idnits 2.17.1 draft-ietf-v6ops-v6nd-problems-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 25, 2012) is 4473 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 5157 (Obsoleted by RFC 7707) Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 v6ops I. Gashinsky 3 Internet-Draft Yahoo! 4 Intended status: Informational J. Jaeggli 5 Expires: July 28, 2012 Zynga 6 W. Kumari 7 Google Inc 8 January 25, 2012 10 Operational Neighbor Discovery Problems 11 draft-ietf-v6ops-v6nd-problems-03 13 Abstract 15 In IPv4, subnets are generally small, made just large enough to cover 16 the actual number of machines on the subnet. In contrast, the 17 default IPv6 subnet size is a /64, a number so large it covers 18 trillions of addresses, the overwhelming number of which will be 19 unassigned. Consequently, simplistic implementations of Neighbor 20 Discovery (ND) can be vulnerable to deliberate or accidental denial 21 of service, whereby they attempt to perform address resolution for 22 large numbers of unassigned addresses. Such denial of attacks can be 23 launched intentionally (by an attacker), or result from legitimate 24 operational tools or accident conditions. As a result of these 25 vulnerabilities, new devices may not be able to "join" a network, it 26 may be impossible to establish new IPv6 flows, and existing IPv6 27 transported flows may be interrupted. 29 This document describes the potential for DOS in detail and suggests 30 possible implementation improvements as well as operational 31 mitigation techniques that can in some cases be used to protect 32 against or at least alleviate the impact of such attacks. 34 Status of this Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at http://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on July 28, 2012. 50 Copyright Notice 52 Copyright (c) 2012 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 1.1. Applicability . . . . . . . . . . . . . . . . . . . . . . 4 69 2. The Problem . . . . . . . . . . . . . . . . . . . . . . . . . 4 70 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 71 4. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 6 72 5. Neighbor Discovery Overview . . . . . . . . . . . . . . . . . 7 73 6. Operational Mitigation Options . . . . . . . . . . . . . . . . 8 74 6.1. Filtering of unused address space. . . . . . . . . . . . . 8 75 6.2. Minimal Subnet Sizing. . . . . . . . . . . . . . . . . . . 8 76 6.3. Routing Mitigation. . . . . . . . . . . . . . . . . . . . 9 77 6.4. Tuning of the NDP Queue Rate Limit. . . . . . . . . . . . 9 78 7. Recommendations for Implementors. . . . . . . . . . . . . . . 9 79 7.1. Prioritize NDP Activities . . . . . . . . . . . . . . . . 10 80 7.2. Queue Tuning. . . . . . . . . . . . . . . . . . . . . . . 11 81 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 82 9. Security Considerations . . . . . . . . . . . . . . . . . . . 12 83 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 84 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12 85 11.1. Normative References . . . . . . . . . . . . . . . . . . . 12 86 11.2. Informative References . . . . . . . . . . . . . . . . . . 13 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 13 89 1. Introduction 91 This document describes implementation issues with IPv6's Neighbor 92 Discovery protocol that can result in vulnerabilities when a network 93 is scanned, either by an intruder or through the use of scanning 94 tools that perform network inventory, security audits, etc. (e.g. 95 "nmap"). 97 This document describes the problem in detail, suggests possible 98 implementation improvements, as well as operational mitigation 99 techniques, that can in some cases protect against such attacks. 101 The RFC series documents generally describe the behavior of 102 protocols, that is, "what" is to be done by a protocol, but not 103 exactly "how" it is to be implemented. The exact details of how best 104 to implement a protocol will depend on the overall hardware and 105 software architecture of a particular device. The actual "how" 106 decisions are (correctly) left in the hands of implementers, so long 107 as implementations differences will generally produce proper on-the- 108 wire behavior. 110 While reading this document, it is important to keep in mind that 111 discussions of how things have been implemented beyond basic 112 compliance with the specification is not within the scope of the 113 neighbor discovery RFCs. 115 1.1. Applicability 117 This document is primarily intended for operators of IPV6 networks 118 and implementors of [RFC4861]. The Document provides some 119 operational considerations as well as recommendations to increase the 120 resilience of the Neighbor Discovery protocol. 122 2. The Problem 124 In IPv4, subnets are generally small, made just large enough to cover 125 the actual number of machines on the subnet. For example, an IPv4 126 /20 contains only 4096 address. In contrast, the default IPv6 subnet 127 size is a /64, a number so large it covers literally billions of 128 billions of addresses, the overwhelming majority of which will be 129 unassigned. Consequently, simplistic implementations of Neighbor 130 Discovery may fail to perform as desired when they perform address 131 resolution of large numbers of unassigned addresses. Such failures 132 can be triggered either intentionally by an attacker launching a 133 Denial of Service attack (DoS) to exploit this vulnerability, or 134 unintentionally due to the use of legitimate operational tools that 135 scan networks for inventory and other purposes. As a result of these 136 failures, new devices may not be able to "join" a network, it may be 137 impossible to establish new IPv6 flows, and existing IPv6 transport 138 flows may be interrupted. 140 Network scans attempt to find and probe devices on a network. 141 Typically, scans are performed on a range of target addresses, or all 142 the addresses on a particular subnet. When such probes are directed 143 via a router, and the target addresses are on a directly attached 144 network, the router will attempt to perform address resolution on a 145 large number of destinations (i.e., some fraction of the 2^64 146 addresses on the subnet). The router's process of testing for the 147 (non)existence of neighbors can induce a denial of service condition, 148 where the number of necessary Neighbor Discovery requests overwhelms 149 the implementation's capacity to process them, exhausts available 150 memoryand replaces existing in-use mappings with incomplete entries 151 that will never be completed. A directed DoS attack may seek to 152 intentionally create similar conditions to that created 153 unintentionally by a network scan. The resulting network disruption 154 may impact existing traffic, and devices that join the network may 155 find that address resolution attempts fail. The DOS as a consequence 156 of network scanning was previously described in [RFC5157] 158 In order to mitigate risk associated with this DoS threat, some 159 router implementations have taken steps to rate-limit the processing 160 rate of Neighbor Solicitations (NS). While these mitigations do 161 help, they do not fully address the issue and may introduce their own 162 set of issues to the neighbor discovery process. 164 3. Terminology 166 Address Resolution Address resolution is the process through which a 167 node determines the link-layer address of a neighbor given only 168 its IP address. In IPv6, address resolution is performed as part 169 of Neighbor Discovery [RFC4861], p60 171 Forwarding Plane That part of a router responsible for forwarding 172 packets. In higher-end routers, the forwarding plane is typically 173 implemented in specialized hardware optimized for performance. 174 Steps in the forwarding process include determining the correct 175 outgoing interface for a packet, decrementing its Time To Live 176 (TTL), verifying and updating the checksum, placing the correct 177 link-layer header on the packet, and forwarding it. 179 Control Plane That part of the router implementation that maintains 180 the data structures that determine where packets should be 181 forwarded. The control plane is typically implemented as a 182 "slower" software process running on a general purpose processor 183 and is responsible for such functions as communicating network 184 status changes via routing protocols, maintaining the forwarding 185 table, performing management, and resolving the correct link-layer 186 address for adjacent neighbors. The control plane "controls" the 187 forwarding plane by programming it with the information needed for 188 packet forwarding. 190 Neighbor Cache As described in [RFC4861], the data structure that 191 holds the cache of (amongst other things) IP address to link-layer 192 address mappings for connected nodes. As the information in the 193 Neighbor Cache is needed by the forwarding plane every time it 194 forwards a packet, it is usually implemented in an ASIC. 196 Neighbor Discovery Process The Neighbor Discovery Process (NDP) is 197 that part of the control plane that implements the Neighbor 198 Discovery protocol. NDP is responsible for performing address 199 resolution and maintaining the Neighbor Cache. When forwarding 200 packets, the forwarding plane accesses entries within the Neighbor 201 Cache. When the forwarding plane processes a packet for which the 202 corresponding Neighbor Cache Entry is missing or incomplete, it 203 notifies NDP to take appropriate action (typically via a shared 204 queue). NDP picks up requests from the shared queue and performs 205 any necessary discovery action. In many implementations the NDP 206 is also responsible for responding to router solicitation 207 messages, Neighbor Unreachability Detection (NUD), etc. 209 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 210 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 211 document are to be interpreted as described in [RFC2119]. 213 4. Background 215 Modern router architectures separate the forwarding of packets 216 (forwarding plane) from the decisions needed to decide where the 217 packets should go (control plane). In order to deal with the high 218 number of packets per second, the forwarding plane is generally 219 implemented in hardware and is highly optimized for the task of 220 forwarding packets. In contrast, the NDP control plane is mostly 221 implemented in software processes running on a general purpose 222 processor. 224 When a router needs to forward an IP packet, the forwarding plane 225 logic performs the longest match lookup to determine where to send 226 the packet and what outgoing interface to use. To deliver the packet 227 to an adjacent node, the forwarding plane encapsulates the packet in 228 a link-layer frame (which contains a header with the link-layer 229 destination address). The forwarding plane logic checks the Neighbor 230 Cache to see if it already has a suitable link-layer destination, and 231 if not, places the request for the required information into a queue, 232 and signals the control plane (i.e., NDP) that it needs the link- 233 layer address resolved. 235 In order to protect NDP specifically and the control plane generally 236 from being overwhelmed with these requests, appropriate steps must be 237 taken. For example, the size and fill rate of the queue might be 238 limited. NDP running in the control plane of the router dequeues 239 requests and performs the address resolution function (by performing 240 a neighbor solicitation and listening for a neighbor advertisement). 241 This process is usually also responsible for other activities needed 242 to maintain link-layer information, such as Neighbor Unreachability 243 Detection (NUD). 245 By sending appropriate packets to addresses on a given subnet, an 246 attacker can cause the router to queue attempts to resolve so many 247 addresses that it crowds out attempts to resolve "legitimate" 248 addresses (and in many cases becomes unable to perform maintenance of 249 existing entries in the neighbor cache, and unable to answer Neighbor 250 Solicitation). This condition can result in the inability to resolve 251 new neighbors and loss of reachability to neighbors with existing ND- 252 Cache entries. During testing it was concluded that 4 simultaneous 253 nmap sessions from a low-end computer was sufficient to make a 254 router's neighbor discovery process unusable and therefore forwarding 255 became unavailable to the destination subnets. 257 The failure to maintain proper NDP behavior whilst under attack has 258 been observed across multiple platforms and implementations, 259 including the largest routers available (when this document was 260 started) from two of the largest vendors. 262 5. Neighbor Discovery Overview 264 When a packet arrives at (or is generated by) a router for a 265 destination on an attached link, the router needs to determine the 266 correct link-layer address to use in the destination field of the 267 layer 2 encapsulation. The router checks the Neighbor Cache for an 268 existing Neighbor Cache Entry for the neighbor, and if none exists, 269 invokes the address resolution portions of the IPv6 Neighbor 270 Discovery [RFC4861] protocol to determine the link-layer address of 271 the neighbor. 273 [RFC4861] Section 5.2 (Conceptual Sending Algorithm) outlines how 274 this process works. A very high level summary is that the device 275 creates a new Neighbor Cache Entry for the neighbor, sets the state 276 to INCOMPLETE, queues the packet and initiates the actual address 277 resolution process. The device then sends out one or more Neighbor 278 Solicitations, and when it receives a corresponding Neighbor 279 Advertisement, completes the Neighbor Cache Entry and sends the 280 queued packet. 282 6. Operational Mitigation Options 284 This section provides some feasible mitigation options that can be 285 employed today by network operators in order to protect network 286 availability while vendors implement more effective protection 287 measures. It can be stipulated that some of these options are 288 "kludges", and are operationally difficult to manage. They are 289 presented, as they represent options we currently have. It is each 290 operator's responsibility to evaluate and understand the impact of 291 changes to their network due to these measures. 293 6.1. Filtering of unused address space. 295 The DoS condition is induced by making a router try to resolve 296 addresses on the subnet at a high rate. By carefully addressing 297 machines into a small portion of a subnet (such as the lowest 298 numbered addresses), it is possible to filter access to addresses not 299 in that assigned portion of address space using Access Control Lists 300 (ACLs), or by null routing, features which are available on most 301 existing platforms. This will prevent the attacker from making the 302 router attempt to resolve unused addresses. For example if there are 303 only 50 hosts connected to an interface, you may be able to filter 304 any address above the first 64 addresses of that subnet by null- 305 routing the subnet carrying a more specific /122 route or by applying 306 ACLs on the WAN link to prevent the attack traffic reaching the 307 vulnerable device. 309 As mentioned at the beginning of this section, it is fully understood 310 that this is ugly (and difficult to manage); but failing other 311 options, it may be a useful technique especially when responding to 312 an attack. 314 This solution requires that the hosts be statically or statefully 315 addressed (as is often done in a datacenter) and may not interact 316 well with networks using [RFC4862] 318 6.2. Minimal Subnet Sizing. 320 By sizing subnets to reflect the number of addresses actually in use, 321 the problem can be avoided. For example, [RFC6164] recommends sizing 322 the subnets for inter-router links to only have 2 addresses (a /127). 323 It is worth noting that this practice is common in IPv4 networks, in 324 part to protect against the harmful effects of ARP request flooding. 326 Subnet prefixes longer than a /64 are not able to use stateless auto- 327 configuration [RFC4862] so this approach is not suitable for use with 328 hosts that are not statically configured. 330 6.3. Routing Mitigation. 332 One very effective technique is to route the subnet to a discard 333 interface (most modern router platforms can discard traffic in 334 hardware / the forwarding plane) and then have individual hosts 335 announce routes for their IP addresses into the network (or use some 336 method to inject much more specific addresses into the local routing 337 domain). For example the network 2001:db8:1:2:3::/64 could be routed 338 to a discard interface on "border" routers, and then individual hosts 339 could announce 2001:db8:1:2:3::10/128, 2001:db8:1:2:3::66/128 into 340 the IGP. This is typically done by having the IP address bound to a 341 virtual interface on the host (for example the loopback interface), 342 enabling IP forwarding on the host and having it run a routing 343 daemon. For obvious reasons, host participation in the IGP makes 344 many operators uncomfortable, but can be a very powerful technique if 345 used in a disciplined and controlled manner. One method to help 346 address these concerns is to have the hosts participate in a 347 different IGP (or difference instance of the same IGP) and carefully 348 redistribute into the main IGP. 350 6.4. Tuning of the NDP Queue Rate Limit. 352 Many implementations provide a means to control the rate of 353 resolution of unknown addresses. By tuning this rate, it may be 354 possible to ameliorate the issue, as with most tuning knobs 355 (especially those that deal with rate limiting), the attack may be 356 completed more quickly due to the lower threshold. By excessively 357 lowering this rate you may negatively impact how long the device 358 takes to learn new addresses under normal conditions (for example, 359 after clearing the neighbor cache or when the router first boots). 360 Under attack conditions you may be unable to resolve "legitimate" 361 addresses sooner than if you had just left the parameter untouched. 363 It is worth noting that this technique is worth investigating only if 364 the device has separate queues for resolution of unknown addresses 365 and the maintenance of existing entries. 367 7. Recommendations for Implementors. 369 The section provides some recommendations to implementors of IPv6 370 Neighbor Discovery. 372 At a high-level, implementors should program defensively. That is, 373 they should assume that attackers will attempt to exploit 374 implementation weaknesses, and should ensure that implementations are 375 robust to various attacks. In the case of Neighbor Discovery, the 376 following general considerations apply: 378 Manage Resources Explicitly Resources such as processor cycles, 379 memory, etc. are never infinite, yet with IPv6's large subnets it 380 is easy to cause NDP to generate large numbers of address 381 resolution requests for non-existent destinations. 382 Implementations need to limit resources devoted to processing 383 Neighbor Discovery requests in a thoughtful manner. 385 Prioritize Some NDP requests are more important than others. For 386 example, when resources are limited, responding to Neighbor 387 Solicitations for one's own address is more important than 388 initiating address resolution requests that create new entries. 389 Likewise, performing Neighbor Unreachability Detection, which by 390 definition is only invoked on destinations that are actively being 391 used, is more important than creating new entries for possibly 392 non-existent neighbors. 394 7.1. Prioritize NDP Activities 396 Not all Neighbor Discovery activities are equally important. 397 Specifically, requests to perform large numbers of address 398 resolutions on non-existent Neighbor Cache Entries should not come at 399 the expense of servicing requests related to keeping existing, in-use 400 entries properly up-to-date. Thus, implementations should divide 401 work activities into categories having different priorities. The 402 following gives examples of different activities and their importance 403 in rough priority order. If implmented, the operation and priority 404 of these SHOULD be configurable by the operator. 406 1. It is critical to respond to Neighbor Solicitations for one's own 407 address, especially for a router. Whether for address resolution or 408 Neighbor Unreachability Detection, failure to respond to Neighbor 409 Solicitations results in immediate problems. Failure to respond to 410 NS requests that are part of NUD can cause neighbors to delete the 411 NCE for that address, and will result in followup NS messages using 412 multicast. Once an entry has been flushed, existing traffic for 413 destinations using that entry can no longer be forwarded until 414 address resolution completes successfully. In other words, not 415 responding to NS messages further increases the NDP load, and causes 416 on-going communication to fail. 418 2. It is critical to revalidate one's own existing NCEs in need of 419 refresh. As part of NUD, ND is required to frequently revalidate 420 existing, in-use entries. Failure to do so can result in the entry 421 being discarded. For in-use entries, discarding the entry will 422 almost certainly result in a subsequent request to perform address 423 resolution on the entry, but this time using multicast. As above, 424 once the entry has been flushed, existing traffic for destinations 425 using that entry can no longer be forwarded until address resolution 426 completes successfully. 428 3. To maintain the stability of the control plane, Neighbor 429 Discovery activity related to traffic sourced by the router (as 430 opposed to traffic being forwarded by the router) should be given 431 high priority. Whenever network problems occur, debugging and making 432 other operational changes requires being able to query and access the 433 router. In addition, routing protocols dependent on Neighbor 434 Discovery for connectivity may begin to react (negatively) to 435 perceived connectivity problems, causing additional undesirable 436 ripple effects. 438 4. Traffic to unknown addresses should be given lowest priority. 439 Indeed, it may be useful to distinguish between "never seen" 440 addresses and those that have been seen before, but that do not have 441 a corresponding NCE. Specifically, the conceptual processing 442 algorithm in IPv6 Neighbor Discovery [RFC4861] calls for deleting 443 NCEs under certain conditions. Rather than delete them completely, 444 however, it might be useful to at least keep track of the fact that 445 an entry at one time existed, in order to prioritize address 446 resolution requests for such neighbors compared with neighbors that 447 have never been seen before. 449 7.2. Queue Tuning. 451 On implementations in which requests to NDP are submitted via a 452 single queue, router vendors SHOULD provide operators with means to 453 control both the rate of link-layer address resolution requests 454 placed into the queue and the size of the queue. This will allow 455 operators to tune Neighbour Discovery for their specific environment. 456 The ability to set, or have per interface or per prefix queue limits 457 at a rate below that of the global queue limit might limit the damage 458 to the neighbor discovery processing to the network targeted by the 459 attack. 461 Setting those values must be a very careful balancing act - the lower 462 the rate of entry into the queue, the less load there will be on the 463 ND process, however, it will take the router longer to learn 464 legitimate destinations as a result. In a datacenter with 6,000 465 hosts attached to a single router, setting that value to be under 466 1000 would mean that resolving all of the addresses from an initial 467 state (or something that invalidates the address cache, such as a STP 468 TCN) may take over 6 seconds. Similarly, the lower the size of the 469 queue, the higher the likelihood of an attack being able to knock out 470 legitimate traffic (but less memory utilization on the router). 472 8. IANA Considerations 474 No IANA resources or consideration are requested in this draft. 476 9. Security Considerations 478 This document outlines mitigation options that operators can use to 479 protect themselves from Denial of Service attacks. Implementation 480 advice to router vendors aimed at ameliorating known problems carries 481 the risk of previously unforeseen consequences. It is not believed 482 that these mitigation techniques or the implementation of finer- 483 grained queuing of NDP activity create additional security risks or 484 DoS exposure. 486 10. Acknowledgements 488 The authors would like to thank Ron Bonica, Troy Bonin, John Jason 489 Brzozowski, Randy Bush, Vint Cerf, Tassos Chatzithomaoglou, Jason 490 Fesler, Wes George, Erik Kline, Jared Mauch, Chris Morrow and Suran 491 De Silva. Special thanks to Thomas Narten and Ray Hunter for 492 detailed review and (even more so) for providing text! 494 Apologies for anyone we may have missed; it was not intentional. 496 11. References 498 11.1. Normative References 500 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 501 Requirement Levels", BCP 14, RFC 2119, March 1997. 503 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 504 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 505 September 2007. 507 [RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless 508 Address Autoconfiguration", RFC 4862, September 2007. 510 [RFC6164] Kohno, M., Nitzan, B., Bush, R., Matsuzaki, Y., Colitti, 511 L., and T. Narten, "Using 127-Bit IPv6 Prefixes on Inter- 512 Router Links", RFC 6164, April 2011. 514 11.2. Informative References 516 [RFC5157] Chown, T., "IPv6 Implications for Network Scanning", 517 RFC 5157, March 2008. 519 Authors' Addresses 521 Igor Gashinsky 522 Yahoo! 523 45 W 18th St 524 New York, NY 525 USA 527 Email: igor@yahoo-inc.com 529 Joel Jaeggli 530 Zynga 531 111 Evelyn 532 Sunnyvale, CA 533 USA 535 Email: jjaeggli@zynga.com 537 Warren Kumari 538 Google Inc 539 1600 Amphitheatre Parkway 540 Mountain View, CA 541 USA 543 Email: warren@kumari.net