idnits 2.17.1 draft-ietf-v6ops-v6nd-problems-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (October 24, 2011) is 4558 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC2119' is defined on line 477, but no explicit reference was found in the text == Unused Reference: 'RFC4398' is defined on line 480, but no explicit reference was found in the text == Unused Reference: 'RFC4255' is defined on line 496, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 v6ops I. Gashinsky 3 Internet-Draft Yahoo! 4 Intended status: Informational J. Jaeggli 5 Expires: April 26, 2012 Zynga 6 W. Kumari 7 Google 8 October 24, 2011 10 Operational Neighbor Discovery Problems 11 draft-ietf-v6ops-v6nd-problems-00 13 Abstract 15 In IPv4, subnets are generally small, made just large enough to cover 16 the actual number of machines on the subnet. In contrast, the 17 default IPv6 subnet size is a /64, a number so large it covers 18 trillions of addresses, the overwhelming number of which will be 19 unassigned. Consequently, simplistic implementations of Neighbor 20 Discovery can be vulnerable to deliberate or accidental denial of 21 service, whereby they attempt to perform address resolution for large 22 numbers of unassigned addresses. Such denial of attacks can be 23 launched intentionally (by an attacker), or result from legitimate 24 operational tools or accident conditions. As a result of these 25 vulnerabilities, new devices may not be able to "join" a network, it 26 may be impossible to establish new IPv6 flows, and existing ipv6 27 transported flows may be interrupted. 29 This document describes the potential for DOS in detail and suggests 30 possible implementation improvements as well as operational 31 mitigation techniques that can in some cases be used to protect 32 against or at least aleviate the impact of such attacks. 34 Status of this Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at http://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on April 26, 2012. 50 Copyright Notice 52 Copyright (c) 2011 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 1.1. Applicability . . . . . . . . . . . . . . . . . . . . . . 4 69 2. The Problem . . . . . . . . . . . . . . . . . . . . . . . . . 4 70 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 71 4. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 6 72 5. Neighbor Discovery Overview . . . . . . . . . . . . . . . . . 7 73 6. Operational Mitigation Options . . . . . . . . . . . . . . . . 7 74 6.1. Filtering of unused address space. . . . . . . . . . . . . 8 75 6.2. Appropriate Subnet Sizing. . . . . . . . . . . . . . . . . 8 76 6.3. Routing Mitigation. . . . . . . . . . . . . . . . . . . . 8 77 6.4. Tuning of the NDP Queue Rate Limit. . . . . . . . . . . . 9 78 7. Recommendations for Implementors. . . . . . . . . . . . . . . 9 79 7.1. Prioritize NDP Activities . . . . . . . . . . . . . . . . 10 80 7.2. Queue Tuning. . . . . . . . . . . . . . . . . . . . . . . 11 81 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 82 9. Security Considerations . . . . . . . . . . . . . . . . . . . 11 83 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 84 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12 85 11.1. Normative References . . . . . . . . . . . . . . . . . . . 12 86 11.2. Informative References . . . . . . . . . . . . . . . . . . 12 87 Appendix A. Text goes here. . . . . . . . . . . . . . . . . . . . 12 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12 90 1. Introduction 92 This document describes implementation issues with IPv6's Neighbor 93 Discovery protocol that can result in vulnerabilities when a network 94 is scanned, either by an intruder or through the use of scanning 95 tools that perform network inventory, security audits, etc. (e.g., 96 "nmap"). 98 This document describes the problem in detail and suggests possible 99 implementation improvements as well as operational mitigation 100 techniques that can in some cases to protect against such attacks. 102 The RFC series documents generally describe the behavior of 103 protocols, that is, "what" is to be done by a protocol, but not 104 exactly "how" it is to be implemented. The exact details of how best 105 to implement a protocol will depend on the overall hardware and 106 software architecture of a particular device. The actual "how" 107 decisions are (correctly) left in the hands of implementers, so long 108 as implementations differences will generally produce proper on-the- 109 wire behavior. 111 While reading this document, it is important to keep in mind that 112 discussions of how things have been implemented beyond basic 113 compliance with the specification is not within the scope of the 114 neighbor discovery RFCs. 116 1.1. Applicability 118 This document is primarily intended for operators of IPV6 networks 119 and implementors of [RFC4861]. The Document provides some 120 operational consideration as well as recommendations to increase the 121 resilience of the Neighbor Discovery protocol. 123 2. The Problem 125 In IPv4, subnets are generally small, made just large enough to cover 126 the actual number of machines on the subnet. For example, an IPv4 127 /20 contains only 4096 address. In contrast, the default IPv6 subnet 128 size is a /64, a number so large it covers literally billions of 129 billions of addresses, the overwhelming number of which will be 130 unassigned. Consequently, simplistic implementations of Neighbor 131 Discovery can be vulnerable to denial of service attacks whereby they 132 perform address resolution for large numbers of unassigned addresses. 133 Such denial of attacks can be launched intentionally (by an 134 attacker), or result from legitimate operational tools that scan 135 networks for inventory and other purposes. As a result of these 136 vulnerabilities, new devices may not be able to "join" a network, it 137 may be impossible to establish new IPv6 flows, and existing ipv6 138 transport flows may be interrupted. 140 Network scans attempt to find and probe devices on a network. 141 Typically, scans are performed on a range of target addresses, or all 142 the addresses on a particular subnet. When such probes are directed 143 via a router, and the target addresses are on a directly attached 144 network, the router will attempt to perform address resolution on a 145 large number of destinations (i.e., some fraction of the 2^64 146 addresses on the subnet). The router's process of testing for the 147 (non)existance of neighbors can induce a denial of service condition, 148 where the number of necessary Neighbor Discovery requests overwhelms 149 the implementation's capacity to process them, exhausts available 150 memory, replaces existing in-use mappings with incomplete entries 151 that will never be completed, and so on. The resulting network 152 disruption, may impact existing traffic, and devices that join the 153 network may find that address resolution attempts fail. 155 In order to alleviate risk associated with this DOS threat, some 156 router implementations have taken steps to rate-limit the processing 157 rate of Neighbor Solicitations (NS). While these mitigations do 158 help, they do not fully address the issue and may introduce their own 159 set of potential liabilities to the neighbor discovery process. 161 3. Terminology 163 Address Resolution Address resolution is the process through which a 164 node determines the link-layer address of a neighbor given only 165 its IP address. In IPv6, address resolution is performed as part 166 of Neighbor Discovery [RFC4861], p60 168 Forwarding Plane That part of a router responsible for forwarding 169 packets. In higher-end routers, the forwarding plane is typically 170 implemented in specialized hardware optimized for performance. 171 Forwarding steps include determining the correct outgoing 172 interface for a packet, decrementing its Time To Live (TTL), 173 verifying and updating the checksum, placing the correct link- 174 layer header on the packet, and forwarding it. 176 Control Plane That part of the router implementation that maintains 177 the data structures that determine where packets should be 178 forwarded. The control plane is typically implemented as a 179 "slower" software process running on a general purpose processor 180 and is responsible for such functions as the routing protocols, 181 performing management and resolving the correct link-layer address 182 for adjacent neighbors. The control plane "controls" the 183 forwarding plane by programming it with the information needed for 184 packet forwarding. 186 Neighbor Cache As described in [RFC4861], the data structure that 187 holds the cache of (amongst other things) IP address to link-layer 188 address mappings for connected nodes. The forwarding plane 189 accesses the Neighbor Cache on every forwarded packet. Thus it is 190 usually implemented in an ASIC . 192 Neighbor Discovery Process The Neighbor Discovery Process (NDP) is 193 that part of the control plane that implements the Neighbor 194 Discovery protocol. NDP is responsible for performing address 195 resolution and maintaining the Neighbor Cache. When forwarding 196 packets, the forwarding plane accesses entries within the Neighbor 197 Cache. When the forwarding plane processes a packet for which the 198 corresponding Neighbor Cache Entry is missing or incomplete, it 199 notifies NDP to take appropriate action (typically via a shared 200 queue). NDP picks up requests from the shared queue and performs 201 any necessary discovery action. In many implementations the NDP 202 is also responsible for responding to router solicitation 203 messages, Neighbor Unreachability Detection (NUD), etc. 205 4. Background 207 Modern router architectures separate the forwarding of packets 208 (forwarding plane) from the decisions needed to decide where the 209 packets should go (control plane). In order to deal with the high 210 number of packets per second, the forwarding plane is generally 211 implemented in hardware and is highly optimized for the task of 212 forwarding packets. In contrast, the NDP control plane is mostly 213 implemented in software processes running on a general purpose 214 processor. 216 When a router needs to forward an IP packet, the forwarding plane 217 logic performs the longest match lookup to determine where to send 218 the packet and what outgoing interface to use. To deliver the packet 219 to an adjacent node, the forwarding plance encapsulates the packet in 220 a link-layer frame (which contains a header with the link-layer 221 destination address). The forwarding plane logic checks the Neighbor 222 Cache to see if it already has a suitable link-layer destination, and 223 if not, places the request for the required information into a queue, 224 and signals the control plane (i.e., NDP) that it needs the link- 225 layer address resolved. 227 In order to protect NDP specifically and the control plane generally 228 from being overwhelmed with these requests, appropriate steps must be 229 taken. For example, the size and fill rate of the queue might be 230 limited. NDP running in the control plane of the router dequeues 231 requests and performs the address resolution function (by performing 232 a neighbor solicitation and listening for a neighbor advertisement). 233 This process is usually also responsible for other activities needed 234 to maintain link-layer information, such as Neighbor Unreachability 235 Detection (NUD). 237 An attacker sending the appropriate packets to addresses on a given 238 subnet can cause the router to queue attempts to resolve so many 239 addresses that it crowds out attempts to resolve "legitimate" 240 addresses (and in many cases becomes unable to perform maintenance of 241 existing entries in the neighbor cache, and unable to answer Neighbor 242 Solicitation). This condition can result in the inability to resolve 243 new neighbors and loss of reachability to neighbors with existing ND- 244 Cache entries. During testing it was concluded that 4 simultaneous 245 nmap sessions from a low-end computer was sufficient to make a 246 router's neighbor discovery process unusable and therefore forwarding 247 became unavailable to the destination subnets. 249 The NDP behavior under attack has been observed across multiple 250 platforms and implementations. 252 5. Neighbor Discovery Overview 254 When a packet arrives at (or is generated by) a router for a 255 destination on an attached link, the router needs to determine the 256 correct link-layer address to send the packet to. The router checks 257 the Neighbor Cache for an existing Neighbor Cache Entry for the 258 neighbor, and if none exists, invokes the address resolution portions 259 of the IPv6 Neighbor Discovery [RFC4861] protocol to determine the 260 link-layer address. 262 RFC4861 Section 5.2 (Conceptual Sending Algorithm) outlines how this 263 process works. A very high level summary is that the device creates 264 a new Neighbor Cache Entry for the neighbor, sets the state to 265 INCOMPLETE, queues the packet and initiates the actual address 266 resolution process. The device then sends out one or more Neighbor 267 Solicitations, and when it receives a correpsonding Neighbor 268 Advertisement, completes the Neighbor Cache Entry and sends the 269 queued packet. 271 6. Operational Mitigation Options 273 This section provides some feasible mitigation options that can be 274 employed today by network operators in order to protect network 275 availability while vendors implement more effective protection 276 measures. It can be stipulated that some of these options are 277 "kludges", and are operationally difficult to manage. They are 278 presented, as they represent options we currently have. It is each 279 operator's responsibility to evaluate and understand the impact of 280 changes to their network due to these measures. 282 6.1. Filtering of unused address space. 284 The DOS condition is induced by making a router try to resolve 285 addresses on the subnet at a high rate. By carefully addressing 286 machines into a small portion of a subnet (such as the lowest 287 numbered addresses), it is possible to filter access to addresses not 288 in that portion. This will prevent the attacker from making the 289 router attempt to resolve unused addresses. For example if there are 290 only 50 hosts connected to an interface, you may be able to filter 291 any address above the first 64 addresses of that subnet by 292 nullrouting the subnet carrying a more specific /122 route. 294 As mentioned at the beginning of this section, it is fully understood 295 that this is ugly (and difficult to manage); but failing other 296 options, it may be a useful technique especially when responding to 297 an attack. 299 This solution requires that the hosts be statically or statefully 300 addressed (as is often done in a datacenter) and may not interact 301 well with networks using [RFC4862] 303 6.2. Appropriate Subnet Sizing. 305 By sizing subnets to reflect the number of addresses actually in use, 306 the problem can be avoided. For example, [RFC6164] recommends sizing 307 the subnets for inter-router links to only have 2 addresses (a /127). 308 It is worth noting that this practice is common in IPv4 networks, in 309 part to protect against the harmful effects of ARP request flooding. 311 6.3. Routing Mitigation. 313 One very effective technique is to route the subnet to a discard 314 interface (most modern router platforms can discard traffic in 315 hardware / the forwarding plane) and then have individual hosts 316 announce routes for their IP addresses into the network (or use some 317 method to inject much more specific addresses into the local routing 318 domain). For example the network 2001:db8:1:2:3::/64 could be routed 319 to a discard interface on "border" routers, and then individual hosts 320 could announce 2001:db8:1:2:3::10/128, 2001:db8:1:2:3::66/128 into 321 the IGP. This is typically done by having the IP address bound to a 322 virtual interface on the host (for example the loopback interface), 323 enabling IP forwarding on the host and having it run a routing 324 daemon. For obvious reasons, host participation in the IGP makes 325 many operators uncomfortable, but can be a very powerful technique if 326 used in a disciplined and controlled manner. 328 6.4. Tuning of the NDP Queue Rate Limit. 330 Many implementations provide a means to control the rate of 331 resolution of unknown addresses. By tuning this rate, it may be 332 possible to amerlorate the issue, as with most tuning knobs 333 (especially those that deal with rate limiting), the attack may be 334 completed more quickly due to the lower threshold. By excessively 335 lowering this rate you may negatively impact how long the device 336 takes to learn new addresses under normal conditions (for example, 337 after clearing the neighbor cache or when the router first boots). 338 Under attack conditions you may be unable to resolve "legitimate" 339 addresses sooner than if you had just left the parameter untouched. 341 It is worth noting that this technique is worth investigating only if 342 the device has separate queues for resolution of unknown addresses 343 and the maintenance of existing entries. 345 7. Recommendations for Implementors. 347 The section provides some recommendations to implementors of IPv4 348 Neighbor Discovery. 350 At a high-level, implementors should program defensively. That is, 351 they should assume that intruders will attempt to exploit 352 implementation weaknesses, and should ensure that implementations are 353 robust to various attacks. In the case of Neighbor Discovery, the 354 following general considerations apply: 356 Manage Resources Explicitly - Resources such as processor cycles, 357 memory, etc. are never infinite, yet with IPv6's large subnets it 358 is easy to cause NDP to generate large numbers of address 359 resolution requests for non-existent destinations. 360 Implementations need to limit resources devoted to processing 361 Neighbor Discovery requests in a thoughtful manner. 363 Prioritize - Some NDP requests are more important than others. For 364 example, when resources are limited, responding to Neighbor 365 Solicitations for one's own address is more important than 366 initiating address resolution requests that create new entries. 367 Likewise, performing Neighbor Unreachability Detection, which by 368 definition is only invoked on destinations that are actively being 369 used, is more important than creating new entries for possibly 370 non-existant neighbors. 372 7.1. Prioritize NDP Activities 374 Not all Neighbor Discovery activities are equally important. 375 Specifically, requests to perform large numbers of address 376 resolutions on non-existant Neighbor Cache Entries should not come at 377 the expense of servicing requests related to keeping existing, in-use 378 entries properly up-to-date. Thus, implementations should divide 379 work activities into categories having different priorities. The 380 following gives examples of different activities and their importance 381 in rough priority order. 383 1. It is critical to respond to Neighbor Solicitations for one's own 384 address, especially when a router. Whether for address resolution or 385 Neighbor Unreachability Detection, failure to respond to Neighbor 386 Solicitations results in immediate problems. Failure to respond to 387 NS requests that are part of NUD can cause neighbors to delete the 388 NCE for that address, and will result in followup NS messages using 389 multicast. Once an entry has been flushed, existing traffic for 390 destinations using that entry can no longer be forwarded until 391 address resolution completes successfully. In other words, not 392 responding to NS messages further increases the NDP load, and causes 393 on-going communication to fail. 395 2. It is critical to revalidate one's own existing NCEs in need of 396 refresh. As part of NUD, ND is required to frequently revalidate 397 existing, in-use entries. Failure to do so can result in the entry 398 being discarded. For in-use entries, discarding the entry will 399 almost certainly result in a subswquent request to perform address 400 resolution on the entry, but this time using multicast. As above, 401 once the entry has been flushed, existing traffic for destinations 402 using that entry can no longer be forwarded until address resolution 403 completes succesfully. 405 3. To maintain the stability of the control plane, Neighbor 406 Discovery activity related to traffic sourced by the router (as 407 opposed to traffic being forwarded by the router) should be given 408 high priority. Whenever network problems occur, debugging and making 409 other operational changes requires being able to query and access the 410 router. In addition, routing protocols depedent on Neighbor 411 Discovery for connectivty may begin to react (negatively) to 412 perceived connectivity problems, causing addition undesirable ripple 413 effects. 415 4. Traffic to unknown addresses should be given lowest priority. 416 Indeed, it may be useful to distinguish between "never seen" 417 addresses and those that have been seen before, but that do not have 418 a corresponding NCE. Specifically, the conceptual processing 419 algorithm in IPv6 Neighbor Discovery [RFC4861] calls for deleting 420 NCEs under certain conditions. Rather than delete them completely, 421 however, it might be useful to at least keep track of the fact that 422 an entry at one time existed, in order to prioritize address 423 resolution requests for such neighbors compared with neighbors that 424 have never been seen before. 426 7.2. Queue Tuning. 428 On implementations in which requests to NDP are submitted via a 429 single queue, router vendors SHOULD provide operators with means to 430 control both the rate of link-layer address resolution requests 431 placed into the queue and the size of the queue. This will allow 432 operators to tune Neighbour Discovery for their specific environment. 433 The ability to set, or have per interface or subnet queue limits at a 434 rate below that of the global queue limit might limit the damage to 435 the neighbor discovery processing to the network targeted by the 436 attack. 438 Setting those values must be a very careful balancing act - the lower 439 the rate of entry into the queue, the less load there will be on the 440 ND process, however, it will take the router longer to learn 441 legitimate destinations as a result. In a datacenter with 6,000 442 hosts attached to a single router, setting that value to be under 443 1000 would mean that resolving all of the addresses from an initial 444 state (or something that invalidates the address cache, such as a STP 445 TCN) may take over 6 seconds. Similarly, the lower the size of the 446 queue, the higher the likelihood of an attack being able to knock out 447 legitimate traffic (but less memory utilization on the router). 449 8. IANA Considerations 451 No IANA resources or consideration are requested in this draft. 453 9. Security Considerations 455 This document outlines mitigation options that operators can use to 456 protect themselves from Denial of Service attacks. Implementation 457 advice to router vendors aimed at ameliorating known problems carries 458 the risk of previously unforeseen consequences. It is not believed 459 that these mitigation techniques or the implementation of finer- 460 grained queuing of NDP activity create additional security risks or 461 DOS exposure. 463 10. Acknowledgements 465 The authors would like to thank Ron Bonica, Troy Bonin, John Jason 466 Brzozowski, Randy Bush, Vint Cerf,Tassos Chatzithomaoglou, Jason 467 Fesler, Wes George, Erik Kline, Jared Mauch, Chris Morrow and Suran 468 De Silva. Special thanks to Thomas Narten for detailed review and 469 (even more so) for providing text! 471 Apologies for anyone we may have missed; it was not intentional. 473 11. References 475 11.1. Normative References 477 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 478 Requirement Levels", BCP 14, RFC 2119, March 1997. 480 [RFC4398] Josefsson, S., "Storing Certificates in the Domain Name 481 System (DNS)", RFC 4398, March 2006. 483 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 484 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 485 September 2007. 487 [RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless 488 Address Autoconfiguration", RFC 4862, September 2007. 490 [RFC6164] Kohno, M., Nitzan, B., Bush, R., Matsuzaki, Y., Colitti, 491 L., and T. Narten, "Using 127-Bit IPv6 Prefixes on Inter- 492 Router Links", RFC 6164, April 2011. 494 11.2. Informative References 496 [RFC4255] Schlyter, J. and W. Griffin, "Using DNS to Securely 497 Publish Secure Shell (SSH) Key Fingerprints", RFC 4255, 498 January 2006. 500 Appendix A. Text goes here. 502 TBD 504 Authors' Addresses 506 Igor 507 Yahoo! 508 45 W 18th St 509 New York, NY 510 USA 512 Email: igor@yahoo-inc.com 514 Joel 515 Zynga 516 111 Evelyn 517 Sunnyvale, CA 518 USA 520 Email: jjaeggli@zynga.com 522 Warren Kumari 523 Google 525 Email: warren@kumari.net