idnits 2.17.1 draft-gashinsky-v6ops-v6nd-problems-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (October 06, 2011) is 4586 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC2119' is defined on line 475, but no explicit reference was found in the text == Unused Reference: 'RFC4398' is defined on line 478, but no explicit reference was found in the text == Unused Reference: 'RFC4255' is defined on line 494, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group W. Kumari 3 Internet-Draft Google 4 Intended status: Informational I. Gashinsky 5 Expires: April 8, 2012 Yahoo! 6 J. Jaeggli 7 Zynga 8 October 06, 2011 10 Operational Neighbor Discovery Problems 11 draft-gashinsky-v6ops-v6nd-problems-00 13 Abstract 15 In IPv4, subnets are generally small, made just large enough to cover 16 the actual number of machines on the subnet. In contrast, the 17 default IPv6 subnet size is a /64, a number so large it covers 18 trillions of addresses, the overwhelming number of which will be 19 unassigned. Consequently, simplistic implementations of Neighbor 20 Discovery can be vulnerable to denial of service attacks whereby they 21 attempt to perform address resolution for large numbers of unassigned 22 addresses. Such denial of attacks can be launched intentionally (by 23 an attacker), or result from legitimate operational tools that scan 24 networks for inventory and other purposes. As a result of these 25 vulnerabilities, new devices may not be able to "join" a network, it 26 may be impossible to establish new IPv6 flows, and existing ipv6 27 transported flows may be interrupted. 29 This document describes the problem in detail and suggests possible 30 implementation improvements as well as operational mitigation 31 techniques that can in some cases to protect against such attacks. 33 Status of this Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at http://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on April 8, 2012. 50 Copyright Notice 52 Copyright (c) 2011 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 68 1.1. Applicability . . . . . . . . . . . . . . . . . . . . . . 3 69 2. The Problem . . . . . . . . . . . . . . . . . . . . . . . . . 3 70 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 71 4. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 5 72 5. Neighbor Discovery Overview . . . . . . . . . . . . . . . . . 6 73 6. Operational Mitigation Options . . . . . . . . . . . . . . . . 6 74 6.1. Filtering of unused address space. . . . . . . . . . . . . 7 75 6.2. Appropriate Subnet Sizing. . . . . . . . . . . . . . . . . 7 76 6.3. Routing Mitigation. . . . . . . . . . . . . . . . . . . . 7 77 6.4. Tuning of the NDP Queue Rate Limit. . . . . . . . . . . . 8 78 7. Recommendations for Implementors. . . . . . . . . . . . . . . 8 79 7.1. Priortize NDP Activities . . . . . . . . . . . . . . . . . 9 80 7.2. Queue Tuning. . . . . . . . . . . . . . . . . . . . . . . 10 81 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 82 9. Security Considerations . . . . . . . . . . . . . . . . . . . 10 83 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10 84 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 85 11.1. Normative References . . . . . . . . . . . . . . . . . . . 11 86 11.2. Informative References . . . . . . . . . . . . . . . . . . 11 87 Appendix A. Text goes here. . . . . . . . . . . . . . . . . . . . 11 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11 90 1. Introduction 92 This document describes implementation issues with IPv6's Neighbor 93 Discovery protocol that can result in vulnerabilities when a network 94 is scanned, either by an intruder or through the use of scanning 95 tools that peform network inventory, security audits, etc. (e.g., 96 "nmap"). 98 This document describes the problem in detail and suggests possible 99 implementation improvements as well as operational mitigation 100 techniques that can in some cases to protect against such attacks. 102 The RFC series documents generally describe the behavior of 103 protocols, that is, "what" is to be done by a protocol, but not 104 exactly "how" it is to be implemented. The exact details of how best 105 to implement a protocol will depend on the overall hardware and 106 software architecture of a particular device. The actual "how" 107 decisions are (correctly) left in the hands of implementers, so long 108 as implementations differences will generally produce proper on-the- 109 wire behavior. 111 While reading this document, it is important to keep in mind that 112 discussions of how things have been implemented beyond basic 113 compliance with the specification is not within the scope of the 114 neighbor discovery RFCs. 116 1.1. Applicability 118 This document is primarily intended for operators of IPV6 networks 119 and implementors of [RFC4861]. The Document provides some 120 operational consideration as well as recommendations to increase the 121 resilience of the Neighbor Discovery protocol. 123 2. The Problem 125 In IPv4, subnets are generally small, made just large enough to cover 126 the actual number of machines on the subnet. For example, an IPv4 127 /20 contains only 4096 address. In contrast, the default IPv6 subnet 128 size is a /64, a number so large it covers literally billions of 129 billions of addresses, the overwhelming number of which will be 130 unassigned. Consequently, simplistic implementations of Neighbor 131 Discovery can be vulnerable to denial of service attacks whereby they 132 perform address resolution for large numbers of unassigned addresses. 133 Such denial of attacks can be launched intentionally (by an 134 attacker), or result from legitimate operational tools that scan 135 networks for inventory and other purposes. As a result of these 136 vulnerabilities, new devices may not be able to "join" a network, it 137 may be impossible to establish new IPv6 flows, and existing ipv6 138 transport flows may be interrupted. 140 Network scans attempt to find and probe devices on a network. 141 Typically, scans are performed on a range of target addresses, or all 142 the addresses on a particular subnet. When such probes are directed 143 via a router, and the target addresses are on a directly attached 144 network, the router will to attempt to perform address resolution on 145 a large number of destinations (i.e., some fraction of the 2^64 146 addresses on the subnet). The process of testing for the 147 (non)existance of neighbors can induce a denial of service condition, 148 where the number of Neighbor Discovery requests overwhelms the 149 implementation's capacity to process them, exhausts available memory, 150 replaces existing in-use mappings with incomplete entries that will 151 never be completed, etc. The result can be network disruption, where 152 existing traffic may be impacted, and devices that join the net find 153 that address resolutions fails. 155 In order to alleviate risk associated with this DOS threat, some 156 router implementations have taken steps to rate-limit the processing 157 rate of Neighbor Solicitations (NS). While these mitigations do 158 help, they do not fully address the issue and may introduce their own 159 set of potential liabilities to the neighbor discovery process. 161 3. Terminology 163 Address Resolution Address resolution is the process through which a 164 node determines the link-layer address of a neighbor given only 165 its IP address. In IPv6, address resolution is performed as part 166 of Neighbor Discovery [RFC4861], p60 168 Forwarding Plane That part of a router responsible for forwarding 169 packets. In higher-end routers, the forwarding plane is typically 170 implemented in specialized hardware optimized for performance. 171 Forwarding steps include determining the correct outgoing 172 interface for a packet, decrementing its Time To Live (TTL), 173 verifying and updating the checksum, placing the correct link- 174 layer header on the packet, and forwarding it. 176 Control Plane That part of the router implementation that maintains 177 the data structures that determine where packets should be 178 forwarded. The control plane is typically implemented as a 179 "slower" software process running on a general purpose processor 180 and is responsible for such functions as the routing protocols, 181 performing management and resolving the correct link-layer address 182 for adjacent neighbors. The control plane "controls" the 183 forwarding plane by programming it with the information needed for 184 packet forwarding. 186 Neighbor Cache As described in [RFC4861], the data structure that 187 holds the cache of (amongst other things) IP address to link-layer 188 address mappings for connected nodes. The forwarding plane 189 accesses the Neighbor Cache on every forwarded packet. Thus it is 190 usually implemented in an ASIC . 192 Neighbor Discovery Process The Neighbor Discovery Process (NDP) is 193 that part of the control plane that implements the Neighbor 194 Discovery protocol. NDP is responsible for performing address 195 resolution and maintaining the Neighbor Cache. When forwarding 196 packets, the forwarding plane accesses entries within the Neighbor 197 Cache. Whenever the forwarding plane processes a packet for which 198 the corresponding Neighbor Cache Entry is missing or incomplete, 199 it notifies NDP to take appropriate action (typically via a shared 200 queue). NDP picks up requests from the shared queue and performs 201 any necessary actions. In many implementations it is also 202 responsible for responding to router solicitation messages, 203 Neighbor Unreachability Detection (NUD), etc. 205 4. Background 207 Modern router architectures separate the forwarding of packets 208 (forwarding plane) from the decisions needed to decide where the 209 packets should go (control plane). In order to deal with the high 210 number of packets per second the forwarding plane is generally 211 implemented in hardware and is highly optimized for the task of 212 forwarding packets. In contrast, the NDP control plane is mostly 213 implemented in software processes running on a general purpose 214 processor. 216 When a router needs to forward an IP packet, the forwarding plane 217 logic performs the longest match lookup to determine where to send 218 the packet and what outgoing interface to use. To deliver the packet 219 to an adjacent node, It encapsulates the packet in a link-layer frame 220 (which contains a header with the link-layer destination address). 221 The forwarding plane logic checks the Neighbor Cache to see if it 222 already has a suitable link-layer destination, and if not, places the 223 request for the required information into a queue, and signals the 224 control plane (i.e., NDP) that it needs the link-layer address 225 resolved. 227 In order to protect NDP specifically and the control plane generally 228 from being overwhelmed with these requests, appropriate steps must be 229 taken. For example, the size and rate of the queue might be limited. 230 NDP running in the control plane of the router dequeues requests and 231 performs the address resolution function (by performing a neighbor 232 solicitation and listening for a neighbor advertisement). This 233 process is usually also responsible for other activities needed to 234 maintain link-layer information, such as Neighbor Unreachability 235 Detection (NUD). 237 An attacker sending the appropriate packets to addresses on a given 238 subnet can cause the router to queue attempts to resolve so many 239 addresses that it crowds out attempts to resolve "legitimate" 240 addresses (and in many cases becomes unable to perform maintenance of 241 existing entries in the neighbor cache, and unable to answer Neighbor 242 Solicitiation). This condition can result the inability to resolve 243 new neighbors and loss of reachability to neighbors with existing ND- 244 Cache entries. During testing it was concluded that 4 simultaneous 245 nmap sessions from a low-end computer was sufficient to make a 246 router's neighbor discovery process unhappy and therefore forwarding 247 unusable. 249 This behavior has been observed across multiple platforms and 250 implementations. 252 5. Neighbor Discovery Overview 254 When a packet arrives at (or is generated by) a router for a 255 destination on an attached link, the router needs to determine the 256 correct link-layer address to send the packet to. The router checks 257 the Neighbor Cache for an existing Neighbor Cache Entry for the 258 neighbor, and if none exists, invokes the address resolution portions 259 of the IPv6 Neighbor Discovery [RFC4861] protocol to determine the 260 link-layer address. 262 RFC4861 Section 5.2 (Conceptual Sending Algorithm) outlines how this 263 process works. A very high level summary is that the device creates 264 a new Neighbor Cache Entry for the neighbor, sets the state to 265 INCOMPLETE, queues the packet and initiates the actual address 266 resolution process. The device then sends out one or more Neighbor 267 Solicitiations, and when it receives a correpsonding Neighbor 268 Advertisement, completes the Neighbor Cache Entry and sends the 269 queued packet. 271 6. Operational Mitigation Options 273 This section provides some feasible mitigation options that can be 274 employed today by network operators in order to protect network 275 availability while vendors implement more effective protection 276 measures. It can be stipulated that some of these options are 277 "kludges", and are operationally difficult to manage. They are 278 presented, as they represent options we currently have. It is each 279 operator's responsibility to evaluate and understand the impact of 280 changes to their network due to these measures. 282 6.1. Filtering of unused address space. 284 The DOS condition is induced by making a router try to resolve 285 addresses on the subnet at a high rate. By carefully addressing 286 machines into a small portion of a subnet (such as the lowest 287 numbered addresses), it is possible to filter access to addresses not 288 in that portion. This will prevent the attacker from making the 289 router attempt to resolve unused addresses. For example if there are 290 only 50 hosts connected to an interface, you may be able to filter 291 any address above the first 64 addresses of that subnet by 292 nullrouting the subnet carrying a more specific /122 route. 294 As mentioned at the beginning of this section, it is fully understood 295 that this is ugly (and difficult to manage); but failing other 296 options, it may be a useful technique especially when responding to 297 an attack. 299 This solution requires that the hosts be statically or statefully 300 addressed (as is often done in a datacenter) and may not interact 301 well with networks using [RFC4862] 303 6.2. Appropriate Subnet Sizing. 305 By sizing subnets to reflect the number of addresses actually in use, 306 the problem can be avoided. For example [RFC6164] recommends sizing 307 the subnet for inter-router links to only have 2 addresses. It is 308 worth noting that this practice is common in IPv4 networks, partly to 309 protect against the harmful effects of ARP flooding attacks. 311 6.3. Routing Mitigation. 313 One very effective technique is to route the subnet to a discard 314 interface (most modern router platforms can discard traffic in 315 hardware / the forwarding plane) and then have individual hosts 316 announce routes for their IP addresses into the network (or use some 317 method to inject much more specific addresses into the local routing 318 domain). For example the network 2001:db8:1:2:3::/64 could be routed 319 to a discard interface on "border" routers, and then individual hosts 320 could announce 2001:db8:1:2:3::10/128, 2001:db8:1:2:3::66/128 into 321 the IGP. This is typically done by having the IP address bound to a 322 virtual interface on the host (for example the loopback interface), 323 enabling IP forwarding on the host and having it run a routing 324 daemon. For obvious reasons, host participation in the IGP makes 325 many operators uncomfortable, but can be a very powerful technique if 326 used in a disciplined and controlled manner. 328 6.4. Tuning of the NDP Queue Rate Limit. 330 Many implementations provide a means to control the rate of 331 resolution of unknown addresses. By tuning this rate, it may be 332 possibly to amerlorate the isse, although, as with most tuning knobs 333 (especially those that deal with rate limiting), you may be 334 "completing the attack". By excissivly lowing this rate you may 335 negatively impact how long the device takes to learn new addresses 336 under normal conditions (for example, after clearing the neighbor 337 cache or when the router first boots) and, under attack conditions 338 you may be unable to resolve "legitimate" addresses sooner than if 339 you had just the the knob alone. 341 It is worth noting that this technique is only worth investigationg 342 if the device has separate queue for resolution of unknown addresses 343 versus maintaice of existing entries. 345 7. Recommendations for Implementors. 347 The section provides some recommendations to implementors of IPv4 348 Neighbor Discovery. 350 At a high-level, implementors should program defensively. That is, 351 they should assume that intruders will attempt to exploit 352 implementation weaknesses, and should ensure that implementations are 353 robust to various attacks. In the case of Neighbor Discovery, the 354 following general considerations apply: 356 Manage Resources Explicitely - Resources such as processor cycles, 357 memory, etc. are never infinite, yet with IPv6's large subnets it 358 is easy to cause NDP to generate large numbers of address 359 resolution requests for non-existant destinations. 360 Implementations need to limit resources devoted to processing 361 Neighbor Discovery requests in a thoughtful manner. 363 Prioritize - Some NDP requests are more important than others. For 364 example, when resources are limited, responding to Neighbor 365 Solicitations for one's own address is more important than 366 initiating address resolution requests that create new entries. 367 Likewise, performing Neighbor Unreachability Detection, which by 368 definition is only invoked on destinations that are actively being 369 used, is more important than creating new entries for possibly 370 non-existant neighbors. 372 7.1. Priortize NDP Activities 374 Not all Neighbor Discovery activies are equally important. 375 Specifically, requests to perform large numbers of address 376 resolutions on non-existant Neighbor Cache Entries should not come at 377 the expense of servicing requests related to keeping existing, in-use 378 entries properly up-to-date. Thus, implementations should divide 379 work activities into categories having different priorities. The 380 following gives examples of different activities and their importance 381 in rough priority order. 383 1. It is critical to respond to Neighbor Solicitations for one's own 384 address, especially when a router. Whether for address resolution or 385 Neighbor Unreachability Detection, failure to respond to Neighbor 386 Solicitations results in immediate problems. Failure to respond to 387 NS requests that are part of NUD can cause neighbors to delete the 388 NCE for that address, and will result in followup NS messages using 389 multicast. Once an entry has been flushed, existing traffic for 390 destinations using that entry can no longer be forwarded until 391 address resolution completes succesfully. In other words, not 392 responding to NS messages further increases the NDP load, and causes 393 on-going communication to fail. 395 2. It is critical to revalidate one's own existing NCEs in need of 396 refresh. As part of NUD, ND is required to frequently revalidate 397 existing, in-use entries. Failure to do so can result in the entry 398 being discarded. For in-use entries, discarding the entry will 399 almost certainly result in a subsquent request to perform address 400 resolution on the entry, but this time using multicast. As above, 401 once the entry has been flushed, existing traffic for destinations 402 using that entry can no longer be forwarded until address resolution 403 completes succesfully. 405 3. To maintin the stability of the control plane, Neighbor Discovery 406 activity related to traffic sourced by the router (as opposed to 407 traffic being forwarded by the router) should be given high priority. 408 Whenever network problems occur, debugging and making other 409 operational changes requires being able to query and access the 410 router. In addition, routing protocols may begin to react 411 (negatively) to perceived connectivity problems, causing addition 412 undesirable ripple effects. 414 4. Activities related to the sending and recieving of Router 415 Advertisements also impact address resolutions. [XXX say more?] 417 5. Traffic to unknown addresses should be given lowest priority. 418 Indeed, it may be useful to distinguish between "never seen" 419 addresses and those that have been seen before, but that do not have 420 a corresponding NCE. Specifically, the conceptual processing 421 algorithm in IPv6 Neighbor Discovery [RFC4861] calls for deleting 422 NCEs under certain conditions. Rather than delete them completely, 423 however, it might be useful to at least keep track of the fact that 424 an entry at one time existed, in order to prioritize address 425 resolution requests for such neighbors compared with neighbors that 426 have never been seen before. 428 7.2. Queue Tuning. 430 On implementations in which requests to NDP are submitted via a 431 single queue, router vendors SHOULD provide operators with means to 432 control both the rate of link-layer address resolution requests 433 placed into the queue and the size of the queue. This will allow 434 operators to tune Neighbour Discovery for their specific environment. 435 The ability to set or have per interface or subnet queue limits at a 436 rate below that of the global queue limit might limit the damage to 437 the neighbor discovery process to the taret network. 439 Setting those values must be a very careful balancing act - the lower 440 the rate of entry into the queue, the less load there will be on the 441 ND process, however, it also means that it will take the router 442 longer to learn legitimate destinations. In a datacenter with 6,000 443 hosts attached to a single router, setting that value to be under 444 1000 would mean that resolving all of the addresses from an initial 445 state (or something that invalidates the address cache, such as a STP 446 TCN) may take over 6 seconds. Similarly, the lower the size of the 447 queue, the higher the likelihood of an attack being able to knock out 448 legitimate traffic (but less memory utilization on the router). 450 8. IANA Considerations 452 No IANA resources or consideration are requested in this draft. 454 9. Security Considerations 456 This document outlines mitigation options that operators can use to 457 protect themselves from Denial of Service attacks. Implementation 458 advice to router vendors aimed at ameliorating known problems carries 459 the risk of previously unforeseen consequences. It is not believed 460 that these techniques create additional security or DOS exposure 462 10. Acknowledgements 464 The authors would like to thank Ron Bonica, Troy Bonin, John Jason 465 Brzozowski, Randy Bush, Vint Cerf, Jason Fesler Erik Kline, Jared 466 Mauch, Chris Morrow and Suran De Silva. Special thanks to Thomas 467 Narten for detailed review and (even more so) for providing text! 469 Apologies for anyone we may have missed; it was not intentional. 471 11. References 473 11.1. Normative References 475 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 476 Requirement Levels", BCP 14, RFC 2119, March 1997. 478 [RFC4398] Josefsson, S., "Storing Certificates in the Domain Name 479 System (DNS)", RFC 4398, March 2006. 481 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 482 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 483 September 2007. 485 [RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless 486 Address Autoconfiguration", RFC 4862, September 2007. 488 [RFC6164] Kohno, M., Nitzan, B., Bush, R., Matsuzaki, Y., Colitti, 489 L., and T. Narten, "Using 127-Bit IPv6 Prefixes on Inter- 490 Router Links", RFC 6164, April 2011. 492 11.2. Informative References 494 [RFC4255] Schlyter, J. and W. Griffin, "Using DNS to Securely 495 Publish Secure Shell (SSH) Key Fingerprints", RFC 4255, 496 January 2006. 498 Appendix A. Text goes here. 500 TBD 502 Authors' Addresses 504 Warren Kumari 505 Google 507 Email: warren@kumari.net 508 Igor 509 Yahoo! 510 45 W 18th St 511 New York, NY 512 USA 514 Email: igor@yahoo-inc.com 516 Joel 517 Zynga 518 111 Evelyn 519 Sunnyvale, CA 520 USA 522 Email: jjaeggli@zynga.com