idnits 2.17.1 draft-wijngaards-dnsext-resolver-side-mitigation-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 24, 2009) is 5539 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 DNS Extensions Working Group W. Wijngaards 3 Internet-Draft NLnet Labs 4 Intended status: Informational February 24, 2009 5 Expires: August 28, 2009 7 Resolver side mitigations 8 draft-wijngaards-dnsext-resolver-side-mitigation-01 10 Status of This Memo 12 This Internet-Draft is submitted to IETF in full conformance with the 13 provisions of BCP 78 and BCP 79. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt. 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http://www.ietf.org/shadow.html. 31 This Internet-Draft will expire on August 28, 2009. 33 Copyright Notice 35 Copyright (c) 2009 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents 40 (http://trustee.ietf.org/license-info) in effect on the date of 41 publication of this document. Please review these documents 42 carefully, as they describe your rights and restrictions with respect 43 to this document. 45 Abstract 47 This document describes a set of mitigations that stop the known 48 variations of the Kaminsky cache poisoning attacks against the DNS 49 system, for which only resolver side deployment is necessary. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 2. Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 3. Mitigations . . . . . . . . . . . . . . . . . . . . . . . . . 4 58 3.1. Add Entropy . . . . . . . . . . . . . . . . . . . . . . . 4 59 3.2. Use Care with the Cache . . . . . . . . . . . . . . . . . 5 60 3.3. Obtain Authoritative Data . . . . . . . . . . . . . . . . 6 61 3.4. Detection . . . . . . . . . . . . . . . . . . . . . . . . 7 63 4. Variants to Protect against . . . . . . . . . . . . . . . . . 8 65 5. Security Considerations . . . . . . . . . . . . . . . . . . . 10 67 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 69 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 11 71 8. Informative References . . . . . . . . . . . . . . . . . . . . 11 73 1. Introduction 75 [WW: These are the counter measures for the Kaminsky attack scenarios 76 that I envision for the Unbound resolver (http://unbound.net). These 77 are counter measures that require resolver side deployment only. 78 Depending on working group input this document could remain an 79 Unbound specific information document or can be made more generic, 80 and move towards a BCP.] 82 This document describes the mitigations that a resolver can deploy on 83 its own in the meantime, while a more comprehensive (read: DNSSEC) 84 solution is being rolled out. For counter measures that require 85 changes to authoritative and recursive servers everywhere, DNSSEC 86 provides the most protection, followed by Nonce-based approaches 87 (e.g. EDNS PING), followed by transport protocol games. Because 88 Unbound implements DNSSEC validation already, and DNSSEC provides the 89 most protection (e.g. against new unknown variations and also against 90 full man-in-the-middle attacks), this is a good long term choice. 92 The solutions covered in this document hope to cover all of the 93 variations in the recent Kaminsky-style attacks. However, it seems 94 likely that other variations besides the ones described in this 95 document are going to be discovered. For that reason a number of 96 generic protections are included, chief amongst those is the use of 97 extra entropy. 99 Since this document focuses on Unbound it is worth noting that 100 although current versions implement these mitigations, they are not 101 all turned on by default. Unbound should support the mitigations 102 considered 'best' by the community. This means without weird, ill- 103 considered, mitigations of its own. Hence this document. 105 It is assumed the reader is aware of, and implementing, the forgery- 106 resilience [RFC5452] recommendations. 108 In Section 2 the criteria are listed. In Section 3 the various 109 measures that can be used to mitigate threats are described. Section 110 4 enumerates Kaminsky-style attack variations, and shows what 111 measures provide protection against each one of them. Section 5 112 discusses consequences caused by the mitigations. 114 2. Criteria 116 The first and foremost criterium is that these are resolver side 117 solutions, thus only the resolver needs to be redeployed, or the 118 software updated, for this to work. The reason behind this is that a 119 short term deployment is possible. The idea is to provide some 120 (partial) protection on the short term. On the long term it is 121 possible to redeploy both authority and recursors, and the solution 122 space is greatly increased (e.g. options range from EDNS PING, using 123 TCP or SCTP, to DNSSEC deployment). 125 Many solutions in this document could also be used in stub resolvers. 126 Stub resolvers are not mentioned specifically further on, the main 127 focus is on the caching recursive server. 129 The solutions have to follow the DNS protocol. 131 The solutions have to be non disruptive, and non anti-social. 132 Specifically, they must not put the costs of the solution with 3rd 133 parties. For example, large scale fallback to TCP both uses a 134 limited resource (TCP connections to authority servers), and disrupts 135 deployment behind many middle boxes. 137 Solutions without an 'attack mode' are preferred. An 'attack mode' 138 is a different state of behaviour that the resolver enters into after 139 something anomalous is detected. It may be for only a subset of 140 operations or only a limited time. One reason to avoid such modal 141 design is that paranoia dictates that maximal protection should 142 always be used. A second reason is that if a protection measure 143 cannot be used always, it is likely to be disruptive (see above). 144 Such an 'attack mode' complicates implementation, testing and 145 especially security analysis. 147 3. Mitigations 149 Below, the resolver side mitigations are described. 151 3.1. Add Entropy 153 The mitigations in this section increase the transaction entropy 154 above the 16 bits in the ID number. This is pretty close to the 155 forgery-resilience [RFC5452] text, differences are in the rtt banding 156 text and 0x20 consideration. 158 o port randomisation 160 As many as possible, using only 1000 or 2000 ports (as some 161 commercial DNS products do) is not enough. A range of 59000 port 162 numbers (15.8 bits) can be usefully achieved. This causes 163 operational problems (NAT boxes using predictable port numbers), 164 portability problems (bugs, features not available), and volume 165 problems (using port number uses limited resource). 167 o 0x20. 169 Breaks queries to some authorities, but more than 99.9% works. It 170 is like a proposal that needs authority server deployment where 171 the authority servers are already deployed to a large extent. 172 [I-D.vixie-dnsext-dns0x20]. 174 o rtt banding 176 RTT banding refers to the method of picking a random nameserver 177 for the query out of the set of nameservers that are within a RTT 178 band (say at most 200 msec slower) from the fastest nameserver. 180 New attack opportunities can be created by sending a new fake 181 question to be resolved by the resolver. Therefore the actual 182 size of the roundtrip time window is not as important as the 183 additional entropy gained by selecting randomly from a set of 184 servers. 186 o IPv4 - IPv6 188 When both IPv4 and IPv6 are available, the protocol can be chosen 189 randomly together with rtt banding to provide more entropy. 191 o source address randomisation 193 If the resolver has multiple public IP addresses these can be used 194 to randomise with. 196 If all the above entropy settings are in use, it is estimated that 197 Unbound can provide about 44 bits of entropy (16 ID, 15.8 port bits, 198 about 8 0x20 bits, about 2 rtt banding + protocol bits and about 2 199 source address bits). Without user configuration or queries amenable 200 to 0x20, 34 bits of entropy are likely, or even 18 if a NAT box kills 201 the port randomisation. Entropy thus provides only limited 202 protection. 204 3.2. Use Care with the Cache 206 o rfc2181 adherence 208 This means that RRsets are ranked in trustworthiness depending on 209 whether they come from the answer section, or from another part of 210 the message. The authoritative answers are preferred. [RFC2181] 212 In addition, do not give data obtained from authority or 213 additional sections in answer sections to clients. 215 o CNAME chain. 217 Only use first entry in answer section. Perform new lookups for 218 remainder. 220 o DNAME chain. 222 Only use the first entry DNAME and its synthesized CNAME from the 223 answer section. Perform new lookups for remainder. 225 o no DNAME from cache 227 Do not pick a DNAME RR out of the cache for a query for which that 228 DNAME RR was not returned. Thus, a DNAME is only used for query 229 names for which answers have been received from the authority 230 server. 232 When the DNAME is signed with DNSSEC, it is allowed to synthesize 233 new CNAMEs from it to answer new queries with it. This is because 234 the zone owner whose zone is redirected is signing away his own 235 zone. 237 3.3. Obtain Authoritative Data 239 o Authority query for NS after referral 241 The idea is to obtain authoritative data for the NS RRset instead 242 of using data tacked along on another message. Care must be taken 243 to avoid DoSing parent nameservers, and not break resolution in 244 common cases where the NS RRsets in parent and child differ. 246 On a referral, the data from the referral may be used to continue 247 answering the current query, but it is not stored in the cache. 248 If the question equals the referred zone name and has qtype NS, 249 then the NS RRset from the referral does get stored in the cache. 251 If the question is not that already, a new lookup is performed for 252 the referred zone name with qtype NS. The results from that 253 lookup are cached normally. The lookup has to start at a parent 254 of the referred zone, so that a new referral is obtained. 256 The upshot is that RFC2181 adherence pins the NS RRset data in the 257 cache because it is seen in the answer section, and tacked on data 258 from other messages is ignored until the TTL expires. It should 259 be noted that most infrastructure TTLs for NS records are very 260 large. 262 It does not break existing disjoint RRsets, or servers that do not 263 answer for qtype NS at all, or servers that are offline, because 264 the referral is cached when making the qtype NS query. This is 265 why the qtype NS query has to be made in such a way that it 266 elicits a fresh referral from the parent server. This gives a 267 once per TTL opportunity for spoofing the referral. 269 The NS RRset answered from the child side of the zone cut 270 overrides the NS RRset picked up from the referral. This causes 271 the same data to be used as today, where the authority section NS 272 set sent along by the child server overrides the NS set seen from 273 the referral. 275 Additional queries are sent for this solution. This increases 276 resolver and authority server load and bandwith usage. 278 o Authority queries for nameserver addresses, A and AAAA. 280 Same idea, like NS query above. You ask for A or AAAA records 281 directly at the authoritative server. It is not necessary to 282 elicit the referral again, the query can be directed at the best 283 server. 285 Additional queries are sent for this solution. This increases 286 resolver and authority server load and bandwith usage. 288 A bonus when using the above methods to obtain authoritative data is 289 that when using DNSSEC, the data can be validated, and thus spoofed 290 infrastructure data can be detected and handled appropriately. This 291 protects DNSSEC, where the referral contains unsigned NS, A and AAAA 292 records from spoofed infrastructure data. Of course, DNSSEC is 293 designed to protect end-user data anyway, whether or not the referral 294 data was poisoned. It simply adds the opportunity to add another 295 layer of defense. 297 3.4. Detection 299 o trouble counter 301 This is a simple detection method. It counts all packets that 302 were not asked for. The only thing noted about the packet is that 303 it is a query reply (QR bit) and was not asked for. 305 This may show false positives due to UDP packet duplicates, 306 delayed responses (delayed for longer than the implementation 307 cares to keep track of what it asks for). The idea is that false 308 positives are probably a low amount. Conversely, some unasked for 309 packets may not be noticed because the implementation may not be 310 listening to particular ports, or whatever implementation choices. 312 When a particular threshold is met, the cache is wiped clean. 314 The threshold is set so that denial of service does not become all 315 that much easier, and that false positives do not (often) result 316 in cache wipes. A threshold in the range of 10 million is 317 proposed. This many packets itself is already a sizable denial of 318 service attack, and also, the amount of data sent gets close to 319 the cache size of the resolver to keep amplification towards the 320 authority servers low. 322 Since this mitigation is meant to protect against hitherto unknown 323 variations, it does not help to examine the packets any further 324 than the QR bit (and the fact that they were not used for regular 325 processing). 327 The result of this is that the probability that there is a 328 poisoned item present in the cache is capped at some maximum. The 329 exact value depends on the entropy per message and the threshold. 331 4. Variants to Protect against 333 In the descriptions below a short title is given to quickly summarize 334 the exploit. The query 'q:' is what the attacker sends as fake 335 question to the resolver to answer. The answer, authority 'auth:' 336 and additional 'add:' sections list the content that the spoofer 337 provides. The mitigation strategy, and sometimes discussion, is 338 provided in the 'protected:' line. 340 The real target is example.com or www.example.com or ns1.example.com, 341 which is the real nameserver for example.com here. The domain 342 evil.example.net is under control of the attacker and 343 192.0.2.66(evil) is an IP address under control of the attacker. The 344 label 'bad123' is used in place of a label that the attacker varies 345 every attempt to obtain new spoofing windows. 347 Glue with new DNS server 348 q: bad123.example.com. 349 answer: bad123.example.com. A whatever 350 auth: example.com. NS evil.example.com. 351 add: evil.example.com. A 192.0.2.66(evil) 352 protected: 2181 adherence plus NS record pinned by NS query. 353 Also name error or no data answers could be used, instead of 354 this answer section. 356 Glue for DNS server 357 q: bad123.example.com. 358 answer: bad123.example.com. A whatever 359 auth: example.com. NS ns1.example.com. (normal entry) 360 add: ns1.example.com. A 192.0.2.66(evil) 361 protected: 2181 adherence plus NS record pinned by NS query, 362 plus A record pinned by glue query. 363 Also name error or no data answers could be used, instead of 364 this answer section. 366 Glue for Web server 367 q: bad123.example.com. 368 answer: bad123.example.com. A whatever 369 auth: example.com. NS www.example.com. 370 add: www.example.com. A 192.0.2.66(evil) 371 protected: 2181 adherence plus NS record pinned by NS query. 373 Glue smaller 374 q: bad123.example.com. 375 answer: bad123.example.com. A 192.0.2.66(evil) 376 auth: example.com. NS bad123.example.com. 377 protected: 2181 adherence plus NS record pinned by NS query. 379 NS change 380 q: bad123.example.com. 381 answer: bad123.example.com. A whatever 382 auth: example.com. NS evil.example.net. 383 protected: 2181 adherence plus NS record pinned by NS query. 385 NS server migration 386 q: bad123.example.com. 387 answer: bad123.example.com. A whatever 388 auth: example.com. NS ns1.example.com. (normal entry) 389 auth: example.com. NS ns2.example.com.evil.example.net. 390 (evil, looks like typo in server migration) 391 protected: 2181 adherence plus NS record pinned by NS query. 393 CNAME 394 q: bad123.example.com. 395 answer: bad123.example.com. CNAME www.example.com. 396 answer: www.example.com. A 192.0.2.66(evil) 397 protected: CNAME chain cutoff. 399 DNAME one message 400 q: www.bad123.example.com. 401 answer: bad123.example.com. DNAME example.com. 402 answer: www.bad123.example.com. CNAME www.example.com. 403 answer: www.example.com. A 192.0.2.66(evil) 404 protected: DNAME chain cutoff. 406 DNAME whole zone 407 q: bad123.example.com. 408 answer: example.com. DNAME evil.example.net. 410 answer: bad123.example.com. CNAME bad123.evil.example.net. 411 answer: bad123.evil.example.net. A whatever 412 protected: no DNAME from cache. 414 New Delegation - rigged 415 q: bad123.www.example.com. 416 answer: (empty) 417 auth: www.example.com. NS www.example.com. 418 add: www.example.com. A 192.0.2.66(evil) 419 protected: the NS queries that ask referral confirmation 420 together with glue queries. 422 New Delegation - looks normal 423 q: bad123.www.example.com. 424 answer: (empty) 425 auth: www.example.com. NS ns1.evil.example.net. 426 auth: www.example.com. NS ns2.evil.example.net. 427 protected: the NS queries that ask referral confirmation 428 together with glue queries. 430 New Delegation - for glue 431 q: bad123.example.com. 432 answer: (empty) 433 auth: bad123.example.com. NS ns1.example.com. 434 additional: ns1.example.com. A 192.0.2.66(evil) 435 protected: rfc2181 adherence. 437 Another hitherto unknown variation 438 These are a lot of variations and it is very likely that other 439 people can come up with better, different ideas. 440 protected: by entropy measures, by the count-and-wipe measure. 441 Long term solutions (PING, TCP, DNSSEC) also aim to protect 442 against these much more thoroughly. 444 5. Security Considerations 446 All of the mitigations aim to provide more security. But, several of 447 these mitigations have adverse effects on performance and bandwith. 449 The CNAME, DNAME, NS and nameserver address mitigations all require 450 that additional lookups be performed. The CNAME and DNAME target 451 lookups cause the answer to the client to be delayed. The NS set and 452 nameserver address lookups cause a higher load on both authority and 453 resolver servers. 455 The detection mechanism is susceptible to denial of service attacks. 456 A small, calculated, amount of additional DoS leverage is provided. 457 This changes some spoof attacks into a denial of service. 459 The NS set and nameserver address lookups cause the NS, A and AAAA 460 RRsets to be pinned in the cache until the TTL expires. This 461 provides cache overwriting protection, but at the cost of not picking 462 up updates to these RRsets in the course of normal resolution. 463 Changes to these RRsets are then no longer seen on the next query, 464 but only after the TTL times out. This adversely affects the 465 coherency of the DNS server infrastructure, as it becomes more likely 466 that resolvers operate using out of date nameserver data. 468 6. IANA Considerations 470 None. 472 7. Acknowledgments 474 Thanks to Nicholas Weaver (ICSI Berkeley) and Olaf Kolkman (NLnet 475 Labs). 477 8. Informative References 479 [I-D.vixie-dnsext-dns0x20] Vixie, P. and D. Dagon, "Use of Bit 0x20 480 in DNS Labels to Improve Transaction 481 Identity", draft-vixie-dnsext-dns0x20-00 482 (work in progress), March 2008. 484 [RFC2181] Elz, R. and R. Bush, "Clarifications to 485 the DNS Specification", RFC 2181, 486 July 1997. 488 [RFC5452] Hubert, A. and R. van Mook, "Measures for 489 Making DNS More Resilient against Forged 490 Answers", RFC 5452, January 2009. 492 Author's Address 494 Wouter Wijngaards 495 NLnet Labs 496 Science Park 140 497 Amsterdam 1098 XG 498 The Netherlands 500 Phone: +31-20-888-4551 501 EMail: wouter@nlnetlabs.nl