idnits 2.17.1 draft-wkumari-dnsop-dist-root-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 3, 2014) is 3578 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group W. Kumari, Ed. 3 Internet-Draft Google 4 Intended status: Informational P. Hoffman, Ed. 5 Expires: January 4, 2015 VPN Consortium 6 July 3, 2014 8 Securely Distributing the DNS Root 9 draft-wkumari-dnsop-dist-root-01 11 Abstract 13 This document recommends that recursive DNS resolvers get copies of 14 the root zone, validate it using DNSSEC, populate their caches with 15 the information, and also give negative responses from the validated 16 zone. 18 [[ Note: This document is largely a discussion starting point. ]] 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on January 4, 2015. 37 Copyright Notice 39 Copyright (c) 2014 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 55 1.1. Requirements notation . . . . . . . . . . . . . . . . . . 3 56 2. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 3 57 3. Open Question: How Should the Root Zone Be Distributed? . . . 5 58 4. Open Question: Should Responses Have the AA Bit Set? . . . . 5 59 5. Pros and Cons of this Technique . . . . . . . . . . . . . . . 6 60 5.1. Pros . . . . . . . . . . . . . . . . . . . . . . . . . . 6 61 5.2. Cons . . . . . . . . . . . . . . . . . . . . . . . . . . 6 62 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 63 7. Security Considerations . . . . . . . . . . . . . . . . . . . 7 64 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 65 9. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 7 66 10. Normative References . . . . . . . . . . . . . . . . . . . . 8 67 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 69 1. Introduction 71 One of the main advantages of a DNSSEC-signed root zone is that it 72 doesn't matter where you get the data from, as long as you validate 73 the contents of the zone using DNSSEC information. From that point 74 on, you know all of the contents of the root zone at the time that 75 you retrieve and validated the zone. 77 When a typical recursive resolver starts up, it has an empty cache, 78 the addresses of the root servers. As it begins answering queries, 79 it populates its cache by making a number of queries to the set of 80 root servers, and caching the results. All queries for root zone 81 names that come to the recursive resolver that are not in either its 82 positive or negative cache are sent to one of the root servers. This 83 process cause a large number of the queries that hit the root are so 84 called "junk" queries, such as queries for second-level domains in 85 non-existent TLDs. 87 This document is describes a mechanism to populate caches in 88 recursive resolvers with the verified contents of the full root zone 89 so that the recursive resolvers have the all of root zone content 90 cached. This technique can be viewed as pre-populating a resolver's 91 cache with the root zone information by retrieving a signed copy of 92 the root zone and verifying the contents. 94 The two goals of this mechanism are to provide faster negative 95 responses to stub resolver queries that contain junk queries, and to 96 reduce the number of junk queries sent to the root servers. The 97 mechanism has other minor advantages, but those two are the focus of 98 this document. 100 1.1. Requirements notation 102 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 103 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 104 document are to be interpreted as described in [RFC2119]. 106 2. Requirements 108 In the discussion below, the term "legacy operation" means the way 109 that a recursive resolver acts when it is not using the mechanism 110 describe in this document, namely as a normal validating recursive 111 resolver with no other special features. 113 In order to implement the mechanism described in this document, a 114 recursive resolver MUST support DNSSEC, and MUST have an up-to-date 115 copy of the DNS root key. 117 A recursive resolver using this mechanism MUST follow these steps at 118 startup or after clearing its cache: 120 1. The resolver determines the list of root zone delivery servers. 121 The delivery mechanism is not yet defined in this document, and 122 some possible options for it are described in Section 3. 124 2. The resolver SHOULD randomly sort the list of zone delivery 125 servers so that all the servers get a fairly even distribution of 126 queries. 128 3. The resolver SHOULD attempt to transfer the signed root zone 129 using the transfer protocol from each one of the servers until 130 either success is achieved or the list has been exhausted. The 131 resolver MAY attempt to transfer in parallel to minimize startup 132 latency. If the root zone cannot be transferred, the resolver 133 logs this as an error, and MUST fall back to legacy operation. 135 4. The resolver MUST validate the records in the zone using DNSSEC. 136 If any of the records do not validate, the resolver MUST discard 137 all records received, MUST log an error, and SHOULD try the next 138 server in the list. If no transferred copy of the root zone can 139 be validated, the resolver logs this as an error, and falls back 140 to legacy operation. Note that the resolver MUST validate all of 141 the zone contents, and MUST NOT start using the new contents 142 until all have been validated; the resolver MUST NOT use "lazy 143 validation". This means that the addition of the zone data MUST 144 be an atomic operation. 146 The resolver MAY store the contents of the validated root zone to 147 disk. If the resolver has a stored copy of the root zone, and the 148 data in the zone is not expired, and that copy was written within the 149 refresh time listed in the zone, the resolver MAY load that zone 150 instead of transferring. 152 Once the resolver has transferred and validated the zone, it MUST 153 attempt to keep its copy of the root zone up to date. This includes 154 following the refresh, retry, expire logic, with certain 155 modifications: 157 o If the zone expires (for example, because it cannot retransfer 158 because of blocked TCP connections), the resolver MUST fall back 159 to legacy operation and MUST log an error. It MUST NOT return 160 SERVFAIL to queries only due to its copy of the root zone being 161 expired. 163 o The resolver MUST validate the contents of the records in the zone 164 using DNSSEC for every transfer. The resolver SHOULD try 165 alternate servers if the validation fails. If the resolver is 166 unable to transfer a copy of the zone that validates, it MUST 167 treat this as an error, MUST discard the received records, and 168 MUST fail back to legacy operation. Note that the resolver MUST 169 validate all of the zone contents, and MUST NOT start using the 170 new contents until all have been validated; the resolver MUST NOT 171 use "lazy validation". This means that the replacement of the 172 current zone data MUST be an atomic operation. 174 o The resolver SHOULD attempt to restart this process at every retry 175 interval for the root zone. 177 o The resolver MUST set the AD bit on responses to queries for 178 records in the root zone. This action is the same as if it had 179 inserted the entry into its cache through a "normal" query that 180 received a DNSSEC-validated answer. 182 o The resolver MUST set the TTL on responses in the same fashion as 183 it would in legacy operation. The difference here is that, when 184 the TTL times out, instead of fetching the new answer from the 185 root, the resolver simply starts the TTL at the maximum listed in 186 the root zone. 188 Compliant nameservers software MUST include an option to securely 189 cache the root zone (an example name for this option could be 190 "transfer-and-validate-root [yes|no]"). That is, the mechanism 191 described in this document MUST be optional, and the cache operator 192 MUST be able to turn it off and on. 194 3. Open Question: How Should the Root Zone Be Distributed? 196 The signed root zone can be distributed over almost any protocol. 197 Because the zone is signed, the distribution protocol does not need 198 to be authenticated. Suggestions for the distribution mechanism 199 include: 201 AXFR zone transfer within the DNS 203 HTTP, most likely with appropriately-tuned caching 205 FTP 207 [[ Others... ]] 209 Note that with any of these methods, the zone does not need to be 210 transferred from the root servers themselves. Instead, a simple 211 discovery mechanism can be built into the protocol that lets a 212 recursive resolver discover where there are servers that will let it 213 transfer the root zone. 215 4. Open Question: Should Responses Have the AA Bit Set? 217 A recursive resolver that has a securely validated copy of the root 218 can be thought of in at least two ways: as a smarter cache, or as a 219 pseudo-slave server for the root. This section discusses the 220 ramifications of those two choices. In both scenarios, the resolver 221 will send back NXDOMAIN responses for junk queries without sending 222 queries to the root and the resolver will set the AD bit on the 223 responses. However, the two scenarios differ in whether or not the 224 responses have the AA bit set. 226 A smarter cache does not set the AA bit. The responses for any query 227 for a name in the root or an NXDOMAIN that is being sent because the 228 TLD is junk come back with the AD bit set but the AA bit not set, 229 just as it would in legacy operation. 231 A pseudo-slave to the root sets the AA bit in response to any query 232 for a name in the root or an NXDOMAIN that is being sent because the 233 TLD is junk. The reason that this is called a pseudo-slave instead 234 of a slave is that there is a general expectation that a slave has a 235 relationship with the master that would cause the slave to be 236 notified of changes in the master with a NOTIFY announcement; that is 237 not the case here. It acts a slave because it knows exactly how the 238 master would reply at the time that it retrieve the signed zone, but 239 it is a pseudo-slave because the master has no way of alerting it of 240 changes. 242 The advantage of a recursive resolver acting as a pseudo-slave is 243 that other resolvers that demand authoritative answers can ask if for 244 those. However, there are few scenarios in which those demanding 245 resolvers exist. The disadvantage of a recursive resolver acting as 246 a pseudo-slave is that there is no way to signal that it is a pseudo- 247 slave and not a real slave. Thus, someone seeing the AA bit set 248 might thing that the resolver is a real slave. This opens the can of 249 worms about trusting the settings of the AA and AD bits in responses. 251 5. Pros and Cons of this Technique 253 This is primarily a tracking / discussion section, and the text is 254 kept even looser than in the rest of this doc. These are not 255 ordered. 257 5.1. Pros 259 o Junk queries / negative caching - Currently, a significant number 260 of queries to the root servers are "junk" queries. Many of these 261 queries are TLDs that do not (and may never) exist in the root 262 Another significant source of junk is queries where the negative 263 TLD answer did not get cached because the queries are for second- 264 level domains (a negative cache entry for "foo.example" will not 265 cover a subsequent query for "bar.example"). 267 o DoS against the root service - By distributing the contents of the 268 root to many recursive resolvers, the DoS protection for customers 269 of the root servers is significantly increased. A DDoS may still 270 be able to take down some recursive servers, but there is much 271 more root service infrastructure to attack in order to be 272 effective. Of course, there is still a zone distribution system 273 that could be attacked (but it would need to be kept down for a 274 much longer time to cause significant damage, and so far the root 275 has stood up just fine to DDoS. 277 o Small increase to privacy of requests - This also removes a place 278 where attackers could collect information. Although query name 279 minimization also achieves some of this, it does still leak the 280 TLDs that people behind a resolver are querying for, which may in 281 itself be a concern (for example someone in a homophobic country 282 who is querying for a name in .gay). 284 5.2. Cons 286 o Loss of agility in making root zone changes - Currently, if there 287 is an error in the root zone (or someone needs to make an 288 emergency change), a new root zone can be created, and the root 289 server operators can be notified and start serving the new zone 290 quickly. Of course, this does not invalidate the bad information 291 in (long TTL) cached answers. Notifying every recursive resolver 292 is not feasible. Currently, an "oops" in the root zone will be 293 cached for the TTL of the record by some percentage of servers. 294 Using the technique described above, the information may be cached 295 (by the same percentage of servers) for the refresh time + the TTL 296 of the record 298 o No central monitoring point - DNS operators lose the ability to 299 monitor the root system. While there is work underway to 300 implement better instrumentation of the root server system, this 301 (potentially) removes the thing to monitor. 303 o Increased complexity in nameserver software and their operations - 304 Any proposal for recursive servers to copy and serve the root 305 inherently means more code to write and execute. Note that many 306 recursive resolvers are on inexpensive home routers that are 307 rarely (if ever) updated. 309 o Changes the nature and distribution of traffic hitting the root 310 servers - If all the "good" recursive resolvers deploy root 311 copying, then root servers end up servicing only "bad" recursive 312 resolvers and attack traffic. The roots (could) become what AS112 313 is for RFC1918. 315 6. IANA Considerations 317 This document requires no action from the IANA. 319 7. Security Considerations 321 A resolver that uses this mechanism but does not do full DNSSEC 322 validation on the data it uses can obviously cause serious security 323 issues because it can be fooled into giving wrong answers. 325 [[ More? ]] 327 8. Acknowledgements 329 The editors fully acknowledge that this is not a new concept, and 330 that we have chatted with many people about this. If we have spoken 331 to you and your name is not listed below, let us know. 333 9. Contributors 335 The general concept in this document is not new; there have been 336 discussions regarding recursive resolvers copying the root zone for 337 many years. The fact that the root zone is now signed with DNSSEC 338 makes implementing some of these techniques more feasible. 340 The following is an unordered list of individuals have contributed 341 text and / or significant discussions to this document. 343 Steve Crocker - Shinkuro 345 Jaap Akkerhuis - NLnet Labs 347 David Conrad - Virtualized, LLC. 349 Lars-Johan Liman - Netnod 351 Suzanne Woolf - Individual 353 Roy Arends - Nominet 355 Olaf Kolkman - NLnet Labs 357 Danny McPherson - Verisign 359 Joe Abley - Dyn 361 Jim Martin - ISC 363 Jared Mauch - NTT America 365 Rob Austien - Dragon Research Labs 367 Sam Weiler - Parsons 369 10. Normative References 371 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 372 Requirement Levels", BCP 14, RFC 2119, March 1997. 374 Authors' Addresses 376 Warren Kumari (editor) 377 Google 378 1600 Amphitheatre Parkway 379 Mountain View, Ca 94043 380 US 382 Email: Warren@kumari.net 383 Paul Hoffman (editor) 384 VPN Consortium 386 Email: paul.hoffman@vpnc.org