idnits 2.17.1 

draft-wkumari-dnsop-dist-root-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (July 3, 2014) is 3578 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

     No issues found here.

     Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                     W. Kumari, Ed.
3	Internet-Draft                                                    Google
4	Intended status: Informational                           P. Hoffman, Ed.
5	Expires: January 4, 2015                                  VPN Consortium
6	                                                            July 3, 2014

8	                   Securely Distributing the DNS Root
9	                    draft-wkumari-dnsop-dist-root-01

11	Abstract

13	   This document recommends that recursive DNS resolvers get copies of
14	   the root zone, validate it using DNSSEC, populate their caches with
15	   the information, and also give negative responses from the validated
16	   zone.

18	   [[ Note: This document is largely a discussion starting point. ]]

20	Status of This Memo

22	   This Internet-Draft is submitted in full conformance with the
23	   provisions of BCP 78 and BCP 79.

25	   Internet-Drafts are working documents of the Internet Engineering
26	   Task Force (IETF).  Note that other groups may also distribute
27	   working documents as Internet-Drafts.  The list of current Internet-
28	   Drafts is at http://datatracker.ietf.org/drafts/current/.

30	   Internet-Drafts are draft documents valid for a maximum of six months
31	   and may be updated, replaced, or obsoleted by other documents at any
32	   time.  It is inappropriate to use Internet-Drafts as reference
33	   material or to cite them other than as "work in progress."

35	   This Internet-Draft will expire on January 4, 2015.

37	Copyright Notice

39	   Copyright (c) 2014 IETF Trust and the persons identified as the
40	   document authors.  All rights reserved.

42	   This document is subject to BCP 78 and the IETF Trust's Legal
43	   Provisions Relating to IETF Documents
44	   (http://trustee.ietf.org/license-info) in effect on the date of
45	   publication of this document.  Please review these documents
46	   carefully, as they describe your rights and restrictions with respect
47	   to this document.  Code Components extracted from this document must
48	   include Simplified BSD License text as described in Section 4.e of
49	   the Trust Legal Provisions and are provided without warranty as
50	   described in the Simplified BSD License.

52	Table of Contents

54	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
55	     1.1.  Requirements notation . . . . . . . . . . . . . . . . . .   3
56	   2.  Requirements  . . . . . . . . . . . . . . . . . . . . . . . .   3
57	   3.  Open Question: How Should the Root Zone Be Distributed? . . .   5
58	   4.  Open Question: Should Responses Have the AA Bit Set?  . . . .   5
59	   5.  Pros and Cons of this Technique . . . . . . . . . . . . . . .   6
60	     5.1.  Pros  . . . . . . . . . . . . . . . . . . . . . . . . . .   6
61	     5.2.  Cons  . . . . . . . . . . . . . . . . . . . . . . . . . .   6
62	   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   7
63	   7.  Security Considerations . . . . . . . . . . . . . . . . . . .   7
64	   8.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   7
65	   9.  Contributors  . . . . . . . . . . . . . . . . . . . . . . . .   7
66	   10. Normative References  . . . . . . . . . . . . . . . . . . . .   8
67	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   8

69	1.  Introduction

71	   One of the main advantages of a DNSSEC-signed root zone is that it
72	   doesn't matter where you get the data from, as long as you validate
73	   the contents of the zone using DNSSEC information.  From that point
74	   on, you know all of the contents of the root zone at the time that
75	   you retrieve and validated the zone.

77	   When a typical recursive resolver starts up, it has an empty cache,
78	   the addresses of the root servers.  As it begins answering queries,
79	   it populates its cache by making a number of queries to the set of
80	   root servers, and caching the results.  All queries for root zone
81	   names that come to the recursive resolver that are not in either its
82	   positive or negative cache are sent to one of the root servers.  This
83	   process cause a large number of the queries that hit the root are so
84	   called "junk" queries, such as queries for second-level domains in
85	   non-existent TLDs.

87	   This document is describes a mechanism to populate caches in
88	   recursive resolvers with the verified contents of the full root zone
89	   so that the recursive resolvers have the all of root zone content
90	   cached.  This technique can be viewed as pre-populating a resolver's
91	   cache with the root zone information by retrieving a signed copy of
92	   the root zone and verifying the contents.

94	   The two goals of this mechanism are to provide faster negative
95	   responses to stub resolver queries that contain junk queries, and to
96	   reduce the number of junk queries sent to the root servers.  The
97	   mechanism has other minor advantages, but those two are the focus of
98	   this document.

100	1.1.  Requirements notation

102	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
103	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
104	   document are to be interpreted as described in [RFC2119].

106	2.  Requirements

108	   In the discussion below, the term "legacy operation" means the way
109	   that a recursive resolver acts when it is not using the mechanism
110	   describe in this document, namely as a normal validating recursive
111	   resolver with no other special features.

113	   In order to implement the mechanism described in this document, a
114	   recursive resolver MUST support DNSSEC, and MUST have an up-to-date
115	   copy of the DNS root key.

117	   A recursive resolver using this mechanism MUST follow these steps at
118	   startup or after clearing its cache:

120	   1.  The resolver determines the list of root zone delivery servers.
121	       The delivery mechanism is not yet defined in this document, and
122	       some possible options for it are described in Section 3.

124	   2.  The resolver SHOULD randomly sort the list of zone delivery
125	       servers so that all the servers get a fairly even distribution of
126	       queries.

128	   3.  The resolver SHOULD attempt to transfer the signed root zone
129	       using the transfer protocol from each one of the servers until
130	       either success is achieved or the list has been exhausted.  The
131	       resolver MAY attempt to transfer in parallel to minimize startup
132	       latency.  If the root zone cannot be transferred, the resolver
133	       logs this as an error, and MUST fall back to legacy operation.

135	   4.  The resolver MUST validate the records in the zone using DNSSEC.
136	       If any of the records do not validate, the resolver MUST discard
137	       all records received, MUST log an error, and SHOULD try the next
138	       server in the list.  If no transferred copy of the root zone can
139	       be validated, the resolver logs this as an error, and falls back
140	       to legacy operation.  Note that the resolver MUST validate all of
141	       the zone contents, and MUST NOT start using the new contents
142	       until all have been validated; the resolver MUST NOT use "lazy
143	       validation".  This means that the addition of the zone data MUST
144	       be an atomic operation.

146	   The resolver MAY store the contents of the validated root zone to
147	   disk.  If the resolver has a stored copy of the root zone, and the
148	   data in the zone is not expired, and that copy was written within the
149	   refresh time listed in the zone, the resolver MAY load that zone
150	   instead of transferring.

152	   Once the resolver has transferred and validated the zone, it MUST
153	   attempt to keep its copy of the root zone up to date.  This includes
154	   following the refresh, retry, expire logic, with certain
155	   modifications:

157	   o  If the zone expires (for example, because it cannot retransfer
158	      because of blocked TCP connections), the resolver MUST fall back
159	      to legacy operation and MUST log an error.  It MUST NOT return
160	      SERVFAIL to queries only due to its copy of the root zone being
161	      expired.

163	   o  The resolver MUST validate the contents of the records in the zone
164	      using DNSSEC for every transfer.  The resolver SHOULD try
165	      alternate servers if the validation fails.  If the resolver is
166	      unable to transfer a copy of the zone that validates, it MUST
167	      treat this as an error, MUST discard the received records, and
168	      MUST fail back to legacy operation.  Note that the resolver MUST
169	      validate all of the zone contents, and MUST NOT start using the
170	      new contents until all have been validated; the resolver MUST NOT
171	      use "lazy validation".  This means that the replacement of the
172	      current zone data MUST be an atomic operation.

174	   o  The resolver SHOULD attempt to restart this process at every retry
175	      interval for the root zone.

177	   o  The resolver MUST set the AD bit on responses to queries for
178	      records in the root zone.  This action is the same as if it had
179	      inserted the entry into its cache through a "normal" query that
180	      received a DNSSEC-validated answer.

182	   o  The resolver MUST set the TTL on responses in the same fashion as
183	      it would in legacy operation.  The difference here is that, when
184	      the TTL times out, instead of fetching the new answer from the
185	      root, the resolver simply starts the TTL at the maximum listed in
186	      the root zone.

188	   Compliant nameservers software MUST include an option to securely
189	   cache the root zone (an example name for this option could be
190	   "transfer-and-validate-root [yes|no]").  That is, the mechanism
191	   described in this document MUST be optional, and the cache operator
192	   MUST be able to turn it off and on.

194	3.  Open Question: How Should the Root Zone Be Distributed?

196	   The signed root zone can be distributed over almost any protocol.
197	   Because the zone is signed, the distribution protocol does not need
198	   to be authenticated.  Suggestions for the distribution mechanism
199	   include:

201	      AXFR zone transfer within the DNS

203	      HTTP, most likely with appropriately-tuned caching

205	      FTP

207	      [[ Others... ]]

209	   Note that with any of these methods, the zone does not need to be
210	   transferred from the root servers themselves.  Instead, a simple
211	   discovery mechanism can be built into the protocol that lets a
212	   recursive resolver discover where there are servers that will let it
213	   transfer the root zone.

215	4.  Open Question: Should Responses Have the AA Bit Set?

217	   A recursive resolver that has a securely validated copy of the root
218	   can be thought of in at least two ways: as a smarter cache, or as a
219	   pseudo-slave server for the root.  This section discusses the
220	   ramifications of those two choices.  In both scenarios, the resolver
221	   will send back NXDOMAIN responses for junk queries without sending
222	   queries to the root and the resolver will set the AD bit on the
223	   responses.  However, the two scenarios differ in whether or not the
224	   responses have the AA bit set.

226	   A smarter cache does not set the AA bit.  The responses for any query
227	   for a name in the root or an NXDOMAIN that is being sent because the
228	   TLD is junk come back with the AD bit set but the AA bit not set,
229	   just as it would in legacy operation.

231	   A pseudo-slave to the root sets the AA bit in response to any query
232	   for a name in the root or an NXDOMAIN that is being sent because the
233	   TLD is junk.  The reason that this is called a pseudo-slave instead
234	   of a slave is that there is a general expectation that a slave has a
235	   relationship with the master that would cause the slave to be
236	   notified of changes in the master with a NOTIFY announcement; that is
237	   not the case here.  It acts a slave because it knows exactly how the
238	   master would reply at the time that it retrieve the signed zone, but
239	   it is a pseudo-slave because the master has no way of alerting it of
240	   changes.

242	   The advantage of a recursive resolver acting as a pseudo-slave is
243	   that other resolvers that demand authoritative answers can ask if for
244	   those.  However, there are few scenarios in which those demanding
245	   resolvers exist.  The disadvantage of a recursive resolver acting as
246	   a pseudo-slave is that there is no way to signal that it is a pseudo-
247	   slave and not a real slave.  Thus, someone seeing the AA bit set
248	   might thing that the resolver is a real slave.  This opens the can of
249	   worms about trusting the settings of the AA and AD bits in responses.

251	5.  Pros and Cons of this Technique

253	   This is primarily a tracking / discussion section, and the text is
254	   kept even looser than in the rest of this doc.  These are not
255	   ordered.

257	5.1.  Pros

259	   o  Junk queries / negative caching - Currently, a significant number
260	      of queries to the root servers are "junk" queries.  Many of these
261	      queries are TLDs that do not (and may never) exist in the root
262	      Another significant source of junk is queries where the negative
263	      TLD answer did not get cached because the queries are for second-
264	      level domains (a negative cache entry for "foo.example" will not
265	      cover a subsequent query for "bar.example").

267	   o  DoS against the root service - By distributing the contents of the
268	      root to many recursive resolvers, the DoS protection for customers
269	      of the root servers is significantly increased.  A DDoS may still
270	      be able to take down some recursive servers, but there is much
271	      more root service infrastructure to attack in order to be
272	      effective.  Of course, there is still a zone distribution system
273	      that could be attacked (but it would need to be kept down for a
274	      much longer time to cause significant damage, and so far the root
275	      has stood up just fine to DDoS.

277	   o  Small increase to privacy of requests - This also removes a place
278	      where attackers could collect information.  Although query name
279	      minimization also achieves some of this, it does still leak the
280	      TLDs that people behind a resolver are querying for, which may in
281	      itself be a concern (for example someone in a homophobic country
282	      who is querying for a name in .gay).

284	5.2.  Cons

286	   o  Loss of agility in making root zone changes - Currently, if there
287	      is an error in the root zone (or someone needs to make an
288	      emergency change), a new root zone can be created, and the root
289	      server operators can be notified and start serving the new zone
290	      quickly.  Of course, this does not invalidate the bad information
291	      in (long TTL) cached answers.  Notifying every recursive resolver
292	      is not feasible.  Currently, an "oops" in the root zone will be
293	      cached for the TTL of the record by some percentage of servers.
294	      Using the technique described above, the information may be cached
295	      (by the same percentage of servers) for the refresh time + the TTL
296	      of the record

298	   o  No central monitoring point - DNS operators lose the ability to
299	      monitor the root system.  While there is work underway to
300	      implement better instrumentation of the root server system, this
301	      (potentially) removes the thing to monitor.

303	   o  Increased complexity in nameserver software and their operations -
304	      Any proposal for recursive servers to copy and serve the root
305	      inherently means more code to write and execute.  Note that many
306	      recursive resolvers are on inexpensive home routers that are
307	      rarely (if ever) updated.

309	   o  Changes the nature and distribution of traffic hitting the root
310	      servers - If all the "good" recursive resolvers deploy root
311	      copying, then root servers end up servicing only "bad" recursive
312	      resolvers and attack traffic.  The roots (could) become what AS112
313	      is for RFC1918.

315	6.  IANA Considerations

317	   This document requires no action from the IANA.

319	7.  Security Considerations

321	   A resolver that uses this mechanism but does not do full DNSSEC
322	   validation on the data it uses can obviously cause serious security
323	   issues because it can be fooled into giving wrong answers.

325	   [[ More? ]]

327	8.  Acknowledgements

329	   The editors fully acknowledge that this is not a new concept, and
330	   that we have chatted with many people about this.  If we have spoken
331	   to you and your name is not listed below, let us know.

333	9.  Contributors

335	   The general concept in this document is not new; there have been
336	   discussions regarding recursive resolvers copying the root zone for
337	   many years.  The fact that the root zone is now signed with DNSSEC
338	   makes implementing some of these techniques more feasible.

340	   The following is an unordered list of individuals have contributed
341	   text and / or significant discussions to this document.

343	      Steve Crocker - Shinkuro

345	      Jaap Akkerhuis - NLnet Labs

347	      David Conrad - Virtualized, LLC.

349	      Lars-Johan Liman - Netnod

351	      Suzanne Woolf - Individual

353	      Roy Arends - Nominet

355	      Olaf Kolkman - NLnet Labs

357	      Danny McPherson - Verisign

359	      Joe Abley - Dyn

361	      Jim Martin - ISC

363	      Jared Mauch - NTT America

365	      Rob Austien - Dragon Research Labs

367	      Sam Weiler - Parsons

369	10.  Normative References

371	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
372	              Requirement Levels", BCP 14, RFC 2119, March 1997.

374	Authors' Addresses

376	   Warren Kumari (editor)
377	   Google
378	   1600 Amphitheatre Parkway
379	   Mountain View, Ca  94043
380	   US

382	   Email: Warren@kumari.net
383	   Paul Hoffman (editor)
384	   VPN Consortium

386	   Email: paul.hoffman@vpnc.org