idnits 2.17.1 

draft-tale-dnsop-serve-stale-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The draft header indicates that this document updates RFC1034, but the
     abstract doesn't seem to mention this, which it should.

  -- The draft header indicates that this document updates RFC1035, but the
     abstract doesn't seem to mention this, which it should.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

     (Using the creation date from RFC1034, updated by this document, for
     RFC5378 checks: 1987-11-01)

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (October 30, 2017) is 2370 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Obsolete informational reference (is this intentional?): RFC 7719
     (Obsoleted by RFC 8499)


     Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	DNSOP Working Group                                          D. Lawrence
3	Internet-Draft                                       Akamai Technologies
4	Updates: 1034, 1035 (if approved)                              W. Kumari
5	Intended status: Standards Track                                  Google
6	Expires: May 3, 2018                                    October 30, 2017

8	              Serving Stale Data to Improve DNS Resiliency
9	                    draft-tale-dnsop-serve-stale-02

11	Abstract

13	   This draft defines a method for recursive resolvers to use stale DNS
14	   data to avoid outages when authoritative nameservers cannot be
15	   reached to refresh expired data.

17	Ed note

19	   Text inside square brackets ([]) is additional background
20	   information, answers to frequently asked questions, general musings,
21	   etc.  They will be removed before publication.  This document is
22	   being collaborated on in GitHub at <https://github.com/vttale/serve-
23	   stale>.  The most recent version of the document, open issues, etc
24	   should all be available here.  The authors gratefully accept pull
25	   requests.

27	Status of This Memo

29	   This Internet-Draft is submitted in full conformance with the
30	   provisions of BCP 78 and BCP 79.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF).  Note that other groups may also distribute
34	   working documents as Internet-Drafts.  The list of current Internet-
35	   Drafts is at https://datatracker.ietf.org/drafts/current/.

37	   Internet-Drafts are draft documents valid for a maximum of six months
38	   and may be updated, replaced, or obsoleted by other documents at any
39	   time.  It is inappropriate to use Internet-Drafts as reference
40	   material or to cite them other than as "work in progress."

42	   This Internet-Draft will expire on May 3, 2018.

44	Copyright Notice

46	   Copyright (c) 2017 IETF Trust and the persons identified as the
47	   document authors.  All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents
51	   (https://trustee.ietf.org/license-info) in effect on the date of
52	   publication of this document.  Please review these documents
53	   carefully, as they describe your rights and restrictions with respect
54	   to this document.  Code Components extracted from this document must
55	   include Simplified BSD License text as described in Section 4.e of
56	   the Trust Legal Provisions and are provided without warranty as
57	   described in the Simplified BSD License.

59	Table of Contents

61	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
62	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   3
63	   3.  Background  . . . . . . . . . . . . . . . . . . . . . . . . .   3
64	   4.  Standards Action  . . . . . . . . . . . . . . . . . . . . . .   4
65	   5.  EDNS Option . . . . . . . . . . . . . . . . . . . . . . . . .   4
66	     5.1.  Option Format . . . . . . . . . . . . . . . . . . . . . .   4
67	     5.2.  Option Usage  . . . . . . . . . . . . . . . . . . . . . .   5
68	   6.  Example Method  . . . . . . . . . . . . . . . . . . . . . . .   6
69	   7.  Implementation Caveats  . . . . . . . . . . . . . . . . . . .   7
70	   8.  Implementation Status . . . . . . . . . . . . . . . . . . . .   8
71	   9.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
72	   10. Privacy Considerations  . . . . . . . . . . . . . . . . . . .   9
73	   11. NAT Considerations  . . . . . . . . . . . . . . . . . . . . .   9
74	   12. IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
75	   13. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   9
76	   14. References  . . . . . . . . . . . . . . . . . . . . . . . . .   9
77	     14.1.  Normative References . . . . . . . . . . . . . . . . . .   9
78	     14.2.  Informative References . . . . . . . . . . . . . . . . .   9
79	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  10

81	1.  Introduction

83	   Traditionally the Time To Live (TTL) of a DNS resource record has
84	   been understood to represent the maximum number of seconds that a
85	   record can be used before it must be discarded, based on its
86	   description and usage in [RFC1035] and clarifications in [RFC2181].

88	   This document proposes that the definition of the TTL be explicitly
89	   expanded to allow for expired data to be used in the exceptional
90	   circumstance that a recursive resolver is unable to refresh the
91	   information.  It is predicated on the observation that authoritative
92	   server unavailability can cause outages even when the underlying data
93	   those servers would return is typically unchanged.

95	   A method is described for this use of stale data, balancing the
96	   competing needs of resiliency and freshness.  While this intended to
97	   be immediately useful to the installed base of DNS software, an
98	   [RFC6891] EDNS option is also proposed for enhanced signalling around
99	   the use of stale data by implementations that understand it.

101	2.  Terminology

103	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
104	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
105	   "OPTIONAL" in this document are to be interpreted as described in
106	   [RFC2119] when, and only when, they appear in all capitals, as shown
107	   here.

109	   For a comprehensive treatment of DNS terms, please see [RFC7719].

111	3.  Background

113	   There are a number of reasons why an authoritative server may become
114	   unreachable, including Denial of Service (DoS) attacks, network
115	   issues, and so on.  If the recursive server is unable to contact the
116	   authoritative servers for a name but still has relevant data that has
117	   aged past its TTL, that information can still be useful for
118	   generating an answer under the metaphorical assumption that, "stale
119	   bread is better than no bread."

121	   [RFC1035] Section 3.2.1 says that the TTL "specifies the time
122	   interval that the resource record may be cached before the source of
123	   the information should again be consulted", and Section 4.1.3 further
124	   says the TTL, "specifies the time interval (in seconds) that the
125	   resource record may be cached before it should be discarded."

127	   A natural English interpretation of these remarks would seem to be
128	   clear enough that records past their TTL expiration must not be used,
129	   However, [RFC1035] predates the more rigorous terminology of
130	   [RFC2119] which softened the interpretation of "may" and "should".

132	   [RFC2181] aimed to provide "the precise definition of the Time to
133	   Live" but in Section 8 was mostly concerned with the numeric range of
134	   values and the possibility that very large values should be capped.
135	   (It also has the curious suggestion that a value in the range
136	   2147483648 to 4294967295 should be treated as zero.)  It closes that
137	   section by noting, "The TTL specifies a maximum time to live, not a
138	   mandatory time to live."  This is again not [RFC2119]-normative
139	   language, but does convey the natural language connotation that data
140	   becomes unusable past TTL expiry.

142	   Several major recursive resolver operations currently use stale data
143	   for answers in some way, including Akamai, OpenDNS, Xerocole, and
144	   Nominum.  Their collective operational experience is that it provides
145	   significant benefit with minimal downside.

147	4.  Standards Action

149	   The definition of TTL in [RFC1035] Sections 3.2.1 and 4.1.3 is
150	   amended to read:

152	   TTL  a 32 bit unsigned integer number of seconds in the range 0 -
153	      2147483647 that specifies the time interval that the resource
154	      record MAY be cached before the source of the information MUST
155	      again be consulted.  Zero values are interpreted to mean that the
156	      RR can only be used for the transaction in progress, and should
157	      not be cached.  Values with the high order bit set SHOULD be
158	      capped at no more than 2147483647.  If the authority for the data
159	      is unavailable when attempting to refresh the data past the given
160	      interval, the record MAY be used as though it has a remaining TTL
161	      of 1 second.

163	5.  EDNS Option

165	   While the basic behaviour of this answer-of-last-resort can be
166	   achieved with changes only to resolvers, explicit signalling about
167	   the use of stale data can be done with an EDNS [RFC6891] option.

169	   [ This section will be fleshed out a bit more thoroughly if there is
170	   interest in pursuing the option. ]

172	5.1.  Option Format

174	   The option is structured as follows:

176	                 +0 (MSB)                        +1 (LSB)
177	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
178	   0: |                         OPTION-CODE                       |
179	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
180	   2: |                        OPTION-LENGTH                      |
181	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
182	   4: |                     STALE-RRSET-INDEX 1                   |
183	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
184	   6: |                                                           |
185	   8: |                         TTL-EXPIRY 1                      |
186	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
187	      :  ... additional STALE-RRSET-INDEX / TTL-EXPIRY pairs ...  :
188	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

190	   OPTION-CODE  2 octets per [RFC6891].  For Serve-Stale the code is TBD
191	      by IANA.

193	   OPTION-LENGTH:  2 octets per [RFC6891].  Contains the length of the
194	      payload following OPTION-LENGTH, in octets.

196	   STALE-RRSET-INDEX  Two octets as a signed integer, indicating the
197	      first RRSet in the message which is beyond its TTL, with RRSet
198	      counting starting at 1 and spanning message sections.

200	   TTL-EXPIRY  Four octets as an unsigned integer, representing the
201	      number of seconds that have passed since the TTL for the RRset
202	      expired.

204	5.2.  Option Usage

206	   Software making a DNS request can signal that it understands Serve-
207	   Stale by including the option with one STALE-RRSET-INDEX initialized
208	   to any negative value and TTY-EXPIRY initialized to 0.  The index is
209	   set to a negative value to detect mere reflection of the option by
210	   responders that don't really understand it.

212	   If the request is made to a recursive resolver which used any stale
213	   RRsets in its reply, it then fills in the corresponding indices and
214	   staleness values.  If no records are stale, STALE-RRSET-INDEX and
215	   TTL-EXPIRY are set to 0.

217	   If the request is made to an authoritative nameserver, it can use the
218	   option in the reply to indicate how the resolver should treat the
219	   records in the reply if they are unable to be refreshed later.  A
220	   default for all RRsets in the message is established by setting the
221	   first STALE-RRSET-INDEX to 0, with optional additional STALE-RRSET-
222	   INDEX values overriding the default.  A TTL-EXPIRY value of 0 means
223	   to never serve the RRset as stale, while non-zero values represent
224	   the maximum amount of time it can be used before it MUST be evicted.
225	   [ Does anyone really want to do this?  It adds more state into
226	   resolvers.  Is the idea only for purists, or is there a practical
227	   application? ]

229	   No facility is made for a client of a resolver to signal that it
230	   doesn't want stale answers, because if a client has knowledge of
231	   Serve-Stale as an option, it also has enough knowledge to just ignore
232	   any records that come back stale.  [ There is admittedly the issue
233	   that the client might just want to wait out the whole attempted
234	   resolution, which there's currently no way to indicate.  The absolute
235	   value of STALE-RRSET-INDEX could be taken as a timer the requester is
236	   willing to wait for an answer, but that's kind of gross overloading
237	   it like that Shame to burn another field on that though, but on the
238	   other hand it would be nice if a client could always signal its
239	   impatience level - "I must have an answer within 900 milliseconds!" ]

241	6.  Example Method

243	   There is conceivably more than one way a recursive resolver could
244	   responsibly implement this resiliency feature while still respecting
245	   the intent of the TTL as a signal for when data is to be refreshed.

247	   In this example method three notable timers drive considerations for
248	   the use of stale data, as follows:

250	   o  A client response timer, which is the maximum amount of time a
251	      recursive resolver should allow between the receipt of a
252	      resolution request and sending its response.

254	   o  A query resolution timer, which caps the total amount of time a
255	      recursive resolver spends processing the query.

257	   o  A maximum stale timer, which caps the amount of time that records
258	      will be kept past their expiration.

260	   Recursive resolvers already have the second timer; the first and
261	   third timers are new concepts for this mechanism.

263	   When a request is received by the recursive resolver, it SHOULD start
264	   the client response timer.  This timer is used to avoid client
265	   timeouts.  It SHOULD be configurable, with a recommended value of 1.8
266	   seconds.

268	   The resolver then checks its cache for an unexpired answer.  If it
269	   finds none and the Recursion Desired flag is not set in the request,
270	   it SHOULD immediately return the response without consulting the
271	   cache for expired records.

273	   If iterative lookups will be done, it SHOULD start the query
274	   resolution timer.  This timer bounds the work done by the resolver,
275	   and is commonly around 10 to 30 seconds.

277	   If the answer has not been completely determined by the time the
278	   client response timer has elapsed, the resolver SHOULD then check its
279	   cache to see whether there is expired data that would satisfy the
280	   request.  If so, it adds that data to the response message and SHOULD
281	   set the TTL of each expired record in the message to 1 second.  The
282	   response is then sent to the client while the resolver continues its
283	   attempt to refresh the data.  1 second was chosen because
284	   historically 0 second TTLs have been problematic for some
285	   implementations.  It not only sidesteps those potential problems with
286	   no practical negative consequence, it would also rate limit further
287	   queries from any client that is honoring the TTL, such as a
288	   forwarding resolver.

290	   The maximum stale timer is used for cache management and is
291	   independent of the query resolution process.  This timer is
292	   conceptually different from the maximum cache TTL that exists in many
293	   resolvers, the latter being a clamp on the value of TTLs as received
294	   from authoritative servers.  The maximum stale timer SHOULD be
295	   configurable, and defines the length of time after a record expires
296	   that it SHOULD be retained in the cache.  The suggested value is 7
297	   days, which gives time to notice the resolution problem and for human
298	   intervention to fix it.

300	   This same basic technique MAY be used to handle stale data associated
301	   with delegations.  If authoritative server addresses are not able to
302	   be refreshed, resolution can possibly still be successful if the
303	   authoritative servers themselves are still up.

305	7.  Implementation Caveats

307	   Answers from authoritative servers that have a DNS Response Code of
308	   either 0 (NOERROR) or 3 (NXDOMAIN) MUST be considered to have
309	   refreshed the data at the resolver.  In particular, this means that
310	   this method is not meant to protect against operator error at the
311	   authoritative server that turns a name that is intended to be valid
312	   into one that is non-existent, because there is no way for a resolver
313	   to know intent.

315	   Resolution is given a chance to succeed before stale data is used to
316	   adhere to the original intent of the design of the DNS.  This
317	   mechanism is only intended to add robustness to failures, and to be
318	   enabled all the time.  If stale data were used immediately and then a
319	   cache refresh attempted after the client response has been sent, the
320	   resolver would frequently be sending data that it would have had no
321	   trouble refreshing.

323	   It is important to continue the resolution attempt after the stale
324	   response has been sent, until the query resolution timeout, because
325	   some pathological resolutions can take many seconds to succeed as
326	   they cope with unavailable servers, bad networks, and other problems.
327	   Stopping the resolution attempt when the response with expired data
328	   has been sent would mean that answers in these pathological cases
329	   would never be refreshed.

331	   Canonical Name (CNAME) records mingled in the expired cache with
332	   other records at the same owner name can cause surprising results.
333	   This was observed with an initial implementation in BIND when a
334	   hostname changed from having an IPv4 Address (A) record to a CNAME.
335	   The version of BIND being used did not evict other types in the cache
336	   when a CNAME was received, which in normal operations is not a
337	   significant issue.  However, after both records expired and the
338	   authorities became unavailable, the fallback to stale answers
339	   returned the older A instead of the newer CNAME.

341	   [ This probably applies to other occluding types, so more thought
342	   should be given to the overall issue. ]

344	   Keeping records around after their normal expiration will of course
345	   cause caches to grow larger than if records were removed at their
346	   TTL.  Specific guidance on managing cache sizes is outside the scope
347	   of this document.  Some areas for consideration include whether to
348	   track the popularity of names in client requests versus evicting by
349	   maximum age, and whether to provide a feature for manually flushing
350	   only stale records.

352	8.  Implementation Status

354	   [RFC Editor: per RFC 6982 this section should be removed prior to
355	   publication.]

357	   The algorithm described in the Section 6 section was originally
358	   implemented as a patch to BIND 9.7.0.  It has been in production on
359	   Akamai's production network since 2011, and effectively smoothed over
360	   transient failures and longer outages that would have resulted in
361	   major incidents.  The patch was contributed to the Internet Systems
362	   Consortium and is now distributed with BIND 9.12.

364	   Unbound has a similar feature for serving stale answers, but it works
365	   in a very different way by returning whatever cached answer it has
366	   before trying to refresh expired records.  This is unfortunately not
367	   faithful to the ideal that data past expiry should attempt to be
368	   refreshed before being served.

370	9.  Security Considerations

372	   The most obvious security issue is the increased likelihood of DNSSEC
373	   validation failures when using stale data because signatures could be
374	   returned outside their validity period.  This would only be an issue
375	   if the authoritative servers are unreachable, the only time the
376	   techniques in this document are used, and thus does not introduce a
377	   new failure in place of what would have otherwise been success.

379	   Additionally, bad actors have been known to use DNS caches to keep
380	   records alive even after their authorities have gone away.  This
381	   potentially makes that easier, although without introducing a new
382	   risk.

384	10.  Privacy Considerations

386	   This document does not add any practical new privacy issues.

388	11.  NAT Considerations

390	   The method described here is not affected by the use of NAT devices.

392	12.  IANA Considerations

394	   This document contains no actions for IANA.  This section will be
395	   removed during conversion into an RFC by the RFC editor.

397	13.  Acknowledgements

399	   The authors wish to thank Matti Klock, Mukund Sivaraman, Jean Roy,
400	   and Jason Moreau for initial review.  Feedback from Robert Edmonds
401	   and Davey Song has also been incorporated.

403	14.  References

405	14.1.  Normative References

407	   [RFC1035]  Mockapetris, P., "Domain names - implementation and
408	              specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
409	              November 1987, <https://www.rfc-editor.org/info/rfc1035>.

411	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
412	              Requirement Levels", BCP 14, RFC 2119,
413	              DOI 10.17487/RFC2119, March 1997,
414	              <https://www.rfc-editor.org/info/rfc2119>.

416	   [RFC2181]  Elz, R. and R. Bush, "Clarifications to the DNS
417	              Specification", RFC 2181, DOI 10.17487/RFC2181, July 1997,
418	              <https://www.rfc-editor.org/info/rfc2181>.

420	   [RFC6891]  Damas, J., Graff, M., and P. Vixie, "Extension Mechanisms
421	              for DNS (EDNS(0))", STD 75, RFC 6891,
422	              DOI 10.17487/RFC6891, April 2013,
423	              <https://www.rfc-editor.org/info/rfc6891>.

425	14.2.  Informative References

427	   [RFC7719]  Hoffman, P., Sullivan, A., and K. Fujiwara, "DNS
428	              Terminology", RFC 7719, DOI 10.17487/RFC7719, December
429	              2015, <https://www.rfc-editor.org/info/rfc7719>.

431	Authors' Addresses

433	   David C Lawrence
434	   Akamai Technologies
435	   150 Broadway
436	   Cambridge  MA 02142-1054
437	   USA

439	   Email: tale@akamai.com

441	   Warren Kumari
442	   Google
443	   1600 Amphitheatre Parkway
444	   Mountain View  CA 94043
445	   USA

447	   Email: warren@kumari.net