< draft-ietf-dnsop-serve-stale-08.txt   draft-ietf-dnsop-serve-stale-09.txt >
DNSOP Working Group D. Lawrence DNSOP Working Group D. Lawrence
Internet-Draft Oracle Internet-Draft Oracle
Updates: 1034, 1035, 2181 (if approved) W. Kumari Updates: 1034, 1035, 2181 (if approved) W. Kumari
Intended status: Standards Track P. Sood Intended status: Standards Track P. Sood
Expires: March 21, 2020 Google Expires: April 26, 2020 Google
September 18, 2019 October 24, 2019
Serving Stale Data to Improve DNS Resiliency Serving Stale Data to Improve DNS Resiliency
draft-ietf-dnsop-serve-stale-08 draft-ietf-dnsop-serve-stale-09
Abstract Abstract
This draft defines a method (serve-stale) for recursive resolvers to This draft defines a method (serve-stale) for recursive resolvers to
use stale DNS data to avoid outages when authoritative nameservers use stale DNS data to avoid outages when authoritative nameservers
cannot be reached to refresh expired data. One of the motivations cannot be reached to refresh expired data. One of the motivations
for serve-stale is to make the DNS more resilient to DoS attacks, and for serve-stale is to make the DNS more resilient to DoS attacks, and
thereby make them less attractive as an attack vector. This document thereby make them less attractive as an attack vector. This document
updates the definitions of TTL from RFC 1034 and RFC 1035 so that updates the definitions of TTL from RFC 1034 and RFC 1035 so that
data can be kept in the cache beyond the TTL expiry, updates RFC 2181 data can be kept in the cache beyond the TTL expiry, updates RFC 2181
skipping to change at page 1, line 33 skipping to change at page 1, line 33
rather than 0, and suggests a cap of 7 days. rather than 0, and suggests a cap of 7 days.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on March 21, 2020. This Internet-Draft will expire on April 26, 2020.
Copyright Notice Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
skipping to change at page 3, line 40 skipping to change at page 3, line 40
says the TTL, "specifies the time interval (in seconds) that the says the TTL, "specifies the time interval (in seconds) that the
resource record may be cached before it should be discarded." resource record may be cached before it should be discarded."
A natural English interpretation of these remarks would seem to be A natural English interpretation of these remarks would seem to be
clear enough that records past their TTL expiration must not be used. clear enough that records past their TTL expiration must not be used.
However, [RFC1035] predates the more rigorous terminology of However, [RFC1035] predates the more rigorous terminology of
[RFC2119] which softened the interpretation of "may" and "should". [RFC2119] which softened the interpretation of "may" and "should".
[RFC2181] aimed to provide "the precise definition of the Time to [RFC2181] aimed to provide "the precise definition of the Time to
Live", but in Section 8 was mostly concerned with the numeric range Live", but in Section 8 was mostly concerned with the numeric range
of values and the possibility that very large values should be of values rather than data expiration behavior. It does, however,
capped. (It also has the curious suggestion that a value in the close that section by noting, "The TTL specifies a maximum time to
range 2147483648 to 4294967295 should be treated as zero.) It closes live, not a mandatory time to live." This wording again does not
that section by noting, "The TTL specifies a maximum time to live, contain BCP 14 [RFC2119] key words, but does convey the natural
not a mandatory time to live." This wording again does not contain language connotation that data becomes unusable past TTL expiry.
BCP 14 [RFC2119] key words, but does convey the natural language
connotation that data becomes unusable past TTL expiry.
Several recursive resolver operators, including Akamai, currently use Several recursive resolver operators, including Akamai, currently use
stale data for answers in some way. A number of recursive resolver stale data for answers in some way. A number of recursive resolver
packages (including BIND, Knot, OpenDNS, Unbound) provide options to packages (including BIND, Knot, OpenDNS, Unbound) provide options to
use stale data. Apple MacOS can also use stale data as part of the use stale data. Apple MacOS can also use stale data as part of the
Happy Eyeballs algorithms in mDNSResponder. The collective Happy Eyeballs algorithms in mDNSResponder. The collective
operational experience is that using stale data can provide operational experience is that using stale data can provide
significant benefit with minimal downside. significant benefit with minimal downside.
4. Standards Action 4. Standards Action
skipping to change at page 4, line 24 skipping to change at page 4, line 23
duration that the resource record MAY be cached before the source duration that the resource record MAY be cached before the source
of the information MUST again be consulted. Zero values are of the information MUST again be consulted. Zero values are
interpreted to mean that the RR can only be used for the interpreted to mean that the RR can only be used for the
transaction in progress, and should not be cached. Values SHOULD transaction in progress, and should not be cached. Values SHOULD
be capped on the orders of days to weeks, with a recommended cap be capped on the orders of days to weeks, with a recommended cap
of 604,800 seconds (seven days). If the data is unable to be of 604,800 seconds (seven days). If the data is unable to be
authoritatively refreshed when the TTL expires, the record MAY be authoritatively refreshed when the TTL expires, the record MAY be
used as though it is unexpired. See the Section 5 and Section 6 used as though it is unexpired. See the Section 5 and Section 6
sections for details. sections for details.
Interpreting values which have the high order bit set as being Interpreting values which have the high-order bit set as being
positive, rather than 0, is a change from [RFC2181]. Suggesting a positive, rather than 0, is a change from [RFC2181], the rationale
cap of seven days, rather than the 68 years allowed by [RFC2181], for which is explained in Section 6. Suggesting a cap of seven days,
reflects the current practice of major modern DNS resolvers. rather than the 68 years allowed by [RFC2181], reflects the current
practice of major modern DNS resolvers.
When returning a response containing stale records, a recursive When returning a response containing stale records, a recursive
resolver MUST set the TTL of each expired record in the message to a resolver MUST set the TTL of each expired record in the message to a
value greater than 0, with a RECOMMENDED value of 30 seconds. See value greater than 0, with a RECOMMENDED value of 30 seconds. See
Section 6 for explanation. Section 6 for explanation.
Answers from authoritative servers that have a DNS Response Code of Answers from authoritative servers that have a DNS Response Code of
either 0 (NoError) or 3 (NXDomain) and the Authoritative Answers (AA) either 0 (NoError) or 3 (NXDomain) and the Authoritative Answers (AA)
bit set MUST be considered to have refreshed the data at the bit set MUST be considered to have refreshed the data at the
resolver. Answers from authoritative servers that have any other resolver. Answers from authoritative servers that have any other
skipping to change at page 7, line 33 skipping to change at page 7, line 33
Regarding the TTL to set on stale records in the response, Regarding the TTL to set on stale records in the response,
historically TTLs of zero seconds have been problematic for some historically TTLs of zero seconds have been problematic for some
implementations, and negative values can't effectively be implementations, and negative values can't effectively be
communicated to existing software. Other very short TTLs could lead communicated to existing software. Other very short TTLs could lead
to congestive collapse as TTL-respecting clients rapidly try to to congestive collapse as TTL-respecting clients rapidly try to
refresh. The recommended value of 30 seconds not only sidesteps refresh. The recommended value of 30 seconds not only sidesteps
those potential problems with no practical negative consequences, it those potential problems with no practical negative consequences, it
also rate limits further queries from any client that honors the TTL, also rate limits further queries from any client that honors the TTL,
such as a forwarding resolver. such as a forwarding resolver.
As for the change to treat a TTL with the high-order bit set as
positive and then clamping it, as opposed to [RFC2181] treating it as
zero, the rationale here is basically one of engineering simplicity
versus an inconsequential operational history. Negative TTLs had no
rational intentional meaning that wouldn't have been satisfied by
just sending 0 instead, and similarly there was realistically no
practical purpose for sending TTLs of 2^25 seconds (1 year) or more.
There's also no record of TTLs in the wild having the most
significant bit set in DNS-OARC's "Day in the Life" samples. With no
apparent reason for operators to use them intentionally, that leaves
either errors or non-standard experiments as explanations as to why
such TTLs might be encountered, with neither providing an obviously
compelling reason as to why having the leading bit set should be
treated differently from having any of the next eleven bits set and
then capped per Section 4.
Another implementation consideration is the use of stale nameserver Another implementation consideration is the use of stale nameserver
addresses for lookups. This is mentioned explicitly because, in some addresses for lookups. This is mentioned explicitly because, in some
resolvers, getting the addresses for nameservers is a separate path resolvers, getting the addresses for nameservers is a separate path
from a normal cache lookup. If authoritative server addresses are from a normal cache lookup. If authoritative server addresses are
not able to be refreshed, resolution can possibly still be successful not able to be refreshed, resolution can possibly still be successful
if the authoritative servers themselves are up. For instance, if the authoritative servers themselves are up. For instance,
consider an attack on a top-level domain that takes its nameservers consider an attack on a top-level domain that takes its nameservers
offline; serve-stale resolvers that had expired glue addresses for offline; serve-stale resolvers that had expired glue addresses for
subdomains within that TLD would still be able to resolve names subdomains within that TLD would still be able to resolve names
within those subdomains, even those it had not previously looked up. within those subdomains, even those it had not previously looked up.
skipping to change at page 11, line 41 skipping to change at page 11, line 48
[RFC1034] Mockapetris, P., "Domain names - concepts and facilities", [RFC1034] Mockapetris, P., "Domain names - concepts and facilities",
STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987, STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987,
<https://www.rfc-editor.org/info/rfc1034>. <https://www.rfc-editor.org/info/rfc1034>.
[RFC1035] Mockapetris, P., "Domain names - implementation and [RFC1035] Mockapetris, P., "Domain names - implementation and
specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
November 1987, <https://www.rfc-editor.org/info/rfc1035>. November 1987, <https://www.rfc-editor.org/info/rfc1035>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, <https://www.rfc- DOI 10.17487/RFC2119, March 1997,
editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC2181] Elz, R. and R. Bush, "Clarifications to the DNS [RFC2181] Elz, R. and R. Bush, "Clarifications to the DNS
Specification", RFC 2181, DOI 10.17487/RFC2181, July 1997, Specification", RFC 2181, DOI 10.17487/RFC2181, July 1997,
<https://www.rfc-editor.org/info/rfc2181>. <https://www.rfc-editor.org/info/rfc2181>.
[RFC2308] Andrews, M., "Negative Caching of DNS Queries (DNS [RFC2308] Andrews, M., "Negative Caching of DNS Queries (DNS
NCACHE)", RFC 2308, DOI 10.17487/RFC2308, March 1998, NCACHE)", RFC 2308, DOI 10.17487/RFC2308, March 1998,
<https://www.rfc-editor.org/info/rfc2308>. <https://www.rfc-editor.org/info/rfc2308>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
 End of changes. 9 change blocks. 
19 lines changed or deleted 34 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/