| < draft-ietf-dnsop-serve-stale-08.txt | draft-ietf-dnsop-serve-stale-09.txt > | |||
|---|---|---|---|---|
| DNSOP Working Group D. Lawrence | DNSOP Working Group D. Lawrence | |||
| Internet-Draft Oracle | Internet-Draft Oracle | |||
| Updates: 1034, 1035, 2181 (if approved) W. Kumari | Updates: 1034, 1035, 2181 (if approved) W. Kumari | |||
| Intended status: Standards Track P. Sood | Intended status: Standards Track P. Sood | |||
| Expires: March 21, 2020 Google | Expires: April 26, 2020 Google | |||
| September 18, 2019 | October 24, 2019 | |||
| Serving Stale Data to Improve DNS Resiliency | Serving Stale Data to Improve DNS Resiliency | |||
| draft-ietf-dnsop-serve-stale-08 | draft-ietf-dnsop-serve-stale-09 | |||
| Abstract | Abstract | |||
| This draft defines a method (serve-stale) for recursive resolvers to | This draft defines a method (serve-stale) for recursive resolvers to | |||
| use stale DNS data to avoid outages when authoritative nameservers | use stale DNS data to avoid outages when authoritative nameservers | |||
| cannot be reached to refresh expired data. One of the motivations | cannot be reached to refresh expired data. One of the motivations | |||
| for serve-stale is to make the DNS more resilient to DoS attacks, and | for serve-stale is to make the DNS more resilient to DoS attacks, and | |||
| thereby make them less attractive as an attack vector. This document | thereby make them less attractive as an attack vector. This document | |||
| updates the definitions of TTL from RFC 1034 and RFC 1035 so that | updates the definitions of TTL from RFC 1034 and RFC 1035 so that | |||
| data can be kept in the cache beyond the TTL expiry, updates RFC 2181 | data can be kept in the cache beyond the TTL expiry, updates RFC 2181 | |||
| skipping to change at page 1, line 33 ¶ | skipping to change at page 1, line 33 ¶ | |||
| rather than 0, and suggests a cap of 7 days. | rather than 0, and suggests a cap of 7 days. | |||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on March 21, 2020. | This Internet-Draft will expire on April 26, 2020. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2019 IETF Trust and the persons identified as the | Copyright (c) 2019 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
| skipping to change at page 3, line 40 ¶ | skipping to change at page 3, line 40 ¶ | |||
| says the TTL, "specifies the time interval (in seconds) that the | says the TTL, "specifies the time interval (in seconds) that the | |||
| resource record may be cached before it should be discarded." | resource record may be cached before it should be discarded." | |||
| A natural English interpretation of these remarks would seem to be | A natural English interpretation of these remarks would seem to be | |||
| clear enough that records past their TTL expiration must not be used. | clear enough that records past their TTL expiration must not be used. | |||
| However, [RFC1035] predates the more rigorous terminology of | However, [RFC1035] predates the more rigorous terminology of | |||
| [RFC2119] which softened the interpretation of "may" and "should". | [RFC2119] which softened the interpretation of "may" and "should". | |||
| [RFC2181] aimed to provide "the precise definition of the Time to | [RFC2181] aimed to provide "the precise definition of the Time to | |||
| Live", but in Section 8 was mostly concerned with the numeric range | Live", but in Section 8 was mostly concerned with the numeric range | |||
| of values and the possibility that very large values should be | of values rather than data expiration behavior. It does, however, | |||
| capped. (It also has the curious suggestion that a value in the | close that section by noting, "The TTL specifies a maximum time to | |||
| range 2147483648 to 4294967295 should be treated as zero.) It closes | live, not a mandatory time to live." This wording again does not | |||
| that section by noting, "The TTL specifies a maximum time to live, | contain BCP 14 [RFC2119] key words, but does convey the natural | |||
| not a mandatory time to live." This wording again does not contain | language connotation that data becomes unusable past TTL expiry. | |||
| BCP 14 [RFC2119] key words, but does convey the natural language | ||||
| connotation that data becomes unusable past TTL expiry. | ||||
| Several recursive resolver operators, including Akamai, currently use | Several recursive resolver operators, including Akamai, currently use | |||
| stale data for answers in some way. A number of recursive resolver | stale data for answers in some way. A number of recursive resolver | |||
| packages (including BIND, Knot, OpenDNS, Unbound) provide options to | packages (including BIND, Knot, OpenDNS, Unbound) provide options to | |||
| use stale data. Apple MacOS can also use stale data as part of the | use stale data. Apple MacOS can also use stale data as part of the | |||
| Happy Eyeballs algorithms in mDNSResponder. The collective | Happy Eyeballs algorithms in mDNSResponder. The collective | |||
| operational experience is that using stale data can provide | operational experience is that using stale data can provide | |||
| significant benefit with minimal downside. | significant benefit with minimal downside. | |||
| 4. Standards Action | 4. Standards Action | |||
| skipping to change at page 4, line 24 ¶ | skipping to change at page 4, line 23 ¶ | |||
| duration that the resource record MAY be cached before the source | duration that the resource record MAY be cached before the source | |||
| of the information MUST again be consulted. Zero values are | of the information MUST again be consulted. Zero values are | |||
| interpreted to mean that the RR can only be used for the | interpreted to mean that the RR can only be used for the | |||
| transaction in progress, and should not be cached. Values SHOULD | transaction in progress, and should not be cached. Values SHOULD | |||
| be capped on the orders of days to weeks, with a recommended cap | be capped on the orders of days to weeks, with a recommended cap | |||
| of 604,800 seconds (seven days). If the data is unable to be | of 604,800 seconds (seven days). If the data is unable to be | |||
| authoritatively refreshed when the TTL expires, the record MAY be | authoritatively refreshed when the TTL expires, the record MAY be | |||
| used as though it is unexpired. See the Section 5 and Section 6 | used as though it is unexpired. See the Section 5 and Section 6 | |||
| sections for details. | sections for details. | |||
| Interpreting values which have the high order bit set as being | Interpreting values which have the high-order bit set as being | |||
| positive, rather than 0, is a change from [RFC2181]. Suggesting a | positive, rather than 0, is a change from [RFC2181], the rationale | |||
| cap of seven days, rather than the 68 years allowed by [RFC2181], | for which is explained in Section 6. Suggesting a cap of seven days, | |||
| reflects the current practice of major modern DNS resolvers. | rather than the 68 years allowed by [RFC2181], reflects the current | |||
| practice of major modern DNS resolvers. | ||||
| When returning a response containing stale records, a recursive | When returning a response containing stale records, a recursive | |||
| resolver MUST set the TTL of each expired record in the message to a | resolver MUST set the TTL of each expired record in the message to a | |||
| value greater than 0, with a RECOMMENDED value of 30 seconds. See | value greater than 0, with a RECOMMENDED value of 30 seconds. See | |||
| Section 6 for explanation. | Section 6 for explanation. | |||
| Answers from authoritative servers that have a DNS Response Code of | Answers from authoritative servers that have a DNS Response Code of | |||
| either 0 (NoError) or 3 (NXDomain) and the Authoritative Answers (AA) | either 0 (NoError) or 3 (NXDomain) and the Authoritative Answers (AA) | |||
| bit set MUST be considered to have refreshed the data at the | bit set MUST be considered to have refreshed the data at the | |||
| resolver. Answers from authoritative servers that have any other | resolver. Answers from authoritative servers that have any other | |||
| skipping to change at page 7, line 33 ¶ | skipping to change at page 7, line 33 ¶ | |||
| Regarding the TTL to set on stale records in the response, | Regarding the TTL to set on stale records in the response, | |||
| historically TTLs of zero seconds have been problematic for some | historically TTLs of zero seconds have been problematic for some | |||
| implementations, and negative values can't effectively be | implementations, and negative values can't effectively be | |||
| communicated to existing software. Other very short TTLs could lead | communicated to existing software. Other very short TTLs could lead | |||
| to congestive collapse as TTL-respecting clients rapidly try to | to congestive collapse as TTL-respecting clients rapidly try to | |||
| refresh. The recommended value of 30 seconds not only sidesteps | refresh. The recommended value of 30 seconds not only sidesteps | |||
| those potential problems with no practical negative consequences, it | those potential problems with no practical negative consequences, it | |||
| also rate limits further queries from any client that honors the TTL, | also rate limits further queries from any client that honors the TTL, | |||
| such as a forwarding resolver. | such as a forwarding resolver. | |||
| As for the change to treat a TTL with the high-order bit set as | ||||
| positive and then clamping it, as opposed to [RFC2181] treating it as | ||||
| zero, the rationale here is basically one of engineering simplicity | ||||
| versus an inconsequential operational history. Negative TTLs had no | ||||
| rational intentional meaning that wouldn't have been satisfied by | ||||
| just sending 0 instead, and similarly there was realistically no | ||||
| practical purpose for sending TTLs of 2^25 seconds (1 year) or more. | ||||
| There's also no record of TTLs in the wild having the most | ||||
| significant bit set in DNS-OARC's "Day in the Life" samples. With no | ||||
| apparent reason for operators to use them intentionally, that leaves | ||||
| either errors or non-standard experiments as explanations as to why | ||||
| such TTLs might be encountered, with neither providing an obviously | ||||
| compelling reason as to why having the leading bit set should be | ||||
| treated differently from having any of the next eleven bits set and | ||||
| then capped per Section 4. | ||||
| Another implementation consideration is the use of stale nameserver | Another implementation consideration is the use of stale nameserver | |||
| addresses for lookups. This is mentioned explicitly because, in some | addresses for lookups. This is mentioned explicitly because, in some | |||
| resolvers, getting the addresses for nameservers is a separate path | resolvers, getting the addresses for nameservers is a separate path | |||
| from a normal cache lookup. If authoritative server addresses are | from a normal cache lookup. If authoritative server addresses are | |||
| not able to be refreshed, resolution can possibly still be successful | not able to be refreshed, resolution can possibly still be successful | |||
| if the authoritative servers themselves are up. For instance, | if the authoritative servers themselves are up. For instance, | |||
| consider an attack on a top-level domain that takes its nameservers | consider an attack on a top-level domain that takes its nameservers | |||
| offline; serve-stale resolvers that had expired glue addresses for | offline; serve-stale resolvers that had expired glue addresses for | |||
| subdomains within that TLD would still be able to resolve names | subdomains within that TLD would still be able to resolve names | |||
| within those subdomains, even those it had not previously looked up. | within those subdomains, even those it had not previously looked up. | |||
| skipping to change at page 11, line 41 ¶ | skipping to change at page 11, line 48 ¶ | |||
| [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", | [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", | |||
| STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987, | STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987, | |||
| <https://www.rfc-editor.org/info/rfc1034>. | <https://www.rfc-editor.org/info/rfc1034>. | |||
| [RFC1035] Mockapetris, P., "Domain names - implementation and | [RFC1035] Mockapetris, P., "Domain names - implementation and | |||
| specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, | specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, | |||
| November 1987, <https://www.rfc-editor.org/info/rfc1035>. | November 1987, <https://www.rfc-editor.org/info/rfc1035>. | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
| DOI 10.17487/RFC2119, March 1997, <https://www.rfc- | DOI 10.17487/RFC2119, March 1997, | |||
| editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
| [RFC2181] Elz, R. and R. Bush, "Clarifications to the DNS | [RFC2181] Elz, R. and R. Bush, "Clarifications to the DNS | |||
| Specification", RFC 2181, DOI 10.17487/RFC2181, July 1997, | Specification", RFC 2181, DOI 10.17487/RFC2181, July 1997, | |||
| <https://www.rfc-editor.org/info/rfc2181>. | <https://www.rfc-editor.org/info/rfc2181>. | |||
| [RFC2308] Andrews, M., "Negative Caching of DNS Queries (DNS | [RFC2308] Andrews, M., "Negative Caching of DNS Queries (DNS | |||
| NCACHE)", RFC 2308, DOI 10.17487/RFC2308, March 1998, | NCACHE)", RFC 2308, DOI 10.17487/RFC2308, March 1998, | |||
| <https://www.rfc-editor.org/info/rfc2308>. | <https://www.rfc-editor.org/info/rfc2308>. | |||
| [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
| End of changes. 9 change blocks. | ||||
| 19 lines changed or deleted | 34 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||