< draft-wkumari-dnsop-hammer-01.txt   draft-wkumari-dnsop-hammer-02.txt >
template W. Kumari Network Working Group W. Kumari
Internet-Draft Google Internet-Draft Google
Intended status: Informational R. Arends Intended status: Informational R. Arends
Expires: January 5, 2015 Nominet Expires: May 3, 2017 Nominet
S. Woolf S. Woolf
D. Migault D. Migault
Orange Orange
July 4, 2014 October 30, 2016
Highly Automated Method for Maintaining Expiring Records Highly Automated Method for Maintaining Expiring Records
draft-wkumari-dnsop-hammer-01 draft-wkumari-dnsop-hammer-02
Abstract Abstract
This document describes a simple DNS cache optimization which keeps This document describes a simple DNS cache optimization which keeps
the most popular records in the DNS cache: Highly Automated Method the most popular records in the DNS cache: Highly Automated Method
for Maintaining Expiring Records (HAMMER). The principle is that for Maintaining Expiring Records (HAMMER). The principle is that
records in the cache are fetched, that is to say resolved before records in the cache are fetched, that is to say resolved before
their TTL expires and the record is flushed from the cache. By their TTL expires and the record is flushed from the cache. By
fetching Records before they are being queried by an end user, HAMMER fetching Records before they are being queried by an end user, HAMMER
is expected to improve the quality of experience of the end users as is expected to improve the quality of experience of the end users as
skipping to change at page 1, line 43 skipping to change at page 1, line 43
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 5, 2015. This Internet-Draft will expire on May 3, 2017.
Copyright Notice Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
skipping to change at page 2, line 25 skipping to change at page 2, line 25
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1. Requirements notation . . . . . . . . . . . . . . . . . . 3 1.1. Requirements notation . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Motivations . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Motivations . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.1. Improving browsing Quality of Experience by reducing 3.1. Improving browsing Quality of Experience by reducing
response time . . . . . . . . . . . . . . . . . . . . . . 3 response time . . . . . . . . . . . . . . . . . . . . . . 3
3.2. Optimize the resources involved in large DNSSEC resolving 3.2. Optimize the resources involved in large DNSSEC resolving
platforms . . . . . . . . . . . . . . . . . . . . . . . . 4 platforms . . . . . . . . . . . . . . . . . . . . . . . . 4
4. Overview of Operation . . . . . . . . . . . . . . . . . . . . 5 4. Overview of Operation . . . . . . . . . . . . . . . . . . . . 4
5. Known implementations . . . . . . . . . . . . . . . . . . . . 5 5. Known implementations . . . . . . . . . . . . . . . . . . . . 5
5.1. Unbound (NLNet Labs) . . . . . . . . . . . . . . . . . . 6 5.1. Unbound (NLNet Labs) . . . . . . . . . . . . . . . . . . 5
5.2. OpenDNS . . . . . . . . . . . . . . . . . . . . . . . . . 6 5.2. OpenDNS . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.3. ISC BIND . . . . . . . . . . . . . . . . . . . . . . . . 6 5.3. ISC BIND . . . . . . . . . . . . . . . . . . . . . . . . 6
6. An example / reference implementation . . . . . . . . . . . 6 6. An example / reference implementation . . . . . . . . . . . 6
6.1. Variables . . . . . . . . . . . . . . . . . . . . . . . . 7 6.1. Variables . . . . . . . . . . . . . . . . . . . . . . . . 7
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
8. Security Considerations . . . . . . . . . . . . . . . . . . . 8 8. Security Considerations . . . . . . . . . . . . . . . . . . . 8
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 8
10.1. Normative References . . . . . . . . . . . . . . . . . . 9 10.1. Normative References . . . . . . . . . . . . . . . . . . 8
10.2. Informative References . . . . . . . . . . . . . . . . . 9 10.2. Informative References . . . . . . . . . . . . . . . . . 8
Appendix A. Changes / Author Notes. . . . . . . . . . . . . . . 9 Appendix A. Changes / Author Notes. . . . . . . . . . . . . . . 9
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9
1. Introduction 1. Introduction
A recursive DNS resolver may cache a Resource Record (RR) for, at A recursive DNS resolver may cache a Resource Record (RR) for, at
most, the Time To Live (TTL) associated with that record. While the most, the Time To Live (TTL) associated with that record. While the
TTL is greater than zero, the resolver may respond to queries from TTL is greater than zero, the resolver may respond to queries from
its cache, but once the TTL has reached zero, the resolver flushes its cache; but once the TTL has reached zero, the resolver flushes
the RR. When the resolver gets another query for that resource, it the RR. When the resolver gets another query for that resource, it
needs to initiate a new query. This is then cached and returned to needs to initiate a new query. This is then cached and returned to
the querying client. This document discusses an optimization (Highly the querying client. This document discusses an optimization (Highly
Automated Method for Maintaining Expiring Records -- (HAMMER), also Automated Method for Maintaining Expiring Records -- (HAMMER), also
known as "prefetch") to help keep popular responses in the cache, by known as "prefetch") to help keep popular responses in the cache, by
fetching new responses before the TTL expires. This behavior is fetching new responses before the TTL expires. This behavior is
triggered by an incoming query, and only shortly before the cache triggered by an incoming query that arrives only shortly before the
entry was due to expire. cache entry was due to expire.
1.1. Requirements notation 1.1. Requirements notation
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
2. Terminology 2. Terminology
- HAMMER resolver: A DNS resolver that implements HAMMER mechanism. HAMMER resolver: A DNS resolver that implements HAMMER mechanism.
- HAMMER FQDN: A FQDN that is a candidate for the HAMMER process. HAMMER FQDN: A FQDN that is a candidate for the HAMMER process.
- HAMMER TIME: TTL Time to consider before triggering the HAMMER HAMMER TIME: TTL Time to consider before triggering the HAMMER
mechanism. mechanism.
3. Motivations 3. Motivations
When a recursive resolver responds to a client, it either responds When a recursive resolver responds to a client, it either responds
from cache, or it initiates an iterative query to resolve the answer, from cache, or it initiates an iterative query to resolve the answer,
caches the answer and then responds with that answer. caches the answer and then responds with that answer.
3.1. Improving browsing Quality of Experience by reducing response time 3.1. Improving browsing Quality of Experience by reducing response time
Any end user querying a fetched FQDN will get the response from the Any end user querying a fetched FQDN will get the response from the
cache of the resolver. This provides faster responses, and thus cache of the resolver. This provides faster responses, thus
improves end user experience for browsing and other applications/ improving the end user experience for browsing and other
activities. applications/activities.
Popular FQDNs are highly queried, and end users have high Popular FQDNs are highly queried, and end users have high
expectations in terms of application response for these FQDNs. With expectations in terms of application responsiveness for these FQDNs.
regular DNS rules, once the FQDN has been flushed from the cache, it With regular DNS rules, once the FQDN has been flushed from the
waits for the next end user to request the FQDN before initiating a cache, it waits for the next end user to request the FQDN before
resolution for this given FQDN with iterative queries. This results initiating a resolution for this given FQDN with iterative queries.
in at least one end user waiting for this resolution to be performed This results in at least one end user waiting for this resolution to
over the Internet before the response is sent to him. This may be performed over the Internet before the response is sent to them.
provide a poor user experience since DNS response times over the This may provide a poor user experience since DNS response times over
Internet are unpredictable at best and it provides a response time the Internet are unpredictable at best and it provides a response
longer then usual. time longer then usual.
In some cases, not only the first end user querying that FQDN may be In some cases, not only the first end user querying that FQDN may be
impacted, but also other end users that request the FQDN between the impacted, but also other end users that request the FQDN between the
time the FQDN TTL expires and the time the cache is again filled. In time the FQDN TTL expires and the time the cache is again filled. In
this case, the result is impact on multiple end users and possible this case, the result is impact on multiple end users and possible
unnecessary load on the platform. Note that this load is increased unnecessary load on the platform. Note that this load is increased
by the use of DNSSEC since DNSSEC may involve additional resolutions, by the use of DNSSEC since DNSSEC may involve additional resolutions,
larger payloads, and signature checks. larger payloads, and signature checks.
DNS response time for a resolution over the Internet is highly
unpredictable as it depends on network congestion and servers'
availability. Links share their bandwidth, so heavily loaded links
result in higher response time, regardless of whether the congestion
occurs close to the resolver, close to the client, or close to the
authoritative servers. Loaded switches or routers may result in
packet drop, which requires the resolver to notice the packet has
been dropped (usually with a time out) and restart the iterative
resolution. These issues are increased by the use of DNSSEC which
makes DNS packets larger. Similarly, loaded servers have longer
response times.
3.2. Optimize the resources involved in large DNSSEC resolving 3.2. Optimize the resources involved in large DNSSEC resolving
platforms platforms
Large resolving platforms are often composed of a set of independent Large resolving platforms are often composed of a set of independent
resolving nodes. The traffic is usually load balanced based on the resolving nodes. The traffic is usually load balanced based on the
query source IP addresses. This results in most popular FQDNs being query source IP addresses. This results in most popular FQDNs being
resolved independently by all nodes. First this increases the number resolved independently by all nodes. This increases the number of
of end users who may experience unnecessary latency. Also, when end users who may experience unnecessary latency. Also, when DNSSEC
DNSSEC is used, all nodes independently perform signature check is used, all nodes independently perform signature check operations,
operations, possibly resulting in high loads on the authoritative possibly resulting in high loads on the authoritative server.
server.
The challenge these large DNSSEC resolving platforms have to overcome The challenge these large DNSSEC resolving platforms have to overcome
is to provide a uniform distribution of the nodes given that end user is to provide a uniform distribution of the nodes given that end user
and FQDNs do not have a uniform distribution of the resources. More and FQDNs do not have a uniform distribution of the resources. More
specifically, FQDNs and end users usually present Zipf popularity specifically, FQDNs and end users usually present Zipf popularity
distributions, which means that most of the traffic is performed by a distributions, which means that most of the traffic is performed by a
small set of end users and by a small set of FQDNs. small set of end users and by a small set of FQDNs.
DNS and large resolving DNS platforms have resulted in uniformly DNS and large resolving DNS platforms have resulted in uniformly
balanced traffic among the nodes. In fact the resolving traffic on balanced traffic among the nodes. In fact the resolving traffic on
skipping to change at page 5, line 32 skipping to change at page 5, line 18
[Ed: Well, this is kinda embarrassing. This idea occurred to us one [Ed: Well, this is kinda embarrassing. This idea occurred to us one
day while sitting around a pool in New Hampshire. It then took a day while sitting around a pool in New Hampshire. It then took a
while before I wrote it down, mostly because I *really* wanted to get while before I wrote it down, mostly because I *really* wanted to get
"Stop! Hammer Time!" into a draft. Anyway, we presented it in "Stop! Hammer Time!" into a draft. Anyway, we presented it in
Berlin, and Wouter Wijngaards stood up and mentioned that Unbound Berlin, and Wouter Wijngaards stood up and mentioned that Unbound
already does this (they use a percentage of TTL, instead of a number already does this (they use a percentage of TTL, instead of a number
of seconds). Then we heard from OpenDNS that they *also* implement of seconds). Then we heard from OpenDNS that they *also* implement
something similar. Then we had a number of discussions, then got something similar. Then we had a number of discussions, then got
sidetracked into other things. Anyway, BIND as of 9.10, around Feb sidetracked into other things. Anyway, BIND as of 9.10, around Feb
2014 now implements something like this (https://deepthought.isc.org/ 2014 now implements something like this
article/AA-01122/0/Early-refresh-of-cache-records-cache-prefetch-in- (https://deepthought.isc.org/article/AA-01122/0/Early-refresh-of-
BIND-9.10.html), and enables it by default. Unfortunately, while cache-records-cache-prefetch-in-BIND-9.10.html), and enables it by
BIND uses the times based approach, they named their parameters default. Unfortunately, while BIND uses the times based approach,
"trigger" and "eligibility" - and shouting "Eligibility! Trigger they named their parameters "trigger" and "eligibility" - and
time!" simply isn't funny (unless you have a very odd sense of shouting "Eligibility! Trigger time!" simply isn't funny (unless you
humor... So, we are now documenting implementations that existed have a very odd sense of humor... So, we are now documenting
before this was published and an impl,entation that we think was implementations that existed before this was published and an
based on this. We think that this has value to the community. I'm impl,entation that we think was based on this. We think that this
also leaving in the HAMMER TIME bit, because it makes me giggle. has value to the community. I'm also leaving in the HAMMER TIME bit,
This below section should be filled out with more detail, in because it makes me giggle. This below section should be filled out
collaboration with the implementors, but this is being written *just* with more detail, in collaboration with the implementors, but this is
before the draft cutoff.]. being written *just* before the draft cutoff.].
A number of recursive resolvers implement techniques similar to the A number of recursive resolvers implement techniques similar to the
techniques described in this document. This section documents some techniques described in this document. This section documents some
of these and tradeoffs they make in picking their techniques. of these and tradeoffs they make in picking their techniques.
5.1. Unbound (NLNet Labs) 5.1. Unbound (NLNet Labs)
The Unbound validating, recursive, and caching DNS resolver The Unbound validating, recursive, and caching DNS resolver
implements a HAMMER type feature, called "prefetch". This feature implements a HAMMER type feature, called "prefetch". This feature
can be enabled or disabled though the configuration option "prefetch: can be enabled or disabled though the configuration option "prefetch:
skipping to change at page 6, line 49 skipping to change at page 6, line 37
When a recursive resolver that implements HAMMER receives a query for When a recursive resolver that implements HAMMER receives a query for
information that it has in the cache, it responds from the cache. information that it has in the cache, it responds from the cache.
If the queried FQDN is a HAMMER FQDN, the HAMMER resolver compares If the queried FQDN is a HAMMER FQDN, the HAMMER resolver compares
the TTL value to the HAMMER TIME, as well as if the FQDN has already the TTL value to the HAMMER TIME, as well as if the FQDN has already
been fetched. been fetched.
If the HAMMER FQDN has already been fetched or provisioned) then If the HAMMER FQDN has already been fetched or provisioned) then
nothing is done. nothing is done.
If the HAMMER FQDN has not yet been fetched and the TTL is less then If the HAMMER FQDN has not yet been fetched and the TTL is less than
the HAMMER_TIME, the HAMMER resolver starts a resolution for the the HAMMER_TIME, the HAMMER resolver starts a resolution for the
queried FQDN in order to fill the cache, just as if the TTL had queried FQDN in order to fill the cache, just as if the TTL had
expired. During this cache fill operation the resolver continues to expired. During this cache fill operation the resolver continues to
respond from cache (until the TTL expires). When the cache fill respond from cache (until the TTL expires). When the cache fill
query completes, the new response replaces the existing cached query completes, the new response replaces the existing cached
information. This ensures the cache has fresh data for subsequent information. This ensures the cache has fresh data for subsequent
queries. queries.
Since the cache fill query is initiated before the existing cached Since the cache fill query is initiated before the existing cached
entry expires (and is flushed), responses will come from the cache entry expires (and is flushed), responses will come from the cache
skipping to change at page 8, line 5 skipping to change at page 7, line 39
cause excessive cache fill queries to occur. In order to prevent cause excessive cache fill queries to occur. In order to prevent
this an additional variable named STOP (described below) is this an additional variable named STOP (described below) is
introduced. If the original TTL of the RR is less than STOP * introduced. If the original TTL of the RR is less than STOP *
HAMMER_TIME then the cache entry should be marked with a "Can't touch HAMMER_TIME then the cache entry should be marked with a "Can't touch
this" flag, and the described method should not be used. this" flag, and the described method should not be used.
6.1. Variables 6.1. Variables
These are the mandatory variables: These are the mandatory variables:
- HAMMER_TIME: is the number of seconds before TTL expiration that a HAMMER_TIME: is the number of seconds before TTL expiration that a
cache fill query should be initiated. This should be a user cache fill query should be initiated. This should be a user
configurable value. A default of 2 seconds is RECOMMENDED. configurable value. A default of 2 seconds is RECOMMENDED.
- STOP: should be a user configurable variable. A default of 3 is STOP: should be a user configurable variable. A default of 3 is
recommended. recommended.
Implementations may consider additional variables. These are not Implementations may consider additional variables. These are not
mandatory but would address specific use of the HAMMER. mandatory but would address specific use of the HAMMER.
- HAMMER_MATCH: should be a user configurable variable. It defines HAMMER_MATCH: should be a user configurable variable. It defines
FQDNs that are expected to implement HAMMER. This rule can be FQDNs that are expected to implement HAMMER. This rule can be
expressed in different ways. It can be a list of FQDNs, or a expressed in different ways. It can be a list of FQDNs, or a
number indicating the number of most popular FQDNs that needs number indicating the number of most popular FQDNs that needs
to be considered. How HAMMER_MATCH is expressed is to be considered. How HAMMER_MATCH is expressed is
implementation dependent. Implementations can use a list of implementation dependent. Implementations can use a list of
FQDNs, others can use a matching rule on the FQDNs, or define FQDNs, others can use a matching rule on the FQDNs, or define
the HAMMER_FQDNs as the X most popular FQDNs. the HAMMER_FQDNs as the X most popular FQDNs.
- HAMMER_FORWARDER: should be a user configurable variable. It is HAMMER_FORWARDER: should be a user configurable variable. It is
optional and designates the DNS server the resolver forwards optional and designates the DNS server the resolver forwards
the request to. the request to.
7. IANA Considerations 7. IANA Considerations
This document makes no request of the IANA. This document makes no request of the IANA.
8. Security Considerations 8. Security Considerations
This technique leverages existing protocols, and should not introduce This technique leverages existing protocols, and should not introduce
any new risks, other than a slight increase in traffic. any new risks, other than a slight increase in traffic.
By initiating cache fill entries before the existing RR has expired By initiating cache fill entries before the existing RR has expired
this technique will slightly increase the number of queries seen by this technique will slightly increase the number of queries seen by
authoritative servers. This increase will be inversely proportional authoritative servers. This increase will be inversely proportional
to the average TTL of the records that they serve. to the average TTL of the records that they serve.
It is unlikely, but possible that this increase could cause a denial It is unlikely, but possible, that this increase could cause a denial
of service condition. of service condition.
9. Acknowledgements 9. Acknowledgements
The authors wish to thank Tony Finch and MC Hammer. We also wish to The authors wish to thank Tony Finch and MC Hammer. We also wish to
thank Brian Somers and Wouter Wijngaards for telling us that they thank Brian Somers and Wouter Wijngaards for telling us that they
already do this :-) (They should probably be co-authors, but I left already do this :-) (They should probably be co-authors, but I left
this too close to the draft cutoff time to confirm with them that this too close to the draft cutoff time to confirm with them that
they are willing to have thier names on this). they are willing to have their names on this).
10. References 10. References
10.1. Normative References 10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/
RFC2119, March 1997,
<http://www.rfc-editor.org/info/rfc2119>.
10.2. Informative References 10.2. Informative References
[I-D.ietf-sidr-iana-objects] [I-D.ietf-sidr-iana-objects]
Manderson, T., Vegoda, L., and S. Kent, "RPKI Objects Manderson, T., Vegoda, L., and S. Kent, "RPKI Objects
issued by IANA", draft-ietf-sidr-iana-objects-03 (work in issued by IANA", draft-ietf-sidr-iana-objects-03 (work in
progress), May 2011. progress), May 2011.
Appendix A. Changes / Author Notes. Appendix A. Changes / Author Notes.
[RFC Editor: Please remove this section before publication ] [RFC Editor: Please remove this section before publication ]
From -01 to -02:
o Readbility / cleanup.
o Tried to make it more clear that most implementations now support
this (although they call it "prefetch" )
From -00 to 01: From -00 to 01:
o Fairly large rewrite. o Fairly large rewrite.
o Added text on the fact that there are implmentations that do this. o Added text on the fact that there are implmentations that do this.
o Added the "prefetch" name, cleaned up some readability. o Added the "prefetch" name, cleaned up some readability.
o Daniel's test (Section 3.2) added. o Daniel's test (Section 3.2) added.
 End of changes. 28 change blocks. 
67 lines changed or deleted 63 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/