<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc2119 PUBLIC "" ".//reference.RFC.2119.xml">
]>
<!-- WK: Set category, IPR, docName -->
<rfc category="info" docName="draft-wkumari-dnsop-hammer-03" ipr="trust200902">
  <?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>

  <?rfc toc="yes" ?>
  <?rfc symrefs="yes" ?>
  <?rfc sortrefs="yes"?>
  <?rfc iprnotified="no" ?>
  <?rfc strict="yes"?>
  <?rfc compact="yes" ?>
  <front>
    <!-- WK: Set long title. -->

    <title abbrev="hammer">Highly Automated Method for Maintaining Expiring
    Records</title>

    <author fullname="Warren Kumari" initials="W." surname="Kumari">
      <organization>Google</organization>

      <address>
        <postal>
          <street>1600 Amphitheatre Parkway</street>
          <city>Mountain View, CA</city>
          <code>94043</code>
          <country>US</country>
        </postal>

        <email>warren@kumari.net</email>
      </address>
    </author>

    <author fullname="Roy Arends" initials="R." surname="Arends">
      <organization>ICANN</organization>
      <address>
        <email>roy.arends@icann.org</email>
      </address>
    </author>

    <author fullname="Suzanne Woolf" initials="S." surname="Woolf">
      <address>
        <postal>
          <street>39 Dodge St. #317</street>
          <city>Beverly, MA</city>
          <code>01915</code>
          <country>US</country>
        </postal>
        <email>suzworldwide@gmail.com</email>
      </address>
    </author>

    <author fullname="Daniel Migault" initials="D." surname="Migault">
      <organization>Ericsson</organization>

      <address>
        <postal>
          <street>2039 Rue Cohen</street>
          <city>Saint-Laurent</city>
          <code>H4R 2A4</code>
          <country>Canada</country>
        </postal>

        <email>daniel.migaultf@ericsson.com</email>
      </address>
    </author>

    <date />

    <abstract> 

<t>This document describes a simple DNS cache optimization which keeps
the most popular Resource Records set (RRset) in the DNS cache: Highly
Automated Method for Maintaining Expiring Records (HAMMER).</t>

<t> The principle is that popular RRset in the cache are fetched, that
is to say resolved before their TTL expires and flushed. By fetching
RRset before they are being queried by an end user, that is to say
prefetched, HAMMER is expected to improve the quality of experience of
the end users as well as to optimize the resources involved in large
DNSSEC resolving platforms.</t> 

</abstract>
 </front>

  <middle>
    <section title="Introduction">

<t>A recursive DNS resolver may cache a Resource Record set (RRset) for,
at most, the Time To Live (TTL) associated with that RRset. While the
TTL is greater than zero, the resolver may respond to queries from its
cache; but once the TTL has reached zero, the resolver flushes the
RRset.  When the resolver gets another query for that RRset, the RRset
is not anymore in the cache, thus the resolver need to proceed to a new
resolution for that RRset with its associated latency and processing.
The resolved RRset are then cached and returned to the original querying
client. This document discusses an optimization (Highly Automated Method
for Maintaining Expiring Records -- (HAMMER), also known as "prefetch")
to help keep popular responses in the cache, by fetching (or resolving)
resources before their TTL expires.</t>

<t>In that document, a resolver implementing HAMMER (HAMMER resolver)
prefetches a RRset candidate to HAMMER (HAMMER RRset) when it receives a
query and its TTL is lower than HAMMER TIME.</t>

<t>Note that <xref target="RFC4035"/> assumes that all RR of a RRset
have the same TTL, while <xref target="RFC2181"/> allows the TTL of the
RR of a RRset to be different. When the RRset does not follow <xref
target="RFC4035"/>, the TTL of the RRset that is considered is the
minimum value of the TTL.</t>   

      <section title="Requirements notation">

<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in <xref
target="RFC2119"/>.</t>

      </section>
    </section>

    <section title="Terminology">
      <t><list hangIndent="6" style="hanging">

<t hangText="HAMMER resolver:">A DNS resolver that implements HAMMER
mechanism.</t>

<t hangText="HAMMER RRset:">A RRset that is a candidate for the HAMMER
process.</t>

<t hangText="HAMMER TIME:"> a number of seconds that indicates the
period preceding the TTL expiration time. When a query for a HAMMER
RRset is received during that period, the HAMMER resolver prefetches
the HAMMER RRset by initiating a resolution. </t>

        </list></t>
    </section>

    <section title="Motivations">

<t>When a recursive resolver responds to a client, it either responds
from cache, or it initiates an iterative query to resolve the answer,
caches the answer and then responds with that answer.</t>

      <section title="Improving browsing Quality of Experience by
reducing response time" anchor="sec-user-qos">

<t>Any end user querying a fetched RRset will get the response from the
cache of the resolver. This provides faster responses, thus improving
the end user experience for browsing and other
applications/activities.</t>

 <t>Popular RRsets are highly queried, and end users have high
expectations in terms of application responsiveness for these RRsets.
With regular DNS rules, once the RRset has been flushed from the cache,
it waits for the next end user to request the RRset before initiating a
resolution for this given RRset with iterative queries. This results in
at least one end user waiting for this resolution to be performed over
the Internet before the response is sent to them. This may provide a
poor user experience since DNS response times over the Internet are
unpredictable at best and it provides a response time longer then usual.
Note that the response time may also be increased  by the use of DNSSEC
since DNSSEC may involve additional resolutions, larger payloads, and
signature checks.</t>

<t>In addition to that first end user querying RRset after it has been
flushed, end users querying the RRset during its resolution or fetching
phase are also impacted. As a result, especially for popular RRsets,
multiple end users are likely to be impacted and be provided a poor user
experience.  </t>

<t>The impact on users also depends on the architecture of resolving
platform architecture. In some case, a centralized resolver is
implemented as a farm of independent resolving nodes and the traffic is
split between the nodes according to the IP addresses and ports. In such
architectures, the number of affected end users is proportional to the
number of resolving nodes as each query is pseudo-randomly associated to
one of the resolving node. Similarly, some global resolving platform
uses anycast to steer the queries to the resolving node associated with
the shortest path. Unless all queries comes from a single region, such
architecture are also expected to impact a number of user proportional
to the number of resolving nodes.</t>

      </section>

      <section title="Optimize the resources involved in large DNSSEC
resolving platforms">
        
<t>As mentioned in <xref target="sec-user-qos"/>, large resolving
platforms are often composed of a set of independent resolving nodes in
order to distribute the traffic between these nodes. Traffic can be
distributed using various forms of load balancing between the resolving
nodes. This includes, for example, a pseudo-random distribution when
load balancing is based on the hash of the IP addresses and ports or
shortest path when anycast is deployed. Such distributions split the
traffic independently of the queried RRset. Ignoring the coordination of
the resolutions implicitly assumes the resource to perform the
resolutions is negligible compared to those necessary to handle the
queries of the end users.</t> 

<t>As a result, such platforms perform multiple parallel resolutions on
their various nodes. With DNS, the necessary resource associated to the
resolution were in fact minimal so little effort have been considered to
synchronize these nodes in order to reduce the number of resolutions. On
the other hand DNSSEC resolutions involve additional resolutions, larger
payloads and signature checks. The consequent increase of resource to
perform DNSSEC resolutions versus DNS resolution makes parallel
resolutions a non negligible lost of resource and leave place for
synchronization mechanisms.</t>
    
<t>One way to reduce the number of DNSSEC resolutions is to prefetch (or
provision) the nodes with the most popular RRsets before their TTL
expire. Note that in this case, the resolution is not performed by the
resolving node. At a node level, prefetching increases the nodes
availability. At the platform level, synchronizing the resolving nodes'
resolution globally reduce the number of resolution and so the overall
resource of the platform.</t>
 
<t>Synchronization of the resolution  may be performed by configuring
each node as a forwarder for these RRsets. This avoids parallel
resolutions and overall reduces cost, because signature checks are not
performed by each resolving node. In this case prefetching enables to
still benefit from the already existing load balancing architecture that
split the load of the end users' queries traffic between the nodes.
Note that the advantages of synchronizing the resolutions between the
resolving nodes may depend on the popularity of the RRsets. This
architecture takes advantage of the Zipf <xref target="ZIPF"/>
distribution of the RRsets' popularity. In fact, a few number of RRsets
need to be cached (a few thousands) to address most of the traffic (up
to 70%) <xref target="PREFETCH"/>.</t>


      </section>
    </section>


    <section title="Protocol Description">

<t>This section describes HAMMER. This section is not normative and
implementation may implement this mechanism with their own flavor.</t> 

<t>When a recursive resolver that implements HAMMER receives a query for
a HAMMER RRset that it has in the cache, it responds from the cache.</t>

<t>If the queried RRset is a HAMMER RRset, the HAMMER resolver compares
the TTL value to the HAMMER TIME, as well as if the RRset is being
(pre)fetched.</t>

<t>If the HAMMER RRset has a TTL greater then the HAMMER TIME, nothing
is done.</t>

<t>If the HAMMER RRset has a TTL less than the HAMMER TIME, the HAMMER
resolver starts a resolution for the RRset in order to fill the cache,
just as if the TTL had expired. The HAMMER RRset is prefetched. Note
that during the resolution, the HAMMER RRset is still cached, and
queries are responded form the cache until the TTL expires. Once the
resolution is performed, the freshly resolved RRset replace the existing
cached RRset. This ensures the cache has fresh data for subsequent
queries.</t>

<t>Since prefetching is initiated before the existing cached entry
expires (and is flushed), responses will come from the cache more often.
This decreases the client resolution latency and improves the user
experience.</t>

<t>Prefetching is triggered by an incoming query (and only if that query
arrives shortly before the record would expire anyway).  This
effectively keeps the most popular RRsets uniformly queried in the
cache, without having to maintain counters in the cache or proactively
resolve responses that are not likely to be needed as often. This is
purely an implementation optimization - resolvers always have the option
to cache records for less than the TTL (for example, when running low on
cache space, etc), this simply triggers a refresh of the RRset before it
expires.</t>

<t>Note that non-uniformly queried RRsets may be popular and may not
benefit from the HAMMER mechanism. For example, a RRset MAY be heavily
queried the first 10 minutes of every hour with a 30 minute TTL. In that
case DNS queries are not expected to come between TTL - HAMMER TIME and
TTL.</t>

<t>HAMMER RRset with small TTL may generate a prefetching process even
though they are not so popular. Suppose an end user is setting a
specific session which requires multiple DNS resolutions on a given
FQDN. These resolutions are necessary for a short period of time, i.e.
the necessary time to establish the session. If these RRset have been
set with a small TTL - in the order of the time session establishment -
the multiple queries to a HAMMER resolver may trigger an unnecessary
resolution. As a result HAMMER would not scale thousands of these
RRsets.  As a result, if the original TTL of the RRset is less than (or
close to HAMMER TIME), the described method could cause excessive
prefetching queries to occur. In order to prevent this an additional
variable named STOP (described below) is introduced. If the original TTL
of the RRset is less than STOP * HAMMER TIME then the cache entry should
be marked with a "Can't touch this" flag, and the described method
should not be used.</t>

      </section>

      <section title="Configuration Variables">

<t>These are the mandatory variables: <list hangIndent="6"
style="hanging">

<t hangText="HAMMER TIME:"> a number of seconds that indicates the
period preceding the TTL expiration time. When a query for a HAMMER
RRset is received during that period, the HAMMER resolver prefetches
the HAMMER RRset by initiating a resolution. A default of 2 seconds is
RECOMMENDED.</t>

<t hangText="STOP:">should be a user configurable variable. A default of
3 is recommended.</t>

          </list></t>

<t>Implementations may consider additional variables. These are not
mandatory but would address specific use of the HAMMER. <list
hangIndent="6" style="hanging">

<t hangText="HAMMER MATCH:">should be a user configurable variable. It
defines RRsets that are expected to implement HAMMER.  This rule can be
expressed in different ways. It can be a list of RRsets, or a number
indicating the number of most popular RRsets that needs to be considered.
How HAMMER MATCH is expressed is implementation dependent.
Implementations can use a list of FQDNs, others can use a matching rule
on the RRsets, or define the HAMMER RRsets as the X most popular
RRsets.</t>

<t hangText="HAMMER FORWARDER:">should be a user configurable variable.
It is optional and designates the DNS server the resolver forwards the
request to.</t>

          </list></t>
    </section>

    <section title="IANA Considerations">

<t>This document makes no request of the IANA.</t>

    </section>

    <section title="Security Considerations">

<t>This technique leverages existing protocols, and should not introduce
any new risks, other than a slight increase in traffic.</t>

<t>By initiating cache fill entries before the existing RR has expired
this technique will slightly increase the number of queries seen by
authoritative servers. This increase will be inversely proportional to
the average TTL of the records that they serve.</t>

<t>It is unlikely, but possible, that this increase could cause a denial
of service condition.</t> 

</section>

    <section title="Acknowledgements">

<t>The authors wish to thank Tony Finch and MC Hammer. We also wish to
thank Brian Somers and Wouter Wijngaards for telling us that they
already do this :-) (They should probably be co-authors, but I left this
too close to the draft cutoff time to confirm with them that they are
willing to have their names on this).</t>

    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include='reference.RFC.2119'?>
      <?rfc include='reference.RFC.2181'?>
      <?rfc include='reference.RFC.4035'?>

    </references>

    <references title="Informative References">

     <reference anchor="ZIPF" target="http://aclweb.org/anthology/W98-1218">
        <front>
        <title>Applications and Explanations of Zipf's Law </title>
        <author fullname="David W. Powers" initials="D. W." surname="Powers">
            <organization></organization>
        </author>
                <date year="1998" month="Jan"/>
        </front>
    </reference>

     <reference anchor="PREFETCH" target="https://www.researchgate.net/publication/270571591_PREFETCHing_to_optimize_DNSSEC_deployment_over_large_Resolving_Platforms">
        <front>
        <title>PREFETCHing to optimize DNSSEC deployment over large Resolving Platforms</title>
        <author fullname="Daniel Migault" initials="D. M." surname="Migault">
            <organization></organization>
        </author>
        <author fullname="Stanislas " initials="S. F." surname="Francfort">
            <organization></organization>
        </author>
        <author fullname="Stephane" initials="S. S." surname="Senecal">
            <organization></organization>
        </author>
        <author fullname="Emmanuel Herbert" initials="E. H." surname="Herbert">
            <organization></organization>
        </author>
        <author fullname="Maryline Laurent" initials="M. L." surname="Laurent">
            <organization>Proceedings of the 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TRUSTCOM'13)</organization>
        </author>
                <date year="2013" month="Jul"/>
        </front>
    </reference>
   
    </references>

<!--
    <section title="Overview of Operation">
      <t>When an incoming query is received, and the result is in the cache,
      the query is answered from the cache. If the remaining TTL of the record
      is below some threshold, the recursive server will also initiate a cache
      fill operation in the background to refresh the cache entry.</t>

      <t>The fact that the behavior is triggered by an incoming query (and not
      by periodically scanning the cache and refreshing all entries that are
      about to expire) allows unpopular names to age out of the cache
      naturally, while keeping popular entries in the cache.</t>
    </section>
-->

    <section title="Known implementations">
      <t>[RFC Editor: Please remove this section before publication ]</t>
 
     <t>[Ed: Well, this is kinda embarrassing. This idea occurred to us one
      day while sitting around a pool in New Hampshire. It then took a while
      before I wrote it down, mostly because I *really* wanted to get "Stop!
      Hammer Time!" into a draft. Anyway, we presented it in Berlin, and
      Wouter Wijngaards stood up and mentioned that Unbound already does this
      (they use a percentage of TTL, instead of a number of seconds). Then we
      heard from OpenDNS that they *also* implement something similar. Then we
      had a number of discussions, then got sidetracked into other things.
      Anyway, BIND as of 9.10, around Feb 2014 now implements something like
      this
      (https://deepthought.isc.org/article/AA-01122/0/Early-refresh-of-cache-records-cache-prefetch-in-BIND-9.10.html),
      and enables it by default. Unfortunately, while BIND uses the times
      based approach, they named their parameters "trigger" and "eligibility"
      - and shouting "Eligibility! Trigger time!" simply isn't funny (unless
      you have a very odd sense of humor... So, we are now documenting
      implementations that existed before this was published and an
      impl,entation that we think was based on this. We think that this has
      value to the community. I'm also leaving in the HAMMER TIME bit, because
      it makes me giggle. This below section should be filled out with more
      detail, in collaboration with the implementors, but this is being
      written *just* before the draft cutoff.].</t>

      <t>A number of recursive resolvers implement techniques similar to the
      techniques described in this document. This section documents some of
      these and tradeoffs they make in picking their techniques.</t>

      <section title="Unbound (NLNet Labs)">
        <t>The Unbound validating, recursive, and caching DNS resolver
        implements a HAMMER type feature, called "prefetch". This feature can
        be enabled or disabled though the configuration option "prefetch:
        &lt;yes or no&gt;". When enabled, Unbound will fetch expiring records
        when their remaining TTL is less than 10% of their original TTL.</t>

        <t>[Ed: Unbound's "prefetch" function was developed independently,
        before this draft was written. The authors were unaware of it when
        writing the document.]</t>
      </section>

      <section title="OpenDNS">
        <t>The public DNS resolver, OpenDNS implements a prefetch like
        solution.</t>

        <t>[Ed: Will work with OpenDNS to get more details.]</t>
      </section>

      <section title="ISC BIND">
        <t>As of version 9.10, Internet Systems Consortium's BIND implements
        the HAMMER functionality. This feature is enabled by default.</t>

        <t>The functionality is configured using the "prefetch" options
        statement, with two parameters:</t>

        <t><list style="hanging">
            <t hangText="Trigger">This is equivalent to the HAMMER_TIME
            parameter described below.</t>

            <t hangText="Eligibility">This is equivalent to the STOP parameter
            described below.</t>
          </list></t>
      </section>
    </section>

    <section title="Changes / Author Notes.">
      <t>[RFC Editor: Please remove this section before publication ]</t>

      <t>From -01 to -02:<list style="symbols">
          <t>Readbility / cleanup.</t>

          <t>Tried to make it more clear that most implementations now support
          this (although they call it "prefetch" )</t>
        </list></t>

      <t>From -00 to 01:<list style="symbols">
          <t>Fairly large rewrite.</t>

          <t>Added text on the fact that there are implmentations that do
          this.</t>

          <t>Added the "prefetch" name, cleaned up some readability.</t>

          <t>Daniel's test (Section 3.2) added.</t>
        </list></t>

      <t>From -template to -00.</t>

      <t><list style="symbols">
          <t>Wrote some text.</t>

          <t>Changed the name.</t>
        </list></t>
    </section>
  </back>
</rfc>
