idnits 2.17.1 

draft-ietf-dnsop-rfc5011-security-considerations-08.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The draft header indicates that this document updates RFC7583, but the
     abstract doesn't seem to mention this, which it should.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (November 29, 2017) is 2339 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Downref: Normative reference to an Informational RFC: RFC 7583

  ** Obsolete normative reference: RFC 7719 (Obsoleted by RFC 8499)


     Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	dnsop                                                        W. Hardaker
3	Internet-Draft                                                   USC/ISI
4	Updates: 7583 (if approved)                                    W. Kumari
5	Intended status: Standards Track                                  Google
6	Expires: June 2, 2018                                  November 29, 2017

8	             Security Considerations for RFC5011 Publishers
9	          draft-ietf-dnsop-rfc5011-security-considerations-08

11	Abstract

13	   This document extends the RFC5011 rollover strategy with timing
14	   advice that must be followed in order to maintain security.
15	   Specifically, this document describes the math behind the minimum
16	   time-length that a DNS zone publisher must wait before signing
17	   exclusively with recently added DNSKEYs.  It contains much math and
18	   complicated equations, but the summary is that the key rollover /
19	   revocation time is much longer than intuition would suggest.  If you
20	   are not both publishing a DNSSEC DNSKEY, and using RFC5011 to
21	   advertise this DNSKEY as a new Secure Entry Point key for use as a
22	   trust anchor, you probably don't need to read this document.

24	   This document also describes the minimum time-length that a DNS zone
25	   publisher must wait after publishing a revoked DNSKEY before assuming
26	   that all active RFC5011 resolvers should have seen the revocation-
27	   marked key and removed it from their list of trust anchors.

29	Status of This Memo

31	   This Internet-Draft is submitted in full conformance with the
32	   provisions of BCP 78 and BCP 79.

34	   Internet-Drafts are working documents of the Internet Engineering
35	   Task Force (IETF).  Note that other groups may also distribute
36	   working documents as Internet-Drafts.  The list of current Internet-
37	   Drafts is at http://datatracker.ietf.org/drafts/current/.

39	   Internet-Drafts are draft documents valid for a maximum of six months
40	   and may be updated, replaced, or obsoleted by other documents at any
41	   time.  It is inappropriate to use Internet-Drafts as reference
42	   material or to cite them other than as "work in progress."

44	   This Internet-Draft will expire on June 2, 2018.

46	Copyright Notice

48	   Copyright (c) 2017 IETF Trust and the persons identified as the
49	   document authors.  All rights reserved.

51	   This document is subject to BCP 78 and the IETF Trust's Legal
52	   Provisions Relating to IETF Documents
53	   (http://trustee.ietf.org/license-info) in effect on the date of
54	   publication of this document.  Please review these documents
55	   carefully, as they describe your rights and restrictions with respect
56	   to this document.  Code Components extracted from this document must
57	   include Simplified BSD License text as described in Section 4.e of
58	   the Trust Legal Provisions and are provided without warranty as
59	   described in the Simplified BSD License.

61	Table of Contents

63	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
64	     1.1.  Document History and Motivation . . . . . . . . . . . . .   3
65	     1.2.  Safely Rolling the Root Zone's KSK in 2017/2018 . . . . .   3
66	     1.3.  Requirements notation . . . . . . . . . . . . . . . . . .   4
67	   2.  Background  . . . . . . . . . . . . . . . . . . . . . . . . .   4
68	   3.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   4
69	   4.  Timing Associated with RFC5011 Processing . . . . . . . . . .   5
70	     4.1.  Timing Associated with Publication  . . . . . . . . . . .   5
71	     4.2.  Timing Associated with Revocation . . . . . . . . . . . .   6
72	   5.  Denial of Service Attack Walkthrough  . . . . . . . . . . . .   6
73	     5.1.  Enumerated Attack Example . . . . . . . . . . . . . . . .   6
74	       5.1.1.  Attack Timing Breakdown . . . . . . . . . . . . . . .   7
75	   6.  Minimum RFC5011 Timing Requirements . . . . . . . . . . . . .   9
76	     6.1.  Equation Components . . . . . . . . . . . . . . . . . . .   9
77	       6.1.1.  addHoldDownTime . . . . . . . . . . . . . . . . . . .   9
78	       6.1.2.  sigExpirationTimeRemaining  . . . . . . . . . . . . .   9
79	       6.1.3.  activeRefresh . . . . . . . . . . . . . . . . . . . .   9
80	       6.1.4.  activeRefreshOffset . . . . . . . . . . . . . . . . .   9
81	       6.1.5.  safetyMargin  . . . . . . . . . . . . . . . . . . . .  10
82	     6.2.  Timing Requirements For Adding a New KSK  . . . . . . . .  11
83	       6.2.1.  Wait Timer Based Calculation  . . . . . . . . . . . .  11
84	       6.2.2.  Wall-Clock Based Calculation  . . . . . . . . . . . .  12
85	       6.2.3.  Timing Constraint Summary . . . . . . . . . . . . . .  12
86	       6.2.4.  Additional Considerations for RFC7583 . . . . . . . .  13
87	       6.2.5.  Example Scenario Calculations . . . . . . . . . . . .  13
88	     6.3.  Timing Requirements For Revoking an Old KSK . . . . . . .  13
89	       6.3.1.  Wait Timer Based Calculation  . . . . . . . . . . . .  14
90	       6.3.2.  Wall-Clock Based Calculation  . . . . . . . . . . . .  14
91	       6.3.3.  Additional Considerations for RFC7583 . . . . . . . .  15
92	       6.3.4.  Example Scenario Calculations . . . . . . . . . . . .  15
93	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  15
94	   8.  Operational Considerations  . . . . . . . . . . . . . . . . .  15
95	   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  16
96	   10. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  16
97	   11. Normative References  . . . . . . . . . . . . . . . . . . . .  16
98	   Appendix A.  Real World Example: The 2017 Root KSK Key Roll . . .  17
99	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  17

101	1.  Introduction

103	   [RFC5011] defines a mechanism by which DNSSEC validators can update
104	   their list of trust anchors when they've seen a new key published in
105	   a zone or revoke a properly marked key from a trust anchor list.
106	   However, RFC5011 [intentionally] provides no guidance to the
107	   publishers of DNSKEYs about how long they must wait before switching
108	   to exclusively using recently published keys for signing records, or
109	   how long they must wait before ceasing publication of a revoked key.
110	   Because of this lack of guidance, zone publishers may derive
111	   incorrect assumptions about safe usage of the RFC5011 DNSKEY
112	   advertising, rolling and revocation process.  This document describes
113	   the minimum security requirements from a publisher's point of view
114	   and is intended to complement the guidance offered in RFC5011 (which
115	   is written to provide timing guidance solely to a Validating
116	   Resolver's point of view).

118	1.1.  Document History and Motivation

120	   To verify this lack of understanding is wide-spread, the authors
121	   reached out to 5 DNSSEC experts to ask them how long they thought
122	   they must wait before signing a zone exclusively with a new KSK
123	   [RFC4033] that was being introduced according to the 5011 process.
124	   All 5 experts answered with an insecure value, and we determined that
125	   this lack of operational guidance might cause security concerns in
126	   deployment and wrote this companion document to RFC5011.  We hope
127	   that this document will rectify this understanding and provide better
128	   guidance to zone publishers that wish to make use of the RFC5011
129	   rollover process.

131	1.2.  Safely Rolling the Root Zone's KSK in 2017/2018

133	   One important note about ICANN's (currently in process) 2017/2018 KSK
134	   rollover plan for the root zone: the timing values chosen for rolling
135	   the KSK in the root zone appear completely safe, and are not affected
136	   by the timing concerns introduced by this draft

138	1.3.  Requirements notation

140	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
141	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
142	   document are to be interpreted as described in [RFC2119].

144	2.  Background

146	   The RFC5011 process describes a process by which a RFC5011 Resolver
147	   may accept a newly published KSK as a trust anchor for validating
148	   future DNSSEC signed records.  It also describes the process for
149	   publicly revoking a published KSK.  This document augments that
150	   information with additional constraints, from the DNSKEY publication
151	   and revocation's points of view.  Note that this document does not
152	   define any other operational guidance or recommendations about the
153	   RFC5011 process and restricts itself to solely the security and
154	   operational ramifications of switching to exclusively using recently
155	   added keys or removing a revoked keys too soon.

157	   Failure of a DNSKEY publisher to follow the minimum recommendations
158	   associated with this draft can result in potential denial-of-service
159	   attack opportunities against validating resolvers.  Failure of a
160	   DNSKEY publisher to publish a revoked key for a long enough period of
161	   time may result in RFC5011 Resolvers leaving that key in their trust
162	   anchor storage beyond the key's expected lifetime.

164	3.  Terminology

166	   SEP Publisher  The entity responsible for publishing a DNSKEY (with
167	      the Secure Entry Point (SEP) bit set) that can be used as a trust
168	      anchor.

170	   Zone Signer  The owner of a zone intending to publish a new Key-
171	      Signing-Key (KSK) that may become a trust anchor for validators
172	      following the RFC5011 process.

174	   RFC5011 Resolver  A DNSSEC Resolver that is using the RFC5011
175	      processes to track and update trust anchors.

177	   Attacker  An entity intent on foiling the RFC5011 Resolver's ability
178	      to successfully adopt the Zone Signer's new DNSKEY as a new trust
179	      anchor or to prevent the RFC5011 Resolver from removing an old
180	      DNSKEY from its list of trust anchors.

182	   lastSigExpirationTime  The latest value of any RRSIG Signature
183	      Expiration field (which is a date and time) that has signed the
184	      previous DNSKEY RRset before a new DNSKEY is introduced to a
185	      publish DNSKEY RRset, or the DNSKEY RRset of a DNSKEY that is to
186	      be revoked.  Note that for organizations pre-creating signatures
187	      this time may be fairly far in the future unless they can be
188	      significantly assured that none of their pre-generated signatures
189	      can be replayed at a later date.

191	   sigExpirationTime  The amount of time between the DNSKEY RRSIG's
192	      Signature Inception field and the Signature Expiration field.

194	   sigExpirationTimeRemaining  The amount of time remaining before
195	      latestSigExpirationTime is reached.

197	   Also see Section 2 of [RFC4033] and [RFC7719] for additional
198	   terminology.

200	4.  Timing Associated with RFC5011 Processing

202	   These sections define a high-level overview of [RFC5011] processing.
203	   These steps are not sufficient for proper RFC5011 implementation, but
204	   provide enough background for the reader to follow the discussion in
205	   this document.  Readers need to fully understand [RFC5011] as well to
206	   fully comprehend the content and importance of this document.

208	4.1.  Timing Associated with Publication

210	   RFC5011's process of safely publishing a new DNSKEY and then assuming
211	   RFC5011 Resolvers have adopted it for trust falls into a number of
212	   high-level steps to be performed by the SEP Publisher.  This document
213	   discusses the following scenario, which the principle way RFC5011 is
214	   currently being used (even though Section 6 of RFC5011 suggests
215	   having a stand-by key available):

217	   1.  Publish a new DNSKEY in a zone, but continue to sign the zone
218	       with the old one.

220	   2.  Wait a period of time.

222	   3.  Begin to exclusively use recently published DNSKEYs to sign the
223	       appropriate resource records.

225	   This document discusses the time required to wait during step 2 of
226	   the above process.  Some interpretations of RFC5011 have erroneously
227	   determined that the wait time is equal to RFC5011's "hold down time".
228	   Section 5 describes an attack based on this (common) erroneous
229	   belief, which can result in a denial of service attack against the
230	   zone.

232	4.2.  Timing Associated with Revocation

234	   RFC5011's process of advertising that an old key is to be revoked
235	   from RFC5011 Resolvers falls into a number of high-level steps:

237	   1.  Set the revoke bit on the DNSKEY to be revoked.

239	   2.  Sign the revoked DNSKEY with itself.

241	   3.  Wait a period of time.

243	   4.  Remove the revoked key from the zone.

245	   This document discusses the time required to wait in step 3 of the
246	   above process.  Some interpretations of RFC5011 have erroneously
247	   determined that the wait time is equal to RFC5011's "hold down time".
248	   This document describes an attack based on this (common) erroneous
249	   belief, which results in a revoked DNSKEY potentially remaining as a
250	   trust anchor in a RFC5011 Resolver long past its expected usage.

252	5.  Denial of Service Attack Walkthrough

254	   This section serves as an illustrative example of the problem being
255	   discussed in this document.  Note that in order to keep the example
256	   simple enough to understand, some simplifications were made (such as
257	   by not creating a set of pre-signed RRSIGs and by not using values
258	   that result in the addHoldDownTime not being evenly divisible by the
259	   activeRefresh value); the mathematical formulas in Section 6,
260	   however, are complete.

262	   If an attacker is able to provide a RFC5011 Resolver with past
263	   responses, such as when it is in-path or able to perform any number
264	   of cache poisoning attacks, the attacker may be able to leave
265	   compliant RFC5011 Resolvers without an appropriate DNSKEY trust
266	   anchor.  This scenario will remain until an administrator manually
267	   fixes the situation.

269	   The time-line below illustrates this situation.

271	5.1.  Enumerated Attack Example

273	   The following example settings are used in the example scenario
274	   within this section:

276	   TTL (all records)  1 day

278	   sigExpirationTime  10 days
279	   Zone resigned every  1 day

281	   Given these settings, the sequence of events in Section 5.1.1 depicts
282	   how a SEP Publisher that waits for only the RFC5011 hold time timer
283	   length of 30 days subjects its users to a potential Denial of Service
284	   attack.  The timing schedule listed below is based on a SEP Publisher
285	   publishing a new Key Signing Key (KSK), with the intent that it will
286	   later be used as a trust anchor.  We label this publication time as
287	   "T+0".  All numbers in this sequence refer to days before and after
288	   this initial publication event.  Thus, T-1 is the day before the
289	   introduction of the new key, and T+15 is the 15th day after the key
290	   was introduced into the fictitious zone being discussed.

292	   In this dialog, we consider two keys within the example zone:

294	   K_old:  An older KSK and Trust Anchor being replaced.

296	   K_new:  A new KSK being transitioned into active use and expected to
297	      become a Trust Anchor via the RFC5011 automated trust anchor
298	      update process.

300	5.1.1.  Attack Timing Breakdown

302	   The steps shows an attack that foils the adoption of a new DNSKEY by
303	   a 5011 Resolver when the SEP Publisher that starts signing and
304	   publishing with the new DNSKEY too quickly.

306	   T-1  The K_old based RRSIGs are being published by the Zone Signer.
307	      [It may also be signing ZSKs as well, but they are not relevant to
308	      this event so we will not talk further about them; we are only
309	      considering the RRSIGs that cover the DNSKEYs in this document.]
310	      The Attacker queries for, retrieves and caches this DNSKEY set and
311	      corresponding RRSIG signatures.

313	   T+0  The Zone Signer adds K_new to their zone and signs the zone's
314	      key set with K_old.  The RFC5011 Resolver (later to be under
315	      attack) retrieves this new key set and corresponding RRSIGs and
316	      notices the publication of K_new.  The RFC5011 Resolver starts the
317	      (30-day) hold-down timer for K_new.  [Note that in a more real-
318	      world scenario there will likely be a further delay between the
319	      point where the Zone Signer publishes a new RRSIG and the RFC5011
320	      Resolver notices its publication; though not shown in this
321	      example, this delay is accounted for in the equation in Section 6
322	      below]

324	   T+5  The RFC5011 Resolver queries for the zone's keyset per the
325	      RFC5011 Active Refresh schedule, discussed in Section 2.3 of
326	      RFC5011.  Instead of receiving the intended published keyset, the
327	      Attacker successfully replays the keyset and associated signatures
328	      recorded at T-1.  Because the signature lifetime is 10 days (in
329	      this example), the replayed signature and keyset is accepted as
330	      valid (being only 6 days old, which is less than
331	      sigExpirationTime) and the RFC5011 Resolver cancels the (30-day)
332	      hold-down timer for K_new, per the RFC5011 algorithm.

334	   T+10  The RFC5011 Resolver queries for the zone's keyset and
335	      discovers a signed keyset that includes K_new (again), and is
336	      signed by K_old.  Note: the attacker is unable to replay the
337	      records cached at T-1, because they have now expired.  Thus at
338	      T+10, the RFC5011 Resolver starts (anew) the hold-timer for K_new.

340	   T+11 through T+29  The RFC5011 Resolver continues checking the zone's
341	      key set at the prescribed regular intervals.  During this period,
342	      the attacker can no longer replay traffic to their benefit.

344	   T+30  The Zone Signer knows that this is the first time at which some
345	      validators might accept K_new as a new trust anchor, since the
346	      hold-down timer of a RFC5011 Resolver not under attack that had
347	      queried and retrieved K_new at T+0 would now have reached 30 days.
348	      However, the hold-down timer of our attacked RFC5011 Resolver is
349	      only at 20 days.

351	   T+35  The Zone Signer (mistakenly) believes that all validators
352	      following the Active Refresh schedule (Section 2.3 of RFC5011)
353	      should have accepted K_new as a the new trust anchor (since the
354	      hold down time (30 days) + the query interval [which is just 1/2
355	      the signature validity period in this example] would have passed).
356	      However, the hold-down timer of our attacked RFC5011 Resolver is
357	      only at 25 days (T+35 minus T+10); thus the RFC5011 Resolver won't
358	      consider it a valid trust anchor addition yet, as the required 30
359	      days have not yet elapsed.

361	   T+36  The Zone Signer, believing K_new is safe to use, switches their
362	      active signing KSK to K_new and publishes a new RRSIG, signed with
363	      (only) K_new, covering the DNSKEY set.  Non-attacked RFC5011
364	      validators, with a hold-down timer of at least 30 days, would have
365	      accepted K_new into their set of trusted keys.  But, because our
366	      attacked RFC5011 Resolver now has a hold-down timer for K_new of
367	      only 26 days, it failed to ever accept K_new as a trust anchor.
368	      Since K_old is no longer being used to sign the zone's DNSKEYs,
369	      all the DNSKEY records from the zone will be treated as invalid.
370	      Subsequently, all of the records in the DNS tree below the zone's
371	      apex will be deemed invalid by DNSSEC.

373	6.  Minimum RFC5011 Timing Requirements

375	   This section defines the minimum timing requirements for making
376	   exclusive use of newly added DNSKEYs and timing requirements for
377	   ceasing the publication of DNSKEYs to be revoked.  First, we define
378	   the term components used in both equations in Section 6.1.

380	6.1.  Equation Components

382	6.1.1.  addHoldDownTime

384	   The addHoldDownTime is defined in Section 2.4.1 of [RFC5011] as:

386	       The add hold-down time is 30 days or the expiration time of the
387	       original TTL of the first trust point DNSKEY RRSet that contained
388	       the new key, whichever is greater.  This ensures that at least
389	       two validated DNSKEY RRSets that contain the new key MUST be seen
390	       by the resolver prior to the key's acceptance.

392	6.1.2.  sigExpirationTimeRemaining

394	   sigExpirationTimeRemaining is defined in Section 3.

396	6.1.3.  activeRefresh

398	   activeRefresh time is defined by RFC5011 by

400	     A resolver that has been configured for an automatic update
401	     of keys from a particular trust point MUST query that trust
402	     point (e.g., do a lookup for the DNSKEY RRSet and related
403	     RRSIG records) no less often than the lesser of 15 days, half
404	     the original TTL for the DNSKEY RRSet, or half the RRSIG
405	     expiration interval and no more often than once per hour.

407	   This translates to:

409	    activeRefresh = MAX(1 hour,
410	                        MIN(sigExpirationTime / 2,
411	                            MAX(TTL of K_old DNSKEY RRSet) / 2,
412	                            15 days)
413	                        )

415	6.1.4.  activeRefreshOffset

417	   The activeRefreshOffset term must be added for situations where the
418	   activeRefresh value is not a factor of the addHoldDownTime.
419	   Specifically, activeRefreshOffset will be "addHoldDownTime %
420	   activeRefresh", where % is the mathematical mod operator (calculating
421	   the remainder in a division problem).  This will frequently be zero,
422	   but could be nearly as large as activeRefresh itself.  For
423	   simplicity, setting the activeRefreshOffset to the activeRefresh
424	   value itself is always safe.

426	6.1.5.  safetyMargin

428	   The safetyMargin is an extra period of time to account for caching,
429	   network delays, dropped packets, and other operational concerns
430	   otherwise beyond the scope of this document.  The value operators
431	   should chose is highly dependent on the deployment siptuation
432	   associated with their zone.  Note that no value of a safetyMargin can
433	   protect against resolvers that are "down".  None the less, we do
434	   offer the following as one method considering reasonable values to
435	   select from.

437	   The following list of variables need to be considered when selecting
438	   an appropriate safetyMargin value:

440	   successRate:  A likely success rate for client queries and retries

442	   numResolvers:  The number of client RFC5011 Resolvers

444	   Note that RFC5011 defines retryTime as:

446	         If the query fails, the resolver MUST repeat the query until
447	         satisfied no more often than once an hour and no less often
448	         than the lesser of 1 day, 10% of the original TTL, or 10% of
449	         the original expiration interval.  That is,
450	         retryTime = MAX (1 hour, MIN (1 day, .1 * origTTL,
451	                                       .1 * expireInterval)).

453	   With these values selected and the definition of retryTime from
454	   RFC5011, one method for determining how many retryTime intervals to
455	   wait in order to reduce the set of uncompleted servers to 0 assuming
456	   normal probability is thus:

458	                         x = (1/(1 - successRate))

460	            retryCountWait = Log_base_x(numResolvers)

462	   To reduce the need for readers to pull out a scientific calculator,
463	   we offer the following lookup table based on successRate and
464	   numResolvers:

466	                        retryCountWait lookup table
467	                        ---------------------------

469	              Number of client RFC5011 Resolvers (numResolvers)
470	                        10,000  100,000 1,000,000 10,000,000 100,000,000
471	                 0.01      917     1146      1375       1604        1833
472	 Probability     0.05      180      225       270        315         360
473	 of Success      0.10       88      110       132        153         175
474	 Per Retry       0.15       57       71        86        100         114
475	 Interval        0.25       33       41        49         57          65
476	 (successRate)   0.50       14       17        20         24          27
477	                 0.90        4        5         6          7           8
478	                 0.95        4        4         5          6           7
479	                 0.99        2        3         3          4           4
480	                 0.999       2        2         2          3           3

482	   Finally, a suggested value of safetyMargin can then be this
483	   retryCountWait number multiplied by the retryTime from RFC5011:

485	                 safetyMargin = retryCountWait * retryTime

487	6.2.  Timing Requirements For Adding a New KSK

489	   This section defines a method for calculating the amount of time to
490	   wait until it is safe to start signing exclusively with a new key
491	   Section 6.2.1 (especially useful for writing code involving sleep
492	   based timers), and an a method for calculating a wall-clock value
493	   after which it is safe to start signing exclusively with a new key
494	   Section 6.2.2 (especially useful for writing code based on clock-
495	   based event triggers).

497	6.2.1.  Wait Timer Based Calculation

499	   Given the attack description in Section 5, the correct minimum length
500	   of time required for the Zone Signer to wait after publishing K_new
501	   but before exclusively using it and newer keys is:

503	      addWaitTime = addHoldDownTime
504	                    + sigExpirationTimeRemaining
505	                    + activeRefresh
506	                    + activeRefreshOffset
507	                    + safetyMargin

509	6.2.1.1.  Fully expanded equation

511	   The full expanded equation is:

513	      addWaitTime = addHoldDownTime
514	                    + sigExpirationTimeRemaining
515	                    + 2 * MAX(1 hour,
516	                              MIN(sigExpirationTime / 2,
517	                                  MAX(TTL of K_old DNSKEY RRSet) / 2,
518	                                  15 days)
519	                              )
520	                    + (addHoldDownTime % activeRefresh)
521	                    + MAX(1.5 hours, 2 * MAX(TTL of all records))
522	                    + safetyMargin

524	6.2.2.  Wall-Clock Based Calculation

526	   The above equations are defined based upon how long to wait from a
527	   particular moment in time.  An alternative, but equivalent, method is
528	   to calculate the date and time before which it is unsafe to use a key
529	   for signing.  This calculation thus becomes:

531	      addWallClockTime = lastSigExpirationTime
532	                       + addHoldDownTime
533	                       + activeRefresh
534	                       + activeRefreshOffset
535	                       + safetyMargin

537	   where lastSigExpirationTime is the latest value of any
538	   sigExpirationTime for which RRSIGs were created that could
539	   potentially be replayed.  Fully expanded, this becomes:

541	    addWallClockTime = lastSigExpirationTime
542	                       + addHoldDownTime
543	                       + 2 * MAX(1 hour,
544	                                 MIN(sigExpirationTime / 2,
545	                                     MAX(TTL of K_old DNSKEY RRSet) / 2,
546	                                     15 days)
547	                                 )
548	                       + (addHoldDownTime % activeRefresh)
549	                       + MAX(1.5 hours, 2 * MAX(TTL of all records))
550	                  + safetyMargin

552	6.2.3.  Timing Constraint Summary

554	   The important timing constraint introduced by this memo relates to
555	   the last point at which a RFC5011 Resolver may have received a
556	   replayed original DNSKEY set, containing K_old and not K_new.  The
557	   next query of the RFC5011 validator at which K_new will be seen
558	   without the potential for a replay attack will occur after the
559	   publication time plus sigExpirationTime.  Thus, the latest time that
560	   a RFC5011 Validator may begin their hold down timer is an "Active
561	   Refresh" period after the last point that an attacker can replay the
562	   K_old DNSKEY set.  The worst case scenario of this attack is if the
563	   attacker can replay K_old just seconds before the (DNSKEY RRSIG
564	   Signature Validity) field of the last K_old only RRSIG.

566	6.2.4.  Additional Considerations for RFC7583

568	   Note: our notion of addWaitTime is called "Itrp" in Section 3.3.4.1
569	   of [RFC7583].  The equation for Itrp in RFC7583 is insecure as it
570	   does not include the sigExpirationTime listed above.  The Itrp
571	   equation in RFC7583 also does not include the 2*TTL safety margin,
572	   though that is an operational consideration and not necessarily as
573	   critical.

575	6.2.5.  Example Scenario Calculations

577	   For the parameters listed in Section 5.1, the activeRefreshOffset is
578	   0, since 30 days is evenly divisible by activeRefresh (1/2 day), and
579	   our resulting addWaitTime is:

581	     addWaitTime = 30
582	                   + 10
583	                   + 1 / 2
584	                   + 2 * (1)        (days)

586	     addWaitTime = 42.5             (days)

588	   This addWaitTime of 42.5 days is 12.5 days longer than just the hold
589	   down timer.

591	6.3.  Timing Requirements For Revoking an Old KSK

593	   This issue affects not just the publication of new DNSKEYs intended
594	   to be used as trust anchors, but also the length of time required to
595	   continuously publish a DNSKEY with the revoke bit set.

597	   This section defines a method for calculating the amount of time
598	   operators need to wait until it is safe to cease publishing a DNSKEY
599	   Section 6.2.1 (especially useful for writing code involving sleep
600	   based timers), and an a method for calculating a minimal wall-clock
601	   value after which it is safe to cease publishing a DNSKEY
602	   Section 6.2.2 (especially useful for writing code based on clock-
603	   based event triggers).

605	6.3.1.  Wait Timer Based Calculation

607	   Both of these publication timing requirements are affected by the
608	   attacks described in this document, but with revocation the key is
609	   revoked immediately and the addHoldDown timer does not apply.  Thus
610	   the minimum amount of time that a SEP Publisher must wait before
611	   removing a revoked key from publication is:

613	     remWaitTime = sigExpirationTimeRemaining
614	                   + MAX(1 hour,
615	                         MIN((sigExpirationTime) / 2,
616	                             MAX(TTL of K_old DNSKEY RRSet) / 2,
617	                             15 days),
618	                         1 hour)
619	                   + 2 * MAX(TTL of all records)

621	   Note that the activeRefreshOffset time does not apply to this
622	   equation.

624	   Note also that adding retryTime intervals to the remWaitTime may be
625	   wise, just as it was for addWaitTime in Section 6.

627	6.3.2.  Wall-Clock Based Calculation

629	   Like before, the above equations are defined based upon how long to
630	   wait from a particular moment in time.  An alternative, but
631	   equivalent, method is to calculate the date and time before which it
632	   is unsafe to cease publishing a revoked key.  This calculation thus
633	   becomes:

635	      remWallClockTime = lastSigExpirationTime
636	                       + activeRefresh
637	                       + activeRefreshOffset
638	                       + safetyMargin

640	   where lastSigExpirationTime is the latest value of any
641	   sigExpirationTime for which RRSIGs were created that could
642	   potentially be replayed.  Fully expanded, this becomes:

644	      remWallClockTime = lastSigExpirationTime
645	                       + 2 * MAX(1 hour,
646	                                 MIN(sigExpirationTime / 2,
647	                                     MAX(TTL of K_old DNSKEY RRSet) / 2,
648	                                     15 days)
649	                                 )
650	                       + (addHoldDownTime % activeRefresh)
651	                       + MAX(1.5 hours, 2 * MAX(TTL of all records))

653	6.3.3.  Additional Considerations for RFC7583

655	   Note that our notion of remWaitTime is called "Irev" in
656	   Section 3.3.4.2 of [RFC7583].  The equation for Irev in RFC7583 is
657	   insecure as it does not include the sigExpirationTime listed above.
658	   The Irev equation in RFC7583 also does not include the 2*TTL safety
659	   margin, though that is an operational consideration and not
660	   necessarily as critical.

662	6.3.4.  Example Scenario Calculations

664	   For the parameters listed in Section 5.1, our example:

666	     remwaitTime = 10
667	                   + 1 / 2
668	                   + 2 * (1)        (days)

670	     remwaitTime = 12.5             (days)

672	   Note that for the values in this example produce a length shorter
673	   than the recommended 30 days in RFC5011's section 6.6, step 3.  Other
674	   values of sigExpirationTime and the original TTL of the K_old DNSKEY
675	   RRSet, however, can produce values longer than 30 days.

677	   Note that because revocation happens immediately, an attacker has a
678	   much harder job tricking a RFC5011 Resolver into leaving a trust
679	   anchor in place, as the attacker must successfully replay the old
680	   data for every query a RFC5011 Resolver sends, not just one.

682	7.  IANA Considerations

684	   This document contains no IANA considerations.

686	8.  Operational Considerations

688	   A companion document to RFC5011 was expected to be published that
689	   describes the best operational practice considerations from the
690	   perspective of a zone publisher and PEP Publisher.  However, this
691	   companion document has yet to be published.  The authors of this
692	   document hope that it will at some point in the future, as RFC5011
693	   timing can be tricky as we have shown, and a BCP is clearly
694	   warranted.  This document is intended only to fill a single
695	   operational void which, when left misunderstood, can result in
696	   serious security ramifications.  This document does not attempt to
697	   document any other missing operational guidance for zone publishers.

699	9.  Security Considerations

701	   This document, is solely about the security considerations with
702	   respect to the SEP Publisher's ability to advertise new DNSKEYs via
703	   the RFC5011 automated trust anchor update process.  Thus the entire
704	   document is a discussion of Security Considerations when adding or
705	   removing DNSKEYs from trust anchor storage using the RFC5011 process.

707	   For simplicity, this document assumes that the SEP Publisher will use
708	   a consistent RRSIG validity period.  SEP Publishers that vary the
709	   length of RRSIG validity periods will need to adjust the
710	   sigExpirationTime value accordingly so that the equations in
711	   Section 6 and Section 6.3 use a value that coincides with the last
712	   time a replay of older RRSIGs will no longer succeed.

714	10.  Acknowledgements

716	   The authors would like to especially thank to Michael StJohns for his
717	   help and advice and the care and thought he put into RFC5011 itself.
718	   We would also like to thank Bob Harold, Shane Kerr, Matthijs Mekking,
719	   Duane Wessels, Petr Petr Spacek, Ed Lewis, and the dnsop working
720	   group who have assisted with this document.

722	11.  Normative References

724	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
725	              Requirement Levels", BCP 14, RFC 2119, March 1997.

727	   [RFC4033]  Arends, R., Austein, R., Larson, M., Massey, D., and S.
728	              Rose, "DNS Security Introduction and Requirements",
729	              RFC 4033, DOI 10.17487/RFC4033, March 2005,
730	              <http://www.rfc-editor.org/info/rfc4033>.

732	   [RFC5011]  StJohns, M., "Automated Updates of DNS Security (DNSSEC)
733	              Trust Anchors", STD 74, RFC 5011, DOI 10.17487/RFC5011,
734	              September 2007, <http://www.rfc-editor.org/info/rfc5011>.

736	   [RFC7583]  Morris, S., Ihren, J., Dickinson, J., and W. Mekking,
737	              "DNSSEC Key Rollover Timing Considerations", RFC 7583,
738	              DOI 10.17487/RFC7583, October 2015, <https://www.rfc-
739	              editor.org/info/rfc7583>.

741	   [RFC7719]  Hoffman, P., Sullivan, A., and K. Fujiwara, "DNS
742	              Terminology", RFC 7719, DOI 10.17487/RFC7719, December
743	              2015, <http://www.rfc-editor.org/info/rfc7719>.

745	Appendix A.  Real World Example: The 2017 Root KSK Key Roll

747	   In 2017, ICANN expects to (or has, depending on when you're reading
748	   this) roll the key signing key (KSK) for the root zone.  The relevant
749	   parameters associated with the root zone at the time of this writing
750	   is as follows:

752	         addHoldDownTime:                      30 days
753	         Old DNSKEY sigExpirationTime:         21 days
754	         Old DNSKEY TTL:                        2 days

756	   Thus, sticking this information into the equation in
757	   Section Section 6 yields (in days):

759	     addWaitTime = 30
760	                   + (21)
761	                   + MAX(MIN((21) / 2,
762	                             MAX(2 / 2,
763	                             15 days)),
764	                         1 hour)
765	                   + 2 * MAX(2)

767	     addWaitTime = 30 + 21 + MAX(MIN(11.5, 1, 15)), 1 hour) + 4

769	     addWaitTime = 30 + 21 + 1 + 4

771	     addWaitTime = 56 days

773	   Note that we use a activeRefreshOffset of 0, since 30 days is evenly
774	   divisible by activeRefresh (1 day).

776	   Thus, ICANN should wait a minimum of 56 days before switching to the
777	   newly published KSK (and 26 days before removing the old revoked key
778	   once it is published as revoked).  ICANN's current plans are to wait
779	   70 days before using the new KEY and 69 days before removing the old,
780	   revoked key.  Thus, their current rollover plans are sufficiently
781	   secure from the attack discussed in this memo.

783	Authors' Addresses

785	   Wes Hardaker
786	   USC/ISI
787	   P.O. Box 382
788	   Davis, CA  95617
789	   US

791	   Email: ietf@hardakers.net
792	   Warren Kumari
793	   Google
794	   1600 Amphitheatre Parkway
795	   Mountain View, CA  94043
796	   US

798	   Email: warren@kumari.net