idnits 2.17.1 draft-ietf-dnsop-rfc5011-security-considerations-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC7583, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 07, 2017) is 2332 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 7583 ** Obsolete normative reference: RFC 7719 (Obsoleted by RFC 8499) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 dnsop W. Hardaker 3 Internet-Draft USC/ISI 4 Updates: 7583 (if approved) W. Kumari 5 Intended status: Standards Track Google 6 Expires: June 10, 2018 December 07, 2017 8 Security Considerations for RFC5011 Publishers 9 draft-ietf-dnsop-rfc5011-security-considerations-09 11 Abstract 13 This document extends the RFC5011 rollover strategy with timing 14 advice that must be followed by the publisher in order to maintain 15 security. Specifically, this document describes the math behind the 16 minimum time-length that a DNS zone publisher must wait before 17 signing exclusively with recently added DNSKEYs. This document also 18 describes the minimum time-length that a DNS zone publisher must wait 19 after publishing a revoked DNSKEY before assuming that all active 20 RFC5011 resolvers should have seen the revocation-marked key and 21 removed it from their list of trust anchors. 23 This document contains much math and complicated equations, but the 24 summary is that the key rollover / revocation time is much longer 25 than intuition would suggest. If you are not both publishing a 26 DNSSEC DNSKEY, and using RFC5011 to advertise this DNSKEY as a new 27 Secure Entry Point key for use as a trust anchor, you probably don't 28 need to read this document. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at http://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on June 10, 2018. 47 Copyright Notice 49 Copyright (c) 2017 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (http://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 65 1.1. Document History and Motivation . . . . . . . . . . . . . 3 66 1.2. Safely Rolling the Root Zone's KSK in 2017/2018 . . . . . 3 67 1.3. Requirements notation . . . . . . . . . . . . . . . . . . 4 68 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 69 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 70 4. Timing Associated with RFC5011 Processing . . . . . . . . . . 5 71 4.1. Timing Associated with Publication . . . . . . . . . . . 5 72 4.2. Timing Associated with Revocation . . . . . . . . . . . . 5 73 5. Denial of Service Attack Walkthrough . . . . . . . . . . . . 6 74 5.1. Enumerated Attack Example . . . . . . . . . . . . . . . . 6 75 5.1.1. Attack Timing Breakdown . . . . . . . . . . . . . . . 7 76 6. Minimum RFC5011 Timing Requirements . . . . . . . . . . . . . 8 77 6.1. Equation Components . . . . . . . . . . . . . . . . . . . 8 78 6.1.1. addHoldDownTime . . . . . . . . . . . . . . . . . . . 9 79 6.1.2. lastSigExpirationTime . . . . . . . . . . . . . . . . 9 80 6.1.3. sigExpirationTime . . . . . . . . . . . . . . . . . . 9 81 6.1.4. sigExpirationTimeRemaining . . . . . . . . . . . . . 9 82 6.1.5. sigExpirationTimeRemaining . . . . . . . . . . . . . 9 83 6.1.6. activeRefresh . . . . . . . . . . . . . . . . . . . . 9 84 6.1.7. activeRefreshOffset . . . . . . . . . . . . . . . . . 10 85 6.1.8. safetyMargin . . . . . . . . . . . . . . . . . . . . 10 86 6.2. Timing Requirements For Adding a New KSK . . . . . . . . 11 87 6.2.1. Wait Timer Based Calculation . . . . . . . . . . . . 11 88 6.2.2. Wall-Clock Based Calculation . . . . . . . . . . . . 12 89 6.2.3. Timing Constraint Summary . . . . . . . . . . . . . . 13 90 6.2.4. Additional Considerations for RFC7583 . . . . . . . . 13 91 6.2.5. Example Scenario Calculations . . . . . . . . . . . . 13 92 6.3. Timing Requirements For Revoking an Old KSK . . . . . . . 14 93 6.3.1. Wait Timer Based Calculation . . . . . . . . . . . . 14 94 6.3.2. Wall-Clock Based Calculation . . . . . . . . . . . . 14 95 6.3.3. Additional Considerations for RFC7583 . . . . . . . . 15 96 6.3.4. Example Scenario Calculations . . . . . . . . . . . . 15 97 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 98 8. Operational Considerations . . . . . . . . . . . . . . . . . 16 99 9. Security Considerations . . . . . . . . . . . . . . . . . . . 16 100 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16 101 11. Normative References . . . . . . . . . . . . . . . . . . . . 17 102 Appendix A. Real World Example: The 2017 Root KSK Key Roll . . . 17 103 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18 105 1. Introduction 107 [RFC5011] defines a mechanism by which DNSSEC validators can update 108 their list of trust anchors when they've seen a new key published in 109 a zone or revoke a properly marked key from a trust anchor list. 110 However, RFC5011 [intentionally] provides no guidance to the 111 publishers of DNSKEYs about how long they must wait before switching 112 to exclusively using recently published keys for signing records, or 113 how long they must wait before ceasing publication of a revoked key. 114 Because of this lack of guidance, zone publishers may derive 115 incorrect assumptions about safe usage of the RFC5011 DNSKEY 116 advertising, rolling and revocation process. This document describes 117 the minimum security requirements from a publisher's point of view 118 and is intended to complement the guidance offered in RFC5011 (which 119 is written to provide timing guidance solely to a Validating 120 Resolver's point of view). 122 1.1. Document History and Motivation 124 To verify this lack of understanding is wide-spread, the authors 125 reached out to 5 DNSSEC experts to ask them how long they thought 126 they must wait before signing a zone exclusively with a new KSK 127 [RFC4033] that was being introduced according to the 5011 process. 128 All 5 experts answered with an insecure value, and we determined that 129 this lack of mathematical understanding might cause security concerns 130 in deployment. We hope that this companion document to RFC5011 will 131 rectify this understanding and provide better guidance to zone 132 publishers that wish to make use of the RFC5011 rollover process. 134 1.2. Safely Rolling the Root Zone's KSK in 2017/2018 136 One important note about ICANN's (currently in process) 2017/2018 KSK 137 rollover plan for the root zone: the timing values chosen for rolling 138 the KSK in the root zone appear completely safe, and are not affected 139 by the timing concerns introduced by this draft 141 1.3. Requirements notation 143 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 144 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 145 document are to be interpreted as described in [RFC2119]. 147 2. Background 149 The RFC5011 process describes a process by which a RFC5011 Resolver 150 may accept a newly published KSK as a trust anchor for validating 151 future DNSSEC signed records. It also describes the process for 152 publicly revoking a published KSK. This document augments that 153 information with additional constraints, from the SEP publisher's 154 points of view. Note that this document does not define any other 155 operational guidance or recommendations about the RFC5011 process and 156 restricts itself to solely the security and operational ramifications 157 of switching to exclusively using recently added keys or removing 158 revoked keys too soon. 160 Failure of a DNSKEY publisher to follow the minimum recommendations 161 associated with this draft can result in potential denial-of-service 162 attack opportunities against validating resolvers. Failure of a 163 DNSKEY publisher to publish a revoked key for a long enough period of 164 time may result in RFC5011 Resolvers leaving that key in their trust 165 anchor storage beyond the key's expected lifetime. 167 3. Terminology 169 SEP Publisher The entity responsible for publishing a DNSKEY (with 170 the Secure Entry Point (SEP) bit set) that can be used as a trust 171 anchor. 173 Zone Signer The owner of a zone intending to publish a new Key- 174 Signing-Key (KSK) that may become a trust anchor for validators 175 following the RFC5011 process. 177 RFC5011 Resolver A DNSSEC Resolver that is using the RFC5011 178 processes to track and update trust anchors. 180 Attacker An entity intent on foiling the RFC5011 Resolver's ability 181 to successfully adopt the Zone Signer's new DNSKEY as a new trust 182 anchor or to prevent the RFC5011 Resolver from removing an old 183 DNSKEY from its list of trust anchors. 185 sigExpirationTime The amount of time between the DNSKEY RRSIG's 186 Signature Inception field and the Signature Expiration field. 188 Also see Section 2 of [RFC4033] and [RFC7719] for additional 189 terminology. 191 4. Timing Associated with RFC5011 Processing 193 These sections define a high-level overview of [RFC5011] processing. 194 These steps are not sufficient for proper RFC5011 implementation, but 195 provide enough background for the reader to follow the discussion in 196 this document. Readers need to fully understand [RFC5011] as well to 197 fully comprehend the content and importance of this document. 199 4.1. Timing Associated with Publication 201 RFC5011's process of safely publishing a new DNSKEY and then assuming 202 RFC5011 Resolvers have adopted it for trust falls into a number of 203 high-level steps to be performed by the SEP Publisher. This document 204 discusses the following scenario, which the principle way RFC5011 is 205 currently being used (even though Section 6 of RFC5011 suggests 206 having a stand-by key available): 208 1. Publish a new DNSKEY in a zone, but continue to sign the zone 209 with the old one. 211 2. Wait a period of time. 213 3. Begin to exclusively use recently published DNSKEYs to sign the 214 appropriate resource records. 216 This document discusses the time required to wait during step 2 of 217 the above process. Some interpretations of RFC5011 have erroneously 218 determined that the wait time is equal to RFC5011's "hold down time". 219 Section 5 describes an attack based on this (common) erroneous 220 belief, which can result in a denial of service attack against the 221 zone. 223 4.2. Timing Associated with Revocation 225 RFC5011's process of advertising that an old key is to be revoked 226 from RFC5011 Resolvers falls into a number of high-level steps: 228 1. Set the revoke bit on the DNSKEY to be revoked. 230 2. Sign the revoked DNSKEY with itself. 232 3. Wait a period of time. 234 4. Remove the revoked key from the zone. 236 This document discusses the time required to wait in step 3 of the 237 above process. Some interpretations of RFC5011 have erroneously 238 determined that the wait time is equal to RFC5011's "hold down time". 239 This document describes an attack based on this (common) erroneous 240 belief, which results in a revoked DNSKEY potentially remaining as a 241 trust anchor in a RFC5011 Resolver long past its expected usage. 243 5. Denial of Service Attack Walkthrough 245 This section serves as an illustrative example of the problem being 246 discussed in this document. Note that in order to keep the example 247 simple enough to understand, some simplifications were made (such as 248 by not creating a set of pre-signed RRSIGs and by not using values 249 that result in the addHoldDownTime not being evenly divisible by the 250 activeRefresh value); the mathematical formulas in Section 6 are, 251 however, complete. 253 If an attacker is able to provide a RFC5011 Resolver with past 254 responses, such as when it is in-path or able to perform any number 255 of cache poisoning attacks, the attacker may be able to leave 256 compliant RFC5011 Resolvers without an appropriate DNSKEY trust 257 anchor. This scenario will remain until an administrator manually 258 fixes the situation. 260 The time-line below illustrates an example of this situation. 262 5.1. Enumerated Attack Example 264 The following example settings are used in the example scenario 265 within this section: 267 TTL (all records) 1 day 269 sigExpirationTime 10 days 271 Zone resigned every 1 day 273 Given these settings, the sequence of events in Section 5.1.1 depicts 274 how a SEP Publisher that waits for only the RFC5011 hold time timer 275 length of 30 days subjects its users to a potential Denial of Service 276 attack. The timing schedule listed below is based on a SEP Publisher 277 publishing a new Key Signing Key (KSK), with the intent that it will 278 later be used as a trust anchor. We label this publication time as 279 "T+0". All numbers in this sequence refer to days before and after 280 this initial publication event. Thus, T-1 is the day before the 281 introduction of the new key, and T+15 is the 15th day after the key 282 was introduced into the fictitious zone being discussed. 284 In this dialog, we consider two keys within the example zone: 286 K_old: An older KSK and Trust Anchor being replaced. 288 K_new: A new KSK being transitioned into active use and expected to 289 become a Trust Anchor via the RFC5011 automated trust anchor 290 update process. 292 5.1.1. Attack Timing Breakdown 294 The steps shows an attack that foils the adoption of a new DNSKEY by 295 a 5011 Resolver when the SEP Publisher that starts signing and 296 publishing with the new DNSKEY too quickly. 298 T-1 The K_old based RRSIGs are being published by the Zone Signer. 299 [It may also be signing ZSKs as well, but they are not relevant to 300 this event so we will not talk further about them; we are only 301 considering the RRSIGs that cover the DNSKEYs in this document.] 302 The Attacker queries for, retrieves and caches this DNSKEY set and 303 corresponding RRSIG signatures. 305 T+0 The Zone Signer adds K_new to their zone and signs the zone's 306 key set with K_old. The RFC5011 Resolver (later to be under 307 attack) retrieves this new key set and corresponding RRSIGs and 308 notices the publication of K_new. The RFC5011 Resolver starts the 309 (30-day) hold-down timer for K_new. [Note that in a more real- 310 world scenario there will likely be a further delay between the 311 point where the Zone Signer publishes a new RRSIG and the RFC5011 312 Resolver notices its publication; though not shown in this 313 example, this delay is accounted for in the equation in Section 6 314 below] 316 T+5 The RFC5011 Resolver queries for the zone's keyset per the 317 RFC5011 Active Refresh schedule, discussed in Section 2.3 of 318 RFC5011. Instead of receiving the intended published keyset, the 319 Attacker successfully replays the keyset and associated signatures 320 recorded at T-1 to the victim RFC5011 Resolver. Because the 321 signature lifetime is 10 days (in this example), the replayed 322 signature and keyset is accepted as valid (being only 6 days old, 323 which is less than sigExpirationTime) and the RFC5011 Resolver 324 cancels the (30-day) hold-down timer for K_new, per the RFC5011 325 algorithm. 327 T+10 The RFC5011 Resolver queries for the zone's keyset and 328 discovers a signed keyset that includes K_new (again), and is 329 signed by K_old. Note: the attacker is unable to replay the 330 records cached at T-1, because the signatures have now expired. 332 Thus at T+10, the RFC5011 Resolver starts (anew) the hold-timer 333 for K_new. 335 T+11 through T+29 The RFC5011 Resolver continues checking the zone's 336 key set at the prescribed regular intervals. During this period, 337 the attacker can no longer replay traffic to their benefit. 339 T+30 The Zone Signer knows that this is the first time at which some 340 validators might accept K_new as a new trust anchor, since the 341 hold-down timer of a RFC5011 Resolver not under attack that had 342 queried and retrieved K_new at T+0 would now have reached 30 days. 343 However, the hold-down timer of our attacked RFC5011 Resolver is 344 only at 20 days. 346 T+35 The Zone Signer (mistakenly) believes that all validators 347 following the Active Refresh schedule (Section 2.3 of RFC5011) 348 should have accepted K_new as a the new trust anchor (since the 349 hold down time (30 days) + the query interval [which is just 1/2 350 the signature validity period in this example] would have passed). 351 However, the hold-down timer of our attacked RFC5011 Resolver is 352 only at 25 days (T+35 minus T+10); thus the RFC5011 Resolver won't 353 consider it a valid trust anchor addition yet, as the required 30 354 days have not yet elapsed. 356 T+36 The Zone Signer, believing K_new is safe to use, switches their 357 active signing KSK to K_new and publishes a new RRSIG, signed with 358 (only) K_new, covering the DNSKEY set. Non-attacked RFC5011 359 validators, with a hold-down timer of at least 30 days, would have 360 accepted K_new into their set of trusted keys. But, because our 361 attacked RFC5011 Resolver now has a hold-down timer for K_new of 362 only 26 days, it failed to ever accept K_new as a trust anchor. 363 Since K_old is no longer being used to sign the zone's DNSKEYs, 364 all the DNSKEY records from the zone will be treated as invalid. 365 Subsequently, all of the records in the DNS tree below the zone's 366 apex will be deemed invalid by DNSSEC. 368 6. Minimum RFC5011 Timing Requirements 370 This section defines the minimum timing requirements for making 371 exclusive use of newly added DNSKEYs and timing requirements for 372 ceasing the publication of DNSKEYs to be revoked. First, we define 373 the term components used in both equations in Section 6.1. 375 6.1. Equation Components 376 6.1.1. addHoldDownTime 378 The addHoldDownTime is defined in Section 2.4.1 of [RFC5011] as: 380 The add hold-down time is 30 days or the expiration time of the 381 original TTL of the first trust point DNSKEY RRSet that contained 382 the new key, whichever is greater. This ensures that at least 383 two validated DNSKEY RRSets that contain the new key MUST be seen 384 by the resolver prior to the key's acceptance. 386 6.1.2. lastSigExpirationTime 388 The latest value (i.e. the future most date and time) of any RRSig 389 Signature Expiration field covering any DNSKEY RRSet containing only 390 the old trust anchor(s) that are being superseded. Note that for 391 organizations pre-creating signatures this time may be fairly far in 392 the future unless they can be significantly assured that none of 393 their pre-generated signatures can be replayed at a later date. 395 6.1.3. sigExpirationTime 397 The amount of time between the DNSKEY RRSIG's Signature Inception 398 field and the Signature Expiration field. 400 6.1.4. sigExpirationTimeRemaining 402 The amount of time remaining before lastSigExpirationTime is reached. 404 6.1.5. sigExpirationTimeRemaining 406 sigExpirationTimeRemaining is defined in Section 3. 408 6.1.6. activeRefresh 410 activeRefresh time is defined by RFC5011 by 412 A resolver that has been configured for an automatic update 413 of keys from a particular trust point MUST query that trust 414 point (e.g., do a lookup for the DNSKEY RRSet and related 415 RRSIG records) no less often than the lesser of 15 days, half 416 the original TTL for the DNSKEY RRSet, or half the RRSIG 417 expiration interval and no more often than once per hour. 419 This translates to: 421 activeRefresh = MAX(1 hour, 422 MIN(sigExpirationTime / 2, 423 MAX(TTL of K_old DNSKEY RRSet) / 2, 424 15 days) 425 ) 427 6.1.7. activeRefreshOffset 429 The activeRefreshOffset term must be added for situations where the 430 activeRefresh value is not a factor of the addHoldDownTime. 431 Specifically, activeRefreshOffset will be "addHoldDownTime % 432 activeRefresh", where % is the mathematical mod operator (calculating 433 the remainder in a division problem). This will frequently be zero, 434 but could be nearly as large as activeRefresh itself. For 435 simplicity, setting the activeRefreshOffset to the activeRefresh 436 value itself is always safe. 438 6.1.8. safetyMargin 440 The safetyMargin is an extra period of time to account for caching, 441 network delays, dropped packets, and other operational concerns 442 otherwise beyond the scope of this document. The value operators 443 should chose is highly dependent on the deployment situation 444 associated with their zone. Note that no value of a safetyMargin can 445 protect against resolvers that are "down". None the less, we do 446 offer the following as one method considering reasonable values to 447 select from. 449 The following list of variables need to be considered when selecting 450 an appropriate safetyMargin value: 452 successRate: A likely success rate for client queries and retries 454 numResolvers: The number of client RFC5011 Resolvers 456 Note that RFC5011 defines retryTime as: 458 If the query fails, the resolver MUST repeat the query until 459 satisfied no more often than once an hour and no less often 460 than the lesser of 1 day, 10% of the original TTL, or 10% of 461 the original expiration interval. That is, 462 retryTime = MAX (1 hour, MIN (1 day, .1 * origTTL, 463 .1 * expireInterval)). 465 With the successRate and numResolvers values selected and the 466 definition of retryTime from RFC5011, one method for determining how 467 many retryTime intervals to wait in order to reduce the set of 468 uncompleted servers to 0 assuming normal probability is thus: 470 x = (1/(1 - successRate)) 472 retryCountWait = Log_base_x(numResolvers) 474 To reduce the need for readers to pull out a scientific calculator, 475 we offer the following lookup table based on successRate and 476 numResolvers: 478 retryCountWait lookup table 479 --------------------------- 481 Number of client RFC5011 Resolvers (numResolvers) 482 ------------------------------------------------- 483 10,000 100,000 1,000,000 10,000,000 100,000,000 484 0.01 917 1146 1375 1604 1833 485 Probability 0.05 180 225 270 315 360 486 of Success 0.10 88 110 132 153 175 487 Per Retry 0.15 57 71 86 100 114 488 Interval 0.25 33 41 49 57 65 489 (successRate) 0.50 14 17 20 24 27 490 0.90 4 5 6 7 8 491 0.95 4 4 5 6 7 492 0.99 2 3 3 4 4 493 0.999 2 2 2 3 3 495 Finally, a suggested value of safetyMargin can then be this 496 retryCountWait number multiplied by the retryTime from RFC5011: 498 safetyMargin = retryCountWait * retryTime 500 6.2. Timing Requirements For Adding a New KSK 502 Section 6.2.1 defines a method for calculating the amount of time to 503 wait until it is safe to start signing exclusively with a new DNSKEY 504 (especially useful for writing code involving sleep based timers), 505 and Section 6.2.2 defines a method for calculating a wall-clock value 506 after which it is safe to start signing exclusively with a new DNSKEY 507 (especially useful for writing code based on clock-based event 508 triggers). 510 6.2.1. Wait Timer Based Calculation 512 Given the attack description in Section 5, the correct minimum length 513 of time required for the Zone Signer to wait after publishing K_new 514 but before exclusively using it and newer keys is: 516 addWaitTime = addHoldDownTime 517 + sigExpirationTimeRemaining 518 + activeRefresh 519 + activeRefreshOffset 520 + safetyMargin 522 6.2.1.1. Fully expanded equation 524 Given the equation components defined in Section 6.1, the full 525 expanded equation is: 527 addWaitTime = addHoldDownTime 528 + sigExpirationTimeRemaining 529 + MAX(1 hour, 530 MIN(sigExpirationTime / 2, 531 MAX(TTL of K_old DNSKEY RRSet) / 2, 532 15 days) 533 ) 534 + (addHoldDownTime % activeRefresh) 535 + MAX(1.5 hours, 2 * MAX(TTL of all records)) 536 + safetyMargin 538 6.2.2. Wall-Clock Based Calculation 540 The equations in Section 6.2.1 are defined based upon how long to 541 wait from a particular moment in time. An alternative, but 542 equivalent, method is to calculate the date and time before which it 543 is unsafe to use a key for signing. This calculation thus becomes: 545 addWallClockTime = lastSigExpirationTime 546 + addHoldDownTime 547 + activeRefresh 548 + activeRefreshOffset 549 + safetyMargin 551 where lastSigExpirationTime is the latest value of any 552 sigExpirationTime for which RRSIGs were created that could 553 potentially be replayed. Fully expanded, this becomes: 555 addWallClockTime = lastSigExpirationTime 556 + addHoldDownTime 557 + 2 * MAX(1 hour, 558 MIN(sigExpirationTime / 2, 559 MAX(TTL of K_old DNSKEY RRSet) / 2, 560 15 days) 561 ) 562 + (addHoldDownTime % activeRefresh) 563 + MAX(1.5 hours, 2 * MAX(TTL of all records)) 564 + safetyMargin 566 6.2.3. Timing Constraint Summary 568 The important timing constraint introduced by this memo relates to 569 the last point at which a RFC5011 Resolver may have received a 570 replayed original DNSKEY set, containing K_old and not K_new. The 571 next query of the RFC5011 validator at which K_new will be seen 572 without the potential for a replay attack will occur after the old 573 DNSKEY RRSIG's Signature Expriation Time. Thus, the latest time that 574 a RFC5011 Validator may begin their hold down timer is an "Active 575 Refresh" period after the last point that an attacker can replay the 576 K_old DNSKEY set. The worst case scenario of this attack is if the 577 attacker can replay K_old just seconds before the (DNSKEY RRSIG 578 Signature Validity) field of the last K_old only RRSIG. 580 6.2.4. Additional Considerations for RFC7583 582 Note: our notion of addWaitTime is called "Itrp" in Section 3.3.4.1 583 of [RFC7583]. The equation for Itrp in RFC7583 is insecure as it 584 does not include the sigExpirationTime listed above. The Itrp 585 equation in RFC7583 also does not include the 2*TTL safety margin, 586 though that is an operational consideration. 588 6.2.5. Example Scenario Calculations 590 For the parameters listed in Section 5.1, the activeRefreshOffset is 591 0, since 30 days is evenly divisible by activeRefresh (1/2 day), and 592 our resulting addWaitTime is: 594 addWaitTime = 30 595 + 10 596 + 1 / 2 597 + 0 (days) 599 addWaitTime = 42.5 (days) 601 This addWaitTime of 42.5 days is 12.5 days longer than just the hold 602 down timer, even with the needed safetyMargin value being left out 603 (which we exclude due to the lack of necessary operational 604 parameters). 606 6.3. Timing Requirements For Revoking an Old KSK 608 This issue affects not just the publication of new DNSKEYs intended 609 to be used as trust anchors, but also the length of time required to 610 continuously publish a DNSKEY with the revoke bit set. 612 Section 6.2.1 defines a method for calculating the amount of time 613 operators need to wait until it is safe to cease publishing a DNSKEY 614 (especially useful for writing code involving sleep based timers), 615 and Section 6.2.2 defines a method for calculating a minimal wall- 616 clock value after which it is safe to cease publishing a DNSKEY 617 (especially useful for writing code based on clock-based event 618 triggers). 620 6.3.1. Wait Timer Based Calculation 622 Both of these publication timing requirements are affected by the 623 attacks described in this document, but with revocation the key is 624 revoked immediately and the addHoldDown timer does not apply. Thus 625 the minimum amount of time that a SEP Publisher must wait before 626 removing a revoked key from publication is: 628 remWaitTime = sigExpirationTimeRemaining 629 + activeRefresh 630 + safetyMargin 632 remWaitTime = sigExpirationTimeRemaining 633 + MAX(1 hour, 634 MIN((sigExpirationTime) / 2, 635 MAX(TTL of K_old DNSKEY RRSet) / 2, 636 15 days)) 637 + safetyMargin 639 Note that the activeRefreshOffset time does not apply to this 640 equation. 642 Note also that adding retryTime intervals to the remWaitTime may be 643 wise, just as it was for addWaitTime in Section 6. 645 6.3.2. Wall-Clock Based Calculation 647 Like before, the above equations are defined based upon how long to 648 wait from a particular moment in time. An alternative, but 649 equivalent, method is to calculate the date and time before which it 650 is unsafe to cease publishing a revoked key. This calculation thus 651 becomes: 653 remWallClockTime = lastSigExpirationTime 654 + activeRefresh 655 + safetyMargin 657 remWallClockTime = lastSigExpirationTime 658 + MAX(1 hour, 659 MIN((sigExpirationTime) / 2, 660 MAX(TTL of K_old DNSKEY RRSet) / 2, 661 15 days)) 662 + safetyMargin 664 where lastSigExpirationTime is the latest value of any 665 sigExpirationTime for which RRSIGs were created that could 666 potentially be replayed. Fully expanded, this becomes: 668 6.3.3. Additional Considerations for RFC7583 670 Note that our notion of remWaitTime is called "Irev" in 671 Section 3.3.4.2 of [RFC7583]. The equation for Irev in RFC7583 is 672 insecure as it does not include the sigExpirationTime listed above. 673 The Irev equation in RFC7583 also does not include a safety margin, 674 though that is an operational consideration. 676 6.3.4. Example Scenario Calculations 678 For the parameters listed in Section 5.1, our example: 680 remwaitTime = 10 681 + 1 / 2 (days) 683 remwaitTime = 10.5 (days) 685 Note that for the values in this example produce a length shorter 686 than the recommended 30 days in RFC5011's section 6.6, step 3. Other 687 values of sigExpirationTime and the original TTL of the K_old DNSKEY 688 RRSet, however, can produce values longer than 30 days. 690 Note that because revocation happens immediately, an attacker has a 691 much harder job tricking a RFC5011 Resolver into leaving a trust 692 anchor in place, as the attacker must successfully replay the old 693 data for every query a RFC5011 Resolver sends, not just one. 695 7. IANA Considerations 697 This document contains no IANA considerations. 699 8. Operational Considerations 701 A companion document to RFC5011 was expected to be published that 702 describes the best operational practice considerations from the 703 perspective of a zone publisher and SEP Publisher. However, this 704 companion document has yet to be published. The authors of this 705 document hope that it will at some point in the future, as RFC5011 706 timing can be tricky as we have shown, and a BCP is clearly 707 warranted. This document is intended only to fill a single 708 operational void which, when left misunderstood, can result in 709 serious security ramifications. This document does not attempt to 710 document any other missing operational guidance for zone publishers. 712 9. Security Considerations 714 This document, is solely about the security considerations with 715 respect to the SEP Publisher's ability to advertise new DNSKEYs via 716 the RFC5011 automated trust anchor update process. Thus the entire 717 document is a discussion of Security Considerations when adding or 718 removing DNSKEYs from trust anchor storage using the RFC5011 process. 720 For simplicity, this document assumes that the SEP Publisher will use 721 a consistent RRSIG validity period. SEP Publishers that vary the 722 length of RRSIG validity periods will need to adjust the 723 sigExpirationTime value accordingly so that the equations in 724 Section 6 and Section 6.3 use a value that coincides with the last 725 time a replay of older RRSIGs will no longer succeed. 727 10. Acknowledgements 729 The authors would like to especially thank to Michael StJohns for his 730 help and advice and the care and thought he put into RFC5011 itself 731 and his continued reviews and suggestions for this document. He also 732 designed the suggested math behind the suggested safetyMargin values 733 in Section 6.1.8. 735 We would also like to thank Bob Harold, Shane Kerr, Matthijs Mekking, 736 Duane Wessels, Petr Petr Spacek, Ed Lewis, and the dnsop working 737 group who have assisted with this document. 739 11. Normative References 741 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 742 Requirement Levels", BCP 14, RFC 2119, 743 DOI 10.17487/RFC2119, March 1997, . 746 [RFC4033] Arends, R., Austein, R., Larson, M., Massey, D., and S. 747 Rose, "DNS Security Introduction and Requirements", 748 RFC 4033, DOI 10.17487/RFC4033, March 2005, 749 . 751 [RFC5011] StJohns, M., "Automated Updates of DNS Security (DNSSEC) 752 Trust Anchors", STD 74, RFC 5011, DOI 10.17487/RFC5011, 753 September 2007, . 755 [RFC7583] Morris, S., Ihren, J., Dickinson, J., and W. Mekking, 756 "DNSSEC Key Rollover Timing Considerations", RFC 7583, 757 DOI 10.17487/RFC7583, October 2015, . 760 [RFC7719] Hoffman, P., Sullivan, A., and K. Fujiwara, "DNS 761 Terminology", RFC 7719, DOI 10.17487/RFC7719, December 762 2015, . 764 Appendix A. Real World Example: The 2017 Root KSK Key Roll 766 In 2017 and 2018, ICANN expects to (or has, depending on when you're 767 reading this) roll the key signing key (KSK) for the root zone. The 768 relevant parameters associated with the root zone at the time of this 769 writing is as follows: 771 addHoldDownTime: 30 days 772 Old DNSKEY sigExpirationTime: 21 days 773 Old DNSKEY TTL: 2 days 775 Thus, sticking this information into the equation in 776 Section Section 6 yields (in days from publication time): 778 addWaitTime = 30 779 + 21 780 + MAX(1 hour, 781 MIN(21 / 2, # activeRefresh 782 MAX(2) / 2, 783 15 days), 784 ) 785 + 30 % activeRefresh 787 addWaitTime = 30 + 21 788 + MAX(1 hour, MIN(11.5, 1, 15))) 789 + 30 % activeRefresh 791 addWaitTime = 30 + 21 + 1 + 30%1 793 addWaitTime = 30 + 21 + 1 + 0 795 addWaitTime = 52 days 797 Note that activeRefreshOffset ends up being 0, since 30 days is 798 evenly divisible by activeRefresh (1 day). 800 Also note that we exclude the safetyMargin value, which is calculated 801 based on the expected client deployment size. 803 Thus, ICANN must wait a minimum of 52 days before switching to the 804 newly published KSK (and 26 days before removing the old revoked key 805 once it is published as revoked). ICANN's current plans involve 806 waiting over 3 months before using the new KEY and 69 days before 807 removing the old, revoked key. Thus, their current rollover plans 808 are sufficiently secure from the attack discussed in this memo. 810 Authors' Addresses 812 Wes Hardaker 813 USC/ISI 814 P.O. Box 382 815 Davis, CA 95617 816 US 818 Email: ietf@hardakers.net 819 Warren Kumari 820 Google 821 1600 Amphitheatre Parkway 822 Mountain View, CA 94043 823 US 825 Email: warren@kumari.net