idnits 2.17.1 draft-ietf-dnsop-rfc5011-security-considerations-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC7583, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 29, 2017) is 2339 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 7583 ** Obsolete normative reference: RFC 7719 (Obsoleted by RFC 8499) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 dnsop W. Hardaker 3 Internet-Draft USC/ISI 4 Updates: 7583 (if approved) W. Kumari 5 Intended status: Standards Track Google 6 Expires: June 2, 2018 November 29, 2017 8 Security Considerations for RFC5011 Publishers 9 draft-ietf-dnsop-rfc5011-security-considerations-08 11 Abstract 13 This document extends the RFC5011 rollover strategy with timing 14 advice that must be followed in order to maintain security. 15 Specifically, this document describes the math behind the minimum 16 time-length that a DNS zone publisher must wait before signing 17 exclusively with recently added DNSKEYs. It contains much math and 18 complicated equations, but the summary is that the key rollover / 19 revocation time is much longer than intuition would suggest. If you 20 are not both publishing a DNSSEC DNSKEY, and using RFC5011 to 21 advertise this DNSKEY as a new Secure Entry Point key for use as a 22 trust anchor, you probably don't need to read this document. 24 This document also describes the minimum time-length that a DNS zone 25 publisher must wait after publishing a revoked DNSKEY before assuming 26 that all active RFC5011 resolvers should have seen the revocation- 27 marked key and removed it from their list of trust anchors. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at http://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on June 2, 2018. 46 Copyright Notice 48 Copyright (c) 2017 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 64 1.1. Document History and Motivation . . . . . . . . . . . . . 3 65 1.2. Safely Rolling the Root Zone's KSK in 2017/2018 . . . . . 3 66 1.3. Requirements notation . . . . . . . . . . . . . . . . . . 4 67 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 69 4. Timing Associated with RFC5011 Processing . . . . . . . . . . 5 70 4.1. Timing Associated with Publication . . . . . . . . . . . 5 71 4.2. Timing Associated with Revocation . . . . . . . . . . . . 6 72 5. Denial of Service Attack Walkthrough . . . . . . . . . . . . 6 73 5.1. Enumerated Attack Example . . . . . . . . . . . . . . . . 6 74 5.1.1. Attack Timing Breakdown . . . . . . . . . . . . . . . 7 75 6. Minimum RFC5011 Timing Requirements . . . . . . . . . . . . . 9 76 6.1. Equation Components . . . . . . . . . . . . . . . . . . . 9 77 6.1.1. addHoldDownTime . . . . . . . . . . . . . . . . . . . 9 78 6.1.2. sigExpirationTimeRemaining . . . . . . . . . . . . . 9 79 6.1.3. activeRefresh . . . . . . . . . . . . . . . . . . . . 9 80 6.1.4. activeRefreshOffset . . . . . . . . . . . . . . . . . 9 81 6.1.5. safetyMargin . . . . . . . . . . . . . . . . . . . . 10 82 6.2. Timing Requirements For Adding a New KSK . . . . . . . . 11 83 6.2.1. Wait Timer Based Calculation . . . . . . . . . . . . 11 84 6.2.2. Wall-Clock Based Calculation . . . . . . . . . . . . 12 85 6.2.3. Timing Constraint Summary . . . . . . . . . . . . . . 12 86 6.2.4. Additional Considerations for RFC7583 . . . . . . . . 13 87 6.2.5. Example Scenario Calculations . . . . . . . . . . . . 13 88 6.3. Timing Requirements For Revoking an Old KSK . . . . . . . 13 89 6.3.1. Wait Timer Based Calculation . . . . . . . . . . . . 14 90 6.3.2. Wall-Clock Based Calculation . . . . . . . . . . . . 14 91 6.3.3. Additional Considerations for RFC7583 . . . . . . . . 15 92 6.3.4. Example Scenario Calculations . . . . . . . . . . . . 15 93 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 94 8. Operational Considerations . . . . . . . . . . . . . . . . . 15 95 9. Security Considerations . . . . . . . . . . . . . . . . . . . 16 96 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16 97 11. Normative References . . . . . . . . . . . . . . . . . . . . 16 98 Appendix A. Real World Example: The 2017 Root KSK Key Roll . . . 17 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17 101 1. Introduction 103 [RFC5011] defines a mechanism by which DNSSEC validators can update 104 their list of trust anchors when they've seen a new key published in 105 a zone or revoke a properly marked key from a trust anchor list. 106 However, RFC5011 [intentionally] provides no guidance to the 107 publishers of DNSKEYs about how long they must wait before switching 108 to exclusively using recently published keys for signing records, or 109 how long they must wait before ceasing publication of a revoked key. 110 Because of this lack of guidance, zone publishers may derive 111 incorrect assumptions about safe usage of the RFC5011 DNSKEY 112 advertising, rolling and revocation process. This document describes 113 the minimum security requirements from a publisher's point of view 114 and is intended to complement the guidance offered in RFC5011 (which 115 is written to provide timing guidance solely to a Validating 116 Resolver's point of view). 118 1.1. Document History and Motivation 120 To verify this lack of understanding is wide-spread, the authors 121 reached out to 5 DNSSEC experts to ask them how long they thought 122 they must wait before signing a zone exclusively with a new KSK 123 [RFC4033] that was being introduced according to the 5011 process. 124 All 5 experts answered with an insecure value, and we determined that 125 this lack of operational guidance might cause security concerns in 126 deployment and wrote this companion document to RFC5011. We hope 127 that this document will rectify this understanding and provide better 128 guidance to zone publishers that wish to make use of the RFC5011 129 rollover process. 131 1.2. Safely Rolling the Root Zone's KSK in 2017/2018 133 One important note about ICANN's (currently in process) 2017/2018 KSK 134 rollover plan for the root zone: the timing values chosen for rolling 135 the KSK in the root zone appear completely safe, and are not affected 136 by the timing concerns introduced by this draft 138 1.3. Requirements notation 140 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 141 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 142 document are to be interpreted as described in [RFC2119]. 144 2. Background 146 The RFC5011 process describes a process by which a RFC5011 Resolver 147 may accept a newly published KSK as a trust anchor for validating 148 future DNSSEC signed records. It also describes the process for 149 publicly revoking a published KSK. This document augments that 150 information with additional constraints, from the DNSKEY publication 151 and revocation's points of view. Note that this document does not 152 define any other operational guidance or recommendations about the 153 RFC5011 process and restricts itself to solely the security and 154 operational ramifications of switching to exclusively using recently 155 added keys or removing a revoked keys too soon. 157 Failure of a DNSKEY publisher to follow the minimum recommendations 158 associated with this draft can result in potential denial-of-service 159 attack opportunities against validating resolvers. Failure of a 160 DNSKEY publisher to publish a revoked key for a long enough period of 161 time may result in RFC5011 Resolvers leaving that key in their trust 162 anchor storage beyond the key's expected lifetime. 164 3. Terminology 166 SEP Publisher The entity responsible for publishing a DNSKEY (with 167 the Secure Entry Point (SEP) bit set) that can be used as a trust 168 anchor. 170 Zone Signer The owner of a zone intending to publish a new Key- 171 Signing-Key (KSK) that may become a trust anchor for validators 172 following the RFC5011 process. 174 RFC5011 Resolver A DNSSEC Resolver that is using the RFC5011 175 processes to track and update trust anchors. 177 Attacker An entity intent on foiling the RFC5011 Resolver's ability 178 to successfully adopt the Zone Signer's new DNSKEY as a new trust 179 anchor or to prevent the RFC5011 Resolver from removing an old 180 DNSKEY from its list of trust anchors. 182 lastSigExpirationTime The latest value of any RRSIG Signature 183 Expiration field (which is a date and time) that has signed the 184 previous DNSKEY RRset before a new DNSKEY is introduced to a 185 publish DNSKEY RRset, or the DNSKEY RRset of a DNSKEY that is to 186 be revoked. Note that for organizations pre-creating signatures 187 this time may be fairly far in the future unless they can be 188 significantly assured that none of their pre-generated signatures 189 can be replayed at a later date. 191 sigExpirationTime The amount of time between the DNSKEY RRSIG's 192 Signature Inception field and the Signature Expiration field. 194 sigExpirationTimeRemaining The amount of time remaining before 195 latestSigExpirationTime is reached. 197 Also see Section 2 of [RFC4033] and [RFC7719] for additional 198 terminology. 200 4. Timing Associated with RFC5011 Processing 202 These sections define a high-level overview of [RFC5011] processing. 203 These steps are not sufficient for proper RFC5011 implementation, but 204 provide enough background for the reader to follow the discussion in 205 this document. Readers need to fully understand [RFC5011] as well to 206 fully comprehend the content and importance of this document. 208 4.1. Timing Associated with Publication 210 RFC5011's process of safely publishing a new DNSKEY and then assuming 211 RFC5011 Resolvers have adopted it for trust falls into a number of 212 high-level steps to be performed by the SEP Publisher. This document 213 discusses the following scenario, which the principle way RFC5011 is 214 currently being used (even though Section 6 of RFC5011 suggests 215 having a stand-by key available): 217 1. Publish a new DNSKEY in a zone, but continue to sign the zone 218 with the old one. 220 2. Wait a period of time. 222 3. Begin to exclusively use recently published DNSKEYs to sign the 223 appropriate resource records. 225 This document discusses the time required to wait during step 2 of 226 the above process. Some interpretations of RFC5011 have erroneously 227 determined that the wait time is equal to RFC5011's "hold down time". 228 Section 5 describes an attack based on this (common) erroneous 229 belief, which can result in a denial of service attack against the 230 zone. 232 4.2. Timing Associated with Revocation 234 RFC5011's process of advertising that an old key is to be revoked 235 from RFC5011 Resolvers falls into a number of high-level steps: 237 1. Set the revoke bit on the DNSKEY to be revoked. 239 2. Sign the revoked DNSKEY with itself. 241 3. Wait a period of time. 243 4. Remove the revoked key from the zone. 245 This document discusses the time required to wait in step 3 of the 246 above process. Some interpretations of RFC5011 have erroneously 247 determined that the wait time is equal to RFC5011's "hold down time". 248 This document describes an attack based on this (common) erroneous 249 belief, which results in a revoked DNSKEY potentially remaining as a 250 trust anchor in a RFC5011 Resolver long past its expected usage. 252 5. Denial of Service Attack Walkthrough 254 This section serves as an illustrative example of the problem being 255 discussed in this document. Note that in order to keep the example 256 simple enough to understand, some simplifications were made (such as 257 by not creating a set of pre-signed RRSIGs and by not using values 258 that result in the addHoldDownTime not being evenly divisible by the 259 activeRefresh value); the mathematical formulas in Section 6, 260 however, are complete. 262 If an attacker is able to provide a RFC5011 Resolver with past 263 responses, such as when it is in-path or able to perform any number 264 of cache poisoning attacks, the attacker may be able to leave 265 compliant RFC5011 Resolvers without an appropriate DNSKEY trust 266 anchor. This scenario will remain until an administrator manually 267 fixes the situation. 269 The time-line below illustrates this situation. 271 5.1. Enumerated Attack Example 273 The following example settings are used in the example scenario 274 within this section: 276 TTL (all records) 1 day 278 sigExpirationTime 10 days 279 Zone resigned every 1 day 281 Given these settings, the sequence of events in Section 5.1.1 depicts 282 how a SEP Publisher that waits for only the RFC5011 hold time timer 283 length of 30 days subjects its users to a potential Denial of Service 284 attack. The timing schedule listed below is based on a SEP Publisher 285 publishing a new Key Signing Key (KSK), with the intent that it will 286 later be used as a trust anchor. We label this publication time as 287 "T+0". All numbers in this sequence refer to days before and after 288 this initial publication event. Thus, T-1 is the day before the 289 introduction of the new key, and T+15 is the 15th day after the key 290 was introduced into the fictitious zone being discussed. 292 In this dialog, we consider two keys within the example zone: 294 K_old: An older KSK and Trust Anchor being replaced. 296 K_new: A new KSK being transitioned into active use and expected to 297 become a Trust Anchor via the RFC5011 automated trust anchor 298 update process. 300 5.1.1. Attack Timing Breakdown 302 The steps shows an attack that foils the adoption of a new DNSKEY by 303 a 5011 Resolver when the SEP Publisher that starts signing and 304 publishing with the new DNSKEY too quickly. 306 T-1 The K_old based RRSIGs are being published by the Zone Signer. 307 [It may also be signing ZSKs as well, but they are not relevant to 308 this event so we will not talk further about them; we are only 309 considering the RRSIGs that cover the DNSKEYs in this document.] 310 The Attacker queries for, retrieves and caches this DNSKEY set and 311 corresponding RRSIG signatures. 313 T+0 The Zone Signer adds K_new to their zone and signs the zone's 314 key set with K_old. The RFC5011 Resolver (later to be under 315 attack) retrieves this new key set and corresponding RRSIGs and 316 notices the publication of K_new. The RFC5011 Resolver starts the 317 (30-day) hold-down timer for K_new. [Note that in a more real- 318 world scenario there will likely be a further delay between the 319 point where the Zone Signer publishes a new RRSIG and the RFC5011 320 Resolver notices its publication; though not shown in this 321 example, this delay is accounted for in the equation in Section 6 322 below] 324 T+5 The RFC5011 Resolver queries for the zone's keyset per the 325 RFC5011 Active Refresh schedule, discussed in Section 2.3 of 326 RFC5011. Instead of receiving the intended published keyset, the 327 Attacker successfully replays the keyset and associated signatures 328 recorded at T-1. Because the signature lifetime is 10 days (in 329 this example), the replayed signature and keyset is accepted as 330 valid (being only 6 days old, which is less than 331 sigExpirationTime) and the RFC5011 Resolver cancels the (30-day) 332 hold-down timer for K_new, per the RFC5011 algorithm. 334 T+10 The RFC5011 Resolver queries for the zone's keyset and 335 discovers a signed keyset that includes K_new (again), and is 336 signed by K_old. Note: the attacker is unable to replay the 337 records cached at T-1, because they have now expired. Thus at 338 T+10, the RFC5011 Resolver starts (anew) the hold-timer for K_new. 340 T+11 through T+29 The RFC5011 Resolver continues checking the zone's 341 key set at the prescribed regular intervals. During this period, 342 the attacker can no longer replay traffic to their benefit. 344 T+30 The Zone Signer knows that this is the first time at which some 345 validators might accept K_new as a new trust anchor, since the 346 hold-down timer of a RFC5011 Resolver not under attack that had 347 queried and retrieved K_new at T+0 would now have reached 30 days. 348 However, the hold-down timer of our attacked RFC5011 Resolver is 349 only at 20 days. 351 T+35 The Zone Signer (mistakenly) believes that all validators 352 following the Active Refresh schedule (Section 2.3 of RFC5011) 353 should have accepted K_new as a the new trust anchor (since the 354 hold down time (30 days) + the query interval [which is just 1/2 355 the signature validity period in this example] would have passed). 356 However, the hold-down timer of our attacked RFC5011 Resolver is 357 only at 25 days (T+35 minus T+10); thus the RFC5011 Resolver won't 358 consider it a valid trust anchor addition yet, as the required 30 359 days have not yet elapsed. 361 T+36 The Zone Signer, believing K_new is safe to use, switches their 362 active signing KSK to K_new and publishes a new RRSIG, signed with 363 (only) K_new, covering the DNSKEY set. Non-attacked RFC5011 364 validators, with a hold-down timer of at least 30 days, would have 365 accepted K_new into their set of trusted keys. But, because our 366 attacked RFC5011 Resolver now has a hold-down timer for K_new of 367 only 26 days, it failed to ever accept K_new as a trust anchor. 368 Since K_old is no longer being used to sign the zone's DNSKEYs, 369 all the DNSKEY records from the zone will be treated as invalid. 370 Subsequently, all of the records in the DNS tree below the zone's 371 apex will be deemed invalid by DNSSEC. 373 6. Minimum RFC5011 Timing Requirements 375 This section defines the minimum timing requirements for making 376 exclusive use of newly added DNSKEYs and timing requirements for 377 ceasing the publication of DNSKEYs to be revoked. First, we define 378 the term components used in both equations in Section 6.1. 380 6.1. Equation Components 382 6.1.1. addHoldDownTime 384 The addHoldDownTime is defined in Section 2.4.1 of [RFC5011] as: 386 The add hold-down time is 30 days or the expiration time of the 387 original TTL of the first trust point DNSKEY RRSet that contained 388 the new key, whichever is greater. This ensures that at least 389 two validated DNSKEY RRSets that contain the new key MUST be seen 390 by the resolver prior to the key's acceptance. 392 6.1.2. sigExpirationTimeRemaining 394 sigExpirationTimeRemaining is defined in Section 3. 396 6.1.3. activeRefresh 398 activeRefresh time is defined by RFC5011 by 400 A resolver that has been configured for an automatic update 401 of keys from a particular trust point MUST query that trust 402 point (e.g., do a lookup for the DNSKEY RRSet and related 403 RRSIG records) no less often than the lesser of 15 days, half 404 the original TTL for the DNSKEY RRSet, or half the RRSIG 405 expiration interval and no more often than once per hour. 407 This translates to: 409 activeRefresh = MAX(1 hour, 410 MIN(sigExpirationTime / 2, 411 MAX(TTL of K_old DNSKEY RRSet) / 2, 412 15 days) 413 ) 415 6.1.4. activeRefreshOffset 417 The activeRefreshOffset term must be added for situations where the 418 activeRefresh value is not a factor of the addHoldDownTime. 419 Specifically, activeRefreshOffset will be "addHoldDownTime % 420 activeRefresh", where % is the mathematical mod operator (calculating 421 the remainder in a division problem). This will frequently be zero, 422 but could be nearly as large as activeRefresh itself. For 423 simplicity, setting the activeRefreshOffset to the activeRefresh 424 value itself is always safe. 426 6.1.5. safetyMargin 428 The safetyMargin is an extra period of time to account for caching, 429 network delays, dropped packets, and other operational concerns 430 otherwise beyond the scope of this document. The value operators 431 should chose is highly dependent on the deployment siptuation 432 associated with their zone. Note that no value of a safetyMargin can 433 protect against resolvers that are "down". None the less, we do 434 offer the following as one method considering reasonable values to 435 select from. 437 The following list of variables need to be considered when selecting 438 an appropriate safetyMargin value: 440 successRate: A likely success rate for client queries and retries 442 numResolvers: The number of client RFC5011 Resolvers 444 Note that RFC5011 defines retryTime as: 446 If the query fails, the resolver MUST repeat the query until 447 satisfied no more often than once an hour and no less often 448 than the lesser of 1 day, 10% of the original TTL, or 10% of 449 the original expiration interval. That is, 450 retryTime = MAX (1 hour, MIN (1 day, .1 * origTTL, 451 .1 * expireInterval)). 453 With these values selected and the definition of retryTime from 454 RFC5011, one method for determining how many retryTime intervals to 455 wait in order to reduce the set of uncompleted servers to 0 assuming 456 normal probability is thus: 458 x = (1/(1 - successRate)) 460 retryCountWait = Log_base_x(numResolvers) 462 To reduce the need for readers to pull out a scientific calculator, 463 we offer the following lookup table based on successRate and 464 numResolvers: 466 retryCountWait lookup table 467 --------------------------- 469 Number of client RFC5011 Resolvers (numResolvers) 470 10,000 100,000 1,000,000 10,000,000 100,000,000 471 0.01 917 1146 1375 1604 1833 472 Probability 0.05 180 225 270 315 360 473 of Success 0.10 88 110 132 153 175 474 Per Retry 0.15 57 71 86 100 114 475 Interval 0.25 33 41 49 57 65 476 (successRate) 0.50 14 17 20 24 27 477 0.90 4 5 6 7 8 478 0.95 4 4 5 6 7 479 0.99 2 3 3 4 4 480 0.999 2 2 2 3 3 482 Finally, a suggested value of safetyMargin can then be this 483 retryCountWait number multiplied by the retryTime from RFC5011: 485 safetyMargin = retryCountWait * retryTime 487 6.2. Timing Requirements For Adding a New KSK 489 This section defines a method for calculating the amount of time to 490 wait until it is safe to start signing exclusively with a new key 491 Section 6.2.1 (especially useful for writing code involving sleep 492 based timers), and an a method for calculating a wall-clock value 493 after which it is safe to start signing exclusively with a new key 494 Section 6.2.2 (especially useful for writing code based on clock- 495 based event triggers). 497 6.2.1. Wait Timer Based Calculation 499 Given the attack description in Section 5, the correct minimum length 500 of time required for the Zone Signer to wait after publishing K_new 501 but before exclusively using it and newer keys is: 503 addWaitTime = addHoldDownTime 504 + sigExpirationTimeRemaining 505 + activeRefresh 506 + activeRefreshOffset 507 + safetyMargin 509 6.2.1.1. Fully expanded equation 511 The full expanded equation is: 513 addWaitTime = addHoldDownTime 514 + sigExpirationTimeRemaining 515 + 2 * MAX(1 hour, 516 MIN(sigExpirationTime / 2, 517 MAX(TTL of K_old DNSKEY RRSet) / 2, 518 15 days) 519 ) 520 + (addHoldDownTime % activeRefresh) 521 + MAX(1.5 hours, 2 * MAX(TTL of all records)) 522 + safetyMargin 524 6.2.2. Wall-Clock Based Calculation 526 The above equations are defined based upon how long to wait from a 527 particular moment in time. An alternative, but equivalent, method is 528 to calculate the date and time before which it is unsafe to use a key 529 for signing. This calculation thus becomes: 531 addWallClockTime = lastSigExpirationTime 532 + addHoldDownTime 533 + activeRefresh 534 + activeRefreshOffset 535 + safetyMargin 537 where lastSigExpirationTime is the latest value of any 538 sigExpirationTime for which RRSIGs were created that could 539 potentially be replayed. Fully expanded, this becomes: 541 addWallClockTime = lastSigExpirationTime 542 + addHoldDownTime 543 + 2 * MAX(1 hour, 544 MIN(sigExpirationTime / 2, 545 MAX(TTL of K_old DNSKEY RRSet) / 2, 546 15 days) 547 ) 548 + (addHoldDownTime % activeRefresh) 549 + MAX(1.5 hours, 2 * MAX(TTL of all records)) 550 + safetyMargin 552 6.2.3. Timing Constraint Summary 554 The important timing constraint introduced by this memo relates to 555 the last point at which a RFC5011 Resolver may have received a 556 replayed original DNSKEY set, containing K_old and not K_new. The 557 next query of the RFC5011 validator at which K_new will be seen 558 without the potential for a replay attack will occur after the 559 publication time plus sigExpirationTime. Thus, the latest time that 560 a RFC5011 Validator may begin their hold down timer is an "Active 561 Refresh" period after the last point that an attacker can replay the 562 K_old DNSKEY set. The worst case scenario of this attack is if the 563 attacker can replay K_old just seconds before the (DNSKEY RRSIG 564 Signature Validity) field of the last K_old only RRSIG. 566 6.2.4. Additional Considerations for RFC7583 568 Note: our notion of addWaitTime is called "Itrp" in Section 3.3.4.1 569 of [RFC7583]. The equation for Itrp in RFC7583 is insecure as it 570 does not include the sigExpirationTime listed above. The Itrp 571 equation in RFC7583 also does not include the 2*TTL safety margin, 572 though that is an operational consideration and not necessarily as 573 critical. 575 6.2.5. Example Scenario Calculations 577 For the parameters listed in Section 5.1, the activeRefreshOffset is 578 0, since 30 days is evenly divisible by activeRefresh (1/2 day), and 579 our resulting addWaitTime is: 581 addWaitTime = 30 582 + 10 583 + 1 / 2 584 + 2 * (1) (days) 586 addWaitTime = 42.5 (days) 588 This addWaitTime of 42.5 days is 12.5 days longer than just the hold 589 down timer. 591 6.3. Timing Requirements For Revoking an Old KSK 593 This issue affects not just the publication of new DNSKEYs intended 594 to be used as trust anchors, but also the length of time required to 595 continuously publish a DNSKEY with the revoke bit set. 597 This section defines a method for calculating the amount of time 598 operators need to wait until it is safe to cease publishing a DNSKEY 599 Section 6.2.1 (especially useful for writing code involving sleep 600 based timers), and an a method for calculating a minimal wall-clock 601 value after which it is safe to cease publishing a DNSKEY 602 Section 6.2.2 (especially useful for writing code based on clock- 603 based event triggers). 605 6.3.1. Wait Timer Based Calculation 607 Both of these publication timing requirements are affected by the 608 attacks described in this document, but with revocation the key is 609 revoked immediately and the addHoldDown timer does not apply. Thus 610 the minimum amount of time that a SEP Publisher must wait before 611 removing a revoked key from publication is: 613 remWaitTime = sigExpirationTimeRemaining 614 + MAX(1 hour, 615 MIN((sigExpirationTime) / 2, 616 MAX(TTL of K_old DNSKEY RRSet) / 2, 617 15 days), 618 1 hour) 619 + 2 * MAX(TTL of all records) 621 Note that the activeRefreshOffset time does not apply to this 622 equation. 624 Note also that adding retryTime intervals to the remWaitTime may be 625 wise, just as it was for addWaitTime in Section 6. 627 6.3.2. Wall-Clock Based Calculation 629 Like before, the above equations are defined based upon how long to 630 wait from a particular moment in time. An alternative, but 631 equivalent, method is to calculate the date and time before which it 632 is unsafe to cease publishing a revoked key. This calculation thus 633 becomes: 635 remWallClockTime = lastSigExpirationTime 636 + activeRefresh 637 + activeRefreshOffset 638 + safetyMargin 640 where lastSigExpirationTime is the latest value of any 641 sigExpirationTime for which RRSIGs were created that could 642 potentially be replayed. Fully expanded, this becomes: 644 remWallClockTime = lastSigExpirationTime 645 + 2 * MAX(1 hour, 646 MIN(sigExpirationTime / 2, 647 MAX(TTL of K_old DNSKEY RRSet) / 2, 648 15 days) 649 ) 650 + (addHoldDownTime % activeRefresh) 651 + MAX(1.5 hours, 2 * MAX(TTL of all records)) 653 6.3.3. Additional Considerations for RFC7583 655 Note that our notion of remWaitTime is called "Irev" in 656 Section 3.3.4.2 of [RFC7583]. The equation for Irev in RFC7583 is 657 insecure as it does not include the sigExpirationTime listed above. 658 The Irev equation in RFC7583 also does not include the 2*TTL safety 659 margin, though that is an operational consideration and not 660 necessarily as critical. 662 6.3.4. Example Scenario Calculations 664 For the parameters listed in Section 5.1, our example: 666 remwaitTime = 10 667 + 1 / 2 668 + 2 * (1) (days) 670 remwaitTime = 12.5 (days) 672 Note that for the values in this example produce a length shorter 673 than the recommended 30 days in RFC5011's section 6.6, step 3. Other 674 values of sigExpirationTime and the original TTL of the K_old DNSKEY 675 RRSet, however, can produce values longer than 30 days. 677 Note that because revocation happens immediately, an attacker has a 678 much harder job tricking a RFC5011 Resolver into leaving a trust 679 anchor in place, as the attacker must successfully replay the old 680 data for every query a RFC5011 Resolver sends, not just one. 682 7. IANA Considerations 684 This document contains no IANA considerations. 686 8. Operational Considerations 688 A companion document to RFC5011 was expected to be published that 689 describes the best operational practice considerations from the 690 perspective of a zone publisher and PEP Publisher. However, this 691 companion document has yet to be published. The authors of this 692 document hope that it will at some point in the future, as RFC5011 693 timing can be tricky as we have shown, and a BCP is clearly 694 warranted. This document is intended only to fill a single 695 operational void which, when left misunderstood, can result in 696 serious security ramifications. This document does not attempt to 697 document any other missing operational guidance for zone publishers. 699 9. Security Considerations 701 This document, is solely about the security considerations with 702 respect to the SEP Publisher's ability to advertise new DNSKEYs via 703 the RFC5011 automated trust anchor update process. Thus the entire 704 document is a discussion of Security Considerations when adding or 705 removing DNSKEYs from trust anchor storage using the RFC5011 process. 707 For simplicity, this document assumes that the SEP Publisher will use 708 a consistent RRSIG validity period. SEP Publishers that vary the 709 length of RRSIG validity periods will need to adjust the 710 sigExpirationTime value accordingly so that the equations in 711 Section 6 and Section 6.3 use a value that coincides with the last 712 time a replay of older RRSIGs will no longer succeed. 714 10. Acknowledgements 716 The authors would like to especially thank to Michael StJohns for his 717 help and advice and the care and thought he put into RFC5011 itself. 718 We would also like to thank Bob Harold, Shane Kerr, Matthijs Mekking, 719 Duane Wessels, Petr Petr Spacek, Ed Lewis, and the dnsop working 720 group who have assisted with this document. 722 11. Normative References 724 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 725 Requirement Levels", BCP 14, RFC 2119, March 1997. 727 [RFC4033] Arends, R., Austein, R., Larson, M., Massey, D., and S. 728 Rose, "DNS Security Introduction and Requirements", 729 RFC 4033, DOI 10.17487/RFC4033, March 2005, 730 . 732 [RFC5011] StJohns, M., "Automated Updates of DNS Security (DNSSEC) 733 Trust Anchors", STD 74, RFC 5011, DOI 10.17487/RFC5011, 734 September 2007, . 736 [RFC7583] Morris, S., Ihren, J., Dickinson, J., and W. Mekking, 737 "DNSSEC Key Rollover Timing Considerations", RFC 7583, 738 DOI 10.17487/RFC7583, October 2015, . 741 [RFC7719] Hoffman, P., Sullivan, A., and K. Fujiwara, "DNS 742 Terminology", RFC 7719, DOI 10.17487/RFC7719, December 743 2015, . 745 Appendix A. Real World Example: The 2017 Root KSK Key Roll 747 In 2017, ICANN expects to (or has, depending on when you're reading 748 this) roll the key signing key (KSK) for the root zone. The relevant 749 parameters associated with the root zone at the time of this writing 750 is as follows: 752 addHoldDownTime: 30 days 753 Old DNSKEY sigExpirationTime: 21 days 754 Old DNSKEY TTL: 2 days 756 Thus, sticking this information into the equation in 757 Section Section 6 yields (in days): 759 addWaitTime = 30 760 + (21) 761 + MAX(MIN((21) / 2, 762 MAX(2 / 2, 763 15 days)), 764 1 hour) 765 + 2 * MAX(2) 767 addWaitTime = 30 + 21 + MAX(MIN(11.5, 1, 15)), 1 hour) + 4 769 addWaitTime = 30 + 21 + 1 + 4 771 addWaitTime = 56 days 773 Note that we use a activeRefreshOffset of 0, since 30 days is evenly 774 divisible by activeRefresh (1 day). 776 Thus, ICANN should wait a minimum of 56 days before switching to the 777 newly published KSK (and 26 days before removing the old revoked key 778 once it is published as revoked). ICANN's current plans are to wait 779 70 days before using the new KEY and 69 days before removing the old, 780 revoked key. Thus, their current rollover plans are sufficiently 781 secure from the attack discussed in this memo. 783 Authors' Addresses 785 Wes Hardaker 786 USC/ISI 787 P.O. Box 382 788 Davis, CA 95617 789 US 791 Email: ietf@hardakers.net 792 Warren Kumari 793 Google 794 1600 Amphitheatre Parkway 795 Mountain View, CA 94043 796 US 798 Email: warren@kumari.net