idnits 2.17.1 draft-nir-ike-qcd-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 17. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 707. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 718. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 725. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 731. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 6, 2008) is 5741 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'IDr' is mentioned on line 222, but not defined == Missing Reference: 'KEi' is mentioned on line 283, but not defined == Missing Reference: 'KEr' is mentioned on line 285, but not defined == Missing Reference: 'CERTREQ' is mentioned on line 467, but not defined == Missing Reference: 'TSi' is mentioned on line 487, but not defined == Missing Reference: 'TSr' is mentioned on line 487, but not defined ** Obsolete normative reference: RFC 4306 (Obsoleted by RFC 5996) ** Obsolete normative reference: RFC 4718 (Obsoleted by RFC 5996) Summary: 3 errors (**), 0 flaws (~~), 7 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Y. Nir 3 Internet-Draft Check Point 4 Intended status: Standards Track F. Detienne 5 Expires: February 7, 2009 P. Sethi 6 Cisco 7 August 6, 2008 9 A Quick Crash Detection Method for IKE 10 draft-nir-ike-qcd-02 12 Status of this Memo 14 By submitting this Internet-Draft, each author represents that any 15 applicable patent or other IPR claims of which he or she is aware 16 have been or will be disclosed, and any of which he or she becomes 17 aware will be disclosed, in accordance with Section 6 of BCP 79. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as Internet- 22 Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six months 25 and may be updated, replaced, or obsoleted by other documents at any 26 time. It is inappropriate to use Internet-Drafts as reference 27 material or to cite them other than as "work in progress." 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/ietf/1id-abstracts.txt. 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html. 35 This Internet-Draft will expire on February 7, 2009. 37 Abstract 39 This document describes an extension to the IKEv2 protocol that 40 allows for faster detection of SA desynchronization using a saved 41 token. 43 When an IPsec tunnel between two IKEv2 peers is disconnected due to a 44 restart of one peer, it can take as much as several minutes for the 45 other peer to discover that the reboot has occurred, thus delaying 46 recovery. In this text we propose an extension to the protocol, that 47 allows for recovery immediately following the restart. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 52 1.1. Conventions Used in This Document . . . . . . . . . . . . 3 53 2. RFC 4306 Crash Recovery . . . . . . . . . . . . . . . . . . . 3 54 3. Protocol Outline . . . . . . . . . . . . . . . . . . . . . . . 4 55 4. Formats and Exchanges . . . . . . . . . . . . . . . . . . . . 5 56 4.1. Notification Format . . . . . . . . . . . . . . . . . . . 5 57 4.2. Passing a Token in the AUTH Exchange . . . . . . . . . . . 5 58 4.3. Replacing Tokens After Rekey or Resumption . . . . . . . . 7 59 4.4. Presenting the Token in an INFORMATIONAL Exchange . . . . 7 60 5. Token Generation and Verification . . . . . . . . . . . . . . 8 61 5.1. A Stateless Method of Token Generation . . . . . . . . . . 8 62 5.2. Token Lifetime . . . . . . . . . . . . . . . . . . . . . . 9 63 6. Backup Gateways . . . . . . . . . . . . . . . . . . . . . . . 9 64 7. Alternative Solutions . . . . . . . . . . . . . . . . . . . . 9 65 7.1. Initiating a new IKE SA . . . . . . . . . . . . . . . . . 9 66 7.2. Birth Certificates . . . . . . . . . . . . . . . . . . . . 9 67 8. Interaction with Session Resumption . . . . . . . . . . . . . 10 68 9. Operational Considerations . . . . . . . . . . . . . . . . . . 11 69 9.1. Who should implement this specification . . . . . . . . . 11 70 9.2. Response to unknown child SPI . . . . . . . . . . . . . . 12 71 10. Security Considerations . . . . . . . . . . . . . . . . . . . 13 72 10.1. QCD Token Handling . . . . . . . . . . . . . . . . . . . . 13 73 10.2. QCD Token Transmission . . . . . . . . . . . . . . . . . . 13 74 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 75 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14 76 13. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 14 77 13.1. Changes from draft-nir-ike-qcd-01 . . . . . . . . . . . . 14 78 13.2. Changes from draft-nir-ike-qcd-00 . . . . . . . . . . . . 14 79 13.3. Changes from draft-nir-qcr-00 . . . . . . . . . . . . . . 14 80 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 81 14.1. Normative References . . . . . . . . . . . . . . . . . . . 15 82 14.2. Informative References . . . . . . . . . . . . . . . . . . 15 83 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15 84 Intellectual Property and Copyright Statements . . . . . . . . . . 17 86 1. Introduction 88 IKEv2, as described in [RFC4306] has a method for recovering from a 89 reboot of one peer. As long as traffic flows in both directions, the 90 rebooted peer should re-establish the tunnels immediately. However, 91 in many cases the rebooted peer is a VPN gateway that protects only 92 servers, or else the non-rebooted peer has a dynamic IP address. In 93 such cases, the rebooted peer will not be able to re-establish the 94 tunnels. Section 2 describes how recovery works under RFC 4306, and 95 explains why it takes several minutes. 97 The method proposed here, is to send a token in the IKE_AUTH exchange 98 that establishes the tunnel. That token can be stored on the peer as 99 part of the IKE SA. After a reboot, the rebooted implementation can 100 re-generate the token, and send it to the non-rebooted peer so as to 101 delete the IKE SA. Deleting the IKE SA results is a quick re- 102 establishment of the IPsec tunnels. This is described in Section 3. 104 1.1. Conventions Used in This Document 106 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 107 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 108 document are to be interpreted as described in [RFC2119]. 110 The term "token" refers to an octet string that an implementation can 111 generate using only the IKE SPIs as input. A conforming 112 implementation MUST be able to generate the same token from the same 113 input even after rebooting. 115 The term "token maker" refers to an implementation that generates a 116 token and sends it to the peer in the IKE_AUTH exchange. 118 The term "token taker" refers to an implementation that stores such a 119 token or a digest thereof, after receiving it in an IKE_AUTH 120 exchange. 122 2. RFC 4306 Crash Recovery 124 When one peer reboots, the other peer does not get any notification, 125 so IPsec traffic can still flow. The rebooted peer will not be able 126 to decrypt it, however, and the only remedy is to send an unprotected 127 INVALID_SPI notification as described in section 3.10.1 of [RFC4306]. 128 That section also describes the processing of such a notification: 129 "If this Informational Message is sent outside the context of an 130 IKE_SA, it should be used by the recipient only as a "hint" that 131 something might be wrong (because it could easily be forged)." 132 Since the INVALID_SPI can only be used as a hint, the non-rebooted 133 peer has to determine whether the IPsec SA, and indeed the parent IKE 134 SA are still valid. The method of doing this is described in section 135 2.4 of [RFC4306]. This method, called "liveness check" involves 136 sending a protected empty INFORMATIONAL message, and awaiting a 137 response. This procedure is sometimes referred to as "Dead Peer 138 Detection" or DPD. 140 Section 2.4 does not mandate how many times the liveness check 141 message should be retransmitted, or for how long, but does recommend 142 the following: "It is suggested that messages be retransmitted at 143 least a dozen times over a period of at least several minutes before 144 giving up on an SA". Clearly, implementations differ, but all will 145 take a significant amount of time. 147 3. Protocol Outline 149 Supporting implementations will send a notification, called a "QCD 150 token", as described in Section 4.1 in the last packets of the 151 IKE_AUTH exchange. These are the final request and final response 152 that contain the AUTH payloads. The generation of these tokens is a 153 local matter for implementations, but considerations are described in 154 Section 5. Implementations that send such a token will be called 155 "token makers". 157 A supporting implementation receiving such a token SHOULD store it as 158 part of the IKE SA. Implementations that support this part of the 159 protocol will be called "token takers". Section 9.1 has 160 considerations for which implementations need to be token takers, and 161 which should be token makers. Implementation that are not token 162 takers will silently ignore QCD tokens. 164 When a token maker receives a protected IKE request message with 165 unknown IKE SPIs, it MUST generate a new token that is identical to 166 the previous token, and send it to the requesting peer in an 167 unprotected IKE message as described in Section 4.4. 169 When a token taker receives the QCD token in an unprotected 170 notification, it MUST verify that the TOKEN_SECRET_DATA matches the 171 token stored in the matching the IKE SA. If the verification fails, 172 or if the IKE SPIs in the message do not match any existing IKE SA, 173 it SHOULD log the event. If it succeeds, it MUST delete the IKE SA 174 associated with the IKE_SPI fields, and all dependant child SAs. 175 This event MAY also be logged. The token taker MUST accept such 176 tokens from any address, so as to allow different kinds of high- 177 availability configuration of the token maker. 179 A supporting token taker MAY immediately create new SAs using an 180 Initial exchange, or it may wait for subsequent traffic to trigger 181 the creation of new SAs. 183 There is ongoing work on IKEv2 Session Resumption [resumption]. See 184 Section 8 for a short discussion about this protocol's interaction 185 with session resumption. 187 4. Formats and Exchanges 189 4.1. Notification Format 191 The notification payload called "QCD token" is formatted as follows: 193 1 2 3 194 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 195 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 196 ! Next Payload !C! RESERVED ! Payload Length ! 197 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 198 ! Protocol ID ! SPI Size ! QCD Token Notify Message Type ! 199 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 200 ! ! 201 ~ TOKEN_SECRET_DATA ~ 202 ! ! 203 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 205 o Protocol ID (1 octet) MUST contain 1, as this message is related 206 to an IKE SA. 207 o SPI Size (1 octet) MUST be zero, in conformance with [RFC4306]. 208 o QCD Token Notify Message Type (2 octets) - MUST be xxxxx, the 209 value assigned for QCD token notifications. TBA by IANA. 210 o TOKEN_SECRET_DATA (16-128 octets) contains a generated token as 211 described in Section 5. 213 4.2. Passing a Token in the AUTH Exchange 215 For clarity, only the EAP version of an AUTH exchange will be 216 presented here. The non-EAP version is very similar. The figures 217 below are based on appendix A.3 of [RFC4718]. 219 first request --> IDi, 220 [N(INITIAL_CONTACT)], 221 [[N(HTTP_CERT_LOOKUP_SUPPORTED)], CERTREQ+], 222 [IDr], 223 [CP(CFG_REQUEST)], 224 [N(IPCOMP_SUPPORTED)+], 225 [N(USE_TRANSPORT_MODE)], 226 [N(ESP_TFC_PADDING_NOT_SUPPORTED)], 227 [N(NON_FIRST_FRAGMENTS_ALSO)], 228 SA, TSi, TSr, 229 [V(SIR_VID)] 230 [V+] 232 first response <-- IDr, [CERT+], AUTH, 233 EAP, 234 [V(SIR_VID)] 235 [V+] 237 / --> EAP 238 repeat 1..N times | 239 \ <-- EAP 241 last request --> AUTH 242 [N(QCD_TOKEN)] 244 last response <-- AUTH, 245 [N(QCD_TOKEN)] 246 [CP(CFG_REPLY)], 247 [N(IPCOMP_SUPPORTED)], 248 [N(USE_TRANSPORT_MODE)], 249 [N(ESP_TFC_PADDING_NOT_SUPPORTED)], 250 [N(NON_FIRST_FRAGMENTS_ALSO)], 251 SA, TSi, TSr, 252 [N(ADDITIONAL_TS_POSSIBLE)], 253 [V+] 255 Note that the QCD_TOKEN notification is marked as optional because it 256 is not required by this specification that every implementation be 257 both token maker and token taker. If only one peer sends the QCD 258 token, then a reboot of the other peer will not be recoverable by 259 this method. This may be acceptable if traffic typically originates 260 from the other peer. 262 In any case, the lack of a QCD_TOKEN notification MUST NOT be taken 263 as an indication that the peer does not support this standard. 264 Conversely, if a peer does not understand this notification, it will 265 simply ignore it. Therefore a peer MAY send this notification 266 freely, even if it does not know whether the other side supports it. 268 The QCD_TOKEN notification is related to the IKE SA and MUST follow 269 the AUTH payload and precede the Configuration payload and all 270 payloads related to the child SA. 272 4.3. Replacing Tokens After Rekey or Resumption 274 After rekeying an IKE SA, the IKE SPIs are replaced, so the new SA 275 also needs to have a token. If only the responder in the rekey 276 exchange is the token maker, this can be done before within the 277 CREATE_CHILD_SA exchange. If the initiator is a token maker, then we 278 need an extra informational exchange. 280 The following figure shows the CREATE_CHILD_SA exchange for rekeying 281 the IKE SA. Only the responder sends a stateless token. 283 request --> SA, Ni, [KEi] 285 response <-- SA, Nr, [KEr], N(QCD_TOKEN) 287 If the initiator is also a token maker, it SHOULD soon initiate an 288 INFORMATIONAL exchange as follows: 290 request --> N(QCD_TOKEN) 292 response <-- 294 For session resumption, as specified in [resumption], the situation 295 is similar. The responder, which is necessarily the peer that has 296 crashed, SHOULD send a new ticket within the protected payload of the 297 IKE_SESSION_RESUME exchange. If the Initiator is also a token maker, 298 it needs to send a QCD_TOKEN in a separate INFORMATIONAL exchange. 300 4.4. Presenting the Token in an INFORMATIONAL Exchange 302 This QCD_TOKEN notification is unprotected, and is sent as a response 303 to a protected IKE request, which uses an IKE SA that is unknown. 305 request --> N(INVALID_IKE_SPI), N(QCD_TOKEN)+ 307 If child SPIs are persistently mapped to IKE SPIs as described in 308 Section 9.2, we may get the following exchange in response to an ESP 309 or AH packet. 311 request --> N(INVALID_SPI), N(QCD_TOKEN)+ 313 The QCD_TOKEN and INVALID_IKE_SPI notifications are sent together to 314 support both implementations that conform to this specification and 315 implementations that don't. Similar to the description in section 316 2.21 of [RFC4306], The IKE SPI and message ID fields in the packet 317 headers are taken from the protected IKE request. 319 To support a periodic rollover of the secret used for token 320 generation, the token taker MUST support at least four QCD_TOKEN 321 notifications in a single packet. The token is considered verified 322 if any of the QCD_TOKEN notifications matches. The token maker MAY 323 generate up to four QCD_TOKEN notifications, based on several 324 generations of keys. 326 If the QCD_TOKEN verifies OK, an empty response MUST be sent. If the 327 QCD_TOKEN cannot be validated, a response SHOULD NOT be sent. 328 Section 5 defines token verification. 330 5. Token Generation and Verification 332 No token generation method is mandated by this document. A method is 333 documented in Section 5.1, but only serves as an example. 335 The following lists the requirements from a token generation 336 mechanism: 337 o Tokens MUST be at least 16 octets long, and no more than 128 338 octets long, to facilitate storage and transmission. Tokens 339 SHOULD be indistinguishable from random data. 340 o It should not be possible for an external attacker to guess the 341 QCD token generated by an implementation. Cryptographic 342 mechanisms such as PRNG and hash functions are RECOMMENDED. 343 o The token maker, MUST be able to re-generate or retrieve the token 344 based on the IKE SPIs even after it reboots. 346 5.1. A Stateless Method of Token Generation 348 This describes a stateless method of generating a token: 349 o At installation or immediately after the first boot of the IKE 350 implementation, 32 random octets are generated using a secure 351 random number generator or a PRNG. 352 o Those 32 bytes, called the "QCD_SECRET", are stored in non- 353 volatile storage on the machine, and kept indefinitely. 354 o The TOKEN_SECRET_DATA is calculated as follows: 356 TOKEN_SECRET_DATA = HASH(QCD_SECRET | SPI-I | SPI-R) 358 o If key rollover is required by policy, the implementation MAY 359 periodically generate a new QCD_SECRET and keep up to 3 previous 360 generations. When sending an unprotected QCD_TOKEN, as many as 4 361 notification payloads may be sent, each from a different 362 QCD_SECRET. 364 5.2. Token Lifetime 366 The token is associated with a single IKE SA, and SHOULD be deleted 367 by the token taker when the SA is deleted or expires. More formally, 368 the token is associated with the pair (SPI-I, SPI-R). 370 6. Backup Gateways 372 Making crash detection and recovery quick is a worthy goal, but since 373 rebooting a gateway takes a non-zero amount of time, many 374 implementations choose to have a stand-by gateway ready to take over 375 as soon as the primary gateway fails for any reason. 377 If such a configuration is available, it is RECOMMENDED that the 378 stand-by gateway be able to generate the same token as the active 379 gateway. if the method described in Section 5.1 is used, this means 380 that the QCD_SECRET field is identical in both gateways. This has 381 the effect of having the crash recovery available immediately. 383 7. Alternative Solutions 385 7.1. Initiating a new IKE SA 387 Instead of sending a QCD token, we could have the rebooted 388 implementation start an Initial exchange with the peer, including the 389 INITIAL_CONTACT notification. This would have the same effect, 390 instructing the peer to erase the old IKE SA, as well as establishing 391 a new IKE SA with fewer rounds. 393 The disadvantage here, is that in IKEv2 an authentication exchange 394 MUST have a piggy-backed Child SA set up. Since our use case is such 395 that the rebooted implementation does not have traffic flowing to the 396 peer, there are no good selectors for such a Child SA. 398 Additionally, when authentication is asymmetric, such as when EAP is 399 used, it is not possible for the rebooted implementation to initiate 400 IKE. 402 7.2. Birth Certificates 404 Birth Certificates is a method of crash detection that has never been 405 formally defined. Bill Sommerfeld suggested this idea in a mail to 406 the IPsec mailing list on August 7, 2000, in a thread discussing 407 methods of crash detection: 409 If we have the system sign a "birth certificate" when it 410 reboots (including a reboot time or boot sequence number), 411 we could include that with a "bad spi" ICMP error and in 412 the negotiation of the IKE SA. 414 We believe that this method would have some problems. First, it 415 requires Alice to store the certificate, so as to be able to compare 416 the public keys. That requires more storage than does a QCD token. 417 Additionally, the public-key operations needed to verify the self- 418 signed certificates are more expensive for Alice. 420 We believe that a symmetric-key operation such as proposed here is 421 more light-weight and simple than that implied by the Birth 422 Certificate idea. 424 8. Interaction with Session Resumption 426 Session Resumption, specified in [resumption] proposes to make 427 setting up a new IKE SA consume less computing resources. This is 428 particularly useful in the case of a remote access gateway that has 429 many tunnels. A failure of such a gateway would require all these 430 many remote access clients to establish an IKE SA either with the 431 rebooted gateway or with a backup gateway. This tunnel re- 432 establishment should occur within a short period of time, creating a 433 burden on the remote access gateway. Session Resumption addresses 434 this problem by having the clients store an encrypted derivative of 435 the IKE SA for quick re-establishment. 437 What Session Resumption does not help, is the problem of detecting 438 that the peer gateway has failed. A failed gateway may go undetected 439 for as long as the lifetime of a child SA, because IPsec does not 440 have packet acknowledgement, and applications cannot signal the IPsec 441 layer that the tunnel "does not work". Before establishing a new IKE 442 SA using Session Resumption, a client MUST ascertain that the gateway 443 has indeed failed. This could be done using either a liveness check 444 (as in RFC 4306) or using the QCD tokens described in this document. 446 A remote access client conforming to both specifications will store 447 QCD tokens, as well as the Session Resumption ticket, if provided by 448 the gateway. A remote access gateway conforming to both 449 specifications will generate a QCD token for the client. When the 450 gateway reboots, the client will discover this in either of two ways: 451 1. The client does regular liveness checks, or else the time for 452 some other IKE exchange has come. Since the gateway is still 453 down, the IKE times out after several minutes. In this case QCD 454 does not help. 455 2. Either the primary gateway or a backup gateway (see Section 6) is 456 ready and sends a QCD token to the client. In that case the 457 client will quickly re-establish the IPsec tunnel, either with 458 the rebooted primary gateway, the backup gateway as described in 459 this document or another gateway as described in [resumption] 461 The full combined protocol looks like this: 463 Initiator Responder 464 ----------- ----------- 465 HDR, SAi1, KEi, Ni --> 467 <-- HDR, SAr1, KEr, Nr, [CERTREQ] 469 HDR, SK {IDi, [CERT,] 470 [CERTREQ,] [IDr,] 471 AUTH, N(QCD_TOKEN) 472 SAi2, TSi, TSr, 473 N(TICKET_REQUEST)} --> 474 <-- HDR, SK {IDr, [CERT,] AUTH, SAr2, TSi, 475 TSr, N(TICKET_OPAQUE) 476 [,N(TICKET_GATEWAY_LIST)]} 478 ---- Reboot ----- 480 HDR, {} --> 481 <-- HDR, N(QCD_Token) 483 HDR, Ni, N(TICKET_OPAQUE), 484 [N+,], SK {IDi, [IDr,] 485 SAi2, TSi, TSr, 486 [CP(CFG_REQUEST)]} --> 487 <-- HDR, SK {IDr, Nr, SAr2, [TSi, TSr], 488 [CP(CFG_REPLY)]} 490 9. Operational Considerations 492 9.1. Who should implement this specification 494 Throughout this document, we have referred to reboot time 495 alternatingly as the time that the implementation crashes and the 496 time when it is ready to process IPsec packets and IKE exchanges. 497 Depending on the hardware and software platforms and the cause of the 498 reboot, rebooting may take anywhere from a few seconds to several 499 minutes. If the implementation is down for a long time, the benefit 500 of this protocol extension is reduced. For this reason critical 501 systems should implement backup gateways as described in Section 6. 502 Note that the lower-case "should" in the previous sentence is 503 intentional, as we do not specify this in the sense of RFC 2119. 505 Implementing the "token maker" side of QCD makes sense for IKE 506 implementation where protected connections originate from the peer, 507 such as inter-domain VPNs and remote access gateways. Implementing 508 the "token taker" side of QCD makes sense for IKE implementations 509 where protected connections originate, such as inter-domain VPNs and 510 remote access clients. 512 To clarify the requirements: 513 o A remote-access client MUST be a token taker and MAY be a token 514 maker. 515 o A remote-access gateway MAY be a token taker and MUST be a token 516 maker. 517 o An inter-domain VPN gateway MUST be both token maker and token 518 taker. 520 In order to limit the effects of DoS attacks, a token taker SHOULD 521 limit the rate of QCD_TOKENs verified from a particular source. 523 If excessive amounts of IKE requests protected with unknown IKE SPIs 524 arrive at a token maker, the IKE module SHOULD revert to the behavior 525 described in section 2.21 of [RFC4306] and either send an 526 INVALID_IKE_SPI notification, or ignore it entirely. 528 9.2. Response to unknown child SPI 530 After a reboot, it is more likely that an implementation receives 531 IPsec packets than IKE packets. In that case, the rebooted 532 implementation will send an INVALID_SPI notification, triggering a 533 liveness check. The token will only be sent in a response to the 534 liveness check, thus requiring an extra round-trip. 536 To avoid this, an implementation that has access to non-volatile 537 storage MAY store a mapping of child SPIs to owning IKE SPIs, or to 538 generated toekns. If such a mapping is available and persistent 539 across reboots, the rebooted implementation SHOULD respond to the 540 IPsec packet with an INVALID_SPI notification, along with the 541 appropriate QCD_Token notifications. A token taker SHOULD verify the 542 QCD token that arrives with an INVALID_SPI notification the same as 543 if it arrived with the IKE SPIs of the parent IKE SA. 545 However, a persistent storage module might not be updated in a timely 546 manner, and could be populated with IKE SPIs that have already been 547 rekeyed. A token taker MUST NOT take an invalid QCD Token sent along 548 with an INVALID_SPI notification as evidence that the peer is either 549 malfunctioning or attacking, but it SHOULD limit the rate at which 550 such notifications are processed. 552 10. Security Considerations 554 10.1. QCD Token Handling 556 Tokens MUST be hard to guess. This is critical, because if an 557 attacker can guess the token associated with the IKE SA, she can tear 558 down the IKE SA and associated tunnels at will. When the token is 559 delivered in the IKE_AUTH exchange, it is encrypted. When it is sent 560 again in an unprotected notification, it is not, but that is the last 561 time this token is ever used. 563 An aggregation of some tokens generated by one peer together with the 564 related IKE SPIs MUST NOT give an attacker the ability to guess other 565 tokens. Specifically, if one peer does not properly secure the QCD 566 tokens and an attacker gains access to them, this attacker MUST NOT 567 be able to guess other tokens generated by the same peer. This is 568 the reason that the QCD_SECRET in Section 5.1 needs to be 569 sufficiently long. 571 The QCD_SECRET MUST be protected from access by other parties. 572 Anyone gaining access to this value will be able to delete all the 573 IKE SAs for this token maker. 575 The QCD token is sent by the rebooted peer in an unprotected message. 576 A message like that is subject to modification, deletion and replay 577 by an attacker. However, these attacks will not compromise the 578 security of either side. Modification is meaningless because a 579 modified token is simply an invalid token. Deletion will only cause 580 the protocol not to work, resulting in a delay in tunnel re- 581 establishment as described in Section 2. Replay is also meaningless, 582 because the IKE SA has been deleted after the first transmission. 584 10.2. QCD Token Transmission 586 A token maker MUST NOT send a QCD token in an unprotected message for 587 an existing IKE SA. This implies that a conforming QCD token maker 588 MUST be able to tell whether a particular pair of IKE SPIs represent 589 a valid IKE SA. 591 This requirement is obvious and easy in the case of a single gateway. 592 However, some implementations use a load balancer to divide the load 593 between several physical gateways. It MUST NOT be possible even in 594 such a configuration to trick one gateway into sending a QCD token 595 for an IKE SA which is valid on another gateway. 597 11. IANA Considerations 599 IANA is requested to assign a notify message type from the error 600 types range (43-8191) of the "IKEv2 Notify Message Types" registry 601 with name "QUICK_CRASH_DETECTION". 603 12. Acknowledgements 605 We would like to thank Hannes Tschofenig and Yaron Sheffer for their 606 comments about Session Resumption. 608 13. Change Log 610 This section lists all changes in this document 612 NOTE TO RFC EDITOR : Please remove this section in the final RFC 614 13.1. Changes from draft-nir-ike-qcd-01 616 o Removed stateless method. 617 o Added discussion of rekeying and resumption. 618 o Added discussion of non-synchronized load-balanced clusters of 619 gateways in the security considerations. 620 o Other wording fixes. 622 13.2. Changes from draft-nir-ike-qcd-00 624 o Merged proposal with draft-detienne-ikev2-recovery [recovery] 625 o Changed the protocol so that the rebooted peer generates the 626 token. This has the effect, that the need for persistent storage 627 is eliminated. 628 o Added discussion of birth certificates. 630 13.3. Changes from draft-nir-qcr-00 632 o Changed name to reflect that this relates to IKE. Also changed 633 from quick crash recovery to quick crash detection to avoid 634 confusion with IFARE. 635 o Added more operational considerations. 636 o Added interaction with IFARE. 638 o Added discussion of backup gateways. 640 14. References 642 14.1. Normative References 644 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 645 Requirement Levels", BCP 14, RFC 2119, March 1997. 647 [RFC4306] Kaufman, C., "Internet Key Exchange (IKEv2) Protocol", 648 RFC 4306, December 2005. 650 [RFC4718] Eronen, P. and P. Hoffman, "IKEv2 Clarifications and 651 Implementation Guidelines", RFC 4718, October 2006. 653 14.2. Informative References 655 [recovery] 656 Detienne, F., Sethi, P., and Y. Nir, "Safe IKE Recovery", 657 draft-detienne-ikev2-recovery (work in progress), 658 July 2008. 660 [resumption] 661 Sheffer, Y., Tschofenig, H., Dondeti, L., and V. 662 Narayanan, "IPsec Gateway Failover Protocol", 663 draft-sheffer-ipsec-failover (work in progress), 664 July 2008. 666 Authors' Addresses 668 Yoav Nir 669 Check Point Software Technologies Ltd. 670 5 Hasolelim st. 671 Tel Aviv 67897 672 Israel 674 Email: ynir@checkpoint.com 675 Frederic Detienne 676 Cisco Systems, Inc. 677 De Kleetlaan, 7 678 Diegem B-1831 679 Belgium 681 Phone: +32 2 704 5681 682 Email: fd@cisco.com 684 Pratima Sethi 685 Cisco Systems, Inc. 686 O'Shaugnessy Road, 11 687 Bangalore, Karnataka 560027 688 India 690 Phone: +91 80 4154 1654 691 Email: psethi@cisco.com 693 Full Copyright Statement 695 Copyright (C) The IETF Trust (2008). 697 This document is subject to the rights, licenses and restrictions 698 contained in BCP 78, and except as set forth therein, the authors 699 retain all their rights. 701 This document and the information contained herein are provided on an 702 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 703 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 704 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 705 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 706 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 707 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 709 Intellectual Property 711 The IETF takes no position regarding the validity or scope of any 712 Intellectual Property Rights or other rights that might be claimed to 713 pertain to the implementation or use of the technology described in 714 this document or the extent to which any license under such rights 715 might or might not be available; nor does it represent that it has 716 made any independent effort to identify any such rights. Information 717 on the procedures with respect to rights in RFC documents can be 718 found in BCP 78 and BCP 79. 720 Copies of IPR disclosures made to the IETF Secretariat and any 721 assurances of licenses to be made available, or the result of an 722 attempt made to obtain a general license or permission for the use of 723 such proprietary rights by implementers or users of this 724 specification can be obtained from the IETF on-line IPR repository at 725 http://www.ietf.org/ipr. 727 The IETF invites any interested party to bring to its attention any 728 copyrights, patents or patent applications, or other proprietary 729 rights that may cover technology that may be required to implement 730 this standard. Please address the information to the IETF at 731 ietf-ipr@ietf.org.