idnits 2.17.1 draft-ietf-ipsecme-ipsecha-protocol-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 29, 2011) is 4774 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: 'CERT' on line 403 -- Looks like a reference, but probably isn't: 'CERTREQ' on line 403 -- Looks like a reference, but probably isn't: 'IDr' on line 403 ** Obsolete normative reference: RFC 5996 (ref. '2') (Obsoleted by RFC 7296) == Outdated reference: A later version (-08) exists of draft-ietf-ipsecme-failure-detection-07 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Singh, Ed. 3 Internet-Draft G. Kalyani 4 Intended status: Standards Track Cisco 5 Expires: September 30, 2011 Y. Nir 6 Check Point 7 Y. Sheffer 8 Independent 9 D. Zhang 10 Huawei 11 March 29, 2011 13 Protocol Support for High Availability of IKEv2/IPsec 14 draft-ietf-ipsecme-ipsecha-protocol-05 16 Abstract 18 The IPsec protocol suite is widely used for business-critical network 19 traffic. In order to make IPsec deployments highly available, more 20 scalable and failure-resistant, they are often implemented as IPsec 21 High Availability (HA) clusters. However there are many issues in 22 IPsec HA clustering, and in particular in IKEv2 clustering. An 23 earlier document, "IPsec Cluster Problem Statement", enumerates the 24 issues encountered in the IKEv2/IPsec HA cluster environment. This 25 document resolves these issues with the least possible change to the 26 protocol. 28 This document defines an extension to the IKEv2 protocol to solve the 29 main issues of "IPsec Cluster Problem Statement" in the commonly 30 deployed hot-standby cluster, and provides implementation advice for 31 other issues. The main issues solved are the synchronization of 32 IKEv2 Message ID counters, and of IPsec Replay Counters. 34 Status of this Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at http://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on September 30, 2011. 50 Copyright Notice 52 Copyright (c) 2011 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 69 3. Issues Resolved from IPsec Cluster Problem Statement . . . . . 6 70 3.1. Large Amount of State . . . . . . . . . . . . . . . . . . 6 71 3.2. Multiple Members Using the Same SA . . . . . . . . . . . . 7 72 3.3. Avoiding Collisions in SPI Number Allocation . . . . . . . 7 73 3.4. Interaction with Counter Modes . . . . . . . . . . . . . . 8 74 4. The IKEv2/IPsec SA Counter Synchronization Problem . . . . . . 8 75 5. SA Counter Synchronization Solution . . . . . . . . . . . . . 9 76 5.1. Processing Rules for IKE Message ID Synchronization . . . 11 77 5.2. Processing Rules for IPsec Replay Counter 78 Synchronization . . . . . . . . . . . . . . . . . . . . . 12 79 6. IKEv2/IPsec Synchronization Notification Payloads . . . . . . 12 80 6.1. The IKEV2_MESSAGE_ID_SYNC_SUPPORTED Notification . . . . . 12 81 6.2. The IPSEC_REPLAY_COUNTER_SYNC_SUPPORTED Notification . . . 13 82 6.3. The IKEV2_MESSAGE_ID_SYNC Notification . . . . . . . . . . 13 83 6.4. The IPSEC_REPLAY_COUNTER_SYNC Notification . . . . . . . . 14 84 7. Implementation Details . . . . . . . . . . . . . . . . . . . . 15 85 8. IKE SA and IPsec SA Message Sequencing . . . . . . . . . . . . 16 86 8.1. Handling of Pending IKE Messages . . . . . . . . . . . . . 16 87 8.2. Handling of Pending IPsec Messages . . . . . . . . . . . . 16 88 8.3. IKE SA Inconsistencies . . . . . . . . . . . . . . . . . . 16 89 9. Step by Step Details . . . . . . . . . . . . . . . . . . . . . 16 90 10. Interaction with other drafts . . . . . . . . . . . . . . . . 17 91 11. Security Considerations . . . . . . . . . . . . . . . . . . . 18 92 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 93 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 94 14. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 19 95 14.1. Draft -04 . . . . . . . . . . . . . . . . . . . . . . . . 19 96 14.2. Draft -03 . . . . . . . . . . . . . . . . . . . . . . . . 19 97 14.3. Draft -02 . . . . . . . . . . . . . . . . . . . . . . . . 20 98 14.4. Draft -01 . . . . . . . . . . . . . . . . . . . . . . . . 20 99 14.5. Draft -00 . . . . . . . . . . . . . . . . . . . . . . . . 20 100 15. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 101 15.1. Normative References . . . . . . . . . . . . . . . . . . . 20 102 15.2. Informative References . . . . . . . . . . . . . . . . . . 21 103 Appendix A. IKEv2 Message ID Sync Examples . . . . . . . . . . . 21 104 A.1. Normal Failover - Example 1 . . . . . . . . . . . . . . . 22 105 A.2. Normal Failover - Example 2 . . . . . . . . . . . . . . . 22 106 A.3. Simultaneous Failover . . . . . . . . . . . . . . . . . . 22 107 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22 109 1. Introduction 111 The IPsec protocol suite, including IKEv2, is a major building block 112 of virtual private networks (VPNs). In order to make such VPNs 113 highly available, more scalable and failure-resistant, these VPNs are 114 implemented as IKEv2/IPsec Highly Available (HA) clusters. However 115 there are many issues with the IKEv2/IPsec HA cluster. Section 4 116 below enumerates the issues around the IKEv2/IPsec HA cluster 117 solution. 119 In the case of a hot-standby cluster implementation of IKEv2/IPsec 120 based VPNs, the IKEv2/IPsec session is first established between the 121 peer and the active member of the cluster. Later, the active member 122 continuously syncs/updates the IKE/IPsec SA state to the standby 123 member of the cluster. This primary SA state sync-up takes place 124 upon each SA bring-up and/or rekey. Performing the SA state 125 synchronization/update for every single IKE and IPsec message is very 126 costly, so normally it is done periodically. As a result, when the 127 failover event happens, this is first detected by the standby member 128 and, possibly after a considerable amount of time, it becomes the 129 active member. During this failover process the peer is unaware of 130 the failover event, and keeps sending IKE requests and IPsec packets 131 to the cluster, as in fact it is allowed to do because of the IKEv2 132 windowing feature. After the newly-active member starts, it detects 133 the mismatch in IKE Message ID values and IPsec replay counters and 134 needs to resolve this situation. Please see Section 4 for more 135 details of the problem. 137 This document proposes an extension to the IKEv2 protocol to solve 138 the main issues of IKE Message ID synchronization and IPsec SA replay 139 counter synchronization and gives implementation advice for others. 140 Following is a summary of the solutions provided in this document: 142 o IKEv2 Message ID synchronization: this is done by syncing up the 143 expected send and receive Message ID values with the peer, and 144 updating the values at the newly active cluster member. 145 o IPsec Replay Counter synchronization: this is done by incrementing 146 the cluster's outgoing SA replay counter values by a "large" 147 number; in addition, the newly-active member requests the peer to 148 increment the replay counter values it is using for the peer's 149 outgoing traffic. 151 Although this document describes the IKEv2 Message ID and IPsec 152 replay counter synchronization in the context of an IPsec HA cluster, 153 the solution provided is generic and can be used in other scenarios 154 where IKEv2 Message ID or IPsec SA replay counter synchronization may 155 be required. 157 Implementations differ on the need to synchronize the IKEv2 Message 158 ID and/or IPsec replay counters. Both of these problems are handled 159 separately, using a separate notification for each capability. This 160 provides the flexibility of implementing either or both of these 161 solutions. 163 2. Terminology 165 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 166 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 167 document are to be interpreted as described in [1]. 169 "SA Counter Synchronization Request/Response" are the request viz. 170 response of the informational exchange defined in this document to 171 synchronize the IKEv2/IPsec SA counter information between one member 172 of the cluster and the peer. 174 Some of the terms listed below are reused from [4] with further 175 clarification in the context of the current document. 177 o "Hot Standby Cluster", or "HS Cluster" is a cluster where only one 178 of the members is active at any one time. This member is also 179 referred to as the "active" member, whereas the other(s) are 180 referred to as "standby" members. VRRP [5] is one method of 181 building such a cluster. The goal of the Hot Standby Cluster is 182 to create the illusion of a single virtual gateway to the peer(s). 183 o "Active Member" is the primary member in the Hot-Standby cluster. 184 It is responsible for forwarding packets on behalf of the virtual 185 gateway. 186 o "Standby Member" is the primary backup member. This member takes 187 control, i.e. becomes the active member, after the failover event. 188 o "Peer" is an IKEv2/IPsec endpoint that maintains an IPsec 189 connection with the Hot-Standby cluster. The Peer identifies the 190 cluster by the cluster's (single) IP address. If a failover event 191 occurs, the standby member of the cluster becomes active, and the 192 peer normally doesn't notice that failover has taken place. 193 Although we treat the peer as a single entity, it may also be a 194 cluster. 195 o "Multiple failover" is the situation where, in a cluster with 196 three or more members, multiple failover events happen in rapid 197 succession, e.g. from M1 to M2, and then to M3. It is our goal 198 that the implementation should be able to handle this situation, 199 i.e. to handle the new failover event even if it is still 200 processing the old failover. 201 o "Simultaneous failover" is the situation where two clusters have 202 an IPsec connection between them, and failover happens at both 203 ends at the same time. It is our goal that implementations should 204 be able to handle simultaneous failover. 206 The generic term "IKEv2/IPsec SA Counters" is used throughout this 207 document. This term refers to both IKEv2 Message ID counters and 208 IPsec replay counters. According to the IPsec standards, the IKEv2 209 Message ID counter is mandatory, and used to ensure reliable delivery 210 as well as to protect against message replay in IKEv2; the IPsec SA 211 replay counters are optional, and are used to provide the IPsec anti- 212 replay feature. 214 3. Issues Resolved from IPsec Cluster Problem Statement 216 The IPsec Cluster Problem Statement [4] enumerates the problems 217 raised by IPsec clusters. The following table lists the problem 218 statement's sections that are resolved by this document. 219 o 3.2. Lots of Long Lived State 220 o 3.3. IKE Counters 221 o 3.4. Outbound SA Counters 222 o 3.5. Inbound SA Counters 223 o 3.6. Missing Synchronization Messages 224 o 3.7. Simultaneous use of IKE and IPsec SAs by Different Members 225 * 3.7.1. Outbound SAs using counter modes 226 o 3.8. Different IP addresses for IKE and IPsec 227 o 3.9. Allocation of SPIs 229 The main problem areas are solved using the protocol extension 230 defined below, starting with Section 5; additionally, this section 231 provides implementation advice for other issues in the following 232 subsections. 234 3.1. Large Amount of State 236 Section 3.2 of the Problem Statement mentions that a lot of state 237 needs to be synchronized for a cluster to be transparent. The actual 238 volume of that data is very much implementation-dependent, and even 239 for the same implementation, the amounts of data may vary wildly. An 240 IPsec gateway used for inter-domain VPN with a dozen other gateways, 241 and having SAs that are rekeyed every 8 hours, will need a lot less 242 synchronization traffic than a similar gateway used for remote 243 access, and supporting 10,000 clients. This is because counter 244 synchronization is proportional to the number of SAs and requires 245 little data, and the setting up of an SA requires a lot of data. 246 Additionally, remote access IKE and IPsec SA setup tend to happen at 247 a particular time of day, so the example gateway with the 10,000 248 clients may see 30-50 IKE SA setups per second at 9:00 AM. This 249 would require very heavy synchronization traffic over that short 250 period of time. 252 If a large volume of traffic is necessary, it may be advisable to use 253 a dedicated high-speed network interface for synch traffic. When 254 packet loss can be made extremely low, it may be advisable to use a 255 stateless transport such as UDP, to minimize network overhead. 257 If these methods are insufficient, it may be prudent that for some 258 SAs the entire state is not synchronized. Instead, only an 259 indication of the SA's existence is synchronized. This, in 260 combination with a sticky solution (as described in section 3.7 of 261 the problem statement) ensures that the traffic from a particular 262 peer does not reach a different member before an actual failover 263 happens. When that happens, the method described in [6] can be used 264 to quickly force the peer to set up a new SA. 266 3.2. Multiple Members Using the Same SA 268 In a load-sharing cluster of the "duplicate" variety (see section 3.7 269 of the problem statement) multiple members may need to send traffic 270 with the same selectors. To actually use the same SA the cluster 271 would have to synchronize the Replay Counter after every packet, and 272 that would impose unreasonable requirements on the synch connection. 274 A far better solution would be to not synchronize the outbound SA, 275 and create multiple outbound SAs, one for each member. The problem 276 with this option is that the peer might view these multiple parallel 277 SAs as redundant, and tear down all but one of them. 279 Section 2.8 of [2] specifically allows multiple parallel SAs, but the 280 reason given for this is to have multiple SAs with different QoS 281 attributes. So while this is not a new requirement of IKEv2 282 implementations, we re-iterate here that IPsec peers MUST accept the 283 long-term existence of multiple parallel SAs, even when QoS 284 mechanisms are not in use. 286 3.3. Avoiding Collisions in SPI Number Allocation 288 Section 3.9 of the problem statement describes the problem of two 289 cluster members allocating the same SPI number for two different SAs. 290 This would violate section 4.4.2.1 of [3]. There are several schemes 291 to allow implementations to avoid such collisions, such as 292 partitioning the SPI space, a request-response over the synch 293 channel, and locking mechanisms. We believe that these are 294 sufficiently robust and available so that we don't need to make an 295 exception to RFC 4301, and we can leave this problem for the 296 implementations to solve. Cluster members MUST NOT generate multiple 297 inbound SAs with the same SPI. 299 3.4. Interaction with Counter Modes 301 For SAs involving counter mode ciphers such as CTR [7] or GCM [8] 302 there is yet another complication. The initial vector for such modes 303 MUST NOT be repeated, and senders use methods such as counters or 304 LFSRs to ensure this property. For an SA shared between multiple 305 active members (load sharing cases), implementations MUST ensure that 306 no initial vector is ever repeated. Similar concerns apply to an SA 307 failing over from one member to another. See [9] for a discussion of 308 this problem in another context. 310 Just as in the SPI collision problem, there are ways to avoid a 311 collision of initial vectors, and this is left up to implementations. 312 In the context of load sharing, parallel SAs are a simple solution to 313 this problem as well. 315 4. The IKEv2/IPsec SA Counter Synchronization Problem 317 The IKEv2 protocol [2] states that "An IKE endpoint MUST NOT exceed 318 the peer's stated window size for transmitted IKE requests". 320 All IKEv2 messages are required to follow a request-response 321 paradigm. The initiator of an IKEv2 request MUST retransmit the 322 request, until it has received a response from the peer. IKEv2 323 introduces a windowing mechanism that allows multiple requests to be 324 outstanding at a given point of time, but mandates that the sender's 325 window should not move until the oldest message it has sent is 326 acknowledged. Loss of even a single message leads to repeated 327 retransmissions followed by an IKEv2 SA teardown if the 328 retransmissions remain unacknowledged. 330 An IPsec Hot Standby Cluster is required to ensure that in the case 331 of failover, the standby member becomes active immediately. The 332 standby member is expected to have the exact value of the Message ID 333 counter as the active member had before failover. Even assuming the 334 best effort to update the Message ID values from active to standby 335 member, the values at the standby member can still be stale due to 336 the following reasons: 337 o The standby member is unaware of the last message that was 338 received and acknowledged by the previously active member, as the 339 failover event could have happened before the standby member could 340 be updated. 341 o The standby member does not have information about on-going 342 unacknowledged requests sent by the previously active member. As 343 a result after the failover event, the newly active member cannot 344 retransmit those requests. 346 When a standby member takes over as the active member, it can only 347 initialize the Message ID values from the previously updated values. 348 This would make it reject requests from the peer when these values 349 are stale. Conversely, the standby member may end up reusing a stale 350 Message ID value which would cause the peer to drop the request. 351 Eventually there is a high probability of the IKEv2 and corresponding 352 IPsec SAs getting torn down simply because of a transitory Message ID 353 mismatch and retransmission of requests, negating the benefits of the 354 high availability cluster despite the periodic update between the 355 cluster members. 357 A similar issue is also observed with IPsec anti-replay counters if 358 anti-replay protection is enabled, which is commonly the case. 359 Regardless of how well the ESP and AH SA counters are synchronized 360 from the active to the standby member, there is a chance that the 361 standby member would end up with stale counter values. The standby 362 member would then use those stale counter values when sending IPsec 363 packets. The peer would reject/drop such packets since when the 364 anti-replay protection feature is enabled, duplicate use of counters 365 is not allowed. Note that IPsec allows the sender to skip some 366 counter values and continue sending with higher counter values. 368 We conclude that a mechanism is required to ensure that the standby 369 member has correct Message ID and IPsec counter values when it 370 becomes active, so that sessions are not torn down as a result of 371 mismatched counters. 373 5. SA Counter Synchronization Solution 375 This document proposes two separate approaches to resolving the 376 issues of mismatched IKE Message ID values and IPsec counter values. 378 o In the case of IKE Message ID values, the newly active cluster 379 member and the peer negotiate a pair of new values so that future 380 IKE messages will not be dropped. 381 o For IPsec counter values, the newly-active member and the peer 382 both increment their respective counter values, "skipping forward" 383 by a large number, to ensure that no IPsec counters are ever 384 reused. 386 Although conceptually separate, the two synchronization processes 387 would typically take place simultaneously. 389 First, the peer and the active member of the cluster negotiate their 390 ability to support IKEv2 Message ID synchronization and/or IPsec 391 Replay Counter synchronization. This is done by exchanging one or 392 both of the IKEV2_MESSAGE_ID_SYNC_SUPPORTED and 393 IPSEC_REPLAY_COUNTER_SYNC_SUPPORTED notifications during the IKE_AUTH 394 exchange. When negotiating these capabilities, the responder MUST 395 NOT assert support of a capability unless such support was asserted 396 by the initiator. Only a capability whose support was asserted by 397 both parties can be used during the lifetime of the SA. 399 This per-IKE SA information is shared with the other cluster members. 401 Peer Active Member 402 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 403 HDR, SK {IDi, [CERT], [CERTREQ], [IDr], AUTH, 404 [N(IKEV2_MESSAGE_ID_SYNC_SUPPORTED),] 405 [N(IPSEC_REPLAY_COUNTER_SYNC_SUPPORTED),] 406 SAi2, TSi, TSr} ----------> 408 <-------- HDR, SK {IDr, [CERT+], [CERTREQ+], AUTH, 409 [N(IKEV2_MESSAGE_ID_SYNC_SUPPORTED),] 410 [N(IPSEC_REPLAY_COUNTER_SYNC_SUPPORTED),] SAr2, TSi, TSr} 412 After a failover event, the standby member MAY use the IKE Message ID 413 and/or IPsec Replay Counter synchronization capability when it 414 becomes the active member, and provided support for the capabilities 415 used has been negotiated. Following that, the peer MUST respond to 416 any synchronization message it receives from the newly-active cluster 417 member, subject to the rules noted below. 419 After the failover event, when the standby member becomes active, it 420 has to synchronize its SA counters with the peer. There are now 421 three possible cases: 423 1. The cluster member wishes to only perform IKE Message ID value 424 synchronization. In this case it initiates an Informational 425 exchange, with Message ID zero and the sole notification 426 IKEV2_MESSAGE_ID_SYNC. 427 2. If the newly-active member wishes to perform only IPsec replay 428 counter synchronization, it generates a regular IKEv2 429 Informational exchange using the current Message ID values, and 430 containing the IPSEC_REPLAY_COUNTER_SYNC notification. 431 3. If synchronization of both counters is needed, the cluster member 432 generates a zero-Message ID message as in case #1, and includes 433 both notifications in this message. 435 This figure contains the IKE message exchange used for SA counter 436 synchronization. The following subsections describe the details of 437 the sender and receiver processing of each message. 439 Standby [Newly Active] Member Peer 440 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 441 HDR, SK {N(IKEV2_MESSAGE_ID_SYNC), 442 [N(IPSEC_REPLAY_COUNTER_SYNC)]} --------> 444 <--------- HDR, SK {N(IKEV2_MESSAGE_ID_SYNC)} 446 Alternatively, if only IPsec Replay Counter synchronization is 447 desired, a normal Informational exchange is used, where the Message 448 ID is non-zero: 450 Standby [Newly Active] Member Peer 451 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 452 HDR, SK{N(IPSEC_REPLAY_COUNTER_SYNC)} --------> 454 <--------- HDR 456 5.1. Processing Rules for IKE Message ID Synchronization 458 The newly-active member sends a request containing two counter value, 459 one for the member (itself) and another for the peer, as well as a 460 random nonce. We denote the values M1 and P1. The peer responds 461 with a message containing two counter values, M2 and P2. The goal of 462 the rules below is to prevent an attacker from replaying a 463 synchronization message, thereby invalidating IKE messages that are 464 currently in process. 466 o M1 is the next sender's Message ID to be used by the member. M1 467 MUST be chosen so that it is larger than any value known to have 468 been used. It is RECOMMENDED to increment the known value at 469 least by the size of the IKE sender window. 470 o P1 SHOULD be 1 more than the last Message ID value received from 471 the peer, but may be any higher value. 472 o The member SHOULD communicate the sent values to the other cluster 473 members, so that if a second failover event takes place, the 474 synchronization message is not replayed. Such a replay would 475 result in the eventual deletion of the IKE SA (see below). 476 o The peer MUST reject any received synchronization message if M1 is 477 lower than or equal to the highest value it has seen from the 478 cluster. This includes any previous received synchronization 479 messages. 480 o M2 MUST be at least the higher of the received M1, and one more 481 than the highest sender value received from the cluster. This 482 includes any previous received synchronization messages. 484 o P2 MUST be the higher of the received P1 value, and one more than 485 the highest sender value used by the peer. 486 o The request contains a Nonce field. This field MUST be returned 487 in the response, unchanged. A response MUST be silently dropped 488 if the received Nonce does not match the one that was sent. 489 o Both the request and the response MUST NOT contain any additional 490 payloads, other than an optional IPSEC_REPLAY_COUNTER_SYNC 491 notification in the request. 492 o The request and the response MUST both be sent with a Message ID 493 value of zero. 495 5.2. Processing Rules for IPsec Replay Counter Synchronization 497 Upon failover, the newly-active member MUST increment its own Replay 498 Counter (the counter used for outgoing traffic), so as to prevent the 499 case of its traffic being dropped by the peer as replay. We note 500 that IPsec allows the replay counter to skip forward by any amount. 501 The estimate is based on the outgoing IPsec bandwidth and the 502 frequency of synchronization between cluster members. In those 503 implementations where it is difficult to estimate this value, the 504 counter can be incremented by a very large number, e.g. 2**30. In 505 the latter case, a rekey SHOULD follow shortly afterwards, to ensure 506 that the counter never wraps around. 508 Next, the cluster member estimates the number of incoming messages it 509 might have missed, using similar logic. The member sends out a 510 IPSEC_REPLAY_COUNTER_SYNC notification, either stand-alone or 511 together with a IKEV2_MESSAGE_ID_SYNC notification. 513 If the IPSEC_REPLAY_COUNTER_SYNC is included in the same message as 514 IKEV2_MESSAGE_ID_SYNC, the peer MUST process the Message ID 515 notification first (which might cause the entire message to be 516 dropped as a replay). Then, it MUST increment the replay counters 517 for all Child SAs associated with the current IKE SA by the amount 518 requested by the cluster member. 520 6. IKEv2/IPsec Synchronization Notification Payloads 522 This section lists the new notification payload types defined by this 523 extension. 525 6.1. The IKEV2_MESSAGE_ID_SYNC_SUPPORTED Notification 527 This notification payload is included in the IKE_AUTH request/ 528 response to indicate support of the IKEv2 Message ID synchronization 529 mechanism described in this document. 531 1 2 3 532 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 533 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 534 | Next Payload |C| RESERVED | Payload Length | 535 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 536 |Protocol ID(=0)| SPI Size (=0) | Notify Message Type | 537 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 539 The 'Next Payload', 'Payload Length', 'Protocol ID', 'SPI Size', and 540 'Notify Message Type' fields are the same as described in Section 3 541 of [2]. The 'SPI Size' field MUST be set to 0 to indicate that the 542 SPI is not present in this message. The 'Protocol ID' MUST be set to 543 0, since the notification is not specific to a particular security 544 association. The 'Payload Length' field is set to the length in 545 octets of the entire payload, including the generic payload header. 546 The 'Notify Message Type' field is set to indicate 547 IKEV2_MESSAGE_ID_SYNC_SUPPORTED, value TBD by IANA. There is no data 548 associated with this notification. 550 6.2. The IPSEC_REPLAY_COUNTER_SYNC_SUPPORTED Notification 552 This notification payload is included in the IKE_AUTH request/ 553 response to indicate support for the IPsec SA Replay Counter 554 synchronization mechanism described in this document. 556 1 2 3 557 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 558 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 559 | Next Payload |C| RESERVED | Payload Length | 560 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 561 |Protocol ID(=0)| SPI Size (=0) | Notify Message Type | 562 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 564 The 'Next Payload', 'Payload Length', 'Protocol ID', 'SPI Size', and 565 'Notify Message Type' fields are the same as described in Section 3 566 of [2] . The 'SPI Size' field MUST be set to 0 to indicate that the 567 SPI is not present in this message. The 'Protocol ID' MUST be set to 568 0, since the notification is not specific to a particular security 569 association. The 'Payload Length' field is set to the length in 570 octets of the entire payload, including the generic payload header. 571 The 'Notify Message Type' field is set to indicate 572 IPSEC_REPLAY_COUNTER_SYNC_SUPPORTED, value TBD by IANA. There is no 573 data associated with this notification. 575 6.3. The IKEV2_MESSAGE_ID_SYNC Notification 577 This notification payload type (value TBD by IANA) is defined to 578 synchronize the IKEv2 Message ID values between the newly-active 579 (formerly standby) cluster member and the peer. 581 1 2 3 582 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 583 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 584 | Next Payload |C| RESERVED | Payload Length | 585 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 586 |Protocol ID(=0)| SPI Size (=0) | Notify Message Type | 587 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 588 | Nonce Data | 589 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 590 | EXPECTED_SEND_REQ_MESSAGE_ID | 591 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 592 | EXPECTED_RECV_REQ_MESSAGE_ID | 593 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 595 It contains the following data. 596 o Nonce Data (4 octets): the random nonce data. The data should be 597 identical in the synchronization request and response. 598 o EXPECTED_SEND_REQ_MESSAGE_ID (4 octets): this field is used by the 599 sender of this notification payload to indicate the Message ID it 600 will use in the next request that it will send to the other 601 protocol peer. 602 o EXPECTED_RECV_REQ_MESSAGE_ID (4 octets): this field is used by the 603 sender of this notification payload to indicate the Message ID it 604 is expecting in the next request to be received from the other 605 protocol peer. 607 6.4. The IPSEC_REPLAY_COUNTER_SYNC Notification 609 This notification payload type (value TBD by IANA) is defined to 610 synchronize the IPsec SA Replay Counters between the newly-active 611 (formerly standby) cluster member and the peer. Since there may be 612 numerous IPsec SAs established under a single IKE SA, we do not 613 directly synchronize the value of each one. Instead, a delta value 614 is sent and all Replay Counters for Child SAs of this IKE SA are 615 incremented by the same value. Note that this solution requires that 616 all these Child SAs either use or do not use Extended Sequence 617 Numbers [3]. This notification is only sent by the cluster. 619 1 2 3 620 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 621 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 622 | Next Payload |C| RESERVED | Payload Length | 623 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 624 |Protocol ID(=0)| SPI Size (=0) | Notify Message Type | 625 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 626 | Incoming IPsec SA delta value | 627 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 629 The notification payload contains the following data. 630 o Incoming IPsec SA delta value (4 or 8 octets): The sender requests 631 that the peer should increment all the Child SA Replay Counters 632 for the sender's incoming (the peer's outgoing) traffic by this 633 value. The size of this field depends on the ESN bit associated 634 with the Child SAs: if the ESN bit is 1, the field's size is 8 635 octets, otherwise it is 4 octets. We note that this constrains 636 the Child SAs of each IKE SA to either all have the ESN bit on or 637 off. 639 7. Implementation Details 641 This protocol does not change any of the existing IKEv2 rules 642 regarding Message ID values. 644 The standby member can initiate the synchronization of IKEv2 Message 645 ID's under different circumstances. 646 o When it receives a problematic IKEv2/IPsec packet, i.e. a packet 647 outside its expected receive window. 648 o When it has to send the first IKEv2/IPsec packet after a failover 649 event. 650 o When it has just received control from the active member and 651 wishes to update the values proactively, so that it need not start 652 this exchange later, when sending or receiving the request. 654 The standby member can initiate the synchronization of IPsec SA 655 Replay Counters: 656 o If there has been traffic using the IPsec SA in the recent past 657 and the standby member suspects that its Replay Counter may be 658 stale. 660 Since there can be a large number of sessions at the standby member, 661 and sending synchronization exchanges for all of them may result in 662 overload, the standby member can choose to initiate the exchange in a 663 "lazy" fashion: only when it has to send or receive the request. In 664 general, the standby member is free to initiate this exchange at its 665 discretion. 667 8. IKE SA and IPsec SA Message Sequencing 669 The straightforward definitions of message sequence numbers, 670 retransmissions and replay protection in IPsec and IKEv2 are strained 671 by the failover scenarios described in this document. This section 672 describes some policy choices that need to be made by implementations 673 in this setting. 675 8.1. Handling of Pending IKE Messages 677 After sending its "receive" counter, the cluster member MUST reject 678 any incoming IKE messages that are outside its declared window. A 679 similar rule applies to the peer. Local policies vary, and strict 680 implementations will reject any incoming IKE message arriving before 681 Message ID synchronization is complete. 683 8.2. Handling of Pending IPsec Messages 685 For IPsec, there is often a trade-off between security and 686 reliability of the protected protocols. Here again there is some 687 leeway for local policy. Some implementations might accept incoming 688 traffic that is outside the replay window for some time after the 689 failover event. Strict implementations will only accept traffic 690 that's inside the "safe" window. 692 8.3. IKE SA Inconsistencies 694 IKEv2 is normally a reliable protocol. As long as an IKE SA is 695 valid, both peers share a single, consistent view of the IKE SA and 696 all associated Child SAs. Failover situations as described in this 697 document may involve forced deletion of IKE messages, resulting in 698 inconsistencies, such as Child SAs that exist on only one of the 699 peers. Such SAs would cause an INVALID_SPI to be returned when used 700 by that peer. 702 The Working Group discussed at some point a proposed set of rules for 703 dealing with such situations. However we believe that these 704 situations should be rare in practice; as a result the "default" 705 behavior of tearing down the entire IKE SA is to be preferred over 706 the complexity of dealing with a multitude of edge cases. 708 9. Step by Step Details 710 This section goes through the sequence of steps of a typical failover 711 event, looking at a case where the IKEv2 Message ID values are 712 synchronized. 714 o The active cluster member and the peer device establish the 715 session. They both announce the capability to synchronize counter 716 information by sending the IKEV2_MESSAGE_ID_SYNC_SUPPORTED 717 notification in the IKE_AUTH Exchange. 718 o Some time later, the active member dies, and a standby member 719 takes over. The standby member sends its own idea of the IKE 720 Message IDs (both incoming and outgoing) to the peer in an 721 Informational message exchange with Message ID zero. 722 o The peer first authenticates the message. The peer compares the 723 received values with the values available locally and picks the 724 higher value. It then updates its Message IDs with the higher 725 values and also propose the same values in its response. 726 o The peer should not wait for any pending responses while 727 responding with the new Message ID values. For example, if the 728 window size is 5 and the peer's window is 3-7, and if the peer has 729 sent requests 3, 4, 5, 6, 7 and received responses only for 4, 5, 730 6, 7 but not for 3, then it should include the value 8 in its 731 EXPECTED_SEND_REQ_MESSAGE_ID payload and should not wait for a 732 response to message 3 anymore. 733 o Similarly, the peer should also not wait for pending (incoming) 734 requests. For example if the window size is 5 and the peer's 735 window is 3-7 and if the peer has received requests 4, 5, 6, 7 but 736 not 3, then it should send the value 8 in the 737 EXPECTED_RECV_REQ_MESSAGE_ID payload, and should not expect to 738 receive message 3 anymore. 740 10. Interaction with other drafts 742 The usage scenario of the IKEv2/IPsec SA counter synchronization 743 proposal is that an IKEv2 SA has been established between the active 744 member of a hot-standby cluster and a peer, then a failover event 745 occurred with the standby member becoming active. The proposal 746 further assumes that the IKEv2 SA state was continuously synchronized 747 between the active and standby members of the cluster before the 748 failover event. 749 o Session resumption [10] assumes that a peer (client or initiator) 750 detects the need to re-establish the session. In IKEv2/IPsec SA 751 counter synchronization, it is the newly-active member (a gateway 752 or responder) that detects the need to synchronize the SA counter 753 after the failover event. Also in a hot-standby cluster, the peer 754 establishes the IKEv2/IPsec session with a single IP address that 755 represents the whole cluster, so the peer normally does not detect 756 the event of failover in the cluster unless the standby member 757 takes too long to become active and the IKEv2 SA times out by use 758 of the IKEv2 liveness check mechanism. To conclude, session 759 resumption and SA counter synchronization after failover are 760 mutually exclusive. 762 o The IKEv2 Redirect mechanism for load-balancing [11] can be used 763 either during the initial stages of SA setup (the IKE_SA_INIT and 764 IKE_AUTH exchanges) or after session establishment. SA counter 765 synchronization is only useful after the IKE SA has been 766 established and a failover event has occurred. So, unlike 767 Redirect, it is irrelevant during the first two exchanges. 768 Redirect after the session has been established is mostly useful 769 for timed or planned shutdown/maintenance. A real failover event 770 cannot be detected by the active member ahead of time, and so 771 using Redirect after session establishment is not possible in the 772 case of failover. So, Redirect and SA counter synchronization 773 after failover are mutually exclusive. 774 o IKEv2 Failure Detection [6] solves a similar problem where the 775 peer can rapidly detect that a cluster member has crashed based on 776 a token. It is unrelated to the current scenario because the goal 777 in failover is for the peer not to notice that a failure has 778 occurred. 780 11. Security Considerations 782 Since Message ID synchronization messages need to be sent with 783 Message ID zero, they are potentially vulnerable to replay attacks. 784 Because of the semantics of this protocol, these can only be denial- 785 of-service (DoS) attacks, and we are aware of two variants. 786 o Replay of Message ID synchronization request: This is countered by 787 the requirement that the Send counter sent by the cluster member 788 should always be monotonically increasing, a rule that the peer 789 enforces by silently dropping messages that contradict it. 790 o Replay of the Message ID synchronization response: This is 791 countered by sending the nonce data along with the synchronization 792 payload. The same nonce data has to be returned in the response. 793 Thus the standby member will accept a reply only for the current 794 request. After it receives a valid response, it MUST NOT process 795 the same response again and MUST discard any additional responses. 797 12. IANA Considerations 799 This document introduces four new IKEv2 Notification Message types as 800 described in Section 6. The new Notify Message Types must be 801 assigned values between 16396 and 40959. 803 +-------------------------------------+-------------+ 804 | Name | Value | 805 +-------------------------------------+-------------+ 806 | IKEV2_MESSAGE_ID_SYNC_SUPPORTED | TBD by IANA | 807 | IPSEC_REPLAY_COUNTER_SYNC_SUPPORTED | TBD by IANA | 808 | IKEV2_MESSAGE_ID_SYNC | TBD by IANA | 809 | IPSEC_REPLAY_COUNTER_SYNC | TBD by IANA | 810 +-------------------------------------+-------------+ 812 13. Acknowledgements 814 We would like to thank Pratima Sethi and Frederic Detienne for their 815 review comments and valuable suggestions for the initial version of 816 the document. 818 We would also like to thank the following people (in alphabetical 819 order) for their review comments and valuable suggestions: Dan 820 Harkins, Paul Hoffman, Steve Kent, Tero Kivinen, David McGrew, and 821 Pekka Riikonen. 823 14. Change Log 825 This section lists all the changes in this document. 827 NOTE TO RFC EDITOR: Please remove this section before publication. 829 14.1. Draft -04 831 Extended Sec. 3 for better coverage of other IPsec cluster-related 832 issues, and how they are resolved within the existing standards. 834 14.2. Draft -03 836 Clarified the rules for Message ID sync, so that replay attacks can 837 be avoided without a failover counter. 839 Added wording regarding inconsistent IKE state (basically choosing to 840 ignore the problem) and further rules dealing with pending traffic. 842 The IPsec replay counter delta value now refers to incoming traffic. 843 The associated notification is only sent from the cluster to the 844 peer, and not back. 846 14.3. Draft -02 848 Addressed comments by Yaron Sheffer posted on the WG mailing list. 850 Numerous editorial changes. 852 14.4. Draft -01 854 Added "Multiple and Simultaneous failover' scenarios as pointed out 855 by Pekka Riikonen. 857 Now document provides a mechanism to sync either IKEv2 message or 858 IPsec replay counter or both to cater different types of 859 implementations. 861 HA cluster's "failover count' is used to encounter replay of sync 862 requests by attacker. 864 The sync of IPsec SA replay counter optimized to to have just one 865 global bumped-up outgoing IPsec SA counter of ALL Child SAs under an 866 IKEv2 SA. 868 The examples added for IKEv2 Message ID sync to provide more clarity. 870 Some edits as per comments on mailing list to enhance clarity. 872 14.5. Draft -00 874 Version 00 is identical to 875 draft-kagarigi-ipsecme-ikev2-windowsync-04, started as WG document. 877 Added IPSECME WG HA design team members as authors. 879 Added comment in Introduction to discuss the window sync process on 880 WG mailing list to solve some concerns. 882 15. References 884 15.1. Normative References 886 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 887 Levels", BCP 14, RFC 2119, March 1997. 889 [2] Kaufman, C., Hoffman, P., Nir, Y., and P. Eronen, "Internet Key 890 Exchange Protocol Version 2 (IKEv2)", RFC 5996, September 2010. 892 [3] Kent, S. and K. Seo, "Security Architecture for the Internet 893 Protocol", RFC 4301, December 2005. 895 15.2. Informative References 897 [4] Nir, Y., "IPsec Cluster Problem Statement", RFC 6027, 898 October 2010. 900 [5] Nadas, S., "Virtual Router Redundancy Protocol (VRRP) Version 3 901 for IPv4 and IPv6", RFC 5798, March 2010. 903 [6] Nir, Y., Wierbowski, D., Detienne, F., and P. Sethi, "A Quick 904 Crash Detection Method for IKE", 905 draft-ietf-ipsecme-failure-detection-07 (work in progress), 906 March 2011. 908 [7] Housley, R., "Using Advanced Encryption Standard (AES) Counter 909 Mode With IPsec Encapsulating Security Payload (ESP)", 910 RFC 3686, January 2004. 912 [8] Viega, J. and D. McGrew, "The Use of Galois/Counter Mode (GCM) 913 in IPsec Encapsulating Security Payload (ESP)", RFC 4106, 914 June 2005. 916 [9] McGrew, D. and B. Weis, "Using Counter Modes with Encapsulating 917 Security Payload (ESP) and Authentication Header (AH) to 918 Protect Group Traffic", RFC 6054, November 2010. 920 [10] Sheffer, Y. and H. Tschofenig, "Internet Key Exchange Protocol 921 Version 2 (IKEv2) Session Resumption", RFC 5723, January 2010. 923 [11] Devarapalli, V. and K. Weniger, "Redirect Mechanism for the 924 Internet Key Exchange Protocol Version 2 (IKEv2)", RFC 5685, 925 November 2009. 927 Appendix A. IKEv2 Message ID Sync Examples 929 This (non-normative) section presents some examples that illustrate 930 how the IKEv2 Message ID values are synchronized. We use a tuple 931 notation, denoting the two counters EXPECTED_SEND_REQ_MESSAGE_ID and 932 EXPECTED_RECV_REQ_MESSAGE_ID on a member as 933 (EXPECTED_SEND_REQ_MESSAGE_ID, EXPECTED_RECV_REQ_MESSAGE_ID). 935 A.1. Normal Failover - Example 1 937 Standby (Newly Active) Member Peer 938 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 939 Sync Request (2, 3) --------> 941 Peer has the values (4, 5) so it sends 942 <------------- (4, 5) as the Sync Response 944 A.2. Normal Failover - Example 2 946 Standby (Newly Active) Member Peer 947 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 948 Sync Request (2, 5) --------> 950 Peer has the values (2, 4) so it sends 951 <-------------(5, 4) as the Sync Response 953 A.3. Simultaneous Failover 955 In the case of simultaneous failover, both sides send the 956 synchronization request, but whichever side has the higher value will 957 be eventually synchronized. 959 Standby (Newly Active) Member Peer 960 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 962 Sync Request (4,4) -----> 964 <-------------- Sync Request (5,5) 966 Sync Response (5,5) ----> 968 <-------- Sync Response (5,5) 970 Authors' Addresses 972 Raj Singh (Editor) 973 Cisco Systems, Inc. 974 Divyashree Chambers, B Wing, O'Shaugnessy Road 975 Bangalore, Karnataka 560025 976 India 978 Phone: +91 80 4301 3320 979 Email: rsj@cisco.com 981 Kalyani Garigipati 982 Cisco Systems, Inc. 983 Divyashree Chambers, B Wing, O'Shaugnessy Road 984 Bangalore, Karnataka 560025 985 India 987 Phone: +91 80 4426 4831 988 Email: kagarigi@cisco.com 990 Yoav Nir 991 Check Point Software Technologies Ltd. 992 5 Hasolelim St. 993 Tel Aviv 67897 994 Israel 996 Email: ynir@checkpoint.com 998 Yaron Sheffer 999 Independent 1001 Email: yaronf.ietf@gmail.com 1003 Dacheng Zhang 1004 Huawei Technologies Ltd. 1006 Email: zhangdacheng@huawei.com