| < draft-ietf-ipsecme-ipsecha-protocol-05.txt | draft-ietf-ipsecme-ipsecha-protocol-06.txt > | |||
|---|---|---|---|---|
| Network Working Group R. Singh, Ed. | Network Working Group R. Singh, Ed. | |||
| Internet-Draft G. Kalyani | Internet-Draft G. Kalyani | |||
| Intended status: Standards Track Cisco | Intended status: Standards Track Cisco | |||
| Expires: September 30, 2011 Y. Nir | Expires: November 7, 2011 Y. Nir | |||
| Check Point | Check Point | |||
| Y. Sheffer | Y. Sheffer | |||
| Independent | Porticor | |||
| D. Zhang | D. Zhang | |||
| Huawei | Huawei | |||
| March 29, 2011 | May 6, 2011 | |||
| Protocol Support for High Availability of IKEv2/IPsec | Protocol Support for High Availability of IKEv2/IPsec | |||
| draft-ietf-ipsecme-ipsecha-protocol-05 | draft-ietf-ipsecme-ipsecha-protocol-06 | |||
| Abstract | Abstract | |||
| The IPsec protocol suite is widely used for business-critical network | The IPsec protocol suite is widely used for business-critical network | |||
| traffic. In order to make IPsec deployments highly available, more | traffic. In order to make IPsec deployments highly available, more | |||
| scalable and failure-resistant, they are often implemented as IPsec | scalable and failure-resistant, they are often implemented as IPsec | |||
| High Availability (HA) clusters. However there are many issues in | High Availability (HA) clusters. However there are many issues in | |||
| IPsec HA clustering, and in particular in IKEv2 clustering. An | IPsec HA clustering, and in particular in IKEv2 clustering. An | |||
| earlier document, "IPsec Cluster Problem Statement", enumerates the | earlier document, "IPsec Cluster Problem Statement", enumerates the | |||
| issues encountered in the IKEv2/IPsec HA cluster environment. This | issues encountered in the IKEv2/IPsec HA cluster environment. This | |||
| skipping to change at page 2, line 4 ¶ | skipping to change at page 2, line 4 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on September 30, 2011. | This Internet-Draft will expire on November 7, 2011. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2011 IETF Trust and the persons identified as the | Copyright (c) 2011 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 3. Issues Resolved from IPsec Cluster Problem Statement . . . . . 6 | 3. Issues Resolved from IPsec Cluster Problem Statement . . . . . 7 | |||
| 3.1. Large Amount of State . . . . . . . . . . . . . . . . . . 6 | 3.1. Large Amount of State . . . . . . . . . . . . . . . . . . 7 | |||
| 3.2. Multiple Members Using the Same SA . . . . . . . . . . . . 7 | 3.2. Multiple Members Using the Same SA . . . . . . . . . . . . 8 | |||
| 3.3. Avoiding Collisions in SPI Number Allocation . . . . . . . 7 | 3.3. Avoiding Collisions in SPI Number Allocation . . . . . . . 8 | |||
| 3.4. Interaction with Counter Modes . . . . . . . . . . . . . . 8 | 3.4. Interaction with Counter Modes . . . . . . . . . . . . . . 8 | |||
| 4. The IKEv2/IPsec SA Counter Synchronization Problem . . . . . . 8 | 4. The IKEv2/IPsec SA Counter Synchronization Problem . . . . . . 9 | |||
| 5. SA Counter Synchronization Solution . . . . . . . . . . . . . 9 | 5. SA Counter Synchronization Solution . . . . . . . . . . . . . 10 | |||
| 5.1. Processing Rules for IKE Message ID Synchronization . . . 11 | 5.1. Processing Rules for IKE Message ID Synchronization . . . 12 | |||
| 5.2. Processing Rules for IPsec Replay Counter | 5.2. Processing Rules for IPsec Replay Counter | |||
| Synchronization . . . . . . . . . . . . . . . . . . . . . 12 | Synchronization . . . . . . . . . . . . . . . . . . . . . 13 | |||
| 6. IKEv2/IPsec Synchronization Notification Payloads . . . . . . 12 | 6. IKEv2/IPsec Synchronization Notification Payloads . . . . . . 13 | |||
| 6.1. The IKEV2_MESSAGE_ID_SYNC_SUPPORTED Notification . . . . . 12 | 6.1. The IKEV2_MESSAGE_ID_SYNC_SUPPORTED Notification . . . . . 14 | |||
| 6.2. The IPSEC_REPLAY_COUNTER_SYNC_SUPPORTED Notification . . . 13 | 6.2. The IPSEC_REPLAY_COUNTER_SYNC_SUPPORTED Notification . . . 14 | |||
| 6.3. The IKEV2_MESSAGE_ID_SYNC Notification . . . . . . . . . . 13 | 6.3. The IKEV2_MESSAGE_ID_SYNC Notification . . . . . . . . . . 15 | |||
| 6.4. The IPSEC_REPLAY_COUNTER_SYNC Notification . . . . . . . . 14 | 6.4. The IPSEC_REPLAY_COUNTER_SYNC Notification . . . . . . . . 15 | |||
| 7. Implementation Details . . . . . . . . . . . . . . . . . . . . 15 | 7. Implementation Details . . . . . . . . . . . . . . . . . . . . 16 | |||
| 8. IKE SA and IPsec SA Message Sequencing . . . . . . . . . . . . 16 | 8. IKE SA and IPsec SA Message Sequencing . . . . . . . . . . . . 17 | |||
| 8.1. Handling of Pending IKE Messages . . . . . . . . . . . . . 16 | 8.1. Handling of Pending IKE Messages . . . . . . . . . . . . . 17 | |||
| 8.2. Handling of Pending IPsec Messages . . . . . . . . . . . . 16 | 8.2. Handling of Pending IPsec Messages . . . . . . . . . . . . 17 | |||
| 8.3. IKE SA Inconsistencies . . . . . . . . . . . . . . . . . . 16 | 8.3. IKE SA Inconsistencies . . . . . . . . . . . . . . . . . . 17 | |||
| 9. Step by Step Details . . . . . . . . . . . . . . . . . . . . . 16 | 9. Step by Step Details . . . . . . . . . . . . . . . . . . . . . 18 | |||
| 10. Interaction with other drafts . . . . . . . . . . . . . . . . 17 | 10. Interaction with other specifications . . . . . . . . . . . . 18 | |||
| 11. Security Considerations . . . . . . . . . . . . . . . . . . . 18 | 11. Security Considerations . . . . . . . . . . . . . . . . . . . 19 | |||
| 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 | 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 | |||
| 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 | 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 20 | |||
| 14. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | 14. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 20 | |||
| 14.1. Draft -04 . . . . . . . . . . . . . . . . . . . . . . . . 19 | 14.1. Draft -06 . . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
| 14.2. Draft -03 . . . . . . . . . . . . . . . . . . . . . . . . 19 | 14.2. Draft -05 . . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
| 14.3. Draft -02 . . . . . . . . . . . . . . . . . . . . . . . . 20 | 14.3. Draft -04 . . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
| 14.4. Draft -01 . . . . . . . . . . . . . . . . . . . . . . . . 20 | 14.4. Draft -03 . . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
| 14.5. Draft -00 . . . . . . . . . . . . . . . . . . . . . . . . 20 | 14.5. Draft -02 . . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
| 15. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 | 14.6. Draft -01 . . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
| 15.1. Normative References . . . . . . . . . . . . . . . . . . . 20 | 14.7. Draft -00 . . . . . . . . . . . . . . . . . . . . . . . . 22 | |||
| 15.2. Informative References . . . . . . . . . . . . . . . . . . 21 | 15. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 | |||
| Appendix A. IKEv2 Message ID Sync Examples . . . . . . . . . . . 21 | 15.1. Normative References . . . . . . . . . . . . . . . . . . . 22 | |||
| A.1. Normal Failover - Example 1 . . . . . . . . . . . . . . . 22 | 15.2. Informative References . . . . . . . . . . . . . . . . . . 22 | |||
| A.2. Normal Failover - Example 2 . . . . . . . . . . . . . . . 22 | Appendix A. IKEv2 Message ID Sync Examples . . . . . . . . . . . 23 | |||
| A.3. Simultaneous Failover . . . . . . . . . . . . . . . . . . 22 | A.1. Normal Failover - Example 1 . . . . . . . . . . . . . . . 23 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22 | A.2. Normal Failover - Example 2 . . . . . . . . . . . . . . . 24 | |||
| A.3. Normal Failover - Example 3 . . . . . . . . . . . . . . . 24 | ||||
| A.4. Simultaneous Failover . . . . . . . . . . . . . . . . . . 24 | ||||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25 | ||||
| 1. Introduction | 1. Introduction | |||
| The IPsec protocol suite, including IKEv2, is a major building block | The IPsec protocol suite, including IKEv2, is a major building block | |||
| of virtual private networks (VPNs). In order to make such VPNs | of virtual private networks (VPNs). In order to make such VPNs | |||
| highly available, more scalable and failure-resistant, these VPNs are | highly available, more scalable and failure-resistant, these VPNs are | |||
| implemented as IKEv2/IPsec Highly Available (HA) clusters. However | implemented as IKEv2/IPsec Highly Available (HA) clusters. However | |||
| there are many issues with the IKEv2/IPsec HA cluster. Section 4 | there are many issues with the IKEv2/IPsec HA cluster. Section 3 and | |||
| below enumerates the issues around the IKEv2/IPsec HA cluster | Section 4 below expand on the issues around the IKEv2/IPsec HA | |||
| solution. | cluster solution, issues which were first described in the Problem | |||
| Statement [4]. | ||||
| In the case of a hot-standby cluster implementation of IKEv2/IPsec | In the case of a hot-standby cluster implementation of IKEv2/IPsec | |||
| based VPNs, the IKEv2/IPsec session is first established between the | based VPNs, the IKEv2/IPsec session is first established between the | |||
| peer and the active member of the cluster. Later, the active member | peer and the active member of the cluster. Later, the active member | |||
| continuously syncs/updates the IKE/IPsec SA state to the standby | continuously syncs/updates the IKE/IPsec SA state to the standby | |||
| member of the cluster. This primary SA state sync-up takes place | member of the cluster. This primary SA state sync-up takes place | |||
| upon each SA bring-up and/or rekey. Performing the SA state | upon each SA bring-up and/or rekey. Performing the SA state | |||
| synchronization/update for every single IKE and IPsec message is very | synchronization/update for every single IKE and IPsec message is very | |||
| costly, so normally it is done periodically. As a result, when the | costly, so normally it is done periodically. As a result, when the | |||
| failover event happens, this is first detected by the standby member | failover event happens, this is first detected by the standby member | |||
| and, possibly after a considerable amount of time, it becomes the | and, possibly after a considerable amount of time, it becomes the | |||
| active member. During this failover process the peer is unaware of | active member. During this failover process the peer is unaware of | |||
| the failover event, and keeps sending IKE requests and IPsec packets | the failover event, and keeps sending IKE requests and IPsec packets | |||
| to the cluster, as in fact it is allowed to do because of the IKEv2 | to the cluster, as in fact it is allowed to do because of the IKEv2 | |||
| windowing feature. After the newly-active member starts, it detects | windowing feature. After the newly-active member starts, it detects | |||
| the mismatch in IKE Message ID values and IPsec replay counters and | the mismatch in IKE Message ID values and IPsec replay counters and | |||
| needs to resolve this situation. Please see Section 4 for more | needs to resolve this situation. Please see Section 4 for more | |||
| details of the problem. | details of the problem. | |||
| This document proposes an extension to the IKEv2 protocol to solve | This document defines an extension to the IKEv2 protocol to solve the | |||
| the main issues of IKE Message ID synchronization and IPsec SA replay | main issues of IKE Message ID synchronization and IPsec SA replay | |||
| counter synchronization and gives implementation advice for others. | counter synchronization, and gives implementation advice to address | |||
| Following is a summary of the solutions provided in this document: | other issues. Following is a summary of the solutions provided in | |||
| this document: | ||||
| o IKEv2 Message ID synchronization: this is done by syncing up the | o IKEv2 Message ID synchronization: this is done by syncing up the | |||
| expected send and receive Message ID values with the peer, and | expected send and receive Message ID values with the peer, and | |||
| updating the values at the newly active cluster member. | updating the values at the newly active cluster member. | |||
| o IPsec Replay Counter synchronization: this is done by incrementing | o IPsec Replay Counter synchronization: this is done by incrementing | |||
| the cluster's outgoing SA replay counter values by a "large" | the cluster's outgoing SA replay counter values by a "large" | |||
| number; in addition, the newly-active member requests the peer to | number; in addition, the newly-active member requests the peer to | |||
| increment the replay counter values it is using for the peer's | increment the replay counter values it is using for the peer's | |||
| outgoing traffic. | outgoing traffic. | |||
| skipping to change at page 5, line 49 ¶ | skipping to change at page 6, line 4 ¶ | |||
| occurs, the standby member of the cluster becomes active, and the | occurs, the standby member of the cluster becomes active, and the | |||
| peer normally doesn't notice that failover has taken place. | peer normally doesn't notice that failover has taken place. | |||
| Although we treat the peer as a single entity, it may also be a | Although we treat the peer as a single entity, it may also be a | |||
| cluster. | cluster. | |||
| o "Multiple failover" is the situation where, in a cluster with | o "Multiple failover" is the situation where, in a cluster with | |||
| three or more members, multiple failover events happen in rapid | three or more members, multiple failover events happen in rapid | |||
| succession, e.g. from M1 to M2, and then to M3. It is our goal | succession, e.g. from M1 to M2, and then to M3. It is our goal | |||
| that the implementation should be able to handle this situation, | that the implementation should be able to handle this situation, | |||
| i.e. to handle the new failover event even if it is still | i.e. to handle the new failover event even if it is still | |||
| processing the old failover. | processing the old failover. | |||
| o "Simultaneous failover" is the situation where two clusters have | o "Simultaneous failover" is the situation where two clusters have | |||
| an IPsec connection between them, and failover happens at both | an IPsec connection between them, and failover happens at both | |||
| ends at the same time. It is our goal that implementations should | ends at the same time. It is our goal that implementations should | |||
| be able to handle simultaneous failover. | be able to handle simultaneous failover. | |||
| The generic term "IKEv2/IPsec SA Counters" is used throughout this | The generic term "IKEv2/IPsec SA Counters" is used throughout this | |||
| document. This term refers to both IKEv2 Message ID counters and | document. This term refers to both IKEv2 Message ID counters and | |||
| IPsec replay counters. According to the IPsec standards, the IKEv2 | IPsec replay counters. According to the IPsec standards, the IKEv2 | |||
| Message ID counter is mandatory, and used to ensure reliable delivery | Message ID counter is mandatory, and used to ensure reliable delivery | |||
| as well as to protect against message replay in IKEv2; the IPsec SA | as well as to protect against message replay in IKEv2; the IPsec SA | |||
| replay counters are optional, and are used to provide the IPsec anti- | replay counters are optional, and are used to provide the IPsec anti- | |||
| replay feature. | replay feature. | |||
| Some of these terms are used in the following architectural diagram. | ||||
| +---------------+ | ||||
| | | | ||||
| | Hot Standby | | ||||
| | Cluster | | ||||
| | | | ||||
| | +---------+ | | ||||
| | | | | | ||||
| | | Active | | | ||||
| | | | | | ||||
| | | Member | | | ||||
| | | | | | ||||
| | +---------+ | | ||||
| | ^ | | ||||
| +---------+ | Sync | | | ||||
| | | | Channel | | | ||||
| | IPsec | IKE/IPsec Traffic | | | | ||||
| | | <=============================> | | | | ||||
| | Peer | | | | | ||||
| | | | | | | ||||
| +---------+ | | | | ||||
| | v | | ||||
| | +---------+ | | ||||
| | | | | | ||||
| | | Standby | | | ||||
| | | | | | ||||
| | | Member | | | ||||
| | | | | | ||||
| | +---------+ | | ||||
| +---------------+ | ||||
| An IPsec Hot Standby Cluster | ||||
| 3. Issues Resolved from IPsec Cluster Problem Statement | 3. Issues Resolved from IPsec Cluster Problem Statement | |||
| The IPsec Cluster Problem Statement [4] enumerates the problems | The IPsec Cluster Problem Statement [4] enumerates the problems | |||
| raised by IPsec clusters. The following table lists the problem | raised by IPsec clusters. The following table lists the problem | |||
| statement's sections that are resolved by this document. | statement's sections that are resolved by this document. | |||
| o 3.2. Lots of Long Lived State | o 3.2. Lots of Long Lived State | |||
| o 3.3. IKE Counters | o 3.3. IKE Counters | |||
| o 3.4. Outbound SA Counters | o 3.4. Outbound SA Counters | |||
| o 3.5. Inbound SA Counters | o 3.5. Inbound SA Counters | |||
| o 3.6. Missing Synchronization Messages | o 3.6. Missing Synchronization Messages | |||
| o 3.7. Simultaneous use of IKE and IPsec SAs by Different Members | o 3.7. Simultaneous use of IKE and IPsec SAs by Different Members | |||
| * 3.7.1. Outbound SAs using counter modes | * 3.7.1. Outbound SAs using counter modes | |||
| o 3.8. Different IP addresses for IKE and IPsec | o 3.8. Different IP addresses for IKE and IPsec | |||
| o 3.9. Allocation of SPIs | o 3.9. Allocation of SPIs | |||
| The main problem areas are solved using the protocol extension | The main problem areas are solved using the protocol extension | |||
| defined below, starting with Section 5; additionally, this section | defined below, starting with Section 5; additionally, this section | |||
| provides implementation advice for other issues in the following | provides implementation advice for other issues in the following | |||
| subsections. | subsections. Implementers should note that these subsections include | |||
| a number of new security-critical requirements. | ||||
| 3.1. Large Amount of State | 3.1. Large Amount of State | |||
| Section 3.2 of the Problem Statement mentions that a lot of state | Section 3.2 of the Problem Statement mentions that a lot of state | |||
| needs to be synchronized for a cluster to be transparent. The actual | needs to be synchronized for a cluster to be transparent. The actual | |||
| volume of that data is very much implementation-dependent, and even | volume of that data is very much implementation-dependent, and even | |||
| for the same implementation, the amounts of data may vary wildly. An | for the same implementation, the amounts of data may vary wildly. An | |||
| IPsec gateway used for inter-domain VPN with a dozen other gateways, | IPsec gateway used for inter-domain VPN with a dozen other gateways, | |||
| and having SAs that are rekeyed every 8 hours, will need a lot less | and having SAs that are rekeyed every 8 hours, will need a lot less | |||
| synchronization traffic than a similar gateway used for remote | synchronization traffic than a similar gateway used for remote | |||
| skipping to change at page 7, line 35 ¶ | skipping to change at page 8, line 25 ¶ | |||
| that would impose unreasonable requirements on the synch connection. | that would impose unreasonable requirements on the synch connection. | |||
| A far better solution would be to not synchronize the outbound SA, | A far better solution would be to not synchronize the outbound SA, | |||
| and create multiple outbound SAs, one for each member. The problem | and create multiple outbound SAs, one for each member. The problem | |||
| with this option is that the peer might view these multiple parallel | with this option is that the peer might view these multiple parallel | |||
| SAs as redundant, and tear down all but one of them. | SAs as redundant, and tear down all but one of them. | |||
| Section 2.8 of [2] specifically allows multiple parallel SAs, but the | Section 2.8 of [2] specifically allows multiple parallel SAs, but the | |||
| reason given for this is to have multiple SAs with different QoS | reason given for this is to have multiple SAs with different QoS | |||
| attributes. So while this is not a new requirement of IKEv2 | attributes. So while this is not a new requirement of IKEv2 | |||
| implementations, we re-iterate here that IPsec peers MUST accept the | implementations working with QoS, we re-iterate here that IPsec peers | |||
| long-term existence of multiple parallel SAs, even when QoS | MUST accept the long-term existence of multiple parallel SAs, even | |||
| mechanisms are not in use. | when QoS mechanisms are not in use. | |||
| 3.3. Avoiding Collisions in SPI Number Allocation | 3.3. Avoiding Collisions in SPI Number Allocation | |||
| Section 3.9 of the problem statement describes the problem of two | Section 3.9 of the problem statement describes the problem of two | |||
| cluster members allocating the same SPI number for two different SAs. | cluster members allocating the same SPI number for two different SAs. | |||
| This would violate section 4.4.2.1 of [3]. There are several schemes | This would violate section 4.4.2.1 of [3]. There are several schemes | |||
| to allow implementations to avoid such collisions, such as | to allow implementations to avoid such collisions, such as | |||
| partitioning the SPI space, a request-response over the synch | partitioning the SPI space, a request-response over the synch | |||
| channel, and locking mechanisms. We believe that these are | channel, and locking mechanisms. We believe that these are | |||
| sufficiently robust and available so that we don't need to make an | sufficiently robust and available so that we don't need to make an | |||
| exception to RFC 4301, and we can leave this problem for the | exception to RFC 4301, and we can leave this problem for the | |||
| implementations to solve. Cluster members MUST NOT generate multiple | implementations to solve. Cluster members must not generate multiple | |||
| inbound SAs with the same SPI. | inbound SAs with the same SPI. | |||
| 3.4. Interaction with Counter Modes | 3.4. Interaction with Counter Modes | |||
| For SAs involving counter mode ciphers such as CTR [7] or GCM [8] | For SAs involving counter mode ciphers such as CTR [7] or GCM [8] | |||
| there is yet another complication. The initial vector for such modes | there is yet another complication. The initial vector for such modes | |||
| MUST NOT be repeated, and senders use methods such as counters or | MUST NOT be repeated, and senders may use methods such as counters or | |||
| LFSRs to ensure this property. For an SA shared between multiple | LFSRs to ensure this property. For an SA shared between multiple | |||
| active members (load sharing cases), implementations MUST ensure that | active members (load sharing cases), implementations MUST ensure that | |||
| no initial vector is ever repeated. Similar concerns apply to an SA | no initial vector is ever repeated. Similar concerns apply to an SA | |||
| failing over from one member to another. See [9] for a discussion of | failing over from one member to another. See [9] for a discussion of | |||
| this problem in another context. | this problem in another context. | |||
| Just as in the SPI collision problem, there are ways to avoid a | Just as in the SPI collision problem, there are ways to avoid a | |||
| collision of initial vectors, and this is left up to implementations. | collision of initial vectors, and this is left up to implementations. | |||
| In the context of load sharing, parallel SAs are a simple solution to | In the context of load sharing, parallel SAs are a simple solution to | |||
| this problem as well. | this problem as well. | |||
| skipping to change at page 9, line 22 ¶ | skipping to change at page 10, line 11 ¶ | |||
| mismatch and retransmission of requests, negating the benefits of the | mismatch and retransmission of requests, negating the benefits of the | |||
| high availability cluster despite the periodic update between the | high availability cluster despite the periodic update between the | |||
| cluster members. | cluster members. | |||
| A similar issue is also observed with IPsec anti-replay counters if | A similar issue is also observed with IPsec anti-replay counters if | |||
| anti-replay protection is enabled, which is commonly the case. | anti-replay protection is enabled, which is commonly the case. | |||
| Regardless of how well the ESP and AH SA counters are synchronized | Regardless of how well the ESP and AH SA counters are synchronized | |||
| from the active to the standby member, there is a chance that the | from the active to the standby member, there is a chance that the | |||
| standby member would end up with stale counter values. The standby | standby member would end up with stale counter values. The standby | |||
| member would then use those stale counter values when sending IPsec | member would then use those stale counter values when sending IPsec | |||
| packets. The peer would reject/drop such packets since when the | packets. The peer would drop such packets since when the anti-replay | |||
| anti-replay protection feature is enabled, duplicate use of counters | protection feature is enabled, duplicate use of counters is not | |||
| is not allowed. Note that IPsec allows the sender to skip some | allowed. Note that IPsec allows the sender to skip some counter | |||
| counter values and continue sending with higher counter values. | values and continue sending with higher counter values. | |||
| We conclude that a mechanism is required to ensure that the standby | We conclude that a mechanism is required to ensure that the standby | |||
| member has correct Message ID and IPsec counter values when it | member has correct Message ID and IPsec counter values when it | |||
| becomes active, so that sessions are not torn down as a result of | becomes active, so that sessions are not torn down as a result of | |||
| mismatched counters. | mismatched counters. | |||
| 5. SA Counter Synchronization Solution | 5. SA Counter Synchronization Solution | |||
| This document proposes two separate approaches to resolving the | This document defines two separate approaches to resolving the issues | |||
| issues of mismatched IKE Message ID values and IPsec counter values. | of mismatched IKE Message ID values and IPsec counter values. | |||
| o In the case of IKE Message ID values, the newly active cluster | o In the case of IKE Message ID values, the newly active cluster | |||
| member and the peer negotiate a pair of new values so that future | member and the peer negotiate a pair of new values so that future | |||
| IKE messages will not be dropped. | IKE messages will not be dropped. | |||
| o For IPsec counter values, the newly-active member and the peer | o For IPsec counter values, the newly-active member and the peer | |||
| both increment their respective counter values, "skipping forward" | both increment their respective counter values, "skipping forward" | |||
| by a large number, to ensure that no IPsec counters are ever | by a large number, to ensure that no IPsec counters are ever | |||
| reused. | reused. | |||
| Although conceptually separate, the two synchronization processes | Although conceptually separate, the two synchronization processes | |||
| would typically take place simultaneously. | would typically take place simultaneously. | |||
| First, the peer and the active member of the cluster negotiate their | First, the peer and the active member of the cluster negotiate their | |||
| ability to support IKEv2 Message ID synchronization and/or IPsec | ability to support IKEv2 Message ID synchronization and/or IPsec | |||
| Replay Counter synchronization. This is done by exchanging one or | Replay Counter synchronization. This is done by exchanging one or | |||
| both of the IKEV2_MESSAGE_ID_SYNC_SUPPORTED and | both of the IKEV2_MESSAGE_ID_SYNC_SUPPORTED and | |||
| IPSEC_REPLAY_COUNTER_SYNC_SUPPORTED notifications during the IKE_AUTH | IPSEC_REPLAY_COUNTER_SYNC_SUPPORTED notifications during the IKE_AUTH | |||
| exchange. When negotiating these capabilities, the responder MUST | exchange. When negotiating these capabilities, the responder MUST | |||
| NOT assert support of a capability unless such support was asserted | NOT assert support of a capability unless such support was asserted | |||
| by the initiator. Only a capability whose support was asserted by | by the initiator. Only a capability whose support was asserted by | |||
| both parties can be used during the lifetime of the SA. | both parties can be used during the lifetime of the SA. The peer's | |||
| capabilities with regard to this extension are part of the IKEv2 SA | ||||
| state, and thus MUST be shared between the cluster members. | ||||
| This per-IKE SA information is shared with the other cluster members. | This per-IKE SA information is shared with the other cluster members. | |||
| Peer Active Member | Peer Active Member | |||
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | |||
| HDR, SK {IDi, [CERT], [CERTREQ], [IDr], AUTH, | HDR, SK {IDi, [CERT], [CERTREQ], [IDr], AUTH, | |||
| [N(IKEV2_MESSAGE_ID_SYNC_SUPPORTED),] | [N(IKEV2_MESSAGE_ID_SYNC_SUPPORTED),] | |||
| [N(IPSEC_REPLAY_COUNTER_SYNC_SUPPORTED),] | [N(IPSEC_REPLAY_COUNTER_SYNC_SUPPORTED),] | |||
| SAi2, TSi, TSr} ----------> | SAi2, TSi, TSr} ----------> | |||
| skipping to change at page 10, line 31 ¶ | skipping to change at page 11, line 24 ¶ | |||
| [N(IPSEC_REPLAY_COUNTER_SYNC_SUPPORTED),] SAr2, TSi, TSr} | [N(IPSEC_REPLAY_COUNTER_SYNC_SUPPORTED),] SAr2, TSi, TSr} | |||
| After a failover event, the standby member MAY use the IKE Message ID | After a failover event, the standby member MAY use the IKE Message ID | |||
| and/or IPsec Replay Counter synchronization capability when it | and/or IPsec Replay Counter synchronization capability when it | |||
| becomes the active member, and provided support for the capabilities | becomes the active member, and provided support for the capabilities | |||
| used has been negotiated. Following that, the peer MUST respond to | used has been negotiated. Following that, the peer MUST respond to | |||
| any synchronization message it receives from the newly-active cluster | any synchronization message it receives from the newly-active cluster | |||
| member, subject to the rules noted below. | member, subject to the rules noted below. | |||
| After the failover event, when the standby member becomes active, it | After the failover event, when the standby member becomes active, it | |||
| has to synchronize its SA counters with the peer. There are now | has to synchronize its SA counters with the peer. There are now four | |||
| three possible cases: | possible cases: | |||
| 1. The cluster member wishes to only perform IKE Message ID value | 1. The cluster member wishes to only perform IKE Message ID value | |||
| synchronization. In this case it initiates an Informational | synchronization. In this case it initiates an Informational | |||
| exchange, with Message ID zero and the sole notification | exchange, with Message ID zero and the sole notification | |||
| IKEV2_MESSAGE_ID_SYNC. | IKEV2_MESSAGE_ID_SYNC. | |||
| 2. If the newly-active member wishes to perform only IPsec replay | 2. If the newly-active member wishes to perform only IPsec replay | |||
| counter synchronization, it generates a regular IKEv2 | counter synchronization, it generates a regular IKEv2 | |||
| Informational exchange using the current Message ID values, and | Informational exchange using the current Message ID values, and | |||
| containing the IPSEC_REPLAY_COUNTER_SYNC notification. | containing the IPSEC_REPLAY_COUNTER_SYNC notification. | |||
| 3. If synchronization of both counters is needed, the cluster member | 3. If synchronization of both counters is needed, the cluster member | |||
| generates a zero-Message ID message as in case #1, and includes | generates a zero-Message ID message as in case #1, and includes | |||
| both notifications in this message. | both notifications in this message. | |||
| 4. Lastly, the peer may not support this extension. This is known | ||||
| to the newly-active member (because the cluster members must | ||||
| share this information, as noted earlier). This case is the | ||||
| existing IKEv2 behavior, and the IKE and IPsec SAs may or may not | ||||
| survive the failover, depending on the exact state on the peer | ||||
| and the cluster member. | ||||
| This figure contains the IKE message exchange used for SA counter | This figure contains the IKE message exchange used for SA counter | |||
| synchronization. The following subsections describe the details of | synchronization. The following subsections describe the details of | |||
| the sender and receiver processing of each message. | the sender and receiver processing of each message. | |||
| Standby [Newly Active] Member Peer | Standby [Newly Active] Member Peer | |||
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | |||
| HDR, SK {N(IKEV2_MESSAGE_ID_SYNC), | HDR, SK {N(IKEV2_MESSAGE_ID_SYNC), | |||
| [N(IPSEC_REPLAY_COUNTER_SYNC)]} --------> | [N(IPSEC_REPLAY_COUNTER_SYNC)]} --------> | |||
| skipping to change at page 11, line 24 ¶ | skipping to change at page 12, line 24 ¶ | |||
| ID is non-zero: | ID is non-zero: | |||
| Standby [Newly Active] Member Peer | Standby [Newly Active] Member Peer | |||
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | |||
| HDR, SK{N(IPSEC_REPLAY_COUNTER_SYNC)} --------> | HDR, SK{N(IPSEC_REPLAY_COUNTER_SYNC)} --------> | |||
| <--------- HDR | <--------- HDR | |||
| 5.1. Processing Rules for IKE Message ID Synchronization | 5.1. Processing Rules for IKE Message ID Synchronization | |||
| The newly-active member sends a request containing two counter value, | The newly-active member sends a request containing two counter | |||
| one for the member (itself) and another for the peer, as well as a | values, one for the member (itself) and another for the peer, as well | |||
| random nonce. We denote the values M1 and P1. The peer responds | as a random nonce. We denote the values M1 and P1. The peer | |||
| with a message containing two counter values, M2 and P2. The goal of | responds with a message containing two counter values, M2 and P2 | |||
| the rules below is to prevent an attacker from replaying a | (note that the values appear in the opposite order in the | |||
| synchronization message, thereby invalidating IKE messages that are | notification's payload). The goal of the rules below is to prevent | |||
| currently in process. | an attacker from replaying a synchronization message, thereby | |||
| invalidating IKE messages that are currently in process. | ||||
| o M1 is the next sender's Message ID to be used by the member. M1 | o M1 is the next sender's Message ID to be used by the member. M1 | |||
| MUST be chosen so that it is larger than any value known to have | MUST be chosen so that it is larger than any value known to have | |||
| been used. It is RECOMMENDED to increment the known value at | been used. It is RECOMMENDED to increment the known value at | |||
| least by the size of the IKE sender window. | least by the size of the IKE sender window. | |||
| o P1 SHOULD be 1 more than the last Message ID value received from | o P1 SHOULD be 1 more than the last Message ID value received from | |||
| the peer, but may be any higher value. | the peer, but may be any higher value. | |||
| o The member SHOULD communicate the sent values to the other cluster | o The member SHOULD communicate the sent values to the other cluster | |||
| members, so that if a second failover event takes place, the | members, so that if a second failover event takes place, the | |||
| synchronization message is not replayed. Such a replay would | synchronization message is not replayed. Such a replay would | |||
| result in the eventual deletion of the IKE SA (see below). | result in the eventual deletion of the IKE SA (see below). | |||
| o The peer MUST reject any received synchronization message if M1 is | o The peer MUST silently drop any received synchronization message | |||
| lower than or equal to the highest value it has seen from the | if M1 is lower than or equal to the highest value it has seen from | |||
| cluster. This includes any previous received synchronization | the cluster. This includes any previous received synchronization | |||
| messages. | messages. | |||
| o M2 MUST be at least the higher of the received M1, and one more | o M2 MUST be at least the higher of the received M1, and one more | |||
| than the highest sender value received from the cluster. This | than the highest sender value received from the cluster. This | |||
| includes any previous received synchronization messages. | includes any previous received synchronization messages. | |||
| o P2 MUST be the higher of the received P1 value, and one more than | o P2 MUST be the higher of the received P1 value, and one more than | |||
| the highest sender value used by the peer. | the highest sender value used by the peer. | |||
| o The request contains a Nonce field. This field MUST be returned | o The request contains a Nonce field. This field MUST be returned | |||
| in the response, unchanged. A response MUST be silently dropped | in the response, unchanged. A response MUST be silently dropped | |||
| if the received Nonce does not match the one that was sent. | if the received Nonce does not match the one that was sent. | |||
| skipping to change at page 12, line 46 ¶ | skipping to change at page 13, line 46 ¶ | |||
| notification first (which might cause the entire message to be | notification first (which might cause the entire message to be | |||
| dropped as a replay). Then, it MUST increment the replay counters | dropped as a replay). Then, it MUST increment the replay counters | |||
| for all Child SAs associated with the current IKE SA by the amount | for all Child SAs associated with the current IKE SA by the amount | |||
| requested by the cluster member. | requested by the cluster member. | |||
| 6. IKEv2/IPsec Synchronization Notification Payloads | 6. IKEv2/IPsec Synchronization Notification Payloads | |||
| This section lists the new notification payload types defined by this | This section lists the new notification payload types defined by this | |||
| extension. | extension. | |||
| All multi-octet fields representing integers are laid out in big | ||||
| endian order (also known as "most significant byte first", or | ||||
| "network byte order"). | ||||
| 6.1. The IKEV2_MESSAGE_ID_SYNC_SUPPORTED Notification | 6.1. The IKEV2_MESSAGE_ID_SYNC_SUPPORTED Notification | |||
| This notification payload is included in the IKE_AUTH request/ | This notification payload is included in the IKE_AUTH request/ | |||
| response to indicate support of the IKEv2 Message ID synchronization | response to indicate support of the IKEv2 Message ID synchronization | |||
| mechanism described in this document. | mechanism described in this document. | |||
| 1 2 3 | 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Next Payload |C| RESERVED | Payload Length | | | Next Payload |C| RESERVED | Payload Length | | |||
| skipping to change at page 14, line 41 ¶ | skipping to change at page 15, line 49 ¶ | |||
| 6.4. The IPSEC_REPLAY_COUNTER_SYNC Notification | 6.4. The IPSEC_REPLAY_COUNTER_SYNC Notification | |||
| This notification payload type (value TBD by IANA) is defined to | This notification payload type (value TBD by IANA) is defined to | |||
| synchronize the IPsec SA Replay Counters between the newly-active | synchronize the IPsec SA Replay Counters between the newly-active | |||
| (formerly standby) cluster member and the peer. Since there may be | (formerly standby) cluster member and the peer. Since there may be | |||
| numerous IPsec SAs established under a single IKE SA, we do not | numerous IPsec SAs established under a single IKE SA, we do not | |||
| directly synchronize the value of each one. Instead, a delta value | directly synchronize the value of each one. Instead, a delta value | |||
| is sent and all Replay Counters for Child SAs of this IKE SA are | is sent and all Replay Counters for Child SAs of this IKE SA are | |||
| incremented by the same value. Note that this solution requires that | incremented by the same value. Note that this solution requires that | |||
| all these Child SAs either use or do not use Extended Sequence | either all Child SAs use Extended Sequence Numbers or else that no | |||
| Numbers [3]. This notification is only sent by the cluster. | Child SA uses Extended Sequence Numbers [3]. This notification is | |||
| only sent by the cluster. | ||||
| 1 2 3 | 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Next Payload |C| RESERVED | Payload Length | | | Next Payload |C| RESERVED | Payload Length | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| |Protocol ID(=0)| SPI Size (=0) | Notify Message Type | | |Protocol ID(=0)| SPI Size (=0) | Notify Message Type | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Incoming IPsec SA delta value | | | Incoming IPsec SA delta value | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| The notification payload contains the following data. | The notification payload contains the following data. | |||
| o Incoming IPsec SA delta value (4 or 8 octets): The sender requests | o Incoming IPsec SA delta value (4 or 8 octets): The sender requests | |||
| that the peer should increment all the Child SA Replay Counters | that the peer should increment all the Child SA Replay Counters | |||
| for the sender's incoming (the peer's outgoing) traffic by this | for the sender's incoming (the peer's outgoing) traffic by this | |||
| skipping to change at page 15, line 40 ¶ | skipping to change at page 16, line 43 ¶ | |||
| The standby member can initiate the synchronization of IKEv2 Message | The standby member can initiate the synchronization of IKEv2 Message | |||
| ID's under different circumstances. | ID's under different circumstances. | |||
| o When it receives a problematic IKEv2/IPsec packet, i.e. a packet | o When it receives a problematic IKEv2/IPsec packet, i.e. a packet | |||
| outside its expected receive window. | outside its expected receive window. | |||
| o When it has to send the first IKEv2/IPsec packet after a failover | o When it has to send the first IKEv2/IPsec packet after a failover | |||
| event. | event. | |||
| o When it has just received control from the active member and | o When it has just received control from the active member and | |||
| wishes to update the values proactively, so that it need not start | wishes to update the values proactively, so that it need not start | |||
| this exchange later, when sending or receiving the request. | this exchange later, when sending or receiving the request. | |||
| To clarify the first alternative: the normal IKE behavior of | ||||
| rejecting out-of-window messages is not changed, but such messages | ||||
| can still be a valid trigger for the exchange defined in this | ||||
| document. To avoid DoS attacks resulting from replayed messages, the | ||||
| peer MUST NOT initiate counter synchronization for any particular IKE | ||||
| SA more than once per failover event. | ||||
| The standby member can initiate the synchronization of IPsec SA | The standby member can initiate the synchronization of IPsec SA | |||
| Replay Counters: | Replay Counters: | |||
| o If there has been traffic using the IPsec SA in the recent past | o If there has been traffic using the IPsec SA in the recent past | |||
| and the standby member suspects that its Replay Counter may be | and the standby member suspects that its Replay Counter may be | |||
| stale. | stale. | |||
| Since there can be a large number of sessions at the standby member, | Since there can be a large number of sessions at the standby member, | |||
| and sending synchronization exchanges for all of them may result in | and sending synchronization exchanges for all of them may result in | |||
| overload, the standby member can choose to initiate the exchange in a | overload, the standby member can choose to initiate the exchange in a | |||
| "lazy" fashion: only when it has to send or receive the request. In | "lazy" fashion: only when it has to send or expects to receive | |||
| general, the standby member is free to initiate this exchange at its | traffic from each peer. In general, the standby member is free to | |||
| discretion. | initiate this exchange at its discretion. Implementation | |||
| considerations include the ability to survive a certain amount of | ||||
| traffic loss, and the capacity of a cluster member to initiate | ||||
| counter synchronization simultaneously with a large number of peers. | ||||
| 8. IKE SA and IPsec SA Message Sequencing | 8. IKE SA and IPsec SA Message Sequencing | |||
| The straightforward definitions of message sequence numbers, | The straightforward definitions of message sequence numbers, | |||
| retransmissions and replay protection in IPsec and IKEv2 are strained | retransmissions and replay protection in IPsec and IKEv2 are strained | |||
| by the failover scenarios described in this document. This section | by the failover scenarios described in this document. This section | |||
| describes some policy choices that need to be made by implementations | describes some policy choices that need to be made by implementations | |||
| in this setting. | in this setting. | |||
| 8.1. Handling of Pending IKE Messages | 8.1. Handling of Pending IKE Messages | |||
| After sending its "receive" counter, the cluster member MUST reject | After sending its "receive" counter, the cluster member MUST reject | |||
| any incoming IKE messages that are outside its declared window. A | (silently drop) any incoming IKE messages that are outside its | |||
| similar rule applies to the peer. Local policies vary, and strict | declared window. A similar rule applies to the peer. Local policies | |||
| implementations will reject any incoming IKE message arriving before | vary, and strict implementations will reject any incoming IKE message | |||
| Message ID synchronization is complete. | arriving before Message ID synchronization is complete. | |||
| 8.2. Handling of Pending IPsec Messages | 8.2. Handling of Pending IPsec Messages | |||
| For IPsec, there is often a trade-off between security and | For IPsec, there is often a trade-off between security and | |||
| reliability of the protected protocols. Here again there is some | reliability of the protected protocols. Here again there is some | |||
| leeway for local policy. Some implementations might accept incoming | leeway for local policy. Some implementations might accept incoming | |||
| traffic that is outside the replay window for some time after the | traffic that is outside the replay window for some time after the | |||
| failover event. Strict implementations will only accept traffic | failover event, and until the counters had been synchronized. Strict | |||
| that's inside the "safe" window. | implementations will only accept traffic that's inside the "safe" | |||
| window. | ||||
| 8.3. IKE SA Inconsistencies | 8.3. IKE SA Inconsistencies | |||
| IKEv2 is normally a reliable protocol. As long as an IKE SA is | IKEv2 is normally a reliable protocol. As long as an IKE SA is | |||
| valid, both peers share a single, consistent view of the IKE SA and | valid, both peers share a single, consistent view of the IKE SA and | |||
| all associated Child SAs. Failover situations as described in this | all associated Child SAs. Failover situations as described in this | |||
| document may involve forced deletion of IKE messages, resulting in | document may involve forced deletion of IKE messages, resulting in | |||
| inconsistencies, such as Child SAs that exist on only one of the | inconsistencies, such as Child SAs that exist on only one of the | |||
| peers. Such SAs would cause an INVALID_SPI to be returned when used | peers. Such SAs might cause an INVALID_SPI to be returned when used | |||
| by that peer. | by that peer. Note that Sec. 1.5 of [2] allows but does not mandate | |||
| sending an INVALID_SPI notification in this case. | ||||
| The Working Group discussed at some point a proposed set of rules for | The Working Group discussed at some point a proposed set of rules for | |||
| dealing with such situations. However we believe that these | dealing with such situations. However we believe that these | |||
| situations should be rare in practice; as a result the "default" | situations should be rare in practice; as a result the "default" | |||
| behavior of tearing down the entire IKE SA is to be preferred over | behavior of tearing down the entire IKE SA is to be preferred over | |||
| the complexity of dealing with a multitude of edge cases. | the complexity of dealing with a multitude of edge cases. | |||
| 9. Step by Step Details | 9. Step by Step Details | |||
| This section goes through the sequence of steps of a typical failover | This section goes through the sequence of steps of a typical failover | |||
| skipping to change at page 17, line 31 ¶ | skipping to change at page 18, line 46 ¶ | |||
| 6, 7 but not for 3, then it should include the value 8 in its | 6, 7 but not for 3, then it should include the value 8 in its | |||
| EXPECTED_SEND_REQ_MESSAGE_ID payload and should not wait for a | EXPECTED_SEND_REQ_MESSAGE_ID payload and should not wait for a | |||
| response to message 3 anymore. | response to message 3 anymore. | |||
| o Similarly, the peer should also not wait for pending (incoming) | o Similarly, the peer should also not wait for pending (incoming) | |||
| requests. For example if the window size is 5 and the peer's | requests. For example if the window size is 5 and the peer's | |||
| window is 3-7 and if the peer has received requests 4, 5, 6, 7 but | window is 3-7 and if the peer has received requests 4, 5, 6, 7 but | |||
| not 3, then it should send the value 8 in the | not 3, then it should send the value 8 in the | |||
| EXPECTED_RECV_REQ_MESSAGE_ID payload, and should not expect to | EXPECTED_RECV_REQ_MESSAGE_ID payload, and should not expect to | |||
| receive message 3 anymore. | receive message 3 anymore. | |||
| 10. Interaction with other drafts | 10. Interaction with other specifications | |||
| The usage scenario of the IKEv2/IPsec SA counter synchronization | The usage scenario of this IKEv2/IPsec SA counter synchronization | |||
| proposal is that an IKEv2 SA has been established between the active | solution is that an IKEv2 SA has been established between the active | |||
| member of a hot-standby cluster and a peer, then a failover event | member of a hot-standby cluster and a peer, followed by a failover | |||
| occurred with the standby member becoming active. The proposal | event occurring and the standby member becoming active. The solution | |||
| further assumes that the IKEv2 SA state was continuously synchronized | further assumes that the IKEv2 SA state was continuously synchronized | |||
| between the active and standby members of the cluster before the | between the active and standby members of the cluster before the | |||
| failover event. | failover event. | |||
| o Session resumption [10] assumes that a peer (client or initiator) | o Session resumption [10] assumes that a peer (client or initiator) | |||
| detects the need to re-establish the session. In IKEv2/IPsec SA | detects the need to re-establish the session. In IKEv2/IPsec SA | |||
| counter synchronization, it is the newly-active member (a gateway | counter synchronization, it is the newly-active member (a gateway | |||
| or responder) that detects the need to synchronize the SA counter | or responder) that detects the need to synchronize the SA counter | |||
| after the failover event. Also in a hot-standby cluster, the peer | after the failover event. Also in a hot-standby cluster, the peer | |||
| establishes the IKEv2/IPsec session with a single IP address that | establishes the IKEv2/IPsec session with a single IP address that | |||
| represents the whole cluster, so the peer normally does not detect | represents the whole cluster, so the peer normally does not detect | |||
| the event of failover in the cluster unless the standby member | the event of failover in the cluster unless the standby member | |||
| takes too long to become active and the IKEv2 SA times out by use | takes too long to become active and the IKEv2 SA times out by use | |||
| of the IKEv2 liveness check mechanism. To conclude, session | of the IKEv2 liveness check mechanism. To conclude, session | |||
| resumption and SA counter synchronization after failover are | resumption and SA counter synchronization after failover are | |||
| mutually exclusive. | mutually exclusive: they are not expected to be used together, and | |||
| both features can coexist within the same implementation without | ||||
| affecting each other. | ||||
| o The IKEv2 Redirect mechanism for load-balancing [11] can be used | o The IKEv2 Redirect mechanism for load-balancing [11] can be used | |||
| either during the initial stages of SA setup (the IKE_SA_INIT and | either during the initial stages of SA setup (the IKE_SA_INIT and | |||
| IKE_AUTH exchanges) or after session establishment. SA counter | IKE_AUTH exchanges) or after session establishment. SA counter | |||
| synchronization is only useful after the IKE SA has been | synchronization is only useful after the IKE SA has been | |||
| established and a failover event has occurred. So, unlike | established and a failover event has occurred. So, unlike | |||
| Redirect, it is irrelevant during the first two exchanges. | Redirect, it is irrelevant during the first two exchanges. | |||
| Redirect after the session has been established is mostly useful | Redirect after the session has been established is mostly useful | |||
| for timed or planned shutdown/maintenance. A real failover event | for timed or planned shutdown/maintenance. A real failover event | |||
| cannot be detected by the active member ahead of time, and so | cannot be detected by the active member ahead of time, and so | |||
| using Redirect after session establishment is not possible in the | using Redirect after session establishment is not possible in the | |||
| case of failover. So, Redirect and SA counter synchronization | case of failover. So, Redirect and SA counter synchronization | |||
| after failover are mutually exclusive. | after failover are mutually exclusive, in the sense described | |||
| above. | ||||
| o IKEv2 Failure Detection [6] solves a similar problem where the | o IKEv2 Failure Detection [6] solves a similar problem where the | |||
| peer can rapidly detect that a cluster member has crashed based on | peer can rapidly detect that a cluster member has crashed based on | |||
| a token. It is unrelated to the current scenario because the goal | a token. It is unrelated to the current scenario because the goal | |||
| in failover is for the peer not to notice that a failure has | in failover is for the peer not to notice that a failure has | |||
| occurred. | occurred. | |||
| 11. Security Considerations | 11. Security Considerations | |||
| Since Message ID synchronization messages need to be sent with | Since Message ID synchronization messages need to be sent with | |||
| Message ID zero, they are potentially vulnerable to replay attacks. | Message ID zero, they are potentially vulnerable to replay attacks. | |||
| skipping to change at page 18, line 40 ¶ | skipping to change at page 20, line 12 ¶ | |||
| the requirement that the Send counter sent by the cluster member | the requirement that the Send counter sent by the cluster member | |||
| should always be monotonically increasing, a rule that the peer | should always be monotonically increasing, a rule that the peer | |||
| enforces by silently dropping messages that contradict it. | enforces by silently dropping messages that contradict it. | |||
| o Replay of the Message ID synchronization response: This is | o Replay of the Message ID synchronization response: This is | |||
| countered by sending the nonce data along with the synchronization | countered by sending the nonce data along with the synchronization | |||
| payload. The same nonce data has to be returned in the response. | payload. The same nonce data has to be returned in the response. | |||
| Thus the standby member will accept a reply only for the current | Thus the standby member will accept a reply only for the current | |||
| request. After it receives a valid response, it MUST NOT process | request. After it receives a valid response, it MUST NOT process | |||
| the same response again and MUST discard any additional responses. | the same response again and MUST discard any additional responses. | |||
| As mentioned in Section 7, trigerring counter synchronization by out- | ||||
| of-window, potentially replayed messages, could open a DoS | ||||
| vulnerability. This risk is mitigated by the solution described in | ||||
| that section. | ||||
| 12. IANA Considerations | 12. IANA Considerations | |||
| This document introduces four new IKEv2 Notification Message types as | This document introduces four new IKEv2 Notification Message types as | |||
| described in Section 6. The new Notify Message Types must be | described in Section 6. The new Notify Message Types must be | |||
| assigned values between 16396 and 40959. | assigned values between 16396 and 40959. | |||
| +-------------------------------------+-------------+ | +-------------------------------------+-------------+ | |||
| | Name | Value | | | Name | Value | | |||
| +-------------------------------------+-------------+ | +-------------------------------------+-------------+ | |||
| | IKEV2_MESSAGE_ID_SYNC_SUPPORTED | TBD by IANA | | | IKEV2_MESSAGE_ID_SYNC_SUPPORTED | TBD by IANA | | |||
| skipping to change at page 19, line 31 ¶ | skipping to change at page 21, line 5 ¶ | |||
| order) for their review comments and valuable suggestions: Dan | order) for their review comments and valuable suggestions: Dan | |||
| Harkins, Paul Hoffman, Steve Kent, Tero Kivinen, David McGrew, and | Harkins, Paul Hoffman, Steve Kent, Tero Kivinen, David McGrew, and | |||
| Pekka Riikonen. | Pekka Riikonen. | |||
| 14. Change Log | 14. Change Log | |||
| This section lists all the changes in this document. | This section lists all the changes in this document. | |||
| NOTE TO RFC EDITOR: Please remove this section before publication. | NOTE TO RFC EDITOR: Please remove this section before publication. | |||
| 14.1. Draft -04 | 14.1. Draft -06 | |||
| Applied multiple review comments, from Pekka Riikonen, Alexey | ||||
| Melnikov, Stephen Farrel, Robert Sparks, Pete Resnick, Russ Housley | ||||
| and Adrian Farrel. Added an architectural reference diagram. Added | ||||
| a MUST requirement for cluster members to share peers' support of | ||||
| this protocol, which had been implicit in previous versions. | ||||
| 14.2. Draft -05 | ||||
| Applied Sean Turner's review comments. | ||||
| 14.3. Draft -04 | ||||
| Extended Sec. 3 for better coverage of other IPsec cluster-related | Extended Sec. 3 for better coverage of other IPsec cluster-related | |||
| issues, and how they are resolved within the existing standards. | issues, and how they are resolved within the existing standards. | |||
| 14.2. Draft -03 | 14.4. Draft -03 | |||
| Clarified the rules for Message ID sync, so that replay attacks can | Clarified the rules for Message ID sync, so that replay attacks can | |||
| be avoided without a failover counter. | be avoided without a failover counter. | |||
| Added wording regarding inconsistent IKE state (basically choosing to | Added wording regarding inconsistent IKE state (basically choosing to | |||
| ignore the problem) and further rules dealing with pending traffic. | ignore the problem) and further rules dealing with pending traffic. | |||
| The IPsec replay counter delta value now refers to incoming traffic. | The IPsec replay counter delta value now refers to incoming traffic. | |||
| The associated notification is only sent from the cluster to the | The associated notification is only sent from the cluster to the | |||
| peer, and not back. | peer, and not back. | |||
| 14.3. Draft -02 | 14.5. Draft -02 | |||
| Addressed comments by Yaron Sheffer posted on the WG mailing list. | Addressed comments by Yaron Sheffer posted on the WG mailing list. | |||
| Numerous editorial changes. | Numerous editorial changes. | |||
| 14.4. Draft -01 | 14.6. Draft -01 | |||
| Added "Multiple and Simultaneous failover' scenarios as pointed out | Added "Multiple and Simultaneous failover" scenarios as pointed out | |||
| by Pekka Riikonen. | by Pekka Riikonen. | |||
| Now document provides a mechanism to sync either IKEv2 message or | Now document provides a mechanism to sync either IKEv2 message or | |||
| IPsec replay counter or both to cater different types of | IPsec replay counter or both to cater different types of | |||
| implementations. | implementations. | |||
| HA cluster's "failover count' is used to encounter replay of sync | HA cluster's "failover count' is used to encounter replay of sync | |||
| requests by attacker. | requests by attacker. | |||
| The sync of IPsec SA replay counter optimized to to have just one | The sync of IPsec SA replay counter optimized to to have just one | |||
| global bumped-up outgoing IPsec SA counter of ALL Child SAs under an | global bumped-up outgoing IPsec SA counter of ALL Child SAs under an | |||
| IKEv2 SA. | IKEv2 SA. | |||
| The examples added for IKEv2 Message ID sync to provide more clarity. | The examples added for IKEv2 Message ID sync to provide more clarity. | |||
| Some edits as per comments on mailing list to enhance clarity. | Some edits as per comments on mailing list to enhance clarity. | |||
| 14.5. Draft -00 | 14.7. Draft -00 | |||
| Version 00 is identical to | Version 00 is identical to | |||
| draft-kagarigi-ipsecme-ikev2-windowsync-04, started as WG document. | draft-kagarigi-ipsecme-ikev2-windowsync-04, started as WG document. | |||
| Added IPSECME WG HA design team members as authors. | Added IPSECME WG HA design team members as authors. | |||
| Added comment in Introduction to discuss the window sync process on | Added comment in Introduction to discuss the window sync process on | |||
| WG mailing list to solve some concerns. | WG mailing list to solve some concerns. | |||
| 15. References | 15. References | |||
| skipping to change at page 21, line 16 ¶ | skipping to change at page 22, line 44 ¶ | |||
| 15.2. Informative References | 15.2. Informative References | |||
| [4] Nir, Y., "IPsec Cluster Problem Statement", RFC 6027, | [4] Nir, Y., "IPsec Cluster Problem Statement", RFC 6027, | |||
| October 2010. | October 2010. | |||
| [5] Nadas, S., "Virtual Router Redundancy Protocol (VRRP) Version 3 | [5] Nadas, S., "Virtual Router Redundancy Protocol (VRRP) Version 3 | |||
| for IPv4 and IPv6", RFC 5798, March 2010. | for IPv4 and IPv6", RFC 5798, March 2010. | |||
| [6] Nir, Y., Wierbowski, D., Detienne, F., and P. Sethi, "A Quick | [6] Nir, Y., Wierbowski, D., Detienne, F., and P. Sethi, "A Quick | |||
| Crash Detection Method for IKE", | Crash Detection Method for IKE", | |||
| draft-ietf-ipsecme-failure-detection-07 (work in progress), | draft-ietf-ipsecme-failure-detection-08 (work in progress), | |||
| March 2011. | April 2011. | |||
| [7] Housley, R., "Using Advanced Encryption Standard (AES) Counter | [7] Housley, R., "Using Advanced Encryption Standard (AES) Counter | |||
| Mode With IPsec Encapsulating Security Payload (ESP)", | Mode With IPsec Encapsulating Security Payload (ESP)", | |||
| RFC 3686, January 2004. | RFC 3686, January 2004. | |||
| [8] Viega, J. and D. McGrew, "The Use of Galois/Counter Mode (GCM) | [8] Viega, J. and D. McGrew, "The Use of Galois/Counter Mode (GCM) | |||
| in IPsec Encapsulating Security Payload (ESP)", RFC 4106, | in IPsec Encapsulating Security Payload (ESP)", RFC 4106, | |||
| June 2005. | June 2005. | |||
| [9] McGrew, D. and B. Weis, "Using Counter Modes with Encapsulating | [9] McGrew, D. and B. Weis, "Using Counter Modes with Encapsulating | |||
| skipping to change at page 21, line 43 ¶ | skipping to change at page 23, line 25 ¶ | |||
| [11] Devarapalli, V. and K. Weniger, "Redirect Mechanism for the | [11] Devarapalli, V. and K. Weniger, "Redirect Mechanism for the | |||
| Internet Key Exchange Protocol Version 2 (IKEv2)", RFC 5685, | Internet Key Exchange Protocol Version 2 (IKEv2)", RFC 5685, | |||
| November 2009. | November 2009. | |||
| Appendix A. IKEv2 Message ID Sync Examples | Appendix A. IKEv2 Message ID Sync Examples | |||
| This (non-normative) section presents some examples that illustrate | This (non-normative) section presents some examples that illustrate | |||
| how the IKEv2 Message ID values are synchronized. We use a tuple | how the IKEv2 Message ID values are synchronized. We use a tuple | |||
| notation, denoting the two counters EXPECTED_SEND_REQ_MESSAGE_ID and | notation, denoting the two counters EXPECTED_SEND_REQ_MESSAGE_ID and | |||
| EXPECTED_RECV_REQ_MESSAGE_ID on a member as | EXPECTED_RECV_REQ_MESSAGE_ID on each protocol party as | |||
| (EXPECTED_SEND_REQ_MESSAGE_ID, EXPECTED_RECV_REQ_MESSAGE_ID). | (EXPECTED_SEND_REQ_MESSAGE_ID, EXPECTED_RECV_REQ_MESSAGE_ID). | |||
| Note that if the IKE message counters are already synchronized (as in | ||||
| the first example), we expect the numbers to be reversed between the | ||||
| two sides. If one protocol party intends to send the next request as | ||||
| 4, then the other expects the next received request to be 4. | ||||
| A.1. Normal Failover - Example 1 | A.1. Normal Failover - Example 1 | |||
| Standby (Newly Active) Member Peer | Standby (Newly Active) Member Peer | |||
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | |||
| Sync Request (0, 5) --------> | ||||
| Peer has the values (5, 0) so it sends | ||||
| <------------- (5, 0) as the Sync Response | ||||
| In this example, the peer has most recently sent an IKE request with | ||||
| Message ID 4, and has never received a request. So the peer's | ||||
| expected values for the next pair of messages are (5, 0). These are | ||||
| the same values as received from the member and therefore they are | ||||
| sent as-is. | ||||
| A.2. Normal Failover - Example 2 | ||||
| Standby (Newly Active) Member Peer | ||||
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | ||||
| Sync Request (2, 3) --------> | Sync Request (2, 3) --------> | |||
| Peer has the values (4, 5) so it sends | Peer has the values (4, 5) so it sends | |||
| <------------- (4, 5) as the Sync Response | <------------- (4, 5) as the Sync Response | |||
| A.2. Normal Failover - Example 2 | In this example, the peer has most recently sent an IKE message with | |||
| the Message ID 3, and received one with ID 4. So the peer's expected | ||||
| values for the next pair of messages are (4, 5). These are both | ||||
| higher than the corresponding values just received from the member | ||||
| (the order of tuple members is reversed when doing this comparison!), | ||||
| and therefore they are sent as-is. | ||||
| A.3. Normal Failover - Example 3 | ||||
| Standby (Newly Active) Member Peer | Standby (Newly Active) Member Peer | |||
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | |||
| Sync Request (2, 5) --------> | Sync Request (2, 5) --------> | |||
| Peer has the values (2, 4) so it sends | Peer has the values (2, 4) so it sends | |||
| <-------------(5, 4) as the Sync Response | <-------------(5, 4) as the Sync Response | |||
| A.3. Simultaneous Failover | In this example, the newly active member expects to send the next IKE | |||
| message with ID 2. It sends an expected receive value of 5, which is | ||||
| higher than the last ID value it has seen from the peer, because it | ||||
| believes some incoming messages may have been lost. The peer has | ||||
| last sent a message with ID 1, and received one with ID 3, indicating | ||||
| that the a couple of messages sent by the previously active member | ||||
| had not been synchronized into the other member. So the peer's next | ||||
| expected (send, receive) values are (2, 4). The peer replies with | ||||
| the maximum of the received and the expected value for both send and | ||||
| receive counters: (max(2, 5), max(4, 2)) = (5, 4). | ||||
| In the case of simultaneous failover, both sides send the | A.4. Simultaneous Failover | |||
| synchronization request, but whichever side has the higher value will | ||||
| be eventually synchronized. | In the case of simultaneous failover, both sides send their | |||
| synchronization requests simultaneously. The eventual outcome of | ||||
| synchronization consists of the higher counter values. This is | ||||
| demonstrated in the following figure. | ||||
| Standby (Newly Active) Member Peer | Standby (Newly Active) Member Peer | |||
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | |||
| Sync Request (4,4) -----> | Sync Request (4,4) -----> | |||
| <-------------- Sync Request (5,5) | <-------------- Sync Request (5,5) | |||
| Sync Response (5,5) ----> | Sync Response (5,5) ----> | |||
| skipping to change at page 23, line 32 ¶ | skipping to change at page 26, line 4 ¶ | |||
| Phone: +91 80 4426 4831 | Phone: +91 80 4426 4831 | |||
| Email: kagarigi@cisco.com | Email: kagarigi@cisco.com | |||
| Yoav Nir | Yoav Nir | |||
| Check Point Software Technologies Ltd. | Check Point Software Technologies Ltd. | |||
| 5 Hasolelim St. | 5 Hasolelim St. | |||
| Tel Aviv 67897 | Tel Aviv 67897 | |||
| Israel | Israel | |||
| Email: ynir@checkpoint.com | Email: ynir@checkpoint.com | |||
| Yaron Sheffer | Yaron Sheffer | |||
| Independent | Porticor Cloud Security | |||
| Email: yaronf.ietf@gmail.com | Email: yaronf.ietf@gmail.com | |||
| Dacheng Zhang | Dacheng Zhang | |||
| Huawei Technologies Ltd. | Huawei Technologies Ltd. | |||
| Email: zhangdacheng@huawei.com | Email: zhangdacheng@huawei.com | |||
| End of changes. 51 change blocks. | ||||
| 111 lines changed or deleted | 236 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||