TOC 
Network Working GroupR. Singh, Ed.
Internet-DraftG. Kalyani
Intended status: Standards TrackCisco
Expires: March 10, 2011Y. Nir
 Check Point
 D. Zhang
 Huawei
 September 6, 2010


Protocol Support for High Availability IKEv2/IPsec
draft-ietf-ipsecme-ipsecha-protocol-00

Abstract

IKEv2 and IPsec protocols are widely used for deploying VPN. In order to make such VPN highly available and failure-prone, these VPNs are implemented as IKEv2/IPsec Highly Available (HA) cluster. But there are many issues in IKEv2/IPsec HA cluster. The draft "IPsec Cluster Problem Statement" enumerates all the issues encountered in IKEv2/IPsec HA cluster environment.

This draft proposes an extension to IKEv2 protocol to solve main issues of "IPsec Cluster Problem Statement" in Hot Standby cluster and gives implementation advice for other issues. The main issues to be solved are:

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

This Internet-Draft will expire on March 10, 2011.

Copyright Notice

Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.



Table of Contents

1.  Introduction
2.  Terminology
3.  Issues solved from IPsec Cluster Problem Statement
4.  IKEv2/IPsec SA Counter Synchronization Problem
5.  IKEv2/IPsec SA Counter Synchronization Solution
6.  SA counter synchronization notify and payload types
    6.1.  SYNC_SA_COUNTER_INFO_SUPPORTED
    6.2.  SYNC_SA_COUNTER_INFO
7.  Details of implementation
8.  Step-by-Step details
9.  Security Considerations
10.  Interaction with other drafts
11.  IANA Considerations
12.  Acknowledgements
13.  Change Log
    13.1.  Draft -00
14.  References
    14.1.  Normative References
    14.2.  Informative References
§  Authors' Addresses




 TOC 

1.  Introduction

IKEv2 is used for deploying IPsec-based VPNs. In order to make such VPN highly available and failure-prone, these VPNs are inplemented as IKEv2/IPsec Highly Available (HA) cluster. But there are many issues in IKEv2/IPsec HA cluster. The draft "IPsec Cluster Problem Statement" enumerates all the issues encountered in IKEv2/IPsec HA cluster.

In case of Hot Standby cluster implementaion of IKEv2/IPsec based VPNs, the IKEv2/IPsec session gets established with the peer and the active member of cluster. After that, the active member syncs/updates the IKE/IPsec SA state to the standby member of the cluster. This primary SA state sync-up is done on SA bring up and/or rekey. Doing SA state synchronization/updation between active and peer member for each IKE and IPsec message standby cluster is very costly, so normally its done periodically. So, when "failover" event happens in the cluster, first "failover' is detected by the standby member and then it becomes active member and it takes considerable time. During the time of failover and standby member becoming newly active member, the peer is unaware of failover and keeps sending IKE request and IPsec packets to the cluster which is allowed as per IKEv2 and IPsec windowing feature. Now, newly active member after coming up finds the mismtach in IKE message id's and IPsec replay counters. Please see Section 4 (IKEv2/IPsec SA Counter Synchronization Problem ) for more details.

This draft proposes an extension to IKEv2 protocol to solve main issues of IKE message id sync and IPsec SA replay counter sync and gives implementation advice for others. Here is summary of solutions provided in this draft:

IKE Message Id synchronization : This is done by obtaining the message Id values from the peer and updating the values at the newly active cluster member after the failover.

IPsec SA Counter synchronization : This is done by sending incremented values of replay counters by the newly active cluster member to the peer as expected replay counter value.

Though this draft describes the IKEv2/IPsec SA counter synchronisation in context of hot standby cluster. This solution can be used in other scenarios where IKEv2/IPsec SA counters are mis-matched and couner sync is needed.

There were some concerns about the current window sync process. The concern was to make IKEv2 window sync optional but we beleive IKEv2 window sync will be mandatory.

[[ This topic needs to be discussed further on the WG mailing list. ]]



 TOC 

2.  Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.) [RFC2119].

"SA Counter SYNC Request" is the information exchange request defined in this draft to synchronize the IKEv2/IPsec SA counter information between member of the cluster and the peer.

"SA Counter SYNC Response" is the information exchange response defined in this draft to synchronize the IKEv2/IPsec SA counter information between member of the cluster and the peer.

Below are the terms taken from [IPsec Cluster Problem Statement] (Nir, Y., “IPsec Cluster Problem Statement,” July 2010.) with added information in context of this draft.

"Hot Standby Cluster", or "HS Cluster" is a cluster where only one of the members is active at any one time. This member is also referred to as the "active", whereas the other(s) are referred to as "standbys". VRRP ([RFC5798]) is one method of building such a cluster. The goal of Hot Standby Cluster is that it creates illusion of single virtual gateway to the peer(s).

"Active Member" is the primary member in the Hot Standby cluster. It is responsible for forwarding packets for the virtual gateway.

"Standby Member" is the primary backup router. The member takes control i.e. becomes active member after the "failover" event.

"Peer" is the IKEv2/IPsec endpoint which establishes VPN connection with Hot Standby cluster. The Peer knows Hot Standby Cluster by single cluster's IP address. In case of "failover", the standby member of the cluster becomes active, so the peer normally doesn't notice that "failover" has occured in the cluster.

The generic term IKEv1/IPsec SA counters is used throughout. By IKEv2 SA counter stands for IKEv2 message ids and IPsec SA counter stands for IPsec SA replay counters which are used to provide optional anti-replay feature.



 TOC 

3.  Issues solved from IPsec Cluster Problem Statement

IPsec Cluster Problem Statement defines the problems encountered in IPsec Clusters. . The problems along with their section names as given in the statement are as follows.

This draft solves the main issues using the protocol extention, and provides implementation advice for other issues, given as follows.



 TOC 

4.  IKEv2/IPsec SA Counter Synchronization Problem

IKEv2 RFC states that "An IKE endpoint MUST NOT exceed the peer's stated window size for transmitted IKE requests".

As per the protocol, all IKEv2 packets follows request-response paradigm. The initiator of an IKEv2 request MUST retransmit the request, until it has received a response from the peer. IKEv2 introduces a windowing mechanism that allows multiple requests to be outstanding at a given point of time, but mandates that the sender window does not move until the oldest message sent from one peer to another is acknowledged. Loss of even a single packet leads to repeated retransmissions followed by an IKEv2 SA teardown if the retransmissions are unacknowledged.

IPsec Hot Standby Cluster is required to ensure that in case of failover of active member, the standby member becomes active immediately. The standby member is expected to have the exact values of message id fields of active member before failover. Even with the best efforts to update the message Id values from active to standby member, the values at standby member can be stale due to following reasons:

When a standby member takes over as the active member, it would start the message id ranges from previously updated values. This would make it reject requests from the peer, since the values would be stale. As a sender, the standby member may end up reusing a stale message id which will cause the peer to drop the request. Eventually there is a high probability of the IKEv2 and corresponding IPsec SAs getting torn down simply because of a transitory message id mismatch and re-transmission of requests. This is not a desirable feature of HA. Even after updating standby memeber periodically the cluster can loose IKE and so all IPsec SA due to message id i.e. SA counter mismatch.

Similar issue is observed in IPsec counters also if anti-replay protection/ESN is implemented. Even with the best efforts of syncing the ESP and AH SA counter numbers from active to stand by member , there is a chance that the stand-by member would have stale counter values. The standby member would then send the stale counter numbers. The peer would reject such packets since in case of anti-replay protection feature, duplicate use of counters are not allowed. In case of IPsec it is ok to skip some counter values and start with the highr counter values.

Hence a mechanism is required in HA to ensure that the standby member has correct values of message Id values and IPsec counters, so that sessions are not torn down just because of window ranges.



 TOC 

5.  IKEv2/IPsec SA Counter Synchronization Solution

After the standby member becomes the active member after failover event in the cluster, the standby member would send an authenticated IKEv2 request to the peer to send its values of SA counters.

The standby member would then update its values of SA counters and then start sending/receiving the requests.

The peer MUST negotiate its ability to support SA counter synchronization information with active member by sending the SYNC_SA_COUNTER_INFO_SUPPORTED notification in IKE_AUTH exchange.


Peer                                                  Active Member
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HDR, SK {IDi, [CERT], [CERTREQ], [IDr], AUTH,
  N[SYNC_SA_COUNTER_INFO_SUPPORTED], SAi2, TSi, TSr} ---------->

<---------- HDR, SK {IDr, [CERT+], [CERTREQ+], AUTH,
                  N[SYNC_SA_COUNTER_INFO_SUPPORTED], SAr2, TSi, TSr}

When peer and active member both support SA counter synchronization, the active member MUST sync/update SA counter synchronization capability to the standby member after the establishment of the IKE SA. So that standby member is aware of the capability and can use it when it becomes the active member after failover event.

After failover event, when the standby member becomes the active member, it has to request the peer for the SA counters. Standby member would initiate the SYNC Request with an INFORMATIONAL exchange containing the notify SYNC_SA_COUNTER_INFO. The SYNC_SA_COUNTER_INFO information can be used for update IKEv2 counters i.e. message ids and also IPsec SA replay counters.

If there are many IPsec SAs and all IPsec SA counters cannot be synchronized with a single counter sync exchange, then another counter sync exchange SHOULD be send for remaining IPsec SAs, but for this exchange message id would be synced IKE message id after first counter sync exchnage NOT zero.

The peer will respond back with the notify SYNC_SA_COUNTER_INFO. The SYNC_SA_COUNTER_INFO request contains NONCE data to avoid DOS attack due to replay of SA counter sync response. The Nonce data send in SYNC_SA_COUNTER_INFO response MUST match with nonce data sent by newly-active member in SYNC_SA_COUNTER_INFO request. If nonce data received in SYNC_SA_COUNTER_INFO response does not match with nonce data sent in SYNC_SA_COUNTER_INFO request, the standby i.e. newly-active member MUST discard this SYNC_SA_COUNTER_INFO response, and normal IKEv2 behaviour of re-transmitting the request and waiting for genuine reply from the peer SHOULD follow, before tearing down the SA becuase of re-transmits.


Standby [Newly Active] Member                            Peer
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HDR, SK {N[SYNC_SA_COUNTER_INFO]+} -------->

             <--------- HDR, SK {N[SYNC_SA_COUNTER_INFO]+}



 TOC 

6.  SA counter synchronization notify and payload types

Below are the new notify and payload types that are defined



 TOC 

6.1.  SYNC_SA_COUNTER_INFO_SUPPORTED

SYNC_SA_COUNTER_INFO_SUPPORTED: This notify is included in the IKE_AUTH request by the peer to indicate the support for IKEv2/IPsec SA counter synchronization mechanism described in this document.



                     1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Payload  |C|  RESERVED   |         Payload Length        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Protocol ID(=0)| SPI Size (=0) |      Notify Message Type      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 SYNC_SA_COUNTER_INFO_SUPPORTED 

The 'Next Payload', 'Payload Length', 'Protocol ID', 'SPI Size', and 'Notify Message Type' fields are the same as described in Section 3 of [IKEv2bis] (Kaufman, C., Hoffman, P., Nir, Y., and P. Eronen, “Internet Key Exchange Protocol: IKEv2,” May 2010.). The 'SPI Size' field MUST be set to 0 to indicate that the SPI is not present in this message. The 'Protocol ID' MUST be set to 0, since the notification is not specific to a particular security association. 'Payload Length' field is set to the length in octets of the entire payload, including the generic payload header. The 'Notify Message Type' field is set to indicate the SYNC_SA_COUNTER_INFO_SUPPORTED payload.



 TOC 

6.2.  SYNC_SA_COUNTER_INFO

SYNC_SA_COUNTER_INFO : This payload type is defined to sync the SA counter information among newly-active [standby] member and the peer. The SYNC_SA_COUNTER_INFO payload can be used to synchronize IKE SA counter and IPsec SA counters as well. So, multiple payloads of this type can be used in the single exchange where one payload is used to sync the IKE SA counter information, another payload can be used to sync the Child SA [ e.g. ESP, AH etc] information.



                     1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Payload  |M|  RESERVED   |         Payload Length        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Protocol ID    | SPI Size      | # of SPI's    |Counter Size   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
~                                                               ~
|                                                               |
~                     Nonce Data                                ~
|                                                               |
~                                                               ~
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|             EXPECTED_SEND_REQ_MESSAGE_ID                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|             EXPECTED_RECV_REQ_MESSAGE_ID                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                            SPI                                |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
~            Last Counter                                       ~
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 SYNC_SA_COUNTER_INFO 

It contains the following data.



 TOC 

7.  Details of implementation

The message Id used in this exchange MUST be zero so that it is not vaildated upon receipt. Message Id zero MUST be permitted only for informational exchange that would have NOTIFY of type SYNC_SA_COUNTER_INFO. If any packet uses the message Id Zero, without having this Notify along with the Nonce payload, then such packets MUST be discarded upon decryption. No other payloads are allowed in this Informational exchange.

The standby member can initiate the synchronization of IKEv2 Message Id's

The standby member can initiate the synchronization of IPsec SA Counters

Since there can be many sessions at Standby member, and sending exchanges from all of the sessions can cause throttling, the standby member can choose to initiate the exchange when it has to send or receive the request. Thus the trigger to initiate this exchange depends on the requirement/discretion of the standby member.

The member which has not announced its capability SYNC_SA_COUNTER_INFO_SUPPORTED MUST NOT send/receive the notify SYNC_SA_COUNTER_INFO.

If a peer gets SYNC_SA_COUNTER_INFO request even though it did not announce its capability in IKE_AUTH exchange, then it MUST ignore this message.



 TOC 

8.  Step-by-Step details

The step by step details of the synchronisation of IKE message Id is as follows.

The step by step details of the synchronisation of IPsec SA Counter synchronization is as follows.



 TOC 

9.  Security Considerations

There can be two types of DOS attacks.



 TOC 

10.  Interaction with other drafts

The primary assumption of IKEv2/IPsec SA Counter Synchronization prososal is IKEv2 SA has been established between active member of Hot Standby Cluster and peer, after that the failover event occurred and now standby member has "become" active. It also assumes the IKEv2 SA state was synced between active and standby member of the Hot Standby Cluster before the failover event.



 TOC 

11.  IANA Considerations

This document introduces two new IKEv2 Notification Message types as described in Section 6.The new Notify Message Types must be assigned values between 16396 and 40959.



 TOC 

12.  Acknowledgements

We would like to thank Pratima Sethi and Frederic Detienne for their reviews comments and valuable suggestions for initial version of the document.

We would also like to thank following people (in alphabetical order) for their review comments and valuable suggestions: Dan Harkins, Paul Hoffman, Steve Kent, Tero Kivinen, David McGrew, Pekka Riikonen, Yaron Sheffar.



 TOC 

13.  Change Log

This section lists all the changes in this document.

NOTE TO RFC EDITOR: Please remove this section in before final RFC publication.



 TOC 

13.1.  Draft -00

Version 00 is identical to draft-kagarigi-ipsecme-ikev2-windowsync-04, started as WG document.

Added IPSECME WG HA design team members as authors.

Added comment in Introduction to discuss the window sync process on WG mailing list to solve some concerns.



 TOC 

14.  References



 TOC 

14.1. Normative References

[IKEv2bis] Kaufman, C., Hoffman, P., Nir, Y., and P. Eronen, “Internet Key Exchange Protocol: IKEv2,” draft-ietf-IPsecme-ikev2bis (work in progress), May 2010 (TXT, HTML).
[IPsec Cluster Problem Statement] Nir, Y., “IPsec Cluster Problem Statement,” July 2010.
[RFC2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).


 TOC 

14.2. Informative References

[RFC5685] Devarapalli, V. and K. Weniger, “Redirect Mechanism for IKEv2,” RFC 5685, November 2009 (TXT, HTML, XML).
[RFC5723] Sheffer, Y. and H. Tschofenig, “IKEv2 Session Resumption,” RFC 5723, January 2010 (TXT, HTML, XML).


 TOC 

Authors' Addresses

  Raj Singh (Editor)
  Cisco Systems, Inc.
  Divyashree Chambers, B Wing, O'Shaugnessy Road
  Bangalore, Karnataka 560025
  India
Phone:  +91 80 4426 4833
Email:  rsj@cisco.com
  
  Kalyani Garigipati
  Cisco Systems, Inc.
  Divyashree Chambers, B Wing, O'Shaugnessy Road
  Bangalore, Karnataka 560025
  India
Phone:  +91 80 4426 4831
Email:  kagarigi@cisco.com
  
  Yoav Nir
  Check Point Software Technologies Ltd.
  5 Hasolelim st.
  Tel Aviv 67897
  Israel
Email:  ynir@checkpoint.com
  
  Dacheng Zhang
  Huawei Technologies Ltd.
Email:  zhangdacheng@huawei.com