idnits 2.17.1 

draft-ietf-ipsecme-rfc8229bis-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([RFC8229]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.

  -- The draft header indicates that this document obsoletes RFC8229, but the
     abstract doesn't seem to directly say this.  It does mention RFC8229
     though, so this could be OK.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (April 29, 2021) is 1092 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFCXXXX' is mentioned on line 936, but not defined

  == Missing Reference: 'CERTREQ' is mentioned on line 1133, but not defined

  == Missing Reference: 'CERT' is mentioned on line 1138, but not defined

  == Missing Reference: 'CP' is mentioned on line 1185, but not defined

  -- Obsolete informational reference (is this intentional?): RFC 5246
     (Obsoleted by RFC 8446)

  -- Obsolete informational reference (is this intentional?): RFC 6528
     (Obsoleted by RFC 9293)

  -- Obsolete informational reference (is this intentional?): RFC 8229
     (Obsoleted by RFC 9329)


     Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                         V. Smyslov
3	Internet-Draft                                                ELVIS-PLUS
4	Obsoletes: 8229 (if approved)                                   T. Pauly
5	Intended status: Standards Track                              Apple Inc.
6	Expires: October 31, 2021                                 April 29, 2021

8	               TCP Encapsulation of IKE and IPsec Packets
9	                    draft-ietf-ipsecme-rfc8229bis-00

11	Abstract

13	   This document describes a method to transport Internet Key Exchange
14	   Protocol (IKE) and IPsec packets over a TCP connection for traversing
15	   network middleboxes that may block IKE negotiation over UDP.  This
16	   method, referred to as "TCP encapsulation", involves sending both IKE
17	   packets for Security Association establishment and Encapsulating
18	   Security Payload (ESP) packets over a TCP connection.  This method is
19	   intended to be used as a fallback option when IKE cannot be
20	   negotiated over UDP.

22	   TCP encapsulation for IKE and IPsec was defined in [RFC8229].  This
23	   document updates specification for TCP encapsulation by including
24	   additional calarifications obtained during implementation and
25	   deployment of this method.  This documents makes RFC8229 obsolete.

27	Status of This Memo

29	   This Internet-Draft is submitted in full conformance with the
30	   provisions of BCP 78 and BCP 79.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF).  Note that other groups may also distribute
34	   working documents as Internet-Drafts.  The list of current Internet-
35	   Drafts is at https://datatracker.ietf.org/drafts/current/.

37	   Internet-Drafts are draft documents valid for a maximum of six months
38	   and may be updated, replaced, or obsoleted by other documents at any
39	   time.  It is inappropriate to use Internet-Drafts as reference
40	   material or to cite them other than as "work in progress."

42	   This Internet-Draft will expire on October 31, 2021.

44	Copyright Notice

46	   Copyright (c) 2021 IETF Trust and the persons identified as the
47	   document authors.  All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents
51	   (https://trustee.ietf.org/license-info) in effect on the date of
52	   publication of this document.  Please review these documents
53	   carefully, as they describe your rights and restrictions with respect
54	   to this document.  Code Components extracted from this document must
55	   include Simplified BSD License text as described in Section 4.e of
56	   the Trust Legal Provisions and are provided without warranty as
57	   described in the Simplified BSD License.

59	Table of Contents

61	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
62	     1.1.  Prior Work and Motivation . . . . . . . . . . . . . . . .   4
63	   2.  Terminology and Notation  . . . . . . . . . . . . . . . . . .   4
64	   3.  Configuration . . . . . . . . . . . . . . . . . . . . . . . .   5
65	   4.  TCP-Encapsulated Header Formats . . . . . . . . . . . . . . .   6
66	     4.1.  TCP-Encapsulated IKE Header Format  . . . . . . . . . . .   6
67	     4.2.  TCP-Encapsulated ESP Header Format  . . . . . . . . . . .   7
68	   5.  TCP-Encapsulated Stream Prefix  . . . . . . . . . . . . . . .   7
69	   6.  Applicability . . . . . . . . . . . . . . . . . . . . . . . .   8
70	     6.1.  Recommended Fallback from UDP . . . . . . . . . . . . . .   8
71	   7.  Using TCP Encapsulation . . . . . . . . . . . . . . . . . . .   9
72	     7.1.  Connection Establishment and Teardown . . . . . . . . . .   9
73	     7.2.  Retransmissions . . . . . . . . . . . . . . . . . . . . .  11
74	     7.3.  Cookies and Puzzles . . . . . . . . . . . . . . . . . . .  11
75	     7.4.  Error Handling in IKE_SA_INIT . . . . . . . . . . . . . .  12
76	     7.5.  NAT Detection Payloads  . . . . . . . . . . . . . . . . .  13
77	     7.6.  Keep-Alives and Dead Peer Detection . . . . . . . . . . .  13
78	     7.7.  Implications of TCP Encapsulation on IPsec SA Processing   14
79	   8.  Interaction with IKEv2 Extensions . . . . . . . . . . . . . .  14
80	     8.1.  MOBIKE Protocol . . . . . . . . . . . . . . . . . . . . .  14
81	     8.2.  IKE Redirect  . . . . . . . . . . . . . . . . . . . . . .  15
82	     8.3.  IKEv2 Session Resumption  . . . . . . . . . . . . . . . .  16
83	     8.4.  IKEv2 Protocol Support for High Availability  . . . . . .  16
84	     8.5.  IKEv2 Fragmentation . . . . . . . . . . . . . . . . . . .  17
85	   9.  Middlebox Considerations  . . . . . . . . . . . . . . . . . .  17
86	   10. Performance Considerations  . . . . . . . . . . . . . . . . .  17
87	     10.1.  TCP-in-TCP . . . . . . . . . . . . . . . . . . . . . . .  17
88	     10.2.  Added Reliability for Unreliable Protocols . . . . . . .  18
89	     10.3.  Quality-of-Service Markings  . . . . . . . . . . . . . .  19
90	     10.4.  Maximum Segment Size . . . . . . . . . . . . . . . . . .  19
91	     10.5.  Tunneling ECN in TCP . . . . . . . . . . . . . . . . . .  19
92	   11. Security Considerations . . . . . . . . . . . . . . . . . . .  19
93	   12. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  20
94	   13. References  . . . . . . . . . . . . . . . . . . . . . . . . .  20
95	     13.1.  Normative References . . . . . . . . . . . . . . . . . .  20
96	     13.2.  Informative References . . . . . . . . . . . . . . . . .  21

98	   Appendix A.  Using TCP Encapsulation with TLS . . . . . . . . . .  23
99	   Appendix B.  Example Exchanges of TCP Encapsulation with TLS 1.3   24
100	     B.1.  Establishing an IKE Session . . . . . . . . . . . . . . .  24
101	     B.2.  Deleting an IKE Session . . . . . . . . . . . . . . . . .  25
102	     B.3.  Re-establishing an IKE Session  . . . . . . . . . . . . .  26
103	     B.4.  Using MOBIKE between UDP and TCP Encapsulation  . . . . .  27
104	   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .  29
105	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  29

107	1.  Introduction

109	   The Internet Key Exchange Protocol version 2 (IKEv2) [RFC7296] is a
110	   protocol for establishing IPsec Security Associations (SAs), using
111	   IKE messages over UDP for control traffic, and using Encapsulating
112	   Security Payload (ESP) [RFC4303] messages for encrypted data traffic.
113	   Many network middleboxes that filter traffic on public hotspots block
114	   all UDP traffic, including IKE and IPsec, but allow TCP connections
115	   through because they appear to be web traffic.  Devices on these
116	   networks that need to use IPsec (to access private enterprise
117	   networks, to route Voice over IP calls to carrier networks, or
118	   because of security policies) are unable to establish IPsec SAs.
119	   This document defines a method for encapsulating IKE control messages
120	   as well as IPsec data messages within a TCP connection.

122	   Using TCP as a transport for IPsec packets adds a third option to the
123	   list of traditional IPsec transports:

125	   1.  Direct.  Currently, IKE negotiations begin over UDP port 500.  If
126	       no Network Address Translation (NAT) device is detected between
127	       the Initiator and the Responder, then subsequent IKE packets are
128	       sent over UDP port 500, and IPsec data packets are sent using
129	       ESP.

131	   2.  UDP Encapsulation [RFC3948].  If a NAT is detected between the
132	       Initiator and the Responder, then subsequent IKE packets are sent
133	       over UDP port 4500 with four bytes of zero at the start of the
134	       UDP payload, and ESP packets are sent out over UDP port 4500.
135	       Some peers default to using UDP encapsulation even when no NAT is
136	       detected on the path, as some middleboxes do not support IP
137	       protocols other than TCP and UDP.

139	   3.  TCP Encapsulation.  If the other two methods are not available or
140	       appropriate, IKE negotiation packets as well as ESP packets can
141	       be sent over a single TCP connection to the peer.

143	   Direct use of ESP or UDP encapsulation should be preferred by IKE
144	   implementations due to performance concerns when using TCP
145	   encapsulation (Section 10).  Most implementations should use TCP
146	   encapsulation only on networks where negotiation over UDP has been
147	   attempted without receiving responses from the peer or if a network
148	   is known to not support UDP.

150	1.1.  Prior Work and Motivation

152	   Encapsulating IKE connections within TCP streams is a common approach
153	   to solve the problem of UDP packets being blocked by network
154	   middleboxes.  The specific goals of this document are as follows:

156	   o  To promote interoperability by defining a standard method of
157	      framing IKE and ESP messages within TCP streams.

159	   o  To be compatible with the current IKEv2 standard without requiring
160	      modifications or extensions.

162	   o  To use IKE over UDP by default to avoid the overhead of other
163	      alternatives that always rely on TCP or Transport Layer Security
164	      (TLS) [RFC5246][RFC8446].

166	   Some previous alternatives include:

168	   Cellular Network Access
169	      Interworking Wireless LAN (IWLAN) uses IKEv2 to create secure
170	      connections to cellular carrier networks for making voice calls
171	      and accessing other network services over Wi-Fi networks. 3GPP has
172	      recommended that IKEv2 and ESP packets be sent within a TLS
173	      connection to be able to establish connections on restrictive
174	      networks.

176	   ISAKMP over TCP
177	      Various non-standard extensions to the Internet Security
178	      Association and Key Management Protocol (ISAKMP) have been
179	      deployed that send IPsec traffic over TCP or TCP-like packets.

181	   Secure Sockets Layer (SSL) VPNs
182	      Many proprietary VPN solutions use a combination of TLS and IPsec
183	      in order to provide reliability.  These often run on TCP port 443.

185	   IKEv2 over TCP
186	      IKEv2 over TCP as described in [I-D.ietf-ipsecme-ike-tcp] is used
187	      to avoid UDP fragmentation.

189	2.  Terminology and Notation

191	   This document distinguishes between the IKE peer that initiates TCP
192	   connections to be used for TCP encapsulation and the roles of
193	   Initiator and Responder for particular IKE messages.  During the
194	   course of IKE exchanges, the role of IKE Initiator and Responder may
195	   swap for a given SA (as with IKE SA rekeys), while the Initiator of
196	   the TCP connection is still responsible for tearing down the TCP
197	   connection and re-establishing it if necessary.  For this reason,
198	   this document will use the term "TCP Originator" to indicate the IKE
199	   peer that initiates TCP connections.  The peer that receives TCP
200	   connections will be referred to as the "TCP Responder".  If an IKE SA
201	   is rekeyed one or more times, the TCP Originator MUST remain the peer
202	   that originally initiated the first IKE SA.

204	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
205	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
206	   "OPTIONAL" in this document are to be interpreted as described in BCP
207	   14 [RFC2119] [RFC8174] when, and only when, they appear in all
208	   capitals, as shown here.

210	3.  Configuration

212	   One of the main reasons to use TCP encapsulation is that UDP traffic
213	   may be entirely blocked on a network.  Because of this, support for
214	   TCP encapsulation is not specifically negotiated in the IKE exchange.
215	   Instead, support for TCP encapsulation must be pre-configured on both
216	   the TCP Originator and the TCP Responder.

218	   Implementations MUST support TCP encapsulation on TCP port 4500,
219	   which is reserved for IPsec NAT traversal.

221	   Beyond a flag indicating support for TCP encapsulation, the
222	   configuration for each peer can include the following optional
223	   parameters:

225	   o  Alternate TCP ports on which the specific TCP Responder listens
226	      for incoming connections.  Note that the TCP Originator may
227	      initiate TCP connections to the TCP Responder from any local port.

229	   o  An extra framing protocol to use on top of TCP to further
230	      encapsulate the stream of IKE and IPsec packets.  See Appendix B
231	      for a detailed discussion.

233	   Since TCP encapsulation of IKE and IPsec packets adds overhead and
234	   has potential performance trade-offs compared to direct or UDP-
235	   encapsulated SAs (as described in Section 10), implementations SHOULD
236	   prefer ESP direct or UDP-encapsulated SAs over TCP-encapsulated SAs
237	   when possible.

239	4.  TCP-Encapsulated Header Formats

241	   Like UDP encapsulation, TCP encapsulation uses the first four bytes
242	   of a message to differentiate IKE and ESP messages.  TCP
243	   encapsulation also adds a 16-bit Length field that precedes every
244	   message to define the boundaries of messages within a stream.  The
245	   value in this field is equal to the length of the original message
246	   plus the length of the field itself, in octets.  If the first 32 bits
247	   of the message are zeros (a non-ESP marker), then the contents
248	   comprise an IKE message.  Otherwise, the contents comprise an ESP
249	   message.  Authentication Header (AH) messages are not supported for
250	   TCP encapsulation.

252	   Although a TCP stream may be able to send very long messages,
253	   implementations SHOULD limit message lengths to typical UDP datagram
254	   ESP payload lengths.  The maximum message length is used as the
255	   effective MTU for connections that are being encrypted using ESP, so
256	   the maximum message length will influence characteristics of inner
257	   connections, such as the TCP Maximum Segment Size (MSS).

259	   Note that this method of encapsulation will also work for placing IKE
260	   and ESP messages within any protocol that presents a stream
261	   abstraction, beyond TCP.

263	4.1.  TCP-Encapsulated IKE Header Format

265	                        1                   2                   3
266	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
267	                                   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
268	                                   |            Length             |
269	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
270	   |                         Non-ESP Marker                        |
271	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
272	   |                                                               |
273	   ~                      IKE header [RFC7296]                     ~
274	   |                                                               |
275	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

277	                                 Figure 1

279	   The IKE header is preceded by a 16-bit Length field in network byte
280	   order that specifies the length of the IKE message (including the
281	   non-ESP marker) within the TCP stream.  As with IKE over UDP port
282	   4500, a zeroed 32-bit non-ESP marker is inserted before the start of
283	   the IKE header in order to differentiate the traffic from ESP traffic
284	   between the same addresses and ports.

286	   o  Length (2 octets, unsigned integer) - Length of the IKE packet,
287	      including the Length field and non-ESP marker.  The value in the
288	      Length field MUST NOT be 0 or 1.  The receiver MUST treat these
289	      values as fatal errors and MUST close TCP connection.

291	4.2.  TCP-Encapsulated ESP Header Format

293	                        1                   2                   3
294	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
295	                                   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
296	                                   |            Length             |
297	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
298	   |                                                               |
299	   ~                     ESP header [RFC4303]                      ~
300	   |                                                               |
301	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

303	                                 Figure 2

305	   The ESP header is preceded by a 16-bit Length field in network byte
306	   order that specifies the length of the ESP packet within the TCP
307	   stream.

309	   The Security Parameter Index (SPI) field [RFC7296] in the ESP header
310	   MUST NOT be a zero value.

312	   o  Length (2 octets, unsigned integer) - Length of the ESP packet,
313	      including the Length field.  The value in the Length field MUST
314	      NOT be 0 or 1.  The receiver MUST treat these values as fatal
315	      errors and MUST close TCP connection.

317	5.  TCP-Encapsulated Stream Prefix

319	   Each stream of bytes used for IKE and IPsec encapsulation MUST begin
320	   with a fixed sequence of six bytes as a magic value, containing the
321	   characters "IKETCP" as ASCII values.  This value is intended to
322	   identify and validate that the TCP connection is being used for TCP
323	   encapsulation as defined in this document, to avoid conflicts with
324	   the prevalence of previous non-standard protocols that used TCP port
325	   4500.  This value is only sent once, by the TCP Originator only, at
326	   the beginning of any stream of IKE and ESP messages.

328	   If other framing protocols are used within TCP to further encapsulate
329	   or encrypt the stream of IKE and ESP messages, the stream prefix must
330	   be at the start of the TCP Originator's IKE and ESP message stream
331	   within the added protocol layer (Appendix B).  Although some framing
332	   protocols do support negotiating inner protocols, the stream prefix
333	   should always be used in order for implementations to be as generic
334	   as possible and not rely on other framing protocols on top of TCP.

336	                0      1      2      3      4      5
337	               +------+------+------+------+------+------+
338	               | 0x49 | 0x4b | 0x45 | 0x54 | 0x43 | 0x50 |
339	               +------+------+------+------+------+------+

341	                                 Figure 3

343	6.  Applicability

345	   TCP encapsulation is applicable only when it has been configured to
346	   be used with specific IKE peers.  If a Responder is configured to use
347	   TCP encapsulation, it MUST listen on the configured port(s) in case
348	   any peers will initiate new IKE sessions.  Initiators MAY use TCP
349	   encapsulation for any IKE session to a peer that is configured to
350	   support TCP encapsulation, although it is recommended that Initiators
351	   should only use TCP encapsulation when traffic over UDP is blocked.

353	   Since the support of TCP encapsulation is a configured property, not
354	   a negotiated one, it is recommended that if there are multiple IKE
355	   endpoints representing a single peer (such as multiple machines with
356	   different IP addresses when connecting by Fully Qualified Domain
357	   Name, or endpoints used with IKE redirection), all of the endpoints
358	   equally support TCP encapsulation.

360	   If TCP encapsulation is being used for a specific IKE SA, all
361	   messages for that IKE SA and its Child SAs MUST be sent over a TCP
362	   connection until the SA is deleted or IKEv2 Mobility and Multihoming
363	   (MOBIKE) is used to change the SA endpoints and/or the encapsulation
364	   protocol.  See Section 8.1 for more details on using MOBIKE to
365	   transition between encapsulation modes.

367	6.1.  Recommended Fallback from UDP

369	   Since UDP is the preferred method of transport for IKE messages,
370	   implementations that use TCP encapsulation should have an algorithm
371	   for deciding when to use TCP after determining that UDP is unusable.
372	   If an Initiator implementation has no prior knowledge about the
373	   network it is on and the status of UDP on that network, it SHOULD
374	   always attempt to negotiate IKE over UDP first.  IKEv2 defines how to
375	   use retransmission timers with IKE messages and, specifically,
376	   IKE_SA_INIT messages [RFC7296].  Generally, this means that the
377	   implementation will define a frequency of retransmission and the
378	   maximum number of retransmissions allowed before marking the IKE SA
379	   as failed.  An implementation can attempt negotiation over TCP once
380	   it has hit the maximum retransmissions over UDP, or slightly before
381	   to reduce connection setup delays.  It is recommended that the
382	   initial message over UDP be retransmitted at least once before
383	   falling back to TCP, unless the Initiator knows beforehand that the
384	   network is likely to block UDP.

386	   When switching from UDP to TCP, a new IKE_SA_INIT exchange MUST be
387	   initiated with new Initiator's SPI and with recalculated content of
388	   NAT_DETECTION_SOURCE_IP notification.

390	7.  Using TCP Encapsulation

392	7.1.  Connection Establishment and Teardown

394	   When the IKE Initiator uses TCP encapsulation, it will initiate a TCP
395	   connection to the Responder using the configured TCP port.  The first
396	   bytes sent on the stream MUST be the stream prefix value (Section 5).
397	   After this prefix, encapsulated IKE messages will negotiate the IKE
398	   SA and initial Child SA [RFC7296].  After this point, both
399	   encapsulated IKE (Figure 1) and ESP (Figure 2) messages will be sent
400	   over the TCP connection.  The TCP Responder MUST wait for the entire
401	   stream prefix to be received on the stream before trying to parse out
402	   any IKE or ESP messages.  The stream prefix is sent only once, and
403	   only by the TCP Originator.

405	   In order to close an IKE session, either the Initiator or Responder
406	   SHOULD gracefully tear down IKE SAs with DELETE payloads.  Once the
407	   SA has been deleted, the TCP Originator SHOULD close the TCP
408	   connection if it does not intend to use the connection for another
409	   IKE session to the TCP Responder.  If the TCP connection is no more
410	   associated with any active IKE SA, the TCP Responder MAY close the
411	   connection to clean up resources if TCP Originator didn't close it
412	   within some reasonable period of time.

414	   An unexpected FIN or a TCP Reset on the TCP connection may indicate a
415	   loss of connectivity, an attack, or some other error.  If a DELETE
416	   payload has not been sent, both sides SHOULD maintain the state for
417	   their SAs for the standard lifetime or timeout period.  The TCP
418	   Originator is responsible for re-establishing the TCP connection if
419	   it is torn down for any unexpected reason.  Since new TCP connections
420	   may use different ports due to NAT mappings or local port allocations
421	   changing, the TCP Responder MUST allow packets for existing SAs to be
422	   received from new source ports.

424	   A peer MUST discard a partially received message due to a broken
425	   connection.

427	   Whenever the TCP Originator opens a new TCP connection to be used for
428	   an existing IKE SA, it MUST send the stream prefix first, before any
429	   IKE or ESP messages.  This follows the same behavior as the initial
430	   TCP connection.

432	   If a TCP connection is being used to resume a previous IKE session,
433	   the TCP Responder can recognize the session using either the IKE SPI
434	   from an encapsulated IKE message or the ESP SPI from an encapsulated
435	   ESP message.  If the session had been fully established previously,
436	   it is suggested that the TCP Originator send an UPDATE_SA_ADDRESSES
437	   message if MOBIKE is supported, or an informational message (a keep-
438	   alive) otherwise.

440	   The TCP Responder MUST NOT accept any messages for the existing IKE
441	   session on a new incoming connection, unless that connection begins
442	   with the stream prefix.  If either the TCP Originator or TCP
443	   Responder detects corruption on a connection that was started with a
444	   valid stream prefix, it SHOULD close the TCP connection.  The
445	   connection can be determined to be corrupted if there are too many
446	   subsequent messages that cannot be parsed as valid IKE messages or
447	   ESP messages with known SPIs, or if the authentication check for an
448	   ESP message with a known SPI fails.  Implementations SHOULD NOT tear
449	   down a connection if only a single ESP message has an unknown SPI,
450	   since the SPI databases may be momentarily out of sync.  If there is
451	   instead a syntax issue within an IKE message, an implementation MUST
452	   send the INVALID_SYNTAX notify payload and tear down the IKE SA as
453	   usual, rather than tearing down the TCP connection directly.

455	   A TCP Originator SHOULD only open one TCP connection per IKE SA, over
456	   which it sends all of the corresponding IKE and ESP messages.  This
457	   helps ensure that any firewall or NAT mappings allocated for the TCP
458	   connection apply to all of the traffic associated with the IKE SA
459	   equally.

461	   Similarly, a TCP Responder SHOULD at any given time send packets for
462	   an IKE SA and its Child SAs over only one TCP connection.  It SHOULD
463	   choose the TCP connection on which it last received a valid and
464	   decryptable IKE or ESP message.  In order to be considered valid for
465	   choosing a TCP connection, an IKE message must be successfully
466	   decrypted and authenticated, not be a retransmission of a previously
467	   received message, and be within the expected window for IKE message
468	   IDs.  Similarly, an ESP message must pass authentication checks and
469	   be decrypted, and must not be a replay of a previous message.

471	   Since a connection may be broken and a new connection re-established
472	   by the TCP Originator without the TCP Responder being aware, a TCP
473	   Responder SHOULD accept receiving IKE and ESP messages on both old
474	   and new connections until the old connection is closed by the TCP
475	   Originator.  A TCP Responder MAY close a TCP connection that it
476	   perceives as idle and extraneous (one previously used for IKE and ESP
477	   messages that has been replaced by a new connection).

479	   Multiple IKE SAs MUST NOT share a single TCP connection, unless one
480	   is a rekey of an existing IKE SA, in which case there will
481	   temporarily be two IKE SAs on the same TCP connection.

483	7.2.  Retransmissions

485	   Section 2.1 of [RFC7296] describes how IKEv2 deals with the
486	   unreliability of the UDP protocol.  In brief, the exchange Initiator
487	   is responsible for retransmissions and must retransmit requests
488	   message until response message is received.  If no reply is received
489	   after several retransmissions, the SA is deleted.  The Responder
490	   never initiates retransmission, but must send a response message
491	   again in case it receives a retransmitted request.

493	   When IKEv2 uses a reliable transport protocol, like TCP, the
494	   retransmission rules are as follows:

496	   o  the exchange Initiator SHOULD NOT retransmit request message; if
497	      no response is received within some reasonable period of time, the
498	      IKE SA is deleted.

500	   o  if a TCP connection is broken and reestablished while the exchange
501	      Initiator is waiting for a response, the Initiator MUST retransmit
502	      its request and continue to wait for a response.

504	   o  the exchange Responder does not change its behavior, but acts as
505	      described in Section 2.1 of [RFC7296].

507	7.3.  Cookies and Puzzles

509	   IKEv2 provides a DoS attack protection mechanism through Cookies,
510	   which is described in Section 2.6 of [RFC7296].  [RFC8019] extends
511	   this mechanism for protection against DDoS attacks by means of Client
512	   Puzzles.  Both mechanisms allow the Responder to avoid keeping state
513	   until the Initiator proves its IP address is legitimate (and after
514	   solving a puzzle if required).

516	   The connection-oriented nature of TCP transport brings additional
517	   considerations for using these mechanisms.  In general, Cookies
518	   provide less value in case of TCP encapsulation, since by the time a
519	   Responder receives the IKE_SA_INIT request, the TCP session has
520	   already been established and the Initiator's IP address has been
521	   verified.  Moreover, a TCP/IP stack creates state once a TCP SYN
522	   packet is received (unless SYN Cookies described in [RFC4987] are
523	   employed), which contradicts the statelessness of IKEv2 Cookies.  In
524	   particular, with TCP, an attacker is able to mount a SYN flooding DoS
525	   attack which an IKEv2 Responder cannot prevent using stateless IKEv2
526	   Cookies.  Thus, when using TCP encapsulation, it makes little sense
527	   to send Cookie requests without Puzzles unless the Responder is
528	   concerned with a possibility of TCP Sequence Number attacks (see
529	   [RFC6528] for details).  Puzzles, on the other hand, still remain
530	   useful (and their use requires using Cookies).

532	   The following considerations are applicable for using Cookie and
533	   Puzzle mechanisms in case of TCP encapsulation:

535	   o  the exchange Responder SHOULD NOT request a Cookie, with the
536	      exception of Puzzles or in rare cases like preventing TCP Sequence
537	      Number attacks.

539	   o  if the Responder chooses to send Cookie request (possibly along
540	      with Puzzle request), then the TCP connection that the IKE_SA_INIT
541	      request message was received over SHOULD be closed, so that the
542	      Responder remains stateless at least until the Cookie (or Puzzle
543	      Solution) is returned.  Note that if this TCP connection is
544	      closed, the Responder MUST NOT include the Initiator's TCP port
545	      into the Cookie calculation (*), since the Cookie will be returned
546	      over a new TCP connection with a different port.

548	   o  the exchange Initiator acts as described in Section 2.6 of
549	      [RFC7296] and Section 7 of [RFC8019], i.e. using TCP encapsulation
550	      doesn't change the Initiator's behavior.

552	   (*) Examples of Cookie calculation methods are given in Section 2.6
553	   of [RFC7296] and in Section 7.1.1.3 of [RFC8019] and they don't
554	   include transport protocol ports.  However these examples are given
555	   for illustrative purposes, since Cookie generation algorithm is a
556	   local matter and some implementations might include port numbers,
557	   that won't work with TCP encapsulation.  Note also that these
558	   examples include the Initiator's IP address in Cookie calculation.
559	   In general this address may change between two initial requests (with
560	   and without Cookies).  This may happen due to NATs, since NATs have
561	   more freedom to change change source IP addresses for new TCP
562	   connections than for UDP.  In such cases cookie verification might
563	   fail.

565	7.4.  Error Handling in IKE_SA_INIT

567	   Section 2.21.1 of [RFC7296] describes how error notifications are
568	   handled in the IKE_SA_INIT exchange.  In particular, it is advised
569	   that the Initiator should not act immediately after receiving error
570	   notification and should instead wait some time for valid response,
571	   since the IKE_SA_INIT messages are completely unauthenticated.  This
572	   advice does not apply equally in case of TCP encapsulation.  If the
573	   Initiator receives a response message over TCP, then either this
574	   message is genuine and was sent by the peer, or the TCP session was
575	   hijacked and the message is forged.  In this latter case, no genuine
576	   messages from the Responder will be received.

578	   Thus, in case of TCP encapsulation, an Initiator SHOULD NOT wait for
579	   additional messages in case it receives error notification from the
580	   Responder in the IKE_SA_INIT exchange.

582	7.5.  NAT Detection Payloads

584	   When negotiating over UDP port 500, IKE_SA_INIT packets include
585	   NAT_DETECTION_SOURCE_IP and NAT_DETECTION_DESTINATION_IP payloads to
586	   determine if UDP encapsulation of IPsec packets should be used.
587	   These payloads contain SHA-1 digests of the SPIs, IP addresses, and
588	   ports as defined in [RFC7296].  IKE_SA_INIT packets sent on a TCP
589	   connection SHOULD include these payloads with the same content as
590	   when sending over UDP and SHOULD use the applicable TCP ports when
591	   creating and checking the SHA-1 digests.

593	   If a NAT is detected due to the SHA-1 digests not matching the
594	   expected values, no change should be made for encapsulation of
595	   subsequent IKE or ESP packets, since TCP encapsulation inherently
596	   supports NAT traversal.  Implementations MAY use the information that
597	   a NAT is present to influence keep-alive timer values.

599	   If a NAT is detected, implementations need to handle transport mode
600	   TCP and UDP packet checksum fixup as defined for UDP encapsulation in
601	   [RFC3948].

603	7.6.  Keep-Alives and Dead Peer Detection

605	   Encapsulating IKE and IPsec inside of a TCP connection can impact the
606	   strategy that implementations use to detect peer liveness and to
607	   maintain middlebox port mappings.  Peer liveness should be checked
608	   using IKE informational packets [RFC7296].

610	   In general, TCP port mappings are maintained by NATs longer than UDP
611	   port mappings, so IPsec ESP NAT keep-alives [RFC3948] SHOULD NOT be
612	   sent when using TCP encapsulation.  Any implementation using TCP
613	   encapsulation MUST silently drop incoming NAT keep-alive packets and
614	   not treat them as errors.  NAT keep-alive packets over a TCP-
615	   encapsulated IPsec connection will be sent as an ESP message with a
616	   one-octet-long payload with the value 0xFF.

618	   Note that, depending on the configuration of TCP and TLS on the
619	   connection, TCP keep-alives [RFC1122] and TLS keep-alives [RFC6520]
620	   may be used.  These MUST NOT be used as indications of IKE peer
621	   liveness.

623	7.7.  Implications of TCP Encapsulation on IPsec SA Processing

625	   Using TCP encapsulation affects some aspects of IPsec SA processing.

627	   1.  Section 8.1 of [RFC4301] requires all tunnel mode IPsec SAs to be
628	       able to copy the Don't Fragment (DF) bit from inner IP header to
629	       the outer (tunnel) one.  With TCP encapsulation this is generally
630	       not possible, because TCP/IP stack manages DF bit in the outer IP
631	       header, and usually the stack ensures that the DF bit is set for
632	       TCP packets to avoid IP fragmentation.

634	   2.  The other feature that is less applicable with TCP encapsulation
635	       is an ability to split traffic of different QoS classes into
636	       different IPsec SAs, created by a single IKE SA.  In this case
637	       the Differentiated Services Code Point (DSCP) field is usually
638	       copied from the inner IP header to the outer (tunnel) one,
639	       ensuring that IPsec traffic of each SA receives the corresponding
640	       level of service.  With TCP encapsulation all IPsec SAs created
641	       by a single IKE SA will share a single TCP connection and thus
642	       will receive the same level of service (see Section 10.3).  If
643	       this functionality is needed, implementations should create
644	       several IKE SAs over TCP and assign a corresponding DSCP value to
645	       each of them.

647	   Besides, TCP encapsulation of IPsec packets may have implications on
648	   performance of the encapsulated traffic.  Performance considerations
649	   are discussed in Section 10.

651	8.  Interaction with IKEv2 Extensions

653	8.1.  MOBIKE Protocol

655	   MOBIKE protocol, that allows IKEv2 SA to migrate between IP
656	   addresses, is defined in [RFC4555], and [RFC4621] further clarifies
657	   the details of the protocol.  When an IKE session that has negotiated
658	   MOBIKE is transitioning between networks, the Initiator of the
659	   transition may switch between using TCP encapsulation, UDP
660	   encapsulation, or no encapsulation.  Implementations that implement
661	   both MOBIKE and TCP encapsulation MUST support dynamically enabling
662	   and disabling TCP encapsulation as interfaces change.

664	   When a MOBIKE-enabled Initiator changes networks, the INFORMATIONAL
665	   exchange with the UPDATE_SA_ADDRESSES notification SHOULD be
666	   initiated first over UDP before attempting over TCP.  If there is a
667	   response to the request sent over UDP, then the ESP packets should be
668	   sent directly over IP or over UDP port 4500 (depending on if a NAT
669	   was detected), regardless of if a connection on a previous network
670	   was using TCP encapsulation.  If no response is received within a
671	   certain period of time after several retransmissions, the Initiator
672	   ought to change its transport for this exchange from UDP to TCP and
673	   resend the request message.  New INFORMATIONAL exchange MUST NOT be
674	   started in this situation.  If the Responder only responds to the
675	   request sent over TCP, then the ESP packets should be sent over the
676	   TCP connection, regardless of if a connection on a previous network
677	   did not use TCP encapsulation.

679	   Since switching from UDP to TCP happens can occur during a single
680	   INFORMATIONAL message exchange, the content of the
681	   NAT_DETECTION_SOURCE_IP notification will in most cases be incorrect
682	   (since UDP and TCP source ports will most likely be different), and
683	   the peer may incorrectly detect the presence of a NAT.  This should
684	   not cause functional issues since all messages will be encapsulated
685	   in TCP anyway, and TCP encapsulation does not change based on the
686	   presence of NATs.

688	   MOBIKE protocol defined the NO_NATS_ALLOWED notification that can be
689	   used to detect the presence of NAT between peer and to refuse to
690	   communicate in this situation.  In case of TCP the NO_NATS_ALLOWED
691	   notification SHOULD be ignored because TCP generally has no problems
692	   with NAT boxes.

694	   Section 3.7 of [RFC4555] describes an additional optional step in the
695	   process of changing IP addresses called Return Routability Check.  It
696	   is performed by Responders in order to be sure that the new
697	   initiator's address is in fact routable.  In case of TCP
698	   encapsulation this check has little value, since TCP handshake proves
699	   routability of the TCP Originator's address.  So, in case of TCP
700	   encapsulation the Return Routability Check SHOULD NOT be performed.

702	8.2.  IKE Redirect

704	   A redirect mechanism for IKEv2 is defined in [RFC5685].  This
705	   mechanism allows security gateways to redirect clients to another
706	   gateway either during IKE SA establishment or after session setup.
707	   If a client is connecting to a security gateway using TCP and then is
708	   redirected to another security gateway, the client needs to reset its
709	   transport selection.  In other words, the client MUST again try first
710	   UDP and then fall back to TCP while establishing a new IKE SA,
711	   regardless of the transport of the SA the redirect notification was
712	   received over (unless the client's configuration instructs it to
713	   instantly use TCP for the gateway it is redirected to).

715	8.3.  IKEv2 Session Resumption

717	   Session resumption for IKEv2 is defined in [RFC5723].  Once an IKE SA
718	   is established, the server creates a resumption ticket where
719	   information about this SA is stored, and transfers this ticket to the
720	   client.  The ticket may be later used to resume the IKE SA after it
721	   is deleted.  In the event of resumption the client presents the
722	   ticket in a new exchange, called IKE_SESSION_RESUME.  Some parameters
723	   in the new SA are retrieved from the ticket and others are re-
724	   negotiated (more details are given in Section 5 of [RFC5723]).  If
725	   TCP encapsulation was used in an old SA, then the client SHOULD
726	   resume this SA using TCP, without first trying to connect over UDP.

728	8.4.  IKEv2 Protocol Support for High Availability

730	   [RFC6311] defines a support for High Availability in IKEv2.  In case
731	   of cluster failover, a new active node must immediately initiate a
732	   special INFORMATION exchange containing the IKEV2_MESSAGE_ID_SYNC
733	   notification, which instructs the client to skip some number of
734	   Message IDs that might not be synchronized yet between nodes at the
735	   time of failover.

737	   Synchronizing states when using TCP encapsulation is much harder than
738	   when using UDP; doing so requires access to TCP/IP stack internals,
739	   which is not always available from an IKE/IPsec implementation.  If a
740	   cluster implementation doesn't synchronize TCP states between nodes,
741	   then after failover event the new active node will not have any TCP
742	   connection with the client, so the node cannot initiate the
743	   INFORMATIONAL exchange as required by [RFC6311].  Since the cluster
744	   usually acts as TCP Responder, the new active node cannot re-
745	   establish TCP connection, since only the TCP Originator can do it.
746	   For the client, the cluster failover event may remain undetected for
747	   long time if it has no IKE or ESP traffic to send.  Once the client
748	   sends an ESP or IKEv2 packet, the cluster node will reply with TCP
749	   RST and the client (as TCP Originator) will reestablish the TCP
750	   connection so that the node will be able to initiate the
751	   INFORMATIONAL exchange informing the client about the cluster
752	   failover.

754	   This document makes the following recommendation: if support for High
755	   Availability in IKEv2 is negotiated and TCP transport is used, a
756	   client that is a TCP Originator SHOULD periodically send IKEv2
757	   messages (e.g. by initiating liveness check exchange) whenever there
758	   is no IKEv2 or ESP traffic.  This differs from the recommendations
759	   given in Section 2.4 of [RFC7296] in the following: the liveness
760	   check should be periodically performed even if the client has nothing
761	   to send over ESP.  The frequency of sending such messages should be
762	   high enough to allow quick detection and restoring of broken TCP
763	   connection.

765	8.5.  IKEv2 Fragmentation

767	   IKE message fragmentation [RFC7383] is not required when using TCP
768	   encapsulation, since a TCP stream already handles the fragmentation
769	   of its contents across packets.  Since fragmentation is redundant in
770	   this case, implementations might choose to not negotiate IKE
771	   fragmentation.  Even if fragmentation is negotiated, an
772	   implementation SHOULD NOT send fragments when going over a TCP
773	   connection, although it MUST support receiving fragments.

775	   If an implementation supports both MOBIKE and IKE fragmentation, it
776	   SHOULD negotiate IKE fragmentation over a TCP-encapsulated session in
777	   case the session switches to UDP encapsulation on another network.

779	9.  Middlebox Considerations

781	   Many security networking devices, such as firewalls or intrusion
782	   prevention systems, network optimization/acceleration devices, and
783	   NAT devices, keep the state of sessions that traverse through them.

785	   These devices commonly track the transport-layer and/or application-
786	   layer data to drop traffic that is anomalous or malicious in nature.
787	   While many of these devices will be more likely to pass TCP-
788	   encapsulated traffic as opposed to UDP-encapsulated traffic, some may
789	   still block or interfere with TCP-encapsulated IKE and IPsec traffic.

791	   A network device that monitors the transport layer will track the
792	   state of TCP sessions, such as TCP sequence numbers.  TCP
793	   encapsulation of IKE should therefore use standard TCP behaviors to
794	   avoid being dropped by middleboxes.

796	10.  Performance Considerations

798	   Several aspects of TCP encapsulation for IKE and IPsec packets may
799	   negatively impact the performance of connections within a tunnel-mode
800	   IPsec SA.  Implementations should be aware of these performance
801	   impacts and take these into consideration when determining when to
802	   use TCP encapsulation.  Implementations SHOULD favor using direct ESP
803	   or UDP encapsulation over TCP encapsulation whenever possible.

805	10.1.  TCP-in-TCP

807	   If the outer connection between IKE peers is over TCP, inner TCP
808	   connections may suffer negative effects from using TCP within TCP.
809	   Running TCP within TCP is discouraged, since the TCP algorithms
810	   generally assume that they are running over an unreliable datagram
811	   layer.

813	   If the outer (tunnel) TCP connection experiences packet loss, this
814	   loss will be hidden from any inner TCP connections, since the outer
815	   connection will retransmit to account for the losses.  Since the
816	   outer TCP connection will deliver the inner messages in order, any
817	   messages after a lost packet may have to wait until the loss is
818	   recovered.  This means that loss on the outer connection will be
819	   interpreted only as delay by inner connections.  The burstiness of
820	   inner traffic can increase, since a large number of inner packets may
821	   be delivered across the tunnel at once.  The inner TCP connection may
822	   interpret a long period of delay as a transmission problem,
823	   triggering a retransmission timeout, which will cause spurious
824	   retransmissions.  The sending rate of the inner connection may be
825	   unnecessarily reduced if the retransmissions are not detected as
826	   spurious in time.

828	   The inner TCP connection's round-trip-time estimation will be
829	   affected by the burstiness of the outer TCP connection if there are
830	   long delays when packets are retransmitted by the outer TCP
831	   connection.  This will make the congestion control loop of the inner
832	   TCP traffic less reactive, potentially permanently leading to a lower
833	   sending rate than the outer TCP would allow for.

835	   TCP-in-TCP can also lead to increased buffering, or bufferbloat.
836	   This can occur when the window size of the outer TCP connection is
837	   reduced and becomes smaller than the window sizes of the inner TCP
838	   connections.  This can lead to packets backing up in the outer TCP
839	   connection's send buffers.  In order to limit this effect, the outer
840	   TCP connection should have limits on its send buffer size and on the
841	   rate at which it reduces its window size.

843	   Note that any negative effects will be shared between all flows going
844	   through the outer TCP connection.  This is of particular concern for
845	   any latency-sensitive or real-time applications using the tunnel.  If
846	   such traffic is using a TCP-encapsulated IPsec connection, it is
847	   recommended that the number of inner connections sharing the tunnel
848	   be limited as much as possible.

850	10.2.  Added Reliability for Unreliable Protocols

852	   Since ESP is an unreliable protocol, transmitting ESP packets over a
853	   TCP connection will change the fundamental behavior of the packets.
854	   Some application-level protocols that prefer packet loss to delay
855	   (such as Voice over IP or other real-time protocols) may be
856	   negatively impacted if their packets are retransmitted by the TCP
857	   connection due to packet loss.

859	10.3.  Quality-of-Service Markings

861	   Quality-of-Service (QoS) markings, such as the Differentiated
862	   Services Code Point (DSCP) and Traffic Class, should be used with
863	   care on TCP connections used for encapsulation.  Individual packets
864	   SHOULD NOT use different markings than the rest of the connection,
865	   since packets with different priorities may be routed differently and
866	   cause unnecessary delays in the connection.

868	10.4.  Maximum Segment Size

870	   A TCP connection used for IKE encapsulation SHOULD negotiate its MSS
871	   in order to avoid unnecessary fragmentation of packets.

873	10.5.  Tunneling ECN in TCP

875	   Since there is not a one-to-one relationship between outer IP packets
876	   and inner ESP/IP messages when using TCP encapsulation, the markings
877	   for Explicit Congestion Notification (ECN) [RFC3168] cannot be simply
878	   mapped.  However, any ECN Congestion Experienced (CE) marking on
879	   inner headers should be preserved through the tunnel.

881	   Implementations SHOULD follow the ECN compatibility mode for tunnel
882	   ingress as described in [RFC6040].  In compatibility mode, the outer
883	   tunnel TCP connection marks its packet headers as not ECN-capable.
884	   If upon egress, the arriving outer header is marked with CE, the
885	   implementation will drop the inner packet, since there is not a
886	   distinct inner packet header onto which to translate the ECN
887	   markings.

889	11.  Security Considerations

891	   IKE Responders that support TCP encapsulation may become vulnerable
892	   to new Denial-of-Service (DoS) attacks that are specific to TCP, such
893	   as SYN-flooding attacks.  TCP Responders should be aware of this
894	   additional attack surface.

896	   TCP Responders should be careful to ensure that (1) the stream prefix
897	   "IKETCP" uniquely identifies incoming streams as streams that use the
898	   TCP encapsulation protocol and (2) they are not running any other
899	   protocols on the same listening port (to avoid potential conflicts).

901	   Attackers may be able to disrupt the TCP connection by sending
902	   spurious TCP Reset packets.  Therefore, implementations SHOULD make
903	   sure that IKE session state persists even if the underlying TCP
904	   connection is torn down.

906	   If MOBIKE is being used, all of the security considerations outlined
907	   for MOBIKE apply [RFC4555].

909	   Similarly to MOBIKE, TCP encapsulation requires a TCP Responder to
910	   handle changes to source address and port due to network or
911	   connection disruption.  The successful delivery of valid IKE or ESP
912	   messages over a new TCP connection is used by the TCP Responder to
913	   determine where to send subsequent responses.  If an attacker is able
914	   to send packets on a new TCP connection that pass the validation
915	   checks of the TCP Responder, it can influence which path future
916	   packets will take.  For this reason, the validation of messages on
917	   the TCP Responder must include decryption, authentication, and replay
918	   checks.

920	   Since TCP provides reliable, in-order delivery of ESP messages, the
921	   ESP anti-replay window size SHOULD be set to 1.  See [RFC4303] for a
922	   complete description of the ESP anti-replay window.  This increases
923	   the protection of implementations against replay attacks.

925	12.  IANA Considerations

927	   TCP port 4500 is already allocated to IPsec for NAT traversal.  This
928	   port SHOULD be used for TCP-encapsulated IKE and ESP as described in
929	   this document.

931	   This document updates the reference for TCP port 4500 from RFC 8229
932	   to itself:

934	             Keyword       Decimal    Description           Reference
935	             -----------   --------   -------------------   ---------
936	             ipsec-nat-t   4500/tcp   IPsec NAT-Traversal   [RFCXXXX]

938	                                 Figure 4

940	13.  References

942	13.1.  Normative References

944	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
945	              Requirement Levels", BCP 14, RFC 2119,
946	              DOI 10.17487/RFC2119, March 1997,
947	              <https://www.rfc-editor.org/info/rfc2119>.

949	   [RFC3948]  Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and M.
950	              Stenberg, "UDP Encapsulation of IPsec ESP Packets",
951	              RFC 3948, DOI 10.17487/RFC3948, January 2005,
952	              <https://www.rfc-editor.org/info/rfc3948>.

954	   [RFC4301]  Kent, S. and K. Seo, "Security Architecture for the
955	              Internet Protocol", RFC 4301, DOI 10.17487/RFC4301,
956	              December 2005, <https://www.rfc-editor.org/info/rfc4301>.

958	   [RFC4303]  Kent, S., "IP Encapsulating Security Payload (ESP)",
959	              RFC 4303, DOI 10.17487/RFC4303, December 2005,
960	              <https://www.rfc-editor.org/info/rfc4303>.

962	   [RFC6040]  Briscoe, B., "Tunnelling of Explicit Congestion
963	              Notification", RFC 6040, DOI 10.17487/RFC6040, November
964	              2010, <https://www.rfc-editor.org/info/rfc6040>.

966	   [RFC7296]  Kaufman, C., Hoffman, P., Nir, Y., Eronen, P., and T.
967	              Kivinen, "Internet Key Exchange Protocol Version 2
968	              (IKEv2)", STD 79, RFC 7296, DOI 10.17487/RFC7296, October
969	              2014, <https://www.rfc-editor.org/info/rfc7296>.

971	   [RFC8019]  Nir, Y. and V. Smyslov, "Protecting Internet Key Exchange
972	              Protocol Version 2 (IKEv2) Implementations from
973	              Distributed Denial-of-Service Attacks", RFC 8019,
974	              DOI 10.17487/RFC8019, November 2016,
975	              <https://www.rfc-editor.org/info/rfc8019>.

977	   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
978	              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
979	              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

981	13.2.  Informative References

983	   [I-D.ietf-ipsecme-ike-tcp]
984	              Nir, Y., "A TCP transport for the Internet Key Exchange",
985	              draft-ietf-ipsecme-ike-tcp-01 (work in progress), December
986	              2012.

988	   [RFC1122]  Braden, R., Ed., "Requirements for Internet Hosts -
989	              Communication Layers", STD 3, RFC 1122,
990	              DOI 10.17487/RFC1122, October 1989,
991	              <https://www.rfc-editor.org/info/rfc1122>.

993	   [RFC2817]  Khare, R. and S. Lawrence, "Upgrading to TLS Within
994	              HTTP/1.1", RFC 2817, DOI 10.17487/RFC2817, May 2000,
995	              <https://www.rfc-editor.org/info/rfc2817>.

997	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
998	              of Explicit Congestion Notification (ECN) to IP",
999	              RFC 3168, DOI 10.17487/RFC3168, September 2001,
1000	              <https://www.rfc-editor.org/info/rfc3168>.

1002	   [RFC4555]  Eronen, P., "IKEv2 Mobility and Multihoming Protocol
1003	              (MOBIKE)", RFC 4555, DOI 10.17487/RFC4555, June 2006,
1004	              <https://www.rfc-editor.org/info/rfc4555>.

1006	   [RFC4621]  Kivinen, T. and H. Tschofenig, "Design of the IKEv2
1007	              Mobility and Multihoming (MOBIKE) Protocol", RFC 4621,
1008	              DOI 10.17487/RFC4621, August 2006,
1009	              <https://www.rfc-editor.org/info/rfc4621>.

1011	   [RFC4987]  Eddy, W., "TCP SYN Flooding Attacks and Common
1012	              Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007,
1013	              <https://www.rfc-editor.org/info/rfc4987>.

1015	   [RFC5246]  Dierks, T. and E. Rescorla, "The Transport Layer Security
1016	              (TLS) Protocol Version 1.2", RFC 5246,
1017	              DOI 10.17487/RFC5246, August 2008,
1018	              <https://www.rfc-editor.org/info/rfc5246>.

1020	   [RFC5685]  Devarapalli, V. and K. Weniger, "Redirect Mechanism for
1021	              the Internet Key Exchange Protocol Version 2 (IKEv2)",
1022	              RFC 5685, DOI 10.17487/RFC5685, November 2009,
1023	              <https://www.rfc-editor.org/info/rfc5685>.

1025	   [RFC5723]  Sheffer, Y. and H. Tschofenig, "Internet Key Exchange
1026	              Protocol Version 2 (IKEv2) Session Resumption", RFC 5723,
1027	              DOI 10.17487/RFC5723, January 2010,
1028	              <https://www.rfc-editor.org/info/rfc5723>.

1030	   [RFC6311]  Singh, R., Ed., Kalyani, G., Nir, Y., Sheffer, Y., and D.
1031	              Zhang, "Protocol Support for High Availability of IKEv2/
1032	              IPsec", RFC 6311, DOI 10.17487/RFC6311, July 2011,
1033	              <https://www.rfc-editor.org/info/rfc6311>.

1035	   [RFC6520]  Seggelmann, R., Tuexen, M., and M. Williams, "Transport
1036	              Layer Security (TLS) and Datagram Transport Layer Security
1037	              (DTLS) Heartbeat Extension", RFC 6520,
1038	              DOI 10.17487/RFC6520, February 2012,
1039	              <https://www.rfc-editor.org/info/rfc6520>.

1041	   [RFC6528]  Gont, F. and S. Bellovin, "Defending against Sequence
1042	              Number Attacks", RFC 6528, DOI 10.17487/RFC6528, February
1043	              2012, <https://www.rfc-editor.org/info/rfc6528>.

1045	   [RFC7383]  Smyslov, V., "Internet Key Exchange Protocol Version 2
1046	              (IKEv2) Message Fragmentation", RFC 7383,
1047	              DOI 10.17487/RFC7383, November 2014,
1048	              <https://www.rfc-editor.org/info/rfc7383>.

1050	   [RFC8229]  Pauly, T., Touati, S., and R. Mantha, "TCP Encapsulation
1051	              of IKE and IPsec Packets", RFC 8229, DOI 10.17487/RFC8229,
1052	              August 2017, <https://www.rfc-editor.org/info/rfc8229>.

1054	   [RFC8446]  Rescorla, E., "The Transport Layer Security (TLS) Protocol
1055	              Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
1056	              <https://www.rfc-editor.org/info/rfc8446>.

1058	Appendix A.  Using TCP Encapsulation with TLS

1060	   This section provides recommendations on how to use TLS in addition
1061	   to TCP encapsulation.

1063	   When using TCP encapsulation, implementations may choose to use TLS
1064	   1.2 [RFC5246] or TLS 1.3 [RFC8446] on the TCP connection to be able
1065	   to traverse middleboxes, which may otherwise block the traffic.

1067	   If a web proxy is applied to the ports used for the TCP connection
1068	   and TLS is being used, the TCP Originator can send an HTTP CONNECT
1069	   message to establish an SA through the proxy [RFC2817].

1071	   The use of TLS should be configurable on the peers, and may be used
1072	   as the default when using TCP encapsulation or may be used as a
1073	   fallback when basic TCP encapsulation fails.  The TCP Responder may
1074	   expect to read encapsulated IKE and ESP packets directly from the TCP
1075	   connection, or it may expect to read them from a stream of TLS data
1076	   packets.  The TCP Originator should be pre-configured to use TLS or
1077	   not when communicating with a given port on the TCP Responder.

1079	   When new TCP connections are re-established due to a broken
1080	   connection, TLS must be renegotiated.  TLS session resumption is
1081	   recommended to improve efficiency in this case.

1083	   The security of the IKE session is entirely derived from the IKE
1084	   negotiation and key establishment and not from the TLS session (which
1085	   in this context is only used for encapsulation purposes); therefore,
1086	   when TLS is used on the TCP connection, both the TCP Originator and
1087	   the TCP Responder SHOULD allow the NULL cipher to be selected for
1088	   performance reasons.  Note, that TLS 1.3 only supports AEAD
1089	   algorithms and at the time of writing this document there was no
1090	   recommended cipher suite for TLS 1.3 with the NULL cipher.

1092	   Implementations should be aware that the use of TLS introduces
1093	   another layer of overhead requiring more bytes to transmit a given
1094	   IKE and IPsec packet.  For this reason, direct ESP, UDP
1095	   encapsulation, or TCP encapsulation without TLS should be preferred
1096	   in situations in which TLS is not required in order to traverse
1097	   middleboxes.

1099	Appendix B.  Example Exchanges of TCP Encapsulation with TLS 1.3

1101	B.1.  Establishing an IKE Session

1103	                   Client                              Server
1104	                 ----------                          ----------
1105	     1)  --------------------  TCP Connection  -------------------
1106	         (IP_I:Port_I  -> IP_R:Port_R)
1107	         TcpSyn                    ---------->
1108	                                   <----------          TcpSyn,Ack
1109	         TcpAck                    ---------->

1111	     2)  ---------------------  TLS Session  ---------------------
1112	         ClientHello               ---------->
1113	                                                       ServerHello
1114	                                             {EncryptedExtensions}
1115	                                                    {Certificate*}
1116	                                              {CertificateVerify*}
1117	                                   <----------          {Finished}
1118	         {Finished}                ---------->

1120	     3)  ---------------------- Stream Prefix --------------------
1121	         "IKETCP"                  ---------->
1122	     4)  ----------------------- IKE Session ---------------------
1123	         Length + Non-ESP Marker   ---------->
1124	         IKE_SA_INIT
1125	         HDR, SAi1, KEi, Ni,
1126	         [N(NAT_DETECTION_*_IP)]
1127	                                   <------ Length + Non-ESP Marker
1128	                                                       IKE_SA_INIT
1129	                                               HDR, SAr1, KEr, Nr,
1130	                                           [N(NAT_DETECTION_*_IP)]
1131	         Length + Non-ESP Marker   ---------->
1132	         first IKE_AUTH
1133	         HDR, SK {IDi, [CERTREQ]
1134	         CP(CFG_REQUEST), IDr,
1135	         SAi2, TSi, TSr, ...}
1136	                                   <------ Length + Non-ESP Marker
1137	                                                    first IKE_AUTH
1138	                                       HDR, SK {IDr, [CERT], AUTH,
1139	                                              EAP, SAr2, TSi, TSr}

1141	         Length + Non-ESP Marker   ---------->
1142	         IKE_AUTH + EAP
1143	         repeat 1..N times
1144	                                   <------ Length + Non-ESP Marker
1145	                                                    IKE_AUTH + EAP
1146	         Length + Non-ESP Marker   ---------->
1147	         final IKE_AUTH
1148	         HDR, SK {AUTH}
1149	                                   <------ Length + Non-ESP Marker
1150	                                                    final IKE_AUTH
1151	                                     HDR, SK {AUTH, CP(CFG_REPLY),
1152	                                                SA, TSi, TSr, ...}
1153	         -------------- IKE and IPsec SAs Established ------------
1154	         Length + ESP Frame        ---------->

1156	                                 Figure 5

1158	   1.  The client establishes a TCP connection with the server on port
1159	       4500 or on an alternate pre-configured port that the server is
1160	       listening on.

1162	   2.  If configured to use TLS, the client initiates a TLS handshake.
1163	       During the TLS handshake, the server SHOULD NOT request the
1164	       client's certificate, since authentication is handled as part of
1165	       IKE negotiation.

1167	   3.  The client sends the stream prefix for TCP-encapsulated IKE
1168	       (Section 5) traffic to signal the beginning of IKE negotiation.

1170	   4.  The client and server establish an IKE connection.  This example
1171	       shows EAP-based authentication, although any authentication type
1172	       may be used.

1174	B.2.  Deleting an IKE Session
1175	                   Client                              Server
1176	                 ----------                          ----------
1177	     1)  ----------------------- IKE Session ---------------------
1178	         Length + Non-ESP Marker   ---------->
1179	         INFORMATIONAL
1180	         HDR, SK {[N,] [D,]
1181	                [CP,] ...}
1182	                                   <------ Length + Non-ESP Marker
1183	                                                     INFORMATIONAL
1184	                                                HDR, SK {[N,] [D,]
1185	                                                        [CP], ...}

1187	     2)  ---------------------  TLS Session  ---------------------
1188	         close_notify              ---------->
1189	                                   <----------        close_notify
1190	     3)  --------------------  TCP Connection  -------------------
1191	         TcpFin                    ---------->
1192	                                   <----------                 Ack
1193	                                   <----------              TcpFin
1194	         Ack                       ---------->
1195	         --------------------  IKE SA Deleted  -------------------

1197	                                 Figure 6

1199	   1.  The client and server exchange informational messages to notify
1200	       IKE SA deletion.

1202	   2.  The client and server negotiate TLS session deletion using TLS
1203	       CLOSE_NOTIFY.

1205	   3.  The TCP connection is torn down.

1207	   The deletion of the IKE SA should lead to the disposal of the
1208	   underlying TLS and TCP state.

1210	B.3.  Re-establishing an IKE Session
1211	                   Client                              Server
1212	                 ----------                          ----------
1213	     1)  --------------------  TCP Connection  -------------------
1214	         (IP_I:Port_I  -> IP_R:Port_R)
1215	         TcpSyn                    ---------->
1216	                                   <----------          TcpSyn,Ack
1217	         TcpAck                    ---------->
1218	     2)  ---------------------  TLS Session  ---------------------
1219	         ClientHello               ---------->
1220	                                                       ServerHello
1221	                                             {EncryptedExtensions}
1222	                                   <----------          {Finished}
1223	         {Finished}                ---------->
1224	     3)  ---------------------- Stream Prefix --------------------
1225	         "IKETCP"                  ---------->
1226	     4)  <---------------------> IKE/ESP Flow <------------------>
1227	         Length + ESP Frame        ---------->

1229	                                 Figure 7

1231	   1.  If a previous TCP connection was broken (for example, due to a
1232	       TCP Reset), the client is responsible for re-initiating the TCP
1233	       connection.  The TCP Originator's address and port (IP_I and
1234	       Port_I) may be different from the previous connection's address
1235	       and port.

1237	   2.  The client SHOULD attempt TLS session resumption if it has
1238	       previously established a session with the server.

1240	   3.  After TCP and TLS are complete, the client sends the stream
1241	       prefix for TCP-encapsulated IKE traffic (Section 5).

1243	   4.  The IKE and ESP packet flow can resume.  If MOBIKE is being used,
1244	       the Initiator SHOULD send an UPDATE_SA_ADDRESSES message.

1246	B.4.  Using MOBIKE between UDP and TCP Encapsulation

1248	                     Client                              Server
1249	                   ----------                          ----------
1250	         (IP_I1:UDP500 -> IP_R:UDP500)
1251	     1)  ----------------- IKE_SA_INIT Exchange -----------------
1252	         (IP_I1:UDP4500 -> IP_R:UDP4500)
1253	         Non-ESP Marker           ----------->
1254	         Initial IKE_AUTH
1255	         HDR, SK { IDi, CERT, AUTH,
1256	         CP(CFG_REQUEST),
1257	         SAi2, TSi, TSr,
1258	         N(MOBIKE_SUPPORTED) }
1259	                                  <-----------      Non-ESP Marker
1260	                                                  Initial IKE_AUTH
1261	                                        HDR, SK { IDr, CERT, AUTH,
1262	                                              EAP, SAr2, TSi, TSr,
1263	                                             N(MOBIKE_SUPPORTED) }
1264	         <------------------ IKE SA Establishment --------------->

1266	     2)  ------------ MOBIKE Attempt on New Network --------------
1267	         (IP_I2:UDP4500 -> IP_R:UDP4500)
1268	         Non-ESP Marker           ----------->
1269	         INFORMATIONAL
1270	         HDR, SK { N(UPDATE_SA_ADDRESSES),
1271	         N(NAT_DETECTION_SOURCE_IP),
1272	         N(NAT_DETECTION_DESTINATION_IP) }

1274	     3)  --------------------  TCP Connection  -------------------
1275	         (IP_I2:Port_I -> IP_R:Port_R)
1276	         TcpSyn                   ----------->
1277	                                  <-----------          TcpSyn,Ack
1278	         TcpAck                   ----------->

1280	     4)  ---------------------  TLS Session  ---------------------
1281	         ClientHello               ---------->
1282	                                                       ServerHello
1283	                                             {EncryptedExtensions}
1284	                                                    {Certificate*}
1285	                                              {CertificateVerify*}
1286	                                   <----------          {Finished}
1287	         {Finished}                ---------->

1289	     5)  ---------------------- Stream Prefix --------------------
1290	         "IKETCP"                  ---------->

1292	     6)  ----------------------- IKE Session ---------------------
1293	         Length + Non-ESP Marker  ----------->
1294	         INFORMATIONAL (Same as step 2)
1295	         HDR, SK { N(UPDATE_SA_ADDRESSES),
1296	         N(NAT_DETECTION_SOURCE_IP),
1297	         N(NAT_DETECTION_DESTINATION_IP) }

1299	                                  <------- Length + Non-ESP Marker
1300	                             HDR, SK { N(NAT_DETECTION_SOURCE_IP),
1301	                                 N(NAT_DETECTION_DESTINATION_IP) }
1302	     7)  <----------------- IKE/ESP Data Flow ------------------->

1304	                                 Figure 8

1306	   1.  During the IKE_SA_INIT exchange, the client and server exchange
1307	       MOBIKE_SUPPORTED notify payloads to indicate support for MOBIKE.

1309	   2.  The client changes its point of attachment to the network and
1310	       receives a new IP address.  The client attempts to re-establish
1311	       the IKE session using the UPDATE_SA_ADDRESSES notify payload, but
1312	       the server does not respond because the network blocks UDP
1313	       traffic.

1315	   3.  The client brings up a TCP connection to the server in order to
1316	       use TCP encapsulation.

1318	   4.  The client initiates a TLS handshake with the server.

1320	   5.  The client sends the stream prefix for TCP-encapsulated IKE
1321	       traffic (Section 5).

1323	   6.  The client sends the UPDATE_SA_ADDRESSES notify payload on the
1324	       TCP-encapsulated connection.  Note that this IKE message is the
1325	       same as the one sent over UDP in step 2; it should have the same
1326	       message ID and contents.

1328	   7.  The IKE and ESP packet flow can resume.

1330	Acknowledgments

1332	   The following people provided valuable feedback and advices while
1333	   preparing RFC8229: Stuart Cheshire, Delziel Fernandes, Yoav Nir,
1334	   Christoph Paasch, Yaron Sheffer, David Schinazi, Graham Bartlett,
1335	   Byju Pularikkal, March Wu, Kingwel Xie, Valery Smyslov, Jun Hu, and
1336	   Tero Kivinen.  Special thanks to Eric Kinnear for his implementation
1337	   work.

1339	   The authors would like to thank Tero Kivinen and Paul Wouters for
1340	   their valuable comments while preparing this document.

1342	Authors' Addresses

1344	   Valery Smyslov
1345	   ELVIS-PLUS
1346	   PO Box 81
1347	   Moscow (Zelenograd)  124460
1348	   Russian Federation

1350	   Phone: +7 495 276 0211
1351	   Email: svan@elvis.ru
1352	   Tommy Pauly
1353	   Apple Inc.
1354	   1 Infinite Loop
1355	   Cupertino, California  95014
1356	   United States of America

1358	   Email: tpauly@apple.com