idnits 2.17.1 

draft-trammell-plus-statefulness-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (November 13, 2017) is 2354 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'I-D.hardie-path-signals' is defined on line 703, but
     no explicit reference was found in the text

  == Unused Reference: 'I-D.ietf-quic-tls' is defined on line 707, but no
     explicit reference was found in the text

  == Outdated reference: A later version (-03) exists of
     draft-hardie-path-signals-01

  == Outdated reference: A later version (-34) exists of
     draft-ietf-quic-tls-07

  == Outdated reference: A later version (-34) exists of
     draft-ietf-quic-transport-07

  -- Obsolete informational reference (is this intentional?): RFC  793
     (Obsoleted by RFC 9293)


     Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                      M. Kuehlewind
3	Internet-Draft                                               B. Trammell
4	Intended status: Informational                                ETH Zurich
5	Expires: May 17, 2018                                      J. Hildebrand
6	                                                       November 13, 2017

8	           Transport-Independent Path Layer State Management
9	                  draft-trammell-plus-statefulness-04

11	Abstract

13	   This document describes a simple state machine for stateful network
14	   devices on a path between two endpoints to associate state with
15	   traffic traversing them on a per-flow basis, as well as abstract
16	   signaling mechanisms for driving the state machine.  This state
17	   machine is intended to replace the de-facto use of the TCP state
18	   machine or incomplete forms thereof by stateful network devices in a
19	   transport-independent way, while still allowing for fast state
20	   timeout of non-established or undesirable flows.

22	Status of This Memo

24	   This Internet-Draft is submitted in full conformance with the
25	   provisions of BCP 78 and BCP 79.

27	   Internet-Drafts are working documents of the Internet Engineering
28	   Task Force (IETF).  Note that other groups may also distribute
29	   working documents as Internet-Drafts.  The list of current Internet-
30	   Drafts is at https://datatracker.ietf.org/drafts/current/.

32	   Internet-Drafts are draft documents valid for a maximum of six months
33	   and may be updated, replaced, or obsoleted by other documents at any
34	   time.  It is inappropriate to use Internet-Drafts as reference
35	   material or to cite them other than as "work in progress."

37	   This Internet-Draft will expire on May 17, 2018.

39	Copyright Notice

41	   Copyright (c) 2017 IETF Trust and the persons identified as the
42	   document authors.  All rights reserved.

44	   This document is subject to BCP 78 and the IETF Trust's Legal
45	   Provisions Relating to IETF Documents
46	   (https://trustee.ietf.org/license-info) in effect on the date of
47	   publication of this document.  Please review these documents
48	   carefully, as they describe your rights and restrictions with respect
49	   to this document.  Code Components extracted from this document must
50	   include Simplified BSD License text as described in Section 4.e of
51	   the Trust Legal Provisions and are provided without warranty as
52	   described in the Simplified BSD License.

54	Table of Contents

56	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
57	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   3
58	   3.  State Machine . . . . . . . . . . . . . . . . . . . . . . . .   4
59	     3.1.  Uniflow States  . . . . . . . . . . . . . . . . . . . . .   7
60	     3.2.  Biflow States . . . . . . . . . . . . . . . . . . . . . .   7
61	     3.3.  Additional States and Actions . . . . . . . . . . . . . .   8
62	   4.  Abstract Signaling Mechanisms . . . . . . . . . . . . . . . .   8
63	     4.1.  Flow Identification . . . . . . . . . . . . . . . . . . .   9
64	     4.2.  Association and Confirmation Signaling  . . . . . . . . .   9
65	       4.2.1.  Start-of-flow versus continual signaling  . . . . . .  10
66	     4.3.  Bidirectional Stop Signaling  . . . . . . . . . . . . . .  11
67	       4.3.1.  Authenticated Stop Signaling  . . . . . . . . . . . .  12
68	     4.4.  Separate Utility  . . . . . . . . . . . . . . . . . . . .  12
69	   5.  Deployment Considerations . . . . . . . . . . . . . . . . . .  12
70	     5.1.  Middlebox Deployment  . . . . . . . . . . . . . . . . . .  12
71	     5.2.  Endpoint Deployment . . . . . . . . . . . . . . . . . . .  13
72	   6.  Signal mappings for transport protocols . . . . . . . . . . .  13
73	     6.1.  Signal mapping for TCP  . . . . . . . . . . . . . . . . .  13
74	     6.2.  Signal mapping for QUIC . . . . . . . . . . . . . . . . .  14
75	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  15
76	   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  15
77	   9.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  15
78	   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  15
79	     10.1.  Normative References . . . . . . . . . . . . . . . . . .  15
80	     10.2.  Informative References . . . . . . . . . . . . . . . . .  16
81	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  17

83	1.  Introduction

85	   The boundary between the network and transport layers was originally
86	   defined to be that between information used (and potentially
87	   modified) hop-by-hop, and that used end-to-end.  End-to-end
88	   information in the transport layer is associated with state at the
89	   endpoints, but processing of network-layer information was assumed to
90	   be stateless.

92	   The widespread deployment of stateful middleboxes in the Internet,
93	   such as network address and port translators (NAPT), firewalls that
94	   model the TCP state machine to distinguish packets belonging from
95	   desirable flows from backscatter and random attack traffic, and
96	   devices which keep per-flow state for reporting and monitoring
97	   purposes (e.g.  IPFIX [RFC7011] Metering Processes), has broken this
98	   assumption, and made it more difficult to deploy non-TCP transport
99	   protocols in the Internet.

101	   The deployment of new transport protocols encapsulated in UDP with
102	   encrypted transport headers (such as QUIC [I-D.ietf-quic-transport])
103	   will present a challenge to the operation of these devices, and their
104	   ubquity likewise threatens to impair the deployability of these
105	   protocols.  There are two main causes for this problem: first,
106	   stateful devices often use an internal model of the TCP state machine
107	   to determine when TCP flows start and end, allowing them to manage
108	   state for these flows; for UDP flows, they must rely on timeouts.
109	   These timeouts are generally short relative to those for TCP
110	   [IMC-GATEWAYS], requiring UDP- encapsulated transports either to
111	   generate unproductive keepalive traffic for long-lived sessions, or
112	   to tolerate connectivity problems and the necessity of reconnection
113	   due to loss of on-path state.

115	   This document presents an abstract solution to this problem by
116	   defining a transport-independent state machine to be implemented at
117	   per-flow state- keeping middleboxes as a replacement for incomplete
118	   TCP state modeling.  A key concept behind this approach is that
119	   encryption of transport protocol headers allows a transport protocol
120	   to separate its wire image - what it looks like to devices on path -
121	   from its internal semantics.  We advocate the creation of a minimal
122	   wire image for these protocols that exposes enough information to
123	   drive the state machine presented.  Present and future evolution of
124	   encrypted transport protocols can then happen behind this wire image,
125	   and Middleboxes implementing this state machine can use signals from
126	   a UDP encapsulation common to a set of encrypted transport protocols
127	   can have equivalent state information to that provided by TCP,
128	   reducing the friction between deployed middleboxes and these new
129	   transport protocols.

131	2.  Terminology

133	   In this document, the term "flow" is defined to be compatible with
134	   the definition given in [RFC7011]: A flow is defined as a set of
135	   packets passing a device on the network during a certain time
136	   interval.  All packets belonging to a particular Flow have a set of
137	   common properties.  Each property is defined as the result of
138	   applying a function to the values of:

140	   1.  one or more network layer header fields (e.g., destination IP
141	       address) or transport layer header fields (e.g., destination port
142	       number) that the device has access to;

144	   2.  one or more characteristics of the packet itself (e.g., number of
145	       MPLS labels, etc.);

147	   3.  one or more of the fields derived from packet treatment at the
148	       device (e.g., next-hop IP address, the output interface, etc.).

150	   A packet is defined as belonging to a flow if it completely satisfies
151	   all the defined properties of the flow.

153	   A bidirectional flow or biflow is defined as compatible with
154	   [RFC5103], by joining the "forward direction" flow with the "reverse
155	   direction" flow, derived by reversing the direction of directional
156	   fields (ports and IP addresses).  Biflows are only relevant at
157	   devices positioned so as to see all the packets in both directions of
158	   the biflow, generally on the endpoint side of the service demarcation
159	   point for either endpoint as defined in the reference path given in
160	   [RFC7398].

162	3.  State Machine

164	   A transport-independent state machine for on-path devices is shown in
165	   Figure 1.  It was designed to have the following properties:

167	   o  A device on path that can see traffic in both directions between
168	      two endpoints knows that each side of an association wishes that
169	      association to continue.  This allows firewalls to delegate policy
170	      decisions about accepting or continuing an association to the
171	      servers they protect.

173	   o  A device on path that can see traffic in both directions between
174	      two endpoints knows that each device can receive traffic at the
175	      source address it provides.  This allows firewalls to provide
176	      protection against trivially spoofed packets.

178	   Both of these properties hold with current firewalls and network
179	   address translation devices observing the flags and sequence/
180	   acknowledgment numbers exposed by TCP.

182	   It relies on six states, three configurable timeouts, and a set of
183	   signals defined in Section 4.  The states are defined as follows:

185	   o  zero: there is no state for a given flow at the device

187	   o  uniflow: at least one packet has been seen in one direction

189	   o  associating: at least one packet has been seen in one direction,
190	      and an indication that the receiving endpoint wishes to continue
191	      the association has been seen in the other direction.

193	   o  associated: a flow in associating state has further demonstrated
194	      that the initial sender can receive packets at its given source
195	      address.

197	   o  stop-wait: one side of a connection has sent an explicit stop
198	      signal, waiting for confirmation

200	   o  stopping: stop signal confirmed, association is stopping.

202	   We refer to the zero and uniflow states as "uniflow states", as they
203	   are relevant both for truly unidirectional flows, as well as in
204	   situations where an on-path device can see only one side of a
205	   communication.  We refer to the remaining four states as "biflow
206	   states", as they are only applicable to true bidirectional flows,
207	   where the on-path device can see both sides of the communication.

209	       `- - - - - - - - - - - - - - - - - - - - - - - - - - - -'
210	       `    +============+    a->b    +============+           '
211	       `   /              \--------->/              \<-+       '
212	     +--->(      zero      )        (    uniflow     ) | a->b  '
213	     ^ `   \              /<---------\              /--+       '
214	     | `    +============+  TO_IDLE   +============+           '
215	     | `- - - - - - - - - -   or    -  | association  - - - - -'
216	     |                   stop signal   V signal
217	     |                          +============+
218	     | TO_IDLE                 /              \
219	     +<-----------------------(  associating   )
220	     |                         \              /
221	     |                          +============+
222	     |                                 | confirmation
223	     |                                 V signal
224	     |                          +============+
225	     | TO_ASSOCIATED           /              \<-+
226	     +<-----------------------(   associated   ) | any packet
227	     |                         \              /--+
228	     |                          +============+
229	     |                           | stop
230	     |                           V signal
231	     |                    +============+
232	     | TO_ASSOCIATED     /              \<-+
233	     +<-----------------(   stop-wait    ) | any packet
234	     |                   \              /--+
235	     |                   +============+
236	     |                    | stop confirmation
237	     |                    V signal
238	     |              +============+
239	     | TO_STOP     /              \<-+
240	     +------------(    stopping    ) | any packet
241	                   \              /--+
242	                    +============+

244	    Figure 1: Transport-Independent State Machine for Stateful On-Path
245	                                  Devices

247	   The three timeouts are defined as follows:

249	   o  TO_IDLE, the unidirectional idle timeout, can be considered
250	      equivalent to the idle timeout for transport protocols where the
251	      device has no information about session start and end (e.g. most
252	      UDP protocols).

254	   o  TO_ASSOCIATED, the bidirectional idle timeout, can be considered
255	      equivalent to the timeout for transport protocols where the device
256	      has information about session start and end (e.g.  TCP).

258	   o  TO_STOP is the teardown timeout: how long the device will account
259	      additional packets to a flow after confirming a close signal,
260	      ensuring retransmitted and/or reordered close signal don't lead to
261	      the spurious creation of new flow state.

263	   Selection of timeouts is a configuration and implementation detail,
264	   but generally TO_STOP <= TO_IDLE << TO_ASSOCIATED; see [IMC-GATEWAYS]
265	   for an analysis of the magnitudes of these timeouts in presently
266	   deployed gateway devices.

268	3.1.  Uniflow States

270	   Every packet received by a device keeping per-flow state must
271	   associate that packet with a flow (see Section 4.1).  When a device
272	   receives a packet associated with a flow it has no state for, and it
273	   is configured to forward the packet instead of dropping it, it moves
274	   that flow from the zero state into the uniflow state and starts a
275	   timer TO_IDLE.  It resets this timer for any additional packet it
276	   forwards in the same direction as long as the flow remains in the
277	   uniflow state.  When timer TO_IDLE expires on a flow in the uniflow
278	   state, the device drops state for the flow and performs any
279	   processing associated with doing so: tearing down NAT bindings,
280	   stopping associated firewall pinholes, exporting flow information,
281	   and so on.  The device may also drop state on a stop signal, if
282	   observed.

284	   Some devices will only see one side of a communication, e.g. if they
285	   are placed in a portion of a network with asymmetric routing.  These
286	   devices use only the zero and uniflow states (as marked in Figure 1.)
287	   In addition, true uniflows - protocols which are solely
288	   unidirectional (e.g. some applications over UDP) - will also use only
289	   the uniflow-only states.  In either case, current devices generally
290	   don't associate much state with observed uniflows, and an idle
291	   timeout is generally sufficient to expire this state.

293	3.2.  Biflow States

295	   A uniflow transitions to the associating state when the device
296	   observes an association signal, and further to the associated state
297	   when the device observes a subsequent confirmation signal; see
298	   Section 4.2 for details.  If the flow has not transitioned to from
299	   the associating to the associated state after TO_IDLE, the device
300	   drops state for the flow.

302	   After transitioning to the associated state, the device starts a
303	   timer TO_ASSOCIATED.  It resets this timer for any packet it forwards
304	   in either direction.  The associated state represents a fully
305	   established bidirectional communication.  When timer TO_ASSOCIATED
306	   expires, the device assumes that the flow has shut down without
307	   signaling as such, and drops state for the flow, performing any
308	   associated processing.  When a bidirectional stop signal (see
309	   Section 4.3) is confirmed, the flow transitions to the stopping
310	   state.

312	   When a flow enters the stopping state, it starts a timer TO_STOP.
313	   While the stop signal should be the last packet on a flow, the
314	   TO_STOP timer ensures that reordered packets after the stop signal
315	   will be accounted to the flow.  When this timer expires, the device
316	   drops state for the flow, performing any associated processing.

318	3.3.  Additional States and Actions

320	   This document is concerned only with states and transitions common to
321	   transport- and function- independent state maintenance.  Devices may
322	   augment the transitions in this state diagram depending on their
323	   function.  For example, a firewall that decides based on some
324	   information beyond the signals used by this state machine to shut
325	   down a flow may transition it directly to a blacklist state on
326	   shutdown.  Or, a firewall may fail to forward additional packets in
327	   the uniflow state until an association signal is observed.

329	4.  Abstract Signaling Mechanisms

331	   The state machine in Section 3 requires four signals: a new flow
332	   signal, the first packet observed in a flow in the zero state; an
333	   association signal, allowing a device to verify that an endpoint
334	   wishes a bidirectional communication to be established or to
335	   continue; a confirmation signal, allowing a device to confirm that
336	   the initiator of a flow is reachable at its purported source address;
337	   and a stop signal, noting that an endpoint wishes to stop a
338	   bidirectional communication.  Additional related signals may also be
339	   useful, depending on the function a device provides.  There are a few
340	   different ways to implement these signals; here, we explore the
341	   properties of some potential implementations.

343	   We assume the following general requirements for these signals;
344	   parallel to those given in [draft-trammell-plus-abstract-mech]:

346	   o  At least the endpoints can verify the integrity of the signals
347	      exposed, and shut down a transport association when that
348	      verification fails, in order to reduce the incentive for on-path
349	      devices to attempt to spoof these signals.

351	   o  Endpoints and devices on path can probabilistically verify that a
352	      originator of a signal is on-path.

354	4.1.  Flow Identification

356	   In order to keep per-flow state, each device using this state machine
357	   must have a function it can apply to each packet to be able to
358	   extract common properties to identify the flow it is associated with.
359	   In general, the set of properties used for flow identification on
360	   presently deployed devices includes the source and destination IP
361	   address, the source and destination transport layer port number, the
362	   transport protocol number.  The differentiated services field
363	   [RFC2474] may also be included in the set of properties defining a
364	   flow, since it may indicate different forwarding treatment.

366	   However, other protocols may use additional bits in their own headers
367	   for flow identification.  In any case, a protocol implementing
368	   signaling for this state machine must specify the function used for
369	   flow identification.

371	4.2.  Association and Confirmation Signaling

373	   An association signal indicates that the endpoint that received the
374	   first packet seen by the device has indeed seen that packet, and is
375	   interested in continuing conversation with the sending endpoint.
376	   This signal is roughly an in-band analogue to consent signaling in
377	   ICE [RFC7675] that is carried to every device along the path.

379	   A confirmation signal indicates that the endpoint that sent the first
380	   packet seen by the device is reachable at its purported source
381	   address, and is necessary to prevent spoofed or reflected packets
382	   from driving the state machine into the associated state.  It is
383	   roughly equivalent to the final ACK in the TCP three-way handshake.

385	   These two signals are related to each other, in that association
386	   requires the receiving endpoint of the first packet to prove it has
387	   seen that packet (or a subsequent packet), and to acknowledge it
388	   wants to continue the association; while confirmation requires the
389	   sending endpoint to prove it has seen the association token.

391	   Transport-independent, path-verifiable association and confirmation
392	   signaling can be implemented using three values carried in the packet
393	   headers: an association token, a confirmation nonce, and an echo
394	   token.

396	   The association token is a cryptographically random value generated
397	   by the endpoint initiating a connection, and is carried on packets in
398	   the uniflow state.  When a receiving endpoint wishes to send an
399	   association signal, it generates an echo token from the association
400	   token using a well-known, defined function (e.g. a truncated SHA-256
401	   hash), and generates a cryptographically random confirmation nonce.

403	   The initiating endpoint sends a confirmation signal on the next
404	   packet it sends after receiving the confirmation nonce, by applying a
405	   function to the echo token and the confirmation nonce, and sending
406	   the result as a new association token.

408	   Devices on path verify that the echo token corresponds to a
409	   previously seen association token to recognize an association signal,
410	   and recognize that an association token corresponds to a previously
411	   seen echo token and confirmation nonce to recognize an association
412	   signal.

414	   If the association token and confirmation nonce are predictable, off-
415	   path devices can spoof association and confirmation signals.  In
416	   choosing the number of bits for an association token, there is a
417	   tradeoff between per-packet overhead and state overhead at on-path
418	   devices, and assurance that an association token is hard to guess.
419	   This tradeoff must be evaluated at protocol design time.

421	   There are a few considerations in choosing a function (or functions)
422	   to generate the echo token from the association token, to verify an
423	   echo token given an association token, and to derive a next
424	   association token from the echo token and confirmation nonce.  The
425	   functions could be extremely simple (e.g., identity for the echo
426	   token and addition for the nonce) for ease of implementation even in
427	   extremely constrained environments.  Using one-way functions (e.g.,
428	   truncated SHA-256 hash to derive echo token from association token;
429	   XOR followed by truncated SHA-256 hash to derive association token
430	   from echo token and confirmation nonce) requires slightly more work
431	   from on-path devices, but the primitives will be available at any
432	   endpoint using an encrypted transport protocol.  In any case, a
433	   concrete implementation of association and confirmation signaling
434	   must choose a set of functions, or mechanism for unambiguously
435	   choosing one, at both endpoints as well as along the path.

437	4.2.1.  Start-of-flow versus continual signaling

439	   There are two possible points in the design space here: these signals
440	   could be continually exposed throughout the flow, or could be exposed
441	   only on the first few packets of a connection (those corresponding to
442	   the cryptographic and/or transport state handshakes in the overlying
443	   protocols).

445	   In the former case, an on-path device could re-establish state in the
446	   middle of a flow; e.g. due to a reboot of the device, due to a NAT
447	   association change without the endpoints' knowledge, or due to idle
448	   periods longer than the TO_ESTABLISHED timeout value.  The on-path
449	   device would receive no special information about which packets were
450	   associated with the start of association.  In this case, the series
451	   of exposed association tokens, echo tokens, and confirmation nonces
452	   can also be observed to derive a running round-trip time estimate for
453	   the flow.

455	   In the latter case, an on-path device would need to observe the start
456	   of the flow to establish state, and would be able to distinguish
457	   connection-start packets from other packets.

459	4.3.  Bidirectional Stop Signaling

461	   The transport-independent state machine uses bidirectional stop
462	   signaling to tear down state.  This requires a stop signal to be
463	   observed in one direction, and a stop confirmation signal to be
464	   observed in the other, to complete tearing down an association.

466	   A stop signal is directly carried or otherwise encoded in the
467	   protocol header to indicate that a flow is ending, whether normally
468	   or abnormally, and that state associated with the flow should be torn
469	   down.  Upon decoding a stop signal, a device on path should move the
470	   flow from uniflow state to zero, or from associated state to stop-
471	   wait state, to wait for a confirmation signal in the other direction.
472	   While in stop-wait state, state will be maintained until a timer set
473	   to TO_ASSOCIATED expires, with any packet forwarded in either
474	   direction reseting the timer.

476	   A stop confirmation signal is directly carried or otherwise encoded
477	   in the protocol header to indicate that the endpoint receiving the
478	   stop signal confirms that the stop signal is valid.  The stop
479	   confirmation signal contains some assurance that the far endpoint has
480	   seen the stop signal.  When a stop confirmation signal is observed in
481	   the opposite direction from the stop signal, a device on path should
482	   move the flow from stop-wait state to stopping state.  The flow will
483	   then remain in stopping state until a timer set to TO_STOP has
484	   expired, after which state for the flow will be dropped.  The
485	   stopping timeout TO_STOP is intended to ensure that any packets
486	   reordered in delivery are accounted to the flow before state for it
487	   is dropped.

489	   We assume the encoding of stop and stop confirmation signals into a
490	   packet header, as with all other signals, is integrity protected end-
491	   to-end.  Stop signals, as association signals, could be forged by a
492	   single on-path device.  However, unless a stop confirmation signal
493	   that can be associated with the stop signal is observed in the other
494	   direction, the flow remains in stop-wait state, during which state is
495	   maintained and packets continue to be forwarded in both directions.
496	   So this attack is of limited utility; an attacker wishing to inject
497	   state teardown would need to control at least one on-path device on
498	   each side of a target device to spoof both stop and corresponding
499	   stop confirmation signals.

501	4.3.1.  Authenticated Stop Signaling

503	   Additionally, the stop and stop confirmation signals could be
504	   designed to authenticate themselves.  Each endpoint could reveal a
505	   stop hash during the initial association, which is the result of a
506	   chosen cryptographic hash function applied to a stop token which that
507	   endpoint keeps secret.  An endpoint wishing to end the association
508	   then reveals the stop token, which can be verified both by the far
509	   endpoint and devices on path which have cached the stop hash to be
510	   authentic.  A stop confirmation signal additionally contains
511	   information derived from the initiating stop signal's stop token, as
512	   further assurance that the stop token was observed by the far
513	   endpoint.

515	4.4.  Separate Utility

517	   Although all of these signals are required to drive the state machine
518	   described by this document, note that association/confirmation and
519	   bidirectional stop signaling have separate utility.  A transport
520	   protocol may expose the end of a flow without any proof of
521	   association or confirmation of return routability of the initiator.
522	   Alternately, the transport protocol could rely on short timeouts to
523	   clean up stale state on path, while exposing continuous association
524	   and confirmation signals to quickly reestablish state.

526	5.  Deployment Considerations

528	   The state machine defined in this document is most useful when
529	   implemented in a single instantiation (wire format for signals, and
530	   selection of functions for deriving values to be exposed and
531	   verified) by multiple transport protocols.  It is intended for use
532	   with protocols that encrypt their transport- layer headers, and that
533	   are encapsulated within UDP, as is the case with QUIC
534	   [I-D.ietf-quic-transport].  Definition of that instantiation is out
535	   of scope for the present revision of this document.

537	   The following subsections discuss incentives for deployment of this
538	   state machine both at middleboxes and at endpoints.

540	5.1.  Middlebox Deployment

542	   The state machine defined herein is designed to replace TCP state-
543	   tracking for firewalls and NAT devices.  When encrypted transport
544	   protocols encapsulated in UDP adopt a set of signals and a wire
545	   format for those signals to drive this state machine, these
546	   middleboxes could continue using TCP-like logic to handle those UDP
547	   flows.  Recognizing the wire format used by those signals would allow
548	   these middleboxes to distinguish "UDP with an encrypted transport"
549	   from undifferentiated UDP, and to treat the former case more like
550	   TCP, providing longer timeouts for established flows, as well as
551	   stateful defense against spoofed or reflected garbage traffic.

553	5.2.  Endpoint Deployment

555	   An encrypted, UDP-encapsulated transport protocol has two primary
556	   incentives to expose these signals.  First, allowing firewalls on
557	   networks that generally block UDP (about 3-5% of Internet-connected
558	   networks, depending on the study) to distinguish "UDP with an
559	   encrypted transport" traffic from other UDP traffic may result in
560	   less blocking of that traffic.  Second, the difference between the
561	   timeouts TO_IDLE and TO_ASSOCIATED, as well as the continuous state
562	   establishment possible with some instantiations of the association
563	   and confirmation signals, would allow these transport protocols to
564	   send less unproductive keepalive traffic for long-lived, sparse
565	   flows.

567	   While both of these advantages require middleboxes on path to
568	   recognize and use the signals driving this state machine, we note
569	   that content providers driving the deployment of this protocols are
570	   also operators of their own content provision networks, and that many
571	   of the benefits of encrypted- encapsulated transport firewalls will
572	   accrue to them, giving these content providers incentives to deploy
573	   both endpoints and middleboxes.

575	6.  Signal mappings for transport protocols

577	   We now show how this state machine can be driven by signals available
578	   in TCP and QUIC.

580	6.1.  Signal mapping for TCP

582	   A mapping of TCP flags to transitions in to the state machine in
583	   Section 3 shows how devices currently using a model of the TCP state
584	   machine can be converted to use this state machine.

586	   TCP [RFC0793] provides start-of-flow association only.  A packet with
587	   the SYN and ACK flags set in the absence of the FIN or RST flags, and
588	   an in-window acknowledgment number, is synonymous with the
589	   association signal.  A packet with the ACK flag set in the absence of
590	   the FIN or RST flags after an initial SYN, and an in-window
591	   acknowledgment number, is synonymous with the confirmation signal.
592	   For a typical TCP flow:

594	   1.  The initial SYN places the flow into uniflow state,

596	   2.  The SYN-ACK sent in reply acts as a association signal and places
597	       the flow into associating state,

599	   3.  The ACK sent in reply acts as a confirmation signal and places
600	       the flow into associated state,

602	   4.  The final FIN is a stop signal, and

604	   5.  the ACK of the final FIN is a stop confirmation signal, moving
605	       the flow into stopping state.

607	   Note that abormally closed flows (with RST) do not provide stop
608	   confirmation, and are therefore not provided for by this state
609	   machine.  Due to TCP's support for half-closed flows, additional
610	   state modeling is necessary to extract a stop signal from the final
611	   FIN.

613	   Note also that the association and stop signals derived from the TCP
614	   header are not integrity protected, and association and confirmation
615	   signals based on in-window ACK are not particularly resistant to off-
616	   path attacks [IMC-TCP].  The state machine is therefore more
617	   susceptible to manipulation when used with vanilla TCP as when with a
618	   transport protocol providing full integrity protection for its
619	   headers end-to-end.

621	6.2.  Signal mapping for QUIC

623	   QUIC [I-D.ietf-quic-transport] is a moving target; however, signals
624	   for driving this state machine are fundamentally compatible with the
625	   protocol's design and could easily be added to the protocol
626	   specification.

628	   Specifically, QUIC's handshake is visible to on-path devices, as it
629	   begins with an unencrypted version negotiation which exposes a 64-bit
630	   connection ID, which can serve as an association and echo token as in
631	   Section 4.2.  The function of the confirmation nonce is not fully
632	   exposed to the path at this point, but could be implemented by
633	   exposing information from the proof of source address ownership
634	   (section 7.4 of [I-D.ietf-quic-transport]) or via echoing the random
635	   initial packet number (as suggested by https://github.com/quicwg/
636	   base-drafts/pull/391).

638	   The addition of a public reset signal that would act as a stop signal
639	   as in Section 4.3 is presently under discussion within the QUIC
640	   working group; the proposal for self-authenticating public reset at
641	   https://github.com/quicwg/base-drafts/pull/20 inspired the addition
642	   of Section 4.3.1 to this document.

644	7.  IANA Considerations

646	   This document has no actions for IANA.

648	8.  Security Considerations

650	   This document defines a state machine for transport-independent state
651	   management on middleboxes, using in-band signaling, to replace the
652	   commonly- implemented current practice of incomplete TCP state
653	   modeling on these devices.  It defines new signals for state
654	   management.  While these signals can be spoofed by any device on path
655	   that observes traffic in both directions, we presume the presence of
656	   end-to-end integrity protection of these signals provided by the
657	   upper-layer transport driving them.  This allows such spoofing to be
658	   detected and countered by endpoints, reducing the threat from on-path
659	   devices to connection disruption, which such devices are trivially
660	   placed to perform in any case.

662	9.  Acknowledgments

664	   Thanks to Christian Huitema for discussions leading to this document,
665	   and to Andrew Yourtchenko for the feedback.  The mechanism for using
666	   a revealed value to prove ownership of a stop token was inspired by
667	   Eric Rescorla's suggestion to use a fundamentally identical mechanism
668	   for the QUIC public reset.

670	   This work is partially supported by the European Commission under
671	   Horizon 2020 grant agreement no. 688421 Measurement and Architecture
672	   for a Middleboxed Internet (MAMI), and by the Swiss State Secretariat
673	   for Education, Research, and Innovation under contract no. 15.0268.
674	   This support does not imply endorsement.

676	10.  References

678	10.1.  Normative References

680	   [RFC5103]  Trammell, B. and E. Boschi, "Bidirectional Flow Export
681	              Using IP Flow Information Export (IPFIX)", RFC 5103,
682	              DOI 10.17487/RFC5103, January 2008,
683	              <https://www.rfc-editor.org/info/rfc5103>.

685	   [RFC7011]  Claise, B., Ed., Trammell, B., Ed., and P. Aitken,
686	              "Specification of the IP Flow Information Export (IPFIX)
687	              Protocol for the Exchange of Flow Information", STD 77,
688	              RFC 7011, DOI 10.17487/RFC7011, September 2013,
689	              <https://www.rfc-editor.org/info/rfc7011>.

691	   [RFC7398]  Bagnulo, M., Burbridge, T., Crawford, S., Eardley, P., and
692	              A. Morton, "A Reference Path and Measurement Points for
693	              Large-Scale Measurement of Broadband Performance",
694	              RFC 7398, DOI 10.17487/RFC7398, February 2015,
695	              <https://www.rfc-editor.org/info/rfc7398>.

697	10.2.  Informative References

699	   [draft-trammell-plus-abstract-mech]
700	              Trammell, B., "Abstract Mechanisms for a Cooperative Path
701	              Layer under Endpoint Control", September 2016.

703	   [I-D.hardie-path-signals]
704	              Hardie, T., "Path signals", draft-hardie-path-signals-01
705	              (work in progress), May 2017.

707	   [I-D.ietf-quic-tls]
708	              Thomson, M. and S. Turner, "Using Transport Layer Security
709	              (TLS) to Secure QUIC", draft-ietf-quic-tls-07 (work in
710	              progress), October 2017.

712	   [I-D.ietf-quic-transport]
713	              Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed
714	              and Secure Transport", draft-ietf-quic-transport-07 (work
715	              in progress), October 2017.

717	   [IMC-GATEWAYS]
718	              Hatonen, S., Nyrhinen, A., Eggert, L., Strowes, S.,
719	              Sarolahti, P., and M. Kojo, "An experimental study of home
720	              gateway characteristics (Proc. ACM IMC 2010)", October
721	              2010.

723	   [IMC-TCP]  Luckie, M., Beverly, R., Wu, T., Allman, M., and k.
724	              claffy, "Resilience of Deployed TCP to Blind Attacks.
725	              (Proc. ACM IMC 2015)", October 2015.

727	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
728	              RFC 793, DOI 10.17487/RFC0793, September 1981,
729	              <https://www.rfc-editor.org/info/rfc793>.

731	   [RFC2474]  Nichols, K., Blake, S., Baker, F., and D. Black,
732	              "Definition of the Differentiated Services Field (DS
733	              Field) in the IPv4 and IPv6 Headers", RFC 2474,
734	              DOI 10.17487/RFC2474, December 1998,
735	              <https://www.rfc-editor.org/info/rfc2474>.

737	   [RFC7675]  Perumal, M., Wing, D., Ravindranath, R., Reddy, T., and M.
738	              Thomson, "Session Traversal Utilities for NAT (STUN) Usage
739	              for Consent Freshness", RFC 7675, DOI 10.17487/RFC7675,
740	              October 2015, <https://www.rfc-editor.org/info/rfc7675>.

742	Authors' Addresses

744	   Mirja Kuehlewind
745	   ETH Zurich
746	   Gloriastrasse 35
747	   8092 Zurich
748	   Switzerland

750	   Email: mirja.kuehlewind@tik.ee.ethz.ch

752	   Brian Trammell
753	   ETH Zurich
754	   Gloriastrasse 35
755	   8092 Zurich
756	   Switzerland

758	   Email: ietf@trammell.ch

760	   Joe Hildebrand

762	   Email: hildjj@cursive.net