idnits 2.17.1 

draft-ietf-taps-impl-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 4 instances of too long lines in the document, the longest one
     being 3 characters in excess of 72.

  ** The abstract seems to contain references ([I-D.ietf-taps-arch]), which
     it shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (March 11, 2019) is 1872 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'Trickle' is defined on line 1623, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-19) exists of
     draft-ietf-taps-arch-02

  == Outdated reference: A later version (-26) exists of
     draft-ietf-taps-interface-02

  ** Obsolete normative reference: RFC 7540 (Obsoleted by RFC 9113)

  == Outdated reference: A later version (-34) exists of
     draft-ietf-quic-transport-18

  -- Obsolete informational reference (is this intentional?): RFC 5245
     (Obsoleted by RFC 8445, RFC 8839)


     Summary: 3 errors (**), 0 flaws (~~), 5 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	TAPS Working Group                                     A. Brunstrom, Ed.
3	Internet-Draft                                       Karlstad University
4	Intended status: Informational                             T. Pauly, Ed.
5	Expires: September 12, 2019                                   Apple Inc.
6	                                                             T. Enghardt
7	                                                               TU Berlin
8	                                                           K-J. Grinnemo
9	                                                     Karlstad University
10	                                                                T. Jones
11	                                                  University of Aberdeen
12	                                                               P. Tiesel
13	                                                               TU Berlin
14	                                                              C. Perkins
15	                                                   University of Glasgow
16	                                                                M. Welzl
17	                                                      University of Oslo
18	                                                          March 11, 2019

20	             Implementing Interfaces to Transport Services
21	                        draft-ietf-taps-impl-03

23	Abstract

25	   The Transport Services architecture [I-D.ietf-taps-arch] defines a
26	   system that allows applications to use transport networking protocols
27	   flexibly.  This document serves as a guide to implementation on how
28	   to build such a system.

30	Status of This Memo

32	   This Internet-Draft is submitted in full conformance with the
33	   provisions of BCP 78 and BCP 79.

35	   Internet-Drafts are working documents of the Internet Engineering
36	   Task Force (IETF).  Note that other groups may also distribute
37	   working documents as Internet-Drafts.  The list of current Internet-
38	   Drafts is at https://datatracker.ietf.org/drafts/current/.

40	   Internet-Drafts are draft documents valid for a maximum of six months
41	   and may be updated, replaced, or obsoleted by other documents at any
42	   time.  It is inappropriate to use Internet-Drafts as reference
43	   material or to cite them other than as "work in progress."

45	   This Internet-Draft will expire on September 12, 2019.

47	Copyright Notice

49	   Copyright (c) 2019 IETF Trust and the persons identified as the
50	   document authors.  All rights reserved.

52	   This document is subject to BCP 78 and the IETF Trust's Legal
53	   Provisions Relating to IETF Documents
54	   (https://trustee.ietf.org/license-info) in effect on the date of
55	   publication of this document.  Please review these documents
56	   carefully, as they describe your rights and restrictions with respect
57	   to this document.  Code Components extracted from this document must
58	   include Simplified BSD License text as described in Section 4.e of
59	   the Trust Legal Provisions and are provided without warranty as
60	   described in the Simplified BSD License.

62	Table of Contents

64	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
65	   2.  Implementing Basic Objects  . . . . . . . . . . . . . . . . .   3
66	   3.  Implementing Pre-Establishment  . . . . . . . . . . . . . . .   4
67	     3.1.  Configuration-time errors . . . . . . . . . . . . . . . .   5
68	     3.2.  Role of system policy . . . . . . . . . . . . . . . . . .   5
69	   4.  Implementing Connection Establishment . . . . . . . . . . . .   6
70	     4.1.  Candidate Gathering . . . . . . . . . . . . . . . . . . .   7
71	       4.1.1.  Structuring Options as a Tree . . . . . . . . . . . .   7
72	       4.1.2.  Branch Types  . . . . . . . . . . . . . . . . . . . .   9
73	     4.2.  Branching Order-of-Operations . . . . . . . . . . . . . .  11
74	     4.3.  Sorting Branches  . . . . . . . . . . . . . . . . . . . .  12
75	     4.4.  Candidate Racing  . . . . . . . . . . . . . . . . . . . .  13
76	       4.4.1.  Delayed . . . . . . . . . . . . . . . . . . . . . . .  14
77	       4.4.2.  Failover  . . . . . . . . . . . . . . . . . . . . . .  15
78	     4.5.  Completing Establishment  . . . . . . . . . . . . . . . .  15
79	       4.5.1.  Determining Successful Establishment  . . . . . . . .  16
80	     4.6.  Establishing multiplexed connections  . . . . . . . . . .  17
81	     4.7.  Handling racing with "unconnected" protocols  . . . . . .  17
82	     4.8.  Implementing listeners  . . . . . . . . . . . . . . . . .  18
83	       4.8.1.  Implementing listeners for Connected Protocols  . . .  18
84	       4.8.2.  Implementing listeners for Unconnected Protocols  . .  18
85	       4.8.3.  Implementing listeners for Multiplexed Protocols  . .  18
86	   5.  Implementing Data Transfer  . . . . . . . . . . . . . . . . .  19
87	     5.1.  Data transfer for streams, datagrams, and frames  . . . .  19
88	       5.1.1.  Sending Messages  . . . . . . . . . . . . . . . . . .  19
89	       5.1.2.  Receiving Messages  . . . . . . . . . . . . . . . . .  21
90	     5.2.  Handling of data for fast-open protocols  . . . . . . . .  22
91	   6.  Implementing Maintenance  . . . . . . . . . . . . . . . . . .  23
92	     6.1.  Managing Connections  . . . . . . . . . . . . . . . . . .  23
93	     6.2.  Handling Path Changes . . . . . . . . . . . . . . . . . .  24
94	   7.  Implementing Termination  . . . . . . . . . . . . . . . . . .  24
95	   8.  Cached State  . . . . . . . . . . . . . . . . . . . . . . . .  25
96	     8.1.  Protocol state caches . . . . . . . . . . . . . . . . . .  26
97	     8.2.  Performance caches  . . . . . . . . . . . . . . . . . . .  26
98	   9.  Specific Transport Protocol Considerations  . . . . . . . . .  27
99	     9.1.  TCP . . . . . . . . . . . . . . . . . . . . . . . . . . .  27
100	     9.2.  UDP . . . . . . . . . . . . . . . . . . . . . . . . . . .  28
101	     9.3.  SCTP  . . . . . . . . . . . . . . . . . . . . . . . . . .  28
102	     9.4.  TLS . . . . . . . . . . . . . . . . . . . . . . . . . . .  29
103	     9.5.  HTTP  . . . . . . . . . . . . . . . . . . . . . . . . . .  29
104	     9.6.  QUIC  . . . . . . . . . . . . . . . . . . . . . . . . . .  29
105	     9.7.  HTTP/2 transport  . . . . . . . . . . . . . . . . . . . .  30
106	   10. Rendezvous and Environment Discovery  . . . . . . . . . . . .  30
107	   11. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  32
108	   12. Security Considerations . . . . . . . . . . . . . . . . . . .  32
109	     12.1.  Considerations for Candidate Gathering . . . . . . . . .  32
110	     12.2.  Considerations for Candidate Racing  . . . . . . . . . .  32
111	   13. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  33
112	   14. References  . . . . . . . . . . . . . . . . . . . . . . . . .  33
113	     14.1.  Normative References . . . . . . . . . . . . . . . . . .  33
114	     14.2.  Informative References . . . . . . . . . . . . . . . . .  34
115	   Appendix A.  Additional Properties  . . . . . . . . . . . . . . .  35
116	     A.1.  Properties Affecting Sorting of Branches  . . . . . . . .  35
117	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  35

119	1.  Introduction

121	   The Transport Services architecture [I-D.ietf-taps-arch] defines a
122	   system that allows applications to use transport networking protocols
123	   flexibly.  The interface such a system exposes to applications is
124	   defined as the Transport Services API [I-D.ietf-taps-interface].
125	   This API is designed to be generic across multiple transport
126	   protocols and sets of protocols features.

128	   This document serves as a guide to implementation on how to build a
129	   system that provides a Transport Services API.  It is the job of an
130	   implementation of a Transport Services system to turn the requests of
131	   an application into decisions on how to establish connections, and
132	   how to transfer data over those connections once established.  The
133	   terminology used in this document is based on the Architecture
134	   [I-D.ietf-taps-arch].

136	2.  Implementing Basic Objects

138	   The basic objects that are exposed to applications for Transport
139	   Services are the Preconnection, the bundle of properties that
140	   describes the application constraints on the transport; the
141	   Connection, the basic object that represents a flow of data in either
142	   direction between the Local and Remote Endpoints; and the Listener, a
143	   passive waiting object that delivers new Connections.

145	   Preconnection objects should be implemented as bundles of properties
146	   that an application can both read and write.  Once a Preconnection
147	   has been used to create an outbound Connection or a Listener, the
148	   implementation should ensure that the copy of the properties held by
149	   the Connection or Listener is immutable.  This may involve performing
150	   a deep-copy if the application is still able to modify properties on
151	   the original Preconnection object.

153	   Connection objects represent the interface between the application
154	   and the implementation to manage transport state, and conduct data
155	   transfer.  During the process of establishment (Section 4), the
156	   Connection will be unbound to a specific transport flow, since there
157	   may be multiple candidate Protocol Stacks being raced.  Once the
158	   Connection is established, the object should be considered mapped to
159	   a specific Protocol Stack.  The notion of a Connection maps to many
160	   different protocols, depending on the Protocol Stack.  For example,
161	   the Connection may ultimately represent the interface into a TCP
162	   connection, a TLS session over TCP, a UDP flow with fully-specified
163	   local and remote endpoints, a DTLS session, a SCTP stream, a QUIC
164	   stream, or an HTTP/2 stream.

166	   Listener objects are created with a Preconnection, at which point
167	   their configuration should be considered immutable by the
168	   implementation.  The process of listening is described in
169	   Section 4.8.

171	3.  Implementing Pre-Establishment

173	   During pre-establishment the application specifies the Endpoints to
174	   be used for communication as well as its preferences via Selection
175	   Properties and, if desired, also Connection Properties.  Generally,
176	   Connection Properties should be configured as early as possible, as
177	   they may serve as input to decisions that are made by the
178	   implementation (the Capacity Profile may guide usage of a protocol
179	   offering scavenger-type congestion control, for example).  In the
180	   remainder of this document, we only refer to Selection Properties
181	   because they are the more typical case and have to be handled by all
182	   implementations.

184	   The implementation stores these objects and properties as part of the
185	   Preconnection object for use during connection establishment.  For
186	   Selection Properties that are not provided by the application, the
187	   implementation must use the default values specified in the Transport
188	   Services API ([I-D.ietf-taps-interface]).

190	3.1.  Configuration-time errors

192	   The transport system should have a list of supported protocols
193	   available, which each have transport features reflecting the
194	   capabilities of the protocol.  Once an application specifies its
195	   Transport Parameters, the transport system should match the required
196	   and prohibited properties against the transport features of the
197	   available protocols.

199	   In the following cases, failure should be detected during pre-
200	   establishment:

202	   o  The application requested Protocol Properties that include
203	      requirements or prohibitions that cannot be satisfied by any of
204	      the available protocols.  For example, if an application requires
205	      "Configure Reliability per Message", but no such protocol is
206	      available on the host running the transport system, e.g., because
207	      SCTP is not supported by the operating system, this should result
208	      in an error.

210	   o  The application requested Protocol Properties that are in conflict
211	      with each other, i.e., the required and prohibited properties
212	      cannot be satisfied by the same protocol.  For example, if an
213	      application prohibits "Reliable Data Transfer" but then requires
214	      "Configure Reliability per Message", this mismatch should result
215	      in an error.

217	   It is important to fail as early as possible in such cases in order
218	   to avoid allocating resources, e.g., to endpoint resolution, only to
219	   find out later that there is no protocol that satisfies the
220	   requirements.

222	3.2.  Role of system policy

224	   The properties specified during pre-establishment have a close
225	   connection to system policy.  The implementation is responsible for
226	   combining and reconciling several different sources of preferences
227	   when establishing Connections.  These include, but are not limited
228	   to:

230	   1.  Application preferences, i.e., preferences specified during the
231	       pre-establishment via Selection Properties.

233	   2.  Dynamic system policy, i.e., policy compiled from internally and
234	       externally acquired information about available network
235	       interfaces, supported transport protocols, and current/previous
236	       Connections.  Examples of ways to externally retrieve policy-
237	       support information are through OS-specific statistics/
238	       measurement tools and tools that reside on middleboxes and
239	       routers.

241	   3.  Default implementation policy, i.e., predefined policy by OS or
242	       application.

244	   In general, any protocol or path used for a connection must conform
245	   to all three sources of constraints.  Any violation of any of the
246	   layers should cause a protocol or path to be considered ineligible
247	   for use.  For an example of application preferences leading to
248	   constraints, an application may prohibit the use of metered network
249	   interfaces for a given Connection to avoid user cost.  Similarly, the
250	   system policy at a given time may prohibit the use of such a metered
251	   network interface from the application's process.  Lastly, the
252	   implementation itself may default to disallowing certain network
253	   interfaces unless explicitly requested by the application and allowed
254	   by the system.

256	   It is expected that the database of system policies and the method of
257	   looking up these policies will vary across various platforms.  An
258	   implementation should attempt to look up the relevant policies for
259	   the system in a dynamic way to make sure it is reflecting an accurate
260	   version of the system policy, since the system's policy regarding the
261	   application's traffic may change over time due to user or
262	   administrative changes.

264	4.  Implementing Connection Establishment

266	   The process of establishing a network connection begins when an
267	   application expresses intent to communicate with a remote endpoint by
268	   calling Initiate.  (At this point, any constraints or requirements
269	   the application may have on the connection are available from pre-
270	   establishment.)  The process can be considered complete once there is
271	   at least one Protocol Stack that has completed any required setup to
272	   the point that it can transmit and receive the application's data.

274	   Connection establishment is divided into two top-level steps:
275	   Candidate Gathering, to identify the paths, protocols, and endpoints
276	   to use, and Candidate Racing, in which the necessary protocol
277	   handshakes are conducted so that the transport system can select
278	   which set to use.

280	   The most simple example of this process might involve identifying the
281	   single IP address to which the implementation wishes to connect,
282	   using the system's current default interface or path, and starting a
283	   TCP handshake to establish a stream to the specified IP address.
284	   However, each step may also vary depending on the requirements of the
285	   connection: if the endpoint is defined as a hostname and port, then
286	   there may be multiple resolved addresses that are available; there
287	   may also be multiple interfaces or paths available, other than the
288	   default system interface; and some protocols may not need any
289	   transport handshake to be considered "established" (such as UDP),
290	   while other connections may utilize layered protocol handshakes, such
291	   as TLS over TCP.

293	   Whenever an implementation has multiple options for connection
294	   establishment, it can view the set of all individual connection
295	   establishment options as a single, aggregate connection
296	   establishment.  The aggregate set conceptually includes every valid
297	   combination of endpoints, paths, and protocols.  As an example,
298	   consider an implementation that initiates a TCP connection to a
299	   hostname + port endpoint, and has two valid interfaces available (Wi-
300	   Fi and LTE).  The hostname resolves to a single IPv4 address on the
301	   Wi-Fi network, and resolves to the same IPv4 address on the LTE
302	   network, as well as a single IPv6 address.  The aggregate set of
303	   connection establishment options can be viewed as follows:

305	Aggregate [Endpoint: www.example.com:80] [Interface: Any]   [Protocol: TCP]
306	|-> [Endpoint: 192.0.2.1:80]       [Interface: Wi-Fi] [Protocol: TCP]
307	|-> [Endpoint: 192.0.2.1:80]       [Interface: LTE]   [Protocol: TCP]
308	|-> [Endpoint: 2001:DB8::1.80]     [Interface: LTE]   [Protocol: TCP]

310	   Any one of these sub-entries on the aggregate connection attempt
311	   would satisfy the original application intent.  The concern of this
312	   section is the algorithm defining which of these options to try,
313	   when, and in what order.

315	4.1.  Candidate Gathering

317	   The step of gathering candidates involves identifying which paths,
318	   protocols, and endpoints may be used for a given Connection.  This
319	   list is determined by the requirements, prohibitions, and preferences
320	   of the application as specified in the Selection Properties.

322	4.1.1.  Structuring Options as a Tree

324	   When an implementation responsible for connection establishment needs
325	   to consider multiple options, it should logically structure these
326	   options as a hierarchical tree.  Each leaf node of the tree
327	   represents a single, coherent connection attempt, with an Endpoint, a
328	   Path, and a set of protocols that can directly negotiate and send
329	   data on the network.  Each node in the tree that is not a leaf
330	   represents a connection attempt that is either underspecified, or
331	   else includes multiple distinct options.  For example. when
332	   connecting on an IP network, a connection attempt to a hostname and
333	   port is underspecified, because the connection attempt requires a
334	   resolved IP address as its remote endpoint.  In this case, the node
335	   represented by the connection attempt to the hostname is a parent
336	   node, with child nodes for each IP address.  Similarly, an
337	   implementation that is allowed to connect using multiple interfaces
338	   will have a parent node of the tree for the decision between the
339	   paths, with a branch for each interface.

341	   The example aggregate connection attempt above can be drawn as a tree
342	   by grouping the addresses resolved on the same interface into
343	   branches:

345	                             ||
346	                +==========================+
347	                |  www.example.com:80/Any  |
348	                +==========================+
349	                  //                    \\
350	+==========================+       +==========================+
351	| www.example.com:80/Wi-Fi |       |  www.example.com:80/LTE  |
352	+==========================+       +==========================+
353	             ||                      //                    \\
354	  +====================+  +====================+  +======================+
355	  | 192.0.2.1:80/Wi-Fi |  |  192.0.2.1:80/LTE  |  |  2001:DB8::1.80/LTE  |
356	  +====================+  +====================+  +======================+

358	   The rest of this section will use a notation scheme to represent this
359	   tree.  The parent (or trunk) node of the tree will be represented by
360	   a single integer, such as "1".  Each child of that node will have an
361	   integer that identifies it, from 1 to the number of children.  That
362	   child node will be uniquely identified by concatenating its integer
363	   to it's parents identifier with a dot in between, such as "1.1" and
364	   "1.2".  Each node will be summarized by a tuple of three elements:
365	   Endpoint, Path, and Protocol.  The above example can now be written
366	   more succinctly as:

368	   1 [www.example.com:80, Any, TCP]
369	     1.1 [www.example.com:80, Wi-Fi, TCP]
370	       1.1.1 [192.0.2.1:80, Wi-Fi, TCP]
371	     1.2 [www.example.com:80, LTE, TCP]
372	       1.2.1 [192.0.2.1:80, LTE, TCP]
373	       1.2.2 [2001:DB8::1.80, LTE, TCP]

375	   When an implementation views this aggregate set of connection
376	   attempts as a single connection establishment, it only will use one
377	   of the leaf nodes to transfer data.  Thus, when a single leaf node
378	   becomes ready to use, then the entire connection attempt is ready to
379	   use by the application.  Another way to represent this is that every
380	   leaf node updates the state of its parent node when it becomes ready,
381	   until the trunk node of the tree is ready, which then notifies the
382	   application that the connection as a whole is ready to use.

384	   A connection establishment tree may be degenerate, and only have a
385	   single leaf node, such as a connection attempt to an IP address over
386	   a single interface with a single protocol.

388	   1 [192.0.2.1:80, Wi-Fi, TCP]

390	   A parent node may also only have one child (or leaf) node, such as a
391	   when a hostname resolves to only a single IP address.

393	   1 [www.example.com:80, Wi-Fi, TCP]
394	     1.1 [192.0.2.1:80, Wi-Fi, TCP]

396	4.1.2.  Branch Types

398	   There are three types of branching from a parent node into one or
399	   more child nodes.  Any parent node of the tree must only use one type
400	   of branching.

402	4.1.2.1.  Derived Endpoints

404	   If a connection originally targets a single endpoint, there may be
405	   multiple endpoints of different types that can be derived from the
406	   original.  The connection library should order the derived endpoints
407	   according to application preference, system policy and expected
408	   performance.

410	   DNS hostname-to-address resolution is the most common method of
411	   endpoint derivation.  When trying to connect to a hostname endpoint
412	   on a traditional IP network, the implementation should send DNS
413	   queries for both A (IPv4) and AAAA (IPv6) records if both are
414	   supported on the local link.  The algorithm for ordering and racing
415	   these addresses should follow the recommendations in Happy Eyeballs
416	   [RFC8305].

418	   1 [www.example.com:80, Wi-Fi, TCP]
419	     1.1 [2001:DB8::1.80, Wi-Fi, TCP]
420	     1.2 [192.0.2.1:80, Wi-Fi, TCP]
421	     1.3 [2001:DB8::2.80, Wi-Fi, TCP]
422	     1.4 [2001:DB8::3.80, Wi-Fi, TCP]

424	   DNS-Based Service Discovery can also provide an endpoint derivation
425	   step.  When trying to connect to a named service, the client may
426	   discover one or more hostname and port pairs on the local network
427	   using multicast DNS.  These hostnames should each be treated as a
428	   branch which can be attempted independently from other hostnames.

430	   Each of these hostnames may also resolve to one or more addresses,
431	   thus creating multiple layers of branching.

433	   1 [term-printer._ipp._tcp.meeting.ietf.org, Wi-Fi, TCP]
434	     1.1 [term-printer.meeting.ietf.org:631, Wi-Fi, TCP]
435	       1.1.1 [31.133.160.18.631, Wi-Fi, TCP]

437	4.1.2.2.  Alternate Paths

439	   If a client has multiple network interfaces available to it, such as
440	   mobile client with both Wi-Fi and Cellular connectivity, it can
441	   attempt a connection over either interface.  This represents a branch
442	   point in the connection establishment.  Like with derived endpoints,
443	   the interfaces should be ranked based on preference, system policy,
444	   and performance.  Attempts should be started on one interface, and
445	   then on other interfaces successively after delays based on expected
446	   round-trip-time or other available metrics.

448	   1 [192.0.2.1:80, Any, TCP]
449	     1.1 [192.0.2.1:80, Wi-Fi, TCP]
450	     1.2 [192.0.2.1:80, LTE, TCP]

452	   This same approach applies to any situation in which the client is
453	   aware of multiple links or views of the network.  Multiple Paths,
454	   each with a coherent set of addresses, routes, DNS server, and more,
455	   may share a single interface.  A path may also represent a virtual
456	   interface service such as a Virtual Private Network (VPN).

458	   The list of available paths should be constrained by any requirements
459	   or prohibitions the application sets, as well as system policy.

461	4.1.2.3.  Protocol Options

463	   Differences in possible protocol compositions and options can also
464	   provide a branching point in connection establishment.  This allows
465	   clients to be resilient to situations in which a certain protocol is
466	   not functioning on a server or network.

468	   This approach is commonly used for connections with optional proxy
469	   server configurations.  A single connection may be allowed to use an
470	   HTTP-based proxy, a SOCKS-based proxy, or connect directly.  These
471	   options should be ranked and attempted in succession.

473	   1 [www.example.com:80, Any, HTTP/TCP]
474	     1.1 [192.0.2.8:80, Any, HTTP/HTTP Proxy/TCP]
475	     1.2 [192.0.2.7:10234, Any, HTTP/SOCKS/TCP]
476	     1.3 [www.example.com:80, Any, HTTP/TCP]
477	       1.3.1 [192.0.2.1:80, Any, HTTP/TCP]

479	   This approach also allows a client to attempt different sets of
480	   application and transport protocols that may provide preferable
481	   characteristics when available.  For example, the protocol options
482	   could involve QUIC [I-D.ietf-quic-transport] over UDP on one branch,
483	   and HTTP/2 [RFC7540] over TLS over TCP on the other:

485	   1 [www.example.com:443, Any, Any HTTP]
486	     1.1 [www.example.com:443, Any, QUIC/UDP]
487	       1.1.1 [192.0.2.1:443, Any, QUIC/UDP]
488	     1.2 [www.example.com:443, Any, HTTP2/TLS/TCP]
489	       1.2.1 [192.0.2.1:443, Any, HTTP2/TLS/TCP]

491	   Another example is racing SCTP with TCP:

493	   1 [www.example.com:80, Any, Any Stream]
494	     1.1 [www.example.com:80, Any, SCTP]
495	       1.1.1 [192.0.2.1:80, Any, SCTP]
496	     1.2 [www.example.com:80, Any, TCP]
497	       1.2.1 [192.0.2.1:80, Any, TCP]

499	   Implementations that support racing protocols and protocol options
500	   should maintain a history of which protocols and protocol options
501	   successfully established, on a per-network basis (see Section 8.2).
502	   This information can influence future racing decisions to prioritize
503	   or prune branches.

505	4.2.  Branching Order-of-Operations

507	   Branch types must occur in a specific order relative to one another
508	   to avoid creating leaf nodes with invalid or incompatible settings.
509	   In the example above, it would be invalid to branch for derived
510	   endpoints (the DNS results for www.example.com) before branching
511	   between interface paths, since usable DNS results on one network may
512	   not necessarily be the same as DNS results on another network due to
513	   local network entities, supported address families, or enterprise
514	   network configurations.  Implementations must be careful to branch in
515	   an order that results in usable leaf nodes whenever there are
516	   multiple branch types that could be used from a single node.

518	   The order of operations for branching, where lower numbers are acted
519	   upon first, should be:

521	   1.  Alternate Paths

523	   2.  Protocol Options

525	   3.  Derived Endpoints
526	   Branching between paths is the first in the list because results
527	   across multiple interfaces are likely not related to one another:
528	   endpoint resolution may return different results, especially when
529	   using locally resolved host and service names, and which protocols
530	   are supported and preferred may differ across interfaces.  Thus, if
531	   multiple paths are attempted, the overall connection can be seen as a
532	   race between the available paths or interfaces.

534	   Protocol options are checked next in order.  Whether or not a set of
535	   protocol, or protocol-specific options, can successfully connect is
536	   generally not dependent on which specific IP address is used.
537	   Furthermore, the protocol stacks being attempted may influence or
538	   altogether change the endpoints being used.  Adding a proxy to a
539	   connection's branch will change the endpoint to the proxy's IP
540	   address or hostname.  Choosing an alternate protocol may also modify
541	   the ports that should be selected.

543	   Branching for derived endpoints is the final step, and may have
544	   multiple layers of derivation or resolution, such as DNS service
545	   resolution and DNS hostname resolution.

547	   For example, if the application has indicated both a preference for
548	   WiFi over LTE and for a feature only available in SCTP, branches will
549	   be first sorted accord to path selection, with WiFi at the top.
550	   Then, branches with SCTP will be sorted to the top within their
551	   subtree according to the properties influencing protocol selection.
552	   However, if the implementation has cached the information that SCTP
553	   is not available on the path over WiFi, there is no SCTP node in the
554	   WiFi subtree.  Here, the path over WiFi will be tried first, and, if
555	   connection establishment succeeds, TCP will be used.  So the
556	   Selection Property of preferring WiFi takes precedence over the
557	   Property that led to a preference for SCTP.

559	   1. [www.example.com:80, Any, Any Stream]
560	   1.1 [192.0.2.1:80, Wi-Fi, Any Stream]
561	   1.1.1 [192.0.2.1:80, Wi-Fi, TCP]
562	   1.2 [192.0.3.1:80, LTE, Any Stream]
563	   1.2.1 [192.0.3.1:80, LTE, SCTP]
564	   1.2.2 [192.0.3.1:80, LTE, TCP]

566	4.3.  Sorting Branches

568	   Implementations should sort the branches of the tree of connection
569	   options in order of their preference rank.  Leaf nodes on branches
570	   with higher rankings represent connection attempts that will be raced
571	   first.  Implementations should order the branches to reflect the
572	   preferences expressed by the application for its new connection,
573	   including Selection Properties, which are specified in
574	   [I-D.ietf-taps-interface].

576	   In addition to the properties provided by the application, an
577	   implementation may include additional criteria such as cached
578	   performance estimates, see Section 8.2, or system policy, see
579	   Section 3.2, in the ranking.  Two examples of how Selection and
580	   Connection Properties may be used to sort branches are provided
581	   below:

583	   o  "Interface Instance or Type": If the application specifies an
584	      interface type to be preferred or avoided, implementations should
585	      rank paths accordingly.  If the application specifies an interface
586	      type to be required or prohibited, we expect an implementation to
587	      not include the non-conforming paths into the three.

589	   o  "Capacity Profile": An implementation may use the Capacity Profile
590	      to prefer paths optimized for the application's expected traffic
591	      pattern according to cached performance estimates, see
592	      Section 8.2:

594	      *  Scavenger: Prefer paths with the highest expected available
595	         bandwidth, based on observed maximum throughput

597	      *  Low Latency/Interactive: Prefer paths with the lowest expected
598	         Round Trip Time

600	      *  Constant-Rate Streaming: Prefer paths that can satisfy the
601	         requested Stream Send or Stream Receive Bitrate, based on
602	         observed maximum throughput

604	   Implementations should process properties in the following order:
605	   Prohibit, Require, Prefer, Avoid.  If Selection Properties contain
606	   any prohibited properties, the implementation should first purge
607	   branches containing nodes with these properties.  For required
608	   properties, it should only keep branches that satisfy these
609	   requirements.  Finally, it should order branches according to
610	   preferred properties, and finally use avoided properties as a
611	   tiebreaker.

613	4.4.  Candidate Racing

615	   The primary goal of the Candidate Racing process is to successfully
616	   negotiate a protocol stack to an endpoint over an interface--to
617	   connect a single leaf node of the tree--with as little delay and as
618	   few unnecessary connections attempts as possible.  Optimizing these
619	   two factors improves the user experience, while minimizing network
620	   load.

622	   This section covers the dynamic aspect of connection establishment.
623	   While the tree described above is a useful conceptual and
624	   architectural model, an implementation does not know what the full
625	   tree may become up front, nor will many of the possible branches be
626	   used in the common case.

628	   There are three different approaches to racing the attempts for
629	   different nodes of the connection establishment tree:

631	   1.  Immediate

633	   2.  Delayed

635	   3.  Failover

637	   Each approach is appropriate in different use-cases and branch types.
638	   However, to avoid consuming unnecessary network resources,
639	   implementations should not use immediate racing as a default
640	   approach.

642	   The timing algorithms for racing should remain independent across
643	   branches of the tree.  Any timers or racing logic is isolated to a
644	   given parent node, and is not ordered precisely with regards to other
645	   children of other nodes.

647	4.4.1.  Delayed

649	   Delayed racing can be used whenever a single node of the tree has
650	   multiple child nodes.  Based on the order determined when building
651	   the tree, the first child node will be initiated immediately,
652	   followed by the next child node after some delay.  Once that second
653	   child node is initiated, the third child node (if present) will begin
654	   after another delay, and so on until all child nodes have been
655	   initiated, or one of the child nodes successfully completes its
656	   negotiation.

658	   Delayed racing attempts occur in parallel.  Implementations should
659	   not terminate an earlier child connection attempt upon starting a
660	   secondary child.

662	   The delay between starting child nodes should be based on the
663	   properties of the previously started child node.  For example, if the
664	   first child represents an IP address with a known route, and the
665	   second child represents another IP address, the delay between
666	   starting the first and second IP addresses can be based on the
667	   expected retransmission cadence for the first child's connection
668	   (derived from historical round-trip-time).  Alternatively, if the
669	   first child represents a branch on a Wi-Fi interface, and the second
670	   child represents a branch on an LTE interface, the delay should be
671	   based on the expected time in which the branch for the first
672	   interface would be able to establish a connection, based on link
673	   quality and historical round-trip-time.

675	   Any delay should have a defined minimum and maximum value based on
676	   the branch type.  Generally, branches between paths and protocols
677	   should have longer delays than branches between derived endpoints.
678	   The maximum delay should be considered with regards to how long a
679	   user is expected to wait for the connection to complete.

681	   If a child node fails to connect before the delay timer has fired for
682	   the next child, the next child should be started immediately.

684	4.4.2.  Failover

686	   If an implementation or application has a strong preference for one
687	   branch over another, the branching node may choose to wait until one
688	   child has failed before starting the next.  Failure of a leaf node is
689	   determined by its protocol negotiation failing or timing out; failure
690	   of a parent branching node is determined by all of its children
691	   failing.

693	   An example in which failover is recommended is a race between a
694	   protocol stack that uses a proxy and a protocol stack that bypasses
695	   the proxy.  Failover is useful in case the proxy is down or
696	   misconfigured, but any more aggressive type of racing may end up
697	   unnecessarily avoiding a proxy that was preferred by policy.

699	4.5.  Completing Establishment

701	   The process of connection establishment completes when one leaf node
702	   of the tree has completed negotiation with the remote endpoint
703	   successfully, or else all nodes of the tree have failed to connect.
704	   The first leaf node to complete its connection is then used by the
705	   application to send and receive data.

707	   It is useful to process success and failure throughout the tree by
708	   child nodes reporting to their parent nodes (towards the trunk of the
709	   tree).  For example, in the following case, if 1.1.1 fails to
710	   connect, it reports the failure to 1.1.  Since 1.1 has no other child
711	   nodes, it also has failed and reports that failure to 1.  Because 1.2
712	   has not yet failed, 1 is not considered to have failed.  Since 1.2
713	   has not yet started, it is started and the process continues.
714	   Similarly, if 1.1.1 successfully connects, then it marks 1.1 as
715	   connected, which propagates to the trunk node 1.  At this point, the
716	   connection as a whole is considered to be successfully connected and
717	   ready to process application data
718	   1 [www.example.com:80, Any, TCP]
719	     1.1 [www.example.com:80, Wi-Fi, TCP]
720	       1.1.1 [192.0.2.1:80, Wi-Fi, TCP]
721	     1.2 [www.example.com:80, LTE, TCP]
722	   ...

724	   If a leaf node has successfully completed its connection, all other
725	   attempts should be made ineligible for use by the application for the
726	   original request.  New connection attempts that involve transmitting
727	   data on the network should not be started after another leaf node has
728	   completed successfully, as the connection as a whole has been
729	   established.  An implementation may choose to let certain handshakes
730	   and negotiations complete in order to gather metrics to influence
731	   future connections.  Similarly, an implementation may choose to hold
732	   onto fully established leaf nodes that were not the first to
733	   establish for use in future connections, but this approach is not
734	   recommended since those attempts were slower to connect and may
735	   exhibit less desirable properties.

737	4.5.1.  Determining Successful Establishment

739	   Implementations may select the criteria by which a leaf node is
740	   considered to be successfully connected differently on a per-protocol
741	   basis.  If the only protocol being used is a transport protocol with
742	   a clear handshake, like TCP, then the obvious choice is to declare
743	   that node "connected" when the last packet of the three-way handshake
744	   has been received.  If the only protocol being used is an
745	   "unconnected" protocol, like UDP, the implementation may consider the
746	   node fully "connected" the moment it determines a route is present,
747	   before sending any packets on the network, see further Section 4.7.

749	   For protocol stacks with multiple handshakes, the decision becomes
750	   more nuanced.  If the protocol stack involves both TLS and TCP, an
751	   implementation could determine that a leaf node is connected after
752	   the TCP handshake is complete, or it can wait for the TLS handshake
753	   to complete as well.  The benefit of declaring completion when the
754	   TCP handshake finishes, and thus stopping the race for other branches
755	   of the tree, is that there will be less burden on the network from
756	   other connection attempts.  On the other hand, by waiting until the
757	   TLS handshake is complete, an implementation avoids the scenario in
758	   which a TCP handshake completes quickly, but TLS negotiation is
759	   either very slow or fails altogether in particular network conditions
760	   or to a particular endpoint.  To avoid the issue of TLS possibly
761	   failing, the implementation should not generate a Ready event for the
762	   Connection until TLS is established.

764	   If all of the leaf nodes fail to connect during racing, i.e. none of
765	   the configurations that satisfy all requirements given in the
766	   Transport Parameters actually work over the available paths, then the
767	   transport system should notify the application with an InitiateError
768	   event.  An InitiateError event should also be generated in case the
769	   transport system finds no usable candidates to race.

771	4.6.  Establishing multiplexed connections

773	   Multiplexing several Connections over a single underlying transport
774	   connection requires that the Connections to be multiplexed belong to
775	   the same Connection Group (as is indicated by the application using
776	   the Clone call).  When the underlying transport connection supports
777	   multi-streaming, the Transport System can map each Connection in the
778	   Connection Group to a different stream.  Thus, when the Connections
779	   that are offered to an application by the Transport System are
780	   multiplexed, the Transport System may implement the establishment of
781	   a new Connection by simply beginning to use a new stream of an
782	   already established transport connection and there is no need for a
783	   connection establishment procedure.  This, then, also means that
784	   there may not be any "establishment" message (like a TCP SYN), but
785	   the application can simply start sending or receiving.  Therefore,
786	   when the Initiate action of a Transport System is called without
787	   Messages being handed over, it cannot be guaranteed that the other
788	   endpoint will have any way to know about this, and hence a passive
789	   endpoint's ConnectionReceived event may not be called upon an active
790	   endpoint's Inititate.  Instead, calling the ConnectionReceived event
791	   may be delayed until the first Message arrives.

793	4.7.  Handling racing with "unconnected" protocols

795	   While protocols that use an explicit handshake to validate a
796	   Connection to a peer can be used for racing multiple establishment
797	   attempts in parallel, "unconnected" protocols such as raw UDP do not
798	   offer a way to validate the presence of a peer or the usability of a
799	   Connection without application feedback.  An implementation should
800	   consider such a protocol stack to be established as soon as a local
801	   route to the peer endpoint is confirmed.

803	   However, if a peer is not reachable over the network using the
804	   unconnected protocol, or data cannot be exchanged for any other
805	   reason, the application may want to attempt using another candidate
806	   Protocol Stack.  The implementation should maintain the list of other
807	   candidate Protocol Stacks that were eligible to use.  In the case
808	   that the application signals that the initial Protocol Stack is
809	   failing for some reason and that another option should be attempted,
810	   the Connection can be updated to point to the next candidate Protocol
811	   Stack.  This can be viewed as an application-driven form of Protocol
812	   Stack racing.

814	4.8.  Implementing listeners

816	   When an implementation is asked to Listen, it registers with the
817	   system to wait for incoming traffic to the Local Endpoint.  If no
818	   Local Endpoint is specified, the implementation should either use an
819	   ephemeral port or generate an error.

821	   If the Selection Properties do not require a single network interface
822	   or path, but allow the use of multiple paths, the Listener object
823	   should register for incoming traffic on all of the network interfaces
824	   or paths that conform to the Properties.  The set of available paths
825	   can change over time, so the implementation should monitor network
826	   path changes and register and de-register the Listener across all
827	   usable paths.  When using multiple paths, the Listener is generally
828	   expected to use the same port for listening on each.

830	   If the Selection Properties allow multiple protocols to be used for
831	   listening, and the implementation supports it, the Listener object
832	   should register across the eligble protocols for each path.  This
833	   means that inbound Connections delivered by the implementation may
834	   have heterogeneous protocol stacks.

836	4.8.1.  Implementing listeners for Connected Protocols

838	   Connected protocols such as TCP and TLS-over-TCP have a strong
839	   mapping between the Local and Remote Endpoints (five-tuple) and their
840	   protocol connection state.  These map well into Connection objects.
841	   Whenever a new inbound handshake is being started, the Listener
842	   should generate a new Connection object and pass it to the
843	   application.

845	4.8.2.  Implementing listeners for Unconnected Protocols

847	   Unconnected protocols such as UDP and UDP-lite generally do not
848	   provide the same mechanisms that connected protocols do to offer
849	   Connection objects.  Implementations should wait for incoming packets
850	   for unconnected protocols on a listening port and should perform
851	   five-tuple matching of packets to either existing Connection objects
852	   or the creation of new Connection objects.  On platforms with
853	   facilities to create a "virtual connection" for unconnected protocols
854	   implementations should use these mechanisms to minimise the handling
855	   of datagrams intended for already created Connection objects.

857	4.8.3.  Implementing listeners for Multiplexed Protocols

859	   Protocols that provide multiplexing of streams into a single five-
860	   tuple can listen both for entirely new connections (a new HTTP/2
861	   stream on a new TCP connection, for example) and for new sub-
862	   connections (a new HTTP/2 stream on an existing connection).  If the
863	   abstraction of Connection presented to the application is mapped to
864	   the multiplexed stream, then the Listener should deliver new
865	   Connection objects in the same way for either case.  The
866	   implementation should allow the application to introspect the
867	   Connection Group marked on the Connections to determine the grouping
868	   of the multiplexing.

870	5.  Implementing Data Transfer

872	5.1.  Data transfer for streams, datagrams, and frames

874	   The most basic mapping for sending a Message is an abstraction of
875	   datagrams, in which the transport protocol naturally deals in
876	   discrete packets.  Each Message here corresponds to a single
877	   datagram.  Generally, these will be short enough that sending and
878	   receiving will always use a complete Message.

880	   For protocols that expose byte-streams, the only delineation provided
881	   by the protocol is the end of the stream in a given direction.  Each
882	   Message in this case corresponds to the entire stream of bytes in a
883	   direction.  These Messages may be quite long, in which case they can
884	   be sent in multiple parts.

886	   Protocols that provide the framing (such as length-value protocols,
887	   or protocols that use delimiters) provide data boundaries that may be
888	   longer than a traditional packet datagram.  Each Message for framing
889	   protocols corresponds to a single frame, which may be sent either as
890	   a complete Message, or in multiple parts.

892	5.1.1.  Sending Messages

894	   The effect of the application sending a Message is determined by the
895	   top-level protocol in the established Protocol Stack.  That is, if
896	   the top-level protocol provides an abstraction of framed messages
897	   over a connection, the receiving application will be able to obtain
898	   multiple Messages on that connection, even if the framing protocol is
899	   built on a byte-stream protocol like TCP.

901	5.1.1.1.  Message Properties

903	   o  Lifetime: this should be implemented by removing the Message from
904	      its queue of pending Messages after the Lifetime has expired.  A
905	      queue of pending Messages within the transport system
906	      implementation that have yet to be handed to the Protocol Stack
907	      can always support this property, but once a Message has been sent
908	      into the send buffer of a protocol, only certain protocols may
909	      support de-queueing a message.  For example, TCP cannot remove
910	      bytes from its send buffer, while in case of SCTP, such control
911	      over the SCTP send buffer can be exercised using the partial
912	      reliability extension [RFC8303].  When there is no standing queue
913	      of Messages within the system, and the Protocol Stack does not
914	      support removing a Message from its buffer, this property may be
915	      ignored.

917	   o  Priority: this represents the ability to prioritize a Message over
918	      other Messages.  This can be implemented by the system re-ordering
919	      Messages that have yet to be handed to the Protocol Stack, or by
920	      giving relative priority hints to protocols that support
921	      priorities per Message.  For example, an implementation of HTTP/2
922	      could choose to send Messages of different Priority on streams of
923	      different priority.

925	   o  Ordered: when this is false, it disables the requirement of in-
926	      order-delivery for protocols that support configurable ordering.

928	   o  Idempotent: when this is true, it means that the Message can be
929	      used by mechanisms that might transfer it multiple times - e.g.,
930	      as a result of racing multiple transports or as part of TCP Fast
931	      Open.

933	   o  Final: when this is true, it means that a transport connection can
934	      be closed immediately after its transmission.

936	   o  Corruption Protection Length: when this is set to any value other
937	      than -1, it limits the required checksum in protocols that allow
938	      limiting the checksum length (e.g.  UDP-Lite).

940	   o  Transmission Profile: TBD - because it's not final in the API yet.
941	      Old text follows: when this is set to "Interactive/Low Latency",
942	      the Message should be sent immediately, even when this comes at
943	      the cost of using the network capacity less efficiently.  For
944	      example, small messages can sometimes be bundled to fit into a
945	      single data packet for the sake of reducing header overhead; such
946	      bundling should not be used.  For example, in case of TCP, the
947	      Nagle algorithm should be disabled when Interactive/Low Latency is
948	      selected as the capacity profile.  Scavenger/Bulk can translate
949	      into usage of a congestion control mechanism such as LEDBAT, and/
950	      or the capacity profile can lead to a choice of a DSCP value as
951	      described in [I-D.ietf-taps-minset]).

953	   o  Singular Transmission: when this is true, the application requests
954	      to avoid transport-layer segmentation or network-layer
955	      fragmentation.  Some transports implement network-layer
956	      fragmentation avoidance (Path MTU Discovery) without exposing this
957	      functionality to the application; in this case, only transport-
958	      layer segmentation should be avoided, by fitting the message into
959	      a single transport-layer segment or otherwise failing.  Otherwise,
960	      network-layer fragmentation should be avoided--e.g. by requesting
961	      the IP Don't Fragment bit to be set in case of UDP(-Lite) and IPv4
962	      (SET_DF in [RFC8304]).

964	5.1.1.2.  Send Completion

966	   The application should be notified whenever a Message or partial
967	   Message has been consumed by the Protocol Stack, or has failed to
968	   send.  The meaning of the Message being consumed by the stack may
969	   vary depending on the protocol.  For a basic datagram protocol like
970	   UDP, this may correspond to the time when the packet is sent into the
971	   interface driver.  For a protocol that buffers data in queues, like
972	   TCP, this may correspond to when the data has entered the send
973	   buffer.

975	5.1.1.3.  Batching Sends

977	   Since sending a Message may involve a context switch between the
978	   application and the transport system, sending patterns that involve
979	   multiple small Messages can incur high overhead if each needs to be
980	   enqueued separately.  To avoid this, the application should have a
981	   way to indicate a batch of Send actions, during which time the
982	   implementation will hold off on processing Messages until the batch
983	   is complete.  This can also help context switches when enqueuing data
984	   in the interface driver if the operation can be batched.

986	5.1.2.  Receiving Messages

988	   Similar to sending, Receiving a Message is determined by the top-
989	   level protocol in the established Protocol Stack.  The main
990	   difference with Receiving is that the size and boundaries of the
991	   Message are not known beforehand.  The application can communicate in
992	   its Receive action the parameters for the Message, which can help the
993	   implementation know how much data to deliver and when.  For example,
994	   if the application only wants to receive a complete Message, the
995	   implementation should wait until an entire Message (datagram, stream,
996	   or frame) is read before delivering any Message content to the
997	   application.  This requires the implementation to understand where
998	   messages end, either via a supplied deframer or because the top-level
999	   protocol in the established Protocol Stack preserves message
1000	   boundaries; if, on the other hand, the top-level protocol only
1001	   supports a byte-stream and no deframers were supported, the
1002	   application must specify the minimum number of bytes of Message
1003	   content it wants to receive (which may be just a single byte) to
1004	   control the flow of received data.

1006	   If a Connection becomes finished before a requested Receive action
1007	   can be satisfied, the implementation should deliver any partial
1008	   Message content outstanding, or if none is available, an indication
1009	   that there will be no more received Messages.

1011	5.2.  Handling of data for fast-open protocols

1013	   Several protocols allow sending higher-level protocol or application
1014	   data within the first packet of their protocol establishment, such as
1015	   TCP Fast Open [RFC7413] and TLS 1.3 [RFC8446].  This approach is
1016	   referred to as sending Zero-RTT (0-RTT) data.  This is a desirable
1017	   property, but poses challenges to an implementation that uses racing
1018	   during connection establishment.

1020	   If the application has 0-RTT data to send in any protocol handshakes,
1021	   it needs to provide this data before the handshakes have begun.  When
1022	   racing, this means that the data should be provided before the
1023	   process of connection establishment has begun.  If the application
1024	   wants to send 0-RTT data, it must indicate this to the implementation
1025	   by setting the Idempotent send parameter to true when sending the
1026	   data.  In general, 0-RTT data may be replayed (for example, if a TCP
1027	   SYN contains data, and the SYN is retransmitted, the data will be
1028	   retransmitted as well), but racing means that different leaf nodes
1029	   have the opportunity to send the same data independently.  If data is
1030	   truly idempotent, this should be permissible.

1032	   Once the application has provided its 0-RTT data, an implementation
1033	   should keep a copy of this data and provide it to each new leaf node
1034	   that is started and for which a 0-RTT protocol is being used.

1036	   It is also possible that protocol stacks within a particular leaf
1037	   node use 0-RTT handshakes without any idempotent application data.
1038	   For example, TCP Fast Open could use a Client Hello from TLS as its
1039	   0-RTT data, shortening the cumulative handshake time.

1041	   0-RTT handshakes often rely on previous state, such as TCP Fast Open
1042	   cookies, previously established TLS tickets, or out-of-band
1043	   distributed pre-shared keys (PSKs).  Implementations should be aware
1044	   of security concerns around using these tokens across multiple
1045	   addresses or paths when racing.  In the case of TLS, any given ticket
1046	   or PSK should only be used on one leaf node.  If implementations have
1047	   multiple tickets available from a previous connection, each leaf node
1048	   attempt must use a different ticket.  In effect, each leaf node will
1049	   send the same early application data, yet encoded (encrypted)
1050	   differently on the wire.

1052	6.  Implementing Maintenance

1054	   Maintenance encompasses changes that the application can request to a
1055	   Connection, or that a Connection can react to based on system and
1056	   network changes.

1058	6.1.  Managing Connections

1060	   Appendix A.1 of [I-D.ietf-taps-minset] explains, using primitives
1061	   from [RFC8303] and [RFC8304], how to implement changing some of the
1062	   following protocol properties of an established connection with TCP
1063	   and UDP.  Below, we amend this description for other protocols (if
1064	   applicable) and extend it with Connection Properties that are not
1065	   contained in [I-D.ietf-taps-minset].

1067	   o  Notification of excessive retransmissions: TODO

1069	   o  Retransmission threshold before excessive retransmission
1070	      notification: TODO; for TCP, this can be done using ERROR.TCP
1071	      described in section 4 of [RFC8303].

1073	   o  Notification of ICMP soft error message arrival: TODO

1075	   o  Required minimum coverage of the checksum for receiving: for UDP-
1076	      Lite, this can be done using the primitive
1077	      SET_MIN_CHECKSUM_COVERAGE.UDP-Lite described in section 4 of
1078	      [RFC8303].

1080	   o  Priority (Connection): TODO; for SCTP, this can be done using the
1081	      primitive CONFIGURE_STREAM_SCHEDULER.SCTP described in section 4
1082	      of [RFC8303].

1084	   o  Timeout for aborting Connection: for SCTP, this can be done using
1085	      the primitive CHANGE_TIMEOUT.SCTP described in section 4 of
1086	      [RFC8303].

1088	   o  Connection group transmission scheduler: for SCTP, this can be
1089	      done using the primitive SET_STREAM_SCHEDULER.SCTP described in
1090	      section 4 of [RFC8303].

1092	   o  Maximum message size concurrent with Connection establishment:
1093	      TODO

1095	   o  Maximum Message size before fragmentation or segmentation: TODO

1097	   o  Maximum Message size on send: TODO

1099	   o  Maximum Message size on receive: TODO
1100	   o  Capacity Profile: TODO

1102	   o  Bounds on Send or Receive Rate: TODO

1104	   o  TCP-specific Property: User Timeout: for TCP, this can be
1105	      configured using the primitive CHANGE_TIMEOUT.TCP described in
1106	      section 4 of [RFC8303].

1108	   It may happen that the application attempts to set a Protocol
1109	   Property which does not apply to the actually chosen protocol.  In
1110	   this case, the implementation should fail gracefully, i.e., it may
1111	   give a warning to the application, but it should not terminate the
1112	   Connection.

1114	6.2.  Handling Path Changes

1116	   When a path change occurs, the Transport Services implementation is
1117	   responsible for notifying Protocol Instances in the Protocol Stack.
1118	   If the Protocol Stack includes a transport protocol that supports
1119	   multipath connectivity, an update to the available paths should
1120	   inform the Protocol Instance of the new set of paths that are
1121	   permissible based on the Selection Properties passed by the
1122	   application.  A multipath protocol can establish new subflows over
1123	   new paths, and should tear down subflows over paths that are no
1124	   longer available.  If the Protocol Stack includes a transport
1125	   protocol that does not support multipath, but support migrating
1126	   between paths, the update to available paths can be used as the
1127	   trigger to migrating the connection.  For protocols that do not
1128	   support multipath or migration, the Protocol Instances may be
1129	   informed of the path change, but should not be forcibly disconnected
1130	   if the previously used path becomes unavailable.  An exception to
1131	   this case is if the System Policy changes to prohibit traffic from
1132	   the Connection based on its properties, in which case the Protocol
1133	   Stack should be disconnected.

1135	7.  Implementing Termination

1137	   With TCP, when an application closes a connection, this means that it
1138	   has no more data to send (but expects all data that has been handed
1139	   over to be reliably delivered).  However, with TCP only, "close" does
1140	   not mean that the application will stop receiving data.  This is
1141	   related to TCP's ability to support half-closed connections.

1143	   SCTP is an example of a protocol that does not support such half-
1144	   closed connections.  Hence, with SCTP, the meaning of "close" is
1145	   stricter: an application has no more data to send (but expects all
1146	   data that has been handed over to be reliably delivered), and will
1147	   also not receive any more data.

1149	   Implementing a protocol independent transport system means that the
1150	   exposed semantics must be the strictest subset of the semantics of
1151	   all supported protocols.  Hence, as is common with all reliable
1152	   transport protocols, after a Close action, the application can expect
1153	   to have its reliability requirements honored regarding the data it
1154	   has given to the Transport System, but it cannot expect to be able to
1155	   read any more data after calling Close.

1157	   Abort differs from Close only in that no guarantees are given
1158	   regarding data that the application has handed over to the Transport
1159	   System before calling Abort.

1161	   As explained in Section 4.6, when a new stream is multiplexed on an
1162	   already existing connection of a Transport Protocol Instance, there
1163	   is no need for a connection establishment procedure.  Because the
1164	   Connections that are offered by the Transport System can be
1165	   implemented as streams that are multiplexed on a transport protocol's
1166	   connection, it can therefore not be guaranteed that one Endpoint's
1167	   Initiate action provokes a ConnectionReceived event at its peer.

1169	   For Close (provoking a Finished event) and Abort (provoking a
1170	   ConnectionError event), the same logic applies: while it is desirable
1171	   to be informed when a peer closes or aborts a Connection, whether
1172	   this is possible depends on the underlying protocol, and no
1173	   guarantees can be given.  With SCTP, the transport system can use the
1174	   stream reset procedure to cause a Finish event upon a Close action
1175	   from the peer [NEAT-flow-mapping].

1177	8.  Cached State

1179	   Beyond a single Connection's lifetime, it is useful for an
1180	   implementation to keep state and history.  This cached state can help
1181	   improve future Connection establishment due to re-using results and
1182	   credentials, and favoring paths and protocols that performed well in
1183	   the past.

1185	   Cached state may be associated with different Endpoints for the same
1186	   Connection, depending on the protocol generating the cached content.
1187	   For example, session tickets for TLS are associated with specific
1188	   endpoints, and thus should be cached based on a Connection's hostname
1189	   Endpoint (if applicable).  On the other hand, performance
1190	   characteristics of a path are more likely tied to the IP address and
1191	   subnet being used.

1193	8.1.  Protocol state caches

1195	   Some protocols will have long-term state to be cached in association
1196	   with Endpoints.  This state often has some time after which it is
1197	   expired, so the implementation should allow each protocol to specify
1198	   an expiration for cached content.

1200	   Examples of cached protocol state include:

1202	   o  The DNS protocol can cache resolution answers (A and AAAA queries,
1203	      for example), associated with a Time To Live (TTL) to be used for
1204	      future hostname resolutions without requiring asking the DNS
1205	      resolver again.

1207	   o  TLS caches session state and tickets based on a hostname, which
1208	      can be used for resuming sessions with a server.

1210	   o  TCP can cache cookies for use in TCP Fast Open.

1212	   Cached protocol state is primarily used during Connection
1213	   establishment for a single Protocol Stack, but may be used to
1214	   influence an implementation's preference between several candidate
1215	   Protocol Stacks.  For example, if two IP address Endpoints are
1216	   otherwise equally preferred, an implementation may choose to attempt
1217	   a connection to an address for which it has a TCP Fast Open cookie.

1219	   Applications must have a way to flush protocol cache state if
1220	   desired.  This may be necessary, for example, if application-layer
1221	   identifiers rotate and clients wish to avoid linkability via
1222	   trackable TLS tickets or TFO cookies.

1224	8.2.  Performance caches

1226	   In addition to protocol state, Protocol Instances should provide data
1227	   into a performance-oriented cache to help guide future protocol and
1228	   path selection.  Some performance information can be gathered
1229	   generically across several protocols to allow predictive comparisons
1230	   between protocols on given paths:

1232	   o  Observed Round Trip Time

1234	   o  Connection Establishment latency

1236	   o  Connection Establishment success rate

1238	   These items can be cached on a per-address and per-subnet
1239	   granularity, and averaged between different values.  The information
1240	   should be cached on a per-network basis, since it is expected that
1241	   different network attachments will have different performance
1242	   characteristics.  Besides Protocol Instances, other system entities
1243	   may also provide data into performance-oriented caches.  This could
1244	   for instance be signal strength information reported by radio modems
1245	   like Wi-Fi and mobile broadband or information about the battery-
1246	   level of the device.  Furthermore, the system may cache the observed
1247	   maximum throughput on a path as an estimate of the available
1248	   bandwidth.

1250	   An implementation should use this information, when possible, to
1251	   determine preference between candidate paths, endpoints, and protocol
1252	   options.  Eligible options that historically had significantly better
1253	   performance than others should be selected first when gathering
1254	   candidates (see Section 4.1) to ensure better performance for the
1255	   application.

1257	   The reasonable lifetime for cached performance values will vary
1258	   depending on the nature of the value.  Certain information, like the
1259	   connection establishment success rate to a Remote Endpoint using a
1260	   given protocol stack, can be stored for a long period of time (hours
1261	   or longer), since it is expected that the capabilities of the Remote
1262	   Endpoint are not changing very quickly.  On the other hand, Round
1263	   Trip Time observed by TCP over a particular network path may vary
1264	   over a relatively short time interval.  For such values, the
1265	   implementation should remove them from the cache more quickly, or
1266	   treat older values with less confidence/weight.

1268	9.  Specific Transport Protocol Considerations

1270	9.1.  TCP

1272	   Connection lifetime for TCP translates fairly simply into the the
1273	   abstraction presented to an application.  When the TCP three-way
1274	   handshake is complete, its layer of the Protocol Stack can be
1275	   considered Ready (established).  This event will cause racing of
1276	   Protocol Stack options to complete if TCP is the top-level protocol,
1277	   at which point the application can be notified that the Connection is
1278	   Ready to send and receive.

1280	   If the application sends a Close, that can translate to a graceful
1281	   termination of the TCP connection, which is performed by sending a
1282	   FIN to the remote endpoint.  If the application sends an Abort, then
1283	   the TCP state can be closed abruptly, leading to a RST being sent to
1284	   the peer.

1286	   Without a layer of framing (a top-level protocol in the established
1287	   Protocol Stack that preserves message boundaries, or an application-
1288	   supplied deframer) on top of TCP, the receiver side of the transport
1289	   system implementation can only treat the incoming stream of bytes as
1290	   a single Message, terminated by a FIN when the Remote Endpoint closes
1291	   the Connection.

1293	9.2.  UDP

1295	   UDP as a direct transport does not provide any handshake or
1296	   connectivity state, so the notion of the transport protocol becoming
1297	   Ready or established is degenerate.  Once the system has validated
1298	   that there is a route on which to send and receive UDP datagrams, the
1299	   protocol is considered Ready.  Similarly, a Close or Abort has no
1300	   meaning to the on-the-wire protocol, but simply leads to the local
1301	   state being torn down.

1303	   When sending and receiving messages over UDP, each Message should
1304	   correspond to a single UDP datagram.  The Message can contain
1305	   metadata about the packet, such as the ECN bits applied to the
1306	   packet.

1308	9.3.  SCTP

1310	   To support sender-side stream schedulers (which are implemented on
1311	   the sender side), a receiver-side Transport System should always
1312	   support message interleaving [RFC8260].

1314	   SCTP messages can be very large.  To allow the reception of large
1315	   messages in pieces, a "partial flag" can be used to inform a (native
1316	   SCTP) receiving application that a message is incomplete.  After
1317	   receiving the "partial flag", this application would know that the
1318	   next receive calls will only deliver remaining parts of the same
1319	   message (i.e., no messages or partial messages will arrive on other
1320	   streams until the message is complete) (see Section 8.1.20 in
1321	   [RFC6458]).  The "partial flag" can therefore facilitate the
1322	   implementation of the receiver buffer in the receiving application,
1323	   at the cost of limiting multiplexing and temporarily creating head-
1324	   of-line blocking delay at the receiver.

1326	   When a Transport System transfers a Message, it seems natural to map
1327	   the Message object to SCTP messages in order to support properties
1328	   such as "Ordered" or "Lifetime" (which maps onto partially reliable
1329	   delivery with a SCTP_PR_SCTP_TTL policy [RFC6458]).  However, since
1330	   multiplexing of Connections onto SCTP streams may happen, and would
1331	   be hidden from the application, the Transport System requires a per-
1332	   stream receiver buffer anyway, so this potential benefit is lost and
1333	   the "partial flag" becomes unnecessary for the system.

1335	   The problem of long messages either requiring large receiver-side
1336	   buffers or getting in the way of multiplexing is addressed by message
1337	   interleaving [RFC8260], which is yet another reason why a receivers-
1338	   side transport system supporting SCTP should implement this
1339	   mechanism.

1341	9.4.  TLS

1343	   The mapping of a TLS stream abstraction into the application is
1344	   equivalent to the contract provided by TCP (see Section 9.1).  The
1345	   Ready state should be determined by the completion of the TLS
1346	   handshake, which involves potentially several more round trips beyond
1347	   the TCP handshake.  The application should not be notified that the
1348	   Connection is Ready until TLS is established.

1350	9.5.  HTTP

1352	   HTTP requests and responses map naturally into Messages, since they
1353	   are delineated chunks of data with metadata that can be sent over a
1354	   transport.  To that end, HTTP can be seen as the most prevalent
1355	   framing protocol that runs on top of streams like TCP, TLS, etc.

1357	   In order to use a transport Connection that provides HTTP Message
1358	   support, the establishment and closing of the connection can be
1359	   treated as it would without the framing protocol.  Sending and
1360	   receiving of Messages, however, changes to treat each Message as a
1361	   well-delineated HTTP request or response, with the content of the
1362	   Message representing the body, and the Headers being provided in
1363	   Message metadata.

1365	9.6.  QUIC

1367	   QUIC provides a multi-streaming interface to an encrypted transport.
1368	   Each stream can be viewed as equivalent to a TLS stream over TCP, so
1369	   a natural mapping is to present each QUIC stream as an individual
1370	   Connection.  The protocol for the stream will be considered Ready
1371	   whenever the underlying QUIC connection is established to the point
1372	   that this stream's data can be sent.  For streams after the first
1373	   stream, this will likely be an immediate operation.

1375	   Closing a single QUIC stream, presented to the application as a
1376	   Connection, does not imply closing the underlying QUIC connection
1377	   itself.  Rather, the implementation may choose to close the QUIC
1378	   connection once all streams have been closed (possibly after some
1379	   timeout), or after an individual stream Connection sends an Abort.

1381	   Messages over a direct QUIC stream should be represented similarly to
1382	   the TCP stream (one Message per direction, see Section 9.1), unless a
1383	   framing mapping is used on top of QUIC.

1385	9.7.  HTTP/2 transport

1387	   Similar to QUIC (Section 9.6), HTTP/2 provides a multi-streaming
1388	   interface.  This will generally use HTTP as the unit of Messages over
1389	   the streams, in which each stream can be represented as a transport
1390	   Connection.  The lifetime of streams and the HTTP/2 connection should
1391	   be managed as described for QUIC.

1393	   It is possible to treat each HTTP/2 stream as a raw byte-stream
1394	   instead of a carrier for HTTP messages, in which case the Messages
1395	   over the streams can be represented similarly to the TCP stream (one
1396	   Message per direction, see Section 9.1).

1398	10.  Rendezvous and Environment Discovery

1400	   The connection establishment process outlined in Section 4 is
1401	   appropriate for client-server connections, but needs to be expanded
1402	   in peer-to-peer Rendezvous scenarios, as follows:

1404	   o  Gathering Local Endpoint candidates

1406	      The set of possible Local Endpoints is gathered.  In the simple
1407	      case, this merely enumerates the local interfaces and protocols,
1408	      allocates ephemeral source ports.  For example, a system that has
1409	      WiFi and Ethernet and supports IPv4 and IPv6 might gather four
1410	      candidate locals (IPv4 on Ethernet, IPv6 on Ethernet, IPv4 on
1411	      WiFi, and IPv6 on WiFi) that can form the source for a transient.

1413	      If NAT traversal is required, the process of gathering Local
1414	      Endpoints becomes broadly equivalent to the ICE candidate
1415	      gathering phase [RFC5245].  The endpoint determines its server
1416	      reflexive Local Endpoints (i.e., the translated address of a
1417	      local, on the other side of a NAT) and relayed locals (e.g., via a
1418	      TURN server or other relay), for each interface and network
1419	      protocol.  These are added to the set of candidate Local Endpoints
1420	      for this connection.

1422	      Gathering Local Endpoints is primarily a local operation, although
1423	      it might involve exchanges with a STUN server to derive server
1424	      reflexive locals, or with a TURN server or other relay to derive
1425	      relayed locals.  It does not involve communication with the Remote
1426	      Endpoint.

1428	   o  Gathering Remote Endpoint Candidates

1430	      The Remote Endpoint is typically a name that needs to be resolved
1431	      into a set of possible addresses that can be used for
1432	      communication.  Resolving the Remote Endpoint is the process of
1433	      recursively performing such name lookups, until fully resolved, to
1434	      return the set of candidates for the remote of this connection.

1436	      How this is done will depend on the type of the Remote Endpoint,
1437	      and can also be specific to each Local Endpoint.  A common case is
1438	      when the Remote Endpoint is a DNS name, in which case it is
1439	      resolved to give a set of IPv4 and IPv6 addresses representing
1440	      that name.  Some types of remote might require more complex
1441	      resolution.  Resolving the Remote Endpoint for a peer-to-peer
1442	      connection might involve communication with a rendezvous server,
1443	      which in turn contacts the peer to gain consent to communicate and
1444	      retrieve its set of candidate locals, which are returned and form
1445	      the candidate remote addresses for contacting that peer.

1447	      Resolving the remote is _not_ a local operation.  It will involve
1448	      a directory service, and can require communication with the remote
1449	      to rendezvous and exchange peer addresses.  This can expose some
1450	      or all of the candidate locals to the remote.

1452	   o  Establishing Connections

1454	      The set of candidate Local Endpoints and the set of candidate
1455	      Remote Endpoints are paired, to derive a priority ordered set of
1456	      Candidate Paths that can potentially be used to establish a
1457	      Connection.

1459	      Then, communication is attempted over each candidate path, in
1460	      priority order.  If there are multiple candidates with the same
1461	      priority, then connection establishment proceeds simultaneously
1462	      and uses the transient that wins the race to be established.
1463	      Otherwise, connection establishment is sequential, paced at a rate
1464	      that should not congest the network.  Depending on the chosen
1465	      transport, this phase might involve racing TCP connections to a
1466	      server over IPv4 and IPv6 [RFC8305], or it could involve a STUN
1467	      exchange to establish peer-to-peer UDP connectivity [RFC5245], or
1468	      some other means.

1470	   o  Confirming and Maintaining Connections

1472	      Once connectivity has been established, unused resources can be
1473	      released and the chosen path can be confirmed.  This is primarily
1474	      required when establishing peer-to-peer connectivity, where
1475	      connections supporting relayed locals that were not required can
1476	      be closed, and where an associated signalling operation might be
1477	      needed to inform middleboxes and proxies of the chosen path.
1478	      Keep-alive messages may also be sent, as appropriate, to ensure
1479	      NAT and firewall state is maintained, so the Connection remains
1480	      operational.

1482	   To support ICE, or similar protocols, that involve an out-of-band
1483	   indirect signalling exchange to exchange candidates with the Remote
1484	   Endpoint, it's important to be able to query the set of candidate
1485	   Local Endpoints, and give the protocol stack a set of candidate
1486	   Remote Endpoints, before it attempts to establish connections.

1488	   (TO-DO: It is expected that a single abstract algorithm can be
1489	   identified that supports both the peer-to-peer and client-server
1490	   connection racing, allowing this text to be merged with Section 4)

1492	11.  IANA Considerations

1494	   RFC-EDITOR: Please remove this section before publication.

1496	   This document has no actions for IANA.

1498	12.  Security Considerations

1500	12.1.  Considerations for Candidate Gathering

1502	   Implementations should avoid downgrade attacks that allow network
1503	   interference to cause the implementation to select less secure, or
1504	   entirely insecure, combinations of paths and protocols.

1506	12.2.  Considerations for Candidate Racing

1508	   See Section 5.2 for security considerations around racing with 0-RTT
1509	   data.

1511	   An attacker that knows a particular device is racing several options
1512	   during connection establishment may be able to block packets for the
1513	   first connection attempt, thus inducing the device to fall back to a
1514	   secondary attempt.  This is a problem if the secondary attempts have
1515	   worse security properties that enable further attacks.
1516	   Implementations should ensure that all options have equivalent
1517	   security properties to avoid incentivizing attacks.

1519	   Since results from the network can determine how a connection attempt
1520	   tree is built, such as when DNS returns a list of resolved endpoints,
1521	   it is possible for the network to cause an implementation to consume
1522	   significant on-device resources.  Implementations should limit the
1523	   maximum amount of state allowed for any given node, including the
1524	   number of child nodes, especially when the state is based on results
1525	   from the network.

1527	13.  Acknowledgements

1529	   This work has received funding from the European Union's Horizon 2020
1530	   research and innovation programme under grant agreement No. 644334
1531	   (NEAT).

1533	   This work has been supported by Leibniz Prize project funds of DFG -
1534	   German Research Foundation: Gottfried Wilhelm Leibniz-Preis 2011 (FKZ
1535	   FE 570/4-1).

1537	   This work has been supported by the UK Engineering and Physical
1538	   Sciences Research Council under grant EP/R04144X/1.

1540	   Thanks to Stuart Cheshire, Josh Graessley, David Schinazi, and Eric
1541	   Kinnear for their implementation and design efforts, including Happy
1542	   Eyeballs, that heavily influenced this work.

1544	14.  References

1546	14.1.  Normative References

1548	   [I-D.ietf-taps-arch]
1549	              Pauly, T., Trammell, B., Brunstrom, A., Fairhurst, G.,
1550	              Perkins, C., Tiesel, P., and C. Wood, "An Architecture for
1551	              Transport Services", draft-ietf-taps-arch-02 (work in
1552	              progress), October 2018.

1554	   [I-D.ietf-taps-interface]
1555	              Trammell, B., Welzl, M., Enghardt, T., Fairhurst, G.,
1556	              Kuehlewind, M., Perkins, C., Tiesel, P., and C. Wood, "An
1557	              Abstract Application Layer Interface to Transport
1558	              Services", draft-ietf-taps-interface-02 (work in
1559	              progress), October 2018.

1561	   [I-D.ietf-taps-minset]
1562	              Welzl, M. and S. Gjessing, "A Minimal Set of Transport
1563	              Services for End Systems", draft-ietf-taps-minset-11 (work
1564	              in progress), September 2018.

1566	   [RFC6458]  Stewart, R., Tuexen, M., Poon, K., Lei, P., and V.
1567	              Yasevich, "Sockets API Extensions for the Stream Control
1568	              Transmission Protocol (SCTP)", RFC 6458,
1569	              DOI 10.17487/RFC6458, December 2011,
1570	              <https://www.rfc-editor.org/info/rfc6458>.

1572	   [RFC7413]  Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP
1573	              Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014,
1574	              <https://www.rfc-editor.org/info/rfc7413>.

1576	   [RFC7540]  Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext
1577	              Transfer Protocol Version 2 (HTTP/2)", RFC 7540,
1578	              DOI 10.17487/RFC7540, May 2015,
1579	              <https://www.rfc-editor.org/info/rfc7540>.

1581	   [RFC8260]  Stewart, R., Tuexen, M., Loreto, S., and R. Seggelmann,
1582	              "Stream Schedulers and User Message Interleaving for the
1583	              Stream Control Transmission Protocol", RFC 8260,
1584	              DOI 10.17487/RFC8260, November 2017,
1585	              <https://www.rfc-editor.org/info/rfc8260>.

1587	   [RFC8303]  Welzl, M., Tuexen, M., and N. Khademi, "On the Usage of
1588	              Transport Features Provided by IETF Transport Protocols",
1589	              RFC 8303, DOI 10.17487/RFC8303, February 2018,
1590	              <https://www.rfc-editor.org/info/rfc8303>.

1592	   [RFC8304]  Fairhurst, G. and T. Jones, "Transport Features of the
1593	              User Datagram Protocol (UDP) and Lightweight UDP (UDP-
1594	              Lite)", RFC 8304, DOI 10.17487/RFC8304, February 2018,
1595	              <https://www.rfc-editor.org/info/rfc8304>.

1597	   [RFC8305]  Schinazi, D. and T. Pauly, "Happy Eyeballs Version 2:
1598	              Better Connectivity Using Concurrency", RFC 8305,
1599	              DOI 10.17487/RFC8305, December 2017,
1600	              <https://www.rfc-editor.org/info/rfc8305>.

1602	   [RFC8446]  Rescorla, E., "The Transport Layer Security (TLS) Protocol
1603	              Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
1604	              <https://www.rfc-editor.org/info/rfc8446>.

1606	14.2.  Informative References

1608	   [I-D.ietf-quic-transport]
1609	              Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed
1610	              and Secure Transport", draft-ietf-quic-transport-18 (work
1611	              in progress), January 2019.

1613	   [NEAT-flow-mapping]
1614	              "Transparent Flow Mapping for NEAT (in Workshop on Future
1615	              of Internet Transport (FIT 2017))", n.d..

1617	   [RFC5245]  Rosenberg, J., "Interactive Connectivity Establishment
1618	              (ICE): A Protocol for Network Address Translator (NAT)
1619	              Traversal for Offer/Answer Protocols", RFC 5245,
1620	              DOI 10.17487/RFC5245, April 2010,
1621	              <https://www.rfc-editor.org/info/rfc5245>.

1623	   [Trickle]  "Trickle - Rate Limiting YouTube Video Streaming (ATC
1624	              2012)", n.d..

1626	Appendix A.  Additional Properties

1628	   This appendix discusses implementation considerations for additional
1629	   parameters and properties that could be used to enhance transport
1630	   protocol and/or path selection, or the transmission of messages given
1631	   a Protocol Stack that implements them.  These are not part of the
1632	   interface, and may be removed from the final document, but are
1633	   presented here to support discussion within the TAPS working group as
1634	   to whether they should be added to a future revision of the base
1635	   specification.

1637	A.1.  Properties Affecting Sorting of Branches

1639	   In addition to the Protocol and Path Selection Properties discussed
1640	   in Section 4.3, the following properties under discussion can
1641	   influence branch sorting:

1643	   o  Bounds on Send or Receive Rate: If the application indicates a
1644	      bound on the expected Send or Receive bitrate, an implementation
1645	      may prefer a path that can likely provide the desired bandwidth,
1646	      based on cached maximum throughput, see Section 8.2.  The
1647	      application may know the Send or Receive Bitrate from metadata in
1648	      adaptive HTTP streaming, such as MPEG-DASH.

1650	   o  Cost Preferences: If the application indicates a preference to
1651	      avoid expensive paths, and some paths are associated with a
1652	      monetary cost, an implementation should decrease the ranking of
1653	      such paths.  If the application indicates that it prohibits using
1654	      expensive paths, paths that are associated with a cost should be
1655	      purged from the decision tree.

1657	Authors' Addresses

1659	   Anna Brunstrom (editor)
1660	   Karlstad University
1661	   Universitetsgatan 2
1662	   651 88 Karlstad
1663	   Sweden

1665	   Email: anna.brunstrom@kau.se
1666	   Tommy Pauly (editor)
1667	   Apple Inc.
1668	   One Apple Park Way
1669	   Cupertino, California 95014
1670	   United States of America

1672	   Email: tpauly@apple.com

1674	   Theresa Enghardt
1675	   TU Berlin
1676	   Marchstrasse 23
1677	   10587 Berlin
1678	   Germany

1680	   Email: theresa@inet.tu-berlin.de

1682	   Karl-Johan Grinnemo
1683	   Karlstad University
1684	   Universitetsgatan 2
1685	   651 88 Karlstad
1686	   Sweden

1688	   Email: karl-johan.grinnemo@kau.se

1690	   Tom Jones
1691	   University of Aberdeen
1692	   Fraser Noble Building
1693	   Aberdeen, AB24 3UE
1694	   UK

1696	   Email: tom@erg.abdn.ac.uk

1698	   Philipp S. Tiesel
1699	   TU Berlin
1700	   Marchstrasse 23
1701	   10587 Berlin
1702	   Germany

1704	   Email: philipp@inet.tu-berlin.de
1705	   Colin Perkins
1706	   University of Glasgow
1707	   School of Computing Science
1708	   Glasgow G12 8QQ
1709	   United Kingdom

1711	   Email: csp@csperkins.org

1713	   Michael Welzl
1714	   University of Oslo
1715	   PO Box 1080 Blindern
1716	   0316  Oslo
1717	   Norway

1719	   Email: michawe@ifi.uio.no