idnits 2.17.1 

draft-ietf-taps-impl-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 8 instances of too long lines in the document, the longest one
     being 44 characters in excess of 72.

  ** The abstract seems to contain references ([I-D.ietf-taps-arch]), which
     it shouldn't.  Please replace those with straight textual mentions of the
     documents in question.

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 1233: '... Implementations SHOULD ensure that th...'


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (9 March 2020) is 1508 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'SUBCATEGORY' is mentioned on line 1532, but not
     defined

  == Outdated reference: A later version (-19) exists of
     draft-ietf-taps-arch-06

  == Outdated reference: A later version (-26) exists of
     draft-ietf-taps-interface-05

  ** Obsolete normative reference: RFC 7540 (Obsoleted by RFC 9113)

  == Outdated reference: A later version (-34) exists of
     draft-ietf-quic-transport-27

  -- Obsolete informational reference (is this intentional?): RFC 5245
     (Obsoleted by RFC 8445, RFC 8839)


     Summary: 4 errors (**), 0 flaws (~~), 5 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	TAPS Working Group                                     A. Brunstrom, Ed.
3	Internet-Draft                                       Karlstad University
4	Intended status: Informational                             T. Pauly, Ed.
5	Expires: 10 September 2020                                    Apple Inc.
6	                                                             T. Enghardt
7	                                                               TU Berlin
8	                                                           K-J. Grinnemo
9	                                                     Karlstad University
10	                                                                T. Jones
11	                                                  University of Aberdeen
12	                                                               P. Tiesel
13	                                                               TU Berlin
14	                                                              C. Perkins
15	                                                   University of Glasgow
16	                                                                M. Welzl
17	                                                      University of Oslo
18	                                                            9 March 2020

20	             Implementing Interfaces to Transport Services
21	                        draft-ietf-taps-impl-06

23	Abstract

25	   The Transport Services architecture [I-D.ietf-taps-arch] defines a
26	   system that allows applications to use transport networking protocols
27	   flexibly.  This document serves as a guide to implementation on how
28	   to build such a system.

30	Status of This Memo

32	   This Internet-Draft is submitted in full conformance with the
33	   provisions of BCP 78 and BCP 79.

35	   Internet-Drafts are working documents of the Internet Engineering
36	   Task Force (IETF).  Note that other groups may also distribute
37	   working documents as Internet-Drafts.  The list of current Internet-
38	   Drafts is at https://datatracker.ietf.org/drafts/current/.

40	   Internet-Drafts are draft documents valid for a maximum of six months
41	   and may be updated, replaced, or obsoleted by other documents at any
42	   time.  It is inappropriate to use Internet-Drafts as reference
43	   material or to cite them other than as "work in progress."

45	   This Internet-Draft will expire on 10 September 2020.

47	Copyright Notice

49	   Copyright (c) 2020 IETF Trust and the persons identified as the
50	   document authors.  All rights reserved.

52	   This document is subject to BCP 78 and the IETF Trust's Legal
53	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
54	   license-info) in effect on the date of publication of this document.
55	   Please review these documents carefully, as they describe your rights
56	   and restrictions with respect to this document.  Code Components
57	   extracted from this document must include Simplified BSD License text
58	   as described in Section 4.e of the Trust Legal Provisions and are
59	   provided without warranty as described in the Simplified BSD License.

61	Table of Contents

63	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
64	   2.  Implementing Connection Objects . . . . . . . . . . . . . . .   4
65	   3.  Implementing Pre-Establishment  . . . . . . . . . . . . . . .   5
66	     3.1.  Configuration-time errors . . . . . . . . . . . . . . . .   5
67	     3.2.  Role of system policy . . . . . . . . . . . . . . . . . .   6
68	   4.  Implementing Connection Establishment . . . . . . . . . . . .   7
69	     4.1.  Candidate Gathering . . . . . . . . . . . . . . . . . . .   8
70	       4.1.1.  Gathering Endpoint Candidates . . . . . . . . . . . .   8
71	       4.1.2.  Structuring Options as a Tree . . . . . . . . . . . .   9
72	       4.1.3.  Branch Types  . . . . . . . . . . . . . . . . . . . .  11
73	     4.2.  Branching Order-of-Operations . . . . . . . . . . . . . .  13
74	     4.3.  Sorting Branches  . . . . . . . . . . . . . . . . . . . .  14
75	     4.4.  Candidate Racing  . . . . . . . . . . . . . . . . . . . .  16
76	       4.4.1.  Delayed . . . . . . . . . . . . . . . . . . . . . . .  16
77	       4.4.2.  Failover  . . . . . . . . . . . . . . . . . . . . . .  17
78	     4.5.  Completing Establishment  . . . . . . . . . . . . . . . .  17
79	       4.5.1.  Determining Successful Establishment  . . . . . . . .  18
80	     4.6.  Establishing multiplexed connections  . . . . . . . . . .  19
81	     4.7.  Handling racing with "unconnected" protocols  . . . . . .  19
82	     4.8.  Implementing listeners  . . . . . . . . . . . . . . . . .  20
83	       4.8.1.  Implementing listeners for Connected Protocols  . . .  20
84	       4.8.2.  Implementing listeners for Unconnected Protocols  . .  21
85	       4.8.3.  Implementing listeners for Multiplexed Protocols  . .  21
86	   5.  Implementing Sending and Receiving Data . . . . . . . . . . .  21
87	     5.1.  Sending Messages  . . . . . . . . . . . . . . . . . . . .  22
88	       5.1.1.  Message Properties  . . . . . . . . . . . . . . . . .  22
89	       5.1.2.  Send Completion . . . . . . . . . . . . . . . . . . .  23
90	       5.1.3.  Batching Sends  . . . . . . . . . . . . . . . . . . .  23
91	     5.2.  Receiving Messages  . . . . . . . . . . . . . . . . . . .  24
92	     5.3.  Handling of data for fast-open protocols  . . . . . . . .  24
93	   6.  Implementing Message Framers  . . . . . . . . . . . . . . . .  25
94	     6.1.  Defining Message Framers  . . . . . . . . . . . . . . . .  26
95	     6.2.  Sender-side Message Framing . . . . . . . . . . . . . . .  27
96	     6.3.  Receiver-side Message Framing . . . . . . . . . . . . . .  27
97	   7.  Implementing Connection Management  . . . . . . . . . . . . .  28
98	     7.1.  Pooled Connection . . . . . . . . . . . . . . . . . . . .  29
99	     7.2.  Handling Path Changes . . . . . . . . . . . . . . . . . .  29
100	   8.  Implementing Connection Termination . . . . . . . . . . . . .  30
101	   9.  Cached State  . . . . . . . . . . . . . . . . . . . . . . . .  31
102	     9.1.  Protocol state caches . . . . . . . . . . . . . . . . . .  31
103	     9.2.  Performance caches  . . . . . . . . . . . . . . . . . . .  32
104	   10. Specific Transport Protocol Considerations  . . . . . . . . .  33
105	     10.1.  TCP  . . . . . . . . . . . . . . . . . . . . . . . . . .  34
106	     10.2.  UDP  . . . . . . . . . . . . . . . . . . . . . . . . . .  35
107	     10.3.  UDP Multicast Receive  . . . . . . . . . . . . . . . . .  36
108	     10.4.  TLS  . . . . . . . . . . . . . . . . . . . . . . . . . .  38
109	     10.5.  DTLS . . . . . . . . . . . . . . . . . . . . . . . . . .  39
110	     10.6.  HTTP . . . . . . . . . . . . . . . . . . . . . . . . . .  40
111	     10.7.  QUIC . . . . . . . . . . . . . . . . . . . . . . . . . .  41
112	     10.8.  HTTP/2 transport . . . . . . . . . . . . . . . . . . . .  41
113	     10.9.  SCTP . . . . . . . . . . . . . . . . . . . . . . . . . .  42
114	   11. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  44
115	   12. Security Considerations . . . . . . . . . . . . . . . . . . .  44
116	     12.1.  Considerations for Candidate Gathering . . . . . . . . .  44
117	     12.2.  Considerations for Candidate Racing  . . . . . . . . . .  44
118	   13. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  45
119	   14. References  . . . . . . . . . . . . . . . . . . . . . . . . .  45
120	     14.1.  Normative References . . . . . . . . . . . . . . . . . .  45
121	     14.2.  Informative References . . . . . . . . . . . . . . . . .  46
122	   Appendix A.  Additional Properties  . . . . . . . . . . . . . . .  47
123	     A.1.  Properties Affecting Sorting of Branches  . . . . . . . .  47
124	   Appendix B.  Reasons for errors . . . . . . . . . . . . . . . . .  47
125	   Appendix C.  Existing Implementations . . . . . . . . . . . . . .  48
126	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  49

128	1.  Introduction

130	   The Transport Services architecture [I-D.ietf-taps-arch] defines a
131	   system that allows applications to use transport networking protocols
132	   flexibly.  The interface such a system exposes to applications is
133	   defined as the Transport Services API [I-D.ietf-taps-interface].
134	   This API is designed to be generic across multiple transport
135	   protocols and sets of protocols features.

137	   This document serves as a guide to implementation on how to build a
138	   system that provides a Transport Services API.  It is the job of an
139	   implementation of a Transport Services system to turn the requests of
140	   an application into decisions on how to establish connections, and
141	   how to transfer data over those connections once established.  The
142	   terminology used in this document is based on the Architecture
143	   [I-D.ietf-taps-arch].

145	2.  Implementing Connection Objects

147	   The connection objects that are exposed to applications for Transport
148	   Services are:

150	   *  the Preconnection, the bundle of properties that describes the
151	      application constraints on the transport;

153	   *  the Connection, the basic object that represents a flow of data in
154	      either direction between the Local and Remote Endpoints;

156	   *  and the Listener, a passive waiting object that delivers new
157	      Connections.

159	   Preconnection objects should be implemented as bundles of properties
160	   that an application can both read and write.  Once a Preconnection
161	   has been used to create an outbound Connection or a Listener, the
162	   implementation should ensure that the copy of the properties held by
163	   the Connection or Listener is immutable.  This may involve performing
164	   a deep-copy if the application is still able to modify properties on
165	   the original Preconnection object.

167	   Connection objects represent the interface between the application
168	   and the implementation to manage transport state, and conduct data
169	   transfer.  During the process of establishment (Section 4), the
170	   Connection will be unbound to a specific transport flow, since there
171	   may be multiple candidate Protocol Stacks being raced.  Once the
172	   Connection is established, the object should be considered mapped to
173	   a specific Protocol Stack.  The notion of a Connection maps to many
174	   different protocols, depending on the Protocol Stack.  For example,
175	   the Connection may ultimately represent the interface into a TCP
176	   connection, a TLS session over TCP, a UDP flow with fully-specified
177	   local and remote endpoints, a DTLS session, a SCTP stream, a QUIC
178	   stream, or an HTTP/2 stream.

180	   Listener objects are created with a Preconnection, at which point
181	   their configuration should be considered immutable by the
182	   implementation.  The process of listening is described in
183	   Section 4.8.

185	3.  Implementing Pre-Establishment

187	   During pre-establishment the application specifies the Endpoints to
188	   be used for communication as well as its preferences via Selection
189	   Properties and, if desired, also Connection Properties.  Generally,
190	   Connection Properties should be configured as early as possible, as
191	   they may serve as input to decisions that are made by the
192	   implementation (the Capacity Profile may guide usage of a protocol
193	   offering scavenger-type congestion control, for example).  In the
194	   remainder of this document, we only refer to Selection Properties
195	   because they are the more typical case and have to be handled by all
196	   implementations.

198	   The implementation stores these objects and properties as part of the
199	   Preconnection object for use during connection establishment.  For
200	   Selection Properties that are not provided by the application, the
201	   implementation must use the default values specified in the Transport
202	   Services API ([I-D.ietf-taps-interface]).

204	3.1.  Configuration-time errors

206	   The transport system should have a list of supported protocols
207	   available, which each have transport features reflecting the
208	   capabilities of the protocol.  Once an application specifies its
209	   Transport Parameters, the transport system should match the required
210	   and prohibited properties against the transport features of the
211	   available protocols.

213	   In the following cases, failure should be detected during pre-
214	   establishment:

216	   *  The application requested Protocol Properties that include
217	      requirements or prohibitions that cannot be satisfied by any of
218	      the available protocols.  For example, if an application requires
219	      "Configure Reliability per Message", but no such protocol is
220	      available on the host running the transport system, e.g., because
221	      SCTP is not supported by the operating system, this should result
222	      in an error.

224	   *  The application requested Protocol Properties that are in conflict
225	      with each other, i.e., the required and prohibited properties
226	      cannot be satisfied by the same protocol.  For example, if an
227	      application prohibits "Reliable Data Transfer" but then requires
228	      "Configure Reliability per Message", this mismatch should result
229	      in an error.

231	   It is important to fail as early as possible in such cases in order
232	   to avoid allocating resources, e.g., to endpoint resolution, only to
233	   find out later that there is no protocol that satisfies the
234	   requirements.

236	3.2.  Role of system policy

238	   The properties specified during pre-establishment have a close
239	   connection to system policy.  The implementation is responsible for
240	   combining and reconciling several different sources of preferences
241	   when establishing Connections.  These include, but are not limited
242	   to:

244	   1.  Application preferences, i.e., preferences specified during the
245	       pre-establishment via Selection Properties.

247	   2.  Dynamic system policy, i.e., policy compiled from internally and
248	       externally acquired information about available network
249	       interfaces, supported transport protocols, and current/previous
250	       Connections.  Examples of ways to externally retrieve policy-
251	       support information are through OS-specific statistics/
252	       measurement tools and tools that reside on middleboxes and
253	       routers.

255	   3.  Default implementation policy, i.e., predefined policy by OS or
256	       application.

258	   In general, any protocol or path used for a connection must conform
259	   to all three sources of constraints.  Any violation of any of the
260	   layers should cause a protocol or path to be considered ineligible
261	   for use.  For an example of application preferences leading to
262	   constraints, an application may prohibit the use of metered network
263	   interfaces for a given Connection to avoid user cost.  Similarly, the
264	   system policy at a given time may prohibit the use of such a metered
265	   network interface from the application's process.  Lastly, the
266	   implementation itself may default to disallowing certain network
267	   interfaces unless explicitly requested by the application and allowed
268	   by the system.

270	   It is expected that the database of system policies and the method of
271	   looking up these policies will vary across various platforms.  An
272	   implementation should attempt to look up the relevant policies for
273	   the system in a dynamic way to make sure it is reflecting an accurate
274	   version of the system policy, since the system's policy regarding the
275	   application's traffic may change over time due to user or
276	   administrative changes.

278	4.  Implementing Connection Establishment

280	   The process of establishing a network connection begins when an
281	   application expresses intent to communicate with a remote endpoint by
282	   calling Initiate.  (At this point, any constraints or requirements
283	   the application may have on the connection are available from pre-
284	   establishment.)  The process can be considered complete once there is
285	   at least one Protocol Stack that has completed any required setup to
286	   the point that it can transmit and receive the application's data.

288	   Connection establishment is divided into two top-level steps:
289	   Candidate Gathering, to identify the paths, protocols, and endpoints
290	   to use, and Candidate Racing, in which the necessary protocol
291	   handshakes are conducted so that the transport system can select
292	   which set to use.  This document structures candidates for racing as
293	   a tree.

295	   The most simple example of this process might involve identifying the
296	   single IP address to which the implementation wishes to connect,
297	   using the system's current default interface or path, and starting a
298	   TCP handshake to establish a stream to the specified IP address.
299	   However, each step may also vary depending on the requirements of the
300	   connection: if the endpoint is defined as a hostname and port, then
301	   there may be multiple resolved addresses that are available; there
302	   may also be multiple interfaces or paths available, other than the
303	   default system interface; and some protocols may not need any
304	   transport handshake to be considered "established" (such as UDP),
305	   while other connections may utilize layered protocol handshakes, such
306	   as TLS over TCP.

308	   Whenever an implementation has multiple options for connection
309	   establishment, it can view the set of all individual connection
310	   establishment options as a single, aggregate connection
311	   establishment.  The aggregate set conceptually includes every valid
312	   combination of endpoints, paths, and protocols.  As an example,
313	   consider an implementation that initiates a TCP connection to a
314	   hostname + port endpoint, and has two valid interfaces available (Wi-
315	   Fi and LTE).  The hostname resolves to a single IPv4 address on the
316	   Wi-Fi network, and resolves to the same IPv4 address on the LTE
317	   network, as well as a single IPv6 address.  The aggregate set of
318	   connection establishment options can be viewed as follows:

320	   Aggregate [Endpoint: www.example.com:80] [Interface: Any]   [Protocol: TCP]
321	   |-> [Endpoint: 192.0.2.1:80]       [Interface: Wi-Fi] [Protocol: TCP]
322	   |-> [Endpoint: 192.0.2.1:80]       [Interface: LTE]   [Protocol: TCP]
323	   |-> [Endpoint: 2001:DB8::1.80]     [Interface: LTE]   [Protocol: TCP]
324	   Any one of these sub-entries on the aggregate connection attempt
325	   would satisfy the original application intent.  The concern of this
326	   section is the algorithm defining which of these options to try,
327	   when, and in what order.

329	   During Candidate Gathering, an implementation first excludes all
330	   protocols and paths that match a Prohibit or do not match all Require
331	   properties.  Then, the implementation will sort branches according to
332	   Preferred properties, Avoided properties, and possibly other
333	   criteria.

335	4.1.  Candidate Gathering

337	   The step of gathering candidates involves identifying which paths,
338	   protocols, and endpoints may be used for a given Connection.  This
339	   list is determined by the requirements, prohibitions, and preferences
340	   of the application as specified in the Selection Properties.

342	4.1.1.  Gathering Endpoint Candidates

344	   Both Local and Remote Endpoint Candidates must be discovered during
345	   connection establishment.  To support ICE, or similar protocols, that
346	   involve out-of-band indirect signalling to exchange candidates with
347	   the Remote Endpoint, it's important to be able to query the set of
348	   candidate Local Endpoints, and give the protocol stack a set of
349	   candidate Remote Endpoints, before it attempts to establish
350	   connections.

352	4.1.1.1.  Local Endpoint candidates

354	   The set of possible Local Endpoints is gathered.  In the simple case,
355	   this merely enumerates the local interfaces and protocols, allocates
356	   ephemeral source ports.  For example, a system that has WiFi and
357	   Ethernet and supports IPv4 and IPv6 might gather four candidate
358	   locals (IPv4 on Ethernet, IPv6 on Ethernet, IPv4 on WiFi, and IPv6 on
359	   WiFi) that can form the source for a transient.

361	   If NAT traversal is required, the process of gathering Local
362	   Endpoints becomes broadly equivalent to the ICE candidate gathering
363	   phase [RFC5245].  The endpoint determines its server reflexive Local
364	   Endpoints (i.e., the translated address of a local, on the other side
365	   of a NAT) and relayed locals (e.g., via a TURN server or other
366	   relay), for each interface and network protocol.  These are added to
367	   the set of candidate Local Endpoints for this connection.

369	   Gathering Local Endpoints is primarily a local operation, although it
370	   might involve exchanges with a STUN server to derive server reflexive
371	   locals, or with a TURN server or other relay to derive relayed
372	   locals.  It does not involve communication with the Remote Endpoint.

374	4.1.1.2.  Remote Endpoint Candidates

376	   The Remote Endpoint is typically a name that needs to be resolved
377	   into a set of possible addresses that can be used for communication.
378	   Resolving the Remote Endpoint is the process of recursively
379	   performing such name lookups, until fully resolved, to return the set
380	   of candidates for the remote of this connection.

382	   How this is done will depend on the type of the Remote Endpoint, and
383	   can also be specific to each Local Endpoint.  A common case is when
384	   the Remote Endpoint is a DNS name, in which case it is resolved to
385	   give a set of IPv4 and IPv6 addresses representing that name.  Some
386	   types of remote might require more complex resolution.  Resolving the
387	   Remote Endpoint for a peer-to-peer connection might involve
388	   communication with a rendezvous server, which in turn contacts the
389	   peer to gain consent to communicate and retrieve its set of candidate
390	   locals, which are returned and form the candidate remote addresses
391	   for contacting that peer.

393	   Resolving the remote is not a local operation.  It will involve a
394	   directory service, and can require communication with the remote to
395	   rendezvous and exchange peer addresses.  This can expose some or all
396	   of the candidate locals to the remote.

398	4.1.2.  Structuring Options as a Tree

400	   When an implementation responsible for connection establishment needs
401	   to consider multiple options, it should logically structure these
402	   options as a hierarchical tree.  Each leaf node of the tree
403	   represents a single, coherent connection attempt, with an Endpoint, a
404	   Path, and a set of protocols that can directly negotiate and send
405	   data on the network.  Each node in the tree that is not a leaf
406	   represents a connection attempt that is either underspecified, or
407	   else includes multiple distinct options.  For example. when
408	   connecting on an IP network, a connection attempt to a hostname and
409	   port is underspecified, because the connection attempt requires a
410	   resolved IP address as its remote endpoint.  In this case, the node
411	   represented by the connection attempt to the hostname is a parent
412	   node, with child nodes for each IP address.  Similarly, an
413	   implementation that is allowed to connect using multiple interfaces
414	   will have a parent node of the tree for the decision between the
415	   paths, with a branch for each interface.

417	   The example aggregate connection attempt above can be drawn as a tree
418	   by grouping the addresses resolved on the same interface into
419	   branches:

421	                                ||
422	                   +==========================+
423	                   |  www.example.com:80/Any  |
424	                   +==========================+
425	                     //                    \\
426	   +==========================+       +==========================+
427	   | www.example.com:80/Wi-Fi |       |  www.example.com:80/LTE  |
428	   +==========================+       +==========================+
429	                ||                      //                    \\
430	     +====================+  +====================+  +======================+
431	     | 192.0.2.1:80/Wi-Fi |  |  192.0.2.1:80/LTE  |  |  2001:DB8::1.80/LTE  |
432	     +====================+  +====================+  +======================+

434	   The rest of this section will use a notation scheme to represent this
435	   tree.  The parent (or trunk) node of the tree will be represented by
436	   a single integer, such as "1".  Each child of that node will have an
437	   integer that identifies it, from 1 to the number of children.  That
438	   child node will be uniquely identified by concatenating its integer
439	   to it's parents identifier with a dot in between, such as "1.1" and
440	   "1.2".  Each node will be summarized by a tuple of three elements:
441	   Endpoint, Path, and Protocol.  The above example can now be written
442	   more succinctly as:

444	   1 [www.example.com:80, Any, TCP]
445	     1.1 [www.example.com:80, Wi-Fi, TCP]
446	       1.1.1 [192.0.2.1:80, Wi-Fi, TCP]
447	     1.2 [www.example.com:80, LTE, TCP]
448	       1.2.1 [192.0.2.1:80, LTE, TCP]
449	       1.2.2 [2001:DB8::1.80, LTE, TCP]

451	   When an implementation views this aggregate set of connection
452	   attempts as a single connection establishment, it only will use one
453	   of the leaf nodes to transfer data.  Thus, when a single leaf node
454	   becomes ready to use, then the entire connection attempt is ready to
455	   use by the application.  Another way to represent this is that every
456	   leaf node updates the state of its parent node when it becomes ready,
457	   until the trunk node of the tree is ready, which then notifies the
458	   application that the connection as a whole is ready to use.

460	   A connection establishment tree may be degenerate, and only have a
461	   single leaf node, such as a connection attempt to an IP address over
462	   a single interface with a single protocol.

464	   1 [192.0.2.1:80, Wi-Fi, TCP]
465	   A parent node may also only have one child (or leaf) node, such as a
466	   when a hostname resolves to only a single IP address.

468	   1 [www.example.com:80, Wi-Fi, TCP]
469	     1.1 [192.0.2.1:80, Wi-Fi, TCP]

471	4.1.3.  Branch Types

473	   There are three types of branching from a parent node into one or
474	   more child nodes.  Any parent node of the tree must only use one type
475	   of branching.

477	4.1.3.1.  Derived Endpoints

479	   If a connection originally targets a single endpoint, there may be
480	   multiple endpoints of different types that can be derived from the
481	   original.  The connection library should order the derived endpoints
482	   according to application preference, system policy and expected
483	   performance.

485	   DNS hostname-to-address resolution is the most common method of
486	   endpoint derivation.  When trying to connect to a hostname endpoint
487	   on a traditional IP network, the implementation should send DNS
488	   queries for both A (IPv4) and AAAA (IPv6) records if both are
489	   supported on the local link.  The algorithm for ordering and racing
490	   these addresses should follow the recommendations in Happy Eyeballs
491	   [RFC8305].

493	   1 [www.example.com:80, Wi-Fi, TCP]
494	     1.1 [2001:DB8::1.80, Wi-Fi, TCP]
495	     1.2 [192.0.2.1:80, Wi-Fi, TCP]
496	     1.3 [2001:DB8::2.80, Wi-Fi, TCP]
497	     1.4 [2001:DB8::3.80, Wi-Fi, TCP]

499	   DNS-Based Service Discovery can also provide an endpoint derivation
500	   step.  When trying to connect to a named service, the client may
501	   discover one or more hostname and port pairs on the local network
502	   using multicast DNS.  These hostnames should each be treated as a
503	   branch which can be attempted independently from other hostnames.
504	   Each of these hostnames may also resolve to one or more addresses,
505	   thus creating multiple layers of branching.

507	   1 [term-printer._ipp._tcp.meeting.ietf.org, Wi-Fi, TCP]
508	     1.1 [term-printer.meeting.ietf.org:631, Wi-Fi, TCP]
509	       1.1.1 [31.133.160.18.631, Wi-Fi, TCP]

511	4.1.3.2.  Alternate Paths

513	   If a client has multiple network interfaces available to it, such as
514	   mobile client with both Wi-Fi and Cellular connectivity, it can
515	   attempt a connection over either interface.  This represents a branch
516	   point in the connection establishment.  Like with derived endpoints,
517	   the interfaces should be ranked based on preference, system policy,
518	   and performance.  Attempts should be started on one interface, and
519	   then on other interfaces successively after delays based on expected
520	   round-trip-time or other available metrics.

522	   1 [192.0.2.1:80, Any, TCP]
523	     1.1 [192.0.2.1:80, Wi-Fi, TCP]
524	     1.2 [192.0.2.1:80, LTE, TCP]

526	   This same approach applies to any situation in which the client is
527	   aware of multiple links or views of the network.  Multiple Paths,
528	   each with a coherent set of addresses, routes, DNS server, and more,
529	   may share a single interface.  A path may also represent a virtual
530	   interface service such as a Virtual Private Network (VPN).

532	   The list of available paths should be constrained by any requirements
533	   or prohibitions the application sets, as well as system policy.

535	4.1.3.3.  Protocol Options

537	   Differences in possible protocol compositions and options can also
538	   provide a branching point in connection establishment.  This allows
539	   clients to be resilient to situations in which a certain protocol is
540	   not functioning on a server or network.

542	   This approach is commonly used for connections with optional proxy
543	   server configurations.  A single connection may be allowed to use an
544	   HTTP-based proxy, a SOCKS-based proxy, or connect directly.  These
545	   options should be ranked and attempted in succession.

547	   1 [www.example.com:80, Any, HTTP/TCP]
548	     1.1 [192.0.2.8:80, Any, HTTP/HTTP Proxy/TCP]
549	     1.2 [192.0.2.7:10234, Any, HTTP/SOCKS/TCP]
550	     1.3 [www.example.com:80, Any, HTTP/TCP]
551	       1.3.1 [192.0.2.1:80, Any, HTTP/TCP]

553	   This approach also allows a client to attempt different sets of
554	   application and transport protocols that may provide preferable
555	   characteristics when available.  For example, the protocol options
556	   could involve QUIC [I-D.ietf-quic-transport] over UDP on one branch,
557	   and HTTP/2 [RFC7540] over TLS over TCP on the other:

559	   1 [www.example.com:443, Any, Any HTTP]
560	     1.1 [www.example.com:443, Any, QUIC/UDP]
561	       1.1.1 [192.0.2.1:443, Any, QUIC/UDP]
562	     1.2 [www.example.com:443, Any, HTTP2/TLS/TCP]
563	       1.2.1 [192.0.2.1:443, Any, HTTP2/TLS/TCP]

565	   Another example is racing SCTP with TCP:

567	   1 [www.example.com:80, Any, Any Stream]
568	     1.1 [www.example.com:80, Any, SCTP]
569	       1.1.1 [192.0.2.1:80, Any, SCTP]
570	     1.2 [www.example.com:80, Any, TCP]
571	       1.2.1 [192.0.2.1:80, Any, TCP]

573	   Implementations that support racing protocols and protocol options
574	   should maintain a history of which protocols and protocol options
575	   successfully established, on a per-network basis (see Section 9.2).
576	   This information can influence future racing decisions to prioritize
577	   or prune branches.

579	4.2.  Branching Order-of-Operations

581	   Branch types must occur in a specific order relative to one another
582	   to avoid creating leaf nodes with invalid or incompatible settings.
583	   In the example above, it would be invalid to branch for derived
584	   endpoints (the DNS results for www.example.com) before branching
585	   between interface paths, since usable DNS results on one network may
586	   not necessarily be the same as DNS results on another network due to
587	   local network entities, supported address families, or enterprise
588	   network configurations.  Implementations must be careful to branch in
589	   an order that results in usable leaf nodes whenever there are
590	   multiple branch types that could be used from a single node.

592	   The order of operations for branching, where lower numbers are acted
593	   upon first, should be:

595	   1.  Alternate Paths

597	   2.  Protocol Options

599	   3.  Derived Endpoints

601	   Branching between paths is the first in the list because results
602	   across multiple interfaces are likely not related to one another:
603	   endpoint resolution may return different results, especially when
604	   using locally resolved host and service names, and which protocols
605	   are supported and preferred may differ across interfaces.  Thus, if
606	   multiple paths are attempted, the overall connection can be seen as a
607	   race between the available paths or interfaces.

609	   Protocol options are checked next in order.  Whether or not a set of
610	   protocol, or protocol-specific options, can successfully connect is
611	   generally not dependent on which specific IP address is used.
612	   Furthermore, the protocol stacks being attempted may influence or
613	   altogether change the endpoints being used.  Adding a proxy to a
614	   connection's branch will change the endpoint to the proxy's IP
615	   address or hostname.  Choosing an alternate protocol may also modify
616	   the ports that should be selected.

618	   Branching for derived endpoints is the final step, and may have
619	   multiple layers of derivation or resolution, such as DNS service
620	   resolution and DNS hostname resolution.

622	   For example, if the application has indicated both a preference for
623	   WiFi over LTE and for a feature only available in SCTP, branches will
624	   be first sorted accord to path selection, with WiFi at the top.
625	   Then, branches with SCTP will be sorted to the top within their
626	   subtree according to the properties influencing protocol selection.
627	   However, if the implementation has cached the information that SCTP
628	   is not available on the path over WiFi, there is no SCTP node in the
629	   WiFi subtree.  Here, the path over WiFi will be tried first, and, if
630	   connection establishment succeeds, TCP will be used.  So the
631	   Selection Property of preferring WiFi takes precedence over the
632	   Property that led to a preference for SCTP.

634	   1. [www.example.com:80, Any, Any Stream]
635	   1.1 [192.0.2.1:80, Wi-Fi, Any Stream]
636	   1.1.1 [192.0.2.1:80, Wi-Fi, TCP]
637	   1.2 [192.0.3.1:80, LTE, Any Stream]
638	   1.2.1 [192.0.3.1:80, LTE, SCTP]
639	   1.2.2 [192.0.3.1:80, LTE, TCP]

641	4.3.  Sorting Branches

643	   Implementations should sort the branches of the tree of connection
644	   options in order of their preference rank.  Leaf nodes on branches
645	   with higher rankings represent connection attempts that will be raced
646	   first.  Implementations should order the branches to reflect the
647	   preferences expressed by the application for its new connection,
648	   including Selection Properties, which are specified in
649	   [I-D.ietf-taps-interface].

651	   In addition to the properties provided by the application, an
652	   implementation may include additional criteria such as cached
653	   performance estimates, see Section 9.2, or system policy, see
654	   Section 3.2, in the ranking.  Two examples of how Selection and
655	   Connection Properties may be used to sort branches are provided
656	   below:

658	   *  "Interface Instance or Type": If the application specifies an
659	      interface type to be preferred or avoided, implementations should
660	      rank paths accordingly.  If the application specifies an interface
661	      type to be required or prohibited, we expect an implementation to
662	      not include the non-conforming paths into the three.

664	   *  "Capacity Profile": An implementation may use the Capacity Profile
665	      to prefer paths optimized for the application's expected traffic
666	      pattern according to cached performance estimates, see
667	      Section 9.2:

669	      -  Scavenger: Prefer paths with the highest expected available
670	         bandwidth, based on observed maximum throughput

672	      -  Low Latency/Interactive: Prefer paths with the lowest expected
673	         Round Trip Time

675	      -  Constant-Rate Streaming: Prefer paths that can satisfy the
676	         requested Stream Send or Stream Receive Bitrate, based on
677	         observed maximum throughput

679	   Implementations should process properties in the following order:
680	   Prohibit, Require, Prefer, Avoid.  If Selection Properties contain
681	   any prohibited properties, the implementation should first purge
682	   branches containing nodes with these properties.  For required
683	   properties, it should only keep branches that satisfy these
684	   requirements.  Finally, it should order branches according to
685	   preferred properties, and finally use avoided properties as a
686	   tiebreaker.  When ordering branches, an implementation may give more
687	   weight to properties that the application has explicitly set than to
688	   properties that are default.

690	   As the available protocols and paths on a specific system and in a
691	   specific context may vary, the result of sorting and the outcome of
692	   racing may vary even given the same Selection and Connection
693	   Properties.  However, an implementation ought to aim to provide a
694	   consistent outcome to applications, e.g., by preferring protocols and
695	   paths that existing Connections with similar Properties are already
696	   using.

698	4.4.  Candidate Racing

700	   The primary goal of the Candidate Racing process is to successfully
701	   negotiate a protocol stack to an endpoint over an interface--to
702	   connect a single leaf node of the tree--with as little delay and as
703	   few unnecessary connections attempts as possible.  Optimizing these
704	   two factors improves the user experience, while minimizing network
705	   load.

707	   This section covers the dynamic aspect of connection establishment.
708	   While the tree described above is a useful conceptual and
709	   architectural model, an implementation does not know what the full
710	   tree may become up front, nor will many of the possible branches be
711	   used in the common case.

713	   There are three different approaches to racing the attempts for
714	   different nodes of the connection establishment tree:

716	   1.  Immediate

718	   2.  Delayed

720	   3.  Failover

722	   Each approach is appropriate in different use-cases and branch types.
723	   However, to avoid consuming unnecessary network resources,
724	   implementations should not use immediate racing as a default
725	   approach.

727	   The timing algorithms for racing should remain independent across
728	   branches of the tree.  Any timers or racing logic is isolated to a
729	   given parent node, and is not ordered precisely with regards to other
730	   children of other nodes.

732	4.4.1.  Delayed

734	   Delayed racing can be used whenever a single node of the tree has
735	   multiple child nodes.  Based on the order determined when building
736	   the tree, the first child node will be initiated immediately,
737	   followed by the next child node after some delay.  Once that second
738	   child node is initiated, the third child node (if present) will begin
739	   after another delay, and so on until all child nodes have been
740	   initiated, or one of the child nodes successfully completes its
741	   negotiation.

743	   Delayed racing attempts occur in parallel.  Implementations should
744	   not terminate an earlier child connection attempt upon starting a
745	   secondary child.

747	   The delay between starting child nodes should be based on the
748	   properties of the previously started child node.  For example, if the
749	   first child represents an IP address with a known route, and the
750	   second child represents another IP address, the delay between
751	   starting the first and second IP addresses can be based on the
752	   expected retransmission cadence for the first child's connection
753	   (derived from historical round-trip-time).  Alternatively, if the
754	   first child represents a branch on a Wi-Fi interface, and the second
755	   child represents a branch on an LTE interface, the delay should be
756	   based on the expected time in which the branch for the first
757	   interface would be able to establish a connection, based on link
758	   quality and historical round-trip-time.

760	   Any delay should have a defined minimum and maximum value based on
761	   the branch type.  Generally, branches between paths and protocols
762	   should have longer delays than branches between derived endpoints.
763	   The maximum delay should be considered with regards to how long a
764	   user is expected to wait for the connection to complete.

766	   If a child node fails to connect before the delay timer has fired for
767	   the next child, the next child should be started immediately.

769	4.4.2.  Failover

771	   If an implementation or application has a strong preference for one
772	   branch over another, the branching node may choose to wait until one
773	   child has failed before starting the next.  Failure of a leaf node is
774	   determined by its protocol negotiation failing or timing out; failure
775	   of a parent branching node is determined by all of its children
776	   failing.

778	   An example in which failover is recommended is a race between a
779	   protocol stack that uses a proxy and a protocol stack that bypasses
780	   the proxy.  Failover is useful in case the proxy is down or
781	   misconfigured, but any more aggressive type of racing may end up
782	   unnecessarily avoiding a proxy that was preferred by policy.

784	4.5.  Completing Establishment

786	   The process of connection establishment completes when one leaf node
787	   of the tree has completed negotiation with the remote endpoint
788	   successfully, or else all nodes of the tree have failed to connect.
789	   The first leaf node to complete its connection is then used by the
790	   application to send and receive data.

792	   It is useful to process success and failure throughout the tree by
793	   child nodes reporting to their parent nodes (towards the trunk of the
794	   tree).  For example, in the following case, if 1.1.1 fails to
795	   connect, it reports the failure to 1.1.  Since 1.1 has no other child
796	   nodes, it also has failed and reports that failure to 1.  Because 1.2
797	   has not yet failed, 1 is not considered to have failed.  Since 1.2
798	   has not yet started, it is started and the process continues.
799	   Similarly, if 1.1.1 successfully connects, then it marks 1.1 as
800	   connected, which propagates to the trunk node 1.  At this point, the
801	   connection as a whole is considered to be successfully connected and
802	   ready to process application data

804	   1 [www.example.com:80, Any, TCP]
805	     1.1 [www.example.com:80, Wi-Fi, TCP]
806	       1.1.1 [192.0.2.1:80, Wi-Fi, TCP]
807	     1.2 [www.example.com:80, LTE, TCP]
808	   ...

810	   If a leaf node has successfully completed its connection, all other
811	   attempts should be made ineligible for use by the application for the
812	   original request.  New connection attempts that involve transmitting
813	   data on the network should not be started after another leaf node has
814	   completed successfully, as the connection as a whole has been
815	   established.  An implementation may choose to let certain handshakes
816	   and negotiations complete in order to gather metrics to influence
817	   future connections.  Similarly, an implementation may choose to hold
818	   onto fully established leaf nodes that were not the first to
819	   establish for use as part of a Pooled Connection, see Section 7.1, or
820	   in future connections.  In both cases, keeping additional connections
821	   is generally not recommended since those attempts were slower to
822	   connect and may exhibit less desirable properties.

824	4.5.1.  Determining Successful Establishment

826	   Implementations may select the criteria by which a leaf node is
827	   considered to be successfully connected differently on a per-protocol
828	   basis.  If the only protocol being used is a transport protocol with
829	   a clear handshake, like TCP, then the obvious choice is to declare
830	   that node "connected" when the last packet of the three-way handshake
831	   has been received.  If the only protocol being used is an
832	   "unconnected" protocol, like UDP, the implementation may consider the
833	   node fully "connected" the moment it determines a route is present,
834	   before sending any packets on the network, see further Section 4.7.

836	   For protocol stacks with multiple handshakes, the decision becomes
837	   more nuanced.  If the protocol stack involves both TLS and TCP, an
838	   implementation could determine that a leaf node is connected after
839	   the TCP handshake is complete, or it can wait for the TLS handshake
840	   to complete as well.  The benefit of declaring completion when the
841	   TCP handshake finishes, and thus stopping the race for other branches
842	   of the tree, is that there will be less burden on the network from
843	   other connection attempts.  On the other hand, by waiting until the
844	   TLS handshake is complete, an implementation avoids the scenario in
845	   which a TCP handshake completes quickly, but TLS negotiation is
846	   either very slow or fails altogether in particular network conditions
847	   or to a particular endpoint.  To avoid the issue of TLS possibly
848	   failing, the implementation should not generate a Ready event for the
849	   Connection until TLS is established.

851	   If all of the leaf nodes fail to connect during racing, i.e. none of
852	   the configurations that satisfy all requirements given in the
853	   Transport Parameters actually work over the available paths, then the
854	   transport system should notify the application with an InitiateError
855	   event.  An InitiateError event should also be generated in case the
856	   transport system finds no usable candidates to race.

858	4.6.  Establishing multiplexed connections

860	   Multiplexing several Connections over a single underlying transport
861	   connection requires that the Connections to be multiplexed belong to
862	   the same Connection Group (as is indicated by the application using
863	   the Clone call).  When the underlying transport connection supports
864	   multi-streaming, the Transport System can map each Connection in the
865	   Connection Group to a different stream.  Thus, when the Connections
866	   that are offered to an application by the Transport System are
867	   multiplexed, the Transport System may implement the establishment of
868	   a new Connection by simply beginning to use a new stream of an
869	   already established transport connection and there is no need for a
870	   connection establishment procedure.  This, then, also means that
871	   there may not be any "establishment" message (like a TCP SYN), but
872	   the application can simply start sending or receiving.  Therefore,
873	   when the Initiate action of a Transport System is called without
874	   Messages being handed over, it cannot be guaranteed that the other
875	   endpoint will have any way to know about this, and hence a passive
876	   endpoint's ConnectionReceived event may not be called upon an active
877	   endpoint's Inititate.  Instead, calling the ConnectionReceived event
878	   may be delayed until the first Message arrives.

880	4.7.  Handling racing with "unconnected" protocols

882	   While protocols that use an explicit handshake to validate a
883	   Connection to a peer can be used for racing multiple establishment
884	   attempts in parallel, "unconnected" protocols such as raw UDP do not
885	   offer a way to validate the presence of a peer or the usability of a
886	   Connection without application feedback.  An implementation should
887	   consider such a protocol stack to be established as soon as a local
888	   route to the peer endpoint is confirmed.

890	   However, if a peer is not reachable over the network using the
891	   unconnected protocol, or data cannot be exchanged for any other
892	   reason, the application may want to attempt using another candidate
893	   Protocol Stack.  The implementation should maintain the list of other
894	   candidate Protocol Stacks that were eligible to use.  In the case
895	   that the application signals that the initial Protocol Stack is
896	   failing for some reason and that another option should be attempted,
897	   the Connection can be updated to point to the next candidate Protocol
898	   Stack.  This can be viewed as an application-driven form of Protocol
899	   Stack racing.

901	4.8.  Implementing listeners

903	   When an implementation is asked to Listen, it registers with the
904	   system to wait for incoming traffic to the Local Endpoint.  If no
905	   Local Endpoint is specified, the implementation should either use an
906	   ephemeral port or generate an error.

908	   If the Selection Properties do not require a single network interface
909	   or path, but allow the use of multiple paths, the Listener object
910	   should register for incoming traffic on all of the network interfaces
911	   or paths that conform to the Properties.  The set of available paths
912	   can change over time, so the implementation should monitor network
913	   path changes and register and de-register the Listener across all
914	   usable paths.  When using multiple paths, the Listener is generally
915	   expected to use the same port for listening on each.

917	   If the Selection Properties allow multiple protocols to be used for
918	   listening, and the implementation supports it, the Listener object
919	   should register across the eligble protocols for each path.  This
920	   means that inbound Connections delivered by the implementation may
921	   have heterogeneous protocol stacks.

923	4.8.1.  Implementing listeners for Connected Protocols

925	   Connected protocols such as TCP and TLS-over-TCP have a strong
926	   mapping between the Local and Remote Endpoints (five-tuple) and their
927	   protocol connection state.  These map well into Connection objects.
928	   Whenever a new inbound handshake is being started, the Listener
929	   should generate a new Connection object and pass it to the
930	   application.

932	4.8.2.  Implementing listeners for Unconnected Protocols

934	   Unconnected protocols such as UDP and UDP-lite generally do not
935	   provide the same mechanisms that connected protocols do to offer
936	   Connection objects.  Implementations should wait for incoming packets
937	   for unconnected protocols on a listening port and should perform
938	   five-tuple matching of packets to either existing Connection objects
939	   or the creation of new Connection objects.  On platforms with
940	   facilities to create a "virtual connection" for unconnected protocols
941	   implementations should use these mechanisms to minimise the handling
942	   of datagrams intended for already created Connection objects.

944	4.8.3.  Implementing listeners for Multiplexed Protocols

946	   Protocols that provide multiplexing of streams into a single five-
947	   tuple can listen both for entirely new connections (a new HTTP/2
948	   stream on a new TCP connection, for example) and for new sub-
949	   connections (a new HTTP/2 stream on an existing connection).  If the
950	   abstraction of Connection presented to the application is mapped to
951	   the multiplexed stream, then the Listener should deliver new
952	   Connection objects in the same way for either case.  The
953	   implementation should allow the application to introspect the
954	   Connection Group marked on the Connections to determine the grouping
955	   of the multiplexing.

957	5.  Implementing Sending and Receiving Data

959	   The most basic mapping for sending a Message is an abstraction of
960	   datagrams, in which the transport protocol naturally deals in
961	   discrete packets.  Each Message here corresponds to a single
962	   datagram.  Generally, these will be short enough that sending and
963	   receiving will always use a complete Message.

965	   For protocols that expose byte-streams, the only delineation provided
966	   by the protocol is the end of the stream in a given direction.  Each
967	   Message in this case corresponds to the entire stream of bytes in a
968	   direction.  These Messages may be quite long, in which case they can
969	   be sent in multiple parts.

971	   Protocols that provide the framing (such as length-value protocols,
972	   or protocols that use delimiters) provide data boundaries that may be
973	   longer than a traditional packet datagram.  Each Message for framing
974	   protocols corresponds to a single frame, which may be sent either as
975	   a complete Message, or in multiple parts.

977	5.1.  Sending Messages

979	   The effect of the application sending a Message is determined by the
980	   top-level protocol in the established Protocol Stack.  That is, if
981	   the top-level protocol provides an abstraction of framed messages
982	   over a connection, the receiving application will be able to obtain
983	   multiple Messages on that connection, even if the framing protocol is
984	   built on a byte-stream protocol like TCP.

986	5.1.1.  Message Properties

988	   *  Lifetime: this should be implemented by removing the Message from
989	      its queue of pending Messages after the Lifetime has expired.  A
990	      queue of pending Messages within the transport system
991	      implementation that have yet to be handed to the Protocol Stack
992	      can always support this property, but once a Message has been sent
993	      into the send buffer of a protocol, only certain protocols may
994	      support de-queueing a message.  For example, TCP cannot remove
995	      bytes from its send buffer, while in case of SCTP, such control
996	      over the SCTP send buffer can be exercised using the partial
997	      reliability extension [RFC8303].  When there is no standing queue
998	      of Messages within the system, and the Protocol Stack does not
999	      support removing a Message from its buffer, this property may be
1000	      ignored.

1002	   *  Priority: this represents the ability to prioritize a Message over
1003	      other Messages.  This can be implemented by the system re-ordering
1004	      Messages that have yet to be handed to the Protocol Stack, or by
1005	      giving relative priority hints to protocols that support
1006	      priorities per Message.  For example, an implementation of HTTP/2
1007	      could choose to send Messages of different Priority on streams of
1008	      different priority.

1010	   *  Ordered: when this is false, it disables the requirement of in-
1011	      order-delivery for protocols that support configurable ordering.

1013	   *  Idempotent: when this is true, it means that the Message can be
1014	      used by mechanisms that might transfer it multiple times - e.g.,
1015	      as a result of racing multiple transports or as part of TCP Fast
1016	      Open.

1018	   *  Final: when this is true, it means that a transport connection can
1019	      be closed immediately after its transmission.

1021	   *  Corruption Protection Length: when this is set to any value other
1022	      than -1, it limits the required checksum in protocols that allow
1023	      limiting the checksum length (e.g.  UDP-Lite).

1025	   *  Transmission Profile: TBD - because it's not final in the API yet.
1026	      Old text follows: when this is set to "Interactive/Low Latency",
1027	      the Message should be sent immediately, even when this comes at
1028	      the cost of using the network capacity less efficiently.  For
1029	      example, small messages can sometimes be bundled to fit into a
1030	      single data packet for the sake of reducing header overhead; such
1031	      bundling should not be used.  For example, in case of TCP, the
1032	      Nagle algorithm should be disabled when Interactive/Low Latency is
1033	      selected as the capacity profile.  Scavenger/Bulk can translate
1034	      into usage of a congestion control mechanism such as LEDBAT, and/
1035	      or the capacity profile can lead to a choice of a DSCP value as
1036	      described in [I-D.ietf-taps-minset]).

1038	   *  Singular Transmission: when this is true, the application requests
1039	      to avoid transport-layer segmentation or network-layer
1040	      fragmentation.  Some transports implement network-layer
1041	      fragmentation avoidance (Path MTU Discovery) without exposing this
1042	      functionality to the application; in this case, only transport-
1043	      layer segmentation should be avoided, by fitting the message into
1044	      a single transport-layer segment or otherwise failing.  Otherwise,
1045	      network-layer fragmentation should be avoided--e.g. by requesting
1046	      the IP Don't Fragment bit to be set in case of UDP(-Lite) and IPv4
1047	      (SET_DF in [RFC8304]).

1049	5.1.2.  Send Completion

1051	   The application should be notified whenever a Message or partial
1052	   Message has been consumed by the Protocol Stack, or has failed to
1053	   send.  The meaning of the Message being consumed by the stack may
1054	   vary depending on the protocol.  For a basic datagram protocol like
1055	   UDP, this may correspond to the time when the packet is sent into the
1056	   interface driver.  For a protocol that buffers data in queues, like
1057	   TCP, this may correspond to when the data has entered the send
1058	   buffer.

1060	5.1.3.  Batching Sends

1062	   Since sending a Message may involve a context switch between the
1063	   application and the transport system, sending patterns that involve
1064	   multiple small Messages can incur high overhead if each needs to be
1065	   enqueued separately.  To avoid this, the application should have a
1066	   way to indicate a batch of Send actions, during which time the
1067	   implementation will hold off on processing Messages until the batch
1068	   is complete.  This can also help context switches when enqueuing data
1069	   in the interface driver if the operation can be batched.

1071	5.2.  Receiving Messages

1073	   Similar to sending, Receiving a Message is determined by the top-
1074	   level protocol in the established Protocol Stack.  The main
1075	   difference with Receiving is that the size and boundaries of the
1076	   Message are not known beforehand.  The application can communicate in
1077	   its Receive action the parameters for the Message, which can help the
1078	   implementation know how much data to deliver and when.  For example,
1079	   if the application only wants to receive a complete Message, the
1080	   implementation should wait until an entire Message (datagram, stream,
1081	   or frame) is read before delivering any Message content to the
1082	   application.  This requires the implementation to understand where
1083	   messages end, either via a supplied deframer or because the top-level
1084	   protocol in the established Protocol Stack preserves message
1085	   boundaries; if, on the other hand, the top-level protocol only
1086	   supports a byte-stream and no deframers were supported, the
1087	   application must specify the minimum number of bytes of Message
1088	   content it wants to receive (which may be just a single byte) to
1089	   control the flow of received data.

1091	   If a Connection becomes finished before a requested Receive action
1092	   can be satisfied, the implementation should deliver any partial
1093	   Message content outstanding, or if none is available, an indication
1094	   that there will be no more received Messages.

1096	5.3.  Handling of data for fast-open protocols

1098	   Several protocols allow sending higher-level protocol or application
1099	   data within the first packet of their protocol establishment, such as
1100	   TCP Fast Open [RFC7413] and TLS 1.3 [RFC8446].  This approach is
1101	   referred to as sending Zero-RTT (0-RTT) data.  This is a desirable
1102	   property, but poses challenges to an implementation that uses racing
1103	   during connection establishment.

1105	   If the application has 0-RTT data to send in any protocol handshakes,
1106	   it needs to provide this data before the handshakes have begun.  When
1107	   racing, this means that the data should be provided before the
1108	   process of connection establishment has begun.  If the application
1109	   wants to send 0-RTT data, it must indicate this to the implementation
1110	   by setting the Idempotent send parameter to true when sending the
1111	   data.  In general, 0-RTT data may be replayed (for example, if a TCP
1112	   SYN contains data, and the SYN is retransmitted, the data will be
1113	   retransmitted as well), but racing means that different leaf nodes
1114	   have the opportunity to send the same data independently.  If data is
1115	   truly idempotent, this should be permissible.

1117	   Once the application has provided its 0-RTT data, an implementation
1118	   should keep a copy of this data and provide it to each new leaf node
1119	   that is started and for which a 0-RTT protocol is being used.

1121	   It is also possible that protocol stacks within a particular leaf
1122	   node use 0-RTT handshakes without any idempotent application data.
1123	   For example, TCP Fast Open could use a Client Hello from TLS as its
1124	   0-RTT data, shortening the cumulative handshake time.

1126	   0-RTT handshakes often rely on previous state, such as TCP Fast Open
1127	   cookies, previously established TLS tickets, or out-of-band
1128	   distributed pre-shared keys (PSKs).  Implementations should be aware
1129	   of security concerns around using these tokens across multiple
1130	   addresses or paths when racing.  In the case of TLS, any given ticket
1131	   or PSK should only be used on one leaf node.  If implementations have
1132	   multiple tickets available from a previous connection, each leaf node
1133	   attempt must use a different ticket.  In effect, each leaf node will
1134	   send the same early application data, yet encoded (encrypted)
1135	   differently on the wire.

1137	6.  Implementing Message Framers

1139	   Message Framers are pieces of code that define simple transformations
1140	   between application Message data and raw transport protocol data.  A
1141	   Framer can encapsulate or encode outbound Messages, and decapsulate
1142	   or decode inbound data into Messages.

1144	   While many protocols can be represented as Message Framers, for the
1145	   purposes of the Transport Services interface these are ways for
1146	   applications or application frameworks to define their own Message
1147	   parsing to be included within a Connection's Protocol Stack.  As an
1148	   example, TLS can serve the purpose of framing data over TCP, but is
1149	   exposed as a protocol natively supported by the Transport Services
1150	   interface.

1152	   Most Message Framers fall into one of two categories:

1154	   *  Header-prefixed record formats, such as a basic Type-Length-Value
1155	      (TLV) structure

1157	   *  Delimiter-separated formats, such as HTTP/1.1.

1159	   Common Message Framers can be provided by the Transport Services
1160	   implementation, but an implementation ought to allow custom Message
1161	   Framers to be defined by the application or some other piece of
1162	   software.  This section describes one possible interface for defining
1163	   Message Framers as an example.

1165	6.1.  Defining Message Framers

1167	   A Message Framer is primarily defined by the set of code that handles
1168	   events for a framer implementation, specifically how it handles
1169	   inbound and outbound data parsing.  The piece of code that implements
1170	   custom framing logic will be referred to as the "framer
1171	   implementation", which may be provided by the Transport Services
1172	   implementation or the application itself.  The Message Framer refers
1173	   to the object or piece of code within the main Connection
1174	   implementation that delivers events to the custom framer
1175	   implementation whenever data is ready to be parsed or framed.

1177	   When a Connection establishment attempt begins, an event can be
1178	   delivered to notify the framer implementation that a new Connection
1179	   is being created.  Similarly, a stop event can be delivered when a
1180	   Connection is being torn down.  The framer implementation can use the
1181	   Connection object to look up specific properties of the Connection or
1182	   the network being used that may influence how to frame Messages.

1184	   MessageFramer -> Start(Connection)
1185	   MessageFramer -> Stop(Connection)

1187	   When a Message Framer generates a "Start" event, the framer
1188	   implementation has the opportunity to start writing some data prior
1189	   to the Connection delivering its "Ready" event.  This allows the
1190	   implementation to communicate control data to the remote endpoint
1191	   that can be used to parse Messages.

1193	   MessageFramer.MakeConnectionReady(Connection)

1195	   Similarly, when a Message Framer generates a "Stop" event, the framer
1196	   implementation has the opportunity to write some final data or clear
1197	   up its local state before the "Closed" event is delivered to the
1198	   Application.  The framer implementation can indicate that it has
1199	   finished with this.

1201	   MessageFramer.MakeConnectionClosed(Connection)

1203	   At any time if the implementation encounters a fatal error, it can
1204	   also cause the Connection to fail and provide an error.

1206	   MessageFramer.FailConnection(Connection, Error)

1208	   Should the framer implementation deem the candidate selected during
1209	   racing unsuitable it can signal this by failing the Connection prior
1210	   to marking it as ready.  If there are no other candidates available,
1211	   the Connection will fail.  Otherwise, the Connection will select a
1212	   different candidate and the Message Framer will generate a new
1213	   "Start" event.

1215	   Before an implementation marks a Message Framer as ready, it can also
1216	   dynamically add a protocol or framer above it in the stack.  This
1217	   allows protocols like STARTTLS, that need to add TLS conditionally,
1218	   to modify the Protocol Stack based on a handshake result.

1220	   otherFramer := NewMessageFramer()
1221	   MessageFramer.PrependFramer(Connection, otherFramer)

1223	6.2.  Sender-side Message Framing

1225	   Message Framers generate an event whenever a Connection sends a new
1226	   Message.

1228	   MessageFramer -> NewSentMessage<Connection, MessageData, MessageContext, IsEndOfMessage>

1230	   Upon receiving this event, a framer implementation is responsible for
1231	   performing any necessary transformations and sending the resulting
1232	   data back to the Message Framer, which will in turn send it to the
1233	   next protocol.  Implementations SHOULD ensure that there is a way to
1234	   pass the original data through without copying to improve
1235	   performance.

1237	   MessageFramer.Send(Connection, Data)

1239	   To provide an example, a simple protocol that adds a length as a
1240	   header would receive the "NewSentMessage" event, create a data
1241	   representation of the length of the Message data, and then send a
1242	   block of data that is the concatenation of the length header and the
1243	   original Message data.

1245	6.3.  Receiver-side Message Framing

1247	   In order to parse a received flow of data into Messages, the Message
1248	   Framer notifies the framer implementation whenever new data is
1249	   available to parse.

1251	   MessageFramer -> HandleReceivedData<Connection>

1253	   Upon receiving this event, the framer implementation can inspect the
1254	   inbound data.  The data is parsed from a particular cursor
1255	   representing the unprocessed data.  The application requests a
1256	   specific amount of data it needs to have available in order to parse.
1257	   If the data is not available, the parse fails.

1259	   MessageFramer.Parse(Connection, MinimumIncompleteLength, MaximumLength) -> (Data, MessageContext, IsEndOfMessage)
1260	   The framer implementation can directly advance the receive cursor
1261	   once it has parsed data to effectively discard data (for example,
1262	   discard a header once the content has been parsed).

1264	   To deliver a Message to the application, the framer implementation
1265	   can either directly deliver data that it has allocated, or deliver a
1266	   range of data directly from the underlying transport and
1267	   simultaneously advance the receive cursor.

1269	   MessageFramer.AdvanceReceiveCursor(Connection, Length)
1270	   MessageFramer.DeliverAndAdvanceReceiveCursor(Connection, MessageContext, Length, IsEndOfMessage)
1271	   MessageFramer.Deliver(Connection, MessageContext, Data, IsEndOfMessage)

1273	   Note that "MessageFramer.DeliverAndAdvanceReceiveCursor" allows the
1274	   framer implementation to earmark bytes as part of a Message even
1275	   before they are received by the transport.  This allows the delivery
1276	   of very large Messages without requiring the implementation to
1277	   directly inspect all of the bytes.

1279	   To provide an example, a simple protocol that parses a length as a
1280	   header value would receive the "HandleReceivedData" event, and call
1281	   "Parse" with a minimum and maximum set to the length of the header
1282	   field.  Once the parse succeeded, it would call
1283	   "AdvanceReceiveCursor" with the length of the header field, and then
1284	   call "DeliverAndAdvanceReceiveCursor" with the length of the body
1285	   that was parsed from the header, marking the new Message as complete.

1287	7.  Implementing Connection Management

1289	   Once a Connection is established, the Transport Services system
1290	   allows applications to interact with the Connection by modifying or
1291	   inspecting Connection Properties.  A Connection can also generate
1292	   events in the form of Soft Errors.

1294	   The set of Connection Properties that are supported for setting and
1295	   getting on a Connection are described in [I-D.ietf-taps-interface].
1296	   For any properties that are generic, and thus could apply to all
1297	   protocols being used by a Connection, the Transport System should
1298	   store the properties in a generic storage, and notify all protocol
1299	   instances in the Protocol Stack whenever the properties have been
1300	   modified by the application.  For protocol-specfic properties, such
1301	   as the User Timeout that applies to TCP, the Transport System only
1302	   needs to update the relevant protocol instance.

1304	   If an error is encountered in setting a property (for example, if the
1305	   application tries to set a TCP-specific property on a Connection that
1306	   is not using TCP), the action should fail gracefully.  The
1307	   application may be informed of the error, but the Connection itself
1308	   should not be terminated.

1310	   The Transport Services implementation should allow protocol instances
1311	   in the Protocol Stack to pass up arbitrary generic or protocol-
1312	   specific errors that can be delivered to the application as Soft
1313	   Errors.  These allow the application to be informed of ICMP errors,
1314	   and other similar events.

1316	7.1.  Pooled Connection

1318	   For protocols that employ request/response pairs and do not require
1319	   in-order delivery of the responses, like HTTP, the transport
1320	   implementation may distribute interactions across several underlying
1321	   transport connections.  For these kinds of protocols, implementations
1322	   may hide the connection management and only expose a single
1323	   Connection object and the individual requests/responses as messages.
1324	   These Pooled Connections can use multiple connections or multiple
1325	   streams of multi-streaming connections between endpoints, as long as
1326	   all of these satisfy the requirements, and prohibitions specified in
1327	   the Selection Properties of the Pooled Connection.  This enables
1328	   implementations to realize transparent connection coalescing,
1329	   connection migration, and to perform per-message endpoint and path
1330	   selection by choosing among these underlying connections.

1332	7.2.  Handling Path Changes

1334	   When a path change occurs, the Transport Services implementation is
1335	   responsible for notifying Protocol Instances in the Protocol Stack.
1336	   If the Protocol Stack includes a transport protocol that supports
1337	   multipath connectivity, an update to the available paths should
1338	   inform the Protocol Instance of the new set of paths that are
1339	   permissible based on the Selection Properties passed by the
1340	   application.  A multipath protocol can establish new subflows over
1341	   new paths, and should tear down subflows over paths that are no
1342	   longer available.  Pooled Connections Section 7.1 may add or remove
1343	   underlying transport connections in a similar manner.  If the
1344	   Protocol Stack includes a transport protocol that does not support
1345	   multipath, but support migrating between paths, the update to
1346	   available paths can be used as the trigger to migrating the
1347	   connection.  For protocols that do not support multipath or
1348	   migration, the Protocol Instances may be informed of the path change,
1349	   but should not be forcibly disconnected if the previously used path
1350	   becomes unavailable.  An exception to this case is if the System
1351	   Policy changes to prohibit traffic from the Connection based on its
1352	   properties, in which case the Protocol Stack should be disconnected.

1354	8.  Implementing Connection Termination

1356	   With TCP, when an application closes a connection, this means that it
1357	   has no more data to send (but expects all data that has been handed
1358	   over to be reliably delivered).  However, with TCP only, "close" does
1359	   not mean that the application will stop receiving data.  This is
1360	   related to TCP's ability to support half-closed connections.

1362	   SCTP is an example of a protocol that does not support such half-
1363	   closed connections.  Hence, with SCTP, the meaning of "close" is
1364	   stricter: an application has no more data to send (but expects all
1365	   data that has been handed over to be reliably delivered), and will
1366	   also not receive any more data.

1368	   Implementing a protocol independent transport system means that the
1369	   exposed semantics must be the strictest subset of the semantics of
1370	   all supported protocols.  Hence, as is common with all reliable
1371	   transport protocols, after a Close action, the application can expect
1372	   to have its reliability requirements honored regarding the data it
1373	   has given to the Transport System, but it cannot expect to be able to
1374	   read any more data after calling Close.

1376	   Abort differs from Close only in that no guarantees are given
1377	   regarding data that the application has handed over to the Transport
1378	   System before calling Abort.

1380	   As explained in Section 4.6, when a new stream is multiplexed on an
1381	   already existing connection of a Transport Protocol Instance, there
1382	   is no need for a connection establishment procedure.  Because the
1383	   Connections that are offered by the Transport System can be
1384	   implemented as streams that are multiplexed on a transport protocol's
1385	   connection, it can therefore not be guaranteed that one Endpoint's
1386	   Initiate action provokes a ConnectionReceived event at its peer.

1388	   For Close (provoking a Finished event) and Abort (provoking a
1389	   ConnectionError event), the same logic applies: while it is desirable
1390	   to be informed when a peer closes or aborts a Connection, whether
1391	   this is possible depends on the underlying protocol, and no
1392	   guarantees can be given.  With SCTP, the transport system can use the
1393	   stream reset procedure to cause a Finish event upon a Close action
1394	   from the peer [NEAT-flow-mapping].

1396	9.  Cached State

1398	   Beyond a single Connection's lifetime, it is useful for an
1399	   implementation to keep state and history.  This cached state can help
1400	   improve future Connection establishment due to re-using results and
1401	   credentials, and favoring paths and protocols that performed well in
1402	   the past.

1404	   Cached state may be associated with different Endpoints for the same
1405	   Connection, depending on the protocol generating the cached content.
1406	   For example, session tickets for TLS are associated with specific
1407	   endpoints, and thus should be cached based on a Connection's hostname
1408	   Endpoint (if applicable).  On the other hand, performance
1409	   characteristics of a path are more likely tied to the IP address and
1410	   subnet being used.

1412	9.1.  Protocol state caches

1414	   Some protocols will have long-term state to be cached in association
1415	   with Endpoints.  This state often has some time after which it is
1416	   expired, so the implementation should allow each protocol to specify
1417	   an expiration for cached content.

1419	   Examples of cached protocol state include:

1421	   *  The DNS protocol can cache resolution answers (A and AAAA queries,
1422	      for example), associated with a Time To Live (TTL) to be used for
1423	      future hostname resolutions without requiring asking the DNS
1424	      resolver again.

1426	   *  TLS caches session state and tickets based on a hostname, which
1427	      can be used for resuming sessions with a server.

1429	   *  TCP can cache cookies for use in TCP Fast Open.

1431	   Cached protocol state is primarily used during Connection
1432	   establishment for a single Protocol Stack, but may be used to
1433	   influence an implementation's preference between several candidate
1434	   Protocol Stacks.  For example, if two IP address Endpoints are
1435	   otherwise equally preferred, an implementation may choose to attempt
1436	   a connection to an address for which it has a TCP Fast Open cookie.

1438	   Applications must have a way to flush protocol cache state if
1439	   desired.  This may be necessary, for example, if application-layer
1440	   identifiers rotate and clients wish to avoid linkability via
1441	   trackable TLS tickets or TFO cookies.

1443	9.2.  Performance caches

1445	   In addition to protocol state, Protocol Instances should provide data
1446	   into a performance-oriented cache to help guide future protocol and
1447	   path selection.  Some performance information can be gathered
1448	   generically across several protocols to allow predictive comparisons
1449	   between protocols on given paths:

1451	   *  Observed Round Trip Time

1453	   *  Connection Establishment latency

1455	   *  Connection Establishment success rate

1457	   These items can be cached on a per-address and per-subnet
1458	   granularity, and averaged between different values.  The information
1459	   should be cached on a per-network basis, since it is expected that
1460	   different network attachments will have different performance
1461	   characteristics.  Besides Protocol Instances, other system entities
1462	   may also provide data into performance-oriented caches.  This could
1463	   for instance be signal strength information reported by radio modems
1464	   like Wi-Fi and mobile broadband or information about the battery-
1465	   level of the device.  Furthermore, the system may cache the observed
1466	   maximum throughput on a path as an estimate of the available
1467	   bandwidth.

1469	   An implementation should use this information, when possible, to
1470	   determine preference between candidate paths, endpoints, and protocol
1471	   options.  Eligible options that historically had significantly better
1472	   performance than others should be selected first when gathering
1473	   candidates (see Section 4.1) to ensure better performance for the
1474	   application.

1476	   The reasonable lifetime for cached performance values will vary
1477	   depending on the nature of the value.  Certain information, like the
1478	   connection establishment success rate to a Remote Endpoint using a
1479	   given protocol stack, can be stored for a long period of time (hours
1480	   or longer), since it is expected that the capabilities of the Remote
1481	   Endpoint are not changing very quickly.  On the other hand, Round
1482	   Trip Time observed by TCP over a particular network path may vary
1483	   over a relatively short time interval.  For such values, the
1484	   implementation should remove them from the cache more quickly, or
1485	   treat older values with less confidence/weight.

1487	10.  Specific Transport Protocol Considerations

1489	   Each protocol that can run as part of a Transport Services
1490	   implementation defines both its API mapping as well as implementation
1491	   details.  API mappings for a protocol apply most to Connections in
1492	   which the given protocol is the "top" of the Protocol Stack.  For
1493	   example, the mapping of the "Send" function for TCP applies to
1494	   Connections in which the application directly sends over TCP.  If
1495	   HTTP/2 is used on top of TCP, the HTTP/2 mappings take precendence.

1497	   Each protocol has a notion of Connectedness.  Possible values for
1498	   Connectedness are:

1500	   *  Unconnected.  Unconnected protocols do not establish explicit
1501	      state between endpoints, and do not perform a handshake during
1502	      Connection establishment.

1504	   *  Connected.  Connected protocols establish state between endpoints,
1505	      and perform a handshake during Connection establishment.  The
1506	      handshake may be 0-RTT to send data or resume a session, but
1507	      bidirectional traffic is required to confirm connectedness.

1509	   *  Multiplexing Connected.  Multiplexing Connected protocols share
1510	      properties with Connected protocols, but also explictly support
1511	      opening multiple application-level flows.  This means that they
1512	      can support cloning new Connection objects without a new explicit
1513	      handshake.

1515	   Protocols also define a notion of Data Unit.  Possible values for
1516	   Data Unit are:

1518	   *  Byte-stream.  Byte-stream protocols do not define any Message
1519	      boundaries of their own apart from the end of a stream in each
1520	      direction.

1522	   *  Datagram.  Datagram protocols define Message boundaries at the
1523	      same level of transmission, such that only complete (not partial)
1524	      Messages are supported.

1526	   *  Message.  Message protocols support Message boundaries that can be
1527	      sent and received either as complete or partial Messages.  Maximum
1528	      Message lengths can be defined, and Messages can be partially
1529	      reliable.

1531	   Below, primitives in the style of
1532	   "CATEGORY.[SUBCATEGORY].PRIMITIVENAME.PROTOCOL" (e.g.,
1533	   "CONNECT.SCTP") refer to the primitives with the same name in section
1534	   4 of [RFC8303].  For further implementation details, the description
1535	   of these primitives in [RFC8303] points to section 3, which refers
1536	   back to the specifications for each protocol.  This back-tracking
1537	   method applies to all elements of [I-D.ietf-taps-minset] (see
1538	   appendix D of [I-D.ietf-taps-interface]): they are listed in appendix
1539	   A of [I-D.ietf-taps-minset] with an implementation hint in the same
1540	   style, pointing back to section 4 of [RFC8303].

1542	10.1.  TCP

1544	   Connectedness: Connected

1546	   Data Unit: Byte-stream

1548	   API mappings for TCP are as follows:

1550	   Connection Object:  TCP connections between two hosts map directly to
1551	      Connection objects.

1553	   Initiate:  CONNECT.TCP.  Calling "Initiate" on a TCP Connection
1554	      causes it to reserve a local port, and send a SYN to the Remote
1555	      Endpoint.

1557	   InitiateWithSend:  CONNECT.TCP with parameter "user message".  Early
1558	      idempotent data is sent on a TCP Connection in the SYN, as TCP
1559	      Fast Open data.

1561	   Ready:  A TCP Connection is ready once the three-way handshake is
1562	      complete.

1564	   InitiateError:  Failure of CONNECT.TCP.  TCP can throw various errors
1565	      during connection setup.  Specifically, it is important to handle
1566	      a RST being sent by the peer during the handshake.

1568	   ConnectionError:  Once established, TCP throws errors whenever the
1569	      connection is disconnected, such as due to receiving a RST from
1570	      the peer; or hitting a TCP retransmission timeout.

1572	   Listen:  LISTEN.TCP.  Calling "Listen" for TCP binds a local port and
1573	      prepares it to receive inbound SYN packets from peers.

1575	   ConnectionReceived:  TCP Listeners will deliver new connections once
1576	      they have replied to an inbound SYN with a SYN-ACK.

1578	   Clone:  Calling "Clone" on a TCP Connection creates a new Connection
1579	      with equivalent parameters.  The two Connections are otherwise
1580	      independent.

1582	   Send:  SEND.TCP.  TCP does not on its own preserve Message
1583	      boundaries.  Calling "Send" on a TCP connection lays out the bytes
1584	      on the TCP send stream without any other delineation.  Any Message
1585	      marked as Final will cause TCP to send a FIN once the Message has
1586	      been completely written, by calling CLOSE.TCP immediately upon
1587	      successful termination of SEND.TCP.

1589	   Receive:  With RECEIVE.TCP, TCP delivers a stream of bytes without
1590	      any Message delineation.  All data delivered in the "Received" or
1591	      "ReceivedPartial" event will be part of a single stream-wide
1592	      Message that is marked Final (unless a Message Framer is used).
1593	      EndOfMessage will be delivered when the TCP Connection has
1594	      received a FIN (CLOSE-EVENT.TCP or ABORT-EVENT.TCP) from the peer.

1596	   Close:  Calling "Close" on a TCP Connection indicates that the
1597	      Connection should be gracefully closed (CLOSE.TCP) by sending a
1598	      FIN to the peer and waiting for a FIN-ACK before delivering the
1599	      "Closed" event.

1601	   Abort:  Calling "Abort" on a TCP Connection indicates that the
1602	      Connection should be immediately closed by sending a RST to the
1603	      peer (ABORT.TCP).

1605	10.2.  UDP

1607	   Connectedness: Unconnected

1609	   Data Unit: Datagram

1611	   API mappings for UDP are as follows:

1613	   Connection Object:  UDP connections represent a pair of specific IP
1614	      addresses and ports on two hosts.

1616	   Initiate:  CONNECT.UDP.  Calling "Initiate" on a UDP Connection
1617	      causes it to reserve a local port, but does not generate any
1618	      traffic.

1620	   InitiateWithSend:  Early data on a UDP Connection does not have any
1621	      special meaning.  The data is sent whenever the Connection is
1622	      Ready.

1624	   Ready:  A UDP Connection is ready once the system has reserved a
1625	      local port and has a path to send to the Remote Endpoint.

1627	   InitiateError:  UDP Connections can only generate errors on
1628	      initiation due to port conflicts on the local system.

1630	   ConnectionError:  Once in use, UDP throws "soft errors" (ERROR.UDP(-
1631	      Lite)) upon receiving ICMP notifications indicating failures in
1632	      the network.

1634	   Listen:  LISTEN.UDP.  Calling "Listen" for UDP binds a local port and
1635	      prepares it to receive inbound UDP datagrams from peers.

1637	   ConnectionReceived:  UDP Listeners will deliver new connections once
1638	      they have received traffic from a new Remote Endpoint.

1640	   Clone:  Calling "Clone" on a UDP Connection creates a new Connection
1641	      with equivalent parameters.  The two Connections are otherwise
1642	      independent.

1644	   Send:  SEND.UDP(-Lite).  Calling "Send" on a UDP connection sends the
1645	      data as the payload of a complete UDP datagram.  Marking Messages
1646	      as Final does not change anything in the datagram's contents.
1647	      Upon sending a UDP datagram, some relevant fields and flags in the
1648	      IP header can be controlled: DSCP (SET_DSCP.UDP(-Lite)), DF in
1649	      IPv4 (SET_DF.UDP(-Lite)) and ECN flag (SET_ECN.UDP(-Lite)).

1651	   Receive:  RECEIVE.UDP(-Lite).  UDP only delivers complete Messages to
1652	      "Received", each of which represents a single datagram received in
1653	      a UDP packet.  Upon receiving a UDP datagram, the ECN flag from
1654	      the IP header can be obtained (GET_ECN.UDP(-Lite)).

1656	   Close:  Calling "Close" on a UDP Connection (ABORT.UDP(-Lite))
1657	      releases the local port reservation.

1659	   Abort:  Calling "Abort" on a UDP Connection (ABORT.UDP(-Lite)) is
1660	      identical to calling "Close".

1662	10.3.  UDP Multicast Receive

1664	   Connectedness: Unconnected

1666	   Data Unit: Datagram

1668	   API mappings for Receiving Multicast UDP are as follows:

1670	   Connection Object:  Established UDP Multicast Receive connections
1671	      represent a pair of specific IP addresses and ports.  The
1672	      "unidirectional receive" transport property is required, and the
1673	      local endpoint must be configured with a group IP address and a
1674	      port.

1676	   Initiate:  Calling "Initiate" on a UDP Multicast Receive Connection
1677	      causes an immediate InitiateError.  This is an unsupported
1678	      operation.

1680	   InitiateWithSend:  Calling "InitiateWithSend" on a UDP Multicast
1681	      Receive Connection causes an immediate InitiateError.  This is an
1682	      unsupported operation.

1684	   Ready:  A UDP Multicast Receive Connection is ready once the system
1685	      has received traffic for the appropriate group and port.

1687	   InitiateError:  UDP Multicast Receive Connections generate an
1688	      InitiateError if Initiate is called.

1690	   ConnectionError:  Once in use, UDP throws "soft errors" (ERROR.UDP(-
1691	      Lite)) upon receiving ICMP notifications indicating failures in
1692	      the network.

1694	   Listen:  LISTEN.UDP.  Calling "Listen" for UDP Multicast Receive
1695	      binds a local port, prepares it to receive inbound UDP datagrams
1696	      from peers, and issues a multicast host join.  If a remote
1697	      endpoint with an address is supplied, the join is Source-specific
1698	      Multicast, and the path selection is based on the route to the
1699	      remote endpoint.  If a remote endpoint is not supplied, the join
1700	      is Any-source Multicast, and the path selection is based on the
1701	      outbound route to the group supplied in the local endpoint.

1703	   ConnectionReceived:  UDP Multicast Receive Listeners will deliver new
1704	      connections once they have received traffic from a new Remote
1705	      Endpoint.

1707	   Clone:  Calling "Clone" on a UDP Multicast Receive Connection creates
1708	      a new Connection with equivalent parameters.  The two Connections
1709	      are otherwise independent.

1711	   Send:  SEND.UDP(-Lite).  Calling "Send" on a UDP Multicast Receive
1712	      connection causes an immediate SendError.  This is an unsupported
1713	      operation.

1715	   Receive:  RECEIVE.UDP(-Lite).  The Receive operation in a UDP
1716	      Multicast Receive connection only delivers complete Messages to
1717	      "Received", each of which represents a single datagram received in
1718	      a UDP packet.  Upon receiving a UDP datagram, the ECN flag from
1719	      the IP header can be obtained (GET_ECN.UDP(-Lite)).

1721	   Close:  Calling "Close" on a UDP Multicast Receive Connection
1722	      (ABORT.UDP(-Lite)) releases the local port reservation and leaves
1723	      the group.

1725	   Abort:  Calling "Abort" on a UDP Multicast Receive Connection
1726	      (ABORT.UDP(-Lite)) is identical to calling "Close".

1728	10.4.  TLS

1730	   The mapping of a TLS stream abstraction into the application is
1731	   equivalent to the contract provided by TCP (see Section 10.1), and
1732	   builds upon many of the actions of TCP connections.

1734	   Connectedness: Connected

1736	   Data Unit: Byte-stream

1738	   Connection Object:  Connection objects represent a single TLS
1739	      connection running over a TCP connection between two hosts.

1741	   Initiate:  Calling "Initiate" on a TLS Connection causes it to first
1742	      initiate a TCP connection.  Once the TCP protocol is Ready, the
1743	      TLS handshake will be performed as a client (starting by sending a
1744	      "client_hello", and so on).

1746	   InitiateWithSend:  Early idempotent data is supported by TLS 1.3, and
1747	      sends encrypted application data in the first TLS message when
1748	      performing session resumption.  For older versions of TLS, or if a
1749	      session is not being resumed, the initial data will be delayed
1750	      until the TLS handshake is complete.  TCP Fast Option can also be
1751	      enabled automatically.

1753	   Ready:  A TLS Connection is ready once the underlying TCP connection
1754	      is Ready, and TLS handshake is also complete and keys have been
1755	      established to encrypt application data.

1757	   InitiateError:  In addition to TCP initiation errors, TLS can
1758	      generate errors during its handshake.  Examples of error include a
1759	      failure of the peer to successfully authenticate, the peer
1760	      rejecting the local authentication, or a failure to match versions
1761	      or algorithms.

1763	   ConnectionError:  TLS connections will generate TCP errors, or errors
1764	      due to failures to rekey or decrypt received messages.

1766	   Listen:  Calling "Listen" for TLS listens on TCP, and sets up
1767	      received connections to perform server-side TLS handshakes.

1769	   ConnectionReceived:  TLS Listeners will deliver new connections once
1770	      they have successfully completed both TCP and TLS handshakes.

1772	   Clone:  As with TCP, calling "Clone" on a TLS Connection creates a
1773	      new Connection with equivalent parameters.  The two Connections
1774	      are otherwise independent.

1776	   Send:  Like TCP, TLS does not preserve message boundaries.  Although
1777	      application data is framed natively in TLS, there is not a general
1778	      guarantee that these TLS messages represent semantically
1779	      meaningful application stream boundaries.  Rather, sending data on
1780	      a TLS Connection only guarantees that the application data will be
1781	      transmitted in an encrypted form.  Marking Messages as Final
1782	      causes a "close_notify" to be generated once the data has been
1783	      written.

1785	   Receive:  Like TCP, TLS delivers a stream of bytes without any
1786	      Message delineation.  The data is decrypted prior to being
1787	      delivered to the application.  If a "close_notify" is received,
1788	      the stream-wide Message will be delivered with EndOfMessage set.

1790	   Close:  Calling "Close" on a TLS Connection indicates that the
1791	      Connection should be gracefully closed by sending a "close_notify"
1792	      to the peer and waiting for a corresponding "close_notify" before
1793	      delivering the "Closed" event.

1795	   Abort:  Calling "Abort" on a TCP Connection indicates that the
1796	      Connection should be immediately closed by sending a
1797	      "close_notify", optionally preceded by "user_canceled", to the
1798	      peer.  Implementations do not need to wait to receive
1799	      "close_notify" before delivering the "Closed" event.

1801	10.5.  DTLS

1803	   DTLS follows the same behavior as TLS (Section 10.4), with the
1804	   notable exception of not inheriting behavior directly from TCP.
1805	   Differences from TLS are detailed below, and all cases not explicitly
1806	   mentioned should be considered the same as TLS.

1808	   Connectedness: Connected

1810	   Data Unit: Datagram

1812	   Connection Object:  Connection objects represent a single DTLS
1813	      connection running over a set of UDP ports between two hosts.

1815	   Initiate:  Calling "Initiate" on a DTLS Connection causes it reserve
1816	      a UDP local port, and begin sending handshake messages to the peer
1817	      over UDP.  These messages are reliable, and will be automatically
1818	      retransmitted.

1820	   Ready:  A DTLS Connection is ready once the TLS handshake is complete
1821	      and keys have been established to encrypt application data.

1823	   Send:  Sending over DTLS does preserve message boundaries in the same
1824	      way that UDP datagrams do.  Marking a Message as Final does send a
1825	      "close_notify" like TLS.

1827	   Receive:  Receiving over DTLS delivers one decrypted Message for each
1828	      received DTLS datagram.  If a "close_notify" is received, a
1829	      Message will be delivered that is marked as Final.

1831	10.6.  HTTP

1833	   HTTP requests and responses map naturally into Messages, since they
1834	   are delineated chunks of data with metadata that can be sent over a
1835	   transport.  To that end, HTTP can be seen as the most prevalent
1836	   framing protocol that runs on top of streams like TCP, TLS, etc.

1838	   In order to use a transport Connection that provides HTTP Message
1839	   support, the establishment and closing of the connection can be
1840	   treated as it would without the framing protocol.  Sending and
1841	   receiving of Messages, however, changes to treat each Message as a
1842	   well-delineated HTTP request or response, with the content of the
1843	   Message representing the body, and the Headers being provided in
1844	   Message metadata.

1846	   Connectedness: Multiplexing Connected

1848	   Data Unit: Message

1850	   Connection Object:  Connection objects represent a flow of HTTP
1851	      messages between a client and a server, which may be an HTTP/1.1
1852	      connection over TCP, or a single stream in an HTTP/2 connection.

1854	   Initiate:  Calling "Initiate" on an HTTP connection intiates a TCP or
1855	      TLS connection as a client.

1857	   Clone:  Calling "Clone" on an HTTP Connection opens a new stream on
1858	      an existing HTTP/2 connection when possible.  If the underlying
1859	      version does not support multiplexed streams, calling "Clone"
1860	      simply creates a new parallel connection.

1862	   Send:  When an application sends an HTTP Message, it is expected to
1863	      provide HTTP header values as a MessageContext in a canonical
1864	      form, along with any associated HTTP message body as the Message
1865	      data.  The HTTP header values are encoded in the specific version
1866	      format upon sending.

1868	   Receive:  HTTP Connections deliver Messages in which HTTP header
1869	      values attached to MessageContexts, and HTTP bodies in Message
1870	      data.

1872	   Close:  Calling "Close" on an HTTP Connection will only close the
1873	      underlying TLS or TCP connection if the HTTP version does not
1874	      support multiplexing.  For HTTP/2, for example, closing the
1875	      connection only closes a specific stream.

1877	10.7.  QUIC

1879	   QUIC provides a multi-streaming interface to an encrypted transport.
1880	   Each stream can be viewed as equivalent to a TLS stream over TCP, so
1881	   a natural mapping is to present each QUIC stream as an individual
1882	   Connection.  The protocol for the stream will be considered Ready
1883	   whenever the underlying QUIC connection is established to the point
1884	   that this stream's data can be sent.  For streams after the first
1885	   stream, this will likely be an immediate operation.

1887	   Closing a single QUIC stream, presented to the application as a
1888	   Connection, does not imply closing the underlying QUIC connection
1889	   itself.  Rather, the implementation may choose to close the QUIC
1890	   connection once all streams have been closed (often after some
1891	   timeout), or after an individual stream Connection sends an Abort.

1893	   Connectedness: Multiplexing Connected

1895	   Data Unit: Stream

1897	   Connection Object:  Connection objects represent a single QUIC stream
1898	      on a QUIC connection.

1900	10.8.  HTTP/2 transport

1902	   Similar to QUIC (Section 10.7), HTTP/2 provides a multi-streaming
1903	   interface.  This will generally use HTTP as the unit of Messages over
1904	   the streams, in which each stream can be represented as a transport
1905	   Connection.  The lifetime of streams and the HTTP/2 connection should
1906	   be managed as described for QUIC.

1908	   It is possible to treat each HTTP/2 stream as a raw byte-stream
1909	   instead of a carrier for HTTP messages, in which case the Messages
1910	   over the streams can be represented similarly to the TCP stream (one
1911	   Message per direction, see Section 10.1).

1913	   Connectedness: Multiplexing Connected

1915	   Data Unit: Stream
1916	   Connection Object:  Connection objects represent a single HTTP/2
1917	      stream on a HTTP/2 connection.

1919	10.9.  SCTP

1921	   Connectedness: Connected

1923	   Data Unit: Message

1925	   API mappings for SCTP are as follows:

1927	   Connection Object:  Connection objects represent a flow of SCTP
1928	      messages between a client and a server, which may be an SCTP
1929	      association or a stream in a SCTP association.  How to map
1930	      Connection objects to streams is described in [NEAT-flow-mapping];
1931	      in the following, a similar method is described.  To map
1932	      Connection objects to SCTP streams without head-of-line blocking
1933	      on the sender side, both the sending and receiving SCTP
1934	      implementation must support message interleaving [RFC8260].  Both
1935	      SCTP implementations must also support stream reconfiguration.
1936	      Finally, both communicating endpoints must be aware of this
1937	      intended multiplexing; [NEAT-flow-mapping] describes a way for a
1938	      Transport System to negotiate the stream mapping capability using
1939	      SCTP's adaptation layer indication, such that this functionality
1940	      would only take effect if both ends sides are aware of it.  The
1941	      first flow, for which the SCTP association has been created, will
1942	      always use stream id zero.  All additional flows are assigned to
1943	      unused stream ids in growing order.  To avoid a conflict when both
1944	      endpoints map new flows simultaneously, the peer which initiated
1945	      the transport connection will use even stream numbers whereas the
1946	      remote side will map its flows to odd stream numbers.  Both sides
1947	      maintain a status map of the assigned stream numbers.  Generally,
1948	      new streams must consume the lowest available (even or odd,
1949	      depending on the side) stream number; this rule is relevant when
1950	      lower numbers become available because Connection objects
1951	      associated to the streams are closed.

1953	   Initiate:  If this is the only Connection object that is assigned to
1954	      the SCTP association or stream mapping has not been negotiated,
1955	      CONNECT.SCTP is called.  Else, a new stream is used: if there are
1956	      enough streams available, "Initiate" is just a local operation
1957	      that assigns a new stream number to the Connection object.  The
1958	      number of streams is negotiated as a parameter of the prior
1959	      CONNECT.SCTP call, and it represents a trade-off between local
1960	      resource usage and the number of Connection objects that can be
1961	      mapped without requiring a reconfiguration signal.  When running
1962	      out of streams, ADD_STREAM.SCTP must be called.

1964	   InitiateWithSend:  If this is the only Connection object that is
1965	      assigned to the SCTP association or stream mapping has not been
1966	      negotiated, CONNECT.SCTP is called with the "user message"
1967	      parameter.  Else, a new stream is used (see "Initiate" for how to
1968	      handle running out of streams), and this just sends the first
1969	      message on a new stream.

1971	   Ready:  "Initiate" or "InitiateWithSend" returns without an error,
1972	      i.e. SCTP's four-way handshake has completed.  If an association
1973	      with the peer already exists, and stream mapping has been
1974	      negotiated and enough streams are available, a Connection Object
1975	      instantly becomes Ready after calling "Initiate" or
1976	      "InitiateWithSend".

1978	   InitiateError:  Failure of CONNECT.SCTP.

1980	   ConnectionError:  TIMEOUT.SCTP or ABORT-EVENT.SCTP.

1982	   Listen:  LISTEN.SCTP.  If an association with the peer already exists
1983	      and stream mapping has been negotiated, "Listen" just expects to
1984	      receive a new message on a new stream id (chosen in accordance
1985	      with the stream number assignment procedure described above).

1987	   ConnectionReceived:  LISTEN.SCTP returns without an error (a result
1988	      of successful CONNECT.SCTP from the peer), or, in case of stream
1989	      mapping, the first message has arrived on a new stream (in this
1990	      case, "Receive" is also invoked).

1992	   Clone:  Calling "Clone" on an SCTP association creates a new
1993	      Connection object and assigns it a new stream number in accordance
1994	      with the stream number assignment procedure described above.  If
1995	      there are not enough streams available, ADD_STREAM.SCTP must be
1996	      called.

1998	   Priority (Connection):  When this value is changed, or a Message with
1999	      Message Property "Priority" is sent, and there are multiple
2000	      Connection objects assigned to the same SCTP association,
2001	      CONFIGURE_STREAM_SCHEDULER.SCTP is called to adjust the priorities
2002	      of streams in the SCTP association.

2004	   Send:  SEND.SCTP.  Message Properties such as "Lifetime" and
2005	      "Ordered" map to parameters of this primitive.

2007	   Receive:  RECEIVE.SCTP.  The "partial flag" of RECEIVE.SCTP invokes a
2008	      "ReceivedPartial" event.

2010	   Close: If this is the only Connection object that is assigned to the
2011	   SCTP association, CLOSE.SCTP is called.  Else, the Connection object
2012	   is one out of several Connection objects that are assigned to the
2013	   same SCTP assocation, and RESET_STREAM.SCTP must be called, which
2014	   informs the peer that the stream will no longer be used for mapping
2015	   and can be used by future "Initiate", "InitiateWithSend" or "Listen"
2016	   calls.  At the peer, the event RESET_STREAM-EVENT.SCTP will fire,
2017	   which the peer must answer by issuing RESET_STREAM.SCTP too.  The
2018	   resulting local RESET_STREAM-EVENT.SCTP informs the transport system
2019	   that the stream number can now be re-used by the next "Initiate",
2020	   "InitiateWithSend" or "Listen" calls.

2022	   Abort: If this is the only Connection object that is assigned to the
2023	   SCTP association, ABORT.SCTP is called.  Else, the Connection object
2024	   is one out of several Connection objects that are assigned to the
2025	   same SCTP assocation, and shutdown proceeds as described under
2026	   "Close".

2028	11.  IANA Considerations

2030	   RFC-EDITOR: Please remove this section before publication.

2032	   This document has no actions for IANA.

2034	12.  Security Considerations

2036	12.1.  Considerations for Candidate Gathering

2038	   Implementations should avoid downgrade attacks that allow network
2039	   interference to cause the implementation to select less secure, or
2040	   entirely insecure, combinations of paths and protocols.

2042	12.2.  Considerations for Candidate Racing

2044	   See Section 5.3 for security considerations around racing with 0-RTT
2045	   data.

2047	   An attacker that knows a particular device is racing several options
2048	   during connection establishment may be able to block packets for the
2049	   first connection attempt, thus inducing the device to fall back to a
2050	   secondary attempt.  This is a problem if the secondary attempts have
2051	   worse security properties that enable further attacks.
2052	   Implementations should ensure that all options have equivalent
2053	   security properties to avoid incentivizing attacks.

2055	   Since results from the network can determine how a connection attempt
2056	   tree is built, such as when DNS returns a list of resolved endpoints,
2057	   it is possible for the network to cause an implementation to consume
2058	   significant on-device resources.  Implementations should limit the
2059	   maximum amount of state allowed for any given node, including the
2060	   number of child nodes, especially when the state is based on results
2061	   from the network.

2063	13.  Acknowledgements

2065	   This work has received funding from the European Union's Horizon 2020
2066	   research and innovation programme under grant agreement No. 644334
2067	   (NEAT).

2069	   This work has been supported by Leibniz Prize project funds of DFG -
2070	   German Research Foundation: Gottfried Wilhelm Leibniz-Preis 2011 (FKZ
2071	   FE 570/4-1).

2073	   This work has been supported by the UK Engineering and Physical
2074	   Sciences Research Council under grant EP/R04144X/1.

2076	   This work has been supported by the Research Council of Norway under
2077	   its "Toppforsk" programme through the "OCARINA" project.

2079	   Thanks to Stuart Cheshire, Josh Graessley, David Schinazi, and Eric
2080	   Kinnear for their implementation and design efforts, including Happy
2081	   Eyeballs, that heavily influenced this work.

2083	14.  References

2085	14.1.  Normative References

2087	   [I-D.ietf-taps-arch]
2088	              Pauly, T., Trammell, B., Brunstrom, A., Fairhurst, G.,
2089	              Perkins, C., Tiesel, P., and C. Wood, "An Architecture for
2090	              Transport Services", Work in Progress, Internet-Draft,
2091	              draft-ietf-taps-arch-06, 23 December 2019,
2092	              <http://www.ietf.org/internet-drafts/draft-ietf-taps-arch-
2093	              06.txt>.

2095	   [I-D.ietf-taps-interface]
2096	              Trammell, B., Welzl, M., Enghardt, T., Fairhurst, G.,
2097	              Kuehlewind, M., Perkins, C., Tiesel, P., Wood, C., and T.
2098	              Pauly, "An Abstract Application Layer Interface to
2099	              Transport Services", Work in Progress, Internet-Draft,
2100	              draft-ietf-taps-interface-05, 4 November 2019,
2101	              <http://www.ietf.org/internet-drafts/draft-ietf-taps-
2102	              interface-05.txt>.

2104	   [I-D.ietf-taps-minset]
2105	              Welzl, M. and S. Gjessing, "A Minimal Set of Transport
2106	              Services for End Systems", Work in Progress, Internet-
2107	              Draft, draft-ietf-taps-minset-11, 27 September 2018,
2108	              <http://www.ietf.org/internet-drafts/draft-ietf-taps-
2109	              minset-11.txt>.

2111	   [RFC7413]  Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP
2112	              Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014,
2113	              <https://www.rfc-editor.org/info/rfc7413>.

2115	   [RFC7540]  Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext
2116	              Transfer Protocol Version 2 (HTTP/2)", RFC 7540,
2117	              DOI 10.17487/RFC7540, May 2015,
2118	              <https://www.rfc-editor.org/info/rfc7540>.

2120	   [RFC8260]  Stewart, R., Tuexen, M., Loreto, S., and R. Seggelmann,
2121	              "Stream Schedulers and User Message Interleaving for the
2122	              Stream Control Transmission Protocol", RFC 8260,
2123	              DOI 10.17487/RFC8260, November 2017,
2124	              <https://www.rfc-editor.org/info/rfc8260>.

2126	   [RFC8303]  Welzl, M., Tuexen, M., and N. Khademi, "On the Usage of
2127	              Transport Features Provided by IETF Transport Protocols",
2128	              RFC 8303, DOI 10.17487/RFC8303, February 2018,
2129	              <https://www.rfc-editor.org/info/rfc8303>.

2131	   [RFC8304]  Fairhurst, G. and T. Jones, "Transport Features of the
2132	              User Datagram Protocol (UDP) and Lightweight UDP (UDP-
2133	              Lite)", RFC 8304, DOI 10.17487/RFC8304, February 2018,
2134	              <https://www.rfc-editor.org/info/rfc8304>.

2136	   [RFC8305]  Schinazi, D. and T. Pauly, "Happy Eyeballs Version 2:
2137	              Better Connectivity Using Concurrency", RFC 8305,
2138	              DOI 10.17487/RFC8305, December 2017,
2139	              <https://www.rfc-editor.org/info/rfc8305>.

2141	   [RFC8446]  Rescorla, E., "The Transport Layer Security (TLS) Protocol
2142	              Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
2143	              <https://www.rfc-editor.org/info/rfc8446>.

2145	14.2.  Informative References

2147	   [I-D.ietf-quic-transport]
2148	              Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed
2149	              and Secure Transport", Work in Progress, Internet-Draft,
2150	              draft-ietf-quic-transport-27, 21 February 2020,
2151	              <http://www.ietf.org/internet-drafts/draft-ietf-quic-
2152	              transport-27.txt>.

2154	   [NEAT-flow-mapping]
2155	              "Transparent Flow Mapping for NEAT (in Workshop on Future
2156	              of Internet Transport (FIT 2017))", 2017.

2158	   [RFC5245]  Rosenberg, J., "Interactive Connectivity Establishment
2159	              (ICE): A Protocol for Network Address Translator (NAT)
2160	              Traversal for Offer/Answer Protocols", RFC 5245,
2161	              DOI 10.17487/RFC5245, April 2010,
2162	              <https://www.rfc-editor.org/info/rfc5245>.

2164	Appendix A.  Additional Properties

2166	   This appendix discusses implementation considerations for additional
2167	   parameters and properties that could be used to enhance transport
2168	   protocol and/or path selection, or the transmission of messages given
2169	   a Protocol Stack that implements them.  These are not part of the
2170	   interface, and may be removed from the final document, but are
2171	   presented here to support discussion within the TAPS working group as
2172	   to whether they should be added to a future revision of the base
2173	   specification.

2175	A.1.  Properties Affecting Sorting of Branches

2177	   In addition to the Protocol and Path Selection Properties discussed
2178	   in Section 4.3, the following properties under discussion can
2179	   influence branch sorting:

2181	   *  Bounds on Send or Receive Rate: If the application indicates a
2182	      bound on the expected Send or Receive bitrate, an implementation
2183	      may prefer a path that can likely provide the desired bandwidth,
2184	      based on cached maximum throughput, see Section 9.2.  The
2185	      application may know the Send or Receive Bitrate from metadata in
2186	      adaptive HTTP streaming, such as MPEG-DASH.

2188	   *  Cost Preferences: If the application indicates a preference to
2189	      avoid expensive paths, and some paths are associated with a
2190	      monetary cost, an implementation should decrease the ranking of
2191	      such paths.  If the application indicates that it prohibits using
2192	      expensive paths, paths that are associated with a cost should be
2193	      purged from the decision tree.

2195	Appendix B.  Reasons for errors

2197	   The Transport Services API [I-D.ietf-taps-interface] allows for the
2198	   several generic error types to specify a more detailed reason as to
2199	   why an error occurred.  This appendix lists some of the possible
2200	   reasons.

2202	   *  InvalidConfiguration: The transport properties and endpoints
2203	      provided by the application are either contradictory or
2204	      incomplete.  Examples include the lack of a remote endpoint on an
2205	      active open or using a multicast group address while not
2206	      requesting a unidirectional receive.

2208	   *  NoCandidates: The configuration is valid, but none of the
2209	      available transport protocols can satisfy the transport properties
2210	      provided by the application.

2212	   *  ResolutionFailed: The remote or local specifier provided by the
2213	      application can not be resolved.

2215	   *  EstablishmentFailed: The TAPS system was unable to establish a
2216	      transport-layer connection to the remote endpoint specified by the
2217	      application.

2219	   *  PolicyProhibited: The system policy prevents the transport system
2220	      from performing the action requested by the application.

2222	   *  NotCloneable: The protocol stack is not capable of being cloned.

2224	   *  MessageTooLarge: The message size is too big for the transport
2225	      system to handle.

2227	   *  ProtocolFailed: The underlying protocol stack failed.

2229	   *  InvalidMessageProperties: The message properties are either
2230	      contradictory to the transport properties or they can not be
2231	      satisfied by the transport system.

2233	   *  DeframingFailed: The data that was received by the underlying
2234	      protocol stack could not be deframed.

2236	   *  ConnectionAborted: The connection was aborted by the peer.

2238	   *  Timeout: Delivery of a message was not possible after a timeout.

2240	Appendix C.  Existing Implementations

2242	   This appendix gives an overview of existing implementations, at the
2243	   time of writing, of transport systems that are (to some degree) in
2244	   line with this document.

2246	   *  Apple's Network.framework:

2248	      -  Network.framework is a transport-level API built for C,
2249	         Objective-C, and Swift.  It a connect-by-name API that supports
2250	         transport security protocols.  It provides userspace
2251	         implementations of TCP, UDP, TLS, DTLS, proxy protocols, and
2252	         allows extension via custom framers.

2254	      -  Documentation: https://developer.apple.com/documentation/
2255	         network (https://developer.apple.com/documentation/network)

2257	   *  NEAT:

2259	      -  NEAT is the output of the European H2020 research project
2260	         "NEAT"; it is a user-space library for protocol-independent
2261	         communication on top of TCP, UDP and SCTP, with many more
2262	         features such as a policy manager.

2264	      -  Code: https://github.com/NEAT-project/neat (https://github.com/
2265	         NEAT-project/neat)

2267	      -  NEAT project: https://www.neat-project.org (https://www.neat-
2268	         project.org)

2270	   *  PyTAPS:

2272	      -  A TAPS implementation based on Python asyncio, offering
2273	         protocol-independent communication to applications on top of
2274	         TCP, UDP and TLS, with support for multicast.

2276	      -  Code: https://github.com/fg-inet/python-asyncio-taps
2277	         (https://github.com/fg-inet/python-asyncio-taps)

2279	Authors' Addresses

2281	   Anna Brunstrom (editor)
2282	   Karlstad University
2283	   Universitetsgatan 2
2284	   SE- 651 88 Karlstad
2285	   Sweden

2287	   Email: anna.brunstrom@kau.se

2289	   Tommy Pauly (editor)
2290	   Apple Inc.
2291	   One Apple Park Way
2292	   Cupertino, California 95014,
2293	   United States of America

2295	   Email: tpauly@apple.com
2296	   Theresa Enghardt
2297	   TU Berlin
2298	   Marchstrasse 23
2299	   10587 Berlin
2300	   Germany

2302	   Email: theresa@inet.tu-berlin.de

2304	   Karl-Johan Grinnemo
2305	   Karlstad University
2306	   Universitetsgatan 2
2307	   SE- 651 88 Karlstad
2308	   Sweden

2310	   Email: karl-johan.grinnemo@kau.se

2312	   Tom Jones
2313	   University of Aberdeen
2314	   Fraser Noble Building
2315	   Aberdeen, AB24 3UE
2316	   United Kingdom

2318	   Email: tom@erg.abdn.ac.uk

2320	   Philipp S. Tiesel
2321	   TU Berlin
2322	   Einsteinufer 25
2323	   10587 Berlin
2324	   Germany

2326	   Email: philipp@tiesel.net

2328	   Colin Perkins
2329	   University of Glasgow
2330	   School of Computing Science
2331	   Glasgow G12 8QQ
2332	   United Kingdom

2334	   Email: csp@csperkins.org

2336	   Michael Welzl
2337	   University of Oslo
2338	   PO Box 1080 Blindern
2339	   0316  Oslo
2340	   Norway

2342	   Email: michawe@ifi.uio.no