idnits 2.17.1 

draft-ietf-mptcp-api-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (February 16, 2012) is 4451 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  ** Obsolete normative reference: RFC  793 (ref. '1') (Obsoleted by RFC 9293)

  == Outdated reference: A later version (-12) exists of
     draft-ietf-mptcp-multiaddressed-06


     Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                                M. Scharf
3	Internet-Draft                                  Alcatel-Lucent Bell Labs
4	Intended status: Informational                                   A. Ford
5	Expires: August 19, 2012                                           Cisco
6	                                                       February 16, 2012

8	               MPTCP Application Interface Considerations
9	                        draft-ietf-mptcp-api-04

11	Abstract

13	   Multipath TCP (MPTCP) adds the capability of using multiple paths to
14	   a regular TCP session.  Even though it is designed to be totally
15	   backward compatible to applications, the data transport differs
16	   compared to regular TCP, and there are several additional degrees of
17	   freedom that applications may wish to exploit.  This document
18	   summarizes the impact that MPTCP may have on applications, such as
19	   changes in performance.  Furthermore, it discusses compatibility
20	   issues of MPTCP in combination with non-MPTCP-aware applications.
21	   Finally, the document describes a basic application interface for
22	   MPTCP-aware applications that provides access to multipath address
23	   information and a level of control equivalent to regular TCP.

25	Status of This Memo

27	   This Internet-Draft is submitted in full conformance with the
28	   provisions of BCP 78 and BCP 79.

30	   Internet-Drafts are working documents of the Internet Engineering
31	   Task Force (IETF).  Note that other groups may also distribute
32	   working documents as Internet-Drafts.  The list of current Internet-
33	   Drafts is at http://datatracker.ietf.org/drafts/current/.

35	   Internet-Drafts are draft documents valid for a maximum of six months
36	   and may be updated, replaced, or obsoleted by other documents at any
37	   time.  It is inappropriate to use Internet-Drafts as reference
38	   material or to cite them other than as "work in progress."

40	   This Internet-Draft will expire on August 19, 2012.

42	Copyright Notice

44	   Copyright (c) 2012 IETF Trust and the persons identified as the
45	   document authors.  All rights reserved.

47	   This document is subject to BCP 78 and the IETF Trust's Legal
48	   Provisions Relating to IETF Documents
49	   (http://trustee.ietf.org/license-info) in effect on the date of
50	   publication of this document.  Please review these documents
51	   carefully, as they describe your rights and restrictions with respect
52	   to this document.  Code Components extracted from this document must
53	   include Simplified BSD License text as described in Section 4.e of
54	   the Trust Legal Provisions and are provided without warranty as
55	   described in the Simplified BSD License.

57	Table of Contents

59	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
60	   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  5
61	   3.  Comparison of MPTCP and Regular TCP  . . . . . . . . . . . . .  5
62	     3.1.  Performance Impact . . . . . . . . . . . . . . . . . . . .  6
63	       3.1.1.  Throughput . . . . . . . . . . . . . . . . . . . . . .  6
64	       3.1.2.  Delay  . . . . . . . . . . . . . . . . . . . . . . . .  6
65	       3.1.3.  Resilience . . . . . . . . . . . . . . . . . . . . . .  7
66	     3.2.  Potential Problems . . . . . . . . . . . . . . . . . . . .  7
67	       3.2.1.  Impact of Middleboxes  . . . . . . . . . . . . . . . .  7
68	       3.2.2.  Outdated Implicit Assumptions  . . . . . . . . . . . .  8
69	       3.2.3.  Security Implications  . . . . . . . . . . . . . . . .  8
70	   4.  Operation of MPTCP with Legacy Applications  . . . . . . . . .  9
71	     4.1.  Overview of the MPTCP Network Stack  . . . . . . . . . . .  9
72	     4.2.  Address Issues . . . . . . . . . . . . . . . . . . . . . . 10
73	       4.2.1.  Specification of Addresses by Applications . . . . . . 10
74	       4.2.2.  Querying of Addresses by Applications  . . . . . . . . 10
75	     4.3.  Socket Option Issues . . . . . . . . . . . . . . . . . . . 11
76	       4.3.1.  General Guideline  . . . . . . . . . . . . . . . . . . 11
77	       4.3.2.  Disabling of the Nagle Algorithm . . . . . . . . . . . 11
78	       4.3.3.  Buffer Sizing  . . . . . . . . . . . . . . . . . . . . 12
79	       4.3.4.  Other Socket Options . . . . . . . . . . . . . . . . . 12
80	     4.4.  Default Enabling of MPTCP  . . . . . . . . . . . . . . . . 12
81	     4.5.  Summary of Advices to Application Developers . . . . . . . 12
82	   5.  Basic API for MPTCP-aware Applications . . . . . . . . . . . . 13
83	     5.1.  Design Considerations  . . . . . . . . . . . . . . . . . . 13
84	     5.2.  Requirements on the Basic MPTCP API  . . . . . . . . . . . 14
85	     5.3.  Sockets Interface Extensions by the Basic MPTCP API  . . . 15
86	       5.3.1.  Overview . . . . . . . . . . . . . . . . . . . . . . . 15
87	       5.3.2.  Enabling and Disabling of MPTCP  . . . . . . . . . . . 16
88	       5.3.3.  Binding MPTCP to Specified Addresses . . . . . . . . . 17
89	       5.3.4.  Querying the MPTCP Subflow Addresses . . . . . . . . . 18
90	       5.3.5.  Getting a Unique Connection Identifier . . . . . . . . 18
91	   6.  Other Compatibility Issues . . . . . . . . . . . . . . . . . . 18
92	     6.1.  Usage of the SCTP Socket API . . . . . . . . . . . . . . . 19
93	     6.2.  Incompatibilities with other Multihoming Solutions . . . . 19
94	     6.3.  Interactions with DNS  . . . . . . . . . . . . . . . . . . 19
95	   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 20
96	   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 21
97	   9.  Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 21
98	   10. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 21
99	   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21
100	     11.1. Normative References . . . . . . . . . . . . . . . . . . . 21
101	     11.2. Informative References . . . . . . . . . . . . . . . . . . 22
102	   Appendix A.  Requirements on a Future Advanced MPTCP API . . . . . 23
103	     A.1.  Design Considerations  . . . . . . . . . . . . . . . . . . 23
104	     A.2.  MPTCP Usage Scenarios and Application Requirements . . . . 23
105	     A.3.  Potential Requirements on an Advanced MPTCP API  . . . . . 25
106	     A.4.  Integration with the SCTP Socket API . . . . . . . . . . . 26
107	   Appendix B.  Change History of the Document  . . . . . . . . . . . 27

109	1.  Introduction

111	   Multipath TCP adds the capability of using multiple paths to a
112	   regular TCP session [1].  The motivations for this extension include
113	   increasing throughput, overall resource utilisation, and resilience
114	   to network failure, and these motivations are discussed, along with
115	   high-level design decisions, as part of the Multipath TCP
116	   architecture [4].  The MPTCP protocol [5] offers the same reliable,
117	   in-order, byte-stream transport as TCP, and is designed to be
118	   backward compatible with both applications and the network layer.  It
119	   requires support inside the network stack of both endpoints.

121	   This document first presents the impacts that MPTCP may have on
122	   applications, such as performance changes compared to regular TCP.
123	   Second, it defines the interoperation of MPTCP and applications that
124	   are unaware of the multipath transport.  MPTCP is designed to be
125	   usable without any application changes, but some compatibility issues
126	   have to be taken into account.  Third, this memo specifies a basic
127	   Application Programming Interface (API) for MPTCP-aware applications.
128	   The API presented here is an extension to the regular TCP API to
129	   allow an MPTCP-aware application the equivalent level of control and
130	   access to information of an MPTCP connection that would be possible
131	   with the standard TCP API on a regular TCP connection.

133	   An advanced API for MPTCP is outside the scope of this document.
134	   Such an advanced API could offer a more fine-grained control over
135	   multipath transport functions and policies.  The appendix includes a
136	   brief, non-compulsory list of potential features of such an advanced
137	   API.

139	   The de facto standard API for TCP/IP applications is the "sockets"
140	   interface.  This document provides an abstract definition of MPTCP-
141	   specific extensions to this interface.  These are operations that can
142	   be used by an application to get or set additional MPTCP-specific
143	   information on a socket, in order to provide an equivalent level of
144	   information and control over MPTCP as exists for an application using
145	   regular TCP.  It is up to the applications, high-level programming
146	   languages, or libraries to decide whether to use these optional
147	   extensions.  For instance, an application may want to turn on or off
148	   the MPTCP mechanism for certain data transfers, or limit its use to
149	   certain interfaces.  The abstract specification is in line with the
150	   Posix standard [17] as much as possible.

152	   There are also various related extensions of the sockets interface:
153	   [11] specifies sockets API extensions for a multihoming shim layer.
154	   The API enables interactions between applications and the multihoming
155	   shim layer for advanced locator management and for access to
156	   information about failure detection and path exploration.

158	   Experimental extensions to the sockets API are also defined for the
159	   Host Identity Protocol (HIP) [12] in order to manage the bindings of
160	   identifiers and locator.  Further related API extensions exist for
161	   IPv6 [9], Mobile IP [10], and SCTP [13].  There can be interactions
162	   or incompatibilities of these APIs with MPTCP, which are discussed
163	   later in this document.

165	   Some network stack implementations, specially on mobile devices, have
166	   centralized connection managers or other higher-level APIs to solve
167	   multi-interface issues, as surveyed in [15].  Their interaction with
168	   MPTCP is outside the scope of this note.

170	   The target readers of this document are application developers whose
171	   software may benefit significantly from MPTCP.  This document also
172	   provides the necessary information for developers of MPTCP to
173	   implement the API in a TCP/IP network stack.

175	2.  Terminology

177	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
178	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
179	   document are to be interpreted as described in [3].

181	   This document uses the MPTCP terminology introduced in [5].

183	   Concerning the API towards applications, the following terms are
184	   distinguished:

186	   o  Legacy API: The interface towards TCP that is currently used by
187	      applications.  This document explains the impact of MPTCP for such
188	      applications, as well as resulting issues.

190	   o  Basic API: A simple extension of TCP's interface for applications
191	      that are aware of MPTCP.  This document abstractly describes this
192	      interface, which provides access to multipath address information
193	      and a level of control equivalent to regular TCP.

195	   o  Advanced API: An API that offers more fine-grained control over
196	      the MPTCP behaviour.  Its detailed specification is outside scope
197	      of this document.

199	3.  Comparison of MPTCP and Regular TCP

201	   This section discusses the impact that the use of MPTCP will have on
202	   applications, in comparison to what may be expected from the use of
203	   regular TCP.

205	3.1.  Performance Impact

207	   One of the key goals of adding multipath capability to TCP is to
208	   improve the performance of a transport connection by load
209	   distribution over separate subflows across potentially disjoint
210	   paths.  Furthermore, it is an explicit goal of MPTCP that it should
211	   not provide a worse performing connection that would have existed
212	   through the use of single-path TCP.  A corresponding congestion
213	   control algorithm is described in [7].  The following sections
214	   summarize the performance impact of MPTCP as seen by an application.

216	3.1.1.  Throughput

218	   The most obvious performance improvement that will be gained with the
219	   use of MPTCP is an increase in throughput, since MPTCP will pool more
220	   than one path (where available) between two endpoints.  This will
221	   provide greater bandwidth for an application.  If there are shared
222	   bottlenecks between the flows, then the congestion control algorithms
223	   will ensure that load is evenly spread amongst regular and multipath
224	   TCP sessions, so that no end user receives worse performance than
225	   single-path TCP.

227	   This performance increase additionally means that an MPTCP session
228	   could achieve throughput that is greater than the capacity of a
229	   single interface on the device.  If any applications make assumptions
230	   about interfaces due to throughput (or vice versa), they must take
231	   this into account (although an MPTCP implementation must always
232	   respect an application's request for a particular interface).

234	   Furthermore, the flexibility of MPTCP to add and remove subflows as
235	   paths change availability could lead to a greater variation, and more
236	   frequent change, in connection bandwidth.  Applications that adapt to
237	   available bandwidth (such as video and audio streaming) may need to
238	   adjust some of their assumptions to most effectively take this into
239	   account.

241	   The transport of MPTCP signaling information results in a small
242	   overhead.  If multiple subflows share a same bottleneck, this
243	   overhead slightly reduces the capacity that is available for data
244	   transport.  Yet, this potential reduction of throughput will be
245	   neglectible in many usage scenarios, and the protocol contains
246	   optimisations in its design so that this overhead is minimal.

248	3.1.2.  Delay

250	   If the delays on the constituent subflows of an MPTCP connection
251	   differ, the jitter perceivable to an application may appear higher as
252	   the data is spread across the subflows.  Although MPTCP will ensure
253	   in-order delivery to the application, the application must be able to
254	   cope with the data delivery being burstier than may be usual with
255	   single-path TCP.  Since burstiness is commonplace on the Internet
256	   today, it is unlikely that applications will suffer from such an
257	   impact on the traffic profile, but application authors may wish to
258	   consider this in future development.

260	   In addition, applications that make round trip time (RTT) estimates
261	   at the application level may have some issues.  Whilst the average
262	   delay calculated will be accurate, whether this is useful for an
263	   application will depend on what it requires this information for.  If
264	   a new application wishes to derive such information, it should
265	   consider how multiple subflows may affect its measurements, and thus
266	   how it may wish to respond.  In such a case, an application may wish
267	   to express its scheduling preferences, as described later in this
268	   document.

270	3.1.3.  Resilience

272	   The use of multiple subflows simultaneously means that, if one should
273	   fail, all traffic will move to the remaining subflow(s), and
274	   additionally any lost packets can be retransmitted on these subflows.

276	   Subflow failure may be caused by issues within the network, which an
277	   application would be unaware of, or interface failure on the node.
278	   An application may, under certain circumstances, be in a position to
279	   be aware of such failure (e.g. by radio signal strength, or simply an
280	   interface enabled flag), and so must not make assumptions of an MPTCP
281	   flow's stablity based on this.  An MPTCP implementation must never
282	   override an application's request for a given interface, however, so
283	   the cases where this issue may be applicable are limited.

285	3.2.  Potential Problems

287	3.2.1.  Impact of Middleboxes

289	   MPTCP has been designed in order to pass through the majority of
290	   middleboxes.  Empirical evidence suggests that new TCP options can
291	   successfully be used on most paths in the Internet.  Nevertheless
292	   some middleboxes may still refuse to pass MPTCP messages due to the
293	   presence of TCP options, or they may strip TCP options.  If this is
294	   the case, MPTCP should fall back to regular TCP.  Although this will
295	   not create a problem for the application (its communication will be
296	   set up either way), there may be additional (and indeed, user-
297	   perceivable) delay while the first handshake fails.  Therefore, an
298	   alternative approach could be to try both MPTCP and regular TCP
299	   connection attempts at the same time, and respond to whichever
300	   replies first (or apply a timeout on the MPTCP attempt, while having
301	   TCP SYN/ACK ready to reply to, thus reducing the setup delay by a
302	   RTT) in a similar fashion to the "Happy Eyeballs" proposal for IPv6
303	   [16].

305	   An MPTCP implementation can learn the rate of MPTCP connection
306	   attempt successes or failures to particular hosts or networks, and on
307	   particular interfaces, and could therefore learn heuristics of when
308	   and when not to use MPTCP.  A detailed discussion of the various
309	   fallback mechanisms, for failures occurring at different points in
310	   the connection, is presented in [5].

312	   There may also be middleboxes that transparently change the length of
313	   content.  If such middleboxes are present, MPTCP's reassembly of the
314	   byte stream in the receiver is difficult.  Still, MPTCP can detect
315	   such middleboxes and then fall back to regular TCP.  An overview of
316	   the impact of middleboxes is presented in [4] and MPTCP's mechanisms
317	   to work around these are presented and discussed in [5].

319	   MPTCP can also have other unexpected implications.  For instance,
320	   intrusion detection systems could be triggered.  A full analysis of
321	   MPTCP's impact on such middleboxes is for further study after
322	   deployment experiments.

324	3.2.2.  Outdated Implicit Assumptions

326	   In regular TCP, there is a one-to-one mapping of the socket interface
327	   to a flow through a network.  Since MPTCP can make use of multiple
328	   subflows, applications cannot implicitly rely on this one-to-one
329	   mapping any more.  Applications that require the transport along a
330	   single path can disable the use of MPTCP as described later in this
331	   document.  Examples include monitoring tools that want to measure the
332	   available bandwidth on a path, or routing protocols such as BGP that
333	   require the use of a specific link.

335	   Furthermore, an implementation may choose to persist an MPTCP
336	   connection even if an IP address is not allocated any more to a host,
337	   depending on the policy concerning the first subflow (fate-sharing,
338	   see Section 4.2.2).  In this case, the IP address exposed to an
339	   MPTCP-unaware application can differ to the addresses actually been
340	   used by MPTCP.  It is even possible that an IP address gets assigned
341	   to another host during the lifetime of an MPTCP connection.

343	3.2.3.  Security Implications

345	   The support for multiple IP addresses within one MPTCP connection can
346	   result in additional security vulnerabilities, such as possibilities
347	   for attackers to hijack connections.  The protocol design of MPTCP
348	   minimizes this risk.  An attacker on one of the paths can cause harm,
349	   but this is hardly an additional security risk compared to single-
350	   path TCP, which is vulnerable to man-in-the-middle attacks, too.  A
351	   detailed threat analysis of MPTCP is published in [6].

353	4.  Operation of MPTCP with Legacy Applications

355	4.1.  Overview of the MPTCP Network Stack

357	   MPTCP is an extension of TCP, but it is designed to be backward
358	   compatible for legacy (MPTCP-unaware) applications.  TCP interacts
359	   with other parts of the network stack by different interfaces.  The
360	   de facto standard API between TCP and applications is the sockets
361	   interface.  The position of MPTCP in the protocol stack can be
362	   illustrated in Figure 1.

364	                      +-------------------------------+
365	                      |           Application         |
366	                      +-------------------------------+
367	                             ^                 |
368	                  ~~~~~~~~~~~|~Socket Interface|~~~~~~~~~~~
369	                             |                 v
370	                     +-------------------------------+
371	                     |             MPTCP             |
372	                     + - - - - - - - + - - - - - - - +
373	                     | Subflow (TCP) | Subflow (TCP) |
374	                     +-------------------------------+
375	                     |       IP      |      IP       |
376	                     +-------------------------------+

378	                      Figure 1: MPTCP protocol stack

380	   In general, MPTCP can affect all interfaces that make assumptions
381	   about the coupling of a TCP connection to a single IP address and TCP
382	   port pair, to one sockets endpoint, to one network interface, or to a
383	   given path through the network.

385	   This means that there are two classes of applications:

387	   o  Legacy applications: These applications are unaware of MPTCP and
388	      use the existing API towards TCP without any changes.  This is the
389	      default case.

391	   o  MPTCP-aware applications: These applications indicate support for
392	      an enhanced MPTCP interface.  This document specified a minimum
393	      set of API extensions for such applications.

395	   In the following, it is discussed to what extent MPTCP affects legacy
396	   applications using the existing sockets API.  The existing sockets
397	   API implies that applications deal with data structures that store,
398	   amongst others, the IP addresses and TCP port numbers of a TCP
399	   connection.  A design objective of MPTCP is that legacy applications
400	   can continue to use the established sockets API without any changes.
401	   However, in MPTCP there is a one-to-many mapping between the socket
402	   endpoint and the subflows.  This has several subtle implications for
403	   legacy applications using sockets API functions.

405	4.2.  Address Issues

407	4.2.1.  Specification of Addresses by Applications

409	   During binding, an application can either select a specific address,
410	   or bind to INADDR_ANY.  Furthermore, on some systems other socket
411	   options (e.g., SO_BINDTODEVICE) can be used to bind to a specific
412	   interface.  If an application uses a specific address or binds to a
413	   specific interface, then MPTCP MUST respect this and not interfere in
414	   the application's choices.  The binding to a specific address or
415	   interface implies that the application is not aware of MPTCP and will
416	   disable the use of MPTCP on this connection.  An application that
417	   wishes to bind to a specific set of addresses with MPTCP must use
418	   multipath-aware calls to achieve this (as described in
419	   Section 5.3.3).

421	   If an application binds to INADDR_ANY, it is assumed that the
422	   application does not care which addresses to use locally.  In this
423	   case, a local policy MAY allow MPTCP to automatically set up multiple
424	   subflows on such a connection.

426	   The basic sockets API of MPTCP-aware applications allows to express
427	   further preferences in an MPTCP-compatible way (e.g. bind to a subset
428	   of interfaces only).

430	4.2.2.  Querying of Addresses by Applications

432	   Applications can use the getpeername() or getsockname() functions in
433	   order to retrieve the IP address of the peer or of the local socket.
434	   These functions can be used for various purposes, including security
435	   mechanisms, geo-location, or interface checks.  The socket API was
436	   designed with an assumption that a socket is using just one address,
437	   and since this address is visible to the application, the application
438	   may assume that the information provided by the functions is the same
439	   during the lifetime of a connection.  However, in MPTCP, unlike in
440	   TCP, there is a one-to-many mapping of a connection to subflows, and
441	   subflows can be added and removed while the connections continues to
442	   exist.  Therefore, MPTCP cannot expose addresses by getpeername() or
443	   getsockname() that are both valid and constant during the
444	   connection's lifetime.

446	   This problem is addressed as follows: If used by a legacy
447	   application, the MPTCP stack MUST always return the addresses of the
448	   first subflow of an MPTCP connection, in all circumstances, even if
449	   that particular subflow is no longer in use.

451	   As this address may not be valid any more if the first subflow is
452	   closed, the MPTCP stack MAY close the whole MPTCP connection if the
453	   first subflow is closed (i.e. fate sharing between the initial
454	   subflow and the MPTCP connection as a whole).  Whether to close the
455	   whole MPTCP connection by default SHOULD be controlled by a local
456	   policy.  Further experiments are needed to investigate its
457	   implications.

459	   The functions getpeername() and getsockname() SHOULD also always
460	   return the addresses of the first subflow if the socket is used by an
461	   MPTCP-aware application, in order to be consistent with MPTCP-unaware
462	   applications, and, e. g., also with SCTP.  Instead of getpeername()
463	   or getsockname(), MPTCP-aware applications can use new API calls,
464	   documented later, in order to retrieve the full list of address pairs
465	   for the subflows in use.

467	4.3.  Socket Option Issues

469	4.3.1.  General Guideline

471	   The existing sockets API includes options that modify the behavior of
472	   sockets and their underlying communications protocols.  Various
473	   socket options exist on socket, TCP, and IP level.  The value of an
474	   option can usually be set by the setsockopt() system function.  The
475	   getsockopt() function gets information.  In general, the existing
476	   sockets interface functions cannot configure each MPTCP subflow
477	   individually.  In order to be backward compatible, existing APIs
478	   therefore SHOULD apply to all subflows within one connection, as far
479	   as possible.

481	4.3.2.  Disabling of the Nagle Algorithm

483	   One commonly used TCP socket option (TCP_NODELAY) disables the Nagle
484	   algorithm as described in [2].  This option is also specified in the
485	   Posix standard [17].  Applications can use this option in combination
486	   with MPTCP exactly in the same way.  It then SHOULD disable the Nagle
487	   algorithm for the MPTCP connection, i.e., all subflows.

489	   In addition, the MPTCP protocol instance MAY use a different path
490	   scheduler algorithm if TCP_NODELAY is present.  For instance, it
491	   could use an algorithm that is optimized for latency-sensitive
492	   traffic.  Specific algorithms are outside the scope of this document.

494	4.3.3.  Buffer Sizing

496	   Applications can explicitly configure send and receive buffer sizes
497	   by the sockets API (SO_SNDBUF, SO_RCVBUF).  These socket options can
498	   also be used in combination with MPTCP and then affect the buffer
499	   size of the MPTCP connection.  However, when defining buffer sizes,
500	   application programmers should take into account that the transport
501	   over several subflows requires a certain amount of buffer for
502	   resequencing in the receiver.  MPTCP may also require more storage
503	   space in the sender, in particular, if retransmissions are sent over
504	   more than one path.  In addition, very small send buffers may prevent
505	   MPTCP from efficiently scheduling data over different subflows.
506	   Therefore, it does not make sense to use MPTCP in combination with
507	   small send or receive buffers.

509	   An MPTCP implementation MAY set a lower bound for send and receive
510	   buffers and treat a small buffer size request as an implicit request
511	   not to use MPTCP.

513	4.3.4.  Other Socket Options

515	   Some network stacks also provide other implementation-specific socket
516	   options or interfaces that affect TCP's behavior.  If a network stack
517	   supports MPTCP, it must be ensured that these options do not
518	   interfere.

520	4.4.  Default Enabling of MPTCP

522	   It is up to a local policy at the end system whether a network stack
523	   should automatically enable MPTCP for sockets even if there is no
524	   explicit sign of MPTCP awareness of the corresponding application.
525	   Such a choice may be under the control of the user through system
526	   preferences.

528	   The enabling of MPTCP, either by application or by system defaults,
529	   does not necessarily mean that MPTCP will always be used.  Both
530	   endpoints must support MPTCP, and there must be multiple addresses at
531	   at least one endpoint, for MPTCP to be used.  Even if those
532	   requirements are met, however, MPTCP may not be immediately used on a
533	   connection.  It may make sense for multiple paths to be brought into
534	   operation only after a given period of time, or if the connection is
535	   saturated.

537	4.5.  Summary of Advices to Application Developers

539	   o  Using the default MPTCP configuration: Like TCP, MPTCP is designed
540	      to be efficient and robust in the default configuration.
541	      Application developers should not explicitly configure TCP (or
542	      MPTCP) features unless this is really needed.

544	   o  Socker buffet dimensioning: Multipath transport requires larger
545	      buffers in the receiver for resequencing, as already explained.
546	      Applications should use reasonably buffer sizes (such as the
547	      operating system default values) in order to fully benefit from
548	      MPTCP.  A full discussion of buffer sizing issues is given in [5].

550	   o  Facilitating stack-internal heuristics: The path management and
551	      data scheduling by MPTCP is realized by stack-internal algorithms
552	      that may implicitly try to self-optimize their behavior according
553	      to assumed application needs.  For instance, an MPTCP
554	      implementation may use heuristics to determine whether an
555	      application requires delay-sensitive or bulk data transport, using
556	      for instance port numbers, the TCP_NODELAY socket options, or the
557	      application's read/write patterns as input parameters.  An
558	      application developer can facilitate the operation of such
559	      heuristics by avoiding atypical interface use cases.  For
560	      instance, for long bulk data transfers, it does neither make sense
561	      to enable the TCP_NODELAY socket option, nor is it reasonable to
562	      use many small subsequent socket "send()" calls with small amounts
563	      of data only.

565	5.  Basic API for MPTCP-aware Applications

567	5.1.  Design Considerations

569	   While applications can use MPTCP with the unmodified sockets API,
570	   multipath transport results in many degrees of freedom.  MPTCP
571	   manages the data transport over different subflows automatically.  By
572	   default, this is transparent to the application, but an application
573	   could use an additional API to interface with the MPTCP layer and to
574	   control important aspects of the MPTCP implementation's behaviour.

576	   This document describes a basic MPTCP API.  The API contains a
577	   minimum set of functions that provide an equivalent level of control
578	   and information as exists for regular TCP.  It maintains backward
579	   compatibility with legacy applications.

581	   An advanced MPTCP API is outside the scope of this document.  The
582	   basic API does not allow a sender or a receiver to express
583	   preferences about the management of paths or the scheduling of data,
584	   even if this can have a significant performance impact and if an
585	   MPTCP implementation could benefit from additional guidance by
586	   applications.  A list of potential further API extensions is provided
587	   in the appendix.  The specification of such an advanced API is for
588	   further study and may partly be implementation-specific.

590	   MPTCP mainly affects the sending of data.  Therefore, the basic API
591	   only affects the sender side of a data transfer.  A receiver may also
592	   have preferences about data transfer choices, and it may have
593	   performance requirements, too.  A receiver may also have preferences
594	   about data transfer choices, and it may have performance
595	   requirements, too.  Yet, the configuration of such preferences is
596	   outside of the scope of the basic API.

598	5.2.  Requirements on the Basic MPTCP API

600	   Because of the importance of the sockets interface there are several
601	   fundamental design objectives for the basic interface between MPTCP
602	   and applications:

604	   o  Consistency with existing sockets APIs must be maintained as far
605	      as possible.  In order to support the large base of applications
606	      using the original API, a legacy application must be able to
607	      continue to use standard socket interface functions when run on a
608	      system supporting MPTCP.  Also, MPTCP-aware applications should be
609	      able to access the socket without any major changes.

611	   o  Sockets API extensions must be minimized and independent of an
612	      implementation.

614	   o  The interface should both handle IPv4 and IPv6.

616	   The following is a list of the core requirements for the basic API:

618	   REQ1:  Turn on/off MPTCP: An application should be able to request to
619	          turn on or turn off the usage of MPTCP.  This means that an
620	          application should be able to explicitly request the use of
621	          MPTCP if this is possible.  Applications should also be able
622	          to request not to enable MPTCP and to use regular TCP
623	          transport instead.  This can be implicit in many cases, since
624	          MPTCP must disabled by the use of binding to a specific
625	          address.  MPTCP may also be enabled if an application uses a
626	          dedicated multipath address family (such as AF_MULTIPATH,
627	          [8]).

629	   REQ2:  An application should be able to restrict MPTCP to binding to
630	          a given set of addresses.

632	   REQ3:  An application should be able obtain information on the
633	          addresses used by the MPTCP subflows.

635	   REQ4:  An application should be able to extract a unique identifier
636	          for the connection (per endpoint).

638	   The first requirement is the most important one, since some
639	   applications could benefit a lot from MPTCP, but there are also cases
640	   in which it hardly makes sense.  The existing sockets API provides
641	   similar mechanisms to enable or disable advanced TCP features.  The
642	   second requirement corresponds to the binding of addresses with the
643	   bind() socket call, or, e.g., explicit device bindings with a
644	   SO_BINDTODEVICE option.  The third requirement ensures that there is
645	   an equivalent to getpeername() or getsockname() that is able to deal
646	   with more than one subflow.  Finally, it should be possible for the
647	   application to retrieve a unique connection identifier (local to the
648	   endpoint on which it is running) for the MPTCP connection.  This is
649	   equivalent to using the (address, port) pair for a connection
650	   identifier in single-path TCP, which is no longer static in MPTCP.

652	   An application can continue to use getpeername() or getsockname() in
653	   addition to the basic MPTCP API.  In that case, both functions return
654	   the corresponding addresses of the first subflow, as already
655	   explained.

657	5.3.  Sockets Interface Extensions by the Basic MPTCP API

659	5.3.1.  Overview

661	   The abstract, basic MPTCP API consists of a set of new values that
662	   are associated with an MPTCP socket.  Such values may be used for
663	   changing properties of an MPTCP connection, or retrieving
664	   information.  These values could be accessed by new symbols on
665	   existing calls such as setsockopt() and getsockopt(), or could be
666	   implemented as entirely new function calls.  This implementation
667	   decision is out of scope for this document.  The following list
668	   presents symbolic names for these MPTCP socket settings.

670	   o  TCP_MULTIPATH_ENABLE: Enable/disable MPTCP

672	   o  TCP_MULTIPATH_ADD: Bind MPTCP to a set of given local addresses,
673	      or add a new local address to an existing MPTCP connection

675	   o  TCP_MULTIPATH_REMOVE: Remove a local address from an MPTCP
676	      connection

678	   o  TCP_MULTIPATH_SUBFLOWS: Get the pairs of addresses currently used
679	      by the MPTCP subflows

681	   o  TCP_MULTIPATH_CONNID: Get the local connection identifier for this
682	      MPTCP connection

684	   Table Table 1 shows a list of the abstract socket operations for the
685	   basic configuration of MPTCP.  The first column gives the symbolic
686	   name of the operation.  The second and third columns indicate whether
687	   the operation provides values to be read ("Get") or takes values to
688	   configure ("Set").  The fourth column lists the type of data
689	   associated with this operation.

691	    +------------------------+-----+-----+----------------------------+
692	    | Name                   | Get | Set |          Data type         |
693	    +------------------------+-----+-----+----------------------------+
694	    | TCP_MULTIPATH_ENABLE   |  o  |  o  |           boolean          |
695	    | TCP_MULTIPATH_ADD      |     |  o  |      list of addresses     |
696	    | TCP_MULTIPATH_REMOVE   |     |  o  |      list of addresses     |
697	    | TCP_MULTIPATH_SUBFLOWS |  o  |     | list of pairs of addresses |
698	    | TCP_MULTIPATH_CONNID   |  o  |     |       32-bit integer       |
699	    +------------------------+-----+-----+----------------------------+

701	                     Table 1: MPTCP Socket Operations

703	   There are restrictions when these new socket operations can be used:

705	   o  TCP_MULTIPATH_ENABLE: This value SHOULD only be set before the
706	      establishment of a TCP connection.  Its value SHOULD only be read
707	      after the establishment of a connection.

709	   o  TCP_MULTIPATH_ADD: This operation can be both applied before
710	      connection setup or during a connection.  If used before, it
711	      controls the local addresses that an MPTCP connection can use.  In
712	      the latter case, it allows MPTCP to use an additional local
713	      address, if there has been a restriction before connection setup.

715	   o  TCP_MULTIPATH_REMOVE: This operation can be both applied before
716	      connection setup or during a connection.  In both cases, it
717	      removes an address from the list of local addresses that may be
718	      used by subflows.

720	   o  TCP_MULTIPATH_SUBFLOWS: This value is read-only and SHOULD only be
721	      used after connection setup.

723	   o  TCP_MULTIPATH_CONNID: This value is read-only and SHOULD only be
724	      used after connection setup.

726	5.3.2.  Enabling and Disabling of MPTCP

728	   An application can explicitly indicate multipath capability by
729	   setting TCP_MULTIPATH_ENABLE to a value larger than 0.  In this case,
730	   the MPTCP implementation SHOULD try to negitiate MPTCP for that
731	   connection.  Note that multipath transport will not necessarily be
732	   enabled, as it requires multiple addresses and support in the other
733	   end-system and potentially also on middleboxes.

735	   Building on the backwards-compatibility specified in Section 4.2.1,
736	   if an application enables MPTCP but binds to a specific address or
737	   interface, MPTCP MUST be enabled, but MPTCP MUST respect the
738	   application's choice and only use addresses that are explicitly
739	   provided by the application.  Note that it would be possible for an
740	   application to use the legacy bindings, and then expand on them by
741	   using TCP_MULTIPATH_ADD.  Note also that it is possible for more than
742	   one local address to be initially available to MPTCP in this case, if
743	   an application has bound to a specific interface with multiple
744	   addresses.

746	   An application can disable MPTCP setting TCP_MULTIPATH_ENABLE to a
747	   value of 0.  In that case, MPTCP MUST NOT be used on that connection.

749	   After connection establishment, an application can get the value of
750	   TCP_MULTIPATH_ENABLE.  A value of 0 then means lack of MPTCP support.
751	   Any value equal to or larger than 1 means that MPTCP is supported.

753	   As alternative to setting an explicit value, an application could
754	   also use a new, separate address family called AF_MULTIPATH [8].
755	   This separate address family can be used to exchange multiple
756	   addresses between an application and the standard sockets API, and
757	   additionally acts as an explicit indication that an application is
758	   MPTCP-aware, i.e., that it can deal with the semantic changes of the
759	   sockets API, in particular concerning getpeername() and
760	   getsockname().  The usage of AF_MULTIPATH is also more flexible with
761	   respect to multipath transport, either IPv4 or IPv6, or both in
762	   parallel [8].

764	5.3.3.  Binding MPTCP to Specified Addresses

766	   Before connection establishment, an application can use
767	   TCP_MULTIPATH_ADD function to indicate a set of local IP addresses
768	   that MPTCP may bind to.  The parameter of the function is a list of
769	   addresses in a corresponding data structure.  By extension, this
770	   operation will also control the list of addresses that can be
771	   advertised to the peer via MPTCP signalling.

773	   If an application binds to a specific address or interface, it is not
774	   required to use the TCP_MULTIPATH_ADD operation for that address.  As
775	   explained in Section 5.3.2, MPTCP MUST only use the explicitly
776	   specified addresses in that case.

778	   An application MAY also indicate a TCP port number that MPTCP should
779	   bind to for a given address.  The port number MAY be different to the
780	   one used by existing subflows.  If no port number is provided by the
781	   application, the port number is automatically selected by the MPTCP
782	   implementation, and will usually be the same across all subflows.

784	   This operation can also be used to modify the address list in use
785	   during the lifetime of an MPTCP connection.  In this case, it is used
786	   to indicate a set of additional local addresses that the MPTCP
787	   connection can make use of, and which can be signalled to the peer.
788	   It should be noted that this signal is only a hint, and an MPTCP
789	   implementation MAY only use a subset of the addresses.

791	   The TCP_MULTIPATH_REMOVE operation can be used to remove a (set of)
792	   local addresses from an MPTCP connection.  MPTCP MUST close any
793	   corresponding subflows (i.e. those using the local address that is no
794	   longer present), and signal the removal of the address to the peer.
795	   If alternative paths are available using the supplied address list
796	   but MPTCP is not currently using them, an MPTCP implementation SHOULD
797	   establish alternative subflows before undertaking the address
798	   removal.

800	   It should be remembered that these operations SHOULD support both
801	   IPv4 and IPv6 addresses, potentially in the same call.

803	5.3.4.  Querying the MPTCP Subflow Addresses

805	   An application can get a list of the addresses used by the currently
806	   established subflows by means of the read-only TCP_MULTIPATH_SUBFLOWS
807	   operation.  The return value is a list of pairs of tuples of IP
808	   address and TCP port number.  In one pair, the first tuple refers to
809	   the local IP address and the local TCP port, and the second one to
810	   the remote IP address and remote TCP port used by the subflow.  The
811	   list MUST only include established subflows.  Both addresses in each
812	   pair MUST be either IPv4 or IPv6.

814	5.3.5.  Getting a Unique Connection Identifier

816	   An application that wants a unique identifier for the connection,
817	   analogous to an (address, port) pair in regular TCP, can query the
818	   TCP_MULTIPATH_CONNID value to get a local connection identifier for
819	   the MPTCP connection.

821	   This is a 32-bit number, and SHOULD be the same as the local
822	   connection identifier sent in the MPTCP handshake.

824	6.  Other Compatibility Issues
825	6.1.  Usage of the SCTP Socket API

827	   For dealing with multi-homing, several socket API extensions have
828	   been defined for SCTP [13].  As MPTCP realizes multipath transport
829	   from and to multi-homed endsystems, some of these interface function
830	   calls are actually applicable to MPTCP in a similar way.

832	   API developers MAY wish to integrate SCTP and MPTCP calls to provide
833	   a consistent interface to the application.  Yet, it must be
834	   emphasized that the transport service provided by MPTCP is different
835	   to SCTP, and this is why not all SCTP API functions can be mapped
836	   directly to MPTCP.  Furthermore, a network stack implementing MPTCP
837	   does not necessarily support SCTP and its specific socket interface
838	   extensions.  This is why the basic API of MPTCP defines additional
839	   socket options only, which are a backward compatible extension of
840	   TCP's application interface.  An integration with the SCTP API is
841	   outside the scope of the basic API.

843	6.2.  Incompatibilities with other Multihoming Solutions

845	   The use of MPTCP can interact with various related sockets API
846	   extensions.  The use of a multihoming shim layer conflicts with
847	   multipath transport such as MPTCP or SCTP [11].  Care should be taken
848	   for the usage not to confuse with the overlapping features of other
849	   APIs:

851	   o  SHIM API [11]: This API specifies sockets API extensions for the
852	      multihoming shim layer.

854	   o  HIP API [12]: The Host Identity Protocol (HIP) also results in a
855	      new API.

857	   o  API for Mobile IPv6 [10]: For Mobile IPv6, a significantly
858	      extended socket API exists as well.

860	   In order to avoid any conflict, multiaddressed MPTCP SHOULD NOT be
861	   enabled if a network stack uses SHIM6, HIP, or Mobile IPv6.
862	   Furthermore, applications should not try to use both the MPTCP API
863	   and another multihoming or mobility layer API.

865	   It is possible, however, that some of the MPTCP functionality, such
866	   as congestion control, could be used in a SHIM6 or HIP environment.
867	   Such operation is outside the scope of this document.

869	6.3.  Interactions with DNS

871	   In multihomed or multiaddressed environments, there are various
872	   issues that are not specific to MPTCP, but have to be considered,
873	   too.  These problems are summarized in [14].

875	   Specifically, there can be interactions with DNS.  Whilst it is
876	   expected that an application will iterate over the list of addresses
877	   returned from a call such as getaddrinfo(), MPTCP itself MUST NOT
878	   make any assumptions about multiple A or AAAA records from the same
879	   DNS query referring to the same host, as it is possible that multiple
880	   addresses refer to multiple servers for load balancing purposes.

882	7.  Security Considerations

884	   This document first defines the behaviour of the standard TCP/IP API
885	   for MPTCP-unaware applications.  As the function offered by this
886	   interface is equivalent to existing APIs and does not offer
887	   additional functionality, the interface design does not result in new
888	   security issues.  In general, enabling MPTCP has some security
889	   implications for applications, which are introduced in Section 5.3.3,
890	   and these threats are further detailed in [6].  The protocol
891	   specification of MPTCP [5] defines several mechanism to protect MPTCP
892	   against those attacks.

894	   In addition, the basic MPTCP API for MPTCP-aware applications defines
895	   functions that provide an equivalent level of control and information
896	   as exists for regular TCP.  New functions enable adding and removing
897	   local addresses from an MPTCP connection (TCP_MULTIPATH_ADD and
898	   TCP_MULTIPATH_REMOVE).  These functions don't add security threats if
899	   the MPTCP stack verifies that the addresses provided by the
900	   application are indeed available as source addresses for subflows.

902	   However, applications should use the TCP_MULTIPATH_ADD function with
903	   care, as new subflows might get established to those addresses.
904	   Furthermore, it could result in some form of information leakage
905	   since MPTCP might advertise those addresses to the other connection
906	   endpoint, which could learn IP addresses of interfaces that are not
907	   visible otherwise.

909	   Use of different addresses should not be assumed to lead to use of
910	   different paths, especially for security purposes.

912	   MPTCP-aware applications should also take care when querying and
913	   using information about the addresses used by subflows
914	   (TCP_MULTIPATH_SUBFLOWS).  As MPTCP can dynamically open and close
915	   subflows, a list of addresses queried once can get outdated during
916	   the lifetime of an MPTCP connection.  Then, the list may contain
917	   invalid entries, i.e. addresses that are not used any more, or that
918	   might not even be valid.  Applications that want to ensure that MPTCP
919	   only uses a certain set of addresses should explicitly bind to those
920	   addresses.

922	8.  IANA Considerations

924	   No IANA considerations.

926	9.  Conclusion

928	   This document discusses MPTCP's application implications and
929	   specifies a basic MPTCP API.  For legacy applications, it is ensured
930	   that the existing sockets API continues to work.  MPTCP-aware
931	   applications can use the basic MPTCP API that provides some control
932	   over the transport layer equivalent to regular TCP.  A more fine-
933	   granular interaction between applications and MPTCP requires an
934	   advanced MPTCP API, which is not specified in this document.

936	10.  Acknowledgments

938	   Authors sincerely thank to the following people for their helpful
939	   comments and reviews of the document: Costin Raiciu, Philip Eardley,
940	   Javier Ubillos, Michael Tuexen, and John Leslie.

942	   Michael Scharf is supported by the German-Lab project
943	   (http://www.german-lab.de/) funded by the German Federal Ministry of
944	   Education and Research (BMBF).  Alan Ford was previously supported by
945	   Roke Manor Research and by Trilogy (http://www.trilogy-project.org/),
946	   a research project (ICT-216372) partially funded by the European
947	   Community under its Seventh Framework Program.  The views expressed
948	   here are those of the author(s) only.  The European Commission is not
949	   liable for any use that may be made of the information in this
950	   document.

952	11.  References

954	11.1.  Normative References

956	   [1]   Postel, J., "Transmission Control Protocol", STD 7, RFC 793,
957	         September 1981.

959	   [2]   Braden, R., "Requirements for Internet Hosts - Communication
960	         Layers", STD 3, RFC 1122, October 1989.

962	   [3]   Bradner, S., "Key words for use in RFCs to Indicate Requirement
963	         Levels", BCP 14, RFC 2119, March 1997.

965	   [4]   Ford, A., Raiciu, C., Handley, M., Barre, S., and J. Iyengar,
966	         "Architectural Guidelines for Multipath TCP Development",
967	         RFC 6182, March 2011.

969	   [5]   Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, "TCP
970	         Extensions for Multipath Operation with Multiple Addresses",
971	         draft-ietf-mptcp-multiaddressed-06 (work in progress),
972	         January 2012.

974	   [6]   Bagnulo, M., "Threat Analysis for TCP Extensions for Multipath
975	         Operation with Multiple Addresses", RFC 6181, March 2011.

977	   [7]   Raiciu, C., Handley, M., and D. Wischik, "Coupled Congestion
978	         Control for Multipath Transport Protocols", RFC 6356,
979	         October 2011.

981	11.2.  Informative References

983	   [8]   Sarolahti, P., "Multi-address Interface in the Socket API",
984	         draft-sarolahti-mptcp-af-multipath-01 (work in progress),
985	         March 2010.

987	   [9]   Stevens, W., Thomas, M., Nordmark, E., and T. Jinmei, "Advanced
988	         Sockets Application Program Interface (API) for IPv6",
989	         RFC 3542, May 2003.

991	   [10]  Chakrabarti, S. and E. Nordmark, "Extension to Sockets API for
992	         Mobile IPv6", RFC 4584, July 2006.

994	   [11]  Komu, M., Bagnulo, M., Slavov, K., and S. Sugimoto, "Sockets
995	         Application Program Interface (API) for Multihoming Shim",
996	         RFC 6316, July 2011.

998	   [12]  Komu, M. and T. Henderson, "Basic Socket Interface Extensions
999	         for the Host Identity Protocol (HIP)", RFC 6317, July 2011.

1001	   [13]  Stewart, R., Tuexen, M., Poon, K., Lei, P., and V. Yasevich,
1002	         "Sockets API Extensions for the Stream Control Transmission
1003	         Protocol (SCTP)", RFC 6458, December 2011.

1005	   [14]  Blanchet, M. and P. Seite, "Multiple Interfaces and
1006	         Provisioning Domains Problem Statement", RFC 6418,
1007	         November 2011.

1009	   [15]  Wasserman, M. and P. Seite, "Current Practices for Multiple-
1010	         Interface Hosts", RFC 6419, November 2011.

1012	   [16]  Wing, D. and A. Yourtchenko, "Happy Eyeballs: Success with
1013	         Dual-Stack Hosts", draft-ietf-v6ops-happy-eyeballs-07 (work in
1014	         progress), December 2011.

1016	   [17]  "IEEE Std. 1003.1-2008 Standard for Information Technology --
1017	         Portable Operating System Interface (POSIX). Open Group
1018	         Technical Standard: Base Specifications, Issue 7, 2008.".

1020	Appendix A.  Requirements on a Future Advanced MPTCP API

1022	A.1.  Design Considerations

1024	   Multipath transport results in many degrees of freedom.  The basic
1025	   MPTCP API only defines a minimum set of the API extensions for the
1026	   interface between the MPTCP layer and applications, which does not
1027	   offer much control of the MPTCP implementation's behaviour.  A
1028	   future, advanced API could address further features of MPTCP and
1029	   provide more control.

1031	   Applications that use TCP may have different requirements on the
1032	   transport layer.  While developers have become used to the
1033	   characteristics of regular TCP, new opportunities created by MPTCP
1034	   could allow the service provided to be optimised further.  An
1035	   advanced API could enable MPTCP-aware applications to specify
1036	   preferences and control certain aspects of the behavior, in addition
1037	   to the simple control provided by the basic interface.  An advanced
1038	   API could also address aspects that are completely out-of-scope of
1039	   the basic API, for example, the question whether a receiving
1040	   application could influence the sending policy.

1042	   Furthermore, an advanced MPTCP API could be part of a new overall
1043	   interface between the network stack and applications that addresses
1044	   other issues as well, such as the split between identifiers and
1045	   locators.  An API that does not use IP addresses (but, instead e.g. a
1046	   connectbyname() function) would be useful for numerous purposes,
1047	   independent of MPTCP.

1049	   This appendix documents a list of potential usage scenarios and
1050	   requirements for the advanded API.  The specification and
1051	   implementation of a corresponding API is outside the scope of this
1052	   document.

1054	A.2.  MPTCP Usage Scenarios and Application Requirements

1056	   There are different MPTCP usage scenarios.  An application that
1057	   wishes to transmit bulk data will want MPTCP to provide a high
1058	   throughput service immediately, through creating and maximising
1059	   utilisation of all available subflows.  This is the default MPTCP use
1060	   case.

1062	   But at the other extreme, there are applications that are highly
1063	   interactive, but require only a small amount of throughput, and these
1064	   are optimally served by low latency and jitter stability.  In such a
1065	   situation, it would be preferable for the traffic to use only the
1066	   lowest latency subflow (assuming it has sufficient capacity), maybe
1067	   with one or two additional subflows for resilience and recovery
1068	   purposes.  The key challenge for such a strategy is that the delay on
1069	   a path may fluctuate significantly and that just always selecting the
1070	   path with the smallest delay might result in instability.

1072	   The choice between bulk data transport and latency-sensitive
1073	   transport affects the scheduler in terms of whether traffic should
1074	   be, by default, sent on one subflow or across several ones.  Even if
1075	   the total bandwidth required is less than that available on an
1076	   individual path, it is desirable to spread this load to reduce stress
1077	   on potential bottlenecks, and this is why this method should be the
1078	   default for bulk data transport.  However, that may not be optimal
1079	   for applications that require latency/jitter stability.

1081	   In the case of the latter option, a further question arises: Should
1082	   additional subflows be used whenever the primary subflow is
1083	   overloaded, or only when the primary path fails (hot-standby)?  In
1084	   other words, is latency stability or bandwidth more important to the
1085	   application?  This results in two different options: Firstly, there
1086	   is the single path which can overflow into an additional subflow; and
1087	   secondly there is single-path with hot-standby, whereby an
1088	   application may want an alternative backup subflow in order to
1089	   improve resilience.  In case that data delivery on the first subflow
1090	   fails, the data transport could immediately be continued on the
1091	   second subflow, which is idle otherwise.

1093	   Yet another complication is introduced with the potential that MPTCP
1094	   introduces for changes in available bandwidth as the number of
1095	   available subflows changes.  Such jitter in bandwidth may prove
1096	   confusing for some applications such as video or audio streaming that
1097	   dynamically adapt codecs based on available bandwidth.  Such
1098	   applications may prefer MPTCP to attempt to provide a consistent
1099	   bandwidth as far as is possible, and avoid maximising the use of all
1100	   subflows.

1102	   A further, mostly orthogonal question is whether data should be
1103	   duplicated over the different subflows, in particular if there is
1104	   spare capacity.  This could improve both the timeliness and
1105	   reliability of data delivery.

1107	   In summary, there are at least three possible performance objectives
1108	   for multipath transport (not necessarily disjoint):

1110	   1.  High bandwidth

1112	   2.  Low latency and jitter stability
1113	   3.  High reliability

1115	   In an advanced API, applications could provide high-level guidance to
1116	   the MPTCP implementation concerning these performance requirements,
1117	   for instance, which is considered to be the most important one.  The
1118	   MPTCP stack would then use internal mechanisms to fulfill this
1119	   abstract indication of a desired service, as far as possible.  This
1120	   would both affect the assignment of data (including retransmissions)
1121	   to existing subflows (e.g., 'use all in parallel', 'use as overflow',
1122	   'hot standby', 'duplicate traffic') as well as the decisions when to
1123	   set up additional subflows to which addresses.  In both cases
1124	   different policies can exist, which can be expected to be
1125	   implementation-specific.

1127	   Therefore, an advanced API could provide a mechanism how applications
1128	   can specify their high-level requirements in an implementation-
1129	   independent way.  One possibility would be to select one "application
1130	   profile" out of a number of choices that characterize typical
1131	   applications.  Yet, as applications today do not have to inform TCP
1132	   about their communication requirements, it requires further studies
1133	   whether such an approach would be realistic.

1135	   Of course, independent of an advanced API, such functionality could
1136	   also partly be achieved by MPTCP-internal heuristics that infer some
1137	   application preferences e.g. from existing socket options, such as
1138	   TCP_NODELAY.  Whether this would be reliable, and indeed appropriate,
1139	   is for further study, too.

1141	A.3.  Potential Requirements on an Advanced MPTCP API

1143	   The following is a list of potential requirements for an advanced
1144	   MPTCP API beyond the features of the basic API.  It is included here
1145	   for information only:

1147	   REQ5:   An application should be able to establish MPTCP connections
1148	           without using IP addresses as locators.

1150	   REQ6:   An application should be able obtain usage information and
1151	           statistics about all subflows (e.g., ratio of traffic sent
1152	           via this subflow).

1154	   REQ7:   An application should be able to request a change in the
1155	           number of subflows in use, thus triggering removal or
1156	           addition of subflows.  An even finer control granularity
1157	           would be a request for the establishment of a new subflow to
1158	           a provided destination, or a request for the termination of a
1159	           specified, existing subflow.

1161	   REQ8:   An application should be able to inform the MPTCP
1162	           implementation about its high-level performance requirements,
1163	           e.g., in form of a profile.

1165	   REQ9:   An application should be able to indicate communication
1166	           characteristics, e. g., the expected amount of data to be
1167	           sent, the expected duration of the connection, or the
1168	           expected rate at which data is provided.  Applications may in
1169	           some cases be able to forecast such properties.  If so, such
1170	           information could be an additional input parameter for
1171	           heuristics inside the MPTCP implementation, which could be
1172	           useful for example to decide when to set up additional
1173	           subflows.

1175	   REQ10:  An application should be able to control the automatic
1176	           establishment/termination of subflows.  This would imply a
1177	           selection among different heuristics of the path manager,
1178	           e.g., 'try as soon as possible', 'wait until there is a bunch
1179	           of data', etc.

1181	   REQ11:  An application should be able to set preferred subflows or
1182	           subflow usage policies.  This would result in a selection
1183	           among different configurations of the multipath scheduler.
1184	           For instance, an application might want to use certain
1185	           subflows as backup only.

1187	   REQ12:  An application should be able to control the level of
1188	           redundancy by telling whether segments should be sent on more
1189	           than one path in parallel.

1191	   An advanced API fulfilling these requirements would allow application
1192	   developers to more specifically configure MPTCP.  It could avoid
1193	   suboptimal decisions of internal, implicit heuristics.  However, it
1194	   is unclear whether all of these requirements would have a significant
1195	   benefit to applications, since they are going above and beyond what
1196	   the existing API to regular TCP provides.

1198	   A subset of this functions might also be implemented system wide or
1199	   by other configuration mechanisms.  These implementation details are
1200	   left for further study.

1202	A.4.  Integration with the SCTP Socket API

1204	   The advanced API may also integrate or use the SCTP Socket API.  The
1205	   following functions that are defined for SCTP have a similar
1206	   functionality like the basic MPTCP API:

1208	   o  sctp_bindx()

1210	   o  sctp_connectx()

1212	   o  sctp_getladdrs()

1214	   o  sctp_getpaddrs()

1216	   o  sctp_freeladdrs()

1218	   o  sctp_freepaddrs()

1220	   The syntax and semantics of these functions are described in [13].

1222	   A potential objective for the advanced API is to provide a consistent
1223	   MPTCP and SCTP interface to the application.  This is left for
1224	   further study in this document.

1226	Appendix B.  Change History of the Document

1228	   Changes compared to version draft-ietf-mptcp-api-03:

1230	   o  Security consideration section

1232	   o  Better explanation of the implications of explicitly specified
1233	      addresses, most notably during the bind call

1235	   o  Editorial changes

1237	   Changes compared to version draft-ietf-mptcp-api-02:

1239	   o  Updated references

1241	   o  Editorial changes

1243	   Changes compared to version draft-ietf-mptcp-api-01:

1245	   o  Additional text on outdated assumptions if an MPTCP application
1246	      does not use fate sharing.

1248	   o  The appendix explicitly mentions an integration of the advanced
1249	      MPTCP API and the SCTP API as a potential objective, which is left
1250	      for further study for the basic API.

1252	   o  A short additional explanation of the parameters of the abstract
1253	      functions TCP_MULTIPATH_ADD and TCP_MULTIPATH_REMOVE.

1255	   o  Better explanation when TCP_MULTIPATH_REMOVE may be used.

1257	   Changes compared to version draft-ietf-mptcp-api-00:

1259	   o  Explicitly specify that the TCP_MULTIPATH_SUBFLOWS function
1260	      returns port numbers, too.  Furthermore, add a new comment that
1261	      TCP_MULTIPATH_ADD permits the specification of a port number.

1263	   o  Mention possible additional extended API functions for the
1264	      indication of application characterstics and for backup paths,
1265	      based on comments received from the community.

1267	   o  Mentions alternative approaches for avoiding non-MPTCP-capable
1268	      paths to reduce impact on applications.

1270	   Changes compared to version draft-scharf-mptcp-api-03:

1272	   o  Removal of explicit references to "socket options" and getsockopt/
1273	      setsockopt.

1275	   o  Change of TCP_MULTIPATH_BIND to TCP_MULTIPATH_ADD and
1276	      TCP_MULTIPATH_REMOVE.

1278	   o  Mention of stability of bandwidth as another potential QoS
1279	      parameter for the advanced API.

1281	   o  Address comments received from Philip Eardley: Explanation of the
1282	      API terminology, more explicit statement concerning applications
1283	      that bind to a specific address, and some smaller editorial fixes

1285	   Changes compared to version draft-scharf-mptcp-api-02:

1287	   o  Definition of the behavior of getpeername() and getsockname() when
1288	      being called by an MPTCP-aware application.

1290	   o  Discussion of the possiblity that an MPTCP implementation could
1291	      support the SCTP API, as far as it is applicable to MPTCP.

1293	   o  Various editorial fixes.

1295	   Changes compared to version draft-scharf-mptcp-api-01:

1297	   o  Second half of the document completely restructured

1299	   o  Separation between a basic API and an advanced API: The focus of
1300	      the document is the basic API only; all text concerning a
1301	      potential extended API is moved to the appendix

1303	   o  Several clarifications, e. g., concerning buffer sizeing and the
1304	      use of different scheduling strategies triggered by TCP_NODELAY

1306	   o  Additional references

1308	   Changes compared to version draft-scharf-mptcp-api-00:

1310	   o  Distinction between legacy and MPTCP-aware applications

1312	   o  Guidance concerning default enabling, reaction to the shutdown of
1313	      the first subflow, etc.

1315	   o  Reference to a potential use of AF_MULTIPATH

1317	   o  Additional references to related work

1319	Authors' Addresses

1321	   Michael Scharf
1322	   Alcatel-Lucent Bell Labs
1323	   Lorenzstrasse 10
1324	   70435 Stuttgart
1325	   Germany

1327	   EMail: michael.scharf@alcatel-lucent.com

1329	   Alan Ford
1330	   Cisco
1331	   Ruscombe Business Park
1332	   Ruscombe, Berkshire  RG10 9NN
1333	   UK

1335	   EMail: alanford@cisco.com