idnits 2.17.1 draft-ietf-mptcp-api-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 16, 2012) is 4451 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 793 (ref. '1') (Obsoleted by RFC 9293) == Outdated reference: A later version (-12) exists of draft-ietf-mptcp-multiaddressed-06 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force M. Scharf 3 Internet-Draft Alcatel-Lucent Bell Labs 4 Intended status: Informational A. Ford 5 Expires: August 19, 2012 Cisco 6 February 16, 2012 8 MPTCP Application Interface Considerations 9 draft-ietf-mptcp-api-04 11 Abstract 13 Multipath TCP (MPTCP) adds the capability of using multiple paths to 14 a regular TCP session. Even though it is designed to be totally 15 backward compatible to applications, the data transport differs 16 compared to regular TCP, and there are several additional degrees of 17 freedom that applications may wish to exploit. This document 18 summarizes the impact that MPTCP may have on applications, such as 19 changes in performance. Furthermore, it discusses compatibility 20 issues of MPTCP in combination with non-MPTCP-aware applications. 21 Finally, the document describes a basic application interface for 22 MPTCP-aware applications that provides access to multipath address 23 information and a level of control equivalent to regular TCP. 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on August 19, 2012. 42 Copyright Notice 44 Copyright (c) 2012 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 61 3. Comparison of MPTCP and Regular TCP . . . . . . . . . . . . . 5 62 3.1. Performance Impact . . . . . . . . . . . . . . . . . . . . 6 63 3.1.1. Throughput . . . . . . . . . . . . . . . . . . . . . . 6 64 3.1.2. Delay . . . . . . . . . . . . . . . . . . . . . . . . 6 65 3.1.3. Resilience . . . . . . . . . . . . . . . . . . . . . . 7 66 3.2. Potential Problems . . . . . . . . . . . . . . . . . . . . 7 67 3.2.1. Impact of Middleboxes . . . . . . . . . . . . . . . . 7 68 3.2.2. Outdated Implicit Assumptions . . . . . . . . . . . . 8 69 3.2.3. Security Implications . . . . . . . . . . . . . . . . 8 70 4. Operation of MPTCP with Legacy Applications . . . . . . . . . 9 71 4.1. Overview of the MPTCP Network Stack . . . . . . . . . . . 9 72 4.2. Address Issues . . . . . . . . . . . . . . . . . . . . . . 10 73 4.2.1. Specification of Addresses by Applications . . . . . . 10 74 4.2.2. Querying of Addresses by Applications . . . . . . . . 10 75 4.3. Socket Option Issues . . . . . . . . . . . . . . . . . . . 11 76 4.3.1. General Guideline . . . . . . . . . . . . . . . . . . 11 77 4.3.2. Disabling of the Nagle Algorithm . . . . . . . . . . . 11 78 4.3.3. Buffer Sizing . . . . . . . . . . . . . . . . . . . . 12 79 4.3.4. Other Socket Options . . . . . . . . . . . . . . . . . 12 80 4.4. Default Enabling of MPTCP . . . . . . . . . . . . . . . . 12 81 4.5. Summary of Advices to Application Developers . . . . . . . 12 82 5. Basic API for MPTCP-aware Applications . . . . . . . . . . . . 13 83 5.1. Design Considerations . . . . . . . . . . . . . . . . . . 13 84 5.2. Requirements on the Basic MPTCP API . . . . . . . . . . . 14 85 5.3. Sockets Interface Extensions by the Basic MPTCP API . . . 15 86 5.3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . 15 87 5.3.2. Enabling and Disabling of MPTCP . . . . . . . . . . . 16 88 5.3.3. Binding MPTCP to Specified Addresses . . . . . . . . . 17 89 5.3.4. Querying the MPTCP Subflow Addresses . . . . . . . . . 18 90 5.3.5. Getting a Unique Connection Identifier . . . . . . . . 18 91 6. Other Compatibility Issues . . . . . . . . . . . . . . . . . . 18 92 6.1. Usage of the SCTP Socket API . . . . . . . . . . . . . . . 19 93 6.2. Incompatibilities with other Multihoming Solutions . . . . 19 94 6.3. Interactions with DNS . . . . . . . . . . . . . . . . . . 19 95 7. Security Considerations . . . . . . . . . . . . . . . . . . . 20 96 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 97 9. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 21 98 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 21 99 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21 100 11.1. Normative References . . . . . . . . . . . . . . . . . . . 21 101 11.2. Informative References . . . . . . . . . . . . . . . . . . 22 102 Appendix A. Requirements on a Future Advanced MPTCP API . . . . . 23 103 A.1. Design Considerations . . . . . . . . . . . . . . . . . . 23 104 A.2. MPTCP Usage Scenarios and Application Requirements . . . . 23 105 A.3. Potential Requirements on an Advanced MPTCP API . . . . . 25 106 A.4. Integration with the SCTP Socket API . . . . . . . . . . . 26 107 Appendix B. Change History of the Document . . . . . . . . . . . 27 109 1. Introduction 111 Multipath TCP adds the capability of using multiple paths to a 112 regular TCP session [1]. The motivations for this extension include 113 increasing throughput, overall resource utilisation, and resilience 114 to network failure, and these motivations are discussed, along with 115 high-level design decisions, as part of the Multipath TCP 116 architecture [4]. The MPTCP protocol [5] offers the same reliable, 117 in-order, byte-stream transport as TCP, and is designed to be 118 backward compatible with both applications and the network layer. It 119 requires support inside the network stack of both endpoints. 121 This document first presents the impacts that MPTCP may have on 122 applications, such as performance changes compared to regular TCP. 123 Second, it defines the interoperation of MPTCP and applications that 124 are unaware of the multipath transport. MPTCP is designed to be 125 usable without any application changes, but some compatibility issues 126 have to be taken into account. Third, this memo specifies a basic 127 Application Programming Interface (API) for MPTCP-aware applications. 128 The API presented here is an extension to the regular TCP API to 129 allow an MPTCP-aware application the equivalent level of control and 130 access to information of an MPTCP connection that would be possible 131 with the standard TCP API on a regular TCP connection. 133 An advanced API for MPTCP is outside the scope of this document. 134 Such an advanced API could offer a more fine-grained control over 135 multipath transport functions and policies. The appendix includes a 136 brief, non-compulsory list of potential features of such an advanced 137 API. 139 The de facto standard API for TCP/IP applications is the "sockets" 140 interface. This document provides an abstract definition of MPTCP- 141 specific extensions to this interface. These are operations that can 142 be used by an application to get or set additional MPTCP-specific 143 information on a socket, in order to provide an equivalent level of 144 information and control over MPTCP as exists for an application using 145 regular TCP. It is up to the applications, high-level programming 146 languages, or libraries to decide whether to use these optional 147 extensions. For instance, an application may want to turn on or off 148 the MPTCP mechanism for certain data transfers, or limit its use to 149 certain interfaces. The abstract specification is in line with the 150 Posix standard [17] as much as possible. 152 There are also various related extensions of the sockets interface: 153 [11] specifies sockets API extensions for a multihoming shim layer. 154 The API enables interactions between applications and the multihoming 155 shim layer for advanced locator management and for access to 156 information about failure detection and path exploration. 158 Experimental extensions to the sockets API are also defined for the 159 Host Identity Protocol (HIP) [12] in order to manage the bindings of 160 identifiers and locator. Further related API extensions exist for 161 IPv6 [9], Mobile IP [10], and SCTP [13]. There can be interactions 162 or incompatibilities of these APIs with MPTCP, which are discussed 163 later in this document. 165 Some network stack implementations, specially on mobile devices, have 166 centralized connection managers or other higher-level APIs to solve 167 multi-interface issues, as surveyed in [15]. Their interaction with 168 MPTCP is outside the scope of this note. 170 The target readers of this document are application developers whose 171 software may benefit significantly from MPTCP. This document also 172 provides the necessary information for developers of MPTCP to 173 implement the API in a TCP/IP network stack. 175 2. Terminology 177 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 178 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 179 document are to be interpreted as described in [3]. 181 This document uses the MPTCP terminology introduced in [5]. 183 Concerning the API towards applications, the following terms are 184 distinguished: 186 o Legacy API: The interface towards TCP that is currently used by 187 applications. This document explains the impact of MPTCP for such 188 applications, as well as resulting issues. 190 o Basic API: A simple extension of TCP's interface for applications 191 that are aware of MPTCP. This document abstractly describes this 192 interface, which provides access to multipath address information 193 and a level of control equivalent to regular TCP. 195 o Advanced API: An API that offers more fine-grained control over 196 the MPTCP behaviour. Its detailed specification is outside scope 197 of this document. 199 3. Comparison of MPTCP and Regular TCP 201 This section discusses the impact that the use of MPTCP will have on 202 applications, in comparison to what may be expected from the use of 203 regular TCP. 205 3.1. Performance Impact 207 One of the key goals of adding multipath capability to TCP is to 208 improve the performance of a transport connection by load 209 distribution over separate subflows across potentially disjoint 210 paths. Furthermore, it is an explicit goal of MPTCP that it should 211 not provide a worse performing connection that would have existed 212 through the use of single-path TCP. A corresponding congestion 213 control algorithm is described in [7]. The following sections 214 summarize the performance impact of MPTCP as seen by an application. 216 3.1.1. Throughput 218 The most obvious performance improvement that will be gained with the 219 use of MPTCP is an increase in throughput, since MPTCP will pool more 220 than one path (where available) between two endpoints. This will 221 provide greater bandwidth for an application. If there are shared 222 bottlenecks between the flows, then the congestion control algorithms 223 will ensure that load is evenly spread amongst regular and multipath 224 TCP sessions, so that no end user receives worse performance than 225 single-path TCP. 227 This performance increase additionally means that an MPTCP session 228 could achieve throughput that is greater than the capacity of a 229 single interface on the device. If any applications make assumptions 230 about interfaces due to throughput (or vice versa), they must take 231 this into account (although an MPTCP implementation must always 232 respect an application's request for a particular interface). 234 Furthermore, the flexibility of MPTCP to add and remove subflows as 235 paths change availability could lead to a greater variation, and more 236 frequent change, in connection bandwidth. Applications that adapt to 237 available bandwidth (such as video and audio streaming) may need to 238 adjust some of their assumptions to most effectively take this into 239 account. 241 The transport of MPTCP signaling information results in a small 242 overhead. If multiple subflows share a same bottleneck, this 243 overhead slightly reduces the capacity that is available for data 244 transport. Yet, this potential reduction of throughput will be 245 neglectible in many usage scenarios, and the protocol contains 246 optimisations in its design so that this overhead is minimal. 248 3.1.2. Delay 250 If the delays on the constituent subflows of an MPTCP connection 251 differ, the jitter perceivable to an application may appear higher as 252 the data is spread across the subflows. Although MPTCP will ensure 253 in-order delivery to the application, the application must be able to 254 cope with the data delivery being burstier than may be usual with 255 single-path TCP. Since burstiness is commonplace on the Internet 256 today, it is unlikely that applications will suffer from such an 257 impact on the traffic profile, but application authors may wish to 258 consider this in future development. 260 In addition, applications that make round trip time (RTT) estimates 261 at the application level may have some issues. Whilst the average 262 delay calculated will be accurate, whether this is useful for an 263 application will depend on what it requires this information for. If 264 a new application wishes to derive such information, it should 265 consider how multiple subflows may affect its measurements, and thus 266 how it may wish to respond. In such a case, an application may wish 267 to express its scheduling preferences, as described later in this 268 document. 270 3.1.3. Resilience 272 The use of multiple subflows simultaneously means that, if one should 273 fail, all traffic will move to the remaining subflow(s), and 274 additionally any lost packets can be retransmitted on these subflows. 276 Subflow failure may be caused by issues within the network, which an 277 application would be unaware of, or interface failure on the node. 278 An application may, under certain circumstances, be in a position to 279 be aware of such failure (e.g. by radio signal strength, or simply an 280 interface enabled flag), and so must not make assumptions of an MPTCP 281 flow's stablity based on this. An MPTCP implementation must never 282 override an application's request for a given interface, however, so 283 the cases where this issue may be applicable are limited. 285 3.2. Potential Problems 287 3.2.1. Impact of Middleboxes 289 MPTCP has been designed in order to pass through the majority of 290 middleboxes. Empirical evidence suggests that new TCP options can 291 successfully be used on most paths in the Internet. Nevertheless 292 some middleboxes may still refuse to pass MPTCP messages due to the 293 presence of TCP options, or they may strip TCP options. If this is 294 the case, MPTCP should fall back to regular TCP. Although this will 295 not create a problem for the application (its communication will be 296 set up either way), there may be additional (and indeed, user- 297 perceivable) delay while the first handshake fails. Therefore, an 298 alternative approach could be to try both MPTCP and regular TCP 299 connection attempts at the same time, and respond to whichever 300 replies first (or apply a timeout on the MPTCP attempt, while having 301 TCP SYN/ACK ready to reply to, thus reducing the setup delay by a 302 RTT) in a similar fashion to the "Happy Eyeballs" proposal for IPv6 303 [16]. 305 An MPTCP implementation can learn the rate of MPTCP connection 306 attempt successes or failures to particular hosts or networks, and on 307 particular interfaces, and could therefore learn heuristics of when 308 and when not to use MPTCP. A detailed discussion of the various 309 fallback mechanisms, for failures occurring at different points in 310 the connection, is presented in [5]. 312 There may also be middleboxes that transparently change the length of 313 content. If such middleboxes are present, MPTCP's reassembly of the 314 byte stream in the receiver is difficult. Still, MPTCP can detect 315 such middleboxes and then fall back to regular TCP. An overview of 316 the impact of middleboxes is presented in [4] and MPTCP's mechanisms 317 to work around these are presented and discussed in [5]. 319 MPTCP can also have other unexpected implications. For instance, 320 intrusion detection systems could be triggered. A full analysis of 321 MPTCP's impact on such middleboxes is for further study after 322 deployment experiments. 324 3.2.2. Outdated Implicit Assumptions 326 In regular TCP, there is a one-to-one mapping of the socket interface 327 to a flow through a network. Since MPTCP can make use of multiple 328 subflows, applications cannot implicitly rely on this one-to-one 329 mapping any more. Applications that require the transport along a 330 single path can disable the use of MPTCP as described later in this 331 document. Examples include monitoring tools that want to measure the 332 available bandwidth on a path, or routing protocols such as BGP that 333 require the use of a specific link. 335 Furthermore, an implementation may choose to persist an MPTCP 336 connection even if an IP address is not allocated any more to a host, 337 depending on the policy concerning the first subflow (fate-sharing, 338 see Section 4.2.2). In this case, the IP address exposed to an 339 MPTCP-unaware application can differ to the addresses actually been 340 used by MPTCP. It is even possible that an IP address gets assigned 341 to another host during the lifetime of an MPTCP connection. 343 3.2.3. Security Implications 345 The support for multiple IP addresses within one MPTCP connection can 346 result in additional security vulnerabilities, such as possibilities 347 for attackers to hijack connections. The protocol design of MPTCP 348 minimizes this risk. An attacker on one of the paths can cause harm, 349 but this is hardly an additional security risk compared to single- 350 path TCP, which is vulnerable to man-in-the-middle attacks, too. A 351 detailed threat analysis of MPTCP is published in [6]. 353 4. Operation of MPTCP with Legacy Applications 355 4.1. Overview of the MPTCP Network Stack 357 MPTCP is an extension of TCP, but it is designed to be backward 358 compatible for legacy (MPTCP-unaware) applications. TCP interacts 359 with other parts of the network stack by different interfaces. The 360 de facto standard API between TCP and applications is the sockets 361 interface. The position of MPTCP in the protocol stack can be 362 illustrated in Figure 1. 364 +-------------------------------+ 365 | Application | 366 +-------------------------------+ 367 ^ | 368 ~~~~~~~~~~~|~Socket Interface|~~~~~~~~~~~ 369 | v 370 +-------------------------------+ 371 | MPTCP | 372 + - - - - - - - + - - - - - - - + 373 | Subflow (TCP) | Subflow (TCP) | 374 +-------------------------------+ 375 | IP | IP | 376 +-------------------------------+ 378 Figure 1: MPTCP protocol stack 380 In general, MPTCP can affect all interfaces that make assumptions 381 about the coupling of a TCP connection to a single IP address and TCP 382 port pair, to one sockets endpoint, to one network interface, or to a 383 given path through the network. 385 This means that there are two classes of applications: 387 o Legacy applications: These applications are unaware of MPTCP and 388 use the existing API towards TCP without any changes. This is the 389 default case. 391 o MPTCP-aware applications: These applications indicate support for 392 an enhanced MPTCP interface. This document specified a minimum 393 set of API extensions for such applications. 395 In the following, it is discussed to what extent MPTCP affects legacy 396 applications using the existing sockets API. The existing sockets 397 API implies that applications deal with data structures that store, 398 amongst others, the IP addresses and TCP port numbers of a TCP 399 connection. A design objective of MPTCP is that legacy applications 400 can continue to use the established sockets API without any changes. 401 However, in MPTCP there is a one-to-many mapping between the socket 402 endpoint and the subflows. This has several subtle implications for 403 legacy applications using sockets API functions. 405 4.2. Address Issues 407 4.2.1. Specification of Addresses by Applications 409 During binding, an application can either select a specific address, 410 or bind to INADDR_ANY. Furthermore, on some systems other socket 411 options (e.g., SO_BINDTODEVICE) can be used to bind to a specific 412 interface. If an application uses a specific address or binds to a 413 specific interface, then MPTCP MUST respect this and not interfere in 414 the application's choices. The binding to a specific address or 415 interface implies that the application is not aware of MPTCP and will 416 disable the use of MPTCP on this connection. An application that 417 wishes to bind to a specific set of addresses with MPTCP must use 418 multipath-aware calls to achieve this (as described in 419 Section 5.3.3). 421 If an application binds to INADDR_ANY, it is assumed that the 422 application does not care which addresses to use locally. In this 423 case, a local policy MAY allow MPTCP to automatically set up multiple 424 subflows on such a connection. 426 The basic sockets API of MPTCP-aware applications allows to express 427 further preferences in an MPTCP-compatible way (e.g. bind to a subset 428 of interfaces only). 430 4.2.2. Querying of Addresses by Applications 432 Applications can use the getpeername() or getsockname() functions in 433 order to retrieve the IP address of the peer or of the local socket. 434 These functions can be used for various purposes, including security 435 mechanisms, geo-location, or interface checks. The socket API was 436 designed with an assumption that a socket is using just one address, 437 and since this address is visible to the application, the application 438 may assume that the information provided by the functions is the same 439 during the lifetime of a connection. However, in MPTCP, unlike in 440 TCP, there is a one-to-many mapping of a connection to subflows, and 441 subflows can be added and removed while the connections continues to 442 exist. Therefore, MPTCP cannot expose addresses by getpeername() or 443 getsockname() that are both valid and constant during the 444 connection's lifetime. 446 This problem is addressed as follows: If used by a legacy 447 application, the MPTCP stack MUST always return the addresses of the 448 first subflow of an MPTCP connection, in all circumstances, even if 449 that particular subflow is no longer in use. 451 As this address may not be valid any more if the first subflow is 452 closed, the MPTCP stack MAY close the whole MPTCP connection if the 453 first subflow is closed (i.e. fate sharing between the initial 454 subflow and the MPTCP connection as a whole). Whether to close the 455 whole MPTCP connection by default SHOULD be controlled by a local 456 policy. Further experiments are needed to investigate its 457 implications. 459 The functions getpeername() and getsockname() SHOULD also always 460 return the addresses of the first subflow if the socket is used by an 461 MPTCP-aware application, in order to be consistent with MPTCP-unaware 462 applications, and, e. g., also with SCTP. Instead of getpeername() 463 or getsockname(), MPTCP-aware applications can use new API calls, 464 documented later, in order to retrieve the full list of address pairs 465 for the subflows in use. 467 4.3. Socket Option Issues 469 4.3.1. General Guideline 471 The existing sockets API includes options that modify the behavior of 472 sockets and their underlying communications protocols. Various 473 socket options exist on socket, TCP, and IP level. The value of an 474 option can usually be set by the setsockopt() system function. The 475 getsockopt() function gets information. In general, the existing 476 sockets interface functions cannot configure each MPTCP subflow 477 individually. In order to be backward compatible, existing APIs 478 therefore SHOULD apply to all subflows within one connection, as far 479 as possible. 481 4.3.2. Disabling of the Nagle Algorithm 483 One commonly used TCP socket option (TCP_NODELAY) disables the Nagle 484 algorithm as described in [2]. This option is also specified in the 485 Posix standard [17]. Applications can use this option in combination 486 with MPTCP exactly in the same way. It then SHOULD disable the Nagle 487 algorithm for the MPTCP connection, i.e., all subflows. 489 In addition, the MPTCP protocol instance MAY use a different path 490 scheduler algorithm if TCP_NODELAY is present. For instance, it 491 could use an algorithm that is optimized for latency-sensitive 492 traffic. Specific algorithms are outside the scope of this document. 494 4.3.3. Buffer Sizing 496 Applications can explicitly configure send and receive buffer sizes 497 by the sockets API (SO_SNDBUF, SO_RCVBUF). These socket options can 498 also be used in combination with MPTCP and then affect the buffer 499 size of the MPTCP connection. However, when defining buffer sizes, 500 application programmers should take into account that the transport 501 over several subflows requires a certain amount of buffer for 502 resequencing in the receiver. MPTCP may also require more storage 503 space in the sender, in particular, if retransmissions are sent over 504 more than one path. In addition, very small send buffers may prevent 505 MPTCP from efficiently scheduling data over different subflows. 506 Therefore, it does not make sense to use MPTCP in combination with 507 small send or receive buffers. 509 An MPTCP implementation MAY set a lower bound for send and receive 510 buffers and treat a small buffer size request as an implicit request 511 not to use MPTCP. 513 4.3.4. Other Socket Options 515 Some network stacks also provide other implementation-specific socket 516 options or interfaces that affect TCP's behavior. If a network stack 517 supports MPTCP, it must be ensured that these options do not 518 interfere. 520 4.4. Default Enabling of MPTCP 522 It is up to a local policy at the end system whether a network stack 523 should automatically enable MPTCP for sockets even if there is no 524 explicit sign of MPTCP awareness of the corresponding application. 525 Such a choice may be under the control of the user through system 526 preferences. 528 The enabling of MPTCP, either by application or by system defaults, 529 does not necessarily mean that MPTCP will always be used. Both 530 endpoints must support MPTCP, and there must be multiple addresses at 531 at least one endpoint, for MPTCP to be used. Even if those 532 requirements are met, however, MPTCP may not be immediately used on a 533 connection. It may make sense for multiple paths to be brought into 534 operation only after a given period of time, or if the connection is 535 saturated. 537 4.5. Summary of Advices to Application Developers 539 o Using the default MPTCP configuration: Like TCP, MPTCP is designed 540 to be efficient and robust in the default configuration. 541 Application developers should not explicitly configure TCP (or 542 MPTCP) features unless this is really needed. 544 o Socker buffet dimensioning: Multipath transport requires larger 545 buffers in the receiver for resequencing, as already explained. 546 Applications should use reasonably buffer sizes (such as the 547 operating system default values) in order to fully benefit from 548 MPTCP. A full discussion of buffer sizing issues is given in [5]. 550 o Facilitating stack-internal heuristics: The path management and 551 data scheduling by MPTCP is realized by stack-internal algorithms 552 that may implicitly try to self-optimize their behavior according 553 to assumed application needs. For instance, an MPTCP 554 implementation may use heuristics to determine whether an 555 application requires delay-sensitive or bulk data transport, using 556 for instance port numbers, the TCP_NODELAY socket options, or the 557 application's read/write patterns as input parameters. An 558 application developer can facilitate the operation of such 559 heuristics by avoiding atypical interface use cases. For 560 instance, for long bulk data transfers, it does neither make sense 561 to enable the TCP_NODELAY socket option, nor is it reasonable to 562 use many small subsequent socket "send()" calls with small amounts 563 of data only. 565 5. Basic API for MPTCP-aware Applications 567 5.1. Design Considerations 569 While applications can use MPTCP with the unmodified sockets API, 570 multipath transport results in many degrees of freedom. MPTCP 571 manages the data transport over different subflows automatically. By 572 default, this is transparent to the application, but an application 573 could use an additional API to interface with the MPTCP layer and to 574 control important aspects of the MPTCP implementation's behaviour. 576 This document describes a basic MPTCP API. The API contains a 577 minimum set of functions that provide an equivalent level of control 578 and information as exists for regular TCP. It maintains backward 579 compatibility with legacy applications. 581 An advanced MPTCP API is outside the scope of this document. The 582 basic API does not allow a sender or a receiver to express 583 preferences about the management of paths or the scheduling of data, 584 even if this can have a significant performance impact and if an 585 MPTCP implementation could benefit from additional guidance by 586 applications. A list of potential further API extensions is provided 587 in the appendix. The specification of such an advanced API is for 588 further study and may partly be implementation-specific. 590 MPTCP mainly affects the sending of data. Therefore, the basic API 591 only affects the sender side of a data transfer. A receiver may also 592 have preferences about data transfer choices, and it may have 593 performance requirements, too. A receiver may also have preferences 594 about data transfer choices, and it may have performance 595 requirements, too. Yet, the configuration of such preferences is 596 outside of the scope of the basic API. 598 5.2. Requirements on the Basic MPTCP API 600 Because of the importance of the sockets interface there are several 601 fundamental design objectives for the basic interface between MPTCP 602 and applications: 604 o Consistency with existing sockets APIs must be maintained as far 605 as possible. In order to support the large base of applications 606 using the original API, a legacy application must be able to 607 continue to use standard socket interface functions when run on a 608 system supporting MPTCP. Also, MPTCP-aware applications should be 609 able to access the socket without any major changes. 611 o Sockets API extensions must be minimized and independent of an 612 implementation. 614 o The interface should both handle IPv4 and IPv6. 616 The following is a list of the core requirements for the basic API: 618 REQ1: Turn on/off MPTCP: An application should be able to request to 619 turn on or turn off the usage of MPTCP. This means that an 620 application should be able to explicitly request the use of 621 MPTCP if this is possible. Applications should also be able 622 to request not to enable MPTCP and to use regular TCP 623 transport instead. This can be implicit in many cases, since 624 MPTCP must disabled by the use of binding to a specific 625 address. MPTCP may also be enabled if an application uses a 626 dedicated multipath address family (such as AF_MULTIPATH, 627 [8]). 629 REQ2: An application should be able to restrict MPTCP to binding to 630 a given set of addresses. 632 REQ3: An application should be able obtain information on the 633 addresses used by the MPTCP subflows. 635 REQ4: An application should be able to extract a unique identifier 636 for the connection (per endpoint). 638 The first requirement is the most important one, since some 639 applications could benefit a lot from MPTCP, but there are also cases 640 in which it hardly makes sense. The existing sockets API provides 641 similar mechanisms to enable or disable advanced TCP features. The 642 second requirement corresponds to the binding of addresses with the 643 bind() socket call, or, e.g., explicit device bindings with a 644 SO_BINDTODEVICE option. The third requirement ensures that there is 645 an equivalent to getpeername() or getsockname() that is able to deal 646 with more than one subflow. Finally, it should be possible for the 647 application to retrieve a unique connection identifier (local to the 648 endpoint on which it is running) for the MPTCP connection. This is 649 equivalent to using the (address, port) pair for a connection 650 identifier in single-path TCP, which is no longer static in MPTCP. 652 An application can continue to use getpeername() or getsockname() in 653 addition to the basic MPTCP API. In that case, both functions return 654 the corresponding addresses of the first subflow, as already 655 explained. 657 5.3. Sockets Interface Extensions by the Basic MPTCP API 659 5.3.1. Overview 661 The abstract, basic MPTCP API consists of a set of new values that 662 are associated with an MPTCP socket. Such values may be used for 663 changing properties of an MPTCP connection, or retrieving 664 information. These values could be accessed by new symbols on 665 existing calls such as setsockopt() and getsockopt(), or could be 666 implemented as entirely new function calls. This implementation 667 decision is out of scope for this document. The following list 668 presents symbolic names for these MPTCP socket settings. 670 o TCP_MULTIPATH_ENABLE: Enable/disable MPTCP 672 o TCP_MULTIPATH_ADD: Bind MPTCP to a set of given local addresses, 673 or add a new local address to an existing MPTCP connection 675 o TCP_MULTIPATH_REMOVE: Remove a local address from an MPTCP 676 connection 678 o TCP_MULTIPATH_SUBFLOWS: Get the pairs of addresses currently used 679 by the MPTCP subflows 681 o TCP_MULTIPATH_CONNID: Get the local connection identifier for this 682 MPTCP connection 684 Table Table 1 shows a list of the abstract socket operations for the 685 basic configuration of MPTCP. The first column gives the symbolic 686 name of the operation. The second and third columns indicate whether 687 the operation provides values to be read ("Get") or takes values to 688 configure ("Set"). The fourth column lists the type of data 689 associated with this operation. 691 +------------------------+-----+-----+----------------------------+ 692 | Name | Get | Set | Data type | 693 +------------------------+-----+-----+----------------------------+ 694 | TCP_MULTIPATH_ENABLE | o | o | boolean | 695 | TCP_MULTIPATH_ADD | | o | list of addresses | 696 | TCP_MULTIPATH_REMOVE | | o | list of addresses | 697 | TCP_MULTIPATH_SUBFLOWS | o | | list of pairs of addresses | 698 | TCP_MULTIPATH_CONNID | o | | 32-bit integer | 699 +------------------------+-----+-----+----------------------------+ 701 Table 1: MPTCP Socket Operations 703 There are restrictions when these new socket operations can be used: 705 o TCP_MULTIPATH_ENABLE: This value SHOULD only be set before the 706 establishment of a TCP connection. Its value SHOULD only be read 707 after the establishment of a connection. 709 o TCP_MULTIPATH_ADD: This operation can be both applied before 710 connection setup or during a connection. If used before, it 711 controls the local addresses that an MPTCP connection can use. In 712 the latter case, it allows MPTCP to use an additional local 713 address, if there has been a restriction before connection setup. 715 o TCP_MULTIPATH_REMOVE: This operation can be both applied before 716 connection setup or during a connection. In both cases, it 717 removes an address from the list of local addresses that may be 718 used by subflows. 720 o TCP_MULTIPATH_SUBFLOWS: This value is read-only and SHOULD only be 721 used after connection setup. 723 o TCP_MULTIPATH_CONNID: This value is read-only and SHOULD only be 724 used after connection setup. 726 5.3.2. Enabling and Disabling of MPTCP 728 An application can explicitly indicate multipath capability by 729 setting TCP_MULTIPATH_ENABLE to a value larger than 0. In this case, 730 the MPTCP implementation SHOULD try to negitiate MPTCP for that 731 connection. Note that multipath transport will not necessarily be 732 enabled, as it requires multiple addresses and support in the other 733 end-system and potentially also on middleboxes. 735 Building on the backwards-compatibility specified in Section 4.2.1, 736 if an application enables MPTCP but binds to a specific address or 737 interface, MPTCP MUST be enabled, but MPTCP MUST respect the 738 application's choice and only use addresses that are explicitly 739 provided by the application. Note that it would be possible for an 740 application to use the legacy bindings, and then expand on them by 741 using TCP_MULTIPATH_ADD. Note also that it is possible for more than 742 one local address to be initially available to MPTCP in this case, if 743 an application has bound to a specific interface with multiple 744 addresses. 746 An application can disable MPTCP setting TCP_MULTIPATH_ENABLE to a 747 value of 0. In that case, MPTCP MUST NOT be used on that connection. 749 After connection establishment, an application can get the value of 750 TCP_MULTIPATH_ENABLE. A value of 0 then means lack of MPTCP support. 751 Any value equal to or larger than 1 means that MPTCP is supported. 753 As alternative to setting an explicit value, an application could 754 also use a new, separate address family called AF_MULTIPATH [8]. 755 This separate address family can be used to exchange multiple 756 addresses between an application and the standard sockets API, and 757 additionally acts as an explicit indication that an application is 758 MPTCP-aware, i.e., that it can deal with the semantic changes of the 759 sockets API, in particular concerning getpeername() and 760 getsockname(). The usage of AF_MULTIPATH is also more flexible with 761 respect to multipath transport, either IPv4 or IPv6, or both in 762 parallel [8]. 764 5.3.3. Binding MPTCP to Specified Addresses 766 Before connection establishment, an application can use 767 TCP_MULTIPATH_ADD function to indicate a set of local IP addresses 768 that MPTCP may bind to. The parameter of the function is a list of 769 addresses in a corresponding data structure. By extension, this 770 operation will also control the list of addresses that can be 771 advertised to the peer via MPTCP signalling. 773 If an application binds to a specific address or interface, it is not 774 required to use the TCP_MULTIPATH_ADD operation for that address. As 775 explained in Section 5.3.2, MPTCP MUST only use the explicitly 776 specified addresses in that case. 778 An application MAY also indicate a TCP port number that MPTCP should 779 bind to for a given address. The port number MAY be different to the 780 one used by existing subflows. If no port number is provided by the 781 application, the port number is automatically selected by the MPTCP 782 implementation, and will usually be the same across all subflows. 784 This operation can also be used to modify the address list in use 785 during the lifetime of an MPTCP connection. In this case, it is used 786 to indicate a set of additional local addresses that the MPTCP 787 connection can make use of, and which can be signalled to the peer. 788 It should be noted that this signal is only a hint, and an MPTCP 789 implementation MAY only use a subset of the addresses. 791 The TCP_MULTIPATH_REMOVE operation can be used to remove a (set of) 792 local addresses from an MPTCP connection. MPTCP MUST close any 793 corresponding subflows (i.e. those using the local address that is no 794 longer present), and signal the removal of the address to the peer. 795 If alternative paths are available using the supplied address list 796 but MPTCP is not currently using them, an MPTCP implementation SHOULD 797 establish alternative subflows before undertaking the address 798 removal. 800 It should be remembered that these operations SHOULD support both 801 IPv4 and IPv6 addresses, potentially in the same call. 803 5.3.4. Querying the MPTCP Subflow Addresses 805 An application can get a list of the addresses used by the currently 806 established subflows by means of the read-only TCP_MULTIPATH_SUBFLOWS 807 operation. The return value is a list of pairs of tuples of IP 808 address and TCP port number. In one pair, the first tuple refers to 809 the local IP address and the local TCP port, and the second one to 810 the remote IP address and remote TCP port used by the subflow. The 811 list MUST only include established subflows. Both addresses in each 812 pair MUST be either IPv4 or IPv6. 814 5.3.5. Getting a Unique Connection Identifier 816 An application that wants a unique identifier for the connection, 817 analogous to an (address, port) pair in regular TCP, can query the 818 TCP_MULTIPATH_CONNID value to get a local connection identifier for 819 the MPTCP connection. 821 This is a 32-bit number, and SHOULD be the same as the local 822 connection identifier sent in the MPTCP handshake. 824 6. Other Compatibility Issues 825 6.1. Usage of the SCTP Socket API 827 For dealing with multi-homing, several socket API extensions have 828 been defined for SCTP [13]. As MPTCP realizes multipath transport 829 from and to multi-homed endsystems, some of these interface function 830 calls are actually applicable to MPTCP in a similar way. 832 API developers MAY wish to integrate SCTP and MPTCP calls to provide 833 a consistent interface to the application. Yet, it must be 834 emphasized that the transport service provided by MPTCP is different 835 to SCTP, and this is why not all SCTP API functions can be mapped 836 directly to MPTCP. Furthermore, a network stack implementing MPTCP 837 does not necessarily support SCTP and its specific socket interface 838 extensions. This is why the basic API of MPTCP defines additional 839 socket options only, which are a backward compatible extension of 840 TCP's application interface. An integration with the SCTP API is 841 outside the scope of the basic API. 843 6.2. Incompatibilities with other Multihoming Solutions 845 The use of MPTCP can interact with various related sockets API 846 extensions. The use of a multihoming shim layer conflicts with 847 multipath transport such as MPTCP or SCTP [11]. Care should be taken 848 for the usage not to confuse with the overlapping features of other 849 APIs: 851 o SHIM API [11]: This API specifies sockets API extensions for the 852 multihoming shim layer. 854 o HIP API [12]: The Host Identity Protocol (HIP) also results in a 855 new API. 857 o API for Mobile IPv6 [10]: For Mobile IPv6, a significantly 858 extended socket API exists as well. 860 In order to avoid any conflict, multiaddressed MPTCP SHOULD NOT be 861 enabled if a network stack uses SHIM6, HIP, or Mobile IPv6. 862 Furthermore, applications should not try to use both the MPTCP API 863 and another multihoming or mobility layer API. 865 It is possible, however, that some of the MPTCP functionality, such 866 as congestion control, could be used in a SHIM6 or HIP environment. 867 Such operation is outside the scope of this document. 869 6.3. Interactions with DNS 871 In multihomed or multiaddressed environments, there are various 872 issues that are not specific to MPTCP, but have to be considered, 873 too. These problems are summarized in [14]. 875 Specifically, there can be interactions with DNS. Whilst it is 876 expected that an application will iterate over the list of addresses 877 returned from a call such as getaddrinfo(), MPTCP itself MUST NOT 878 make any assumptions about multiple A or AAAA records from the same 879 DNS query referring to the same host, as it is possible that multiple 880 addresses refer to multiple servers for load balancing purposes. 882 7. Security Considerations 884 This document first defines the behaviour of the standard TCP/IP API 885 for MPTCP-unaware applications. As the function offered by this 886 interface is equivalent to existing APIs and does not offer 887 additional functionality, the interface design does not result in new 888 security issues. In general, enabling MPTCP has some security 889 implications for applications, which are introduced in Section 5.3.3, 890 and these threats are further detailed in [6]. The protocol 891 specification of MPTCP [5] defines several mechanism to protect MPTCP 892 against those attacks. 894 In addition, the basic MPTCP API for MPTCP-aware applications defines 895 functions that provide an equivalent level of control and information 896 as exists for regular TCP. New functions enable adding and removing 897 local addresses from an MPTCP connection (TCP_MULTIPATH_ADD and 898 TCP_MULTIPATH_REMOVE). These functions don't add security threats if 899 the MPTCP stack verifies that the addresses provided by the 900 application are indeed available as source addresses for subflows. 902 However, applications should use the TCP_MULTIPATH_ADD function with 903 care, as new subflows might get established to those addresses. 904 Furthermore, it could result in some form of information leakage 905 since MPTCP might advertise those addresses to the other connection 906 endpoint, which could learn IP addresses of interfaces that are not 907 visible otherwise. 909 Use of different addresses should not be assumed to lead to use of 910 different paths, especially for security purposes. 912 MPTCP-aware applications should also take care when querying and 913 using information about the addresses used by subflows 914 (TCP_MULTIPATH_SUBFLOWS). As MPTCP can dynamically open and close 915 subflows, a list of addresses queried once can get outdated during 916 the lifetime of an MPTCP connection. Then, the list may contain 917 invalid entries, i.e. addresses that are not used any more, or that 918 might not even be valid. Applications that want to ensure that MPTCP 919 only uses a certain set of addresses should explicitly bind to those 920 addresses. 922 8. IANA Considerations 924 No IANA considerations. 926 9. Conclusion 928 This document discusses MPTCP's application implications and 929 specifies a basic MPTCP API. For legacy applications, it is ensured 930 that the existing sockets API continues to work. MPTCP-aware 931 applications can use the basic MPTCP API that provides some control 932 over the transport layer equivalent to regular TCP. A more fine- 933 granular interaction between applications and MPTCP requires an 934 advanced MPTCP API, which is not specified in this document. 936 10. Acknowledgments 938 Authors sincerely thank to the following people for their helpful 939 comments and reviews of the document: Costin Raiciu, Philip Eardley, 940 Javier Ubillos, Michael Tuexen, and John Leslie. 942 Michael Scharf is supported by the German-Lab project 943 (http://www.german-lab.de/) funded by the German Federal Ministry of 944 Education and Research (BMBF). Alan Ford was previously supported by 945 Roke Manor Research and by Trilogy (http://www.trilogy-project.org/), 946 a research project (ICT-216372) partially funded by the European 947 Community under its Seventh Framework Program. The views expressed 948 here are those of the author(s) only. The European Commission is not 949 liable for any use that may be made of the information in this 950 document. 952 11. References 954 11.1. Normative References 956 [1] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, 957 September 1981. 959 [2] Braden, R., "Requirements for Internet Hosts - Communication 960 Layers", STD 3, RFC 1122, October 1989. 962 [3] Bradner, S., "Key words for use in RFCs to Indicate Requirement 963 Levels", BCP 14, RFC 2119, March 1997. 965 [4] Ford, A., Raiciu, C., Handley, M., Barre, S., and J. Iyengar, 966 "Architectural Guidelines for Multipath TCP Development", 967 RFC 6182, March 2011. 969 [5] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, "TCP 970 Extensions for Multipath Operation with Multiple Addresses", 971 draft-ietf-mptcp-multiaddressed-06 (work in progress), 972 January 2012. 974 [6] Bagnulo, M., "Threat Analysis for TCP Extensions for Multipath 975 Operation with Multiple Addresses", RFC 6181, March 2011. 977 [7] Raiciu, C., Handley, M., and D. Wischik, "Coupled Congestion 978 Control for Multipath Transport Protocols", RFC 6356, 979 October 2011. 981 11.2. Informative References 983 [8] Sarolahti, P., "Multi-address Interface in the Socket API", 984 draft-sarolahti-mptcp-af-multipath-01 (work in progress), 985 March 2010. 987 [9] Stevens, W., Thomas, M., Nordmark, E., and T. Jinmei, "Advanced 988 Sockets Application Program Interface (API) for IPv6", 989 RFC 3542, May 2003. 991 [10] Chakrabarti, S. and E. Nordmark, "Extension to Sockets API for 992 Mobile IPv6", RFC 4584, July 2006. 994 [11] Komu, M., Bagnulo, M., Slavov, K., and S. Sugimoto, "Sockets 995 Application Program Interface (API) for Multihoming Shim", 996 RFC 6316, July 2011. 998 [12] Komu, M. and T. Henderson, "Basic Socket Interface Extensions 999 for the Host Identity Protocol (HIP)", RFC 6317, July 2011. 1001 [13] Stewart, R., Tuexen, M., Poon, K., Lei, P., and V. Yasevich, 1002 "Sockets API Extensions for the Stream Control Transmission 1003 Protocol (SCTP)", RFC 6458, December 2011. 1005 [14] Blanchet, M. and P. Seite, "Multiple Interfaces and 1006 Provisioning Domains Problem Statement", RFC 6418, 1007 November 2011. 1009 [15] Wasserman, M. and P. Seite, "Current Practices for Multiple- 1010 Interface Hosts", RFC 6419, November 2011. 1012 [16] Wing, D. and A. Yourtchenko, "Happy Eyeballs: Success with 1013 Dual-Stack Hosts", draft-ietf-v6ops-happy-eyeballs-07 (work in 1014 progress), December 2011. 1016 [17] "IEEE Std. 1003.1-2008 Standard for Information Technology -- 1017 Portable Operating System Interface (POSIX). Open Group 1018 Technical Standard: Base Specifications, Issue 7, 2008.". 1020 Appendix A. Requirements on a Future Advanced MPTCP API 1022 A.1. Design Considerations 1024 Multipath transport results in many degrees of freedom. The basic 1025 MPTCP API only defines a minimum set of the API extensions for the 1026 interface between the MPTCP layer and applications, which does not 1027 offer much control of the MPTCP implementation's behaviour. A 1028 future, advanced API could address further features of MPTCP and 1029 provide more control. 1031 Applications that use TCP may have different requirements on the 1032 transport layer. While developers have become used to the 1033 characteristics of regular TCP, new opportunities created by MPTCP 1034 could allow the service provided to be optimised further. An 1035 advanced API could enable MPTCP-aware applications to specify 1036 preferences and control certain aspects of the behavior, in addition 1037 to the simple control provided by the basic interface. An advanced 1038 API could also address aspects that are completely out-of-scope of 1039 the basic API, for example, the question whether a receiving 1040 application could influence the sending policy. 1042 Furthermore, an advanced MPTCP API could be part of a new overall 1043 interface between the network stack and applications that addresses 1044 other issues as well, such as the split between identifiers and 1045 locators. An API that does not use IP addresses (but, instead e.g. a 1046 connectbyname() function) would be useful for numerous purposes, 1047 independent of MPTCP. 1049 This appendix documents a list of potential usage scenarios and 1050 requirements for the advanded API. The specification and 1051 implementation of a corresponding API is outside the scope of this 1052 document. 1054 A.2. MPTCP Usage Scenarios and Application Requirements 1056 There are different MPTCP usage scenarios. An application that 1057 wishes to transmit bulk data will want MPTCP to provide a high 1058 throughput service immediately, through creating and maximising 1059 utilisation of all available subflows. This is the default MPTCP use 1060 case. 1062 But at the other extreme, there are applications that are highly 1063 interactive, but require only a small amount of throughput, and these 1064 are optimally served by low latency and jitter stability. In such a 1065 situation, it would be preferable for the traffic to use only the 1066 lowest latency subflow (assuming it has sufficient capacity), maybe 1067 with one or two additional subflows for resilience and recovery 1068 purposes. The key challenge for such a strategy is that the delay on 1069 a path may fluctuate significantly and that just always selecting the 1070 path with the smallest delay might result in instability. 1072 The choice between bulk data transport and latency-sensitive 1073 transport affects the scheduler in terms of whether traffic should 1074 be, by default, sent on one subflow or across several ones. Even if 1075 the total bandwidth required is less than that available on an 1076 individual path, it is desirable to spread this load to reduce stress 1077 on potential bottlenecks, and this is why this method should be the 1078 default for bulk data transport. However, that may not be optimal 1079 for applications that require latency/jitter stability. 1081 In the case of the latter option, a further question arises: Should 1082 additional subflows be used whenever the primary subflow is 1083 overloaded, or only when the primary path fails (hot-standby)? In 1084 other words, is latency stability or bandwidth more important to the 1085 application? This results in two different options: Firstly, there 1086 is the single path which can overflow into an additional subflow; and 1087 secondly there is single-path with hot-standby, whereby an 1088 application may want an alternative backup subflow in order to 1089 improve resilience. In case that data delivery on the first subflow 1090 fails, the data transport could immediately be continued on the 1091 second subflow, which is idle otherwise. 1093 Yet another complication is introduced with the potential that MPTCP 1094 introduces for changes in available bandwidth as the number of 1095 available subflows changes. Such jitter in bandwidth may prove 1096 confusing for some applications such as video or audio streaming that 1097 dynamically adapt codecs based on available bandwidth. Such 1098 applications may prefer MPTCP to attempt to provide a consistent 1099 bandwidth as far as is possible, and avoid maximising the use of all 1100 subflows. 1102 A further, mostly orthogonal question is whether data should be 1103 duplicated over the different subflows, in particular if there is 1104 spare capacity. This could improve both the timeliness and 1105 reliability of data delivery. 1107 In summary, there are at least three possible performance objectives 1108 for multipath transport (not necessarily disjoint): 1110 1. High bandwidth 1112 2. Low latency and jitter stability 1113 3. High reliability 1115 In an advanced API, applications could provide high-level guidance to 1116 the MPTCP implementation concerning these performance requirements, 1117 for instance, which is considered to be the most important one. The 1118 MPTCP stack would then use internal mechanisms to fulfill this 1119 abstract indication of a desired service, as far as possible. This 1120 would both affect the assignment of data (including retransmissions) 1121 to existing subflows (e.g., 'use all in parallel', 'use as overflow', 1122 'hot standby', 'duplicate traffic') as well as the decisions when to 1123 set up additional subflows to which addresses. In both cases 1124 different policies can exist, which can be expected to be 1125 implementation-specific. 1127 Therefore, an advanced API could provide a mechanism how applications 1128 can specify their high-level requirements in an implementation- 1129 independent way. One possibility would be to select one "application 1130 profile" out of a number of choices that characterize typical 1131 applications. Yet, as applications today do not have to inform TCP 1132 about their communication requirements, it requires further studies 1133 whether such an approach would be realistic. 1135 Of course, independent of an advanced API, such functionality could 1136 also partly be achieved by MPTCP-internal heuristics that infer some 1137 application preferences e.g. from existing socket options, such as 1138 TCP_NODELAY. Whether this would be reliable, and indeed appropriate, 1139 is for further study, too. 1141 A.3. Potential Requirements on an Advanced MPTCP API 1143 The following is a list of potential requirements for an advanced 1144 MPTCP API beyond the features of the basic API. It is included here 1145 for information only: 1147 REQ5: An application should be able to establish MPTCP connections 1148 without using IP addresses as locators. 1150 REQ6: An application should be able obtain usage information and 1151 statistics about all subflows (e.g., ratio of traffic sent 1152 via this subflow). 1154 REQ7: An application should be able to request a change in the 1155 number of subflows in use, thus triggering removal or 1156 addition of subflows. An even finer control granularity 1157 would be a request for the establishment of a new subflow to 1158 a provided destination, or a request for the termination of a 1159 specified, existing subflow. 1161 REQ8: An application should be able to inform the MPTCP 1162 implementation about its high-level performance requirements, 1163 e.g., in form of a profile. 1165 REQ9: An application should be able to indicate communication 1166 characteristics, e. g., the expected amount of data to be 1167 sent, the expected duration of the connection, or the 1168 expected rate at which data is provided. Applications may in 1169 some cases be able to forecast such properties. If so, such 1170 information could be an additional input parameter for 1171 heuristics inside the MPTCP implementation, which could be 1172 useful for example to decide when to set up additional 1173 subflows. 1175 REQ10: An application should be able to control the automatic 1176 establishment/termination of subflows. This would imply a 1177 selection among different heuristics of the path manager, 1178 e.g., 'try as soon as possible', 'wait until there is a bunch 1179 of data', etc. 1181 REQ11: An application should be able to set preferred subflows or 1182 subflow usage policies. This would result in a selection 1183 among different configurations of the multipath scheduler. 1184 For instance, an application might want to use certain 1185 subflows as backup only. 1187 REQ12: An application should be able to control the level of 1188 redundancy by telling whether segments should be sent on more 1189 than one path in parallel. 1191 An advanced API fulfilling these requirements would allow application 1192 developers to more specifically configure MPTCP. It could avoid 1193 suboptimal decisions of internal, implicit heuristics. However, it 1194 is unclear whether all of these requirements would have a significant 1195 benefit to applications, since they are going above and beyond what 1196 the existing API to regular TCP provides. 1198 A subset of this functions might also be implemented system wide or 1199 by other configuration mechanisms. These implementation details are 1200 left for further study. 1202 A.4. Integration with the SCTP Socket API 1204 The advanced API may also integrate or use the SCTP Socket API. The 1205 following functions that are defined for SCTP have a similar 1206 functionality like the basic MPTCP API: 1208 o sctp_bindx() 1210 o sctp_connectx() 1212 o sctp_getladdrs() 1214 o sctp_getpaddrs() 1216 o sctp_freeladdrs() 1218 o sctp_freepaddrs() 1220 The syntax and semantics of these functions are described in [13]. 1222 A potential objective for the advanced API is to provide a consistent 1223 MPTCP and SCTP interface to the application. This is left for 1224 further study in this document. 1226 Appendix B. Change History of the Document 1228 Changes compared to version draft-ietf-mptcp-api-03: 1230 o Security consideration section 1232 o Better explanation of the implications of explicitly specified 1233 addresses, most notably during the bind call 1235 o Editorial changes 1237 Changes compared to version draft-ietf-mptcp-api-02: 1239 o Updated references 1241 o Editorial changes 1243 Changes compared to version draft-ietf-mptcp-api-01: 1245 o Additional text on outdated assumptions if an MPTCP application 1246 does not use fate sharing. 1248 o The appendix explicitly mentions an integration of the advanced 1249 MPTCP API and the SCTP API as a potential objective, which is left 1250 for further study for the basic API. 1252 o A short additional explanation of the parameters of the abstract 1253 functions TCP_MULTIPATH_ADD and TCP_MULTIPATH_REMOVE. 1255 o Better explanation when TCP_MULTIPATH_REMOVE may be used. 1257 Changes compared to version draft-ietf-mptcp-api-00: 1259 o Explicitly specify that the TCP_MULTIPATH_SUBFLOWS function 1260 returns port numbers, too. Furthermore, add a new comment that 1261 TCP_MULTIPATH_ADD permits the specification of a port number. 1263 o Mention possible additional extended API functions for the 1264 indication of application characterstics and for backup paths, 1265 based on comments received from the community. 1267 o Mentions alternative approaches for avoiding non-MPTCP-capable 1268 paths to reduce impact on applications. 1270 Changes compared to version draft-scharf-mptcp-api-03: 1272 o Removal of explicit references to "socket options" and getsockopt/ 1273 setsockopt. 1275 o Change of TCP_MULTIPATH_BIND to TCP_MULTIPATH_ADD and 1276 TCP_MULTIPATH_REMOVE. 1278 o Mention of stability of bandwidth as another potential QoS 1279 parameter for the advanced API. 1281 o Address comments received from Philip Eardley: Explanation of the 1282 API terminology, more explicit statement concerning applications 1283 that bind to a specific address, and some smaller editorial fixes 1285 Changes compared to version draft-scharf-mptcp-api-02: 1287 o Definition of the behavior of getpeername() and getsockname() when 1288 being called by an MPTCP-aware application. 1290 o Discussion of the possiblity that an MPTCP implementation could 1291 support the SCTP API, as far as it is applicable to MPTCP. 1293 o Various editorial fixes. 1295 Changes compared to version draft-scharf-mptcp-api-01: 1297 o Second half of the document completely restructured 1299 o Separation between a basic API and an advanced API: The focus of 1300 the document is the basic API only; all text concerning a 1301 potential extended API is moved to the appendix 1303 o Several clarifications, e. g., concerning buffer sizeing and the 1304 use of different scheduling strategies triggered by TCP_NODELAY 1306 o Additional references 1308 Changes compared to version draft-scharf-mptcp-api-00: 1310 o Distinction between legacy and MPTCP-aware applications 1312 o Guidance concerning default enabling, reaction to the shutdown of 1313 the first subflow, etc. 1315 o Reference to a potential use of AF_MULTIPATH 1317 o Additional references to related work 1319 Authors' Addresses 1321 Michael Scharf 1322 Alcatel-Lucent Bell Labs 1323 Lorenzstrasse 10 1324 70435 Stuttgart 1325 Germany 1327 EMail: michael.scharf@alcatel-lucent.com 1329 Alan Ford 1330 Cisco 1331 Ruscombe Business Park 1332 Ruscombe, Berkshire RG10 9NN 1333 UK 1335 EMail: alanford@cisco.com