idnits 2.17.1 draft-ietf-mptcp-api-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 19, 2013) is 4087 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 793 (ref. '1') (Obsoleted by RFC 9293) -- Obsolete informational reference (is this intentional?): RFC 6555 (ref. '16') (Obsoleted by RFC 8305) -- Obsolete informational reference (is this intentional?): RFC 5246 (ref. '17') (Obsoleted by RFC 8446) -- Obsolete informational reference (is this intentional?): RFC 6125 (ref. '19') (Obsoleted by RFC 9525) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force M. Scharf 3 Internet-Draft Alcatel-Lucent Bell Labs 4 Intended status: Informational A. Ford 5 Expires: July 23, 2013 Cisco 6 January 19, 2013 8 MPTCP Application Interface Considerations 9 draft-ietf-mptcp-api-07 11 Abstract 13 Multipath TCP (MPTCP) adds the capability of using multiple paths to 14 a regular TCP session. Even though it is designed to be totally 15 backward compatible to applications, the data transport differs 16 compared to regular TCP, and there are several additional degrees of 17 freedom that applications may wish to exploit. This document 18 summarizes the impact that MPTCP may have on applications, such as 19 changes in performance. Furthermore, it discusses compatibility 20 issues of MPTCP in combination with non-MPTCP-aware applications. 21 Finally, the document describes a basic application interface which 22 is a simple extension of TCP's interface for MPTCP-aware 23 applications. 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on July 23, 2013. 42 Copyright Notice 44 Copyright (c) 2013 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 61 3. Comparison of MPTCP and Regular TCP . . . . . . . . . . . . . 5 62 3.1. Effect on Performance . . . . . . . . . . . . . . . . . . 5 63 3.1.1. Throughput . . . . . . . . . . . . . . . . . . . . . . 6 64 3.1.2. Delay . . . . . . . . . . . . . . . . . . . . . . . . 6 65 3.1.3. Resilience . . . . . . . . . . . . . . . . . . . . . . 8 66 3.2. Potential Problems . . . . . . . . . . . . . . . . . . . . 8 67 3.2.1. Impact of Middleboxes . . . . . . . . . . . . . . . . 8 68 3.2.2. Dealing with Multiple Addresses Inside Applications . 9 69 3.2.3. Security Implications . . . . . . . . . . . . . . . . 10 70 4. Operation of MPTCP with Legacy Applications . . . . . . . . . 10 71 4.1. Overview of the MPTCP Network Stack . . . . . . . . . . . 10 72 4.2. Address Issues . . . . . . . . . . . . . . . . . . . . . . 11 73 4.2.1. Specification of Addresses by Applications . . . . . . 11 74 4.2.2. Querying of Addresses by Applications . . . . . . . . 12 75 4.3. MPTCP Connection Management . . . . . . . . . . . . . . . 13 76 4.3.1. Reaction to Close Call by Application . . . . . . . . 13 77 4.3.2. Other Connection Management Functions . . . . . . . . 13 78 4.4. Socket Option Issues . . . . . . . . . . . . . . . . . . . 13 79 4.4.1. General Guideline . . . . . . . . . . . . . . . . . . 13 80 4.4.2. Disabling of the Nagle Algorithm . . . . . . . . . . . 14 81 4.4.3. Buffer Sizing . . . . . . . . . . . . . . . . . . . . 14 82 4.4.4. Other Socket Options . . . . . . . . . . . . . . . . . 14 83 4.5. Default Enabling of MPTCP . . . . . . . . . . . . . . . . 15 84 4.6. Summary of Advice to Application Developers . . . . . . . 15 85 5. Basic API for MPTCP-aware Applications . . . . . . . . . . . . 16 86 5.1. Design Considerations . . . . . . . . . . . . . . . . . . 16 87 5.2. Requirements on the Basic MPTCP API . . . . . . . . . . . 16 88 5.3. Sockets Interface Extensions by the Basic MPTCP API . . . 17 89 5.3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . 17 90 5.3.2. Enabling and Disabling of MPTCP . . . . . . . . . . . 19 91 5.3.3. Binding MPTCP to Specified Addresses . . . . . . . . . 20 92 5.3.4. Querying the MPTCP Subflow Addresses . . . . . . . . . 20 93 5.3.5. Getting a Unique Connection Identifier . . . . . . . . 21 94 6. Other Compatibility Issues . . . . . . . . . . . . . . . . . . 21 95 6.1. Usage of TLS over MPTCP . . . . . . . . . . . . . . . . . 21 96 6.2. Usage of the SCTP Socket API . . . . . . . . . . . . . . . 21 97 6.3. Incompatibilities with other Multihoming Solutions . . . . 22 98 6.4. Interactions with DNS . . . . . . . . . . . . . . . . . . 22 99 7. Security Considerations . . . . . . . . . . . . . . . . . . . 22 100 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 101 9. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 24 102 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 24 103 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 104 11.1. Normative References . . . . . . . . . . . . . . . . . . . 24 105 11.2. Informative References . . . . . . . . . . . . . . . . . . 25 106 Appendix A. Requirements on a Future Advanced MPTCP API . . . . . 26 107 A.1. Design Considerations . . . . . . . . . . . . . . . . . . 26 108 A.2. MPTCP Usage Scenarios and Application Requirements . . . . 27 109 A.3. Potential Requirements on an Advanced MPTCP API . . . . . 29 110 A.4. Integration with the SCTP Socket API . . . . . . . . . . . 30 111 Appendix B. Change History of the Document . . . . . . . . . . . 31 113 1. Introduction 115 Multipath TCP adds the capability of using multiple paths to a 116 regular TCP session [1]. The motivations for this extension include 117 increasing throughput, overall resource utilisation, and resilience 118 to network failure, and these motivations are discussed, along with 119 high-level design decisions, as part of the Multipath TCP 120 architecture [4]. The MPTCP protocol [5] offers the same reliable, 121 in-order, byte-stream transport as TCP, and is designed to be 122 backward compatible with both applications and the network layer. It 123 requires support inside the network stack of both endpoints. 125 This document first presents the effects that MPTCP may have on 126 applications, such as performance changes compared to regular TCP. 127 Second, it defines the interoperation of MPTCP and applications that 128 are unaware of the multipath transport. MPTCP is designed to be 129 usable without any application changes, but some compatibility issues 130 have to be taken into account. Third, this memo specifies a basic 131 Application Programming Interface (API) for MPTCP-aware applications. 132 The API presented here is an extension to the regular TCP API to 133 allow an MPTCP-aware application the equivalent level of control and 134 access to information of an MPTCP connection that would be possible 135 with the standard TCP API on a regular TCP connection. 137 The de facto standard API for TCP/IP applications is the "sockets" 138 interface [8]. This document provides an abstract definition of 139 MPTCP-specific extensions to this interface. These are operations 140 that can be used by an application to get or set additional MPTCP- 141 specific information on a socket, in order to provide an equivalent 142 level of information and control over MPTCP as exists for an 143 application using regular TCP. It is up to the applications, high- 144 level programming languages, or libraries to decide whether to use 145 these optional extensions. For instance, an application may want to 146 turn on or off the MPTCP mechanism for certain data transfers, or 147 limit its use to certain interfaces. The abstract specification is 148 in line with the Posix standard [8] as much as possible. 150 An advanced API for MPTCP is outside the scope of this document. 151 Such an advanced API could offer a more fine-grained control over 152 multipath transport functions and policies. The appendix includes a 153 brief, non-compulsory list of potential features of such an advanced 154 API. 156 There can be interactions or incompatibilities of MPTCP with other 157 APIs or socket interface extensions, which are discussed later in 158 this document. Some network stack implementations, specially on 159 mobile devices, have centralized connection managers or other higher- 160 level APIs to solve multi-interface issues, as surveyed in [15]. 162 Their interaction with MPTCP is outside the scope of this note. 164 The target readers of this document are application developers whose 165 software may benefit significantly from MPTCP. This document also 166 provides the necessary information for developers of MPTCP to 167 implement the API in a TCP/IP network stack. 169 2. Terminology 171 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 172 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 173 document are to be interpreted as described in [3]. 175 This document uses the MPTCP terminology introduced in [5]. 177 Concerning the API towards applications, the following terms are 178 distinguished: 180 o Legacy API: The interface towards TCP that is currently used by 181 applications. This document explains the effect of MPTCP for such 182 applications, as well as resulting issues. 184 o Basic API: A simple extension of TCP's interface for applications 185 that are aware of MPTCP. This document abstractly describes this 186 interface, which provides access to multipath address information 187 and a level of control equivalent to regular TCP. 189 o Advanced API: An API that offers more fine-grained control over 190 the MPTCP behavior. Its specification is outside scope of this 191 document. 193 3. Comparison of MPTCP and Regular TCP 195 This section discusses the effect of MPTCP on performance as seen by 196 an application, in comparison to what may be expected from the use of 197 regular TCP. 199 3.1. Effect on Performance 201 One of the key goals of adding multipath capability to TCP is to 202 improve the performance of a transport connection by load 203 distribution over separate subflows across potentially disjoint 204 paths. Furthermore, it is an explicit goal of MPTCP that it provides 205 a connection that performs at least as well as one using single-path 206 TCP. A corresponding congestion control algorithm is described in 207 [7]. The following sections summarize the performance effect of 208 MPTCP as seen by an application. 210 3.1.1. Throughput 212 The most obvious performance improvement that can be expected from 213 the use of MPTCP is an increase in throughput, since MPTCP will pool 214 more than one path (where available) between two endpoints. This 215 will usually provide as great or greater bandwidth for an 216 application, even though exceptions may exist, e.g., due to 217 differences in the congestion control dynamics. For instance, if a 218 new subflow is started, the short-term throughput can be smaller than 219 the theoretical optimum. If there are shared bottlenecks between the 220 flows, then the congestion control algorithms will in most cases 221 ensure that load is evenly spread amongst regular and multipath TCP 222 sessions, so that no end user receives worse performance than if all 223 were using single-path TCP. There are some known corner cases in 224 which an upgrade to MPTCP can affect other users [21]. 226 This performance increase additionally means that an MPTCP session 227 could achieve throughput that is greater than the capacity of a 228 single interface on the device. If any applications make assumptions 229 about interfaces due to throughput, they must take this into account 230 (although an MPTCP implementation must always respect an 231 application's request for a particular interface). 233 Furthermore, the flexibility of MPTCP to add and remove subflows as 234 paths change availability could lead to a greater variation, and more 235 frequent change, in connection bandwidth. Applications that adapt to 236 available bandwidth (such as video and audio streaming) may need to 237 adjust some of their assumptions to most effectively take this into 238 account. 240 The transport of MPTCP signalling information results in a small 241 overhead. The use of MPTCP instead of a single TCP connection 242 therefore results in a smaller goodput. Also, if multiple subflows 243 share a same bottleneck, this overhead slightly reduces the capacity 244 that is available for data transport. Yet, this potential reduction 245 of throughput will be negligible in many usage scenarios, and the 246 protocol contains optimisations in its design so that this overhead 247 is minimal. 249 3.1.2. Delay 251 The benefits of MPTCP regarding throughput and resilience may come at 252 some cost regarding data delivery delay and delay jitter. 254 If the delays on the constituent subflows of an MPTCP connection 255 differ, the jitter perceivable to an application may appear higher as 256 the data are spread across the subflows. Although MPTCP will ensure 257 in-order delivery to the application, the data delivery could be more 258 bursty than may be usual with single-path TCP, in particular on 259 highly asymmetric paths. 261 Applications with high real-time requirements might be affected by 262 that. One possibly remedy is to disable MPTCP for such jitter- 263 sensitive application, either by using the Basic API defined in this 264 document, or by other means, such as system policies. 266 However, the actual delay and jitter of data transport over MPTCP 267 depends on the scheduling and congestion control algorithms used for 268 sending data, as well as the heuristics to establish and shutdown 269 subflows. A sender can implement strategies to minimize the delay 270 jitter seen by applications, but this requires an accurate estimation 271 of the path characteristics. If the scheduling decisions are 272 suboptimal or if assumptions about the path characteristics turn out 273 to be wrong, delay jitter may be increased and affect delay-sensitive 274 applications. In general, for a delay-sensitive application, it 275 would be desirable to select an appropriate congestion control 276 algorithm for its traffic needs. 278 Alternatively, MPTCP could be used in high reliability, rather than 279 high throughput, modes of operation, such as by mirroring traffic on 280 subflows, or by only using additional subflows for hot standby. 281 These methods of traffic scheduling would not cause delay variation 282 in the same way. These additional modes, and the selection of 283 alternative scheduling algorithms, would need to be indicated by an 284 advanced API, the specification of which requires further analysis 285 and is outside the scope of this document. 287 If data transport on one subflow fails, the retransmissions inside 288 MPTCP could affect the delivery delay to the application. Yet, 289 without MPTCP that data or the whole connection might have been lost, 290 and other reliability mechanisms (e.g. application-level recovery) 291 would likely have an even larger delay impact. 293 In addition, applications that make round trip time (RTT) estimates 294 at the application level may have some issues. Whilst the average 295 delay calculated will be accurate, whether this is useful for an 296 application will depend on what it requires this information for. If 297 a new application wishes to derive such information, it should 298 consider how multiple subflows may affect its measurements, and thus 299 how it may wish to respond. In such a case, an application may wish 300 to express its scheduling preferences, as described later in this 301 document. 303 3.1.3. Resilience 305 Another performance improvement through the use of MPTCP is better 306 resilience. The use of multiple subflows simultaneously means that, 307 if one should fail, all traffic will move to the remaining 308 subflow(s), and additionally any lost packets can be retransmitted on 309 these subflows. 311 As one special case, the MPTCP protocol can be used with only one 312 active subflow at a given point in time. In that case, resilience 313 compared to single-path TCP is improved. MPTCP also supports make- 314 before-break and break-before-make handovers between subflows. In 315 both cases, the MPTCP connection can survive an unavailability or 316 change of an IP address (e.g., due to shutdown of an interface or 317 handover). MPTCP close or resets the MPTCP connection separately 318 from the individual subflows, as described in [5]. 320 Subflow failure may be caused by issues within the network, which an 321 application would be unaware of, or interface failure on the node. 322 An application may, under certain circumstances, be in a position to 323 be aware of such failure (e.g. by radio signal strength, or simply an 324 interface enabled flag), and so must not make assumptions of an MPTCP 325 flow's stability based on this. An MPTCP implementation must never 326 override an application's request for a given interface, however, so 327 the cases where this issue may be applicable are limited. 329 3.2. Potential Problems 331 3.2.1. Impact of Middleboxes 333 MPTCP has been designed in order to pass through the majority of 334 middleboxes. Empirical evidence suggests that new TCP options can 335 successfully be used on most paths in the Internet [22]. 336 Nevertheless some middleboxes may still refuse to pass MPTCP messages 337 due to the presence of TCP options, or they may strip TCP options. 338 If this is the case, MPTCP falls back to regular TCP. Although this 339 will not create a problem for the application (its communication will 340 be set up either way), there may be additional (and indeed, user- 341 perceivable) delay while the first handshake fails. Therefore, an 342 alternative approach could be to try both MPTCP and regular TCP 343 connection attempts at the same time, and respond to whichever 344 replies first in a similar fashion to the "Happy Eyeballs" mechanism 345 for IPv6 [16]. One could also apply a shorter timeout on the MPTCP 346 attempt and thus reduce the setup delay if fallback to regular TCP is 347 needed. 349 An MPTCP implementation can learn the rate of MPTCP connection 350 attempt successes or failures to particular hosts or networks, and on 351 particular interfaces, and could therefore learn heuristics of when 352 and when not to use MPTCP. A detailed discussion of the various 353 fallback mechanisms, for failures occurring at different points in 354 the connection, is presented in [5]. It must be emphasized that all 355 such heuristics could also fail, and learning can be difficult in 356 certain environements, e.g., if the host is mobile. 358 There may also be middleboxes that transparently change the length of 359 content. If such middleboxes are present, MPTCP's reassembly of the 360 byte stream in the receiver is difficult. Still, MPTCP can detect 361 such middleboxes and then fall back to regular TCP. An overview of 362 the impact of middleboxes is presented in [4] and MPTCP's mechanisms 363 to work around these are presented and discussed in [5]. 365 MPTCP can also have other unexpected implications. For instance, 366 intrusion detection systems could be triggered. A full analysis of 367 MPTCP's impact on such middleboxes is for further study after 368 deployment experiments. 370 3.2.2. Dealing with Multiple Addresses Inside Applications 372 In regular TCP, there is a one-to-one mapping of the socket interface 373 to a flow through a network. Since MPTCP can make use of multiple 374 subflows, applications cannot implicitly rely on this one-to-one 375 mapping any more. 377 Whilst this doesn't matter for most applications, some applications 378 may need to adapt to the presence of multiple addresses, because 379 implicit assumptions are outdated. In the following, selected 380 examples for resulting issues are discussed. The question whether 381 such implicit assumptions matter is an application-level decision, 382 and this document only provides general guidance and a basic API to 383 retrieve relevant information. 385 A few applications require the transport to be along a single path; 386 they can disable the use of MPTCP as described later in this 387 document. Examples include monitoring tools that want to measure the 388 available bandwidth on a path, or routing protocols such as BGP that 389 require the use of a specific link. 391 Certain applications store the IP addresses of TCP connections, e.g., 392 by logging mechanisms. Such logging mechanisms will continue to work 393 with MPTCP, but two important aspects have to be mentioned: First, if 394 the application is not aware of MPTCP, it will use the existing 395 interface to the network stack. This implies that an MPTCP-unaware 396 application will track the IP addresses of the first subflow only. 397 IP addresses used by follow-up subflows will be ignored. Second, an 398 MPTCP-aware application can use the basic API described in this 399 document to monitor the IP addresses of all subflows, e.g., for 400 logging mechanisms. If an MPTCP connection uses several subflows, 401 this will possibly imply that data structures have to be adapted and 402 that the amount of data that has to be logged and stored per 403 connection will increase. 405 A MPTCP implementation may choose to maintain an MPTCP connection 406 even if the IP address of the original subflow is no longer allocated 407 to a host, depending on the policy concerning the first subflow 408 (fate-sharing, see Section 4.2.2). In this case, the IP address 409 exposed to an MPTCP-unaware application can differ to the addresses 410 actually being used by MPTCP. It is even possible that the IP 411 address gets assigned to another host during the lifetime of an MPTCP 412 connection. As further discussed below, this could be an issue if 413 the IP addresses are exchanged by applications, e.g., inside the 414 application protocol. This issue can be addressed by enabling fate 415 sharing, at the cost of resilience, because the MPTCP connection then 416 cannot close the initial subflow. 418 3.2.3. Security Implications 420 The support for multiple IP addresses within one MPTCP connection can 421 result in additional security vulnerabilities, such as possibilities 422 for attackers to hijack connections. The protocol design of MPTCP 423 minimizes this risk. An attacker on one of the paths can cause harm, 424 but this is hardly an additional security risk compared to single- 425 path TCP, which is vulnerable to man-in-the-middle attacks as well. 426 A detailed threat analysis of MPTCP is published in [6]. 428 Impact on Transport Layer Security (TLS) is discussed in Section 6.1. 430 4. Operation of MPTCP with Legacy Applications 432 4.1. Overview of the MPTCP Network Stack 434 MPTCP is an extension of TCP, but it is designed to be backward 435 compatible for legacy (MPTCP-unaware) applications. TCP interacts 436 with other parts of the network stack by different interfaces. The 437 de facto standard API between TCP and applications is the sockets 438 interface. The position of MPTCP in the protocol stack is 439 illustrated in Figure 1. 441 +-------------------------------+ 442 | Application | 443 +-------------------------------+ 444 ^ | 445 ~~~~~~~~~~~|~Socket Interface|~~~~~~~~~~~ 446 | v 447 +-------------------------------+ 448 | MPTCP | 449 + - - - - - - - + - - - - - - - + 450 | Subflow (TCP) | Subflow (TCP) | 451 +-------------------------------+ 452 | IP | IP | 453 +-------------------------------+ 455 Figure 1: MPTCP protocol stack 457 In general, MPTCP can affect all interfaces that make assumptions 458 about the coupling of a TCP connection to a single IP address and TCP 459 port pair, to one sockets endpoint, to one network interface, or to a 460 given path through the network. 462 This means that there are two classes of applications: 464 o Legacy applications: These applications are unaware of MPTCP and 465 use the existing API towards TCP without any changes. This is the 466 default case. 468 o MPTCP-aware applications: These applications indicate support for 469 an enhanced MPTCP interface. This document specifies a minimum 470 set of API extensions for such applications. 472 In the following, it is discussed to what extent MPTCP affects legacy 473 applications using the existing sockets API. The existing sockets 474 API implies that applications deal with data structures that store, 475 amongst others, the IP addresses and TCP port numbers of a TCP 476 connection. A design objective of MPTCP is that legacy applications 477 can continue to use the established sockets API without any changes. 478 However, in MPTCP there is a one-to-many mapping between the socket 479 endpoint and the subflows. This has several subtle implications for 480 legacy applications using sockets API functions. 482 4.2. Address Issues 484 4.2.1. Specification of Addresses by Applications 486 During binding, an application can either select a specific address, 487 or bind to INADDR_ANY. Furthermore, on some systems other socket 488 options (e.g., SO_BINDTODEVICE) can be used to bind to a specific 489 interface. If an application uses a specific address or binds to a 490 specific interface, then MPTCP MUST respect this and not interfere in 491 the application's choices. The binding to a specific address or 492 interface implies that the application is not aware of MPTCP and will 493 disable the use of MPTCP on this connection. An application that 494 wishes to bind to a specific set of addresses with MPTCP must use 495 multipath-aware calls to achieve this (as described in 496 Section 5.3.3). 498 If an application binds to INADDR_ANY, it is assumed that the 499 application does not care which addresses to use locally. In this 500 case, a local policy MAY allow MPTCP to automatically set up multiple 501 subflows on such a connection. 503 The basic sockets API of MPTCP-aware applications allows to express 504 further preferences in an MPTCP-compatible way (e.g. bind to a subset 505 of interfaces only). 507 4.2.2. Querying of Addresses by Applications 509 Applications can use the getpeername() or getsockname() functions in 510 order to retrieve the IP address of the peer or of the local socket. 511 These functions can be used for various purposes, including security 512 mechanisms, geo-location, or interface checks. The socket API was 513 designed with an assumption that a socket is using just one address, 514 and since this address is visible to the application, the application 515 may assume that the information provided by the functions is the same 516 during the lifetime of a connection. However, in MPTCP, unlike in 517 TCP, there is a one-to-many mapping of a connection to subflows, and 518 subflows can be added and removed while the connection continues to 519 exist. Since the subflow addresses can change, MPTCP cannot expose 520 addresses by getpeername() or getsockname() that are both valid and 521 constant during the connection's lifetime. 523 This problem is addressed as follows: If used by a legacy 524 application, the MPTCP stack MUST always return the addresses and 525 port numbers of the first subflow of an MPTCP connection, in all 526 circumstances, even if that particular subflow is no longer in use. 528 As the addresses may not be valid any more if the first subflow is 529 closed, the MPTCP stack MAY close the whole MPTCP connection if the 530 first subflow is closed (i.e. fate sharing between the initial 531 subflow and the MPTCP connection as a whole). This fate-sharing 532 avoids that the pair of IP addresses and ports are reused while a 533 MPTCP connection is still in progress, but at the cost of reducing 534 the utility of MPTCP if IP addresses of the first subflow are not 535 available any more (e.g., mobility events). Whether to close the 536 whole MPTCP connection by default SHOULD be controlled by a local 537 policy. Further experiments are needed to investigate its 538 implications. 540 The functions getpeername() and getsockname() SHOULD also always 541 return the addresses of the first subflow if the socket is used by an 542 MPTCP-aware application, in order to be consistent with MPTCP-unaware 543 applications, and, e.g., also with Stream Control Transmission 544 Protocol (SCTP). Instead of getpeername() or getsockname(), MPTCP- 545 aware applications can use new API calls, documented later, in order 546 to retrieve the full list of address pairs for the subflows in use. 548 4.3. MPTCP Connection Management 550 4.3.1. Reaction to Close Call by Application 552 As described in [5], MPTCP distinguishes between the closing of 553 subflows (by TCP FIN) and closing the whole MPTCP conncetion (by DATA 554 FIN). 556 When an application closes a socket, e.g., by calling the close() 557 function, this indicates that the application has no more data to 558 send, like for single-path TCP. MPTCP will then close the MPTCP 559 connection by DATA FIN messages. This is completely transparent for 560 an application. 562 In summary, the semantics of the close() interface for applications 563 are not changed compared to TCP. 565 4.3.2. Other Connection Management Functions 567 In general, an MPTCP connection is maintained separately from 568 individual subflows. The MPTCP protocol therefore has internal 569 mechanisms to establish, close, or reset the MPTCP connection [5]. 570 They provide equivalent functions like single-path TCP and can be 571 mapped accordingly. Therefore, these MPTCP internals do not affect 572 the application interface. 574 4.4. Socket Option Issues 576 4.4.1. General Guideline 578 The existing sockets API includes options that modify the behavior of 579 sockets and their underlying communications protocols. Various 580 socket options exist on the socket, TCP, and IP level. The value of 581 an option can usually be set by the setsockopt() system function. 582 The getsockopt() function gets information. In general, the existing 583 sockets interface functions cannot configure each MPTCP subflow 584 individually. In order to be backward compatible, existing APIs 585 therefore SHOULD apply to all subflows within one connection, as far 586 as possible. 588 4.4.2. Disabling of the Nagle Algorithm 590 One commonly used TCP socket option (TCP_NODELAY) disables the Nagle 591 algorithm as described in [2]. This option is also specified in the 592 Posix standard [8]. Applications can use this option in combination 593 with MPTCP exactly in the same way. It then SHOULD disable the Nagle 594 algorithm for the MPTCP connection, i.e., all subflows. 596 In addition, the MPTCP protocol instance MAY use a different path 597 scheduler algorithm if TCP_NODELAY is present. For instance, it 598 could use an algorithm that is optimized for latency-sensitive 599 traffic (for instance only transmitting on one path). Specific 600 algorithms are outside the scope of this document. 602 4.4.3. Buffer Sizing 604 Applications can explicitly configure send and receive buffer sizes 605 by the sockets API (SO_SNDBUF, SO_RCVBUF). These socket options can 606 also be used in combination with MPTCP and then affect the buffer 607 size of the MPTCP connection. However, when defining buffer sizes, 608 application programmers should take into account that the transport 609 over several subflows requires a certain amount of buffer for 610 resequencing in the receiver. MPTCP may also require more storage 611 space in the sender, in particular, if retransmissions are sent over 612 more than one path. In addition, very small send buffers may prevent 613 MPTCP from efficiently scheduling data over different subflows. 614 Therefore, it does not make sense to use MPTCP in combination with 615 small send or receive buffers. 617 An MPTCP implementation MAY set a lower bound for send and receive 618 buffers and treat a small buffer size request as an implicit request 619 not to use MPTCP. 621 4.4.4. Other Socket Options 623 TCP features the ability to send "Urgent" data, but its use is not 624 recommended in general, and specifically not with MPTCP [4]. 626 Some network stacks also provide other implementation-specific socket 627 options or interfaces that affect TCP's behavior. If a network stack 628 supports MPTCP, it must be ensured that these options do not 629 interfere. 631 4.5. Default Enabling of MPTCP 633 It is up to a local policy at the end system whether a network stack 634 should automatically enable MPTCP for sockets even if there is no 635 explicit sign of MPTCP awareness of the corresponding application. 636 Such a choice may be under the control of the user through system 637 preferences. 639 The enabling of MPTCP, either by application or by system defaults, 640 does not necessarily mean that MPTCP will always be used. Both 641 endpoints must support MPTCP, and there must be multiple addresses at 642 at least one endpoint, for MPTCP to be used. Even if those 643 requirements are met, however, MPTCP may not be immediately used on a 644 connection. It may make sense for multiple paths to be brought into 645 operation only after a given period of time, or if the connection is 646 saturated. 648 4.6. Summary of Advice to Application Developers 650 o Using the default MPTCP configuration: Like TCP, MPTCP is designed 651 to be efficient and robust in the default configuration. 652 Application developers should not explicitly configure TCP (or 653 MPTCP) features unless this is really needed. 655 o Socket buffer dimensioning: Multipath transport requires larger 656 buffers in the receiver for resequencing, as already explained. 657 Applications should use reasonable buffer sizes (such as the 658 operating system default values) in order to fully benefit from 659 MPTCP. A full discussion of buffer sizing issues is given in [5]. 661 o Facilitating stack-internal heuristics: The path management and 662 data scheduling by MPTCP is realized by stack-internal algorithms 663 that may implicitly try to self-optimize their behavior according 664 to assumed application needs. For instance, an MPTCP 665 implementation may use heuristics to determine whether an 666 application requires delay-sensitive or bulk data transport, using 667 for instance port numbers, the TCP_NODELAY socket options, or the 668 application's read/write patterns as input parameters. An 669 application developer can facilitate the operation of such 670 heuristics by avoiding atypical interface use cases. For 671 instance, for long bulk data transfers, it neither makes sense to 672 enable the TCP_NODELAY socket option, nor is it reasonable to use 673 many small socket send() calls each with small amounts of data 674 only. 676 5. Basic API for MPTCP-aware Applications 678 5.1. Design Considerations 680 While applications can use MPTCP with the unmodified sockets API, 681 multipath transport results in many degrees of freedom. MPTCP 682 manages the data transport over different subflows automatically. By 683 default, this is transparent to the application, but an application 684 could use an additional API to interface with the MPTCP layer and to 685 control important aspects of the MPTCP implementation's behavior. 687 This document describes a basic MPTCP API. The API contains a 688 minimum set of functions that provide an equivalent level of control 689 and information as exists for regular TCP. It maintains backward 690 compatibility with legacy applications. 692 An advanced MPTCP API is outside the scope of this document. The 693 basic API does not allow a sender or a receiver to express 694 preferences about the management of paths or the scheduling of data, 695 even if this can have a significant performance impact and if an 696 MPTCP implementation could benefit from additional guidance by 697 applications. A list of potential further API extensions is provided 698 in the appendix. The specification of such an advanced API is for 699 further study and may partly be implementation-specific. 701 MPTCP mainly affects the sending of data. But a receiver may also 702 have preferences about data transfer choices, and it may have 703 performance requirements as well. A receiver may also have 704 preferences about data transfer choices, and it may have performance 705 requirements, too. Yet, the configuration of such preferences is 706 outside of the scope of the basic API. 708 5.2. Requirements on the Basic MPTCP API 710 Because of the importance of the sockets interface there are several 711 fundamental design objectives for the basic interface between MPTCP 712 and applications: 714 o Consistency with existing sockets APIs must be maintained as far 715 as possible. In order to support the large base of applications 716 using the original API, a legacy application must be able to 717 continue to use standard socket interface functions when run on a 718 system supporting MPTCP. Also, MPTCP-aware applications should be 719 able to access the socket without any major changes. 721 o Sockets API extensions must be minimized and independent of an 722 implementation. 724 o The interface should handle both IPv4 and IPv6. 726 The following is a list of the core requirements for the basic API: 728 REQ1: Turn on/off MPTCP: An application should be able to request to 729 turn on or turn off the usage of MPTCP. This means that an 730 application should be able to explicitly request the use of 731 MPTCP if this is possible. Applications should also be able 732 to request not to enable MPTCP and to use regular TCP 733 transport instead. This can be implicit in many cases, since 734 MPTCP must be disabled by the use of binding to a specific 735 address. MPTCP may also be enabled if an application uses a 736 dedicated multipath address family (such as AF_MULTIPATH, 737 [20]). 739 REQ2: An application should be able to restrict MPTCP to binding to 740 a given set of addresses. 742 REQ3: An application should be able obtain information on the pairs 743 of addresses used by the MPTCP subflows. 745 REQ4: An application should be able to extract a unique identifier 746 for the connection (per endpoint). 748 The first requirement is the most important one, since some 749 applications could benefit a lot from MPTCP, but there are also cases 750 in which it hardly makes sense. The existing sockets API provides 751 similar mechanisms to enable or disable advanced TCP features. The 752 second requirement corresponds to the binding of addresses with the 753 bind() socket call, or, e.g., explicit device bindings with a 754 SO_BINDTODEVICE option. The third requirement ensures that there is 755 an equivalent to getpeername() or getsockname() that is able to deal 756 with more than one subflow. Finally, it should be possible for the 757 application to retrieve a unique connection identifier (local to the 758 endpoint on which it is running) for the MPTCP connection. This 759 replaces the (address, port) pair for a connection identifier in 760 single-path TCP, which is no longer static in MPTCP. 762 An application can continue to use getpeername() or getsockname() in 763 addition to the basic MPTCP API. Both functions return the 764 corresponding addresses of the first subflow, as already explained. 766 5.3. Sockets Interface Extensions by the Basic MPTCP API 768 5.3.1. Overview 770 The abstract, basic MPTCP API consists of a set of new values that 771 are associated with an MPTCP socket. Such values may be used for 772 changing properties of an MPTCP connection, or retrieving 773 information. These values could be accessed by new symbols on 774 existing calls such as setsockopt() and getsockopt(), or could be 775 implemented as entirely new function calls. This implementation 776 decision is out of scope for this document. The following list 777 presents symbolic names for these MPTCP socket settings. 779 o TCP_MULTIPATH_ENABLE: Enable/disable MPTCP 781 o TCP_MULTIPATH_ADD: Bind MPTCP to a set of given local addresses, 782 or add a set of new local addresses to an existing MPTCP 783 connection 785 o TCP_MULTIPATH_REMOVE: Remove a local address from an MPTCP 786 connection 788 o TCP_MULTIPATH_SUBFLOWS: Get the pairs of addresses currently used 789 by the MPTCP subflows 791 o TCP_MULTIPATH_CONNID: Get the local connection identifier for this 792 MPTCP connection 794 Table 1 shows a list of the abstract socket operations for the basic 795 configuration of MPTCP. The first column gives the symbolic name of 796 the operation. The second and third columns indicate whether the 797 operation provides values to be read ("Get") or takes values to 798 configure ("Set"). The fourth column lists the type of data 799 associated with this operation. The data types are listed for 800 information only. In addition to IP addresses, an application MAY 801 also indicate TCP port numbers, as further detailed below. 803 +------------------------+-----+-----+------------------------------+ 804 | Name | Get | Set | Data type | 805 +------------------------+-----+-----+------------------------------+ 806 | TCP_MULTIPATH_ENABLE | o | o | boolean | 807 | TCP_MULTIPATH_ADD | | o | list of addresses (and | 808 | | | | ports) | 809 | TCP_MULTIPATH_REMOVE | | o | list of addresses (and | 810 | | | | ports) | 811 | TCP_MULTIPATH_SUBFLOWS | o | | list of pairs of addresses | 812 | | | | (and ports) | 813 | TCP_MULTIPATH_CONNID | o | | integer | 814 +------------------------+-----+-----+------------------------------+ 816 Table 1: MPTCP Socket Operations 818 There are restrictions when these new socket operations can be used: 820 o TCP_MULTIPATH_ENABLE: This value should only be set before the 821 establishment of a TCP connection. Its value should only be read 822 after the establishment of a connection. 824 o TCP_MULTIPATH_ADD: This operation can be both applied before 825 connection setup or during a connection. If used before, it 826 controls the local addresses that an MPTCP connection can use. In 827 the latter case, it allows MPTCP to use an additional local 828 address, if there has been a restriction before connection setup. 830 o TCP_MULTIPATH_REMOVE: This operation can be both applied before 831 connection setup or during a connection. In both cases, it 832 removes an address from the list of local addresses that may be 833 used by subflows. 835 o TCP_MULTIPATH_SUBFLOWS: This value is read-only and can only be 836 used after connection setup. 838 o TCP_MULTIPATH_CONNID: This value is read-only and should only be 839 used after connection setup. 841 5.3.2. Enabling and Disabling of MPTCP 843 An application can explicitly indicate multipath capability by 844 setting TCP_MULTIPATH_ENABLE to the value "true". In this case, the 845 MPTCP implementation SHOULD try to negotiate MPTCP for that 846 connection. Note that multipath transport will not necessarily be 847 enabled, as it requires support at both end systems, no middleboxes 848 on the path that would prevent any additional signalling, and at 849 least one endpoint with multiple addresses. 851 Building on the backwards-compatibility specified in Section 4.2.1, 852 if an application enables MPTCP but binds to a specific address or 853 interface, MPTCP MUST be enabled, but MPTCP MUST respect the 854 application's choice and only use addresses that are explicitly 855 provided by the application. Note that it would be possible for an 856 application to use the legacy bindings, and then expand on them by 857 using TCP_MULTIPATH_ADD. Note also that it is possible for more than 858 one local address to be initially available to MPTCP in this case, if 859 an application has bound to a specific interface with multiple 860 addresses. 862 An application can disable MPTCP by setting TCP_MULTIPATH_ENABLE to a 863 value of "false". In that case, MPTCP MUST NOT be used on that 864 connection. 866 After connection establishment, an application can get the value of 867 TCP_MULTIPATH_ENABLE. A value of "false" then means lack of MPTCP 868 support. A value of "true" means that MPTCP is supported. 870 5.3.3. Binding MPTCP to Specified Addresses 872 Before connection establishment, an application can use 873 TCP_MULTIPATH_ADD function to indicate a set of local IP addresses 874 that MPTCP may bind to. The parameter of the function is a list of 875 addresses in a corresponding data structure. By extension, this 876 operation will also control the list of addresses that can be 877 advertised to the peer via MPTCP signalling. 879 If an application binds to a specific address or interface, it is not 880 required to use the TCP_MULTIPATH_ADD operation for that address. As 881 explained in Section 5.3.2, MPTCP MUST only use the explicitly 882 specified addresses in that case. 884 An application MAY also indicate a TCP port number that, if 885 specified, MPTCP MUST attempt to bind to. The port number MAY be 886 different to the one used by existing subflows. If no port number is 887 provided by the application, the port number is automatically 888 selected by the MPTCP implementation, and it is RECOMMENDED that it 889 is the same across all subflows. 891 This operation can also be used to modify the address list in use 892 during the lifetime of an MPTCP connection. In this case, it is used 893 to indicate a set of additional local addresses that the MPTCP 894 connection can make use of, and which can be signalled to the peer. 895 It should be noted that this signal is only a hint, and an MPTCP 896 implementation MAY select only a subset of the addresses. 898 The TCP_MULTIPATH_REMOVE operation can be used to remove a (set of) 899 local addresses from an MPTCP connection. MPTCP MUST close any 900 corresponding subflows (i.e. those using the local address that is no 901 longer present), and signal the removal of the address to the peer. 902 If alternative paths are available using the supplied address list 903 but MPTCP is not currently using them, an MPTCP implementation SHOULD 904 establish alternative subflows before undertaking the address 905 removal. 907 It should be remembered that these operations SHOULD support both 908 IPv4 and IPv6 addresses, potentially in the same call. 910 5.3.4. Querying the MPTCP Subflow Addresses 912 An application can get a list of the addresses used by the currently 913 established subflows in an MPTCP connection by means of the read-only 914 TCP_MULTIPATH_SUBFLOWS operation. 916 The return value is a list of pairs of tuples of IP address and TCP 917 port number. In one pair, the first tuple refers to the local IP 918 address and the local TCP port, and the second one to the remote IP 919 address and remote TCP port used by the subflow. The list MUST only 920 include established subflows. Both addresses in each pair MUST be 921 either IPv4 or IPv6. 923 5.3.5. Getting a Unique Connection Identifier 925 An application that wants a unique identifier for the connection, 926 analogous to an (address, port) pair in regular TCP, can query the 927 TCP_MULTIPATH_CONNID value to get a local connection identifier for 928 the MPTCP connection. 930 This SHOULD be an integer number, and SHOULD be locally unique (e.g., 931 the MPTCP token). 933 6. Other Compatibility Issues 935 6.1. Usage of TLS over MPTCP 937 Transport Layer Security (TLS) [17] may be used over MPTCP's Basic 938 API. When TLS compares any addresses used by MPTCP against names or 939 addresses present in X.509 certificates [18] [19], it MUST only 940 compare them with the address that MPTCP used to start the initial 941 subflow as presented to TLS. The addresses used for subsequent 942 subflows need not to be compared against any TLS certificate 943 information. Finer-grained control would require an Advanced API or 944 proactive subflow management via the Basic API. 946 6.2. Usage of the SCTP Socket API 948 For dealing with multi-homing, several socket API extensions have 949 been defined for SCTP [13]. As MPTCP realizes multipath transport 950 from and to multi-homed endsystems, some of these interface function 951 calls are actually applicable to MPTCP in a similar way. 953 API developers may wish to integrate SCTP and MPTCP calls to provide 954 a consistent interface to the application. Yet, it must be 955 emphasized that the transport service provided by MPTCP is different 956 to SCTP, and this is why not all SCTP API functions can be mapped 957 directly to MPTCP. Furthermore, a network stack implementing MPTCP 958 does not necessarily support SCTP and its specific socket interface 959 extensions. This is why the basic API of MPTCP defines additional 960 socket options only, which are a backward compatible extension of 961 TCP's application interface. An integration with the SCTP API is 962 outside the scope of the basic API. 964 6.3. Incompatibilities with other Multihoming Solutions 966 The use of MPTCP can interact with various related sockets API 967 extensions. The use of a multihoming shim layer conflicts with 968 multipath transport such as MPTCP or SCTP [11]. Care should be taken 969 for the usage not to confuse with the overlapping features of other 970 APIs: 972 o SHIM API [11]: This API specifies sockets API extensions for the 973 multihoming shim layer. 975 o HIP API [12]: The Host Identity Protocol (HIP) also results in a 976 new API. 978 o API for Mobile IPv6 [10]: For Mobile IPv6, a significantly 979 extended socket API exists as well (in addition to API extensions 980 for IPv6 [9]). 982 In order to avoid any conflict, multiaddressed MPTCP SHOULD NOT be 983 enabled if a network stack uses SHIM6, HIP, or Mobile IPv6. 984 Furthermore, applications should not try to use both the MPTCP API 985 and another multihoming or mobility layer API. 987 It is possible, however, that some of the MPTCP functionality, such 988 as congestion control, could be used in a SHIM6 or HIP environment. 989 Such operation is for further study. 991 6.4. Interactions with DNS 993 In multihomed or multiaddressed environments, there are various 994 issues that are not specific to MPTCP, but have to be considered as 995 well. These problems are summarized in [14]. 997 Specifically, there can be interactions with DNS. Whilst it is 998 expected that an application will iterate over the list of addresses 999 returned from a call such as getaddrinfo(), MPTCP itself MUST NOT 1000 make any assumptions about multiple A or AAAA records from the same 1001 DNS query referring to the same host, as it is possible that multiple 1002 addresses refer to multiple servers for load balancing purposes. 1004 7. Security Considerations 1006 This document first defines the behavior of the standard TCP/IP API 1007 for MPTCP-unaware applications. In general, enabling MPTCP has some 1008 security implications for applications, which are introduced in 1009 Section 5.3.3, and these threats are further detailed in [6]. The 1010 protocol specification of MPTCP [5] defines several mechanisms to 1011 protect MPTCP against those attacks. 1013 The syntax and semantics of the API for MPTCP-unaware application 1014 does not change. However, assumptions that non-MPTCP-aware 1015 applications may make on the data retrieved by the backwards- 1016 compatible API are discussed in Section 4.2.2. System administrators 1017 may wish to disable MPTCP for certain applications that signal 1018 addresses, or make security decisions (e.g. opening firewall holes), 1019 based on responses to such queries. 1021 In addition, the basic MPTCP API for MPTCP-aware applications defines 1022 functions that provide an equivalent level of control and information 1023 as exists for regular TCP. This document does not mandate a specific 1024 implementation of the basic MPTCP API. The implementation should be 1025 designed not to affect memory management assumptions in existing 1026 code. Implementors should take into account that data structures 1027 will be more complex than for standard TCP, e.g., when multiple 1028 subflow addresses have to be stored. When dealing with such data 1029 structures, care is needed not to add security vulnerabilities to 1030 applications. 1032 New functions enable adding and removing local addresses from an 1033 MPTCP connection (TCP_MULTIPATH_ADD and TCP_MULTIPATH_REMOVE). These 1034 functions don't add security threats if the MPTCP stack verifies that 1035 the addresses provided by the application are indeed available as 1036 source addresses for subflows. 1038 However, applications should use the TCP_MULTIPATH_ADD function with 1039 care, as new subflows might get established to those addresses. 1040 Furthermore, it could result in some form of information leakage 1041 since MPTCP might advertise those addresses to the other connection 1042 endpoint, which could learn IP addresses of interfaces that are not 1043 visible otherwise. 1045 Use of different addresses should not be assumed to lead to use of 1046 different paths, especially for security purposes. 1048 MPTCP-aware applications should also take care when querying and 1049 using information about the addresses used by subflows 1050 (TCP_MULTIPATH_SUBFLOWS). As MPTCP can dynamically open and close 1051 subflows, a list of addresses queried once can get outdated during 1052 the lifetime of an MPTCP connection. Then, the list may contain 1053 invalid entries, i.e. addresses that are not used any more, or that 1054 might not even be assigned to that host any more. Applications that 1055 want to ensure that MPTCP only uses a certain set of addresses should 1056 explicitly bind to those addresses. 1058 One specific example is the use TLS on top of MPTCP. Corresponding 1059 guidance can be found in Section 6.1. 1061 8. IANA Considerations 1063 This document has no IANA actions. This document only defines an 1064 abstract API and therefore does not request the reservation of 1065 identifiers or names. 1067 9. Conclusion 1069 This document discusses MPTCP's implications and its performance 1070 impact on applications. In addition, it specifies a basic MPTCP API. 1071 For legacy applications, it is ensured that the existing sockets API 1072 continues to work. MPTCP-aware applications can use the basic MPTCP 1073 API that provides some control over the transport layer equivalent to 1074 regular TCP. 1076 10. Acknowledgments 1078 Authors sincerely thank to the following people for their helpful 1079 comments and reviews of the document: Philip Eardley, Lavkesh 1080 Lahngir, John Leslie, Costin Raiciu, Michael Tuexen, and Javier 1081 Ubillos. 1083 Michael Scharf is supported by the German-Lab project 1084 (http://www.german-lab.de/) funded by the German Federal Ministry of 1085 Education and Research (BMBF). Alan Ford was previously supported by 1086 Roke Manor Research and by Trilogy (http://www.trilogy-project.org/), 1087 a research project (ICT-216372) partially funded by the European 1088 Community under its Seventh Framework Program. 1090 11. References 1092 11.1. Normative References 1094 [1] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, 1095 September 1981. 1097 [2] Braden, R., "Requirements for Internet Hosts - Communication 1098 Layers", STD 3, RFC 1122, October 1989. 1100 [3] Bradner, S., "Key words for use in RFCs to Indicate Requirement 1101 Levels", BCP 14, RFC 2119, March 1997. 1103 [4] Ford, A., Raiciu, C., Handley, M., Barre, S., and J. Iyengar, 1104 "Architectural Guidelines for Multipath TCP Development", 1105 RFC 6182, March 2011. 1107 [5] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, "TCP 1108 Extensions for Multipath Operation with Multiple Addresses", 1109 draft-ietf-mptcp-multiaddressed-12 (work in progress), 1110 October 2012. 1112 [6] Bagnulo, M., "Threat Analysis for TCP Extensions for Multipath 1113 Operation with Multiple Addresses", RFC 6181, March 2011. 1115 [7] Raiciu, C., Handley, M., and D. Wischik, "Coupled Congestion 1116 Control for Multipath Transport Protocols", RFC 6356, 1117 October 2011. 1119 [8] "IEEE Std. 1003.1-2008 Standard for Information Technology -- 1120 Portable Operating System Interface (POSIX). Open Group 1121 Technical Standard: Base Specifications, Issue 7, 2008.". 1123 11.2. Informative References 1125 [9] Stevens, W., Thomas, M., Nordmark, E., and T. Jinmei, "Advanced 1126 Sockets Application Program Interface (API) for IPv6", 1127 RFC 3542, May 2003. 1129 [10] Chakrabarti, S. and E. Nordmark, "Extension to Sockets API for 1130 Mobile IPv6", RFC 4584, July 2006. 1132 [11] Komu, M., Bagnulo, M., Slavov, K., and S. Sugimoto, "Sockets 1133 Application Program Interface (API) for Multihoming Shim", 1134 RFC 6316, July 2011. 1136 [12] Komu, M. and T. Henderson, "Basic Socket Interface Extensions 1137 for the Host Identity Protocol (HIP)", RFC 6317, July 2011. 1139 [13] Stewart, R., Tuexen, M., Poon, K., Lei, P., and V. Yasevich, 1140 "Sockets API Extensions for the Stream Control Transmission 1141 Protocol (SCTP)", RFC 6458, December 2011. 1143 [14] Blanchet, M. and P. Seite, "Multiple Interfaces and 1144 Provisioning Domains Problem Statement", RFC 6418, 1145 November 2011. 1147 [15] Wasserman, M. and P. Seite, "Current Practices for Multiple- 1148 Interface Hosts", RFC 6419, November 2011. 1150 [16] Wing, D. and A. Yourtchenko, "Happy Eyeballs: Success with 1151 Dual-Stack Hosts", RFC 6555, April 2012. 1153 [17] Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS) 1154 Protocol Version 1.2", RFC 5246, August 2008. 1156 [18] Cooper, D., Santesson, S., Farrell, S., Boeyen, S., Housley, 1157 R., and W. Polk, "Internet X.509 Public Key Infrastructure 1158 Certificate and Certificate Revocation List (CRL) Profile", 1159 RFC 5280, May 2008. 1161 [19] Saint-Andre, P. and J. Hodges, "Representation and Verification 1162 of Domain-Based Application Service Identity within Internet 1163 Public Key Infrastructure Using X.509 (PKIX) Certificates in 1164 the Context of Transport Layer Security (TLS)", RFC 6125, 1165 March 2011. 1167 [20] Sarolahti, P., "Multi-address Interface in the Socket API", 1168 March 2010. 1170 [21] Khalili, R., Gast, N., Popovic, M., and J. Le Boudec, 1171 "Performance Issues with MPTCP", March 2010. 1173 [22] Honda, M., Nishida, Y., Raiciu, C., Greenhalgh, A., Handley, 1174 M., and H. Tokuda, "Is it Still Possible to Extend TCP?", Proc. 1175 ACM Internet Measurement Conference (IMC), November 2011. 1177 Appendix A. Requirements on a Future Advanced MPTCP API 1179 A.1. Design Considerations 1181 Multipath transport results in many degrees of freedom. The basic 1182 MPTCP API only defines a minimum set of the API extensions for the 1183 interface between the MPTCP layer and applications, which does not 1184 offer much control of the MPTCP implementation's behavior. A future, 1185 advanced API could address further features of MPTCP and provide more 1186 control. 1188 Applications that use TCP may have different requirements on the 1189 transport layer. While developers have become used to the 1190 characteristics of regular TCP, new opportunities created by MPTCP 1191 could allow the service provided to be optimised further. An 1192 advanced API could enable MPTCP-aware applications to specify 1193 preferences and control certain aspects of the behavior, in addition 1194 to the simple control provided by the basic interface. An advanced 1195 API could also address aspects that are completely out-of-scope of 1196 the basic API, for example, the question whether a receiving 1197 application could influence the sending policy. A better integration 1198 with TLS could be another relevant objective (cf. Section 6.1) that 1199 requires further work. 1201 Furthermore, an advanced MPTCP API could be part of a new overall 1202 interface between the network stack and applications that addresses 1203 other issues as well, such as the split between identifiers and 1204 locators. An API that does not use IP addresses (but, instead e.g. a 1205 connectbyname() function) would be useful for numerous purposes, 1206 independent of MPTCP. 1208 It has also been suggested to use a separate address family called 1209 AF_MULTIPATH [20]. This separate address family could be used to 1210 exchange multiple addresses between an application and the standard 1211 sockets API, but it would be a more fundamental change compared to 1212 the basic API described in this document. 1214 This appendix documents a list of potential usage scenarios and 1215 requirements for the advanced API. The specification and 1216 implementation of a corresponding API is outside the scope of this 1217 document. 1219 A.2. MPTCP Usage Scenarios and Application Requirements 1221 There are different MPTCP usage scenarios. An application that 1222 wishes to transmit bulk data will want MPTCP to provide a high 1223 throughput service immediately, through creating and maximising 1224 utilisation of all available subflows. This is the default MPTCP use 1225 case. 1227 But at the other extreme, there are applications that are highly 1228 interactive, but require only a small amount of throughput, and these 1229 are optimally served by low latency and jitter stability. In such a 1230 situation, it would be preferable for the traffic to use only the 1231 lowest latency subflow (assuming it has sufficient capacity), maybe 1232 with one or two additional subflows for resilience and recovery 1233 purposes. The key challenge for such a strategy is that the delay on 1234 a path may fluctuate significantly and that just always selecting the 1235 path with the smallest delay might result in instability. 1237 The choice between bulk data transport and latency-sensitive 1238 transport affects the scheduler in terms of whether traffic should 1239 be, by default, sent on one subflow or across several ones. Even if 1240 the total bandwidth required is less than that available on an 1241 individual path, it is desirable to spread this load to reduce stress 1242 on potential bottlenecks, and this is why this method should be the 1243 default for bulk data transport. However, that may not be optimal 1244 for applications that require latency/jitter stability. 1246 In the case of the latter option, a further question arises: Should 1247 additional subflows be used whenever the primary subflow is 1248 overloaded, or only when the primary path fails (hot-standby)? In 1249 other words, is latency stability or bandwidth more important to the 1250 application? This results in two different options: Firstly, there 1251 is the single path which can overflow into an additional subflow; and 1252 secondly there is single-path with hot-standby, whereby an 1253 application may want an alternative backup subflow in order to 1254 improve resilience. In case that data delivery on the first subflow 1255 fails, the data transport could immediately be continued on the 1256 second subflow, which is idle otherwise. 1258 Yet another complication is introduced with the potential that MPTCP 1259 introduces for changes in available bandwidth as the number of 1260 available subflows changes. Such jitter in bandwidth may prove 1261 confusing for some applications such as video or audio streaming that 1262 dynamically adapt codecs based on available bandwidth. Such 1263 applications may prefer MPTCP to attempt to provide a consistent 1264 bandwidth as far as is possible, and avoid maximising the use of all 1265 subflows. 1267 A further, mostly orthogonal question is whether data should be 1268 duplicated over the different subflows, in particular if there is 1269 spare capacity. This could improve both the timeliness and 1270 reliability of data delivery. 1272 In summary, there are at least three possible performance objectives 1273 for multipath transport: 1275 1. High bandwidth 1277 2. Low latency and jitter stability 1279 3. High reliability 1281 These are not necessarily disjoint, since there are also broadband 1282 interactive applications that both require high-speed bulk data 1283 traffic and a low latency and jitter. 1285 In an advanced API, applications could provide high-level guidance to 1286 the MPTCP implementation concerning these performance requirements, 1287 for instance, which is considered to be the most important one. The 1288 MPTCP stack would then use internal mechanisms to fulfill this 1289 abstract indication of a desired service, as far as possible. This 1290 would both affect the assignment of data (including retransmissions) 1291 to existing subflows (e.g., 'use all in parallel', 'use as overflow', 1292 'hot standby', 'duplicate traffic') as well as the decisions when to 1293 set up additional subflows to which addresses. In both cases 1294 different policies can exist, which can be expected to be 1295 implementation-specific. 1297 Therefore, an advanced API could provide a mechanism how applications 1298 can specify their high-level requirements in an implementation- 1299 independent way. One possibility would be to select one "application 1300 profile" out of a number of choices that characterize typical 1301 applications. Yet, as applications today do not have to inform TCP 1302 about their communication requirements, it requires further studies 1303 whether such an approach would be realistic. 1305 Of course, independent of an advanced API, such functionality could 1306 also partly be achieved by MPTCP-internal heuristics that infer some 1307 application preferences e.g. from existing socket options, such as 1308 TCP_NODELAY. Whether this would be reliable, and indeed appropriate, 1309 is for further study. 1311 A.3. Potential Requirements on an Advanced MPTCP API 1313 The following is a list of potential requirements for an advanced 1314 MPTCP API beyond the features of the basic API. It is included here 1315 for information only: 1317 REQ5: An application should be able to establish MPTCP connections 1318 without using IP addresses as locators. 1320 REQ6: An application should be able obtain usage information and 1321 statistics about all subflows (e.g., ratio of traffic sent 1322 via this subflow). 1324 REQ7: An application should be able to request a change in the 1325 number of subflows in use, thus triggering removal or 1326 addition of subflows. An even finer control granularity 1327 would be a request for the establishment of a specific 1328 subflow to a provided destination, or a request for the 1329 termination of a specified, existing subflow. 1331 REQ8: An application should be able to inform the MPTCP 1332 implementation about its high-level performance requirements, 1333 e.g., in the form of a profile. 1335 REQ9: An application should be able to indicate communication 1336 characteristics, e.g., the expected amount of data to be 1337 sent, the expected duration of the connection, or the 1338 expected rate at which data is provided. Applications may in 1339 some cases be able to forecast such properties. If so, such 1340 information could be an additional input parameter for 1341 heuristics inside the MPTCP implementation, which could be 1342 useful for example to decide when to set up additional 1343 subflows. 1345 REQ10: An application should be able to control the automatic 1346 establishment/termination of subflows. This would imply a 1347 selection among different heuristics of the path manager, 1348 e.g., 'try as soon as possible', 'wait until there is a bunch 1349 of data', etc. 1351 REQ11: An application should be able to set preferred subflows or 1352 subflow usage policies. This would result in a selection 1353 among different configurations of the multipath scheduler. 1354 For instance, an application might want to use certain 1355 subflows as backup only. 1357 REQ12: An application should be able to control the level of 1358 redundancy by telling whether segments should be sent on more 1359 than one path in parallel. 1361 REQ13: An application should be able to control the use of fate 1362 sharing of the MPTCP connection and the initial subflow, 1363 e.g., to overwrite system policies. 1365 REQ14: An application should be able to register for callbacks to be 1366 informed of changes to subflows on an MPTCP connection. This 1367 "push" interface would allow the application to make timely 1368 logging and configuration changes, if required, and would 1369 avoid frequent polling of information. 1371 An advanced API fulfilling these requirements would allow application 1372 developers to more specifically configure MPTCP. It could avoid 1373 suboptimal decisions of internal, implicit heuristics. However, it 1374 is unclear whether all of these requirements would have a significant 1375 benefit to applications, since they are going above and beyond what 1376 the existing API to regular TCP provides. 1378 A subset of this functions might also be implemented system wide or 1379 by other configuration mechanisms. These implementation details are 1380 left for further study. 1382 A.4. Integration with the SCTP Socket API 1384 The advanced API may also integrate or use the SCTP Socket API. The 1385 following functions that are defined for SCTP have a similar 1386 functionality like the basic MPTCP API: 1388 o sctp_bindx() 1390 o sctp_connectx() 1392 o sctp_getladdrs() 1394 o sctp_getpaddrs() 1395 o sctp_freeladdrs() 1397 o sctp_freepaddrs() 1399 The syntax and semantics of these functions are described in [13]. 1401 A potential objective for the advanced API is to provide a consistent 1402 MPTCP and SCTP interface to the application. This is left for 1403 further study. 1405 Appendix B. Change History of the Document 1407 Note to RFC Editor: Remove this section before publication 1409 Changes compared to version draft-ietf-mptcp-api-06: 1411 o IETF last call and IESG review comments 1413 Changes compared to version draft-ietf-mptcp-api-05: 1415 o More explicitly mention port numbers in addition to IP addresses 1417 o Better reference to socket API specification 1419 o Better explanation of fall-back to regular TCP 1421 Changes compared to version draft-ietf-mptcp-api-04: 1423 o Slightly changed abstract (comment by Philip Eardley) 1425 o Removel of redundant text from intro (comment by Philip Eardley) 1427 o New text on the lack of interface differences to regular TCP 1428 regarding closing the connection, also in the resilience 1429 discussion (comment by Philip Eardley) 1431 o Moved AF_MULTIPATH to appendix (comment by Philip Eardley) 1433 o Update of text on connection identifier to align with latest 1434 protocol specification (comment by Lavkesh Lahngir) 1436 o Numerous small editorial changes 1438 Changes compared to version draft-ietf-mptcp-api-03: 1440 o Security consideration section 1441 o Better explanation of the implications of explicitly specified 1442 addresses, most notably during the bind call 1444 o Editorial changes 1446 Changes compared to version draft-ietf-mptcp-api-02: 1448 o Updated references 1450 o Editorial changes 1452 Changes compared to version draft-ietf-mptcp-api-01: 1454 o Additional text on outdated assumptions if an MPTCP application 1455 does not use fate sharing. 1457 o The appendix explicitly mentions an integration of the advanced 1458 MPTCP API and the SCTP API as a potential objective, which is left 1459 for further study for the basic API. 1461 o A short additional explanation of the parameters of the abstract 1462 functions TCP_MULTIPATH_ADD and TCP_MULTIPATH_REMOVE. 1464 o Better explanation when TCP_MULTIPATH_REMOVE may be used. 1466 Changes compared to version draft-ietf-mptcp-api-00: 1468 o Explicitly specify that the TCP_MULTIPATH_SUBFLOWS function 1469 returns port numbers, too. Furthermore, add a new comment that 1470 TCP_MULTIPATH_ADD permits the specification of a port number. 1472 o Mention possible additional extended API functions for the 1473 indication of application characterstics and for backup paths, 1474 based on comments received from the community. 1476 o Mentions alternative approaches for avoiding non-MPTCP-capable 1477 paths to reduce impact on applications. 1479 Changes compared to version draft-scharf-mptcp-api-03: 1481 o Removal of explicit references to "socket options" and getsockopt/ 1482 setsockopt. 1484 o Change of TCP_MULTIPATH_BIND to TCP_MULTIPATH_ADD and 1485 TCP_MULTIPATH_REMOVE. 1487 o Mention of stability of bandwidth as another potential QoS 1488 parameter for the advanced API. 1490 o Address comments received from Philip Eardley: Explanation of the 1491 API terminology, more explicit statement concerning applications 1492 that bind to a specific address, and some smaller editorial fixes 1494 Changes compared to version draft-scharf-mptcp-api-02: 1496 o Definition of the behavior of getpeername() and getsockname() when 1497 being called by an MPTCP-aware application. 1499 o Discussion of the possiblity that an MPTCP implementation could 1500 support the SCTP API, as far as it is applicable to MPTCP. 1502 o Various editorial fixes. 1504 Changes compared to version draft-scharf-mptcp-api-01: 1506 o Second half of the document completely restructured 1508 o Separation between a basic API and an advanced API: The focus of 1509 the document is the basic API only; all text concerning a 1510 potential extended API is moved to the appendix 1512 o Several clarifications, e.g., concerning buffer sizeing and the 1513 use of different scheduling strategies triggered by TCP_NODELAY 1515 o Additional references 1517 Changes compared to version draft-scharf-mptcp-api-00: 1519 o Distinction between legacy and MPTCP-aware applications 1521 o Guidance concerning default enabling, reaction to the shutdown of 1522 the first subflow, etc. 1524 o Reference to a potential use of AF_MULTIPATH 1526 o Additional references to related work 1528 Authors' Addresses 1530 Michael Scharf 1531 Alcatel-Lucent Bell Labs 1532 Lorenzstrasse 10 1533 70435 Stuttgart 1534 Germany 1536 EMail: michael.scharf@alcatel-lucent.com 1537 Alan Ford 1538 Cisco 1539 Ruscombe Business Park 1540 Ruscombe, Berkshire RG10 9NN 1541 UK 1543 EMail: alanford@cisco.com