idnits 2.17.1 draft-scharf-mptcp-api-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: The use of a multihoming shim layer conflicts with multipath transport such as MPTCP or SCTP [11]. In order to avoid any conflict, multiaddressed MPTCP SHOULD not be enabled if a network stack uses SHIM6 or HIP. Furthermore, applications should not try to use both the MPTCP API and a multihoming shim layer API. It is feasible, however, that some of the MPTCP functionality, such as congestion control, could be used in a SHIM6 or HIP environment. Such operation is outside the scope of this document. -- The document date (March 8, 2010) is 5135 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 793 (ref. '1') (Obsoleted by RFC 9293) == Outdated reference: A later version (-05) exists of draft-ietf-mptcp-architecture-00 == Outdated reference: A later version (-03) exists of draft-ford-mptcp-multiaddressed-02 == Outdated reference: A later version (-08) exists of draft-ietf-mptcp-threat-00 == Outdated reference: A later version (-01) exists of draft-raiciu-mptcp-congestion-00 == Outdated reference: A later version (-17) exists of draft-ietf-shim6-multihome-shim-api-13 == Outdated reference: A later version (-32) exists of draft-ietf-tsvwg-sctpsocket-21 == Outdated reference: A later version (-12) exists of draft-ietf-mif-current-practices-00 Summary: 2 errors (**), 0 flaws (~~), 9 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force M. Scharf 3 Internet-Draft Alcatel-Lucent Bell Labs 4 Intended status: Informational A. Ford 5 Expires: September 9, 2010 Roke Manor Research 6 March 8, 2010 8 MPTCP Application Interface Considerations 9 draft-scharf-mptcp-api-01 11 Abstract 13 Multipath TCP (MPTCP) adds the capability of using multiple paths to 14 a regular TCP session. Even though it is designed to be totally 15 backwards compatible to applications, the data transport differs 16 compared to regular TCP, and there are several additional degrees of 17 freedom that applications may wish to exploit. This document 18 summarizes the impact that MPTCP may have on applications, such as 19 changes in performance. Furthermore, it describes an optional 20 extended application interface that provides access to multipath 21 information and enables control of some aspects of the MPTCP 22 implementation's behaviour. 24 Status of This Memo 26 This Internet-Draft is submitted to IETF in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF), its areas, and its working groups. Note that 31 other groups may also distribute working documents as Internet- 32 Drafts. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 The list of current Internet-Drafts can be accessed at 40 http://www.ietf.org/ietf/1id-abstracts.txt. 42 The list of Internet-Draft Shadow Directories can be accessed at 43 http://www.ietf.org/shadow.html. 45 This Internet-Draft will expire on September 9, 2010. 47 Copyright Notice 48 Copyright (c) 2010 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 64 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 65 3. Comparison of MPTCP and Regular TCP . . . . . . . . . . . . . 5 66 3.1. Performance Impact . . . . . . . . . . . . . . . . . . . . 5 67 3.1.1. Throughput . . . . . . . . . . . . . . . . . . . . . . 5 68 3.1.2. Delay . . . . . . . . . . . . . . . . . . . . . . . . 6 69 3.1.3. Resilience . . . . . . . . . . . . . . . . . . . . . . 6 70 3.2. Potential Problems . . . . . . . . . . . . . . . . . . . . 6 71 3.2.1. Impact of Middleboxes . . . . . . . . . . . . . . . . 7 72 3.2.2. Outdated Implicit Assumptions . . . . . . . . . . . . 7 73 3.2.3. Security Implications . . . . . . . . . . . . . . . . 7 74 4. Operation of MPTCP with Legacy Applications . . . . . . . . . 7 75 4.1. Overview of the MPTCP Network Stack . . . . . . . . . . . 7 76 4.2. Usage of Addresses Inside Applications . . . . . . . . . . 8 77 4.3. Usage of Existing Socket Options . . . . . . . . . . . . . 9 78 4.4. Default Enabling of MPTCP . . . . . . . . . . . . . . . . 10 79 4.5. Known Remaining Issues with Legacy Applications . . . . . 10 80 5. Minimal API Enhancements for MPTCP-aware Applications . . . . 11 81 5.1. Indicating MPTCP Awareness . . . . . . . . . . . . . . . . 11 82 5.2. Modified Address Handling . . . . . . . . . . . . . . . . 11 83 5.3. Usage of a New Address Family . . . . . . . . . . . . . . 11 84 6. Extended MPTCP API . . . . . . . . . . . . . . . . . . . . . . 11 85 6.1. MPTCP Usage Scenarios and Application Requirements . . . . 11 86 6.2. Requirements on API Extensions . . . . . . . . . . . . . . 13 87 6.3. Design Considerations . . . . . . . . . . . . . . . . . . 15 88 6.4. Overview of Sockets Interface Extensions . . . . . . . . . 15 89 6.5. Detailed Description . . . . . . . . . . . . . . . . . . . 16 90 6.5.1. TCP_MP_ENABLE . . . . . . . . . . . . . . . . . . . . 16 91 6.5.2. TCP_MP_SUBFLOWS . . . . . . . . . . . . . . . . . . . 16 92 6.5.3. TCP_MP_PROFILE . . . . . . . . . . . . . . . . . . . . 16 93 6.6. Usage examples . . . . . . . . . . . . . . . . . . . . . . 17 94 6.7. Interactions and Incompatibilities with other 95 Multihoming Solutions . . . . . . . . . . . . . . . . . . 17 96 6.8. Other Advice to Application Developers . . . . . . . . . . 17 97 7. Security Considerations . . . . . . . . . . . . . . . . . . . 17 98 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 99 9. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 18 100 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 18 101 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18 102 11.1. Normative References . . . . . . . . . . . . . . . . . . . 18 103 11.2. Informative References . . . . . . . . . . . . . . . . . . 19 104 Appendix A. Change History of the Document . . . . . . . . . . . 19 106 1. Introduction 108 Multipath TCP (MPTCP) adds the capability of using multiple paths to 109 a regular TCP session [1]. The motivations for this extension 110 include increasing throughput, overall resource utilisation, and 111 resilience to network failure, and these motivations are discussed, 112 along with high-level design decisions, as part of the MPTCP 113 architecture [4]. MPTCP [5] offers the same reliable, in-order, 114 byte-stream transport as TCP, and is designed to be backward- 115 compatible with both applications and the network layer. It requires 116 support inside the network stack of both endpoints. This document 117 presents the impacts that MPTCP may have on applications, such as 118 performance changes compared to regular TCP. Furthermore, it 119 specifies an extended Application Programming Interface (API) 120 describing how applications can exploit additional features of 121 multipath transport. MPTCP is designed to be usable without any 122 application changes. The specified API is an optional extension that 123 provides access to multipath information and enables control of some 124 aspects of the MPTCP implementation's behaviour, for example 125 switching on or off the automatic use of MPTCP. 127 The de facto standard API for TCP/IP applications is the "sockets" 128 interface. This document defines experimental MPTCP-specific 129 extensions, in particular additional socket options. It is up to the 130 applications, high-level programming languages, or libraries to 131 decide whether to use these optional extensions. For instance, an 132 application may want to turn on or off the MPTCP mechanism for 133 certain data transfers, or provide some guidance concerning its usage 134 (and thus the service the application receives). The syntax and 135 semantics of the specification is in line with the Posix standard [8] 136 as much as possible. 138 Some network stack implementations, specially on mobile devices, have 139 centralized connection managers or other higher-level APIs to solve 140 multi-interface issues, as surveyed in [14]. Their interaction with 141 MPTCP is outside the scope of this note. 143 There are also various related extensions of the sockets interface: 144 [11] specifies sockets API extensions for a multihoming shim layer. 145 The API enables interactions between applications and the multihoming 146 shim layer for advanced locator management and for access to 147 information about failure detection and path exploration. Other 148 experimental extensions to the sockets API are defined for the Host 149 Identity Protocol (HIP) [12] in order to manage the bindings of 150 identifiers and locator. Other related API extensions exist for IPv6 151 [10] and SCTP [13]. There can be interactions or incompatibilities 152 of these APIs with MPTCP, which are discussed later in this document. 154 The target readers of this document are application programmers who 155 develop application software that may benefit significantly from 156 MPTCP. This document also provides the necessary information for 157 developers of MPTCP to implement the API in a TCP/IP network stack. 159 2. Terminology 161 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 162 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 163 document are to be interpreted as described in [3]. 165 This document uses the terminology introduced in [5]. 167 3. Comparison of MPTCP and Regular TCP 169 This section discusses the impact that the use of MPTCP will have on 170 applications, in comparison to what may be expected from the use of 171 regular TCP. 173 3.1. Performance Impact 175 One of the key goals of adding multipath capability to TCP is to 176 improve the performance of a transport connection by load 177 distribution over separate subflows across potentially disjoint 178 paths. Furthermore, it is an explicit goal of MPTCP that it should 179 not provide a worse performing connection that would have existed 180 through the use of legacy, single-path TCP. A corresponding 181 congestion control algorithm is described in [7]. The following 182 sections summarize the performance impact of MPTCP as seen by an 183 application. 185 3.1.1. Throughput 187 The most obvious performance improvement that will be gained with the 188 use of MPTCP is an increase in throughput, since MPTCP will pool more 189 than one path (where available) between two endpoints. This will 190 provide greater bandwidth for an application. If there are shared 191 bottlenecks between the flows, then the congestion control algorithms 192 will ensure that load is evenly spread amongst regular and multipath 193 TCP sessions, so that no end user receives worse performance than 194 single-path TCP. 196 Furthermore, this means that an MPTCP session could achieve 197 throughput that is greater than the capacity of a single interface on 198 the device. If any applications make assumptions about interfaces 199 due to throughput (or vice versa), they must take this into account. 201 The transport of MPTCP signaling information results in a small 202 overhead. If multiple subflows share a same bottleneck, this 203 overhead slightly reduces the capacity that is available for data 204 transport. Yet, this potential reduction of throughput will be 205 neglectible in many usage scenarios, and the protocol contains 206 optimisations in its design so that this overhead is minimal. 208 3.1.2. Delay 210 If the delays on the constituent subflows of an MPTCP connection 211 differ, the jitter perceivable to an application may appear higher as 212 the data is striped across the subflows. Although MPTCP will ensure 213 in-order delivery to the application, the application must be able to 214 cope with the data delivery being burstier than may be usual with 215 single-path TCP. Since burstiness is commonplace on the Internet 216 today, it is unlikely that applications will suffer from such an 217 impact on the traffic profile, but application authors may wish to 218 consider this in future development. 220 In addition, applications that make round trip time (RTT) estimates 221 at the application level may have some issues. Whilst the average 222 delay calculated will be accurate, whether this is useful for an 223 application will depend on what it requires this information for. If 224 a new application wishes to derive such information, it should 225 consider how multiple subflows may affect its measurements, and thus 226 how it may wish to respond. In such a case, an application may wish 227 to express its scheduling preferences, as described later in this 228 document. 230 3.1.3. Resilience 232 The use of multiple subflows simultaneously means that, if one should 233 fail, all traffic will move to the remaining subflow(s), and 234 additionally any lost packets can be retransmitted on these subflows. 236 Subflow failure may be caused by issues within the network, which an 237 application would be unaware of, or interface failure on the node. 238 An application may, under certain circumstances, be in a position to 239 be aware of such failure (e.g. by radio signal strength, or simply an 240 interface enabled flag), and so must not make assumptions of an MPTCP 241 flow's stablity based on this. MPTCP will never override an 242 application's request for a given interface, however, so the cases 243 where this issue may be applicable are limited. 245 3.2. Potential Problems 246 3.2.1. Impact of Middleboxes 248 MPTCP has been designed in order to pass through the majority of 249 middleboxes, for example through its ability to open subflows in 250 either direction, and through its use of a data-level sequence 251 number. 253 Nevertheless some middleboxes may still refuse to pass MPTCP messages 254 due to the presence of TCP options. If this is the case, MPTCP 255 should fall back to regular TCP. Although this will not create a 256 problem for the application (its communication will be set up either 257 way), there may be additional (and indeed, user-perceivable) delay 258 while the first handshake fails. 260 Empirical evidence suggests that new TCP options can successfully be 261 used on most paths in the Internet. But they can also have other 262 unexpected implications. For instance, intrusion detection systems 263 could be triggered. Full analysis of MPTCP's impact on such 264 middleboxes is for further study. 266 3.2.2. Outdated Implicit Assumptions 268 MPTCP overcomes the one-to-one mapping of the socket interface to a 269 flow through the network. As a result, applications cannot 270 implicitly rely on this one-to-one mapping any more. Applications 271 that require the transport along a single path can disable the use of 272 MPTCP as described later in this document. Examples include 273 monitoring tools that want to measure the available bandwidth on a 274 path, or routing protocols such as BGP that require the use of a 275 specific link. 277 3.2.3. Security Implications 279 The support for multiple IP addresses within one MPTCP connection can 280 result in additional security vulnerabilities, such as possibilities 281 for attackers to hijack connections. The protocol design of MPTCP 282 minimizes this risk. An attacker on one of the paths can cause harm, 283 but this is hardly an additional security risk compared to single- 284 path TCP, which is vulnerable to man-in-the-middle attacks, too. A 285 detailed thread analysis of MPTCP is published in [6]. 287 4. Operation of MPTCP with Legacy Applications 289 4.1. Overview of the MPTCP Network Stack 291 MPTCP is an extension of TCP, but it is designed to be backward 292 compatible for legacy applications. TCP interacts with other parts 293 of the network stack by different interfaces. The de facto standard 294 API between TCP and applications is the sockets interface. The 295 position of MPTCP in the protocol stack can be illustrated in 296 Figure 1. 298 +-------------------------------+ 299 | Application | 300 +-------------------------------+ 301 ^ | 302 ~~~~~~~~~~~|~Socket Interface|~~~~~~~~~~~ 303 | v 304 +-------------------------------+ 305 | MPTCP | 306 + - - - - - - - + - - - - - - - + 307 | Subflow (TCP) | Subflow (TCP) | 308 +-------------------------------+ 309 | IP | IP | 310 +-------------------------------+ 312 Figure 1: MPTCP protocol stack 314 In general, MPTCP can affect all interfaces that rely on the coupling 315 of a TCP connection to a single IP address and TCP port pair, to one 316 sockets endpoint, to one network interface, or to a given path 317 through the network. 319 This means that there are two classes of applications: 321 o Legacy applications: These applications use the existing API 322 towards TCP without any changes. This is the default case. 324 o MPTCP-aware applications: These applications indicate support for 325 an enhance MPTCP interface. 327 In the following, it is discussed to which extent MPTCP affects 328 legacy applications using the existing sockets API. 330 4.2. Usage of Addresses Inside Applications 332 The existing sockets API implies that applications deal with data 333 structures that store, amongst others, the IP addresses and TCP port 334 numbers of a TCP connection. A design objective of MPTCP is that 335 legacy applications can continue to use the established sockets API 336 without any changes. However, in MPTCP there is a one-to-many 337 mapping between the socket endpoint and the subflows. This has 338 several subtle implications for legacy applications using sockets API 339 functions. 341 During binding, an application can either select a specific address, 342 or bind to INADDR_ANY. Furthermore, the SO_BINDTODEVICE socket 343 option can be used to bind to a specific interface. If an 344 application uses a specific address, or sets the SO_BINDTODEVICE 345 socket option to bind to a specific interface, then MPTCP MUST 346 respect this and not interfere in the application's choices. If an 347 application binds to INADDR_ANY, it is assumed that the application 348 does not care which addresses to use locally. In this case, a local 349 policy MAY allow MPTCP to automatically set up multiple subflows on 350 such a connection. The extended sockets API will allow applications 351 to express specific preferences in an MPTCP-compatible way (e.g. bind 352 to a subset of interfaces only). 354 Applications can use the getpeername() or getsockname() functions in 355 order to retrieve the IP address of the peer or of the local socket. 356 These functions can be used for various purposes, including security 357 mechanisms, geo-location, or interface checks. The socket API was 358 designed with an assumption that a socket is using just one address, 359 and since this address is visible to the application, the application 360 may assume that the information provided by the functions is the same 361 during the lifetime of a connection. However, in MPTCP, unlike in 362 TCP, there is a one-to-many mapping of a connection to subflows, and 363 subflows can be added and removed while the connections continues to 364 exist. Therefore, MPTCP cannot expose addresses by getpeername() or 365 getsockname() that are both valid and constant during the 366 connection's lifetime. 368 This problem is addressed as follows: If used by a legacy 369 application, the MPTCP stack MUST always return the addresses of the 370 first subflow of an MPTCP connection, in all circumstances, even if 371 that particular subflow is no longer in use. As this address may not 372 be valid any more if the first subflow is closed, the MPTCP stack MAY 373 close the whole MPTCP connection if the first subflow is closed (fate 374 sharing). Whether to close the whole MPTCP connection by default 375 SHOULD be controlled by a local policy. Further experiments are 376 needed to investigate its implications. 378 Instead of getpeername() or getsockname(), MPTCP-aware applications 379 can use new API calls, documented later, in order to retrieve the 380 full list of address pairs for the subflows in use. 382 4.3. Usage of Existing Socket Options 384 The existing sockets API includes options that modify the behavior of 385 sockets and their underlying communications protocols. Various 386 socket options exist on socket, TCP, and IP level. The value of an 387 option can usually be set by the setsockopt() system function. The 388 getsockopt() function gets information. In general, the existing 389 sockets interface functions cannot configure each MPTCP subflow 390 individually. In order to be backward compatible, existing APIs 391 therefore should apply to all subflows within one connection, as far 392 as possible. 394 One commonly used TCP socket option (TCP_NODELAY) disables the Nagle 395 algorithm as described in [2]. This option is also specified in the 396 Posix standard [8]. Applications can use this option in combination 397 with MPTCP exactly in the same way. It then disables the Nagle 398 algorithm for the MPTCP connection, i.e., all subflows. 400 TODO: Setting this option could also trigger a different path 401 scheduler algorithm - specifically, that which is designed for 402 latency-sensitive traffic, as described in a later section. 404 Applications can also explicitly configure send and receive buffer 405 sizes by the sockets API (SO_SNDBUF, SO_RCVBUF). These socket 406 options can also be used in combination with MPTCP and then affect 407 the buffer size of the MPTCP connection. However, when defining 408 buffer sizes, application programmers should take into account that 409 the transport over several subflows requires a certain amount of 410 buffer for resequencing. Therefore, it does not make sense to use 411 MPTCP in combination with very small receive buffers. Small send 412 buffers may prevent MPTCP from efficiently scheduling data over 413 different subflows. It may be appropriate for an MPTCP 414 implementation to set a lower bound for such buffers, or 415 alternatively treat a small buffer size request as an implicit 416 request not to use MPTCP. 418 Some network stacks also provide other implementation-specific socket 419 options or interfaces that affect TCP's behavior. If a network stack 420 supports MPTCP, it must be ensured that these options do not 421 interfere. 423 4.4. Default Enabling of MPTCP 425 It is up to a local policy at the end system whether a network stack 426 should automatically enable MPTCP for sockets even if there is no 427 explicit sign of MPTCP awareness of the corresponding application. 428 Such a choice may be under the control of the user through system 429 preferences. 431 4.5. Known Remaining Issues with Legacy Applications 433 TODO: Future experiments will show whether legacy applications could 434 break despite the backward-compatible API of MPTCP. 436 5. Minimal API Enhancements for MPTCP-aware Applications 438 5.1. Indicating MPTCP Awareness 440 While applications can use MPTCP with the unmodified sockets API, a 441 clean interface requires small semantic changes compared to the 442 existing sockets API. Even if these changes do not affect most 443 applications, they are only enabled if an application explicitly 444 signals that it supports multipath transport and the enhanced 445 interface, in order to maintain backward compatibility with legacy 446 applications. An application can explicitly indicate multipath 447 capability by setting the TCP_MP_ENABLE option described below. 449 5.2. Modified Address Handling 451 The main change of the sockets API for MPTCP-aware applications is as 452 follows: If a socket is MPTCP-aware and thus does not use the 453 backward-compatibility mode, the functions getpeername() and 454 getsockname() SHOULD fail with a new error code EMULTIPATH. Due to 455 their ambiguity, an MPTCP-aware application should not use these two 456 functions. Instead, the information about the addresses in use can 457 be accessed by the extended sockets API, if needed. 459 5.3. Usage of a New Address Family 461 As alternative to setting a socket option, an application can also 462 use a new, separate address family called AF_MULTIPATH [9]. This 463 separate address family can be used to exchange multiple addresses 464 between an application and the standard sockets API, and additionally 465 acts as an explicit indication that an application is MPTCP-aware, 466 i.e., that it can deal with the semantic changes of the sockets API, 467 in particular concerning getpeername() and getsockname(). The usage 468 of AF_MULTIPATH is also more flexible with respect to multipath 469 transport, either IPv4 or IPv6, or both in parallel [9]. 471 6. Extended MPTCP API 473 6.1. MPTCP Usage Scenarios and Application Requirements 475 Applications that use TCP may have different requirements on the 476 transport layer. While developers have become used to the 477 characteristics of regular TCP, new opportunities created by MPTCP 478 could allow the service provided to be optimised further. An 479 extended API enables MPTCP-aware applications to specify preferences 480 and control certain aspects of the behavior, in addition to the 481 simple controls already discussed, such as switching on or off the 482 automatic use of MPTCP. 484 An application that wishes to transmit bulk data will want MPTCP to 485 provide a high throughput service immediately, through creating and 486 maximising utilisation of all available subflows. This is the 487 default MPTCP use case. 489 But at the other extreme, there are applications that are highly 490 interactive, but require only a small amount of throughput, and these 491 are optimally served by low latency and jitter stability. In such a 492 situation, it would be preferable for the traffic to use only the 493 lowest latency subflow (assuming it has sufficient capacity), with 494 one or two additional subflows for resilience and recovery purposes. 496 The choice between these two options affects the scheduler in terms 497 of whether traffic should be, by default, sent on one subflow or 498 across both. Even if the total bandwidth required is less than that 499 available on an individual path, it is desirable to spread this load 500 to reduce stress on potential bottlenecks, and this is why this 501 method should be the default. It is recognised, however, that this 502 may not benefit all applications that require latency/jitter 503 stability, so the other (single path) option is provided. 505 In the case of the latter option, however, a further question arises: 506 should additional subflows be used whenever the primary subflow is 507 overloaded, or only when the primary path fails (hot-standby)? In 508 other words, is latency stability or bandwidth more important to the 509 application? 511 We therefore divide this option into two: Firstly, there is the 512 single path which can overflow into an additional subflow; and 513 secondly there is single-path with hot-standby, whereby an 514 application may want an alternative backup subflow in order to 515 improve resilience. In case that data delivery on the first subflow 516 fails, the data transport could immediately be continued on the 517 second subflow, which is idle otherwise. 519 In summary, there are three different "application profiles" 520 concerning the use of MPTCP: 522 1. Bulk data transport 524 2. Latency-sensitive transport (with overflow) 526 3. Latency-sensitive transport (hot-standby) 528 These different application profiles affect both the management of 529 subflows, i.e., the decisions when to set up additional subflows to 530 which addresses as well as the assignment of data (including 531 retransmissions) to the existing subflows. In both cases different 532 policies can exist. 534 These profiles have been defined to cover the common application use 535 cases. It is not possible to cover all application requirements, 536 however, and as such applications may wish to have finer control over 537 subflows and packet scheduling. A set of requirements is listed 538 below. 540 Although it is intended that such functionality will be achieved 541 through new MPTCP-specific options, it may also be possible to infer 542 some application preferences from existing socket options, such as 543 TCP_NODELAY. Whether this would be reliable, and indeed appropriate, 544 is for further study. 546 6.2. Requirements on API Extensions 548 Because of the importance of the sockets interface there are several 549 fundamental design objectives for the interface between MPTCP and 550 applications: 552 o Consistency with existing sockets APIs must be maintained as far 553 as possible. In order to support the large base of applications 554 using the original API, a legacy application must be able to 555 continue to use standard socket interface functions when run on a 556 system supporting MPTCP. Also, MPTCP-aware applications should be 557 able to access the socket without any major changes. 559 o Sockets API extensions must be minimized and independent of an 560 implementation. 562 o The interface should both handle IPv4 and IPv6. 564 The following is a list of specific requirements from applications: 566 TODO: This list of requirements is preliminary and requires further 567 discussion. Some requirements have to be removed. 569 REQ1: Turn on/off MPTCP: An application should be able to request to 570 turn on or turn off the usage of MPTCP. This means that an 571 application should be able to explicitly request the use of 572 MPTCP if this is possible. Applications should also be able 573 to request not to enable MPTCP and to use regular TCP 574 transport instead. This can be implicit in many cases, e.g., 575 since MPTCP must disabled by the use of binding to a specific 576 address, or may be enabled if an application uses 577 AF_MULTIPATH. 579 REQ2: An application will want to be able to restrict MPTCP to 580 binding to a given set of addresses or interfaces. 582 REQ3: An application should be able to know if multiple subflows are 583 in use. 585 REQ4: An application should be able to enumerate all subflows in 586 use, obtain information on the addresses used by a subflow, 587 and obtain a subflow's usage (e.g., ratio of traffic sent via 588 this subflow). 590 REQ5: An application should be able to extract a unique identifier 591 for the connection (per endpoint), analogous to a port, i.e., 592 it should be able to retrieve MPTCP's connection identifier. 593 (TODO) 595 REQ6: Set/get the application profile, as discussed in the previous 596 section. 598 The above requirements are seen as having fairly clear benefits to 599 applications. Although in some cases they are going above and beyond 600 what regular TCP would provide, they are allowing an application to 601 make optimal use of the new features that MPTCP provides. 603 The following requirements are more specific, and could mostly be 604 implied through more generic options, such as the application profile 605 selection. They are currently included here as potential discussion 606 points, however, as they may have use to application developers as 607 more specific configuration options, beyond being an implicit part of 608 a profile selection. 610 REQ7: Constrain the maximum number of subflows to be used by an 611 MPTCP connection. 613 REQ8: Request a change in scheduling between subflows. 615 REQ9: Request a change in the number of subflows in use, thus 616 triggering removal or addition of subflows. (A finer control 617 granularity would be: Request the establishment of a new 618 subflow to a provided destination, and request the 619 termination of a specified, existing subflow.) 621 REQ10: Control automatic establishment/termination of subflows? 622 There could be different configurations of the path manager, 623 e.g., 'try ASAP', 'wait until there is a bunch of data, etc. 624 (Tied to application profile?) 626 REQ11: Set/get preferred subflows or subflow usage policies? There 627 could be different configurations of the multipath scheduler, 628 e.g., 'all-or-nothing', 'overflow', etc. (Again, tied to 629 application profile?) 631 REQ12: Get/set redundancy, i.e., to send segments on more than one 632 path in parallel. 634 REQ13: An application should be able to modify the MPTCP 635 configuration while communication is ongoing, i.e., after 636 establishment of the MPTCP connection. 638 6.3. Design Considerations 640 Multipath transport results in many degrees of freedom. MPTCP 641 manages the data transport over different subflows automatically. By 642 default, this is transparent to the application. But applications 643 can use the sockets API extensions defined in this section to 644 interface with the MPTCP layer and to control important aspects of 645 the MPTCP implementation's behaviour. The API uses non-mandatory 646 socket options and is designed to be as light-weight as possible. 648 MPTCP mainly affects the sending of data. Therefore, most of the new 649 socket options must be set in the sender side of a data transfer in 650 order to take effect. Nevertheless, it is also possible for a 651 receiver to have preferences about data transfer choices, as it may 652 too have performance requirements. (TODO) It is for further study as 653 to whether it is feasible for a receiving application to influence 654 sending policy, and if so, how this could be implemented. 656 As this document specifies sockets API extensions, it is written so 657 that the syntax and semantics are in line with the Posix standard [8] 658 as much as possible. 660 6.4. Overview of Sockets Interface Extensions 662 The extended MPTCP API consist of several new socket options that are 663 specific to MPTCP. All of these socket options are defined at TCP 664 level (IPPROTO_TCP). These socket options can be used either by the 665 getsockopt() or by the setsockopt() system call. 667 The new API functions can be classified into general configuration 668 and more advanced configuration. The new socket options for the 669 general configuration of MPTCP are: 671 o TCP_MP_ENABLE: Enable/disable MPTCP 672 o TCP_MP_SUBFLOWS: Get the addresses currently used by the MPTCP 673 subflows, optionally complemented by further information such as 674 usage ratio 676 o TCP_MP_PROFILE: Get/set the MPTCP profile 678 o ... 680 Table Table 1 shows a list of the socket options for the general 681 configuration of MPTCP. The first column gives the name of the 682 option. The second and third columns indicate whether the option can 683 be handled by the getsockopt() system call and/or by the setsockopt() 684 system call. The fourth column lists the type of data structure 685 specified along with the socket option. 687 +-----------------+-----+-----+-----------+ 688 | Option name | Get | Set | Data type | 689 +-----------------+-----+-----+-----------+ 690 | TCP_MP_ENABLE | o | o | int | 691 | TCP_MP_SUBFLOWS | o | | *1 | 692 | TCP_MP_PROFILE | o | o | int | 693 | ... | | | | 694 +-----------------+-----+-----+-----------+ 696 *1: Data structure containing the addresses of each subflow, plus 697 further information 699 Table 1: Socket options for MPTCP 701 TODO: More options may be added in a future version of this note. 703 6.5. Detailed Description 705 6.5.1. TCP_MP_ENABLE 707 TODO: Description 709 6.5.2. TCP_MP_SUBFLOWS 711 TODO: Description 713 6.5.3. TCP_MP_PROFILE 715 TODO: Description 717 6.6. Usage examples 719 TODO: Example C code for one or more API functions 721 6.7. Interactions and Incompatibilities with other Multihoming 722 Solutions 724 The use of MPTCP can interact with various related sockets API 725 extensions. Care should be taken for the usage not to confuse with 726 the overlapping features: 728 o SHIM API [11]: This API specifies sockets API extensions for the 729 multihoming shim layer. 731 o HIP API [12]: The Host Identity Protocol (HIP) also results in a 732 new API. 734 The use of a multihoming shim layer conflicts with multipath 735 transport such as MPTCP or SCTP [11]. In order to avoid any 736 conflict, multiaddressed MPTCP SHOULD not be enabled if a network 737 stack uses SHIM6 or HIP. Furthermore, applications should not try to 738 use both the MPTCP API and a multihoming shim layer API. It is 739 feasible, however, that some of the MPTCP functionality, such as 740 congestion control, could be used in a SHIM6 or HIP environment. 741 Such operation is outside the scope of this document. 743 6.8. Other Advice to Application Developers 745 o Using the default MPTCP configuration: MPTCP is designed to be 746 efficient and robust in the default configuration. Application 747 developers should not explicitly configure features unless this is 748 really needed. 750 o Socker buffer dimensioning: Multipath transport requires larger 751 buffers in the receiver for resequencing, as already explained. 752 Applications should use reasonably buffer sizes (such as the 753 operating system default values) in order to fully benefit from 754 MPTCP. 756 7. Security Considerations 758 Will be added in a later version of this document. 760 8. IANA Considerations 762 No IANA considerations. 764 9. Conclusion 766 This document discusses MPTCP's application implications and 767 specifies an extended API. From an architectural point of view, 768 MPTCP offers additional degrees of freedom concerning the transport 769 of data. The extended sockets API allows MPTCP-aware applications to 770 have additional control of some aspects of the MPTCP implementation's 771 behaviour and to obtain information about its usage. The new socket 772 options for MPTCP can be used by getsockopt() and/or setsockopt() 773 system calls. But it is also ensured that the existing sockets API 774 continues to work for legacy applications. 776 10. Acknowledgments 778 Authors sincerely thank to the following people for their helpful 779 comments to the document: Costin Raiciu 781 Michael Scharf is supported by the German-Lab project 782 (http://www.german-lab.de/) funded by the German Federal Ministry of 783 Education and Research (BMBF). Alan Ford is supported by Trilogy 784 (http://www.trilogy-project.org/), a research project (ICT-216372) 785 partially funded by the European Community under its Seventh 786 Framework Program. The views expressed here are those of the 787 author(s) only. The European Commission is not liable for any use 788 that may be made of the information in this document. 790 11. References 792 11.1. Normative References 794 [1] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, 795 September 1981. 797 [2] Braden, R., "Requirements for Internet Hosts - Communication 798 Layers", STD 3, RFC 1122, October 1989. 800 [3] Bradner, S., "Key words for use in RFCs to Indicate Requirement 801 Levels", BCP 14, RFC 2119, March 1997. 803 [4] Ford, A., Raiciu, C., Barre, S., and J. Iyengar, "Architectural 804 Guidelines for Multipath TCP Development", 805 draft-ietf-mptcp-architecture-00 (work in progress), 806 March 2010. 808 [5] Ford, A., Raiciu, C., and M. Handley, "TCP Extensions for 809 Multipath Operation with Multiple Addresses", 810 draft-ford-mptcp-multiaddressed-02 (work in progress), 811 October 2009. 813 [6] Bagnulo, M., "Threat Analysis for Multi-addressed/Multi-path 814 TCP", draft-ietf-mptcp-threat-00 (work in progress), 815 February 2010. 817 [7] Raiciu, C., Handley, M., and D. Wischik, "Coupled Multipath- 818 Aware Congestion Control", draft-raiciu-mptcp-congestion-00 819 (work in progress), October 2009. 821 [8] "IEEE Std. 1003.1-2008 Standard for Information Technology -- 822 Portable Operating System Interface (POSIX). Open Group 823 Technical Standard: Base Specifications, Issue 7, 2008.". 825 11.2. Informative References 827 [9] Sarolahti, P., "Multi-address Interface in the Socket API", 828 draft-sarolahti-mptcp-af-multipath-01 (work in progress), 829 March 2010. 831 [10] Stevens, W., Thomas, M., Nordmark, E., and T. Jinmei, "Advanced 832 Sockets Application Program Interface (API) for IPv6", 833 RFC 3542, May 2003. 835 [11] Komu, M., Bagnulo, M., Slavov, K., and S. Sugimoto, "Socket 836 Application Program Interface (API) for Multihoming Shim", 837 draft-ietf-shim6-multihome-shim-api-13 (work in progress), 838 February 2010. 840 [12] Komu, M. and T. Henderson, "Basic Socket Interface Extensions 841 for Host Identity Protocol (HIP)", draft-ietf-hip-native-api-12 842 (work in progress), January 2010. 844 [13] Stewart, R., Poon, K., Tuexen, M., Yasevich, V., and P. Lei, 845 "Sockets API Extensions for Stream Control Transmission 846 Protocol (SCTP)", draft-ietf-tsvwg-sctpsocket-21 (work in 847 progress), February 2010. 849 [14] Wasserman, M., "Current Practices for Multiple Interface 850 Hosts", draft-ietf-mif-current-practices-00 (work in progress), 851 October 2009. 853 Appendix A. Change History of the Document 855 Changes compared to version 00: 857 o Distinction between legacy and MPTCP-aware applications 859 o Guidance concerning default enabling, reaction to the shutdown of 860 the first sub-flow, etc. 862 o Reference to a potential use of AF_MULTIPATH 864 o Additional references to related work 866 Authors' Addresses 868 Michael Scharf 869 Alcatel-Lucent Bell Labs 870 Lorenzstrasse 10 871 70435 Stuttgart 872 Germany 874 EMail: michael.scharf@alcatel-lucent.com 876 Alan Ford 877 Roke Manor Research 878 Old Salisbury Lane 879 Romsey, Hampshire SO51 0ZN 880 UK 882 Phone: +44 1794 833 465 883 EMail: alan.ford@roke.co.uk