idnits 2.17.1 draft-ietf-mptcp-architecture-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 21, 2011) is 4837 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 793 (ref. '1') (Obsoleted by RFC 9293) == Outdated reference: A later version (-12) exists of draft-ietf-mptcp-multiaddressed-02 -- Obsolete informational reference (is this intentional?): RFC 4960 (ref. '6') (Obsoleted by RFC 9260) == Outdated reference: A later version (-07) exists of draft-ietf-mptcp-congestion-01 == Outdated reference: A later version (-07) exists of draft-ietf-mptcp-api-00 == Outdated reference: A later version (-08) exists of draft-ietf-mptcp-threat-07 == Outdated reference: A later version (-27) exists of draft-tuexen-tsvwg-sctp-multipath-01 -- Obsolete informational reference (is this intentional?): RFC 6093 (ref. '20') (Obsoleted by RFC 9293) Summary: 1 error (**), 0 flaws (~~), 6 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force A. Ford 3 Internet-Draft Roke Manor Research 4 Intended status: Informational C. Raiciu 5 Expires: July 25, 2011 M. Handley 6 University College London 7 S. Barre 8 Universite catholique de 9 Louvain 10 J. Iyengar 11 Franklin and Marshall College 12 January 21, 2011 14 Architectural Guidelines for Multipath TCP Development 15 draft-ietf-mptcp-architecture-05 17 Abstract 19 Hosts are often connected by multiple paths, but TCP restricts 20 communications to a single path per transport connection. Resource 21 usage within the network would be more efficient were these multiple 22 paths able to be used concurrently. This should enhance user 23 experience through improved resilience to network failure and higher 24 throughput. 26 This document outlines architectural guidelines for the development 27 of a Multipath Transport Protocol, with references to how these 28 architectural components come together in the development of a 29 Multipath TCP protocol. This document lists certain high level 30 design decisions that provide foundations for the design of the MPTCP 31 protocol, based upon these architectural requirements. 33 Status of this Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at http://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on July 25, 2011. 50 Copyright Notice 52 Copyright (c) 2011 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 5 69 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 70 1.3. Reference Scenario . . . . . . . . . . . . . . . . . . . . 6 71 2. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 72 2.1. Functional Goals . . . . . . . . . . . . . . . . . . . . . 6 73 2.2. Compatibility Goals . . . . . . . . . . . . . . . . . . . 7 74 2.2.1. Application Compatibility . . . . . . . . . . . . . . 7 75 2.2.2. Network Compatibility . . . . . . . . . . . . . . . . 8 76 2.2.3. Compatibility with other network users . . . . . . . . 9 77 2.3. Security Goals . . . . . . . . . . . . . . . . . . . . . . 10 78 2.4. Related Protocols . . . . . . . . . . . . . . . . . . . . 10 79 3. An Architectural Basis For Multipath TCP . . . . . . . . . . . 10 80 4. A Functional Decomposition of MPTCP . . . . . . . . . . . . . 12 81 5. High-Level Design Decisions . . . . . . . . . . . . . . . . . 14 82 5.1. Sequence Numbering . . . . . . . . . . . . . . . . . . . . 14 83 5.2. Reliability and Retransmissions . . . . . . . . . . . . . 15 84 5.3. Buffers . . . . . . . . . . . . . . . . . . . . . . . . . 17 85 5.4. Signalling . . . . . . . . . . . . . . . . . . . . . . . . 18 86 5.5. Path Management . . . . . . . . . . . . . . . . . . . . . 19 87 5.6. Connection Identification . . . . . . . . . . . . . . . . 20 88 5.7. Congestion Control . . . . . . . . . . . . . . . . . . . . 21 89 5.8. Security . . . . . . . . . . . . . . . . . . . . . . . . . 21 90 6. Software Interactions . . . . . . . . . . . . . . . . . . . . 22 91 6.1. Interactions with Applications . . . . . . . . . . . . . . 22 92 6.2. Interactions with Management Systems . . . . . . . . . . . 23 93 7. Interactions with Middleboxes . . . . . . . . . . . . . . . . 23 94 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 25 95 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 25 96 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 97 11. Security Considerations . . . . . . . . . . . . . . . . . . . 25 98 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 26 99 12.1. Normative References . . . . . . . . . . . . . . . . . . . 26 100 12.2. Informative References . . . . . . . . . . . . . . . . . . 26 101 Appendix A. Changelog . . . . . . . . . . . . . . . . . . . . . . 28 102 A.1. Changes since draft-ietf-mptcp-architecture-04 . . . . . . 28 103 A.2. Changes since draft-ietf-mptcp-architecture-03 . . . . . . 28 104 A.3. Changes since draft-ietf-mptcp-architecture-02 . . . . . . 28 105 A.4. Changes since draft-ietf-mptcp-architecture-01 . . . . . . 28 106 A.5. Changes since draft-ietf-mptcp-architecture-00 . . . . . . 28 107 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 28 109 1. Introduction 111 As the Internet evolves, demands on Internet resources are ever- 112 increasing, but often these resources (in particular, bandwidth) 113 cannot be fully utilised due to protocol constraints both on the end- 114 systems and within the network. If these resources could be used 115 concurrently, end user experience could be greatly improved. Such 116 enhancements would also reduce the necessary expenditure on network 117 infrastructure that would otherwise be needed to create an equivalent 118 improvement in user experience. By the application of resource 119 pooling [3], these available resources can be 'pooled' such that they 120 appear as a single logical resource to the user. 122 Multipath transport aims to realize some of the goals of resource 123 pooling by simultaneously making use of multiple disjoint (or 124 partially disjoint) paths across a network. The two key benefits of 125 multipath transport are: 127 o To increase the resilience of the connectivity by providing 128 multiple paths, protecting end hosts from the failure of one. 130 o To increase the efficiency of the resource usage, and thus 131 increase the network capacity available to end hosts. 133 Multipath TCP is a modified version of TCP [1] that implements a 134 multipath transport and achieves these goals by pooling multiple 135 paths within a transport connection, transparently to the 136 application. Multipath TCP is primarily concerned with utilising 137 multiple paths end-to-end, where one or both end host is multi-homed. 138 It may also have applications where multiple paths exist within the 139 network and can be manipulated by an end host, such as using 140 different port numbers with ECMP [4]. 142 MPTCP, defined in [5], is a specific protocol that instantiates the 143 Multipath TCP concept. This document looks both at general 144 architectural principles for a Multipath TCP fulfilling the goals 145 described in Section 2, as well as the key design decisions behind 146 MPTCP, which are detailed in Section 5. 148 Although multihoming and multipath functions are not new to transport 149 protocols (SCTP [6] being a notable example), MPTCP aims to gain 150 wide-scale deployment by recognising the importance of application 151 and network compatibility goals. These goals, discussed in detail in 152 Section 2, relate to the appearance of MPTCP to the network (so non- 153 MPTCP-aware entities see it as TCP) and to the application (through 154 providing an service equivalent to TCP for non-MPTCP-aware 155 applications). 157 This document has three key purposes: (i) it describes goals for a 158 multipath transport - goals that MPTCP is designed to meet; (ii) it 159 lays out an architectural basis for MPTCP's design - a discussion 160 that applies to other multipath transports as well; and (iii) it 161 discusses and documents high-level design decisions made in MPTCP's 162 development, and considers their implications. 164 Companion documents to this architectural overview are those which 165 provide details of the protocol extensions [5], congestion control 166 algorithms [7], and application-level considerations [8]. Put 167 together, these components specify a complete Multipath TCP design. 168 We note that specific components are replaceable in accordance with 169 the layer and functional decompositions discussed in this document. 171 1.1. Requirements Language 173 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 174 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 175 document are to be interpreted as described in RFC 2119 [2]. 177 1.2. Terminology 179 Regular/Single-Path TCP: The standard version of the TCP [1] 180 protocol in use today, operating between a single pair of IP 181 addresses. 183 Multipath TCP: A modified version of the TCP protocol that supports 184 the simultaneous use of multiple paths between hosts. 186 Path: A sequence of links between a sender and a receiver, defined 187 in this context by a source and destination address pair. 189 Host: An end host either initiating or terminating a Multipath TCP 190 connection. 192 MPTCP: The proposed protocol extensions specified in [5] to provide 193 a Multipath TCP implementation. 195 Subflow: A flow of TCP segments operating over an individual path, 196 which forms part of a larger Multipath TCP connection. 198 (Multipath TCP) Connection: A set of one or more subflows combined 199 to provide a single Multipath TCP service to an application at a 200 host. 202 1.3. Reference Scenario 204 The diagram shown in Figure 1 illustrates a typical usage scenario 205 for Multipath TCP. Two hosts, A and B, are communicating with each 206 other. These hosts are multi-homed and multi-addressed, providing 207 two disjoint connections to the Internet. The addresses on each host 208 are referred to as A1, A2, B1 and B2. There are therefore up to four 209 different paths between the two hosts: A1-B1, A1-B2, A2-B1, A2-B2. 211 +------+ __________ +------+ 212 | |A1 ______ ( ) ______ B1| | 213 | Host |--/ ( ) \--| Host | 214 | | ( Internet ) | | 215 | A |--\______( )______/--| B | 216 | |A2 (__________) B2| | 217 +------+ +------+ 219 Figure 1: Simple Multipath TCP Usage Scenario 221 The scenario could have any number of addresses (1 or more) on each 222 host, as long as the number of paths available between the two hosts 223 is 2 or more (i.e. num_addr(A) * num_addr(B) > 1). The paths created 224 by these address combinations through the Internet need not be 225 entirely disjoint - potential fairness issues introduced by shared 226 bottlenecks need to be handled by the Multipath TCP congestion 227 controller. Furthermore, the paths through the Internet often do not 228 provide a pure end-to-end service, and instead may be affected by 229 middleboxes such as NATs and Firewalls. 231 2. Goals 233 This section outlines primary goals that Multipath TCP aims to meet. 234 These are broadly broken down into: functional goals, which steer 235 services and features that Multipath TCP must provide; and 236 compatibility goals, which determine how Multipath TCP should appear 237 to entities that interact with it. 239 2.1. Functional Goals 241 In supporting the use of multiple paths, Multipath TCP has the 242 following two functional goals. 244 o Improve Throughput: Multipath TCP MUST support the concurrent use 245 of multiple paths. To meet the minimum performance incentives for 246 deployment, a Multipath TCP connection over multiple paths SHOULD 247 achieve no lesser throughput than a single TCP connection over the 248 best constituent path. 250 o Improve Resilience: Multipath TCP MUST support the use of multiple 251 paths interchangeably for resilience purposes, by permitting 252 segments to be sent and re-sent on any available path. It follows 253 that, in the worst case, the protocol MUST be no less resilient 254 than regular single-path TCP. 256 As distribution of traffic among available paths and responses to 257 congestion are done in accordance with resource pooling principles 258 [3], a secondary effect of meeting these goals is that widespread use 259 of Multipath TCP over the Internet should improve overall network 260 utility by shifting load away from congested bottlenecks and by 261 taking advantage of spare capacity wherever possible. 263 Furthermore, Multipath TCP SHOULD feature automatic negotiation of 264 its use. A host supporting Multipath TCP that requires the other 265 host to do so too must be able to detect reliably whether this host 266 does in fact support the required extensions, using them if so, and 267 otherwise automatically falling back to single-path TCP. 269 2.2. Compatibility Goals 271 In addition to the functional goals listed above, a Multipath TCP 272 must meet a number of compatibility goals in order to support 273 deployment in today's Internet. These goals fall into the following 274 categories: 276 2.2.1. Application Compatibility 278 Application compatibility refers to the appearance of Multipath TCP 279 to the application both in terms of the API that can be used and the 280 expected service model that is provided. 282 Multipath TCP MUST follow the same service model as TCP [1]: in- 283 order, reliable, and byte-oriented delivery. Furthermore, a 284 Multipath TCP connection SHOULD provide the application with no worse 285 throughput or resilience than it would expect from running a single 286 TCP connection over any one of its available paths. A Multipath TCP 287 may not, however, be able to provide the same level of consistency of 288 throughput and latency as a single TCP connection. These, and other, 289 application considerations are discussed in detail in [8]. 291 A multipath-capable equivalent of TCP MUST retain some level of 292 backward compatibility with existing TCP APIs, so that existing 293 applications can use the newer transport merely by upgrading the 294 operating systems of the end-hosts. This does not preclude the use 295 of an advanced API to permit multipath-aware applications to specify 296 preferences, nor for users to configure their systems in a different 297 way from the default, for example switching on or off the automatic 298 use of multipath extensions. 300 It is possible for regular TCP sessions today to survive brief breaks 301 in connectivity by retaining state at end hosts before a timeout 302 occurs. It would be desirable to support similar session continuity 303 in MPTCP, however the circumstances could be different. Whilst in 304 regular TCP the IP addresses will remain constant across the break in 305 connectivity, in MPTCP a different interface may appear. It is 306 desirable (but not mandated) to support this kind of "break-before- 307 make" session continuity. This places constraints on security 308 mechanisms, however, as discussed in Section 5.8. Timeouts for this 309 function would be locally configured. 311 2.2.2. Network Compatibility 313 In the traditional Internet architecture, network devices operate at 314 the network layer and lower layers, with the layers above the network 315 layer instantiated only at the end-hosts. While this architecture, 316 shown in Figure 2, was initially largely adhered to, this layering no 317 longer reflects the "ground truth" in the Internet with the 318 proliferation of middleboxes [9]. Middleboxes routinely interpose on 319 the transport layer; sometimes even completely terminating transport 320 connections, thus leaving the application layer as the first real 321 end-to-end layer, as shown in Figure 3. 323 +-------------+ +-------------+ 324 | Application |<------------ end-to-end ------------->| Application | 325 +-------------+ +-------------+ 326 | Transport |<------------ end-to-end ------------->| Transport | 327 +-------------+ +-------------+ +-------------+ +-------------+ 328 | Network |<->| Network |<->| Network |<->| Network | 329 +-------------+ +-------------+ +-------------+ +-------------+ 330 End Host Router Router End Host 332 Figure 2: Traditional Internet Architecture 334 +-------------+ +-------------+ 335 | Application |<------------ end-to-end ------------->| Application | 336 +-------------+ +-------------+ +-------------+ 337 | Transport |<------------------->| Transport |<->| Transport | 338 +-------------+ +-------------+ +-------------+ +-------------+ 339 | Network |<->| Network |<->| Network |<->| Network | 340 +-------------+ +-------------+ +-------------+ +-------------+ 341 Firewall, 342 End Host Router NAT, or Proxy End Host 344 Figure 3: Internet Reality 346 Middleboxes that interpose on the transport layer result in loss of 347 "fate-sharing" [10], that is, they often hold "hard" state that, when 348 lost or corrupted, results in loss or corruption of the end-to-end 349 transport connection. 351 The network compatibility goal requires that the multipath extension 352 to TCP retains compatibility with the Internet as it exists today, 353 including making reasonable efforts to be able to traverse 354 predominant middleboxes such as firewalls, NATs, and performance 355 enhancing proxies [9]. This requirement comes from recognizing 356 middleboxes as a significant deployment bottleneck for any transport 357 that is not TCP or UDP, and constrains Multipath TCP to appear as TCP 358 does on the wire and to use established TCP extensions where 359 necessary. To ensure end-to-endness of the transport, we further 360 require Multipath TCP to preserve fate-sharing without making any 361 assumptions about middlebox behavior. 363 A detailed analysis of middlebox behaviour and the impact on the 364 Multipath TCP architecture is presented in Section 7. In addition, 365 network compatibility must be retained to the extent that Multipath 366 TCP MUST fall back to regular TCP if there are insurmountable 367 incompatibilities for the multipath extension on a path. 369 Middleboxes may also cause some TCP features to be able to exist on 370 one subflow but not another. Typically these will be at the subflow 371 level (such as SACK [11]) and thus do not affect the connection-level 372 behaviour. In the future, any proposed TCP connection-level 373 extensions should consider how they can co-exist with MPTCP. 375 The modifications to support Multipath TCP remain at the transport 376 layer, although some knowledge of the underlying network layer is 377 required. Multipath TCP SHOULD work with IPv4 and IPv6 378 interchangeably, i.e. one connection may operate over both IPv4 and 379 IPv6 networks. 381 2.2.3. Compatibility with other network users 383 As a corollary to both network and application compatibility, the 384 architecture must enable new Multipath TCP flows to coexist 385 gracefully with existing single-path TCP flows, competing for 386 bandwidth neither unduly aggressively nor unduly timidly (unless low- 387 precedence operation is specifically requested by the application, 388 such as with LEDBAT). The use of multiple paths MUST NOT unduly harm 389 users using single-path TCP at shared bottlenecks, beyond the impact 390 that would occur from another single-path TCP flow. Multiple 391 Multipath TCP flows on a shared bottleneck MUST share bandwidth 392 between each other with similar fairness to that which occurs at a 393 shared bottleneck with single-path TCP. 395 2.3. Security Goals 397 The extension of TCP with multipath capabilities will bring with it a 398 number of new threats, analysed in detail in [12]. The security goal 399 for Multipath TCP is to provide a service no less secure than 400 regular, single-path TCP. This will be achieved through a 401 combination of existing TCP security mechanisms (potentially modified 402 to align with the Multipath TCP extensions) and of protection against 403 the new multipath threats identified. The design decisions derived 404 from this goal are presented in Section 5.8. 406 2.4. Related Protocols 408 There are several similarities between SCTP [6] and MPTCP, in that 409 both can make use of multiple addresses at end hosts to give some 410 multi-path capability. In SCTP, the primary use case is to support 411 redundancy and mobility for multihomed hosts (i.e. a single path will 412 change one of its end host addresses); the simultaneous use of 413 multiple paths is not supported . Extensions are proposed to support 414 simultaneous multipath transport [13], but these are yet to be 415 standardised. By far the most widely used stream-based transport 416 protocol is, however, TCP [1], and SCTP does not meet the network and 417 application compatibility goals specified in Section 2.2. For 418 network compatibility, there are issues with various middleboxes 419 (especially NATs) that are unaware of SCTP and consequently end up 420 blocking it. For application compatibility, applications need to 421 actively choose to use SCTP, and with the deployment issues very few 422 choose to do so. MPTCP's compatibility goals are in part based on 423 these observations of SCTP's deployment issues. 425 3. An Architectural Basis For Multipath TCP 427 We now present one possible transport architecture that we believe 428 can effectively support the goals for Multipath TCP. The new 429 Internet model described here is based on ideas proposed earlier in 430 Tng ("Transport next-generation") [14]. While by no means the only 431 possible architecture supporting multipath transport, Tng 432 incorporates many lessons learned from previous transport research 433 and development practice, and offers a strong starting point from 434 which to consider the extant Internet architecture and its bearing on 435 the design of any new Internet transports or transport extensions. 437 +------------------+ 438 | Application | 439 +------------------+ ^ Application-oriented transport 440 | | | functions (Semantic Layer) 441 + - - Transport - -+ ---------------------------------- 442 | | | Network-oriented transport 443 +------------------+ v functions (Flow+Endpoint Layer) 444 | Network | 445 +------------------+ 446 Existing Layers Tng Decomposition 448 Figure 4: Decomposition of Transport Functions 450 Tng loosely splits the transport layer into "application-oriented" 451 and "network-oriented" layers, as shown in Figure 4. The 452 application-oriented "Semantic" layer implements functions driven 453 primarily by concerns of supporting and protecting the application's 454 end-to-end communication, while the network-oriented "Flow+Endpoint" 455 layer implements functions such as endpoint identification (using 456 port numbers) and congestion control. These network-oriented 457 functions, while traditionally located in the ostensibly "end-to-end" 458 Transport layer, have proven in practice to be of great concern to 459 network operators and the middleboxes they deploy in the network to 460 enforce network usage policies [15] [16] or optimize communication 461 performance [17]. Figure 5 shows how middleboxes interact with 462 different layers in this decomposed model of the transport layer: the 463 application-oriented layer operates end-to-end, while the network- 464 oriented layer operates "segment-by-segment" and can be interposed 465 upon by middleboxes. 467 +-------------+ +-------------+ 468 | Application |<------------ end-to-end ------------->| Application | 469 +-------------+ +-------------+ 470 | Semantic |<------------ end-to-end ------------->| Semantic | 471 +-------------+ +-------------+ +-------------+ +-------------+ 472 |Flow+Endpoint|<->|Flow+Endpoint|<->|Flow+Endpoint|<->|Flow+Endpoint| 473 +-------------+ +-------------+ +-------------+ +-------------+ 474 | Network |<->| Network |<->| Network |<->| Network | 475 +-------------+ +-------------+ +-------------+ +-------------+ 476 Firewall Performance 477 End Host or NAT Enhancing Proxy End Host 479 Figure 5: Middleboxes in the new Internet model 481 MPTCP's architectural design follows Tng's decomposition as shown in 482 Figure 6. MPTCP, which provides application compatibility through 483 the preservation of TCP-like semantics of global ordering of 484 application data and reliability, is an instantiation of the 485 "application-oriented" Semantic layer; whereas the subflow TCP 486 component, which provides network compatibility by appearing and 487 behaving as a TCP flow in the network, is an instantiation of the 488 "network-oriented" Flow+Endpoint layer. 490 +--------------------------+ +-------------------------------+ 491 | Application | | Application | 492 +--------------------------+ +-------------------------------+ 493 | Semantic | | MPTCP | 494 |------------+-------------| + - - - - - - - + - - - - - - - + 495 | Flow+Endpt | Flow+Endpt | | Subflow (TCP) | Subflow (TCP) | 496 +------------+-------------+ +---------------+---------------+ 497 | Network | Network | | IP | IP | 498 +------------+-------------+ +---------------+---------------+ 500 Figure 6: Relationship between Tng (left) and MPTCP (right) 502 As a protocol extension to TCP, MPTCP thus explicitly acknowledges 503 middleboxes in its design, and specifies a protocol that operates at 504 two scales: the MPTCP component operates end-to-end, while it allows 505 the TCP component to operate segment-by-segment. 507 4. A Functional Decomposition of MPTCP 509 The previous two sections have discussed the goals for a Multipath 510 TCP design, and provided a basis for decomposing the functions of a 511 transport protocol in order to better understand the form a solution 512 should take. This section builds upon this analysis by presenting 513 the functional components that are used within the MPTCP design. 515 MPTCP makes use of (what appear to the network to be) standard TCP 516 sessions, termed "subflows", to provide the underlying transport per 517 path, and as such these retain the network compatibility desired. 518 MPTCP-specific information is carried in a TCP-compatible manner, 519 although this mechanism is separate from the actual information being 520 transferred so could evolve in future revisions. Figure 7 521 illustrates the layered architecture. 523 +-------------------------------+ 524 | Application | 525 +---------------+ +-------------------------------+ 526 | Application | | MPTCP | 527 +---------------+ + - - - - - - - + - - - - - - - + 528 | TCP | | Subflow (TCP) | Subflow (TCP) | 529 +---------------+ +-------------------------------+ 530 | IP | | IP | IP | 531 +---------------+ +-------------------------------+ 533 Figure 7: Comparison of Standard TCP and MPTCP Protocol Stacks 535 Situated below the application, the MPTCP extension in turn manages 536 multiple TCP subflows below it. In order to do this, it must 537 implement the following functions: 539 o Path Management: This is the function to detect and use multiple 540 paths between two hosts. MPTCP uses the presence of multiple IP 541 addresses at one or both of the hosts as an indicator of this. 542 The path management features of the MPTCP protocol are the 543 mechanisms to signal alternative addresses to hosts, and 544 mechanisms to set up new subflows joined to an existing MPTCP 545 connection. 547 o Packet Scheduling: This function breaks the bytestream received 548 from the application into segments to be transmitted on one of the 549 available subflows. The MPTCP design makes use of a data sequence 550 mapping, associating segments sent on different subflows to a 551 connection-level sequence numbering, thus allowing segments sent 552 on different subflows to be correctly re-ordered at the receiver. 553 The packet scheduler is dependent upon information about the 554 availability of paths exposed by the path management component, 555 and then makes use of the subflows to transmit queued segments. 556 This function is also responsible for connection-level re-ordering 557 on receipt of packets from the TCP subflows, according to the 558 attached data sequence mappings. 560 o Subflow (single-path TCP) Interface: A subflow component takes 561 segments from the packet-scheduling component and transmits them 562 over the specified path, ensuring detectable delivery to the host. 563 MPTCP uses TCP underneath for network compatibility; TCP ensures 564 in-order, reliable delivery. TCP adds its own sequence numbers to 565 the segments; these are used to detect and retransmit lost packets 566 at the subflow layer. On receipt, the subflow passes its 567 reassembled data to the packet scheduling component for 568 connection-level reassembly; the data sequence mapping from the 569 sender's packet scheduling component allows re-ordering of the 570 entire bytestream. 572 o Congestion Control: This function coordinates congestion control 573 across the subflows. As specified, this congestion control 574 algorithm MUST ensure that a MPTCP connection does not unfairly 575 take more bandwidth than a single path TCP flow would take at a 576 shared bottleneck. An algorithm to support this is specified in 577 [7]. 579 These functions fit together as follows. The Path Management looks 580 after the discovery (and if necessary, initialisation) of multiple 581 paths between two hosts. The Packet Scheduler then receives a stream 582 of data from the application destined for the network, and undertakes 583 the necessary operations on it (such as segmenting the data into 584 connection-level segments, and adding a connection-level sequence 585 number) before sending it on to a subflow. The subflow then adds its 586 own sequence number, ACKs, and passes them to network. The receiving 587 subflow re-orders data (if necessary) and passes it to the packet 588 scheduling component, which performs connection level re-ordering, 589 and sends the data stream to the application. Finally, the 590 congestion control component exists as part of the packet scheduling, 591 in order to schedule which segments should be sent at what rate on 592 which subflow. 594 5. High-Level Design Decisions 596 There is seemingly a wide range of choices when designing a multipath 597 extension to TCP. However, the goals as discussed earlier in this 598 document constrain the possible solutions, leaving relative little 599 choice in many areas. Here, we outline high-level design choices 600 that draw from the architectural basis discussed earlier in 601 Section 3, which the design of MPTCP [5] takes into account. 603 5.1. Sequence Numbering 605 MPTCP uses two levels of sequence spaces: a connection level sequence 606 number, and another sequence number for each subflow. This permits 607 connection-level segmentation and reassembly, and retransmission of 608 the same part of connection-level sequence space on different 609 subflow-level sequence space. 611 The alternative approach would be to use a single connection level 612 sequence number, which gets sent on multiple subflows. This has two 613 problems: first, the individual subflows will appear to the network 614 as TCP sessions with gaps in the sequence space; this in turn may 615 upset certain middleboxes such as intrusion detection systems, or 616 certain transparent proxies, and would thus go against the network 617 compatibility goal. Second, the sender would not be able to 618 attribute packet losses or receptions to the correct path when the 619 same segment is sent on multiple paths (i.e. in the case of 620 retransmissions). 622 The sender must be able to tell the receiver how to reassemble the 623 data, for delivery to the application. In order to achieve this, the 624 receiver must determine how subflow-level data (carrying subflow 625 sequence numbers) maps at the connection level. We refer to this as 626 the Data Sequence Mapping. This mapping takes the form (data seq, 627 subflow seq, length), i.e. for a given number of bytes (the length), 628 the subflow sequence space beginning at the given sequence number 629 maps to the connection-level sequence space (beginning at the given 630 data seq number). This information could conceivably have various 631 sources. 633 One option to signal the Data Sequence Mapping would be to use 634 existing fields in the TCP segment (such as subflow seqno, length) 635 and only add the data sequence number to each segment, for instance 636 as a TCP option. This would be vulnerable, however, to middleboxes 637 that resegment or assemble data, since there is no specified 638 behaviour for coalescing TCP options. If one signalled (data seqno, 639 length), this would still be vulnerable to middleboxes that coalesce 640 segments and do not understand MPTCP signalling so do not correctly 641 rewrite the options. 643 Because of these potential issues, the design decision taken in the 644 MPTCP protocol is that whenever a mapping for subflow data needs to 645 be conveyed to the other host, all three pieces of data (data seq, 646 subflow seq, length) must be sent. To reduce the overhead, it would 647 be permissible for the mapping to be sent periodically and cover more 648 than a single segment. Further experimentation is required to 649 determine what tradeoffs exist regarding the frequency at which 650 mappings should be sent. It could also be excluded entirely in the 651 case of a connection before more than one subflow is used, where the 652 data-level and subflow-level sequence space is the same. 654 5.2. Reliability and Retransmissions 656 MPTCP features acknowledgements at connection-level as well as 657 subflow-level acknowledgements, in order to provide a robust service 658 to the application. 660 Under normal behaviour, MPTCP can use the data sequence mapping and 661 subflow ACKs to decide when a connection-level segment was received. 662 The transmission of TCP ACKs for a subflow are handled entirely at 663 the subflow level, in order to maintain TCP semantics and trigger 664 subflow-level retransmissions. This has certain implications on end- 665 to-end semantics. It means that once a segment is ACKed at the 666 subflow level it cannot be discarded in the re-order buffer at the 667 connection level. Secondly, unlike in standard TCP, a receiver 668 cannot simply drop out-of-order segments if needed (for instance, due 669 to memory pressure). Under certain circumstances, therefore, it may 670 be desirable to drop segments after acknowledgement on the subflow 671 but before delivery to the application, and this can be facilitated 672 by a connection-level acknowledgement. 674 Furthermore, it is possible to conceive of some cases where 675 connection-level acknowledgements could improve robustness. Consider 676 a subflow traversing a transparent proxy: if the proxy ACKs a segment 677 and then crashes, the sender will not retransmit the lost segment on 678 another subflow, as it thinks the segment has been received. The 679 connection grinds to a halt despite having other working subflows, 680 and the sender would be unable to determine the cause of the problem. 681 An example situation where this may occur would be mobility between 682 wireless access points, each of which operates a transport-level 683 proxy. Finally, as an optimisation, it may be feasible for a 684 connection-level acknowledgement to be transmitted over the shortest 685 Round-Trip Time (RTT) path, potentially reducing send buffer 686 requirements (see Section 5.3). 688 Therefore, to provide a fully robust multipath TCP solution given the 689 above constraints, MPTCP for use on the public Internet MUST feature 690 explicit connection-level acknowledgements, in addition to subflow- 691 level acknowledgements. A connection-level acknowledgement would 692 only be required in order to signal when the receive window moves 693 forward; the heuristics for using such a signal are discussed in more 694 detail in the protocol specification [5]. 696 Regarding retransmissions, it MUST be possible for a segments to be 697 retransmitted on a different subflow to that on which it was 698 originally sent. This is one of MPTCP's core goals, in order to 699 maintain integrity during temporary or permanent subflow failure, and 700 this is enabled by the dual sequence number space. 702 The scheduling of retransmissions will have significant impact on 703 MPTCP user experience. The current MPTCP specification suggests that 704 data outstanding on subflows that have timed out should be 705 rescheduled for transmission on different subflows. This behaviour 706 aims to minimize disruption when a path breaks, and uses the first 707 timeout as indicators. More conservative versions would be to use 708 second or third timeouts for the same segment. 710 Typically, fast retransmit on an individual subflow will not trigger 711 retransmission on another subflow, although this may still be 712 desirable in certain cases, for instance to reduce the receive buffer 713 requirements. However, in all cases with retransmissions on 714 different subflows, the lost segments SHOULD still be sent on the 715 path that lost them. This is currently believed to be necessary to 716 maintain subflow integrity, as per the network compatibility goal. 717 By doing this, some efficiency is lost, and it is unclear at this 718 point what the optimal retransmit strategy is. 720 Large-scale experiments are therefore required in order to determine 721 the most appropriate retransmission strategy, and recommendations 722 will be refined once more information is available. 724 5.3. Buffers 726 To ensure in-order delivery, MPTCP must use a connection level 727 receive buffer, where segments are placed until they are in order and 728 can be read by the application. 730 In regular, single-path TCP, it is usually recommended to set the 731 receive buffer to 2*BDP (Bandwidth-Delay Product, i.e. BDP = BW*RTT, 732 where BW = Bandwidth and RTT = Round-Trip Time). One BDP allows 733 supporting reordering of segments by the network. The other BDP 734 allows the connection to continue during fast retransmit: when a 735 segment is fast retransmitted, the receiver must be able to store 736 incoming data during one more RTT. 738 For MPTCP, the story is a bit more complicated. The ultimate goal is 739 that a subflow packet loss or subflow failure should not affect the 740 throughput of other working subflows; the receiver should have enough 741 buffering to store all data until the missing segment is re- 742 transmitted and reaches the destination. 744 The worst case scenario would be when the subflow with the highest 745 RTT/RTO (Round-Trip Time or Retransmission TimeOut) experiences a 746 timeout; in that case the receiver has to buffer data from all 747 subflows for the duration of the RTO. Thus, the smallest connection- 748 level receive buffer that would be needed to avoid stalling with 749 subflow failures is sum(BW_i)*RTO_max, where BW_i = Bandwidth for 750 each subflow and RTO_max is the largest RTO across all subflows. 752 This is an order of magnitude more than the receive buffer required 753 for a single connection, and is probably too expensive for practical 754 purposes. A more sensible requirement is to avoid stalls in the 755 absence of timeouts. Therefore, the RECOMMENDED receive buffer is 756 2*sum(BW_i)*RTT_max, where RTT_max is the largest RTT across all 757 subflows. This buffer sizing ensures subflows do not stall when fast 758 retransmit is triggered on any subflow. 760 The resulting buffer size should be small enough for practical use. 761 However, there may be extreme cases where fast, high throughput paths 762 (e.g. 100Mb/s, 10ms RTT) are used in conjunction with slow paths 763 (e.g. 1Mb/s, 1000ms RTT). In that case the required receive buffer 764 would be 12.5MB, which is likely too big. In extreme cases such as 765 this example, it may be prudent to only use some of the fastest 766 available paths for the MPTCP connection, potentially using the slow 767 path(s) for backup only. 769 Send Buffer: The RECOMMENDED send buffer is the same size as the 770 recommended receive buffer i.e., 2*sum(BW_i)*RTT_max. This is 771 because the sender must store locally the segments sent but 772 unacknowledged by the connection level ACK. The send buffer size 773 matters particularly for hosts that maintain a large number of 774 ongoing connections. If the required send buffer is too large, a 775 host can choose to only send data on the fast subflows, using the 776 slow subflows only in cases of failure. 778 5.4. Signalling 780 Since MPTCP uses TCP as its subflow transport mechanism, a MPTCP 781 connection will also begin as a single TCP connection. Nevertheless, 782 it must signal to the peer that it supports MPTCP and wishes to use 783 it on this connection. As such, a TCP Option will be used to 784 transmit this information, since this is the established mechanism 785 for indicating additional functionality on a TCP session. 787 In addition, further signalling is required during the operation of a 788 MPTCP session, such as that for reassembly for multiple subflows, and 789 for informing the other host about potential other available 790 addresses. 792 The MPTCP protocol design will, however, use TCP Options for this 793 additional signalling. This has been chosen as the mechanism most 794 fitting in with the goals as specified in Section 2. With this 795 mechanism, the signalling requires to operate MPTCP is transported 796 separately from the data, allowing it to be created and processed 797 separately from the data stream, and retaining architectural 798 compatibility with network entities. 800 This decision is the consensus of the Working Group (following 801 detailed discussions at IETF78), and the main reasons for this are as 802 follows: 804 o TCP options are the traditional signalling method for TCP; 806 o A TCP option on a SYN is the most compatible way for an end host 807 to signal it is MPTCP-capable; 809 o If connection-level ACKs are signalled in the payload then they 810 may suffer from packet loss and may be congestion-controlled, 811 which may affect the data throughput in the forward direction and 812 could lead to head-of-line blocking; 814 o Middleboxes, such as NAT traversal helpers, can easily parse TCP 815 options, e. g., to rewrite addresses. 817 On the other hand, the main drawbacks of TCP options compared to TLV 818 encoding in the payload are: 820 o There is limited space for signalling messages; 822 o A middlebox may, potentially, drop a packet with an unknown 823 option; 825 o The transport of control information in options is not necessarily 826 reliable. 828 The detailed design of MPTCP alleviates these issues as far as 829 possible by carefully considering the size of MPTCP options, and 830 seamlessly falling back to regular TCP on the loss of control data. 832 Both option and payload encoding may interfere with offloading of TCP 833 processing to high speed network interface cards, such as 834 segmentation, checksumming, and reassembly. For network cards 835 supporting MPTCP, signalling in TCP options should simplify 836 offloading due to the separate handling of MPTCP signalling and data. 838 5.5. Path Management 840 Currently, the network does not expose path diversity between pairs 841 of IP addresses. In order to achieve path diversity from today's IP 842 networks, in the typical case MPTCP uses multiple addresses at one or 843 both hosts to infer different paths across the network. It is 844 expected that these paths, whilst not necessarily entirely non- 845 overlapping, will be sufficiently disjoint to allow multipath to 846 achieve improved throughput and robustness. The use of multiple IP 847 addresses is a simple mechanism that requires no additional features 848 in the network. 850 Multiple different (source, destination) address pairs will thus be 851 used as path selectors in most cases. Each path will be identified 852 by a standard five-tuple (i.e. source address, destination address, 853 source port, destination port, protocol), however, which can allow 854 the extension of MPTCP to use ports as well as addresses as path 855 selectors. This will allow hosts to use port-based load balancing 856 with MPTCP, for example if the network routes different ports over 857 different paths (which may be the case with technologies such as 858 Equal Cost MultiPath (ECMP) routing [4]). It should be noted, 859 however, that ISPs often undertake traffic engineering in order to 860 optimise resource utilisation within their networks, and care should 861 be taken (by both ISPs and developers) that MPTCP using broadly 862 similar paths does not adversely interfere with this. 864 For increased chance of successfully setting up additional subflows 865 (such as when one end is behind a firewall, NAT, or other restrictive 866 middlebox), either host SHOULD be able to add new subflows to a MPTCP 867 connection. MPTCP MUST be able to handle paths that appear and 868 disappear during the lifetime of a connection (for example, through 869 the activation of an additional network interface). 871 The path management is a separate function from the packet 872 scheduling, subflow interface, and congestion control functions of 873 MPTCP, as documented in Section 4. As such it would be feasible to 874 replace this IP-address-based design with an alternative path 875 selection mechanism in the future, with no significant changes to the 876 other functional components. 878 5.6. Connection Identification 880 Since a MPTCP connection may not be bound to a traditional 5-tuple 881 (source address and port, destination address and port, protocol 882 number) for the entirety of its existence, it is desirable to provide 883 a new mechanism for connection identification. This will be useful 884 for MPTCP-aware applications, and for the MPTCP implementation (and 885 MPTCP-aware middleboxes) to have a unique identifier with which to 886 associate the multiple subflows. 888 Therefore, each MPTCP connection requires a connection identifier at 889 each host, which is locally unique within that host. In many ways, 890 this is analogous to an ephemeral port number in regular TCP. The 891 manifestation and purpose of such an identifier is out of the scope 892 of this architecture document. 894 Legacy applications will not, however, have access to this identifier 895 and in such cases a MPTCP connection will be identified by the 896 5-tuple of the first TCP subflow. It is out of the scope of this 897 document, however, to define the behaviour of the MPTCP 898 implementation if the first TCP subflow later fails. If there are 899 MPTCP-unaware applications that make assumptions about continued 900 existence of the initial address pair, their behaviour could be 901 disrupted by carrying on regardless. It is expected that this is a 902 very small, possibly negligible, set of applications, however. MPTCP 903 MUST NOT be used for applications that request to bind to a specific 904 address or interface, since such applications are making a deliberate 905 choice of path in use. 907 Since the requirements of applications are not clear at this stage, 908 however, it is as yet unconfirmed whether carrying on in the event of 909 the loss of the initial address pair would be a damaging assumption 910 to make. This behaviour will be an implementation-specific solution, 911 and as such it is expected to be chosen by implementors once more 912 research has been undertaken to determine its impact. 914 5.7. Congestion Control 916 As discussed in network-layer compatibility requirements 917 Section 2.2.3, there are three goals for the congestion control 918 algorithms used by a MPTCP implementation: improve throughput (at 919 least as well as a single-path TCP connection would perform); do no 920 harm to other network users (do not take up more capacity on any one 921 path than if it was a single path flow using only that route - this 922 is particularly relevant for shared bottlenecks); and balance 923 congestion by moving traffic away from the most congested paths. To 924 achieve these goals, the congestion control algorithms on each 925 subflow must be coupled in some way. A proposal for a suitable 926 congestion control algorithm is given in [7]. 928 5.8. Security 930 A detailed threat analysis for Multipath TCP is presented in a 931 separate document [12]. This focuses on flooding attacks and 932 hijacking attacks that can be launched against a Multipath TCP 933 connection. 935 The basic security goal of Multipath TCP, as introduced in 936 Section 2.3, can be stated as: "provide a solution that is no worse 937 than standard TCP". 939 From the threat analysis, and with this goal in mind, three key 940 security requirements can be identified. A multi-addressed Multipath 941 TCP SHOULD be able to: 943 o Provide a mechanism to confirm that the parties in a subflow 944 handshake are the same as in the original connection setup (e.g. 945 require use of a key exchanged in the initial handshake in the 946 subflow handshake, to limit the scope for hijacking attacks). 948 o Provide verification that the peer can receive traffic at a new 949 address before adding it (i.e. verify that the address belongs to 950 the other host, to prevent flooding attacks). 952 o Provide replay protection, i.e. ensure that a request to add/ 953 remove a subflow is 'fresh'. 955 Additional mechanisms have been deployed as part of standard TCP 956 stacks to provide resistance to Denial-of-Service attacks. For 957 example, there are various mechanisms to protect against TCP reset 958 attacks [18], and Multipath TCP should continue to support similar 959 protection. In addition, TCP SYN Cookies [19] were developed to 960 allow a TCP server to defer the creation of session state in the 961 SYN_RCVD state, and remain stateless until the ESTABLISHED state had 962 been reached. Multipath TCP should, ideally, continue to provide 963 such functionality and, at a minimum, avoid significant computational 964 burden prior to reaching the ESTABLISHED state (of the Multipath TCP 965 connection as a whole). 967 It should be noted that aspects of the Multipath TCP design space 968 place constraints on the security solution: 970 o The use of TCP options significantly limits the amount of 971 information that can be carried in the handshake. 973 o The need to work through middleboxes results in the need to handle 974 mutability of packets. 976 o The desire to support a 'break-before-make' (as well as a 'make- 977 before-break') approach to adding subflows (within a limited time 978 period) implies that a host cannot rely on using a pre-existing 979 subflow to support the addition of a new one. 981 The MPTCP protocol will be designed with these security requirements 982 in mind, and the protocol specification [5] will document how these 983 are met. 985 6. Software Interactions 987 6.1. Interactions with Applications 989 In the case of applications that have used an existing API call to 990 bind to a specific address or interface, the MPTCP extension MUST NOT 991 be used. This is because the applications are indicating a clear 992 choice of path to use and thus will have expectations of behaviour 993 that must be maintained, in order to adhere to the application 994 compatibility goals. 996 Interactions with applications are presented in [8] - including, but 997 not limited to, performances changes that may be expected, semantic 998 changes, and new features that may be requested through an enhanced 999 API. 1001 TCP features the ability to send "Urgent" data, the delivery of which 1002 to the application may or may not be out-of-band. The use of this 1003 feature is not recommended due to security implications and 1004 implementation differences [20]. MPTCP requires contiguous data to 1005 support its Data Sequence Mapping over multiple segments, and 1006 therefore the Urgent pointer cannot interrupt an existing mapping. 1007 An MPTCP implementation MAY choose to support sending Urgent data, 1008 and if it does, it SHOULD send the Urgent data on the soonest 1009 available unassigned subflow sequence space. Incoming Urgent data 1010 SHOULD be mapped to connection-level sequence space and delivered to 1011 the application analogous to Urgent data in regular TCP. 1013 6.2. Interactions with Management Systems 1015 To enable interactions between TCP and network management systems, 1016 the TCP [21] and TCP Extended Statistics (ESTATS) [22] MIBs have been 1017 defined. MPTCP should share the these MIBs for aspects that are 1018 designed to be transparent to the application. 1020 It is anticipated that a MPTCP MIB will be defined in the future, 1021 once experience of experimental MPTCP deployments is gathered. This 1022 MIB would provide access to MPTCP-specific properties such as whether 1023 MPTCP is enabled, and the number and properties of the individual 1024 paths in use. 1026 7. Interactions with Middleboxes 1028 As discussed in Section 2.2, it is a goal of MPTCP to be deployable 1029 today and thus compatible with the majority of middleboxes. This 1030 section summarises the issues that may arise with NATs, firewalls, 1031 proxies, intrusion detection systems, and other middleboxes that, if 1032 not considered in the protocol design, may hinder its deployment. 1034 This section is intended primarily as a description of options and 1035 considerations only. Protocol-specific solutions to these issues 1036 will be given in the companion documents. 1038 Multipath TCP will be deployed in a network that no longer provides 1039 just basic datagram delivery. A myriad of middleboxes are deployed 1040 to optimize various perceived problems with the Internet protocols: 1041 NATs primarily address IP address space shortage [15], Performance 1042 Enhancing Proxies (PEPs) optimize TCP for different link 1043 characteristics [17], firewalls [16] and intrusion detection systems 1044 try to block malicious content from reaching a host, and traffic 1045 normalizers [23] ensure a consistent view of the traffic stream to 1046 Intrusion Detection Systems (IDS) and hosts. 1048 All these middleboxes optimize current applications at the expense of 1049 future applications. In effect, future applications will often need 1050 to behave in a similar fashion to existing ones, in order to increase 1051 the chances of successful deployment. Further, the precise behaviour 1052 of all these middleboxes is not clearly specified, and implementation 1053 errors make matters worse, raising the bar for the deployment of new 1054 technologies. 1056 The following list of middlebox classes documents behaviour that 1057 could impact the use of MPTCP. This list is used in [5] to describe 1058 the features of the MPTCP protocol that are used to mitigate the 1059 impact of these middlebox behaviours. 1061 o NATs: Network Address Translators decouple the host's local IP 1062 address (and, in the case of NAPTs, port) with that which is seen 1063 in the wider Internet when the packets are transmitted through a 1064 NAT. This adds complexity, and reduces the chances of success, 1065 when signalling IP addresses. 1067 o PEPs: Performance Enhancing Proxies, which aim to improve the 1068 performance of protocols over low-performance (e.g. high latency 1069 or high error rate) links. As such, they may "split" a TCP 1070 connection and behaviour such as proactive ACKing may occur, and 1071 therefore it is no longer guaranteed that one host is 1072 communicating directly with another. PEPs, firewalls or other 1073 middleboxes may also change the declared receive window size. 1075 o Traffic Normalizers: These aim to eliminate ambiguities and 1076 potential attacks at the network level, and amongst other things 1077 are unlikely to permit holes in TCP-level sequence space (which 1078 has impact on MPTCP's retransmission and subflow sequence 1079 numbering design choices). 1081 o Firewalls: on top of preventing incoming connections, firewalls 1082 may also attempt additional protection such as sequence number 1083 randomization (so a sender cannot reliably know what TCP sequence 1084 number the receiver will see). 1086 o Intrusion Detection Systems: IDSs may look for traffic patterns to 1087 protect a network, and may have false positives with MPTCP and 1088 drop the connections during normal operation. Future MPTCP-aware 1089 middleboxes will require the ability to correlate the various 1090 paths in use. 1092 o Content-aware Firewalls: Some middleboxes may actively change data 1093 in packets, such as re-writing URIs in HTTP traffic. 1095 In addition, all classes of middleboxes may affect TCP traffic in the 1096 following ways: 1098 o TCP Options: some middleboxes may drop packets with unknown TCP 1099 options, or strip those options from the packets. 1101 o Segmentation and Coalescing: middleboxes (or even something as 1102 close to the end host as TCP Segmentation Offloading (TSO) on a 1103 Network Interface Card (NIC)) may change the packet boundaries 1104 from those which the sender intended. It may do this by splitting 1105 packets, or coalescing them together. This leads to two major 1106 impacts: we cannot guarantee where a packet boundary will be, and 1107 we cannot say for sure what a middlebox will do with TCP options 1108 in these cases (they may be repeated, dropped, or sent only once). 1110 8. Contributors 1112 The authors would like to acknowledge the contributions of Andrew 1113 McDonald and Bryan Ford to this document. 1115 The authors would also like to thank the following people for 1116 detailed reviews: Olivier Bonaventure, Gorry Fairhurst, Iljitsch van 1117 Beijnum, Philip Eardley, Michael Scharf, Lars Eggert, Cullen 1118 Jennings, Joel Halpern, Juergen Quittek, Alexey Melnikov, David 1119 Harrington, Jari Arkko and Stewart Bryant. 1121 9. Acknowledgements 1123 Alan Ford, Costin Raiciu, Mark Handley, and Sebastien Barre are 1124 supported by Trilogy (http://www.trilogy-project.org), a research 1125 project (ICT-216372) partially funded by the European Community under 1126 its Seventh Framework Program. The views expressed here are those of 1127 the author(s) only. The European Commission is not liable for any 1128 use that may be made of the information in this document. 1130 10. IANA Considerations 1132 None. 1134 11. Security Considerations 1136 This informational document provides an architectural overview for 1137 Multipath TCP and so does not, in itself, raise any security issues. 1138 A separate threat analysis [12] lists threats that can exist with a 1139 Multipath TCP. However, a protocol based on the architecture in this 1140 document will have a number of security requirements. The high level 1141 goals for such a protocol are identified in Section 2.3, whilst 1142 Section 5.8 provides more detailed discussion of security 1143 requirements and design decisions which are applied in the MPTCP 1144 protocol design [5]. 1146 12. References 1148 12.1. Normative References 1150 [1] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, 1151 September 1981. 1153 [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement 1154 Levels", BCP 14, RFC 2119, March 1997. 1156 12.2. Informative References 1158 [3] Wischik, D., Handley, M., and M. Bagnulo Braun, "The Resource 1159 Pooling Principle", ACM SIGCOMM CCR vol. 38 num. 5, pp. 47-52, 1160 October 2008, 1161 . 1163 [4] Hopps, C., "Analysis of an Equal-Cost Multi-Path Algorithm", 1164 RFC 2992, November 2000. 1166 [5] Ford, A., Raiciu, C., and M. Handley, "TCP Extensions for 1167 Multipath Operation with Multiple Addresses", 1168 draft-ietf-mptcp-multiaddressed-02 (work in progress), 1169 October 2010. 1171 [6] Stewart, R., "Stream Control Transmission Protocol", RFC 4960, 1172 September 2007. 1174 [7] Raiciu, C., Handley, M., and D. Wischik, "Coupled Congestion 1175 Control for Multipath Transport Protocols", 1176 draft-ietf-mptcp-congestion-01 (work in progress), 1177 January 2011. 1179 [8] Scharf, M. and A. Ford, "MPTCP Application Interface 1180 Considerations", draft-ietf-mptcp-api-00 (work in progress), 1181 November 2010. 1183 [9] Carpenter, B. and S. Brim, "Middleboxes: Taxonomy and Issues", 1184 RFC 3234, February 2002. 1186 [10] Carpenter, B., "Internet Transparency", RFC 2775, 1187 February 2000. 1189 [11] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP 1190 Selective Acknowledgment Options", RFC 2018, October 1996. 1192 [12] Bagnulo, M., "Threat Analysis for TCP Extensions for Multi-path 1193 Operation with Multiple Addresses", draft-ietf-mptcp-threat-07 1194 (work in progress), January 2011. 1196 [13] Becke, M., Dreibholz, T., Iyengar, J., Natarajan, P., and M. 1197 Tuexen, "Load Sharing for the Stream Control Transmission 1198 Protocol (SCTP)", draft-tuexen-tsvwg-sctp-multipath-01 (work in 1199 progress), December 2010. 1201 [14] Ford, B. and J. Iyengar, "Breaking Up the Transport Logjam", 1202 ACM HotNets, October 2008. 1204 [15] Srisuresh, P. and K. Egevang, "Traditional IP Network Address 1205 Translator (Traditional NAT)", RFC 3022, January 2001. 1207 [16] Freed, N., "Behavior of and Requirements for Internet 1208 Firewalls", RFC 2979, October 2000. 1210 [17] Border, J., Kojo, M., Griner, J., Montenegro, G., and Z. 1211 Shelby, "Performance Enhancing Proxies Intended to Mitigate 1212 Link-Related Degradations", RFC 3135, June 2001. 1214 [18] Ramaiah, A., Stewart, R., and M. Dalal, "Improving TCP's 1215 Robustness to Blind In-Window Attacks", RFC 5961, August 2010. 1217 [19] Eddy, W., "TCP SYN Flooding Attacks and Common Mitigations", 1218 RFC 4987, August 2007. 1220 [20] Gont, F. and A. Yourtchenko, "On the Implementation of the TCP 1221 Urgent Mechanism", RFC 6093, January 2011. 1223 [21] Raghunarayan, R., "Management Information Base for the 1224 Transmission Control Protocol (TCP)", RFC 4022, March 2005. 1226 [22] Mathis, M., Heffner, J., and R. Raghunarayan, "TCP Extended 1227 Statistics MIB", RFC 4898, May 2007. 1229 [23] Handley, M., Paxson, V., and C. Kreibich, "Network Intrusion 1230 Detection: Evasion, Traffic Normalization, and End-to-End 1231 Protocol Semantics", Usenix Security 2001, 2001, . 1234 Appendix A. Changelog 1236 (For removal by the RFC Editor) 1238 A.1. Changes since draft-ietf-mptcp-architecture-04 1240 o Responded to IETF Last Call and IESG review comments. 1242 A.2. Changes since draft-ietf-mptcp-architecture-03 1244 o Responded to AD review comments. 1246 A.3. Changes since draft-ietf-mptcp-architecture-02 1248 o Responded to WG last call review comments. Included editorial 1249 fixes, adding Section 2.4, and improving Section 5.4 and 1250 Section 7. 1252 A.4. Changes since draft-ietf-mptcp-architecture-01 1254 o Responded to review comments. 1256 o Added security sections. 1258 A.5. Changes since draft-ietf-mptcp-architecture-00 1260 o Added middlebox compatibility discussion (Section 7). 1262 o Clarified path identification (TCP 4-tuple) in Section 5.5. 1264 o Added brief scenario and diagram to Section 1.3. 1266 Authors' Addresses 1268 Alan Ford 1269 Roke Manor Research 1270 Old Salisbury Lane 1271 Romsey, Hampshire SO51 0ZN 1272 UK 1274 Phone: +44 1794 833 465 1275 Email: alan.ford@roke.co.uk 1276 Costin Raiciu 1277 University College London 1278 Gower Street 1279 London WC1E 6BT 1280 UK 1282 Email: c.raiciu@cs.ucl.ac.uk 1284 Mark Handley 1285 University College London 1286 Gower Street 1287 London WC1E 6BT 1288 UK 1290 Email: m.handley@cs.ucl.ac.uk 1292 Sebastien Barre 1293 Universite catholique de Louvain 1294 Pl. Ste Barbe, 2 1295 Louvain-la-Neuve 1348 1296 Belgium 1298 Phone: +32 10 47 91 03 1299 Email: sebastien.barre@uclouvain.be 1301 Janardhan Iyengar 1302 Franklin and Marshall College 1303 Mathematics and Computer Science 1304 PO Box 3003 1305 Lancaster, PA 17604-3003 1306 USA 1308 Phone: 717-358-4774 1309 Email: jiyengar@fandm.edu