idnits 2.17.1 

draft-ietf-mptcp-architecture-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (January 21, 2011) is 4837 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  ** Obsolete normative reference: RFC  793 (ref. '1') (Obsoleted by RFC 9293)

  == Outdated reference: A later version (-12) exists of
     draft-ietf-mptcp-multiaddressed-02

  -- Obsolete informational reference (is this intentional?): RFC 4960 (ref.
     '6') (Obsoleted by RFC 9260)

  == Outdated reference: A later version (-07) exists of
     draft-ietf-mptcp-congestion-01

  == Outdated reference: A later version (-07) exists of
     draft-ietf-mptcp-api-00

  == Outdated reference: A later version (-08) exists of
     draft-ietf-mptcp-threat-07

  == Outdated reference: A later version (-27) exists of
     draft-tuexen-tsvwg-sctp-multipath-01

  -- Obsolete informational reference (is this intentional?): RFC 6093 (ref.
     '20') (Obsoleted by RFC 9293)


     Summary: 1 error (**), 0 flaws (~~), 6 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                                  A. Ford
3	Internet-Draft                                       Roke Manor Research
4	Intended status: Informational                                 C. Raiciu
5	Expires: July 25, 2011                                        M. Handley
6	                                               University College London
7	                                                                S. Barre
8	                                                Universite catholique de
9	                                                                 Louvain
10	                                                              J. Iyengar
11	                                           Franklin and Marshall College
12	                                                        January 21, 2011

14	         Architectural Guidelines for Multipath TCP Development
15	                    draft-ietf-mptcp-architecture-05

17	Abstract

19	   Hosts are often connected by multiple paths, but TCP restricts
20	   communications to a single path per transport connection.  Resource
21	   usage within the network would be more efficient were these multiple
22	   paths able to be used concurrently.  This should enhance user
23	   experience through improved resilience to network failure and higher
24	   throughput.

26	   This document outlines architectural guidelines for the development
27	   of a Multipath Transport Protocol, with references to how these
28	   architectural components come together in the development of a
29	   Multipath TCP protocol.  This document lists certain high level
30	   design decisions that provide foundations for the design of the MPTCP
31	   protocol, based upon these architectural requirements.

33	Status of this Memo

35	   This Internet-Draft is submitted in full conformance with the
36	   provisions of BCP 78 and BCP 79.

38	   Internet-Drafts are working documents of the Internet Engineering
39	   Task Force (IETF).  Note that other groups may also distribute
40	   working documents as Internet-Drafts.  The list of current Internet-
41	   Drafts is at http://datatracker.ietf.org/drafts/current/.

43	   Internet-Drafts are draft documents valid for a maximum of six months
44	   and may be updated, replaced, or obsoleted by other documents at any
45	   time.  It is inappropriate to use Internet-Drafts as reference
46	   material or to cite them other than as "work in progress."

48	   This Internet-Draft will expire on July 25, 2011.

50	Copyright Notice

52	   Copyright (c) 2011 IETF Trust and the persons identified as the
53	   document authors.  All rights reserved.

55	   This document is subject to BCP 78 and the IETF Trust's Legal
56	   Provisions Relating to IETF Documents
57	   (http://trustee.ietf.org/license-info) in effect on the date of
58	   publication of this document.  Please review these documents
59	   carefully, as they describe your rights and restrictions with respect
60	   to this document.  Code Components extracted from this document must
61	   include Simplified BSD License text as described in Section 4.e of
62	   the Trust Legal Provisions and are provided without warranty as
63	   described in the Simplified BSD License.

65	Table of Contents

67	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
68	     1.1.  Requirements Language  . . . . . . . . . . . . . . . . . .  5
69	     1.2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . .  5
70	     1.3.  Reference Scenario . . . . . . . . . . . . . . . . . . . .  6
71	   2.  Goals  . . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
72	     2.1.  Functional Goals . . . . . . . . . . . . . . . . . . . . .  6
73	     2.2.  Compatibility Goals  . . . . . . . . . . . . . . . . . . .  7
74	       2.2.1.  Application Compatibility  . . . . . . . . . . . . . .  7
75	       2.2.2.  Network Compatibility  . . . . . . . . . . . . . . . .  8
76	       2.2.3.  Compatibility with other network users . . . . . . . .  9
77	     2.3.  Security Goals . . . . . . . . . . . . . . . . . . . . . . 10
78	     2.4.  Related Protocols  . . . . . . . . . . . . . . . . . . . . 10
79	   3.  An Architectural Basis For Multipath TCP . . . . . . . . . . . 10
80	   4.  A Functional Decomposition of MPTCP  . . . . . . . . . . . . . 12
81	   5.  High-Level Design Decisions  . . . . . . . . . . . . . . . . . 14
82	     5.1.  Sequence Numbering . . . . . . . . . . . . . . . . . . . . 14
83	     5.2.  Reliability and Retransmissions  . . . . . . . . . . . . . 15
84	     5.3.  Buffers  . . . . . . . . . . . . . . . . . . . . . . . . . 17
85	     5.4.  Signalling . . . . . . . . . . . . . . . . . . . . . . . . 18
86	     5.5.  Path Management  . . . . . . . . . . . . . . . . . . . . . 19
87	     5.6.  Connection Identification  . . . . . . . . . . . . . . . . 20
88	     5.7.  Congestion Control . . . . . . . . . . . . . . . . . . . . 21
89	     5.8.  Security . . . . . . . . . . . . . . . . . . . . . . . . . 21
90	   6.  Software Interactions  . . . . . . . . . . . . . . . . . . . . 22
91	     6.1.  Interactions with Applications . . . . . . . . . . . . . . 22
92	     6.2.  Interactions with Management Systems . . . . . . . . . . . 23
93	   7.  Interactions with Middleboxes  . . . . . . . . . . . . . . . . 23
94	   8.  Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 25
95	   9.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 25
96	   10. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 25
97	   11. Security Considerations  . . . . . . . . . . . . . . . . . . . 25
98	   12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 26
99	     12.1. Normative References . . . . . . . . . . . . . . . . . . . 26
100	     12.2. Informative References . . . . . . . . . . . . . . . . . . 26
101	   Appendix A.  Changelog . . . . . . . . . . . . . . . . . . . . . . 28
102	     A.1.  Changes since draft-ietf-mptcp-architecture-04 . . . . . . 28
103	     A.2.  Changes since draft-ietf-mptcp-architecture-03 . . . . . . 28
104	     A.3.  Changes since draft-ietf-mptcp-architecture-02 . . . . . . 28
105	     A.4.  Changes since draft-ietf-mptcp-architecture-01 . . . . . . 28
106	     A.5.  Changes since draft-ietf-mptcp-architecture-00 . . . . . . 28
107	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 28

109	1.  Introduction

111	   As the Internet evolves, demands on Internet resources are ever-
112	   increasing, but often these resources (in particular, bandwidth)
113	   cannot be fully utilised due to protocol constraints both on the end-
114	   systems and within the network.  If these resources could be used
115	   concurrently, end user experience could be greatly improved.  Such
116	   enhancements would also reduce the necessary expenditure on network
117	   infrastructure that would otherwise be needed to create an equivalent
118	   improvement in user experience.  By the application of resource
119	   pooling [3], these available resources can be 'pooled' such that they
120	   appear as a single logical resource to the user.

122	   Multipath transport aims to realize some of the goals of resource
123	   pooling by simultaneously making use of multiple disjoint (or
124	   partially disjoint) paths across a network.  The two key benefits of
125	   multipath transport are:

127	   o  To increase the resilience of the connectivity by providing
128	      multiple paths, protecting end hosts from the failure of one.

130	   o  To increase the efficiency of the resource usage, and thus
131	      increase the network capacity available to end hosts.

133	   Multipath TCP is a modified version of TCP [1] that implements a
134	   multipath transport and achieves these goals by pooling multiple
135	   paths within a transport connection, transparently to the
136	   application.  Multipath TCP is primarily concerned with utilising
137	   multiple paths end-to-end, where one or both end host is multi-homed.
138	   It may also have applications where multiple paths exist within the
139	   network and can be manipulated by an end host, such as using
140	   different port numbers with ECMP [4].

142	   MPTCP, defined in [5], is a specific protocol that instantiates the
143	   Multipath TCP concept.  This document looks both at general
144	   architectural principles for a Multipath TCP fulfilling the goals
145	   described in Section 2, as well as the key design decisions behind
146	   MPTCP, which are detailed in Section 5.

148	   Although multihoming and multipath functions are not new to transport
149	   protocols (SCTP [6] being a notable example), MPTCP aims to gain
150	   wide-scale deployment by recognising the importance of application
151	   and network compatibility goals.  These goals, discussed in detail in
152	   Section 2, relate to the appearance of MPTCP to the network (so non-
153	   MPTCP-aware entities see it as TCP) and to the application (through
154	   providing an service equivalent to TCP for non-MPTCP-aware
155	   applications).

157	   This document has three key purposes: (i) it describes goals for a
158	   multipath transport - goals that MPTCP is designed to meet; (ii) it
159	   lays out an architectural basis for MPTCP's design - a discussion
160	   that applies to other multipath transports as well; and (iii) it
161	   discusses and documents high-level design decisions made in MPTCP's
162	   development, and considers their implications.

164	   Companion documents to this architectural overview are those which
165	   provide details of the protocol extensions [5], congestion control
166	   algorithms [7], and application-level considerations [8].  Put
167	   together, these components specify a complete Multipath TCP design.
168	   We note that specific components are replaceable in accordance with
169	   the layer and functional decompositions discussed in this document.

171	1.1.  Requirements Language

173	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
174	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
175	   document are to be interpreted as described in RFC 2119 [2].

177	1.2.  Terminology

179	   Regular/Single-Path TCP:  The standard version of the TCP [1]
180	      protocol in use today, operating between a single pair of IP
181	      addresses.

183	   Multipath TCP:  A modified version of the TCP protocol that supports
184	      the simultaneous use of multiple paths between hosts.

186	   Path:  A sequence of links between a sender and a receiver, defined
187	      in this context by a source and destination address pair.

189	   Host:  An end host either initiating or terminating a Multipath TCP
190	      connection.

192	   MPTCP:  The proposed protocol extensions specified in [5] to provide
193	      a Multipath TCP implementation.

195	   Subflow:  A flow of TCP segments operating over an individual path,
196	      which forms part of a larger Multipath TCP connection.

198	   (Multipath TCP) Connection:  A set of one or more subflows combined
199	      to provide a single Multipath TCP service to an application at a
200	      host.

202	1.3.  Reference Scenario

204	   The diagram shown in Figure 1 illustrates a typical usage scenario
205	   for Multipath TCP.  Two hosts, A and B, are communicating with each
206	   other.  These hosts are multi-homed and multi-addressed, providing
207	   two disjoint connections to the Internet.  The addresses on each host
208	   are referred to as A1, A2, B1 and B2.  There are therefore up to four
209	   different paths between the two hosts: A1-B1, A1-B2, A2-B1, A2-B2.

211	               +------+           __________           +------+
212	               |      |A1 ______ (          ) ______ B1|      |
213	               | Host |--/      (            )      \--| Host |
214	               |      |        (   Internet   )        |      |
215	               |  A   |--\______(            )______/--|   B  |
216	               |      |A2        (__________)        B2|      |
217	               +------+                                +------+

219	               Figure 1: Simple Multipath TCP Usage Scenario

221	   The scenario could have any number of addresses (1 or more) on each
222	   host, as long as the number of paths available between the two hosts
223	   is 2 or more (i.e. num_addr(A) * num_addr(B) > 1).  The paths created
224	   by these address combinations through the Internet need not be
225	   entirely disjoint - potential fairness issues introduced by shared
226	   bottlenecks need to be handled by the Multipath TCP congestion
227	   controller.  Furthermore, the paths through the Internet often do not
228	   provide a pure end-to-end service, and instead may be affected by
229	   middleboxes such as NATs and Firewalls.

231	2.  Goals

233	   This section outlines primary goals that Multipath TCP aims to meet.
234	   These are broadly broken down into: functional goals, which steer
235	   services and features that Multipath TCP must provide; and
236	   compatibility goals, which determine how Multipath TCP should appear
237	   to entities that interact with it.

239	2.1.  Functional Goals

241	   In supporting the use of multiple paths, Multipath TCP has the
242	   following two functional goals.

244	   o  Improve Throughput: Multipath TCP MUST support the concurrent use
245	      of multiple paths.  To meet the minimum performance incentives for
246	      deployment, a Multipath TCP connection over multiple paths SHOULD
247	      achieve no lesser throughput than a single TCP connection over the
248	      best constituent path.

250	   o  Improve Resilience: Multipath TCP MUST support the use of multiple
251	      paths interchangeably for resilience purposes, by permitting
252	      segments to be sent and re-sent on any available path.  It follows
253	      that, in the worst case, the protocol MUST be no less resilient
254	      than regular single-path TCP.

256	   As distribution of traffic among available paths and responses to
257	   congestion are done in accordance with resource pooling principles
258	   [3], a secondary effect of meeting these goals is that widespread use
259	   of Multipath TCP over the Internet should improve overall network
260	   utility by shifting load away from congested bottlenecks and by
261	   taking advantage of spare capacity wherever possible.

263	   Furthermore, Multipath TCP SHOULD feature automatic negotiation of
264	   its use.  A host supporting Multipath TCP that requires the other
265	   host to do so too must be able to detect reliably whether this host
266	   does in fact support the required extensions, using them if so, and
267	   otherwise automatically falling back to single-path TCP.

269	2.2.  Compatibility Goals

271	   In addition to the functional goals listed above, a Multipath TCP
272	   must meet a number of compatibility goals in order to support
273	   deployment in today's Internet.  These goals fall into the following
274	   categories:

276	2.2.1.  Application Compatibility

278	   Application compatibility refers to the appearance of Multipath TCP
279	   to the application both in terms of the API that can be used and the
280	   expected service model that is provided.

282	   Multipath TCP MUST follow the same service model as TCP [1]: in-
283	   order, reliable, and byte-oriented delivery.  Furthermore, a
284	   Multipath TCP connection SHOULD provide the application with no worse
285	   throughput or resilience than it would expect from running a single
286	   TCP connection over any one of its available paths.  A Multipath TCP
287	   may not, however, be able to provide the same level of consistency of
288	   throughput and latency as a single TCP connection.  These, and other,
289	   application considerations are discussed in detail in [8].

291	   A multipath-capable equivalent of TCP MUST retain some level of
292	   backward compatibility with existing TCP APIs, so that existing
293	   applications can use the newer transport merely by upgrading the
294	   operating systems of the end-hosts.  This does not preclude the use
295	   of an advanced API to permit multipath-aware applications to specify
296	   preferences, nor for users to configure their systems in a different
297	   way from the default, for example switching on or off the automatic
298	   use of multipath extensions.

300	   It is possible for regular TCP sessions today to survive brief breaks
301	   in connectivity by retaining state at end hosts before a timeout
302	   occurs.  It would be desirable to support similar session continuity
303	   in MPTCP, however the circumstances could be different.  Whilst in
304	   regular TCP the IP addresses will remain constant across the break in
305	   connectivity, in MPTCP a different interface may appear.  It is
306	   desirable (but not mandated) to support this kind of "break-before-
307	   make" session continuity.  This places constraints on security
308	   mechanisms, however, as discussed in Section 5.8.  Timeouts for this
309	   function would be locally configured.

311	2.2.2.  Network Compatibility

313	   In the traditional Internet architecture, network devices operate at
314	   the network layer and lower layers, with the layers above the network
315	   layer instantiated only at the end-hosts.  While this architecture,
316	   shown in Figure 2, was initially largely adhered to, this layering no
317	   longer reflects the "ground truth" in the Internet with the
318	   proliferation of middleboxes [9].  Middleboxes routinely interpose on
319	   the transport layer; sometimes even completely terminating transport
320	   connections, thus leaving the application layer as the first real
321	   end-to-end layer, as shown in Figure 3.

323	   +-------------+                                       +-------------+
324	   | Application |<------------ end-to-end ------------->| Application |
325	   +-------------+                                       +-------------+
326	   |  Transport  |<------------ end-to-end ------------->|  Transport  |
327	   +-------------+   +-------------+   +-------------+   +-------------+
328	   |   Network   |<->|   Network   |<->|   Network   |<->|   Network   |
329	   +-------------+   +-------------+   +-------------+   +-------------+
330	      End Host           Router             Router          End Host

332	                Figure 2: Traditional Internet Architecture

334	   +-------------+                                       +-------------+
335	   | Application |<------------ end-to-end ------------->| Application |
336	   +-------------+                     +-------------+   +-------------+
337	   |  Transport  |<------------------->|  Transport  |<->|  Transport  |
338	   +-------------+   +-------------+   +-------------+   +-------------+
339	   |   Network   |<->|   Network   |<->|   Network   |<->|   Network   |
340	   +-------------+   +-------------+   +-------------+   +-------------+
341	                                          Firewall,
342	      End Host           Router         NAT, or Proxy      End Host

344	                        Figure 3: Internet Reality

346	   Middleboxes that interpose on the transport layer result in loss of
347	   "fate-sharing" [10], that is, they often hold "hard" state that, when
348	   lost or corrupted, results in loss or corruption of the end-to-end
349	   transport connection.

351	   The network compatibility goal requires that the multipath extension
352	   to TCP retains compatibility with the Internet as it exists today,
353	   including making reasonable efforts to be able to traverse
354	   predominant middleboxes such as firewalls, NATs, and performance
355	   enhancing proxies [9].  This requirement comes from recognizing
356	   middleboxes as a significant deployment bottleneck for any transport
357	   that is not TCP or UDP, and constrains Multipath TCP to appear as TCP
358	   does on the wire and to use established TCP extensions where
359	   necessary.  To ensure end-to-endness of the transport, we further
360	   require Multipath TCP to preserve fate-sharing without making any
361	   assumptions about middlebox behavior.

363	   A detailed analysis of middlebox behaviour and the impact on the
364	   Multipath TCP architecture is presented in Section 7.  In addition,
365	   network compatibility must be retained to the extent that Multipath
366	   TCP MUST fall back to regular TCP if there are insurmountable
367	   incompatibilities for the multipath extension on a path.

369	   Middleboxes may also cause some TCP features to be able to exist on
370	   one subflow but not another.  Typically these will be at the subflow
371	   level (such as SACK [11]) and thus do not affect the connection-level
372	   behaviour.  In the future, any proposed TCP connection-level
373	   extensions should consider how they can co-exist with MPTCP.

375	   The modifications to support Multipath TCP remain at the transport
376	   layer, although some knowledge of the underlying network layer is
377	   required.  Multipath TCP SHOULD work with IPv4 and IPv6
378	   interchangeably, i.e. one connection may operate over both IPv4 and
379	   IPv6 networks.

381	2.2.3.  Compatibility with other network users

383	   As a corollary to both network and application compatibility, the
384	   architecture must enable new Multipath TCP flows to coexist
385	   gracefully with existing single-path TCP flows, competing for
386	   bandwidth neither unduly aggressively nor unduly timidly (unless low-
387	   precedence operation is specifically requested by the application,
388	   such as with LEDBAT).  The use of multiple paths MUST NOT unduly harm
389	   users using single-path TCP at shared bottlenecks, beyond the impact
390	   that would occur from another single-path TCP flow.  Multiple
391	   Multipath TCP flows on a shared bottleneck MUST share bandwidth
392	   between each other with similar fairness to that which occurs at a
393	   shared bottleneck with single-path TCP.

395	2.3.  Security Goals

397	   The extension of TCP with multipath capabilities will bring with it a
398	   number of new threats, analysed in detail in [12].  The security goal
399	   for Multipath TCP is to provide a service no less secure than
400	   regular, single-path TCP.  This will be achieved through a
401	   combination of existing TCP security mechanisms (potentially modified
402	   to align with the Multipath TCP extensions) and of protection against
403	   the new multipath threats identified.  The design decisions derived
404	   from this goal are presented in Section 5.8.

406	2.4.  Related Protocols

408	   There are several similarities between SCTP [6] and MPTCP, in that
409	   both can make use of multiple addresses at end hosts to give some
410	   multi-path capability.  In SCTP, the primary use case is to support
411	   redundancy and mobility for multihomed hosts (i.e. a single path will
412	   change one of its end host addresses); the simultaneous use of
413	   multiple paths is not supported .  Extensions are proposed to support
414	   simultaneous multipath transport [13], but these are yet to be
415	   standardised.  By far the most widely used stream-based transport
416	   protocol is, however, TCP [1], and SCTP does not meet the network and
417	   application compatibility goals specified in Section 2.2.  For
418	   network compatibility, there are issues with various middleboxes
419	   (especially NATs) that are unaware of SCTP and consequently end up
420	   blocking it.  For application compatibility, applications need to
421	   actively choose to use SCTP, and with the deployment issues very few
422	   choose to do so.  MPTCP's compatibility goals are in part based on
423	   these observations of SCTP's deployment issues.

425	3.  An Architectural Basis For Multipath TCP

427	   We now present one possible transport architecture that we believe
428	   can effectively support the goals for Multipath TCP.  The new
429	   Internet model described here is based on ideas proposed earlier in
430	   Tng ("Transport next-generation") [14].  While by no means the only
431	   possible architecture supporting multipath transport, Tng
432	   incorporates many lessons learned from previous transport research
433	   and development practice, and offers a strong starting point from
434	   which to consider the extant Internet architecture and its bearing on
435	   the design of any new Internet transports or transport extensions.

437	          +------------------+
438	          |    Application   |
439	          +------------------+  ^ Application-oriented transport
440	          |                  |  | functions (Semantic Layer)
441	          + - - Transport - -+ ----------------------------------
442	          |                  |  | Network-oriented transport
443	          +------------------+  v functions (Flow+Endpoint Layer)
444	          |      Network     |
445	          +------------------+
446	            Existing Layers             Tng Decomposition

448	              Figure 4: Decomposition of Transport Functions

450	   Tng loosely splits the transport layer into "application-oriented"
451	   and "network-oriented" layers, as shown in Figure 4.  The
452	   application-oriented "Semantic" layer implements functions driven
453	   primarily by concerns of supporting and protecting the application's
454	   end-to-end communication, while the network-oriented "Flow+Endpoint"
455	   layer implements functions such as endpoint identification (using
456	   port numbers) and congestion control.  These network-oriented
457	   functions, while traditionally located in the ostensibly "end-to-end"
458	   Transport layer, have proven in practice to be of great concern to
459	   network operators and the middleboxes they deploy in the network to
460	   enforce network usage policies [15] [16] or optimize communication
461	   performance [17].  Figure 5 shows how middleboxes interact with
462	   different layers in this decomposed model of the transport layer: the
463	   application-oriented layer operates end-to-end, while the network-
464	   oriented layer operates "segment-by-segment" and can be interposed
465	   upon by middleboxes.

467	   +-------------+                                       +-------------+
468	   | Application |<------------ end-to-end ------------->| Application |
469	   +-------------+                                       +-------------+
470	   |  Semantic   |<------------ end-to-end ------------->|  Semantic   |
471	   +-------------+   +-------------+   +-------------+   +-------------+
472	   |Flow+Endpoint|<->|Flow+Endpoint|<->|Flow+Endpoint|<->|Flow+Endpoint|
473	   +-------------+   +-------------+   +-------------+   +-------------+
474	   |   Network   |<->|   Network   |<->|   Network   |<->|   Network   |
475	   +-------------+   +-------------+   +-------------+   +-------------+
476	                        Firewall         Performance
477	      End Host           or NAT        Enhancing Proxy      End Host

479	              Figure 5: Middleboxes in the new Internet model

481	   MPTCP's architectural design follows Tng's decomposition as shown in
482	   Figure 6.  MPTCP, which provides application compatibility through
483	   the preservation of TCP-like semantics of global ordering of
484	   application data and reliability, is an instantiation of the
485	   "application-oriented" Semantic layer; whereas the subflow TCP
486	   component, which provides network compatibility by appearing and
487	   behaving as a TCP flow in the network, is an instantiation of the
488	   "network-oriented" Flow+Endpoint layer.

490	     +--------------------------+    +-------------------------------+
491	     |      Application         |    |          Application          |
492	     +--------------------------+    +-------------------------------+
493	     |        Semantic          |    |             MPTCP             |
494	     |------------+-------------|    + - - - - - - - + - - - - - - - +
495	     | Flow+Endpt | Flow+Endpt  |    | Subflow (TCP) | Subflow (TCP) |
496	     +------------+-------------+    +---------------+---------------+
497	     |   Network  |   Network   |    |       IP      |       IP      |
498	     +------------+-------------+    +---------------+---------------+

500	        Figure 6: Relationship between Tng (left) and MPTCP (right)

502	   As a protocol extension to TCP, MPTCP thus explicitly acknowledges
503	   middleboxes in its design, and specifies a protocol that operates at
504	   two scales: the MPTCP component operates end-to-end, while it allows
505	   the TCP component to operate segment-by-segment.

507	4.  A Functional Decomposition of MPTCP

509	   The previous two sections have discussed the goals for a Multipath
510	   TCP design, and provided a basis for decomposing the functions of a
511	   transport protocol in order to better understand the form a solution
512	   should take.  This section builds upon this analysis by presenting
513	   the functional components that are used within the MPTCP design.

515	   MPTCP makes use of (what appear to the network to be) standard TCP
516	   sessions, termed "subflows", to provide the underlying transport per
517	   path, and as such these retain the network compatibility desired.
518	   MPTCP-specific information is carried in a TCP-compatible manner,
519	   although this mechanism is separate from the actual information being
520	   transferred so could evolve in future revisions.  Figure 7
521	   illustrates the layered architecture.

523	                                   +-------------------------------+
524	                                   |           Application         |
525	      +---------------+            +-------------------------------+
526	      |  Application  |            |             MPTCP             |
527	      +---------------+            + - - - - - - - + - - - - - - - +
528	      |      TCP      |            | Subflow (TCP) | Subflow (TCP) |
529	      +---------------+            +-------------------------------+
530	      |      IP       |            |       IP      |      IP       |
531	      +---------------+            +-------------------------------+

533	      Figure 7: Comparison of Standard TCP and MPTCP Protocol Stacks

535	   Situated below the application, the MPTCP extension in turn manages
536	   multiple TCP subflows below it.  In order to do this, it must
537	   implement the following functions:

539	   o  Path Management: This is the function to detect and use multiple
540	      paths between two hosts.  MPTCP uses the presence of multiple IP
541	      addresses at one or both of the hosts as an indicator of this.
542	      The path management features of the MPTCP protocol are the
543	      mechanisms to signal alternative addresses to hosts, and
544	      mechanisms to set up new subflows joined to an existing MPTCP
545	      connection.

547	   o  Packet Scheduling: This function breaks the bytestream received
548	      from the application into segments to be transmitted on one of the
549	      available subflows.  The MPTCP design makes use of a data sequence
550	      mapping, associating segments sent on different subflows to a
551	      connection-level sequence numbering, thus allowing segments sent
552	      on different subflows to be correctly re-ordered at the receiver.
553	      The packet scheduler is dependent upon information about the
554	      availability of paths exposed by the path management component,
555	      and then makes use of the subflows to transmit queued segments.
556	      This function is also responsible for connection-level re-ordering
557	      on receipt of packets from the TCP subflows, according to the
558	      attached data sequence mappings.

560	   o  Subflow (single-path TCP) Interface: A subflow component takes
561	      segments from the packet-scheduling component and transmits them
562	      over the specified path, ensuring detectable delivery to the host.
563	      MPTCP uses TCP underneath for network compatibility; TCP ensures
564	      in-order, reliable delivery.  TCP adds its own sequence numbers to
565	      the segments; these are used to detect and retransmit lost packets
566	      at the subflow layer.  On receipt, the subflow passes its
567	      reassembled data to the packet scheduling component for
568	      connection-level reassembly; the data sequence mapping from the
569	      sender's packet scheduling component allows re-ordering of the
570	      entire bytestream.

572	   o  Congestion Control: This function coordinates congestion control
573	      across the subflows.  As specified, this congestion control
574	      algorithm MUST ensure that a MPTCP connection does not unfairly
575	      take more bandwidth than a single path TCP flow would take at a
576	      shared bottleneck.  An algorithm to support this is specified in
577	      [7].

579	   These functions fit together as follows.  The Path Management looks
580	   after the discovery (and if necessary, initialisation) of multiple
581	   paths between two hosts.  The Packet Scheduler then receives a stream
582	   of data from the application destined for the network, and undertakes
583	   the necessary operations on it (such as segmenting the data into
584	   connection-level segments, and adding a connection-level sequence
585	   number) before sending it on to a subflow.  The subflow then adds its
586	   own sequence number, ACKs, and passes them to network.  The receiving
587	   subflow re-orders data (if necessary) and passes it to the packet
588	   scheduling component, which performs connection level re-ordering,
589	   and sends the data stream to the application.  Finally, the
590	   congestion control component exists as part of the packet scheduling,
591	   in order to schedule which segments should be sent at what rate on
592	   which subflow.

594	5.  High-Level Design Decisions

596	   There is seemingly a wide range of choices when designing a multipath
597	   extension to TCP.  However, the goals as discussed earlier in this
598	   document constrain the possible solutions, leaving relative little
599	   choice in many areas.  Here, we outline high-level design choices
600	   that draw from the architectural basis discussed earlier in
601	   Section 3, which the design of MPTCP [5] takes into account.

603	5.1.  Sequence Numbering

605	   MPTCP uses two levels of sequence spaces: a connection level sequence
606	   number, and another sequence number for each subflow.  This permits
607	   connection-level segmentation and reassembly, and retransmission of
608	   the same part of connection-level sequence space on different
609	   subflow-level sequence space.

611	   The alternative approach would be to use a single connection level
612	   sequence number, which gets sent on multiple subflows.  This has two
613	   problems: first, the individual subflows will appear to the network
614	   as TCP sessions with gaps in the sequence space; this in turn may
615	   upset certain middleboxes such as intrusion detection systems, or
616	   certain transparent proxies, and would thus go against the network
617	   compatibility goal.  Second, the sender would not be able to
618	   attribute packet losses or receptions to the correct path when the
619	   same segment is sent on multiple paths (i.e. in the case of
620	   retransmissions).

622	   The sender must be able to tell the receiver how to reassemble the
623	   data, for delivery to the application.  In order to achieve this, the
624	   receiver must determine how subflow-level data (carrying subflow
625	   sequence numbers) maps at the connection level.  We refer to this as
626	   the Data Sequence Mapping.  This mapping takes the form (data seq,
627	   subflow seq, length), i.e. for a given number of bytes (the length),
628	   the subflow sequence space beginning at the given sequence number
629	   maps to the connection-level sequence space (beginning at the given
630	   data seq number).  This information could conceivably have various
631	   sources.

633	   One option to signal the Data Sequence Mapping would be to use
634	   existing fields in the TCP segment (such as subflow seqno, length)
635	   and only add the data sequence number to each segment, for instance
636	   as a TCP option.  This would be vulnerable, however, to middleboxes
637	   that resegment or assemble data, since there is no specified
638	   behaviour for coalescing TCP options.  If one signalled (data seqno,
639	   length), this would still be vulnerable to middleboxes that coalesce
640	   segments and do not understand MPTCP signalling so do not correctly
641	   rewrite the options.

643	   Because of these potential issues, the design decision taken in the
644	   MPTCP protocol is that whenever a mapping for subflow data needs to
645	   be conveyed to the other host, all three pieces of data (data seq,
646	   subflow seq, length) must be sent.  To reduce the overhead, it would
647	   be permissible for the mapping to be sent periodically and cover more
648	   than a single segment.  Further experimentation is required to
649	   determine what tradeoffs exist regarding the frequency at which
650	   mappings should be sent.  It could also be excluded entirely in the
651	   case of a connection before more than one subflow is used, where the
652	   data-level and subflow-level sequence space is the same.

654	5.2.  Reliability and Retransmissions

656	   MPTCP features acknowledgements at connection-level as well as
657	   subflow-level acknowledgements, in order to provide a robust service
658	   to the application.

660	   Under normal behaviour, MPTCP can use the data sequence mapping and
661	   subflow ACKs to decide when a connection-level segment was received.
662	   The transmission of TCP ACKs for a subflow are handled entirely at
663	   the subflow level, in order to maintain TCP semantics and trigger
664	   subflow-level retransmissions.  This has certain implications on end-
665	   to-end semantics.  It means that once a segment is ACKed at the
666	   subflow level it cannot be discarded in the re-order buffer at the
667	   connection level.  Secondly, unlike in standard TCP, a receiver
668	   cannot simply drop out-of-order segments if needed (for instance, due
669	   to memory pressure).  Under certain circumstances, therefore, it may
670	   be desirable to drop segments after acknowledgement on the subflow
671	   but before delivery to the application, and this can be facilitated
672	   by a connection-level acknowledgement.

674	   Furthermore, it is possible to conceive of some cases where
675	   connection-level acknowledgements could improve robustness.  Consider
676	   a subflow traversing a transparent proxy: if the proxy ACKs a segment
677	   and then crashes, the sender will not retransmit the lost segment on
678	   another subflow, as it thinks the segment has been received.  The
679	   connection grinds to a halt despite having other working subflows,
680	   and the sender would be unable to determine the cause of the problem.
681	   An example situation where this may occur would be mobility between
682	   wireless access points, each of which operates a transport-level
683	   proxy.  Finally, as an optimisation, it may be feasible for a
684	   connection-level acknowledgement to be transmitted over the shortest
685	   Round-Trip Time (RTT) path, potentially reducing send buffer
686	   requirements (see Section 5.3).

688	   Therefore, to provide a fully robust multipath TCP solution given the
689	   above constraints, MPTCP for use on the public Internet MUST feature
690	   explicit connection-level acknowledgements, in addition to subflow-
691	   level acknowledgements.  A connection-level acknowledgement would
692	   only be required in order to signal when the receive window moves
693	   forward; the heuristics for using such a signal are discussed in more
694	   detail in the protocol specification [5].

696	   Regarding retransmissions, it MUST be possible for a segments to be
697	   retransmitted on a different subflow to that on which it was
698	   originally sent.  This is one of MPTCP's core goals, in order to
699	   maintain integrity during temporary or permanent subflow failure, and
700	   this is enabled by the dual sequence number space.

702	   The scheduling of retransmissions will have significant impact on
703	   MPTCP user experience.  The current MPTCP specification suggests that
704	   data outstanding on subflows that have timed out should be
705	   rescheduled for transmission on different subflows.  This behaviour
706	   aims to minimize disruption when a path breaks, and uses the first
707	   timeout as indicators.  More conservative versions would be to use
708	   second or third timeouts for the same segment.

710	   Typically, fast retransmit on an individual subflow will not trigger
711	   retransmission on another subflow, although this may still be
712	   desirable in certain cases, for instance to reduce the receive buffer
713	   requirements.  However, in all cases with retransmissions on
714	   different subflows, the lost segments SHOULD still be sent on the
715	   path that lost them.  This is currently believed to be necessary to
716	   maintain subflow integrity, as per the network compatibility goal.
717	   By doing this, some efficiency is lost, and it is unclear at this
718	   point what the optimal retransmit strategy is.

720	   Large-scale experiments are therefore required in order to determine
721	   the most appropriate retransmission strategy, and recommendations
722	   will be refined once more information is available.

724	5.3.  Buffers

726	   To ensure in-order delivery, MPTCP must use a connection level
727	   receive buffer, where segments are placed until they are in order and
728	   can be read by the application.

730	   In regular, single-path TCP, it is usually recommended to set the
731	   receive buffer to 2*BDP (Bandwidth-Delay Product, i.e.  BDP = BW*RTT,
732	   where BW = Bandwidth and RTT = Round-Trip Time).  One BDP allows
733	   supporting reordering of segments by the network.  The other BDP
734	   allows the connection to continue during fast retransmit: when a
735	   segment is fast retransmitted, the receiver must be able to store
736	   incoming data during one more RTT.

738	   For MPTCP, the story is a bit more complicated.  The ultimate goal is
739	   that a subflow packet loss or subflow failure should not affect the
740	   throughput of other working subflows; the receiver should have enough
741	   buffering to store all data until the missing segment is re-
742	   transmitted and reaches the destination.

744	   The worst case scenario would be when the subflow with the highest
745	   RTT/RTO (Round-Trip Time or Retransmission TimeOut) experiences a
746	   timeout; in that case the receiver has to buffer data from all
747	   subflows for the duration of the RTO.  Thus, the smallest connection-
748	   level receive buffer that would be needed to avoid stalling with
749	   subflow failures is sum(BW_i)*RTO_max, where BW_i = Bandwidth for
750	   each subflow and RTO_max is the largest RTO across all subflows.

752	   This is an order of magnitude more than the receive buffer required
753	   for a single connection, and is probably too expensive for practical
754	   purposes.  A more sensible requirement is to avoid stalls in the
755	   absence of timeouts.  Therefore, the RECOMMENDED receive buffer is
756	   2*sum(BW_i)*RTT_max, where RTT_max is the largest RTT across all
757	   subflows.  This buffer sizing ensures subflows do not stall when fast
758	   retransmit is triggered on any subflow.

760	   The resulting buffer size should be small enough for practical use.
761	   However, there may be extreme cases where fast, high throughput paths
762	   (e.g. 100Mb/s, 10ms RTT) are used in conjunction with slow paths
763	   (e.g. 1Mb/s, 1000ms RTT).  In that case the required receive buffer
764	   would be 12.5MB, which is likely too big.  In extreme cases such as
765	   this example, it may be prudent to only use some of the fastest
766	   available paths for the MPTCP connection, potentially using the slow
767	   path(s) for backup only.

769	   Send Buffer: The RECOMMENDED send buffer is the same size as the
770	   recommended receive buffer i.e., 2*sum(BW_i)*RTT_max.  This is
771	   because the sender must store locally the segments sent but
772	   unacknowledged by the connection level ACK.  The send buffer size
773	   matters particularly for hosts that maintain a large number of
774	   ongoing connections.  If the required send buffer is too large, a
775	   host can choose to only send data on the fast subflows, using the
776	   slow subflows only in cases of failure.

778	5.4.  Signalling

780	   Since MPTCP uses TCP as its subflow transport mechanism, a MPTCP
781	   connection will also begin as a single TCP connection.  Nevertheless,
782	   it must signal to the peer that it supports MPTCP and wishes to use
783	   it on this connection.  As such, a TCP Option will be used to
784	   transmit this information, since this is the established mechanism
785	   for indicating additional functionality on a TCP session.

787	   In addition, further signalling is required during the operation of a
788	   MPTCP session, such as that for reassembly for multiple subflows, and
789	   for informing the other host about potential other available
790	   addresses.

792	   The MPTCP protocol design will, however, use TCP Options for this
793	   additional signalling.  This has been chosen as the mechanism most
794	   fitting in with the goals as specified in Section 2.  With this
795	   mechanism, the signalling requires to operate MPTCP is transported
796	   separately from the data, allowing it to be created and processed
797	   separately from the data stream, and retaining architectural
798	   compatibility with network entities.

800	   This decision is the consensus of the Working Group (following
801	   detailed discussions at IETF78), and the main reasons for this are as
802	   follows:

804	   o  TCP options are the traditional signalling method for TCP;

806	   o  A TCP option on a SYN is the most compatible way for an end host
807	      to signal it is MPTCP-capable;

809	   o  If connection-level ACKs are signalled in the payload then they
810	      may suffer from packet loss and may be congestion-controlled,
811	      which may affect the data throughput in the forward direction and
812	      could lead to head-of-line blocking;

814	   o  Middleboxes, such as NAT traversal helpers, can easily parse TCP
815	      options, e. g., to rewrite addresses.

817	   On the other hand, the main drawbacks of TCP options compared to TLV
818	   encoding in the payload are:

820	   o  There is limited space for signalling messages;

822	   o  A middlebox may, potentially, drop a packet with an unknown
823	      option;

825	   o  The transport of control information in options is not necessarily
826	      reliable.

828	   The detailed design of MPTCP alleviates these issues as far as
829	   possible by carefully considering the size of MPTCP options, and
830	   seamlessly falling back to regular TCP on the loss of control data.

832	   Both option and payload encoding may interfere with offloading of TCP
833	   processing to high speed network interface cards, such as
834	   segmentation, checksumming, and reassembly.  For network cards
835	   supporting MPTCP, signalling in TCP options should simplify
836	   offloading due to the separate handling of MPTCP signalling and data.

838	5.5.  Path Management

840	   Currently, the network does not expose path diversity between pairs
841	   of IP addresses.  In order to achieve path diversity from today's IP
842	   networks, in the typical case MPTCP uses multiple addresses at one or
843	   both hosts to infer different paths across the network.  It is
844	   expected that these paths, whilst not necessarily entirely non-
845	   overlapping, will be sufficiently disjoint to allow multipath to
846	   achieve improved throughput and robustness.  The use of multiple IP
847	   addresses is a simple mechanism that requires no additional features
848	   in the network.

850	   Multiple different (source, destination) address pairs will thus be
851	   used as path selectors in most cases.  Each path will be identified
852	   by a standard five-tuple (i.e. source address, destination address,
853	   source port, destination port, protocol), however, which can allow
854	   the extension of MPTCP to use ports as well as addresses as path
855	   selectors.  This will allow hosts to use port-based load balancing
856	   with MPTCP, for example if the network routes different ports over
857	   different paths (which may be the case with technologies such as
858	   Equal Cost MultiPath (ECMP) routing [4]).  It should be noted,
859	   however, that ISPs often undertake traffic engineering in order to
860	   optimise resource utilisation within their networks, and care should
861	   be taken (by both ISPs and developers) that MPTCP using broadly
862	   similar paths does not adversely interfere with this.

864	   For increased chance of successfully setting up additional subflows
865	   (such as when one end is behind a firewall, NAT, or other restrictive
866	   middlebox), either host SHOULD be able to add new subflows to a MPTCP
867	   connection.  MPTCP MUST be able to handle paths that appear and
868	   disappear during the lifetime of a connection (for example, through
869	   the activation of an additional network interface).

871	   The path management is a separate function from the packet
872	   scheduling, subflow interface, and congestion control functions of
873	   MPTCP, as documented in Section 4.  As such it would be feasible to
874	   replace this IP-address-based design with an alternative path
875	   selection mechanism in the future, with no significant changes to the
876	   other functional components.

878	5.6.  Connection Identification

880	   Since a MPTCP connection may not be bound to a traditional 5-tuple
881	   (source address and port, destination address and port, protocol
882	   number) for the entirety of its existence, it is desirable to provide
883	   a new mechanism for connection identification.  This will be useful
884	   for MPTCP-aware applications, and for the MPTCP implementation (and
885	   MPTCP-aware middleboxes) to have a unique identifier with which to
886	   associate the multiple subflows.

888	   Therefore, each MPTCP connection requires a connection identifier at
889	   each host, which is locally unique within that host.  In many ways,
890	   this is analogous to an ephemeral port number in regular TCP.  The
891	   manifestation and purpose of such an identifier is out of the scope
892	   of this architecture document.

894	   Legacy applications will not, however, have access to this identifier
895	   and in such cases a MPTCP connection will be identified by the
896	   5-tuple of the first TCP subflow.  It is out of the scope of this
897	   document, however, to define the behaviour of the MPTCP
898	   implementation if the first TCP subflow later fails.  If there are
899	   MPTCP-unaware applications that make assumptions about continued
900	   existence of the initial address pair, their behaviour could be
901	   disrupted by carrying on regardless.  It is expected that this is a
902	   very small, possibly negligible, set of applications, however.  MPTCP
903	   MUST NOT be used for applications that request to bind to a specific
904	   address or interface, since such applications are making a deliberate
905	   choice of path in use.

907	   Since the requirements of applications are not clear at this stage,
908	   however, it is as yet unconfirmed whether carrying on in the event of
909	   the loss of the initial address pair would be a damaging assumption
910	   to make.  This behaviour will be an implementation-specific solution,
911	   and as such it is expected to be chosen by implementors once more
912	   research has been undertaken to determine its impact.

914	5.7.  Congestion Control

916	   As discussed in network-layer compatibility requirements
917	   Section 2.2.3, there are three goals for the congestion control
918	   algorithms used by a MPTCP implementation: improve throughput (at
919	   least as well as a single-path TCP connection would perform); do no
920	   harm to other network users (do not take up more capacity on any one
921	   path than if it was a single path flow using only that route - this
922	   is particularly relevant for shared bottlenecks); and balance
923	   congestion by moving traffic away from the most congested paths.  To
924	   achieve these goals, the congestion control algorithms on each
925	   subflow must be coupled in some way.  A proposal for a suitable
926	   congestion control algorithm is given in [7].

928	5.8.  Security

930	   A detailed threat analysis for Multipath TCP is presented in a
931	   separate document [12].  This focuses on flooding attacks and
932	   hijacking attacks that can be launched against a Multipath TCP
933	   connection.

935	   The basic security goal of Multipath TCP, as introduced in
936	   Section 2.3, can be stated as: "provide a solution that is no worse
937	   than standard TCP".

939	   From the threat analysis, and with this goal in mind, three key
940	   security requirements can be identified.  A multi-addressed Multipath
941	   TCP SHOULD be able to:

943	   o  Provide a mechanism to confirm that the parties in a subflow
944	      handshake are the same as in the original connection setup (e.g.
945	      require use of a key exchanged in the initial handshake in the
946	      subflow handshake, to limit the scope for hijacking attacks).

948	   o  Provide verification that the peer can receive traffic at a new
949	      address before adding it (i.e. verify that the address belongs to
950	      the other host, to prevent flooding attacks).

952	   o  Provide replay protection, i.e. ensure that a request to add/
953	      remove a subflow is 'fresh'.

955	   Additional mechanisms have been deployed as part of standard TCP
956	   stacks to provide resistance to Denial-of-Service attacks.  For
957	   example, there are various mechanisms to protect against TCP reset
958	   attacks [18], and Multipath TCP should continue to support similar
959	   protection.  In addition, TCP SYN Cookies [19] were developed to
960	   allow a TCP server to defer the creation of session state in the
961	   SYN_RCVD state, and remain stateless until the ESTABLISHED state had
962	   been reached.  Multipath TCP should, ideally, continue to provide
963	   such functionality and, at a minimum, avoid significant computational
964	   burden prior to reaching the ESTABLISHED state (of the Multipath TCP
965	   connection as a whole).

967	   It should be noted that aspects of the Multipath TCP design space
968	   place constraints on the security solution:

970	   o  The use of TCP options significantly limits the amount of
971	      information that can be carried in the handshake.

973	   o  The need to work through middleboxes results in the need to handle
974	      mutability of packets.

976	   o  The desire to support a 'break-before-make' (as well as a 'make-
977	      before-break') approach to adding subflows (within a limited time
978	      period) implies that a host cannot rely on using a pre-existing
979	      subflow to support the addition of a new one.

981	   The MPTCP protocol will be designed with these security requirements
982	   in mind, and the protocol specification [5] will document how these
983	   are met.

985	6.  Software Interactions

987	6.1.  Interactions with Applications

989	   In the case of applications that have used an existing API call to
990	   bind to a specific address or interface, the MPTCP extension MUST NOT
991	   be used.  This is because the applications are indicating a clear
992	   choice of path to use and thus will have expectations of behaviour
993	   that must be maintained, in order to adhere to the application
994	   compatibility goals.

996	   Interactions with applications are presented in [8] - including, but
997	   not limited to, performances changes that may be expected, semantic
998	   changes, and new features that may be requested through an enhanced
999	   API.

1001	   TCP features the ability to send "Urgent" data, the delivery of which
1002	   to the application may or may not be out-of-band.  The use of this
1003	   feature is not recommended due to security implications and
1004	   implementation differences [20].  MPTCP requires contiguous data to
1005	   support its Data Sequence Mapping over multiple segments, and
1006	   therefore the Urgent pointer cannot interrupt an existing mapping.
1007	   An MPTCP implementation MAY choose to support sending Urgent data,
1008	   and if it does, it SHOULD send the Urgent data on the soonest
1009	   available unassigned subflow sequence space.  Incoming Urgent data
1010	   SHOULD be mapped to connection-level sequence space and delivered to
1011	   the application analogous to Urgent data in regular TCP.

1013	6.2.  Interactions with Management Systems

1015	   To enable interactions between TCP and network management systems,
1016	   the TCP [21] and TCP Extended Statistics (ESTATS) [22] MIBs have been
1017	   defined.  MPTCP should share the these MIBs for aspects that are
1018	   designed to be transparent to the application.

1020	   It is anticipated that a MPTCP MIB will be defined in the future,
1021	   once experience of experimental MPTCP deployments is gathered.  This
1022	   MIB would provide access to MPTCP-specific properties such as whether
1023	   MPTCP is enabled, and the number and properties of the individual
1024	   paths in use.

1026	7.  Interactions with Middleboxes

1028	   As discussed in Section 2.2, it is a goal of MPTCP to be deployable
1029	   today and thus compatible with the majority of middleboxes.  This
1030	   section summarises the issues that may arise with NATs, firewalls,
1031	   proxies, intrusion detection systems, and other middleboxes that, if
1032	   not considered in the protocol design, may hinder its deployment.

1034	   This section is intended primarily as a description of options and
1035	   considerations only.  Protocol-specific solutions to these issues
1036	   will be given in the companion documents.

1038	   Multipath TCP will be deployed in a network that no longer provides
1039	   just basic datagram delivery.  A myriad of middleboxes are deployed
1040	   to optimize various perceived problems with the Internet protocols:
1041	   NATs primarily address IP address space shortage [15], Performance
1042	   Enhancing Proxies (PEPs) optimize TCP for different link
1043	   characteristics [17], firewalls [16] and intrusion detection systems
1044	   try to block malicious content from reaching a host, and traffic
1045	   normalizers [23] ensure a consistent view of the traffic stream to
1046	   Intrusion Detection Systems (IDS) and hosts.

1048	   All these middleboxes optimize current applications at the expense of
1049	   future applications.  In effect, future applications will often need
1050	   to behave in a similar fashion to existing ones, in order to increase
1051	   the chances of successful deployment.  Further, the precise behaviour
1052	   of all these middleboxes is not clearly specified, and implementation
1053	   errors make matters worse, raising the bar for the deployment of new
1054	   technologies.

1056	   The following list of middlebox classes documents behaviour that
1057	   could impact the use of MPTCP.  This list is used in [5] to describe
1058	   the features of the MPTCP protocol that are used to mitigate the
1059	   impact of these middlebox behaviours.

1061	   o  NATs: Network Address Translators decouple the host's local IP
1062	      address (and, in the case of NAPTs, port) with that which is seen
1063	      in the wider Internet when the packets are transmitted through a
1064	      NAT.  This adds complexity, and reduces the chances of success,
1065	      when signalling IP addresses.

1067	   o  PEPs: Performance Enhancing Proxies, which aim to improve the
1068	      performance of protocols over low-performance (e.g. high latency
1069	      or high error rate) links.  As such, they may "split" a TCP
1070	      connection and behaviour such as proactive ACKing may occur, and
1071	      therefore it is no longer guaranteed that one host is
1072	      communicating directly with another.  PEPs, firewalls or other
1073	      middleboxes may also change the declared receive window size.

1075	   o  Traffic Normalizers: These aim to eliminate ambiguities and
1076	      potential attacks at the network level, and amongst other things
1077	      are unlikely to permit holes in TCP-level sequence space (which
1078	      has impact on MPTCP's retransmission and subflow sequence
1079	      numbering design choices).

1081	   o  Firewalls: on top of preventing incoming connections, firewalls
1082	      may also attempt additional protection such as sequence number
1083	      randomization (so a sender cannot reliably know what TCP sequence
1084	      number the receiver will see).

1086	   o  Intrusion Detection Systems: IDSs may look for traffic patterns to
1087	      protect a network, and may have false positives with MPTCP and
1088	      drop the connections during normal operation.  Future MPTCP-aware
1089	      middleboxes will require the ability to correlate the various
1090	      paths in use.

1092	   o  Content-aware Firewalls: Some middleboxes may actively change data
1093	      in packets, such as re-writing URIs in HTTP traffic.

1095	   In addition, all classes of middleboxes may affect TCP traffic in the
1096	   following ways:

1098	   o  TCP Options: some middleboxes may drop packets with unknown TCP
1099	      options, or strip those options from the packets.

1101	   o  Segmentation and Coalescing: middleboxes (or even something as
1102	      close to the end host as TCP Segmentation Offloading (TSO) on a
1103	      Network Interface Card (NIC)) may change the packet boundaries
1104	      from those which the sender intended.  It may do this by splitting
1105	      packets, or coalescing them together.  This leads to two major
1106	      impacts: we cannot guarantee where a packet boundary will be, and
1107	      we cannot say for sure what a middlebox will do with TCP options
1108	      in these cases (they may be repeated, dropped, or sent only once).

1110	8.  Contributors

1112	   The authors would like to acknowledge the contributions of Andrew
1113	   McDonald and Bryan Ford to this document.

1115	   The authors would also like to thank the following people for
1116	   detailed reviews: Olivier Bonaventure, Gorry Fairhurst, Iljitsch van
1117	   Beijnum, Philip Eardley, Michael Scharf, Lars Eggert, Cullen
1118	   Jennings, Joel Halpern, Juergen Quittek, Alexey Melnikov, David
1119	   Harrington, Jari Arkko and Stewart Bryant.

1121	9.  Acknowledgements

1123	   Alan Ford, Costin Raiciu, Mark Handley, and Sebastien Barre are
1124	   supported by Trilogy (http://www.trilogy-project.org), a research
1125	   project (ICT-216372) partially funded by the European Community under
1126	   its Seventh Framework Program.  The views expressed here are those of
1127	   the author(s) only.  The European Commission is not liable for any
1128	   use that may be made of the information in this document.

1130	10.  IANA Considerations

1132	   None.

1134	11.  Security Considerations

1136	   This informational document provides an architectural overview for
1137	   Multipath TCP and so does not, in itself, raise any security issues.
1138	   A separate threat analysis [12] lists threats that can exist with a
1139	   Multipath TCP.  However, a protocol based on the architecture in this
1140	   document will have a number of security requirements.  The high level
1141	   goals for such a protocol are identified in Section 2.3, whilst
1142	   Section 5.8 provides more detailed discussion of security
1143	   requirements and design decisions which are applied in the MPTCP
1144	   protocol design [5].

1146	12.  References

1148	12.1.  Normative References

1150	   [1]   Postel, J., "Transmission Control Protocol", STD 7, RFC 793,
1151	         September 1981.

1153	   [2]   Bradner, S., "Key words for use in RFCs to Indicate Requirement
1154	         Levels", BCP 14, RFC 2119, March 1997.

1156	12.2.  Informative References

1158	   [3]   Wischik, D., Handley, M., and M. Bagnulo Braun, "The Resource
1159	         Pooling Principle", ACM SIGCOMM CCR vol. 38 num. 5, pp. 47-52,
1160	         October 2008,
1161	         <http://ccr.sigcomm.org/online/files/p47-handleyA4.pdf>.

1163	   [4]   Hopps, C., "Analysis of an Equal-Cost Multi-Path Algorithm",
1164	         RFC 2992, November 2000.

1166	   [5]   Ford, A., Raiciu, C., and M. Handley, "TCP Extensions for
1167	         Multipath Operation with Multiple Addresses",
1168	         draft-ietf-mptcp-multiaddressed-02 (work in progress),
1169	         October 2010.

1171	   [6]   Stewart, R., "Stream Control Transmission Protocol", RFC 4960,
1172	         September 2007.

1174	   [7]   Raiciu, C., Handley, M., and D. Wischik, "Coupled Congestion
1175	         Control for Multipath Transport Protocols",
1176	         draft-ietf-mptcp-congestion-01 (work in progress),
1177	         January 2011.

1179	   [8]   Scharf, M. and A. Ford, "MPTCP Application Interface
1180	         Considerations", draft-ietf-mptcp-api-00 (work in progress),
1181	         November 2010.

1183	   [9]   Carpenter, B. and S. Brim, "Middleboxes: Taxonomy and Issues",
1184	         RFC 3234, February 2002.

1186	   [10]  Carpenter, B., "Internet Transparency", RFC 2775,
1187	         February 2000.

1189	   [11]  Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
1190	         Selective Acknowledgment Options", RFC 2018, October 1996.

1192	   [12]  Bagnulo, M., "Threat Analysis for TCP Extensions for Multi-path
1193	         Operation with Multiple Addresses", draft-ietf-mptcp-threat-07
1194	         (work in progress), January 2011.

1196	   [13]  Becke, M., Dreibholz, T., Iyengar, J., Natarajan, P., and M.
1197	         Tuexen, "Load Sharing for the Stream Control Transmission
1198	         Protocol (SCTP)", draft-tuexen-tsvwg-sctp-multipath-01 (work in
1199	         progress), December 2010.

1201	   [14]  Ford, B. and J. Iyengar, "Breaking Up the Transport Logjam",
1202	          ACM HotNets, October 2008.

1204	   [15]  Srisuresh, P. and K. Egevang, "Traditional IP Network Address
1205	         Translator (Traditional NAT)", RFC 3022, January 2001.

1207	   [16]  Freed, N., "Behavior of and Requirements for Internet
1208	         Firewalls", RFC 2979, October 2000.

1210	   [17]  Border, J., Kojo, M., Griner, J., Montenegro, G., and Z.
1211	         Shelby, "Performance Enhancing Proxies Intended to Mitigate
1212	         Link-Related Degradations", RFC 3135, June 2001.

1214	   [18]  Ramaiah, A., Stewart, R., and M. Dalal, "Improving TCP's
1215	         Robustness to Blind In-Window Attacks", RFC 5961, August 2010.

1217	   [19]  Eddy, W., "TCP SYN Flooding Attacks and Common Mitigations",
1218	         RFC 4987, August 2007.

1220	   [20]  Gont, F. and A. Yourtchenko, "On the Implementation of the TCP
1221	         Urgent Mechanism", RFC 6093, January 2011.

1223	   [21]  Raghunarayan, R., "Management Information Base for the
1224	         Transmission Control Protocol (TCP)", RFC 4022, March 2005.

1226	   [22]  Mathis, M., Heffner, J., and R. Raghunarayan, "TCP Extended
1227	         Statistics MIB", RFC 4898, May 2007.

1229	   [23]  Handley, M., Paxson, V., and C. Kreibich, "Network Intrusion
1230	         Detection: Evasion, Traffic Normalization, and End-to-End
1231	         Protocol Semantics", Usenix Security 2001, 2001, <http://
1232	         www.usenix.org/events/sec01/full_papers/handley/handley.pdf>.

1234	Appendix A.  Changelog

1236	   (For removal by the RFC Editor)

1238	A.1.  Changes since draft-ietf-mptcp-architecture-04

1240	   o  Responded to IETF Last Call and IESG review comments.

1242	A.2.  Changes since draft-ietf-mptcp-architecture-03

1244	   o  Responded to AD review comments.

1246	A.3.  Changes since draft-ietf-mptcp-architecture-02

1248	   o  Responded to WG last call review comments.  Included editorial
1249	      fixes, adding Section 2.4, and improving Section 5.4 and
1250	      Section 7.

1252	A.4.  Changes since draft-ietf-mptcp-architecture-01

1254	   o  Responded to review comments.

1256	   o  Added security sections.

1258	A.5.  Changes since draft-ietf-mptcp-architecture-00

1260	   o  Added middlebox compatibility discussion (Section 7).

1262	   o  Clarified path identification (TCP 4-tuple) in Section 5.5.

1264	   o  Added brief scenario and diagram to Section 1.3.

1266	Authors' Addresses

1268	   Alan Ford
1269	   Roke Manor Research
1270	   Old Salisbury Lane
1271	   Romsey, Hampshire  SO51 0ZN
1272	   UK

1274	   Phone: +44 1794 833 465
1275	   Email: alan.ford@roke.co.uk
1276	   Costin Raiciu
1277	   University College London
1278	   Gower Street
1279	   London  WC1E 6BT
1280	   UK

1282	   Email: c.raiciu@cs.ucl.ac.uk

1284	   Mark Handley
1285	   University College London
1286	   Gower Street
1287	   London  WC1E 6BT
1288	   UK

1290	   Email: m.handley@cs.ucl.ac.uk

1292	   Sebastien Barre
1293	   Universite catholique de Louvain
1294	   Pl. Ste Barbe, 2
1295	   Louvain-la-Neuve  1348
1296	   Belgium

1298	   Phone: +32 10 47 91 03
1299	   Email: sebastien.barre@uclouvain.be

1301	   Janardhan Iyengar
1302	   Franklin and Marshall College
1303	   Mathematics and Computer Science
1304	   PO Box 3003
1305	   Lancaster, PA  17604-3003
1306	   USA

1308	   Phone: 717-358-4774
1309	   Email: jiyengar@fandm.edu