idnits 2.17.1 

draft-singh-mptcp-plmt-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 35 instances of too long lines in the document, the longest
     one being 7 characters in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     PLMT provides two modes of operation, which differ by the time when
     the control connection is established: Parallel Setup and Late Setup. The
     Parallel Setup is significantly simpler for a Passive Opener, as
     Signatures are sent in the first bytes of a connection and therefore are
     simple to identify. But, unfortunately, the setup of a Control Connection
     for every data transfer with a short duration results in overhead and
     additional delay without any performance gains. This mode is therefore
     mainly useful if it is known in advance that a TCP connection will
     transport a large amount of data. In order to reduce the overhead for
     short connection, PLMT also allows that the Control Connection is
     established later than the Initial Connection. In this case, the PLMT
     Layer on a host MUST not initiate the TLV data encoding before the PLMT
     capability of the other host has been determined through the Control
     Connection, (cf. Figure 3).

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     The Control Connection is used to determine the PLMT capability of
     the end hosts. The Initial Connection MUST not transport any data before
     the Control Connection is established and the PLMT Capability Exchange is
     completed. If the Control Connection setup or PLMT Capability Exchange
     fails, then the Initial Connection MUST not transmit data with TLV
     encoding but the legacy TCP bytestream.

  -- The document date (August 6, 2010) is 5002 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  ** Obsolete normative reference: RFC  793 (ref. '2') (Obsoleted by RFC 9293)

  ** Obsolete normative reference: RFC 5246 (ref. '3') (Obsoleted by RFC 8446)

  == Outdated reference: A later version (-07) exists of
     draft-ietf-mptcp-congestion-00

  == Outdated reference: A later version (-05) exists of
     draft-ietf-mptcp-architecture-01

  == Outdated reference: A later version (-12) exists of
     draft-ietf-mptcp-multiaddressed-01

  == Outdated reference: A later version (-08) exists of
     draft-ietf-mptcp-threat-02

  == Outdated reference: A later version (-04) exists of
     draft-scharf-mptcp-api-02

  == Outdated reference: A later version (-01) exists of
     draft-scharf-mptcp-mctcp-00


     Summary: 3 errors (**), 0 flaws (~~), 9 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Engineering Task Force                                 A.Singh
2	Internet Draft                                     University of Bremen
3	Intended status: experimental                                 M. Scharf
4	Expires: January 2011                          Alcatel-Lucent Bell Labs
5	                                                         August 6, 2010

7	        PayLoad Multi-connection Transport using Multiple Addresses
8	                       draft-singh-mptcp-plmt-00.txt

10	Abstract

12	   The single path transport provided by the Transmission Control
13	   Protocol (TCP) can be extended to a multipath transport session for
14	   multi-homed end hosts by coupling several TCP connections over
15	   multiple interfaces of the end hosts. Payload Multi-connection
16	   Transport (PLMT) is a multipath protocol variant that encodes all
17	   the control/signaling information in the payload of TCP connections
18	   and therefore requires no additional TCP options. PLMT allows for
19	   the simultaneous use of the multiple connections over potentially
20	   disjoint paths while being mostly backward compatible to single path
21	   transport of TCP. PLMT operates as an additional protocol layer
22	   between the network stack and the application layer. This document
23	   describes PLMT as an example for a multipath mechanism that could
24	   possibly be realized entirely in the user-space of an operating
25	   system.

27	Status of this Memo

29	   This Internet-Draft is submitted in full conformance with the
30	   provisions of BCP 78 and BCP 79.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF).  Note that other groups may also distribute
34	   working documents as Internet-Drafts. The list of current Internet-
35	   Drafts is at http://datatracker.ietf.org/drafts/current/.

37	   Internet-Drafts are draft documents valid for a maximum of six
38	   months and may be updated, replaced, or obsoleted by other documents
39	   at any time.  It is inappropriate to use Internet-Drafts as
40	   reference material or to cite them other than as "work in progress."

42	   This Internet-Draft will expire on January 6, 2011.

44	Copyright Notice

46	   Copyright (c) 2010 IETF Trust and the persons identified as the
47	   document authors. All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents
51	   (http://trustee.ietf.org/license-info) in effect on the date of
52	   publication of this document. Please review these documents
53	   carefully, as they describe your rights and restrictions with
54	   respect to this document. Code Components extracted from this
55	   document must include Simplified BSD License text as described in
56	   Section 4.e of the Trust Legal Provisions and are provided without
57	   warranty as described in the Simplified BSD License.

59	Table of Contents

61	   1. Introduction...................................................5
62	   2. Terminology....................................................6
63	   3. Design Considerations..........................................7
64	      3.1. Goals.....................................................7
65	      3.2. Layered Representation....................................8
66	      3.3. Operation Summary.........................................8
67	      3.4. Compatibility............................................11
68	      3.5. Advantages and Drawbacks of PLMT.........................11
69	   4. PLMT Protocol.................................................16
70	      4.1. Session Initiation.......................................16
71	      4.2. Exchange of PLMT Signaling Over the PLMT Control Channel.16
72	         4.2.1. Establishment of the Control Connection.............16
73	         4.2.2. PLMT Capable Messages...............................17
74	         4.2.3. Further Usage of the Control Connection.............19
75	         4.2.4. Discussion of Control Connection Failure Cases......20
76	      4.3. PLMT Data Connection Setup and Operation.................20
77	         4.3.1. Guidelines for selection of a Signature.............21
78	         4.3.2. Bundling of Initial Connection to the Control
79	         Connection in Parallel Setup...............................21
80	         4.3.3. Bundling of Initial Connection to the Control
81	         Connection in Late Setup...................................23
82	      4.4. Additional Subflow Connections Initiation and Operation..24
83	         4.4.1. Address Advertisement...............................24
84	         4.4.2. Subflow Connection Setup............................25
85	         4.4.3. TLV Encoding of Data Segments.......................26
86	         4.4.4. Data Acknowledgments................................26
87	      4.5. Other Aspects............................................27
88	         4.5.1. Congestion Control..................................27
89	         4.5.2. Path Management and Scheduling......................28
90	         4.5.3. Closing Connections and Sessions....................28
91	   5. Interaction with Middleboxes..................................28
92	      5.1. Middleboxes that Translate Address/Ports.................29
93	      5.2. Middleboxes that Manipulate TCP Options..................29
94	      5.3. Middleboxes that Parse Content...........................29
95	      5.4. Middleboxes that Change content..........................30
96	   6. Security Considerations.......................................30
97	   6.1. Reappearance of Signature in Application Data...............30
98	   6.2. Resilience against Malicious Attacks........................31
99	   7. Open Issues...................................................31
100	   8. IANA Considerations...........................................31
101	   9. Conclusion....................................................32
102	   10. References...................................................32
103	   10.1. Normative References.......................................32
104	   10.2. Informative References.....................................32
105	   11. Acknowledgments..............................................33

107	1. Introduction

109	   The objective of a multipath transport mechanism is to allow the
110	   simultaneous use of multiple connections over multiple paths. A
111	   multipath transport mechanism is expected to be beneficial since it
112	   enhances the network resource utilization and since it provides
113	   resilience to node failures in the network [5].

115	   One key mechanism that aims to provide multipath transport is
116	   Multipath TCP (MPTCP). MPTCP enables multipath transport by
117	   utilizing multiple addresses of the end host to establish multiple
118	   paths (subflows) for a TCP connection [6]. MPTCP extends the
119	   standard Transmission Control Protocol (TCP) [2] to add the
120	   multipath capability and uses several new TCP options to encode
121	   control/signaling information.

123	   Another multipath transport solution, MCTCP [9] uses the new TCP
124	   options only during connection setup to transport signaling
125	   information. Afterwards the additional signaling information is sent
126	   together with the application data in the payload using a type-
127	   length-value (TLV) framing format.

129	   This document presents the Payload Multi-connection Transport (PLMT)
130	   protocol design as a further alternative multipath transport
131	   mechanism. PLMT also uses a type-length-value (TLV) framing format
132	   to send application data and control/signaling information. However,
133	   in order to transmit control/signaling information; PLMT does not
134	   use new TCP options, unlike other multipath transport solutions.
135	   Instead, PLMT sets up a control connection to a well-known port for
136	   the signaling information exchange, and it uses payload encoding
137	   over standard TCP connections. The control connection can either be
138	   set up before starting the data transport, or afterwards. In either
139	   case, it is possible to implement the PLMT signaling without
140	   changing the network stack. Each of the multiple PLMT connections is
141	   a standard TCP connection that transports TLV encoded data segments
142	   and that are coupled together to the PLMT session.

144	   Therefore, PLMT is easily deployable and extensible. PLMT is also
145	   transparent to applications and offers reliable transport similar to
146	   a standard TCP connection. PLMT is also mostly backward compatible
147	   to single path standard TCP. By design, PLMT robustly operates in
148	   environments with middleboxes that prevent the use of new TCP
149	   options. But the use of out-of-band signaling also comes at some
150	   cost concerning complexity, fall-back options, and security.
151	   However, as outlined in this document, PLMT is designed to minimize
152	   these risks and is rather robust. This document presents PLMT and
153	   discusses both the advantages and drawbacks of its design.

155	2. Terminology

157	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
158	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
159	   document are to be interpreted as described in RFC-2119 [1].

161	   This document uses the terminology defined in [5][6], though some of
162	   the terms are re-defined.

164	   Session: A connection over which an application can communicate
165	   between two hosts. For an application, there is a one-to-one mapping
166	   between a session and the socket. If a session includes only the
167	   initial connection, it is almost identical to a standard TCP
168	   connection.

170	   PLMT Control Port: A port allocated to accept the PLMT control
171	   connections.

173	   PLMT Layer: A protocol layer implementing the multi-connection
174	   capability of the PLMT. It can for instance be realized in the user
175	   space of an operating system.

177	   Initial Connection: A TCP connection established by an application
178	   request. If both ends are PLMT capable, the first subflow uses this
179	   connection.

181	   Additional Subflow Connection: A new TCP connection established for
182	   a subsequent subflow.

184	   Control Connection: A TCP connection that is established to the PLMT
185	   Control Port. The IP addresses are identical to the Initial
186	   Connection.

188	   PLMT Data Segment: The segmented application data with TLV header.

190	   Active Opener: Refers to the TCP client for a Session with PLMT
191	   Layer.

193	   Passive Opener: Refers to the TCP server for a Session with PLMT
194	   Layer.

196	   Legacy End-host: Refers to a host without PLMT Layer.

198	   Token: A 64-bit number that is unique on a host.

200	   Signature: A long bit pattern that is used to identify PLMT messages
201	   inside TCP connections. The length is 16 byte (128 bit). It MUST be
202	   selected in a way such that it is unlikely to occur in application
203	   protocols. Guidelines how to determine a Signature are explained in
204	   section 4.3.1. .

206	   Session Sequence Number: The sequence number of a byte inside a byte
207	   stream of a session, determined by the PLMT Layer.

209	3. Design Considerations

211	   This section gives a high-level overview of PLMT's design.

213	    3.1. Goals

215	   Important design assumptions and goals of the PLMT design are:

217	     o  No change of network stack: PLMT is designed to minimize the
218	        impact on the network stack implementation. The signaling can
219	        be completely implemented in the user-space of an operating
220	        system.

222	     o  Backward compatible: The PLMT should be backward compatible to
223	        standard TCP. A single connection PLMT should be exactly
224	        similar to the standard TCP connection. As long as only one
225	        connection exists, it is not necessary to use TLV framing on
226	        that connection.

228	     o  Co-existence with standard TCP connections: A PLMT capable end
229	        host must be able to differentiate between PLMT connections and
230	        regular TCP connections. This is crucial, since PLMT
231	        connections use TLV encoding.

233	     o  Multihomed and multiaddressed end hosts: PLMT assumes that for
234	        the establishment of multiple connections at least one of the
235	        end hosts must be multihomed and multiaddressed.

237	     o  Middlebox compatibility: PLMT should be compliant to the vast
238	        majority of middleboxes, such as NAPT middleboxes and
239	        firewalls. Therefore, PLMT should not rely on TCP extensions.
240	        PLMT should also allow a middlebox to identify that a host
241	        establishes PLMT connections, and prevent this.

243	     o  Transparency: PLMT should be transparent to the legacy
244	        application i.e., it should provide the same API and services
245	        (of the standard TCP) to the application.

247	    3.2. Layered Representation

249	  PLMT operates as an additional protocol layer (shim layer) between
250	  the application layer and the transport layer. It is designed to be
251	  transparent to both higher and lower layers and to be implemented in
252	  the user space. It can be used by legacy applications without any
253	  changes. Figure 1 illustrates this layering.

255	                                        +-------------------------------+
256	                                        |           Application         |
257	    +-----------------------------+     +-------------------------------+
258	    |          Application        |     |             PLMT              |
259	    +-----------------------------+     +---------------+---------------+
260	    |             TCP             |     |      TCP      |      TCP      |
261	    +--------------+--------------+     +---------------+---------------+
262	    |             IP              |     |       IP      |       IP      |
263	    +--------------+--------------+     +---------------+---------------+
264	       Figure 1 Comparison of Standard TCP and PLMT Protocol Stacks

266	3.3. Operation Summary

268	   This section gives an outlook to the overall high-level operation of
269	   PLMT. Figure 2 depicts a simple scenario to illustrate the basic
270	   PLMT operation. A detailed PLMT protocol specification and operation
271	   description is provided in section 4.

273	     o  A legacy application, unaware of the presence of PLMT will
274	        initiate a standard TCP connection by opening a TCP socket for
275	        a Session. PLMT-aware applications MAY use a new application
276	        interface [8] to control the functioning of PLMT.

278	     o  The PLMT Layer then manages the connection establishment of
279	        Initial Connection, Control Connection and additional Subflow
280	        Connections.

282	     o  In order to enable PLMT, the Active Opener opens a PLMT
283	        control connection to a well-known port at the Passive Opener.
284	        The control connection is used to determine whether the remote
285	        end supports PLMT, and to exchange the necessary control
286	        information such as the Tokens. The Control Connection, as well
287	        as Subflow Connections, are established in the standard TCP way
288	        by the PLMT Layer.

290	     o  A node may either set up a Control Connection before or in
291	        parallel to the setting up of the Initial Connection (refer
292	        Figure 2). Alternatively, it may first use the Initial
293	        Connection and decide later to open the Control Connection. The
294	        latter case is discussed in section 4.3.3. . The control
295	        connection must be set up using the same IP source and
296	        destination addresses like the Initial Connection, and use the
297	        PLMT control port. If the setup of the Control Connection
298	        fails, PLMT will not be enabled and fall back to standard TCP.

300	     o  If the Passive Opener supports PLMT and TLV transport is
301	        successfully enabled, the Initial Connection will use a TLV
302	        framing for data transmission. Then, the Initial Connection is
303	        also termed first Subflow Connection. The setup of the TCP
304	        connections between two hosts A and B is illustrated in Figure
305	        2. PLMT signals the use of TLV encoding by sending the
306	        Signature in the payload of the TCP byte stream. The Signature
307	        is a long bit pattern that is selected in such a way that it is
308	        unlikely to occur in a TCP connection not using PLMT.
309	        Furthermore, Tokens are used to verify that the Initial and
310	        Control Connection originate indeed at the same hosts. A
311	        detailed analysis of the security implications of PLMT and the
312	        resulting very small risk of false positives when detecting its
313	        connections are provided in section 6. .

315	     o  If multiple interfaces are present, PLMT can establish
316	        multiple Subflow Connections to allow data transport over
317	        multiple paths. Once TLV encoded data transport is activated, a
318	        Session level data sequence number is used for in-order
319	        delivery of the Data Segments over multiple Subflow
320	        Connections. The PLMT Layer manages the multiple interfaces and
321	        connections and delivers the packets over the different
322	        connections. At the receiver, the PLMT Layer reassembles the
323	        byte stream and transparently delivery them to the application.

325	     o  As the Subflow Connections are standard TCP connections, they
326	        are terminated as a regular TCP connection with the 4-way FIN
327	        handshake. The Session is terminated with the termination of
328	        the last subflow.

330	            End-host A                                 End-host B
331	     ---------------------------             ---------------------------
332	      Address A1     Address A2               Address B1     Address B2
333	     ------------   ------------             ------------   ------------
334	           |               |                          |              |
335	           |     (Initial Connection setup)           |              |
336	           |---------------SYN----------------------->|              |
337	           |<------------SYN/ACK----------------------|              |
338	           |---------------ACK----------------------->|              |
339	           |               |                          |              |
340	           |     (Control Connection setup)           |              |
341	           |~~~~~~~~~~~~~~~SYN~~~~~~~~~~~~~~~~~~~~~~~>|              |
342	           |<~~~~~~~~~~~~SYN/ACK~~~~~~~~~~~~~~~~~~~~~~|              |
343	           |~~~~~~~~~~~~~~~ACK~~~~~~~~~~~~~~~~~~~~~~~>|              |
344	           |               |                          |              |
345	           | (Token exchange over Control Connection  |              |
346	           |       as detailed in Figure 3)           |              |
347	           |~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~>|              |
348	           |<~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|              |
349	           |               |                          |              |
350	           |           Signature+Token                |              |
351	           |----------------------------------------->|              |
352	           |           Signature+Token                |              |
353	           |<-----------------------------------------|              |
354	           |               |                          |              |
355	           |     (TLV-encoded Data Segments           |              |
356	           |     over the Initial Connection)         |              |
357	           |----------------------------------------->|              |
358	           |<-----------------------------------------|              |
359	           |               |                          |              |
360	           |(Address Exchange over the Control or Initial Connection)|
361	           |               |                          |              |
362	           |       (Additional Subflow Connection setup (TCP))       |
363	           |==========================SYN===========================>|
364	           |<=======================SYN/ACK==========================|
365	           |==========================ACK===========================>|
366	           |               |                          |              |
367	           |        (Signature and TLV-encoded Data Segments         |
368	           |              over the Subflow Connection)               |
369	           |========================================================>|
370	           |<========================================================|
371	           |               |            |             |              |

373	    Figure 2 PLMT Connections Establishment in case that the Control Connection
374	                is set up in parallel to the Initial Connection

376	    3.4. Compatibility

378	   PLMT uses the Control Connection to detect whether a Passive Opener
379	   indeed supports its operation. If the setup of the Control
380	   Connection fails, it falls back to standard TCP transport, and does
381	   not use any additional PLMT signaling. PLMT is thus compatible with
382	   legacy TCP stacks and is able to detect them.

384	   The PLMT Layer is transparent to applications, i. e., it is
385	   compatible with legacy applications unaware of PLMT.

387	   The PLMT protocol does not require extensions of the TCP protocol
388	   and reuses the standard TCP mechanisms for the reliable, in-order
389	   operation of its connections. PLMT uses its own frame format based
390	   on the TLV encoding to send the application data and the control
391	   information. The use of TLV encoding is known from other TCP-based
392	   protocols such as TLS [3]. Therefore, PLMT should pass most
393	   middleboxes, in particular all middleboxes that would block TCP
394	   options. An exception is the case of middleboxes that parse the byte
395	   stream and block TLV content. In this case, PLMT transport may fail
396	   in certain cases, as discussed in section 5.3. and 5.4. .

398	   The signaling and message transport of PLMT can be implemented on a
399	   host without changing the network stack, i. e., as a library in the
400	   user space. With a combination of scheduling and rate shaping
401	   mechanisms, the PLMT Layer can also try to emulate congestion
402	   control coupling algorithms such as [4]. In this case, it may be
403	   possible to implement PLMT entirely in the user space of a host.

405	    3.5. Advantages and Drawbacks of PLMT

407	   PLMT follows the principles outlined for a multipath transport
408	   solution based on TCP in [5]. PLMT uses the TCP payload to transport
409	   signaling messages and requires no new TCP options. Thus, PLMT
410	   brings along all advantages of the payload encoding mechanism (cf.
411	   [9]):

413	     o  PLMT does not use any TCP option to setup its connections.
414	        Therefore, it might be possible to implement PLMT entirely in
415	        the user-space, which would significantly facilitate deployment
416	        of PLMT.

418	     o  In addition, the signaling messages are not constrained with
419	        the limited size of the TCP options, and PLMT does not consume
420	        further option space in SYN segments.

422	     o  PLMT does not modify TCP and is therefore compatible with many
423	        middleboxes, especially ones which do not allow unknown TCP
424	        options to get through, or ones that re-write the TCP options.

426	     o  Middleboxes can very easily identify the setup of a PLMT
427	        Control Connection due to the use of a well-known port. If a
428	        middlebox on the path of the Initial Connection wants to
429	        prevent the use of multipath transport, it can simply block the
430	        connection setup to that port. Then, multipath transport will
431	        not be used for the corresponding connection.

433	   PLMT is developed as an example for a multipath transport protocol
434	   that does not use any new TCP option, or other TCP extensions, and
435	   that is still backward compatible. Still, due to the use of payload
436	   encoding and an out-of-band control channel for the exchange of
437	   control information, a number of issues arise. The following text
438	   discusses these problems (some of which may exist for other
439	   multipath transport solutions as well) and possible solutions.

441	     o  PLMT opens a Control Connection per PLMT Session, i. e., an
442	        additional TCP connection. If a host opens Control Connections
443	        for every short TCP-based transfer, this would result in a
444	        large number of additional connection setups, which would
445	        consume bandwidth, processing resources, and port numbers. The
446	        worst case is that a PLMT Control Connection is set up for
447	        every Initial Connection, but additional subflows are never
448	        established. Then, the number of TCP connection doubles without
449	        any performance benefit. As a remedy, PLMT can also first use
450	        an Initial Connection without Control Connection, and try to
451	        establish the Control Connection after some time. Once PLMT
452	        capability is detected and additional signaling information has
453	        been exchanged, the Initial Connection as well as potential
454	        additional Subflow Connections can then be used to transport
455	        PLMT TLV-encoded data traffic. This mechanism avoids needless
456	        Control Connection setups for short transfers.

458	     o  PLMT needs a well known, dedicated port for the Control
459	        Connections, similar to TLS [3]. If PLMT is enabled on a host,
460	        it may try to establish Control Connections to that port for
461	        all communication partners. Even if heuristics can be used to
462	        learn whether servers are supporting PLMT, or not, and thus
463	        reduce the connection setup attempts, numerous legacy hosts in
464	        the Internet will receive connection setups on that port. To
465	        legacy systems, this may look similar to a SYN flooding attack.
466	        As a counter measure, network administrators may configure
467	        firewalls to block the PLMT Control Port, which prevents the
468	        usage of the protocol once it is more widely deployed.

470	     o  Middlebox that transparently change the length of content are
471	        a problem for multipath transport protocols. When using TLV-
472	        based transport, PLMT could detect such middleboxes by using a
473	        checksum, or by observing broken TLV headers, and try
474	        retransmissions. However, if the byte stream is transparently
475	        changed before switching to TLV encoding, difficulties can
476	        arise. For instance, the Signature may not be at the position
477	        where it is expected. In this case, PLMT cannot enter the TLV
478	        mode, but it can also not necessarily fall back, and it may
479	        either have to cancel that transfer by closing the PLMT
480	        Session, or, in the worst case, it may even deliver corrupted
481	        data to an application.

483	     o  PLMT delays the setup of connections in various scenarios. If
484	        an Active Opener wants to use TLV encoding immediately on the
485	        Initial Connection, it must await the setup of the control
486	        connection. If there is no response (no SYN/ACK), the Active
487	        Opener may either retransmit the SYN, i. e., wait for a longer
488	        time, or give up. Then, multipath transport is not possible. In
489	        all cases, there is at least a small delay before the data
490	        transport over the Initial Connection can start. If the Active
491	        Opener decides to setup the Control Connection later, this
492	        delay is avoided. But then the Active Opener must stop data
493	        transmission after the setup of the Control Connection, in
494	        order to ensure a safe exchange of tokens, which interrupts the
495	        data transport.

497	     o  The Passive Opener has a significant processing overhead due
498	        to PLMT. First and most obviously, there is the overhead of
499	        maintaining the Control Connections, which can be significant
500	        for a highly-loaded server with thousands of connections.

502	     o  The second and trickier challenge is the distinction between
503	        legacy TCP connections and connection originating from hosts
504	        that use PLMT. PLMT Subflow Connections are characterized by
505	        the presence of the Signature in the byte stream. This means
506	        that the PLMT layer must accept all incoming connections, parse
507	        for the presence of a valid Signature, and then decide whether
508	        it is a legacy connection or a connection transporting PLMT
509	        content with TLV encoding. The parsing for Signatures is
510	        difficult if an incoming connection sends less data than the
511	        length of the Signature. If the first bytes match a valid
512	        Signature, or if no bytes are received at all, the PLMT layer
513	        must wait for the arrival of further data, or time out, e. g.,
514	        if the corresponding application does not send enough bytes. If
515	        it times out, the only safe option is to close the connection.
516	        This means that the PLMT layer may reject not only PLMT
517	        connections that suffered from retransmissions within the first
518	        byte, but also valid TCP connection setup from legacy stacks if
519	        they happen to (partly) match a Signature. If the delayed setup
520	        of Control Connections is allowed, the parsing overhead is even
521	        larger. The PLMT layer must then parse all established TCP
522	        connections for all valid Signatures at the negotiated
523	        positions in the byte stream, which may also require temporary
524	        buffering of data, if only parts of a valid Signature are
525	        received, or if the rest of the first TLV message is missing.
526	        In all cases, the delivery of data to applications may be
527	        delayed.

529	     o  On a Passive Opener, the PLMT layer has to accept incoming
530	        connections in order to parse the payload, before it can hand
531	        over the connection to the application. This can delay data
532	        delivery, and also may result in inconsistent views when the
533	        connection is indeed established. Further studies are needed to
534	        understand whether the delay of connection establishment as
535	        seen by applications, which does not occur in case of option-
536	        based multipath protocols, could break existing applications.

538	     o  Due to the processing and buffer overhead required to identify
539	        connections by payload parsing, the Passive Opener is
540	        vulnerable to a Denial-of-Service (DoS) attack: An attacker can
541	        open a large number of Control Connections, which will consume
542	        resources on a server and slow down data delivery on other
543	        connections. Passive Openers can reduce the risk by only
544	        accepting Coupled Connections from source IP addresses that
545	        originate also an existing connection, but this does not offer
546	        a complete protection, in particular if an attacker is sitting
547	        behind a large NAPT middlebox. Another remedy is to limit the
548	        amount of allowed Control Connections, but then other users of
549	        PLMT suffer from the effects of Control Connection setup
550	        failures.

552	     o  PLMT must exchange the Token information in the payload of the
553	        Initial Connection, in order to verify that an Initial
554	        Connection and a Coupled Connection indeed have the same
555	        endpoints. This requires the transport of a TLV-encoded
556	        message. As a consequence, unlike other multipath transport
557	        protocols [6] [9], PLMT cannot fall back to a backward
558	        compatible byte stream transport if a middlebox on the path
559	        should block the TLV transport.

561	     o  If there is a single-homed Active Opener and a multi-homed
562	        Passive Opener, PLMT cannot indicate to the Active Opener that
563	        multipath transport may make sense, i. e., that it could
564	        establish a Control Connection, before that connection actually
565	        exists. Other multipath transport protocols [6] [9] have a
566	        signaling mechanism for this. PLMT can only detect this
567	        situation if it blindly opens Control Connections in all cases.

569	     o  If a middlebox does not intercept the information on the
570	        Control Connections, or if it does not know the Signature by
571	        other means, it cannot determine if a given TCP connection
572	        transports PLMT data, or not. If a middlebox is not on the path
573	        of the Control Connection, it cannot prevent the usage of TLV
574	        encoding. For the latter case, a possible remedy would be that
575	        Additional Subflow Connections use another well-known port,
576	        which could then be blocked.

578	     o  A Passive Opener can accept with a certain, small probability
579	        erroneously a connection from a legacy host as PLMT Subflow
580	        Connection, if an application happens to send a bit pattern
581	        that is identical to one of the valid Signature of that Passive
582	        Opener, plus the valid Tokens. This may either happen if the
583	        first bytes of a standard TCP connection match an active
584	        Signature, or if a corresponding bit pattern is present exactly
585	        at the same sequence position as negotiated on a control
586	        connection. In that case, TLV-encoded content will be injected
587	        into a legacy connection, which will be corrupted. Due to the
588	        length of the Signature, this error probability is very small.

590	     o  An attacker can abuse PLMT to break legacy TCP connections to
591	        a PLMT-enabled Passive Opener, if it is sitting behind the same
592	        NAPT middlebox like another Active Opener, as already
593	        explained. In this case, the attacker can open multiple Control
594	        Connections, not only as a DoS attack, but also to attack other
595	        users. With a very small probability, the Signature and Tokens
596	        negotiated over the Control Connection will match another
597	        connection. If so, TLV content will be injected on that
598	        connection, and it will break, too. Again, the success
599	        probability of this attack is very small.

601	   In summary, PLMT is a multipath protocol that is designed as a
602	   payload-only solution. It is useful for controlled and trusted
603	   environments, for networks with middleboxes that affect the use of
604	   TCP options, and for use cases where it is impossible to change the
605	   network stack.

607	4. PLMT Protocol

609	  This section details the operations of PLMT protocol.

611	4.1. Session Initiation

613	  A session initiation begins with an application request for a new TCP
614	  connection, upon which the PLMT protocol performs the following
615	  actions.

617	4.2. Exchange of PLMT Signaling Over the PLMT Control Channel

619	  A node MAY setup a TCP Control Connection before or in parallel to
620	  the setting up of the Initial Connection (Parallel Setup), or it MAY
621	  set up the Control Connection at a later point in time (Late Setup).
622	  Both variants have advantages and drawbacks and affect the way how
623	  the Initial Connection is used.

625	      4.2.1. Establishment of the Control Connection

627	   The Active Opener Must set up the TCP Control Connection using the
628	   same source and destination IP addresses, and it MUST be destined to
629	   the PLMT Control Port. If the TCP connection is successfully set up,
630	   this is a first indication that the Passive Opener indeed supports
631	   PLMT. In order to exclude the case that another service is
632	   accidentally running on that port, PLMT support is further verified
633	   by PLMT Capable Messages.

635	   A Passive Opener SHOULD verify whether there are already established
636	   TCP connections from the same Active Opener, in order to reduce the
637	   vulnerability to DoS attacks.

639	      4.2.2. PLMT Capable Messages

641	  If the Control Connection is set up successfully, the two hosts can
642	  be expected to have an operational PLMT Shim Layer. The End-host MUST
643	  exchange the Tokens as shown in Figure 3 for further validation of
644	  the existence of PLMT Shim layer and the willingness of the Passive
645	  Opener to use PLMT. Note that at this stage of the signaling the
646	  Passive Opener cannot safely identify the Initial Connection that
647	  this Control Connection shall be associated with.

649	            End-host A                                 End-host B
650	     ---------------------------             ---------------------------
651	      Address A1     Address A2               Address B1     Address B2
652	     ------------   ------------             ------------   ------------
653	           |               |                          |              |
654	           |     (Control Connection setup (TCP))     |              |
655	           |~~~~~~~~~~~~~~~SYN~~~~~~~~~~~~~~~~~~~~~~~>|              |
656	           |<~~~~~~~~~~~~SYN/ACK~~~~~~~~~~~~~~~~~~~~~~|              |
657	           |~~~~~~~~~~~~~~~ACK~~~~~~~~~~~~~~~~~~~~~~~>|              |
658	           |               |                          |              |
659	           |       (PLMT Capable Signaling)           |              |
660	           |~~~~~~~~~~PLMT Token Indication~~~~~~~~~~>|              |
661	           |<~~~~~~~~PLMT Token Confirmation~~~~~~~~~~|              |
662	           |               |                          |              |

664	   Figure 3 PLMT Signaling Exchange over the Control Connection

666	   The frame format of the PLMT Token Indication message is shown in
667	   Figure 4. The Token is a unique number for a host and is used to
668	   identify a particular PLMT Session. To make it harder for an
669	   attacker to guess the Token by brute-force method, a 64-bit Token
670	   SHOULD be generated randomly [7]. Furthermore, the PLMT Token
671	   Indication message includes the Signature of the Active Opener, as
672	   well as the byte position in the Initial Connection where this
673	   Signature will be present on the Initial Connection. The byte
674	   position is provided in the Token Indication in order to reduce the
675	   parsing overhead of a Passive Opener, and the risk that an attacker
676	   can hijack a connection by negotiation of a large number of
677	   Signatures and Tokens with a Passive Opener. This implies that an
678	   Active Opener can only send data up to this position before it
679	   receives a PLMT Token Confirmation message. In case of a Parallel
680	   connection setup, this position is set to 0, as the Signature is set
681	   at the beginning of the connection. As a side note, the whole
682	   mechanism can fail if the bytestream length is affected by a
683	   middlebox.

685	                            1                   2                   3
686	       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
687	      +---------------+--------------------------------+--------------+
688	      |Kind=TOKENIND  |            Length=32           |   reserved   |
689	      +---------------+--------------------------------+--------------+
690	      :          Active Opener Signature (in total 16 byte)           :
691	      +---------------------------------------------------------------+
692	      :          Active Opener Token (in total 8 bytes)               :
693	      +---------------------------------------------------------------+
694	      |                    Signature offset (4 byte)                  |
695	      +---------------------------------------------------------------+

697	       Figure 4 PLMT Token Indication message (sent via the Control
698	                                Connection)

700	   As a response to the reception of the PLMT Token Indication from the
701	   Active Opener, the Passive Opener SHOULD either send back an own
702	   Token in a PLMT Token Confirmation message shown in Figure 5, or it
703	   SHOULD immediately close the Control Connection instead. This
704	   message also echoes back the Active Opener's Token, in order to
705	   verify that the reply is indeed sent by a PLMT layer.

707	                            1                   2                   3
708	       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
709	      +---------------+--------------------------------+--------------+
710	      |Kind=TOKENCONF |             Length=36          |  reserved    |
711	      +---------------+--------------------------------+--------------+
712	      :         Passive Opener Signature (in total 16 byte)           :
713	      +---------------------------------------------------------------+
714	      :              Passive Opener Token (in total 8 byte)           :
715	      +---------------------------------------------------------------+
716	      :       Echo of Active Opener Token (in total 8 bytes)          :
717	      +---------------------------------------------------------------+

719	      Figure 5 PLMT Token Confirmation message (sent via the Control
720	                                Connection)

722	   Upon reception of that message, the Active Opener MUST first verify
723	   the validity of the message (in particular the echoed Token). If the
724	   message is valid, it MUST send the Signature provided from the
725	   Passive Opener at the indicated byte position in the Initial
726	   Connection, directly followed by a PLMT Token Message. Afterwards,
727	   TLV framing has to be used. The Passive Opener must similarly react:
728	   After having received the Signature and Token on an Initial
729	   Connection, the Passive Opener MUST send the Active Opener's
730	   Signature and a PLMT Token Message over the Initial Connection, too,
731	   and use TLV framing afterwards. Thus, after having sent the
732	   Signature, the Active Opener must parse all incoming bytes on the
733	   Initial Connection for the Signature of the Passive Opener, in order
734	   to detect the begin of TLV transfer in the reverse direction. In the
735	   simplest case, the Passive Opener has not sent any data in the
736	   meantime, i. e., the Signature is received immediately. However,
737	   other cases are possible, too.

739	   Note that this method is inefficient and also has a very small risk
740	   of false positives, as it requires byte-wise parsing of the byte
741	   stream. Yet, the fundamental problem is that the Passive Opener
742	   cannot provide a byte offset for the Signature over the Control
743	   Channel during the PLMT Capability Signaling phase, as the Initial
744	   Connection and the Control Connection cannot be associated at that
745	   time. As an optimization, the Passive Opener could provide a
746	   bytestream offset by a separate signaling message once it has
747	   received the Token on the Initial Connection, but PLMT cannot rely
748	   on this, as the Control Connection could fail or stall in the
749	   meantime and then the PLMT session would not be in consistent state.
750	   The PLMT signaling exchange is designed to reflect an atomic
751	   transaction.

753	      4.2.3. Further Usage of the Control Connection

755	   The Control connection is only needed to exchange token information
756	   and to verify the association with the Initial Connection. After the
757	   PLMT capability exchange has been completed, the control connection
758	   is actually not needed any more, and it MAY be closed. All further
759	   control information, such as additional addresses etc., can also be
760	   exchanged over the Subflow Connections, by corresponding TLV
761	   messages. However, the Control Connection MAY also be kept
762	   established and used for further PLMT signaling. In particular, it
763	   could be useful to exchange address information over the Control
764	   Connection instead of the Subflow Connections. This would enable
765	   future NAPT helper for the PLMT protocol that could try to translate
766	   private to public addresses. A detailed discussion of this is
767	   outside the scope of this document.

769	      4.2.4. Discussion of Control Connection Failure Cases

771	  A failure to setup a Control Connection is an indication that the
772	  other end host does not have a PLMT Layer, or that middleboxes do not
773	  allow the establishment of a PLMT Control Connection. An Active
774	  Opener MUST await the successful PLMT capability exchange on the
775	  Control Connection before starting to send the Signature and TLV
776	  encoded content. An Active Opener MAY also give up after a certain
777	  waiting time. Then, it MUST close the Control Connection, and use
778	  backward compatible bytestream transport on the Initial Connection.

780	  The PLMT capability exchange requires a single exchange of messages
781	  on the Control Connection only. If the Connection fails afterwards,
782	  all control information can be exchanged over Subflow Connections. If
783	  the control connection fails and the Active Opener does not receive
784	  the Token Confirmation message, without that the Passive Opener
785	  detects this, there may be a synchronization mismatch and the Passive
786	  opener may inject a Signature and a Token to the Initial Connection
787	  even if this is not expected by the Active Opener. In order to avoid
788	  data corruption, the Active Opener could parse all incoming data for
789	  the Signature after failure of a Control Connections, but this may
790	  increase the processing overhead.

792	  If a Control Connection fails after the exchange of the tokens, PLMT
793	  could in principle continue to operate, since TLV encoded data can be
794	  transported over the established Subflow Connections, and since the
795	  Signatures and Tokens are already known.

797	4.3. PLMT Data Connection Setup and Operation

799	   PLMT provides two modes of operation, which differ by the time when
800	   the control connection is established: Parallel Setup and Late
801	   Setup. The Parallel Setup is significantly simpler for a Passive
802	   Opener, as Signatures are sent in the first bytes of a connection
803	   and therefore are simple to identify. But, unfortunately, the setup
804	   of a Control Connection for every data transfer with a short
805	   duration results in overhead and additional delay without any
806	   performance gains. This mode is therefore mainly useful if it is
807	   known in advance that a TCP connection will transport a large amount
808	   of data. In order to reduce the overhead for short connection, PLMT
809	   also allows that the Control Connection is established later than
810	   the Initial Connection. In this case, the PLMT Layer on a host MUST
811	   not initiate the TLV data encoding before the PLMT capability of the
812	   other host has been determined through the Control Connection, (cf.
813	   Figure 3).

815	      4.3.1. Guidelines for selection of a Signature

817	  To allow for a simple identification of where exactly the TLV
818	  encoding inside the byte stream starts, a 128-bit Signature is used,
819	  which is used as a delimiter between bytestream and TLV encoding (cf.
820	  Figure 6). The Signature is selected by the hosts that must parse it,
821	  and MUST be chosen such that collisions with existing application
822	  protocols are minimal. Note that it is up to the hosts to decide what
823	  Signature to use for different connections The most secure solution
824	  is to use a different Signature for every Control Connection, but
825	  then the parsing effort is the largest. For performance optimization,
826	  the PLMT Layer at a host MAY use the same Signature in more than one
827	  connection, but it MUST change the value on a regular basis.

829	                            1                   2                   3
830	       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
831	      +---------------------------------------------------------------+
832	      |                       Signature (16 byte)                     :
833	      +-----------------------------------------------+---------------+
834	      :                           Signature                           :
835	      +-----------------------------------------------+---------------+
836	      :                           Signature                           :
837	      +-----------------------------------------------+---------------+
838	      :                           Signature                           |
839	      +---------------------------------------------------------------+

841	           Figure 6 PLMT Signature (sent on Subflow Connections)

843	      4.3.2. Bundling of Initial Connection to the Control Connection
844	         in Parallel Setup

846	   The Control Connection is used to determine the PLMT capability of
847	   the end hosts. The Initial Connection MUST not transport any data
848	   before the Control Connection is established and the PLMT Capability
849	   Exchange is completed. If the Control Connection setup or PLMT
850	   Capability Exchange fails, then the Initial Connection MUST not
851	   transmit data with TLV encoding but the legacy TCP bytestream.

853	   Before using TLV encoding, a host must first send the Signature on
854	   the Initial Connection as depicted in Figure 7. The first TLV-
855	   encoded messages after that delimiter must exchange the tokens to
856	   bundle the Initial Connection with the Control Connection, and to
857	   verify at both endpoints that the Initial Connection and the Control
858	   Connection indeed terminate at the same host. The tokens are
859	   exchanged by a Token Indication and a Token Confirmation message.
860	   After these messages, both sides are allowed to send other PLMT
861	   messages in TLV encoding over the Connection, or to establish
862	   further Subflow Connections. Both Active and Passive Opener must
863	   verify the Tokens. If the Tokens do not match the ones exchanged
864	   over the control connection, the PLMT session must be closed, as
865	   apparently an error has occurred.

867	            End-host A                                 End-host B
868	     ---------------------------             ---------------------------
869	      Address A1     Address A2               Address B1     Address B2
870	     ------------   ------------             ------------   ------------
871	           |               |                          |              |
872	           |      (Initial Connection setup (TCP))    |              |
873	           |---------------SYN----------------------->|              |
874	           |<------------SYN/ACK----------------------|              |
875	           |---------------ACK----------------------->|              |
876	           |               |                          |              |
877	           |    (PLMT Capability of the Other End-host has been      |
878	           |         determined over the Control Connection)         |
879	           |               |                          |              |
880	           |    (First TLV encoded message exchange   |              |
881	           |         over the Initial Connection)     |              |
882	           |---B's Signature + Token B Verification-->|              |
883	           |               |                          | Token        |
884	           |               |                          | verif.       |
885	           |<--A's Signature + Token A Verification---|              |
886	    Token  |               |                          |              |
887	    verif. |               |                          |              |
888	           |               |                          |              |
889	           |     (TLV encoded data transport          |              |
890	           |     over the Initial Connection)         |              |
891	           |---------------TLV----------------------->|              |
892	           |<--------------TLV------------------------|              |
893	           |               |                          |              |
894	     Figure 7 Bundling of Initial PLMT Subflow Connection and Control
895	                       Connection for Parallel Setup

897	                                 1                   2                   3
898	           0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
899	         +---------------+---------------------------------+-------------+
900	         |Kind=TOKEN     |                 Length=12       |  reserved   |
901	         +---------------+---------------------------------+-------------+
902	         |                         Token (8 byte)                        |
903	         +---------------------------------------------------------------+

905	      Figure 8 PLMT Token Verification Message (sent over the Initial
906	                                Connection)

908	      4.3.3. Bundling of Initial Connection to the Control Connection
909	         in Late Setup

911	  In order to avoid the setup overhead of control Connections for
912	  short-lived transfers, the PLMT protocol MAY establish the Control
913	  Connection after data has already been exchanged on the Initial
914	  Connection. This document does not describe heuristics when to set up
915	  the Control connection. They may take into account factors such as
916	  number of bytes transferred, cached information about support of
917	  PLMT, or user preferences.

919	  A receiver MUST assume that all bytes received on an incoming TCP
920	  connection are sent by legacy end system, before a match with a valid
921	  Signature is possible. Until then, all data must be passed to the
922	  application in unmodified form. Thus, PLMT risks with a very small
923	  probability that corrupted data is delivered to an application.

925	   Once the Control Connection is established and the PLMT capability
926	   information of the end hosts has been exchanged, the Active Opener
927	   can send the Passive Opener's Signature and a PLMT Token
928	   Verification message over the Initial Connection, at the position in
929	   the byte stream that has been advertised over the control channel.
930	   The mechanism of token exchange in the payload of the Initial
931	   Connection is used to verify that the Initial Connection and Control
932	   Connection actually involve the same hosts.

934	            End-host A                                 End-host B
935	     ---------------------------             ---------------------------
936	      Address A1     Address A2               Address B1     Address B2
937	     ------------   ------------             ------------   ------------
938	           |               |                          |              |
939	           |      (Initial Connection setup (TCP))    |              |
940	           |---------------SYN----------------------->|              |
941	           |<------------SYN/ACK----------------------|              |
942	           |---------------ACK----------------------->|              |
943	           |               |                          |              |
944	           |              (Data Segments              |              |
945	           |     sent over the Initial Connection)    |              |
946	           |----------------------------------------->|              |
947	           |<-----------------------------------------|              |
948	           |               |                          |              |
949	           |     (Control Connection setup (TCP))     |              |
950	           |~~~~~~~~~~~~~~~SYN~~~~~~~~~~~~~~~~~~~~~~~>|              |
951	           |<~~~~~~~~~~~~SYN/ACK~~~~~~~~~~~~~~~~~~~~~~|              |
952	           |~~~~~~~~~~~~~~~ACK~~~~~~~~~~~~~~~~~~~~~~~>|              |
953	           |               |                          |              |
954	           |   (TLV-Enabled PLMT Control Signaling    |              |
955	           |     sent over the Control Connection)    |              |
956	           |~~~Sign. indic. (A's sign., A's token)~~~>|              |
957	           |<~~Sign. confirm. (B's sign., B's token)~~|              |
958	           |               |                          |              |
959	           |       (Message exchange over the         |              |
960	           |           Initial Connection)            |              |
961	           |---B's Signature + Token B verification-->|              |
962	           |                                          | Token        |
963	   ........|..........................................| verif.       |
964	           |<--A's Signature + Token A verification---|              |
965	    Token  |               |                          |              |
966	    verif. |     (TLV encoded data transport          |              |
967	           |     over the Initial Connection)         |              |
968	           |---------------TLV----------------------->|              |
969	           |<--------------TLV------------------------|              |
970	           |               |                          |              |
971	      Figure 9 Bundling of PLMT First Subflow Connection and Control
972	                       Connection for Delayed Setup

974	4.4. Additional Subflow Connections Initiation and Operation

976	      4.4.1. Address Advertisement

978	  The Initial Subflow Connection, as well as the Control Connection, is
979	  established by the Active Opener. Once TLV encoding is enabled on the
980	  Initial Subflow Connection, and it is thus verified that the two end-
981	  hosts are PLMT capable, any of the end-hosts MAY initiate further
982	  Subflow Connections. PLMT assumes that at least one of the two
983	  connection endpoints is multihomed, i. e., has at least two IP
984	  addresses. The end-hosts MAY exchange these addresses via the Control
985	  Connection or via any Subflow Connection, once TLV transport is
986	  enabled. The frame format of advertising and releasing addresses is
987	  given in Figure 10 and 11, respectively.

989	                             1                   2                   3
990	         0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
991	        +---------------+-------------------------------+-------+-------+
992	        | Kind=ADD_ADDR |            Length             | IPVer | (res) |
993	        +---------------+-------------------------------+-------+-------+
994	        |          Address (IPv4 - 4 octets / IPv6 - 16 octets)         |
995	        +---------------------------------------------------------------+

997	                       Figure 10   PLMT Add Address
998	                             1                   2                   3
999	         0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1000	        +---------------+-------------------------------+-------+-------+
1001	        | Kind=DEL_ADDR |               Length          | IPVer | (res) |
1002	        +---------------+-------------------------------+-------+-------+
1003	        |          Address (IPv4 - 4 octets / IPv6 - 16 octets)         |
1004	        +---------------------------------------------------------------+

1006	                      Figure 11   PLMT Remove Address

1008	      4.4.2. Subflow Connection Setup

1010	   For each initiation of an additional Subflow Connection, a new TCP
1011	   connection is initiated with a three-way handshake (SYN, SYN/ACK,
1012	   ACK). The Signatures are used by both ends to distinguish Subflow
1013	   Connections from normal TCP connection, and to detect the start of
1014	   TLV encoding. If a Subflow Connection is established that shall
1015	   carry TLV Data Segments, a sender MUST send the Signature first
1016	   before starting to send TLV Data Segments. In all cases, the first
1017	   Data Segment after the Signature MUST be a Token Indication (from
1018	   Active Opener) or Token Confirmation message (from Passive Opener).
1019	   This setup of an additional Subflow Connection is illustrated in
1020	   Figure 12.

1022	            End-host A                                 End-host B
1023	     ---------------------------             ---------------------------
1024	      Address A1     Address A2               Address B1     Address B2
1025	     ------------   ------------             ------------   ------------
1026	           |               |                          |              |
1027	           |     (TLV encoded Data Segments)          |              |
1028	           |----------------------------------------->|              |
1029	           |<-----------------------------------------|              |
1030	           |               |                          |              |
1031	           |   (Over Subflow or Control Connection)   |
1032	           |<--------------ADD_ADDR-B2----------------|              |
1033	           |               |                          |              |
1034	           |      (Additional Subflow Connection Setup (TCP))        |
1035	           |***************************SYN**************************>|
1036	           |<************************SYN/ACK*************************|
1037	           |***************************ACK**************************>|
1038	           |               |                          |              |
1039	           |***B's Signature + Token B verification*****************>|
1040	           |                                          | Token        |
1041	   ........|..........................................| verif.       |
1042	           |<**A's Signature + Token A verification******************|
1043	    Token  |               |                          |              |
1044	    verif. |     (TLV encoded data transport          |              |
1045	           |  over the additional Subflow Connection) |              |
1046	           |***************TLV**************************************>|
1047	           |               |                          |              |
1048	           |<**************TLV***************************************|

1050	              Figure 12   Additional Subflow Connection setup

1052	      4.4.3. TLV Encoding of Data Segments

1054	   TLV encoded Data Segments can be sent on each Subflow Connection.
1055	   Each Data Segment carries a 64-bit Session Sequence Number. A PLMT-
1056	   capable host must maintain a Session Sequence Number in addition to
1057	   the TCP sequence numbers of TCP on a Subflow Connection.

1059	                              1                   2                   3
1060	          0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1061	         +----------------+-------------------------------+--------------+
1062	         |   Kind = DATA  |         Length=20+n           |  reserved    |
1063	         +----------------+-------------------------------+--------------+
1064	         :                   Session Sequence Number (8 byte)            :
1065	         +---------------------------------------------------------------+
1066	         :                    Data Segment (n bytes total)               |
1067	         +---------------------------------------------------------------+

1069	               Figure 13   TLV encoded Data Segment message

1071	   Session Sequence Numbers are used to reorder the data inside the
1072	   PLMT session that arrives over multiple Subflow Connections. The
1073	   Session Sequence Number is thus similar to the TCP sequence number
1074	   and identifies each byte of data. Each Data Segment carries the
1075	   Session Sequence Number, which refers to the byte number of the
1076	   first byte in the Data segment.

1078	   Even when a PLMT-capable host is not transmitting TLV data segments,
1079	   the end host MUST store Session Sequence Numbers for all ongoing TCP
1080	   connections, in order to be able to deal with late setups of a
1081	   Control Connection.

1083	      4.4.4. Data Acknowledgments

1085	   In addition to the regular Subflow Connection TCP acknowledgements,
1086	   session-level Data Acknowledgements are used to cumulatively
1087	   acknowledge the data received over the different Subflow
1088	   Connections. A Data Acknowledgement that acknowledges the reception
1089	   of a Data Segment message includes the next expected byte of Data
1090	   Segments. In a normal operation, session-level Data Acknowledgements
1091	   are actually not needed, but certain performance enhancing proxies
1092	   or middlebox failures may result in situations in which the
1093	   acknowledgments on a SubFlow Connection erroneously allows release
1094	   of data in the sender, even if it is not yet received.

1096	   The Data Acknowlegdements also include a session-level receive
1097	   window to correctly perform flow control at session level, and to
1098	   avoid deadlocks.

1100	   Since the use of data acknowledgements is only a mechanism to
1101	   increase robustness, the data acknowledgements SHOULD be sent at
1102	   bigger intervals of time. It is left for further study how often
1103	   they should be sent. Another open question is on which of the
1104	   connections the messages should be sent.

1106	                           1                   2                   3
1107	       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1108	      +---------------+--------------------------------+--------------+
1109	      |Kind=SESS_ACK  |           Length=12            |  reserved    |
1110	      +---------------+--------------------------------+--------------+
1111	      :    Next expected Session Sequence Number (8 byte in total)    :
1112	      +---------------------------------------------------------------+
1113	      :          Session receive window (8 bytes in total)            :
1114	      +---------------------------------------------------------------+

1116	                 Figure 14   Data Acknowledgement message

1118	4.5. Other Aspects

1120	      4.5.1. Congestion Control

1122	   One of the goals for having a multi-connection transport solution is
1123	   to enhance the usage of network resources, commonly known as
1124	   resource pooling principle. In order to achieve resource pooling,
1125	   the congestion windows of the different Subflow Connections of the
1126	   Session should be coupled together. The coupling should lead to
1127	   transmission of more Data Segments over the less congested
1128	   connections as compared to the more congested connections.

1130	   Different congestion control algorithms may be implemented for
1131	   multipath transport mechanisms to achieve the goals of resource
1132	   pooling and fairness. One such algorithm is presented in [4]. The
1133	   algorithm offers a potential solution in the current Internet by
1134	   controlling the Subflow Connection congestion window increase as a
1135	   function of the performance of other Subflow Connections of a
1136	   session.

1138	   PLMT could use this algorithm for congestion control as well. If
1139	   PLMT is entirely implemented in the user space, an alternative
1140	   algorithm could be used that runs a corresponding scheduler, which
1141	   uses own estimates for the path characteristics. The design of
1142	   alternative algorithms for congestion control coupling is beyond the
1143	   scope of this document.

1145	      4.5.2. Path Management and Scheduling

1147	   The establishment of multiple Subflow Connections to different
1148	   addresses aims at a better utilization of the network resources.
1149	   PLMT could use cross-layer information from the network layer for
1150	   path management.

1152	   The scheduling of TLV-encoded Data Segments over the different
1153	   Subflow Connections is based on the local policy. PLMT can use
1154	   different algorithms to control the splitting of the data stream
1155	   from the application over the different Subflow Connections. PLMT
1156	   uses the standard TCP mechanisms for reliable transport of data on
1157	   its Subflow Connections.

1159	   The retransmission strategy for lost Data Segments is a local
1160	   policy. The session sequence number allows lost Data Segments to be
1161	   sent over another Subflow Connection in addition to the
1162	   retransmission over the same Subflow Connection. How often a Data
1163	   Segment is sent over another Subflow Connection is again a design
1164	   choice of the local policy.

1166	      4.5.3. Closing Connections and Sessions

1168	   A Subflow Connection is a standard TCP connection. To close a
1169	   Subflow Connection the TCP 4-way FIN handshake mechanism is used.

1171	   When the Session needs to be closed, it means that all the PLMT
1172	   Connections need to be closed, including the Control Connection.

1174	5. Interaction with Middleboxes

1176	   The Internet consists of many different types of middleboxes, some
1177	   parse the contents of the stream of a TCP connection, rewrite the
1178	   content of packet headers or rewrite even the payload. For a new
1179	   multipath transport like PLMT to be successfully deployable, its
1180	   operation should be understood and tested against such middleboxes.
1181	   Examples for well-known middleboxes are Network Address and Port
1182	   Translators (NAPT). PLMT is designed to be compatible with
1183	   middleboxes that have problems with TCP options. But there are also
1184	   some problems with other types of middleboxes.

1186	5.1. Middleboxes that Translate Address/Ports

1188	   Middleboxes that perform Network Address and Port Translations
1189	   (NAPT) may cause problems for the creation of multiple connections
1190	   (this is a potential issue for all multipath transport protocols).
1191	   Hosts behind the NAPT know their local addresses but might not be
1192	   aware of the global addresses that the NAPT uses. Therefore, the
1193	   hosts MUST NOT advertise their multiple local addresses to the other
1194	   host. The host behind the NAPT MAY still be multipath capable and
1195	   MAY open a PLMT connection to the other host if the other host is
1196	   also PLMT capable. Over the established PLMT connection, the other
1197	   host MAY advertise its multiple addresses. These addresses will be
1198	   used by the host behind the NAPT to open further Subflow
1199	   Connections.

1201	5.2. Middleboxes that Manipulate TCP Options

1203	   The multipath solutions that use TCP options field for their
1204	   operation may suffer from middleboxes that may remove or modify the
1205	   TCP options. Some middleboxes may even drop packets with unknown TCP
1206	   options, and this may happen for the connection establishment
1207	   packets as well. PLMT does not employ any new TCP option and hence
1208	   it would not be affected by such a middlebox behavior.

1210	5.3. Middleboxes that Parse Content

1212	   Current middleboxes in the Internet are not aware of multipath
1213	   transport. Therefore, middleboxes will identify the single Subflow
1214	   Connection to be a standard TCP connection. The TLV encoding of the
1215	   payload may confuse the middlebox and may lead the middlebox to
1216	   stall the connection in case that the middlebox parses the content.

1218	   If a middlebox blocks TLV encoding, PLMT can try to transmit data
1219	   over another path. However, PLMT cannot fall back to a mode that
1220	   does not use TLV transport, since it must send the Signature and
1221	   tokens in TLV encoding over the Initial Subflow Connection.

1223	   Middleboxes that want to prevent multipath transport can block
1224	   connection setups to the well-known port. This prevents the use of
1225	   multipath transport if a middlebox is both on the path of the
1226	   Initial Subflow Connection and the Control Connection. A middlebox
1227	   that is not on the path of the Control Connection cannot safely
1228	   distinguish normal TCP connections and PLMT Subflow Connections with
1229	   TLV transport.

1231	5.4. Middleboxes that Change content

1233	   Middleboxes may also modify the payload and not only the packet
1234	   headers. All the multipath solutions require a session-level data
1235	   sequence number to re-order/combine the data stream received over
1236	   the Subflow Connections. The PLMT design allows detecting such a
1237	   middlebox behavior by identifying the connection which gets stalled
1238	   due to undecodable TLV framing. In addition, checksums could be
1239	   used. The Data Acknowledgements will identify the holes in the
1240	   session sequence numbers so that a retransmission of the missing
1241	   segments over other Subflow Connections will be initiated. This
1242	   allows working around content-modifying middleboxes, unless they are
1243	   present on all paths.

1245	   If this type of middlebox is present on the Initial Connection, then
1246	   the Signature matching may fail. This means that data transport over
1247	   the Initial Connection may be corrupted, as, e. g., the Signature
1248	   may be delivered to the application as part of the byte stream.

1250	6. Security Considerations

1252	   The Signature-based method to identify the setup of a new TLV-
1253	   enabled data flow has two security issues: First, an application can
1254	   accidentally generate a bit pattern that is equal to the Signature.
1255	   Second, due to the use of out-of-band signaling, PLMT's method must
1256	   be robust against malicious attacks that try to break or hijack PLMT
1257	   sessions or normal connections. Unlike other multipath transport
1258	   protocols, it is theoretically possible to attack a normal TCP
1259	   connection to a PLMT-enabled server, even if it does not use
1260	   multipath transport.

1262	6.1. Reappearance of Signature in Application Data

1264	   The Signature (and the tokens) is sent in two different contexts:

1266	      o A connection which was started as a single legacy TCP
1267	         connection is later switched to PLMT/TLV-enabled operation. In
1268	         this case, the Active Opener provides the Session sequence
1269	         number over the control connection of the last byte that is
1270	         not TLV encoded. This way, the PLMT Layer of the Passive
1271	         Opener knows how much user data has been transmitted through
1272	         the legacy TCP connection and when to expect the Signature.
1273	         Given the length of the Signature, as well as the following
1274	         token exchange, it is extremely unlikely that a normal TCP
1275	         connection is wrongly classified as a Subflow Connection. A
1276	         similar problem occurs at the Active Opener.

1278	      o The Signature can also be present in the first bytes of a new
1279	         PLMT Subflow Connection, if it is an additional Subflow
1280	         Connection, or if the Control Connection is established first.
1281	         In these cases, the Subflow Connection is characterized by the
1282	         Signature being present in the first bytes of a connection. In
1283	         case that an application itself opens an additional TCP
1284	         connection to the same corresponding end host, a problem could
1285	         occur if the Signature pattern (and follow-up token messages)
1286	         is contained in the first data packet of the connection.

1288	   Because of both effects, there is a residual probability that PLMT
1289	   accepts a connection erroneously, if an application accidentally
1290	   sends a bit pattern that is identical to the Signature (plus the
1291	   Tokens), of if an attacker manages to guess the pattern. This
1292	   probability is very small as the Signature is a long, random bit
1293	   pattern.

1295	   This probabilistic approach of a token-based identification is
1296	   general practice in challenge-response authentication methods, where
1297	   there is also an extremely small residual probability that an
1298	   unauthorized (malicious) node guesses the response correctly.

1300	6.2. Resilience against Malicious Attacks

1302	   One aspect of address-agile multi-path transport mechanisms are
1303	   possible malicious attacks. PLMT suffers from a DoS vulnerability,
1304	   but it has protection methods against other attacks.

1306	   PLMT uses the same token mechanism like other multipath transport
1307	   protocols, but with much longer tokens. An attacker must not only
1308	   correctly guess the Tokens, but also the Signature. As a
1309	   consequence, the probability of blind guess attacks on PLMT is
1310	   extremely small.

1312	7. Open Issues

1314	   This PLMT protocol specification is a work-in-progress, and there
1315	   are still remaining unsolved issues that need further
1316	   considerations.

1318	8. IANA Considerations

1320	   This document will make a request to IANA to allocate a new TCP/UDP
1321	   port value for the PLMT Control Connection.

1323	9. Conclusion

1325	   PLMT is a user-space solution to enable reliable, in-order data
1326	   transfer over multiple paths. This specification defines the PLMT
1327	   protocol. PLMT is defined as a worked example for a payload-based
1328	   multipath transport, as an alternative to TCP option based signaling
1329	   mechanisms. Due to some security vulnerabilities, it is mainly
1330	   suitable for controlled and trusted environments.

1332	10. References

1334	10.1. Normative References

1336	   [1]   Bradner, S., "Key words for use in RFCs to Indicate
1337	         Requirement Levels", BCP 14, RFC 2119, March 1997.

1339	   [2]   J. Postel, ''Transmission Control Protocol'', STD 7, RFC 793,
1340	         September 1981.

1342	   [3]   Dierks, T. and E. Rescorla, "The Transport Layer Security
1343	         (TLS) Protocol Version 1.2", RFC 5246, August 2008.

1345	10.2. Informative References

1347	   [4]   Raiciu, C., Handley, M. and D. Wischik, ''Coupled Multipath-
1348	         Aware Congestion Control'', draft-ietf-mptcp-congestion-00
1349	         (work in progress), July 2010.

1351	   [5]   Ford, A., Raiciu, C., Barre, S. and J. Iyengar, ''Architectural
1352	         Guidelines for Multipath TCP Development'', draft-ietf-mptcp-
1353	         architecture-01 (work in progress), June 2010.

1355	   [6]   Ford, A., Raiciu, C. and M. Handley, ''TCP Extensions for
1356	         Multipath Operation with Multiple Addresses'', draft-ietf-
1357	         mptcp-multiaddressed-01 (work in progress), July 2010.

1359	   [7]   M. Bagnulo, ''Threat Analysis for Multi-addressed/Multi-path
1360	         TCP'', draft-ietf-mptcp-threat-02 (work in progress), March
1361	         2010.

1363	   [8]   Scharf, M. and A. Ford, ''MPTCP Application Interface
1364	         Considerations'', draft-scharf-mptcp-api-02 (work in progress),
1365	         July 2010.

1367	   [9]   M. Scharf, ''Multi-Connection TCP (MCTCP) Transport'', draft-
1368	         scharf-mptcp-mctcp-00 (work in progress), July 2010.

1370	11. Acknowledgments

1372	   The authors are supported by the German-Lab project
1373	   (http://www.german-lab.de/), a research project funded by the German
1374	   Federal Ministry of Education and Research (BMBF). The views
1375	   expressed here are those of the author(s) only. The BMBF is not
1376	   liable for any use that may be made of the information in this
1377	   document.

1379	   The authors gratefully acknowledge significant input into this
1380	   document from Koojana Kuladinithi, Asanga Udugama, Andreas Koensgen,
1381	   Andres Toro (all from University of Bremen), Andreas Timm-Giel
1382	   (Hamburg University of Technology), Thomas-Rolf Banniza and Peter
1383	   Schefczik (all from Alcatel-Lucent Bell Labs).

1385	Authors' Addresses

1387	   Amanpreet Singh
1388	   University of Bremen
1389	   Otto-Hahn-Allee 1
1390	   28359 Bremen
1391	   Germany

1393	   Email: aps@comnets.uni-bremen.de

1395	   Michael Scharf
1396	   Alcatel-Lucent Bell Labs
1397	   Lorenzstrasse 10
1398	   70435 Stuttgart
1399	   Germany

1401	   EMail: michael.scharf@alcatel-lucent.com