idnits 2.17.1 draft-singh-mptcp-plmt-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 35 instances of too long lines in the document, the longest one being 7 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: PLMT provides two modes of operation, which differ by the time when the control connection is established: Parallel Setup and Late Setup. The Parallel Setup is significantly simpler for a Passive Opener, as Signatures are sent in the first bytes of a connection and therefore are simple to identify. But, unfortunately, the setup of a Control Connection for every data transfer with a short duration results in overhead and additional delay without any performance gains. This mode is therefore mainly useful if it is known in advance that a TCP connection will transport a large amount of data. In order to reduce the overhead for short connection, PLMT also allows that the Control Connection is established later than the Initial Connection. In this case, the PLMT Layer on a host MUST not initiate the TLV data encoding before the PLMT capability of the other host has been determined through the Control Connection, (cf. Figure 3). == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: The Control Connection is used to determine the PLMT capability of the end hosts. The Initial Connection MUST not transport any data before the Control Connection is established and the PLMT Capability Exchange is completed. If the Control Connection setup or PLMT Capability Exchange fails, then the Initial Connection MUST not transmit data with TLV encoding but the legacy TCP bytestream. -- The document date (August 6, 2010) is 5002 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 793 (ref. '2') (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 5246 (ref. '3') (Obsoleted by RFC 8446) == Outdated reference: A later version (-07) exists of draft-ietf-mptcp-congestion-00 == Outdated reference: A later version (-05) exists of draft-ietf-mptcp-architecture-01 == Outdated reference: A later version (-12) exists of draft-ietf-mptcp-multiaddressed-01 == Outdated reference: A later version (-08) exists of draft-ietf-mptcp-threat-02 == Outdated reference: A later version (-04) exists of draft-scharf-mptcp-api-02 == Outdated reference: A later version (-01) exists of draft-scharf-mptcp-mctcp-00 Summary: 3 errors (**), 0 flaws (~~), 9 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force A.Singh 2 Internet Draft University of Bremen 3 Intended status: experimental M. Scharf 4 Expires: January 2011 Alcatel-Lucent Bell Labs 5 August 6, 2010 7 PayLoad Multi-connection Transport using Multiple Addresses 8 draft-singh-mptcp-plmt-00.txt 10 Abstract 12 The single path transport provided by the Transmission Control 13 Protocol (TCP) can be extended to a multipath transport session for 14 multi-homed end hosts by coupling several TCP connections over 15 multiple interfaces of the end hosts. Payload Multi-connection 16 Transport (PLMT) is a multipath protocol variant that encodes all 17 the control/signaling information in the payload of TCP connections 18 and therefore requires no additional TCP options. PLMT allows for 19 the simultaneous use of the multiple connections over potentially 20 disjoint paths while being mostly backward compatible to single path 21 transport of TCP. PLMT operates as an additional protocol layer 22 between the network stack and the application layer. This document 23 describes PLMT as an example for a multipath mechanism that could 24 possibly be realized entirely in the user-space of an operating 25 system. 27 Status of this Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six 38 months and may be updated, replaced, or obsoleted by other documents 39 at any time. It is inappropriate to use Internet-Drafts as 40 reference material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on January 6, 2011. 44 Copyright Notice 46 Copyright (c) 2010 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with 54 respect to this document. Code Components extracted from this 55 document must include Simplified BSD License text as described in 56 Section 4.e of the Trust Legal Provisions and are provided without 57 warranty as described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction...................................................5 62 2. Terminology....................................................6 63 3. Design Considerations..........................................7 64 3.1. Goals.....................................................7 65 3.2. Layered Representation....................................8 66 3.3. Operation Summary.........................................8 67 3.4. Compatibility............................................11 68 3.5. Advantages and Drawbacks of PLMT.........................11 69 4. PLMT Protocol.................................................16 70 4.1. Session Initiation.......................................16 71 4.2. Exchange of PLMT Signaling Over the PLMT Control Channel.16 72 4.2.1. Establishment of the Control Connection.............16 73 4.2.2. PLMT Capable Messages...............................17 74 4.2.3. Further Usage of the Control Connection.............19 75 4.2.4. Discussion of Control Connection Failure Cases......20 76 4.3. PLMT Data Connection Setup and Operation.................20 77 4.3.1. Guidelines for selection of a Signature.............21 78 4.3.2. Bundling of Initial Connection to the Control 79 Connection in Parallel Setup...............................21 80 4.3.3. Bundling of Initial Connection to the Control 81 Connection in Late Setup...................................23 82 4.4. Additional Subflow Connections Initiation and Operation..24 83 4.4.1. Address Advertisement...............................24 84 4.4.2. Subflow Connection Setup............................25 85 4.4.3. TLV Encoding of Data Segments.......................26 86 4.4.4. Data Acknowledgments................................26 87 4.5. Other Aspects............................................27 88 4.5.1. Congestion Control..................................27 89 4.5.2. Path Management and Scheduling......................28 90 4.5.3. Closing Connections and Sessions....................28 91 5. Interaction with Middleboxes..................................28 92 5.1. Middleboxes that Translate Address/Ports.................29 93 5.2. Middleboxes that Manipulate TCP Options..................29 94 5.3. Middleboxes that Parse Content...........................29 95 5.4. Middleboxes that Change content..........................30 96 6. Security Considerations.......................................30 97 6.1. Reappearance of Signature in Application Data...............30 98 6.2. Resilience against Malicious Attacks........................31 99 7. Open Issues...................................................31 100 8. IANA Considerations...........................................31 101 9. Conclusion....................................................32 102 10. References...................................................32 103 10.1. Normative References.......................................32 104 10.2. Informative References.....................................32 105 11. Acknowledgments..............................................33 107 1. Introduction 109 The objective of a multipath transport mechanism is to allow the 110 simultaneous use of multiple connections over multiple paths. A 111 multipath transport mechanism is expected to be beneficial since it 112 enhances the network resource utilization and since it provides 113 resilience to node failures in the network [5]. 115 One key mechanism that aims to provide multipath transport is 116 Multipath TCP (MPTCP). MPTCP enables multipath transport by 117 utilizing multiple addresses of the end host to establish multiple 118 paths (subflows) for a TCP connection [6]. MPTCP extends the 119 standard Transmission Control Protocol (TCP) [2] to add the 120 multipath capability and uses several new TCP options to encode 121 control/signaling information. 123 Another multipath transport solution, MCTCP [9] uses the new TCP 124 options only during connection setup to transport signaling 125 information. Afterwards the additional signaling information is sent 126 together with the application data in the payload using a type- 127 length-value (TLV) framing format. 129 This document presents the Payload Multi-connection Transport (PLMT) 130 protocol design as a further alternative multipath transport 131 mechanism. PLMT also uses a type-length-value (TLV) framing format 132 to send application data and control/signaling information. However, 133 in order to transmit control/signaling information; PLMT does not 134 use new TCP options, unlike other multipath transport solutions. 135 Instead, PLMT sets up a control connection to a well-known port for 136 the signaling information exchange, and it uses payload encoding 137 over standard TCP connections. The control connection can either be 138 set up before starting the data transport, or afterwards. In either 139 case, it is possible to implement the PLMT signaling without 140 changing the network stack. Each of the multiple PLMT connections is 141 a standard TCP connection that transports TLV encoded data segments 142 and that are coupled together to the PLMT session. 144 Therefore, PLMT is easily deployable and extensible. PLMT is also 145 transparent to applications and offers reliable transport similar to 146 a standard TCP connection. PLMT is also mostly backward compatible 147 to single path standard TCP. By design, PLMT robustly operates in 148 environments with middleboxes that prevent the use of new TCP 149 options. But the use of out-of-band signaling also comes at some 150 cost concerning complexity, fall-back options, and security. 151 However, as outlined in this document, PLMT is designed to minimize 152 these risks and is rather robust. This document presents PLMT and 153 discusses both the advantages and drawbacks of its design. 155 2. Terminology 157 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 158 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 159 document are to be interpreted as described in RFC-2119 [1]. 161 This document uses the terminology defined in [5][6], though some of 162 the terms are re-defined. 164 Session: A connection over which an application can communicate 165 between two hosts. For an application, there is a one-to-one mapping 166 between a session and the socket. If a session includes only the 167 initial connection, it is almost identical to a standard TCP 168 connection. 170 PLMT Control Port: A port allocated to accept the PLMT control 171 connections. 173 PLMT Layer: A protocol layer implementing the multi-connection 174 capability of the PLMT. It can for instance be realized in the user 175 space of an operating system. 177 Initial Connection: A TCP connection established by an application 178 request. If both ends are PLMT capable, the first subflow uses this 179 connection. 181 Additional Subflow Connection: A new TCP connection established for 182 a subsequent subflow. 184 Control Connection: A TCP connection that is established to the PLMT 185 Control Port. The IP addresses are identical to the Initial 186 Connection. 188 PLMT Data Segment: The segmented application data with TLV header. 190 Active Opener: Refers to the TCP client for a Session with PLMT 191 Layer. 193 Passive Opener: Refers to the TCP server for a Session with PLMT 194 Layer. 196 Legacy End-host: Refers to a host without PLMT Layer. 198 Token: A 64-bit number that is unique on a host. 200 Signature: A long bit pattern that is used to identify PLMT messages 201 inside TCP connections. The length is 16 byte (128 bit). It MUST be 202 selected in a way such that it is unlikely to occur in application 203 protocols. Guidelines how to determine a Signature are explained in 204 section 4.3.1. . 206 Session Sequence Number: The sequence number of a byte inside a byte 207 stream of a session, determined by the PLMT Layer. 209 3. Design Considerations 211 This section gives a high-level overview of PLMT's design. 213 3.1. Goals 215 Important design assumptions and goals of the PLMT design are: 217 o No change of network stack: PLMT is designed to minimize the 218 impact on the network stack implementation. The signaling can 219 be completely implemented in the user-space of an operating 220 system. 222 o Backward compatible: The PLMT should be backward compatible to 223 standard TCP. A single connection PLMT should be exactly 224 similar to the standard TCP connection. As long as only one 225 connection exists, it is not necessary to use TLV framing on 226 that connection. 228 o Co-existence with standard TCP connections: A PLMT capable end 229 host must be able to differentiate between PLMT connections and 230 regular TCP connections. This is crucial, since PLMT 231 connections use TLV encoding. 233 o Multihomed and multiaddressed end hosts: PLMT assumes that for 234 the establishment of multiple connections at least one of the 235 end hosts must be multihomed and multiaddressed. 237 o Middlebox compatibility: PLMT should be compliant to the vast 238 majority of middleboxes, such as NAPT middleboxes and 239 firewalls. Therefore, PLMT should not rely on TCP extensions. 240 PLMT should also allow a middlebox to identify that a host 241 establishes PLMT connections, and prevent this. 243 o Transparency: PLMT should be transparent to the legacy 244 application i.e., it should provide the same API and services 245 (of the standard TCP) to the application. 247 3.2. Layered Representation 249 PLMT operates as an additional protocol layer (shim layer) between 250 the application layer and the transport layer. It is designed to be 251 transparent to both higher and lower layers and to be implemented in 252 the user space. It can be used by legacy applications without any 253 changes. Figure 1 illustrates this layering. 255 +-------------------------------+ 256 | Application | 257 +-----------------------------+ +-------------------------------+ 258 | Application | | PLMT | 259 +-----------------------------+ +---------------+---------------+ 260 | TCP | | TCP | TCP | 261 +--------------+--------------+ +---------------+---------------+ 262 | IP | | IP | IP | 263 +--------------+--------------+ +---------------+---------------+ 264 Figure 1 Comparison of Standard TCP and PLMT Protocol Stacks 266 3.3. Operation Summary 268 This section gives an outlook to the overall high-level operation of 269 PLMT. Figure 2 depicts a simple scenario to illustrate the basic 270 PLMT operation. A detailed PLMT protocol specification and operation 271 description is provided in section 4. 273 o A legacy application, unaware of the presence of PLMT will 274 initiate a standard TCP connection by opening a TCP socket for 275 a Session. PLMT-aware applications MAY use a new application 276 interface [8] to control the functioning of PLMT. 278 o The PLMT Layer then manages the connection establishment of 279 Initial Connection, Control Connection and additional Subflow 280 Connections. 282 o In order to enable PLMT, the Active Opener opens a PLMT 283 control connection to a well-known port at the Passive Opener. 284 The control connection is used to determine whether the remote 285 end supports PLMT, and to exchange the necessary control 286 information such as the Tokens. The Control Connection, as well 287 as Subflow Connections, are established in the standard TCP way 288 by the PLMT Layer. 290 o A node may either set up a Control Connection before or in 291 parallel to the setting up of the Initial Connection (refer 292 Figure 2). Alternatively, it may first use the Initial 293 Connection and decide later to open the Control Connection. The 294 latter case is discussed in section 4.3.3. . The control 295 connection must be set up using the same IP source and 296 destination addresses like the Initial Connection, and use the 297 PLMT control port. If the setup of the Control Connection 298 fails, PLMT will not be enabled and fall back to standard TCP. 300 o If the Passive Opener supports PLMT and TLV transport is 301 successfully enabled, the Initial Connection will use a TLV 302 framing for data transmission. Then, the Initial Connection is 303 also termed first Subflow Connection. The setup of the TCP 304 connections between two hosts A and B is illustrated in Figure 305 2. PLMT signals the use of TLV encoding by sending the 306 Signature in the payload of the TCP byte stream. The Signature 307 is a long bit pattern that is selected in such a way that it is 308 unlikely to occur in a TCP connection not using PLMT. 309 Furthermore, Tokens are used to verify that the Initial and 310 Control Connection originate indeed at the same hosts. A 311 detailed analysis of the security implications of PLMT and the 312 resulting very small risk of false positives when detecting its 313 connections are provided in section 6. . 315 o If multiple interfaces are present, PLMT can establish 316 multiple Subflow Connections to allow data transport over 317 multiple paths. Once TLV encoded data transport is activated, a 318 Session level data sequence number is used for in-order 319 delivery of the Data Segments over multiple Subflow 320 Connections. The PLMT Layer manages the multiple interfaces and 321 connections and delivers the packets over the different 322 connections. At the receiver, the PLMT Layer reassembles the 323 byte stream and transparently delivery them to the application. 325 o As the Subflow Connections are standard TCP connections, they 326 are terminated as a regular TCP connection with the 4-way FIN 327 handshake. The Session is terminated with the termination of 328 the last subflow. 330 End-host A End-host B 331 --------------------------- --------------------------- 332 Address A1 Address A2 Address B1 Address B2 333 ------------ ------------ ------------ ------------ 334 | | | | 335 | (Initial Connection setup) | | 336 |---------------SYN----------------------->| | 337 |<------------SYN/ACK----------------------| | 338 |---------------ACK----------------------->| | 339 | | | | 340 | (Control Connection setup) | | 341 |~~~~~~~~~~~~~~~SYN~~~~~~~~~~~~~~~~~~~~~~~>| | 342 |<~~~~~~~~~~~~SYN/ACK~~~~~~~~~~~~~~~~~~~~~~| | 343 |~~~~~~~~~~~~~~~ACK~~~~~~~~~~~~~~~~~~~~~~~>| | 344 | | | | 345 | (Token exchange over Control Connection | | 346 | as detailed in Figure 3) | | 347 |~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~>| | 348 |<~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| | 349 | | | | 350 | Signature+Token | | 351 |----------------------------------------->| | 352 | Signature+Token | | 353 |<-----------------------------------------| | 354 | | | | 355 | (TLV-encoded Data Segments | | 356 | over the Initial Connection) | | 357 |----------------------------------------->| | 358 |<-----------------------------------------| | 359 | | | | 360 |(Address Exchange over the Control or Initial Connection)| 361 | | | | 362 | (Additional Subflow Connection setup (TCP)) | 363 |==========================SYN===========================>| 364 |<=======================SYN/ACK==========================| 365 |==========================ACK===========================>| 366 | | | | 367 | (Signature and TLV-encoded Data Segments | 368 | over the Subflow Connection) | 369 |========================================================>| 370 |<========================================================| 371 | | | | | 373 Figure 2 PLMT Connections Establishment in case that the Control Connection 374 is set up in parallel to the Initial Connection 376 3.4. Compatibility 378 PLMT uses the Control Connection to detect whether a Passive Opener 379 indeed supports its operation. If the setup of the Control 380 Connection fails, it falls back to standard TCP transport, and does 381 not use any additional PLMT signaling. PLMT is thus compatible with 382 legacy TCP stacks and is able to detect them. 384 The PLMT Layer is transparent to applications, i. e., it is 385 compatible with legacy applications unaware of PLMT. 387 The PLMT protocol does not require extensions of the TCP protocol 388 and reuses the standard TCP mechanisms for the reliable, in-order 389 operation of its connections. PLMT uses its own frame format based 390 on the TLV encoding to send the application data and the control 391 information. The use of TLV encoding is known from other TCP-based 392 protocols such as TLS [3]. Therefore, PLMT should pass most 393 middleboxes, in particular all middleboxes that would block TCP 394 options. An exception is the case of middleboxes that parse the byte 395 stream and block TLV content. In this case, PLMT transport may fail 396 in certain cases, as discussed in section 5.3. and 5.4. . 398 The signaling and message transport of PLMT can be implemented on a 399 host without changing the network stack, i. e., as a library in the 400 user space. With a combination of scheduling and rate shaping 401 mechanisms, the PLMT Layer can also try to emulate congestion 402 control coupling algorithms such as [4]. In this case, it may be 403 possible to implement PLMT entirely in the user space of a host. 405 3.5. Advantages and Drawbacks of PLMT 407 PLMT follows the principles outlined for a multipath transport 408 solution based on TCP in [5]. PLMT uses the TCP payload to transport 409 signaling messages and requires no new TCP options. Thus, PLMT 410 brings along all advantages of the payload encoding mechanism (cf. 411 [9]): 413 o PLMT does not use any TCP option to setup its connections. 414 Therefore, it might be possible to implement PLMT entirely in 415 the user-space, which would significantly facilitate deployment 416 of PLMT. 418 o In addition, the signaling messages are not constrained with 419 the limited size of the TCP options, and PLMT does not consume 420 further option space in SYN segments. 422 o PLMT does not modify TCP and is therefore compatible with many 423 middleboxes, especially ones which do not allow unknown TCP 424 options to get through, or ones that re-write the TCP options. 426 o Middleboxes can very easily identify the setup of a PLMT 427 Control Connection due to the use of a well-known port. If a 428 middlebox on the path of the Initial Connection wants to 429 prevent the use of multipath transport, it can simply block the 430 connection setup to that port. Then, multipath transport will 431 not be used for the corresponding connection. 433 PLMT is developed as an example for a multipath transport protocol 434 that does not use any new TCP option, or other TCP extensions, and 435 that is still backward compatible. Still, due to the use of payload 436 encoding and an out-of-band control channel for the exchange of 437 control information, a number of issues arise. The following text 438 discusses these problems (some of which may exist for other 439 multipath transport solutions as well) and possible solutions. 441 o PLMT opens a Control Connection per PLMT Session, i. e., an 442 additional TCP connection. If a host opens Control Connections 443 for every short TCP-based transfer, this would result in a 444 large number of additional connection setups, which would 445 consume bandwidth, processing resources, and port numbers. The 446 worst case is that a PLMT Control Connection is set up for 447 every Initial Connection, but additional subflows are never 448 established. Then, the number of TCP connection doubles without 449 any performance benefit. As a remedy, PLMT can also first use 450 an Initial Connection without Control Connection, and try to 451 establish the Control Connection after some time. Once PLMT 452 capability is detected and additional signaling information has 453 been exchanged, the Initial Connection as well as potential 454 additional Subflow Connections can then be used to transport 455 PLMT TLV-encoded data traffic. This mechanism avoids needless 456 Control Connection setups for short transfers. 458 o PLMT needs a well known, dedicated port for the Control 459 Connections, similar to TLS [3]. If PLMT is enabled on a host, 460 it may try to establish Control Connections to that port for 461 all communication partners. Even if heuristics can be used to 462 learn whether servers are supporting PLMT, or not, and thus 463 reduce the connection setup attempts, numerous legacy hosts in 464 the Internet will receive connection setups on that port. To 465 legacy systems, this may look similar to a SYN flooding attack. 466 As a counter measure, network administrators may configure 467 firewalls to block the PLMT Control Port, which prevents the 468 usage of the protocol once it is more widely deployed. 470 o Middlebox that transparently change the length of content are 471 a problem for multipath transport protocols. When using TLV- 472 based transport, PLMT could detect such middleboxes by using a 473 checksum, or by observing broken TLV headers, and try 474 retransmissions. However, if the byte stream is transparently 475 changed before switching to TLV encoding, difficulties can 476 arise. For instance, the Signature may not be at the position 477 where it is expected. In this case, PLMT cannot enter the TLV 478 mode, but it can also not necessarily fall back, and it may 479 either have to cancel that transfer by closing the PLMT 480 Session, or, in the worst case, it may even deliver corrupted 481 data to an application. 483 o PLMT delays the setup of connections in various scenarios. If 484 an Active Opener wants to use TLV encoding immediately on the 485 Initial Connection, it must await the setup of the control 486 connection. If there is no response (no SYN/ACK), the Active 487 Opener may either retransmit the SYN, i. e., wait for a longer 488 time, or give up. Then, multipath transport is not possible. In 489 all cases, there is at least a small delay before the data 490 transport over the Initial Connection can start. If the Active 491 Opener decides to setup the Control Connection later, this 492 delay is avoided. But then the Active Opener must stop data 493 transmission after the setup of the Control Connection, in 494 order to ensure a safe exchange of tokens, which interrupts the 495 data transport. 497 o The Passive Opener has a significant processing overhead due 498 to PLMT. First and most obviously, there is the overhead of 499 maintaining the Control Connections, which can be significant 500 for a highly-loaded server with thousands of connections. 502 o The second and trickier challenge is the distinction between 503 legacy TCP connections and connection originating from hosts 504 that use PLMT. PLMT Subflow Connections are characterized by 505 the presence of the Signature in the byte stream. This means 506 that the PLMT layer must accept all incoming connections, parse 507 for the presence of a valid Signature, and then decide whether 508 it is a legacy connection or a connection transporting PLMT 509 content with TLV encoding. The parsing for Signatures is 510 difficult if an incoming connection sends less data than the 511 length of the Signature. If the first bytes match a valid 512 Signature, or if no bytes are received at all, the PLMT layer 513 must wait for the arrival of further data, or time out, e. g., 514 if the corresponding application does not send enough bytes. If 515 it times out, the only safe option is to close the connection. 516 This means that the PLMT layer may reject not only PLMT 517 connections that suffered from retransmissions within the first 518 byte, but also valid TCP connection setup from legacy stacks if 519 they happen to (partly) match a Signature. If the delayed setup 520 of Control Connections is allowed, the parsing overhead is even 521 larger. The PLMT layer must then parse all established TCP 522 connections for all valid Signatures at the negotiated 523 positions in the byte stream, which may also require temporary 524 buffering of data, if only parts of a valid Signature are 525 received, or if the rest of the first TLV message is missing. 526 In all cases, the delivery of data to applications may be 527 delayed. 529 o On a Passive Opener, the PLMT layer has to accept incoming 530 connections in order to parse the payload, before it can hand 531 over the connection to the application. This can delay data 532 delivery, and also may result in inconsistent views when the 533 connection is indeed established. Further studies are needed to 534 understand whether the delay of connection establishment as 535 seen by applications, which does not occur in case of option- 536 based multipath protocols, could break existing applications. 538 o Due to the processing and buffer overhead required to identify 539 connections by payload parsing, the Passive Opener is 540 vulnerable to a Denial-of-Service (DoS) attack: An attacker can 541 open a large number of Control Connections, which will consume 542 resources on a server and slow down data delivery on other 543 connections. Passive Openers can reduce the risk by only 544 accepting Coupled Connections from source IP addresses that 545 originate also an existing connection, but this does not offer 546 a complete protection, in particular if an attacker is sitting 547 behind a large NAPT middlebox. Another remedy is to limit the 548 amount of allowed Control Connections, but then other users of 549 PLMT suffer from the effects of Control Connection setup 550 failures. 552 o PLMT must exchange the Token information in the payload of the 553 Initial Connection, in order to verify that an Initial 554 Connection and a Coupled Connection indeed have the same 555 endpoints. This requires the transport of a TLV-encoded 556 message. As a consequence, unlike other multipath transport 557 protocols [6] [9], PLMT cannot fall back to a backward 558 compatible byte stream transport if a middlebox on the path 559 should block the TLV transport. 561 o If there is a single-homed Active Opener and a multi-homed 562 Passive Opener, PLMT cannot indicate to the Active Opener that 563 multipath transport may make sense, i. e., that it could 564 establish a Control Connection, before that connection actually 565 exists. Other multipath transport protocols [6] [9] have a 566 signaling mechanism for this. PLMT can only detect this 567 situation if it blindly opens Control Connections in all cases. 569 o If a middlebox does not intercept the information on the 570 Control Connections, or if it does not know the Signature by 571 other means, it cannot determine if a given TCP connection 572 transports PLMT data, or not. If a middlebox is not on the path 573 of the Control Connection, it cannot prevent the usage of TLV 574 encoding. For the latter case, a possible remedy would be that 575 Additional Subflow Connections use another well-known port, 576 which could then be blocked. 578 o A Passive Opener can accept with a certain, small probability 579 erroneously a connection from a legacy host as PLMT Subflow 580 Connection, if an application happens to send a bit pattern 581 that is identical to one of the valid Signature of that Passive 582 Opener, plus the valid Tokens. This may either happen if the 583 first bytes of a standard TCP connection match an active 584 Signature, or if a corresponding bit pattern is present exactly 585 at the same sequence position as negotiated on a control 586 connection. In that case, TLV-encoded content will be injected 587 into a legacy connection, which will be corrupted. Due to the 588 length of the Signature, this error probability is very small. 590 o An attacker can abuse PLMT to break legacy TCP connections to 591 a PLMT-enabled Passive Opener, if it is sitting behind the same 592 NAPT middlebox like another Active Opener, as already 593 explained. In this case, the attacker can open multiple Control 594 Connections, not only as a DoS attack, but also to attack other 595 users. With a very small probability, the Signature and Tokens 596 negotiated over the Control Connection will match another 597 connection. If so, TLV content will be injected on that 598 connection, and it will break, too. Again, the success 599 probability of this attack is very small. 601 In summary, PLMT is a multipath protocol that is designed as a 602 payload-only solution. It is useful for controlled and trusted 603 environments, for networks with middleboxes that affect the use of 604 TCP options, and for use cases where it is impossible to change the 605 network stack. 607 4. PLMT Protocol 609 This section details the operations of PLMT protocol. 611 4.1. Session Initiation 613 A session initiation begins with an application request for a new TCP 614 connection, upon which the PLMT protocol performs the following 615 actions. 617 4.2. Exchange of PLMT Signaling Over the PLMT Control Channel 619 A node MAY setup a TCP Control Connection before or in parallel to 620 the setting up of the Initial Connection (Parallel Setup), or it MAY 621 set up the Control Connection at a later point in time (Late Setup). 622 Both variants have advantages and drawbacks and affect the way how 623 the Initial Connection is used. 625 4.2.1. Establishment of the Control Connection 627 The Active Opener Must set up the TCP Control Connection using the 628 same source and destination IP addresses, and it MUST be destined to 629 the PLMT Control Port. If the TCP connection is successfully set up, 630 this is a first indication that the Passive Opener indeed supports 631 PLMT. In order to exclude the case that another service is 632 accidentally running on that port, PLMT support is further verified 633 by PLMT Capable Messages. 635 A Passive Opener SHOULD verify whether there are already established 636 TCP connections from the same Active Opener, in order to reduce the 637 vulnerability to DoS attacks. 639 4.2.2. PLMT Capable Messages 641 If the Control Connection is set up successfully, the two hosts can 642 be expected to have an operational PLMT Shim Layer. The End-host MUST 643 exchange the Tokens as shown in Figure 3 for further validation of 644 the existence of PLMT Shim layer and the willingness of the Passive 645 Opener to use PLMT. Note that at this stage of the signaling the 646 Passive Opener cannot safely identify the Initial Connection that 647 this Control Connection shall be associated with. 649 End-host A End-host B 650 --------------------------- --------------------------- 651 Address A1 Address A2 Address B1 Address B2 652 ------------ ------------ ------------ ------------ 653 | | | | 654 | (Control Connection setup (TCP)) | | 655 |~~~~~~~~~~~~~~~SYN~~~~~~~~~~~~~~~~~~~~~~~>| | 656 |<~~~~~~~~~~~~SYN/ACK~~~~~~~~~~~~~~~~~~~~~~| | 657 |~~~~~~~~~~~~~~~ACK~~~~~~~~~~~~~~~~~~~~~~~>| | 658 | | | | 659 | (PLMT Capable Signaling) | | 660 |~~~~~~~~~~PLMT Token Indication~~~~~~~~~~>| | 661 |<~~~~~~~~PLMT Token Confirmation~~~~~~~~~~| | 662 | | | | 664 Figure 3 PLMT Signaling Exchange over the Control Connection 666 The frame format of the PLMT Token Indication message is shown in 667 Figure 4. The Token is a unique number for a host and is used to 668 identify a particular PLMT Session. To make it harder for an 669 attacker to guess the Token by brute-force method, a 64-bit Token 670 SHOULD be generated randomly [7]. Furthermore, the PLMT Token 671 Indication message includes the Signature of the Active Opener, as 672 well as the byte position in the Initial Connection where this 673 Signature will be present on the Initial Connection. The byte 674 position is provided in the Token Indication in order to reduce the 675 parsing overhead of a Passive Opener, and the risk that an attacker 676 can hijack a connection by negotiation of a large number of 677 Signatures and Tokens with a Passive Opener. This implies that an 678 Active Opener can only send data up to this position before it 679 receives a PLMT Token Confirmation message. In case of a Parallel 680 connection setup, this position is set to 0, as the Signature is set 681 at the beginning of the connection. As a side note, the whole 682 mechanism can fail if the bytestream length is affected by a 683 middlebox. 685 1 2 3 686 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 687 +---------------+--------------------------------+--------------+ 688 |Kind=TOKENIND | Length=32 | reserved | 689 +---------------+--------------------------------+--------------+ 690 : Active Opener Signature (in total 16 byte) : 691 +---------------------------------------------------------------+ 692 : Active Opener Token (in total 8 bytes) : 693 +---------------------------------------------------------------+ 694 | Signature offset (4 byte) | 695 +---------------------------------------------------------------+ 697 Figure 4 PLMT Token Indication message (sent via the Control 698 Connection) 700 As a response to the reception of the PLMT Token Indication from the 701 Active Opener, the Passive Opener SHOULD either send back an own 702 Token in a PLMT Token Confirmation message shown in Figure 5, or it 703 SHOULD immediately close the Control Connection instead. This 704 message also echoes back the Active Opener's Token, in order to 705 verify that the reply is indeed sent by a PLMT layer. 707 1 2 3 708 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 709 +---------------+--------------------------------+--------------+ 710 |Kind=TOKENCONF | Length=36 | reserved | 711 +---------------+--------------------------------+--------------+ 712 : Passive Opener Signature (in total 16 byte) : 713 +---------------------------------------------------------------+ 714 : Passive Opener Token (in total 8 byte) : 715 +---------------------------------------------------------------+ 716 : Echo of Active Opener Token (in total 8 bytes) : 717 +---------------------------------------------------------------+ 719 Figure 5 PLMT Token Confirmation message (sent via the Control 720 Connection) 722 Upon reception of that message, the Active Opener MUST first verify 723 the validity of the message (in particular the echoed Token). If the 724 message is valid, it MUST send the Signature provided from the 725 Passive Opener at the indicated byte position in the Initial 726 Connection, directly followed by a PLMT Token Message. Afterwards, 727 TLV framing has to be used. The Passive Opener must similarly react: 728 After having received the Signature and Token on an Initial 729 Connection, the Passive Opener MUST send the Active Opener's 730 Signature and a PLMT Token Message over the Initial Connection, too, 731 and use TLV framing afterwards. Thus, after having sent the 732 Signature, the Active Opener must parse all incoming bytes on the 733 Initial Connection for the Signature of the Passive Opener, in order 734 to detect the begin of TLV transfer in the reverse direction. In the 735 simplest case, the Passive Opener has not sent any data in the 736 meantime, i. e., the Signature is received immediately. However, 737 other cases are possible, too. 739 Note that this method is inefficient and also has a very small risk 740 of false positives, as it requires byte-wise parsing of the byte 741 stream. Yet, the fundamental problem is that the Passive Opener 742 cannot provide a byte offset for the Signature over the Control 743 Channel during the PLMT Capability Signaling phase, as the Initial 744 Connection and the Control Connection cannot be associated at that 745 time. As an optimization, the Passive Opener could provide a 746 bytestream offset by a separate signaling message once it has 747 received the Token on the Initial Connection, but PLMT cannot rely 748 on this, as the Control Connection could fail or stall in the 749 meantime and then the PLMT session would not be in consistent state. 750 The PLMT signaling exchange is designed to reflect an atomic 751 transaction. 753 4.2.3. Further Usage of the Control Connection 755 The Control connection is only needed to exchange token information 756 and to verify the association with the Initial Connection. After the 757 PLMT capability exchange has been completed, the control connection 758 is actually not needed any more, and it MAY be closed. All further 759 control information, such as additional addresses etc., can also be 760 exchanged over the Subflow Connections, by corresponding TLV 761 messages. However, the Control Connection MAY also be kept 762 established and used for further PLMT signaling. In particular, it 763 could be useful to exchange address information over the Control 764 Connection instead of the Subflow Connections. This would enable 765 future NAPT helper for the PLMT protocol that could try to translate 766 private to public addresses. A detailed discussion of this is 767 outside the scope of this document. 769 4.2.4. Discussion of Control Connection Failure Cases 771 A failure to setup a Control Connection is an indication that the 772 other end host does not have a PLMT Layer, or that middleboxes do not 773 allow the establishment of a PLMT Control Connection. An Active 774 Opener MUST await the successful PLMT capability exchange on the 775 Control Connection before starting to send the Signature and TLV 776 encoded content. An Active Opener MAY also give up after a certain 777 waiting time. Then, it MUST close the Control Connection, and use 778 backward compatible bytestream transport on the Initial Connection. 780 The PLMT capability exchange requires a single exchange of messages 781 on the Control Connection only. If the Connection fails afterwards, 782 all control information can be exchanged over Subflow Connections. If 783 the control connection fails and the Active Opener does not receive 784 the Token Confirmation message, without that the Passive Opener 785 detects this, there may be a synchronization mismatch and the Passive 786 opener may inject a Signature and a Token to the Initial Connection 787 even if this is not expected by the Active Opener. In order to avoid 788 data corruption, the Active Opener could parse all incoming data for 789 the Signature after failure of a Control Connections, but this may 790 increase the processing overhead. 792 If a Control Connection fails after the exchange of the tokens, PLMT 793 could in principle continue to operate, since TLV encoded data can be 794 transported over the established Subflow Connections, and since the 795 Signatures and Tokens are already known. 797 4.3. PLMT Data Connection Setup and Operation 799 PLMT provides two modes of operation, which differ by the time when 800 the control connection is established: Parallel Setup and Late 801 Setup. The Parallel Setup is significantly simpler for a Passive 802 Opener, as Signatures are sent in the first bytes of a connection 803 and therefore are simple to identify. But, unfortunately, the setup 804 of a Control Connection for every data transfer with a short 805 duration results in overhead and additional delay without any 806 performance gains. This mode is therefore mainly useful if it is 807 known in advance that a TCP connection will transport a large amount 808 of data. In order to reduce the overhead for short connection, PLMT 809 also allows that the Control Connection is established later than 810 the Initial Connection. In this case, the PLMT Layer on a host MUST 811 not initiate the TLV data encoding before the PLMT capability of the 812 other host has been determined through the Control Connection, (cf. 813 Figure 3). 815 4.3.1. Guidelines for selection of a Signature 817 To allow for a simple identification of where exactly the TLV 818 encoding inside the byte stream starts, a 128-bit Signature is used, 819 which is used as a delimiter between bytestream and TLV encoding (cf. 820 Figure 6). The Signature is selected by the hosts that must parse it, 821 and MUST be chosen such that collisions with existing application 822 protocols are minimal. Note that it is up to the hosts to decide what 823 Signature to use for different connections The most secure solution 824 is to use a different Signature for every Control Connection, but 825 then the parsing effort is the largest. For performance optimization, 826 the PLMT Layer at a host MAY use the same Signature in more than one 827 connection, but it MUST change the value on a regular basis. 829 1 2 3 830 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 831 +---------------------------------------------------------------+ 832 | Signature (16 byte) : 833 +-----------------------------------------------+---------------+ 834 : Signature : 835 +-----------------------------------------------+---------------+ 836 : Signature : 837 +-----------------------------------------------+---------------+ 838 : Signature | 839 +---------------------------------------------------------------+ 841 Figure 6 PLMT Signature (sent on Subflow Connections) 843 4.3.2. Bundling of Initial Connection to the Control Connection 844 in Parallel Setup 846 The Control Connection is used to determine the PLMT capability of 847 the end hosts. The Initial Connection MUST not transport any data 848 before the Control Connection is established and the PLMT Capability 849 Exchange is completed. If the Control Connection setup or PLMT 850 Capability Exchange fails, then the Initial Connection MUST not 851 transmit data with TLV encoding but the legacy TCP bytestream. 853 Before using TLV encoding, a host must first send the Signature on 854 the Initial Connection as depicted in Figure 7. The first TLV- 855 encoded messages after that delimiter must exchange the tokens to 856 bundle the Initial Connection with the Control Connection, and to 857 verify at both endpoints that the Initial Connection and the Control 858 Connection indeed terminate at the same host. The tokens are 859 exchanged by a Token Indication and a Token Confirmation message. 860 After these messages, both sides are allowed to send other PLMT 861 messages in TLV encoding over the Connection, or to establish 862 further Subflow Connections. Both Active and Passive Opener must 863 verify the Tokens. If the Tokens do not match the ones exchanged 864 over the control connection, the PLMT session must be closed, as 865 apparently an error has occurred. 867 End-host A End-host B 868 --------------------------- --------------------------- 869 Address A1 Address A2 Address B1 Address B2 870 ------------ ------------ ------------ ------------ 871 | | | | 872 | (Initial Connection setup (TCP)) | | 873 |---------------SYN----------------------->| | 874 |<------------SYN/ACK----------------------| | 875 |---------------ACK----------------------->| | 876 | | | | 877 | (PLMT Capability of the Other End-host has been | 878 | determined over the Control Connection) | 879 | | | | 880 | (First TLV encoded message exchange | | 881 | over the Initial Connection) | | 882 |---B's Signature + Token B Verification-->| | 883 | | | Token | 884 | | | verif. | 885 |<--A's Signature + Token A Verification---| | 886 Token | | | | 887 verif. | | | | 888 | | | | 889 | (TLV encoded data transport | | 890 | over the Initial Connection) | | 891 |---------------TLV----------------------->| | 892 |<--------------TLV------------------------| | 893 | | | | 894 Figure 7 Bundling of Initial PLMT Subflow Connection and Control 895 Connection for Parallel Setup 897 1 2 3 898 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 899 +---------------+---------------------------------+-------------+ 900 |Kind=TOKEN | Length=12 | reserved | 901 +---------------+---------------------------------+-------------+ 902 | Token (8 byte) | 903 +---------------------------------------------------------------+ 905 Figure 8 PLMT Token Verification Message (sent over the Initial 906 Connection) 908 4.3.3. Bundling of Initial Connection to the Control Connection 909 in Late Setup 911 In order to avoid the setup overhead of control Connections for 912 short-lived transfers, the PLMT protocol MAY establish the Control 913 Connection after data has already been exchanged on the Initial 914 Connection. This document does not describe heuristics when to set up 915 the Control connection. They may take into account factors such as 916 number of bytes transferred, cached information about support of 917 PLMT, or user preferences. 919 A receiver MUST assume that all bytes received on an incoming TCP 920 connection are sent by legacy end system, before a match with a valid 921 Signature is possible. Until then, all data must be passed to the 922 application in unmodified form. Thus, PLMT risks with a very small 923 probability that corrupted data is delivered to an application. 925 Once the Control Connection is established and the PLMT capability 926 information of the end hosts has been exchanged, the Active Opener 927 can send the Passive Opener's Signature and a PLMT Token 928 Verification message over the Initial Connection, at the position in 929 the byte stream that has been advertised over the control channel. 930 The mechanism of token exchange in the payload of the Initial 931 Connection is used to verify that the Initial Connection and Control 932 Connection actually involve the same hosts. 934 End-host A End-host B 935 --------------------------- --------------------------- 936 Address A1 Address A2 Address B1 Address B2 937 ------------ ------------ ------------ ------------ 938 | | | | 939 | (Initial Connection setup (TCP)) | | 940 |---------------SYN----------------------->| | 941 |<------------SYN/ACK----------------------| | 942 |---------------ACK----------------------->| | 943 | | | | 944 | (Data Segments | | 945 | sent over the Initial Connection) | | 946 |----------------------------------------->| | 947 |<-----------------------------------------| | 948 | | | | 949 | (Control Connection setup (TCP)) | | 950 |~~~~~~~~~~~~~~~SYN~~~~~~~~~~~~~~~~~~~~~~~>| | 951 |<~~~~~~~~~~~~SYN/ACK~~~~~~~~~~~~~~~~~~~~~~| | 952 |~~~~~~~~~~~~~~~ACK~~~~~~~~~~~~~~~~~~~~~~~>| | 953 | | | | 954 | (TLV-Enabled PLMT Control Signaling | | 955 | sent over the Control Connection) | | 956 |~~~Sign. indic. (A's sign., A's token)~~~>| | 957 |<~~Sign. confirm. (B's sign., B's token)~~| | 958 | | | | 959 | (Message exchange over the | | 960 | Initial Connection) | | 961 |---B's Signature + Token B verification-->| | 962 | | Token | 963 ........|..........................................| verif. | 964 |<--A's Signature + Token A verification---| | 965 Token | | | | 966 verif. | (TLV encoded data transport | | 967 | over the Initial Connection) | | 968 |---------------TLV----------------------->| | 969 |<--------------TLV------------------------| | 970 | | | | 971 Figure 9 Bundling of PLMT First Subflow Connection and Control 972 Connection for Delayed Setup 974 4.4. Additional Subflow Connections Initiation and Operation 976 4.4.1. Address Advertisement 978 The Initial Subflow Connection, as well as the Control Connection, is 979 established by the Active Opener. Once TLV encoding is enabled on the 980 Initial Subflow Connection, and it is thus verified that the two end- 981 hosts are PLMT capable, any of the end-hosts MAY initiate further 982 Subflow Connections. PLMT assumes that at least one of the two 983 connection endpoints is multihomed, i. e., has at least two IP 984 addresses. The end-hosts MAY exchange these addresses via the Control 985 Connection or via any Subflow Connection, once TLV transport is 986 enabled. The frame format of advertising and releasing addresses is 987 given in Figure 10 and 11, respectively. 989 1 2 3 990 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 991 +---------------+-------------------------------+-------+-------+ 992 | Kind=ADD_ADDR | Length | IPVer | (res) | 993 +---------------+-------------------------------+-------+-------+ 994 | Address (IPv4 - 4 octets / IPv6 - 16 octets) | 995 +---------------------------------------------------------------+ 997 Figure 10 PLMT Add Address 998 1 2 3 999 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1000 +---------------+-------------------------------+-------+-------+ 1001 | Kind=DEL_ADDR | Length | IPVer | (res) | 1002 +---------------+-------------------------------+-------+-------+ 1003 | Address (IPv4 - 4 octets / IPv6 - 16 octets) | 1004 +---------------------------------------------------------------+ 1006 Figure 11 PLMT Remove Address 1008 4.4.2. Subflow Connection Setup 1010 For each initiation of an additional Subflow Connection, a new TCP 1011 connection is initiated with a three-way handshake (SYN, SYN/ACK, 1012 ACK). The Signatures are used by both ends to distinguish Subflow 1013 Connections from normal TCP connection, and to detect the start of 1014 TLV encoding. If a Subflow Connection is established that shall 1015 carry TLV Data Segments, a sender MUST send the Signature first 1016 before starting to send TLV Data Segments. In all cases, the first 1017 Data Segment after the Signature MUST be a Token Indication (from 1018 Active Opener) or Token Confirmation message (from Passive Opener). 1019 This setup of an additional Subflow Connection is illustrated in 1020 Figure 12. 1022 End-host A End-host B 1023 --------------------------- --------------------------- 1024 Address A1 Address A2 Address B1 Address B2 1025 ------------ ------------ ------------ ------------ 1026 | | | | 1027 | (TLV encoded Data Segments) | | 1028 |----------------------------------------->| | 1029 |<-----------------------------------------| | 1030 | | | | 1031 | (Over Subflow or Control Connection) | 1032 |<--------------ADD_ADDR-B2----------------| | 1033 | | | | 1034 | (Additional Subflow Connection Setup (TCP)) | 1035 |***************************SYN**************************>| 1036 |<************************SYN/ACK*************************| 1037 |***************************ACK**************************>| 1038 | | | | 1039 |***B's Signature + Token B verification*****************>| 1040 | | Token | 1041 ........|..........................................| verif. | 1042 |<**A's Signature + Token A verification******************| 1043 Token | | | | 1044 verif. | (TLV encoded data transport | | 1045 | over the additional Subflow Connection) | | 1046 |***************TLV**************************************>| 1047 | | | | 1048 |<**************TLV***************************************| 1050 Figure 12 Additional Subflow Connection setup 1052 4.4.3. TLV Encoding of Data Segments 1054 TLV encoded Data Segments can be sent on each Subflow Connection. 1055 Each Data Segment carries a 64-bit Session Sequence Number. A PLMT- 1056 capable host must maintain a Session Sequence Number in addition to 1057 the TCP sequence numbers of TCP on a Subflow Connection. 1059 1 2 3 1060 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1061 +----------------+-------------------------------+--------------+ 1062 | Kind = DATA | Length=20+n | reserved | 1063 +----------------+-------------------------------+--------------+ 1064 : Session Sequence Number (8 byte) : 1065 +---------------------------------------------------------------+ 1066 : Data Segment (n bytes total) | 1067 +---------------------------------------------------------------+ 1069 Figure 13 TLV encoded Data Segment message 1071 Session Sequence Numbers are used to reorder the data inside the 1072 PLMT session that arrives over multiple Subflow Connections. The 1073 Session Sequence Number is thus similar to the TCP sequence number 1074 and identifies each byte of data. Each Data Segment carries the 1075 Session Sequence Number, which refers to the byte number of the 1076 first byte in the Data segment. 1078 Even when a PLMT-capable host is not transmitting TLV data segments, 1079 the end host MUST store Session Sequence Numbers for all ongoing TCP 1080 connections, in order to be able to deal with late setups of a 1081 Control Connection. 1083 4.4.4. Data Acknowledgments 1085 In addition to the regular Subflow Connection TCP acknowledgements, 1086 session-level Data Acknowledgements are used to cumulatively 1087 acknowledge the data received over the different Subflow 1088 Connections. A Data Acknowledgement that acknowledges the reception 1089 of a Data Segment message includes the next expected byte of Data 1090 Segments. In a normal operation, session-level Data Acknowledgements 1091 are actually not needed, but certain performance enhancing proxies 1092 or middlebox failures may result in situations in which the 1093 acknowledgments on a SubFlow Connection erroneously allows release 1094 of data in the sender, even if it is not yet received. 1096 The Data Acknowlegdements also include a session-level receive 1097 window to correctly perform flow control at session level, and to 1098 avoid deadlocks. 1100 Since the use of data acknowledgements is only a mechanism to 1101 increase robustness, the data acknowledgements SHOULD be sent at 1102 bigger intervals of time. It is left for further study how often 1103 they should be sent. Another open question is on which of the 1104 connections the messages should be sent. 1106 1 2 3 1107 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1108 +---------------+--------------------------------+--------------+ 1109 |Kind=SESS_ACK | Length=12 | reserved | 1110 +---------------+--------------------------------+--------------+ 1111 : Next expected Session Sequence Number (8 byte in total) : 1112 +---------------------------------------------------------------+ 1113 : Session receive window (8 bytes in total) : 1114 +---------------------------------------------------------------+ 1116 Figure 14 Data Acknowledgement message 1118 4.5. Other Aspects 1120 4.5.1. Congestion Control 1122 One of the goals for having a multi-connection transport solution is 1123 to enhance the usage of network resources, commonly known as 1124 resource pooling principle. In order to achieve resource pooling, 1125 the congestion windows of the different Subflow Connections of the 1126 Session should be coupled together. The coupling should lead to 1127 transmission of more Data Segments over the less congested 1128 connections as compared to the more congested connections. 1130 Different congestion control algorithms may be implemented for 1131 multipath transport mechanisms to achieve the goals of resource 1132 pooling and fairness. One such algorithm is presented in [4]. The 1133 algorithm offers a potential solution in the current Internet by 1134 controlling the Subflow Connection congestion window increase as a 1135 function of the performance of other Subflow Connections of a 1136 session. 1138 PLMT could use this algorithm for congestion control as well. If 1139 PLMT is entirely implemented in the user space, an alternative 1140 algorithm could be used that runs a corresponding scheduler, which 1141 uses own estimates for the path characteristics. The design of 1142 alternative algorithms for congestion control coupling is beyond the 1143 scope of this document. 1145 4.5.2. Path Management and Scheduling 1147 The establishment of multiple Subflow Connections to different 1148 addresses aims at a better utilization of the network resources. 1149 PLMT could use cross-layer information from the network layer for 1150 path management. 1152 The scheduling of TLV-encoded Data Segments over the different 1153 Subflow Connections is based on the local policy. PLMT can use 1154 different algorithms to control the splitting of the data stream 1155 from the application over the different Subflow Connections. PLMT 1156 uses the standard TCP mechanisms for reliable transport of data on 1157 its Subflow Connections. 1159 The retransmission strategy for lost Data Segments is a local 1160 policy. The session sequence number allows lost Data Segments to be 1161 sent over another Subflow Connection in addition to the 1162 retransmission over the same Subflow Connection. How often a Data 1163 Segment is sent over another Subflow Connection is again a design 1164 choice of the local policy. 1166 4.5.3. Closing Connections and Sessions 1168 A Subflow Connection is a standard TCP connection. To close a 1169 Subflow Connection the TCP 4-way FIN handshake mechanism is used. 1171 When the Session needs to be closed, it means that all the PLMT 1172 Connections need to be closed, including the Control Connection. 1174 5. Interaction with Middleboxes 1176 The Internet consists of many different types of middleboxes, some 1177 parse the contents of the stream of a TCP connection, rewrite the 1178 content of packet headers or rewrite even the payload. For a new 1179 multipath transport like PLMT to be successfully deployable, its 1180 operation should be understood and tested against such middleboxes. 1181 Examples for well-known middleboxes are Network Address and Port 1182 Translators (NAPT). PLMT is designed to be compatible with 1183 middleboxes that have problems with TCP options. But there are also 1184 some problems with other types of middleboxes. 1186 5.1. Middleboxes that Translate Address/Ports 1188 Middleboxes that perform Network Address and Port Translations 1189 (NAPT) may cause problems for the creation of multiple connections 1190 (this is a potential issue for all multipath transport protocols). 1191 Hosts behind the NAPT know their local addresses but might not be 1192 aware of the global addresses that the NAPT uses. Therefore, the 1193 hosts MUST NOT advertise their multiple local addresses to the other 1194 host. The host behind the NAPT MAY still be multipath capable and 1195 MAY open a PLMT connection to the other host if the other host is 1196 also PLMT capable. Over the established PLMT connection, the other 1197 host MAY advertise its multiple addresses. These addresses will be 1198 used by the host behind the NAPT to open further Subflow 1199 Connections. 1201 5.2. Middleboxes that Manipulate TCP Options 1203 The multipath solutions that use TCP options field for their 1204 operation may suffer from middleboxes that may remove or modify the 1205 TCP options. Some middleboxes may even drop packets with unknown TCP 1206 options, and this may happen for the connection establishment 1207 packets as well. PLMT does not employ any new TCP option and hence 1208 it would not be affected by such a middlebox behavior. 1210 5.3. Middleboxes that Parse Content 1212 Current middleboxes in the Internet are not aware of multipath 1213 transport. Therefore, middleboxes will identify the single Subflow 1214 Connection to be a standard TCP connection. The TLV encoding of the 1215 payload may confuse the middlebox and may lead the middlebox to 1216 stall the connection in case that the middlebox parses the content. 1218 If a middlebox blocks TLV encoding, PLMT can try to transmit data 1219 over another path. However, PLMT cannot fall back to a mode that 1220 does not use TLV transport, since it must send the Signature and 1221 tokens in TLV encoding over the Initial Subflow Connection. 1223 Middleboxes that want to prevent multipath transport can block 1224 connection setups to the well-known port. This prevents the use of 1225 multipath transport if a middlebox is both on the path of the 1226 Initial Subflow Connection and the Control Connection. A middlebox 1227 that is not on the path of the Control Connection cannot safely 1228 distinguish normal TCP connections and PLMT Subflow Connections with 1229 TLV transport. 1231 5.4. Middleboxes that Change content 1233 Middleboxes may also modify the payload and not only the packet 1234 headers. All the multipath solutions require a session-level data 1235 sequence number to re-order/combine the data stream received over 1236 the Subflow Connections. The PLMT design allows detecting such a 1237 middlebox behavior by identifying the connection which gets stalled 1238 due to undecodable TLV framing. In addition, checksums could be 1239 used. The Data Acknowledgements will identify the holes in the 1240 session sequence numbers so that a retransmission of the missing 1241 segments over other Subflow Connections will be initiated. This 1242 allows working around content-modifying middleboxes, unless they are 1243 present on all paths. 1245 If this type of middlebox is present on the Initial Connection, then 1246 the Signature matching may fail. This means that data transport over 1247 the Initial Connection may be corrupted, as, e. g., the Signature 1248 may be delivered to the application as part of the byte stream. 1250 6. Security Considerations 1252 The Signature-based method to identify the setup of a new TLV- 1253 enabled data flow has two security issues: First, an application can 1254 accidentally generate a bit pattern that is equal to the Signature. 1255 Second, due to the use of out-of-band signaling, PLMT's method must 1256 be robust against malicious attacks that try to break or hijack PLMT 1257 sessions or normal connections. Unlike other multipath transport 1258 protocols, it is theoretically possible to attack a normal TCP 1259 connection to a PLMT-enabled server, even if it does not use 1260 multipath transport. 1262 6.1. Reappearance of Signature in Application Data 1264 The Signature (and the tokens) is sent in two different contexts: 1266 o A connection which was started as a single legacy TCP 1267 connection is later switched to PLMT/TLV-enabled operation. In 1268 this case, the Active Opener provides the Session sequence 1269 number over the control connection of the last byte that is 1270 not TLV encoded. This way, the PLMT Layer of the Passive 1271 Opener knows how much user data has been transmitted through 1272 the legacy TCP connection and when to expect the Signature. 1273 Given the length of the Signature, as well as the following 1274 token exchange, it is extremely unlikely that a normal TCP 1275 connection is wrongly classified as a Subflow Connection. A 1276 similar problem occurs at the Active Opener. 1278 o The Signature can also be present in the first bytes of a new 1279 PLMT Subflow Connection, if it is an additional Subflow 1280 Connection, or if the Control Connection is established first. 1281 In these cases, the Subflow Connection is characterized by the 1282 Signature being present in the first bytes of a connection. In 1283 case that an application itself opens an additional TCP 1284 connection to the same corresponding end host, a problem could 1285 occur if the Signature pattern (and follow-up token messages) 1286 is contained in the first data packet of the connection. 1288 Because of both effects, there is a residual probability that PLMT 1289 accepts a connection erroneously, if an application accidentally 1290 sends a bit pattern that is identical to the Signature (plus the 1291 Tokens), of if an attacker manages to guess the pattern. This 1292 probability is very small as the Signature is a long, random bit 1293 pattern. 1295 This probabilistic approach of a token-based identification is 1296 general practice in challenge-response authentication methods, where 1297 there is also an extremely small residual probability that an 1298 unauthorized (malicious) node guesses the response correctly. 1300 6.2. Resilience against Malicious Attacks 1302 One aspect of address-agile multi-path transport mechanisms are 1303 possible malicious attacks. PLMT suffers from a DoS vulnerability, 1304 but it has protection methods against other attacks. 1306 PLMT uses the same token mechanism like other multipath transport 1307 protocols, but with much longer tokens. An attacker must not only 1308 correctly guess the Tokens, but also the Signature. As a 1309 consequence, the probability of blind guess attacks on PLMT is 1310 extremely small. 1312 7. Open Issues 1314 This PLMT protocol specification is a work-in-progress, and there 1315 are still remaining unsolved issues that need further 1316 considerations. 1318 8. IANA Considerations 1320 This document will make a request to IANA to allocate a new TCP/UDP 1321 port value for the PLMT Control Connection. 1323 9. Conclusion 1325 PLMT is a user-space solution to enable reliable, in-order data 1326 transfer over multiple paths. This specification defines the PLMT 1327 protocol. PLMT is defined as a worked example for a payload-based 1328 multipath transport, as an alternative to TCP option based signaling 1329 mechanisms. Due to some security vulnerabilities, it is mainly 1330 suitable for controlled and trusted environments. 1332 10. References 1334 10.1. Normative References 1336 [1] Bradner, S., "Key words for use in RFCs to Indicate 1337 Requirement Levels", BCP 14, RFC 2119, March 1997. 1339 [2] J. Postel, ''Transmission Control Protocol'', STD 7, RFC 793, 1340 September 1981. 1342 [3] Dierks, T. and E. Rescorla, "The Transport Layer Security 1343 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 1345 10.2. Informative References 1347 [4] Raiciu, C., Handley, M. and D. Wischik, ''Coupled Multipath- 1348 Aware Congestion Control'', draft-ietf-mptcp-congestion-00 1349 (work in progress), July 2010. 1351 [5] Ford, A., Raiciu, C., Barre, S. and J. Iyengar, ''Architectural 1352 Guidelines for Multipath TCP Development'', draft-ietf-mptcp- 1353 architecture-01 (work in progress), June 2010. 1355 [6] Ford, A., Raiciu, C. and M. Handley, ''TCP Extensions for 1356 Multipath Operation with Multiple Addresses'', draft-ietf- 1357 mptcp-multiaddressed-01 (work in progress), July 2010. 1359 [7] M. Bagnulo, ''Threat Analysis for Multi-addressed/Multi-path 1360 TCP'', draft-ietf-mptcp-threat-02 (work in progress), March 1361 2010. 1363 [8] Scharf, M. and A. Ford, ''MPTCP Application Interface 1364 Considerations'', draft-scharf-mptcp-api-02 (work in progress), 1365 July 2010. 1367 [9] M. Scharf, ''Multi-Connection TCP (MCTCP) Transport'', draft- 1368 scharf-mptcp-mctcp-00 (work in progress), July 2010. 1370 11. Acknowledgments 1372 The authors are supported by the German-Lab project 1373 (http://www.german-lab.de/), a research project funded by the German 1374 Federal Ministry of Education and Research (BMBF). The views 1375 expressed here are those of the author(s) only. The BMBF is not 1376 liable for any use that may be made of the information in this 1377 document. 1379 The authors gratefully acknowledge significant input into this 1380 document from Koojana Kuladinithi, Asanga Udugama, Andreas Koensgen, 1381 Andres Toro (all from University of Bremen), Andreas Timm-Giel 1382 (Hamburg University of Technology), Thomas-Rolf Banniza and Peter 1383 Schefczik (all from Alcatel-Lucent Bell Labs). 1385 Authors' Addresses 1387 Amanpreet Singh 1388 University of Bremen 1389 Otto-Hahn-Allee 1 1390 28359 Bremen 1391 Germany 1393 Email: aps@comnets.uni-bremen.de 1395 Michael Scharf 1396 Alcatel-Lucent Bell Labs 1397 Lorenzstrasse 10 1398 70435 Stuttgart 1399 Germany 1401 EMail: michael.scharf@alcatel-lucent.com