idnits 2.17.1 draft-bonaventure-mptcp-long-options-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 496: '... option MUST be encoded by using the...' RFC 2119 keyword, line 498: '... by a DSS option MAY end with TCP opti...' RFC 2119 keyword, line 659: '... If used, this option, defined in [RFC0793] MUST appear as the last...' RFC 2119 keyword, line 665: '... It MUST never be sent inside a DSS-...' RFC 2119 keyword, line 666: '...oad that contains this option, it MUST...' (13 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 225 has weird spacing: '...covered cov...' == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: This option, defined in [RFC1323] and revised in [RFC7323] SHOULD not appear in any DSS-mapped payload. It does not benefit from the reliability provided by the DSS-mapped payload. -- The document date (July 06, 2015) is 3214 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-13) exists of draft-ietf-tcpm-tcp-edo-03 -- Obsolete informational reference (is this intentional?): RFC 793 (Obsoleted by RFC 9293) -- Obsolete informational reference (is this intentional?): RFC 1323 (Obsoleted by RFC 7323) -- Obsolete informational reference (is this intentional?): RFC 6824 (Obsoleted by RFC 8684) Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MPTCP Working Group O. Bonaventure 3 Internet-Draft UCLouvain 4 Intended status: Informational July 06, 2015 5 Expires: January 7, 2016 7 Supporting long TCP options in Multipath TCP 8 draft-bonaventure-mptcp-long-options-00 10 Abstract 12 The extensibility of TCP is severely limited by the Data Offset field 13 of the TCP header that limits the number of bytes that can be used in 14 each segment to transport options. This document considers Multipath 15 TCP as the starting point and analyses different alternatives to 16 improve the ability of Multipath TCP to transport TCP extensions. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at http://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on January 7, 2016. 35 Copyright Notice 37 Copyright (c) 2015 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (http://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 53 2. Lessons learned with Multipath TCP . . . . . . . . . . . . . 3 54 3. Extending the DSS option . . . . . . . . . . . . . . . . . . 4 55 3.1. First approach : TCP EOL as marker . . . . . . . . . . . 6 56 3.2. Second approach : Option length in DSS . . . . . . . . . 9 57 3.3. Third approach : using the control stream for options . . 10 58 3.4. Middlebox interference . . . . . . . . . . . . . . . . . 12 59 4. Negotiating the extended DSS option . . . . . . . . . . . . . 16 60 5. Compatibility with the existing TCP options . . . . . . . . . 16 61 5.1. End-of-Option List . . . . . . . . . . . . . . . . . . . 16 62 5.2. Maximum Segment Size . . . . . . . . . . . . . . . . . . 16 63 5.3. No-Operation . . . . . . . . . . . . . . . . . . . . . . 16 64 5.4. SACK-Permitted . . . . . . . . . . . . . . . . . . . . . 16 65 5.5. SACK option . . . . . . . . . . . . . . . . . . . . . . . 17 66 5.6. Timestamps . . . . . . . . . . . . . . . . . . . . . . . 17 67 5.7. TCP-TFO . . . . . . . . . . . . . . . . . . . . . . . . . 17 68 5.8. TCP-AO option . . . . . . . . . . . . . . . . . . . . . . 17 69 5.9. TCP-User Timeout . . . . . . . . . . . . . . . . . . . . 17 70 5.10. Multipath TCP options . . . . . . . . . . . . . . . . . . 17 71 6. Security consideration . . . . . . . . . . . . . . . . . . . 18 72 7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 18 73 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 18 74 9. Informative References . . . . . . . . . . . . . . . . . . . 19 75 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 20 77 1. Introduction 79 Multipath TCP is an extension to TCP [RFC0793] that was standardized 80 in [RFC6824]. Multipath TCP uses several types of TCP options to 81 exchange data. Like other TCP extensions, Multipath TCP suffers from 82 the 4 bits Data Offset field of the TCP header that limits the size 83 of the entire header, including options to 60 bytes. This limits the 84 total length of the TCP options to 40 bytes, which becomes a problem 85 for segments that include SACK [RFC2018], Timestamp [RFC1323], 86 Multipath TCP [RFC6824] and TCP-AO [RFC5925]. For example, it 87 becomes difficult to combine these four option types in a single 88 segment. 90 Various techniques to extend the TCP option space are being discussed 91 [I-D.ietf-tcpm-tcp-edo] within the TCPM working group. As of this 92 writing, there is not enough experience with such extensions and 93 their possible interference with middleboxes that are known to limit 94 the extensibility of TCP [IMC11]. 96 Instead of starting from regular TCP as the other proposals, we 97 assume that Multipath TCP has been successfully negotiated for the 98 TCP connection and evaluate how the unique features of Multipath TCP 99 can be tweaked to carry large TCP options including long SACK blocks. 100 Instead of changing the semantics of the Data Offset field, adding a 101 new TCP option or encoding the payload, we start from the DSS options 102 used by Multipath TCP. This gives us more flexibility and leverages 103 the middlebox resilience provided by Multipath TCP. 105 In this document, we focus on the transport of longer TCP options 106 once the Multipath TCP connection has been established. We leave for 107 further work the problem of extending the TCP option space in the SYN 108 segment during the three-way handshake. 110 This document is organized as follows. We first discuss the lessons 111 that were learned from the specification, implementation and 112 deployment of Multipath TCP in section Section 2. We then discuss in 113 Section 3 several solutions that modify the DSS option to support the 114 transmission of additional TCP options. We propose in Section 4 to 115 negotiate the utilisation of this option as a new Multipath TCP 116 version. Section 5 discusses how the existing TCP options can be 117 supported with the solution proposed in this document. 119 2. Lessons learned with Multipath TCP 121 During the design, implementation and deployment of Multipath TCP 122 during the last years we have learned some lessons on how various 123 types of middleboxes can interfere with a TCP extension. These 124 lessons have been documented in several scientific publications 125 [IMC11] [HotMiddlebox13], but it is worth summarising them again. 127 The first lesson concerns the initial three way handshake. Most TCP 128 extensions assume that the utilisation of a new option can be safely 129 negotiated by sending an option inside the initial SYN segment and 130 verifying that the same option is present in the returned SYN+ACK. 131 This is the approach used for SACK [RFC2018] and to some extent the 132 WSCALE and TS extensions defined in [RFC7323]. Unfortunately, 133 measurements with Multipath TCP showed that there are middleboxes 134 that simply echo in the SYN+ACK segment any unknown option received 135 in the SYN segment. To cope with such middleboxes, the TCP option 136 used to negotiate the utilisation of a TCP extension cannot be the 137 same in the SYN and SYN+ACK segments. The active opener must verify 138 that the option sent in the SYN segment has not been modified before 139 being returned in the SYN+ACK segment. 141 The second lesson is also about the initial three way handshake. A 142 middlebox can modify the SYN+ACK segment on the return path without 143 having observed the option contained in the SYN segment. This kind 144 of middlebox interference can be a problem for the negotiation of the 145 utilisation of a TCP extension since the states of the active and the 146 passive opener might disagree. Multipath TCP solves this problem by 147 sending information in the third ACK. 149 The third lesson is that despite a successful negotiation during the 150 three way handshake by using option type x, middleboxes might still 151 remove option type x in some segments on the connection. Any TCP 152 extension must be able to cope with segments that do not contain an 153 expected option. 155 The fourth lesson is that middleboxes can split segments. Such 156 middleboxes can be located inside the network or be simply the TSO 157 implementation on the network interface card of the sending hosts 158 [IMC11]. These network interface cards are very popular and it is 159 difficult for a TCP stack to correctly detect their exact behaviour 160 concerning the handling of TCP options. These cards expose a large 161 MSS to the TCP stack and then split the large segment in smaller 162 segments that are (usually) checksummed by the card. According to 163 [IMC11] all the tested cards copy all the options included in the 164 large segment into the smaller ones. 166 The fifth lesson is that middleboxes can coalesce segments. Such 167 middleboxes can be located inside the network or more likely on the 168 receiving host. Most network interface cards implement GRO by 169 performing the opposite operation of TSO. If the segments that are 170 received in sequence always contain the same option, then these 171 segments are coalesced in a larger segment which is delivered to the 172 TCP stack. Note that the size of the segments that are delivered to 173 the TCP stack will vary in function of the packet losses and also of 174 the utilisation of options by the sending host. 176 The sixth lesson is that middleboxes can modify, inject or remove 177 data from the payload of TCP segments. A typical example are the 178 Application Level Gateways used on Network Address Translators to 179 "seamlessly" support application-level protocols that exchange IP 180 addresses in the bytestream. To support these applications, ALGs 181 need to modify the IP addresses (and possibly port numbers) exchanged 182 in the bytestream. 184 3. Extending the DSS option 186 The DSS option is defined in [RFC6824]. This option has a variable 187 length depending on the utilisation of the 'm', 'M', 'a' and 'A' 188 flags. The format of this option in shown in Figure 1. An important 189 point to note is that it contains several flags that are currently 190 reserved for future use. 192 1 2 3 193 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 194 +---------------+---------------+-------+----------------------+ 195 | Kind | Length |Subtype| ( reser.) |F|m|M|a|A| 196 +---------------+---------------+-------+----------------------+ 197 | Data ACK (4 or 8 octets, depending on flags) | 198 +--------------------------------------------------------------+ 199 | Data sequence number (4 or 8 octets, depending on flags) | 200 +--------------------------------------------------------------+ 201 | Subflow Sequence Number (4 octets) | 202 +-------------------------------+------------------------------+ 203 | Data-Level Length (2 octets) | Checksum (2 octets) | 204 +-------------------------------+------------------------------+ 206 Figure 1: DSS option 208 We assume in this document that the utilisation of the modified DSS 209 option is negotiated as a new Multipath TCP version in the three way 210 handshake. 212 The general approach evaluated in this document is to place the 213 additional TCP options inside the payload and modify the DSS to 214 indicate which part of the payload contains TCP options (if any) and 215 which part contains payload data (if any). The generic solution is 216 shown graphically in Figure 2. 218 <- ext. -> 219 options 221 +--------+---------+--------+----------+--------...-----+ 222 | ip hdr | TCP hdr | DSS | Options | User Data | 223 +--------+---------+--------+----------+--------...-----+ 224 <- DO -> <- Payload of TCP segment--> 225 covered covered by DSS mapping 227 Figure 2: Usage of the modified DSS option 229 The key question for all the approaches discussed in this section is 230 how to modify the DSS option to indicate the boundary between the 231 additional TCP options and the payload. For simplicity, we assume in 232 this section that only the Multipath TCP DSS option is placed in the 233 extended TCP header. The other TCP are placed inside the payload 234 which is covered by the DSS option. 236 3.1. First approach : TCP EOL as marker 238 A first approach to extend the TCP option space is to simply assume 239 that each DSS option is always followed by one or more TCP options 240 and that the EOL option, defined in [RFC0793] is used to mark the end 241 of the additional TCP options. For simplicity, we assume that the 242 DSS option is the only option that consumes space in the TCP header 243 extension (i.e. the Data Offset field of the TCP header has a value 244 equal to the length of the DSS option, possibly with a few NOP 245 options to align it on 32 bits boundaries). The length included in 246 the DSS option remains the length of the payload which is part of the 247 bytestream without taking into account the bytes consummed by the 248 included TCP options. This is illustrated in Figure 3. If the DSS 249 checksum has been negotiated, it is computed only on the user data 250 and the pseudo-header defined in [RFC6824]. 252 +--------+---------+--------+--------+-----+------...-----+ 253 | IP hdr | TCP hdr | DSS | Opt x | EOL | User data | 254 +--------+---------+--------+--------+-----+------...-----+ 255 <- Payload of TCP segment--> 256 covered by DSS mapping 258 Figure 3: Using the TCP EOL option to mark the boundary between TCP 259 options and user data 261 In this figure, Option x and the payload are optional. With this 262 design there is always one byte used by the EOL option inside each 263 segment that contains a DSS option. 265 Let us now analyse how this solution react to middleboxes that 266 coalesce or split segments. Figure 4 illustrates a middlebox that 267 splits a large segment and copies the DSS option in both. The DSS 268 option shows the mapping between the DSN (2) and the subflow sequence 269 number (1) and the length covered by the DSS option. 271 +---------+--------+-----+------...---+ 272 | TCP hdr | DSS | EOL | Payload | 273 | seq=1 | 2->1 | | 4 bytes | 274 | len=5 | len=4 | | | 275 +---------+--------+-----+------...---+ 276 || 277 \/ 278 +---------+--------+-----+------...---+ 279 | TCP hdr | DSS | EOL | Payload | 280 | seq=1 | 2->1 | | 2 bytes | 281 | len=3 | len=4 | | | 282 +---------+--------+-----+------...---+ 284 +---------+--------+-------...---+ 285 | TCP hdr | DSS | Payload | 286 | seq=3 | 2->1 | 2 bytes | 287 | len=2 | len=4 | | 288 +---------+--------+-------...---+ 290 Figure 4: Effect of Segment splitting 292 If the segment is split as shown in the above example, a Multipath 293 TCP receiver will parse the DSS option in the first segment and wait 294 until it has received all the mapped data before extracting the EOL 295 option and the payload. Segment splitting does not affect this 296 solution. 298 If a middlebox coalesces segments, the situation is different. Let 299 us consider the scenario shown in Figure 5. 301 +---------+--------+-----+------...---+ 302 | TCP hdr | DSS | EOL | Payload | 303 | seq=1 | 2->1 | | 2 bytes | 304 | len=3 | len=2 | | | 305 +---------+--------+-----+------...---+ 307 +---------+--------+-----+-------...--+ 308 | TCP hdr | DSS | EOL | Payload | 309 | seq=3 | 4->3 | | 2 bytes | 310 | len=3 | len=2 | | | 311 +---------+--------+-----+------...---+ 312 || 313 \/ 315 +---------+--------+-----+------...---+-----+------...---+ 316 | TCP hdr | DSS | EOL | Payload | EOL | Payload | 317 | seq=1 | 2->1 | | 2 bytes | | 2 bytes | 318 | len=5 | len=2 | | | | | 319 +---------+--------+-----+------...---+-----+------------+ 321 or 323 +---------+--------+-----+------...---+-----+------...---+ 324 | TCP hdr | DSS | EOL | Payload | EOL | Payload | 325 | seq=1 | 3->3 | | 2 bytes | | 2 bytes | 326 | len=5 | len=2 | | | | | 327 +---------+--------+-----+------...---+-----+------------+ 329 Figure 5: Effect of Segment Coalescing 331 Since the two small segments do not contain the same option, a 332 middlebox should not in theory coalesce them. Anyway, let us analyse 333 what happens in this case. If the middlebox only copies the TCP 334 option in the first segment, then the receiver will process the first 335 EOL option and the first block of 2 bytes of payload. The remaining 336 bytes are not covered by a mapping. The receiver will ack at the 337 data level the first block of two bytes in the payload but not the 338 second. After some time, the sender will retransmit the 339 unacknowledged data with a new DSS option that will cover the data of 340 the second segment. The same applies if the middlebox copies the 341 second DSS option in the coalesced segment. 343 A first drawback of this solution is that the TCP stack must parse 344 the payload to extract all the options that are included inside a DSS 345 map. A possible alternative to simplify this parsing would be to 346 redefine an unused bit of the DSS option to indicate that the mapped 347 payload starts with TCP options. If this bit is reset, then there 348 are no options in the payload and the receiver can process the 349 payload as usual. Parsing the payload might have a performance 350 impact on packet filters used on some routers or firewalls that 351 process TCP options. 353 A second drawback of this approach is that the TCP options consumme 354 TCP sequence space and are transmitted reliably. This implies that a 355 measurement application that uses the difference between the sequence 356 number of the SYN and FIN segments to compute the number of bytes 357 exchanged over a connection will overestimate the number of bytes 358 exchanged by the communicating applications. 360 3.2. Second approach : Option length in DSS 362 The second approach is to include an 'Options length' field inside 363 the DSS option to indicate the part of the payload that follows the 364 DSS option that is used by TCP options. This is illustrated with the 365 modified DSS option below. 367 1 2 3 368 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 369 +---------------+---------------+-------+----------------------+ 370 | Kind | Length |Subtype| ( reser.) |F|m|M|a|A| 371 +---------------+---------------+-------+----------------------+ 372 | Data ACK (4 or 8 octets, depending on flags) | 373 +--------------------------------------------------------------+ 374 | Data sequence number (4 or 8 octets, depending on flags) | 375 +--------------------------------------------------------------+ 376 | Subflow Sequence Number (4 octets) | 377 +-------------------------------+------------------------------+ 378 | Data-Level Length (2 octets) | Checksum (2 octets) | 379 +-------------------------------+------------------------------+ 380 | TCP Options Length | 381 +-------------------------------+ 383 Figure 6: Extending the DSS option to include a TCP options length 385 This 'TCP Options length' field could be encoded in several ways : 387 o as a 16 bits field that encodes the number of bytes used for 388 options. This solution has the advantage of being aligned on 16 389 bits boundaries and supports options of up to 64 KBytes. 391 o as an 8 bits field that encodes the number of bytes used for 392 options. This solution supports options of up to 255 bytes. The 393 DSS option is not aligned on 32 bits boundaries and TCP option 394 padding might be required in the TCP extended header. 396 o as an 8 bits field that encodes the number of 32 bits words used 397 for options. This solution supports options of up to 1020 bytes. 398 The DSS option is aligned on 16 bits boundaries and TCP option 399 padding is required inside the DSS option. 401 The last solution is probably the best compromise between overhead 402 and extensibility. It however adds some complexity in the padding 403 mechanism that needs to be used to encode the TCP options that appear 404 before the DSS option and the TCP options that are encoded within the 405 DSS option since both need to be aligned on 32 bits boundaries. 407 With the three solutions above, it is possible to send data mapped by 408 a DSS option without any TCP option by using a Length field (i.e. 20, 409 24 or 28 bytes) for the DSS option that does not include space for 410 the 'Options Length'. 412 3.3. Third approach : using the control stream for options 414 The control stream is an extension to Multipath TCP that was proposed 415 in [I-D.paasch-mptcp-control-stream]. This extension redefines one 416 flag (called 'S' in Figure 7) of the DSS option. 418 1 2 3 419 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 420 +---------------+---------------+-------+----------------------+ 421 | Kind | Length |Subtype|(reserved)|S|F|m|M|a|A| 422 +---------------+---------------+-------+----------------------+ 423 | Control ACK (4 or 8 octets, depending on flags) | 424 +--------------------------------------------------------------+ 425 |Control sequence number (4 or 8 octets, depending on flags) | 426 +--------------------------------------------------------------+ 427 | Subflow Sequence Number (4 octets) | 428 +-------------------------------+------------------------------+ 429 |Control-Level Length (2 octets)| Checksum (2 octets) | 430 +-------------------------------+------------------------------+ 432 Figure 7: Modification of the DSS option to indicate whether it maps 433 user data (S=0) or TCP options (S=1) 435 The control stream defines two separate bytestreams. The default 436 bytestream is used to carry application level data. These datas are 437 mapped by using the above DSS option with the 'S' flag set to 0. 438 When the 'S' flag of the DSS option is set to 1, this indicates that 439 the mapped data belongs to the second bytestream. In 441 [I-D.paasch-mptcp-control-stream], this second bytestream was used to 442 exchange options using a special TLV format. One of the use cases 443 mentioned in [I-D.paasch-mptcp-control-stream] was the exchange of 444 security keys. However, this solution was neither implemented nor 445 deployed. We reuse it to map TCP options on the second bytestream. 447 Figure 8 shows some examples of segments containing only TCP options, 448 only user data or both user data and TCP options. In the first 449 example, the S bit is set to 1 and thus the mapped payload only 450 contains TCP options. In this segment, the DO field of the TCP 451 header covers only the DSS option. In the second example, the S bit 452 is set to zero and the payload contains user data. Again, the DO 453 field of the TCP header only covers the DSS option. The third 454 example combines both options and user data in a single segment. The 455 first part of the payload, containing 'Opt x', is covered by the DSS 456 with the S bit set to 1. This DSS maps TCP options. The second part 457 of the payload contains user data. In this last example, the DO 458 field of the TCP header covers the two DSS options. 460 TCP segment containing extended options 462 +--------+---------+--------------+--------+-------+ 463 | IP hdr | TCP hdr | DSS (S=1) | Opt x | Opt y | 464 +--------+---------+--------------+--------+-------+ 465 <-- Payload ----> 466 Covered by DSS 468 TCP segment containing data 470 +--------+---------+--------------+------...---+ 471 | IP hdr | TCP hdr | DSS (S=0) | Payload | 472 +--------+---------+--------------+------...---+ 473 <- Payload -> 474 Covered by DSS 476 TCP segment containing both extended options and data 478 covered by DSS (S=0) 479 <-- + ---> 480 +--------+---------+-----------+-----------+-------+---...---+ 481 | IP hdr | TCP hdr | DSS (S=1) | DSS (S=0) |Opt x | Payload | 482 +--------+---------+-----------+-----------+-------+---...---+ 483 <- + -> 484 covered by DSS (S=1) 486 Figure 8: Examples of extended segments 488 Note that to support both data and extended TCP options, two DSS 489 options must be placed inside the segment. This implies that each of 490 them must have a length of 20 bytes. In practice, we do not expect 491 that hosts will often need to send extended options and data 492 simultaneously. A better approach is to send data and extended 493 options in different segments. 495 Each TCP option that is included in the payload mapped by a DSS 496 option MUST be encoded by using the standard Type-Length-Value format 497 for TCP options [RFC0793]. The TCP options that are included in the 498 payload mapped by a DSS option MAY end with TCP option zero (End of 499 Option List). With this modified DSS option, the length of the 500 extended TCP options is only limited by the length than can be mapped 501 by a DSS option. Given that a DSS option can map up to 64 KBytes of 502 data, it is possible to send up to 64 KBytes worth of options. 504 3.4. Middlebox interference 506 We now qualitatively evaluate the last solution and analyze how 507 middleboxes could interfere with the transmission of extended TCP 508 options. 510 A first point to note is that although the example below assume for 511 simplicity that only the DSS option is part of the TCP header covered 512 by the Data Offset, this is not a requirement. A TCP segment can 513 include other TCP options inside the header covered by the Data 514 Offset field. 516 Our first use case is a middlebox that inserts a new TCP option and 517 modifies the TCP Data Offset. Figure 9 considers the impact of such 518 a middlebox on both a segment containing userdata and a segment 519 containing extended TCP options. Such a middlebox was proposed in 520 [I-D.ananth-middisc-tcpopt]. 522 Initial segment containing options 524 <--- DO covered ---> <-- DSS mapped--> 525 +---------+--------------+--------+-------+ 526 | TCP hdr | DSS (S=1) | Opt x | Opt y | 527 +---------+--------------+--------+-------+ 529 Modified segment containing options 531 <--- DO covered ---> <-- DSS mapped--> 532 +---------+----------+--------------+--------+-------+ 533 | TCP hdr |Added Opt | DSS (S=1) | Opt x | Opt y | 534 +---------+----------+--------------+--------+-------+ 536 Initial segment containing userdata 538 <--- DO covered ---> <-- DSS mapped--> 539 +---------+--------------+--------+-------+ 540 | TCP hdr | DSS (S=0) | User data | 541 +---------+--------------+--------+-------+ 543 Modified segment containing options 545 <--- DO covered ---> <-- DSS mapped--> 546 +---------+----------+--------------+--------+-------+ 547 | TCP hdr |Added Opt | DSS (S=0) | User data | 548 +---------+----------+--------------+--------+-------+ 550 Figure 9: Impact of the insertion of a TCP option 552 The proposed solution correctly copes with such a middlebox. It does 553 not work on a path where a middlebox removes the DSS option because 554 this option is required for the operation of Multipath TCP. In this 555 case, Multipath TCP performs a fallback to regular TCP. 557 Our second use case is a middlebox that splits a segment that 558 contains TCP options or user data and copies the DSS option in both 559 resulting segments. This is illustrated in Figure 10 560 +---------+--------+--------+ 561 | TCP hdr | DSS(S1)| Opt x | 562 | seq=1 | 2->1 | 4 bytes| 563 | len=5 | len=4 | | 564 +---------+--------+--------+ 565 || 566 \/ 567 +---------+--------+--------+ 568 | TCP hdr | DSS(S1)| Opt | 569 | seq=1 | 2->1 | 2 bytes| 570 | len=3 | len=4 | | 571 +---------+--------+--------+ 573 +---------+--------+--------+ 574 | TCP hdr | DSS(S1)| Opt | 575 | seq=3 | 2->1 | 2 bytes| 576 | len=5 | len=4 | | 577 +---------+--------+--------+ 579 Figure 10: Effect of segment splitting 581 With such a middlebox that splits a segment and copies the DSS 582 option, the receiver can recover the extended option once it has 583 received the two segments. The same applies for user data and is 584 already supported by Multipath TCP implementations [IMC13a]. 586 The third use case is a middlebox that coalesces two segments. We 587 consider the case where a segment containing extended options is 588 coalesced with a segment containing user data. A similar reasoning 589 applies for other types of segment. This is illustrated in 590 Figure 11. 592 +---------+--------+-------------+ 593 | TCP hdr | DSS(S1)| Options | 594 | seq=1 | 2->1 | 4 bytes | 595 | len=4 | len=4 | | 596 +---------+--------+-------------+ 598 +---------+--------+-------- --+ 599 | TCP hdr | DSS(S0)| User data | 600 | seq=5 | 4->5 | 2 bytes | 601 | len=2 | len=2 | | 602 +---------+--------+-----------+ 603 || 604 \/ Coalescing middlebox 606 +---------+--------+----------+------------+ 607 | TCP hdr | DSS(S1)| Options | User data | 608 | seq=1 | 2->1 | 4 bytes | 2 bytes | 609 | len=6 | len=4 | | | 610 +---------+--------+----------+------------+ 612 or 614 +---------+--------+----------+------------+ 615 | TCP hdr | DSS(S0)| Options | User data | 616 | seq=1 | 4->5 | 4 bytes | 2 bytes | 617 | len=6 | len=2 | | | 618 +---------+--------+----------+------------+ 620 Figure 11: Effect of Segment Coalescing 622 Multipath TCP implementations already support middleboxes that 623 coalesce consecutive segments containing data as demonstrated in 624 [IMC13a]. 626 There are two different possibilities depending on whether the first 627 or the second DSS option is copied in the coalesced segment If the 628 first DSS option is copied, then the receiver has a valid mapping for 629 the extended TCP options and can decode them but no mapping for the 630 user data. The sender will timeout and retransmit a segment 631 containing the user data with a valid mapping. If the second DSS 632 option is copied, then the receiver can process the user data but has 633 to wait for a retransmission of the mapping that covers the extended 634 TCP options. 636 4. Negotiating the extended DSS option 638 The proposed extended DSS option should only be used between hosts 639 that support the extension. This should be negotiated during the 640 three way handshake for the initial subflow. There are two possible 641 solutions for this negotiation : 643 o Redefine one of the unused bits (e.g. 'B') of the MP_CAPABLE 644 option [RFC6824] to negotiate the utilisation of the extended DSS 645 option 647 o Define a new version of the Multipath protocol [RFC6824] 649 Given the impact of the change, it is probable safer to increment the 650 protocol version number. 652 5. Compatibility with the existing TCP options 654 In this section, we discuss how the existing TCP options can be 655 transported by using the control stream. 657 5.1. End-of-Option List 659 If used, this option, defined in [RFC0793] MUST appear as the last 660 TCP option in each DSS-mapped payload that contains TCP options. 662 5.2. Maximum Segment Size 664 The MSS option, defined in [RFC0793] can only appear in SYN segments. 665 It MUST never be sent inside a DSS-mapped payload. If a host 666 receives a DSS-mapped payload that contains this option, it MUST 667 ignore the entire DSS-mapped payload. 669 5.3. No-Operation 671 This option, defined in [RFC0793] is often used to align TCP options 672 to word boundaries. Some middleboxes replace existing TCP options 673 with this option [IMC11]. A host that sends TCP options inside a 674 DSS-mapped payload MAY send one or more No-Operation options inside 675 the DSS-mapped payload. 677 5.4. SACK-Permitted 679 This option, defined in [RFC2018], is used to negotiate the 680 utilisation of the selective acknowledgements during the three way 681 handshake. It MUST thus not appear in any DSS-mapped payload. If a 682 host receives a DSS-mapped payload that contains this option, it MUST 683 ignore the entire DSS-mapped payload. 685 5.5. SACK option 687 This option, defined in [RFC2018], MAY either be sent as a regular 688 TCP option or inside a DSS-mapped payload. Placing the option in a 689 DSS-mapped payload has two advantages. First, the length of the SACK 690 option is not anymore limited by the maximum length of the TCP 691 header. Second, this option will be delivered reliably to the 692 destination. 694 5.6. Timestamps 696 This option, defined in [RFC1323] and revised in [RFC7323] SHOULD not 697 appear in any DSS-mapped payload. It does not benefit from the 698 reliability provided by the DSS-mapped payload. 700 5.7. TCP-TFO 702 Two options are defined in [I-D.ietf-tcpm-fastopen] : Fast Open 703 Cookie and Fast Open Cookie Request. These two options can only be 704 used inside SYN segments. For this reason, they MUST never be sent 705 inside a DSS-mapped payload. If a host receives a DSS-mapped payload 706 that contains one of these options, it MUST ignore the entire DSS- 707 mapped payload. 709 5.8. TCP-AO option 711 This option, defined in [RFC5925], allows to authenticate the TCP 712 segments exchanged between hosts. Given the processing rules defined 713 in [RFC5925], it seems difficult to place it as defined in [RFC5925] 714 inside a DSS-mapped payload. For this reason, a host MUST never send 715 the TCP-AO option inside a DSS-mapped payload. 717 It should be noted that a DSS option of length 20 bytes can be used 718 inside a segment that is covered by a TCP-AO option of 20 bytes or 719 less. 721 5.9. TCP-User Timeout 723 This option is defined in [RFC5482]. It MAY either be sent as a 724 regular TCP option or inside a DSS-mapped payload. Placing the 725 option in a DSS-mapped payload provides a reliable delivery of the 726 option for the applications requiring it. 728 5.10. Multipath TCP options 730 Several Multipath TCP options are defined in [RFC6824]. Some of them 731 can benefit from the reliability and the unrestricted length of the 732 DSS-mapped payload. 734 The MP_CAPABLE and MP_JOIN options can only appear in SYN segments. 735 They MUST never be sent inside a DSS-mapped payload. If a host 736 receives a DSS-mapped payload that contains one of these options, it 737 MUST reject the entire DSS-mapped payload. 739 Similarly, a DSS option cannot appear inside a DSS-mapped payload. 740 If a host receives a DSS-mapped payload that contains another DSS 741 option, it MUST reject the entire DSS-mapped payload. 743 The ADD_ADDR and REMOVE_ADDR can benefit from the reliability of 744 being transported inside a DSS-mapped payload. As discussed in 745 [Cellnet12], the loss of such options can impact the performance of 746 Multipath TCP in failover scenarios. Another benefit of the DSS- 747 mapped payload is that a multihomed host that has several IPv6 748 addresses could advertise all its addresses by sending a single DSS- 749 mapped segment. The MP_PRIO option can also benefit from the added 750 reliability of placing it inside a DSS-mapped payload. 752 The MP_FAIL option is used when there are problems with middleboxes. 753 In this case, placing it inside a DSS-mapped payload is unlikely to 754 help. For this reason, it MUST never appear inside a DSS-mapped 755 payload. 757 The MP_FASTCLOSE option is used to abruptly terminate an MPTCP 758 connection.It can be transmitted as a DSS-mapped option. 760 6. Security consideration 762 The solution proposed in this document does not modify the security 763 properties of Multipath TCP. The security considerations listed in 764 [RFC6824] [RFC6181] apply. 766 7. Conclusion 768 In this document, we have proposed a simple modification to the 769 format of the DSS option in Multipath TCP to support the transport of 770 long TCP options inside the TCP payload while leveraging the existing 771 Multipath TCP mechanisms. 773 8. Acknowledgements 775 This work was partially supported by the FP7-Trilogy2 project. We 776 would like to thank Joe Touch and Bob Briscoe whose work on extending 777 the TCP option space [I-D.briscoe-tcpm-inner-space] 778 [I-D.ietf-tcpm-tcp-edo] has motivated this work. This document also 779 benefitted from comments and suggestions from Fabien Duchene, 780 Christoph Paasch and Benjamin Hesmans. 782 9. Informative References 784 [Cellnet12] 785 Paasch, C., Detal, G., Duchene, F., Raiciu, C., and O. 786 Bonaventure, "Exploring Mobile/WiFi Handover with 787 Multipath TCP", ACM SIGCOMM workshop on Cellular Networks 788 (Cellnet12) , 2012, 789 . 792 [HotMiddlebox13] 793 Hesmans, B., Duchene, F., Paasch, C., Detal, G., and O. 794 Bonaventure, "Are TCP Extensions Middlebox-proof?", CoNEXT 795 workshop HotMiddlebox , December 2013, 796 . 799 [I-D.ananth-middisc-tcpopt] 800 Knutsen, A., Ramaiah, A., and A. Ramasamy, "TCP option for 801 transparent Middlebox negotiation", draft-ananth-middisc- 802 tcpopt-02 (work in progress), February 2013. 804 [I-D.briscoe-tcpm-inner-space] 805 Briscoe, B., "Inner Space for TCP Options", draft-briscoe- 806 tcpm-inner-space-01 (work in progress), October 2014. 808 [I-D.ietf-tcpm-fastopen] 809 Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 810 Fast Open", draft-ietf-tcpm-fastopen-10 (work in 811 progress), September 2014. 813 [I-D.ietf-tcpm-tcp-edo] 814 Touch, J. and W. Eddy, "TCP Extended Data Offset Option", 815 draft-ietf-tcpm-tcp-edo-03 (work in progress), April 2015. 817 [I-D.paasch-mptcp-control-stream] 818 Paasch, C. and O. Bonaventure, "A generic control stream 819 for Multipath TCP", draft-paasch-mptcp-control-stream-00 820 (work in progress), February 2014. 822 [IMC11] Honda, M., Nishida, Y., Raiciu, C., Greenhalgh, A., 823 Handley, M., and H. Tokuda, "Is it still possible to 824 extend TCP?", Proceedings of the 2011 ACM SIGCOMM 825 conference on Internet measurement conference (IMC '11) , 826 2011, . 828 [IMC13a] Detal, G., Hesmans, B., Bonaventure, O., Vanaubel, Y., and 829 B. Donnet, "Revealing Middlebox Interference with 830 Tracebox", Proceedings of the 2013 ACM SIGCOMM conference 831 on Internet measurement conference , 2013, 832 . 835 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, RFC 836 793, September 1981. 838 [RFC1323] Jacobson, V., Braden, B., and D. Borman, "TCP Extensions 839 for High Performance", RFC 1323, May 1992. 841 [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP 842 Selective Acknowledgment Options", RFC 2018, October 1996. 844 [RFC5482] Eggert, L. and F. Gont, "TCP User Timeout Option", RFC 845 5482, March 2009. 847 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 848 Authentication Option", RFC 5925, June 2010. 850 [RFC6181] Bagnulo, M., "Threat Analysis for TCP Extensions for 851 Multipath Operation with Multiple Addresses", RFC 6181, 852 March 2011. 854 [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, 855 "TCP Extensions for Multipath Operation with Multiple 856 Addresses", RFC 6824, January 2013. 858 [RFC7323] Borman, D., Braden, B., Jacobson, V., and R. 859 Scheffenegger, "TCP Extensions for High Performance", RFC 860 7323, September 2014. 862 Author's Address 864 Olivier Bonaventure 865 UCLouvain 867 Email: Olivier.Bonaventure@uclouvain.be