idnits 2.17.1 draft-ietf-tcpm-tcp-edo-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC793, updated by this document, for RFC5378 checks: 1981-09-01) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 19, 2018) is 2261 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) -- Obsolete informational reference (is this intentional?): RFC 6824 (Obsoleted by RFC 8684) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 TCPM WG J. Touch 2 Internet Draft 3 Updates: 793 Wes Eddy 4 Intended status: Standards Track MTI Systems 5 Expires: July 2018 January 19, 2018 7 TCP Extended Data Offset Option 8 draft-ietf-tcpm-tcp-edo-09.txt 10 Status of this Memo 12 This Internet-Draft is submitted in full conformance with the 13 provisions of BCP 78 and BCP 79. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six 21 months and may be updated, replaced, or obsoleted by other documents 22 at any time. It is inappropriate to use Internet-Drafts as 23 reference material or to cite them other than as "work in progress." 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http://www.ietf.org/shadow.html 31 This Internet-Draft will expire on July 19, 2018. 33 Copyright Notice 35 Copyright (c) 2018 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents 40 (http://trustee.ietf.org/license-info) in effect on the date of 41 publication of this document. Please review these documents 42 carefully, as they describe your rights and restrictions with 43 respect to this document. Code Components extracted from this 44 document must include Simplified BSD License text as described in 45 Section 4.e of the Trust Legal Provisions and are provided without 46 warranty as described in the Simplified BSD License. 48 Abstract 50 TCP segments include a Data Offset field to indicate space for TCP 51 options but the size of the field can limit the space available for 52 complex options such as SACK and Multipath TCP and can limit the 53 combination of such options supported in a single connection. This 54 document updates RFC 793 with an optional TCP extension to that 55 space to support the use of multiple large options. It also explains 56 why the initial SYN of a connection cannot be extending a single 57 segment. 59 Table of Contents 61 1. Introduction...................................................3 62 2. Conventions used in this document..............................3 63 3. Motivation.....................................................3 64 4. Requirements for Extending TCP's Data Offset...................4 65 5. The TCP EDO Option.............................................4 66 5.1. EDO Supported.............................................5 67 5.2. EDO Extension.............................................5 68 5.3. The two EDO Extension variants............................8 69 6. TCP EDO Interaction with TCP...................................9 70 6.1. TCP User Interface........................................9 71 6.2. TCP States and Transitions................................9 72 6.3. TCP Segment Processing...................................10 73 6.4. Impact on TCP Header Size................................10 74 6.5. Connectionless Resets....................................11 75 6.6. ICMP Handling............................................11 76 7. Interactions with Middleboxes.................................12 77 7.1. Middlebox Coexistence with EDO...........................12 78 7.2. Middlebox Interference with EDO..........................13 79 8. Comparison to Previous Proposals..............................14 80 8.1. EDO Criteria.............................................14 81 8.2. Summary of Approaches....................................15 82 8.3. Extended Segments........................................16 83 8.4. TCPx2....................................................16 84 8.5. LO/SLO...................................................17 85 8.6. LOIC.....................................................17 86 8.7. Problems with Extending the Initial SYN..................18 87 9. Implementation Issues.........................................19 88 10. Security Considerations......................................20 89 11. IANA Considerations..........................................20 90 12. References...................................................20 91 12.1. Normative References....................................20 92 12.2. Informative References..................................20 93 13. Acknowledgments..............................................22 95 1. Introduction 97 TCP's Data Offset (DO)is a 4-bit field, which indicates the number 98 of 32-bit words of the entire TCP header [RFC793]. This limits the 99 current total header size to 60 bytes, of which the basic header 100 occupies 20, leaving 40 bytes for options. These 40 bytes are 101 increasingly becoming a limitation to the development of advanced 102 capabilities, such as when SACK [RFC2018][RFC6675] is combined with 103 either Multipath TCP [RFC6824], TCP-AO [RFC5925], or TCP Fast Open 104 [RFC7413]. 106 This document specifies the TCP Extended Data Offset (EDO) option, 107 and is independent of (and thus compatible with) IPv4 and IPv6. EDO 108 extends the space available for TCP options, except for the initial 109 SYN and SYN/ACK. This document also explains why the option space of 110 the initial SYN segments cannot be extended as individual segments 111 without severe impact on TCP's initial handshake and the SYN/ACK 112 limitation that results from potential middlebox misbehavior. 113 Multiple other TCP extensions are being considered in the TCPM 114 working group in order to address the case of SYN and SYN/ACK 115 segments [Bo14][Br14][To18]. Some of these other extensions can work 116 in conjunction with EDO (e.g., [To18]). 118 2. Conventions used in this document 120 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 121 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 122 document are to be interpreted as described in RFC-2119 [RFC2119]. 124 In this document, these words will appear with that interpretation 125 only when in ALL CAPS. Lower case uses of these words are not to be 126 interpreted as carrying RFC-2119 significance. 128 In this document, the characters ">>" preceding an indented line(s) 129 indicates a compliance requirement statement using the key words 130 listed above. This convention aids reviewers in quickly identifying 131 or finding the explicit compliance requirements of this RFC. 133 3. Motivation 135 TCP supports headers with a total length of up to 15 32-bit words, 136 as indicated in the 4-bit Data Offset field [RFC793]. This accounts 137 for a total of 60 bytes, of which the default TCP header fields 138 occupy 20 bytes, leaving 40 bytes for options. 140 TCP connections already use this option space for a variety of 141 capabilities. These include Maximum Segment Size (MSS) [RFC793], 142 Window Scale (WS) [RFC7323], Timestamp (TS) [RFC7323], Selective 143 Acknowledgement (SACK) [RFC2018][RFC6675], TCP Authentication Option 144 (TCP-AO) [RFC5925], Multipath TCP (MP-TCP)_[RFC6824], and TCP User 145 Timeout [RFC5482]. Some options occur only in a SYN or SYN/ACK (MSS, 146 WS), and others vary in size when used in SYN vs. non-SYN segments. 148 Each of these options consumes space, where some options consuming 149 as much space as available (SACK) and other desired combinations can 150 easily exceed the currently available space. For example, it is not 151 currently possible to use TCP-AO with both TS and MP-TCP in the same 152 non-SYN segment, i.e., to combine accurate round-trip estimation, 153 authentication, and multipath support in the same connection - even 154 though these options can be negotiated during a SYN exchange (10 for 155 TS, 16 for TCP-AO, and 12 for MP-TCP). 157 TCP EDO is intended to overcome this limitation for non-SYN 158 segments, as well as to increase the space available for SACK 159 blocks. Further discussion of the impact of EDO and existing options 160 is discussed in Section 6.4. Extending SYN segments is much more 161 complicated, as discussed in Section 8.7. 163 4. Requirements for Extending TCP's Data Offset 165 The primary goal of extending the TCP Data Offset field is to 166 increase the space available for TCP options in all segments except 167 the initial SYN. 169 An important requirement of any such extension is that it not impact 170 legacy endpoints. Endpoints seeking to use this new option should 171 not incur additional delay or segment exchanges to connect to either 172 new endpoints supporting this option or legacy endpoints without 173 this option. We call this a "backward downgrade" capability. 175 An additional consideration of this extension is avoiding user data 176 corruption in the presence of popular network devices, including 177 middleboxes. Consideration of middlebox misbehavior can also 178 interfere with extension in the SYN/ACK. 180 5. The TCP EDO Option 182 TCP EDO extends the option space for all segments except the initial 183 SYN (i.e., SYN set and ACK not set) and SYN/ACK response. EDO is 184 indicated by the TCP option codepoint of EDO-OPT and has two types: 185 EDO Supported and EDO Extension, as discussed in the following 186 subsections. 188 5.1. EDO Supported 190 EDO capability is determined in both directions using a single 191 exchange of the EDO Supported option (Figure 1). When EDO is desired 192 on a given connection, the SYN and SYN/ACK segments include the EDO 193 Supported option, which consists of the two required TCP option 194 fields: Kind and Length. The EDO Supported option is used only in 195 the SYN and SYN/ACK segments and only to confirm support for EDO in 196 subsequent segments. 198 +--------+--------+ 199 | Kind | Length | 200 +--------+--------+ 202 Figure 1 TCP EDO Supported option 204 An endpoint seeking to enable EDO includes the EDO Supported option 205 in the initial SYN. If receiver of that SYN agrees to use EDO, it 206 responds with the EDO Supported option in the SYN/ACK. The EDO 207 Supported option does not extend the TCP option space. 209 >> Connections using EDO MUST negotiate its availability during the 210 SYN exchange of the initial three-way handshake. 212 >> An endpoint confirming and agreeing to EDO use MUST respond with 213 the EDO Supported option in its SYN/ACK. 215 The SYN/ACK uses only the EDO Supported option (and not the EDO 216 Extension option, below) because it may not yet be safe to extend 217 the option space in the reverse direction due to potential middlebox 218 misbehavior (see Section 7.2). Extension of the SYN and SYN/ACK 219 space is addressed as a separate option (see Section 8.7). 221 5.2. EDO Extension 223 When EDO is successfully negotiated, all other segments use the EDO 224 Extension option, of which there are two variants (Figure 2 and 225 Figure 3). Both variants are considered equivalent and either 226 variant can be used in any segment where the EDO Extension option is 227 required. Both variants add a Header_Length field (in network- 228 standard byte order), indicating the length of the entire TCP header 229 in 32-bit words. Figure 3 depicts the longer variant, which includes 230 an additional Segment_Length field, which is identical to the TCP 231 pseudoheader TCP Length field and used to detect when segments have 232 been altered in ways that would interfere with EDO (discussed 233 further in Section 5.3). 235 +--------+--------+--------+--------+ 236 | Kind | Length | Header_Length | 237 +--------+--------+--------+--------+ 239 Figure 2 TCP EDO Extension option - simple variant 241 +--------+--------+--------+--------+ 242 | Kind | Length | Header_Length | 243 +--------+--------+--------+--------+ 244 | Segment_Length | 245 +--------+--------+ 247 Figure 3 TCP EDO Extension option - with segment length verification 249 >> Once enabled on a connection, all segments in both directions 250 MUST include the EDO Extension option. Segments not needing 251 extension MUST set the EDO Extension option Header Length field 252 equal to the Data Offset length. 254 >> The EDO Extension option MAY be used only if confirmed when the 255 connection transitions to the ESTABLISHED state, e.g., a client is 256 enabled after receiving the EDO Supported option in the SYN/ACK and 257 the server is enabled after seeing the EDO Extension option in the 258 final ACK of the three-way handshake. If either of those segments 259 lacks the appropriate EDO option, the connection MUST NOT use any 260 EDO options on any other segments. 262 Internet paths may vary after connection establishment, introducing 263 misbehaving middleboxes (see Section 7.2). Using EDO on all segments 264 in both directions allows this condition to be detected. 266 >> The EDO Supported option MAY occur in an initial SYN as desired 267 (e.g., as expressed by the user/application) and in the SYN/ACK as 268 confirmation, but MUST NOT be inserted in other segments. If the EDO 269 Supported option is received in other segments, it MUST be silently 270 ignored. 272 >> If EDO has not been negotiated and agreed, the EDO Extension 273 option MUST be silently ignored on subsequent segments. The EDO 274 Extension option MUST NOT be sent in an initial SYN segment or 275 SYN/ACK, and MUST be silently ignored and not acknowledged if so 276 received. 278 >> If EDO has been negotiated, any subsequent segments arriving 279 without the EDO Extension option MUST be silently ignored. Such 280 events MAY be logged as warning errors and logging MUST be rate 281 limited. 283 When processing a segment, EDO needs to be visible within the area 284 indicated by the Data Offset field, so that processing can use the 285 EDO Header_length to override the field for that segment. 287 >> The EDO Extension option MUST occur within the space indicated by 288 the TCP Data Offset. 290 >> The EDO Extension option indicates the total length of the 291 header. The EDO Header_length field MUST NOT exceed that of the 292 total segment size (i.e., TCP Length). 294 >> The EDO Header Length MUST be at least as large as the TCP Data 295 Offset field of the segment in which they both appear. When the EDO 296 Header Length equals the Data Offset length, the EDO Extension 297 option is present but it does not extend the option space. When the 298 EDO Header Length is invalid, the TCP segment MUST be silently 299 dropped. 301 >> The EDO Supported option SHOULD be aligned on a 16-bit boundary 302 and the EDO Extension option SHOULD be aligned on a 32-bit boundary, 303 in both cases for simpler processing. 305 For example, a segment with only EDO would have a Data Offset of 6 306 or 7 (depending on the EDO Extension variant used), where EDO would 307 be the first option processed, at which point the EDO Extension 308 option would override the Data Offset and processing would continue 309 until the end of the TCP header as indicated by the EDO 310 Header_length field. 312 There are cases where it might be useful to process other options 313 before EDO, notably those that determine whether the TCP header is 314 valid, such as authentication, encryption, or alternate checksums. 315 In those cases, the EDO Extension option is preferably the first 316 option after a validation option, and the payload after the Data 317 Offset is treated as user data for the purposes of validation. 319 >> The EDO Extension option SHOULD occur as early as possible, 320 either first or just after any authentication or encryption, and 321 SHOULD be the last option covered by the Data Offset value. 323 Other options are generally handled in the same manner as when the 324 EDO option is not active, unless they interact with other options. 326 One such example is TCP-AO [RFC5925], which optionally ignores the 327 contents of TCP options, so it would need to be aware of EDO to 328 operate correctly when options are excluded from the HMAC 329 calculation. 331 >> Options that depend on other options, such as TCP-AO [RFC5925] 332 (which may include or exclude options in MAC calculations) MUST also 333 be augmented to interpret the EDO Extension option to operate 334 correctly. 336 5.3. The two EDO Extension variants 338 There are two variants of the EDO Extension option; one includes a 339 copy of the TCP segment length, copied from the TCP pseduoheader 340 [RFC793]. The Segment_Length field is added to the longer variant to 341 detect when segments are incorrectly and inappropriately merged by 342 middleboxes or TCP offload processing but without consideration for 343 the additional option space indicated by the EDO Header_Length 344 field. Such effects are described in further detail in Section 7.2. 346 >> An endpoint MAY use either variant of the EDO Extension option 347 interchangeably. 349 When the longer, 6-byte variant is used, the Segment_Length field is 350 used to check whether modification of the segment was performed 351 consistent with knowledge of the EDO option. The Segment_Length 352 field will detect any modification of the length of the segment, 353 such as might occur when segments are split or merged, that occurs 354 without also updating the Segment Length field as well. The Segment 355 Length field thus helps endpoints detects devices that merge or 356 split TCP segments without support for EDO. Devices that merge or 357 split TCP segments that support EDO would update the Segment Length 358 field as needed, but would also ensure that the user data is handled 359 separately from the extended option space indicate by EDO. 361 >> When an endpoint creates a new segment using the 6-byte EDO 362 Extension option, the Segment_Length field is initialized with a 363 copy of the segment length from the TCP pseudoheader. 365 >> When an endpoint receives a segment using the 6-byte EDO 366 Extension option, it MUST validate the Segment_Length field with the 367 length of the segment as indicated in the TCP pseudoheader. If the 368 segment lengths do not match, the segment MUST be discarded and an 369 error SHOULD be logged in a rate-limited manner. 371 >> The 6-byte EDO Extension variant SHOULD be used where middlebox 372 or TCP offload support could merge or split TCP segments without 373 consideration for the EDO option. Because these conditions could 374 occur at either endpoint or along the network path, the 6-byte 375 variant SHOULD be preferred until sufficient evidence for safe use 376 of the 4-byte variant is determined by the community. 378 The field will not detect other modification of the TCP user data; 379 such modifications would need more complex detection mechanisms, 380 such as checksums or hashes. When these are used, as with IPsec or 381 TCP-AO, the 4-byte variant is sufficient. 383 >> The 4-byte EDO Extension variant is sufficient when EDO is used 384 in conjunction with other mechanisms that provide integrity 385 protection, such as IPsec or TCP-AO. 387 6. TCP EDO Interaction with TCP 389 The following subsections describe how EDO interacts with the TCP 390 specification [RFC793]. 392 6.1. TCP User Interface 394 The TCP EDO option is enabled on a connection using a mechanism 395 similar to any other per-connection option. In Unix systems, this is 396 typically performed using the 'setsockopt' system call. 398 >> Implementations can also employ system-wide defaults, however 399 systems SHOULD NOT activate this extension by default to avoid 400 interfering with legacy applications. 402 >> Due to the potential impacts of legacy middleboxes (discussed in 403 Section 7), a TCP implementation supporting EDO SHOULD log any 404 events within an EDO connection when options that are malformed or 405 show other evidence of tampering arrive. An operating system MAY 406 choose to cache the list of destination endpoints where this has 407 occurred with and block use of EDO on future connections to those 408 endpoints, but this cache MUST be accessible to users/applications 409 on the host. Note that such endpoint assumptions can vary in the 410 presence of load balancers where server implementations vary behind 411 such balancers. 413 6.2. TCP States and Transitions 415 TCP EDO does not alter the existing TCP state or state transition 416 mechanisms. 418 6.3. TCP Segment Processing 420 TCP EDO alters segment processing during the TCP option processing 421 step. Once detected, the TCP EDO Extension option overrides the TCP 422 Data Offset field for all subsequent option processing. Option 423 processing continues at the next option (if present) after the EDO 424 Extension option. 426 6.4. Impact on TCP Header Size 428 The TCP EDO Supported option increases SYN header length by a 429 minimum of 2 bytes, but could increase it by more depending on 32- 430 bit word alignment. Currently popular SYN options total 19 bytes, 431 which leaves more than enough room for the EDO Supported option: 433 o SACK permitted (2 bytes in SYN, optionally 2 + 8N bytes after) 434 [RFC2018][RFC6675] 436 o Timestamp (10 bytes) [RFC7323] 438 o Window scale (3 bytes) [RFC7323] 440 o MSS option (4 bytes) [RFC793] 442 Adding the EDO Supported option would result in a total of 21 bytes 443 of SYN option space. 445 Subsequent segments would use 10 bytes of option space without any 446 SACK blocks (TS only; WS and MSS are used only in SYN and SYN/ACK) 447 or allow up to 3 SACK blocks before needing to use EDO; with EDO, 448 the number of SACK blocks or additional options would be 449 substantially increased. There are also other options that are 450 emerging in the SYN, including TCP Fast Open, which uses another 6- 451 18 (typically 10) bytes in the SYN/ACK of the first connection and 452 in the SYN of subsequent connections [RFC7413]. 454 TCP EDO can also be negotiated in SYNs with either of the following 455 large options: 457 o TCP-AO (authentication) (16 bytes) [RFC5925] 459 o Multipath TCP (12 bytes in SYN and SYN/ACK, 20 after) [RFC6824] 461 Including TCP-AO with TS, WS, SACK increases the SYN option space 462 use to 35 bytes; with Multipath TCP the use is 31 bytes. When 463 Multipath TCP is enabled with the typical options, later segments 464 would require 30 bytes without SACK, thus limiting the SACK option 465 to one block unless EDO is also supported on at least non-SYN 466 segments. 468 The full combination of the above options (47 bytes for TS, WS, MSS, 469 SACK, TCP-AO, and MPTCP) does not fit in the existing SYN option 470 space and (as noted) that space cannot be extended within a single 471 SYN segment. There has been a proposal to change TS to a 2 byte "TS 472 permitted" signal in the initial SYN, provided it can be safely 473 enabled during the connection later or might be avoided completely 474 [Ni15]. Even using "TS-permitted", the total space is still too 475 large to support in the initial SYN without SYN option space 476 extension [Bo14][Br14][To18]. 478 The EDO Extension option has negligible impact on other headers, 479 because it can either come first or just after security information, 480 and in either case the additional 4 or 6 bytes are easily 481 accommodated within the TCP Data Offset length. Once the EDO option 482 is processed, the entirety of the remainder of the TCP segment is 483 available for any remaining options. 485 6.5. Connectionless Resets 487 A RST may arrive during a currently active connection or may be 488 needed to cleanup old state from an abandoned connection. The latter 489 occurs when a new SYN is sent to an endpoint with matching existing 490 connection state, at which point that endpoint responds with a RST 491 and both ends remove stale information. 493 The EDO Extension option is mandatory on all TCP segments once 494 negotiated, i.e., except in the SYN and SYN/ACK (which establish 495 support) and the RST. A RST may lack the context to know that EDO is 496 active on a connection. 498 >> The EDO Extension option MAY occur in a RST when the endpoint has 499 connection state that has negotiated EDO. However, unless the RST is 500 generated by an incoming segment that includes an EDO Extension 501 option, the transmitted RST MUST NOT include the EDO Extension 502 option. 504 6.6. ICMP Handling 506 ICMP responses are intended to include the IP and the port fields of 507 TCP and UDP headers of typical TCP/IP and UDP/IP packets [RFC792]. 508 This includes the first 8 data bytes of the original datagram, 509 intended to include the transport port numbers used for connection 510 demultiplexing. Later specifications encourage returning as much of 511 the original payload as possible [RFC1812]. In either case, legacy 512 options or new options in the EDO extension area might or might not 513 be included, and so options are generally not assumed to be part of 514 ICMP processing anyway. 516 7. Interactions with Middleboxes 518 Middleboxes are on-path devices that typically examine or modify 519 packets in ways that Internet routers do not [RFC3234]. This 520 includes parsing transport headers and/or rewriting transport 521 segments in ways that may affect EDO. 523 There are several cases to consider: 525 - Typical NAT/NAPT devices, which modify only IP address and/or TCP 526 port number fields (with associated TCP checksum updates) 528 - Middleboxes that try to reconstitute TCP data streams, such as 529 for deep-packet inspection for virus scanning 531 - Middleboxes that modify known TCP header fields 533 - Middleboxes that rewrite TCP segments 535 7.1. Middlebox Coexistence with EDO 537 Middleboxes can coexist with EDO when they either support EDO or 538 when they ignore its impact on segment structure. 540 NATs and NAPTs, which rewrite IP address and/or transport port 541 fields, are the most common form of middlebox and are not affected 542 by the EDO option. 544 Middleboxes that support EDO would be those that correctly parse the 545 EDO option. Such boxes can reconstitute the TCP data stream 546 correctly or can modify header fields and/or rewrite segments 547 without impact to EDO. 549 Conventional TCP proxies terminate the TCP connection in both 550 directions and thus operate as TCP endpoints, such as when a client- 551 middlebox and middlebox-server each have separate TCP connections. 552 They would support EDO by following the host requirements herein on 553 both connections. The use of EDO on one connection is independent of 554 its use on the other in this case. 556 7.2. Middlebox Interference with EDO 558 Middleboxes that do not support EDO cannot coexist with its use when 559 they modify segment boundaries or do not forward unknown (e.g., the 560 EDO) options. 562 So-called "transparent" rewriting proxies, which inappropriately and 563 incorrectly modify TCP segment boundaries, might mix option 564 information with user data if they did not support EDO. Such devices 565 might also interfere with other TCP options such as TCP-AO. There 566 are three types of such boxes: 568 o Those that process received options and transmit sent options 569 separately, i.e., although they rewrite segments, they behave as 570 TCP endpoints in both directions. 572 o Those that split segments, taking a received segment and emitting 573 two or more segments with revised headers. 575 o Those that join segments, receiving multiple segments and 576 emitting a single segment whose data is the concatenation of the 577 components. 579 In all three cases, EDO is either treated as independent on 580 different sides of such boxes or not. If independent, EDO would 581 either be correctly terminated in either or both directions or 582 disabled due to lack of SYN/ACK confirmation in either or both 583 directions. Problems would occur only when TCP segments with EDO are 584 combined or split while ignoring the EDO option. In the split case, 585 the key concern is if the split happens within the option extension 586 space or if EDO is silently copied to both segments without copying 587 the corresponding extended option space contents. However, the most 588 comprehensive study of these cases indicates that "although 589 middleboxes do split and coalesce segments, none did so while 590 passing unknown options" [Ho11]. 592 Note that the second and third types of middlebox behaviors listed 593 above may create syndromes similar to TCP transmit and receive 594 hardware offload engines that incorrectly modify segments with 595 unknown options. 597 Middleboxes that silently remove options that they do not implement 598 have been observed [Ho11]. Such boxes interfere with the use of the 599 EDO Extension option in the SYN and SYN/ACK segments because 600 extended option space would be misinterpreted as user data if the 601 EDO Extension option were removed, and this cannot be avoided. This 602 is one reason that SYN and SYN/ACK extension requires alternate 603 mechanisms (see Section 8.7). It is also the reason for the 6-byte 604 EDO Extension variant (see Section 5.3), which can detect such 605 merging or splitting of segments. Further, if such middleboxes 606 become present on a path they could cause similar misinterpretation 607 on segments exchanged in the ESTABLISHED and subsequent states. As a 608 result, this document requires that the EDO Extension option be 609 avoided on the SYN/ACK and that this option needs to be used on all 610 segments once successfully negotiated and encourages use of the 6- 611 byte EDO Extension variant. 613 Deep-packet inspection systems that inspect TCP segment payloads or 614 attempt to reconstitute the data stream would incorrectly include 615 option data in the reconstituted user data stream, which might 616 interfere with their operation. 618 >> It can be important to detect misbehavior that could cause EDO 619 space to be misinterpreted as user data. In such cases, EDO SHOULD 620 be used in conjunction with an integrity protection mechanism. This 621 includes the 6-byte EDO Extension variant or stronger mechanisms 622 such as IPsec, TCP-AO, etc. It is useful to note that such 623 protection only helps non-compliant components and enable avoidance 624 (e.g., disabling EDO), but integrity protection alone cannot correct 625 the misinterpretation of EDO space as user data. 627 This situation is similar to that of ECN and ICMP support in the 628 Internet. In both cases, endpoints have evolved mechanisms for 629 detecting and robustly operating around "black holes". Very similar 630 algorithms are expected to be applicable for EDO. 632 8. Comparison to Previous Proposals 634 EDO is the latest in a long line of attempts to increase TCP option 635 space [Al06][Ed08][Ko04][Ra12][Yo11]. The following is a comparison 636 of these approaches to EDO, based partly on a previous summary 637 [Ra12]. This comparison differs from that summary by using a 638 different set of success criteria. 640 8.1. EDO Criteria 642 Our criteria for a successful solution are as follows: 644 o Zero-cost fallback to legacy endpoints. 646 o Minimal impact on middlebox compatibility. 648 o No additional side-effects. 650 Zero-cost fallback requires that upgraded hosts incur no penalty for 651 attempting to use EDO. This disqualifies dual-stack approaches, 652 because the client might have to delay connection establishment to 653 wait for the preferred connection mode to complete. Note that the 654 impact of legacy endpoints that silently reflect unknown options are 655 not considered, as they are already non-compliant with existing TCP 656 requirements [RFC793]. 658 Minimal impact on middlebox compatibility requires that EDO works 659 through simple NAT and NAPT boxes, which modify IP addresses and 660 ports and recompute IPv4 header and TCP segment checksums. 661 Middleboxes that reject unknown options or that process segments in 662 detail without regard for unknown options are not considered; they 663 process segments as if they were an endpoint but do so in ways that 664 are not compliant with existing TCP requirements (e.g., they should 665 have rejected the initial SYN because of its unknown options rather 666 than silently relaying it). 668 EDO also attempts to avoid creating side-effects, such as might 669 happen if options were split across multiple TCP segments (which 670 could arrive out of order or be lost) or across different TCP 671 connections (which could fail to share fate through firewalls or 672 NAT/NAPTs). 674 These requirements are similar to those noted in [Ra12], but EDO 675 groups cases of segment modification beyond address and port - such 676 as rewriting, segment drop, sequence number modification, and option 677 stripping - as already in violation of existing TCP requirements 678 regarding unknown options, and so we do not consider their impact on 679 this new option. 681 8.2. Summary of Approaches 683 There are three basic ways in which TCP option space extension has 684 been attempted: 686 1. Use of a TCP option. 688 2. Redefinition of the existing TCP header fields. 690 3. Use of option space in multiple TCP segments (split across 691 multiple segments). 693 A TCP option is the most direct way to extend the option space and 694 is the basis of EDO. This approach cannot extend the option space of 695 the initial SYN. 697 Redefining existing TCP header fields can be used to either contain 698 additional options or as a pointer indicating alternate ways to 699 interpret the segment payload. All such redefinitions make it 700 difficult to achieve zero-impact backward compatibility, both with 701 legacy endpoints and middleboxes. 703 Splitting option space across separate segments can create 704 unintended side-effects, such as increased delay to deal with path 705 latency or loss differences. 707 The following discusses three of the most notable past attempts to 708 extend the TCP option space: Extended Segments, TCPx2, LO/SLO, and 709 LOIC. [Ra12] suggests a few other approaches, including use of TCP 710 option cookies, reuse/overload of other TCP fields (e.g., the URG 711 pointer), or compressing TCP options. None of these is compatible 712 with legacy endpoints or middleboxes. 714 8.3. Extended Segments 716 TCP Extended Segments redefined the meaning of currently unused 717 values of the Data Offset (DO) field [Ko04]. TCP defines DO as 718 indicating the length of the TCP header, including options, in 32- 719 bit words. The default TCP header with no options is 5 such words, 720 so the minimum currently valid DO value is 5 (meaning 40 bytes of 721 option space). This document defines interpretations of values 0-4: 722 DO=0 means 48 bytes of option space, DO=1 means 64, DO=2 means 128, 723 DO=3 means 256, and DO=4 means unlimited (e.g., the entire payload 724 is option space). This variant negotiates the use of this capability 725 by using one of these invalid DO values in the initial SYN. 727 Use of this variant is not backward-compatible with legacy TCP 728 implementations, whether at the desired endpoint or on middleboxes. 729 The variant also defines a way to initiate the feature on the 730 passive side, e.g., using an invalid DO during the SYN/ACK when the 731 initial SYN had a valid DO. This capability allows either side to 732 initiate use of the feature but is also not backward compatible. 734 8.4. TCPx2 736 TCPx2 redefines legacy TCP headers by basically doubling all TCP 737 header fields [Al06]. It relies on a new transport protocol number 738 to indicate its use, defeating backward compatibility with all 739 existing TCP capabilities, including firewalls, NATs/NAPTs, and 740 legacy endpoints and applications. 742 8.5. LO/SLO 744 The TCP Long Option (LO, [Ed08]) is very similar to EDO, except that 745 presence of LO results in ignoring the existing Data Offset (DO) 746 field and that LO is required to be the first option. EDO considers 747 the need for other fields to be first and declares that the EDO is 748 the last option as indicated by the DO field value. Like LO, EDO is 749 required in every segment once negotiated. 751 The TCP Long Option draft also specified the SYN Long Option (SLO) 752 [Ed08]. If SLO is used in the initial SYN and successfully 753 negotiated, it is used in each subsequent segment until all of the 754 initial SYN options are transmitted. 756 LO is backward compatible, as is SLO; in both cases, endpoints not 757 supporting the option would not respond with the option, and in both 758 cases the initial SYN is not itself extended. 760 SLO does modify the three-way handshake because the connection isn't 761 considered completely established until the first data byte is 762 acknowledged. Legacy TCP can establish a connection even in the 763 absence of data. SLO also changes the semantics of the SYN/ACK; for 764 legacy TCP, this completes the active side connection establishment, 765 where in SLO an additional data ACK is required. A connection whose 766 initial SYN options have been confirmed in the SYN/ACK might still 767 fail upon receipt of additional options sent in later SLO segments. 768 This case - of late negotiation fail - is not addressed in the 769 specification. 771 8.6. LOIC 773 TCP Long Options by Invalid Checksum is a dual-stack approach that 774 uses two initial SYNS to initiate all updated connections [Yo11]. 775 One SYN negotiates the new option and the other SYN payload contains 776 only the entire options. The negotiation SYN is compliant with 777 existing procedures, but the option SYN has a deliberately incorrect 778 TCP checksum (decremented by 2). A legacy endpoint would discard the 779 segment with the incorrect checksum and respond to the negotiation 780 SYN without the LO option. 782 Use of the option SYN and its incorrect checksum both interfere with 783 other legacy components. Segments with incorrect checksums will be 784 silently dropped by most middleboxes, including NATs/NAPTs. Use of 785 two SYNs creates side-effects that can delay connections to upgraded 786 endpoints, notably when the option SYN is lost or the SYNs arrive 787 out of order. Finally, by not allowing other options in the 788 negotiation SYN, all connections to legacy endpoints either use no 789 options or require a separate connection attempt (either concurrent 790 or subsequent). 792 8.7. Problems with Extending the Initial SYN 794 The key difficulty with most previous proposals is the desire to 795 extend the option space in all TCP segments, including the initial 796 SYN, i.e., SYN with no ACK, typically the first segment of a 797 connection, as well as possibly the SYN/ACK. It has proven difficult 798 to extend space within the segment of the initial SYN in the absence 799 of prior negotiation while maintaining current TCP three-way 800 handshake properties, and it may be similarly challenging to extend 801 the SYN/ACK (depending on asymmetric middlebox assumptions). 803 A new TCP option cannot extend the Data Offset of a single TCP 804 initial SYN segment, and cannot extend a SYN/ACK in a single segment 805 when considering misbehaving middleboxes. All TCP segments, 806 including the initial SYN and SYN/ACK, may include user data in the 807 payload data [RFC793], and this can be useful for some proposed 808 features such as TCP Fast Open [RFC7413]. Legacy endpoints that 809 ignore the new option would process the payload contents as user 810 data and send an ACK. Once ACK'd, this data cannot be removed from 811 the user stream. 813 The Reserved TCP header bits cannot be redefined easily, even though 814 three of the six total bits have already been redefined (ECE/CWR 815 [RFC3168] and NS [RFC3540]). Legacy endpoints have been known to 816 reflect received values in these fields; this was safely dealt with 817 for ECN but would be difficult here [RFC3168]. 819 TCP initial SYN (SYN and not ACK) segments can use every other TCP 820 header field except the Acknowledgement number, which is not used 821 because the ACK field is not set. In all other segments, all fields 822 except the three remaining Reserved header bits are actively used. 823 The total amount of available header fields, in either case, is 824 insufficient to be useful in extending the option space. 826 The representation of TCP options can be optimized to minimize the 827 space needed. In such cases, multiple Kind and Length fields are 828 combined, so that a new Kind would indicate a specific combination 829 of options, whose order is fixed and whose length is indicated by 830 one Length field. Most TCP options use fields whose size is much 831 larger than the required Kind and Length components, so the 832 resulting efficiency is typically insufficient for additional 833 options. 835 The option space of an initial SYN segment might be extended by 836 using multiple initial segments (e.g., multiple SYNs or a SYN and 837 non-SYN) or based on the context of previous or parallel 838 connections. This method may also be needed to extend space in the 839 SYN/ACK in the presence of misbehaving middleboxes. Because of their 840 potential complexity, these approaches are addressed in separate 841 documents [Bo14][Br14][To18]. 843 Option space cannot be extended in outer layer headers, e.g., IPv4 844 or IPv6. These layers typically try to avoid extensions altogether, 845 to simplify forwarding processing at routers. Introducing new shim 846 layers to accommodate additional option space would interfere with 847 deep-packet inspection mechanisms that are in widespread use. 849 As a result, EDO does not attempt to extend the space available for 850 options in TCP initial SYNs. It does extend that space in all other 851 segments (including SYN/ACK), which has always been trivially 852 possible once an option is defined. 854 9. Implementation Issues 856 TCP segment processing can involve accessing nonlinear data 857 structures, such as chains of buffers. Such chains are often 858 designed so that the maximum default TCP header (60 bytes) fits in 859 the first buffer. Extending the TCP header across multiple buffers 860 may necessitate buffer traversal functions that span boundaries 861 between buffers. Such traversal can also have a significant 862 performance impact, which is additional rationale for using TCP 863 option space - even extended option space - sparingly. 865 Although EDO can be large enough to consume the entire segment, it 866 is important to leave space for data so that the TCP connection can 867 make forward progress. It would be wise to limit EDO to consuming no 868 more than MSS-4 bytes of the IP segment, preferably even less (e.g., 869 MSS-128 bytes). 871 When using the ExID variant for testing and experimentation, either 872 TCP option codepoint (253, 254) is valid in sent or received 873 segments. 875 Implementers need to be careful about the potential for offload 876 support interfering with this option. The EDO data needs to be 877 passed to the protocol stack as part of the option space, not 878 integrated with the user segment, to allow the offload to 879 independently determine user data segment boundaries and combine 880 them correctly with the extended option data. Some legacy hardware 881 receive offload engines may present challenges in this regard, and 882 may be incompatible with EDO where they incorrectly attempt to 883 process segments with unknown options. Such offload engines are part 884 of the protocol stack and updated accordingly. Issues with incorrect 885 resegmentation by an offload engine can be detected in the same way 886 as middlebox tampering. 888 10. Security Considerations 890 It is meaningless to have the Data Offset further exceed the 891 position of the EDO data offset option. 893 >> When the EDO Extension option is present, the EDO Extension 894 option SHOULD be the last non-null option covered by the TCP Data 895 Offset, because it would be the last option affected by Data Offset. 897 This also makes it more difficult to use the Data Offset field as a 898 covert channel. 900 11. IANA Considerations 902 We request that, upon publication, this option be assigned a TCP 903 Option codepoint by IANA, which the RFC Editor will replace EDO-OPT 904 in this document with codepoint value. 906 The TCP Experimental ID (ExID) with a 16-bit value of 0x0ED0 (in 907 network standard byte order) has been assigned for use during 908 testing and preliminary experiments. 910 12. References 912 12.1. Normative References 914 [RFC793] Postel, J., "Transmission Control Protocol", STD 7, RFC 915 793, September 1981. 917 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 918 Requirement Levels", BCP 14, RFC 2119, March 1997. 920 12.2. Informative References 922 [Al06] Allman, M., "TCPx2: Don't Fence Me In", draft-allman- 923 tcpx2-hack-00 (work in progress), May 2006. 925 [Bo14] Borman, D., "TCP Four-Way Handshake", draft-borman- 926 tcp4way-00 (work in progress), October 2014. 928 [Br14] Briscoe, B., "Inner Space for TCP Options", draft-briscoe- 929 tcpm-inner-space-01 (work in progress), October 2014. 931 [Ed08] Eddy, W. and A. Langley, "Extending the Space Available 932 for TCP Options", draft-eddy-tcp-loo-04 (work in 933 progress), July 2008. 935 [Ho11] Honda, M., Nishida, Y., Raiciu, C., Greenhalgh, A., 936 Handley, M., and H. Tokuda, "Is it still possible to 937 extend TCP", Proc. ACM Sigcomm Internet Measurement 938 Conference (IMC), 2011, pp. 181-194. 940 [Ko04] Kohler, E., "Extended Option Space for TCP", draft-kohler- 941 tcpm-extopt-00 (work in progress), September 2004. 943 [Ni15] Nishida, Y., "A-PAWS: Alternative Approach for PAWS", 944 draft-nishida-tcpm-apaws-02 (work in progress), Oct. 2015. 946 [Ra12] Ramaiah, A., "TCP option space extension", draft-ananth- 947 tcpm-tcpoptext-00 (work in progress), March 2012. 949 [RFC792] Postel, J., "Internet Control Message Protocol", RFC 792, 950 September 1981. 952 [RFC1812] Baker, F. (Ed.), "Requirements for IP Version 4 Routers," 953 RFC 1812, June 1995. 955 [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP 956 Selective Acknowledgment Options", RFC 2018, October 1996. 958 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 959 of Explicit Congestion Notification (ECN) to IP", RFC 960 3168, September 2001. 962 [RFC3234] Carpenter, B. and S. Brim, "Middleboxes: Taxonomy and 963 Issues", RFC 3234, February 2002. 965 [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit 966 Congestion Notification (ECN) Signaling with Nonces", RFC 967 3540, June 2003. 969 [RFC5482] Eggert, L., and F. Gont, "TCP User Timeout Option", RFC 970 5482, March 2009. 972 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 973 Authentication Option", RFC 5925, June 2010. 975 [RFC6675] Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M., 976 and Y. Nishida, "A Conservative Loss Recovery Algorithm 977 Based on Selective Acknowledgment (SACK) for TCP", RFC 978 6675, August 2012. 980 [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, 981 "TCP Extensions for Multipath Operation with Multiple 982 Addresses", RFC 6824, January 2013. 984 [RFC7323] Borman, D., Braden, B., Jacobson, V., and R. Scheffenegger 985 (Ed.), "TCP Extensions for High Performance", RFC 7323, 986 September 2014. 988 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 989 Fast Open", RFC 7413, December 2014. 991 [To18] Touch, J., T. Faber, "TCP SYN Extended Option Space Using 992 an Out-of-Band Segment", draft-touch-tcpm-tcp-syn-ext-opt 993 (work in progress), Jan. 2018. 995 [Yo11] Yourtchenko, A., "Introducing TCP Long Options by Invalid 996 Checksum", draft-yourtchenko-tcp-loic-00 (work in 997 progress), April 2011. 999 13. Acknowledgments 1001 The authors would like to thank the IETF TCPM WG for their feedback, 1002 in particular: Oliver Bonaventure, Bob Briscoe, Ted Faber, John 1003 Leslie, Pasi Sarolahti, Richard Scheffenegger, and Alexander 1004 Zimmerman. 1006 This work is partly supported by USC/ISI's Postel Center. 1008 This document was prepared using 2-Word-v2.0.template.dot. 1010 Authors' Addresses 1012 Joe Touch 1014 Manhattan Beach, CA 90266 USA 1016 Phone: +1 (310) 560-0334 1017 Email: touch@strayalpha.com 1018 Wesley M. Eddy 1019 MTI Systems 1020 US 1022 Email: wes@mti-systems.com