idnits 2.17.1 draft-eddy-tcp-loo-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 462. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 473. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 480. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 486. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 1, 2008) is 5771 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) -- Obsolete informational reference (is this intentional?): RFC 1323 (Obsoleted by RFC 7323) Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group W. Eddy 3 Internet-Draft NASA GRC/Verizon FNS 4 Expires: January 2, 2009 A. Langley 5 Google Inc 6 July 1, 2008 8 Extending the Space Available for TCP Options 9 draft-eddy-tcp-loo-04 11 Status of this Memo 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft will expire on January 2, 2009. 36 Abstract 38 This document describes a method for increasing the space available 39 for TCP options. Two new TCP options (LO and SLO) are detailed which 40 reduce the limitations imposed by the TCP header's Data Offset field. 41 The LO option provides this extension after connection establishment, 42 and the SLO option aids in transmission of lengthy connection 43 initialization and configuration options. 45 Table of Contents 47 1. Requirements Notation . . . . . . . . . . . . . . . . . . . . 3 48 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 49 3. The Long Options (LO) Option . . . . . . . . . . . . . . . . . 6 50 4. The SYN Long Options (SLO) Option . . . . . . . . . . . . . . 7 51 5. Middlebox Interactions . . . . . . . . . . . . . . . . . . . . 9 52 6. Comparison to Extended Segments . . . . . . . . . . . . . . . 10 53 7. Security Considerations . . . . . . . . . . . . . . . . . . . 12 54 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 55 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14 56 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 57 10.1. Normative References . . . . . . . . . . . . . . . . . . 15 58 10.2. Informative References . . . . . . . . . . . . . . . . . 15 59 Appendix A. Changes . . . . . . . . . . . . . . . . . . . . . . . 16 60 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 17 61 Intellectual Property and Copyright Statements . . . . . . . . . . 18 63 1. Requirements Notation 65 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 66 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 67 document are to be interpreted as described in RFC 2119 [RFC2119]. 69 2. Introduction 71 Every TCP segment's header contains a 4-bit Data Offset (DO) field 72 that implies the length of that segment's TCP header. The DO field 73 has been specified as: "The number of 32-bit words in the TCP Header. 74 This indicates where the data begins. The TCP header (even one 75 including options) is an integral number of 32 bits long" [RFC0793]. 76 For a TCP implementation, this means that the boundary separating TCP 77 control data and application data is always exactly DO * 4 bytes from 78 the beginning of the TCP header. 80 As a 4-bit unsigned integer, DO's value is bounded between 0 and 15. 81 This allows for a maximum TCP header length of 60 bytes (15 * 4 82 bytes). The required fields in a TCP header occupy a fixed 20 bytes, 83 leaving 40 bytes as the maximum amount of space for use by TCP 84 options. 86 While 40 bytes is a reasonable amount of space, sufficient for the 87 concurrent use of several presently defined TCP options, there are 88 cases where more space might be useful. For example, the Selective 89 Acknowledgement (SACK) option [RFC2018] uses a fixed 2 bytes for its 90 kind and length fields, and requires an additional 8 bytes per SACK 91 block. Thus, the maximum number of SACK blocks a TCP acknowledgement 92 may carry is limited to 4 (with 6 bytes left over). Since SACK is 93 commonly used with the Timestamp option [RFC1323], which uses 10 94 bytes, this further limits the number of SACK blocks that may be 95 carried to 3. For specific scenarios involving large windows and 96 combinations of data and acknowledgement loss, additional capacity 97 for SACK blocks is known to be useful [more-sack]. 99 Creation of new TCP options is also hindered by the lack of space 100 left over after currently-used options are accounted for. For long 101 options that must be present at connection-startup time, this is a 102 particular problem, as all negotiable options need to share 40 bytes 103 of space in a SYN segment. One method that has been used to get 104 around this limitation is overloading the Timestamp bytes in the SYN 105 segments [migrate]. There are other header fields that might be 106 similarly overloaded (e.g. the urgent pointer), but this approach is 107 of obviously limited utility, as it does not address the fundamental 108 limitation imposed by the DO field, and there are a finite number of 109 overloadable header bits. 111 This document specifies two new TCP options, LO and SLO. The Long 112 Options (LO) option allows two hosts to negotiate for the ability to 113 use TCP headers longer than 60 bytes (and thus options space of 114 greater than 40 bytes) on subsequent segments. This is accomplished 115 by ignoring the DO field's value and adding a 16-bit field at a fixed 116 location in the header's options to replace it. The format and usage 117 of the LO option is detailed in Section 3. 119 Attempting to process initial SYN segments with greater than 60 bytes 120 of TCP headers might cause errors if received by hosts that consider 121 anything past the DO-specified boundary to be application data. For 122 backwards compatibility reasons, the maximum length of options on a 123 connection-initiating SYN segment remains 40. The SYN Long Options 124 (SLO) option is used in the case where these 40 bytes are not enough 125 space to carry the desired startup configuration options, and 126 negotiates for later reliable delivery of the left-off options. 127 Section 4 describes the format and usage of the SLO option. 129 3. The Long Options (LO) Option 131 A host might implement some set of TCP options allowing it to predict 132 that greater than 40 bytes of TCP options space may be useful (for 133 example SACK, Timestamps, alternate checksums, etc). In this case, a 134 host MAY implement the LO option. When initiating connections 135 through an active open, hosts implementing the LO option SHOULD place 136 a LO option of the form shown in Figure 1 somewhere in the SYN 137 segment's options. The 16-bit field labelled "Header Length" should 138 be filled in with the same value as the DO field in the required 139 portion of the TCP header, left-padded with zeros. 141 1 2 3 142 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 143 +---------------+---------------+-------------------------------+ 144 | Kind = | Length = 4 | Header Length | 145 | TDB-IANA-KIND1| | (in 4 byte words) | 146 +---------------+---------------+-------------------------------+ 148 TCP Long Options (LO) Option 150 Figure 1 152 Receipt of an acknowledgement covering the SYN and also containing an 153 LO option means that future segments MAY include an LO option which 154 expands the length of the TCP header beyond the limit of the DO 155 field. The LO option MUST be the first option and the DO field MUST 156 be set to 6. The value 6 represents the length of the required 157 portions of the TCP header plus the LO option. 159 An LO option SHOULD NOT be used when not required by the options in a 160 given segment. A host MUST reject any non-SYN segment containing an 161 LO option if the DO field is not equal to 6. 163 Since a LO option's Header Length field has greater range than the IP 164 header's Total Length field [RFC0791], this allows TCP options to 165 consume an entire maximum-sized IP datagram's length (minus the IP 166 header and required TCP header fields). No matter what size the 167 options section of a TCP header is, it must still be appended with 168 zero-padding to make the total header a multiple of 32 bits, per RFC 169 793 [RFC0793]. 171 Listening hosts that implement the LO option, after reception of a 172 SYN segment with the LO option present, SHOULD reply with a LO option 173 in their SYN-ACK. It can be seen that in both the normal case where 174 one host passively opens and another actively opens, and the more 175 rare case where two hosts simultaneously initiate active opens, the 176 LO option's use can be successfully negotiated. 178 4. The SYN Long Options (SLO) Option 180 If the LO option has been successfully negotiated, an active-opening 181 host that has more bytes of initialization options than would fit in 182 the SYN, can use the SYN Long Options (SLO) option. If a host 183 supports the LO option, then it MUST support the SLO option. 185 Any option bytes transmitted using the SLO option will be treated as 186 if they were carried on the SYN segment. Since there is no guarantee 187 that the LO option will be successfully negotiated, the additional 36 188 bytes left over aside from the 4 byte LO option on a SYN segment 189 should be filled with the most important remaining options that will 190 fit, as determined by the particular implementation. A host issuing 191 a passive open, MUST NOT use the SLO option, as it can use the LO 192 option on SYN-ACK segments if it needs to send long initialization 193 options. The SLO option only serves the needs of an active-opening 194 host that, for backwards compatibility reasons, could not send more 195 than 40 bytes of options on the SYN segment. 197 After successful LO negotiation, if a host has any options that did 198 not fit on the SYN, then additional data or acknowledgement segments 199 MUST carry a SLO option until the first data byte has been 200 acknowledged. The SLO option's format is shown in figure Figure 2. 201 The trailing 2 bytes hold a 16-bit unsigned count of the additional 202 bytes that would have been in the SYN segment's options, if they had 203 been possible to include. This represents an offset from the end of 204 the SLO option, to the last byte that should be considered a SYN 205 option. The next "Additional Byte Count"-number of bytes trailing 206 the SLO option MUST be the ones that did not fit in the SYN segment. 207 The SLO option should always immediately follow the LO option, 208 followed by the additional SYN options, and then by normal options, 209 and finally application data. 211 1 2 3 212 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 213 +---------------+---------------+-------------------------------+ 214 | Kind = | Length = 4 | Additional Byte Count | 215 | TDB-IANA-KIND2| 216 +---------------+---------------+-------------------------------+ 218 TCP SYN Long Options (SLO) Option 220 Figure 2 222 Since TCP connection establishment is often concluded by a pure 223 acknowledgement (carrying no data), only placing the SLO option and 224 additional SYN options in such a single, unreliable segment would be 225 risky. This is why a host MUST continue transmitting SLO options on 226 all segments until its first byte of sent data is acknowledged. 227 Acknowledgement of the first data-byte implicitly covers the SLO and 228 trailing options, as these must have been received end-to-end with 229 the first data byte. 231 If a host does not send any data bytes, but if by some means (perhaps 232 through the received options) it is possible to derive either an 233 explicit or implicit acknowledgement of even a single option 234 transmitted in a SLO-carrying segment (for example via a Timestamp 235 echo), then a host MAY choose to stop transmitting the SLO data. 236 This special case overrides the previously specified MUST condition. 238 A host SHOULD NOT continue sending SLO options after it has received 239 acknowledgement of the first data byte, nor should a host process 240 incoming SLO options other than on the first valid segment it 241 receives that carries them. 243 5. Middlebox Interactions 245 The large number of middleboxes (firewalls, proxies, protocol 246 scrubbers, etc) currently present in the Internet pose some 247 difficulty for deploying new TCP options. Some firewalls may block 248 segments that carry unknown options. For instance, if the LO option 249 is not understood by a firewall, incoming SYNs advertising LO support 250 may be dropped, preventing connection establishment. This is similar 251 to the ECN blackhole problem, where certain faulty hosts and routers 252 throw away packets with ECN bits set [RFC3168]. Some recent results 253 indicate that for new TCP options, this may not be a significant 254 threat, with only 0.2% of web requests failing when carrying an 255 unknown option [transport-middlebox]. 257 More problematic, are the implications of TCP connection-splitting 258 middleboxes and protocol scrubbers that do not understand the LO 259 option. Since such middleboxes may operate on a packet's contents 260 (aggregating application data between multiple segments, rewriting 261 sequence numbers, etc), if the LO option is not understood, then 262 there may be a mangling of the data passed to the application, as 263 control data could end up inter-mingled with the application data. 264 Such errors could be difficult to detect at the transport layer, and 265 many applications might not perform their own integrity checks. An 266 encouraging fact is that some of these devices reset connection 267 attempts when they see TCP options that they do not understand. 268 Hosts that implement the TCP options described in this document MAY 269 retry connection attempts without LO options on the SYNs, if their 270 first attempt with LO options fails. 272 6. Comparison to Extended Segments 274 Another proposal that solves the same problem as the LO and SLO 275 options is that of TCP "extended segments" [ex-segs]. The extended 276 segments technique was proposed following the initial introduction 277 and discussion of the LO and SLO options within the IETF's TCP 278 Maintenance and Minor Extensions working group. The two methods 279 solve the same problem in rather different ways, and have several 280 minor comparative advantages and disadvantages. 282 The LO and SLO options are designed using the philosophy of using the 283 TCP options space to compensate for insufficiency of the standard 284 header. This is in keeping with the way that several currently-used 285 options work. For example, the Window Scale option deals with the 286 limited space in the advertised receive window field, and the 287 Selective Acknowledgement option solves the lack of information in 288 the cumulative acknowledgement field. Extended segments approach 289 overloads the meaning of the standard Data Offset field, keeping its 290 original meaning for values of 5 and greater, but redefining it for 291 values less than 5. This is seen as acceptable since values less 292 than 5 are currently impossible, illegal, and unusable. Extended 293 segments avoid the need for new options by changing the way that the 294 existing standard header is parsed. 296 A key advantage of the extended segments approach is that it does not 297 increase the TCP header size, whereas the LO option adds 4 bytes of 298 space to TCP headers. The severity or triviality of this bloat in 299 header overhead depends entirely upon the network properties and 300 application traffic for particular use cases. 302 It is also not altogether clear that extended segments will always 303 save space in comparison to LO options. The granularity of option 304 lengths that extended segments can support is limited to the number 305 of unusable Data Offset values (5, 0 through 4). Currently, the 306 extended segments proposal defines 4 fixed lengths, and one 307 "infinite" length that means the entire segment is options, with no 308 application data. The fixed option lengths are 48, 64, 128, and 256 309 bytes. If the required per-data-segment options space for some 310 extension or combination of extensions does not map to exactly these 311 values, then padding bytes are required. If 129 bytes of options are 312 required on a data segment, then a length of 256 must be used, and 313 127 bytes of useless padding are added. The LO option has a single- 314 byte granularity and avoids the need for all wasteful padding, aside 315 from that mandated to make the header a perfect multiple of 4-bytes. 316 It is possible that the overhead on a single extended segment could 317 be more than that of several segments using the LO option. 319 Some networkers have found the SLO mechanism that is required for 320 processing of long initialization options to be somewhat "ugly". 321 Extended segments avoid this by sending long initialization options 322 on the initial SYN and SYN-ACK segments. If the other side does not 323 support extended segments, this adds needless confusion and delay in 324 connection setup. The protocol dance to negotiate use of extended 325 segments is arguably much worse than using SLO. If an extended SYN 326 is not understood, a non-reliably transmitted RST segment signals the 327 initiating host to retry without extended segments. Such a retry 328 mechanism is not commonly found in existing TCP implementations. If 329 the LO option is not understood, a SYN-ACK is still immediately 330 generated and the connection goes on uninterrupted, without any 331 additional retry mechanisms. Furthermore, extended SYN-ACKs may be 332 sent in response to non-extended SYNs. This complicates the recovery 333 procedure even more, if not understood, and goes against the way that 334 all current negotiable TCP extensions operate (only used on SYN-ACK 335 if advertised on SYN). 337 Over-zealous middleboxes are immensely troublesome for the deployment 338 of most transport layer extensions. It is unclear whether LO and 339 extended segments have any real difference in robustness in the 340 presence of different types of middleboxes. Both types of segments 341 may appear as invalid to some middleboxes, and both may be mangled if 342 rewritten by a middlebox. 344 7. Security Considerations 346 The TCP options presented in this document open no additional 347 vulnerabilities that we are aware of. 349 8. IANA Considerations 351 This document does not create any new registries or modify the rules 352 for any existing registries managed by IANA. 354 This document requires IANA to update values in its registry of TCP 355 options numbers to assign two new entries, referred herein as 356 "TBD-IANA-KIND1" and "TBD-IANA-KIND2". 358 9. Acknowledgements 360 This document benefitted specifically from discussions with Josh 361 Blanton and Shawn Ostermann. Some comments from Eddie Kohler 362 motivated the discussion of middlebox interactions. Valuable 363 feedback was obtained from Mark Allman and other participants in the 364 TCP Maintenance and Minor Extensions (TCPM) Working Group. 366 10. References 368 10.1. Normative References 370 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 371 RFC 793, September 1981. 373 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 374 September 1981. 376 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 377 Requirement Levels", BCP 14, RFC 2119, March 1997. 379 10.2. Informative References 381 [RFC1323] Jacobson, V., Braden, B., and D. Borman, "TCP Extensions 382 for High Performance", RFC 1323, May 1992. 384 [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP 385 Selective Acknowledgment Options", RFC 2018, October 1996. 387 [more-sack] 388 Srijith, K., Jacob, L., and A. Ananda, "Worst-case 389 Performance Limitation of TCP SACK and a Feasible 390 Solution", Proceedings of 8th IEEE International 391 Conference on Communications Systems (ICCS), 392 November 2002. 394 [migrate] Snoeren, A. and H. Balakrishnan, "An End-to-End Approach 395 to Host Mobility", Proc. of the Sixth Annual ACM/IEEE 396 International Conference on Mobile Computing and 397 Networking, August 2000. 399 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 400 of Explicit Congestion Notification (ECN) to IP", 401 RFC 3168, September 2001. 403 [transport-middlebox] 404 Medina, A., Allman, M., and S. Floyd, "Measuring 405 Interactions Between Transport Protocols and Middleboxes", 406 ACM SIGCOMM/USENIX Internet Measurement Conference, 407 October 2004. 409 [ex-segs] Kohler, E., "Extended Option Space for TCP", Internet 410 Draft (work in progress), September 2004. 412 Appendix A. Changes 414 To be removed by RFC Editor before publication 416 Changes since 03 418 1. Change the option numbers specified to placeholders: 419 "TBD-IANA-KIND1" and "TBD-IANA-KIND2". 421 2. Change the requirement that all segments include the LO option, 422 if negotiated, to a SHOULD NOT unless the options require it. 423 The reasoning behind the initial requirement was for 424 implementation ease but, having implemented it myself, the 425 ability to use the fast path processing for LO connections 426 outweighs that. 428 3. Change the units of the LO option from bytes to words. This was 429 ambiguous in the 03 draft and, since padding to four bytes was 430 required anyway, it seemed best to remove one extra way that the 431 option could be invalid. 433 Authors' Addresses 435 Wesley M. Eddy 436 NASA GRC/Verizon FNS 437 21000 Brookpark Rd, MS 54-5 438 Cleveland, OH 44135 440 Phone: 216-433-6682 441 Email: weddy@grc.nasa.gov 443 Adam Langley 444 Google Inc 446 Email: agl@imperialviolet.org 448 Full Copyright Statement 450 Copyright (C) The IETF Trust (2008). 452 This document is subject to the rights, licenses and restrictions 453 contained in BCP 78, and except as set forth therein, the authors 454 retain all their rights. 456 This document and the information contained herein are provided on an 457 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 458 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 459 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 460 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 461 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 462 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 464 Intellectual Property 466 The IETF takes no position regarding the validity or scope of any 467 Intellectual Property Rights or other rights that might be claimed to 468 pertain to the implementation or use of the technology described in 469 this document or the extent to which any license under such rights 470 might or might not be available; nor does it represent that it has 471 made any independent effort to identify any such rights. Information 472 on the procedures with respect to rights in RFC documents can be 473 found in BCP 78 and BCP 79. 475 Copies of IPR disclosures made to the IETF Secretariat and any 476 assurances of licenses to be made available, or the result of an 477 attempt made to obtain a general license or permission for the use of 478 such proprietary rights by implementers or users of this 479 specification can be obtained from the IETF on-line IPR repository at 480 http://www.ietf.org/ipr. 482 The IETF invites any interested party to bring to its attention any 483 copyrights, patents or patent applications, or other proprietary 484 rights that may cover technology that may be required to implement 485 this standard. Please address the information to the IETF at 486 ietf-ipr@ietf.org.