| < draft-ietf-tcpm-tcp-edo-01.txt | draft-ietf-tcpm-tcp-edo-02.txt > | |||
|---|---|---|---|---|
| TCPM WG J. Touch | TCPM WG J. Touch | |||
| Internet Draft USC/ISI | Internet Draft USC/ISI | |||
| Updates: 793 Wes Eddy | Updates: 793 Wes Eddy | |||
| Intended status: Standards Track MTI Systems | Intended status: Standards Track MTI Systems | |||
| Expires: April 2015 October 13, 2014 | Expires: October 2015 April 15, 2015 | |||
| TCP Extended Data Offset Option | TCP Extended Data Offset Option | |||
| draft-ietf-tcpm-tcp-edo-01.txt | draft-ietf-tcpm-tcp-edo-02.txt | |||
| Status of this Memo | Status of this Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
| other groups may also distribute working documents as Internet- | other groups may also distribute working documents as Internet- | |||
| Drafts. | Drafts. | |||
| skipping to change at page 1, line 32 ¶ | skipping to change at page 1, line 32 ¶ | |||
| months and may be updated, replaced, or obsoleted by other documents | months and may be updated, replaced, or obsoleted by other documents | |||
| at any time. It is inappropriate to use Internet-Drafts as | at any time. It is inappropriate to use Internet-Drafts as | |||
| reference material or to cite them other than as "work in progress." | reference material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html | http://www.ietf.org/shadow.html | |||
| This Internet-Draft will expire on April 13, 2015. | This Internet-Draft will expire on October 15, 2015. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2014 IETF Trust and the persons identified as the | Copyright (c) 2015 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with | carefully, as they describe your rights and restrictions with | |||
| respect to this document. Code Components extracted from this | respect to this document. Code Components extracted from this | |||
| document must include Simplified BSD License text as described in | document must include Simplified BSD License text as described in | |||
| Section 4.e of the Trust Legal Provisions and are provided without | Section 4.e of the Trust Legal Provisions and are provided without | |||
| warranty as described in the Simplified BSD License. | warranty as described in the Simplified BSD License. | |||
| Abstract | Abstract | |||
| TCP segments include a Data Offset field to indicate space for TCP | TCP segments include a Data Offset field to indicate space for TCP | |||
| options, but the size of the field can limit the space available for | options but the size of the field can limit the space available for | |||
| complex options that have evolved. This document updates RFC 793 | complex options such as SACK and Multipath TCP and can limit the | |||
| with an optional TCP extension to that space to support the use of | combination of such options supported in a single connection. This | |||
| multiple large options such as SACK with either TCP Multipath or TCP | document updates RFC 793 with an optional TCP extension to that | |||
| AO. It also explains why the initial SYN of a connection cannot be | space to support the use of multiple large options. It also explains | |||
| extending a single segment. | why the initial SYN of a connection cannot be extending a single | |||
| segment. | ||||
| Table of Contents | Table of Contents | |||
| 1. Introduction...................................................3 | 1. Introduction...................................................3 | |||
| 2. Conventions used in this document..............................3 | 2. Conventions used in this document..............................3 | |||
| 3. Requirements for Extending TCP's Data Offset...................3 | 3. Motivation.....................................................3 | |||
| 4. The TCP EDO Option.............................................4 | 4. Requirements for Extending TCP's Data Offset...................4 | |||
| 5. TCP EDO Interaction with TCP...................................6 | 5. The TCP EDO Option.............................................4 | |||
| 5.1. TCP User Interface........................................6 | 5.1. EDO Supported.............................................5 | |||
| 5.2. TCP States and Transitions................................7 | 5.2. EDO Extension.............................................5 | |||
| 5.3. TCP Segment Processing....................................7 | 5.3. The two EDO Extension variants............................8 | |||
| 5.4. Impact on TCP Header Size.................................7 | 6. TCP EDO Interaction with TCP...................................9 | |||
| 5.5. Connectionless Resets.....................................8 | 6.1. TCP User Interface........................................9 | |||
| 5.6. ICMP Handling.............................................9 | 6.2. TCP States and Transitions................................9 | |||
| 6. Interactions with Middleboxes..................................9 | 6.3. TCP Segment Processing....................................9 | |||
| 6.1. Middlebox Coexistence with EDO............................9 | 6.4. Impact on TCP Header Size................................10 | |||
| 6.2. Middlebox Interference with EDO..........................10 | 6.5. Connectionless Resets....................................11 | |||
| 7. Comparison to Previous Proposals..............................11 | 6.6. ICMP Handling............................................11 | |||
| 7.1. EDO Criteria.............................................11 | 7. Interactions with Middleboxes.................................11 | |||
| 7.2. Summary of Approaches....................................12 | 7.1. Middlebox Coexistence with EDO...........................12 | |||
| 7.3. Extended Segments........................................13 | 7.2. Middlebox Interference with EDO..........................12 | |||
| 7.4. TCPx2....................................................13 | 8. Comparison to Previous Proposals..............................14 | |||
| 7.5. LO/SLO...................................................13 | 8.1. EDO Criteria.............................................14 | |||
| 7.6. LOIC.....................................................14 | 8.2. Summary of Approaches....................................15 | |||
| 7.7. Problems with Extending the Initial SYN..................14 | 8.3. Extended Segments........................................16 | |||
| 8. Implementation Issues.........................................16 | 8.4. TCPx2....................................................16 | |||
| 9. Security Considerations.......................................16 | 8.5. LO/SLO...................................................16 | |||
| 10. IANA Considerations..........................................17 | 8.6. LOIC.....................................................17 | |||
| 11. References...................................................17 | 8.7. Problems with Extending the Initial SYN..................17 | |||
| 11.1. Normative References....................................17 | 9. Implementation Issues.........................................19 | |||
| 11.2. Informative References..................................17 | 10. Security Considerations......................................19 | |||
| 12. Acknowledgments..............................................19 | 11. IANA Considerations..........................................20 | |||
| 12. References...................................................20 | ||||
| 12.1. Normative References....................................20 | ||||
| 12.2. Informative References..................................20 | ||||
| 13. Acknowledgments..............................................22 | ||||
| 1. Introduction | 1. Introduction | |||
| TCP's Data Offset is a 4-bit field, which indicates the number of | TCP's Data Offset (DO)is a 4-bit field, which indicates the number | |||
| 32-bit words of the entire TCP header [RFC793]. This limits the | of 32-bit words of the entire TCP header [RFC793]. This limits the | |||
| current total header size to 60 bytes, of which the basic header | current total header size to 60 bytes, of which the basic header | |||
| occupies 20, leaving 40 bytes for options. These 40 bytes are | occupies 20, leaving 40 bytes for options. These 40 bytes are | |||
| increasingly becoming a limitation to the development of advanced | increasingly becoming a limitation to the development of advanced | |||
| capabilities, such as when SACK [RFC2018][RFC6675] is combined with | capabilities, such as when SACK [RFC2018][RFC6675] is combined with | |||
| either Multipath TCP [RFC6824], TCP-AO [RFC5925], or TCP Fast Open | either Multipath TCP [RFC6824], TCP-AO [RFC5925], or TCP Fast Open | |||
| [Ch14]. | [RFC7413]. | |||
| This document specifies the TCP Extended Data Offset (EDO) option, | This document specifies the TCP Extended Data Offset (EDO) option, | |||
| and is independent of (and thus compatible with) IPv4 and IPv6. EDO | and is independent of (and thus compatible with) IPv4 and IPv6. EDO | |||
| extends the space available for TCP options, except for the initial | extends the space available for TCP options, except for the initial | |||
| SYN and SYN/ACK. This document also explains why the option space of | SYN and SYN/ACK. This document also explains why the option space of | |||
| the initial SYN segments cannot be extended as individual segments | the initial SYN segments cannot be extended as individual segments | |||
| without severe impact on TCP's initial handshake and the SYN/ACK | without severe impact on TCP's initial handshake and the SYN/ACK | |||
| limitation that results from middlebox misbehavior. | limitation that results from potential middlebox misbehavior. | |||
| Multiple other TCP extensions are being considered in the TCPM | ||||
| working group in order to address the case of SYN and SYN/ACK | ||||
| segments [Bo14][Br14][To15]. Some of these other extensions can work | ||||
| in conjunction with EDO (e.g., [To15]). | ||||
| 2. Conventions used in this document | 2. Conventions used in this document | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
| document are to be interpreted as described in RFC-2119 [RFC2119]. | document are to be interpreted as described in RFC-2119 [RFC2119]. | |||
| In this document, these words will appear with that interpretation | In this document, these words will appear with that interpretation | |||
| only when in ALL CAPS. Lower case uses of these words are not to be | only when in ALL CAPS. Lower case uses of these words are not to be | |||
| interpreted as carrying RFC-2119 significance. | interpreted as carrying RFC-2119 significance. | |||
| In this document, the characters ">>" preceding an indented line(s) | In this document, the characters ">>" preceding an indented line(s) | |||
| indicates a compliance requirement statement using the key words | indicates a compliance requirement statement using the key words | |||
| listed above. This convention aids reviewers in quickly identifying | listed above. This convention aids reviewers in quickly identifying | |||
| or finding the explicit compliance requirements of this RFC. | or finding the explicit compliance requirements of this RFC. | |||
| 3. Requirements for Extending TCP's Data Offset | 3. Motivation | |||
| TCP supports headers with a total length of up to 15 32-bit words, | ||||
| as indicated in the 4-bit Data Offset field [RFC793]. This accounts | ||||
| for a total of 60 bytes, of which the default TCP header fields | ||||
| occupy 20 bytes, leaving 40 bytes for options. | ||||
| TCP connections already use this option space for a variety of | ||||
| capabilities. These include Maximum Segment Size (MSS) [RFC793], | ||||
| Window Scale (WS) [RFC7323], Timestamp (TS) [RFC7323], Selective | ||||
| Acknowledgement (SACK) [RFC2018][RFC6675], TCP Authentication Option | ||||
| (TCP-AO) [RFC5925], Multipath TCP (MP-TCP)_[RFC6824], and TCP User | ||||
| Timeout [RFC5482]. Some options occur only in a SYN or SYN/ACK (MSS, | ||||
| WS), and others vary in size when used in SYN vs. non-SYN segments. | ||||
| Each of these options consumes space, where some options consuming | ||||
| as much space as available (SACK) and other desired combinations can | ||||
| easily exceed the currently available space. For example, it is not | ||||
| currently possible to use TCP-AO with both TS and MP-TCP in the same | ||||
| non-SYN segment, i.e., to combine accurate round-trip estimation, | ||||
| authentication, and multipath support in the same connection - even | ||||
| though these options can be negotiated during a SYN exchange (10 for | ||||
| TS, 16 for TCP-AO, and 12 for MP-TCP). | ||||
| TCP EDO is intended to overcome this limitation for non-SYN | ||||
| segments, as well as to increase the space available for SACK | ||||
| blocks. Further discussion of the impact of EDO and existing options | ||||
| is discussed in Section 6.4. Extending SYN segments is much more | ||||
| complicated, as discussed in Section 8.7. | ||||
| 4. Requirements for Extending TCP's Data Offset | ||||
| The primary goal of extending the TCP Data Offset field is to | The primary goal of extending the TCP Data Offset field is to | |||
| increase the space available for TCP options in all segments except | increase the space available for TCP options in all segments except | |||
| the initial SYN. | the initial SYN. | |||
| An important requirement of any such extension is that it not impact | An important requirement of any such extension is that it not impact | |||
| legacy endpoints. Endpoints seeking to use this new option should | legacy endpoints. Endpoints seeking to use this new option should | |||
| not incur additional delay or segment exchanges to connect to either | not incur additional delay or segment exchanges to connect to either | |||
| new endpoints supporting this option or legacy endpoints without | new endpoints supporting this option or legacy endpoints without | |||
| this option. We call this a "backward downgrade" capability. | this option. We call this a "backward downgrade" capability. | |||
| An additional consideration of this extension is avoiding user data | An additional consideration of this extension is avoiding user data | |||
| corruption in the presence of popular network devices, including | corruption in the presence of popular network devices, including | |||
| middleboxes. Consideration of middlebox misbehavior can also | middleboxes. Consideration of middlebox misbehavior can also | |||
| interfere with extension in the SYN/ACK. | interfere with extension in the SYN/ACK. | |||
| 4. The TCP EDO Option | 5. The TCP EDO Option | |||
| TCP EDO extends the option space for all segments except the initial | TCP EDO extends the option space for all segments except the initial | |||
| SYN (i.e., SYN set and ACK not set) and SYN/ACK response. The EDO | SYN (i.e., SYN set and ACK not set) and SYN/ACK response. EDO is | |||
| option is organized as indicated in Figure 1 and Figure 2. When | indicated by the TCP option codepoint of EDO-OPT and has two types: | |||
| desired, initial SYN segments (i.e., those whose ACK bit is not set) | EDO Supported and EDO Extension, as discussed in the following | |||
| use the EDO request option, which consists of the required Kind and | subsections. | |||
| Length fields only. Depending on capability and whether EDO is | ||||
| successfully negotiated, any other segments can use the EDO length | 5.1. EDO Supported | |||
| option, which adds a Header_Length field (in network-standard byte | ||||
| order), indicating the length of the entire TCP header in 32-bit | EDO capability is determined in both directions using a single | |||
| words. The codepoint value of the EDO Kind is EDO-OPT. | exchange of the EDO Supported option (Figure 1). When EDO is desired | |||
| on a given connection, the SYN and SYN/ACK segments include the EDO | ||||
| Supported option, which consists of the two required TCP option | ||||
| fields: Kind and Length. The EDO Supported option is used only in | ||||
| the SYN and SYN/ACK segments and only to confirm support for EDO in | ||||
| subsequent segments. | ||||
| +--------+--------+ | +--------+--------+ | |||
| | Kind | Length | | | Kind | Length | | |||
| +--------+--------+ | +--------+--------+ | |||
| Figure 1 TCP EDO request option | Figure 1 TCP EDO Supported option | |||
| +--------+--------+--------+--------+ | An endpoint seeking to enable EDO includes the EDO Supported option | |||
| | Kind | Length | Header_length | | in the initial SYN. If receiver of that SYN agrees to use EDO, it | |||
| +--------+--------+--------+--------+ | responds with the EDO Supported option in the SYN/ACK. The EDO | |||
| Supported option does not extend the TCP option space. | ||||
| Figure 2 TCP EDO length option | >> Connections using EDO MUST negotiate its availability during the | |||
| SYN exchange of the initial three-way handshake. | ||||
| EDO support is determined in both directions using a single | >> An endpoint confirming and agreeing to EDO use MUST respond with | |||
| exchange. An endpoint seeking to enable EDO support includes the EDO | the EDO Supported option in its SYN/ACK. | |||
| request option in the initial SYN. If receiver of that SYN agrees to | ||||
| support EDO, it responds with a null EDO length option in the | ||||
| SYN/ACK. A null EDO length option contains the same value as the DO | ||||
| field, i.e., it does not extend the TCP option space. | ||||
| >> Connections using EDO MUST negotiate its availability during the | The SYN/ACK uses only the EDO Supported option (and not the EDO | |||
| initial three-way handshake. | Extension option, below) because it may not yet be safe to extend | |||
| the option space in the reverse direction due to potential middlebox | ||||
| misbehavior (see Section 7.2). Extension of the SYN and SYN/ACK | ||||
| space is addressed as a separate option (see Section 8.7). | ||||
| >> An endpoint confirming EDO support MUST respond with a null EDO | 5.2. EDO Extension | |||
| length option in its SYN/ACK. | ||||
| The SYN/ACK uses the null EDO length option because it may not yet | When EDO is successfully negotiated, all other segments use the EDO | |||
| be safe to extend the option space in the reverse direction due to | Extension option, of which there are two variants (Figure 2 and | |||
| middlebox misbehavior (see Section 6.2). Extension of the SYN and | Figure 3). Both variants are considered equivalent and either | |||
| SYN/ACK space is addressed as a separate option (see Section 7.7). | variant can be used in any segment where the EDO Extension option is | |||
| required. Both variants add a Header_Length field (in network- | ||||
| standard byte order), indicating the length of the entire TCP header | ||||
| in 32-bit words. Figure 3 depicts the longer variant, which includes | ||||
| an additional Segment_Length field, which is identical to the TCP | ||||
| pseudoheader TCP Length field and used to detect when segments have | ||||
| been altered in ways that would interfere with EDO (discussed | ||||
| further in Section 5.3). | ||||
| >> The EDO length option MAY be used only if confirmed when the | +--------+--------+--------+--------+ | |||
| connection transitions to the ESTABLISHED state, e.g., a client is | | Kind | Length | Header_Length | | |||
| enabled after receiving the null EDO length option in the SYN/ACK | +--------+--------+--------+--------+ | |||
| and the server is enabled after seeing a null or non-null EDO length | ||||
| option in the final ACK of the three-way handshake. If either of | Figure 2 TCP EDO Extension option - simple variant | |||
| those segments lacks the EDO length option, the connection MUST NOT | ||||
| use EDO on any other segments. | +--------+--------+--------+--------+ | |||
| | Kind | Length | Header_Length | | ||||
| +--------+--------+--------+--------+ | ||||
| | Segment_Length | | ||||
| +--------+--------+ | ||||
| Figure 3 TCP EDO Extension option - with segment length verification | ||||
| >> Once enabled on a connection, all segments in both directions | >> Once enabled on a connection, all segments in both directions | |||
| MUST include the EDO length option. Segments not needing extension | MUST include the EDO Extension option. Segments not needing | |||
| MUST set the EDO length equal to the DO length. | extension MUST set the EDO Extension option Header Length field | |||
| equal to the Data Offset length. | ||||
| >> The EDO Extension option MAY be used only if confirmed when the | ||||
| connection transitions to the ESTABLISHED state, e.g., a client is | ||||
| enabled after receiving the EDO Supported option in the SYN/ACK and | ||||
| the server is enabled after seeing the EDO Extension option in the | ||||
| final ACK of the three-way handshake. If either of those segments | ||||
| lacks the appropriate EDO option, the connection MUST NOT use any | ||||
| EDO options on any other segments. | ||||
| Internet paths may vary after connection establishment, introducing | Internet paths may vary after connection establishment, introducing | |||
| misbehaving middleboxes (see Section 6.2). Using EDO on all segments | misbehaving middleboxes (see Section 7.2). Using EDO on all segments | |||
| in both directions allows this condition to be detected. | in both directions allows this condition to be detected. | |||
| >> The EDO request option MAY occur in an initial SYN as desired | >> The EDO Supported option MAY occur in an initial SYN as desired | |||
| (e.g., as expressed by the user/application), but MUST NOT be | (e.g., as expressed by the user/application) and in the SYN/ACK as | |||
| inserted in other segments. If the EDO request option is received in | confirmation, but MUST NOT be inserted in other segments. If the EDO | |||
| other segments, it MUST be silently ignored. | Supported option is received in other segments, it MUST be silently | |||
| ignored. | ||||
| >> If EDO has not been negotiated and agreed, the EDO length option | >> If EDO has not been negotiated and agreed, the EDO Extension | |||
| MUST be silently ignored on subsequent segments. The EDO length | option MUST be silently ignored on subsequent segments. The EDO | |||
| option MUST NOT be sent in an initial SYN segment, and MUST be | Extension option MUST NOT be sent in an initial SYN segment or | |||
| silently ignored and not acknowledged if so received. | SYN/ACK, and MUST be silently ignored and not acknowledged if so | |||
| received. | ||||
| >> If EDO has been negotiated, any subsequent segments arriving | >> If EDO has been negotiated, any subsequent segments arriving | |||
| without the EDO length option MUST be silently ignored. Such events | without the EDO Extension option MUST be silently ignored. Such | |||
| MAY be logged as warning errors and logging MUST be rate limited. | events MAY be logged as warning errors and logging MUST be rate | |||
| limited. | ||||
| When processing a segment, EDO needs to be visible within the area | When processing a segment, EDO needs to be visible within the area | |||
| indicated by the Data Offset field, so that processing can use the | indicated by the Data Offset field, so that processing can use the | |||
| EDO Header_length to override the Data Offset for that segment. | EDO Header_length to override the field for that segment. | |||
| >> The EDO length option MUST occur within the space indicated by | >> The EDO Extension or EDO Verified Extension options MUST occur | |||
| the TCP Data Offset. | within the space indicated by the TCP Data Offset. | |||
| >> The EDO length option indicates the total length of the header. | >> The EDO Extension or EDO Verified Extension options indicates the | |||
| The EDO Header_length field MUST NOT exceed that of the total | total length of the header. The EDO Header_length field MUST NOT | |||
| segment size (i.e., TCP Length). | exceed that of the total segment size (i.e., TCP Length). | |||
| >> The EDO length option MUST be at least as large as the TCP Data | >> The EDO Header Length MUST be at least as large as the TCP Data | |||
| Offset field of the segment in which they both appear. When the EDO | Offset field of the segment in which they both appear. When the EDO | |||
| length equals the DO length, the EDO option is present but it does | Header Length equals the Data Offset length, the EDO Extension | |||
| not extend the option space. When the EDO length is invalid, the TCP | option is present but it does not extend the option space. When the | |||
| segment MUST be silently dropped. | EDO Header Length is invalid, the TCP segment MUST be silently | |||
| dropped. | ||||
| >> The EDO request option SHOULD be aligned on a 16-bit boundary and | >> The EDO Supported option SHOULD be aligned on a 16-bit boundary | |||
| the EDO length option SHOULD be aligned on a 32-bit boundary, in | and the EDO Extension option SHOULD be aligned on a 32-bit boundary, | |||
| both cases for simpler processing. | in both cases for simpler processing. | |||
| For example, a segment with only EDO would have a Data Offset of 6, | For example, a segment with only EDO would have a Data Offset of 6 | |||
| where EDO would be the first option processed, at which point the | or 7 (depending on the EDO Extension variant used), where EDO would | |||
| EDO length option would override the Data Offset and processing | be the first option processed, at which point the EDO Extension | |||
| would continue until the end of the TCP header as indicated by the | option would override the Data Offset and processing would continue | |||
| EDO Header_length field. | until the end of the TCP header as indicated by the EDO | |||
| Header_length field. | ||||
| There are cases where it might be useful to process other options | There are cases where it might be useful to process other options | |||
| before EDO, notably those that determine whether the TCP header is | before EDO, notably those that determine whether the TCP header is | |||
| valid, such as authentication, encryption, or alternate checksums. | valid, such as authentication, encryption, or alternate checksums. | |||
| In those cases, the EDO length option is preferably the first option | In those cases, the EDO Extension option is preferably the first | |||
| after a validation option, and the payload after the Data Offset is | option after a validation option, and the payload after the Data | |||
| treated as user data for the purposes of validation. | Offset is treated as user data for the purposes of validation. | |||
| >> The EDO length option SHOULD occur as early as possible, either | >> The EDO Extension option SHOULD occur as early as possible, | |||
| first or just after any authentication or encryption, and SHOULD be | either first or just after any authentication or encryption, and | |||
| the last option covered by the Data Offset value. | SHOULD be the last option covered by the Data Offset value. | |||
| Other options are generally handled in the same manner as when the | Other options are generally handled in the same manner as when the | |||
| EDO option is not active, unless they interact with other options. | EDO option is not active, unless they interact with other options. | |||
| One such example is TCP-AO [RFC5925], which optionally ignores the | One such example is TCP-AO [RFC5925], which optionally ignores the | |||
| contents of TCP options, so it would need to be aware of EDO to | contents of TCP options, so it would need to be aware of EDO to | |||
| operate correctly when options are excluded from the HMAC | operate correctly when options are excluded from the HMAC | |||
| calculation. | calculation. | |||
| >> Options that depend on other options, such as TCP-AO [RFC5925] | >> Options that depend on other options, such as TCP-AO [RFC5925] | |||
| (which may include or exclude options in MAC calculations) MUST also | (which may include or exclude options in MAC calculations) MUST also | |||
| be augmented to interpret the EDO length option to operate | be augmented to interpret the EDO Extension option to operate | |||
| correctly. | correctly. | |||
| 5. TCP EDO Interaction with TCP | 5.3. The two EDO Extension variants | |||
| There are two variants of the EDO Extension option; one includes a | ||||
| copy of the TCP segment length, copied from the TCP pseduoheader | ||||
| [RFC793]. The Segment_Length field is added to the longer variant to | ||||
| detect when segments are merged by middleboxes or TCP offload | ||||
| processing but without consideration for the additional option space | ||||
| indicated by the EDO Header_Length field. Such effects are described | ||||
| in further detail in Section 7.2. | ||||
| >> An endpoint MAY use either variant of the EDO Extension option | ||||
| interchangeably. | ||||
| When the longer, 6-byte variant is used, the Segment_Length field is | ||||
| used to check whether modification of the segment was performed | ||||
| consistent with knowledge of the EDO option. The Segment_Length | ||||
| field will detect any modification of the length of the segment, | ||||
| such as might occur when segments are split or merged. | ||||
| >> When an endpoint creates a new segment using the 6-byte EDO | ||||
| Extension option, the Segment_Length field is initialized with a | ||||
| copy of the segment length from the TCP pseudoheader. | ||||
| >> When an endpoint receives a segment using the 6-byte EDO | ||||
| Extension option, it MUST validate the Segment_Length field with the | ||||
| length of the segment as indicated in the TCP pseudoheader. If the | ||||
| segment lengths do not match, the segment MUST be discarded and an | ||||
| error SHOULD be logged in a rate-limited manner. | ||||
| >> The 6-byte EDO Extension variant SHOULD be used where middlebox | ||||
| or TCP offload support could merge or split TCP segments without | ||||
| consideration for the EDO option. Because these conditions could | ||||
| occur at either endpoint or along the network path, the 6-byte | ||||
| variant SHOULD be preferred until sufficient evidence for safe use | ||||
| of the 4-byte variant is determined by the community. | ||||
| The field will not detect other modification of the TCP user data; | ||||
| such modifications would need more complex detection mechanisms, | ||||
| such as checksums or hashes. When these are used, as with IPsec or | ||||
| TCP-AO, the 4-byte variant is sufficient. | ||||
| >> The 4-byte EDO Extension variant is sufficient when EDO is used | ||||
| in conjunction with other mechanisms that provide integrity | ||||
| protection, such as IPsec or TCP-AO. | ||||
| 6. TCP EDO Interaction with TCP | ||||
| The following subsections describe how EDO interacts with the TCP | The following subsections describe how EDO interacts with the TCP | |||
| specification [RFC793]. | specification [RFC793]. | |||
| 5.1. TCP User Interface | 6.1. TCP User Interface | |||
| The TCP EDO option is enabled on a connection using a mechanism | The TCP EDO option is enabled on a connection using a mechanism | |||
| similar to any other per-connection option. In Unix systems, this is | similar to any other per-connection option. In Unix systems, this is | |||
| typically performed using the 'setsockopt' system call. | typically performed using the 'setsockopt' system call. | |||
| >> Implementations can also employ system-wide defaults, however | >> Implementations can also employ system-wide defaults, however | |||
| systems SHOULD NOT activate this extension by default to avoid | systems SHOULD NOT activate this extension by default to avoid | |||
| interfering with legacy applications. | interfering with legacy applications. | |||
| >> Due to the potential impacts of legacy middleboxes (discussed in | >> Due to the potential impacts of legacy middleboxes (discussed in | |||
| Section 6), a TCP implementation supporting EDO SHOULD log any | Section 7), a TCP implementation supporting EDO SHOULD log any | |||
| events within an EDO connection when options that are malformed or | events within an EDO connection when options that are malformed or | |||
| show other evidence of tampering arrive. An operating system MAY | show other evidence of tampering arrive. An operating system MAY | |||
| choose to cache the list of destination endpoints where this has | choose to cache the list of destination endpoints where this has | |||
| occurred with and block use of EDO on future connections to those | occurred with and block use of EDO on future connections to those | |||
| endpoints, but this cache MUST be accessible to users/applications | endpoints, but this cache MUST be accessible to users/applications | |||
| on the host. Note that such endpoint assumptions can vary in the | on the host. Note that such endpoint assumptions can vary in the | |||
| presence of load balancers where server implementations vary behind | presence of load balancers where server implementations vary behind | |||
| such balancers. | such balancers. | |||
| 5.2. TCP States and Transitions | 6.2. TCP States and Transitions | |||
| TCP EDO does not alter the existing TCP state or state transition | TCP EDO does not alter the existing TCP state or state transition | |||
| mechanisms. | mechanisms. | |||
| 5.3. TCP Segment Processing | 6.3. TCP Segment Processing | |||
| TCP EDO alters segment processing during the TCP option processing | TCP EDO alters segment processing during the TCP option processing | |||
| step. Once detected, the TCP EDO length option overrides the TCP | step. Once detected, the TCP EDO Extension option overrides the TCP | |||
| Data Offset field for all subsequent option processing. Option | Data Offset field for all subsequent option processing. Option | |||
| processing continues at the next option (if present) after the EDO | processing continues at the next option (if present) after the EDO | |||
| length option. | Extension option. | |||
| 5.4. Impact on TCP Header Size | 6.4. Impact on TCP Header Size | |||
| The TCP EDO request option increases SYN header length by a minimum | The TCP EDO Supported option increases SYN header length by a | |||
| of 2 bytes. Currently popular SYN options total 19 bytes, which | minimum of 2 bytes, but could increase it by more depending on 32- | |||
| leaves more than enough room for the EDO request: | bit word alignment. Currently popular SYN options total 19 bytes, | |||
| which leaves more than enough room for the EDO Supported option: | ||||
| o SACK permitted (2 bytes in SYN, optionally 2 + 8N bytes after) | o SACK permitted (2 bytes in SYN, optionally 2 + 8N bytes after) | |||
| [RFC2018][RFC6675] | [RFC2018][RFC6675] | |||
| o Timestamp (10 bytes) [RFC7323] | o Timestamp (10 bytes) [RFC7323] | |||
| o Window scale (3 bytes) [RFC7323] | o Window scale (3 bytes) [RFC7323] | |||
| o MSS option (4 bytes) [RFC793] | o MSS option (4 bytes) [RFC793] | |||
| Adding the EDO option would result in a total of 21 bytes of SYN | Adding the EDO Supported option would result in a total of 21 bytes | |||
| option space. Subsequent segments would use 19 bytes of option space | of SYN option space. | |||
| without any SACK blocks or allow up to 3 SACK blocks before needing | ||||
| to use EDO; with EDO, the number of SACK blocks or additional | Subsequent segments would use 10 bytes of option space without any | |||
| options would be substantially increased. There are also other | SACK blocks (TS only; WS and MSS are used only in SYN and SYN/ACK) | |||
| options that are emerging in the SYN, including TCP Fast Open, which | or allow up to 3 SACK blocks before needing to use EDO; with EDO, | |||
| uses another 6-18 (typically 10) bytes in the SYN/ACK of the first | the number of SACK blocks or additional options would be | |||
| connection and in the SYN of subsequent connections [Ch14]. | substantially increased. There are also other options that are | |||
| emerging in the SYN, including TCP Fast Open, which uses another 6- | ||||
| 18 (typically 10) bytes in the SYN/ACK of the first connection and | ||||
| in the SYN of subsequent connections [RFC7413]. | ||||
| TCP EDO can also be negotiated in SYNs with either of the following | TCP EDO can also be negotiated in SYNs with either of the following | |||
| large options: | large options: | |||
| o TCP-AO (authentication) (16 bytes) [RFC5925] | o TCP-AO (authentication) (16 bytes) [RFC5925] | |||
| o Multipath TCP (12 bytes in SYN and SYN/ACK, 20 after) [RFC6824] | o Multipath TCP (12 bytes in SYN and SYN/ACK, 20 after) [RFC6824] | |||
| Including TCP-AO increases the SYN option space use to 37 bytes; | Including TCP-AO with TS, WS, SACK increases the SYN option space | |||
| with Multipath TCP the use is 33 bytes. When Multipath TCP is | use to 35 bytes; with Multipath TCP the use is 31 bytes. When | |||
| enabled with the typical options, later segments might require 39 | Multipath TCP is enabled with the typical options, later segments | |||
| bytes without SACK, thus effectively disabling the SACK option | would require 30 bytes without SACK, thus limiting the SACK option | |||
| unless EDO is also supported on at least non-SYN segments. | to one block unless EDO is also supported on at least non-SYN | |||
| segments. | ||||
| The full combination of the above options (49 bytes including EDO) | The full combination of the above options (47 bytes for TS, WS, MSS, | |||
| does not fit in the existing SYN option space and (as noted) that | SACK, TCP-AO, and MPTCP) does not fit in the existing SYN option | |||
| space cannot be extended within a single SYN segment. There has been | space and (as noted) that space cannot be extended within a single | |||
| a proposal to change TS to a 2 byte "TS permitted" signal in the | SYN segment. There has been a proposal to change TS to a 2 byte "TS | |||
| initial SYN, provided it can be safely enabled during the connection | permitted" signal in the initial SYN, provided it can be safely | |||
| later or might be avoided completely [Ni14]. Even using "TS- | enabled during the connection later or might be avoided completely | |||
| permitted", the total space is still too large to support in the | [Ni14]. Even using "TS-permitted", the total space is still too | |||
| initial SYN without SYN option space extension [Br14][To14]. | large to support in the initial SYN without SYN option space | |||
| extension [Bo14][Br14][To15]. | ||||
| The EDO option has negligible impact on other headers, because it | The EDO Extension option has negligible impact on other headers, | |||
| can either come first or just after security information, and in | because it can either come first or just after security information, | |||
| either case the additional 4 bytes are easily accommodated within | and in either case the additional 4 or 6 bytes are easily | |||
| the TCP Data Offset length. Once the EDO option is processed, the | accommodated within the TCP Data Offset length. Once the EDO option | |||
| entirety of the remainder of the TCP segment is available for any | is processed, the entirety of the remainder of the TCP segment is | |||
| remaining options. | available for any remaining options. | |||
| 5.5. Connectionless Resets | 6.5. Connectionless Resets | |||
| A RST may arrive during a currently active connection or may be | A RST may arrive during a currently active connection or may be | |||
| needed to cleanup old state from an abandoned connection. The latter | needed to cleanup old state from an abandoned connection. The latter | |||
| occurs when a new SYN is sent to an endpoint with matching existing | occurs when a new SYN is sent to an endpoint with matching existing | |||
| connection state, at which point that endpoint responds with a RST | connection state, at which point that endpoint responds with a RST | |||
| and both ends remove stale information. | and both ends remove stale information. | |||
| The EDO option is mandatory on all TCP segments once negotiated, | The EDO Extension option is mandatory on all TCP segments once | |||
| except the SYN and SYN/ACK of the three-way handshake to establish | negotiated, i.e., except in the SYN and SYN/ACK (which establish | |||
| its support and the RST. A RST may lack the context to know that EDO | support) and the RST. A RST may lack the context to know that EDO is | |||
| is active on a connection. | active on a connection. | |||
| >> The EDO length option MAY occur in a RST when the endpoint has | >> The EDO Extension option MAY occur in a RST when the endpoint has | |||
| connection state that has negotiated EDO. However, unless the RST is | connection state that has negotiated EDO. However, unless the RST is | |||
| generated by an incoming segment that includes an EDO option, the | generated by an incoming segment that includes an EDO Extension | |||
| transmitted RST MUST NOT include the EDO length option. | option, the transmitted RST MUST NOT include the EDO Extension | |||
| option. | ||||
| 5.6. ICMP Handling | 6.6. ICMP Handling | |||
| ICMP responses are intended to include the IP and the port fields of | ICMP responses are intended to include the IP and the port fields of | |||
| TCP and UDP headers of typical TCP/IP and UDP/IP packets [RFC792]. | TCP and UDP headers of typical TCP/IP and UDP/IP packets [RFC792]. | |||
| This includes the first 8 data bytes of the original datagram, | This includes the first 8 data bytes of the original datagram, | |||
| intended to include the transport port numbers used for connection | intended to include the transport port numbers used for connection | |||
| demultiplexing. Later specifications encourage returning as much of | demultiplexing. Later specifications encourage returning as much of | |||
| the original payload as possible [RFC1812]. In either case, legacy | the original payload as possible [RFC1812]. In either case, legacy | |||
| options or new options in the EDO extension area might or might not | options or new options in the EDO extension area might or might not | |||
| be included, and so options are generally not assumed to be part of | be included, and so options are generally not assumed to be part of | |||
| ICMP processing anyway. | ICMP processing anyway. | |||
| 6. Interactions with Middleboxes | 7. Interactions with Middleboxes | |||
| Middleboxes are on-path devices that typically examine or modify | Middleboxes are on-path devices that typically examine or modify | |||
| packets in ways that Internet routers do not [RFC3234]. This | packets in ways that Internet routers do not [RFC3234]. This | |||
| includes parsing transport headers and/or rewriting transport | includes parsing transport headers and/or rewriting transport | |||
| segments in ways that may affect EDO. | segments in ways that may affect EDO. | |||
| There are several cases to consider: | There are several cases to consider: | |||
| - Typical NAT/NAPT devices, which modify only IP address and/or TCP | - Typical NAT/NAPT devices, which modify only IP address and/or TCP | |||
| port number fields (with associated TCP checksum updates) | port number fields (with associated TCP checksum updates) | |||
| - Middleboxes that try to reconstitute TCP data streams, such as | - Middleboxes that try to reconstitute TCP data streams, such as | |||
| for deep-packet inspection for virus scanning | for deep-packet inspection for virus scanning | |||
| - Middleboxes that modify known TCP header fields | - Middleboxes that modify known TCP header fields | |||
| - Middleboxes that rewrite TCP segments | - Middleboxes that rewrite TCP segments | |||
| 6.1. Middlebox Coexistence with EDO | 7.1. Middlebox Coexistence with EDO | |||
| Middleboxes can coexist with EDO when they either support EDO or | Middleboxes can coexist with EDO when they either support EDO or | |||
| when they ignore its impact on segment structure. | when they ignore its impact on segment structure. | |||
| NATs and NAPTs, which rewrite IP address and/or transport port | NATs and NAPTs, which rewrite IP address and/or transport port | |||
| fields, are the most common form of middlebox and are not affected | fields, are the most common form of middlebox and are not affected | |||
| by the EDO option. | by the EDO option. | |||
| Middleboxes that support EDO would be those that correctly parse the | Middleboxes that support EDO would be those that correctly parse the | |||
| EDO option. Such boxes can reconstitute the TCP data stream | EDO option. Such boxes can reconstitute the TCP data stream | |||
| correctly or can modify header fields and/or rewrite segments | correctly or can modify header fields and/or rewrite segments | |||
| without impact to EDO. | without impact to EDO. | |||
| Conventional TCP proxies terminate the TCP connection in both | Conventional TCP proxies terminate the TCP connection in both | |||
| directions and thus operate as TCP endpoints, such as when a client- | directions and thus operate as TCP endpoints, such as when a client- | |||
| middlebox and middlebox-server each have separate TCP connections. | middlebox and middlebox-server each have separate TCP connections. | |||
| They would support EDO by following the host requirements herein on | They would support EDO by following the host requirements herein on | |||
| both connections. The use of EDO on one connection is independent of | both connections. The use of EDO on one connection is independent of | |||
| its use on the other in this case. | its use on the other in this case. | |||
| 6.2. Middlebox Interference with EDO | 7.2. Middlebox Interference with EDO | |||
| Middleboxes that do not support EDO cannot coexist with its use when | Middleboxes that do not support EDO cannot coexist with its use when | |||
| they modify segment boundaries or do not forward unknown (e.g., the | they modify segment boundaries or do not forward unknown (e.g., the | |||
| EDO) options. | EDO) options. | |||
| So-called "transparent" rewriting proxies, which modify TCP segment | So-called "transparent" rewriting proxies, which modify TCP segment | |||
| boundaries, might mix option information with user data if they did | boundaries, might mix option information with user data if they did | |||
| not support EDO. Such devices might also interfere with other TCP | not support EDO. Such devices might also interfere with other TCP | |||
| options such as TCP-AO. There are three types of such boxes: | options such as TCP-AO. There are three types of such boxes: | |||
| skipping to change at page 10, line 44 ¶ | skipping to change at page 13, line 29 ¶ | |||
| disabled due to lack of SYN/ACK confirmation in either or both | disabled due to lack of SYN/ACK confirmation in either or both | |||
| directions. Problems would occur only when TCP segments with EDO are | directions. Problems would occur only when TCP segments with EDO are | |||
| combined or split while ignoring the EDO option. In the split case, | combined or split while ignoring the EDO option. In the split case, | |||
| the key concern is if the split happens within the option extension | the key concern is if the split happens within the option extension | |||
| space or if EDO is silently copied to both segments without copying | space or if EDO is silently copied to both segments without copying | |||
| the corresponding extended option space contents. However, the most | the corresponding extended option space contents. However, the most | |||
| comprehensive study of these cases indicates that "although | comprehensive study of these cases indicates that "although | |||
| middleboxes do split and coalesce segments, none did so while | middleboxes do split and coalesce segments, none did so while | |||
| passing unknown options" [Ho11]. | passing unknown options" [Ho11]. | |||
| Note that the second and third types of middlebox behaviors listed | ||||
| above may create syndromes similar to TCP transmit and receive | ||||
| hardware offload engines that incorrectly modify segments with | ||||
| unknown options. | ||||
| Middleboxes that silently remove options they do not implement have | Middleboxes that silently remove options they do not implement have | |||
| been observed [Ho11]. Such boxes interfere with the use of the EDO | been observed [Ho11]. Such boxes interfere with the use of the EDO | |||
| length option in the SYN and SYN/ACK segments because extended | Extension option in the SYN and SYN/ACK segments because extended | |||
| option space would be misinterpreted as user data if the EDO option | option space would be misinterpreted as user data if the EDO | |||
| were removed, and this cannot be avoided. This is one reason that | Extension option were removed, and this cannot be avoided. This is | |||
| SYN and SYN/ACK extension requires alternate mechanisms (see Section | one reason that SYN and SYN/ACK extension requires alternate | |||
| 7.7). Further, if such middleboxes become present on a path they | mechanisms (see Section 8.7). It is also the reason for the 6-byte | |||
| could cause similar misinterpretation on segments exchanged in the | EDO Extension variant (see Section 5.3), which can detect such | |||
| ESTABLISHED and subsequent states. As a result, this document | merging or splitting of segments. Further, if such middleboxes | |||
| requires that the EDO length option be avoided on the SYN/ACK and | become present on a path they could cause similar misinterpretation | |||
| that this option needs to be used on all segments once successfully | on segments exchanged in the ESTABLISHED and subsequent states. As a | |||
| negotiated. | result, this document requires that the EDO Extension option be | |||
| avoided on the SYN/ACK and that this option needs to be used on all | ||||
| segments once successfully negotiated and encourages use of the 6- | ||||
| byte EDO Extension variant. | ||||
| Deep-packet inspection systems that inspect TCP segment payloads or | Deep-packet inspection systems that inspect TCP segment payloads or | |||
| attempt to reconstitute the data stream would incorrectly include | attempt to reconstitute the data stream would incorrectly include | |||
| option data in the reconstituted user data stream, which might | option data in the reconstituted user data stream, which might | |||
| interfere with their operation. | interfere with their operation. | |||
| >> It can be important to detect misbehavior that could cause EDO | >> It can be important to detect misbehavior that could cause EDO | |||
| space to be misinterpreted as user data. In such cases, EDO SHOULD | space to be misinterpreted as user data. In such cases, EDO SHOULD | |||
| be used in conjunction with an integrity protection mechanism, such | be used in conjunction with an integrity protection mechanism. This | |||
| as IPsec, TCP-AO, etc. It is useful to note that such protection | includes the 6-byte EDO Extension variant or stronger mechanisms | |||
| helps find only non-compliant components. | such as IPsec, TCP-AO, etc. It is useful to note that such | |||
| protection only helps non-compliant components and enable avoidance | ||||
| (e.g., disabling EDO), but integrity protection alone cannot correct | ||||
| the misinterpretation of EDO space as user data. | ||||
| This situation is similar to that of ECN and ICMP support in the | This situation is similar to that of ECN and ICMP support in the | |||
| Internet. In both cases, endpoints have evolved mechanisms for | Internet. In both cases, endpoints have evolved mechanisms for | |||
| detecting and robustly operating around "black holes". Very similar | detecting and robustly operating around "black holes". Very similar | |||
| algorithms are expected to be applicable for EDO. | algorithms are expected to be applicable for EDO. | |||
| 7. Comparison to Previous Proposals | 8. Comparison to Previous Proposals | |||
| EDO is the latest in a long line of attempts to increase TCP option | EDO is the latest in a long line of attempts to increase TCP option | |||
| space [Al06][Ed08][Ko04][Ra12][Yo11]. The following is a comparison | space [Al06][Ed08][Ko04][Ra12][Yo11]. The following is a comparison | |||
| of these approaches to EDO, based partly on a previous summary | of these approaches to EDO, based partly on a previous summary | |||
| [Ra12]. This comparison differs from that summary by using a | [Ra12]. This comparison differs from that summary by using a | |||
| different set of success criteria. | different set of success criteria. | |||
| 7.1. EDO Criteria | 8.1. EDO Criteria | |||
| Our criteria for a successful solution are as follows: | Our criteria for a successful solution are as follows: | |||
| o Zero-cost fallback to legacy endpoints. | o Zero-cost fallback to legacy endpoints. | |||
| o Minimal impact on middlebox compatibility. | o Minimal impact on middlebox compatibility. | |||
| o No additional side-effects. | o No additional side-effects. | |||
| Zero-cost fallback requires that upgraded hosts incur no penalty for | Zero-cost fallback requires that upgraded hosts incur no penalty for | |||
| skipping to change at page 12, line 22 ¶ | skipping to change at page 15, line 17 ¶ | |||
| have rejected the initial SYN because of its unknown options rather | have rejected the initial SYN because of its unknown options rather | |||
| than silently relaying it). | than silently relaying it). | |||
| EDO also attempts to avoid creating side-effects, such as might | EDO also attempts to avoid creating side-effects, such as might | |||
| happen if options were split across multiple TCP segments (which | happen if options were split across multiple TCP segments (which | |||
| could arrive out of order or be lost) or across different TCP | could arrive out of order or be lost) or across different TCP | |||
| connections (which could fail to share fate through firewalls or | connections (which could fail to share fate through firewalls or | |||
| NAT/NAPTs). | NAT/NAPTs). | |||
| These requirements are similar to those noted in [Ra12], but EDO | These requirements are similar to those noted in [Ra12], but EDO | |||
| groups cases of segment modification beyond address and port - such | groups cases of segment modification beyond address and port - such | |||
| as rewriting, segment drop, sequence number modification, and option | as rewriting, segment drop, sequence number modification, and option | |||
| stripping - as already in violation of existing TCP requirements | stripping - as already in violation of existing TCP requirements | |||
| regarding unknown options, and so we do not consider their impact on | regarding unknown options, and so we do not consider their impact on | |||
| this new option. | this new option. | |||
| 7.2. Summary of Approaches | 8.2. Summary of Approaches | |||
| There are three basic ways in which TCP option space extension has | There are three basic ways in which TCP option space extension has | |||
| been attempted: | been attempted: | |||
| 1. Use of a TCP option. | 1. Use of a TCP option. | |||
| 2. Redefinition of the existing TCP header fields. | 2. Redefinition of the existing TCP header fields. | |||
| 3. Use of option space in multiple TCP segments (split across | 3. Use of option space in multiple TCP segments (split across | |||
| multiple segments). | multiple segments). | |||
| skipping to change at page 13, line 16 ¶ | skipping to change at page 16, line 9 ¶ | |||
| unintended side-effects, such as increased delay to deal with path | unintended side-effects, such as increased delay to deal with path | |||
| latency or loss differences. | latency or loss differences. | |||
| The following discusses three of the most notable past attempts to | The following discusses three of the most notable past attempts to | |||
| extend the TCP option space: Extended Segments, TCPx2, LO/SLO, and | extend the TCP option space: Extended Segments, TCPx2, LO/SLO, and | |||
| LOIC. [Ra12] suggests a few other approaches, including use of TCP | LOIC. [Ra12] suggests a few other approaches, including use of TCP | |||
| option cookies, reuse/overload of other TCP fields (e.g., the URG | option cookies, reuse/overload of other TCP fields (e.g., the URG | |||
| pointer), or compressing TCP options. None of these is compatible | pointer), or compressing TCP options. None of these is compatible | |||
| with legacy endpoints or middleboxes. | with legacy endpoints or middleboxes. | |||
| 7.3. Extended Segments | 8.3. Extended Segments | |||
| TCP Extended Segments redefined the meaning of currently unused | TCP Extended Segments redefined the meaning of currently unused | |||
| values of the Data Offset (DO) field [Ko04]. TCP defines DO as | values of the Data Offset (DO) field [Ko04]. TCP defines DO as | |||
| indicating the length of the TCP header, including options, in 32- | indicating the length of the TCP header, including options, in 32- | |||
| bit words. The default TCP header with no options is 5 such words, | bit words. The default TCP header with no options is 5 such words, | |||
| so the minimum currently valid DO value is 5 (meaning 40 bytes of | so the minimum currently valid DO value is 5 (meaning 40 bytes of | |||
| option space). This document defines interpretations of values 0-4: | option space). This document defines interpretations of values 0-4: | |||
| DO=0 means 48 bytes of option space, DO=1 means 64, DO=2 means 128, | DO=0 means 48 bytes of option space, DO=1 means 64, DO=2 means 128, | |||
| DO=3 means 256, and DO=4 means unlimited (e.g., the entire payload | DO=3 means 256, and DO=4 means unlimited (e.g., the entire payload | |||
| is option space). This variant negotiates the use of this capability | is option space). This variant negotiates the use of this capability | |||
| by using one of these invalid DO values in the initial SYN. | by using one of these invalid DO values in the initial SYN. | |||
| Use of this variant is not backward-compatible with legacy TCP | Use of this variant is not backward-compatible with legacy TCP | |||
| implementations, whether at the desired endpoint or on middleboxes. | implementations, whether at the desired endpoint or on middleboxes. | |||
| The variant also defines a way to initiate the feature on the | The variant also defines a way to initiate the feature on the | |||
| passive side, e.g., using an invalid DO during the SYN/ACK when the | passive side, e.g., using an invalid DO during the SYN/ACK when the | |||
| initial SYN had a valid DO. This capability allows either side to | initial SYN had a valid DO. This capability allows either side to | |||
| initiate use of the feature but is also not backward compatible. | initiate use of the feature but is also not backward compatible. | |||
| 7.4. TCPx2 | 8.4. TCPx2 | |||
| TCPx2 redefines legacy TCP headers by basically doubling all TCP | TCPx2 redefines legacy TCP headers by basically doubling all TCP | |||
| header fields [Al06]. It relies on a new transport protocol number | header fields [Al06]. It relies on a new transport protocol number | |||
| to indicate its use, defeating backward compatibility with all | to indicate its use, defeating backward compatibility with all | |||
| existing TCP capabilities, including firewalls, NATs/NAPTs, and | existing TCP capabilities, including firewalls, NATs/NAPTs, and | |||
| legacy endpoints and applications. | legacy endpoints and applications. | |||
| 7.5. LO/SLO | 8.5. LO/SLO | |||
| The TCP Long Option (LO, [Ed08]) is very similar to EDO, except that | The TCP Long Option (LO, [Ed08]) is very similar to EDO, except that | |||
| presence of LO results in ignoring the existing DO field and that LO | presence of LO results in ignoring the existing Data Offset (DO) | |||
| is required to be the first option. EDO considers the need for other | field and that LO is required to be the first option. EDO considers | |||
| fields to be first and declares that the EDO is the last option as | the need for other fields to be first and declares that the EDO is | |||
| indicated by the DO field value. Like LO, EDO is required in every | the last option as indicated by the DO field value. Like LO, EDO is | |||
| segment once negotiated. | required in every segment once negotiated. | |||
| The TCP Long Option draft also specified the SYN Long Option (SLO) | The TCP Long Option draft also specified the SYN Long Option (SLO) | |||
| [Ed08]. If SLO is used in the initial SYN and successfully | [Ed08]. If SLO is used in the initial SYN and successfully | |||
| negotiated, it is used in each subsequent segment until all of the | negotiated, it is used in each subsequent segment until all of the | |||
| initial SYN options are transmitted. | initial SYN options are transmitted. | |||
| LO is backward compatible, as is SLO; in both cases, endpoints not | LO is backward compatible, as is SLO; in both cases, endpoints not | |||
| supporting the option would not respond with the option, and in both | supporting the option would not respond with the option, and in both | |||
| cases the initial SYN is not itself extended. | cases the initial SYN is not itself extended. | |||
| skipping to change at page 14, line 25 ¶ | skipping to change at page 17, line 20 ¶ | |||
| considered completely established until the first data byte is | considered completely established until the first data byte is | |||
| acknowledged. Legacy TCP can establish a connection even in the | acknowledged. Legacy TCP can establish a connection even in the | |||
| absence of data. SLO also changes the semantics of the SYN/ACK; for | absence of data. SLO also changes the semantics of the SYN/ACK; for | |||
| legacy TCP, this completes the active side connection establishment, | legacy TCP, this completes the active side connection establishment, | |||
| where in SLO an additional data ACK is required. A connection whose | where in SLO an additional data ACK is required. A connection whose | |||
| initial SYN options have been confirmed in the SYN/ACK might still | initial SYN options have been confirmed in the SYN/ACK might still | |||
| fail upon receipt of additional options sent in later SLO segments. | fail upon receipt of additional options sent in later SLO segments. | |||
| This case - of late negotiation fail - is not addressed in the | This case - of late negotiation fail - is not addressed in the | |||
| specification. | specification. | |||
| 7.6. LOIC | 8.6. LOIC | |||
| TCP Long Options by Invalid Checksum is a dual-stack approach that | TCP Long Options by Invalid Checksum is a dual-stack approach that | |||
| uses two initial SYNS to initiate all updated connections [Yo11]. | uses two initial SYNS to initiate all updated connections [Yo11]. | |||
| One SYN negotiates the new option and the other SYN payload contains | One SYN negotiates the new option and the other SYN payload contains | |||
| only the entire options. The negotiation SYN is compliant with | only the entire options. The negotiation SYN is compliant with | |||
| existing procedures, but the option SYN has a deliberately incorrect | existing procedures, but the option SYN has a deliberately incorrect | |||
| TCP checksum (decremented by 2). A legacy endpoint would discard the | TCP checksum (decremented by 2). A legacy endpoint would discard the | |||
| segment with the incorrect checksum and respond to the negotiation | segment with the incorrect checksum and respond to the negotiation | |||
| SYN without the LO option. | SYN without the LO option. | |||
| Use of the option SYN and its incorrect checksum both interfere with | Use of the option SYN and its incorrect checksum both interfere with | |||
| other legacy components. Segments with incorrect checksums will be | other legacy components. Segments with incorrect checksums will be | |||
| silently dropped by most middleboxes, including NATs/NAPTs. Use of | silently dropped by most middleboxes, including NATs/NAPTs. Use of | |||
| two SYNs creates side-effects that can delay connections to upgraded | two SYNs creates side-effects that can delay connections to upgraded | |||
| endpoints, notably when the option SYN is lost or the SYNs arrive | endpoints, notably when the option SYN is lost or the SYNs arrive | |||
| out of order. Finally, by not allowing other options in the | out of order. Finally, by not allowing other options in the | |||
| negotiation SYN, all connections to legacy endpoints either use no | negotiation SYN, all connections to legacy endpoints either use no | |||
| options or require a separate connection attempt (either concurrent | options or require a separate connection attempt (either concurrent | |||
| or subsequent). | or subsequent). | |||
| 7.7. Problems with Extending the Initial SYN | 8.7. Problems with Extending the Initial SYN | |||
| The key difficulty with most previous proposals is the desire to | The key difficulty with most previous proposals is the desire to | |||
| extend the option space in all TCP segments, including the initial | extend the option space in all TCP segments, including the initial | |||
| SYN, i.e., SYN with no ACK, typically the first segment of a | SYN, i.e., SYN with no ACK, typically the first segment of a | |||
| connection, as well as possibly the SYN/ACK. It has proven difficult | connection, as well as possibly the SYN/ACK. It has proven difficult | |||
| to extend space within the segment of the initial SYN in the absence | to extend space within the segment of the initial SYN in the absence | |||
| of prior negotiation while maintaining current TCP three-way | of prior negotiation while maintaining current TCP three-way | |||
| handshake properties, and it may be similarly challenging to extend | handshake properties, and it may be similarly challenging to extend | |||
| the SYN/ACK (depending on asymmetric middlebox assumptions). | the SYN/ACK (depending on asymmetric middlebox assumptions). | |||
| A new TCP option cannot extend the Data Offset of a single TCP | A new TCP option cannot extend the Data Offset of a single TCP | |||
| initial SYN segment, and cannot extend a SYN/ACK in a single segment | initial SYN segment, and cannot extend a SYN/ACK in a single segment | |||
| when considering misbehaving middleboxes. All TCP segments, | when considering misbehaving middleboxes. All TCP segments, | |||
| including the initial SYN and SYN/ACK, may include user data in the | including the initial SYN and SYN/ACK, may include user data in the | |||
| payload data [RFC793], and this can be useful for some proposed | payload data [RFC793], and this can be useful for some proposed | |||
| features such as TCP Fast Open [Ch14]. Legacy endpoints that ignore | features such as TCP Fast Open [RFC7413]. Legacy endpoints that | |||
| the new option would process the payload contents as user data and | ignore the new option would process the payload contents as user | |||
| send an ACK. Once ACK'd, this data cannot be removed from the user | data and send an ACK. Once ACK'd, this data cannot be removed from | |||
| stream. | the user stream. | |||
| The Reserved TCP header bits cannot be redefined easily, even though | The Reserved TCP header bits cannot be redefined easily, even though | |||
| three of the six total bits have already been redefined (ECE/CWR | three of the six total bits have already been redefined (ECE/CWR | |||
| [RFC3168] and NS [RFC3540]). Legacy endpoints have been known to | [RFC3168] and NS [RFC3540]). Legacy endpoints have been known to | |||
| reflect received values in these fields; this was safely dealt with | reflect received values in these fields; this was safely dealt with | |||
| for ECN but would be difficult here [RFC3168]. | for ECN but would be difficult here [RFC3168]. | |||
| TCP initial SYN (SYN and not ACK) segments can use every other TCP | TCP initial SYN (SYN and not ACK) segments can use every other TCP | |||
| header field except the Acknowledgement number, which is not used | header field except the Acknowledgement number, which is not used | |||
| because the ACK field is not set. In all other segments, all fields | because the ACK field is not set. In all other segments, all fields | |||
| skipping to change at page 15, line 47 ¶ | skipping to change at page 18, line 43 ¶ | |||
| larger than the required Kind and Length components, so the | larger than the required Kind and Length components, so the | |||
| resulting efficiency is typically insufficient for additional | resulting efficiency is typically insufficient for additional | |||
| options. | options. | |||
| The option space of an initial SYN segment might be extended by | The option space of an initial SYN segment might be extended by | |||
| using multiple initial segments (e.g., multiple SYNs or a SYN and | using multiple initial segments (e.g., multiple SYNs or a SYN and | |||
| non-SYN) or based on the context of previous or parallel | non-SYN) or based on the context of previous or parallel | |||
| connections. This method may also be needed to extend space in the | connections. This method may also be needed to extend space in the | |||
| SYN/ACK in the presence of misbehaving middleboxes. Because of their | SYN/ACK in the presence of misbehaving middleboxes. Because of their | |||
| potential complexity, these approaches are addressed in separate | potential complexity, these approaches are addressed in separate | |||
| documents [Br14][To14]. | documents [Bo14][Br14][To15]. | |||
| Option space cannot be extended in outer layer headers, e.g., IPv4 | Option space cannot be extended in outer layer headers, e.g., IPv4 | |||
| or IPv6. These layers typically try to avoid extensions altogether, | or IPv6. These layers typically try to avoid extensions altogether, | |||
| to simplify forwarding processing at routers. Introducing new shim | to simplify forwarding processing at routers. Introducing new shim | |||
| layers to accommodate additional option space would interfere with | layers to accommodate additional option space would interfere with | |||
| deep-packet inspection mechanisms that are in widespread use. | deep-packet inspection mechanisms that are in widespread use. | |||
| As a result, EDO does not attempt to extend the space available for | As a result, EDO does not attempt to extend the space available for | |||
| options in TCP initial SYNs. It does extend that space in all other | options in TCP initial SYNs. It does extend that space in all other | |||
| segments (including SYN/ACK), which has always been trivially | segments (including SYN/ACK), which has always been trivially | |||
| possible once an option is defined. | possible once an option is defined. | |||
| 8. Implementation Issues | 9. Implementation Issues | |||
| TCP segment processing can involve accessing nonlinear data | TCP segment processing can involve accessing nonlinear data | |||
| structures, such as chains of buffers. Such chains are often | structures, such as chains of buffers. Such chains are often | |||
| designed so that the maximum default TCP header (60 bytes) fits in | designed so that the maximum default TCP header (60 bytes) fits in | |||
| the first buffer. Extending the TCP header across multiple buffers | the first buffer. Extending the TCP header across multiple buffers | |||
| may necessitate buffer traversal functions that span boundaries | may necessitate buffer traversal functions that span boundaries | |||
| between buffers. Such traversal can also have a significant | between buffers. Such traversal can also have a significant | |||
| performance impact, which is additional rationale for using TCP | performance impact, which is additional rationale for using TCP | |||
| option space - even extended option space - sparingly. | option space - even extended option space - sparingly. | |||
| skipping to change at page 16, line 39 ¶ | skipping to change at page 19, line 36 ¶ | |||
| When using the ExID variant for testing and experimentation, either | When using the ExID variant for testing and experimentation, either | |||
| TCP option codepoint (253, 254) is valid in sent or received | TCP option codepoint (253, 254) is valid in sent or received | |||
| segments. | segments. | |||
| Implementers need to be careful about the potential for offload | Implementers need to be careful about the potential for offload | |||
| support interfering with this option. The EDO data needs to be | support interfering with this option. The EDO data needs to be | |||
| passed to the protocol stack as part of the option space, not | passed to the protocol stack as part of the option space, not | |||
| integrated with the user segment, to allow the offload to | integrated with the user segment, to allow the offload to | |||
| independently determine user data segment boundaries and combine | independently determine user data segment boundaries and combine | |||
| them correctly with the extended option data. | them correctly with the extended option data. Some legacy hardware | |||
| receive offload engines may present challenges in this regard, and | ||||
| may be incompatible with EDO where they incorrectly process segments | ||||
| with unknown options. Such offload engines should be considered part | ||||
| of the protocol stack and updated accordingly. Issues with incorrect | ||||
| resegmentation by an offload engine can be detected in the same way | ||||
| as middlebox tampering. | ||||
| 9. Security Considerations | 10. Security Considerations | |||
| It is meaningless to have the Data Offset further exceed the | It is meaningless to have the Data Offset further exceed the | |||
| position of the EDO data offset option. | position of the EDO data offset option. | |||
| >> When the EDO length option is present, the EDO length option | >> When the EDO Extension option is present, the EDO Extension | |||
| SHOULD be the last non-null option covered by the TCP Data Offset, | option SHOULD be the last non-null option covered by the TCP Data | |||
| because it would be the last option affected by Data Offset. | Offset, because it would be the last option affected by Data Offset. | |||
| This also makes it more difficult to use the Data Offset field as a | This also makes it more difficult to use the Data Offset field as a | |||
| covert channel. | covert channel. | |||
| 10. IANA Considerations | 11. IANA Considerations | |||
| We request that, upon publication, this option be assigned a TCP | We request that, upon publication, this option be assigned a TCP | |||
| Option codepoint by IANA, which the RFC Editor will replace EDO-OPT | Option codepoint by IANA, which the RFC Editor will replace EDO-OPT | |||
| in this document with codepoint value. | in this document with codepoint value. | |||
| The TCP Experimental ID (ExID) with a 16-bit value of 0x0ED0 (in | The TCP Experimental ID (ExID) with a 16-bit value of 0x0ED0 (in | |||
| network standard byte order) has been assigned for use during | network standard byte order) has been assigned for use during | |||
| testing and preliminary experiments. | testing and preliminary experiments. | |||
| 11. References | 12. References | |||
| 11.1. Normative References | 12.1. Normative References | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
| [RFC793] Postel, J., "Transmission Control Protocol", STD 7, RFC | [RFC793] Postel, J., "Transmission Control Protocol", STD 7, RFC | |||
| 793, September 1981. | 793, September 1981. | |||
| 11.2. Informative References | 12.2. Informative References | |||
| [Al06] Allman, M., "TCPx2: Don't Fence Me In", draft-allman- | [Al06] Allman, M., "TCPx2: Don't Fence Me In", draft-allman- | |||
| tcpx2-hack-00 (work in progress), May 2006. | tcpx2-hack-00 (work in progress), May 2006. | |||
| [Br14] Briscoe, B., "Extended TCP Option Space in the Payload of | [Bo14] Borman, D., "TCP Four-Way Handshake", draft-borman- | |||
| an Alternative SYN", draft-briscoe-tcpm-syn-op-sis-02 | tcp4way-00 (work in progress), October 2014. | |||
| (work in progress), September 2014. | ||||
| [Ch14] Cheng, Y., Chu, J., and A. Jain, "TCP Fast Open", draft- | [Br14] Briscoe, B., "Inner Space for TCP Options", draft-briscoe- | |||
| ietf-tcpm-fastopen-10, September 2014. | tcpm-inner-space-01 (work in progress), October 2014. | |||
| [Ed08] Eddy, W. and A. Langley, "Extending the Space Available | [Ed08] Eddy, W. and A. Langley, "Extending the Space Available | |||
| for TCP Options", draft-eddy-tcp-loo-04 (work in | for TCP Options", draft-eddy-tcp-loo-04 (work in | |||
| progress), July 2008. | progress), July 2008. | |||
| [Ho11] Honda, M., Nishida, Y., Raiciu, C., Greenhalgh, A., | [Ho11] Honda, M., Nishida, Y., Raiciu, C., Greenhalgh, A., | |||
| Handley, M., and H. Tokuda, "Is it still possible to | Handley, M., and H. Tokuda, "Is it still possible to | |||
| extend TCP", Proc. ACM Sigcomm Internet Measurement | extend TCP", Proc. ACM Sigcomm Internet Measurement | |||
| Conference (IMC), 2011, pp. 181-194. | Conference (IMC), 2011, pp. 181-194. | |||
| skipping to change at page 18, line 31 ¶ | skipping to change at page 21, line 31 ¶ | |||
| of Explicit Congestion Notification (ECN) to IP", RFC | of Explicit Congestion Notification (ECN) to IP", RFC | |||
| 3168, September 2001. | 3168, September 2001. | |||
| [RFC3234] Carpenter, B. and S. Brim, "Middleboxes: Taxonomy and | [RFC3234] Carpenter, B. and S. Brim, "Middleboxes: Taxonomy and | |||
| Issues", RFC 3234, February 2002. | Issues", RFC 3234, February 2002. | |||
| [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit | [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit | |||
| Congestion Notification (ECN) Signaling with Nonces", RFC | Congestion Notification (ECN) Signaling with Nonces", RFC | |||
| 3540, June 2003. | 3540, June 2003. | |||
| [RFC5482] Eggert, L., and F. Gont, "TCP User Timeout Option", RFC | ||||
| 5482, March 2009. | ||||
| [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP | [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP | |||
| Authentication Option", RFC 5925, June 2010. | Authentication Option", RFC 5925, June 2010. | |||
| [RFC6675] Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M., | [RFC6675] Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M., | |||
| and Y. Nishida, "A Conservative Loss Recovery Algorithm | and Y. Nishida, "A Conservative Loss Recovery Algorithm | |||
| Based on Selective Acknowledgment (SACK) for TCP", RFC | Based on Selective Acknowledgment (SACK) for TCP", RFC | |||
| 6675, August 2012. | 6675, August 2012. | |||
| [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, | [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, | |||
| "TCP Extensions for Multipath Operation with Multiple | "TCP Extensions for Multipath Operation with Multiple | |||
| Addresses", RFC 6824, January 2013. | Addresses", RFC 6824, January 2013. | |||
| [RFC7323] Borman, D., Braden, B., Jacobson, V., and R. Scheffenegger | [RFC7323] Borman, D., Braden, B., Jacobson, V., and R. Scheffenegger | |||
| (Ed.), "TCP Extensions for High Performance", RFC 7323, | (Ed.), "TCP Extensions for High Performance", RFC 7323, | |||
| September 2014. | September 2014. | |||
| [To14] Touch, J., T. Faber, "TCP SYN Extended Option Space Using | [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP | |||
| Fast Open", RFC 7413, December 2014. | ||||
| [To15] Touch, J., T. Faber, "TCP SYN Extended Option Space Using | ||||
| an Out-of-Band Segment", draft-touch-tcpm-tcp-syn-ext-opt- | an Out-of-Band Segment", draft-touch-tcpm-tcp-syn-ext-opt- | |||
| 01 (work in progress), September 2014. | 02 (work in progress), April 2015. | |||
| [Yo11] Yourtchenko, A., "Introducing TCP Long Options by Invalid | [Yo11] Yourtchenko, A., "Introducing TCP Long Options by Invalid | |||
| Checksum", draft-yourtchenko-tcp-loic-00 (work in | Checksum", draft-yourtchenko-tcp-loic-00 (work in | |||
| progress), April 2011. | progress), April 2011. | |||
| 12. Acknowledgments | 13. Acknowledgments | |||
| The authors would like to thank the IETF TCPM WG for their feedback, | The authors would like to thank the IETF TCPM WG for their feedback, | |||
| in particular: Oliver Bonaventure, Bob Briscoe, Ted Faber, John | in particular: Oliver Bonaventure, Bob Briscoe, Ted Faber, John | |||
| Leslie, Pasi Sarolahti, Richard Scheffenegger, and Alexander | Leslie, Pasi Sarolahti, Richard Scheffenegger, and Alexander | |||
| Zimmerman. | Zimmerman. | |||
| This document was prepared using 2-Word-v2.0.template.dot. | This document was prepared using 2-Word-v2.0.template.dot. | |||
| Authors' Addresses | Authors' Addresses | |||
| End of changes. 87 change blocks. | ||||
| 218 lines changed or deleted | 364 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||