| < draft-touch-tcpm-tcp-edo-02.txt | draft-touch-tcpm-tcp-edo-03.txt > | |||
|---|---|---|---|---|
| TCPM WG J. Touch | TCPM WG J. Touch | |||
| Internet Draft USC/ISI | Internet Draft USC/ISI | |||
| Updates: 793 Wes Eddy | Updates: 793 Wes Eddy | |||
| Intended status: Standards Track MTI Systems | Intended status: Standards Track MTI Systems | |||
| Expires: November 2014 May 5, 2014 | Expires: January 2015 July 2, 2014 | |||
| TCP Extended Data Offset Option | TCP Extended Data Offset Option | |||
| draft-touch-tcpm-tcp-edo-02.txt | draft-touch-tcpm-tcp-edo-03.txt | |||
| Status of this Memo | Status of this Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
| other groups may also distribute working documents as Internet- | other groups may also distribute working documents as Internet- | |||
| Drafts. | Drafts. | |||
| skipping to change at page 1, line 32 ¶ | skipping to change at page 1, line 32 ¶ | |||
| months and may be updated, replaced, or obsoleted by other documents | months and may be updated, replaced, or obsoleted by other documents | |||
| at any time. It is inappropriate to use Internet-Drafts as | at any time. It is inappropriate to use Internet-Drafts as | |||
| reference material or to cite them other than as "work in progress." | reference material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html | http://www.ietf.org/shadow.html | |||
| This Internet-Draft will expire on November 5, 2014. | This Internet-Draft will expire on January 2, 2015. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2014 IETF Trust and the persons identified as the | Copyright (c) 2014 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with | carefully, as they describe your rights and restrictions with | |||
| respect to this document. | respect to this document. | |||
| Abstract | Abstract | |||
| TCP segments include a Data Offset field to indicate space for TCP | TCP segments include a Data Offset field to indicate space for TCP | |||
| options, but the size of the field can limit the space available for | options, but the size of the field can limit the space available for | |||
| complex options that have evolved. This document updates RFC 793 | complex options that have evolved. This document updates RFC 793 | |||
| with an optional TCP extension to that space and explains why such | with an optional TCP extension to that space to support the use of | |||
| extensions cannot be used in the initial SYN of a connection. | multiple large options such as SACK with either TCP Multipath or TCP | |||
| AO. It also explains why the initial SYN of a connection cannot be | ||||
| extending as a single segment. | ||||
| Table of Contents | Table of Contents | |||
| 1. Introduction...................................................2 | 1. Introduction...................................................2 | |||
| 2. Conventions used in this document..............................3 | 2. Conventions used in this document..............................3 | |||
| 3. Requirements for Extending TCP's Data Offset...................3 | 3. Requirements for Extending TCP's Data Offset...................3 | |||
| 4. Initial SYN Data Offset Cannot be Extended.....................3 | 4. The TCP EDO Option.............................................3 | |||
| 5. The TCP EDO Option.............................................4 | 5. TCP EDO Interaction with TCP...................................6 | |||
| 6. TCP EDO Interaction with TCP...................................7 | 5.1. TCP User Interface........................................6 | |||
| 6.1. TCP User Interface........................................7 | 5.2. TCP States and Transitions................................6 | |||
| 6.2. TCP States and Transitions................................7 | 5.3. TCP Segment Processing....................................6 | |||
| 6.3. TCP Segment Processing....................................7 | 5.4. Impact on TCP Header Size.................................6 | |||
| 6.4. Impact on TCP Header Size.................................7 | 5.5. Connectionless Resets.....................................7 | |||
| 6.5. Connectionless Resets.....................................8 | 5.6. ICMP Handling.............................................8 | |||
| 6.6. ICMP Handling.............................................8 | 6. Interactions with Middleboxes..................................8 | |||
| 7. Interactions with Middleboxes..................................9 | 7. Comparison to Previous Proposals...............................9 | |||
| 8. Security Considerations.......................................10 | 7.1. EDO Criteria..............................................9 | |||
| 9. IANA Considerations...........................................10 | 7.2. Summary of Approaches....................................10 | |||
| 10. References...................................................10 | 7.3. Extended Segments........................................11 | |||
| 10.1. Normative References....................................10 | 7.4. TCPx2....................................................11 | |||
| 10.2. Informative References..................................11 | 7.5. LOO/SLO..................................................12 | |||
| 11. Acknowledgments..............................................12 | 7.6. LOIC.....................................................12 | |||
| 7.7. Problems with Extending the Initial SYN..................13 | ||||
| 8. Security Considerations.......................................14 | ||||
| 9. IANA Considerations...........................................14 | ||||
| 10. References...................................................14 | ||||
| 10.1. Normative References....................................14 | ||||
| 10.2. Informative References..................................15 | ||||
| 11. Acknowledgments..............................................16 | ||||
| 1. Introduction | 1. Introduction | |||
| TCP's Data Offset is a 4-bit field, which indicates the number of | TCP's Data Offset is a 4-bit field, which indicates the number of | |||
| 32-bit words of the entire TCP header [RFC793]. This limits the | 32-bit words of the entire TCP header [RFC793]. This limits the | |||
| current total header size to 60 bytes, of which the basic header | current total header size to 60 bytes, of which the basic header | |||
| occupies 20, leaving 40 bytes for options. These 40 bytes are | occupies 20, leaving 40 bytes for options. These 40 bytes are | |||
| increasingly becoming a limitation to the development of advanced | increasingly becoming a limitation to the development of advanced | |||
| capabilities, such as when SACK [RFC2018][RFC6675], Multipath | capabilities, such as when SACK [RFC2018][RFC6675] is combined with | |||
| [RFC6824], and TCP-AO [RFC5925] are combined. | either Multipath TCP [RFC6824] or TCP-AO [RFC5925]. | |||
| This document specifies the TCP Extended Data Offset (EDO) option, | This document specifies the TCP Extended Data Offset (EDO) option, | |||
| and is independent of (and thus compatible with) IPv4 and IPv6. EDO | and is independent of (and thus compatible with) IPv4 and IPv6. EDO | |||
| extends the space available for TCP options, except for the initial | extends the space available for TCP options, except for the initial | |||
| SYN segment (i.e., SYN and not ACK, the first segment of a new | SYN segment (i.e., SYN and not ACK, the first segment of a new | |||
| connection). This document also explains why the option space of an | connection). This document also explains why the option space of a | |||
| initial SYN cannot be extended without severe impact on TCP's | single initial SYN segment cannot be extended without severe impact | |||
| initial handshake. | on TCP's initial handshake. | |||
| 2. Conventions used in this document | 2. Conventions used in this document | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
| document are to be interpreted as described in RFC-2119 [RFC2119]. | document are to be interpreted as described in RFC-2119 [RFC2119]. | |||
| In this document, these words will appear with that interpretation | In this document, these words will appear with that interpretation | |||
| only when in ALL CAPS. Lower case uses of these words are not to be | only when in ALL CAPS. Lower case uses of these words are not to be | |||
| interpreted as carrying RFC-2119 significance. | interpreted as carrying RFC-2119 significance. | |||
| In this document, the characters ">>" preceding an indented line(s) | In this document, the characters ">>" preceding an indented line(s) | |||
| indicates a compliance requirement statement using the key words | indicates a compliance requirement statement using the key words | |||
| listed above. This convention aids reviewers in quickly identifying | listed above. This convention aids reviewers in quickly identifying | |||
| or finding the explicit compliance requirements of this RFC. | or finding the explicit compliance requirements of this RFC. | |||
| 3. Requirements for Extending TCP's Data Offset | 3. Requirements for Extending TCP's Data Offset | |||
| The primary goal of extending the TCP Data Offset field is to | The primary goal of extending the TCP Data Offset field is to | |||
| increase the space available for TCP options. | increase the space available for TCP options in all segments except | |||
| the initial SYN. | ||||
| An important requirement of any such extension is that it not impact | An important requirement of any such extension is that it not impact | |||
| legacy endpoints. I.e., endpoints seeking to use this new option | legacy endpoints. Endpoints seeking to use this new option should | |||
| should not incur additional delay or segment exchanges to connect to | not incur additional delay or segment exchanges to connect to either | |||
| either new endpoints supporting this option or legacy endpoints | new endpoints supporting this option or legacy endpoints without | |||
| without this option. We call this a "backward downgrade" capability. | this option. We call this a "backward downgrade" capability. | |||
| 4. Initial SYN Data Offset Cannot be Extended | ||||
| There have been a number of previous attempts to extend the space | ||||
| available for TCP options [Al06][Ed08][Ko04][Ra12][Yo11]. The key | ||||
| difficulty with most previous proposals is the desire to extend the | ||||
| option space in all TCP segments, including the initial SYN, i.e., | ||||
| SYN with no ACK, typically the first segment of a connection. There | ||||
| are three basic ways in which this has been attempted: | ||||
| 1. Use of a TCP option. | ||||
| 2. Redefinition of the existing TCP header fields. | ||||
| 3. Use of option space in multiple TCP segments (split-space). | ||||
| A new TCP option cannot extend the Data Offset of a TCP initial SYN. | ||||
| All TCP segments, including the initial SYN, may include user data | ||||
| in the payload data [RFC793], and this can be useful for some | ||||
| proposed features such as TCP Fast Open [Ch14]. Legacy endpoints | ||||
| that ignore the new option would process the payload contents as | ||||
| user data and send an ACK. Once ACK'd, this data cannot be removed | ||||
| from the user stream. | ||||
| The six Reserved TCP header bits cannot be redefined easily, because | ||||
| the original specification did not require their contents to be | ||||
| ignored, even though three of those bits have already been redefined | ||||
| (ECE/CWR [RFC3168] and NS [RFC3540]). Legacy endpoints are required | ||||
| to drop TCP segments where those bits are not zero, making it | ||||
| difficult to assume backward downgrade. | ||||
| TCP initial SYN (SYN and not ACK) segments can use every other TCP | ||||
| header field except the Acknowledgement number, which is not used | ||||
| because the ACK field is not set. In all other segments, all fields | ||||
| except the three remaining Reserved header bits. The total amount of | ||||
| space, in either case, is insufficient to be useful in extending the | ||||
| option space. | ||||
| The representation of TCP options can be optimized to minimize the | ||||
| space needed. In such cases, multiple Kind and Length fields are | ||||
| combined, so that a new Kind would indicate a specific combination | ||||
| of options, whose order is fixed and whose length is indicated by | ||||
| one Length field. Most TCP options use fields whose size is much | ||||
| larger than the required Kind and Length components, so the | ||||
| resulting efficiency is typically insufficient for additional | ||||
| options. | ||||
| Option space cannot be extended across multiple SYN segments. A | ||||
| legacy endpoint would continue the connection with incomplete option | ||||
| information. | ||||
| Option space cannot be extended in outer layer headers, e.g., IPv4 | ||||
| or IPv6. These layers typically try to avoid extensions altogether, | ||||
| to simplify forwarding processing at routers. Introducing new shim | ||||
| layers to accommodate additional option space would interfere with | ||||
| deep-packet inspection mechanisms that are in widespread use. | ||||
| As a result, EDO does not attempt to extend the space available for | ||||
| options in TCP initial SYNs. It does extend that space in all other | ||||
| segments (including SYN-ACK), which has always been trivially | ||||
| possible once an option is defined. | ||||
| 5. The TCP EDO Option | 4. The TCP EDO Option | |||
| The TCP EDO option is organized as indicated in Figure 1 and Figure | TCP EDO extends the option space for all segments except the initial | |||
| 2. For initial SYN segments (i.e., those whose ACK bit is not set), | SYN (i.e., SYN set and ACK not set). The EDO option is organized as | |||
| the EDO request option consists of the required Kind and Length | indicated in Figure 1 and Figure 2. For initial SYN segments (i.e., | |||
| fields only. All other segments optionally use the EDO length | those whose ACK bit is not set), the EDO request option consists of | |||
| option, which adds a Header_Length field (in network-standard byte | the required Kind and Length fields only. All other segments | |||
| rder), indicating the length of the entire TCP header in bytes. The | optionally use the EDO length option, which adds a Header_Length | |||
| codepoint value of the EDO Kind is EDO-OPT. | field (in network-standard byte rder), indicating the length of the | |||
| entire TCP header in bytes. The codepoint value of the EDO Kind is | ||||
| EDO-OPT. | ||||
| +--------+--------+ | +--------+--------+ | |||
| | Kind | Length | | | Kind | Length | | |||
| +--------+--------+ | +--------+--------+ | |||
| Figure 1 TCP EDO request option | Figure 1 TCP EDO request option | |||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| | Kind | Length | Header_length | | | Kind | Length | Header_length | | |||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| skipping to change at page 7, line 10 ¶ | skipping to change at page 6, line 7 ¶ | |||
| One such example is TCP-AO [RFC5925], which optionally ignores the | One such example is TCP-AO [RFC5925], which optionally ignores the | |||
| contents of TCP options, so it would need to be aware of EDO to | contents of TCP options, so it would need to be aware of EDO to | |||
| operate correctly when options are excluded from the HMAC | operate correctly when options are excluded from the HMAC | |||
| calculation. | calculation. | |||
| >> Options that depend on other options, such as TCP-AO [RFC5925] | >> Options that depend on other options, such as TCP-AO [RFC5925] | |||
| (which may include or exclude options in MAC calculations) MUST also | (which may include or exclude options in MAC calculations) MUST also | |||
| be augmented to interpret the EDO length option to operate | be augmented to interpret the EDO length option to operate | |||
| correctly. | correctly. | |||
| 6. TCP EDO Interaction with TCP | 5. TCP EDO Interaction with TCP | |||
| The following subsections describe how EDO interacts with the TCP | The following subsections describe how EDO interacts with the TCP | |||
| specification [RFC793]. | specification [RFC793]. | |||
| 6.1. TCP User Interface | 5.1. TCP User Interface | |||
| The TCP EDO option is enabled on a connection using a mechanism | The TCP EDO option is enabled on a connection using a mechanism | |||
| similar to any other per-connection option. In Unix systems, this is | similar to any other per-connection option. In Unix systems, this is | |||
| typically performed using the 'setsockopt' function. | typically performed using the 'setsockopt' function. | |||
| >> Implementations can also employ system-wide defaults, however | >> Implementations can also employ system-wide defaults, however | |||
| systems SHOULD NOT use this extension by default to avoid | systems SHOULD NOT use this extension by default to avoid | |||
| interfering with legacy applications. | interfering with legacy applications. | |||
| 6.2. TCP States and Transitions | 5.2. TCP States and Transitions | |||
| TCP EDO does not alter the existing TCP state or state transition | TCP EDO does not alter the existing TCP state or state transition | |||
| mechanisms. | mechanisms. | |||
| 6.3. TCP Segment Processing | 5.3. TCP Segment Processing | |||
| TCP EDO alters segment processing during the TCP option processing | TCP EDO alters segment processing during the TCP option processing | |||
| step. Once detected, the TCP EDO length option overrides the TCP | step. Once detected, the TCP EDO length option overrides the TCP | |||
| Data Offset field for all subsequent option processing. Option | Data Offset field for all subsequent option processing. Option | |||
| processing continues at the next option after the EDO length option. | processing continues at the next option after the EDO length option. | |||
| Implementers need to be careful about the potential for offload | Implementers need to be careful about the potential for offload | |||
| support interfering with this option. The EDO data needs to be | support interfering with this option. The EDO data needs to be | |||
| passed to the protocol stack as part of the option space, not | passed to the protocol stack as part of the option space, not | |||
| integrated with the user segment, to allow the offload to | integrated with the user segment, to allow the offload to | |||
| independently determine user data segment boundaries and combine | independently determine user data segment boundaries and combine | |||
| them correctly with the extended option data. | them correctly with the extended option data. | |||
| 6.4. Impact on TCP Header Size | 5.4. Impact on TCP Header Size | |||
| The TCP EDO request option increases SYN header length by a minimum | The TCP EDO request option increases SYN header length by a minimum | |||
| of 2 bytes. Currently popular SYN options total 11 bytes, which | of 2 bytes. Currently popular SYN options total 19 bytes, which | |||
| leaves more than enough room for the EDO request: | leaves more than enough room for the EDO request: | |||
| o SACK permitted (2 bytes) [RFC2018][RFC6675] | o SACK permitted (2 bytes in SYN, optionally 2 + 8N bytes after) | |||
| o Timestamp (10 bytes) [RFC1323] | [RFC2018][RFC6675] | |||
| o Timestamp (10 bytes) [RFC1323] | ||||
| o Window scale (3 bytes) [RFC1323] | o Window scale (3 bytes) [RFC1323] | |||
| o MSS option (4 bytes) [RFC793] | o MSS option (4 bytes) [RFC793] | |||
| TCP EDO can also be negotiated in SYNs with the following options: | Adding the EDO option would result in a total of 21 bytes of SYN | |||
| option space. Subsequent segments would use 19 bytes of option space | ||||
| without any SACK blocks or allow up to 4 SACK blocks before needing | ||||
| to use EDO; with EDO, the number of SACK blocks or additional | ||||
| options would be substantially increased. | ||||
| TCP EDO can also be negotiated in SYNs with either of the following | ||||
| options: | ||||
| o TCP-AO (authentication) (16 bytes) [RFC5925] | o TCP-AO (authentication) (16 bytes) [RFC5925] | |||
| o Multipath TCP (20 bytes) [RFC6824] | o Multipath TCP (12 bytes in SYN and SYN-ACK, 20 after) [RFC6824] | |||
| Some combinations of the above options may not fit in the existing | Including TCP-AO increases the SYN option space use to 37 bytes; | |||
| SYN option space, and (as noted) that space cannot be extended. | with Multipath TCP the use is 33 bytes. When Multipath TCP is | |||
| enabled with the typical options, later segments might require 39 | ||||
| bytes without SACK, thus effectively disabling the SACK option | ||||
| unless EDO is also supported on at least non-SYN segments. | ||||
| The full combination of the above options (49 bytes including EDO) | ||||
| does not fit in the existing SYN option space and (as noted) that | ||||
| space cannot be extended within a single SYN segment. There has been | ||||
| a proposal to change TS to a 2 byte "TS permitted" signal in the | ||||
| initial SYN, provided it can be safely enabled during the connection | ||||
| later or might be avoided completely [Ni14]. Even using "TS- | ||||
| permitted", the total space is still too large to support in the | ||||
| initial SYN without SYN option space extension [To14]. | ||||
| The EDO option has negligible impact on other headers, because it | The EDO option has negligible impact on other headers, because it | |||
| can either come first or just after security information, and in | can either come first or just after security information, and in | |||
| either case the additional 4 bytes are easily accommodated within | either case the additional 4 bytes are easily accommodated within | |||
| the TCP Data Offset length. Once the EDO option is processed, the | the TCP Data Offset length. Once the EDO option is processed, the | |||
| entirety of the remainder of the TCP segment is available for any | entirety of the remainder of the TCP segment is available for any | |||
| remaining options. | remaining options. | |||
| 6.5. Connectionless Resets | 5.5. Connectionless Resets | |||
| A RST may arrive during a currently active connection or may be | A RST may arrive during a currently active connection or may be | |||
| needed to cleanup old state from an abandoned connection. The latter | needed to cleanup old state from an abandoned connection. The latter | |||
| occurs when a new SYN is sent to an endpoint with matching existing | occurs when a new SYN is sent to an endpoint with matching existing | |||
| connection state, at which point that endpoint responds with a RST | connection state, at which point that endpoint responds with a RST | |||
| and both ends remove stale information. | and both ends remove stale information. | |||
| The EDO option is not mandatory in any TCP segment, except the SYN | The EDO option is not mandatory in any TCP segment, except the SYN | |||
| and SYN-ACK of the three-way handshake to establish its support. | and SYN-ACK of the three-way handshake to establish its support. | |||
| >> The EDO length option MAY occur in a RST when the endpoint has | >> The EDO length option MAY occur in a RST when the endpoint has | |||
| connection state that has negotiated EDO. However, unless the RST is | connection state that has negotiated EDO. However, unless the RST is | |||
| generated by an incoming segment that includes an EDO option, the | generated by an incoming segment that includes an EDO option, the | |||
| RST MUST NOT include the EDO length option. | RST MUST NOT include the EDO length option. | |||
| 6.6. ICMP Handling | 5.6. ICMP Handling | |||
| ICMP responses are intended to include the IP and the port fields of | ICMP responses are intended to include the IP and the port fields of | |||
| TCP and UDP headers of typical TCP/IP and UDP/IP packets [RFC792]. | TCP and UDP headers of typical TCP/IP and UDP/IP packets [RFC792]. | |||
| This includes the first 8 data bytes of the original datagram, | This includes the first 8 data bytes of the original datagram, | |||
| intended to include the transport port numbers used for connection | intended to include the transport port numbers used for connection | |||
| demultiplexing. Later specifications encourage returning as much of | demultiplexing. Later specifications encourage returning as much of | |||
| the original payload as possible [RFC1812]. In either case, legacy | the original payload as possible [RFC1812]. In either case, legacy | |||
| options or new options in the EDO extension area might or might not | options or new options in the EDO extension area might or might not | |||
| be included, and so options are generally not assumed to be part of | be included, and so options are generally not assumed to be part of | |||
| ICMP processing anyway. | ICMP processing anyway. | |||
| 7. Interactions with Middleboxes | 6. Interactions with Middleboxes | |||
| Any new TCP option may be impacted by the presence of any on-path | Any new TCP option may be impacted by the presence of any on-path | |||
| device that examines or modifies transport headers [RFC3234]. Boxes | device that examines or modifies transport headers [RFC3234]. Boxes | |||
| that parse or modify TCP options need to follow the same | that parse or modify TCP options need to follow the same | |||
| requirements of TCP endpoints in supporting EDO, or they could | requirements of TCP endpoints in supporting EDO, or they could | |||
| interfere with connections. The primary concern is so-called | interfere with connections. The primary concern is so-called | |||
| "transparent" rewriting proxies, which modify TCP segment boundaries | "transparent" rewriting proxies, which modify TCP segment boundaries | |||
| and thus would mix option information with user data if they do not | and thus would mix option information with user data if they do not | |||
| support EDO. Such devices interfere with many other TCP options, and | support EDO. Such devices interfere with many other TCP options, and | |||
| although their use is not common they would interfere with | although their use is not common they would interfere with | |||
| connections that use EDO. | connections that use EDO. | |||
| More common are NATs, which rewrite IP address and/or transport port | More common are NATs, which rewrite IP address and/or transport port | |||
| fields. NATs are not affected by the EDO option. | fields. NATs are not affected by the EDO option. | |||
| Deep-packet inspection systems that inspect TCP segment payloads or | Deep-packet inspection systems that inspect TCP segment payloads or | |||
| attempt to reconstitute the data stream would incorrectly include | attempt to reconstitute the data stream would incorrectly include | |||
| options, which might interfere with their operation. | options, which might interfere with their operation. | |||
| ?? PROPOSED SOLUTION BELOW | ||||
| It may be important to detect misbehavior that could cause EDO space | It may be important to detect misbehavior that could cause EDO space | |||
| to be misinterpreted as user data. In such cases, EDO SHOULD be used | to be misinterpreted as user data. In such cases, EDO SHOULD be used | |||
| in conjunction with integrity protection mechanisms, such as IPsec, | in conjunction with integrity protection mechanisms, such as IPsec, | |||
| TCP-AO, etc. | TCP-AO, etc. It is useful to note that such protection helps find | |||
| only non-compliant components. | ||||
| --- | --- | |||
| ?? should we include a header checksum, as suggested by Oliver | ?? should we include a header checksum, as suggested by Oliver | |||
| Bonaventure, too? Should it be integrated with EDO or independent? | Bonaventure, too? Should it be integrated with EDO or independent? | |||
| If so, we either need one of the following: | If so, we either need one of the following: | |||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| | Kind | Length | Header_length | | | Kind | Length | Header_length | | |||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| | EDO-checksum | | | EDO-checksum | | |||
| +--------+--------+ | +--------+--------+ | |||
| skipping to change at page 10, line 4 ¶ | skipping to change at page 9, line 15 ¶ | |||
| Bonaventure, too? Should it be integrated with EDO or independent? | Bonaventure, too? Should it be integrated with EDO or independent? | |||
| If so, we either need one of the following: | If so, we either need one of the following: | |||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| | Kind | Length | Header_length | | | Kind | Length | Header_length | | |||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| | EDO-checksum | | | EDO-checksum | | |||
| +--------+--------+ | +--------+--------+ | |||
| Figure 3 TCP EDO length + sum option | Figure 3 TCP EDO length + sum option | |||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| | Kind | Length | Header_checksum | | | Kind | Length | Header_checksum | | |||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| Figure 4 TCP header checksum option | Figure 4 TCP header checksum option | |||
| The former adds a header checksum to EDO - this could be optional or | The former adds a header checksum to EDO - this could be optional or | |||
| required, and its presence can be determined from the option length. | required, and its presence can be determined from the option length. | |||
| The checksum would be over only the additional EDO space. | The checksum would be over only the additional EDO space. | |||
| The latter requires a new option and we'd need to decide what the | The latter requires a new option and we'd need to decide what the | |||
| option covered. It could cover just the option space. | option covered. It could cover just the option space. | |||
| Finally, another option would be to try to make rewriting | Finally, another option would be to try to make rewriting | |||
| middleboxes fail by some other, more opaque means - such as using | middleboxes fail by some other, more opaque means - such as using | |||
| DO=0 when EDO is used as suggested by John Leslie (but this would | DO=0 when EDO is used as suggested by John Leslie (but this would | |||
| also require considering whether EDO would be required on all | also require considering whether EDO would be required on all | |||
| packets, or just on DO=0, as well as where EDO would be and whether | packets, or just on DO=0, as well as where EDO would be and whether | |||
| it could precede security/integrity options such as TCP-AO). | it could precede security/integrity options such as TCP-AO). | |||
| 7. Comparison to Previous Proposals | ||||
| EDO is the latest in a long line of attempts to increase TCP option | ||||
| space [Al06][Ed08][Ko04][Ra12][Yo11]. The following is a comparison | ||||
| of these approaches to EDO, based partly on a previous summary | ||||
| [Ra12]. This comparison differs from that summary by using a | ||||
| different set of success criteria. | ||||
| 7.1. EDO Criteria | ||||
| Our criteria for a successful solution are as follows: | ||||
| o Zero-cost fallback to legacy endpoints. | ||||
| o Minimal impact on middlebox compatibility. | ||||
| o No additional side-effects. | ||||
| Zero-cost fallback requires that upgraded hosts incur no penalty for | ||||
| attempting to use EDO. This disqualifies dual-stack approaches, | ||||
| because the client might have to delay connection establishment to | ||||
| wait for the preferred connection mode to complete. Note that the | ||||
| impact of legacy endpoints that silently reflect unknown options are | ||||
| not considered, as they are already non-compliant with existing TCP | ||||
| requirements [RFC793]. | ||||
| Minimal impact on middlebox compatibility requires that EDO works | ||||
| through simple NAT and NAPT boxes, which modify IP addresses and | ||||
| ports and recompute IPv4 header and TCP segment checksums. | ||||
| Middleboxes that reject unknown options or that process segments in | ||||
| detail without regard for unknown options are not considered; they | ||||
| process segments as if they were an endpoint but do so in ways that | ||||
| are not compliant with existing TCP requirements (e.g., they should | ||||
| have rejected the initial SYN because of its unknown options rather | ||||
| than silently relaying it). | ||||
| EDO also attempts to avoid creating side-effects, such as might | ||||
| happen if options were split across multiple TCP segments (which | ||||
| could arrive out of order or be lost) or across different TCP | ||||
| connections (which could fail to share fate through firewalls or | ||||
| NAT/NAPTs). | ||||
| These requirements are similar to those noted in [Ra12], but EDO | ||||
| groups cases of segment modification beyond address and port - such | ||||
| as rewriting, segment drop, sequence number modification, and option | ||||
| stripping - as already in violation of existing TCP requirements | ||||
| regarding unknown options, and so we do not consider their impact on | ||||
| this new option. | ||||
| 7.2. Summary of Approaches | ||||
| There are three basic ways in which TCP option space extension has | ||||
| been attempted: | ||||
| 1. Use of a TCP option. | ||||
| 2. Redefinition of the existing TCP header fields. | ||||
| 3. Use of option space in multiple TCP segments (split across | ||||
| multiple segments). | ||||
| A TCP option is the most direct way to extend the option space and | ||||
| is the basis of EDO. This approach cannot extend the option space of | ||||
| the initial SYN. | ||||
| Redefining existing TCP header fields can be used to either contain | ||||
| additional options or as a pointer indicating alternate ways to | ||||
| interpret the segment payload. All such redefinitions make it | ||||
| difficult to achieve zero-impact backward compatibility, both with | ||||
| legacy endpoints and middleboxes. | ||||
| Splitting option space across separate segments can create | ||||
| unintended side-effects, such as increased delay to deal with path | ||||
| latency or loss differences. | ||||
| The following discusses three of the most notable past attempts to | ||||
| extend the TCP option space: Extended Segments, TCPx2, LOO/SLO, and | ||||
| LOIC. [Ra12] suggests a few other approaches, including use of TCP | ||||
| option cookies, reuse/overload of other TCP fields (e.g., the URG | ||||
| pointer), or compressing TCP options. None of these is compatible | ||||
| with legacy endpoints or middleboxes. | ||||
| 7.3. Extended Segments | ||||
| TCP Extended Segments redefined the meaning of currently unused | ||||
| values of the Data Offset (DO) field [Ko04]. TCP defines DO as | ||||
| indicating the length of the TCP header, including options, in 32- | ||||
| bit words. The default TCP header with no options is 5 such words, | ||||
| so the minimum currently valid DO value is 5 (meaning 40 bytes of | ||||
| option space). This document defines interpretations of values 0-4: | ||||
| DO=0 means 48 bytes of option space, DO=1 means 64, DO=2 means 128, | ||||
| DO=3 means 256, and DO=4 means unlimited (e.g., the entire payload | ||||
| is option space). This variant negotiates the use of this capability | ||||
| by using one of these invalid DO values in the initial SYN. | ||||
| Use of this variant is not backward-compatible with legacy TCP | ||||
| implementations, whether at the desired endpoint or on middleboxes. | ||||
| The variant also defines a way to initiate the feature on the | ||||
| passive side, e.g., using an invalid DO during the SYN-ACK when the | ||||
| initial SYN had a valid DO. This capability allows either side to | ||||
| initiate use of the feature but is also not backward compatible. | ||||
| 7.4. TCPx2 | ||||
| TCPx2 redefines legacy TCP headers by basically doubling all TCP | ||||
| header fields [Al06]. It relies on a new transport protocol number | ||||
| to indicate its use, defeating backward compatibility with all | ||||
| existing TCP capabilities, including firewalls, NATs/NAPTs, and | ||||
| legacy endpoints and applications. | ||||
| 7.5. LOO/SLO | ||||
| The TCP Long Option (LO, [Ed08]) is very similar to EDO, except that | ||||
| presence of LO results in ignoring the existing DO field and that LO | ||||
| is required to be the first option. EDO considers the need for other | ||||
| fields to be first and declares that the EDO is the last option as | ||||
| indicated by the DO field value. Unlike LO, EDO is not required in | ||||
| every segment once negotiated, saving 6 bytes if not actively | ||||
| needed. | ||||
| The TCP Long Option draft also specified the SYN Long Option (SLO) | ||||
| [Ed08]. If SLO is used in the initial SYN and successfully | ||||
| negotiated, it is used in each subsequent segment until all of the | ||||
| initial SYN options are transmitted. | ||||
| LO is backward compatible, as is SLO; in both cases, endpoints not | ||||
| supporting the option would not respond with the option, and in both | ||||
| cases the initial SYN is not itself extended. | ||||
| SLO does modify the three-way handshake because the connection isn't | ||||
| considered completely established until the first data byte is | ||||
| ACKed. Legacy TCP can establish a connection even in the absence of | ||||
| data. SLO also changes the semantics of the SYN-ACK; for legacy TCP, | ||||
| this completes the active side connection establishment, where in | ||||
| SLO an additional data ACK is required. A connection whose initial | ||||
| SYN options have been confirmed in the SYN-ACK might still fail upon | ||||
| receipt of additional options sent in later SLO segments. This case | ||||
| - of late negotiation fail - is not addressed in the specification. | ||||
| 7.6. LOIC | ||||
| TCP Long Options by Invalid Checksum is a dual-stack approach that | ||||
| uses two initial SYNS to initiate all updated connections [Yo11]. | ||||
| One SYN negotiates the new option and the other SYN payload contains | ||||
| only the entire options. The negotiation SYN is compliant with | ||||
| existing procedures, but the option SYN has a deliberately incorrect | ||||
| TCP checksum (decremented by 2). A legacy endpoint would discard the | ||||
| segment with the incorrect checksum and respond to the negotiation | ||||
| SYN without the LO option. | ||||
| Use of the option SYN and its incorrect checksum both interfere with | ||||
| other legacy components. Segments with incorrect checksums will be | ||||
| silently dropped by most middleboxes, including NATs/NAPTs. Use of | ||||
| two SYNs creates side-effects that can delay connections to upgraded | ||||
| endpoints, notably when the option SYN is lost or the SYNs arrive | ||||
| out of order. Finally, by not allowing other options in the | ||||
| negotiation SYN, all connections to legacy endpoints either use no | ||||
| options or require a separate connection attempt (either concurrent | ||||
| or subsequent). | ||||
| 7.7. Problems with Extending the Initial SYN | ||||
| The key difficulty with most previous proposals is the desire to | ||||
| extend the option space in all TCP segments, including the initial | ||||
| SYN, i.e., SYN with no ACK, typically the first segment of a | ||||
| connection. It has proven difficult to extend space in this initial | ||||
| SYN in the absence of prior negotiation while maintaining current | ||||
| TCP three-way handshake properties. | ||||
| A new TCP option cannot extend the Data Offset of a single TCP | ||||
| initial SYN segment. All TCP segments, including the initial SYN, | ||||
| may include user data in the payload data [RFC793], and this can be | ||||
| useful for some proposed features such as TCP Fast Open [Ch14]. | ||||
| Legacy endpoints that ignore the new option would process the | ||||
| payload contents as user data and send an ACK. Once ACK'd, this data | ||||
| cannot be removed from the user stream. | ||||
| The six Reserved TCP header bits cannot be redefined easily, because | ||||
| the original specification did not require their contents to be | ||||
| ignored, even though three of those bits have already been redefined | ||||
| (ECE/CWR [RFC3168] and NS [RFC3540]). Legacy endpoints are required | ||||
| to drop TCP segments where those bits are not zero, making it | ||||
| difficult to assume backward downgrade. | ||||
| TCP initial SYN (SYN and not ACK) segments can use every other TCP | ||||
| header field except the Acknowledgement number, which is not used | ||||
| because the ACK field is not set. In all other segments, all fields | ||||
| except the three remaining Reserved header bits are actively used. | ||||
| The total amount of available header fields, in either case, is | ||||
| insufficient to be useful in extending the option space. | ||||
| The representation of TCP options can be optimized to minimize the | ||||
| space needed. In such cases, multiple Kind and Length fields are | ||||
| combined, so that a new Kind would indicate a specific combination | ||||
| of options, whose order is fixed and whose length is indicated by | ||||
| one Length field. Most TCP options use fields whose size is much | ||||
| larger than the required Kind and Length components, so the | ||||
| resulting efficiency is typically insufficient for additional | ||||
| options. | ||||
| The option space of an initial SYN segment might be extended by | ||||
| using multiple initial segments (e.g., multiple SYNs or a SYN and | ||||
| non-SYN) or based on the context of previous or parallel | ||||
| connections. Because of their potential complexity, these approaches | ||||
| are addressed in a separate document [To14]. | ||||
| Option space cannot be extended in outer layer headers, e.g., IPv4 | ||||
| or IPv6. These layers typically try to avoid extensions altogether, | ||||
| to simplify forwarding processing at routers. Introducing new shim | ||||
| layers to accommodate additional option space would interfere with | ||||
| deep-packet inspection mechanisms that are in widespread use. | ||||
| As a result, EDO does not attempt to extend the space available for | ||||
| options in TCP initial SYNs. It does extend that space in all other | ||||
| segments (including SYN-ACK), which has always been trivially | ||||
| possible once an option is defined. | ||||
| 8. Security Considerations | 8. Security Considerations | |||
| It is meaningless to have the Data Offset further exceed the | It is meaningless to have the Data Offset further exceed the | |||
| position of the EDO data offset option. | position of the EDO data offset option. | |||
| >> When the EDO length option is present, the EDO length option | >> When the EDO length option is present, the EDO length option | |||
| SHOULD be the last non-null option covered by the TCP Data Offset, | SHOULD be the last non-null option covered by the TCP Data Offset, | |||
| because it would be the last option affected by Data Offset. | because it would be the last option affected by Data Offset. | |||
| This also makes it more difficult to use the Data Offset field as a | This also makes it more difficult to use the Data Offset field as a | |||
| skipping to change at page 11, line 14 ¶ | skipping to change at page 15, line 11 ¶ | |||
| [RFC793] Postel, J., "Transmission Control Protocol", STD 7, RFC | [RFC793] Postel, J., "Transmission Control Protocol", STD 7, RFC | |||
| 793, September 1981. | 793, September 1981. | |||
| 10.2. Informative References | 10.2. Informative References | |||
| [Al06] Allman, M., "TCPx2: Don't Fence Me In", draft-allman- | [Al06] Allman, M., "TCPx2: Don't Fence Me In", draft-allman- | |||
| tcpx2-hack-00 (work in progress), May 2006. | tcpx2-hack-00 (work in progress), May 2006. | |||
| [Ch14] Cheng, Y., Chu, J., and A. Jain, "TCP Fast Open", draft- | [Ch14] Cheng, Y., Chu, J., and A. Jain, "TCP Fast Open", draft- | |||
| ietf-tcpm-fastopen-08, March 2014. | ietf-tcpm-fastopen-09, June 2014. | |||
| [Ed08] Eddy, W. and A. Langley, "Extending the Space Available | [Ed08] Eddy, W. and A. Langley, "Extending the Space Available | |||
| for TCP Options", draft-eddy-tcp-loo-04 (work in | for TCP Options", draft-eddy-tcp-loo-04 (work in | |||
| progress), July 2008. | progress), July 2008. | |||
| [Ko04] Kohler, E., "Extended Option Space for TCP", draft-kohler- | [Ko04] Kohler, E., "Extended Option Space for TCP", draft-kohler- | |||
| tcpm-extopt-00 (work in progress), September 2004. | tcpm-extopt-00 (work in progress), September 2004. | |||
| [Ni14] Nishida, Y., "A-PAWS: Alternative Approach for PAWS", | ||||
| draft-nishida-tcpm-apaws-01 (work in progress), June 2014. | ||||
| [Ra12] Ramaiah, A., "TCP option space extension", draft-ananth- | [Ra12] Ramaiah, A., "TCP option space extension", draft-ananth- | |||
| tcpm-tcpoptext-00 (work in progress), March 2012. | tcpm-tcpoptext-00 (work in progress), March 2012. | |||
| [RFC792] Postel, J., "Internet Control Message Protocol", RFC 792. | [RFC792] Postel, J., "Internet Control Message Protocol", RFC 792. | |||
| [RFC1323] Jacobson, V., Braden, R., and D. Borman, "TCP Extensions | [RFC1323] Jacobson, V., Braden, R., and D. Borman, "TCP Extensions | |||
| for High Performance", RFC 1323, May 1992. | for High Performance", RFC 1323, May 1992. | |||
| [RFC1812] Baker, F. (Ed.), "Requirements for IP Version 4 | [RFC1812] Baker, F. (Ed.), "Requirements for IP Version 4 Routers," | |||
| Routers," RFC 1812, June 1995. | RFC 1812, June 1995. | |||
| [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP | [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP | |||
| Selective Acknowledgment Options", RFC 2018, October 1996. | Selective Acknowledgment Options", RFC 2018, October 1996. | |||
| [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | |||
| of Explicit Congestion Notification (ECN) to IP", RFC | of Explicit Congestion Notification (ECN) to IP", RFC | |||
| 3168, September 2001. | 3168, September 2001. | |||
| [RFC3234] Carpenter, B. and S. Brim, "Middleboxes: Taxonomy and | [RFC3234] Carpenter, B. and S. Brim, "Middleboxes: Taxonomy and | |||
| Issues", RFC 3234, February 2002. | Issues", RFC 3234, February 2002. | |||
| skipping to change at page 12, line 14 ¶ | skipping to change at page 16, line 14 ¶ | |||
| [RFC6675] Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M., | [RFC6675] Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M., | |||
| and Y.. Nishida, "A Conservative Loss Recovery Algorithm | and Y.. Nishida, "A Conservative Loss Recovery Algorithm | |||
| Based on Selective Acknowledgment (SACK) for TCP", RFC | Based on Selective Acknowledgment (SACK) for TCP", RFC | |||
| 6675, August 2012. | 6675, August 2012. | |||
| [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, | [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, | |||
| "TCP Extensions for Multipath Operation with Multiple | "TCP Extensions for Multipath Operation with Multiple | |||
| Addresses", RFC 6824, January 2013. | Addresses", RFC 6824, January 2013. | |||
| [To14] Touch, J., (et al., TBD), "Extending TCP Option Space | ||||
| During Connection Establishment", draft-touch-tcp-syn-edo- | ||||
| 00 (work in progress), July 2014. | ||||
| [Yo11] Yourtchenko, A., "Introducing TCP Long Options by Invalid | [Yo11] Yourtchenko, A., "Introducing TCP Long Options by Invalid | |||
| Checksum", draft-yourtchenko-tcp-loic-00 (work in | Checksum", draft-yourtchenko-tcp-loic-00 (work in | |||
| progress), April 2011. | progress), April 2011. | |||
| 11. Acknowledgments | 11. Acknowledgments | |||
| The authors would like to thank the IETF TCPM WG for their feedback, | The authors would like to thank the IETF TCPM WG for their feedback, | |||
| in particular: Oliver Bonaventure, Ted Faber, John Leslie, Richard | in particular: Oliver Bonaventure, Bob Briscoe, Ted Faber, John | |||
| Scheffenegger, and Alexander Zimmerman. | Leslie, Richard Scheffenegger, and Alexander Zimmerman. | |||
| This document was prepared using 2-Word-v2.0.template.dot. | This document was prepared using 2-Word-v2.0.template.dot. | |||
| Authors' Addresses | Authors' Addresses | |||
| Joe Touch | Joe Touch | |||
| USC/ISI | USC/ISI | |||
| 4676 Admiralty Way | 4676 Admiralty Way | |||
| Marina del Rey, CA 90292-6695 USA | Marina del Rey, CA 90292-6695 USA | |||
| End of changes. 36 change blocks. | ||||
| 125 lines changed or deleted | 319 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||