| < draft-agl-tcpm-sadata-00.txt | draft-agl-tcpm-sadata-01.txt > | |||
|---|---|---|---|---|
| Network Working Group A. Langley | Network Working Group A. Langley | |||
| Internet-Draft Google Inc | Internet-Draft Google Inc | |||
| Expires: January 16, 2009 July 15, 2008 | Expires: February 6, 2009 August 5, 2008 | |||
| Faster application handshakes with SYN/ACK payloads | Faster application handshakes with SYN/ACK payloads | |||
| draft-agl-tcpm-sadata-00 | draft-agl-tcpm-sadata-01 | |||
| Status of this Memo | Status of this Memo | |||
| By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
| applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
| have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
| aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
| skipping to change at page 1, line 33 ¶ | skipping to change at page 1, line 33 ¶ | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| This Internet-Draft will expire on January 16, 2009. | This Internet-Draft will expire on February 6, 2009. | |||
| Abstract | Abstract | |||
| This document describes an extension to TCP [RFC0793] which permits a | This document advocates the usage of small, mostly constant payloads | |||
| small, mostly constant data payload to be carried in the SYN+ACK | in the SYN+ACK frame of the 3-way TCP [RFC0793] handshake. We show | |||
| frame of the 3-way handshake. This new behaviour is enabled by an | how this can have immediate benefits for some protocols. | |||
| option in the SYN packet to ensure backwards compatibility. We | Additionally, we describe a new TCP option that enables a wider range | |||
| should how this has latency benefits, specifically for cryptographic | of protocols to gain from it. | |||
| applications. | ||||
| Table of Contents | Table of Contents | |||
| 1. Requirements Notation . . . . . . . . . . . . . . . . . . . . 3 | 1. Requirements Notation . . . . . . . . . . . . . . . . . . . . 3 | |||
| 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 2. Changes since 00 . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 3. The SYNACK Payload Permitted Option . . . . . . . . . . . . . 6 | 3. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 4. Security Considerations . . . . . . . . . . . . . . . . . . . 8 | 4. Example One: Opportunistic HTTP encryption . . . . . . . . . . 7 | |||
| 5. Implementation details . . . . . . . . . . . . . . . . . . . . 9 | 5. Example Two: Faster SSH connections . . . . . . . . . . . . . 9 | |||
| 6. Comparison to T/TCP . . . . . . . . . . . . . . . . . . . . . 10 | 6. Example Three: Compressed HTTP headers . . . . . . . . . . . . 11 | |||
| 7. Middlebox Interactions . . . . . . . . . . . . . . . . . . . . 11 | 7. The SYNACK Payload Processed Option . . . . . . . . . . . . . 12 | |||
| 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 | 8. Security Considerations . . . . . . . . . . . . . . . . . . . 15 | |||
| 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 13 | 9. Comparison to T/TCP . . . . . . . . . . . . . . . . . . . . . 16 | |||
| 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 | 10. Middlebox Interactions . . . . . . . . . . . . . . . . . . . . 17 | |||
| 10.1. Normative References . . . . . . . . . . . . . . . . . . 14 | 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 | |||
| 10.2. Informative References . . . . . . . . . . . . . . . . . 14 | 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
| Appendix A. Changes . . . . . . . . . . . . . . . . . . . . . . . 15 | 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 16 | 13.1. Normative References . . . . . . . . . . . . . . . . . . 20 | |||
| Intellectual Property and Copyright Statements . . . . . . . . . . 17 | 13.2. Informative References . . . . . . . . . . . . . . . . . 20 | |||
| Appendix A. Changes . . . . . . . . . . . . . . . . . . . . . . . 23 | ||||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 24 | ||||
| Intellectual Property and Copyright Statements . . . . . . . . . . 25 | ||||
| 1. Requirements Notation | 1. Requirements Notation | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
| document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
| 2. Introduction | 2. Changes since 00 | |||
| To be removed by the RFC Editor before publication. | ||||
| o Greatly expanded on the introduction | ||||
| o Fixed the wording around retransmissions which mistakenly | ||||
| suggested that no packets of any type could be transmitted without | ||||
| payloads. | ||||
| o Renamed the flag to SYNACK Payloads Processed. | ||||
| o Required that the flag be echoed in resulting SYNACK frames. | ||||
| o Added discussion of simultaneous open. | ||||
| o Added discussion of SYNACKs with payloads that are nothing to do | ||||
| with this spec, noting that they are still permitted. | ||||
| o Changed the option to a standard, 2 byte, flags option. | ||||
| 3. Introduction | ||||
| At the current time, almost no stacks will send payloads in the | ||||
| SYNACK frame of a TCP handshake even though RFC 793 [RFC0793] permits | ||||
| it. This springs from a handful of reasons: | ||||
| 1. Processing time for a SYN must be minimal to mitigate the effects | ||||
| of SYN floods. Even waking up an application to process a SYN | ||||
| would greatly increase the costs. | ||||
| 2. Replies to SYNs must be small, otherwise it provides a way to | ||||
| amplify a DDoS attacks using false source IP addresses. | ||||
| 3. The ubiquitous sockets API doesn't make it easy to do so. | ||||
| This document proposes that a semi-constant payload (a payload such | ||||
| that it's trivial for the kernel to compute) overcomes the first and | ||||
| third reasons. Additionally, limiting that payload to 64-bytes | ||||
| overcomes the second. | ||||
| There are protocols that could immediately benefit from a gradual | ||||
| deployment of hosts which supported a "setsockopt" to set a constant | ||||
| payload and hosts that would ACK and enqueue such a payload. SMTP | ||||
| would be one such protocol: clients wait for a 200 code banner from | ||||
| the server before starting their part of the exchange and the banner | ||||
| is small and constant. SMTP is also a protocol which ends up making | ||||
| many, short lived connections. | ||||
| Since such behaviour is already permitted by TCP it requires no | ||||
| standards work. It would also be easy to deploy. Active open hosts | ||||
| that don't enqueue payloads in SYNACK frames will ACK only the SYN | ||||
| flag and the passive open host then knows to retransmit the payload | ||||
| immediately after. | ||||
| However, this common lack of carrying data in SYNACK frames, and the | ||||
| sockets API which reflects it, has guided the design of many | ||||
| application layer protocols. These protocols are often designed such | ||||
| that: | ||||
| 1. The client starts the exchange. For example, the first | ||||
| application layer bytes sent on an HTTP connection are the | ||||
| client's request. | ||||
| 2. The exchange is large since there is little space pressure. SSH | ||||
| algorithm agreement uses strings like | ||||
| ""diffie-hellman-group14-sha1"" (28 bytes) because of this. | ||||
| In these cases we suggest that, had a general ability to send | ||||
| payloads in SYNACK frames existed at the time that these protocols | ||||
| were written, they may have ended up differently. However, the | ||||
| ability for a passive open host to send a payload with no latency | ||||
| overhead is of value: we outline three motivating examples in next | ||||
| sections. | ||||
| Modifications to take advantage of SYNACK payloads would then require | ||||
| changes to the application level protocol. This could be managed by | ||||
| assigning new ports, trying connections on the new ports first, | ||||
| backing off etc. However, given that SYNACK payloads are partly a | ||||
| latency optimisation, that would utterly negate any gains. | ||||
| Because of this, we also describe a TCP option that lets the | ||||
| application layer on both sides know that their respective stacks | ||||
| support at least the limited SYNACK payloads described herein, and | ||||
| also to agree to use an alternative protocol which takes advantage of | ||||
| it. | ||||
| Fundamentally, any protocol which used payloads in SYNACK frames | ||||
| could achieve the same effect without them, at the cost of an extra | ||||
| round trip. Thus, this should only be used where latency is | ||||
| important. None the less, the advantage of avoiding a round trip | ||||
| should not be discounted. Round trip times are often in excess of | ||||
| 100ms for distant hosts, or in poorly networked areas of the world. | ||||
| 4. Example One: Opportunistic HTTP encryption | ||||
| Here we assume that both HTTP client and server implement this | ||||
| specification. | ||||
| The client, before calling "connect", calls "setsockopt" to instruct | ||||
| the kernel to include an option to advertise support for this | ||||
| specification. The server has already configured its listening | ||||
| socket to include a Diffie-Hellman public value in the SYNACK | ||||
| payloads elicited from SYN frames carrying this option. | ||||
| Additionally, the server's stack generates an 8-byte random nonce and | ||||
| includes it in the payload. | ||||
| The client is aware that the server implements this specification | ||||
| because the advertising option is echoed back in the SYNACK frame. | ||||
| Thus, it expects to read the nonce and public value from the | ||||
| connection. It then sends its own nonce and public value to the | ||||
| server. Both sides can calculate a shared key and use a cipher to | ||||
| encrypt the remaining data in both directions. | ||||
| Choosing the correct cryptographic primitives can make this | ||||
| particularly cheap. Curve25519 [curve25519] is an elliptic curve | ||||
| Diffie-Hellman function that can be calculated in 240 microseconds on | ||||
| a 2.33GHz Intel Core2. Salsa20/8 [salsa20] is a stream cipher that | ||||
| can encrypt data in 2 cycles/byte on the same hardware. | ||||
| The resulting key could also be used to establish integrity using the | ||||
| forthcoming TCP Auth Option specification [1]. | ||||
| This example demonstrates a number of salient features of this | ||||
| specification: | ||||
| o By using the correct primitive (curve25519), a constant payload | ||||
| can be used to establish cryptographic connections. | ||||
| o We can add significant extensions to latency sensitive protocols | ||||
| without affecting latency. Previous attempts [RFC2817] to do the | ||||
| same have required an extra round trip weather the server side | ||||
| supported the protocol or not. | ||||
| o We can do in a backwards compatible fashion, affording a gradual | ||||
| deployment. | ||||
| The above is cursorary in order not to distract from the topic of | ||||
| this document, however enquiring readers are welcome to continue | ||||
| reading this section if they still have questions. | ||||
| *Why Elliptic Curves?*: The payload must be short otherwise SYN- | ||||
| floods could use this as an amplification to backscatter DDoS another | ||||
| host. The reduced computation cost (as compared to Diffie-Hellman | ||||
| over a multiplicative finite field) is very nice. | ||||
| Most importantly, curve25519 is specifically designed to allow a | ||||
| constant public value to be used for multiple key agreements. If a | ||||
| new public value had to be generated for every SYN, not only would | ||||
| the stack have to be able to perform that operation, a SYN flood | ||||
| would be very effective. | ||||
| *Can't the client's public value fit in the SYN?*: A SYN generally | ||||
| has twenty bytes of free option space these days. (We can't use the | ||||
| payload space in a SYN). Since we wouldn't want to define the last | ||||
| option ever, we need to leave four bytes spare. Two bytes for the | ||||
| option header means fourteen bytes (or 112 bits) for the public | ||||
| value. The closest prime is then 2^112-75. | ||||
| The best, general algorithm currently known for breaking the Diffie- | ||||
| Hellman problem on elliptic curves is Pollard's Rho. The work | ||||
| involved in this attack is sqrt(n), which is 2^56 in this case. | ||||
| Critically, once you have solved a single instance you can precompute | ||||
| tables to speed up breaking more instances. With a petabyte of | ||||
| storage, you could break 112-bit curves in only 2^12 operations. | ||||
| *Can't a smaller field be used?*: Some speedup could be gained by | ||||
| using an elliptic curve with a field size around 200 bits. However | ||||
| the effort of defining such a curve is pretty huge. The standard | ||||
| NIST curves around that size are slower than curve25519 [2]. | ||||
| *What about man-in-the-middle attacks*: All opportunistic schemes are | ||||
| open to man-in-the-middle and downgrade attacks. This is no | ||||
| exception, it's a trade off and for real security, TLS [RFC4346] | ||||
| should be used. It has been suggested that the HTTP server include a | ||||
| header in replies giving a URL on the same domain, using the "https" | ||||
| scheme, which contains the server's public value and an expiry time. | ||||
| 5. Example Two: Faster SSH connections | ||||
| SSH [RFC4253] connection latency is a small, but quotidian | ||||
| frustration for those who use it. Current efforts to address it | ||||
| involve multiplexing interactive sessions over long-term, persistent | ||||
| connections. | ||||
| Consider the following, diagrammatic representation of the beginning | Consider the following, diagrammatic representation of the beginning | |||
| of an SSH [RFC4253] connection: | of an SSH [RFC4253] connection: | |||
| 0 SYN ------------> 0.5 | 0 SYN ------------> 0.5 | |||
| 1 <------------ SYNACK 0.5 | 1 <------------ SYNACK 0.5 | |||
| 1 Ident ------------> 1.5 | 1 Ident ------------> 1.5 | |||
| 1 NList ------------> 1.5 | 1 NList ------------> 1.5 | |||
| 2 <------------ Ident 1.5 | 2 <------------ Ident 1.5 | |||
| 2 <------------ NList 1.5 | 2 <------------ NList 1.5 | |||
| skipping to change at page 4, line 36 ¶ | skipping to change at page 9, line 41 ¶ | |||
| Figure 1 | Figure 1 | |||
| Here, arrows from the left to the right are frames from client to | Here, arrows from the left to the right are frames from client to | |||
| server. Times on the left are the times that the client either | server. Times on the left are the times that the client either | |||
| transmits or receives a packet (and vice versa). Times are measured | transmits or receives a packet (and vice versa). Times are measured | |||
| in round trip times (RTT), so that it takes 0.5 units for a frame to | in round trip times (RTT), so that it takes 0.5 units for a frame to | |||
| pass between the hosts. | pass between the hosts. | |||
| The above diagram is for a latency tuned implementation of SSH, | The above diagram is for a latency tuned implementation of SSH, | |||
| specifically, the client doesn't wait for the server's identity | specifically, the client doesn't wait for the server's identity | |||
| string to be received. And yes, in this ideal scenario, the client | string to be received. And yet, in this ideal scenario, the client | |||
| can only start transmitting useful data after 2 RTT and the server | can only start transmitting useful data after 2 RTT and the server | |||
| can only start transmitting after 2.5 RTT. As a rule of thumb, the | can only start transmitting after 2.5 RTT. As a rule of thumb, the | |||
| RTT from San Francisco to London is 150ms, so this means a 300ms | RTT from San Francisco to London is 150ms, so this means a 300ms | |||
| latency, at least, when setting up this connection. | latency, at least, when setting up this connection. | |||
| (To keep the discussion simple, we assume there is no packet loss, | (To keep the discussion simple, we assume there is no packet loss, | |||
| that the path is symmetrical and that the client's ACK of the 3-way | that the path is symmetrical and that the client's ACK of the 3-way | |||
| handshake carries a data payload.) | handshake carries a data payload.) | |||
| Now, let us consider a hypothetical SSH protocol where the server | Now, let us consider the situation when SYNACK payloads are | |||
| could include a short, constant byte-string in the SYNACK packet of | available. First we compact the name-list (which is part of the | |||
| the TCP exchange. First we compact the name-list (part of the | ||||
| algorithm negotiation) and put it in the SYNACK. | algorithm negotiation) and put it in the SYNACK. | |||
| 0 SYN ------------> 0.5 | 0 SYN ------------> 0.5 | |||
| 1 <------------ SA+NList 0.5 | 1 <------------ SA+NList 0.5 | |||
| 1 NList ------------> 1.5 | 1 NList ------------> 1.5 | |||
| 1 KX ------------> 1.5 | 1 KX ------------> 1.5 | |||
| 2 <------------ KX 1.5 | 2 <------------ KX 1.5 | |||
| SSH protocol with a compact name list carried in the SYN+ACK frame | SSH protocol with a compact name list carried in the SYN+ACK frame | |||
| skipping to change at page 5, line 36 ¶ | skipping to change at page 10, line 37 ¶ | |||
| 1 <------------ SA+KX 0.5 | 1 <------------ SA+KX 0.5 | |||
| 1 KX ------------> 1.5 | 1 KX ------------> 1.5 | |||
| A protocol which includes key exchange information in the SYN+ACK | A protocol which includes key exchange information in the SYN+ACK | |||
| frame. | frame. | |||
| Figure 3 | Figure 3 | |||
| Here the client's latency is 1 RTT and the server's is 1.5 RTT, which | Here the client's latency is 1 RTT and the server's is 1.5 RTT, which | |||
| is equal to the minimum required by the 3-way handshake, saving a | is equal to the minimum required by the 3-way handshake, saving a | |||
| full RTT of latency from the initial diagram. In order for the | full RTT of latency from the initial diagram. | |||
| SYNACK to be sent without application involvement, some cryptographic | ||||
| tricks are needed, as detailed below. | ||||
| None of the above discussion is specific to SSH. Many cryptographic | 6. Example Three: Compressed HTTP headers | |||
| protocols, such as TLS [RFC4346], involve a similar scheme and could | ||||
| benefit from lower latency. Specifically designed protocols could | ||||
| style themselves on the third example and achieve, essentially, no | ||||
| latency overhead. This also allows existing protocols to be extended | ||||
| with encryption with no additional round trips and with transparent | ||||
| fallback. | ||||
| 3. The SYNACK Payload Permitted Option | So that all the examples aren't cryptography based, we consider a | |||
| third example. | ||||
| Several commonly used TCP stacks don't support receiving data | There are many HTTP resources that are very small, or even empty. | |||
| payloads in SYNACK packets. Thus, SYNACK payloads cannot be enabled | Consider that clicking on Google results involves requesting a | |||
| unless it's known that the remote host can support them. To that end | resource from the Google server to redirect to the true result. Or | |||
| we define an option in the SYN frame: | OCSP [RFC2560] revocation servers which serve small ASN.1 documents. | |||
| For these services the size of HTTP headers might dominate the | ||||
| bandwidth requirements: Firefox 3 transmits over 350 bytes to request | ||||
| the shortest URL possible ("/") | ||||
| 1 2 | HTTP headers, however are highly compressible. They are highly | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 . . . | structured, and many strings are very common (such as "Keep-Alive"). | |||
| +---------------+---------------+--------------- . . . | Careful examination of the current patterns in both client requests | |||
| | Kind = | Length = 2+n | Flag bits | and server replies would probably yield a range coding [rangecoding] | |||
| | TDB-IANA-KIND1| |0 1 2 3 4 5 6 7 . . . | model that achieved significant savings. | |||
| +---------------+---------------+--------------- . . . | ||||
| TCP Flags Option | However, there is no easy method to deploy such a scheme. Obviously | |||
| the first client request on a connection could not use a scheme. A | ||||
| server could advertise support in its reply headers for subsequent | ||||
| requests on the same connection, although that could only affect | ||||
| requests that haven't already been pipelined. | ||||
| Figure 4 | A SYNACK payload could serve to advertise support for this, and any | |||
| other extensions, allowing every request on a connection to use such | ||||
| a scheme when both ends support it. | ||||
| The flags option contains a conceptually infinite number of bits | 7. The SYNACK Payload Processed Option | |||
| which are numbered from the MSB of the first byte, upwards. If not | ||||
| specified, bits are assumed to be false. The flags option MUST be as | ||||
| short as possible and yet cover all the true bits that need to be | ||||
| specified. If no bits are true, the flags option MUST NOT be | ||||
| included. The meanings of flag bits are to be assigned by IANA. For | ||||
| this RFC, bit 0 is assumed to be "SYNACK Payload Permitted". | ||||
| For example, if no other flag bits are to be set, SYNACK payload | Alternative application protocols that take advantage of data in a | |||
| support would be advertised by a 3-byte option whose first data byte | SYNACK frame necessarily require the application level to know when | |||
| is "0x80". | this specification is in effect. To that end, we define an option | |||
| which signifies compliance with this specification to be carried in | ||||
| the SYN and SYNACK frames: | ||||
| Hosts MUST NOT set the SYNACK Payload Permitted bit unless an | 1 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 | ||||
| +---------------+---------------+ | ||||
| | Kind = | Length = 2 | | ||||
| | TDB-IANA-KIND1| | | ||||
| +---------------+---------------+ | ||||
| SYNACK Payload Processed Option | ||||
| Figure 4 | ||||
| It is required that both endpoints reach agreement about when this | ||||
| option is in effect since it affects the application layer. The next | ||||
| five paragraphs deal with this. This specification considers the | ||||
| option to be an optimisation however, and a valid agreement might be | ||||
| that the option is not in effect even in the case that both endpoints | ||||
| support it. This is to allow implementations to back off in the case | ||||
| of possible middleware interactions and overload. | ||||
| Hosts MUST NOT include the SYNACK Payload Processed option unless an | ||||
| application has requested it for the current socket. If SYNACK | application has requested it for the current socket. If SYNACK | |||
| Payload Permitted is requested for a socket, the host SHOULD include | Payload Processed is requested for a socket, the host SHOULD include | |||
| the SYNACK Payload Permitted. For example, it may choose not to in | the SYNACK Payload Processed option. For example, it may choose not | |||
| the case of having to retransmit the SYN frame as middleware may be | to in the case of having to retransmit the SYN frame as middleware | |||
| filtering the extra option. | may be filtering the extra option. | |||
| Upon receipt of a SYN frame with SYNACK Payload Permitted, a host | Upon receipt of a SYN frame with a SYNACK Payload Processed option, | |||
| SHOULD include a data payload in any resulting SYNACK frame, if so | to a valid, passive open socket, that socket will have either been | |||
| configured. For a given SYN, if any resulting SYNACK frame has a | configured by an application to take advantage of this specification | |||
| payload then all resulting frames MUST have a payload. If a host | or not. In the case that it has not, the host MUST NOT include the | |||
| chooses to retransmit a SYN frame without a SYNACK Payload Permitted | SYNACK Payload Processed option in any SYNACK. In the case that it | |||
| bit when previous transmissions included the bit, it MUST reject any | has been so configured, the host SHOULD include the configured | |||
| SYNACK with a payload. If a SYN frame for a given handshake is ever | payload in the SYNACK. Iff it chooses to do so, it MUST include the | |||
| transmitted without a SYNACK Payload Permitted bit, all | SYNACK Payload Processed option. | |||
| retransmissions MUST NOT include the bit. | ||||
| The data payload affects the SEQ/ACK numbers like any other data. | For a given connection, for all resulting SYNACK frames, the presence | |||
| of the SYNACK Payload Processed option MUST NOT differ. | ||||
| If a host has alternative mechanisms which involve sending payloads | ||||
| in a SYNACK frame, they MUST NOT be used concurrently with this | ||||
| specification for a given connection. This specification does not | ||||
| prohibit SYNACK frames with payloads generated by other means as long | ||||
| as the SYNACK Payload Processed option is not included. | ||||
| It's expected that a host will make a best effort to include a SYNACK | ||||
| payload when the application has set one. It may choose not to for a | ||||
| number of reasons including: the SYN frame didn't request it, the | ||||
| host is under heavy SYN load, is using SYN cookies or that the host | ||||
| is having to retransmit the SYNACK. | ||||
| The next four paragraphs seek to establish a minimal basis for | ||||
| application protocols to build upon. An implementation may allow | ||||
| applications to set arbitrary payloads on a per connection basis, but | ||||
| we expect that most will wish to expose a more limited scope. | ||||
| Obviously some of these capabilities, such as the inclusion of random | ||||
| bytes, are motivated by the examples above. | ||||
| In the case that the SYNACK Payload Processed option is in effect: | ||||
| The data payload MUST affect the SEQ/ACK numbers like any other data. | ||||
| Any ACK frame resulting from such a SYNACK frame MUST acknowledge the | Any ACK frame resulting from such a SYNACK frame MUST acknowledge the | |||
| whole SYNACK frame, including the SYN flag. If a frame is the final | whole SYNACK frame, including the SYN flag. If a frame is the final | |||
| ACK in a 3-way handshake, a host MUST reject it unless it | ACK in a 3-way handshake, a host MUST reject it unless it | |||
| acknowledges the whole SYNACK frame. | acknowledges the whole SYNACK frame. | |||
| A host MUST NOT include a data payload in any SYNACK resulting from a | ||||
| SYN frame without SYNACK Payload Permitted. | ||||
| A host MUST provide a method for applications to set a SYNACK | A host MUST provide a method for applications to set a SYNACK | |||
| payload, to determine if a passive-open connection sent a SYNACK | payload, to determine if a passive-open connection sent a SYNACK | |||
| payload and to determine if an active open connection received a | payload and to determine if an active open connection received the | |||
| payload in the SYNACK frame. This is because the SYNACK data appears | SYNACK Payload Processed option in the SYNACK frame. | |||
| to the application like any other, but its presence may alter the | ||||
| application level protocol. | ||||
| It's expected that a host will make a best effort to include a SYNACK | A host MUST support configuring passive open sockets with at least | |||
| payload when the application has set one. It may choose not to for a | 64-bytes of data. (See "Security Considerations", below). | |||
| number of reasons including: the SYN frame didn't request it, the | ||||
| host is under heavy SYN load, is using SYN cookies or that the host | ||||
| is having to retransmit the SYNACK. | ||||
| 4. Security Considerations | A host SHOULD support including at least 8 random bytes in the SYNACK | |||
| payload, at any arbitrary (but within range) byte offset. If it | ||||
| does, the random bytes MUST be consistent between retransmissions of | ||||
| the SYNACK frame and the host MUST support a method for the | ||||
| application to learn the value of the random bytes included in any | ||||
| resulting connection. | ||||
| What follows is clarification on some corner cases: | ||||
| In the case of a simultaneous open where one or both SYN frames | ||||
| include the SYNACK Payload Processed flag, this specification is not | ||||
| in effect. The connection continues as usual. | ||||
| In the case of a frame carrying the SYNACK Payload Processed option | ||||
| and with both SYN and FIN flags set, the host MAY support this | ||||
| specification. In practice, many stacks with ignore a FIN flag and | ||||
| any payload in a SYN frame, in which case such a packet is no | ||||
| different from any other SYN frame. | ||||
| In the case that the MTU makes transmitting the larger SYNACKs | ||||
| problematic, the host MAY choose to fragment the packet or it MAY | ||||
| choose not to echo the SYNACK Payload Processed option, resulting in | ||||
| a smaller SYNACK frame. | ||||
| 8. Security Considerations | ||||
| Any payload in a SYNACK packet must be as frugal as possible since a | Any payload in a SYNACK packet must be as frugal as possible since a | |||
| host will be transmitting it to an unconfirmed address. If a 40 byte | host will be transmitting it to an unconfirmed address. If a 40 byte | |||
| frame could elicit a 1500 byte reply to an attacker controlled | frame could elicit a 1500 byte reply to an attacker controlled | |||
| address, this would be readily used to hide and amplify distributed | address, this would be readily used to hide and amplify distributed | |||
| denial of service attacks. | denial of service attacks. | |||
| Thus we specify a maximum size of 64 bytes for the payload. This is | Thus we specify a maximum size of 64 bytes for the payload. This is | |||
| sufficient to include a strong elliptic curve key (256 bits), a 64- | sufficient to include a strong elliptic curve key (256 bits), a 64- | |||
| bit nonce and a small amount of overhead (12 bytes). | bit nonce and a small amount of overhead (24 bytes). | |||
| 5. Implementation details | ||||
| Although the exact implementation details shouldn't be spelled out by | ||||
| this document, consideration must be given to it. | ||||
| Breaking the very common BSD sockets API by having applications get | ||||
| advance notice of connections so that they can specify the SYNACK | ||||
| payload (if any) would be painfully incongruent with current | ||||
| implementations. Thus it would be ideal if the SYNACK payload for a | ||||
| given, listening socket were constant; a constant value can be | ||||
| specified by a "setsockopt". | ||||
| However, for the specific motivating case here (cryptography), it's | ||||
| very helpful to include an nonce. One could consider using the SEQ | ||||
| and ACK numbers as nonces but the overloading is distasteful and they | ||||
| are quite short for secure nonces. So, at the risk of over | ||||
| optimising for a specific case: implementations SHOULD allow | ||||
| applications to specify that the first 8 bytes of the SYNACK payload | ||||
| be replaced with a cryptographically strong nonce. | ||||
| For the case where the key exchange material is carried in the SYN+ | ||||
| ACK frame, the public key thus has to be constant. This means that | ||||
| certain schemes which provide perfect forward secrecy are | ||||
| inapplicable and that implementors should be careful to use key | ||||
| exchange algorithms which are still secure under this model. | ||||
| 6. Comparison to T/TCP | 9. Comparison to T/TCP | |||
| The idea of including data in frames which also carry a SYN flag | The idea of including data in frames which also carry a SYN flag | |||
| isn't new: it was included in the experimental T/TCP RFCs 1379 | isn't new: it was included in the experimental T/TCP RFCs 1379 | |||
| [RFC1379] and 1644 [RFC1644]. T/TCP suffered because it broke the | [RFC1379] and 1644 [RFC1644]. T/TCP suffered because it broke the | |||
| assumption that the source address of a new connection from a | assumption that the source address of a new connection from a | |||
| passive-open socket had been verified by a 3-way handshake. This was | passive-open socket had been verified by a 3-way handshake. This was | |||
| a critical security issue for applications like RSH which often used | a critical security issue for applications like RSH which often used | |||
| source address whitelists. | source address whitelists. | |||
| This draft doesn't break any such assumptions that applications may | This draft doesn't break any such assumptions that applications may | |||
| be depending on. Source addresses for new connections are still | be depending on. Source addresses for new connections are still | |||
| validated by a 3-way handshake for passive-open sockets. | validated by a 3-way handshake for passive-open sockets. | |||
| Additionally, this draft is dramatically simpler than T/TCP: it | Additionally, this draft is dramatically simpler than T/TCP: it | |||
| doesn't introduce any additional TCP states nor does it deal with the | doesn't introduce any additional TCP states nor does it deal with the | |||
| complexity of including payloads in a SYN frame. Nor does this draft | complexity of including payloads in a SYN frame. Nor does this draft | |||
| apply to any application which is unaware of it since applications | apply to any application which is unaware of it since applications | |||
| are required to explicitly configure SYNACK payloads before they come | are required to explicitly configure SYNACK payloads before they come | |||
| into effect. | into effect. | |||
| 7. Middlebox Interactions | 10. Middlebox Interactions | |||
| The large number of middleboxes (firewalls, proxies, protocol | The large number of middleboxes (firewalls, proxies, protocol | |||
| scrubbers, etc) currently present in the Internet pose some | scrubbers, etc) currently present in the Internet pose some | |||
| difficulty for deploying new TCP options. Some firewalls may block | difficulty for deploying new TCP options. Some firewalls may block | |||
| segments that carry unknown options. For instance, if the flags | segments that carry unknown options. For instance, if the flags | |||
| option is not understood by a firewall, incoming SYNs advertising | option is not understood by a firewall, incoming SYNs advertising | |||
| SYNACK payload support may be dropped, preventing connection | SYNACK payload support may be dropped, preventing connection | |||
| establishment. This is similar to the ECN blackhole problem, where | establishment. This is similar to the ECN blackhole problem, where | |||
| certain faulty hosts and routers throw away packets with ECN bits set | certain faulty hosts and routers throw away packets with ECN bits set | |||
| [RFC3168]. Some recent results indicate that for new TCP options, | [RFC3168]. Some recent results indicate that for new TCP options, | |||
| this may not be a significant threat, with only 0.2% of web requests | this may not be a significant threat, with only 0.2% of web requests | |||
| failing when carrying an unknown option [transport-middlebox]. | failing when carrying an unknown option [transport-middlebox]. | |||
| 8. IANA Considerations | 11. IANA Considerations | |||
| This document requires IANA to create a new registry of flag option | ||||
| bits, currently containing a single entry: bit 0 is assigned by | ||||
| SYNACK Payload Permitted. | ||||
| This document requires IANA to update values in its registry of TCP | This document requires IANA to update values in its registry of TCP | |||
| options numbers to assign a new entry, referred herein as | options numbers to assign a new entry, referred herein as | |||
| "TBD-IANA-KIND1". | "TBD-IANA-KIND1". | |||
| 9. Acknowledgements | 12. Acknowledgements | |||
| Wesley Eddy kindly reviewed initial versions of this draft. | Wesley Eddy kindly reviewed initial versions of this draft. | |||
| 10. References | Joe Touch provided many helpful comments. | |||
| 10.1. Normative References | 13. References | |||
| 13.1. Normative References | ||||
| [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, | [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, | |||
| RFC 793, September 1981. | RFC 793, September 1981. | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
| 10.2. Informative References | 13.2. Informative References | |||
| [RFC1379] Braden, B., "Extending TCP for Transactions -- Concepts", | ||||
| RFC 1379, November 1992. | ||||
| [RFC1644] Braden, B., "T/TCP -- TCP Extensions for Transactions | ||||
| Functional Specification", RFC 1644, July 1994. | ||||
| [RFC2560] Myers, M., Ankney, R., Malpani, A., Galperin, S., and C. | ||||
| Adams, "X.509 Internet Public Key Infrastructure Online | ||||
| Certificate Status Protocol - OCSP", RFC 2560, June 1999. | ||||
| [RFC2817] Khare, R. and S. Lawrence, "Upgrading to TLS Within | ||||
| HTTP/1.1", RFC 2817, May 2000. | ||||
| [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | |||
| of Explicit Congestion Notification (ECN) to IP", | of Explicit Congestion Notification (ECN) to IP", | |||
| RFC 3168, September 2001. | RFC 3168, September 2001. | |||
| [RFC4253] Ylonen, T. and C. Lonvick, "The Secure Shell (SSH) | [RFC4253] Ylonen, T. and C. Lonvick, "The Secure Shell (SSH) | |||
| Transport Layer Protocol", RFC 4253, January 2006. | Transport Layer Protocol", RFC 4253, January 2006. | |||
| [RFC4346] Dierks, T. and E. Rescorla, "The Transport Layer Security | [RFC4346] Dierks, T. and E. Rescorla, "The Transport Layer Security | |||
| (TLS) Protocol Version 1.1", RFC 4346, April 2006. | (TLS) Protocol Version 1.1", RFC 4346, April 2006. | |||
| [RFC1379] Braden, B., "Extending TCP for Transactions -- Concepts", | [curve25519] | |||
| RFC 1379, November 1992. | Bernstein, D., "Curve25519: new Diffie-Hellman speed | |||
| records". | ||||
| [RFC1644] Braden, B., "T/TCP -- TCP Extensions for Transactions | [salsa20] Bernstein, D., "Salsa20/8 and Salsa20/12". | |||
| Functional Specification", RFC 1644, July 1994. | ||||
| [transport-middlebox] | [transport-middlebox] | |||
| Medina, A., Allman, M., and S. Floyd, "Measuring | Medina, A., Allman, M., and S. Floyd, "Measuring | |||
| Interactions Between Transport Protocols and Middleboxes", | Interactions Between Transport Protocols and Middleboxes", | |||
| ACM SIGCOMM/USENIX Internet Measurement Conference, | ACM SIGCOMM/USENIX Internet Measurement Conference, | |||
| October 2004. | October 2004. | |||
| [rangecoding] | ||||
| Martin, G., "Range encoding: an algorithm for removing | ||||
| redundancy from a digitized message", Video and Data | ||||
| Recording Conference, July 1979. | ||||
| URIs | ||||
| [1] <http://www.ietf.org/internet-drafts/ | ||||
| draft-ietf-tcpm-tcp-auth-opt-01.txt> | ||||
| [2] <http://cr.yp.to/ecdh/reports.html> | ||||
| Appendix A. Changes | Appendix A. Changes | |||
| Author's Address | Author's Address | |||
| Adam Langley | Adam Langley | |||
| Google Inc | Google Inc | |||
| Email: agl@imperialviolet.org | Email: agl@imperialviolet.org | |||
| Full Copyright Statement | Full Copyright Statement | |||
| End of changes. 37 change blocks. | ||||
| 131 lines changed or deleted | 365 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||