idnits 2.17.1 draft-ietf-tcpm-fastopen-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document is more than 15 pages and seems to lack a Table of Contents. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC2119]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 682 has weird spacing: '... by encrypt...' -- The document date (Octobor 22, 2012) is 4507 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'RFC2119' is mentioned on line 65, but not defined == Missing Reference: 'RFC3390' is mentioned on line 500, but not defined == Missing Reference: 'RFC 1323' is mentioned on line 693, but not defined ** Obsolete undefined reference: RFC 1323 (Obsoleted by RFC 7323) == Missing Reference: 'RCCJR11' is mentioned on line 741, but not defined == Unused Reference: 'HNRGHT11' is defined on line 906, but no explicit reference was found in the text == Unused Reference: 'LANGLEY06' is defined on line 911, but no explicit reference was found in the text == Unused Reference: 'QWGMSS11' is defined on line 927, but no explicit reference was found in the text ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) == Outdated reference: A later version (-08) exists of draft-ietf-tcpm-initcwnd-02 -- Obsolete informational reference (is this intentional?): RFC 1644 (Obsoleted by RFC 6247) -- Obsolete informational reference (is this intentional?): RFC 2140 (Obsoleted by RFC 9040) -- Obsolete informational reference (is this intentional?): RFC 6013 (Obsoleted by RFC 7805) == Outdated reference: A later version (-10) exists of draft-ietf-tcpm-fastopen-01 Summary: 4 errors (**), 0 flaws (~~), 11 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Draft Y. Cheng 3 draft-ietf-tcpm-fastopen-02.txt J. Chu 4 Intended status: Experimental S. Radhakrishnan 5 Expiration date: April, 2013 A. Jain 6 Google, Inc. 7 Octobor 22, 2012 9 TCP Fast Open 11 Status of this Memo 13 Distribution of this memo is unlimited. 15 This Internet-Draft is submitted in full conformance with the 16 provisions of BCP 78 and BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that other 20 groups may also distribute working documents as Internet-Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/1id-abstracts.html 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html 33 This Internet-Draft will expire in August, 2012. 35 Copyright Notice 37 Copyright (c) 2012 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (http://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 Abstract 52 TCP Fast Open (TFO) allows data to be carried in the SYN and SYN-ACK 53 packets and consumed by the receiving end during the initial 54 connection handshake, thus saving up to one full round trip time 55 (RTT) compared to standard TCP which requires a three-way handshake 56 (3WHS) to complete before data can be exchanged. 58 Terminology 60 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 61 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 62 document are to be interpreted as described in RFC 2119 [RFC2119]. 63 TFO refers to TCP Fast Open. Client refers to the TCP's active open 64 side and server refers to the TCP's passive open side. 66 1. Introduction 68 TCP Fast Open (TFO) enables data to be exchanged safely during TCP's 69 connection handshake. 71 This document describes a design that enables applications to save a 72 round trip while avoiding severe security ramifications. At the core 73 of TFO is a security cookie used by the server side to authenticate a 74 client initiating a TFO connection. This document covers the details 75 of exchanging data during TCP's initial handshake, the protocol for 76 TFO cookies, and potential new security vulnerabilities and their 77 mitigation. It also includes discussions of deployment issues and 78 related proposals. TFO requires extensions to the socket API but this 79 document does not cover that. 81 TFO is motivated by the performance needs of today's Web 82 applications. Network latency is determined by the round-trip time 83 (RTT) and the number of round trips required to transfer application 84 data. RTT consists of propagation delay and queuing delay. Network 85 bandwidth has grown substantially over the past two decades, reducing 86 queuing delay, while propagation delay is largely constrained by the 87 speed of light and has remained unchanged. Therefore reducing the 88 number of round trips has become the most effective way to improve 89 the latency of Web applications [CDCM11]. 91 Standard TCP only permits data exchange after 3WHS [RFC793], which 92 adds one RTT to the network latency. For short transfers (e.g., web 93 objects) this additional RTT is a significant portion of the network 94 latency [THK98]. One widely deployed solution is HTTP persistent 95 connections. However, this solution is limited since hosts and middle 96 boxes terminate idle TCP connections due to resource constraints. For 97 example, the Chrome browser keeps TCP connections idle up to 5 98 minutes but 35% of Chrome HTTP requests are made on new TCP 99 connections. We discuss HTTP persistent connections further in 100 section 7.1. 102 2. Data In SYN 104 [RFC793] (section 3.4) already allows data in SYN packets but forbids 105 the receiver to deliver the data to the application until 3WHS is 106 completed. This is because TCP's initial handshake serves to capture 107 1) Old or duplicate SYNs and 2)SYNs with spoofed IP addresses. 109 TFO allows data to be delivered to the application before 3WHS is 110 completed, thus opening itself to a possible data integrity problem 111 caused by the problematic SYN packets above. This could cause a 112 problem in the following two examples: a) the receiver host receives 113 both duplicate and original SYNs before and after the host reboots, 114 and b) the duplicate is received after the connection created by the 115 original SYN has been closed. The receiver will not be protected by 116 the 2MSL TIMEWAIT state if the close is initiated by the sender. In 117 both cases, the data is replayed. 119 2.1. TCP Semantics and Duplicate SYNs 121 The proposed T/TCP protocol employs a new TCP "TAO" option and 122 connection count to guard against old or duplicate SYNs [RFC1644]. 123 The solution is complex, involving state tracking on a per remote 124 peer basis, and is vulnerable to IP spoofing attacks. Moreover, it 125 has been shown that despite its complexity, T/TCP is still not 126 entirely protected. Old or duplicate SYNs may still be accepted by a 127 T/TCP server [PHRACK98]. 129 Rather than trying to capture all dubious SYN packets to make TFO 130 100% compatible with TCP semantics, we made a design decision early 131 on to accept old SYN packets with data, i.e., to restrict TFO to use 132 with a class of applications that are tolerant of duplicate SYN 133 packets with data. We believe this is the right design trade-off 134 balancing complexity with usefulness. Applications that require 135 transactional semantics already deploy specific mechanisms to 136 tolerate similar data replay issues in TCP today. For example, a 137 browser reload event may replay any HTTP request even without data in 138 SYN. For transactional HTTP requests applications typically include 139 unique identifiers in the HTTP headers. Thus, allowing data in SYN 140 poses little risk to existing HTTP applications. 142 However, we note that some applications may rely on TCP 3-way 143 handshake semantics. For this reason, TFO MUST be used explicitly by 144 applications on a per service port basis. 146 2.2. SYNs with spoofed IP addresses 148 Standard TCP suffers from the SYN flood attack [RFC4987] because 149 bogus SYN packets, i.e., SYN packets with spoofed source IP addresses 150 can easily fill up a listener's small queue, causing a service port 151 to be blocked completely until timeouts. Secondary damage comes from 152 these SYN requests taking up memory space. Though this is less of an 153 issue today as servers typically have plenty of memory. 155 TFO goes one step further to allow server side TCP to process and 156 send up data to the application layer before 3WHS is completed. This 157 opens up more serious new vulnerabilities. Applications serving ports 158 that have TFO enabled may waste lots of CPU and memory resources 159 processing the requests and producing the responses. If the response 160 is much larger than the request, the attacker can mount an amplified 161 reflection attack against victims of choice beyond the TFO server 162 itself. 164 Numerous mitigation techniques against the regular SYN flood attack 165 exist and have been well documented [RFC4987]. Unfortunately none are 166 applicable to TFO. We propose a server supplied cookie to mitigate 167 most of the security issues introduced by TFO. We defer further 168 discussion of SYN flood attacks to the "Security Considerations" 169 section. 171 3. Protocol Overview 173 The key component of TFO is the Fast Open Cookie (cookie), a message 174 authentication code (MAC) tag generated by the server. The client 175 requests a cookie in one regular TCP connection, then uses it for 176 future TCP connections to exchange data during 3WHS: 177 Requesting a Fast Open Cookie: 178 1. The client sends a SYN with a Fast Open Cookie Request option. 180 2. The server generates a cookie and sends it through the Fast Open 181 Cookie option of a SYN-ACK packet. 183 3. The client caches the cookie for future TCP Fast Open connections 184 (see below). 186 Performing TCP Fast Open: 188 1. The client sends a SYN with Fast Open Cookie option and data. 190 2. The server validates the cookie: 191 a. If the cookie is valid, the server sends a SYN-ACK 192 acknowledging both the SYN and the data. The server then 193 delivers the data to the application. 195 b. Otherwise, the server drops the data and sends a SYN-ACK 196 acknowledging only the SYN sequence number. 198 3. If the server accepts the data in the SYN packet, it may send the 199 response data before the handshake finishes. The max amount is 200 governed by the TCP's congestion control [RFC5681]. 202 4. The client sends an ACK acknowledging the SYN and the server data. 203 If the client's data is not acknowledged, the client retransmits 204 the data in the ACK packet. 206 5. The rest of the connection proceeds like a normal TCP connection. 207 The client can repeat many Fast Open operations once it acquires a 208 cookie (until the cookie is expired by the server). Thus TFO is 209 useful for applications that have temporal locality on client and 210 server connections. 212 Requesting Fast Open Cookie in connection 1: 214 TCP A (Client) TCP B(Server) 215 ______________ _____________ 216 CLOSED LISTEN 218 #1 SYN-SENT ----- ----------> SYN-RCVD 220 #2 ESTABLISHED <---- ---------- SYN-RCVD 221 (caches cookie C) 223 Performing TCP Fast Open in connection 2: 225 TCP A (Client) TCP B(Server) 226 ______________ _____________ 227 CLOSED LISTEN 229 #1 SYN-SENT ----- ----> SYN-RCVD 231 #2 ESTABLISHED <---- ---- SYN-RCVD 233 #3 ESTABLISHED <---- ---- SYN-RCVD 235 #4 ESTABLISHED ----- --------------------> ESTABLISHED 237 #5 ESTABLISHED --- ----------> ESTABLISHED 239 4. Protocol Details 241 4.1. Fast Open Cookie 243 The Fast Open Cookie is designed to mitigate new security 244 vulnerabilities in order to enable data exchange during handshake. 245 The cookie is a message authentication code tag generated by the 246 server and is opaque to the client; the client simply caches the 247 cookie and passes it back on subsequent SYN packets to open new 248 connections. The server can expire the cookie at any time to enhance 249 security. 251 4.1.1. TCP Options 253 Fast Open Cookie Option 255 The server uses this option to grant a cookie to the client in the 256 SYN-ACK packet; the client uses it to pass the cookie back to the 257 server in the SYN packet. 259 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 260 | Kind | Length | 261 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 262 | | 263 ~ Cookie ~ 264 | | 265 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 267 Kind 1 byte: constant TBD (assigned by IANA) 268 Length 1 byte: range 6 to 18 (bytes); limited by 269 remaining space in the options field. 270 The number MUST be even. 271 Cookie 4 to 16 bytes (Length - 2) 273 Options with invalid Length values or without SYN flag set MUST be 274 ignored. The minimum Cookie size is 4 bytes. Although the diagram 275 shows a cookie aligned on 32-bit boundaries, alignment is not 276 required. 278 Fast Open Cookie Request Option 280 The client uses this option in the SYN packet to request a cookie 281 from a TFO-enabled server 283 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 284 | Kind | Length | 285 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 286 Kind 1 byte: same as the Fast Open Cookie option 287 Length 1 byte: constant 2. This distinguishes the option 288 from the Fast Open cookie option. 290 Options with invalid Length values, without SYN flag set, or with ACK 291 flag set MUST be ignored. 293 4.1.2. Server Cookie Handling 295 The server is in charge of cookie generation and authentication. The 296 cookie SHOULD be a message authentication code tag with the following 297 properties: 299 1. The cookie authenticates the client's (source) IP address of the 300 SYN packet. The IP address can be an IPv4 or IPv6 address. 302 2. The cookie can only be generated by the server and can not be 303 fabricated by any other parties including the client. 305 3. The generation and verification are fast relative to the rest of 306 SYN and SYN-ACK processing. 308 4. A server may encode other information in the cookie, and accept 309 more than one valid cookie per client at any given time. But this 310 is all server implementation dependent and transparent to the 311 client. 313 5. The cookie expires after a certain amount of time. The reason for 314 cookie expiration is detailed in the "Security Consideration" 315 section. This can be done by either periodically changing the 316 server key used to generate cookies or including a timestamp when 317 generating the cookie. 319 To gradually invalidate cookies over time, the server can 320 implement key rotation to generate and verify cookies using 321 multiple keys. This approach is useful for large-scale servers to 322 retain Fast Open rolling key updates. We do not specify a 323 particular mechanism because the implementation is often server 324 specific. 326 The server supports the cookie generation and verification 327 operations: 329 - GetCookie(IP_Address): returns a (new) cookie 331 - IsCookieValid(IP_Address, Cookie): checks if the cookie is valid, 332 i.e., it has not expired and it authenticates the client IP address. 334 Example Implementation: a simple implementation is to use AES_128 to 335 encrypt the IPv4 (with padding) or IPv6 address and truncate to 64 336 bits. The server can periodically update the key to expire the 337 cookies. AES encryption on recent processors is fast and takes only a 338 few hundred nanoseconds [RCCJB11]. 340 If only one valid cookie is allowed per-client and the server can 341 regenerate the cookie independently, the best validation process is 342 to simply regenerate a valid cookie and compare it against the 343 incoming cookie. In that case if the incoming cookie fails the check, 344 a valid cookie is readily available to be sent to the client. 346 The server MAY return a cookie request option, e.g., a null cookie, 347 to signal the support of Fast Open without generating cookies, for 348 probing or debugging purposes. 350 4.1.3. Client Cookie Handling 352 The client MUST cache cookies from servers for later Fast Open 353 connections. For a multi-homed client, the cookies are both client 354 and server IP dependent. Beside the cookie, we RECOMMEND that the 355 client caches the MSS and RTT to the server to enhance performance. 357 The MSS advertised by the server is stored in the cache to determine 358 the maximum amount of data that can be supported in the SYN packet. 359 This information is needed because data is sent before the server 360 announces its MSS in the SYN-ACK packet. Without this information, 361 the data size in the SYN packet is limited to the default MSS of 536 362 bytes [RFC1122]. The client SHOULD update the cache MSS value 363 whenever it discovers new MSS value, e.g., through path MTU 364 discovery. 366 Caching RTT allows seeding a more accurate SYN timeout than the 367 default value [RFC6298]. This lowers the performance penalty if the 368 network or the server drops the SYN packets with data or the cookie 369 options (See "Reliability and Deployment Issues" section below). 371 The cache replacement algorithm is not specified and is left for the 372 implementations. 374 Note that before TFO sees wide deployment, clients are advised to 375 also cache negative responses from servers in order to reduce the 376 amount of futile TFO attempts. Since TFO is enabled on a per-service 377 port basis but cookies are independent of service ports, clients' 378 cache should include remote port numbers too. 380 4.2. Fast Open Protocol 382 One predominant requirement of TFO is to be fully compatible with 383 existing TCP implementations, both on the client and the server 384 sides. 386 The server keeps two variables per listening port: 388 FastOpenEnabled: default is off. It MUST be turned on explicitly by 389 the application. When this flag is off, the server does not perform 390 any TFO related operations and MUST ignore all cookie options. 392 PendingFastOpenRequests: tracks number of TFO connections in SYN-RCVD 393 state. If this variable goes over a preset system limit, the server 394 SHOULD disable TFO for all new connection requests until 395 PendingFastOpenRequests drops below the system limit. This variable 396 is used for defending some vulnerabilities discussed in the "Security 397 Considerations" section. 399 The server keeps a FastOpened flag per TCB to mark if a connection 400 has successfully performed a TFO. 402 4.2.1. Fast Open Cookie Request 404 Any client attempting TFO MUST first request a cookie from the server 405 with the following steps: 407 1. The client sends a SYN packet with a Fast Open Cookie Request 408 option. 410 2. The server SHOULD respond with a SYN-ACK based on the procedures 411 in the "Server Cookie Handling" section. This SYN-ACK SHOULD 412 contain a Fast Open Cookie option if the server currently supports 413 TFO for this listener port. 415 3. If the SYN-ACK contains a Fast Open Cookie option, the client 416 replaces the cookie and other information as described in the 417 "Client Cookie Handling" section. Otherwise, if the SYN-ACK is 418 first seen, i.e.,not a (spurious) retransmission, the client MAY 419 remove the server information from the cookie cache. If the SYN- 420 ACK is a spurious retransmission without valid Fast Open Cookie 421 Option, the client does nothing to the cookie cache for the 422 reasons below. 424 The network or servers may drop the SYN or SYN-ACK packets with the 425 new cookie options which causes SYN or SYN-ACK timeouts. We RECOMMEND 426 both the client and the server retransmit SYN and SYN-ACK without the 427 cookie options on timeouts. This ensures the connections of cookie 428 requests will go through and lowers the latency penalties (of dropped 429 SYN/SYN-ACK packets). The obvious downside for maximum compatibility 430 is that any regular SYN drop will fail the cookie (although one can 431 argue the delay in the data transmission till after 3WHS is justified 432 if the SYN drop is due to network congestion). Next section 433 describes a heuristic to detect such drops when the client receives 434 the SYN-ACK. 436 We also RECOMMEND the client to record servers that failed to respond 437 to cookie requests and only attempt another cookie request after 438 certain period. An alternate proposal is to request cookie in FIN 439 instead since FIN-drop by incompatible middle-box does not affect 440 latency. However such paths are likely to drop SYN packet with data 441 later, and many applications close the connections with RST instead, 442 so the actual benefit of this approach is not clear. 444 4.2.2. TCP Fast Open 446 Once the client obtains the cookie from the target server, the client 447 can perform subsequent TFO connections until the cookie is expired by 448 the server. The nature of TCP sequencing makes the TFO specific 449 changes relatively small in addition to [RFC793]. 451 Client: Sending SYN 453 To open a TFO connection, the client MUST have obtained the cookie 454 from the server: 456 1. Send a SYN packet. 458 a. If the SYN packet does not have enough option space for the 459 Fast Open Cookie option, abort TFO and fall back to regular 3WHS. 461 b. Otherwise, include the Fast Open Cookie option with the cookie 462 of the server. Include any data up to the cached server MSS or 463 default 536 bytes. 465 2. Advance to SYN-SENT state and update SND.NXT to include the data 466 accordingly. 468 3. If RTT is available from the cache, seed SYN timer according to 469 [RFC6298]. 471 To deal with network or servers dropping SYN packets with payload or 472 unknown options, when the SYN timer fires, the client SHOULD 473 retransmit a SYN packet without data and Fast Open Cookie options. 475 Server: Receiving SYN and responding with SYN-ACK 477 Upon receiving the SYN packet with Fast Open Cookie option: 479 1. Initialize and reset a local FastOpened flag. If FastOpenEnabled 480 is false, go to step 5. 482 2. If PendingFastOpenRequests is over the system limit, go to step 5. 484 3. If IsCookieValid() in section 4.1.2 returns false, go to step 5. 486 4. Buffer the data and notify the application. Set FastOpened flag 487 and increment PendingFastOpenRequests. 489 5. Send the SYN-ACK packet. The packet MAY include a Fast Open 490 Option. If FastOpened flag is set, the packet acknowledges the SYN 491 and data sequence. Otherwise it acknowledges only the SYN 492 sequence. The server MAY include data in the SYN-ACK packet if the 493 response data is readily available. Some application may favor 494 delaying the SYN-ACK, allowing the application to process the 495 request in order to produce a response, but this is left to the 496 implementation. 498 6. Advance to the SYN-RCVD state. If the FastOpened flag is set, the 499 server MUST follow the congestion control [RFC5681], in particular 500 the initial congestion window [RFC3390], to send more data 501 packets. 503 If the SYN-ACK timer fires, the server SHOULD retransmit a SYN-ACK 504 segment with neither data nor Fast Open Cookie options for 505 compatibility reasons. 507 Client: Receiving SYN-ACK 509 The client SHOULD perform the following steps upon receiving the SYN- 510 ACK: 511 1. Update the cookie cache if the SYN-ACK has a Fast Open Cookie 512 Option or MSS option or both. 514 2. Send an ACK packet. Set acknowledgment number to RCV.NXT and 515 include the data after SND.UNA if data is available. 517 3. Advance to the ESTABLISHED state. 519 Note there is no latency penalty if the server does not acknowledge 520 the data in the original SYN packet. The client SHOULD retransmit any 521 unacknowledged data in the first ACK packet in step 2. The data 522 exchange will start after the handshake like a regular TCP 523 connection. 525 If the client has timed out and retransmitted only regular SYN 526 packets, it can heuristically detect paths that intentionally drop 527 SYN with Fast Open option or data. If the SYN-ACK acknowledges only 528 the initial sequence and does not carry a Fast Open cookie option, 529 presumably it is triggered by a retransmitted (regular) SYN and the 530 original SYN or the corresponding SYN-ACK was lost. 532 Server: Receiving ACK 534 Upon receiving an ACK acknowledging the SYN sequence, the server 535 decrements PendingFastOpenRequests and advances to the ESTABLISHED 536 state. No special handling is required further. 538 5. Reliability and Deployment Issues 540 Network or Hosts Dropping SYN packets with data or unknown options 542 A study [MAF04] found that some middle-boxes and end-hosts may drop 543 packets with unknown TCP options incorrectly. Studies [LANGLEY06, 544 HNRGHT11] both found that 6% of the probed paths on the Internet drop 545 SYN packets with data or with unknown TCP options. The TFO protocol 546 deals with this problem by retransmitting SYN without data or cookie 547 options and we recommend tracking these servers in the client. 549 Server Farms 551 A common server-farm setup is to have many physical hosts behind a 552 load-balancer sharing the same server IP. The load-balancer forwards 553 new TCP connections to different physical hosts based on certain 554 load-balancing algorithms. For TFO to work, the physical hosts need 555 to share the same key and update the key at about the same time. 557 Network Address Translation (NAT) 559 The hosts behind NAT sharing same IP address will get the same cookie 560 to the same server. This will not prevent TFO from working. But on 561 some carrier-grade NAT configurations where every new TCP connection 562 from the same physical host uses a different public IP address, TFO 563 does not provide latency benefit. However, there is no performance 564 penalty either as described in Section "Client: Receiving SYN-ACK". 566 6. Security Considerations 568 The Fast Open cookie stops an attacker from trivially flooding 569 spoofed SYN packets with data to burn server resources or to mount an 570 amplified reflection attack on random hosts. The server can defend 571 against spoofed SYN floods with invalid cookies using existing 572 techniques [RFC4987]. We note that generating bogus cookies is 573 usually cheaper than validating them. But the additional cost of 574 validating the cookies, inherent to any authentication scheme, may 575 not be substantial compared to processing a regular SYN packet. 577 However, the attacker may still obtain cookies from some compromised 578 hosts, then flood spoofed SYN with data and "valid" cookies (from 579 these hosts or other vantage points). With DHCP, it's possible to 580 obtain cookies of past IP addresses without compromising any host. 581 Below we identify new vulnerabilities of TFO and describe the 582 countermeasures. 584 6.1. Server Resource Exhaustion Attack by SYN Flood with Valid Cookies 586 Like regular TCP handshakes, TFO is vulnerable to such an attack. But 587 the potential damage can be much more severe. Besides causing 588 temporary disruption to service ports under attack, it may exhaust 589 server CPU and memory resources. 591 For this reason it is crucial for the TFO server to limit the maximum 592 number of total pending TFO connection requests, i.e., 593 PendingFastOpenRequests. When the limit is exceeded, the server 594 temporarily disables TFO entirely as described in "Server Cookie 595 Handling". Then subsequent TFO requests will be downgraded to regular 596 connection requests, i.e., with the data dropped and only SYN 597 acknowledged. This allows regular SYN flood defense techniques 598 [RFC4987] like SYN-cookies to kick in and prevent further service 599 disruption. 601 There are other subtle but important differences in the vulnerability 602 between TFO and regular TCP handshake. Before the SYN flood attack 603 broke out in the late '90s, typical listener's max qlen was small, 604 enough to sustain the highest expected new connection rate and the 605 average RTT for the SYN-ACK packets to be acknowledged in time. E.g., 606 if a server is designed to handle at most 100 connection requests per 607 second, and the average RTT is 100ms, a max qlen on the order of 10 608 will be sufficient. 610 This small max qlen made it very easy for any attacker, even equipped 611 with just a dailup modem to the Internet, to cause major disruptions 612 to a web site by simply throwing a handful of "SYN bombs" at its 613 victim of choice. But for this attack scheme to work, the attacker 614 must pick a non-responsive source IP address to spoof with. Otherwise 615 the SYN-ACK packet will trigger TCP RST from the host whose IP 616 address has been spoofed, causing corresponding connection to be 617 removed from the server's listener queue hence defeating the attack. 619 In other words, the main damage of SYN bombs against the standard TCP 620 stack is not directly from the bombs themselves costing TCP 621 processing overhead or host memory, but rather from the spoofed SYN 622 packets filling up the often small listener's queue. 624 On the other hand, TFO SYN bombs can cause damage directly if 625 admitted without limit into the stack. The RST packets from the 626 spoofed host will fuel rather than defeat the SYN bombs as compared 627 to the non-TFO case, because the attacker can flood more SYNs with 628 data to cost more data processing resources. For this reason, a TFO 629 server needs to monitor the connections in SYN-RCVD being reset in 630 addition to imposing a reasonable max qlen. Implementations may 631 combine the two, e.g., by continuing to account for those connection 632 requests that have just been reset against the listener's 633 PendingFastOpenRequests until a timeout period has passed. 635 Limiting the maximum number of pending TFO connection requests does 636 make it easy for an attacker to overflow the queue, causing TFO to be 637 disabled. We argue that causing TFO to be disabled is unlikely to be 638 of interest to attackers because the service will remain intact 639 without TFO hence there is hardly any real damage. 641 6.2. Amplified Reflection Attack to Random Host 643 Limiting PendingFastOpenRequests with a system limit can be done 644 without Fast Open Cookies and would protect the server from resource 645 exhaustion. It would also limit how much damage an attacker can cause 646 through an amplified reflection attack from that server. However, it 647 would still be vulnerable to an amplified reflection attack from a 648 large number of servers. An attacker can easily cause damage by 649 tricking many servers to respond with data packets at once to any 650 spoofed victim IP address of choice. 652 With the use of Fast Open Cookies, the attacker would first have to 653 steal a valid cookie from its target victim. This likely requires the 654 attacker to compromise the victim host or network first. 656 The attacker here has little interest in mounting an attack on the 657 victim host that has already been compromised. But she may be 658 motivated to disrupt the victim's network. Since a stolen cookie is 659 only valid for a single server, she has to steal valid cookies from a 660 large number of servers and use them before they expire to cause 661 sufficient damage without triggering the defense in the previous 662 section. 664 One can argue that if the attacker has compromised the target network 665 or hosts, she could perform a similar but simpler attack by injecting 666 bits directly. The degree of damage will be identical, but TFO- 667 specific attack allows the attacker to remain anonymous and disguises 668 the attack as from other servers. 670 The best defense is for the server not to respond with data until 671 handshake finishes. In this case the risk of amplification reflection 672 attack is completely eliminated. But the potential latency saving 673 from TFO may diminish if the server application produces responses 674 earlier before the handshake completes. 676 6.3 Attacks from behind sharing public IPs (NATs) 678 An attacker behind NAT can easily obtain valid cookies to launch the 679 above attack to hurt other clients that share the path. [BOB12] 680 suggested that the server can extend cookie generation to include the 681 TCP timestamp---GetCookie(IP_Address, Timestamp)---and implement it 682 by encrypting the concatenation of the two values to generate the 683 cookie. The client stores both the cookie and its corresponding 684 timestamp, and echoes both in the SYN. The server then implements 685 IsCookieValid(IP_Address, Timestamp, Cookie) by encrypting the IP and 686 timestamp data and comparing it with the cookie value. 688 This enables the server to issue different cookies to clients that 689 share the same IP address, hence can selectively discard those 690 misused cookies from the attacker. However the attacker can simply 691 repeat the attack with new cookies. The server would eventually need 692 to throttle all requests from the IP address just like the current 693 approach. Moreover this approach requires modifying [RFC 1323] to 694 send non-zero Timestamp Echo Reply in SYN, potentially cause firewall 695 issues. Therefore we believe the benefit may not outweigh the 696 drawbacks. 698 7. Web Performance 700 7.1. HTTP persistent connection 702 TCP connection setup overhead has long been identified as a 703 performance bottleneck for web applications [THK98]. HTTP persistent 704 connection was proposed to mitigate this issue and has been widely 705 deployed. However, [RCCJR11][AERG11] show that the average number of 706 transactions per connection is between 2 and 4, based on large-scale 707 measurements from both servers and clients. In these studies, the 708 servers and clients both kept the idle connections up to several 709 minutes, well into the human think time. 711 Can the utilization rate increase by keeping connections even longer? 712 Unfortunately, this is problematic due to middle-boxes and rapidly 713 growing mobile end hosts. One major issue is NAT. Studies 715 [HNESSK10][MQXMZ11] show that the majority of home routers and ISPs 716 fail to meet the the 124 minutes idle timeout mandated in [RFC5382]. 717 In [MQXMZ11], 35% of mobile ISPs timeout idle connections within 30 718 minutes. NAT boxes do not possess a reliable mechanism to notify end 719 hosts when idle connections are removed from local tables, either due 720 to resource constraints such as mapping table size, memory, or lookup 721 overhead, or due to the limited port number and IP address space. 722 Moreover, unmapped packets received by NAT boxes are often dropped 723 silently. (TCP RST is not required by RFC5382.) The end host 724 attempting to use these broken connections are often forced to wait 725 for a lengthy TCP timeout. Thus the browser risks large performance 726 penalty when keeping idle connections open. To circumvent this 727 problem, some applications send frequent TCP keep-alive probes. 728 However, this technique drains power on mobile devices [MQXMZ11]. In 729 fact, power has become a prominent issue in modern LTE devices that 730 mobile browsers close the HTTP connections within seconds or even 731 immediately [SOUDERS11]. 733 Idle connections also consume more memory resources. Due to the 734 complexity of today's web applications, the application layer often 735 needs orders of magnitude more memory than the TCP connection 736 footprint. As a result, servers need to implement advanced resource 737 management in order to support a large number of idle connections. 739 7.2 Case Study: Chrome Browser 741 [RCCJR11] studied Chrome browser performance based on 28 days of 742 global statistics. Chrome browser keeps idle HTTP persistent 743 connections up to 5 to 10 minutes. However the average number of the 744 transactions per connection is only 3.3. Due to the low utilization, 745 TCP 3WHS accounts up to 25% of the HTTP transaction network latency. 746 The authors tested a Linux TFO implementation with TFO enabled Chrome 747 browser on popular web sites in emulated environments such as 748 residential broadband and mobile networks. They showed that TFO 749 improves page load time by 10% to 40%. More detailed on the design 750 tradeoffs and measurement can be found at [RCCJB11]. 752 8. TFO's Applicability 754 TFO aims at latency conscious applications that are sensitive to 755 TCP's initial connection setup delay. These application protocols 756 often employ short-lived TCP connections, or employ long-lived 757 connections but are more sensitive to the connection setup delay due 758 to, e.g., a more strict connection fail-over requirement. 760 Only transaction-type applications where RTT constitutes a 761 significant portion of the total end-to-end latency will likely 762 benefit from TFO. Moreover, the client request must fit in the SYN 763 packet. Otherwise there may not be any saving in the total number of 764 round trips required to complete a transaction. 766 To the extent possible applications protocols SHOULD employ long- 767 lived connections to best take advantage of TCP's built-in congestion 768 control algorithm, and to reduce the impact from TCP's connection 769 setup overhead. E.g., for the web applications, P-HTTP will likely 770 help and is much easier to deploy hence should be attempted first. 771 TFO will likely provide further latency reduction on top of P-HTTP. 772 But the additional benefit will depend on how much persistency one 773 can get from HTTP in a given operating environment. 775 One alternative to short-lived TCP connection might be UDP, which is 776 connectionless hence doesn't inflict any connection setup delay, and 777 is best suited for application protocols that are transactional. 778 Practical deployment issues such as middle-box and/or firewall 779 traversal may severely limit the use of UDP based application 780 protocols though. 782 Note that when the application employs too many short-lived 783 connections, it may negatively impact network stability, as these 784 connections often exit before TCP's congestion control algorithm 785 kicks in. Implementations supporting large number of short-lived 786 connections should employ temporal sharing of TCB data as described 787 in [RFC2140]. 789 More discussion on TCP Fast Open and its projected performance 790 benefit can be found in [RCCJB11]. 792 9. Related Work 794 9.1. T/TCP 796 TCP Extensions for Transactions [RFC1644] attempted to bypass the 797 three-way handshake, among other things, hence shared the same goal 798 but also the same set of issues as TFO. It focused most of its effort 799 battling old or duplicate SYNs, but paid no attention to security 800 vulnerabilities it introduced when bypassing 3WHS. Its TAO option and 801 connection count, besides adding complexity, require the server to 802 keep state per remote host, while still leaving it wide open for 803 attacks. It is trivial for an attacker to fake a CC value that will 804 pass the TAO test. Unfortunately, in the end its scheme is still not 805 100% bullet proof as pointed out by [PHRACK98]. 807 As stated earlier, we take a practical approach to focus TFO on the 808 security aspect, while allowing old, duplicate SYN packets with data 809 after recognizing that 100% TCP semantics is likely infeasible. We 810 believe this approach strikes the right tradeoff, and makes TFO much 811 simpler and more appealing to TCP implementers and users. 813 9.2. Common Defenses Against SYN Flood Attacks 815 TFO is still vulnerable to SYN flood attacks just like normal TCP 816 handshakes, but the damage may be much worse, thus deserves a careful 817 thought. 819 There have been plenty of studies on how to mitigate attacks from 820 regular SYN flood, i.e., SYN without data [RFC4987]. But from the 821 stateless SYN-cookies to the stateful SYN Cache, none can preserve 822 data sent with SYN safely while still providing an effective defense. 824 The best defense may be to simply disable TFO when a host is 825 suspected to be under a SYN flood attack, e.g., the SYN backlog is 826 filled. Once TFO is disabled, normal SYN flood defenses can be 827 applied. The "Security Consideration" section contains a thorough 828 discussion on this topic. 830 9.3. TCP Cookie Transaction (TCPCT) 832 TCPCT [RFC6013] eliminates server state during initial handshake and 833 defends spoofing DoS attacks. Like TFO, TCPCT allows SYN and SYN-ACK 834 packets to carry data. However, TCPCT and TFO are designed for 835 different goals and they are not compatible. 837 The TCPCT server does not keep any connection state during the 838 handshake, therefore the server application needs to consume the data 839 in SYN and (immediately) produce the data in SYN-ACK before sending 840 SYN-ACK. Otherwise the application's response has to wait until 841 handshake completes. In contrary, TFO allows server to respond data 842 during handshake. Therefore for many request-response style 843 applications, TCPCT may not achieve same latency benefit as TFO. 845 Rapid-Restart [SIMPSON11] is based on TCPCT and shares similar goal 846 as TFO. In Rapid-Restart, both the server and the client retain the 847 TCP control blocks after a connection is terminated in order to 848 allow/resume data exchange in next connection handshake. In contrary, 849 TFO does not require keeping both TCB on both sides and is more 850 scalable. 852 10. IANA Considerations 854 The Fast Open Cookie Option and Fast Open Cookie Request Option 855 define no new namespace. The options require IANA allocate one value 856 from the TCP option Kind namespace. Early implementation before the 857 allocation SHOULD follow [EXPOPT] and use experimental option 254 and 858 magic number 0xF989 (16 bits), and migrate to the new option after 859 the allocation according. 861 11. Acknowledgement 862 We thank Rick Jones, Bob Briscoe, Adam Langley, Matt Mathis, Neal 863 Cardwell, Roberto Peon, and Tom Herbert for their feedbacks. We 864 especially thank Barath Raghavan for his contribution on the security 865 design of Fast Open. 867 12. References 869 12.1. Normative References 871 [RFC793] Postel, J. "Transmission Control Protocol", RFC 793, 872 September 1981. 874 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 875 Communication Layers", STD 3, RFC 1122, October 1989. 877 [RFC5382] S. Guha, Ed., Biswas, K., Ford B., Sivakumar S., Srisuresh, 878 P., "NAT Behavioral Requirements for TCP", RFC 5382 880 [RFC5681] Allman, M., Paxson, V. and E. Blanton, "TCP Congestion 881 Control", RFC 5681, September 2009. 883 [RFC6298] Paxson, V., Allman, M., Chu, J. and M. Sargent, "Computing 884 TCP's Retransmission Timer", RFC 6298, June 2011. 886 12.2. Informative References 888 [AERG11] M. Al-Fares, K. Elmeleegy, B. Reed, and I. Gashinsky, 889 "Overclocking the Yahoo! CDN for Faster Web Page Loads". In 890 Proceedings of Internet Measurement Conference, November 891 2011. 893 [CDCM11] Chu, J., Dukkipati, N., Cheng, Y. and M. Mathis, 894 "Increasing TCP's Initial Window", Internet-Draft draft- 895 ietf-tcpm-initcwnd-02.txt (work in progress), October 2011. 897 [EXPOPT] Touch, Joe, "Shared Use of Experimental TCP Options", 898 Internet-Draft draft-ietf-tcpm-experimental-options (work 899 in progress), October 2012. 901 [HNESSK10] S. Haetoenen, A. Nyrhinen, L. Eggert, S. Strowes, P. 902 Sarolahti, M. Kojo., "An Experimental Study of Home Gateway 903 Characteristics". In Proceedings of Internet Measurement 904 Conference. Octobor 2010 906 [HNRGHT11] M. Honda, Y. Nishida, C. Raiciu, A. Greenhalgh, M. 907 Handley, H. Tokuda, "Is it Still Possible to Extend TCP?". 908 In Proceedings of Internet Measurement Conference. November 909 2011. 911 [LANGLEY06] Langley, A, "Probing the viability of TCP extensions", 912 URL http://www.imperialviolet.org/binary/ecntest.pdf 914 [MAF04] Medina, A., Allman, M., and S. Floyd, "Measuring 915 Interactions Between Transport Protocols and Middleboxes", 916 In Proceedings of Internet Measurement Conference, October 917 2004. 919 [MQXMZ11] Z. Mao, Z. Qian, Q. Xu, Z. Mao, M. Zhang. "An Untold Story 920 of Middleboxes in Cellular Networks", In Proceedings of 921 SIGCOMM. August 2011. 923 [PHRACK98] "T/TCP vulnerabilities", Phrack Magazine, Volume 8, Issue 924 53 artical 6. July 8, 1998. URL 925 http://www.phrack.com/issues.html?issue=53&id=6 927 [QWGMSS11] F. Qian, Z. Wang, A. Gerber, Z. Mao, S. Sen, O. 928 Spatscheck. "Profiling Resource Usage for Mobile 929 Applications: A Cross-layer Approach", In Proceedings of 930 International Conference on Mobile Systems. April 2011. 932 [RCCJB11] Radhakrishnan, S., Cheng, Y., Chu, J., Jain, A. and B. 933 Raghavan, "TCP Fast Open". In Proceedings of 7th ACM CoNEXT 934 Conference, December 2011. 936 [RFC1644] Braden, R., "T/TCP -- TCP Extensions for Transactions 937 Functional Specification", RFC 1644, July 1994. 939 [RFC2140] Touch, J., "TCP Control Block Interdependence", RFC2140, 940 April 1997. 942 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 943 Mitigations", RFC 4987, August 2007. 945 [RFC6013] Simpson, W., "TCP Cookie Transactions (TCPCT)", RFC6013, 946 January 2011. 948 [SIMPSON11] Simpson, W., "Tcp cookie transactions (tcpct) rapid 949 restart", Internet draft draft-simpson-tcpct-rr-02.txt 950 (work in progress), July 2011. 952 [SOUDERS11] S. Souders. "Making A Mobile Connection". 953 http://www.stevesouders.com/blog/2011/09/21/making-a- 954 mobile-connection/ 956 [THK98] Touch, J., Heidemann, J., Obraczka, K., "Analysis of HTTP 957 Performance", USC/ISI Research Report 98-463. December 958 1998. 960 [BOB12] Briscoe, B., "Some ideas building on draft-ietf-tcpm- 961 fastopen-01", tcpm list, 962 http://www.ietf.org/mail-archive/web/tcpm/current/ 963 msg07192.html 965 Author's Addresses 967 Yuchung Cheng 968 Google, Inc. 969 1600 Amphitheatre Parkway 970 Mountain View, CA 94043, USA 971 EMail: ycheng@google.com 973 Jerry Chu 974 Google, Inc. 975 1600 Amphitheatre Parkway 976 Mountain View, CA 94043, USA 977 EMail: hkchu@google.com 979 Sivasankar Radhakrishnan 980 Department of Computer Science and Engineering 981 University of California, San Diego 982 9500 Gilman Dr 983 La Jolla, CA 92093-0404 984 EMail: sivasankar@cs.ucsd.edu 986 Arvind Jain 987 Google, Inc. 988 1600 Amphitheatre Parkway 989 Mountain View, CA 94043, USA 990 EMail: arvind@google.com