idnits 2.17.1 draft-ietf-tcpm-fastopen-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document is more than 15 pages and seems to lack a Table of Contents. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. ** The abstract seems to contain references ([RFC2119]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 16, 2012) is 4302 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'RFC2119' is mentioned on line 65, but not defined == Missing Reference: 'RFC3390' is mentioned on line 479, but not defined == Missing Reference: 'RCCJR11' is mentioned on line 694, but not defined == Unused Reference: 'HNRGHT11' is defined on line 853, but no explicit reference was found in the text == Unused Reference: 'LANGLEY06' is defined on line 858, but no explicit reference was found in the text == Unused Reference: 'QWGMSS11' is defined on line 875, but no explicit reference was found in the text ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) -- No information found for draft-ietf-tcpm- - is the name correct? -- Obsolete informational reference (is this intentional?): RFC 1644 (Obsoleted by RFC 6247) -- Obsolete informational reference (is this intentional?): RFC 2140 (Obsoleted by RFC 9040) -- Obsolete informational reference (is this intentional?): RFC 6013 (Obsoleted by RFC 7805) Summary: 4 errors (**), 0 flaws (~~), 7 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Draft Y. Cheng 3 draft-ietf-tcpm-fastopen-01.txt J. Chu 4 Intended status: Experimental S. Radhakrishnan 5 Expiration date: Feburary, 2013 A. Jain 6 Google, Inc. 7 July 16, 2012 9 TCP Fast Open 11 Status of this Memo 13 Distribution of this memo is unlimited. 15 This Internet-Draft is submitted in full conformance with the 16 provisions of BCP 78 and BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that other 20 groups may also distribute working documents as Internet-Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/1id-abstracts.html 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html 33 This Internet-Draft will expire in August, 2012. 35 Copyright Notice 37 Copyright (c) 2012 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (http://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 Abstract 52 TCP Fast Open (TFO) allows data to be carried in the SYN and SYN-ACK 53 packets and consumed by the receiving end during the initial 54 connection handshake, thus providing a saving of up to one full round 55 trip time (RTT) compared to standard TCP requiring a three-way 56 handshake (3WHS) to complete before data can be exchanged. 58 Terminology 60 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 61 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 62 document are to be interpreted as described in RFC 2119 [RFC2119]. 63 TFO refers to TCP Fast Open. Client refers to the TCP's active open 64 side and server refers to the TCP's passive open side. 66 1. Introduction 68 TCP Fast Open (TFO) enables data to be exchanged safely during TCP 69 connection handshake. 71 This document describes a design that enables qualified applications 72 to attain a round trip saving while avoiding severe security 73 ramifications. At the core of TFO is a security cookie used by the 74 server side to authenticate a client initiating a TFO connection. The 75 document covers the details of exchanging data during TCP's initial 76 handshake, the protocol for TFO cookies, and potential new security 77 vulnerabilities and their mitigation. It also includes discussions on 78 deployment issues and related proposals. TFO requires extensions to 79 the existing socket API, which will be covered in a separate 80 document. 82 TFO is motivated by the performance need of today's Web applications. 83 Network latency is determined by the round-trip time (RTT) and the 84 number of round trips required to transfer application data. RTT 85 consists of transmission delay and propagation delay. Network 86 bandwidth has grown substantially over the past two decades, much 87 reducing the transmission delay, while propagation delay is largely 88 constrained by the speed of light and has remained unchanged. 89 Therefore reducing the number of round trips has become the most 90 effective way to improve the latency of Web applications [CDCM11]. 92 Standard TCP only permits data exchange after 3WHS [RFC793], which 93 introduces one RTT delay to the network latency. For short transfers, 94 e.g., web objects, this additional RTT becomes a significant portion 95 of the network latency [THK98]. One widely deployed solution is HTTP 96 persistent connections. However, this solution is limited since hosts 97 and middle boxes terminate idle TCP connections due to resource 98 constraints. E.g., the Chrome browser keeps TCP connections idle up 99 to 5 minutes but 35% of Chrome HTTP requests are made on new TCP 100 connections. More discussions on HTTP persistent connections are in 101 section 7.1. 103 2. Data In SYN 105 [RFC793] (section 3.4) already allows data in SYN packets but forbids 106 the receiver to deliver the data to the application until 3WHS is 107 completed. This is because TCP's initial handshake serves to capture 108 - Old or duplicate SYNs 110 - SYNs with spoofed IP addresses 112 TFO allows data to be delivered to the application before 3WHS is 113 completed, thus opening itself to a possible data integrity problem 114 caused by the dubious SYN packets above. 116 2.1. TCP Semantics and Duplicate SYNs 118 A past proposal called T/TCP employs a new TCP "TAO" option and 119 connection count to guard against old or duplicate SYNs [RFC1644]. 120 The solution is complex, involving state tracking on per remote peer 121 basis, and is vulnerable to IP spoofing attack. Moreover, it has been 122 shown that even with all the complexity, T/TCP is still not 100% 123 bullet proof. Old or duplicate SYNs may still slip through and get 124 accepted by a T/TCP server [PHRACK98]. 126 Rather than trying to capture all the dubious SYN packets to make TFO 127 100% compatible with TCP semantics, we've made a design decision 128 early on to accept old SYN packets with data, i.e., to restrict TFO 129 for a class of applications that are tolerant of duplicate SYN 130 packets with data, e.g., idempotent or query type transactions. We 131 believe this is the right design trade-off balancing complexity with 132 usefulness. There is a large class of applications that can tolerate 133 dubious transaction requests. 135 For this reason, TFO MUST be disabled by default, and only enabled 136 explicitly by applications on a per service port basis. 138 2.2. SYNs with spoofed IP addresses 140 Standard TCP suffers from the SYN flood attack [RFC4987] because 141 bogus SYN packets, i.e., SYN packets with spoofed source IP addresses 142 can easily fill up a listener's small queue, causing a service port 143 to be blocked completely until timeouts. Secondary damage comes from 144 faked SYN requests taking up memory space. This is normally not an 145 issue today with typical servers having plenty of memory. 147 TFO goes one step further to allow server side TCP to process and 148 send up data to the application layer before 3WHS is completed. This 149 opens up much more serious new vulnerabilities. Applications serving 150 ports that have TFO enabled may waste lots of CPU and memory 151 resources processing the requests and producing the responses. If the 152 response is much larger than the request, the attacker can mount an 153 amplified reflection attack against victims of choice beyond the TFO 154 server itself. 156 Numerous mitigation techniques against the regular SYN flood attack 157 exist and have been well documented [RFC4987]. Unfortunately none are 158 applicable to TFO. We propose a server supplied cookie to mitigate 159 most of the security risks introduced by TFO. A more thorough 160 discussion on SYN flood attack against TFO is deferred to the 161 "Security Considerations" section. 163 3. Protocol Overview 165 The key component of TFO is the Fast Open Cookie (cookie), a message 166 authentication code (MAC) tag generated by the server. The client 167 requests a cookie in one regular TCP connection, then uses it for 168 future TCP connections to exchange data during 3WHS: 170 Requesting Fast Open Cookie: 172 1. The client sends a SYN with a Fast Open Cookie Request option. 173 2. The server generates a cookie and sends it through the Fast Open 174 Cookie option of a SYN-ACK packet. 175 3. The client caches the cookie for future TCP Fast Open connections 176 (see below). 178 Performing TCP Fast Open: 180 1. The client sends a SYN with Fast Open Cookie option and data. 181 2. The server validates the cookie: 182 a. If the cookie is valid, the server sends a SYN-ACK 183 acknowledging both the SYN and the data. The server then delivers 184 the data to the application. 185 b. Otherwise, the server drops the data and sends a SYN-ACK 186 acknowledging only the SYN sequence number. 187 3. If the server accepts the data in the SYN packet, it may send the 188 response data before the handshake finishes. The max amount is 189 governed by the TCP's congestion control [RFC5681]. 190 4. The client sends an ACK acknowledging the SYN and the server data. 191 If the client's data is not acknowledged, the client retransmits 192 the data in the ACK packet. 193 5. The rest of the connection proceeds like a normal TCP connection. 195 The client can perform many TFO operations once it acquires a cookie 196 until the cookie is expired by the server. Thus TFO is useful for 197 applications that have temporal locality on client and server 198 connections. 200 Requesting Fast Open Cookie in connection 1: 202 TCP A (Client) TCP B(Server) 203 ______________ _____________ 204 CLOSED LISTEN 206 #1 SYN-SENT ----- ----------> SYN-RCVD 208 #2 ESTABLISHED <---- ---------- SYN-RCVD 209 (caches cookie C) 211 Performing TCP Fast Open in connection 2: 213 TCP A (Client) TCP B(Server) 214 ______________ _____________ 215 CLOSED LISTEN 217 #1 SYN-SENT ----- ----> SYN-RCVD 219 #2 ESTABLISHED <---- ---- SYN-RCVD 221 #3 ESTABLISHED <---- ---- SYN-RCVD 223 #4 ESTABLISHED ----- --------------------> ESTABLISHED 225 #5 ESTABLISHED --- ----------> ESTABLISHED 227 4. Protocol Details 229 4.1. Fast Open Cookie 231 The Fast Open Cookie is invented to mitigate new security 232 vulnerabilities in order to enable data exchange during handshake. 233 The cookie is a message authentication code tag generated by the 234 server and is opaque to the client; the client simply caches the 235 cookie and passes it back on subsequent SYN packets to open new 236 connections. The server can expire the cookie at any time to enhance 237 security. 239 4.1.1. TCP Options 241 Fast Open Cookie Option 243 The server uses this option to grant a cookie to the client in the 244 SYN-ACK packet; the client uses it to pass the cookie back to the 245 server in the SYN packet. 247 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 248 | Kind | Length | 249 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 250 | | 251 ~ Cookie ~ 252 | | 253 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 255 Kind 1 byte: constant TBD (assigned by IANA) 256 Length 1 byte: range 6 to 18 (bytes); limited by 257 remaining space in the options field. 258 The number MUST be even. 259 Cookie 4 to 16 bytes (Length - 2) 261 Options with invalid Length values or without SYN flag set MUST be 262 ignored. The minimum Cookie size is 4 bytes. Although the diagram 263 shows a cookie aligned on 32-bit boundaries, that is not required. 265 Fast Open Cookie Request Option 267 The client uses this option in the SYN packet to request a cookie 268 from a TFO-enabled server 270 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 271 | Kind | Length | 272 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 273 Kind 1 byte: same as the Fast Open Cookie option 274 Length 1 byte: constant 2. This distinguishes the option from 275 the Fast Open cookie option. 277 Options with invalid Length values, without SYN flag set, or with ACK 278 flag set MUST be ignored. 280 4.1.2. Server Cookie Handling 282 The server is in charge of cookie generation and authentication. The 283 cookie SHOULD be a message authentication code tag with the following 284 properties: 286 1. The cookie authenticates the client's (source) IP address of the 287 SYN packet. The IP address can be an IPv4 or IPv6 address. 289 2. The cookie can only be generated by the server and can not be 290 fabricated by any other parties including the client. 292 3. The cookie expires after a certain amount of time. The reason is 293 detailed in the "Security Consideration" section. This can be 294 done by either periodically changing the server key used to 295 generate cookies or including a timestamp in the cookie. 297 4. The generation and verification are fast relative to the rest of 298 SYN and SYN-ACK processing. 300 5. A server may encode other information in the cookie, and accept 301 more than one valid cookie per client at any given time. But this 302 is all server implementation dependent and transparent to the 303 client. 305 The server supports the cookie generation and verification 306 operations: 308 - GetCookie(IP_Address): returns a (new) cookie 310 - IsCookieValid(IP_Address, Cookie): checks if the cookie is valid, 311 i.e., it has not expired and it authenticates the client IP address. 313 Example Implementation: a simple implementation is to use AES_128 to 314 encrypt the IPv4 (with padding) or IPv6 address and truncate to 64 315 bits. The server can periodically update the key to expire the 316 cookies. AES encryption on recent processors is fast and takes only a 317 few hundred nanoseconds [RCCJB11]. 319 Note that if only one valid cookie is allowed per-client and the 320 server can regenerate the cookie independently, the best validation 321 process may be for the server to simply regenerate a valid cookie and 322 compare it against the incoming cookie. In that case if the incoming 323 cookie fails the check, a valid cookie is readily available to be 324 sent to the client without additional computation. 326 Also note the server may want to use special cookie values, e.g., 327 "0", for specific scenarios. For example, the server wants to notify 328 the client the support of TFO, but chooses not to return a valid 329 cookie for security or performance reasons upon receiving a TFO 330 request. 332 4.1.3. Client Cookie Handling 334 The client MUST cache cookies from servers for later Fast Open 335 connections. For a multi-homed client, the cookies are both client 336 and server IP dependent. Beside the cookie, we RECOMMEND that the 337 client caches the MSS and RTT to the server to enhance performance. 339 The MSS advertised by the server is stored in the cache to determine 340 the maximum amount of data that can be supported in the SYN packet. 341 This information is needed because data is sent before the server 342 announces its MSS in the SYN-ACK packet. Without this information, 343 the data size in the SYN packet is limited to the default MSS of 536 344 bytes [RFC1122]. The client SHOULD update the cache MSS value 345 whenever it discovers new MSS value, e.g., through path MTU 346 discovery. 348 Caching RTT allows seeding a more accurate SYN timeout than the 349 default value [RFC6298]. This lowers the performance penalty if the 350 network or the server drops the SYN packets with data or the cookie 351 options (See "Reliability and Deployment Issues" section below). 353 The cache replacement algorithm is not specified and is left for the 354 implementations. 356 Note that before TFO sees wide deployment, clients are advised to 357 also cache negative responses from servers in order to reduce the 358 amount of futile TFO attempts. Since TFO is enabled on a per-service 359 port basis but cookies are independent of service ports, clients' 360 cache should include remote port numbers too. 362 4.2. Fast Open Protocol 364 One predominant requirement of TFO is to be fully compatible with 365 existing TCP implementations, both on the client and the server 366 sides. 368 The server keeps two variables per listening port: 370 FastOpenEnabled: default is off. It MUST be turned on explicitly by 371 the application. When this flag is off, the server does not perform 372 any TFO related operations and MUST ignore all cookie options. 374 PendingFastOpenRequests: tracks number of TFO connections in SYN-RCVD 375 state. If this variable goes over a preset system limit, the server 376 SHOULD disable TFO for all new connection requests until 377 PendingFastOpenRequests drops below the system limit. This variable 378 is used for defending some vulnerabilities discussed in the "Security 379 Considerations" section. 381 The server keeps a FastOpened flag per TCB to mark if a connection 382 has successfully performed a TFO. 384 4.2.1. Fast Open Cookie Request 386 Any client attempting TFO MUST first request a cookie from the server 387 with the following steps: 389 1. The client sends a SYN packet with a Fast Open Cookie Request 390 option. 392 2. The server SHOULD respond with a SYN-ACK based on the procedures 393 in the "Server Cookie Handling" section. This SYN-ACK SHOULD 394 contain a Fast Open Cookie option if the server currently 395 supports TFO for this listener port. 397 3. If the SYN-ACK contains a Fast Open Cookie option, the client 398 replaces the cookie and other information as described in the 399 "Client Cookie Handling" section. Otherwise, if the SYN-ACK is 400 first seen, i.e.,not a (spurious) retransmission, the client MAY 401 remove the server information from the cookie cache. If the SYN- 402 ACK is a spurious retransmission without valid Fast Open Cookie 403 Option, the client does nothing to the cookie cache for the 404 reasons below. 406 The network or servers may drop the SYN or SYN-ACK packets with the 407 new cookie options which causes SYN or SYN-ACK timeouts. We RECOMMEND 408 both the client and the server retransmit SYN and SYN-ACK without the 409 cookie options on timeouts. This ensures the connections of cookie 410 requests will go through and lowers the latency penalties (of dropped 411 SYN/SYN-ACK packets). The obvious downside for maximum compatibility 412 is that any regular SYN drop will fail the cookie (although one can 413 argue the delay in the data transmission till after 3WHS is justified 414 if the SYN drop is due to network congestion). Next section 415 describes a heuristic to detect such drops when the client receives 416 the SYN-ACK. 418 We also RECOMMEND the client to record servers that failed to respond 419 to cookie requests and only attempt another cookie request after 420 certain period. 422 4.2.2. TCP Fast Open 424 Once the client obtains the cookie from the target server, the client 425 can perform subsequent TFO connections until the cookie is expired by 426 the server. The nature of TCP sequencing makes the TFO specific 427 changes relatively small in addition to [RFC793]. 429 Client: Sending SYN 431 To open a TFO connection, the client MUST have obtained the cookie 432 from the server: 434 1. Send a SYN packet. 436 a. If the SYN packet does not have enough option space for the 437 Fast Open Cookie option, abort TFO and fall back to regular 3WHS. 439 b. Otherwise, include the Fast Open Cookie option with the cookie 440 of the server. Include any data up to the cached server MSS or 441 default 536 bytes. 443 2. Advance to SYN-SENT state and update SND.NXT to include the data 444 accordingly. 446 3. If RTT is available from the cache, seed SYN timer according to 447 [RFC6298]. 449 To deal with network or servers dropping SYN packets with payload or 450 unknown options, when the SYN timer fires, the client SHOULD 451 retransmit a SYN packet without data and Fast Open Cookie options. 453 Server: Receiving SYN and responding with SYN-ACK 455 Upon receiving the SYN packet with Fast Open Cookie option: 457 1. Initialize and reset a local FastOpened flag. If FastOpenEnabled 458 is false, go to step 5. 460 2. If PendingFastOpenRequests is over the system limit, go to step 5. 462 3. If IsCookieValid() in section 4.1.2 returns false, go to step 5. 464 4. Buffer the data and notify the application. Set FastOpened flag 465 and increment PendingFastOpenRequests. 467 5. Send the SYN-ACK packet. The packet MAY include a Fast Open 468 Option. If FastOpened flag is set, the packet acknowledges the SYN 469 and data sequence. Otherwise it acknowledges only the SYN sequence. 471 The server MAY include data in the SYN-ACK packet if the response 472 data is readily available. Some application may favor delaying the 473 SYN-ACK, allowing the application to process the request in order to 474 produce a response, but this is left to the implementation. 476 6. Advance to the SYN-RCVD state. If the FastOpened flag is set, the 477 server MAY send more data packets before the handshake completes. The 478 maximum amount is ruled by the initial congestion window and the 479 receiver window [RFC3390]. 481 If the SYN-ACK timer fires, the server SHOULD retransmit a SYN-ACK 482 segment with neither data nor Fast Open Cookie options for 483 compatibility reasons. 485 Client: Receiving SYN-ACK 487 The client SHOULD perform the following steps upon receiving the SYN- 488 ACK: 489 1. Update the cookie cache if the SYN-ACK has a Fast Open Cookie 490 Option or MSS option or both. 492 2. Send an ACK packet. Set acknowledgment number to RCV.NXT and 493 include the data after SND.UNA if data is available. 495 3. Advance to the ESTABLISHED state. 497 Note there is no latency penalty if the server does not acknowledge 498 the data in the original SYN packet. The client SHOULD retransmit any 499 unacknowledged data in the first ACK packet in step 2. The data 500 exchange will start after the handshake like a regular TCP 501 connection. 503 If the client has timed out and retransmitted only regular SYN 504 packets, it can heuristically detect paths that intentionally drop 505 SYN with Fast Open option or data. If the SYN-ACK acknowledges only 506 the initial sequence and does not carry a Fast Open cookie option, 507 presumably it is triggered by a retransmitted (regular) SYN and the 508 original SYN or the corresponding SYN-ACK was lost. 510 Server: Receiving ACK 512 Upon receiving an ACK acknowledging the SYN sequence, the server 513 decrements PendingFastOpenRequests and advances to the ESTABLISHED 514 state. No special handling is required further. 516 5. Reliability and Deployment Issues 517 Network or Hosts Dropping SYN packets with data or unknown options 519 A study [MAF04] found that some middle-boxes and end-hosts may drop 520 packets with unknown TCP options incorrectly. Studies [LANGLEY06, 521 HNRGHT11] both found that 6% of the probed paths on the Internet drop 522 SYN packets with data or with unknown TCP options. The TFO protocol 523 deals with this problem by retransmitting SYN without data or cookie 524 options and we recommend tracking these servers in the client. 526 Server Farms 528 A common server-farm setup is to have many physical hosts behind a 529 load-balancer sharing the same server IP. The load-balancer forwards 530 new TCP connections to different physical hosts based on certain 531 load-balancing algorithms. For TFO to work, the physical hosts need 532 to share the same key and update the key at about the same time. 534 Network Address Translation (NAT) 536 The hosts behind NAT sharing same IP address will get the same cookie 537 to the same server. This will not prevent TFO from working. But on 538 some carrier-grade NAT configurations where every new TCP connection 539 from the same physical host uses a different public IP address, TFO 540 does not provide latency benefit. However, there is no performance 541 penalty either as described in Section "Client: Receiving SYN-ACK". 543 6. Security Considerations 545 The Fast Open cookie stops an attacker from trivially flooding 546 spoofed SYN packets with data to burn server resources or to mount an 547 amplified reflection attack on random hosts. The server can defend 548 against spoofed SYN floods with invalid cookies using existing 549 techniques [RFC4987]. We note that generating bogus cookies is 550 usually cheaper than validating them. But the additional cost of 551 validating the cookies, inherent to any authentication scheme, may 552 not be substantial compared to processing a regular SYN packet. 554 However, the attacker may still obtain cookies from some compromised 555 hosts, then flood spoofed SYN with data and "valid" cookies (from 556 these hosts or other vantage points). With DHCP, it's possible to 557 obtain cookies of past IP addresses without compromising any host. 558 Below we identify two new vulnerabilities of TFO and describe the 559 countermeasures. 561 6.1. Server Resource Exhaustion Attack by SYN Flood with Valid Cookies 563 Like regular TCP handshakes, TFO is vulnerable to such an attack. But 564 the potential damage can be much more severe. Besides causing 565 temporary disruption to service ports under attack, it may exhaust 566 server CPU and memory resources. 568 For this reason it is crucial for the TFO server to limit the maximum 569 number of total pending TFO connection requests, i.e., 570 PendingFastOpenRequests. When the limit is exceeded, the server 571 temporarily disables TFO entirely as described in "Server Cookie 572 Handling". Then subsequent TFO requests will be downgraded to regular 573 connection requests, i.e., with the data dropped and only SYN 574 acknowledged. This allows regular SYN flood defense techniques 575 [RFC4987] like SYN-cookies to kick in and prevent further service 576 disruption. 578 There are other subtle but important differences in the vulnerability 579 between TFO and regular TCP handshake. Before the SYN flood attack 580 broke out in the late '90s, typical listener's max qlen was small, 581 enough to sustain the highest expected new connection rate and the 582 average RTT for the SYN-ACK packets to be acknowledged in time. E.g., 583 if a server is designed to handle at most 100 connection requests per 584 second, and the average RTT is 100ms, a max qlen on the order of 10 585 will be sufficient. 587 This small max qlen made it very easy for any attacker, even equipped 588 with just a dailup modem to the Internet, to cause major disruptions 589 to a web site by simply throwing a handful of "SYN bombs" at its 590 victim of choice. But for this attack scheme to work, the attacker 591 must pick a non-responsive source IP address to spoof with. Otherwise 592 the SYN-ACK packet will trigger TCP RST from the host whose IP 593 address has been spoofed, causing corresponding connection to be 594 removed from the server's listener queue hence defeating the attack. 595 In other words, the main damage of SYN bombs against the standard TCP 596 stack is not directly from the bombs themselves costing TCP 597 processing overhead or host memory, but rather from the spoofed SYN 598 packets filling up the often small listener's queue. 600 On the other hand, TFO SYN bombs can cause damage directly if 601 admitted without limit into the stack. The RST packets from the 602 spoofed host will fuel rather than defeat the SYN bombs as compared 603 to the non-TFO case, because the attacker can flood more SYNs with 604 data to cost more data processing resources. For this reason, a TFO 605 server needs to monitor the connections in SYN-RCVD being reset in 606 addition to imposing a reasonable max qlen. Implementations may 607 combine the two, e.g., by continuing to account for those connection 608 requests that have just been reset against the listener's 609 PendingFastOpenRequests until a timeout period has passed. 611 Limiting the maximum number of pending TFO connection requests does 612 make it easy for an attacker to overflow the queue, causing TFO to be 613 disabled. We argue that causing TFO to be disabled is unlikely to be 614 of interest to attackers because the service will remain intact 615 without TFO hence there is hardly any real damage. 617 6.2. Amplified Reflection Attack to Random Host 619 Limiting PendingFastOpenRequests with a system limit can be done 620 without Fast Open Cookies and would protect the server from resource 621 exhaustion. It would also limit how much damage an attacker can cause 622 through an amplified reflection attack from that server. However, it 623 would still be vulnerable to an amplified reflection attack from a 624 large number of servers. An attacker can easily cause damage by 625 tricking many servers to respond with data packets at once to any 626 spoofed victim IP address of choice. 628 With the use of Fast Open Cookies, the attacker would first have to 629 steal a valid cookie from its target victim. This likely requires the 630 attacker to compromise the victim host or network first. 632 The attacker here has little interest in mounting an attack on the 633 victim host that has already been compromised. But she may be 634 motivated to disrupt the victim's network. Since a stolen cookie is 635 only valid for a single server, she has to steal valid cookies from a 636 large number of servers and use them before they expire to cause 637 sufficient damage without triggering the defense in the previous 638 section. 640 One can argue that if the attacker has compromised the target network 641 or hosts, she could perform a similar but simpler attack by injecting 642 bits directly. The degree of damage will be identical, but TFO- 643 specific attack allows the attacker to remain anonymous and disguises 644 the attack as from other servers. 646 The best defense is for the server not to respond with data until 647 handshake finishes. In this case the risk of amplification reflection 648 attack is completely eliminated. But the potential latency saving 649 from TFO may diminish if the server application produces responses 650 earlier before the handshake completes. 652 7. Web Performance 654 7.1. HTTP persistent connection 656 TCP connection setup overhead has long been identified as a 657 performance bottleneck for web applications [THK98]. HTTP persistent 658 connection was proposed to mitigate this issue and has been widely 659 deployed. However, [RCCJR11][AERG11] show that the average number of 660 transactions per connection is between 2 and 4, based on large-scale 661 measurements from both servers and clients. In these studies, the 662 servers and clients both kept the idle connections up to several 663 minutes, well into the human think time. 665 Can the utilization rate increase by keeping connections even longer? 666 Unfortunately, this is problematic due to middle-boxes and rapidly 667 growing mobile end hosts. One major issue is NAT. Studies 668 [HNESSK10][MQXMZ11] show that the majority of home routers and ISPs 669 fail to meet the the 124 minutes idle timeout mandated in [RFC5382]. 670 In [MQXMZ11], 35% of mobile ISPs timeout idle connections within 30 671 minutes. NAT boxes do not possess a reliable mechanism to notify 672 endhosts when idle connections are removed from local tables, either 673 due to resource constraints such as mapping table size, memory, or 674 lookup overhead, or due to the limited port number and IP address 675 space. Moreover, unmapped packets received by NAT boxes are often 676 dropped silently. (TCP RST is not required by RFC5382.) The end host 677 attempting to use these broken connections are often forced to wait 678 for a lengthy TCP timeout. Thus the browser risks large performance 679 penalty when keeping idle connections open. To circumvent this 680 problem, some applications send frequent TCP keep-alive probes. 681 However, this technique drains power on mobile devices [MQXMZ11]. In 682 fact, power has become a prominent issue in modern LTE devices that 683 mobile browsers close the HTTP connections within seconds or even 684 immediately [SOUDERS11]. 686 Idle connections also consume more memory resources. Due to the 687 complexity of today's web applications, the application layer often 688 needs orders of magnitude more memory than the TCP connection 689 footprint. As a result, servers need to implement advanced resource 690 management in order to support a large number of idle connections. 692 7.2 Case Study: Chrome Browser 694 [RCCJR11] studied Chrome browser performance based on 28 days of 695 global statistics. Chrome browser keeps idle HTTP persistent 696 connections up to 5 to 10 minutes. However the average number of the 697 transactions per connection is only 3.3. Due to the low utilization, 698 TCP 3WHS accounts up to 25% of the HTTP transaction network latency. 699 The authors tested a Linux TFO implementation with TFO enabled Chrome 700 browser on popular web sites in emulated environments such as 701 residential broadband and mobile networks. They showed that TFO 702 improves page load time by 10% to 40%. More detailed on the design 703 tradeoffs and measurement can be found at [RCCJB11]. 705 8. TFO's Applicability 707 TFO aims at latency conscious applications that are sensitive to 708 TCP's initial connection setup delay. These application protocols 709 often employ short-lived TCP connections, or employ long-lived 710 connections but are more sensitive to the connection setup delay due 711 to, e.g., a more strict connection failover requirement. 713 Only transaction-type applications where RTT constitutes a 714 significant portion of the total end-to-end latency will likely 715 benefit from TFO. Moreover, the client request must fit in the SYN 716 packet. Otherwise there may not be any saving in the total number of 717 round trips required to complete a transaction. 719 To the extent possible applications protocols SHOULD employ long- 720 lived connections to best take advantage of TCP's built-in congestion 721 control algorithm, and to reduce the impact from TCP's connection 722 setup overhead. E.g., for the web applications, P-HTTP will likely 723 help and is much easier to deploy hence should be attempted first. 724 TFO will likely provide further latency reduction on top of P-HTTP. 725 But the additional benefit will depend on how much persistency one 726 can get from HTTP in a given operating environment. 728 One alternative to short-lived TCP connection might be UDP, which is 729 connectionless hence doesn't inflict any connection setup delay, and 730 is best suited for application protocols that are transactional. 731 Practical deployment issues such as middlebox and/or firewall 732 traversal may severely limit the use of UDP based application 733 protocols though. 735 Note that when the application employs too many short-lived 736 connections, it may negatively impact network stability, as these 737 connections often exit before TCP's congestion control algorithm 738 kicks in. Implementations supporting large number of short-lived 739 connections should employ temporal sharing of TCB data as described 740 in [RFC2140]. 742 More discussion on TCP Fast Open and its projected performance 743 benefit can be found in [RCCJB11]. 745 9. Related Work 747 9.1. T/TCP 749 TCP Extensions for Transactions [RFC1644] attempted to bypass the 750 three-way handshake, among other things, hence shared the same goal 751 but also the same set of issues as TFO. It focused most of its effort 752 battling old or duplicate SYNs, but paid no attention to security 753 vulnerabilities it introduced when bypassing 3WHS. Its TAO option and 754 connection count, besides adding complexity, require the server to 755 keep state per remote host, while still leaving it wide open for 756 attacks. It is trivial for an attacker to fake a CC value that will 757 pass the TAO test. Unfortunately, in the end its scheme is still not 758 100% bullet proof as pointed out by [PHRACK98]. 760 As stated earlier, we take a practical approach to focus TFO on the 761 security aspect, while allowing old, duplicate SYN packets with data 762 after recognizing that 100% TCP semantics is likely infeasible. We 763 believe this approach strikes the right tradeoff, and makes TFO much 764 simpler and more appealing to TCP implementers and users. 766 9.2. Common Defenses Against SYN Flood Attacks 768 TFO is still vulnerable to SYN flood attacks just like normal TCP 769 handshakes, but the damage may be much worse, thus deserves a careful 770 thought. 772 There have been plenty of studies on how to mitigate attacks from 773 regular SYN flood, i.e., SYN without data [RFC4987]. But from the 774 stateless SYN-cookies to the stateful SYN Cache, none can preserve 775 data sent with SYN safely while still providing an effective defense. 777 The best defense may be to simply disable TFO when a host is 778 suspected to be under a SYN flood attack, e.g., the SYN backlog is 779 filled. Once TFO is disabled, normal SYN flood defenses can be 780 applied. The "Security Consideration" section contains a thorough 781 discussion on this topic. 783 9.3. TCP Cookie Transaction (TCPCT) 785 TCPCT [RFC6013] eliminates server state during initial handshake and 786 defends spoofing DoS attacks. Like TFO, TCPCT allows SYN and SYN-ACK 787 packets to carry data. However, TCPCT and TFO are designed for 788 different goals and they are not compatible. 790 The TCPCT server does not keep any connection state during the 791 handshake, therefore the server application needs to consume the data 792 in SYN and (immediately) produce the data in SYN-ACK before sending 793 SYN-ACK. Otherwise the application's response has to wait until 794 handshake completes. In contrary, TFO allows server to respond data 795 during handshake. Therefore for many request-response style 796 applications, TCPCT may not achieve same latency benefit as TFO. 798 Rapid-Restart [SIMPSON11] is based on TCPCT and shares similar goal 799 as TFO. In Rapid-Restart, both the server and the client retain the 800 TCP control blocks after a connection is terminated in order to 801 allow/resume data exchange in next connection handshake. In contrary, 802 TFO does not require keeping both TCB on both sides and is more 803 scalable. 805 10. IANA Considerations 807 The Fast Open Cookie Option and Fast Open Cookie Request Option 808 define no new namespace. The options require IANA allocate one value 809 from the TCP option Kind namespace. 811 11. Acknowledgements 813 The authors would like to thank Tom Herbert, Rick Jones, Adam 814 Langley, Mathew Mathis, Roberto Peon, and Barath Raghavan for their 815 insightful comments. 817 12. References 819 12.1. Normative References 821 [RFC793] Postel, J. "Transmission Control Protocol", RFC 793, 822 September 1981. 824 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 825 Communication Layers", STD 3, RFC 1122, October 1989. 827 [RFC5382] S. Guha, Ed., Biswas, K., Ford B., Sivakumar S., Srisuresh, 828 P., "NAT Behavioral Requirements for TCP", RFC 5382 830 [RFC5681] Allman, M., Paxson, V. and E. Blanton, "TCP Congestion 831 Control", RFC 5681, September 2009. 833 [RFC6298] Paxson, V., Allman, M., Chu, J. and M. Sargent, "Computing 834 TCP's Retransmission Timer", RFC 6298, June 2011. 836 12.2. Informative References 838 [AERG11] M. Al-Fares, K. Elmeleegy, B. Reed, and I. Gashinsky, 839 "Overclocking the Yahoo! CDN for Faster Web Page Loads". In 840 Proceedings of Internet Measurement Conference, November 841 2011. 843 [CDCM11] Chu, J., Dukkipati, N., Cheng, Y. and M. Mathis, 844 "Increasing TCP's Initial Window", Internet-Draft draft- 845 ietf-tcpm- initcwnd-02.txt (work in progress), October 846 2011. 848 [HNESSK10] S. Haetoenen, A. Nyrhinen, L. Eggert, S. Strowes, P. 849 Sarolahti, M. Kojo., "An Experimental Study of Home Gateway 850 Characteristics". In Proceedings of Internet Measurement 851 Conference. Octobor 2010 853 [HNRGHT11] M. Honda, Y. Nishida, C. Raiciu, A. Greenhalgh, M. 854 Handley, H. Tokuda, "Is it Still Possible to Extend TCP?". 855 In Proceedings of Internet Measurement Conference. November 856 2011. 858 [LANGLEY06] Langley, A, "Probing the viability of TCP extensions", 860 URL http://www.imperialviolet.org/binary/ecntest.pdf 862 [MAF04] Medina, A., Allman, M., and S. Floyd, "Measuring 863 Interactions Between Transport Protocols and Middleboxes", 864 In Proceedings of Internet Measurement Conference, October 865 2004. 867 [MQXMZ11] Z. Mao, Z. Qian, Q. Xu, Z. Mao, M. Zhang. "An Untold Story 868 of Middleboxes in Cellular Networks", In Proceedings of 869 SIGCOMM. August 2011. 871 [PHRACK98] "T/TCP vulnerabilities", Phrack Magazine, Volume 8, Issue 872 53 artical 6. July 8, 1998. URL 873 http://www.phrack.com/issues.html?issue=53&id=6 875 [QWGMSS11] F. Qian, Z. Wang, A. Gerber, Z. Mao, S. Sen, O. 876 Spatscheck. "Profiling Resource Usage for Mobile 877 Applications: A Cross-layer Approach", In Proceedings of 878 International Conference on Mobile Systems. April 2011. 880 [RCCJB11] Radhakrishnan, S., Cheng, Y., Chu, J., Jain, A. and B. 881 Raghavan, "TCP Fast Open". In Proceedings of 7th ACM CoNEXT 882 Conference, December 2011. 884 [RFC1644] Braden, R., "T/TCP -- TCP Extensions for Transactions 885 Functional Specification", RFC 1644, July 1994. 887 [RFC2140] Touch, J., "TCP Control Block Interdependence", RFC2140, 888 April 1997. 890 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 891 Mitigations", RFC 4987, August 2007. 893 [RFC6013] Simpson, W., "TCP Cookie Transactions (TCPCT)", RFC6013, 894 January 2011. 896 [SIMPSON11] Simpson, W., "Tcp cookie transactions (tcpct) rapid 897 restart", Internet draft draft-simpson-tcpct-rr-02.txt 898 (work in progress), July 2011. 900 [SOUDERS11] S. Souders. "Making A Mobile Connection". 901 http://www.stevesouders.com/blog/2011/09/21/making-a- 902 mobile-connection/ 904 [THK98] Touch, J., Heidemann, J., Obraczka, K., "Analysis of HTTP 905 Performance", USC/ISI Research Report 98-463. December 906 1998. 908 Author's Addresses 910 Yuchung Cheng 911 Google, Inc. 912 1600 Amphitheatre Parkway 913 Mountain View, CA 94043, USA 914 EMail: ycheng@google.com 916 Jerry Chu 917 Google, Inc. 918 1600 Amphitheatre Parkway 919 Mountain View, CA 94043, USA 920 EMail: hkchu@google.com 922 Sivasankar Radhakrishnan 923 Google, Inc. 924 1600 Amphitheatre Parkway 925 Mountain View, CA 94043, USA 926 EMail: sivasankar@cs.ucsd.edu 928 Arvind Jain 929 Google, Inc. 930 1600 Amphitheatre Parkway 931 Mountain View, CA 94043, USA 932 EMail: arvind@google.com 934 Acknowledgement 936 Funding for the RFC Editor function is currently provided by the 937 Internet Society.