idnits 2.17.1 draft-gont-tcpm-tcp-timestamps-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) -- The document has an IETF Trust Provisions (28 Dec 2009) Section 6.c(ii) Publication Limitation clause. If this document is intended for submission to the IESG for publication, this constitutes an error. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 30, 2010) is 5141 days in the past. Is this intentional? Checking references for intended status: Best Current Practice ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 1323 (Obsoleted by RFC 7323) ** Downref: Normative reference to an Informational RFC: RFC 1337 Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TCP Maintenance and Minor F. Gont 3 Extensions (tcpm) UK CPNI 4 Internet-Draft March 30, 2010 5 Intended status: BCP 6 Expires: October 1, 2010 8 Reducing the TIME-WAIT state using TCP timestamps 9 draft-gont-tcpm-tcp-timestamps-04.txt 11 Abstract 13 This document describes an algorithm for processing incoming SYN 14 segments that allows higher connection-establishment rates between 15 any two TCP endpoints when a TCP timestamps option is present in the 16 incoming SYN segment. 18 Status of this Memo 20 This Internet-Draft is submitted to IETF in full conformance with the 21 provisions of BCP 78 and BCP 79. This document may not be modified, 22 and derivative works of it may not be created, and it may not be 23 published except as an Internet-Draft. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as Internet- 28 Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt. 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html. 41 This Internet-Draft will expire on October 1, 2010. 43 Copyright Notice 45 Copyright (c) 2010 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Improved processing of incoming connection requests . . . . . 3 62 3. Interaction with various timestamps generation algorithms . . 6 63 4. Corner-cases . . . . . . . . . . . . . . . . . . . . . . . . . 7 64 4.1. Connection request after system reboot . . . . . . . . . . 7 65 5. Security Considerations . . . . . . . . . . . . . . . . . . . 8 66 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 67 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 8 68 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 8 69 8.1. Normative References . . . . . . . . . . . . . . . . . . . 8 70 8.2. Informative References . . . . . . . . . . . . . . . . . . 9 71 Appendix A. Changes from previous versions of the draft (to 72 be removed by the RFC Editor before publishing 73 this document as an RFC) . . . . . . . . . . . . . . 9 74 A.1. Changes from draft-gont-tcpm-tcp-timestamps-03 . . . . . . 9 75 A.2. Changes from draft-gont-tcpm-tcp-timestamps-02 . . . . . . 9 76 A.3. Changes from draft-gont-tcpm-tcp-timestamps-01 . . . . . . 9 77 A.4. Changes from draft-gont-tcpm-tcp-timestamps-00 . . . . . . 10 78 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 10 80 1. Introduction 82 The Timestamps option, specified in RFC 1323 [RFC1323], allows a TCP 83 to include a timestamp value in its segments, that can be used used 84 to perform two functions: Round-Trip Time Measurement (RTTM), and 85 Protection Against Wrapped Sequences (PAWS). 87 For the purpose of PAWS, the timestamps sent on a connection are 88 required to be monotonically increasing. While there is no 89 requirement that timestamps are monotonically increasing across TCP 90 connections, the generation of timestamps such that they are 91 monotonically increasing across connections between the same two 92 endpoints allows the use of timestamps for improving the handling of 93 SYN segments that are received while the corresponding four-tuple is 94 in the TIME-WAIT state. That is, the timestamp option could be used 95 to perform heuristics to determine whether to allow the creation of a 96 new incarnation of a connection that is in the TIME-WAIT state. 98 This use of TCP timestamps is simply an extrapolation of the use of 99 Initial Sequence Numbers (ISNs) for the same purpose, as allowed by 100 RFC 1122 [RFC1122], and it has been incorporated in a number of TCP 101 implementations, such as that included in the Linux kernel [Linux]. 103 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 104 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 105 document are to be interpreted as described in RFC 2119 [RFC2119]. 107 2. Improved processing of incoming connection requests 109 In a number of scenarios a socket pair may need to be reused while 110 the corresponding four-tuple is still in the TIME-WAIT state in a 111 remote TCP peer. For example, a client accessing some service on a 112 host may try to create a new incarnation of a previous connection, 113 while the corresponding four-tuple is still in the TIME-WAIT state at 114 the remote TCP peer (the server). This may happen if the ephemeral 115 port numbers are being reused too quickly, either because of a bad 116 policy of selection of ephemeral ports, or simply because of a high 117 connection rate to the corresponding service. In such scenarios, the 118 establishment of new connections that reuse a four-tuple that is in 119 the TIME-WAIT state would fail. 121 In order to avoid this problem, RFC 1122 [RFC1122] (in Section 122 4.2.2.13) states that when a connection request is received with a 123 four-tuple that is in the TIME-WAIT state, the connection request 124 could be accepted if the sequence number of the incoming SYN segment 125 is greater than the last sequence number seen on the previous 126 incarnation of the connection (for that direction of the data 127 transfer). This requirement aims at avoiding the sequence number 128 space of the new and old incarnations of the connection to overlap, 129 thus avoiding old segments from the previous incarnation of the 130 connection to be accepted as valid by the new connection. 132 The same policy may be extrapolated to TCP timestamps. That is, when 133 a connection request is received with a four-tuple that is in the 134 TIME-WAIT state, the connection request could be accepted if the 135 timestamp of the incoming SYN segment is greater than the last 136 timestamp seen on the previous incarnation of the connection (for 137 that direction of the data transfer). 139 The following paragraphs summarize the processing of SYN segments 140 received for connections in the TIME-WAIT state. Both the ISN 141 (Initial Sequence Number) and the timestamp option (if present) of 142 the incoming SYN segment are included in the heuristics performed for 143 allowing a high connection-establishment rate. 145 Processing of SYN segments received for connections in the 146 synchronized states should occur as follows: 148 o If a SYN segment is received for a connection in any synchronized 149 state other than TIME-WAIT, respond with an ACK, applying rate- 150 throttling. 152 o If the corresponding connection is in the TIME-WAIT state, then, 154 * If the previous incarnation of the connection used timestamps, 155 then, 157 + If TCP timestamps would be enabled for the new incarnation 158 of the connection, and the timestamp contained in the 159 incoming SYN segment is greater than the last timestamp seen 160 on the previous incarnation of the connection (for that 161 direction of the data transfer), honour the connection 162 request (creating a connection in the SYN-RECEIVED state). 164 + If TCP timestamps would be enabled for the new incarnation 165 of the connection, the timestamp contained in the incoming 166 SYN segment is equal to the last timestamp seen on the 167 previous incarnation of the connection (for that direction 168 of the data transfer), and the Sequence Number of the 169 incoming SYN segment is larger than the last sequence number 170 seen on the previous incarnation of the connection (for that 171 direction of the data transfer), then honour the connection 172 request (creating a connection in the SYN-RECEIVED state). 174 + If TCP timestamps would not be enabled for the new 175 incarnation of the connection, but the Sequence Number of 176 the incoming SYN segment is larger than the last sequence 177 number seen on the previous incarnation of the connection 178 (for the same direction of the data transfer), honour the 179 connection request (creating a connection in the SYN- 180 RECEIVED state). 182 + Otherwise, silently drop the incoming SYN segment, thus 183 leaving the previous incarnation of the connection in the 184 TIME-WAIT state. 186 * If the previous incarnation of the connection did not use 187 timestamps, then, 189 + If TCP timestamps would be enabled for the new incarnation 190 of the connection, honour the incoming connection request. 192 + If TCP timestamps would not be enabled for the new 193 incarnation of the connection, but the Sequence Number of 194 the incoming SYN segment is larger than the last sequence 195 number seen on the previous incarnation of the connection 196 (for the same direction of the data transfer), then honour 197 the incoming connection request (even if the sequence number 198 of the incoming SYN segment falls within the receive window 199 of the previous incarnation of the connection). 201 + Otherwise, silently drop the incoming SYN segment, thus 202 leaving the previous incarnation of the connection in the 203 TIME-WAIT state. 205 Note: 207 In the above explanation, the phrase "TCP timestamps would be 208 enabled for the new incarnation for the connection" means that the 209 incoming SYN segment contains a TCP Timestamps option (i.e., the 210 client has enabled TCP timestamps), and that the SYN/ACK segment 211 that would be sent in response to it would also contain a 212 Timestamps option (i.e., the server has enabled TCP timestamps). 213 In such a scenario, TCP timestamps would be enabled for the new 214 incarnation of the connection. 216 The "last sequence number seen on the previous incarnation of the 217 connection (for the same direction of the data transfer)" refers 218 to the last sequence number used by the previous incarnation of 219 the connection (for the same direction of the data transfer), and 220 not to the last value seen in the Sequence Number field of the 221 corresponding segments. That is, it refers to the sequence number 222 corresponding to the FIN flag of the previous incarnation of the 223 connection, for that direction of the data transfer. 225 Many implementations do not include the TCP timestamp option when 226 performing the above heuristics, thus imposing stricter constraints 227 on the generation of Initial Sequence Numbers, the average data 228 transfer rate of the connections, and the amount of data transferred 229 with them. RFC 793 [RFC0793] states that the ISN generator should be 230 incremented roughly once every four microseconds (i.e., roughly 231 250000 times per second). As a result, any connection that transfers 232 more than 250000 bytes of data at more than 250 KB/s could lead to 233 scenarios in which the last sequence number seen on a connection that 234 moves into the TIME-WAIT state is still greater than the sequence 235 number of an incoming SYN segment that aims at creating a new 236 incarnation of the same connection. In those scenarios, the 4.4BSD 237 heuristics would fail, and therefore the connection request would 238 usually time out. By including the TCP timestamp option in the 239 heuristics described above, all these constraints are greatly 240 relaxed. 242 It is clear that the use of TCP timestamps for the heuristics 243 described above benefit from timestamps that are monotonically 244 increasing across connections between the same two TCP endpoints. 246 3. Interaction with various timestamps generation algorithms 248 The algorithm proposed in Section 2 clearly benefits of timestamps 249 that are monotonically-increasing across connections to the same end- 250 point. In particular, generation of timestamps such that they are 251 monotonically-increasing timestamps are important for TCPs that 252 perform the active open, as those are the timestamps that will be 253 used for the proposed algorithm. 255 While monotonically-increasing timestamps ensure that the proposed 256 algorithm will be able to reduce the TIME-WAIT state of a previous 257 incarnation of a connection, implementation of the algorithm does not 258 imply by itself a requirement on the timestamps generation algorithm 259 of other TCPs. 261 In the worst-case scenario, an incoming SYN corresponding to a new 262 incarnation of a connection in the TIME-WAIT contains a timestamp 263 that is smaller than the last timestamp seen on the previous 264 incarnation of the connection, the heuristics fail, and the result is 265 no worse than the current state-of-affairs. That is, 266 o The TIME_WAIT state is assassinated, with the connection request 267 being rejected (as specified in [RFC0793]), or, 269 o The SYN segment is ignored (as specified in [RFC1337]), and thus 270 the connection request times out, or is accepted after future 271 retransmissions of the SYN 273 Some stacks may implement timestamps generation algorithms that do 274 not lead to monotonically-increasing timestamps across connections 275 with the same remote endpoint. An example of such algorithms is the 276 one described in [RFC4987] and [Opperman], that allows the 277 implementation of extended TCP SYN cookies. 279 Note: 280 It should be noted that this algorithm could co-exist with an 281 algorithm for generating timestamps such that they are 282 monotonically-increasing. Monotonically increasing timestamps 283 could be generated for TCPs that perform the active open, while 284 timestamps for TCPs that perform the passive open could be 285 generated according to [Opperman]. 287 4. Corner-cases 289 4.1. Connection request after system reboot 291 The question was raised on the tcpm mailing-list as to how this 292 algorithm would operate in case a computer reboots, keeps the same IP 293 address, looses memory of the previous time stamps, and then tries to 294 reestablish a previous connection. 296 Firstly, as specified in [RFC0793], hosts must not establish new 297 connections for a period of 2*MSL after they boot (this is the "quiet 298 time" concept). As a result, specs-wise, this scenario should never 299 occur. 301 If a host does not comply with the "quiet time concept", then the 302 possible scenarios are: 304 o If the selected timestamp for the new connection is monotonically- 305 increasing with respect to the last timestamp seen on the previous 306 incarnation of the connection, the TIME-WAIT state is tossed, and 307 the new connection request succeeds. 309 o Otherwise, the connection request may time out or be rejected 310 (depending on whether the workaround described in [RFC1337] is 311 implemented or not). This case corresponds to the current state- 312 of-affairs without the algorithm proposed in this document. 314 5. Security Considerations 316 While the algorithm described in this document for processing 317 incoming SYN segments would benefit from TCP timestamps that are 318 monotonically-increasing across connections, this document does not 319 propose any specific algorithm for generating timestamps, nor does it 320 require monotonically-increasing timestamps across conenctions. 322 [CPNI-TCP] contains a detailed discussion of the security 323 implications of TCP timestamps. 325 6. IANA Considerations 327 This document has no actions for IANA. 329 7. Acknowledgements 331 The author of this document would like to thank (in alphabetical 332 order) Mark Allman, Christian huitema, Alfred Hoenes, Eric Rescorla, 333 Joe Touch, and Alexander Zimmermann for providing valuable feedback 334 on an earlier version of this document. 336 Additionally, the author would like to thank David Borman for a 337 fruitful discussion on TCP timestamps at IETF 73. 339 Finally, the author would like to thank the United Kingdom's Centre 340 for the Protection of National Infrastructure (UK CPNI) for their 341 continued support. 343 8. References 345 8.1. Normative References 347 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 348 RFC 793, September 1981. 350 [RFC1122] Braden, R., "Requirements for Internet Hosts - 351 Communication Layers", STD 3, RFC 1122, October 1989. 353 [RFC1323] Jacobson, V., Braden, B., and D. Borman, "TCP Extensions 354 for High Performance", RFC 1323, May 1992. 356 [RFC1337] Braden, B., "TIME-WAIT Assassination Hazards in TCP", 357 RFC 1337, May 1992. 359 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 360 Requirement Levels", BCP 14, RFC 2119, March 1997. 362 8.2. Informative References 364 [CPNI-TCP] 365 CPNI, "Security Assessment of the Transmission Control 366 Protocol (TCP)", http://www.cpni.gov.uk/Docs/ 367 tn-03-09-security-assessment-TCP.pdf, 2009. 369 [Linux] The Linux Project, "http://www.kernel.org". 371 [Opperman] 372 Oppermann, A., "FYI: Extended TCP syncookies in FreeBSD- 373 current", Post to the tcpm mailing-list. Available at: ht 374 tp://www.ietf.org/mail-archive/web/tcpm/current/ 375 msg02251.html, 2006. 377 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 378 Mitigations", RFC 4987, August 2007. 380 Appendix A. Changes from previous versions of the draft (to be removed 381 by the RFC Editor before publishing this document as an 382 RFC) 384 A.1. Changes from draft-gont-tcpm-tcp-timestamps-03 386 o Changed the document title 388 o Removed all the text related to the algorithm earlier proposed for 389 timestamps generation. 391 o Addresses comments received from Alexander Zimmermann, Christian 392 Huitema, Joe Touch, and others. 394 A.2. Changes from draft-gont-tcpm-tcp-timestamps-02 396 o Minor edits (the I-D was just about to expire, so it was 397 resubmitted with almost no changes). 399 A.3. Changes from draft-gont-tcpm-tcp-timestamps-01 401 o Version -01 of the draft had expired, and hence the I-D is 402 resubmitted to make it available again (no changes). 404 A.4. Changes from draft-gont-tcpm-tcp-timestamps-00 406 o Fixed author's affiliation. 408 o Addressed feedback submitted by Alfred Hoenes (see: 409 http://www.ietf.org/mail-archive/web/tcpm/current/msg04281.html), 410 plus nits sent by Alfred off-list. 412 Author's Address 414 Fernando Gont 415 UK Centre for the Protection of National Infrastructure 417 Email: fernando@gont.com.ar 418 URI: http://www.cpni.gov.uk