idnits 2.17.1 draft-dawkins-trigtran-linkup-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC2119], [PILC]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: Modified hosts MUST not send LUNs more frequently than once every three seconds. This restriction matches the RTO period for a new TCP connection, so is assumed to be "safe enough". -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 24, 2003) is 7490 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'PILC' on line 339 looks like a reference -- Missing reference section? 'RFC2119' on line 346 looks like a reference -- Missing reference section? 'RFC793' on line 343 looks like a reference -- Missing reference section? 'TRIGTRAN56' on line 354 looks like a reference -- Missing reference section? 'RFC2988' on line 350 looks like a reference -- Missing reference section? 'LINK' on line 332 looks like a reference -- Missing reference section? 'LINKNOTE' on line 336 looks like a reference -- Missing reference section? 'ESP' on line 202 looks like a reference -- Missing reference section? 'BAPC' on line 330 looks like a reference Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TSVWG S. Dawkins 3 Internet-Draft C. Williams 4 Expires: April 23, 2004 MCSR Labs 5 October 24, 2003 7 End-to-end, Implicit "Link-Up" Notification 8 draft-dawkins-trigtran-linkup-01.txt 10 Status of this Memo 12 This document is an Internet-Draft and is in full conformance with 13 all provisions of Section 10 of RFC2026. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that other 17 groups may also distribute working documents as Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six months 20 and may be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet-Drafts as reference 22 material or to cite them other than as "work in progress." 24 The list of current Internet-Drafts can be accessed at http:// 25 www.ietf.org/ietf/1id-abstracts.txt. 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html. 30 This Internet-Draft will expire on April 23, 2004. 32 Copyright Notice 34 Copyright (C) The Internet Society (2003). All Rights Reserved. 36 Abstract 38 The Performance Implications of Link Characteristics [PILC] working 39 group is recommending an end-to-end implicit notification when an 40 access link outage ends. This document codifies the "Link Up 41 Notification" for TCP. 43 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 44 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 45 document are to be interpreted as described in [RFC2119]. 47 1. Introduction 49 The Transmission Control Protocol (TCP) [RFC793] uses a 50 retransmission timer to ensure data delivery in the absence of any 51 feedback from a remote data receiver, and prescribes an "exponential 52 backoff" for this timer in cases where retransmissions are also 53 unacknowledged. This timer can grow to a very large value (the 54 retransmission timer in deployed implementations is often capped at 55 64 seconds, and even this limit isn't required by standards-track 56 specifications). 58 This exponential backoff is necessary to prevent sustained congestion 59 (if loss occurs due to congestion), but may provide an unnecessarily 60 unpleasant user experience (if the loss occurs due to link outages in 61 a wireless environment). 63 The Performance Implications of Link Characteristics [PILC] working 64 group is recommending an end-to-end implicit notification when an 65 access link outage ends [LINK, section 8.2]. The goal is to allow 66 sending transports to retransmit in a timely fashion without 67 modifying the exponential backoff mechanism. This notification was 68 well-supported in the IETF 56 TRIGTRAN BoF [TRIGTRAN56]. 70 PILC is not chartered to propose protocol changes, so this proposal 71 is targeted for the Transport Area Working Group (TSVWG). 73 This note describes a method of "short-circuiting" a "backed-off" 74 retransmission timer in a case where a TCP detects that a local 75 interface has become operational, so that a sender is notified that 76 another retransmission attempt may be appropriate. The TCP using the 77 interface sends a "Link Up Notification" (or "LUN") to its peer. 79 2. Problem Statement 81 The Transmission Control Protocol (TCP) [RFC793] uses a 82 retransmission timer to ensure data delivery in the absence of any 83 feedback from a remote data receiver. This timer, called the 84 retransmission timeout (RTO), is calculated using an algorithm 85 specified in [RFC2988]. 87 When an RTO occurs, the sender retransmits an unacknowledged segment. 88 If this retransmitted segment is also unacknowledged, the sender 89 waits twice as long before attempting an additional retransmission, 90 and this delay is cumulative for each successive retransmission that 91 does not result in an acknowledgement from the receiver. 93 The initial value of RTO is 3 seconds, and subsequent values during 94 normal operation approach a smoothed average of the RTT (plus a 95 factor based on the variance in RTT), with a lower bound of 1 second. 96 When a segment is lost, and cannot be recovered by other means (Fast 97 Retransmit), the RTO used to trigger the first retransmission attempt 98 will be as short as is "reasonable" - the RTO is calculated based on 99 the measured RTT, so the RTO will happen with a reasonable 100 expectation that no acknowledgement for data sent before RTO will be 101 received after RTO. This might be characterized as "as soon as 102 possible, but no sooner". 104 All well and good, if the retransmitted segment is acknowledged. If 105 it is not acknowledged, the TCP will wait twice as long before 106 retransmitting again, and will continue to double the RTO interval 107 each time its attempt to retransmit fails. 109 This behavior is conservative, ensuring that sending TCPs "back off" 110 in the presence of path congestion. This desirable property comes at 111 a price - current RTO values quickly increase into the 10s of seconds 112 between retransmission attempts, a painfully slow interval if a human 113 being is "in the loop". BSD-based TCPs finally "cap" the maximum RTO 114 value at 64 seconds, but this "cap" is not required [RFC2988] - 115 conformant TCPs are allowed to continue to increase RTO into multiple 116 minutes between retransmission attempts. 118 If an RTO has happened because of path congestion, high and rising 119 RTO-based periods of "silence" are necessary to ensure that path 120 congestion does not remain, or even increase, at a time when the 121 sending TCP is not receiving any feedback from the receiver. 123 If an RTO has happened because of an access link failure, an 124 all-too-common situation when the access link is a wireless link, and 125 the access link becomes available again, the unexpired portion of the 126 full RTO period is not required to prevent sustained congestion, 127 because no congestion was occurring. However, today's sending TCPs 128 cannot know this is the case, have no indication that the RTO is 129 caused by an access link failure, and must make the conservative 130 assumption that lost packets are being lost due to congestion. 132 It is near-axiomatic that a "human in the loop" will abandon any 133 operation leading to minutes of inactivity and "try again" - for 134 instance, pressing the "stop" and "reload" buttons on an HTTP 135 browser. These operations often reset or abandon existing TCP 136 connections, causing TCPs to discard learned path characteristics, 137 and add additional packets (SYN/SYN-ACK on new connections, etc.) to 138 the connection path. If it's possible to prevent this, it's desirable 139 to do so. 141 2.1 A Historical Note: "Kicking" TCP 143 The IETF PILC Working group is recommending retransmission of packets 144 on an interface that has returned to operational status, in [LINK]. 145 [LINK] documents informal practice, but additional details are 146 required for standards-track TCPs. 148 "Kicking TCP" takes its name from Phil Karn's posting to the PILC 149 mailing list, proposing that routers driving subnetworks subject to 150 lengthy outages "try to hold onto the last IP packet of each flow 151 when a link goes down and forward it to its destination when the link 152 comes back up". [LINKNOTE]. 154 This document takes "Kicking TCP" as a starting point. It extends 155 "Kicking TCP" by adding sender-side behavior for 156 apparently-duplicated packets received on an RTOed TCP connection. 158 2.2 Transport and deployability Considerations 160 Ideally, a "Link Up Notification" (or "LUN") would be accomplished 161 using an ICMP message, but in today's Internet, an end-to-end TCP 162 packet for an existing connection is more likely to "arrive" at its 163 destination across border gateways, firewalls, and NATs. "Kicking 164 TCP" takes advantage of this - the LUN is exactly a packet that has 165 already been transmitted on an existing connection path. 167 2.3 Applicability Statement 169 Hosts supporting TCP-based applications over subnetwork interfaces 170 subject to multi-second outages MAY perform the actions described in 171 Section 3. These actions are more attractive for TCP implementations 172 used with "human-in-the-loop" applications, but are safe for any 173 TCP-based implementation. 175 All hosts supporting TCP-based applications SHOULD perform the 176 actions described in Section 4. 178 3. When a Local Interface Returns to "UP" 180 If a host contains a local interface that is subject to frequent and 181 lengthy outages, the host subnetwork implementation MAY retain a copy 182 of "the last" packet transmitted on each TCP connection. 184 When the subnetwork implementation detects that a local interface has 185 returned to "UP" status, the subnetwork implementation MAY retransmit 186 the last packet stored for each TCP connection. 188 3.1 Layering Violation Tradeoffs 190 This proposal casually acts like subnetwork implementations can track 191 TCP connections between two end hosts. This is a layering violation. 193 If an implementation finds it more convenient to provide "local link 194 up" indications to its own TCP, LUN functionality can be implemented 195 in the TCP/IP stack. 197 Not all subnetwork implementations are able to distinguish between 198 TCP connections. In this case, the subnetwork may chose to store one 199 packet per destination host. 201 TCP source and destination port numbers will be masked when the host 202 is using IPSEC Encapsulating Secure Payload [ESP], because this 203 cryptographic privacy mechanism obscures these fields from the TCP/IP 204 "pseudo header". In these cases, the subnetwork may also choose to 205 store one packet per destination host. 207 If a host is storing one packet per destination host, it should be 208 the most recently transmitted packet, to maximize the probability 209 that a LUN will restart an active TCP connection. 211 3.2 Stopping the Babbling 213 LUNs are intended as an end-to-end implicit notification to a peer 214 TCP, not a reliable signal. If a LUN is also lost due to a new link 215 outage, no additional LUNs will take place unless the local interface 216 "cycles" again. 218 Some subnetwork technologies can cycle between operational and 219 non-operational status very rapidly. The authors have been informed 220 of a scenario with more than 10 802.11 "link up" transitions per 221 second in a private conversation [BAPC]. To prevent "LUN storms", 222 hosts MUST wait at least one second (the minimum RTO value) after an 223 interface becomes operational before sending a LUN. 225 Modified hosts MUST not send LUNs more frequently than once every 226 three seconds. This restriction matches the RTO period for a new TCP 227 connection, so is assumed to be "safe enough". 229 4. When an RTOed TCP Sender Receives a LUN 231 The LUN described in Section 3 will contain an acknowledgement 232 sequence number, if the TCP connection has advanced to the 233 ESTABLISHED state. There are several possibilities (using 234 [RFC793]-style notation): 236 1. SND.NXT < SEG.ACK - in this case, the receiver has retransmitted 237 an acknowledgement for a segment that hasn't been sent yet. 239 2. SND.UNA < SEG.ACK <= SND.NXT - in this case, the receiver has 240 retransmitted a "new" ACK that the sender has not seen. The TCP 241 would process this segment normally - it would remove the 242 acknowledged segments from the retransmission queue and perform 243 slow start (since the connection is already in RTO). 245 3. SEG.ACK <= SND.UNA - in this case, the receiver has retransmitted 246 a "duplicate" ACK that the sender has seen previously. In today's 247 standard-conformant TCPs, this segment would be ignored (the 248 receiver would assume the ACK has been duplicated or reordered by 249 the IP network). This memo adds the following TCP mechanism: for 250 a connection in RETRANSMISSION-WAIT, the sending TCP SHOULD 251 perform slow start. 253 OPEN ISSUE: should we tighten the criteria for a LUN, so that we only 254 respond to a LUN that duplicates the "most recent" ACK received? Our 255 sense is that if we got an ACK before the link went inactive, we 256 should expact to get that ACK again as a LUN when the link becomes 257 active again, and not some earlier ACK (yes, IP networks can reorder 258 packets, but during RTO, the sender sends only one packet into the 259 network, and older packets shouldn't still be active in the network). 260 But responding to earlier ACKs as LUNs wouldn't be much of a risk, 261 because LUN has no effect except during RTO anyway. 263 5. Security Considerations 265 This memo describes a (small) change in TCP behavior - the most 266 widely used transport protocol on the Internet today. 268 The procedures defined in this memo will cause sending hosts to 269 retransmit one packet per RTOed connection before RTO timers would 270 have expired (when the sending host would have retransmitted one 271 packet per connection anyway). 273 The procedures defined in this memo may cause a TCP to "give up" on 274 an RTOed connection more rapidly than it would have previously (for 275 instance, modified BSD-derived sending TCPs may still abandon a TCP 276 connection after 12 attempted retransmissions, but the 12 277 retransmissions may take place over a shorter time interval if LUNs 278 cause retransmissions to take place before the sender's RTO timer 279 expires). 281 It is possible to spoof LUNs. For this to work, an attacker would 282 identify a TCP connection that has experienced RTO, and send a forged 283 packet with appropriate addresses and port numbers, and reasonable 284 sequence numbers, to the TCP sender. This seems like a lot of work to 285 generate a single TCP segment retransmission followed by Slow Start 286 (the effect of a LUN) - an attacker with this capability could simply 287 start sending an ACK stream today, and cause more packets to enter 288 the network. 290 The authors assume that fully-backed-off TCP connections for 291 interactive applications will often be abandoned anyway, resulting in 292 additional traffic (SYN/SYN-ACKs, etc.), so that tiny increase in 293 traffic of a single LUN would be outweighed by traffic avoidance in 294 these situations. 296 6. IANA Considerations 298 There are no IANA considerations for this document. 300 7. Acknowledgements 302 We want to clearly acknowledge Phil Karn as the person who brought 303 "Kicking TCP" to the PILC working group. 305 We want to thank Mark Allman and Bernard Aboba for a number of 306 helpful comments on previous variants of this discussion. 308 Authors' Addresses 310 Spencer Dawkins 311 MCSR Labs 312 1547 Rivercrest Blvd. 313 Allen, TX 75002 314 US 316 Phone: +1-972-727-9834 317 EMail: spencer@mcsr-labs.org 319 Carl Williams 320 MCSR Labs 321 3790 El Camino Real 322 Palo Alto, CA 94306 323 US 325 Phone: +1-650-279-5903 326 EMail: carlw@mcsr-labs.org 328 Appendix A. References 330 [BAPC]: Bernard Aboba, private conversation at IETF 57 332 [LINK]: "Advice for Internet Subnetwork Designers", Phil Karn 333 (editor), February 2003 [draft-ietf-pilc-link-design-13.txt, work 334 in progress] 336 [LINKNOTE]: "Kicking TCP", posting on PILC mailing list by Phil Karn, 337 March 7, 2000 [http://pilc.grc.nasa.gov/list/archive/0691.html] 339 [PILC]: "Performance Implications of Link Characteristics", IETF 340 Working group [http://www.ietf.org/html.charters/ 341 pilc-charter.html] 343 [RFC793]: "Transmission Control Protocol", J. Postel, September, 1981 344 [ftp://ftp.rfc-editor.org/in-notes/rfc793.txt] 346 [RFC2119]: "Key words for use in RFCs to Indicate Requirement 347 Levels", S. Bradner, March 1997 [ftp://ftp.rfc-editor.org/ 348 in-notes/rfc2119.txt] 350 [RFC2988]: "Computing TCP's Retransmission Timer", V. Paxson, M. 351 Allman, November, 2000 [ftp://ftp.rfc-editor.org/in-notes/ 352 rfc2988.txt] 354 [TRIGTRAN56]: "Triggers for Transport (TRIGTRAN) BoF minutes", March, 355 2003 [http://www.ietf.org/proceedings/03mar/minutes/trigtran.htm] 357 Intellectual Property Statement 359 The IETF takes no position regarding the validity or scope of any 360 intellectual property or other rights that might be claimed to 361 pertain to the implementation or use of the technology described in 362 this document or the extent to which any license under such rights 363 might or might not be available; neither does it represent that it 364 has made any effort to identify any such rights. Information on the 365 IETF's procedures with respect to rights in standards-track and 366 standards-related documentation can be found in BCP-11. Copies of 367 claims of rights made available for publication and any assurances of 368 licenses to be made available, or the result of an attempt made to 369 obtain a general license or permission for the use of such 370 proprietary rights by implementors or users of this specification can 371 be obtained from the IETF Secretariat. 373 The IETF invites any interested party to bring to its attention any 374 copyrights, patents or patent applications, or other proprietary 375 rights which may cover technology that may be required to practice 376 this standard. Please address the information to the IETF Executive 377 Director. 379 Full Copyright Statement 381 Copyright (C) The Internet Society (2003). All Rights Reserved. 383 This document and translations of it may be copied and furnished to 384 others, and derivative works that comment on or otherwise explain it 385 or assist in its implementation may be prepared, copied, published 386 and distributed, in whole or in part, without restriction of any 387 kind, provided that the above copyright notice and this paragraph are 388 included on all such copies and derivative works. However, this 389 document itself may not be modified in any way, such as by removing 390 the copyright notice or references to the Internet Society or other 391 Internet organizations, except as needed for the purpose of 392 developing Internet standards in which case the procedures for 393 copyrights defined in the Internet Standards process must be 394 followed, or as required to translate it into languages other than 395 English. 397 The limited permissions granted above are perpetual and will not be 398 revoked by the Internet Society or its successors or assignees. 400 This document and the information contained herein is provided on an 401 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 402 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 403 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 404 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 405 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 407 Acknowledgment 409 Funding for the RFC Editor function is currently provided by the 410 Internet Society.