idnits 2.17.1 

draft-ietf-tcpm-tcp-lcd-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** You're using the IETF Trust Provisions' Section 6.b License Notice from
     12 Sep 2009 rather than the newer Notice from 28 Dec 2009.  (See
     https://trustee.ietf.org/license-info/)


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (March 30, 2010) is 5134 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  ** Obsolete normative reference: RFC  793 (Obsoleted by RFC 9293)

  ** Obsolete normative reference: RFC 1323 (Obsoleted by RFC 7323)

  ** Obsolete normative reference: RFC 2988 (Obsoleted by RFC 6298)

  -- Obsolete informational reference (is this intentional?): RFC 2460
     (Obsoleted by RFC 8200)

  -- Obsolete informational reference (is this intentional?): RFC 3782
     (Obsoleted by RFC 6582)


     Summary: 4 errors (**), 0 flaws (~~), 1 warning (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	TCP Maintenance and Minor                                  A. Zimmermann
3	Extensions (TCPM) WG                                        A. Hannemann
4	Internet-Draft                                    RWTH Aachen University
5	Intended status: Experimental                             March 30, 2010
6	Expires: October 1, 2010

8	   Making TCP more Robust to Long Connectivity Disruptions (TCP-LCD)
9	                       draft-ietf-tcpm-tcp-lcd-01

11	Abstract

13	   Disruptions in end-to-end path connectivity, which last longer than
14	   one retransmission timeout, cause suboptimal TCP performance.  The
15	   reason for this performance degradation is that TCP interprets
16	   segment loss induced by long connectivity disruptions as a sign of
17	   congestion, resulting in repeated retransmission timer backoffs.
18	   This, in turn, leads to a delayed detection of the re-establishment
19	   of the connection since TCP waits for the next retransmission timeout
20	   before it attempts a retransmission.

22	   This document proposes an algorithm to make TCP more robust to long
23	   connectivity disruptions (TCP-LCD).  It describes how standard ICMP
24	   messages can be exploited during timeout-based loss recovery to
25	   disambiguate true congestion loss from non-congestion loss caused by
26	   connectivity disruptions.  Moreover, a revert strategy of the
27	   retransmission timer is specified that enables a more prompt
28	   detection of whether or not the connectivity to a previously
29	   disconnected peer node has been restored.  TCP-LCD is a TCP sender-
30	   only modification that effectively improves TCP performance in case
31	   of connectivity disruptions.

33	Status of this Memo

35	   This Internet-Draft is submitted to IETF in full conformance with the
36	   provisions of BCP 78 and BCP 79.

38	   Internet-Drafts are working documents of the Internet Engineering
39	   Task Force (IETF), its areas, and its working groups.  Note that
40	   other groups may also distribute working documents as Internet-
41	   Drafts.

43	   Internet-Drafts are draft documents valid for a maximum of six months
44	   and may be updated, replaced, or obsoleted by other documents at any
45	   time.  It is inappropriate to use Internet-Drafts as reference
46	   material or to cite them other than as "work in progress."

48	   The list of current Internet-Drafts can be accessed at
49	   http://www.ietf.org/ietf/1id-abstracts.txt.

51	   The list of Internet-Draft Shadow Directories can be accessed at
52	   http://www.ietf.org/shadow.html.

54	   This Internet-Draft will expire on October 1, 2010.

56	Copyright Notice

58	   Copyright (c) 2010 IETF Trust and the persons identified as the
59	   document authors.  All rights reserved.

61	   This document is subject to BCP 78 and the IETF Trust's Legal
62	   Provisions Relating to IETF Documents
63	   (http://trustee.ietf.org/license-info) in effect on the date of
64	   publication of this document.  Please review these documents
65	   carefully, as they describe your rights and restrictions with respect
66	   to this document.  Code Components extracted from this document must
67	   include Simplified BSD License text as described in Section 4.e of
68	   the Trust Legal Provisions and are provided without warranty as
69	   described in the BSD License.

71	Table of Contents

73	   1.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
74	   2.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
75	   3.  Connectivity Disruption Indication . . . . . . . . . . . . . .  6
76	   4.  Connectivity Disruption Reaction . . . . . . . . . . . . . . .  8
77	     4.1.  Basic Idea . . . . . . . . . . . . . . . . . . . . . . . .  8
78	     4.2.  Algorithm Details  . . . . . . . . . . . . . . . . . . . .  8
79	   5.  Discussion of TCP-LCD  . . . . . . . . . . . . . . . . . . . . 11
80	     5.1.  Retransmission Ambiguity . . . . . . . . . . . . . . . . . 12
81	     5.2.  Wrapped Sequence Numbers . . . . . . . . . . . . . . . . . 13
82	     5.3.  Packet Duplication . . . . . . . . . . . . . . . . . . . . 14
83	     5.4.  Probing Frequency  . . . . . . . . . . . . . . . . . . . . 14
84	     5.5.  Reaction during Connection Establishment . . . . . . . . . 14
85	     5.6.  Reaction in Steady-State . . . . . . . . . . . . . . . . . 15
86	   6.  Dissolving Ambiguity Issues (the Safe Variant) . . . . . . . . 15
87	   7.  Interoperability Issues  . . . . . . . . . . . . . . . . . . . 17
88	     7.1.  Detection of TCP Connection Failures . . . . . . . . . . . 17
89	     7.2.  Explicit Congestion Notification . . . . . . . . . . . . . 17
90	     7.3.  ICMP for IP version 6  . . . . . . . . . . . . . . . . . . 18
91	     7.4.  TCP-LCD and IP Tunnels . . . . . . . . . . . . . . . . . . 18
92	   8.  Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 19
93	   9.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 20
94	   10. Security Considerations  . . . . . . . . . . . . . . . . . . . 20
95	   11. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 21
96	   12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21
97	     12.1. Normative References . . . . . . . . . . . . . . . . . . . 21
98	     12.2. Informative References . . . . . . . . . . . . . . . . . . 21
99	   Appendix A.  Changes from previous versions of the draft . . . . . 24
100	     A.1.  Changes from draft-ietf-tcpm-tcp-lcd-00  . . . . . . . . . 24
101	     A.2.  Changes from draft-zimmermann-tcp-lcd-02 . . . . . . . . . 24
102	     A.3.  Changes from draft-zimmermann-tcp-lcd-01 . . . . . . . . . 25
103	     A.4.  Changes from draft-zimmermann-tcp-lcd-00 . . . . . . . . . 25
104	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25

106	1.  Terminology

108	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
109	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
110	   document are to be interpreted as described in [RFC2119].

112	   The reader should be familiar with the algorithm and terminology from
113	   [RFC2988], which defines the standard algorithm Transmission Control
114	   Protocol (TCP) senders are required to use to compute and manage
115	   their retransmission timer.  In this document the terms
116	   "retransmission timer" and "retransmission timeout" are used as
117	   defined in [RFC2988].  The retransmission timer ensures data delivery
118	   in the absence of any feedback from the receiver.  The duration of
119	   this timer is referred to as retransmission timeout (RTO).

121	   As defined in [RFC0793], the term "acceptable acknowledgment (ACK)"
122	   refers to a TCP segment that acknowledges previously unacknowledged
123	   data.  The TCP sender state variable "SND.UNA" and the current
124	   segment variable "SEG.SEQ" are used as defined in [RFC0793].  SND.UNA
125	   holds the segment sequence number of earliest segment that has not
126	   been acknowledged by the TCP receiver (the oldest outstanding
127	   segment).  SEG.SEQ is the segment sequence number of a given segment.

129	   For the purposes of this specification we define the term "timeout-
130	   based loss recovery" that refers to the state, which a TCP sender
131	   enters upon the first timeout of the oldest outstanding segment
132	   (SND.UNA) and leaves upon the arrival of the *first* acceptable ACK.
133	   It is important to note that other documents use a different
134	   interpretation of the term "timeout-based loss recovery".  For
135	   example the NewReno modification to TCP's Fast Recovery algorithm
136	   [RFC3782] extents the period a TCP sender remains in timeout-based
137	   loss recovery compared to the one defined in this document.  This is
138	   because [RFC3782] attempts to avoid unnecessary multiple Fast
139	   Retransmits that can occur after an RTO.

141	2.  Introduction

143	   Connectivity disruptions can occur in many different situations.  The
144	   frequency of connectivity disruptions depends on the property of the
145	   end-to-end path between the communicating hosts.  While connectivity
146	   disruptions can occur in traditional wired networks too, e.g., caused
147	   by an unplugged network cable, the likelihood of occurrence is
148	   significantly higher in wireless (multi-hop) networks.  Especially,
149	   end-host mobility, network topology changes, and wireless
150	   interferences are crucial factors.  In the case of the Transmission
151	   Control Protocol (TCP) [RFC0793], the performance of the connection
152	   can experience a significant reduction compared to a permanently
153	   connected path [SESB05].  This is because TCP, which was originally
154	   designed to operate in fixed and wired networks, generally assumes
155	   that the end-to-end path connectivity is relatively stable over the
156	   connection's lifetime.

158	   Depending on their duration connectivity disruptions can be
159	   classified into two groups [I-D.schuetz-tcpm-tcp-rlci]: "short" and
160	   "long".  A connectivity disruption is "short" if connectivity returns
161	   before the retransmission timer fires for the first time.  In this
162	   case, TCP recovers lost data segments through Fast Retransmit and
163	   lost acknowledgments (ACK) through successfully delivered later ACKs.
164	   Connectivity disruptions are declared as "long" for a given TCP
165	   connection if the retransmission timer fires at least once before
166	   connectivity is resumed.  Whether or not path characteristics, like
167	   the round trip time (RTT) or the available bandwidth, have changed
168	   when connectivity resumes after a disruption is another important
169	   aspect for TCP's retransmission scheme [I-D.schuetz-tcpm-tcp-rlci].

171	   This document improves TCP's behavior in case of "long connectivity
172	   disruptions".  In particular, it focuses on the period "prior" to the
173	   re-establishment of the connectivity to a previously disconnected
174	   peer node.  The document does not describe any modifications of TCP's
175	   behavior and its congestion control mechanisms [RFC5681] "after"
176	   connectivity has been restored.

178	   When a long connectivity disruption occurs on a TCP connection the
179	   TCP sender eventually does not receive any more acknowledgments.
180	   After the retransmission timer expires, the TCP sender enters the
181	   timeout-based loss recovery and declares the oldest outstanding
182	   segment (SND.UNA) as lost.  Since TCP tightly couples reliability and
183	   congestion control, the retransmission of SND.UNA is triggered
184	   together with the reduction of the transmission rate.  This is based
185	   on the assumption that segment loss is an indication of congestion
186	   [RFC5681].  As long as the connectivity disruption persists, TCP will
187	   repeat this procedure until the oldest outstanding segment has
188	   successfully been acknowledged, or until the connection has timed
189	   out.  TCP implementations that follow the recommended retransmission
190	   timeout (RTO) management of RFC 2988 [RFC2988] double the RTO after
191	   each retransmission attempt.  However, the RTO's growth may be
192	   bounded by an upper limit, the maximum RTO, which is at least 60s,
193	   but may be longer: Linux, for example, uses 120s.  If connectivity is
194	   restored between two retransmission attempts, TCP still has to wait
195	   until the retransmission timer expires before resuming transmission,
196	   since it simply does not have any means to know if the connectivity
197	   has been re-established.  Therefore, depending on when connectivity
198	   becomes available again, this can waste up to a maximum RTO of
199	   possible transmission time.

201	   This retransmission behavior is not efficient, especially in
202	   scenarios with long connectivity disruptions.  In the ideal case, TCP
203	   would attempt a retransmission as soon as connectivity to its peer
204	   has been re-established.  In this document, we specify a TCP sender-
205	   only modification to provide robustness to long connectivity
206	   disruptions (TCP-LCD).  The memo describes how the standard Internet
207	   Control Message Protocol (ICMP) can be exploited during timeout-based
208	   loss recovery to identify non-congestion loss caused by long
209	   connectivity disruptions.  TCP-LCD's revert strategy of the
210	   retransmission timer enables higher-frequency retransmissions and
211	   thereby a prompt detection when connectivity to a previously
212	   disconnected peer node has been restored.  If no congestion is
213	   present, TCP-LCD approaches the ideal behavior.

215	3.  Connectivity Disruption Indication

217	   If the queue of an intermediate router experiencing a link outage can
218	   buffer all incoming packets, a connectivity disruption will only
219	   cause a variation in delay, which is handled well by TCP
220	   implementations using either Eifel [RFC3522], [RFC4015] or Forward
221	   RTO-Recovery (F-RTO) [RFC5682].  However, if the link outage lasts
222	   for too long, the router experiencing the link outage is forced to
223	   drop packets, and finally to discard the according route.  Means to
224	   detect such link outages include reacting on failed address
225	   resolution protocol (ARP) [RFC0826] queries, unsuccessful link
226	   sensing, and the like.  However, this is solely in the responsibility
227	   of the respective router.

229	      Note: The focus of this memo is on introducing a method how ICMP
230	      messages may be exploited to improve TCP's performance; how
231	      different physical and link layer mechanisms below the network
232	      layer may trigger ICMP destination unreachable messages are out of
233	      scope of this memo.

235	   Provided that no other route to the specific destination exists the
236	   router will notify the corresponding sending host about the dropped
237	   packets via ICMP destination unreachable messages of code 0 (net
238	   unreachable) or code 1 (host unreachable) [RFC1812].  Therefore, the
239	   sending host can use the ICMP destination unreachable messages of
240	   these codes as an indication for a connectivity disruption, since the
241	   reception of these messages provide evidence that packets were
242	   dropped due to a link outage.

244	   Note that there are also other ICMP destination unreachable messages
245	   with different codes.  Some of them are candidates for connectivity
246	   disruption indications, too, but need further investigation.  For
247	   example, ICMP destination unreachable messages with code 5 (source
248	   route failed), code 11 (net unreachable for TOS), or code 12 (host
249	   unreachable for TOS) [RFC1812].  On the other hand, codes that flag
250	   hard errors are of no use for the proposed scheme, since TCP should
251	   abort the connection when those are received [RFC1122].  In the
252	   following, the term "ICMP unreachable message" is used as synonym for
253	   ICMP destination unreachable messages of code 0 or code 1.

255	   The accurate interpretation of ICMP unreachable messages as a
256	   connectivity disruption indication is complicated by the following
257	   two peculiarities of ICMP messages.  Firstly, they do not necessarily
258	   operate on the same timescale as the packets, i.e., TCP segments that
259	   elicited them.  When a router drops a packet due to a missing route
260	   it will not necessarily send an ICMP unreachable message immediately,
261	   but will rather queue it for later delivery.  Secondly, ICMP messages
262	   are subject to rate limiting, e.g., when a router drops a whole
263	   window of data due to a link outage, it will hardly send as many ICMP
264	   unreachable messages as it dropped TCP segments.  Depending on the
265	   load of the router it may even send no ICMP unreachable messages at
266	   all.  Both peculiarities originate from [RFC1812].

268	   Fortunately, according to [RFC0792], ICMP unreachable messages have
269	   to contain in their body the entire Internet Protocol (IP) header
270	   [RFC0791] of the datagram eliciting the ICMP unreachable message,
271	   plus the first 64 bits of the payload of that datagram.  This allows
272	   the sending host to match the ICMP error message to the transport
273	   that elicited it.  RFC 1812 [RFC1812] augments the requirements and
274	   states that ICMP messages should contain as much of the original
275	   datagram as possible without the length of the ICMP datagram
276	   exceeding 576 bytes.  Therefore, in case of TCP, at least the source
277	   port number, the destination port number, and the 32-bit TCP sequence
278	   number are included.  This allows the originating TCP to demultiplex
279	   the received ICMP message and to identify the faulty connection.
280	   Moreover, it can identify which segment of the respective connection
281	   triggered the ICMP unreachable message, unless there are several
282	   segments in-flight with the same sequence number (see Section 5.1).

284	   A connectivity disruption indication in form of an ICMP unreachable
285	   message associated with a presumably lost TCP segment provides strong
286	   evidence that the segment was not dropped due to congestion, but was
287	   successfully delivered to the temporary end-point of the employed
288	   path, i.e., the reporting router.  It therefore did not witness any
289	   congestion at least on that part of the path that was traversed by
290	   both the TCP segment eliciting the ICMP unreachable message as well
291	   as the ICMP unreachable message itself.

293	4.  Connectivity Disruption Reaction

295	   Section 4.1 introduces the basic idea of TCP-LCD.  The complete
296	   algorithm is specified in Section 4.2.

298	4.1.  Basic Idea

300	   The goal of the algorithm is to promptly detect when connectivity to
301	   a previously disconnected peer node has been restored after a long
302	   connectivity disruption, while retaining appropriate behavior in case
303	   of congestion.  TCP-LCD exploits standard ICMP unreachable messages
304	   during timeout-based loss recovery.  This increases TCP's
305	   retransmission frequency by undoing one retransmission timer backoff
306	   whenever an ICMP unreachable message reports on the sequence number
307	   of a presumably lost retransmission.

309	   This approach has the advantage of appropriately reducing the probing
310	   rate in case of congestion.  If either the retransmission itself, or
311	   the corresponding ICMP message, is dropped the previously performed
312	   retransmission timer backoff is not undone, which effectively halves
313	   the probing rate.

315	4.2.  Algorithm Details

317	   A TCP sender using RFC 2988 [RFC2988] to compute TCP's retransmission
318	   timer MAY employ the following scheme to avoid over-conservative
319	   retransmission timer backoffs in case of long connectivity
320	   disruptions.  If a TCP sender does implement the following steps, the
321	   algorithm MUST be initiated upon the first timeout of the oldest
322	   outstanding segment (SND.UNA) and MUST be stopped upon the arrival of
323	   the first acceptable ACK.  The algorithm MUST NOT be re-initiated
324	   upon subsequent timeouts for the same segment.  The scheme SHOULD NOT
325	   be used in SYN-SENT or SYN-RECEIVED states [RFC0793] (i.e., during
326	   connection establishment).

328	   A TCP sender that does not employ RFC 2988 [RFC2988] to compute TCP's
329	   retransmission timer SHOULD NOT use TCP-LCD.  We envision that the
330	   scheme could be easily adapted to algorithms others than RFC 2988.
331	   However, we leave this as future work.

333	   In rule (2.5) RFC 2988 [RFC2988] provides the option to place a
334	   maximum value on the RTO.  When a TCP implements this rule to provide
335	   an upper bound for the RTO, it SHOULD also be used in the following
336	   algorithm.  In particular, if the RTO is bounded by an upper limit
337	   (maximum RTO), the "MAX_RTO" variable used in this scheme SHOULD be
338	   initialized with this upper limit.  Otherwise, if the RTO is
339	   unbounded, the "MAX_RTO" variable SHOULD be set to infinity.

341	   The scheme specified in this document uses the "BACKOFF_CNT"
342	   variable, whose initial value is zero.  The variable is used to count
343	   the number of performed retransmission timer backoffs during one
344	   timeout-based loss recovery.  Moreover, the "RTO_BASE" variable is
345	   used to recover the previous RTO if the retransmission timer backoff
346	   was unnecessary.  The variable is initialized with the RTO upon
347	   initiation of timeout-based loss recovery.

349	   (1)  Before TCP updates the variable "RTO" when it initiates timeout-
350	        based loss recovery, set the variables "BACKOFF_CNT" and
351	        "RTO_BASE" as follows:

353	           BACKOFF_CNT := 0;
354	           RTO_BASE := RTO.

356	        Proceed to step (R).

358	   (R)  This is a placeholder for standard TCP's behavior in case the
359	        retransmission timer has expired.  In particular, if RFC 2988
360	        [RFC2988] is used, steps (5.4) - (5.6) of that algorithm go
361	        here.  Proceed to step (2).

363	   (2)  To account for the expiration of the retransmission timer in the
364	        previous step (R), increment the "BACKOFF_CNT" variable by one:

366	           BACKOFF_CNT := BACKOFF_CNT + 1.

368	   (3)  Wait either

370	           for the expiration of the retransmission timer.  When the
371	           retransmission timer expires, proceed to step (R);

373	           or for the arrival of an acceptable ACK.  When an acceptable
374	           ACK arrives, proceed to step (A);

376	           or for the arrival of an ICMP unreachable message.  When the
377	           ICMP unreachable message "ICMP_DU" arrives, proceed to step
378	           (4).

380	   (4)  If "BACKOFF_CNT > 0", i.e., if at least one retransmission timer
381	        backoff can be undone, then

383	           proceed to step (5);

385	        else

387	           proceed to step (3).

389	   (5)  Extract the TCP segment header included in the ICMP unreachable
390	        message "ICMP_DU":

392	           SEG := Extract(ICMP_DU).

394	   (6)  If "SEG.SEQ == SND.UNA", i.e., if the TCP segment "SEG"
395	        eliciting the ICMP unreachable message "ICMP_DU" carries the
396	        sequence number of a retransmission, then

398	           proceed to step (7);

400	        else

402	           proceed to step (3).

404	   (7)  Undo the last retransmission timer backoff:

406	           BACKOFF_CNT := BACKOFF_CNT - 1;
407	           RTO := min(RTO_BASE * 2^(BACKOFF_CNT), MAX_RTO).

409	   (8)  If the retransmission timer expires due to the undoing in the
410	        previous step (7), then

412	           proceed to step (R);

414	        else

416	           proceed to step (3).

418	   (A)  This is a placeholder for standard TCP's behavior in case an
419	        acceptable ACK has arrived.  No further processing.

421	   When a TCP in steady-state detects a segment loss using the
422	   retransmission timer it enters the timeout-based loss recovery and
423	   initiates the algorithm (step 1).  It adjusts the slow start
424	   threshold (ssthresh), sets the congestion window (CWND) to one
425	   segment, backs off the retransmission timer, and retransmits the
426	   first unacknowledged segment (step R) [RFC5681], [RFC2988].  To
427	   account for the expiration of the retransmission timer the TCP sender
428	   increments the "BACKOFF_CNT" variable by one (step 2).

430	   In case the retransmission timer expires again (step 3a) a TCP will
431	   repeat the retransmission of the first unacknowledged segment and
432	   back off the retransmission timer once more (step R) [RFC2988] as
433	   well as increment the "BACKOFF_CNT" variable by one (step 2).  Note
434	   that a TCP may implement RFC 2988's [RFC2988] option to place a
435	   maximum value on the RTO that may result in not performing the
436	   retransmission timer backoff.  However, step (2) MUST always and
437	   unconditionally be applied, no matter whether or not the
438	   retransmission timer is actually backed off.  In other words, each
439	   time the retransmission timer expires, the "BACKOFF_CNT" variable
440	   MUST be incremented by one.

442	   If the first received packet after the retransmission(s) is an
443	   acceptable ACK (step 3b), a TCP will proceed as normal, i.e., slow
444	   start the connection and terminate the algorithm (step A).  Later
445	   ICMP unreachable messages from the just terminated timeout-based loss
446	   recovery are ignored since the ACK clock is already restarting due to
447	   the successful retransmission.

449	   On the other hand, if the first received packet after the
450	   retransmission(s) is an ICMP unreachable message (step 3c), and if
451	   step (4) permits it, a TCP SHOULD undo one backoff for each ICMP
452	   unreachable message reporting an error on a retransmission.  To
453	   decide if an ICMP unreachable message reports on a retransmission,
454	   the sequence number therein is exploited (step 5, step 6).  The undo
455	   is performed by re-calculating the RTO with the decremented
456	   "BACKOFF_CNT" variable (step 7).  This calculation explicitly matches
457	   the (bounded) exponential backoff specified in rule (5.5) of
458	   [RFC2988].

460	   Upon receipt of an ICMP unreachable message that legitimately undoes
461	   one backoff there is the possibility that the shortened
462	   retransmission timer has already expired (step 8).  Then, a TCP
463	   SHOULD retransmit immediately, i.e., an ICMP message clocked
464	   retransmission.  In case the shortened retransmission timer has not
465	   yet expired, TCP MUST wait accordingly.

467	5.  Discussion of TCP-LCD

469	   TCP-LCD takes caution to only react to connectivity disruption
470	   indications in form of ICMP unreachable messages during timeout-based
471	   loss recovery.  Therefore, TCP's behavior is not altered when either
472	   no ICMP unreachable messages are received, or the retransmission
473	   timer of the TCP sender did not expire since the last received
474	   acceptable ACK.  Thus, by defintion the algorithm triggers only in
475	   case of long connectivity disruptions.

477	   Only such ICMP unreachable messages that report on the sequence
478	   number of a retransmission, i.e., report on SND.UNA, are evaluated by
479	   TCP-LCD.  All other ICMP unreachable messages are ignored.  The
480	   arrival of those ICMP unreachable messages provides strong evidence
481	   that the retransmissions were not dropped due to congestion but were
482	   successfully delivered to the temporary end-point of the employed
483	   path, i.e., the reporting router.  In other words, there is no
484	   evidence for any congestion at least on that very part of the path
485	   that was traveled by both, the TCP segment eliciting the ICMP
486	   unreachable message as well as the ICMP unreachable message itself.

488	   However, there are some situations where TCP-LCD makes a false
489	   decision and incorrectly undoes a retransmission timer backoff.  This
490	   can happen, albeit the received ICMP unreachable message reports on
491	   the segment number of a retransmission (SND.UNA) because the TCP
492	   segment that elicited the ICMP unreachable message may either not be
493	   a retransmission (Section 5.1), or does not belong to the current
494	   timeout-based loss recovery (Section 5.2).  Finally, packet
495	   duplication (Section 5.3) can also spuriously trigger the algorithm.

497	   Section 5.4 discusses possible probing frequencies, while Section 5.6
498	   describes the motivation for not reacting on ICMP unreachable
499	   messages while TCP is in steady-state.

501	5.1.  Retransmission Ambiguity

503	   Historically, the retransmission ambiguity problem [Zh86], [KP87] is
504	   the TCP sender's inability to distinguish whether the first
505	   acceptable ACK after a retransmission refers to the original
506	   transmission or to the retransmission.  This problem occurs after
507	   both a Fast Retransmit and a timeout-based retransmit.  However,
508	   modern TCP implementations can eliminate the retransmission ambiguity
509	   with either the help of Eifel [RFC3522], [RFC4015] or Forward RTO-
510	   Recovery (F-RTO) [RFC5682].

512	   The revert strategy of the given algorithm suffers from a form of
513	   retransmission ambiguity, too.  In contrast to the above case, TCP
514	   suffers from ambiguity regarding ICMP unreachable messages received
515	   during timeout-based loss recovery.  With the TCP segment number
516	   included in the ICMP unreachable message, a TCP sender is not able to
517	   determine if the ICMP unreachable message refers to the original
518	   transmission or to any of the timeout-based retransmissions.  That
519	   is, there is an ambiguity which TCP segment an ICMP unreachable
520	   message reports on.

522	   However, for the algorithm this ambiguity is not considered to be a
523	   problem.  The assumption that a received ICMP message provides
524	   evidence that a non-congestion loss caused by the connectivity
525	   disruption was wrongly considered a congestion loss still holds,
526	   regardless to which TCP segment, transmission or retransmission, the
527	   message refers.

529	5.2.  Wrapped Sequence Numbers

531	   Besides the ambiguity whether a received ICMP unreachable message
532	   refers to the original transmission or to any of the retransmissions,
533	   there is another source of ambiguity about the TCP sequence numbers
534	   contained in ICMP unreachable messages.  For high bandwidth paths
535	   like modern gigabit links the sequence space may wrap rather quickly,
536	   thereby allowing the possibility that delayed ICMP unreachable
537	   messages - a router dropping packets due to a link outage is not
538	   obliged to send ICMP unreachable messages in a timely manner
539	   [RFC1812] - may coincidentally fit as valid input in the proposed
540	   scheme.  As a result, the scheme may incorrectly undo retransmission
541	   timer backoffs.  Chances for this to happen are minuscule, since a
542	   particular ICMP message would need to contain the exact sequence
543	   number of the current oldest outstanding segment (SND.UNA), while at
544	   the same time TCP is in timeout-based loss recovery.  However, two
545	   "worst case" scenarios for the algorithm are possible:

547	   For instance, consider a steady state TCP connection, which will be
548	   disrupted at an intermediate router R due to a link outage.  Upon the
549	   expiration of the RTO, the TCP sender enters the timeout-based loss
550	   recovery and starts to retransmit the earliest segment that has not
551	   been acknowledged (SND.UNA).  For some reason, router R delays all
552	   corresponding ICMP unreachable messages so that the TCP sender
553	   backoffs the retransmission timer normally without any undoing.  At
554	   the end of the connectivity disruption, the TCP sender eventually
555	   detects the re-establishment, leaves the scheme and finally the
556	   timeout-based loss recovery, too.  A sequence number wrap-around
557	   later, the connectivity between the two peers is disrupted again, but
558	   this time due to congestion and exactly at the time at which the
559	   current SND.UNA matches the SND.UNA from the previous cycle.  If
560	   router R emits the delayed ICMP unreachable messages now, the TCP
561	   sender would incorrectly undo retransmission timer backoffs.  As the
562	   TCP sequence number contains 32 bits, the probability of this
563	   scenario is at most 1/2^32.  Given sufficiently many retransmissions
564	   in the first timeout-based loss recovery, the corresponding ICMP
565	   unreachable messages could reduce the RTO in the second recovery at
566	   most to "RTO_BASE".  However, once the ICMP unreachable messages are
567	   depleted, the standard exponential backoff will be performed.  Thus,
568	   the congestion response will only be delayed by some false
569	   retransmissions.

571	   Similar to the above, consider the case where a steady state TCP
572	   connection with n segments in-flight will be disrupted at some point
573	   due to a link outage by an intermediate router R. For each segment
574	   in-flight, router R may generate an ICMP unreachable message.
575	   However, due to some reason it delays them.  Once the link outage is
576	   over and the connection has been re-established, the TCP sender
577	   leaves the scheme and slow-starts the connection.  Following a
578	   sequence number wrap-around, a retransmission timeout occurs, just at
579	   the moment the TCP sender's current window of data reaches the
580	   previous range of the sequence number space again.  In case router R
581	   emits the delayed ICMP unreachable messages now, one spurious undoing
582	   of the retransmission timer backoff is possible, if the TCP segment
583	   number contained in ICMP unreachable messages matches the current
584	   SND.UNA, and the timeout was a result of congestion.  In the case of
585	   another connectivity disruption, the additional undoing of the
586	   retransmission timer backoff has no impact.  The probability of this
587	   scenario is at most n/2^32.

589	5.3.  Packet Duplication

591	   In case an intermediate router duplicates packets, a TCP sender may
592	   receive more ICMP unreachable messages during timeout-based loss
593	   recovery than it actually has sent timeout-based retransmissions.
594	   However, since TCP-LCD keeps track of the number of performed
595	   retransmission timer backoffs in the "BACKOFF_CNT" variable, it will
596	   not undo more retransmission timer backoffs than were actually
597	   performed.  Nevertheless, if packet duplication and congestion
598	   coincide on the path between the two communicating hosts, duplicated
599	   ICMP messages could hide the congestion loss of some retransmissions
600	   or ICMP messages, and the algorithm may incorrectly undo
601	   retransmission timer backoffs.  Considering the overall impact of a
602	   router that duplicates packets, the additional load induced by some
603	   spurious timeout-based retransmits can probably be neglected.

605	5.4.  Probing Frequency

607	   One could argue that if an ICMP unreachable message arrives for a
608	   timeout-based retransmission, the RTO shall be reset or recalculated,
609	   similar to what is done when an ACK arrives during timeout-based loss
610	   recovery (see Karn's algorithm [KP87], [RFC2988]), and a new
611	   retransmission should be sent immediately.  Generally, this would
612	   allow for a much higher probing frequency based on the round trip
613	   time up to the router where connectivity has been disrupted.
614	   However, we believe the current scheme provides a good trade-off
615	   between conservative behavior and fast detection of connectivity re-
616	   establishment.

618	5.5.  Reaction during Connection Establishment

620	   It is possible that a TCP sender enters timeout-based loss recovery
621	   while the connection is in SYN-SENT or SYN-RECEIVED states [RFC0793].
622	   The algorithm described in this document could also be used for
623	   faster connection establishment in networks with connectivity
624	   disruptions.  However, because existing TCP implementations [RFC5461]
625	   already interpret ICMP unreachable messages during connection
626	   establishment and abort the corresponding connection, we refrain from
627	   suggesting this.

629	5.6.  Reaction in Steady-State

631	   Another exploitation of ICMP unreachable messages in the context of
632	   TCP congestion control might seem appropriate in case the ICMP
633	   unreachable message is received while TCP is in steady-state, and the
634	   message refers to a segment from within the current window of data.
635	   As the RTT up to the router that generated the ICMP unreachable
636	   message is likely to be substantially shorter than the overall RTT to
637	   the destination, the ICMP unreachable message may very well reach the
638	   originating TCP while it is transmitting the current window of data.
639	   In case the remaining window is large, it might seem appropriate to
640	   refrain from transmitting the remaining window as there is timely
641	   evidence that it will only trigger further ICMP unreachable messages
642	   at the very router.  Although this promises improvement from a
643	   wastage perspective, it may be counterproductive from a security
644	   perspective.  An attacker could forge such ICMP messages, thereby
645	   forcing the originating TCP to stop sending data, very similar to the
646	   blind throughput-reduction attack mentioned in
647	   [I-D.ietf-tcpm-icmp-attacks].

649	   An additional consideration is the following: in the presence of
650	   multi-path routing even the receipt of a legitimate ICMP unreachable
651	   message cannot be exploited accurately because there is the option
652	   that only one of the multiple paths to the destination is suffering
653	   from a connectivity disruption, which causes ICMP unreachable
654	   messages to be sent.  Then, however, there is the possibility that
655	   the path along which the connectivity disruption occurred contributed
656	   considerably to the overall bandwidth, such that a congestion
657	   response is very well reasonable.  However, this is not necessarily
658	   the case.  Therefore, a TCP has no means except for its inherent
659	   congestion control to decide on this matter.  All in all, it seems
660	   that for a connection in steady-state, i.e., not in timeout-based
661	   loss recovery, reacting on ICMP unreachable messages in regard to
662	   congestion control is not appropriate.  For the case of timeout-based
663	   retransmissions, however, there is a reasonable congestion response,
664	   which is skipping further retransmission timer backoffs because there
665	   is no congestion indication - as described above.

667	6.  Dissolving Ambiguity Issues (the Safe Variant)

669	   Given that the TCP Timestamps option [RFC1323] is enabled for a
670	   connection, a TCP sender MAY use the following algorithm to dissolve
671	   the ambiguity issues mentioned in Sections 5.1, 5.2, and 5.3.  In
672	   particular, both the retransmission ambiguity and the packet
673	   duplication problems are prevented by the following TCP-LCD variant.
674	   On the other hand, the false positives caused by wrapped sequence
675	   numbers cannot be completely avoided, but the likelihood is further
676	   reduced by a factor of 1/2^32 since the Timestamp Value field (TSval)
677	   of the TCP Timestamps Option contains 32 bits.

679	   Hence, implementers may choose to implement the TCP-LCD with the
680	   following modifications.

682	   Step (1) is replaced by step (1'):

684	   (1')  Before TCP updates the variable "RTO" when it initiates
685	         timeout-based loss recovery, set the variables "BACKOFF_CNT"
686	         and "RTO_BASE" and the data structure "RETRANS_TS" as follows:

688	            BACKOFF_CNT := 0;
689	            RTO_BASE := RTO.
690	            RETRANS_TS := [];

692	         Proceed to step (R).

694	   Step (2) is extended by step (2b):

696	   (2b)  Store the value of the Timestamp Value field (TSval) of the TCP
697	         Timestamps option included in the retransmission "RET" sent in
698	         step (R) into the "RETRANS_TS" data structure:

700	            RETRANS_TS.add(RET.TSval)

702	   Step (6) is replaced by step (6'):

704	   (6')  If "SEG.SEQ == SND.UNA && RETRANS_TS.exists(SEQ.TSval)", i.e.,
705	         if the TCP segment "SEG" eliciting the ICMP unreachable message
706	         "ICMP_DU" carries the sequence number of a retransmission, and
707	         the value in its Timestamp Value field (TSval) is valid, then

709	               proceed to step (7');

711	         else

713	               proceed to step (3).

715	   Step (7) is replaced by step (7'):

717	   (7')  Undo the last retransmission timer backoff:

719	               RETRANS_TS.remove(SEQ.TSval);
720	               BACKOFF_CNT := BACKOFF_CNT - 1;
721	               RTO := min(RTO_BASE * 2^(BACKOFF_CNT), MAX_RTO).

723	   The downside of the safe variant is twofold.  Firstly, the
724	   modifications come at a cost: the TCP sender is required to store the
725	   timestamps of all retransmissions sent during one timeout-based loss
726	   recovery.  Second, the safe variant can only undo a retransmission
727	   timer backoff if the intermediate router experiencing the link outage
728	   implements [RFC1812] and chooses to include as many more than the
729	   first 64 bits of the payload of the triggering datagram, as are
730	   needed to include the TCP Timestamps option in the ICMP unreachable
731	   message.

733	7.  Interoperability Issues

735	   This section discusses interoperability issues related to introducing
736	   TCP-LCD.

738	7.1.  Detection of TCP Connection Failures

740	   TCP-LCD may have side-effects on TCP implementations that attempt to
741	   detect TCP connection failures by counting timeout-based
742	   retransmissions.  RFC 1122 [RFC1122] states in Section 4.2.3.5 that a
743	   TCP host must handle excessive retransmissions of data segments with
744	   two thresholds R1 and R2 measuring the number of retransmissions that
745	   have occurred for the same segment.  Both thresholds might either be
746	   measured in time units or as a count of retransmissions.

748	   Due to TCP-LCD's revert strategy of the retransmission timer, the
749	   assumption that a certain number of retransmissions corresponds to a
750	   specific time interval no longer holds, as additional retransmissions
751	   may be performed during timeout-based-loss recovery to detect the end
752	   of the connectivity disruption.  Therefore, a TCP employing TCP-LCD
753	   either SHOULD measure the thresholds R1 and R2 in time units or, in
754	   case R1 and R2 are counters of retransmissions, SHOULD convert them
755	   into time intervals, which correspond to the time an unmodified TCP
756	   would need to reach the specified number of retransmissions.

758	7.2.  Explicit Congestion Notification

760	   By the use of Explicit Congestion Notification (ECN) [RFC3168] ECN-
761	   capable routers are no longer limited to dropping packets as
762	   congestion indication.  Instead, they can set the Congestion
763	   Experienced (CE) codepoint in the IP header to indicate congestion.

765	   With TCP-LCD it may happen that during a connectivity disruption a
766	   received ICMP unreachable message has been elicited by a timeout-
767	   based retransmission that was marked with the CE codepoint before
768	   reaching the router experiencing the link outage.  In such a case, we
769	   suggest that the TCP sender SHOULD additionally reset the
770	   retransmission timer in case the algorithm undoes a retransmission
771	   timer backoff.

773	7.3.  ICMP for IP version 6

775	   RFC 4443 [RFC4443] specifies the Internet Control Message Protocol
776	   (ICMPv6) to be used with the Internet Protocol version 6 (IPv6)
777	   [RFC2460].  From TCP-LCD's point of view, it is important to notice
778	   that for IPv6, the payload of an ICMPv6 error messages has to include
779	   as many bytes as possible from the IPv6 datagram that elicited the
780	   ICMPv6 error message, without making the error message exceed the
781	   minimum IPv6 MTU (1280 bytes) [RFC4443].  Thus, more information is
782	   available for TCP-LCD as in the case of IPv4.

784	   The counterpart of the ICMPv4 destination unreachable message of code
785	   0 (net unreachable) and of code 1 (host unreachable) is the ICMPv6
786	   destination unreachable message of code 0 (no route to destination)
787	   [RFC4443].  As with IPv4, a router should generate an ICMPv6
788	   destination unreachable message of code 0 in response to a packet
789	   that cannot be delivered to its destination address because it lacks
790	   a matching entry in its routing table.  As a result, TCP-LCD can
791	   employ this ICMPv6 error messages as connectivity disruption
792	   indication, too.

794	7.4.  TCP-LCD and IP Tunnels

796	   It is worth noting that IP tunnels, including IPsec [RFC4301], IP in
797	   IP [RFC2003], Generic Routing Encapsulation (GRE) [RFC2784], and
798	   others are compatible with TCP-LCD, as long as the received ICMP
799	   unreachable messages can be demultiplexed and extracted appropriately
800	   by the TCP sender during timeout-based loss recovery.

802	   If, for example, end-to-end tunnels like IPSec in transport mode
803	   [RFC4301] are employed, a TCP sender may receive ICMP unreachable
804	   messages where additional steps, e.g., decrypting in step (5) of the
805	   algorithm, are needed to extract the TCP header from these ICMP
806	   messages.  Provided that the received ICMP unreachable message
807	   contains enough information, i.e., SEQ.SEG is extractable, these
808	   information MAY still be used as a valid input for the proposed
809	   algorithm.

811	   Likewise, if IP encapsulation like [RFC2003] is used in some part of
812	   the path between the communicating hosts, the tunnel ingress node may
813	   receive the ICMP unreachable messages from an intermediate router
814	   experiencing the link outage.  Nevertheless, the tunnel ingress node
815	   may replay the ICMP unreachable messages in order to inform the TCP
816	   sender.  If enough information is preserved to extract SEQ.SEG, the
817	   replayed ICMP unreachable messages MAY still be used in TCP-LCD.

819	8.  Related Work

821	   Several methods that address TCP's problems in the presence of
822	   connectivity disruptions have been proposed in literature.  Some of
823	   them try to improve TCP's performance by modifying lower layers.  For
824	   example [SM03] introduces a "smart link layer", which buffers one
825	   segment for each active connection and replays these segments upon
826	   connectivity re-establishment.  This approach has a serious drawback:
827	   previously stateless intermediate routers have to be modified in
828	   order to inspect TCP headers, to track the end-to-end connection, and
829	   to provide additional buffer space.  This leads to an additional need
830	   of memory and processing power.

832	   On the other hand, stateless link layer schemes, as proposed in
833	   [RFC3819], which unconditionally buffer some small number of packets
834	   may have another problem: if a packet is buffered longer than the
835	   maximum segment lifetime (MSL) of 2 min [RFC0793], i.e., the
836	   disconnection lasts longer than MSL, TCP's assumption that such
837	   segments will never be received will no longer be true, violating
838	   TCP's semantics [I-D.eggert-tcpm-tcp-retransmit-now].

840	   Other approaches, like TCP-F [CRVP01] or the Explicit Link Failure
841	   Notification (ELFN) [HV02] inform a TCP sender about a disrupted path
842	   by special messages generated and sent from intermediate routers.  In
843	   case of a link failure the TCP sender stops sending segments and
844	   freezes its retransmission timers.  TCP-F stays in this state and
845	   remains silent until either a "route establishment notification" is
846	   received or an internal timer expires.  In contrast, ELFN
847	   periodically probes the network to detect connectivity re-
848	   establishment.  Both proposals rely on changes to intermediate
849	   routers, whereas the scheme proposed in this document is a sender-
850	   only modification.  Moreover, ELFN does not consider congestion and
851	   may impose serious additional load on the network, depending on the
852	   probe interval.

854	   The authors of ATCP [LS01] propose enhancements to identify different
855	   types of packet loss by introducing a layer between TCP and IP.  They
856	   utilize ICMP destination unreachable messages to set TCP's receiver
857	   advertised window to zero, thus forcing the TCP sender to perform
858	   zero window probing with a exponential backoff.  ICMP destination
859	   unreachable messages that arrive during this probing period are
860	   ignored.  This approach is nearly orthogonal to this document, which
861	   exploits ICMP messages to undo a retransmission timer backoff when
862	   TCP is already probing.  In principle, both mechanisms could be
863	   combined.  However, due to security considerations it does not seem
864	   appropriate to adopt ATCP's reaction as discussed in Section 5.6.

866	   Schuetz et al. describe, in [I-D.schuetz-tcpm-tcp-rlci], a set of TCP
867	   extensions that improve TCP's behavior when transmitting over paths
868	   whose characteristics can change rapidly.  Their proposed extensions
869	   modify the local behavior of TCP and introduce a new TCP option to
870	   signal locally received connectivity-change indications (CCIs) to
871	   remote peers.  Upon receipt of a CCI, they re-probe the path
872	   characteristics either by performing a speculative retransmission or
873	   by sending a single segment of new data, depending on whether the
874	   connection is currently stalled in exponential backoff or
875	   transmitting in steady-state, respectively.  The authors focus on
876	   specifying TCP response mechanisms, nevertheless underlying layers
877	   would have to be modified to explicitly send CCIs to make these
878	   immediate responses possible.

880	9.  IANA Considerations

882	   This memo includes no request to IANA.

884	10.  Security Considerations

886	   The algorithm proposed in this document is considered to be secure.
887	   For example, an attacker who already guessed the correct four-tuple
888	   (i.e., Source IP Address, Source TCP port, Destination IP Address,
889	   and Destination TCP port), can still not make a TCP modified with
890	   TCP-LCD to flood the network just by sending forged ICMP unreachable
891	   messages in an attempt to maliciously shorten the retransmission
892	   timer.  The attacker additionally would need to guess the correct
893	   segment sequence number of the current timeout-based retransmission,
894	   with a probability of at most 1/2^32.  Even in the case of man-in-
895	   the-middle attacks, i.e., attacks performed in scenarios in which the
896	   attacker can sniff the retransmissions, the impact on network load is
897	   considered to be low, since the retransmission frequency is limited
898	   by the RTO that was computed before TCP had entered the timeout-based
899	   loss recovery.  Hence, the highest probing frequency is expected to
900	   be even lower than once per minimum RTO, i.e. 1s as specified by
901	   [RFC2988].

903	11.  Acknowledgments

905	   We would like to thank Kai Jakobs, Ilpo Jarvinen, Pasi Sarolahti,
906	   Timothy Shepard, Joe Touch and Carsten Wolff for feedback on earlier
907	   versions of this document.  We also thank Michael Faber, Daniel
908	   Schaffrath, and Damian Lukowski for implementing and testing the
909	   algorithm in Linux.  Special thanks go to Ilpo Jarvinen for giving
910	   valuable feedback regarding the Linux implementation.

912	   This work has been supported by the German National Science
913	   Foundation (DFG) within the research excellence cluster Ultra High-
914	   Speed Mobile Information and Communication (UMIC), RWTH Aachen
915	   University.

917	12.  References

919	12.1.  Normative References

921	   [RFC0792]  Postel, J., "Internet Control Message Protocol", STD 5,
922	              RFC 792, September 1981.

924	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
925	              RFC 793, September 1981.

927	   [RFC1323]  Jacobson, V., Braden, B., and D. Borman, "TCP Extensions
928	              for High Performance", RFC 1323, May 1992.

930	   [RFC1812]  Baker, F., "Requirements for IP Version 4 Routers",
931	              RFC 1812, June 1995.

933	   [RFC2988]  Paxson, V. and M. Allman, "Computing TCP's Retransmission
934	              Timer", RFC 2988, November 2000.

936	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
937	              Control", RFC 5681, September 2009.

939	12.2.  Informative References

941	   [CRVP01]   Chandran, K., Raghunathan, S., Venkatesan, S., and R.
942	              Prakash, "A feedback-based scheme for improving TCP
943	              performance in ad hoc wireless networks", IEEE Personal
944	              Communications vol. 8, no. 1, pp. 34-39, February 2001.

946	   [HV02]     Holland, G. and N. Vaidya, "Analysis of TCP performance
947	              over mobile ad hoc networks", Wireless Networks vol. 8,
948	              no. 2-3, pp. 275-288, March 2002.

950	   [I-D.eggert-tcpm-tcp-retransmit-now]
951	              Eggert, L., "TCP Extensions for Immediate
952	              Retransmissions", draft-eggert-tcpm-tcp-retransmit-now-02
953	              (work in progress), June 2005.

955	   [I-D.ietf-tcpm-icmp-attacks]
956	              Gont, F., "ICMP attacks against TCP",
957	              draft-ietf-tcpm-icmp-attacks-12 (work in progress),
958	              March 2010.

960	   [I-D.schuetz-tcpm-tcp-rlci]
961	              Schuetz, S., Koutsianas, N., Eggert, L., Eddy, W., Swami,
962	              Y., and K. Le, "TCP Response to Lower-Layer Connectivity-
963	              Change Indications", draft-schuetz-tcpm-tcp-rlci-03 (work
964	              in progress), February 2008.

966	   [KP87]     Karn, P. and C. Partridge, "Improving Round-Trip Time
967	              Estimates in Reliable Transport Protocols", Proceedings of
968	              the Conference on Applications, Technologies,
969	              Architectures, and Protocols for Computer Communication
970	              (SIGCOMM'87) pp. 2-7, August 1987.

972	   [LS01]     Liu, J. and S. Singh, "ATCP: TCP for mobile ad hoc
973	              networks", IEEE Journal on Selected Areas in
974	              Communications vol. 19, no. 7, pp. 1300-1315, 2001 July.

976	   [RFC0791]  Postel, J., "Internet Protocol", STD 5, RFC 791,
977	              September 1981.

979	   [RFC0826]  Plummer, D., "Ethernet Address Resolution Protocol: Or
980	              converting network protocol addresses to 48.bit Ethernet
981	              address for transmission on Ethernet hardware", STD 37,
982	              RFC 826, November 1982.

984	   [RFC1122]  Braden, R., "Requirements for Internet Hosts -
985	              Communication Layers", STD 3, RFC 1122, October 1989.

987	   [RFC2003]  Perkins, C., "IP Encapsulation within IP", RFC 2003,
988	              October 1996.

990	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
991	              Requirement Levels", BCP 14, RFC 2119, March 1997.

993	   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
994	              (IPv6) Specification", RFC 2460, December 1998.

996	   [RFC2784]  Farinacci, D., Li, T., Hanks, S., Meyer, D., and P.
997	              Traina, "Generic Routing Encapsulation (GRE)", RFC 2784,
998	              March 2000.

1000	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
1001	              of Explicit Congestion Notification (ECN) to IP",
1002	              RFC 3168, September 2001.

1004	   [RFC3522]  Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm
1005	              for TCP", RFC 3522, April 2003.

1007	   [RFC3782]  Floyd, S., Henderson, T., and A. Gurtov, "The NewReno
1008	              Modification to TCP's Fast Recovery Algorithm", RFC 3782,
1009	              April 2004.

1011	   [RFC3819]  Karn, P., Bormann, C., Fairhurst, G., Grossman, D.,
1012	              Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L.
1013	              Wood, "Advice for Internet Subnetwork Designers", BCP 89,
1014	              RFC 3819, July 2004.

1016	   [RFC4015]  Ludwig, R. and A. Gurtov, "The Eifel Response Algorithm
1017	              for TCP", RFC 4015, February 2005.

1019	   [RFC4301]  Kent, S. and K. Seo, "Security Architecture for the
1020	              Internet Protocol", RFC 4301, December 2005.

1022	   [RFC4443]  Conta, A., Deering, S., and M. Gupta, "Internet Control
1023	              Message Protocol (ICMPv6) for the Internet Protocol
1024	              Version 6 (IPv6) Specification", RFC 4443, March 2006.

1026	   [RFC5461]  Gont, F., "TCP's Reaction to Soft Errors", RFC 5461,
1027	              February 2009.

1029	   [RFC5682]  Sarolahti, P., Kojo, M., Yamamoto, K., and M. Hata,
1030	              "Forward RTO-Recovery (F-RTO): An Algorithm for Detecting
1031	              Spurious Retransmission Timeouts with TCP", RFC 5682,
1032	              September 2009.

1034	   [SESB05]   Schuetz, S., Eggert, L., Schmid, S., and M. Brunner,
1035	              "Protocol enhancements for intermittently connected
1036	              hosts", SIGCOMM Computer Communication Review vol. 35, no.
1037	              3, pp. 5-18, December 2005.

1039	   [SM03]     Scott, J. and G. Mapp, "Link layer-based TCP optimisation
1040	              for disconnecting networks", SIGCOMM Computer
1041	              Communication Review vol. 33, no. 5, pp. 31-42,
1042	              October 2003.

1044	   [Zh86]     Zhang, L., "Why TCP Timers Don't Work Well", Proceedings
1045	              of the Conference on Applications, Technologies,
1046	              Architectures, and Protocols for Computer Communication
1047	              (SIGCOMM'86) pp. 397-405, August 1986.

1049	Appendix A.  Changes from previous versions of the draft

1051	A.1.  Changes from draft-ietf-tcpm-tcp-lcd-00

1053	   o  Editorial changes.

1055	   o  Clarified TCP-LCD's behaviour during connection establishment
1056	      (Thanks to Mark Handley).

1058	A.2.  Changes from draft-zimmermann-tcp-lcd-02

1060	   o  Incorporated feedback submitted by Ilpo Jarvinen.
1061	      <http://www.ietf.org/mail-archive/web/tcpm/current/msg04841.html>

1063	   o  Incorporated feedback submitted by Pasi Sarolahti.
1064	      <http://www.ietf.org/mail-archive/web/tcpm/current/msg04870.html>

1066	   o  Incorporated feedback submitted by Joe Touch.
1067	      <http://www.ietf.org/mail-archive/web/tcpm/current/msg04895.html>
1068	      <http://www.ietf.org/mail-archive/web/tcpm/current/msg04900.html>

1070	   o  Extended and reorganized the discussion (Section 5):

1072	      *  Every discussion item got its own title, so that we have a
1073	         better overview.

1075	      *  Extended Retransmission Ambiguity section.  Added also some
1076	         references to the historical retransmission ambiguity problem.

1078	      *  Heavily extended discussion about wrapped sequence numbers (see
1079	         Joe's comments).

1081	      *  Described the influence of packet duplication on the algorithm
1082	         (Thanks to Ilpo).

1084	      *  The section "Protecting Against Misbehaving Routers" is not a
1085	         subsection anymore.  Moreover, the section was renamed to
1086	         "Dissolving Ambiguity Issues" and has now real content.

1088	   o  An interoperability issues section (Section 7) was added.  In
1089	      particular comments to ECN, ICMPv6, and to the two thresholds R1
1090	      and R2 of [RFC1122] (Section 4.2.3.5) were added.

1092	   o  Miscellaneous editorial changes.  In particular, the algorithm has
1093	      a name now: TCP-LCD.

1095	A.3.  Changes from draft-zimmermann-tcp-lcd-01

1097	   o  The algorithm in Section 4.2 was slightly changed.  Instead of
1098	      reverting the last retransmission timer backoff by halving the
1099	      RTO, the RTO is recalculated with help of the "BACKOFF_CNT"
1100	      variable.  This fixes an issue that occurred when the
1101	      retransmission timer was backed off but bounded by a maximum
1102	      value.  The algorithm in the previous version of the draft, would
1103	      have "reverted" to half of that maximum value, instead of using
1104	      the value, before the RTO was doubled (and then bounded).

1106	   o  Miscellaneous editorial changes.

1108	A.4.  Changes from draft-zimmermann-tcp-lcd-00

1110	   o  Miscellaneous editorial changes in Section 1, 2 and 3.

1112	   o  The document was restructured in Section 1, 2 and 3 for easier
1113	      reading.  The motivation for the algorithm is changed according
1114	      TCP's problem to disambiguate congestion from non-congestion loss.

1116	   o  Added Section 4.1.

1118	   o  The algorithm in Section 4.2 was restructured and simplified:

1120	      *  The special case of the first received ICMP destination
1121	         unreachable message after an RTO was removed.

1123	      *  The "BACKOFF_CNT" variable was introduced so it is no longer
1124	         possible to perform more reverts than backoffs.

1126	   o  The discussion in Section 5 was improved and expanded according to
1127	      the algorithm changes.

1129	Authors' Addresses

1131	   Alexander Zimmermann
1132	   RWTH Aachen University
1133	   Ahornstrasse 55
1134	   Aachen,   52074
1135	   Germany

1137	   Phone: +49 241 80 21422
1138	   Email: zimmermann@cs.rwth-aachen.de

1140	   Arnd Hannemann
1141	   RWTH Aachen University
1142	   Ahornstrasse 55
1143	   Aachen,   52074
1144	   Germany

1146	   Phone: +49 241 80 21423
1147	   Email: hannemann@nets.rwth-aachen.de