Re: [tcpm] Minimum RTO

Jerry Chu <hkchu@google.com> Tue, 10 April 2012 20:57 UTC

Return-Path: <hkchu@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E47B411E814B for <tcpm@ietfa.amsl.com>; Tue, 10 Apr 2012 13:57:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.976
X-Spam-Level:
X-Spam-Status: No, score=-102.976 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vho5CGRVhGPY for <tcpm@ietfa.amsl.com>; Tue, 10 Apr 2012 13:57:06 -0700 (PDT)
Received: from mail-lb0-f172.google.com (mail-lb0-f172.google.com [209.85.217.172]) by ietfa.amsl.com (Postfix) with ESMTP id A014811E8149 for <tcpm@ietf.org>; Tue, 10 Apr 2012 13:57:05 -0700 (PDT)
Received: by lbok13 with SMTP id k13so200948lbo.31 for <tcpm@ietf.org>; Tue, 10 Apr 2012 13:57:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-system-of-record; bh=8RTa1zdPkKR1E8fktgbEK0UpoIbvFYTwRPLlhRRYRHI=; b=IMhcND8MsB0dwvdOpri0VrxVkpBhgal1UMBWG/dcWrgMRUjYdAukRYPAWxnNbabr1l pt5DyBTmSp3v0efcAR7ZEvOJXwxBITqPHaNlqIP48K+8FIHovV3f0JWR5kg5muMHUtvV YdoWhlFLcVIN5CXcfihZTiQ9TLMEaCm3y7TKWvIt8UbD/t8isc5yTLJ9VyzpVRS4FGTx avkhQ0mEtYVZrGFu+bPKYOK/dPuD5r3MJkF/yD2a6LR/+eOkGrtcfw/OrcW4cMiLP+hq 57AHKZEtfnHnLoZuJunkiEAr9l2ulOhIhJCWNka7v/1aREkWJsv3V8c2S/GlZdsGzOOf ZL/Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-system-of-record:x-gm-message-state; bh=8RTa1zdPkKR1E8fktgbEK0UpoIbvFYTwRPLlhRRYRHI=; b=dsQk/UNrlbEiESuMKT8vRBqC2kaNsM5YUwvAavdKr4QLuCwCVeVWi3amJKLM/+1Fda b3sObfowJNbDhVuznWJ7/6CqMZFIfVT2Yu5cgKMNF4j3sjypZzpW+JzcJoqqpzERkGYB EzkF04IDwobVlLE4qyycHt7rzYuRh0X4dmoltgTfooehN2ajnAFkERw5Cyj0zxZ2pd4p uqhZNlkq3HMFdb6DvSY+5jRBPiSpLI7i4sUgrk/rdHxzmXQLQCVhlmTsePOAZzRo3q8d 7NUbNN50rEqbux4CUQOPqAlK6/QG7YVyrtNS47vrIcQeH6IUcpMG27UjFYVZfx4MMNpO 7t9A==
Received: by 10.152.103.134 with SMTP id fw6mr16023975lab.20.1334091424578; Tue, 10 Apr 2012 13:57:04 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.152.103.134 with SMTP id fw6mr16023969lab.20.1334091424484; Tue, 10 Apr 2012 13:57:04 -0700 (PDT)
Received: by 10.152.29.236 with HTTP; Tue, 10 Apr 2012 13:57:04 -0700 (PDT)
In-Reply-To: <2A08B7ACA17C9F4BAA8D5395DF908D731DED9A27@SEAEMBX02.olympus.F5Net.com>
References: <2A08B7ACA17C9F4BAA8D5395DF908D731DED9A27@SEAEMBX02.olympus.F5Net.com>
Date: Tue, 10 Apr 2012 13:57:04 -0700
Message-ID: <CAPshTCh_vn6qKAs8_X3VFVT=nBOL4rg80s7SMjk=KDiG3yjHmw@mail.gmail.com>
From: Jerry Chu <hkchu@google.com>
To: Mark Lloyd <M.Lloyd@f5.com>
Content-Type: multipart/alternative; boundary="f46d040716eb744eed04bd595dfd"
X-System-Of-Record: true
X-Gm-Message-State: ALoCoQkBDiz0P6BvT3AW5Ze1Oyh/bzIFCN972b7NacWH7oD8hEfjubho7E+NWpYMqaZV8iqiJY+NAs0dEV59UixLTsK2ESg9I/T1bRCWvgMnuiz8hwNgof00+awiSBKNfL0fkXWkniHW
Cc: "tcpm@ietf.org" <tcpm@ietf.org>
Subject: Re: [tcpm] Minimum RTO
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 10 Apr 2012 20:57:08 -0000

Mark,

RFC6298 is mainly for reducing the initial RTO from 3secs to 1sec. We
did not touch the minimum RTO due to a lack of consensus and also
for fear of opening a prolonged debate. Besides, OSes have reduced
minRTO anyway as you noted.

Personally I don't think a somewhat arbitrary RTO lower bound is really
necessary, but delayed acks complicate the matter.

Jerry

On Tue, Apr 10, 2012 at 12:52 PM, Mark Lloyd <M.Lloyd@f5.com> wrote:

> Reading RFC 6298
>
> http://tools.ietf.org/html/rfc6298
>
>
>  (2.4) Whenever RTO is computed, if it is less than 1 second, then the
>          RTO SHOULD be rounded up to 1 second.
>
>
> While troubleshooting a problem that involves minimum RTO, I found that
> RFC 6298 suggests a 1 second value for minimum RTO that is not honored in
> the implementations I have tested, and even in these implementation TCP
> does not do a good job of handling packet loss in a common scenario I
> tested.
>
> I use two Linux hosts as my test. They are setup to imitate the kind of
> connection that would be used for a database or RPC transaction.
>
> My server does a netcat command that loops back data as it is received:
>
> touch /tmp/testfile
> tail -f /tmp/testfile | sudo nc -l 1000 >>/tmp/testfile
>
> The client sends data, then echos a response simulating a TCP conversation
> of small transactions.
>
> touch  /tmp/testfile
> echo hello >/tmp/testfile
> tail -f /tmp/testfile | nc 10.10.30.121 1000 >>/tmp/testfile
>
> I then setup a wan emulator to simulate a 1% loss rate back to the client.
>
> I'm showing frames leading up to the packet loss to show the what the
> traffic looks like up to the packet loss.
>
> Server side:
>
> Frm    Time
> Num    Delta
> 1941   0.0000830     Server -> Client     TCP:Flags=...AP...,
> SrcPort=1001, DstPort=55552, PayloadLen=6, Seq=1538646100 - 1538646106,
> Ack=1257857589,
> 1942   0.0005920     Client -> Server     TCP:Flags=...AP...,
> SrcPort=55552, DstPort=1001, PayloadLen=6, Seq=1257857589 - 1257857595,
> Ack=1538646106
> 1943   0.0000920     Server -> Client     TCP:Flags=...AP...,
> SrcPort=1001, DstPort=55552, PayloadLen=6, Seq=1538646106 - 1538646112,
> Ack=1257857595
> 1944   0.0005590     Client -> Server     TCP:Flags=...AP...,
> SrcPort=55552, DstPort=1001, PayloadLen=6, Seq=1257857595 - 1257857601,
> Ack=1538646112
> 1945   0.0000730     Server -> Client     TCP:Flags=...AP...,
> SrcPort=1001, DstPort=55552, PayloadLen=6, Seq=1538646112 - 1538646118,
> Ack=1257857601
> 1946   0.0005770     Client -> Server     TCP:Flags=...AP...,
> SrcPort=55552, DstPort=1001, PayloadLen=6, Seq=1257857601 - 1257857607,
> Ack=1538646118
> 1947   0.0000930     Server -> Client     TCP:Flags=...AP...,
> SrcPort=1001, DstPort=55552, PayloadLen=6, Seq=1538646118 - 1538646124,
> Ack=1257857607
> 1948   0.0006190     Client -> Server     TCP:Flags=...AP...,
> SrcPort=55552, DstPort=1001, PayloadLen=6, Seq=1257857607 - 1257857613,
> Ack=1538646124
> 1949   0.0000790     Server -> Client     TCP:Flags=...AP...,
> SrcPort=1001, DstPort=55552, PayloadLen=6, Seq=1538646124 - 1538646130,
> Ack=1257857613
> 1950   0.0005110     Client -> Server     TCP:Flags=...AP...,
> SrcPort=55552, DstPort=1001, PayloadLen=6, Seq=1257857613 - 1257857619,
> Ack=1538646130
> 1951   0.0000850     Server -> Client     TCP:Flags=...AP...,
> SrcPort=1001, DstPort=55552, PayloadLen=6, Seq=1538646130 - 1538646136,
> Ack=1257857619
>                                          Packet 1951 will be lost in
> transit. Note 205ms gap.
>
> 1952   0.2053670     Client -> Server     TCP:[ReTransmit
> #1950]Flags=...AP..., SrcPort=55552, DstPort=1001, PayloadLen=6,
> Seq=1257857613 - 1257857619
> 1953   0.0000420     Server -> Client     TCP:Flags=...A....,
> SrcPort=1001, DstPort=55552, PayloadLen=0, Seq=1538646136, Ack=1257857619
> 1954   0.0007630     Server -> Client     TCP:[ReTransmit #1951]
> Flags=...AP..., SrcPort=1001, DstPort=55552, PayloadLen=6, Seq=1538646130 -
> 1538646136, Ack=1257857619
> 1955   0.0006640     Client -> Server     TCP:Flags=...AP...,
> SrcPort=55552, DstPort=1001, PayloadLen=6, Seq=1257857619 - 1257857625,
> Ack=1538646136
>
> Client side:
>
> 1928   0.0006400     Server -> Client     TCP:Flags=...AP...,
> SrcPort=1001, DstPort=55552, PayloadLen=6, Seq=1538646094 - 1538646100,
> Ack=1257857583
> 1929   0.0000690     Client -> Server     TCP:Flags=...AP...,
> SrcPort=55552, DstPort=1001, PayloadLen=6, Seq=1257857583 - 1257857589,
> Ack=1538646100
> 1930   0.0006200     Server -> Client     TCP:Flags=...AP...,
> SrcPort=1001, DstPort=55552, PayloadLen=6, Seq=1538646100 - 1538646106,
> Ack=1257857589
> 1931   0.0000690     Client -> Server     TCP:Flags=...AP...,
> SrcPort=55552, DstPort=1001, PayloadLen=6, Seq=1257857589 - 1257857595,
> Ack=1538646106
> 1932   0.0006300     Server -> Client     TCP:Flags=...AP...,
> SrcPort=1001, DstPort=55552, PayloadLen=6, Seq=1538646106 - 1538646112,
> Ack=1257857595
> 1933   0.0000730     Client -> Server     TCP:Flags=...AP...,
> SrcPort=55552, DstPort=1001, PayloadLen=6, Seq=1257857595 - 1257857601,
> Ack=1538646112
> 1934   0.0005230     Server -> Client     TCP:Flags=...AP...,
> SrcPort=1001, DstPort=55552, PayloadLen=6, Seq=1538646112 - 1538646118,
> Ack=1257857601
> 1935   0.0000830     Client -> Server     TCP:Flags=...AP...,
> SrcPort=55552, DstPort=1001, PayloadLen=6, Seq=1257857601 - 1257857607,
> Ack=1538646118
> 1936   0.0006140     Server -> Client     TCP:Flags=...AP...,
> SrcPort=1001, DstPort=55552, PayloadLen=6, Seq=1538646118 - 1538646124,
> Ack=1257857607
> 1937   0.0000750     Client -> Server     TCP:Flags=...AP...,
> SrcPort=55552, DstPort=1001, PayloadLen=6, Seq=1257857607 - 1257857613,
> Ack=1538646124
> 1938   0.0006230     Server -> Client     TCP:Flags=...AP...,
> SrcPort=1001, DstPort=55552, PayloadLen=6, Seq=1538646124 - 1538646130,
> Ack=1257857613
> 1939   0.0000650     Client -> Server     TCP:Flags=...AP...,
> SrcPort=55552, DstPort=1001, PayloadLen=6, Seq=1257857613 - 1257857619,
> Ack=1538646130
>                                          Packet does not arrive from
> server. Client retransmits in 205ms.
>
> 1940   0.2053300     Client -> Server     TCP:[ReTransmit #1939]
> Flags=...AP..., SrcPort=55552, DstPort=1001, PayloadLen=6, Seq=1257857613 -
> 1257857619, Ack=1538646130
> 1941   0.0006840     Server -> Client     TCP:Flags=...A....,
> SrcPort=1001, DstPort=55552, PayloadLen=0, Seq=1538646136, Ack=1257857619
> 1942   0.0006970     Server -> Client     TCP:Flags=...AP...,
> SrcPort=1001, DstPort=55552, PayloadLen=6, Seq=1538646130 - 1538646136,
> Ack=1257857619,
> 1943   0.0001120     Client -> Server     TCP:Flags=...AP...,
> SrcPort=55552, DstPort=1001, PayloadLen=6, Seq=1257857619 - 1257857625,
> Ack=15386461
>
> Linux responds in ~200ms, not the suggested 1 second, but this is still
> not very helpful.  The RTO  of 200ms recovers better than the suggested
> minimum RTO, but 200ms is way too long as we have a sub-millisecond RTT. A
> similar test with a Windows host showed a minimum RTO of ~300ms.
>
>
> 1. Why was the now quaint 1 second minimum RTO carried over into RFC 6298?
> 2. Is there a way we can get the minimum RTO down below even 200ms in a
> low latency network without breaking other TCP features?
>
> I can see some discussion on the internet about this but I don't see
> movement to come up reasonable modern value the most recent RFC 6298.
>
> e.g.
> http://www.icir.org/mallman/papers/draft-allman-tcpm-rto-consider-00.txt
>
>
>    Also, note, that in addition to the experiments discussed in [AP99],
>    the Linux TCP implementation has been using various non-standard RTO
>    mechanisms for many years seemingly without large scale problems
>    (e.g., using different EWMA gains).  Also, a number of
>    implementations use minimum RTOs that are less than the 1 second
>    specified in [RFC2988].  While the precise implications of this may
>    show more spurious retransmits (per [AP99]) we are aware of no large
>    scale problems caused by this change to the minimum RTO
>
> Mark Lloyd | Enterprise Network Engineer
> F5 Networks
>
>
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm
>