[tcpm] 2581 implementation report, take 2
Mark Allman <mallman@icir.org> Tue, 30 October 2007 19:58 UTC
Return-path: <tcpm-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1ImxEH-0002XR-Vb; Tue, 30 Oct 2007 15:58:25 -0400
Received: from tcpm by megatron.ietf.org with local (Exim 4.43) id 1ImxEH-0002XD-Aw for tcpm-confirm+ok@megatron.ietf.org; Tue, 30 Oct 2007 15:58:25 -0400
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1ImxEG-0002X3-VW for tcpm@ietf.org; Tue, 30 Oct 2007 15:58:25 -0400
Received: from pork.icsi.berkeley.edu ([192.150.186.19]) by chiedprmail1.ietf.org with esmtp (Exim 4.43) id 1ImxEG-0000E2-5e for tcpm@ietf.org; Tue, 30 Oct 2007 15:58:24 -0400
Received: from guns.icir.org (adsl-69-222-35-58.dsl.bcvloh.ameritech.net [69.222.35.58]) by pork.ICSI.Berkeley.EDU (8.12.11.20060308/8.12.11) with ESMTP id l9UJwMgq023066 for <tcpm@ietf.org>; Tue, 30 Oct 2007 12:58:22 -0700
Received: from lawyers.icir.org (adsl-69-222-35-58.dsl.bcvloh.ameritech.net [69.222.35.58]) by guns.icir.org (Postfix) with ESMTP id C9CBF11512E1 for <tcpm@ietf.org>; Tue, 30 Oct 2007 15:58:15 -0400 (EDT)
Received: from lawyers.icir.org (localhost [127.0.0.1]) by lawyers.icir.org (Postfix) with ESMTP id 9CA6C2D9591 for <tcpm@ietf.org>; Tue, 30 Oct 2007 15:56:09 -0400 (EDT)
To: tcpm@ietf.org
From: Mark Allman <mallman@icir.org>
Organization: ICSI Center for Internet Research (ICIR)
Song-of-the-Day: 30 Days in the Hole
MIME-Version: 1.0
Date: Tue, 30 Oct 2007 15:56:09 -0400
Message-Id: <20071030195609.9CA6C2D9591@lawyers.icir.org>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 25eb6223a37c19d53ede858176b14339
Subject: [tcpm] 2581 implementation report, take 2
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: mallman@icir.org
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0227490603=="
Errors-To: tcpm-bounces@ietf.org
Attached is a slightly tweaked version of the 2581 implementation report. The report includes input from the linux community noting that 2581 is implemented in their stack and that they have not seen any sort of big problems because of it. allman
Background: + RFC 2581 is a re-write of RFC 2001. RFC 2001 was a description of TCP's congestion control algorithms that was published long after these algorithms were in nearly ubiquitous deployment throughout the Internet (largely triggered by the congestion collapses of the mid-1980s). + While RFC 2001 was a description of the algorithms, RFC 2581 is a more traditional specification. We stress that the RFC was written based on running code and experience. + The mechanisms in RFC 3042 (Limited Transmit) and RFC 3390 (Larger Initial Congestion Window) are also rolled into the current document. Both of these enhancements are Proposed Standards that have gathered wide consensus within the community based on deployment experience. + The traditional test of two interoperable implementations to move a Proposed Standard to Draft Standard is less obvious in the case of congestion control mechanisms. Congestion control is about *when* to send a segment and not *what* that segment looks like, how to process it, how big fields are, etc. Therefore, it is difficult to assess "interoperability" in the traditional sense. Below we cite several sources that show or suggest that multiple implementations of the mechanisms exist and seem to work as intended. + The new version of the document clarifies a number of small issues that implementers have asked about over the years, but does not make any large changes to the algorithms. Known Implementations: + [WS95] discusses the BSD implementation of the core algorithms in RFC 2581 (slow start, congestion avoidance, fast retransmit and fast recovery). This implementation has formed the basis of the TCP stack in numerous operating systems (NetBSD, FreeBSD, OpenBSD, SunOS 4.x, BSDI, etc.). While various operating systems may have diverged in small details (some of which is documented in RFC 2581) the basic algorithms do not seem to have changed. + Linux also supports RFC 2581 and does not report any adverse impacts. See Attachment 1 below. (The complaint in that email is not about the document itself or even the algorithm within RFC 2581, but rather goes to our congestion control principles. Further, as sketched the behavior given in RFC 2581 is more conservative than desired and therefore if this RFC is in error, it is erroring in the right direction for stable operation.) + [Pax97] analyzes a number of implementations, finding both correct and incorrect behavior relative to RFC 2581 across a variety of implementations. The incorrect behavior fed into [RFC2525]. + [MAF05] tests for conformance along a number of angles by probing the TCPs of over 70K web servers with specialized packet streams that induce the stack to show how it handles various situations. The results include: + The vast majority of server reduce their congestion window by half in response to congestion (per RFC 2581's congestion avoidance). + The majority of the web servers used an initial congestion window of 1--2 packets. + Limited Transmit was used in over 20% of the servers. + While some servers do not use fast retransmit the overwhelming majority implement it. + Many web servers use the fast recovery algorithm (with a number using more advanced recovery such as NewReno [RFC3782] or SACK-based loss recovery techniques [RFC2018,RFC3517]. (Note that [MAF05] updates some of the results of [PF01]. The newer results confirm the older results.) References: [MAF05] Alberto Medina, Mark Allman, Sally Floyd. Measuring the Evolution of Transport Protocols in the Internet. ACM Computer Communication Review, 35(2), April 2005. [Pax97] Vern Paxson. Automated Packet Trace Analysis of TCP Implementations. ACM SIGCOMM, September 1997. [PF01] Jitu Padhye, Sally Floyd. Identifying the TCP Behavior of Web Servers, SIGCOMM 2001, August 2001. [WS95] Wright, G. and W. Stevens, "TCP/IP Illustrated, Volume 2: The Implementation", Addison-Wesley, 1995. Attachment 1: Date: Mon, 24 Sep 2007 19:55:20 PDT To: mallman@icir.org From: Stephen Hemminger <shemminger@linux-foundation.org> Subject: Re: rfc2581 [...] Yes Linux implements RFC2581 and has not had any unstable or congestion problems caused by that. In recent years, there has been lots of refinements and alternatives added, but all the other algorithms are more complex attempts to ensure proper and stable response in "corner case" domains of large delay bandwidth products and/or small router queues. Linux also implements RFC2861 (congestion window validation) by default which makes it less aggressive than many other implementations. Because this caused some bursty applications to have poor performance it was made optional. The only real complaint against the principles of congestion control has come from the financial community. Slow start can cause connections to have latency, and when latency equates to real $$ during transactions, customers get very sensitive to the added delay. For a discussion of this see the presentation from Credit Suisse at this 2007 Kernel Summit. http://lwn.net/Articles/248878/ For that reason, they are looking to alternatives to TCP/IP such as Infiniband. -- Stephen Hemminger <shemminger@linux-foundation.org>
_______________________________________________ tcpm mailing list tcpm@ietf.org https://www1.ietf.org/mailman/listinfo/tcpm
- [tcpm] 2581 implementation report, take 2 Mark Allman