[rrg] LISP PMTU - 2 methods in draft-farinacci-lisp-11
Robin Whittle <rw@firstpr.com.au> Tue, 27 January 2009 05:27 UTC
Return-Path: <rrg-bounces@irtf.org>
X-Original-To: rrg-archive@ietf.org
Delivered-To: ietfarch-rrg-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 4C7F83A69FB; Mon, 26 Jan 2009 21:27:36 -0800 (PST)
X-Original-To: rrg@core3.amsl.com
Delivered-To: rrg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 2F0BA3A69FB for <rrg@core3.amsl.com>; Mon, 26 Jan 2009 21:27:34 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.57
X-Spam-Level:
X-Spam-Status: No, score=-1.57 tagged_above=-999 required=5 tests=[AWL=0.325, BAYES_00=-2.599, HELO_EQ_AU=0.377, HOST_EQ_AU=0.327]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KVkgqMVWcp5q for <rrg@core3.amsl.com>; Mon, 26 Jan 2009 21:27:32 -0800 (PST)
Received: from gair.firstpr.com.au (gair.firstpr.com.au [150.101.162.123]) by core3.amsl.com (Postfix) with ESMTP id 720F73A689B for <rrg@irtf.org>; Mon, 26 Jan 2009 21:27:32 -0800 (PST)
Received: from [10.0.0.6] (wira.firstpr.com.au [10.0.0.6]) by gair.firstpr.com.au (Postfix) with ESMTP id AD0CF175DB0; Tue, 27 Jan 2009 16:27:12 +1100 (EST)
Message-ID: <497E9BB2.5040101@firstpr.com.au>
Date: Tue, 27 Jan 2009 16:29:22 +1100
From: Robin Whittle <rw@firstpr.com.au>
Organization: First Principles
User-Agent: Thunderbird 2.0.0.19 (Windows/20081209)
MIME-Version: 1.0
To: RRG <rrg@irtf.org>
Cc: lisp-interest@lists.civil-tongue.net
Subject: [rrg] LISP PMTU - 2 methods in draft-farinacci-lisp-11
X-BeenThere: rrg@irtf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IRTF Routing Research Group <rrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/listinfo/rrg>, <mailto:rrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/pipermail/rrg>
List-Post: <mailto:rrg@irtf.org>
List-Help: <mailto:rrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/rrg>, <mailto:rrg-request@irtf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: rrg-bounces@irtf.org
Errors-To: rrg-bounces@irtf.org
Short version: Stateless approach is a really bad idea in the future when ~9000 byte MTU ITR to ETR paths become common. Stateful approach somewhat resembles Ivip's approach - which is documented in greater detail and limits the number of traffic packets for which state must be stored in case a PTB is received. Here my thoughts on the two techniques (Stateless and Stateful) for LISP Path MTU Discovery management in the latest (19 December) draft. My understanding of the whole problem (for Ivip's map-encap modes or for any other map-encap scheme, such as LISP or APT), together with my solution for Ivip, is here: http://www.firstpr.com.au/ip/ivip/pmtud-frag/ Here is my critique of the Stateless approach, assuming L = 1500, as is recommended at the end of: http://tools.ietf.org/html/draft-farinacci-lisp-11#section-5.4.1 The stateless approach was rejected by the OpenLISP team: http://tools.ietf.org/html/draft-iannone-openlisp-implementation-01#section-6.8.1 Stateless IPv4 DF=0 ------------------- L is chosen for the entire global LISP system to be a minimum value of MTU which can be expected of any ITR->ETR tunnel. Here I assume 1500 bytes, as is recommended in the last line of section 5.4.1. Perhaps it would need to be less, such as 1470 or less. Google servers regularly send 1470 byte DF=0 packets: http://www.firstpr.com.au/ip/ivip/ipv4-bits/actual-packets.html#google-no-pmtud IPv4 DF=0 packets longer than some size S will be fragmented into two fragments, each of which will be encapsulated. The ETR decapsulates them separately and sends the two fragments to the destination network. S is set globally to a value is L minus the encapsulation overhead = 36 bytes. (IPv4, UDP and LISP headers, in section 5.1). So S = 1464 bytes. This will not work when the packet length is more than about twice S (since fragmentation produces two packets whose combined length is marginally longer than the original packet). The statement: "This will ensure that the new, encapsulated packets are of size (S/2 + H), which is always below the effective tunnel MTU." will not apply when the incoming packets are more than about twice S in length. Still, with a bit of tightening of the spec, I see no major problems with this approach to IPv4 fragmentable packets compared to what I propose for Ivip. I don't think a core-edge separation scheme should be required to support DF=0 packets longer than something like 1470 bytes into the indefinite future. RFC 1191 PMTUD has been around since the early 1990s and I think it is time that hosts stopped expecting the network to fragment packets. The IPv6 designers evidently thought the same in 1996. Stateless IPv4 DF=1 and IPv6 ---------------------------- This approach makes no sense for the long-term future since it forces all traffic to be in packets no longer than 1464 (IPv4) bytes: the ITR will drop the packet when the size is greater than L, and sends an ICMP Too Big message to the source with a value of S, where S is (L - H). Replacing the variables with constants for IPv4, this means: the ITR will drop the packet when the size is greater than 1500 bytes, and sends an ICMP Too Big message to the source with a value of 1464 bytes. For IPv6, the limiting size is set by the 56 byte overhead, to 1444 bytes. A proper solution should keep working well when the DFZ includes MTUs of 9000 bytes between ITR and ETR - and should allow the sending host to generate packets of nearly this size, so that once encapsulated they still fit within whatever the MTU is for the path to this ETR. Fred Templin raised similar concerns: http://www.irtf.org/pipermail/rrg/2009-January/000884.html (My guess is that the non-DF=0 part of this approach was written with an assumption that L would be individually determined by the ITR for each ETR - but that is not the way the I-D is written: L is fixed at 1500 bytes, and any such method would be stateful.) Stateful approach for all packets --------------------------------- I am reading the LISP version - I have not looked in detail at the OpenLISP source of this approach. This makes no reference to IPv4 DF=0 packets. So this approach of the ITR sending a PTB packet to the sending host when a DF=0 packet exceeds some length is not going to result in any action on the part of the sending host. Such a DF=0 packet will be dropped by the ITR. That may be OK - Ivip will do much the same - but it needs to be specified clearly. This approach of determining the MTU to each ITR by receiving ICMP messages from an intermediate router needs to be done securely. It requires the ITR to cache significant amounts of information for every packet it sends which might trigger such a PTB. The intermediate router would need to send back sufficient of the original packet to ITR to include the LISP nonce. Otherwise, PTBs spoofed by off-path attackers would be accepted and the whole system could easily be DoSed. The ITR needs to store an initial fragment of each incoming traffic packet for some time, so it can generate a PTB message for the sending host. It can't rely on enough of the original packet coming back in the PTB from the intermediate router. The ITR needs to cache this for a second or two at least - while it waits for a possible PTB. This is an onerous requirement in a high-volume ITR. Ivip map-encap ITRs need to perform a stateful PMTU determination process which is somewhat similar to this. However, the Ivip approach quickly narrows the zone of uncertainty so that the number of packets involved in testing PMTU is very limited. The Ivip approach is specified in greater detail, including occasionally sending longer packets (if and when the current or some other sending host generates them) to see if the MTU has grown since the last PTB set a limit on it, as far as the ITR was concerned. - Robin _______________________________________________ rrg mailing list rrg@irtf.org http://www.irtf.org/mailman/listinfo/rrg
- [rrg] LISP PMTU - 2 methods in draft-farinacci-li… Robin Whittle
- Re: [rrg] LISP PMTU - 2 methods in draft-farinacc… Luigi Iannone
- Re: [rrg] LISP PMTU - 2 methods in draft-farinacc… Robin Whittle
- Re: [rrg] LISP PMTU - 2 methods in draft-farinacc… Templin, Fred L
- Re: [rrg] LISP PMTU - 2 methods in draft-farinacc… Robin Whittle
- Re: [rrg] LISP PMTU - 2 methods in draft-farinacc… Templin, Fred L