Path MTU Discovery WG (pmtud)
Monday, 7 November 2005, 18:50--19:50
CHAIRS: Matt Mathis
(Jabber) Scribe: Michael Richardson
These minutes were generated by the Chairs from notes taken by
Michael Richardson and Matt Zekauskas.
1. Agenda bashing, milestones review
2. PMTUD method draft update and issue summary
3. draft-templin-linkadapt-01.txt - should it be a pmtud item?
4. Remaining Issues
1. Agenda bashing, milestones review
Matt Z opened the meeting reviewing the agenda, and milestones.
We think we're close to the end of the second milestone (revise draft based
on stakeholder feedback and implementation experience). We need to come
to closure and submit the method draft along with a MIB document.
John Heffner is working on a Linux implementation that follows the
current draft; when that is done and the one or two remaining technical
issues resolved, we should be ready to submit the method draft.
2. PMTUD method draft overview
John Heffner gave a presentation on the rewrite of the method draft.
The changes were all to make it simpler, and relax things that were
overspecified. There's only a small part that needs to be in
standards language -- when it is OK to ignore loss as a congestion
signal. Most of the text describes heuristics. The algorithm is now
strictly an extension to classical PMTUD: there are no changes to ICMP
processing specified, the probing algorithm has been decoupled from
verification, and the terminology has been simplified.
Historically, the drafts have had a lot of ICMP verification. However,
this wasn't strictly necessary, and it dovetailed with work Fernando Gont
has been doing more generally with ICMP. Thus it was all removed from
this draft, which simplifies the text considerably. The old text could
be made into a separate document, or combined with Fernando's work.
The other major change was to decouple verification from probing; this
was discussed a bit in Minneapolis, and the idea is based on
implementation experience. By decoupling verification from probing,
an implementation could potentially probe with larger and larger
packets every RTT, and keep pace with slow-start. It makes
discovering a large MTU quite fast. It also simplifies the
description; there can now be two independent state machines. It also
allows for evolution and innovation in verification; the authors feel
that the current verification method could be improved.
Finally, the notion of maximum packet size was removed, and all descriptions
are in terms of MTU. This implies that the packetization protocol must
now know the total packet sizes sent out (including IP headers). However,
how it knows is no longer specified.
John then described a Linux implementation he is working on that
follows the current draft. It currently has no verification phase,
he's adding that back in. He did add a black hole discovery mechanism
that allows an implementation to start with a large MTU, and only
"probe up" (that is, fall back to a small MTU and then probe for larger
and larger values) if the initial MTU is too big. He's also working
on the proper caching of state.
Fernando Gont asked about starting with large packets, and is there
any guidance for implementers? John replied that the general point is
that if you expect large packets won't work, then start with a small
MTU and probe for larger values. If you expect them to work in the
common case, then start with a large MTU.
Fernando also commented on the difficulty of TCP (or any other
packetization protocol) finding the complete size of packets with
headers. He noted that in BSD-derived systems, at least, it is not
difficult. John stated that in all stacks, you know the maximum
packet size; the issue is if the packetization layer knows the length
of the particular packet. He noted that in general, you know how much
is reserved, not how much is used. Matt Mathis pointed out the real
issue for PMTUD is that the probe will fail if it's not padded out the
proper length, including headers. In particular, if options are used
there could be a problem.
Fernando also offered to create an implementation in OpenBSD. John
(and Matt) thought it would be an excellent contribution; Kevin Lehey
did an very early implementation for netbsd, and a new current
implementation would be appreciated (as would comments based on
Matt Mathis started a discussion around one of the open technical issues
in the draft: how often should you probe to see if the MTU has been raised
along a path. The draft currently says five minutes, but that value
was somewhat arbitrary. Matt wondered if anyone has insights into
existing precedent or a rational way of picking a number.
Michael Richardson thought 5 minutes sounded about right, and in
any case it should be greater than the 2 minute TCP timeout; he
also thought it should not be larger than the typical damping interval
for BGP. John Heffner commented that ten minutes is commonly used for
values in route caches. Fernando Gont suggested a strategy being considered
for OpenBSD: base the probe interval based on the current MTU size. If the
size is small, probe more frequently; if the value is large (and still
working) use a larger timeout. Matt asked if Fernando considered 1500
large or small. Fernando thought "small", and a "large" value might
be something greater than 6000 right now; a value of 10 to 15 minutes
might be appropriate there. Dave Thalor noted that since the draft
is now an extension of "PMTUD classic", RFC 1981 says 10 minutes,
so that's a value that you would not have to defend.
Fred Templin came in late, but wanted to bring up one of his drafts,
draft-templin-linkadapt-01.txt. As we currently understand it, this
draft discusses having IPv6 tunnel endpoints know the size of the MTU
for the IPv4 tunnel, so that they can respond correctly. Thus, the
endpoints would run PMTUD on the tunnel itself. This generated
discussion to see what the draft proposed, and try to understand the
nature of Fred's request, which we believe is: "is this draft in-scope
Dave Thalor noted that this draft tries to solve the same problem, as
the current draft, but for IPv6 tunnel over IPv4 endpoints. It was
noted that there are cases where an IPv4 packet might or might not
have the DF bit set; and even then there is the problem noted in
earlier meetings that IPsec allows an IPsec tunnel to fragment packets
and reassemble them at tunnel exit. In any event, the tunnel entry
would like to send back an appropriate ICMP packet too big response.
Dave Thalor felt that the draft is certainly of interest to working
group members, but wasn't sure if it was in scope.
Matt Mathis thought that the current outline of what to do with
transports such as DCCP would be applicable to tunnels. He also
noted that he felt pressure to get the current draft done, and didn't
want to broaden scope before we were done. Whether we should consider
it for the group will require AD discussion, and we should remain
mindful that there is a lot of desire to get the current working
group draft done.
4. Remaining Issues
Dave Thalor asked about a more specific plan to get the MIB
written. Matt Mathis asked for volunteers to help specify the MIB
(no one stepped forward in the meeting -- the chairs will ask again on the
mailing list). One issue that Matt Mathis hasn't solved is what to use for an
index; you want something like a protocol-independent connection ID.
It's a bit more granular than what is in existing host MIB RFCs, since
it potentially specifies a particular flow. Perhaps it should be an
extension to the IP MIB if the contents are (transport) protocol independent.
Fernando Gont asked about the ICMP text that was taken out of the -05
version of the draft; John Heffner mentioned that it could go in another
more general ICMP advice draft. Matt and John said that they have no
concrete plans to pursue this at this time; there are some really good
ideas that are only embodied in Internet Drafts and they deserve a more
permanent home. The current focus is in finishing the milestones.
With that the meeting ended.