IETF Path MTU Discovery WG (pmtud)
Thursday, March 4, 2004 at 15:30 to 17:30
The meeting was moderated by the working group chairs, Matt Mathis and Matt
Zekauskas. Simon Leinen and Matt Zekauskas took notes, which were edited
into these minutes by the chairs.
1. Agenda Bashing
2. PMTUD method document status
3. (if interest, and advocate):
3. Milestone status
1. Agenda Bashing
-- The Chairs
Joe Touch: are we going to discuss Richardson IPsec draft?
issue -- on IPsec ML, there is an active 2401bis discussion
dealing with fragmentation problems
reassemble before decrypt? after decrypt?
it's a mess
MM: thinking about RFC, fragmentation even more harmful than thought
MZ: what can we do?
JT: add value to the discussion on the ipsec ML
issue of fragmentation 2 network, a few rubrics:
2. different sources, else fail before resend
[hangs that result from tunnels?--mjz]
3. view DNF as a covert channel
SPI negotiates SET ("ok"), COPY, or CLEAR
argue, !clear, then drop. prevent covert channel
These folks want complete control over visibility in header...
even if violates semantics to change something in middle...
MM: one soln, IPv4 that emulates IPv6... DF on frags, too.
JT: problem is IPsec in middle on tunnel, not enough room for all the
IPsec headers, must fragment.
S. Parthi noted that in Solaris fragmentation is not on fast path, and not
often used. But recently, we see a big rise for tunnel & encryption
MM: there are security boxes that ignore fragmentation, or fragment
anyway if DF set; and most of these are major manufacturer's boxes...
2. PMTUD method update
-- Matt Mathis
On the "running code" slide, first send a probe packet, small probe on RTT,
then changes. In this case, the transfer is CPU limited, and the
transfer speeds up when MTU rises.
JT: It looks like the MTU went up and down?
You send a single larger MTU probe packet, and spend the next RTT at the old
MTU until the probe succeeds (so you don't lose a whole window of data if
the MTU is too large).
Joe wondered if the case where a tunnel is introduced and the MTU goes down
had been tested.
If there is no ICMP message, TCP will get a timeout. At that point we pull
the MTU way back and then restart.
Joe wanted to be sure that that case had been tested... and MM replied that
one of the implementations did that.
Matt continued with the slide presentation.
Matt Mathis made a call for implementations. We encourage people to do
implementations. To get the next round of bugs out of the document will
require people to look at code and see where the document is
Matt noted there were two open robustness issues. First, what to do in case
of transport when there are repeated timeouts. At one point we thought we
would be able to have unified text on this issue. For MTU - you want to
pull it down to something safe. However, you would need to be unified on
different timeouts. This is a bigger problem than thought it was.
Second, what to do when the path ignores DF.
However, the issue with MTU raising loss rate has been solved in the
In the case of not honoring DF:
There is evidence it is becoming more common. However, there is only
lore, not based on measurements.
There's a case where this might be fatal with high-speed transfers. At high
rates it does not take much time to wrap the 16 bit IPid field. If there is
a single connection at high speed, using ip frag inappropriately (UDP
transfer tools, say); If drop a low frag and protocol recovers in usual
way, might have remaining hi frag in recovery queue.
Now the sequence space wraps.
You can now associate a new low frag w/old hi frag.
If this causes a CRC error, then the packet is dropped. (and in fact all
packets would be dropped, because from then on the lower and upper halfs
would not associate correctly). If the packet does not fail a CRC check,
then you could pass incorrect data to applicatoins.
joe: seems like party doing frag has violated semantics -- reusing ipid
w/in 2MSL is broken
Matt said that yes, but then applications are limited to 100pps.
joe: if declaring that pepole do this, and have to live with it, then
maybe we should declare that IPid field not used.
Matt thought it might be better to discourage using
Joe said that if tunnels are introduced you must fragment.
Matt said that if PMTU discovery works, then don't do it. prob second doc.
On repeated timeouts, MTU is set to 512/1230. However, imagine
scenario when probing gives acceptable answer, but when raise MTU and send
many packets at the new MTU the link fails. Is there something we should do
to detect or prevent this?
Matt said we still need contributors, especially in the following areas:
* How to address interaction with tunnels in the document.
Joe Touch volunteered to help with the tunnels.
Larry Dunn asked about how much implementation experience we need before
pushing the document out.
Matt replied that we can declare doc done before there is extensive
implimentation experience. There is a little bit in the document that is
standard, the rest is heuristics that we can try changing. So, we can gain
experience, and then tweak the document in two years based on the
At this point, we skipped to
We noted that the first milestone was complete, but the last one had an
item that we had not at all addressed: MIBs.
We cover items that must be instrumented for visibility in the method
document. For example, there are APIs for Unix that no longer support MTU
testing. They support application fragmentation, and kernel
fragmentation (where the kernel does PMTUD for you) but no "send this big
packet even though you think it's too big".
Furthermore, the MIBs are typically transport-dependent, so the most we
could specify might be a MIB fragment that is inserted into other MIBs.
Joe Touch mentiond that there is work in another working group that is
relevant... API/MIB info for link. They did not think about the issue of
override, however. It is going to be a BCP, we should comment on that
Joe also went back to the Richardson draft; it provides interim
solutions for an IPsec gateway. It's not about IPsec, exactly, but
fragmentation after processing. PMTUD is related to the IPsec working
group, and they need to know about this proposed technology. We need to go
to that group and make them aware of it. 2401bis has been long coming -- we
should look at it before it goes to last call.
Sunil from Sun asked if there is a testbed for this kind of thing?
Someplace to show the benefits of PMTUD working -- to be able to
quantify the gain. Both of the Matt's said to come talk, there are
networks that support > 1500 byte MTUs.
Finally, we touched on the Welzl draft. There were no advocates in the
Joe Touch noted that if the options are only partially implemented it
would not be productive to proceed forward, because you cannot ratchet the
MTU down. You must search up.
Matt Mathis said that inded the document does not stand alone, but must be
used with other methods.
Joe Touch thought that we needed to heed the rule to be gentile with what
you do, liberal in what you accept.
With no other comments the meeting adjourned early.