2.8.12 Path MTU Discovery (pmtud)

NOTE: This charter is a snapshot of the 60th IETF Meeting in San Diego, CA USA. It may now be out-of-date.

Last Modified: 2004-07-23

Chair(s):
Matt Mathis <mathis@psc.edu>
Matthew Zekauskas <matt@internet2.edu>
Transport Area Director(s):
Allison Mankin <mankin@psg.com>
Jon Peterson <jon.peterson@neustar.biz>
Transport Area Advisor:
Allison Mankin <mankin@psg.com>
Mailing Lists:
General Discussion: pmtud@ietf.org
To Subscribe: pmtud-request@ietf.org
In Body: In Body: subscribe email_address
Archive: http://www.ietf.org/mail-archive/web/pmtud/index.html
Description of Working Group:
The goal of the PMTUD working group is to specify a robust method for determining the IP Maximum Transmission Unit supported over an end-to-end path. This new method is expected to update most uses of RFC1191 and RFC1981, the current standards track protocols for this purpose. Various weakness in the current methods are documented in RFC2923, and have proven to be a chronic impediment to the deployment of new technologies that alter the path MTU, such as tunnels and new types of link layers. The proposed new method does not rely on ICMP or other messages from the network. It finds the proper MTU by starting a connection using relatively small packets (e.g. TCP segments) and searching upwards by probing with progressively larger test packets (containing application data). If a probe packet is successfully delivered, then the path MTU is raised. The isolated loss of a probe packet (with or without an ICMP can't fragment message) is treated as an indication of a MTU limit, and not a congestion indicator. The working group will specify the method for use in TCP, SCTP, and will outline what is necessary to support the method in transports such as DCCP. It will particularly describe the precise conditions under which lost packets are not treated as congestion indications. The work will pay particular attention to details that affect robustness and security. Path MTU discovery has the potential to interact with many other parts of the Internet, including all link, transport, encapsulation and tunnel protocols. Thereforethis working group will particularly encourage input from a wide cross section of the IETF to help to maximize the robustness of path MTU discovery in the presence of pathological behaviors from other components. Input draft: Packetization Layer Path MTU Discovery draft-mathis-plpmtud-00.txt
Goals and Milestones:
Jul 03  Reorganized Internet-Draft. Solicit implementation and field experience.
Dec 03  Update Internet-Draft incorporating implementers experience, actively solicit input from stakeholders - all communities that might be affected by changing PMTUD.
Feb 04  Submit completed Internet-draft and a PMTUD MIB draft for Proposed Standard.
Internet-Drafts:
  • - draft-ietf-pmtud-method-02.txt
  • No Request For Comments

    Current Meeting Report

    Path MTU Discovery WG (pmtud)
    Tuesday, August 3, 2004 at 14:15 to 15:15
    =========================================
    The meeting was chaired by Matt Zekauskas and Matt Mathis. Dave Thalor was the Jabber scribe. These minutes are edited from notes taken by Aaron Falk.


    AGENDA:


    1. Agenda Bashing, Milestones Status


    2. Packet-Level Path MTU Discovery Method update
    http://www.ietf.org/internet-drafts/draft-ietf-pmtud-method-02.txt
    Additional background reading:
    http://www.ietf.org/internet-drafts/draft-richardson-ipsec-fragment-01.txt


    3. Datagram Transport Layer Security


    4. MTU and Fragmentation Issues with In-the-network Tunneling


    5. Path MTU Discovery Using Options
    http://www.ietf.org/internet-drafts/draft-welzl-pmtud-options-01.txt


    6. Packet-Level Path MTU Discovery implementation experiences


    7. If Time Available: Fragmentation Considered Very Harmful
    http://www.ietf.org/internet-drafts/draft-mathis-frag-harmful-00.txt
    Slides are at http://www.psc.edu/~mathis/papers/frag200408





    1. Agenda Bashing, Milestones Status


    Matt Zekauskas opened the meeting. There were no additional agenda changes requested.


    - Milestones:
    * DONE: reorganized ID incorporating implementers experience.
    * Aug 04: update ID, solicit feedback from communities affected we feel this is done.
    * Jan 05: Is the current target for a draft to the IESG. One issue is the MIB: there really isn't a single MIB for this technique, but more that each protocol would have entries in its MIB. The current draft has language about what protocols must provide to allow debugging; that's the basis for items that should be included in individual MIBs.






    2. Packet-Level Path MTU Discovery Method update
    -- Matt Mathis


    Matt began with an algorithm review: start with 1k MTU, test larger MTUs by probing with larger packets, verify provisional MTU for 1 RTT. Most of the algorithm runs in the transport layer but you want to keep cached/shared state in the IP layer. Matt showed a trace from an implementation.


    The key properties of packetization-layer PMTUD: not designing a protocol, but making use of existing features of protocols. Careful thought has been placed into maximizing robustness. Implementation differences don't affect interoperability; it's a sender-side change only.


    There is now a new unofficial status page:
    http://www.psc.edu/~mathis/MTU/pmtud/index.html.
    Included is a pointer to the draft, and both this page and the draft give the latest updates in "real time".


    The changes from -01 to -02: Three are no significant changes to algorithms; the text has been restructured for cleaner layer separation, and to make the document much less TCP-centric. TCP may be harder than most protocols in that it has no way to mark out-of-band data. There is new text on SCTP.


    Robustness issues: there is added discussion of "full stop timeouts", and we removed state machine to detect pMTU discovery induced failures. Devices that ignore DF remain the big worry.


    New Topics added: Tunnels discussion with sermon on not ignoring DF, subnets with non-uniform MTU, recommendation that IPv4 fragmentation be used in a mode that emulates IPv6 fragmentation.


    Christian Huitema said that he was getting feedback from an implementer that there is tension between wanting to retain incentive to fix broken traditional ICMP MTU discovery and develop a new solution (plPMTUD). Matt noted that the traditional way is a layer violation, and that it was permanently broken and won't be fixed. Christian responded that a working ICMP-based pmtud is very valuable when there are tunnels on the path. Many tunnel endpoints don't copy DF bit in the outer header.


    Joe Touch thought that the text was missing the concepts of "passive" and "bidirectional". Without a bidirectional ULP, you'll never get the information back you need. Also, this draft breaks layers all over the place (so don't throw stones at ICMP). He also felt that it is a protocol. Just because you don't set a tag in a header, doesn't mean you don't have a protocol. But, he is more comfortable with this solution then the ICMP solution. Joe noted that it will be hard to make every implementation respect the DF bit: RFC2401 (a draft standard) says it's OK for implementations to clear the DF bit.


    Magnus Westerlund noted that this pMTUd method has the advantage of moving the detection up the stack. Although you can't do it with generic one-way UDP, RTP RTCP can do it.


    Andrew McGregor asked how does this play with middleboxes which play with maximum segment size as TCP negotiation goes by? Matt said that those middleboxes in effect create an upper bound.


    Dave Thaler reiterated that ICMP is mandatory for IPv6, and there is no DF bit. That's an incentive to fix. On the other hand, perhaps UDP-like applications will inappropriately start using TCP with plPMTUD, rather than providing an incentive to fix the infrastructure.


    Fred Templin thought that this probed for the maximum segment size, rather than maximum MTU, and we might end up with too large value. Matt Mathis clarified the issue: segment size is the natural unit for TCP. The plPMTUD layering needs to translate between MSS and MTU. The primary difficulty of the draft is describing this translation clearly everywhere it is needed.



    Back to the presentation, Matt noted that we need a section for "every" packetization layer, including all future protocols. Perhaps we need a list of questions to be answered for every packetization layer?


    Joe said that it seems awkward to do this at every packetization layer. Matt said that's why he's looking for common questions. Joe argued that we should just add an API to a, e.g., congestion manager, to determine the correct MTU -- simpler, cleaner, controllable, and reusable.


    Stanislav Shalunov pointed out an analogy between PMTUD and congestion control. The same considerations apply to a large extent. Whether "source quench" or "fragmentation needed", and hosts inferring congestion from losses: reduction is required at the sender. The same layering violation discussion applies, and the same reuse discussion applies.


    Steve Casner noted that RTP doesn't have the same answer for all codecs/payloads. So, it is hard to come up with common answer. Also, an adjunct protocol may not get same answer as primary protocol, so it may come up with wrong answer.





    3. Datagram Transport Layer Security (DTLS)
    -- Eric Rescorla


    Eric is working on a version of TLS that works on UDP. Protocol flow is the same as for TLS. Data is sent in DTLS records, which are large. The handshake is 2-3 RTs contains certificates, 500-1000 bytes per certificate (possibly many certificates per message). Application data transfer can contain records up to 2^14 bytes -- which obviously is larger than most PMTUs.


    Aaron Falk asked if application records were all the same size. Eric said that there was no need to have them all be the same size; one could use small packets for handshake then discover during data exchange.


    Matt Mathis asked how often do you talk to same node? Eric said that the same nodes communicated often for brief time.


    Matt Mathis and Eric will talk more about this.






    4. MTU and Fragmentation Issues with In-the-network Tunneling
    -- Pekka Savola


    There are a number of different tunneling protocols (e.g., IP-IP, GRE, L2TP, IPv6-in-IPv4). Many of those are used in router-to-router tunneling; all these protocols need to discuss and address the same issues. This document aims to spell out the issues and describe the options. Issues with tunneling in the "core" are quite tricky. Four different solutions, none of which is good for all cases.


    Matt Z asked everyone to please read the draft, comments to list are appropriate.





    5. Path MTU Discovery Using Options
    -- Michael Welzl


    In the end, pMTUd always loses a packet. It would be nice to avoid this. In addition, the algorithm should converge fast. Thus, this proposes adding a mechanism to converge faster without loss. Algorithm: before doing pMTUd, include "Provide MTU" IP option, which routers mark. The receiver feeds back result to source. Implementation results shown. There are some known problems with IP options: slow path processing and some routers drop packets with unknown IP options. Added delay increased by 24% in measurements. 25% of hosts did not respond when there was an IP option. Clearly not recommendable for each TCP connection. Also some security issues: falsification of MTU value or number of routers. Recommended mainly for "special" scenarios. See


    http://www.welzl.at/research/projects/ip-options/


    Mark Allman shared some of his measured data: 98% of 80,000 web servers reachable, 70% not reachable with unknown option. With IP record route option 34% of the servers were not reachable. With the IP timestamp option (?), 43% of the servers were not reachable.


    Joe Touch said that he liked explicitness of solution but this may suffer from same problems as ICMP. Assumption is that processing IP options is faster than ICMP. Need mechanisms which ensure that an overloaded router doesn't have to do more work. (Have more suggestions offline.) See draft-touch-tcp-antispoof: nonces don't give security, they only protect against off-path attacks.


    Phil Karn noted that in the real world a ICMP "message to large" is never generated most of the time, since local link sets MTU. So, this solution only adds overhead.


    Krishnan noted a problem with the TTL check proposed in the draft: while the probability of guessing TTL is less than one in 256, most OS's set it to some known lower value (32, 64). It's not random. So the window in which you need to guess becomes much smaller.


    Matt Z asked if this should become a working group item.


    "How many people are interested in having this become a wg item?": none


    "not interested": clear hum


    In order to get us interested, you would need to establish why this solution is clearly better than ICMP.





    6. Packet-Level Path MTU Discovery implementation experiences


    Rao Shoaib commented on experience (so far) implementing the algorithm in Solaris.


    * The draft requirement that TCP not start plPMTUD until it holds three packets of data is overly restrictive and too hard to implement. If you're already congested or throttled, you don't want to wait for three packet times.


    * A generic solution for all protocols, and not just one per protocol would be a better idea.


    * The draft should describe the increments for probing, i.e., why not suggest well-known MTUs?


    Due to lack of time the meeting was adjourned without discussing the last time.

    Slides

    Agenda
    Path Maximum Transmission Unit Discovery
    Datagram Transport Layer Security (DTLS)
    MTU and Fragmentation Issues with In-the-Network Tunneling
    Path MTU Discovery Using Options