CURRENT_MEETING_REPORT_ Reported by Jeffrey Mogul/DEC Meeting of 12 December 1989 Held at the Digital Equipment Corporation Western Research Laboratory Palo Alto, California AGENDA 1. Discuss proposed MTU Discovery protocols 2. Attempt a consensus protocol 3. Find a victim to write up a draft 4. Set next meeting date MINUTES This was the first meeting of the MTU Discovery Working Group. We had already made and discussed a number of proposals for a protocol design, and had also discussed (over the mailing list) a number of technical and non-technical constraints on such a protocol. The most important non-technical constraint was that we wanted to devise a protocol that would work fairly well, or at least no worse than ``no protocol at all'', in the presence of large numbers of unconverted hosts and routers. Technical constraints include issues such as the cost of processing options; the possible use of the reserved bit in the IP header [probably not available to us]; the layers in which the MTU discovery protocol is implemented; the need to support TOS, security, and asymmetric paths; deciding when to send IP options; the problem of LANs with more than one MTU [a possible consequence of the use of bridges between Ethernets and FDDI]; the delays in propagation of information through the network; the realization that the path MTU is first known at the wrong end of the path; The proposals made before the meeting basically fell into two categories: those that probed the network to find out the path MTU (the minimum MTU over a path), essentially by asking the routers along the path to report the MTU, and those that asked the receiving host to report fragmentation when it occurs, with the understanding that the size of first fragment of a packet received probably reflects the path MTU. In the latter case, the sender must send large packets occasionally in order to discover if they will be fragmented. The ``Report Fragmentation'' approach would be cleanest if we could have used the reserved bit in the IP header as a flag to tell the receiver that the sender is willing to receive and utilize a (new) ICMP 1 Fragmentation Occured message. However, we were told that this bit is unlikely to be released for this purpose. ``Report Fragmentation'' (R-F) can also be done using an IP header option, again ``allowing'' the receiver to send the ICMP report. R-F has the advantage that it requires no changes in the routers, but the disadvantage that it requires changes in most end-hosts before it does much good. (If the receiving host does not implement it, the sending host could obliviously continue to send packets that are being fragmented and perhaps lost.) During the meeting, we discussed how to decide when and how often to send the ICMP Fragmentation Occurred message, and which layer should send it. Since option-processing is expensive for routers, we believe that the RF option cannot be sent on every packet. Thus, the receiving host should remember that a sender has specified Report Fragmentation, and if fragmentation does occur later on, the ICMP should be sent even if the fragmented packet did not carry the option. Since some protocols on the sending host (e.g., NFS) might not be able to use the MTU information even thought others (e.g., TCP) might have requested it, some felt that it was necessary to send fragmentation reports in response only to those host+protocol+port tuples that had sent the RF option; this means that the receiving host must keep a table keyed in this way, and probably that it has to be maintained by the transport protocols (TCP, UDP, etc.) At the same time, it was felt to be unfortunate that the transport protocols would be burdened with this chore; we consider it an ``implementation suggestion'' rather than a protocol specification. One approach might be to insist that a host not send the RF option unless all of its protocols are willing to abide by MTU change reports. We know that it is possible to make NFS obey certain MTUs, and perhaps it is worth making this rule rather than complicating the discovery protocol implementation. Talk then turned to the ``MTU Probe Option'' approach. In this approach (as it evolved from RFC1063 before the meeting), the sender would (occasionally) send this option on packets flowing on the connection. The option would have three fields: a ``next hop IP address field'', an ``OK'' field, and an ``minimum MTU'' field. The initial values would be the first hop router address, ``true'', and the MTU of the first-hop link. Each router along the path would set the MTU to the min of the current value and the MTUs of the incoming and outgoing paths. If its own address was NOT the same as the ``next hop'' field, it would set the OK field to ``false''; otherwise, it would change the next-hop field. In any case, the option is forwarded (but not copied on fragmentation). The last-hop router (it should be able tell that it is such; we'll address the FDDI-Ethernet bridge issue somehow) does the same thing to the option as the previous routers, but if the OK field is still ``true'', it now knows the path MTU. It therefore sends a (new) ICMP 2 Path MTU message back to the sender, including the usual IP header info for matching at the sender. In any case, the option continues to the receiving host. Note that the sender will not get a Path MTU message unless every router along the way understands this protocol. This is because we could otherwise be misled by a low-MTU link bordered by two unconverted routers. However, we believe that routers will be converted much sooner than most end hosts. Since the option is also received by the ultimate receiving host, that host can also interpret this option as ``Report Fragmentation'' flag (as above). This is a backup mechanism; if the Path MTU message is not generated, or if the MTU then decreases enough to cause fragmentation, the sender will still find out. Of course, the sender cannot know that the receiver is doing this. One partial solution is that if the receiver gets a MTU Probe option with the OK field = false, then it should send an Path MTU message to the sender with a code indicating ``unknown''; this tells the sender that it is OK to use the ``Report Fragmentation'' approach. If the sender receives neither kind of Path MTU message, then it must assume that it will not receive Fragmentation Occurred reports, and it should stick to the current ``no more than 576 bytes if non-local'' rule. This hybrid approach seems (to the attendees, at least) to combine the best of both methods: it gives accurate, early results (i.e., before a connection has to start sending big datagrams) without incurring fragmentation if the routers cooperate, it detects fragmentation if the receiver cooperates, and it causes conservative behavior otherwise. There are still several issues to nail down, including how often to send the MTU Probe option, how many times to send the Fragmentation Occurred ICMP message, etc. Keith McCloghrie and Rich Fox were volunteered to write up a draft RFC, which we hope to circulate several weeks before the next IETF meeting. We expect to meet again at the February IETF meeting. ATTENDEES Art Berggreen art@salt.acc.com Noel Chiappa jnc@PTT.LCS.MIT.EDU Farokh Deboo sun!iruucp!ntrlink!fjd Steve Deering deering@pescadero.stanford.edu Rich Fox sytek!rfox@sun.com Ivan Liu iliu@orville.nas.nasa.gov Keith Mc Cloghrie sytek!kzm@hplabs.HP.COM Jeff Mogul mogul@decwrl.dec.com Nuggehalli Pradeep pradeep@orville.nas.nasa.gov Stephanie Price cmcvax!price@hub.ucsb.edu [Noel Chiappa participated via telephone] 4