idnits 2.17.1 

draft-generic-6man-tunfrag-09.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (July 15, 2013) is 3931 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'RFC4443' is defined on line 278, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200)

  == Outdated reference: A later version (-68) exists of
     draft-templin-intarea-seal-60

  -- Obsolete informational reference (is this intentional?): RFC 1981
     (Obsoleted by RFC 8201)


     Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                    F. Templin, Ed.
3	Internet-Draft                              Boeing Research & Technology
4	Intended status: Informational                             July 15, 2013
5	Expires: January 16, 2014

7	                        Fragmentation Revisited
8	                   draft-generic-6man-tunfrag-09.txt

10	Abstract

12	   IP fragmentation has long been subject for scrutiny since the
13	   publication of "Fragmentation Considered Harmful" in 1987.  This work
14	   cast fragmentation in a negative light that has persisted to the
15	   present day.  However, the tone of the work failed to honor two
16	   principles of creative thinking: never say "always" and never say
17	   "never".  This document discusses uses for fragmentation that apply
18	   both to the present day and moving forward into the future.

20	Status of this Memo

22	   This Internet-Draft is submitted in full conformance with the
23	   provisions of BCP 78 and BCP 79.

25	   Internet-Drafts are working documents of the Internet Engineering
26	   Task Force (IETF).  Note that other groups may also distribute
27	   working documents as Internet-Drafts.  The list of current Internet-
28	   Drafts is at http://datatracker.ietf.org/drafts/current/.

30	   Internet-Drafts are draft documents valid for a maximum of six months
31	   and may be updated, replaced, or obsoleted by other documents at any
32	   time.  It is inappropriate to use Internet-Drafts as reference
33	   material or to cite them other than as "work in progress."

35	   This Internet-Draft will expire on January 16, 2014.

37	Copyright Notice

39	   Copyright (c) 2013 IETF Trust and the persons identified as the
40	   document authors.  All rights reserved.

42	   This document is subject to BCP 78 and the IETF Trust's Legal
43	   Provisions Relating to IETF Documents
44	   (http://trustee.ietf.org/license-info) in effect on the date of
45	   publication of this document.  Please review these documents
46	   carefully, as they describe your rights and restrictions with respect
47	   to this document.  Code Components extracted from this document must
48	   include Simplified BSD License text as described in Section 4.e of
49	   the Trust Legal Provisions and are provided without warranty as
50	   described in the Simplified BSD License.

52	Table of Contents

54	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . 3
55	   2.  Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 3
56	   3.  IPv6 Hosts Sending Large Isolated Packets . . . . . . . . . . . 4
57	   4.  IPv6 Tunnels  . . . . . . . . . . . . . . . . . . . . . . . . . 5
58	   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6
59	   6.  Security Considerations . . . . . . . . . . . . . . . . . . . . 7
60	   7.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 7
61	   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . . . 7
62	     8.1.  Normative References  . . . . . . . . . . . . . . . . . . . 7
63	     8.2.  Informative References  . . . . . . . . . . . . . . . . . . 7
64	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . . . 8

66	1.  Introduction

68	   IP fragmentation has long been subject for scrutiny since the
69	   publication of "Fragmentation Considered Harmful" in 1987 [FRAG].
70	   This work cast fragmentation in a negative light that has persisted
71	   to the present day.  However, the tone of the work failed to honor
72	   two principles of creative thinking: never say "always" and never say
73	   "never".  This document discusses uses for fragmentation that apply
74	   both to the present day and moving forward into the future.

76	2.  Problem Statement

78	   The de facto "Internet cell size" is effectively 1500 bytes, i.e.,
79	   the minimum maximum Transmission Unit (minMTU) configured by the vast
80	   majority of links in the Internet.  IPv6 constrains this even further
81	   by specifying a minMTU of 1280 bytes and a minimum Maximum Reassembly
82	   Unit (minMRU) of 1500 bytes [RFC2460].  IPv4 specifies both minMTU/
83	   minMRU as only 576 bytes [RFC0791][RFC1122], although it is widely
84	   assumed that the vast majority of nodes will configure an IPv4 minMRU
85	   of at least 1500 bytes.

87	   The 1280 IPv6 minMTU originated from a November 14, 1997 mailing from
88	   Steve Deering to the IPng mailing list, which stated:

90	      "In the ipngwg meeting in Munich, I proposed increasing the IPv6
91	      minimum MTU from 576 bytes to something closer to the Ethernet MTU
92	      of 1500 bytes, (i.e., 1500 minus room for a couple layers of
93	      encapsulating headers, so that min- MTU-size packets that are
94	      tunneled across 1500-byte-MTU paths won't be subject to
95	      fragmentation/reassembly on ingress/egress from the tunnels, in
96	      most cases).

98	      ...

100	      The number I propose for the new minimum MTU is 1280 bytes (1024 +
101	      256, as compared to the classic 576 value which is 512 + 64).
102	      That would leave generous room for encapsulating/tunnel headers
103	      within the Ethernet MTU of 1500, e.g., enough for two layers of
104	      secure tunneling including both ESP and AUTH headers."

106	   However, there was a fundamental flaw in this reasoning .  In
107	   particular to avoid fragmentation for several nested layers of
108	   encapsulation, the first tunnel (T1) would have to set a 1280 MTU so
109	   that its tunneled packets would emerge as 1320 bytes (1280 bytes plus
110	   40 bytes for the encapsulating IPv6 header).  Then, the next tunnel
111	   (T2) would have to set a 1320 MTU so its tunneled packets would
112	   emerge as 1360.  Then the next tunnel (T3) would have to set a 1360
113	   MTU so that its tunneled packets would emerge as 1400, etc. until the
114	   available path MTU is exhausted.  The question is, how can those
115	   nested tunnels be so carefully coordinated so that there would never
116	   be an MTU infraction?  In a single administrative domain where an
117	   operator can lay hands on every tunnel ingress this may be possible,
118	   but in the general case it cannot be expected that the nested tunnel
119	   MTUs would be so well orchestrated.  It is therefore necessary to
120	   consider as a limiting condition a tunnel that configures a 1280 MTU
121	   in which the tunnel crosses a link (perhaps another tunnel) that also
122	   configures a 1280 MTU.  In that case, the tunnel ingress has two
123	   choices: 1) perform fragmentation that the tunnel egress needs to
124	   reassemble, or 2) shut down the tunnel due to failure to meet the
125	   IPv6 minMTU requirement.

127	   In addition, it is becoming increasingly evident that Path MTU
128	   Discovery (PMTUD) [RFC1981] does not work properly in all cases.
129	   This is due to the fact that the Packet Too Big (PTB) messages
130	   required for PMTUD can be lost due to network filters that block
131	   ICMPv6 messages [RFC2923][WAND][SIGCOMM][RIPE].  It is therefore
132	   necessary to consider the case where IPv6 packets are dropped
133	   silently in the network due to a size restriction, but the IPv6
134	   source host never receives the necesary indication from the network
135	   that the packet was lost.  The source host must therefore support
136	   some form of IP fragmentation in order to ensure that isolated large
137	   packets are delivered, as well as a packet size probing capabilitiy
138	   (see: [RFC4821]) to ensure that large packets that are part of a
139	   coordinated stream are making it through to the destination.

141	   Due to these considerations, there are at least two use cases for
142	   network layer fragmentation that must be satisfied now and for the
143	   long term.  In the following sections, we discuss these
144	   considerations in more detail.

146	3.  IPv6 Hosts Sending Large Isolated Packets

148	   IPv6 hosts that send large isolated packets have no way of ensuring
149	   that the packets are delivered to the final destination if their size
150	   exceeds the path MTU.  The host must therefore perform network layer
151	   fragmentation to a fragment size of no larger than 1280 bytes to
152	   ensure that the fragmented packets are delivered to the destination
153	   without loss due to a size restriction.  However, the destination
154	   node need only configure a minMRU size of 1500 bytes per the IPv6
155	   specs.  Therefore, the source must either limit its packet sizes to
156	   1500 bytes (i.e., before fragmentation) or somehow have a way of
157	   determining that the destination configures a larger minMRU.  Two
158	   uses for this host-based fragmentation to support large isolated
159	   packets are OSPVFv3 and DNS.

161	4.  IPv6 Tunnels

163	   IPv6 tunnels are used for many purposes, including transition,
164	   security, mobility, routing control, etc.  While it is assumed that
165	   transition mechanisms will eventually give way to native IPv6, it is
166	   clear that the use of tunnels for other purposes will continue and
167	   even expand.  A long term strategy for dealing with tunnel MTUs is
168	   therefore required.

170	   Tunnels may cross links (perhaps even other tunnels) that configue
171	   only the IPv6 minMTU of 1280 bytes while the tunnel ingress must be
172	   able to send packets that are at least 1280 bytes in length so that
173	   the IPv6 minMTU is extended to the source.  However, these tunneled
174	   packets become (1280 + HLEN) bytes on the wire (where HLEN is the
175	   length of the encapsulating headers), meaning that they would be
176	   vulerable to loss at a link within the tunnel that configures a
177	   smaller MTU.  Therefore, the only way to satisfy the IPv6 minMTU is
178	   through network layer fragmentation and reassembly between the tunnel
179	   ingress and egress, where the ingress fragments its tunneled packets
180	   that are larger than (1280 - HLEN) bytes.

182	   Unfortunately, fragmentation and reassembly are a pain point for in-
183	   the-network routers - especially for those that are nearer the core
184	   of the network.  It is therefore highly desirable for the tunnel
185	   ingress to discover whether this fragmentation and reassembly can be
186	   avoided.  This can only be done by allowing the ingress to probe the
187	   path to the egress by sending whole 1500 byte probe packets to
188	   discover whether the probes can be delivered to the egress without
189	   fragmentation.  These 1500 byte probes appear as (1500 + HLEN) bytes
190	   on the wire, therefore the path must support an MTU of at least this
191	   size in order for the probe to succeed.

193	   The tunnel fragmentation and reassembly strategy is therefore as
194	   follows:

196	   1.  When the tunnel ingress receives a packet that is no larger than
197	       (1280-HLEN) bytes, it encapsulates the packet and sends it to the
198	       egress without fragmentation.  The egress will receive the packet
199	       since it is small enough to fit within the IPv6 minMTU of 1280
200	       bytes.

202	   2.  When the tunnel egress receives a packet that is larger than 1500
203	       bytes, it encapsulates the packet and sends it to the egress
204	       without fragmentation.  If the packet is lost in the network due
205	       to a size restriction, the ingress may or may not reeceive a PTB
206	       message which it can then forward to the original soruce.
207	       Whether or not a PTB message is received, however, it is the
208	       responsibility of the original source to ensure that its packets
209	       larger than 1500 bytes are making it to the final destination by
210	       using a path probing technique such as specified by [RFC4821].

212	   3.  When the tunnel ingress receives a packet larger than (1280 -
213	       HLEN) but no larger than 1500 bytes, and it is not yet known
214	       whether packets of this size can reach the egress without
215	       fragmentation, the ingress encapsulates the packet and uses
216	       network layer fragmentation to fragment it into two pieces that
217	       are each signifiicantly smaller than (1280 - HLEN) bytes.  At the
218	       same time, the tunnel ingress sends an unfragmented 1500 byte
219	       probe packet toward the egress (subject to rate limiting) which
220	       will appear as (1500 + HLEN) bytes on the wire.  If the egress
221	       receives the probe, it informs the ingress that the probe
222	       succeeded.  If the probe succeeds, the ingress can suspend the
223	       fragmentation process and send packets between (1280-HLEN) and
224	       1500 bytes without using fragmentation.  This probing process
225	       exactly parallels [RFC4821].

227	   In this method, the tunnel egress must configure a slightly larger
228	   MRU than the minMRU specified for IPv6 in order to accommodate the
229	   HLEN bytes of tunnel encapsulation during reassembly. 2KB is
230	   recommended as the minMRU for this reason.

232	   These procedures give way to the ability for the tunnel ingress to
233	   configure an unlimited MTU (theoretical limit is 64KB for IPv4 and
234	   4GB for IPv6).  They will therefore naturally lead to the Internet
235	   migrating to larger packet sizes with no dependence on traditional
236	   path MTU discovery.  Operators will also soon discover that
237	   configuring larger MTUs on links between routers (e.g., 2KB or
238	   larger) will dampen the fragmentation and reassembly requirements
239	   until fragmentation and reassembly usage is gradually tuned out of
240	   the network.

242	   These procedures are not supported by the existing IPv6 fragmentation
243	   method, however they are exactly those specified in the Subnetwork
244	   Encapsulation and Adaptation Layer (SEAL) [I-D.templin-intarea-seal].
245	   Widespread adoption of SEAL will therefore naturally lead to an
246	   Internet which no longer places MTU restrictions on tunnels and
247	   therefore supports natural migration to unbounded packet sizes.  The
248	   approach can best be summarized as: "take care of the smalls, and let
249	   the bigs take care of themselves".

251	5.  IANA Considerations

253	   There are no IANA considerations for this document.

255	6.  Security Considerations

257	   The security considerations for [RFC2460] apply also to this
258	   document.

260	7.  Acknowledgments

262	   This method was inspired through discussion on various IETF mailing
263	   lists in the 2012-2013 timeframe.

265	8.  References

267	8.1.  Normative References

269	   [RFC0791]  Postel, J., "Internet Protocol", STD 5, RFC 791,
270	              September 1981.

272	   [RFC1122]  Braden, R., "Requirements for Internet Hosts -
273	              Communication Layers", STD 3, RFC 1122, October 1989.

275	   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
276	              (IPv6) Specification", RFC 2460, December 1998.

278	   [RFC4443]  Conta, A., Deering, S., and M. Gupta, "Internet Control
279	              Message Protocol (ICMPv6) for the Internet Protocol
280	              Version 6 (IPv6) Specification", RFC 4443, March 2006.

282	8.2.  Informative References

284	   [FRAG]     Kent, C. and J. Mogul, "Fragmentation Considered Harmful",
285	              October 1987.

287	   [I-D.templin-intarea-seal]
288	              Templin, F., "The Subnetwork Encapsulation and Adaptation
289	              Layer (SEAL)", draft-templin-intarea-seal-60 (work in
290	              progress), July 2013.

292	   [RFC1981]  McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery
293	              for IP version 6", RFC 1981, August 1996.

295	   [RFC2923]  Lahey, K., "TCP Problems with Path MTU Discovery",
296	              RFC 2923, September 2000.

298	   [RFC4821]  Mathis, M. and J. Heffner, "Packetization Layer Path MTU
299	              Discovery", RFC 4821, March 2007.

301	   [RIPE]     De Boer, M. and J. Bosma, "Discovering Path MTU Black
302	              Holes on the Internet using RIPE Atlas", July 2012.

304	   [SIGCOMM]  Luckie, M. and B. Stasiewicz, "Measuring Path MTU
305	              Discovery Behavior", November 2010.

307	   [WAND]     Luckie, M., Cho, K., and B. Owens, "Inferring and
308	              Debugging Path MTU Discovery Failures", October 2005.

310	Author's Address

312	   Fred L. Templin (editor)
313	   Boeing Research & Technology
314	   P.O. Box 3707
315	   Seattle, WA  98124
316	   USA

318	   Email: fltemplin@acm.org