idnits 2.17.1 

draft-mathis-frag-harmful-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3667, Section 5.1 on line 15.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 334.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 318.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 324.

  ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line
     340), which is fine, but *also* found old RFC 2026, Section 10.4C,
     paragraph 1 text on line 36.

  ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure
     Acknowledgement -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.

  ** The document seems to lack an RFC 3979 Section 5, para. 1 IPR Disclosure
     Acknowledgement -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate
     instead of verbatim RFC 3978 boilerplate.  After 6 May 2005, submission
     of drafts without verbatim RFC 3978 boilerplate is not accepted.

     The following non-3978 patterns matched text found in the document. 
     That text should be removed or replaced:

        By submitting this Internet-Draft, I certify that any applicable patent
        or other IPR claims of which I am aware have been disclosed, or
        will be disclosed, and any of which I become aware will be
        disclosed, in accordance with RFC 3668.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (July 10, 2004) is 7227 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: '9' is defined on line 260, but no explicit reference
     was found in the text

  == Unused Reference: '11' is defined on line 266, but no explicit reference
     was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. '1'

  ** Downref: Normative reference to an Informational RFC: RFC 2923 (ref. '2')

  -- Possible downref: Non-RFC (?) normative reference: ref. '5'

  -- Possible downref: Non-RFC (?) normative reference: ref. '6'

  ** Obsolete normative reference: RFC 2460 (ref. '7') (Obsoleted by RFC 8200)

  ** Obsolete normative reference: RFC 2960 (ref. '8') (Obsoleted by RFC 4960)

  ** Obsolete normative reference: RFC 2402 (ref. '9') (Obsoleted by RFC
     4302, RFC 4305)

  == Outdated reference: A later version (-24) exists of
     draft-ietf-secsh-transport-18

  ** Downref: Normative reference to an Unknown state RFC: RFC  815 (ref.
     '11')


     Summary: 14 errors (**), 0 flaws (~~), 5 warnings (==), 9 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network Working Group                                          M. Mathis
2	Internet-Draft                                                J. Heffner
3	Expires: January 8, 2005                                     B. Chandler
4	                                                                     PSC
5	                                                           July 10, 2004

7	                 Fragmentation Considered Very Harmful
8	                      draft-mathis-frag-harmful-00

10	Status of this Memo

12	   By submitting this Internet-Draft, I certify that any applicable
13	   patent or other IPR claims of which I am aware have been disclosed,
14	   and any of which I become aware will be disclosed, in accordance with
15	   RFC 3668.

17	   Internet-Drafts are working documents of the Internet Engineering
18	   Task Force (IETF), its areas, and its working groups. Note that other
19	   groups may also distribute working documents as Internet-Drafts.

21	   Internet-Drafts are draft documents valid for a maximum of six months
22	   and may be updated, replaced, or obsoleted by other documents at any
23	   time. It is inappropriate to use Internet-Drafts as reference
24	   material or to cite them other than as "work in progress."

26	   The list of current Internet-Drafts can be accessed at http://
27	   www.ietf.org/ietf/1id-abstracts.txt.

29	   The list of Internet-Draft Shadow Directories can be accessed at
30	   http://www.ietf.org/shadow.html.

32	   This Internet-Draft will expire on January 8, 2005.

34	Copyright Notice

36	   Copyright (C) The Internet Society (2004). All Rights Reserved.

38	Abstract

40	   IPv4 fragmentation is not sufficiently robust for general use in
41	   today's Internet. The 16-bit IP identification field is not large
42	   enough to prevent frequent missassociated IP fragments and the TCP
43	   and UDP checksums are insufficient to prevent the resulting corrupted
44	   data from being delivered to higher protocol layers.  In this note we
45	   describe some easily reproduced experiments demonstrating the problem
46	   and estimate the scale the data corruption in the presence of ever
47	   growing data rates.

49	1.  Introduction

51	   The IPv4 header was designed at a time when data rates were several
52	   orders of magnitude lower than those achievable today.  In this
53	   document, we describe a consequent scale-related failure in the IP
54	   identification (ID) field, where fragments may be mis-associated at a
55	   rate high enough likely to invalidate assumptions about data
56	   integrity failure rates.  We also outline scenarios in which data
57	   corruption may happen reliably and reproducibly.

59	   While a number of problems with IP fragmentation have been well
60	   documented [1], this presents a relatively new and serious
61	   operational problem given the severity of the failure mode, and that
62	   it occurs on what is today common communications equipment. It is
63	   especially pertinent due to the recent proliferation of UDP bulk
64	   transport tools which do not do MTU discovery , and some network
65	   equipment which ignores the Don't Fragment (DF) bit in the IP header
66	   as a work-around for MTU discovery problems [2].

68	2.  Wrapping the IP ID Field

70	   The Internet Protocol standard specifies:

72	      "The choice of the Identifier for a datagram is based on the need
73	      to provide a way to uniquely identify the fragments of a
74	      particular datagram.  The protocol module assembling fragments
75	      judges fragments to belong to the same datagram if they have the
76	      same source, destination, protocol, and Identifier.  Thus, the
77	      sender must choose the Identifier to be unique for this source,
78	      destination pair and protocol for the time the datagram (or any
79	      fragment of it) could be alive in the internet." [3]

81	   Strict conformance to this standard limits transmissions in one
82	   direction between any address pair to no more than 65536 datagrams
83	   per maximum packet lifetime.

85	   Obviously hosts do not follow the standard so strictly.  Assuming a
86	   maximum packet lifetime on the order of seconds, today it is common
87	   for host interfaces to send at rates higher than this.  For example,
88	   a host with a 100 Mbps interface sending 1500 byte packets may send
89	   65536 packets in under 8 seconds.

91	   The problem occurs when a fragment is dropped by the network, and a
92	   later fragment is received that, while part of a different datagram,
93	   has the same ID value and fragment offset as the dropped fragment.
94	   The two fragments will be incorrectly spliced together and delivered
95	   to the layer above IP.  It is common that the fragment offset and
96	   length would match since packets of the same size sent along the same
97	   path will be fragmented in the same manner.  In 65537 segments, there
98	   must be at least two with matching ID fields.  If the sender is
99	   transmitting segments fast enough that datagrams are send with
100	   duplicate ID fields within the reassembly timeout (a suggested value
101	   is 15 seconds [3]), then fragments may be mis-associated.

103	   The case of particular concern occurs when only the first fragment of
104	   a datagram is lost by the network.  The remaining fragments will be
105	   stored in the fragment reassembly buffer, and at some point in the
106	   future a new packet will arrive with the matching ID field. This new
107	   first fragment will be (incorrectly) matched up with the rest of the
108	   old packet and delivered to the upper layer.  Assuming the fragments
109	   are delivered in order, the rest of the new datagram will be
110	   buffered, forming a cycle.  One of every 65536 datagrams will be
111	   incorrectly reassembled by the IP layer.  It is possible to have a
112	   number of simultaneous cycles, bounded by the size of the fragment
113	   reassembly buffer.

115	   Most TCP implementations today participate in MTU discovery [4],
116	   which will avoid this problem by avoiding fragmentation. However, as
117	   a work-around for MTU discovery problems [2], some TCP
118	   implementations and communications gear provide mechanisms to disable
119	   path MTU discovery by clearing or ignoring the DF bit.

121	3.  Harmful Effects of Mis-associated Fragments

123	   When the mis-associated fragments are delivered, transport-layer
124	   checksumming should detect these datagrams as incorrect and discard
125	   them.  When the datagrams are discarded, it could pose a problem for
126	   loss feedback congestion control algorithms since there will be a
127	   high number of non-congestion-related losses.

129	   However, transport checksums may not be designed to handle such high
130	   error rates, either.  The UDP checksum is only 16 bits in length.  If
131	   these checksums follow a uniform random distribution, we expect
132	   mis-associated datagrams to be accepted by the checksum at a rate of
133	   one per 65536.  With only one mis-association cycle, we expect
134	   corrupt data delivered to the application layer once per 2^32
135	   datagrams.  This number can be significantly higher with multiple
136	   cycles.

138	   With non-random data, the UDP checksum may be even weaker still.  It
139	   is possible to construct datasets where mis-associated fragments will
140	   always have the same checksum.  Such a case may be considered
141	   unlikely, but is worth considering. "Real" data may be more likely
142	   than random data to cause checksum hotspots and increase the
143	   probability of false checksum match [5].  Also, some applications may
144	   turn off checksumming to increase speed, though this practice has
145	   been found to be dangerous for other reasons [6].

147	4.  Experimental Results

149	   To test the practical impact of fragmentation on UDP, we ran a series
150	   of experiments with a common UDP bulk transport protocol, Reliable
151	   Blast UDP (RBUDP), part of the QUANTA networking toolkit.  It is one
152	   of the tools used as an alternative to TCP for high-bandwidth
153	   applications on specialized networks. The choice to use RBUDP has
154	   very little to do with the protocol itself, as any UDP transport tool
155	   without extra corruption detection would work equally well.

157	   In order to diagnose corruption on files transferred with RBUDP, we
158	   used a file format including embedded sequence numbers and MD5
159	   checksums.  These were placed such that one set was included in each
160	   fragment of each datagram.  Thus it was possible to distinguish
161	   random corruption from that caused by mis-associated fragments.

163	   Two types of dataset were used.  In the first, all space not used for
164	   sequence numbers and MD5 checksums was filled with pseudo-random
165	   data, giving datagrams random checksums.  The second was constructed
166	   in a similar manner except that the upper halves of each 32-bit word
167	   were filled with the 16-bit ones complement of the lower half. This
168	   gave each 32-bit word a zero ones-complement sum, so datagrams had
169	   constant checksums.  With these constant checksums, mis-associated
170	   fragments were guaranteed not to fail the UDP checksum test.  Each
171	   dataset used was 400 MB in size.

173	   The RBUDP tools were used to send the datasets between a pair of
174	   hosts at slightly less than the available datarate.  Near the
175	   beginning of each flow, a brief secondary flow was started to induce
176	   packet loss in the primary flow. Throughout the life of the primary
177	   flow, we typically observed mis-association rates on the order of
178	   0.05%.  In datasets with constant checksums, each of these
179	   mis-associations resulted in corrupted data.  In sending datasets
180	   with random checksums 100 times (for a total of 100 GB), we observed
181	   one corruption and 41091 bad UDP checksums.

183	5.  Remedies

185	   IPv6 is less vulnerable to this type of problem, since its fragment
186	   header contains a 32-bit identification field [7]. Mis-association
187	   will only be a problem at packet rates 65536 times higher than for
188	   IPv4.

190	   Since mis-association of fragments will only occur when the IP ID
191	   field is wrapped within the fragment reassembly timeout, it is
192	   possible to reduce the timeout so that this situation is less likely
193	   to occur.  Since the timeout is set by the receiving host while the
194	   IP ID field is set by the sending host, it is not generally possible
195	   to set the timeout low enough so that a fast sender's fragments will
196	   not be mis-association, yet high enough so that a slow sender's
197	   fragments will not be unconditionally discarded before it is possible
198	   to reassemble them.  It is not within the scope of this document to
199	   recommend timeout values.

201	   Another means of solving the corruption issue is to add stronger
202	   integrity checking, which can be done at any layer above IP.  This is
203	   a natural side effect of using cryptographic authentication.  At the
204	   network layer, if IPsec AH is in use, the mis-associated fragments
205	   should be discarded with extremely high probability.  Other higher
206	   layers may use longer checksums (for example, SCTP's is 32 bits in
207	   length [8]) or cryptographic authentication (SSH message
208	   authentication codes [10]). While stronger integrity checking may
209	   prevent data corruption, it will not solve the problem of a high
210	   effective loss rate.

212	6.  Security Considerations

214	   If a malicious entity knows that a pair of hosts are communicating
215	   using a fragmented stream, it may present an opportunity for this
216	   entity to corrupt the flow.  By sending "high" fragments (those with
217	   offset greater than zero) with a forged source address, the attacker
218	   can deliberately cause corruption as described above.  Exploiting
219	   this vulnerability requires only knowledge of the source and
220	   destination addresses of the flow, and fragment boundaries.  It does
221	   not require knowledge of port or sequence numbers.

223	   If the attacker has visibility of packets on the path, the attack
224	   profile is similar to injecting full segments.  Using this attack
225	   makes blind disruptions easier, and could certainly be used
226	   effectively to cause denial of service. However, only streams using
227	   IPv4 fragmentation are vulnerable.  Because of the nature of the
228	   problems outlined in this draft, the use of IPv4 fragmentation for
229	   critical applications may not be advisable regardless of security
230	   concerns.

232	7  References

234	   [1]   Kent, C. and J. Mogul, "Fragmentation considered harmful",
235	         Proc. SIGCOMM '87 vol. 17, No. 5, October 1987.

237	   [2]   Lahey, K., "TCP Problems with Path MTU Discovery", RFC 2923,
238	         September 2000.

240	   [3]   Postel, J., "Internet Protocol", STD 5, RFC 791, September
241	         1981.

243	   [4]   Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
244	         November 1990.

246	   [5]   Stone, J., Greenwald, M., Partridge, C. and J. Hughes,
247	         "Performance of Checksums and CRC's over Real Data", IEEE/ACM
248	         Transactions on Networking vol. 6, No. 5, October 1998.

250	   [6]   Stone, J. and C. Partridge, "When The CRC and TCP Checksum
251	         Disagree", Proc. SIGCOMM 2000 vol. 30, No. 4, October 2000.

253	   [7]   Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6)
254	         Specification", RFC 2460, December 1998.

256	   [8]   Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzbauer,
257	         H., Taylor, T., Rytina, I., Kalla, M., Zhang, L. and V. Paxson,
258	         "Stream Control Transmission Protocol", RFC 2960, October 2000.

260	   [9]   Kent, S. and R. Atkinson, "IP Authentication Header", RFC 2402,
261	         November 1998.

263	   [10]  Ylonen, T. and C. Lonvick, "SSH Transport Layer Protocol",
264	         draft-ietf-secsh-transport-18 (work in progress), June 2004.

266	   [11]  Clark, D., "IP datagram reassembly algorithms", RFC 815, July
267	         1982.

269	Authors' Addresses

271	   Matt Mathis
272	   Pittsburgh Supercomputing Center
273	   4400 Fifth Avenue
274	   Pittsburgh, PA  15213
275	   US

277	   Phone: 412-268-3319
278	   EMail: mathis@psc.edu
279	   John W. Heffner
280	   Pittsburgh Supercomputing Center
281	   4400 Fifth Avenue
282	   Pittsburgh, PA  15213
283	   US

285	   Phone: 412-268-2329
286	   EMail: jheffner@psc.edu

288	   Ben Chandler
289	   Pittsburgh Supercomputing Center
290	   4400 Fifth Avenue
291	   Pittsburgh, PA  15213
292	   US

294	   Phone: 412-268-9783
295	   EMail: bchandle@psc.edu

297	Appendix A.  Support

299	   This work was supported by the National Science Foundation under
300	   Grant No. 0083285.

302	Intellectual Property Statement

304	   The IETF takes no position regarding the validity or scope of any
305	   Intellectual Property Rights or other rights that might be claimed to
306	   pertain to the implementation or use of the technology described in
307	   this document or the extent to which any license under such rights
308	   might or might not be available; nor does it represent that it has
309	   made any independent effort to identify any such rights. Information
310	   on the IETF's procedures with respect to rights in IETF Documents can
311	   be found in BCP 78 and BCP 79.

313	   Copies of IPR disclosures made to the IETF Secretariat and any
314	   assurances of licenses to be made available, or the result of an
315	   attempt made to obtain a general license or permission for the use of
316	   such proprietary rights by implementers or users of this
317	   specification can be obtained from the IETF on-line IPR repository at
318	   http://www.ietf.org/ipr.

320	   The IETF invites any interested party to bring to its attention any
321	   copyrights, patents or patent applications, or other proprietary
322	   rights that may cover technology that may be required to implement
323	   this standard. Please address the information to the IETF at
324	   ietf-ipr@ietf.org.

326	Disclaimer of Validity

328	   This document and the information contained herein are provided on an
329	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
330	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
331	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
332	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
333	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
334	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

336	Copyright Statement

338	   Copyright (C) The Internet Society (2004). This document is subject
339	   to the rights, licenses and restrictions contained in BCP 78, and
340	   except as set forth therein, the authors retain all their rights.

342	Acknowledgment

344	   Funding for the RFC Editor function is currently provided by the
345	   Internet Society.