idnits 2.17.1 draft-mathis-frag-harmful-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3667, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5 on line 334. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 318. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 324. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 340), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 36. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document seems to lack an RFC 3979 Section 5, para. 1 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, or will be disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 10, 2004) is 7227 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '9' is defined on line 260, but no explicit reference was found in the text == Unused Reference: '11' is defined on line 266, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. '1' ** Downref: Normative reference to an Informational RFC: RFC 2923 (ref. '2') -- Possible downref: Non-RFC (?) normative reference: ref. '5' -- Possible downref: Non-RFC (?) normative reference: ref. '6' ** Obsolete normative reference: RFC 2460 (ref. '7') (Obsoleted by RFC 8200) ** Obsolete normative reference: RFC 2960 (ref. '8') (Obsoleted by RFC 4960) ** Obsolete normative reference: RFC 2402 (ref. '9') (Obsoleted by RFC 4302, RFC 4305) == Outdated reference: A later version (-24) exists of draft-ietf-secsh-transport-18 ** Downref: Normative reference to an Unknown state RFC: RFC 815 (ref. '11') Summary: 14 errors (**), 0 flaws (~~), 5 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group M. Mathis 2 Internet-Draft J. Heffner 3 Expires: January 8, 2005 B. Chandler 4 PSC 5 July 10, 2004 7 Fragmentation Considered Very Harmful 8 draft-mathis-frag-harmful-00 10 Status of this Memo 12 By submitting this Internet-Draft, I certify that any applicable 13 patent or other IPR claims of which I am aware have been disclosed, 14 and any of which I become aware will be disclosed, in accordance with 15 RFC 3668. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that other 19 groups may also distribute working documents as Internet-Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six months 22 and may be updated, replaced, or obsoleted by other documents at any 23 time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at http:// 27 www.ietf.org/ietf/1id-abstracts.txt. 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 This Internet-Draft will expire on January 8, 2005. 34 Copyright Notice 36 Copyright (C) The Internet Society (2004). All Rights Reserved. 38 Abstract 40 IPv4 fragmentation is not sufficiently robust for general use in 41 today's Internet. The 16-bit IP identification field is not large 42 enough to prevent frequent missassociated IP fragments and the TCP 43 and UDP checksums are insufficient to prevent the resulting corrupted 44 data from being delivered to higher protocol layers. In this note we 45 describe some easily reproduced experiments demonstrating the problem 46 and estimate the scale the data corruption in the presence of ever 47 growing data rates. 49 1. Introduction 51 The IPv4 header was designed at a time when data rates were several 52 orders of magnitude lower than those achievable today. In this 53 document, we describe a consequent scale-related failure in the IP 54 identification (ID) field, where fragments may be mis-associated at a 55 rate high enough likely to invalidate assumptions about data 56 integrity failure rates. We also outline scenarios in which data 57 corruption may happen reliably and reproducibly. 59 While a number of problems with IP fragmentation have been well 60 documented [1], this presents a relatively new and serious 61 operational problem given the severity of the failure mode, and that 62 it occurs on what is today common communications equipment. It is 63 especially pertinent due to the recent proliferation of UDP bulk 64 transport tools which do not do MTU discovery , and some network 65 equipment which ignores the Don't Fragment (DF) bit in the IP header 66 as a work-around for MTU discovery problems [2]. 68 2. Wrapping the IP ID Field 70 The Internet Protocol standard specifies: 72 "The choice of the Identifier for a datagram is based on the need 73 to provide a way to uniquely identify the fragments of a 74 particular datagram. The protocol module assembling fragments 75 judges fragments to belong to the same datagram if they have the 76 same source, destination, protocol, and Identifier. Thus, the 77 sender must choose the Identifier to be unique for this source, 78 destination pair and protocol for the time the datagram (or any 79 fragment of it) could be alive in the internet." [3] 81 Strict conformance to this standard limits transmissions in one 82 direction between any address pair to no more than 65536 datagrams 83 per maximum packet lifetime. 85 Obviously hosts do not follow the standard so strictly. Assuming a 86 maximum packet lifetime on the order of seconds, today it is common 87 for host interfaces to send at rates higher than this. For example, 88 a host with a 100 Mbps interface sending 1500 byte packets may send 89 65536 packets in under 8 seconds. 91 The problem occurs when a fragment is dropped by the network, and a 92 later fragment is received that, while part of a different datagram, 93 has the same ID value and fragment offset as the dropped fragment. 94 The two fragments will be incorrectly spliced together and delivered 95 to the layer above IP. It is common that the fragment offset and 96 length would match since packets of the same size sent along the same 97 path will be fragmented in the same manner. In 65537 segments, there 98 must be at least two with matching ID fields. If the sender is 99 transmitting segments fast enough that datagrams are send with 100 duplicate ID fields within the reassembly timeout (a suggested value 101 is 15 seconds [3]), then fragments may be mis-associated. 103 The case of particular concern occurs when only the first fragment of 104 a datagram is lost by the network. The remaining fragments will be 105 stored in the fragment reassembly buffer, and at some point in the 106 future a new packet will arrive with the matching ID field. This new 107 first fragment will be (incorrectly) matched up with the rest of the 108 old packet and delivered to the upper layer. Assuming the fragments 109 are delivered in order, the rest of the new datagram will be 110 buffered, forming a cycle. One of every 65536 datagrams will be 111 incorrectly reassembled by the IP layer. It is possible to have a 112 number of simultaneous cycles, bounded by the size of the fragment 113 reassembly buffer. 115 Most TCP implementations today participate in MTU discovery [4], 116 which will avoid this problem by avoiding fragmentation. However, as 117 a work-around for MTU discovery problems [2], some TCP 118 implementations and communications gear provide mechanisms to disable 119 path MTU discovery by clearing or ignoring the DF bit. 121 3. Harmful Effects of Mis-associated Fragments 123 When the mis-associated fragments are delivered, transport-layer 124 checksumming should detect these datagrams as incorrect and discard 125 them. When the datagrams are discarded, it could pose a problem for 126 loss feedback congestion control algorithms since there will be a 127 high number of non-congestion-related losses. 129 However, transport checksums may not be designed to handle such high 130 error rates, either. The UDP checksum is only 16 bits in length. If 131 these checksums follow a uniform random distribution, we expect 132 mis-associated datagrams to be accepted by the checksum at a rate of 133 one per 65536. With only one mis-association cycle, we expect 134 corrupt data delivered to the application layer once per 2^32 135 datagrams. This number can be significantly higher with multiple 136 cycles. 138 With non-random data, the UDP checksum may be even weaker still. It 139 is possible to construct datasets where mis-associated fragments will 140 always have the same checksum. Such a case may be considered 141 unlikely, but is worth considering. "Real" data may be more likely 142 than random data to cause checksum hotspots and increase the 143 probability of false checksum match [5]. Also, some applications may 144 turn off checksumming to increase speed, though this practice has 145 been found to be dangerous for other reasons [6]. 147 4. Experimental Results 149 To test the practical impact of fragmentation on UDP, we ran a series 150 of experiments with a common UDP bulk transport protocol, Reliable 151 Blast UDP (RBUDP), part of the QUANTA networking toolkit. It is one 152 of the tools used as an alternative to TCP for high-bandwidth 153 applications on specialized networks. The choice to use RBUDP has 154 very little to do with the protocol itself, as any UDP transport tool 155 without extra corruption detection would work equally well. 157 In order to diagnose corruption on files transferred with RBUDP, we 158 used a file format including embedded sequence numbers and MD5 159 checksums. These were placed such that one set was included in each 160 fragment of each datagram. Thus it was possible to distinguish 161 random corruption from that caused by mis-associated fragments. 163 Two types of dataset were used. In the first, all space not used for 164 sequence numbers and MD5 checksums was filled with pseudo-random 165 data, giving datagrams random checksums. The second was constructed 166 in a similar manner except that the upper halves of each 32-bit word 167 were filled with the 16-bit ones complement of the lower half. This 168 gave each 32-bit word a zero ones-complement sum, so datagrams had 169 constant checksums. With these constant checksums, mis-associated 170 fragments were guaranteed not to fail the UDP checksum test. Each 171 dataset used was 400 MB in size. 173 The RBUDP tools were used to send the datasets between a pair of 174 hosts at slightly less than the available datarate. Near the 175 beginning of each flow, a brief secondary flow was started to induce 176 packet loss in the primary flow. Throughout the life of the primary 177 flow, we typically observed mis-association rates on the order of 178 0.05%. In datasets with constant checksums, each of these 179 mis-associations resulted in corrupted data. In sending datasets 180 with random checksums 100 times (for a total of 100 GB), we observed 181 one corruption and 41091 bad UDP checksums. 183 5. Remedies 185 IPv6 is less vulnerable to this type of problem, since its fragment 186 header contains a 32-bit identification field [7]. Mis-association 187 will only be a problem at packet rates 65536 times higher than for 188 IPv4. 190 Since mis-association of fragments will only occur when the IP ID 191 field is wrapped within the fragment reassembly timeout, it is 192 possible to reduce the timeout so that this situation is less likely 193 to occur. Since the timeout is set by the receiving host while the 194 IP ID field is set by the sending host, it is not generally possible 195 to set the timeout low enough so that a fast sender's fragments will 196 not be mis-association, yet high enough so that a slow sender's 197 fragments will not be unconditionally discarded before it is possible 198 to reassemble them. It is not within the scope of this document to 199 recommend timeout values. 201 Another means of solving the corruption issue is to add stronger 202 integrity checking, which can be done at any layer above IP. This is 203 a natural side effect of using cryptographic authentication. At the 204 network layer, if IPsec AH is in use, the mis-associated fragments 205 should be discarded with extremely high probability. Other higher 206 layers may use longer checksums (for example, SCTP's is 32 bits in 207 length [8]) or cryptographic authentication (SSH message 208 authentication codes [10]). While stronger integrity checking may 209 prevent data corruption, it will not solve the problem of a high 210 effective loss rate. 212 6. Security Considerations 214 If a malicious entity knows that a pair of hosts are communicating 215 using a fragmented stream, it may present an opportunity for this 216 entity to corrupt the flow. By sending "high" fragments (those with 217 offset greater than zero) with a forged source address, the attacker 218 can deliberately cause corruption as described above. Exploiting 219 this vulnerability requires only knowledge of the source and 220 destination addresses of the flow, and fragment boundaries. It does 221 not require knowledge of port or sequence numbers. 223 If the attacker has visibility of packets on the path, the attack 224 profile is similar to injecting full segments. Using this attack 225 makes blind disruptions easier, and could certainly be used 226 effectively to cause denial of service. However, only streams using 227 IPv4 fragmentation are vulnerable. Because of the nature of the 228 problems outlined in this draft, the use of IPv4 fragmentation for 229 critical applications may not be advisable regardless of security 230 concerns. 232 7 References 234 [1] Kent, C. and J. Mogul, "Fragmentation considered harmful", 235 Proc. SIGCOMM '87 vol. 17, No. 5, October 1987. 237 [2] Lahey, K., "TCP Problems with Path MTU Discovery", RFC 2923, 238 September 2000. 240 [3] Postel, J., "Internet Protocol", STD 5, RFC 791, September 241 1981. 243 [4] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 244 November 1990. 246 [5] Stone, J., Greenwald, M., Partridge, C. and J. Hughes, 247 "Performance of Checksums and CRC's over Real Data", IEEE/ACM 248 Transactions on Networking vol. 6, No. 5, October 1998. 250 [6] Stone, J. and C. Partridge, "When The CRC and TCP Checksum 251 Disagree", Proc. SIGCOMM 2000 vol. 30, No. 4, October 2000. 253 [7] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) 254 Specification", RFC 2460, December 1998. 256 [8] Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzbauer, 257 H., Taylor, T., Rytina, I., Kalla, M., Zhang, L. and V. Paxson, 258 "Stream Control Transmission Protocol", RFC 2960, October 2000. 260 [9] Kent, S. and R. Atkinson, "IP Authentication Header", RFC 2402, 261 November 1998. 263 [10] Ylonen, T. and C. Lonvick, "SSH Transport Layer Protocol", 264 draft-ietf-secsh-transport-18 (work in progress), June 2004. 266 [11] Clark, D., "IP datagram reassembly algorithms", RFC 815, July 267 1982. 269 Authors' Addresses 271 Matt Mathis 272 Pittsburgh Supercomputing Center 273 4400 Fifth Avenue 274 Pittsburgh, PA 15213 275 US 277 Phone: 412-268-3319 278 EMail: mathis@psc.edu 279 John W. Heffner 280 Pittsburgh Supercomputing Center 281 4400 Fifth Avenue 282 Pittsburgh, PA 15213 283 US 285 Phone: 412-268-2329 286 EMail: jheffner@psc.edu 288 Ben Chandler 289 Pittsburgh Supercomputing Center 290 4400 Fifth Avenue 291 Pittsburgh, PA 15213 292 US 294 Phone: 412-268-9783 295 EMail: bchandle@psc.edu 297 Appendix A. Support 299 This work was supported by the National Science Foundation under 300 Grant No. 0083285. 302 Intellectual Property Statement 304 The IETF takes no position regarding the validity or scope of any 305 Intellectual Property Rights or other rights that might be claimed to 306 pertain to the implementation or use of the technology described in 307 this document or the extent to which any license under such rights 308 might or might not be available; nor does it represent that it has 309 made any independent effort to identify any such rights. Information 310 on the IETF's procedures with respect to rights in IETF Documents can 311 be found in BCP 78 and BCP 79. 313 Copies of IPR disclosures made to the IETF Secretariat and any 314 assurances of licenses to be made available, or the result of an 315 attempt made to obtain a general license or permission for the use of 316 such proprietary rights by implementers or users of this 317 specification can be obtained from the IETF on-line IPR repository at 318 http://www.ietf.org/ipr. 320 The IETF invites any interested party to bring to its attention any 321 copyrights, patents or patent applications, or other proprietary 322 rights that may cover technology that may be required to implement 323 this standard. Please address the information to the IETF at 324 ietf-ipr@ietf.org. 326 Disclaimer of Validity 328 This document and the information contained herein are provided on an 329 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 330 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 331 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 332 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 333 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 334 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 336 Copyright Statement 338 Copyright (C) The Internet Society (2004). This document is subject 339 to the rights, licenses and restrictions contained in BCP 78, and 340 except as set forth therein, the authors retain all their rights. 342 Acknowledgment 344 Funding for the RFC Editor function is currently provided by the 345 Internet Society.