idnits 2.17.1 draft-ietf-intarea-ipv4-id-update-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 2 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. -- The draft header indicates that this document updates RFC2003, but the abstract doesn't seem to directly say this. It does mention RFC2003 though, so this could be OK. -- The draft header indicates that this document updates RFC1122, but the abstract doesn't seem to directly say this. It does mention RFC1122 though, so this could be OK. -- The draft header indicates that this document updates RFC791, but the abstract doesn't seem to directly say this. It does mention RFC791 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC791, updated by this document, for RFC5378 checks: 1981-09-01) -- The document seems to contain a disclaimer for pre-RFC5378 work, and may have content which was first submitted before 10 November 2008. The disclaimer is necessary when there are original authors that you have been unable to contact, or if some do not wish to grant the BCP78 rights to the IETF Trust. If you are able to get all authors (current and original) to grant those rights, you can and should remove the disclaimer; otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 27, 2012) is 4165 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '10' on line 717 -- Obsolete informational reference (is this intentional?): RFC 2460 (Obsoleted by RFC 8200) -- Obsolete informational reference (is this intentional?): RFC 2671 (Obsoleted by RFC 6891) -- Obsolete informational reference (is this intentional?): RFC 4960 (Obsoleted by RFC 9260) -- Obsolete informational reference (is this intentional?): RFC 6145 (Obsoleted by RFC 7915) Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Area WG J. Touch 2 Internet Draft USC/ISI 3 Updates: 791,1122,2003 November 27, 2012 4 Intended status: Proposed Standard 5 Expires: May 2013 7 Updated Specification of the IPv4 ID Field 8 draft-ietf-intarea-ipv4-id-update-07.txt 10 Status of this Memo 12 This Internet-Draft is submitted to IETF in full conformance with the 13 provisions of BCP 78 and BCP 79. 15 This document may contain material from IETF Documents or IETF 16 Contributions published or made publicly available before November 17 10, 2008. The person(s) controlling the copyright in some of this 18 material may not have granted the IETF Trust the right to allow 19 modifications of such material outside the IETF Standards Process. 20 Without obtaining an adequate license from the person(s) controlling 21 the copyright in such materials, this document may not be modified 22 outside the IETF Standards Process, and derivative works of it may 23 not be created outside the IETF Standards Process, except to format 24 it for publication as an RFC or to translate it into languages other 25 than English. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF), its areas, and its working groups. Note that 29 other groups may also distribute working documents as Internet- 30 Drafts. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 The list of current Internet-Drafts can be accessed at 38 http://www.ietf.org/ietf/1id-abstracts.txt 40 The list of Internet-Draft Shadow Directories can be accessed at 41 http://www.ietf.org/shadow.html 43 This Internet-Draft will expire on May 27, 2013. 45 Copyright Notice 47 Copyright (c) 2012 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Abstract 62 The IPv4 Identification (ID) field enables fragmentation and 63 reassembly, and as currently specified is required to be unique 64 within the maximum lifetime for all datagrams with a given 65 source/destination/protocol tuple. If enforced, this uniqueness 66 requirement would limit all connections to 6.4 Mbps. Because 67 individual connections commonly exceed this speed, it is clear that 68 existing systems violate the current specification. This document 69 updates the specification of the IPv4 ID field in RFC791, RFC1122, 70 and RFC2003 to more closely reflect current practice and to more 71 closely match IPv6 so that the field's value is defined only when a 72 datagram is actually fragmented. It also discusses the impact of 73 these changes on how datagrams are used. 75 Table of Contents 77 1. Introduction...................................................3 78 2. Conventions used in this document..............................3 79 3. The IPv4 ID Field..............................................4 80 3.1. Uses of the IPv4 ID Field.................................4 81 3.2. Background on IPv4 ID Reassembly Issues...................5 82 4. Updates to the IPv4 ID Specification...........................6 83 4.1. IPv4 ID Used Only for Fragmentation.......................7 84 4.2. Encourage Safe IPv4 ID Use................................8 85 4.3. IPv4 ID Requirements That Persist.........................8 86 5. Impact of Proposed Changes.....................................9 87 5.1. Impact on Legacy Internet Devices.........................9 88 5.2. Impact on Datagram Generation............................10 89 5.3. Impact on Middleboxes....................................11 90 5.3.1. Rewriting Middleboxes...............................11 91 5.3.2. Filtering Middleboxes...............................12 92 5.4. Impact on Header Compression.............................12 93 5.5. Impact of Network Reordering and Loss....................13 94 5.5.1. Atomic Datagrams Experiencing Reordering or Loss....13 95 5.5.2. Non-atomic Datagrams Experiencing Reordering or Loss14 96 6. Updates to Existing Standards.................................14 97 6.1. Updates to RFC 791.......................................14 98 6.2. Updates to RFC 1122......................................15 99 6.3. Updates to RFC 2003......................................16 100 7. Security Considerations.......................................16 101 8. IANA Considerations...........................................17 102 9. References....................................................17 103 9.1. Normative References.....................................17 104 9.2. Informative References...................................17 105 10. Acknowledgments..............................................19 107 1. Introduction 109 In IPv4, the Identification (ID) field is a 16-bit value that is 110 unique for every datagram for a given source address, destination 111 address, and protocol, such that it does not repeat within the 112 maximum datagram lifetime (MDL) [RFC791][RFC1122]. As currently 113 specified, all datagrams between a source and destination of a given 114 protocol must have unique IPv4 ID values over a period of this MDL, 115 which is typically interpreted as two minutes, and is related to the 116 recommended reassembly timeout [RFC1122]. This uniqueness is 117 currently specified as for all datagrams, regardless of fragmentation 118 settings. 120 Uniqueness of the IPv4 ID is commonly violated by high speed devices; 121 if strictly enforced, it would limit the speed of a single protocol 122 between two IP endpoints to 6.4 Mbps for typical MTUs of 1500 bytes 123 [RFC4963]. It is common for a single connection to operate far in 124 excess of these rates, which strongly indicates that the uniqueness 125 of the IPv4 ID as specified is already moot. Further, some sources 126 have been generating non-varying IPv4 IDs for many years (e.g., 127 cellphones), which resulted in support for such in ROHC [RFC5225]. 129 This document updates the specification of the IPv4 ID field to more 130 closely reflect current practice, and to include considerations taken 131 into account during the specification of the similar field in IPv6. 133 2. Conventions used in this document 135 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 136 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 137 document are to be interpreted as described in RFC-2119 [RFC2119]. 139 In this document, the characters ">>" proceeding an indented line(s) 140 indicates a requirement using the key words listed above. This 141 convention aids reviewers in quickly identifying or finding this 142 document's explicit requirements. 144 3. The IPv4 ID Field 146 IP supports datagram fragmentation, where large datagrams are split 147 into smaller components to traverse links with limited maximum 148 transmission units (MTUs). Fragments are indicated in different ways 149 in IPv4 and IPv6: 151 o In IPv4, fragments are indicated using four fields of the basic 152 header: Identification (ID), Fragment Offset, a "Don't Fragment" 153 flag (DF), and a "More Fragments" flag (MF) [RFC791] 155 o In IPv6, fragments are indicated in an extension header that 156 includes an ID, Fragment Offset, and M (more fragments) flag 157 similar to their counterparts in IPv4 [RFC2460] 159 IPv4 and IPv6 fragmentation differs in a few important ways. IPv6 160 fragmentation occurs only at the source, so a DF bit is not needed to 161 prevent downstream devices from initiating fragmentation (i.e., IPv6 162 always acts as if DF=1). The IPv6 fragment header is present only 163 when a datagram has been fragmented, or when the source has received 164 a "packet too big" ICMPv6 error message indicating that the path 165 cannot support the required minimum 1280-byte IPv6 MTU and is thus 166 subject to translation [RFC2460][RFC4443]. The latter case is 167 relevant only for IPv6 datagrams sent to IPv4 destinations to support 168 subsequent fragmentation after translation to IPv4. 170 With the exception of these two cases, the ID field is not present 171 for non-fragmented datagrams, and thus is meaningful only for 172 datagrams that are already fragmented or datagrams intended to be 173 fragmented as part of IPv4 translation. Finally, the IPv6 ID field is 174 32 bits, and required unique per source/destination address pair for 175 IPv6, whereas for IPv4 it is only 16 bits and required unique per 176 source/destination/protocol triple. 178 This document focuses on the IPv4 ID field issues, because in IPv6 179 the field is larger and present only in fragments. 181 3.1. Uses of the IPv4 ID Field 183 The IPv4 ID field was originally intended for fragmentation and 184 reassembly [RFC791]. Within a given source address, destination 185 address, and protocol, fragments of an original datagram are matched 186 based on their IPv4 ID. This requires that IDs are unique within the 187 address/protocol triple when fragmentation is possible (e.g., DF=0) 188 or when it has already occurred (e.g., frag_offset>0 or MF=1). 190 Other uses have been envisioned for the IPv4 ID field. The field has 191 been proposed as a way to detect and remove duplicate datagrams, 192 e.g., at congested routers (noted in Sec. 3.2.1.5 of [RFC1122]) or in 193 network accelerators. It has similarly been proposed for use at end 194 hosts to reduce the impact of duplication on higher-layer protocols 195 (e.g., additional processing in TCP, or the need for application- 196 layer duplicate suppression in UDP). This is also discussed further 197 in Section 5.1. 199 The IPv4 ID field is used in some diagnostic tools to correlate 200 datagrams measured at various locations along a network path. This is 201 already insufficient in IPv6 because unfragmented datagrams lack an 202 ID, so these tools are already being updated to avoid such reliance 203 on the ID field. This is also discussed further in Section 5.1. 205 The ID clearly needs to be unique (within MDL, within the 206 src/dst/protocol tuple) to support fragmentation and reassembly, but 207 not all datagrams are fragmented or allow fragmentation. This 208 document deprecates non-fragmentation uses, allowing the ID to be 209 repeated (within MDL, within the src/dst/protocol tuple) in those 210 cases. 212 3.2. Background on IPv4 ID Reassembly Issues 214 The following is a summary of issues with IPv4 fragment reassembly in 215 high speed environments raised previously [RFC4963]. Readers are 216 encouraged to consult RFC 4963 for a more detailed discussion of 217 these issues. 219 With the maximum IPv4 datagram size of 64KB, a 16-bit ID field that 220 does not repeat within 120 seconds means that the aggregate of all 221 TCP connections of a given protocol between two IP endpoints is 222 limited to roughly 286 Mbps; at a more typical MTU of 1500 bytes, 223 this speed drops to 6.4 Mbps [RFC791][RFC1122][RFC4963]. This limit 224 currently applies for all IPv4 datagrams within a single protocol 225 (i.e., the IPv4 protocol field) between two IP addresses, regardless 226 of whether fragmentation is enabled or inhibited, and whether a 227 datagram is fragmented or not. 229 IPv6, even at typical MTUs, is capable of 18.7 Tbps with 230 fragmentation between two IP endpoints as an aggregate across all 231 protocols, due to the larger 32-bit ID field (and the fact that the 232 IPv6 next-header field, the equivalent of the IPv4 protocol field, is 233 not considered in differentiating fragments). When fragmentation is 234 not used the field is absent, and in that case IPv6 speeds are not 235 limited by the ID field uniqueness. 237 Note also that 120 seconds is only an estimate on the MDL. It is 238 related to the reassembly timeout as a lower bound and the TCP 239 Maximum Segment Lifetime as an upper bound (both as noted in 240 [RFC1122]). Network delays are incurred in other ways, e.g., 241 satellite links, which can add seconds of delay even though the TTL 242 is not decremented by a corresponding amount. There is thus no 243 enforcement mechanism to ensure that datagrams older than 120 seconds 244 are discarded. 246 Wireless Internet devices are frequently connected at speeds over 54 247 Mbps, and wired links of 1 Gbps have been the default for several 248 years. Although many end-to-end transport paths are congestion 249 limited, these devices easily achieve 100+ Mbps application-layer 250 throughput over LANs (e.g., disk-to-disk file transfer rates), and 251 numerous throughput demonstrations with COTS systems over wide-area 252 paths exhibit these speeds for over a decade. This strongly suggests 253 that IPv4 ID uniqueness has been moot for a long time. 255 4. Updates to the IPv4 ID Specification 257 This document updates the specification of the IPv4 ID field in three 258 distinct ways, as discussed in subsequent subsections: 260 o Use the IPv4 ID field only for fragmentation 262 o Avoiding a performance impact when the IPv4 ID field is used 264 o Encourage safe operation when the IPv4 ID field is used 266 There are two kinds of datagrams used in the following discussion, 267 named as follows: 269 o Atomic datagrams are datagrams not yet fragmented and for which 270 further fragmentation has been inhibited. 272 o Non-atomic datagrams are datagrams that either already have been 273 fragmented or for which fragmentation remains possible. 275 This same definition can be expressed in pseudo code as using common 276 logical operators (equals is ==, logical 'and' is &&, logical 'or' is 277 ||, greater than is >, and parenthesis function typically) as: 279 o Atomic datagrams: (DF==1)&&(MF==0)&&(frag_offset==0) 280 o Non-atomic datagrams: (DF==0)||(MF==1)||(frag_offset>0) 282 The test for non-atomic datagrams is the logical negative of the test 283 for atomic datagrams, thus all possibilities are considered. 285 4.1. IPv4 ID Used Only for Fragmentation 287 Although RFC1122 suggests the IPv4 ID field has other uses, including 288 datagram de-duplication, such uses are already not interoperable with 289 known implementations of sources that do not vary their ID. This 290 document thus defines this field's value only for fragmentation and 291 reassembly: 293 >> IPv4 ID field MUST NOT be used for purposes other than 294 fragmentation and reassembly. 296 Datagram de-duplication is accomplished using hash-based duplicate 297 detection for cases where the ID field is absent (IPv6 unfragmented 298 datagrams), which can also be applied to IPv4 atomic datagrams 299 without utilizing the ID field [RFC6621]. 301 In atomic datagrams, the IPv4 ID field has no meaning, and thus can 302 be set to an arbitrary value, i.e., the requirement for non-repeating 303 IDs within the address/protocol triple is no longer required for 304 atomic datagrams: 306 >> Originating sources MAY set the IPv4 ID field of atomic datagrams 307 to any value. 309 Second, all network nodes, whether at intermediate routers, 310 destination hosts, or other devices (e.g., NATs and other address 311 sharing mechanisms, firewalls, tunnel egresses), cannot rely on the 312 field: 314 >> All devices that examine IPv4 headers MUST ignore the IPv4 ID 315 field of atomic datagrams. 317 The IPv4 ID field is thus meaningful only for non-atomic datagrams - 318 datagrams that have either already been fragmented, or those for 319 which fragmentation remains permitted. Atomic datagrams are detected 320 by their DF, MF, and fragmentation offset fields as explained in 321 Section 4, because such a test is completely backward compatible; 322 this document thus does not reserve any IPv4 ID values, including 0, 323 as distinguished. 325 Deprecating the use of the IPv4 ID field for non-reassembly uses 326 should have little - if any - impact. IPv4 IDs are already frequently 327 repeated, e.g., over even moderately fast connections and from some 328 sources that do not vary the ID at all, and no adverse impact has 329 been observed. Duplicate suppression was suggested [RFC1122] and has 330 been implemented in some protocol accelerators, but no impacts of 331 IPv4 ID reuse have been noted to date. Routers are not required to 332 issue ICMPs on any particular timescale, and so IPv4 ID repetition 333 should not have been used for validation and has not been observed, 334 and again repetition already occurs and would have been noticed 335 [RFC1812]. ICMP relaying at tunnel ingresses is specified to use soft 336 state rather than a datagram cache, and should have been noted if the 337 latter for similar reasons [RFC2003]. These and other legacy issues 338 are discussed further in Section 5.1. 340 4.2. Encourage Safe IPv4 ID Use 342 This document makes further changes to the specification of the IPv4 343 ID field and its use to encourage its safe use as corollary 344 requirements changes as follows. 346 RFC 1122 discusses that if TCP retransmits a segment it may be 347 possible to reuse the IPv4 ID (see Section 6.2). This can make it 348 difficult for a source to avoid IPv4 ID repetition for received 349 fragments. RFC 1122 concludes that this behavior "is not useful"; 350 this document formalizes that conclusion as follows: 352 >> The IPv4 ID of non-atomic datagrams MUST NOT be reused when 353 sending a copy of an earlier non-atomic datagram. 355 RFC 1122 also suggests that fragments can overlap [RFC1122]. Such 356 overlap can occur if successive retransmissions are fragmented in 357 different ways but with the same reassembly IPv4 ID. This overlap is 358 noted as the result of reusing IPv4 IDs when retransmitting 359 datagrams, which this document deprecates. However, it is also the 360 result of in-network datagram duplication, which can still occur. As 361 a result this document does not change the need to support 362 overlapping fragments. 364 4.3. IPv4 ID Requirements That Persist 366 This document does not relax the IPv4 ID field uniqueness 367 requirements of [RFC791] for non-atomic datagrams, i.e.: 369 >> Sources emitting non-atomic datagrams MUST NOT repeat IPv4 ID 370 values within one MDL for a given source address/destination 371 address/protocol triple. 373 Such sources include originating hosts, tunnel ingresses, and NATs 374 (including other address sharing mechanisms) (see Section 5.3). 376 This document does not relax the requirement that all network devices 377 honor the DF bit, i.e.: 379 >> IPv4 datagrams whose DF=1 MUST NOT be fragmented. 381 >> IPv4 datagram transit devices MUST NOT clear the DF bit. 383 In specific, DF=1 prevents fragmenting atomic datagrams. DF=1 also 384 prevents further fragmenting received fragments. In-network 385 fragmentation is permitted only when DF=0; this document does not 386 change that requirement. 388 5. Impact of Proposed Changes 390 This section discusses the impact of the proposed changes on legacy 391 devices, datagram generation in updated devices, middleboxes, and 392 header compression. 394 5.1. Impact on Legacy Internet Devices 396 Legacy uses of the IPv4 ID field consist of fragment generation, 397 fragment reassembly, duplicate datagram detection, and "other" uses. 399 Current devices already generate ID values that are reused within the 400 source address, destination address, protocol, and ID tuple in less 401 than the current estimated Internet MDL of two minutes. They assume 402 that the MDL over their end-to-end path is much lower. 404 Existing devices have been known to generate non-varying IDs for 405 atomic datagrams for nearly a decade, notably some cell phones. Such 406 constant ID values are the reason for their support as an 407 optimization of ROHC [RFC5225]. This is discussed further in Section 408 5.4. Generation of IPv4 datagrams with constant (zero) IDs is also 409 described as part of the IP/ICMP translation standard [RFC6145]. 411 Many current devices support fragmentation that ignores the IPv4 412 Don't Fragment (DF) bit. Such devices already transit traffic from 413 sources that reuse the ID. If fragments of different datagrams 414 reusing the same ID (within the source/destination/protocol tuple) 415 arrive at the destination interleaved, fragmentation would fail and 416 traffic would be dropped. Either such interleaving is uncommon, or 417 traffic from such devices is not widely traversing these DF-ignoring 418 devices, because significant occurrence of reassembly errors has not 419 been reported. DF-ignoring devices do not comply with existing 420 standards, and it is not feasible to update the standards to allow 421 them as compliant. 423 The ID field has been envisioned for use in duplicate detection, as 424 discussed in Section 4.1 [RFC1122]. Although this document now allows 425 IPv4 ID reuse for atomic datagrams, such reuse is already common (as 426 noted above). Protocol accelerators are known to implement IPv4 427 duplicate detection, but such devices are also known to violate other 428 Internet standards to achieve higher end-to-end performance. These 429 devices would already exhibit erroneous drops for this current 430 traffic, and this has not been reported. 432 There are other potential uses of the ID field, such as for 433 diagnostic purposes. Such uses already need to accommodate atomic 434 datagrams with reused ID fields. There are no reports of such uses 435 having problems with current datagrams that reuse IDs. These and any 436 other uses of the ID field are encouraged to apply IPv6-compatible 437 methods for IPv4 as well. 439 Thus, as a result of previous requirements, this document recommends 440 that IPv4 duplicate detection and diagnostic mechanisms apply IPv6- 441 compatible methods, i.e., that do not rely on the ID field (e.g., as 442 suggested in [RFC6621]). This is a consequence of using the ID field 443 only for reassembly, as well as the known hazard of existing devices 444 already reusing the ID field. 446 5.2. Impact on Datagram Generation 448 The following is a summary of the recommendations that are the result 449 of the previous changes to the IPv4 ID field specification. 451 Because atomic datagrams can use arbitrary IPv4 ID values, the ID 452 field no longer imposes a performance impact in those cases. However, 453 the performance impact remains for non-atomic datagrams. As a result: 455 >> Sources of non-atomic IPv4 datagrams MUST rate-limit their output 456 to comply with the ID uniqueness requirements. Such sources include, 457 in particular, DNS over UDP [RFC2671]. 459 Because there is no strict definition of the MDL, reassembly hazards 460 exist regardless of the IPv4 ID reuse interval or the reassembly 461 timeout. As a result: 463 >> Higher layer protocols SHOULD verify the integrity of IPv4 464 datagrams, e.g., using a checksum or hash that can detect reassembly 465 errors (the UDP checksum is weak in this regard, but better than 466 nothing). 468 Additional integrity checks can be employed using tunnels, as 469 supported by SEAL, IPsec, or SCTP [RFC4301][RFC4960][RFC5320]. Such 470 checks can avoid the reassembly hazards that can occur when using UDP 471 and TCP checksums [RFC4963], or when using partial checksums as in 472 UDP-Lite [RFC3828]. Because such integrity checks can avoid the 473 impact of reassembly errors: 475 >> Sources of non-atomic IPv4 datagrams using strong integrity checks 476 MAY reuse the ID within MDL values smaller than is typical. 478 Note, however, that such frequent reuse can still result in corrupted 479 reassembly and poor throughput, although it would not propagate 480 reassembly errors to higher layer protocols. 482 5.3. Impact on Middleboxes 484 Middleboxes include rewriting devices that include network address 485 translators (NATs), address/port translators (NAPTs), and other 486 address sharing mechanisms (ASMs). They also include devices that 487 inspect and filter datagrams that are not routers, such as 488 accelerators and firewalls. 490 The changes proposed in this document may not be implemented by 491 middleboxes, however these changes are more likely to make current 492 middlebox behavior compliant than to affect the service provided by 493 those devices. 495 5.3.1. Rewriting Middleboxes 497 NATs and NAPTs rewrite IP fields, and tunnel ingresses (using IPv4 498 encapsulation) copy and modify some IPv4 fields, so all are 499 considered sources, as do any devices that rewrite any portion of the 500 source address, destination address, protocol, and ID tuple for any 501 datagrams [RFC3022]. This is also true for other ASMs, including 4rd, 502 IVI, and others in the "A+P" (address plus port) family [Bo11] [De11] 503 [RFC6219]. It is equally true for any other datagram rewriting 504 mechanism. As a result, they are subject to all the requirements of 505 any source, as has been noted. 507 NATs/ASMs/rewriters present a particularly challenging situation for 508 fragmentation. Because they overwrite portions of the reassembly 509 tuple in both directions, they can destroy tuple uniqueness and 510 result in a reassembly hazard. Whenever IPv4 source address, 511 destination address, or protocol fields are modified, a 512 NAT/ASM/rewriter needs to ensure that the ID field is generated 513 appropriately, rather than simply copied from the incoming datagram. 514 In specific: 516 >> Address sharing or rewriting devices MUST ensure that the IPv4 ID 517 field of datagrams whose address or protocol are translated comply 518 with these requirements as if the datagram were sourced by that 519 device. 521 This compliance means that the IPv4 ID field of non-atomic datagrams 522 translated at a NAT/ASM/rewriter needs to obey the uniqueness 523 requirements of any IPv4 datagram source. Unfortunately, fragments 524 already violate that requirement, as they repeat an IPv4 ID within 525 the MDL for a given source address, destination address, and protocol 526 triple. 528 Such problems with transmitting fragments through NATs/ASMs/rewriters 529 are already known; translation is based on the transport port number, 530 which is present in only the first fragment anyway [RFC3022]. This 531 document underscores the point that not only is reassembly (and 532 possibly subsequent fragmentation) required for translation, it can 533 be used to avoid issues with IPv4 ID uniqueness. 535 Note that NATs/ASMs already need to exercise special care when 536 emitting datagrams on their public side, because merging datagrams 537 from many sources onto a single outgoing source address can result in 538 IPv4 ID collisions. This situation precedes this document, and is not 539 affected by it. It is exacerbated in large-scale, so-called "carrier 540 grade" NATs [Pe11]. 542 Tunnel ingresses act as sources for the outermost header, but tunnels 543 act as routers for the inner headers (i.e., the datagram as arriving 544 at the tunnel ingress). Ingresses can always fragment as originating 545 sources of the outer header, because they control the uniqueness of 546 that IPv4 ID field and the value of DF on the outer header 547 independent of those values on the inner (arriving datagram) header. 549 5.3.2. Filtering Middleboxes 551 Middleboxes also include devices that filter datagrams, including 552 network accelerators and firewalls. Some such devices reportedly 553 feature datagram de-duplication that relies on IP ID uniqueness to 554 identify duplicates, which has been discussed in Section 5.1. 556 5.4. Impact on Header Compression 558 Header compression algorithms already accommodate various ways in 559 which the IPv4 ID changes between sequential datagrams [RFC1144] 560 [RFC2508] [RFC3545] [RFC5225]. Such algorithms currently assume that 561 the IPv4 ID is preserved end-to-end. Some algorithms already allow 562 assuming the ID does not change (e.g., ROHC [RFC5225]), where others 563 include non-changing IDs via zero deltas (e.g., ECRTP [RFC3545]). 565 When compression assumes a changing ID as a default, having a non- 566 changing ID can make compression less efficient. Such non-changing 567 IDs have been described in various RFCs (e.g., footnote 21 of 568 [RFC1144] and cRTP [RFC2508]). When compression can assume a non- 569 changing IPv4 ID - as with ROHC and ECRTP - efficiency can be 570 increased. 572 5.5. Impact of Network Reordering and Loss 574 Tolerance to network reordering and loss is a key feature of the 575 Internet architecture. Although most current IP networks avoid 576 gratuitous such events, both reordering and loss can and do occur. 577 Datagrams are already intended to be reordered or lost, and recovery 578 from those errors (where supported) already occurs at the transport 579 or higher protocol layers. 581 Reordering is typically associated with routing transients or where 582 multiple alternate paths exist. Loss is typically associated with 583 path congestion or link failure (partial or complete). The impact of 584 such events is different for atomic and non-atomic datagrams, and is 585 discussed below. In summary, the recommendations of this document 586 make the Internet more robust to reordering and loss by emphasizing 587 the requirements of ID uniqueness for non-atomic datagrams and by 588 more clearly indicating the impact of these requirements on both 589 endpoints and datagram transit devices. 591 5.5.1. Atomic Datagrams Experiencing Reordering or Loss 593 Reusing ID values does not affect atomic datagrams when the DF bit is 594 correctly respected, because order restoration does not depend on the 595 datagram header. TCP uses a transport header sequence number; in some 596 other protocols, sequence is indicated and restored at the 597 application layer. 599 When DF=1 is ignored, reordering or loss can cause fragments of 600 different datagrams to be interleaved and thus incorrectly 601 reassembled and thus discarded. Reuse of ID values in atomic packets, 602 as permitted by this document, can result in higher datagram loss in 603 such cases. Such cases already can exist because there are known 604 devices that use a constant ID for atomic packets (some cellphones), 605 and there are known devices that ignore DF=1, but high levels of 606 corresponding loss have not been reported. The lack of such reports 607 indicates either a lack of reordering or loss in such cases, or a 608 tolerance to the resulting losses. If such issues are reported, it 609 would be more productive to address non-compliant devices (that 610 ignore DF=1), because it is impractical to define Internet 611 specifications to tolerate devices that ignore those specifications. 612 This is why this document emphasizes the need to honor DF=1, as well 613 as that datagram transit devices need to retain the DF bit as 614 received (i.e., rather than clear it). 616 5.5.2. Non-atomic Datagrams Experiencing Reordering or Loss 618 Non-atomic datagrams rely on the uniqueness of the ID value to 619 tolerate reordering of fragments, notably where fragments of 620 different datagrams are interleaved as a result of such reordering. 621 Fragment loss can result in reassembly of fragments from different 622 origin datagrams, which is why ID reuse in non-atomic datagrams is 623 based on datagram (fragment) maximum lifetime, not just expected 624 reordering interleaving. 626 This document does not change the requirements for uniqueness of IDs 627 in non-atomic datagrams, and thus does not affect their tolerance to 628 such reordering or loss. This document emphasizes the need for ID 629 uniqueness for all datagram sources including rewriting middleboxes, 630 the need to rate-limit sources to ensure ID uniqueness, the need to 631 not reuse the ID for retransmitted datagrams, and the need to use 632 higher-layer integrity checks to prevent reassembly errors - all of 633 which result in a higher tolerance to reordering or loss events. 635 6. Updates to Existing Standards 637 The following sections address the specific changes to existing 638 protocols indicated by this document. 640 6.1. Updates to RFC 791 642 RFC 791 states that: 644 The originating protocol module of an internet datagram sets the 645 identification field to a value that must be unique for that 646 source-destination pair and protocol for the time the datagram 647 will be active in the internet system. 649 And later that: 651 Thus, the sender must choose the Identifier to be unique for this 652 source, destination pair and protocol for the time the datagram 653 (or any fragment of it) could be alive in the internet. 655 It seems then that a sending protocol module needs to keep a table 656 of Identifiers, one entry for each destination it has communicated 657 with in the last maximum datagram lifetime for the internet. 659 However, since the Identifier field allows 65,536 different 660 values, some host may be able to simply use unique identifiers 661 independent of destination. 663 It is appropriate for some higher level protocols to choose the 664 identifier. For example, TCP protocol modules may retransmit an 665 identical TCP segment, and the probability for correct reception 666 would be enhanced if the retransmission carried the same 667 identifier as the original transmission since fragments of either 668 datagram could be used to construct a correct TCP segment. 670 This document changes RFC 791 as follows: 672 o IPv4 ID uniqueness applies to only non-atomic datagrams. 674 o Retransmitted non-atomic IPv4 datagrams are no longer permitted to 675 reuse the ID value. 677 6.2. Updates to RFC 1122 679 RFC 1122 states that: 681 3.2.1.5 Identification: RFC-791 Section 3.2 683 When sending an identical copy of an earlier datagram, a 684 host MAY optionally retain the same Identification field in 685 the copy. 687 DISCUSSION: 689 Some Internet protocol experts have maintained that when a 690 host sends an identical copy of an earlier datagram, the new 691 copy should contain the same Identification value as the 692 original. There are two suggested advantages: (1) if the 693 datagrams are fragmented and some of the fragments are lost, 694 the receiver may be able to reconstruct a complete datagram 695 from fragments of the original and the copies; (2) a 696 congested gateway might use the IP Identification field (and 697 Fragment Offset) to discard duplicate datagrams from the 698 queue. 700 This document changes RFC 1122 as follows: 702 o The IPv4 ID field is no longer permitted to be used for duplicate 703 detection. This applies to both atomic and non-atomic datagrams. 705 o Retransmitted non-atomic IPv4 datagrams are no longer permitted to 706 reuse the ID value. 708 6.3. Updates to RFC 2003 710 This document updates how IPv4-in-IPv4 tunnels create IPv4 ID values 711 for the IPv4 outer header [RFC2003], but only in the same way as for 712 any other IPv4 datagram source. In specific, RFC 2003 states the 713 following, where ref. [10] is RFC 791: 715 Identification, Flags, Fragment Offset 717 These three fields are set as specified in [10]... 719 This document changes RFC 2003 as follows: 721 o The IPv4 ID field is set as permitted by RFCXXXX. 723 7. Security Considerations 725 When the IPv4 ID is ignored on receipt (e.g., for atomic datagrams), 726 its value becomes unconstrained; that field then can more easily be 727 used as a covert channel. For some atomic datagrams it is now 728 possible, and may be desirable, to rewrite the IPv4 ID field to avoid 729 its use as such a channel. Rewriting would be prohibited for 730 datagrams protected by IPsec Authentication Header (AH), although we 731 do not recommend use of AH to achieve this result [RFC4302]. 733 The IPv4 ID also now adds much less to the entropy of the header of a 734 datagram. Such entropy might be used as input to cryptographic 735 algorithms or pseudorandom generators, although IDs have never been 736 assured sufficient entropy for such purposes. The IPv4 ID had 737 previously been unique (for a given source/address pair, and protocol 738 field) within one MDL, although this requirement was not enforced and 739 clearly is typically ignored. The IPv4 ID of atomic datagrams is not 740 required unique, and so contributes no entropy to the header. 742 The deprecation of the IPv4 ID field's uniqueness for atomic 743 datagrams can defeat the ability to count devices behind a 744 NAT/ASM/rewriter [Be02]. This is not intended as a security feature, 745 however. 747 8. IANA Considerations 749 There are no IANA considerations in this document. 751 The RFC Editor should remove this section prior to publication 753 9. References 755 9.1. Normative References 757 [RFC791] Postel, J., "Internet Protocol", RFC 791 / STD 5, September 758 1981. 760 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 761 Communication Layers", RFC 1122 / STD 3, October 1989. 763 [RFC1812] Baker, F. (Ed.), "Requirements for IP Version 4 Routers", 764 RFC 1812 / STD 4, Jun. 1995. 766 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 767 Requirement Levels", RFC 2119 / BCP 14, March 1997. 769 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, 770 October 1996. 772 9.2. Informative References 774 [Be02] Bellovin, S., "A Technique for Counting NATted Hosts", 775 Internet Measurement Conference, Proceedings of the 2nd ACM 776 SIGCOMM Workshop on Internet Measurement, Nov. 2002. 778 [Bo11] Boucadair, M., J. Touch, P. Levis, R. Penno, "Analysis of 779 Solution Candidates to Reveal a Host Identifier in Shared 780 Address Deployments", (work in progress), draft-boucadair- 781 intarea-nat-reveal-analysis, Sept. 2011. 783 [De11] Despres, R. (Ed.), S. Matsushima, T. Murakami, O. Troan, 784 "IPv4 Residual Deployment across IPv6-Service networks 785 (4rd)", (work in progress), draft-despres-intarea-4rd, Mar. 786 2011. 788 [Pe11] Perreault, S., (Ed.), I. Yamagata, S. Miyakawa, A. 789 Nakagawa, H. Ashida, "Common requirements of IP address 790 sharing schemes", (work in progress), draft-ietf-behave- 791 lsn-requirements, Mar. 2011. 793 [RFC1144] Jacobson, V., "Compressing TCP/IP Headers", RFC 1144, Feb. 794 1990. 796 [RFC2460] Deering, S., R. Hinden, "Internet Protocol, Version 6 797 (IPv6) Specification", RFC 2460, Dec. 1998. 799 [RFC2508] Casner, S., V. Jacobson. "Compressing IP/UDP/RTP Headers 800 for Low-Speed Serial Links", RFC 2508, Feb. 1999. 802 [RFC2671] Vixie,P., "Extension Mechanisms for DNS (EDNS0)", RFC 2671, 803 Aug. 1999. 805 [RFC3022] Srisuresh, P. and K. Egevang, "Traditional IP Network 806 Address Translator (Traditional NAT)", RFC 3022, Jan. 2001. 808 [RFC3545] Koren, T., S. Casner, J. Geevarghese, B. Thompson, P. 809 Ruddy, "Enhanced Compressed RTP (CRTP) for Links with High 810 Delay, Packet Loss and Reordering", RFC 3545, Jul. 2003. 812 [RFC3828] Larzon, L-A., M. Degermark, S. Pink, L-E. Jonsson, Ed., G. 813 Fairhurst, Ed., "The Lightweight User Datagram Protocol 814 (UDP-Lite)", RFC 3828, Jul. 2004. 816 [RFC4301] Kent, S., K. Seo, "Security Architecture for the Internet 817 Protocol", RFC 4301, Dec. 2005. 819 [RFC4302] Kent, S., "IP Authentication Header", RFC 4302, Dec. 2005. 821 [RFC4443] Conta, A., S. Deering, M. Gupta (Ed.), "Internet Control 822 Message Protocol (ICMPv6) for the Internet Protocol Version 823 6 (IPv6) Specification", RFC 4443, March. 2006. 825 [RFC4960] Stewart, R. (Ed.), "Stream Control Transmission Protocol", 826 RFC 4960, Sep. 2007. 828 [RFC4963] Heffner, J., M. Mathis, B. Chandler, "IPv4 Reassembly 829 Errors at High Data Rates," RFC 4963, Jul. 2007. 831 [RFC5225] Pelletier, G., K. Sandlund, "RObust Header Compression 832 Version 2 (ROHCv2): Profiles for RTP, UDP, IP, ESP and UDP- 833 Lite", RFC 5225, Apr. 2008. 835 [RFC5320] Templin, F., Ed., "The Subnetwork Encapsulation and 836 Adaptation Layer (SEAL)", RFC 5320, Feb. 2010. 838 [RFC6145] Li, X., C. Bao, F. Baker, "IP/ICMP Translation Algorithm," 839 RFC 6145, Apr. 2011. 841 [RFC6219] Li, X., C. Bao, M. Chen, H. Zhang, J. Wu, "The China 842 Education and Research Network (CERNET) IVI Translation 843 Design and Deployment for the IPv4/IPv6 Coexistence and 844 Transition", RFC 6219, May 2011. 846 [RFC6621] Macker, J. (Ed.), "Simplified Multicast Forwarding," RFC 847 6621, May 2012. 849 10. Acknowledgments 851 This document was inspired by of numerous discussions among the 852 authors, Jari Arkko, Lars Eggert, Dino Farinacci, and Fred Templin, 853 as well as members participating in the Internet Area Working Group. 854 Detailed feedback was provided by Gorry Fairhurst, Brian Haberman, 855 Ted Hardie, Mike Heard, Erik Nordmark, Carlos Pignataro, and Dan 856 Wing. This document originated as an Independent Stream draft co- 857 authored by Matt Mathis, PSC, and his contributions are greatly 858 appreciated. 860 This document was prepared using 2-Word-v2.0.template.dot. 862 Author's Address 864 Joe Touch 865 USC/ISI 866 4676 Admiralty Way 867 Marina del Rey, CA 90292-6695 868 U.S.A. 870 Phone: +1 (310) 448-9151 871 Email: touch@isi.edu