idnits 2.17.1 

draft-ietf-intarea-ipv4-id-update-07.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  == There are 2 instances of lines with non-RFC6890-compliant IPv4 addresses
     in the document.  If these are example addresses, they should be changed.

  -- The draft header indicates that this document updates RFC2003, but the
     abstract doesn't seem to directly say this.  It does mention RFC2003
     though, so this could be OK.

  -- The draft header indicates that this document updates RFC1122, but the
     abstract doesn't seem to directly say this.  It does mention RFC1122
     though, so this could be OK.

  -- The draft header indicates that this document updates RFC791, but the
     abstract doesn't seem to directly say this.  It does mention RFC791
     though, so this could be OK.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

     (Using the creation date from RFC791, updated by this document, for
     RFC5378 checks: 1981-09-01)

  -- The document seems to contain a disclaimer for pre-RFC5378 work, and may
     have content which was first submitted before 10 November 2008.  The
     disclaimer is necessary when there are original authors that you have
     been unable to contact, or if some do not wish to grant the BCP78 rights
     to the IETF Trust.  If you are able to get all authors (current and
     original) to grant those rights, you can and should remove the
     disclaimer; otherwise, the disclaimer is needed and you can ignore this
     comment. (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (November 27, 2012) is 4165 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Looks like a reference, but probably isn't: '10' on line 717

  -- Obsolete informational reference (is this intentional?): RFC 2460
     (Obsoleted by RFC 8200)

  -- Obsolete informational reference (is this intentional?): RFC 2671
     (Obsoleted by RFC 6891)

  -- Obsolete informational reference (is this intentional?): RFC 4960
     (Obsoleted by RFC 9260)

  -- Obsolete informational reference (is this intentional?): RFC 6145
     (Obsoleted by RFC 7915)


     Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 10 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------

1	Internet Area WG                                               J. Touch
2	Internet Draft                                                 USC/ISI
3	Updates: 791,1122,2003                                November 27, 2012
4	Intended status: Proposed Standard
5	Expires: May 2013

7	                Updated Specification of the IPv4 ID Field
8	                 draft-ietf-intarea-ipv4-id-update-07.txt

10	Status of this Memo

12	   This Internet-Draft is submitted to IETF in full conformance with the
13	   provisions of BCP 78 and BCP 79.

15	   This document may contain material from IETF Documents or IETF
16	   Contributions published or made publicly available before November
17	   10, 2008. The person(s) controlling the copyright in some of this
18	   material may not have granted the IETF Trust the right to allow
19	   modifications of such material outside the IETF Standards Process.
20	   Without obtaining an adequate license from the person(s) controlling
21	   the copyright in such materials, this document may not be modified
22	   outside the IETF Standards Process, and derivative works of it may
23	   not be created outside the IETF Standards Process, except to format
24	   it for publication as an RFC or to translate it into languages other
25	   than English.

27	   Internet-Drafts are working documents of the Internet Engineering
28	   Task Force (IETF), its areas, and its working groups.  Note that
29	   other groups may also distribute working documents as Internet-
30	   Drafts.

32	   Internet-Drafts are draft documents valid for a maximum of six months
33	   and may be updated, replaced, or obsoleted by other documents at any
34	   time.  It is inappropriate to use Internet-Drafts as reference
35	   material or to cite them other than as "work in progress."

37	   The list of current Internet-Drafts can be accessed at
38	   http://www.ietf.org/ietf/1id-abstracts.txt

40	   The list of Internet-Draft Shadow Directories can be accessed at
41	   http://www.ietf.org/shadow.html

43	   This Internet-Draft will expire on May 27, 2013.

45	Copyright Notice

47	   Copyright (c) 2012 IETF Trust and the persons identified as the
48	   document authors. All rights reserved.

50	   This document is subject to BCP 78 and the IETF Trust's Legal
51	   Provisions Relating to IETF Documents
52	   (http://trustee.ietf.org/license-info) in effect on the date of
53	   publication of this document. Please review these documents
54	   carefully, as they describe your rights and restrictions with respect
55	   to this document. Code Components extracted from this document must
56	   include Simplified BSD License text as described in Section 4.e of
57	   the Trust Legal Provisions and are provided without warranty as
58	   described in the Simplified BSD License.

60	Abstract

62	   The IPv4 Identification (ID) field enables fragmentation and
63	   reassembly, and as currently specified is required to be unique
64	   within the maximum lifetime for all datagrams with a given
65	   source/destination/protocol tuple. If enforced, this uniqueness
66	   requirement would limit all connections to 6.4 Mbps. Because
67	   individual connections commonly exceed this speed, it is clear that
68	   existing systems violate the current specification. This document
69	   updates the specification of the IPv4 ID field in RFC791, RFC1122,
70	   and RFC2003 to more closely reflect current practice and to more
71	   closely match IPv6 so that the field's value is defined only when a
72	   datagram is actually fragmented. It also discusses the impact of
73	   these changes on how datagrams are used.

75	Table of Contents

77	   1. Introduction...................................................3
78	   2. Conventions used in this document..............................3
79	   3. The IPv4 ID Field..............................................4
80	      3.1. Uses of the IPv4 ID Field.................................4
81	      3.2. Background on IPv4 ID Reassembly Issues...................5
82	   4. Updates to the IPv4 ID Specification...........................6
83	      4.1. IPv4 ID Used Only for Fragmentation.......................7
84	      4.2. Encourage Safe IPv4 ID Use................................8
85	      4.3. IPv4 ID Requirements That Persist.........................8
86	   5. Impact of Proposed Changes.....................................9
87	      5.1. Impact on Legacy Internet Devices.........................9
88	      5.2. Impact on Datagram Generation............................10
89	      5.3. Impact on Middleboxes....................................11
90	         5.3.1. Rewriting Middleboxes...............................11
91	         5.3.2. Filtering Middleboxes...............................12
92	      5.4. Impact on Header Compression.............................12
93	      5.5. Impact of Network Reordering and Loss....................13
94	         5.5.1. Atomic Datagrams Experiencing Reordering or Loss....13
95	         5.5.2. Non-atomic Datagrams Experiencing Reordering or Loss14
96	   6. Updates to Existing Standards.................................14
97	      6.1. Updates to RFC 791.......................................14
98	      6.2. Updates to RFC 1122......................................15
99	      6.3. Updates to RFC 2003......................................16
100	   7. Security Considerations.......................................16
101	   8. IANA Considerations...........................................17
102	   9. References....................................................17
103	      9.1. Normative References.....................................17
104	      9.2. Informative References...................................17
105	   10. Acknowledgments..............................................19

107	1. Introduction

109	   In IPv4, the Identification (ID) field is a 16-bit value that is
110	   unique for every datagram for a given source address, destination
111	   address, and protocol, such that it does not repeat within the
112	   maximum datagram lifetime (MDL) [RFC791][RFC1122]. As currently
113	   specified, all datagrams between a source and destination of a given
114	   protocol must have unique IPv4 ID values over a period of this MDL,
115	   which is typically interpreted as two minutes, and is related to the
116	   recommended reassembly timeout [RFC1122]. This uniqueness is
117	   currently specified as for all datagrams, regardless of fragmentation
118	   settings.

120	   Uniqueness of the IPv4 ID is commonly violated by high speed devices;
121	   if strictly enforced, it would limit the speed of a single protocol
122	   between two IP endpoints to 6.4 Mbps for typical MTUs of 1500 bytes
123	   [RFC4963]. It is common for a single connection to operate far in
124	   excess of these rates, which strongly indicates that the uniqueness
125	   of the IPv4 ID as specified is already moot. Further, some sources
126	   have been generating non-varying IPv4 IDs for many years (e.g.,
127	   cellphones), which resulted in support for such in ROHC [RFC5225].

129	   This document updates the specification of the IPv4 ID field to more
130	   closely reflect current practice, and to include considerations taken
131	   into account during the specification of the similar field in IPv6.

133	2. Conventions used in this document

135	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
136	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
137	   document are to be interpreted as described in RFC-2119 [RFC2119].

139	   In this document, the characters ">>" proceeding an indented line(s)
140	   indicates a requirement using the key words listed above. This
141	   convention aids reviewers in quickly identifying or finding this
142	   document's explicit requirements.

144	3. The IPv4 ID Field

146	   IP supports datagram fragmentation, where large datagrams are split
147	   into smaller components to traverse links with limited maximum
148	   transmission units (MTUs). Fragments are indicated in different ways
149	   in IPv4 and IPv6:

151	   o  In IPv4, fragments are indicated using four fields of the basic
152	      header: Identification (ID), Fragment Offset, a "Don't Fragment"
153	      flag (DF), and a "More Fragments" flag (MF) [RFC791]

155	   o  In IPv6, fragments are indicated in an extension header that
156	      includes an ID, Fragment Offset, and M (more fragments) flag
157	      similar to their counterparts in IPv4 [RFC2460]

159	   IPv4 and IPv6 fragmentation differs in a few important ways. IPv6
160	   fragmentation occurs only at the source, so a DF bit is not needed to
161	   prevent downstream devices from initiating fragmentation (i.e., IPv6
162	   always acts as if DF=1). The IPv6 fragment header is present only
163	   when a datagram has been fragmented, or when the source has received
164	   a "packet too big" ICMPv6 error message indicating that the path
165	   cannot support the required minimum 1280-byte IPv6 MTU and is thus
166	   subject to translation [RFC2460][RFC4443]. The latter case is
167	   relevant only for IPv6 datagrams sent to IPv4 destinations to support
168	   subsequent fragmentation after translation to IPv4.

170	   With the exception of these two cases, the ID field is not present
171	   for non-fragmented datagrams, and thus is meaningful only for
172	   datagrams that are already fragmented or datagrams intended to be
173	   fragmented as part of IPv4 translation. Finally, the IPv6 ID field is
174	   32 bits, and required unique per source/destination address pair for
175	   IPv6, whereas for IPv4 it is only 16 bits and required unique per
176	   source/destination/protocol triple.

178	   This document focuses on the IPv4 ID field issues, because in IPv6
179	   the field is larger and present only in fragments.

181	3.1. Uses of the IPv4 ID Field

183	   The IPv4 ID field was originally intended for fragmentation and
184	   reassembly [RFC791]. Within a given source address, destination
185	   address, and protocol, fragments of an original datagram are matched
186	   based on their IPv4 ID. This requires that IDs are unique within the
187	   address/protocol triple when fragmentation is possible (e.g., DF=0)
188	   or when it has already occurred (e.g., frag_offset>0 or MF=1).

190	   Other uses have been envisioned for the IPv4 ID field. The field has
191	   been proposed as a way to detect and remove duplicate datagrams,
192	   e.g., at congested routers (noted in Sec. 3.2.1.5 of [RFC1122]) or in
193	   network accelerators. It has similarly been proposed for use at end
194	   hosts to reduce the impact of duplication on higher-layer protocols
195	   (e.g., additional processing in TCP, or the need for application-
196	   layer duplicate suppression in UDP). This is also discussed further
197	   in Section 5.1.

199	   The IPv4 ID field is used in some diagnostic tools to correlate
200	   datagrams measured at various locations along a network path. This is
201	   already insufficient in IPv6 because unfragmented datagrams lack an
202	   ID, so these tools are already being updated to avoid such reliance
203	   on the ID field. This is also discussed further in Section 5.1.

205	   The ID clearly needs to be unique (within MDL, within the
206	   src/dst/protocol tuple) to support fragmentation and reassembly, but
207	   not all datagrams are fragmented or allow fragmentation. This
208	   document deprecates non-fragmentation uses, allowing the ID to be
209	   repeated (within MDL, within the src/dst/protocol tuple) in those
210	   cases.

212	3.2. Background on IPv4 ID Reassembly Issues

214	   The following is a summary of issues with IPv4 fragment reassembly in
215	   high speed environments raised previously [RFC4963]. Readers are
216	   encouraged to consult RFC 4963 for a more detailed discussion of
217	   these issues.

219	   With the maximum IPv4 datagram size of 64KB, a 16-bit ID field that
220	   does not repeat within 120 seconds means that the aggregate of all
221	   TCP connections of a given protocol between two IP endpoints is
222	   limited to roughly 286 Mbps; at a more typical MTU of 1500 bytes,
223	   this speed drops to 6.4 Mbps [RFC791][RFC1122][RFC4963]. This limit
224	   currently applies for all IPv4 datagrams within a single protocol
225	   (i.e., the IPv4 protocol field) between two IP addresses, regardless
226	   of whether fragmentation is enabled or inhibited, and whether a
227	   datagram is fragmented or not.

229	   IPv6, even at typical MTUs, is capable of 18.7 Tbps with
230	   fragmentation between two IP endpoints as an aggregate across all
231	   protocols, due to the larger 32-bit ID field (and the fact that the
232	   IPv6 next-header field, the equivalent of the IPv4 protocol field, is
233	   not considered in differentiating fragments). When fragmentation is
234	   not used the field is absent, and in that case IPv6 speeds are not
235	   limited by the ID field uniqueness.

237	   Note also that 120 seconds is only an estimate on the MDL. It is
238	   related to the reassembly timeout as a lower bound and the TCP
239	   Maximum Segment Lifetime as an upper bound (both as noted in
240	   [RFC1122]). Network delays are incurred in other ways, e.g.,
241	   satellite links, which can add seconds of delay even though the TTL
242	   is not decremented by a corresponding amount. There is thus no
243	   enforcement mechanism to ensure that datagrams older than 120 seconds
244	   are discarded.

246	   Wireless Internet devices are frequently connected at speeds over 54
247	   Mbps, and wired links of 1 Gbps have been the default for several
248	   years. Although many end-to-end transport paths are congestion
249	   limited, these devices easily achieve 100+ Mbps application-layer
250	   throughput over LANs (e.g., disk-to-disk file transfer rates), and
251	   numerous throughput demonstrations with COTS systems over wide-area
252	   paths exhibit these speeds for over a decade. This strongly suggests
253	   that IPv4 ID uniqueness has been moot for a long time.

255	4. Updates to the IPv4 ID Specification

257	   This document updates the specification of the IPv4 ID field in three
258	   distinct ways, as discussed in subsequent subsections:

260	   o  Use the IPv4 ID field only for fragmentation

262	   o  Avoiding a performance impact when the IPv4 ID field is used

264	   o  Encourage safe operation when the IPv4 ID field is used

266	   There are two kinds of datagrams used in the following discussion,
267	   named as follows:

269	   o  Atomic datagrams are datagrams not yet fragmented and for which
270	      further fragmentation has been inhibited.

272	   o  Non-atomic datagrams are datagrams that either already have been
273	      fragmented or for which fragmentation remains possible.

275	   This same definition can be expressed in pseudo code as using common
276	   logical operators (equals is ==, logical 'and' is &&, logical 'or' is
277	   ||, greater than is >, and parenthesis function typically) as:

279	   o  Atomic datagrams: (DF==1)&&(MF==0)&&(frag_offset==0)
280	   o  Non-atomic datagrams: (DF==0)||(MF==1)||(frag_offset>0)

282	   The test for non-atomic datagrams is the logical negative of the test
283	   for atomic datagrams, thus all possibilities are considered.

285	4.1. IPv4 ID Used Only for Fragmentation

287	   Although RFC1122 suggests the IPv4 ID field has other uses, including
288	   datagram de-duplication, such uses are already not interoperable with
289	   known implementations of sources that do not vary their ID. This
290	   document thus defines this field's value only for fragmentation and
291	   reassembly:

293	   >> IPv4 ID field MUST NOT be used for purposes other than
294	   fragmentation and reassembly.

296	   Datagram de-duplication is accomplished using hash-based duplicate
297	   detection for cases where the ID field is absent (IPv6 unfragmented
298	   datagrams), which can also be applied to IPv4 atomic datagrams
299	   without utilizing the ID field [RFC6621].

301	   In atomic datagrams, the IPv4 ID field has no meaning, and thus can
302	   be set to an arbitrary value, i.e., the requirement for non-repeating
303	   IDs within the address/protocol triple is no longer required for
304	   atomic datagrams:

306	   >> Originating sources MAY set the IPv4 ID field of atomic datagrams
307	   to any value.

309	   Second, all network nodes, whether at intermediate routers,
310	   destination hosts, or other devices (e.g., NATs and other address
311	   sharing mechanisms, firewalls, tunnel egresses), cannot rely on the
312	   field:

314	   >> All devices that examine IPv4 headers MUST ignore the IPv4 ID
315	   field of atomic datagrams.

317	   The IPv4 ID field is thus meaningful only for non-atomic datagrams -
318	   datagrams that have either already been fragmented, or those for
319	   which fragmentation remains permitted. Atomic datagrams are detected
320	   by their DF, MF, and fragmentation offset fields as explained in
321	   Section 4, because such a test is completely backward compatible;
322	   this document thus does not reserve any IPv4 ID values, including 0,
323	   as distinguished.

325	   Deprecating the use of the IPv4 ID field for non-reassembly uses
326	   should have little - if any - impact. IPv4 IDs are already frequently
327	   repeated, e.g., over even moderately fast connections and from some
328	   sources that do not vary the ID at all, and no adverse impact has
329	   been observed. Duplicate suppression was suggested [RFC1122] and has
330	   been implemented in some protocol accelerators, but no impacts of
331	   IPv4 ID reuse have been noted to date. Routers are not required to
332	   issue ICMPs on any particular timescale, and so IPv4 ID repetition
333	   should not have been used for validation and has not been observed,
334	   and again repetition already occurs and would have been noticed
335	   [RFC1812]. ICMP relaying at tunnel ingresses is specified to use soft
336	   state rather than a datagram cache, and should have been noted if the
337	   latter for similar reasons [RFC2003]. These and other legacy issues
338	   are discussed further in Section 5.1.

340	4.2. Encourage Safe IPv4 ID Use

342	   This document makes further changes to the specification of the IPv4
343	   ID field and its use to encourage its safe use as corollary
344	   requirements changes as follows.

346	   RFC 1122 discusses that if TCP retransmits a segment it may be
347	   possible to reuse the IPv4 ID (see Section 6.2). This can make it
348	   difficult for a source to avoid IPv4 ID repetition for received
349	   fragments. RFC 1122 concludes that this behavior "is not useful";
350	   this document formalizes that conclusion as follows:

352	   >> The IPv4 ID of non-atomic datagrams MUST NOT be reused when
353	   sending a copy of an earlier non-atomic datagram.

355	   RFC 1122 also suggests that fragments can overlap [RFC1122]. Such
356	   overlap can occur if successive retransmissions are fragmented in
357	   different ways but with the same reassembly IPv4 ID. This overlap is
358	   noted as the result of reusing IPv4 IDs when retransmitting
359	   datagrams, which this document deprecates. However, it is also the
360	   result of in-network datagram duplication, which can still occur. As
361	   a result this document does not change the need to support
362	   overlapping fragments.

364	4.3. IPv4 ID Requirements That Persist

366	   This document does not relax the IPv4 ID field uniqueness
367	   requirements of [RFC791] for non-atomic datagrams, i.e.:

369	   >> Sources emitting non-atomic datagrams MUST NOT repeat IPv4 ID
370	   values within one MDL for a given source address/destination
371	   address/protocol triple.

373	   Such sources include originating hosts, tunnel ingresses, and NATs
374	   (including other address sharing mechanisms) (see Section 5.3).

376	   This document does not relax the requirement that all network devices
377	   honor the DF bit, i.e.:

379	   >> IPv4 datagrams whose DF=1 MUST NOT be fragmented.

381	   >> IPv4 datagram transit devices MUST NOT clear the DF bit.

383	   In specific, DF=1 prevents fragmenting atomic datagrams. DF=1 also
384	   prevents further fragmenting received fragments. In-network
385	   fragmentation is permitted only when DF=0; this document does not
386	   change that requirement.

388	5. Impact of Proposed Changes

390	   This section discusses the impact of the proposed changes on legacy
391	   devices, datagram generation in updated devices, middleboxes, and
392	   header compression.

394	5.1. Impact on Legacy Internet Devices

396	   Legacy uses of the IPv4 ID field consist of fragment generation,
397	   fragment reassembly, duplicate datagram detection, and "other" uses.

399	   Current devices already generate ID values that are reused within the
400	   source address, destination address, protocol, and ID tuple in less
401	   than the current estimated Internet MDL of two minutes. They assume
402	   that the MDL over their end-to-end path is much lower.

404	   Existing devices have been known to generate non-varying IDs for
405	   atomic datagrams for nearly a decade, notably some cell phones. Such
406	   constant ID values are the reason for their support as an
407	   optimization of ROHC [RFC5225]. This is discussed further in Section
408	   5.4. Generation of IPv4 datagrams with constant (zero) IDs is also
409	   described as part of the IP/ICMP translation standard [RFC6145].

411	   Many current devices support fragmentation that ignores the IPv4
412	   Don't Fragment (DF) bit. Such devices already transit traffic from
413	   sources that reuse the ID. If fragments of different datagrams
414	   reusing the same ID (within the source/destination/protocol tuple)
415	   arrive at the destination interleaved, fragmentation would fail and
416	   traffic would be dropped. Either such interleaving is uncommon, or
417	   traffic from such devices is not widely traversing these DF-ignoring
418	   devices, because significant occurrence of reassembly errors has not
419	   been reported. DF-ignoring devices do not comply with existing
420	   standards, and it is not feasible to update the standards to allow
421	   them as compliant.

423	   The ID field has been envisioned for use in duplicate detection, as
424	   discussed in Section 4.1 [RFC1122]. Although this document now allows
425	   IPv4 ID reuse for atomic datagrams, such reuse is already common (as
426	   noted above). Protocol accelerators are known to implement IPv4
427	   duplicate detection, but such devices are also known to violate other
428	   Internet standards to achieve higher end-to-end performance. These
429	   devices would already exhibit erroneous drops for this current
430	   traffic, and this has not been reported.

432	   There are other potential uses of the ID field, such as for
433	   diagnostic purposes. Such uses already need to accommodate atomic
434	   datagrams with reused ID fields. There are no reports of such uses
435	   having problems with current datagrams that reuse IDs. These and any
436	   other uses of the ID field are encouraged to apply IPv6-compatible
437	   methods for IPv4 as well.

439	   Thus, as a result of previous requirements, this document recommends
440	   that IPv4 duplicate detection and diagnostic mechanisms apply IPv6-
441	   compatible methods, i.e., that do not rely on the ID field (e.g., as
442	   suggested in [RFC6621]). This is a consequence of using the ID field
443	   only for reassembly, as well as the known hazard of existing devices
444	   already reusing the ID field.

446	5.2. Impact on Datagram Generation

448	   The following is a summary of the recommendations that are the result
449	   of the previous changes to the IPv4 ID field specification.

451	   Because atomic datagrams can use arbitrary IPv4 ID values, the ID
452	   field no longer imposes a performance impact in those cases. However,
453	   the performance impact remains for non-atomic datagrams. As a result:

455	   >> Sources of non-atomic IPv4 datagrams MUST rate-limit their output
456	   to comply with the ID uniqueness requirements. Such sources include,
457	   in particular, DNS over UDP [RFC2671].

459	   Because there is no strict definition of the MDL, reassembly hazards
460	   exist regardless of the IPv4 ID reuse interval or the reassembly
461	   timeout. As a result:

463	   >> Higher layer protocols SHOULD verify the integrity of IPv4
464	   datagrams, e.g., using a checksum or hash that can detect reassembly
465	   errors (the UDP checksum is weak in this regard, but better than
466	   nothing).

468	   Additional integrity checks can be employed using tunnels, as
469	   supported by SEAL, IPsec, or SCTP [RFC4301][RFC4960][RFC5320]. Such
470	   checks can avoid the reassembly hazards that can occur when using UDP
471	   and TCP checksums [RFC4963], or when using partial checksums as in
472	   UDP-Lite [RFC3828]. Because such integrity checks can avoid the
473	   impact of reassembly errors:

475	   >> Sources of non-atomic IPv4 datagrams using strong integrity checks
476	   MAY reuse the ID within MDL values smaller than is typical.

478	   Note, however, that such frequent reuse can still result in corrupted
479	   reassembly and poor throughput, although it would not propagate
480	   reassembly errors to higher layer protocols.

482	5.3. Impact on Middleboxes

484	   Middleboxes include rewriting devices that include network address
485	   translators (NATs), address/port translators (NAPTs), and other
486	   address sharing mechanisms (ASMs). They also include devices that
487	   inspect and filter datagrams that are not routers, such as
488	   accelerators and firewalls.

490	   The changes proposed in this document may not be implemented by
491	   middleboxes, however these changes are more likely to make current
492	   middlebox behavior compliant than to affect the service provided by
493	   those devices.

495	5.3.1. Rewriting Middleboxes

497	   NATs and NAPTs rewrite IP fields, and tunnel ingresses (using IPv4
498	   encapsulation) copy and modify some IPv4 fields, so all are
499	   considered sources, as do any devices that rewrite any portion of the
500	   source address, destination address, protocol, and ID tuple for any
501	   datagrams [RFC3022]. This is also true for other ASMs, including 4rd,
502	   IVI, and others in the "A+P" (address plus port) family [Bo11] [De11]
503	   [RFC6219]. It is equally true for any other datagram rewriting
504	   mechanism. As a result, they are subject to all the requirements of
505	   any source, as has been noted.

507	   NATs/ASMs/rewriters present a particularly challenging situation for
508	   fragmentation. Because they overwrite portions of the reassembly
509	   tuple in both directions, they can destroy tuple uniqueness and
510	   result in a reassembly hazard. Whenever IPv4 source address,
511	   destination address, or protocol fields are modified, a
512	   NAT/ASM/rewriter needs to ensure that the ID field is generated
513	   appropriately, rather than simply copied from the incoming datagram.
514	   In specific:

516	   >> Address sharing or rewriting devices MUST ensure that the IPv4 ID
517	   field of datagrams whose address or protocol are translated comply
518	   with these requirements as if the datagram were sourced by that
519	   device.

521	   This compliance means that the IPv4 ID field of non-atomic datagrams
522	   translated at a NAT/ASM/rewriter needs to obey the uniqueness
523	   requirements of any IPv4 datagram source. Unfortunately, fragments
524	   already violate that requirement, as they repeat an IPv4 ID within
525	   the MDL for a given source address, destination address, and protocol
526	   triple.

528	   Such problems with transmitting fragments through NATs/ASMs/rewriters
529	   are already known; translation is based on the transport port number,
530	   which is present in only the first fragment anyway [RFC3022]. This
531	   document underscores the point that not only is reassembly (and
532	   possibly subsequent fragmentation) required for translation, it can
533	   be used to avoid issues with IPv4 ID uniqueness.

535	   Note that NATs/ASMs already need to exercise special care when
536	   emitting datagrams on their public side, because merging datagrams
537	   from many sources onto a single outgoing source address can result in
538	   IPv4 ID collisions. This situation precedes this document, and is not
539	   affected by it. It is exacerbated in large-scale, so-called "carrier
540	   grade" NATs [Pe11].

542	   Tunnel ingresses act as sources for the outermost header, but tunnels
543	   act as routers for the inner headers (i.e., the datagram as arriving
544	   at the tunnel ingress). Ingresses can always fragment as originating
545	   sources of the outer header, because they control the uniqueness of
546	   that IPv4 ID field and the value of DF on the outer header
547	   independent of those values on the inner (arriving datagram) header.

549	5.3.2. Filtering Middleboxes

551	   Middleboxes also include devices that filter datagrams, including
552	   network accelerators and firewalls. Some such devices reportedly
553	   feature datagram de-duplication that relies on IP ID uniqueness to
554	   identify duplicates, which has been discussed in Section 5.1.

556	5.4. Impact on Header Compression

558	   Header compression algorithms already accommodate various ways in
559	   which the IPv4 ID changes between sequential datagrams [RFC1144]
560	   [RFC2508] [RFC3545] [RFC5225]. Such algorithms currently assume that
561	   the IPv4 ID is preserved end-to-end. Some algorithms already allow
562	   assuming the ID does not change (e.g., ROHC [RFC5225]), where others
563	   include non-changing IDs via zero deltas (e.g., ECRTP [RFC3545]).

565	   When compression assumes a changing ID as a default, having a non-
566	   changing ID can make compression less efficient. Such non-changing
567	   IDs have been described in various RFCs (e.g., footnote 21 of
568	   [RFC1144] and cRTP [RFC2508]). When compression can assume a non-
569	   changing IPv4 ID - as with ROHC and ECRTP - efficiency can be
570	   increased.

572	5.5. Impact of Network Reordering and Loss

574	   Tolerance to network reordering and loss is a key feature of the
575	   Internet architecture. Although most current IP networks avoid
576	   gratuitous such events, both reordering and loss can and do occur.
577	   Datagrams are already intended to be reordered or lost, and recovery
578	   from those errors (where supported) already occurs at the transport
579	   or higher protocol layers.

581	   Reordering is typically associated with routing transients or where
582	   multiple alternate paths exist. Loss is typically associated with
583	   path congestion or link failure (partial or complete). The impact of
584	   such events is different for atomic and non-atomic datagrams, and is
585	   discussed below. In summary, the recommendations of this document
586	   make the Internet more robust to reordering and loss by emphasizing
587	   the requirements of ID uniqueness for non-atomic datagrams and by
588	   more clearly indicating the impact of these requirements on both
589	   endpoints and datagram transit devices.

591	5.5.1. Atomic Datagrams Experiencing Reordering or Loss

593	   Reusing ID values does not affect atomic datagrams when the DF bit is
594	   correctly respected, because order restoration does not depend on the
595	   datagram header. TCP uses a transport header sequence number; in some
596	   other protocols, sequence is indicated and restored at the
597	   application layer.

599	   When DF=1 is ignored, reordering or loss can cause fragments of
600	   different datagrams to be interleaved and thus incorrectly
601	   reassembled and thus discarded. Reuse of ID values in atomic packets,
602	   as permitted by this document, can result in higher datagram loss in
603	   such cases. Such cases already can exist because there are known
604	   devices that use a constant ID for atomic packets (some cellphones),
605	   and there are known devices that ignore DF=1, but high levels of
606	   corresponding loss have not been reported. The lack of such reports
607	   indicates either a lack of reordering or loss in such cases, or a
608	   tolerance to the resulting losses. If such issues are reported, it
609	   would be more productive to address non-compliant devices (that
610	   ignore DF=1), because it is impractical to define Internet
611	   specifications to tolerate devices that ignore those specifications.
612	   This is why this document emphasizes the need to honor DF=1, as well
613	   as that datagram transit devices need to retain the DF bit as
614	   received (i.e., rather than clear it).

616	5.5.2. Non-atomic Datagrams Experiencing Reordering or Loss

618	   Non-atomic datagrams rely on the uniqueness of the ID value to
619	   tolerate reordering of fragments, notably where fragments of
620	   different datagrams are interleaved as a result of such reordering.
621	   Fragment loss can result in reassembly of fragments from different
622	   origin datagrams, which is why ID reuse in non-atomic datagrams is
623	   based on datagram (fragment) maximum lifetime, not just expected
624	   reordering interleaving.

626	   This document does not change the requirements for uniqueness of IDs
627	   in non-atomic datagrams, and thus does not affect their tolerance to
628	   such reordering or loss. This document emphasizes the need for ID
629	   uniqueness for all datagram sources including rewriting middleboxes,
630	   the need to rate-limit sources to ensure ID uniqueness, the need to
631	   not reuse the ID for retransmitted datagrams, and the need to use
632	   higher-layer integrity checks to prevent reassembly errors - all of
633	   which result in a higher tolerance to reordering or loss events.

635	6. Updates to Existing Standards

637	   The following sections address the specific changes to existing
638	   protocols indicated by this document.

640	6.1. Updates to RFC 791

642	   RFC 791 states that:

644	      The originating protocol module of an internet datagram sets the
645	      identification field to a value that must be unique for that
646	      source-destination pair and protocol for the time the datagram
647	      will be active in the internet system.

649	   And later that:

651	      Thus, the sender must choose the Identifier to be unique for this
652	      source, destination pair and protocol for the time the datagram
653	      (or any fragment of it) could be alive in the internet.

655	      It seems then that a sending protocol module needs to keep a table
656	      of Identifiers, one entry for each destination it has communicated
657	      with in the last maximum datagram lifetime for the internet.

659	      However, since the Identifier field allows 65,536 different
660	      values, some host may be able to simply use unique identifiers
661	      independent of destination.

663	      It is appropriate for some higher level protocols to choose the
664	      identifier. For example, TCP protocol modules may retransmit an
665	      identical TCP segment, and the probability for correct reception
666	      would be enhanced if the retransmission carried the same
667	      identifier as the original transmission since fragments of either
668	      datagram could be used to construct a correct TCP segment.

670	   This document changes RFC 791 as follows:

672	   o  IPv4 ID uniqueness applies to only non-atomic datagrams.

674	   o  Retransmitted non-atomic IPv4 datagrams are no longer permitted to
675	      reuse the ID value.

677	6.2. Updates to RFC 1122

679	   RFC 1122 states that:

681	        3.2.1.5  Identification: RFC-791 Section 3.2

683	            When sending an identical copy of an earlier datagram, a
684	            host MAY optionally retain the same Identification field in
685	            the copy.

687	            DISCUSSION:

689	            Some Internet protocol experts have maintained that when a
690	            host sends an identical copy of an earlier datagram, the new
691	            copy should contain the same Identification value as the
692	            original.  There are two suggested advantages:  (1) if the
693	            datagrams are fragmented and some of the fragments are lost,
694	            the receiver may be able to reconstruct a complete datagram
695	            from fragments of the original and the copies; (2) a
696	            congested gateway might use the IP Identification field (and
697	            Fragment Offset) to discard duplicate datagrams from the
698	            queue.

700	   This document changes RFC 1122 as follows:

702	   o  The IPv4 ID field is no longer permitted to be used for duplicate
703	      detection. This applies to both atomic and non-atomic datagrams.

705	   o  Retransmitted non-atomic IPv4 datagrams are no longer permitted to
706	      reuse the ID value.

708	6.3. Updates to RFC 2003

710	   This document updates how IPv4-in-IPv4 tunnels create IPv4 ID values
711	   for the IPv4 outer header [RFC2003], but only in the same way as for
712	   any other IPv4 datagram source. In specific, RFC 2003 states the
713	   following, where ref. [10] is RFC 791:

715	         Identification, Flags, Fragment Offset

717	            These three fields are set as specified in [10]...

719	   This document changes RFC 2003 as follows:

721	   o  The IPv4 ID field is set as permitted by RFCXXXX.

723	7. Security Considerations

725	   When the IPv4 ID is ignored on receipt (e.g., for atomic datagrams),
726	   its value becomes unconstrained; that field then can more easily be
727	   used as a covert channel. For some atomic datagrams it is now
728	   possible, and may be desirable, to rewrite the IPv4 ID field to avoid
729	   its use as such a channel. Rewriting would be prohibited for
730	   datagrams protected by IPsec Authentication Header (AH), although we
731	   do not recommend use of AH to achieve this result [RFC4302].

733	   The IPv4 ID also now adds much less to the entropy of the header of a
734	   datagram. Such entropy might be used as input to cryptographic
735	   algorithms or pseudorandom generators, although IDs have never been
736	   assured sufficient entropy for such purposes. The IPv4 ID had
737	   previously been unique (for a given source/address pair, and protocol
738	   field) within one MDL, although this requirement was not enforced and
739	   clearly is typically ignored. The IPv4 ID of atomic datagrams is not
740	   required unique, and so contributes no entropy to the header.

742	   The deprecation of the IPv4 ID field's uniqueness for atomic
743	   datagrams can defeat the ability to count devices behind a
744	   NAT/ASM/rewriter [Be02]. This is not intended as a security feature,
745	   however.

747	8. IANA Considerations

749	   There are no IANA considerations in this document.

751	   The RFC Editor should remove this section prior to publication

753	9. References

755	9.1. Normative References

757	   [RFC791]  Postel, J., "Internet Protocol", RFC 791 / STD 5, September
758	             1981.

760	   [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts -
761	             Communication Layers", RFC 1122 / STD 3, October 1989.

763	   [RFC1812] Baker, F. (Ed.), "Requirements for IP Version 4 Routers",
764	             RFC 1812 / STD 4, Jun. 1995.

766	   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
767	             Requirement Levels", RFC 2119 / BCP 14, March 1997.

769	   [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003,
770	             October 1996.

772	9.2. Informative References

774	   [Be02]    Bellovin, S., "A Technique for Counting NATted Hosts",
775	             Internet Measurement Conference, Proceedings of the 2nd ACM
776	             SIGCOMM Workshop on Internet Measurement, Nov. 2002.

778	   [Bo11]    Boucadair, M., J. Touch, P. Levis, R. Penno, "Analysis of
779	             Solution Candidates to Reveal a Host Identifier in Shared
780	             Address Deployments", (work in progress), draft-boucadair-
781	             intarea-nat-reveal-analysis, Sept. 2011.

783	   [De11]    Despres, R. (Ed.), S. Matsushima, T. Murakami, O. Troan,
784	             "IPv4 Residual Deployment across IPv6-Service networks
785	             (4rd)", (work in progress), draft-despres-intarea-4rd, Mar.
786	             2011.

788	   [Pe11]    Perreault, S., (Ed.), I. Yamagata, S. Miyakawa, A.
789	             Nakagawa, H. Ashida, "Common requirements of IP address
790	             sharing schemes", (work in progress), draft-ietf-behave-
791	             lsn-requirements, Mar. 2011.

793	   [RFC1144] Jacobson, V., "Compressing TCP/IP Headers", RFC 1144, Feb.
794	             1990.

796	   [RFC2460] Deering, S., R. Hinden, "Internet Protocol, Version 6
797	             (IPv6) Specification", RFC 2460, Dec. 1998.

799	   [RFC2508] Casner, S., V. Jacobson. "Compressing IP/UDP/RTP Headers
800	             for Low-Speed Serial Links", RFC 2508, Feb. 1999.

802	   [RFC2671] Vixie,P., "Extension Mechanisms for DNS (EDNS0)", RFC 2671,
803	             Aug. 1999.

805	   [RFC3022] Srisuresh, P. and K. Egevang, "Traditional IP Network
806	             Address Translator (Traditional NAT)", RFC 3022, Jan. 2001.

808	   [RFC3545] Koren, T., S. Casner, J. Geevarghese, B. Thompson, P.
809	             Ruddy, "Enhanced Compressed RTP (CRTP) for Links with High
810	             Delay, Packet Loss and Reordering", RFC 3545, Jul. 2003.

812	   [RFC3828] Larzon, L-A., M. Degermark, S. Pink, L-E. Jonsson, Ed., G.
813	             Fairhurst, Ed., "The Lightweight User Datagram Protocol
814	             (UDP-Lite)", RFC 3828, Jul. 2004.

816	   [RFC4301] Kent, S., K. Seo, "Security Architecture for the Internet
817	             Protocol", RFC 4301, Dec. 2005.

819	   [RFC4302] Kent, S., "IP Authentication Header", RFC 4302, Dec. 2005.

821	   [RFC4443] Conta, A., S. Deering, M. Gupta (Ed.), "Internet Control
822	             Message Protocol (ICMPv6) for the Internet Protocol Version
823	             6 (IPv6) Specification", RFC 4443, March. 2006.

825	   [RFC4960] Stewart, R. (Ed.), "Stream Control Transmission Protocol",
826	             RFC 4960, Sep. 2007.

828	   [RFC4963] Heffner, J., M. Mathis, B. Chandler, "IPv4 Reassembly
829	             Errors at High Data Rates," RFC 4963, Jul. 2007.

831	   [RFC5225] Pelletier, G., K. Sandlund, "RObust Header Compression
832	             Version 2 (ROHCv2): Profiles for RTP, UDP, IP, ESP and UDP-
833	             Lite", RFC 5225, Apr. 2008.

835	   [RFC5320] Templin, F., Ed., "The Subnetwork Encapsulation and
836	             Adaptation Layer (SEAL)", RFC 5320, Feb. 2010.

838	   [RFC6145] Li, X., C. Bao, F. Baker, "IP/ICMP Translation Algorithm,"
839	             RFC 6145, Apr. 2011.

841	   [RFC6219] Li, X., C. Bao, M. Chen, H. Zhang, J. Wu, "The China
842	             Education and Research Network (CERNET) IVI Translation
843	             Design and Deployment for the IPv4/IPv6 Coexistence and
844	             Transition", RFC 6219, May 2011.

846	   [RFC6621] Macker, J. (Ed.), "Simplified Multicast Forwarding," RFC
847	             6621, May 2012.

849	10. Acknowledgments

851	   This document was inspired by of numerous discussions among the
852	   authors, Jari Arkko, Lars Eggert, Dino Farinacci, and Fred Templin,
853	   as well as members participating in the Internet Area Working Group.
854	   Detailed feedback was provided by Gorry Fairhurst, Brian Haberman,
855	   Ted Hardie, Mike Heard, Erik Nordmark, Carlos Pignataro, and Dan
856	   Wing. This document originated as an Independent Stream draft co-
857	   authored by Matt Mathis, PSC, and his contributions are greatly
858	   appreciated.

860	   This document was prepared using 2-Word-v2.0.template.dot.

862	Author's Address

864	   Joe Touch
865	   USC/ISI
866	   4676 Admiralty Way
867	   Marina del Rey, CA 90292-6695
868	   U.S.A.

870	   Phone: +1 (310) 448-9151
871	   Email: touch@isi.edu