idnits 2.17.1 

draft-irtf-pearg-numeric-ids-generation-07.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (February 2, 2021) is 1178 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  ** Obsolete normative reference: RFC  793 (Obsoleted by RFC 9293)

  ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200)

  ** Obsolete normative reference: RFC 4941 (Obsoleted by RFC 8981)

  ** Obsolete normative reference: RFC 6528 (Obsoleted by RFC 9293)

  == Outdated reference: A later version (-11) exists of
     draft-gont-numeric-ids-sec-considerations-06

  == Outdated reference: A later version (-11) exists of
     draft-irtf-pearg-numeric-ids-history-06


     Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Research Task Force (IRTF)                              F. Gont
3	Internet-Draft                                              SI6 Networks
4	Intended status: Informational                                   I. Arce
5	Expires: August 6, 2021                                        Quarkslab
6	                                                        February 2, 2021

8	           On the Generation of Transient Numeric Identifiers
9	               draft-irtf-pearg-numeric-ids-generation-07

11	Abstract

13	   This document performs an analysis of the security and privacy
14	   implications of different types of "transient numeric identifiers"
15	   used in IETF protocols, and tries to categorize them based on their
16	   interoperability requirements and their associated failure severity
17	   when such requirements are not met.  Subsequently, it provides advice
18	   on possible algorithms that could be employed to satisfy the
19	   interoperability requirements of each identifier category, while
20	   minimizing the negative security and privacy implications, thus
21	   providing guidance to protocol designers and protocol implementers.
22	   Finally, it describes a number of algorithms that have been employed
23	   in real implementations to generate transient numeric identifiers,
24	   and analyzes their security and privacy properties.  This document is
25	   a product of the Privacy Enhancement and Assessment Research Group
26	   (PEARG) in the IRTF.

28	Status of This Memo

30	   This Internet-Draft is submitted in full conformance with the
31	   provisions of BCP 78 and BCP 79.

33	   Internet-Drafts are working documents of the Internet Engineering
34	   Task Force (IETF).  Note that other groups may also distribute
35	   working documents as Internet-Drafts.  The list of current Internet-
36	   Drafts is at https://datatracker.ietf.org/drafts/current/.

38	   Internet-Drafts are draft documents valid for a maximum of six months
39	   and may be updated, replaced, or obsoleted by other documents at any
40	   time.  It is inappropriate to use Internet-Drafts as reference
41	   material or to cite them other than as "work in progress."

43	   This Internet-Draft will expire on August 6, 2021.

45	Copyright Notice

47	   Copyright (c) 2021 IETF Trust and the persons identified as the
48	   document authors.  All rights reserved.

50	   This document is subject to BCP 78 and the IETF Trust's Legal
51	   Provisions Relating to IETF Documents
52	   (https://trustee.ietf.org/license-info) in effect on the date of
53	   publication of this document.  Please review these documents
54	   carefully, as they describe your rights and restrictions with respect
55	   to this document.

57	Table of Contents

59	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
60	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   5
61	   3.  Threat Model  . . . . . . . . . . . . . . . . . . . . . . . .   5
62	   4.  Issues with the Specification of Transient Numeric
63	       Identifiers . . . . . . . . . . . . . . . . . . . . . . . . .   6
64	   5.  Protocol Failure Severity . . . . . . . . . . . . . . . . . .   7
65	   6.  Categorizing Transient Numeric Identifiers  . . . . . . . . .   7
66	   7.  Common Algorithms for Transient Numeric Identifier Generation  10
67	     7.1.  Category #1: Uniqueness (soft failure)  . . . . . . . . .  10
68	     7.2.  Category #2: Uniqueness (hard failure)  . . . . . . . . .  13
69	     7.3.  Category #3: Uniqueness, stable within context (soft
70	           failure)  . . . . . . . . . . . . . . . . . . . . . . . .  13
71	     7.4.  Category #4: Uniqueness, monotonically increasing within
72	           context (hard failure)  . . . . . . . . . . . . . . . . .  15
73	   8.  Common Vulnerabilities Associated with Transient Numeric
74	       Identifiers . . . . . . . . . . . . . . . . . . . . . . . . .  21
75	     8.1.  Network Activity Correlation  . . . . . . . . . . . . . .  21
76	     8.2.  Information Leakage . . . . . . . . . . . . . . . . . . .  22
77	     8.3.  Fingerprinting  . . . . . . . . . . . . . . . . . . . . .  23
78	     8.4.  Exploitation of the Semantics of Transient Numeric
79	           Identifiers . . . . . . . . . . . . . . . . . . . . . . .  24
80	     8.5.  Exploitation of Collisions of Transient Numeric
81	           Identifiers . . . . . . . . . . . . . . . . . . . . . . .  24
82	     8.6.  Exploitation of Predictable Transient Numeric Identifiers
83	           for Injection Attacks . . . . . . . . . . . . . . . . . .  24
84	     8.7.  Cryptanalysis . . . . . . . . . . . . . . . . . . . . . .  25
85	   9.  Vulnerability Assessment of Transient Numeric Identifiers . .  26
86	     9.1.  Category #1: Uniqueness (soft failure)  . . . . . . . . .  26
87	     9.2.  Category #2: Uniqueness (hard failure)  . . . . . . . . .  26
88	     9.3.  Category #3: Uniqueness, stable within context (soft
89	           failure)  . . . . . . . . . . . . . . . . . . . . . . . .  27
90	     9.4.  Category #4: Uniqueness, monotonically increasing within
91	           context (hard failure)  . . . . . . . . . . . . . . . . .  27
92	   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  30
93	   11. Security Considerations . . . . . . . . . . . . . . . . . . .  30
94	   12. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  30
95	   13. References  . . . . . . . . . . . . . . . . . . . . . . . . .  30
96	     13.1.  Normative References . . . . . . . . . . . . . . . . . .  30
97	     13.2.  Informative References . . . . . . . . . . . . . . . . .  32
98	   Appendix A.  Algorithms and Techniques with Known Issues  . . . .  37
99	     A.1.  Predictable Linear Identifiers Algorithm  . . . . . . . .  37
100	     A.2.  Random-Increments Algorithm . . . . . . . . . . . . . . .  39
101	     A.3.  Re-using Identifiers Across Different Contexts  . . . . .  40
102	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  40

104	1.  Introduction

106	   Networking protocols employ a variety of transient numeric
107	   identifiers for different protocol objects, such as IPv4 and IPv6
108	   Fragment Identifiers [RFC0791] [RFC8200], IPv6 Interface Identifiers
109	   (IIDs) [RFC4291], transport protocol ephemeral port numbers
110	   [RFC6056], TCP Initial Sequence Numbers (ISNs) [RFC0793], and DNS
111	   Transaction IDs (TxIDs) [RFC1035].  These identifiers usually have
112	   specific interoperability requirements (e.g. uniqueness during a
113	   specified period of time) that must be satisfied such that they do
114	   not result in negative interoperability implications, and an
115	   associated failure severity when such requirements are not met,
116	   ranging from soft to hard failures.

118	   For more than 30 years, a large number of implementations of the TCP/
119	   IP protocol suite have been subject to a variety of attacks, with
120	   effects ranging from Denial of Service (DoS) or data injection, to
121	   information leakages that could be exploited for pervasive monitoring
122	   [RFC7258].  The root cause of these issues has been, in many cases,
123	   the poor selection of transient numeric identifiers in such
124	   protocols, usually as a result of insufficient or misleading
125	   specifications.  While it is generally trivial to identify an
126	   algorithm that can satisfy the interoperability requirements of a
127	   given transient numeric identifier, empirical evidence exists that
128	   doing so without negatively affecting the security and/or privacy
129	   properties of the aforementioned protocols is prone to error
130	   [I-D.irtf-pearg-numeric-ids-history].

132	   For example, implementations have been subject to security and/or
133	   privacy issues resulting from:

135	   o  Predictable IPv4 or IPv6 Fragment Identifiers (see e.g.
136	      [Sanfilippo1998a], [RFC6274], and [RFC7739])

138	   o  Predictable IPv6 IIDs (see e.g.  [RFC7721], [RFC7707], and
139	      [RFC7217])

141	   o  Predictable transport protocol ephemeral port numbers (see e.g.
142	      [RFC6056] and [Silbersack2005])

144	   o  Predictable TCP Initial Sequence Numbers (ISNs) (see e.g.
145	      [Morris1985], [Bellovin1989], and [RFC6528])

147	   o  Predictable initial timestamp in TCP timestamps Options (see e.g.
148	      [TCPT-uptime] and [RFC7323])

150	   o  Predictable DNS TxIDs (see e.g.  [Schuba1993] and [Klein2007])

152	   Recent history indicates that when new protocols are standardized or
153	   new protocol implementations are produced, the security and privacy
154	   properties of the associated transient numeric identifiers tend to be
155	   overlooked, and inappropriate algorithms to generate transient
156	   numeric identifiers are either suggested in the specifications or
157	   selected by implementers.  As a result, it should be evident that
158	   advice in this area is warranted.

160	   We note that the use of cryptographic techniques may readily mitigate
161	   some of the issues arising from predictable transient numeric
162	   identifiers.  For example, cryptographic integrity and authentication
163	   can readily mitigate data injection attacks even in the presence of
164	   predictable transient numeric identifiers (such as "sequence
165	   numbers").  However, use of flawed algorithms (such as global
166	   counters) for generating transient numeric identifiers could still
167	   result in information leakages even when cryptographic techniques are
168	   employed.

170	   This document contains a non-exhaustive survey of transient numeric
171	   identifiers employed in various IETF protocols, and aims to
172	   categorize such identifiers based on their interoperability
173	   requirements, and the associated failure severity when such
174	   requirements are not met.  Subsequently, it provides advice on
175	   possible algorithms that could be employed to satisfy the
176	   interoperability requirements of each category, while minimizing
177	   negative security and privacy implications.  Finally, it analyzes
178	   several algorithms that have been employed in real implementations to
179	   meet such requirements, and analyzes their security and privacy
180	   properties.

182	   This document represents the consensus of the Privacy Enhancement and
183	   Assessment Research Group (PEARG).

185	2.  Terminology

187	   Transient Numeric Identifier:
188	      A data object in a protocol specification that can be used to
189	      definitely distinguish a protocol object (a datagram, network
190	      interface, transport protocol endpoint, session, etc.) from all
191	      other objects of the same type, in a given context.  Transient
192	      numeric identifiers are usually defined as a series of bits, and
193	      represented using integer values.  These identifiers are typically
194	      dynamically selected, as opposed to statically-assigned numeric
195	      identifiers (see e.g.  [IANA-PROT]).  We note that different
196	      transient numeric identifiers may have additional requirements or
197	      properties depending on their specific use in a protocol.  We use
198	      the term "transient numeric identifier" (or simply "numeric
199	      identifier" or "identifier" as short forms) as a generic term to
200	      refer to any data object in a protocol specification that
201	      satisfies the identification property stated above.

203	   Failure Severity:
204	      The consequences of a failure to comply with the interoperability
205	      requirements of a given identifier.  Severity considers the worst
206	      potential consequence of a failure, determined by the system
207	      damage and/or time lost to repair the failure.  In this document
208	      we define two types of failure severity: "soft failure" and "hard
209	      failure".

211	   Soft Failure:
212	      A soft failure is a recoverable condition in which a protocol does
213	      not operate in the prescribed manner but normal operation can be
214	      resumed automatically in a short period of time.  For example, a
215	      simple packet-loss event that is subsequently recovered with a
216	      packet-retransmission can be considered a soft failure.

218	   Hard Failure:
219	      A hard failure is a non-recoverable condition in which a protocol
220	      does not operate in the prescribed manner or it operates with
221	      excessive degradation of service.  For example, an established TCP
222	      connection that is aborted due to an error condition constitutes,
223	      from the point of view of the transport protocol, a hard failure,
224	      since it enters a state from which normal operation cannot be
225	      resumed.

227	3.  Threat Model

229	   Throughout this document, we assume an attacker does not have
230	   physical or logical access to the system(s) being attacked, and
231	   cannot observe the packets being transferred between the sender and
232	   the receiver(s) of the target protocol (if any).  However, we assume
233	   the attacker can send any traffic to the target device(s), to e.g.
234	   sample transient numeric identifiers employed by such device(s).

236	4.  Issues with the Specification of Transient Numeric Identifiers

238	   While assessing protocol specifications regarding the use of
239	   transient numeric identifiers, we have found that most of the issues
240	   discussed in this document arise as a result of one of the following
241	   conditions:

243	   o  Protocol specifications that under-specify the requirements for
244	      their transient numeric identifiers

246	   o  Protocol specifications that over-specify their transient numeric
247	      identifiers

249	   o  Protocol implementations that simply fail to comply with the
250	      specified requirements

252	   A number of protocol specifications (too many of them) have simply
253	   overlooked the security and privacy implications of transient numeric
254	   identifiers [I-D.irtf-pearg-numeric-ids-history].  Examples of them
255	   are the specification of TCP ephemeral ports in [RFC0793], the
256	   specification of TCP sequence numbers in [RFC0793], or the
257	   specification of the DNS TxID in [RFC1035].

259	   On the other hand, there are a number of protocol specifications that
260	   over-specify some of their associated transient numeric identifiers.
261	   For example, [RFC4291] essentially overloads the semantics of IPv6
262	   Interface Identifiers (IIDs) by embedding link-layer addresses in the
263	   IPv6 IIDs, when the interoperability requirement of uniqueness could
264	   be achieved in other ways that do not result in negative security and
265	   privacy implications [RFC7721].  Similarly, [RFC2460] suggested the
266	   use of a global counter for the generation of Fragment Identification
267	   values, when the interoperability properties of uniqueness per {IPv6
268	   Source Address, IPv6 Destination Address} could be achieved with
269	   other algorithms that do not result in negative security and privacy
270	   implications [RFC7739].

272	   Finally, there are protocol implementations that simply fail to
273	   comply with existing protocol specifications.  For example, some
274	   popular operating systems (notably Microsoft Windows) still fail to
275	   implement transport protocol ephemeral port randomization, as
276	   recommended in [RFC6056].

278	5.  Protocol Failure Severity

280	   Section 2 defines the concept of "Failure Severity", along with two
281	   types of failure severities that we employ throughout this document:
282	   soft and hard.

284	   Our analysis of the severity of a failure is performed from the point
285	   of view of the protocol in question.  However, the corresponding
286	   severity on the upper protocol (or application) might not be the same
287	   as that of the protocol in question.  For example, a TCP connection
288	   that is aborted might or might not result in a hard failure of the
289	   upper application: if the upper application can establish a new TCP
290	   connection without any impact on the application, a hard failure at
291	   the TCP protocol may have no severity at the application level.  On
292	   the other hand, if a hard failure of a TCP connection results in
293	   excessive degradation of service at the application layer, it will
294	   also result in a hard failure at the application.

296	6.  Categorizing Transient Numeric Identifiers

298	   This section includes a non-exhaustive survey of transient numeric
299	   identifiers, and proposes a number of categories that can accommodate
300	   these identifiers based on their interoperability requirements and
301	   their associated failure severity (soft or hard)

303	   +------------------+---------------------------------+--------------+
304	   |    Identifier    |  Interoperability Requirements  |   Failure    |
305	   |                  |                                 |   Severity   |
306	   +------------------+---------------------------------+--------------+
307	   |   IPv6 Frag ID   |    Uniqueness (for IP address   |  Soft/Hard   |
308	   |                  |              pair)              |     (1)      |
309	   +------------------+---------------------------------+--------------+
310	   |     IPv6 IID     |  Uniqueness (and stable within  |   Soft (3)   |
311	   |                  |         IPv6 prefix) (2)        |              |
312	   +------------------+---------------------------------+--------------+
313	   |     TCP ISN      |   Monotonically-increasing (4)  |   Hard (4)   |
314	   +------------------+---------------------------------+--------------+
315	   |   TCP initial    |   Monotonically-increasing (5)  |   Hard (5)   |
316	   |    timestamps    |                                 |              |
317	   +------------------+---------------------------------+--------------+
318	   |  TCP eph. port   |  Uniqueness (for connection ID) |     Hard     |
319	   +------------------+---------------------------------+--------------+
320	   | IPv6 Flow Label  |            Uniqueness           |   None (6)   |
321	   +------------------+---------------------------------+--------------+
322	   |     DNS TxID     |            Uniqueness           |   None (7)   |
323	   +------------------+---------------------------------+--------------+

325	             Table 1: Survey of Transient Numeric Identifiers

327	   NOTE:

329	   (1)
330	      While a single collision of Fragment ID values would simply lead
331	      to a single packet drop (and hence a "soft" failure), repeated
332	      collisions at high data rates might trash the Fragment ID space,
333	      leading to a hard failure [RFC4963].

335	   (2)
336	      While the interoperability requirements are simply that the
337	      Interface ID results in a unique IPv6 address, for operational
338	      reasons it is typically desirable that the resulting IPv6 address
339	      (and hence the corresponding Interface ID) be stable within each
340	      network [RFC7217] [RFC8064].

342	   (3)
343	      While IPv6 Interface IDs must result in unique IPv6 addresses,
344	      IPv6 Duplicate Address Detection (DAD) [RFC4862] allows for the
345	      detection of duplicate addresses, and hence such Interface ID
346	      collisions can be recovered.

348	   (4)
349	      In theory, there are no interoperability requirements for TCP
350	      Initial Sequence Numbers (ISNs), since the TIME-WAIT state and
351	      TCP's "quiet time" concept take care of old segments from previous
352	      incarnations of a connection.  However, a widespread optimization
353	      allows for a new incarnation of a previous connection to be
354	      created if the ISN of the incoming SYN is larger than the last
355	      sequence number seen in that direction for the previous
356	      incarnation of the connection.  Thus, monotonically-increasing TCP
357	      ISNs allow for such optimization to work as expected [RFC6528],
358	      and can help avoid connection-establishment failures.

360	   (5)
361	      Strictly speaking, there are no interoperability requirements for
362	      the *initial* TCP timestamp employed by a TCP instance (i.e., the
363	      TS Value (TSval) in a segment with the SYN bit set).  However,
364	      some TCP implementations allow a new incarnation of a previous
365	      connection to be created if the TSval of the incoming SYN is
366	      larger than the last TSval seen in that direction for the previous
367	      incarnation of the connection (please see [RFC6191]).  Thus,
368	      monotonically-increasing TCP initial timestamps (across
369	      connections to the same endpoint) allow for such optimization to
370	      work as expected [RFC6191], and can help avoid connection-
371	      establishment failures.

373	   (6)
374	      The IPv6 Flow Label is typically employed for load sharing
375	      [RFC7098], along with the Source and Destination IPv6 addresses.
376	      Reuse of a Flow Label value for the same set {Source Address,
377	      Destination Address} would typically cause both flows to be
378	      multiplexed onto the same link.  However, as long as this does not
379	      occur deterministically, it will not result in any negative
380	      implications.

382	   (7)
383	      DNS TxIDs are employed, together with the Source Address,
384	      Destination Address, Source Port, and Destination Port, to match
385	      DNS requests and responses.  However, since an implementation
386	      knows which DNS requests were sent for that set of {Source
387	      Address, Destination Address, Source Port, and Destination Port,
388	      DNS TxID}, a collision of TxID would result, if anything, in a
389	      small performance penalty (the response would nevertheless be
390	      discarded when it is found that it does not answer the query sent
391	      in the corresponding DNS query).

393	   Based on the survey above, we can categorize identifiers as follows:

395	   +-----+---------------------------------------+---------------------+
396	   | Cat |                Category               |   Sample Proto IDs  |
397	   |  #  |                                       |                     |
398	   +-----+---------------------------------------+---------------------+
399	   |  1  |       Uniqueness (soft failure)       |  IPv6 Flow L., DNS  |
400	   |     |                                       |        TxIDs        |
401	   +-----+---------------------------------------+---------------------+
402	   |  2  |       Uniqueness (hard failure)       |  IPv6 Frag ID, TCP  |
403	   |     |                                       |    ephemeral port   |
404	   +-----+---------------------------------------+---------------------+
405	   |  3  |   Uniqueness, stable within context   |      IPv6 IIDs      |
406	   |     |             (soft failure)            |                     |
407	   +-----+---------------------------------------+---------------------+
408	   |  4  |  Uniqueness, monotonically increasing |     TCP ISN, TCP    |
409	   |     |     within context (hard failure)     |  initial timestamps |
410	   +-----+---------------------------------------+---------------------+

412	                      Table 2: Identifier Categories

414	   We note that Category #4 could be considered a generalized case of
415	   category #3, in which a monotonically increasing element is added to
416	   a stable (within context) element, such that the resulting
417	   identifiers are monotonically increasing within a specified context.
418	   That is, the same algorithm could be employed for both #3 and #4,
419	   given appropriate parameters.

421	7.  Common Algorithms for Transient Numeric Identifier Generation

423	   The following subsections describe some sample algorithms that can be
424	   employed for generating transient numeric identifiers for each of the
425	   categories above.

427	   All of the variables employed in the algorithms of the following
428	   subsections are of "unsigned integer" type, except for the "retry"
429	   variable, that is of (signed) "integer" type.

431	7.1.  Category #1: Uniqueness (soft failure)

433	   The requirement of uniqueness with a soft failure severity can be
434	   complied with a Pseudo-Random Number Generator (PRNG).

436	   We note that since the premise is that collisions of transient
437	   numeric identifiers of this category only leads to soft failures, in
438	   many cases, the algorithm might not need to check the suitability of
439	   a selected identifier (i.e., suitable_id() could always return
440	   "true").

442	   In scenarios where e.g. simultaneous use of a given numeric ID is
443	   undesirable and the implementation detects such condition, an
444	   implementation may opt to select the next available identifier in the
445	   same sequence, or select another random number.  Section 7.1.1 is an
446	   implementation of the former strategy, while Section 7.1.2 is an
447	   implementation of the later.  Typically, the algorithm in
448	   Section 7.1.2 results in a more uniform distribution of the generated
449	   transient numeric identifiers.  However, for transient numeric
450	   identifiers where an implementation typically keeps local state about
451	   unsuitable/used identifiers, the algorithm in Section 7.1.2 may
452	   require many more iterations than the algorithm in Section 7.1.1 to
453	   generate a suitable transient numeric identifier.  This will usually
454	   be affected by the current usage ratio of transient numeric
455	   identifiers (i.e., number of numeric identifiers considered suitable
456	   / total number of numeric identifiers) and other parameters.
457	   Therefore, in such cases many implementations tend to prefer the
458	   algorithm in Section 7.1.1 over the algorithm in Section 7.1.2.

460	7.1.1.  Simple Randomization Algorithm
461	       /* Transient Numeric ID selection function */

463	       id_range = max_id - min_id + 1;
464	       next_id = min_id + (random() % id_range);
465	       retry = id_range;

467	       do {
468	           if (suitable_id(next_id)) {
469	               return next_id;
470	           }

472	           if (next_id == max_id) {
473	               next_id = min_id;
474	           } else {
475	               next_id++;
476	           }

478	           retry--;

480	       } while (retry > 0);

482	       return ERROR;

484	   NOTE:
485	      random() is a function that returns a pseudo-random unsigned
486	      integer number of appropriate size.  Note that the output needs to
487	      be unpredictable, and typical implementations of the POSIX
488	      random() function do not necessarily meet this requirement.  See
489	      [RFC4086] for randomness requirements for security.  Beware that
490	      "adapting" the length of the output of random() with a modulo
491	      operator (e.g., C language's "%") may change the distribution of
492	      the PRNG.

494	      The function suitable_id() can check, when possible and desirable,
495	      whether a selected transient numeric identifier is suitable (e.g.
496	      it is not already in use).  Depending on how/where the numeric
497	      identifier is used, it may or may not be possible (or even
498	      desirable) to check whether the numeric identifier is in use (or
499	      whether it has been recently employed).  When an identifier is
500	      found to be unsuitable, this algorithm selects the next available
501	      numeric identifier in sequence.

503	      Even when this algorithm selects numeric IDs randomly, it is
504	      biased towards the first available numeric ID after a sequence of
505	      unavailable numeric IDs.  For example, if this algorithm is
506	      employed for transport protocol ephemeral port randomization
507	      [RFC6056] and the local list of unsuitable port numbers (e.g.,
508	      registered port numbers that should not be used for ephemeral
509	      ports) is significant, an attacker may actually have a
510	      significantly better chance of guessing a port number.

512	      All the variables (in this and all the algorithms discussed in
513	      this document) are unsigned integers.

515	   Assuming the randomness requirements for the PRNG are met (see
516	   [RFC4086]), this algorithm does not suffer from any of the issues
517	   discussed in Section 8.

519	7.1.2.  Another Simple Randomization Algorithm

521	   The following pseudo-code illustrates another algorithm for selecting
522	   a random transient numeric identifier which, in the event a selected
523	   identifier is found to be unsuitable (e.g., already in use), another
524	   identifier is randomly selected:

526	       /* Transient Numeric ID selection function */

528	       id_range = max_id - min_id + 1;
529	       retry = id_range;

531	       do {
532	           next_id = min_id + (random() % id_range);

534	           if (suitable_id(next_id)) {
535	               return next_id;
536	           }

538	           retry--;

540	       } while (retry > 0);

542	       return ERROR;

544	   This algorithm might be unable to select a transient numeric
545	   identifier (i.e., return "ERROR") even if there are suitable
546	   identifiers available, in cases where a large number of identifiers
547	   are found to be unsuitable (e.g. "in use").

549	   The same considerations from Section 7.1.1 with respect to the
550	   properties of random() and the adaptation of its output length apply
551	   to this algorithm.

553	   Assuming the randomness requirements for the PRNG are met (see
554	   [RFC4086]), this algorithm does not suffer from any of the issues
555	   discussed in Section 8.

557	7.2.  Category #2: Uniqueness (hard failure)

559	   One of the most trivial approaches for generating unique transient
560	   numeric identifier (with a hard failure severity) is to reduce the
561	   identifier reuse frequency by generating the numeric identifiers with
562	   a monotonically-increasing function (e.g. linear).  As a result, any
563	   of the algorithms described in Section 7.4 ("Category #4: Uniqueness,
564	   monotonically increasing within context (hard failure)") can be
565	   readily employed for complying with the requirements of this
566	   transient numeric identifier category.

568	   In cases where suitability (e.g. uniqueness) of the selected
569	   identifiers can be definitely assessed by the local system, any of
570	   the algorithms described in Section 7.1 ("Category #1: Uniqueness
571	   (soft failure)") can be readily employed for complying with the
572	   requirements of this numeric identifier category.

574	   NOTE:
575	      In the case of e.g.  TCP ephemeral ports or TCP ISNs, a transient
576	      numeric identifier that might seem suitable from the perspective
577	      of the local system, might actually be unsuitable from the
578	      perspective of the remote system (e.g., because there is state
579	      associated with the selected identifier at the remote system).
580	      Therefore, in such cases it is not possible employ the algorithms
581	      from Section 7.1 ("Category #1: Uniqueness (soft failure)").

583	7.3.  Category #3: Uniqueness, stable within context (soft failure)

585	   The goal of the following algorithm is to produce identifiers that
586	   are stable for a given context (identified by "CONTEXT"), but that
587	   change when the aforementioned context changes.

589	   In order to avoid storing in memory the transient numeric identifiers
590	   computed for each CONTEXT, the following algorithm employs a
591	   calculated technique (as opposed to keeping state in memory) to
592	   generate a stable transient numeric identifier for each given
593	   context.

595	       /* Transient Numeric ID selection function  */

597	       id_range = max_id - min_id + 1;

599	       retry = 0;

601	       do {
602	           offset = F(CONTEXT, retry, secret_key);
603	           next_id = min_id + (offset % id_range);

605	           if (suitable_id(next_id)) {
606	               return next_id;
607	           }

609	           retry++;

611	       } while (retry <= MAX_RETRIES);

613	       return ERROR;

615	   In this algorithm, the function F() provides a stateless and stable
616	   per-CONTEXT offset, where CONTEXT is the concatenation of all the
617	   elements that define the given context.

619	      For example, if this algorithm is expected to produce IPv6 IIDs
620	      that are unique per network interface and SLAAC autoconfiguration
621	      prefix, the CONTEXT should be the concatenation of e.g. the
622	      network interface index and the SLAAC autoconfiguration prefix
623	      (please see [RFC7217] for an implementation of this algorithm for
624	      generation of stable IPv6 IIDs).

626	   F() is a pseudorandom function (PRF).  It must not computable from
627	   the outside (without knowledge of the secret key).  F() must also be
628	   difficult to reverse, such that it resists attempts to obtain the
629	   secret_key, even when given samples of the output of F() and
630	   knowledge or control of the other input parameters.  F() should
631	   produce an output of at least as many bits as required for the
632	   transient numeric identifier.  SipHash-2-4 (128-bit key, 64-bit
633	   output) [SipHash] and BLAKE3 (256-bit key, arbitrary-length output)
634	   [BLAKE3] are two possible options for F().  Alternatively, F() could
635	   be implemented with a keyed-hash message authentication code (HMAC)
636	   [RFC2104].  HMAC-SHA-256 [FIPS-SHS] would be one possible option for
637	   such implementation alternative.  Note: Use of HMAC-MD5 [RFC1321] is
638	   not recommended for F() [RFC6151].

640	   The result of F() is no more secure than the secret key, and
641	   therefore 'secret_key' must be unknown to the attacker, and must be
642	   of a reasonable length. 'secret_key' must remain stable for a given
643	   CONTEXT, since otherwise the numeric identifiers generated by this
644	   algorithm would not have the desired stability properties (i.e.,
645	   stable for a given CONTEXT).  In most cases, 'secret_key' should be
646	   selected with a PRNG (see [RFC4086] for recommendations on choosing
647	   secrets) at an appropriate time, and stored in stable or volatile
648	   storage (as necessary) for future use.

650	   The result of F() is stored in the variable 'offset', which may take
651	   any value within the storage type range, since we are restricting the
652	   resulting identifier to be in the range [min_id, max_id] in a similar
653	   way as in the algorithm described in Section 7.1.1.

655	   suitable_id() checks whether the candidate identifier has suitable
656	   uniqueness properties.  Collisions (i.e., an identifier that is not
657	   unique) are recovered by incrementing the 'retry' variable and
658	   recomputing F(), up to a maximum of MAX_RETRIES times.  However,
659	   recovering from collisions will usually result in identifiers that
660	   fail to remain constant for the specified context.  This is normally
661	   acceptable when the probability of collisions is small, as in the
662	   case of e.g.  IPv6 IIDs resulting from SLAAC [RFC7217] [RFC4941].

664	   For obvious reasons, the transient numeric identifiers generated with
665	   this algorithm allow for network activity correlation and
666	   fingerprinting within "CONTEXT".  However, this is essentially a
667	   design goal of this category of transient numeric identifiers.

669	7.4.  Category #4: Uniqueness, monotonically increasing within context
670	      (hard failure)

672	7.4.1.  Per-context Counter Algorithm

674	   One possible way of selecting unique monotonically-increasing
675	   identifiers (per context) is to employ a per-context counter.  Such
676	   an algorithm could be described as follows:

678	       /* Transient Numeric ID selection function */

680	       id_range = max_id - min_id + 1;
681	       retry = id_range;
682	       id_inc = increment() % id_range;

684	       if( (next_id = lookup_counter(CONTEXT)) == ERROR){
685	            next_id = min_id + random() % id_range;
686	       }

688	       do {
689	           if ( (max_id - next_id) >= id_inc){
690	               next_id = next_id + id_inc;
691	           }
692	           else {
693	               next_id = min_id + id_inc - (max_id - next_id);
694	           }

696	           if (suitable_id(next_id)){
697	               store_counter(CONTEXT, next_id);
698	               return next_id;
699	           }

701	           retry = retry - id_inc;

703	       } while (retry > 0);

705	       return ERROR;

707	   NOTE:
708	      increment() returns a small integer that is employed to increment
709	      the current counter value to obtain the next transient numeric
710	      identifier.  This value must be much smaller than the number of
711	      possible values for the numeric IDs (i.e., "id_range").  Most
712	      implementations of this algorithm employ a constant increment of
713	      1.  Using a value other than 1 can help mitigate some information
714	      leakages (please see below), at the expense of a possible increase
715	      in the numeric ID reuse frequency.

717	      The code above makes sure that the increment employed in the
718	      algorithm (id_inc) is always smaller than the number of possible
719	      values for the numeric IDs (i.e., "max_id - min_d + 1").  However,
720	      as noted above, this value must also be much smaller than the
721	      number of possible values for the numeric IDs.

723	      lookup_counter() is a function that returns the current counter
724	      for a given context, or an error condition if that counter does
725	      not exist.

727	      store_counter() is a function that saves a counter value for a
728	      given context.

730	      suitable_id() is a function that checks whether the resulting
731	      identifier is acceptable (e.g., whether it is not already in use,
732	      etc.).

734	   Essentially, whenever a new identifier is to be selected, the
735	   algorithm checks whether a counter for the corresponding context
736	   exists.  If does, the value of such counter is incremented to obtain
737	   the new transient numeric identifier, and the counter is updated.  If
738	   no counter exists for such context, a new counter is created and
739	   initialized to a random value, and used as the selected transient
740	   numeric identifier.  This algorithm produces a per-context counter,
741	   which results in one monotonically-increasing function for each
742	   context.  Since each counter is initialized to a random value, the
743	   resulting values are unpredictable by an off-path attacker.

745	   The choice of id_inc has implications on both the security and
746	   privacy properties of the resulting identifiers, but also on the
747	   corresponding interoperability properties.  On one hand, minimizing
748	   the increments generally minimizes the identifier reuse frequency,
749	   albeit at increased predictability.  On the other hand, if the
750	   increments are randomized, predictability of the resulting
751	   identifiers is reduced, and the information leakage produced by
752	   global constant increments is mitigated.  However, using larger
753	   increments than necessary can result in higher numeric ID reuse
754	   frequency.

756	   This algorithm has the following drawbacks:

758	   o  It requires an implementation to store each per-CONTEXT counter in
759	      memory.  If, as a result of resource management, the counter for a
760	      given context must be removed, the last transient numeric
761	      identifier value used for that context will be lost.  Thus, if
762	      subsequently an identifier needs to be generated for the same
763	      context, the corresponding counter will need to be recreated and
764	      reinitialized to a random value, thus possibly leading to reuse/
765	      collision of numeric identifiers.

767	   o  Keeping one counter for each possible "context" may in some cases
768	      be considered too onerous in terms of memory requirements.

770	   Otherwise, the identifiers produced by this algorithm do not suffer
771	   from the other issues discussed in Section 8.

773	7.4.2.  Simple PRF-Based Algorithm

775	   The goal of this algorithm is to produce monotonically-increasing
776	   transient numeric identifiers (for each given context), with a
777	   randomized initial value.  For example, if the identifiers being
778	   generated must be monotonically-increasing for each {IP Source
779	   Address, IP Destination Address} set, then each possible combination
780	   of {IP Source Address, IP Destination Address} should have a separate
781	   monotonically-increasing sequence, that starts at a different random
782	   value.

784	   Instead of maintaining a per-context counter (as in the algorithm
785	   from Section 7.4.1), the following algorithm employs a calculated
786	   technique to maintain a random offset for each possible context.

788	       /* Initialization code */
789	       counter = 0;

791	       /* Transient Numeric ID selection function  */

793	       id_range = max_id - min_id + 1;
794	       id_inc = increment() % id_range;
795	       offset = F(CONTEXT, secret_key);
796	       retry = id_range;

798	       do {
799	           next_id = min_id + (offset + counter) % id_range;
800	           counter = counter + id_inc;

802	           if (suitable_id(next_id)) {
803	               return next_id;
804	           }

806	           retry = retry - id_inc;

808	       } while (retry > 0);

810	       return ERROR;

812	   In the algorithm above, the function F() provides a (stateless)
813	   unpredictable offset for each given context (as identified by
814	   'CONTEXT').

816	   F() is a PRF, with the same properties as those specified for F() in
817	   Section 7.3.

819	   CONTEXT is the concatenation of all the elements that define a given
820	   context.  For example, if this algorithm is expected to produce
821	   identifiers that are monotonically-increasing for each set (Source IP
822	   Address, Destination IP Address), CONTEXT should be the concatenation
823	   of these two IP addresses.

825	   The function F() provides a "per-CONTEXT" fixed offset within the
826	   numeric identifier "space".  Both the 'offset' and 'counter'
827	   variables may take any value within the storage type range since we
828	   are restricting the resulting identifier to be in the range [min_id,
829	   max_id] in a similar way as in the algorithm described in
830	   Section 7.1.1.  This allows us to simply increment the 'counter'
831	   variable and rely on the unsigned integer to wrap around.

833	   The result of F() is no more secure than the secret key, and
834	   therefore 'secret_key' must be unknown to the attacker, and must be
835	   of a reasonable length. 'secret_key' must remain stable for a given
836	   CONTEXT, since otherwise the numeric identifiers generated by this
837	   algorithm would not have the desired stability properties (i.e.,
838	   monotonically-increasing for a given CONTEXT).  In most cases,
839	   'secret_key' should be selected with a PRNG (see [RFC4086] for
840	   recommendations on choosing secrets) at an appropriate time, and
841	   stored in stable or volatile storage (as necessary) for future use.

843	   It should be noted that, since this algorithm uses a global counter
844	   ("counter") for selecting identifiers (i.e., all counters share the
845	   same increments space), this algorithm results in an information
846	   leakage (as described in Section 8.2).  For example, if this
847	   algorithm were used for selecting TCP ephemeral ports, and an
848	   attacker could force a client to periodically establish a new TCP
849	   connection to an attacker-controlled system (or through an attacker-
850	   observable routing path), the attacker could subtract consecutive
851	   source port values to obtain the number of outgoing TCP connections
852	   established globally by the victim host within that time period (up
853	   to wrap-around issues and five-tuple collisions, of course).  This
854	   information leakage could be partially mitigated by employing small
855	   random values for the increments (i.e., increment() function),
856	   instead of having increment() return the constant "1".

858	   We nevertheless note that an improved mitigation of this information
859	   leakage could be more successfully achieved by employing the
860	   algorithm from Section 7.4.3, instead.

862	7.4.3.  Double-PRF Algorithm

864	   A trade-off between maintaining a single global 'counter' variable
865	   and maintaining 2**N 'counter' variables (where N is the width of the
866	   result of F()), could be achieved as follows.  The system would keep
867	   an array of TABLE_LENGTH values, which would provide a separation of
868	   the increment space into multiple buckets.  This improvement could be
869	   incorporated into the algorithm from Section 7.4.2 as follows:

871	       /* Initialization code */

873	       for(i = 0; i < TABLE_LENGTH; i++) {
874	           table[i] = random();
875	       }

877	       /* Transient Numeric ID selection function */

879	       id_range = max_id - min_id + 1;
880	       id_inc = increment() % id_range;
881	       offset = F(CONTEXT, secret_key1);
882	       index = G(CONTEXT, secret_key2) % TABLE_LENGTH;
883	       retry = id_range;

885	       do {
886	           next_id = min_id + (offset + table[index]) % id_range;
887	           table[index] = table[index] + id_inc;

889	           if (suitable_id(next_id)) {
890	               return next_id;
891	           }

893	          retry = retry - id_inc;

895	       } while (retry > 0);

897	       return ERROR;

899	   'table[]' could be initialized with random values, as indicated by
900	   the initialization code in the pseudo-code above.

902	   Both F() and G() are PRFs, with the same properties as those required
903	   for F() in Section 7.3.

905	   The results of F() and G() are no more secure than their respective
906	   secret keys ('secret_key1' and 'secret_key2', respectively), and
907	   therefore both secret keys must be unknown to the attacker, and must
908	   be of a reasonable length.  Both secret keys must remain stable for
909	   the given CONTEXT, since otherwise the transient numeric identifiers
910	   generated by this algorithm would not have the desired stability
911	   properties (i.e., monotonically-increasing for a given CONTEXT).  In
912	   most cases, both secret keys should be selected with a PRNG (see
913	   [RFC4086] for recommendations on choosing secrets) at an appropriate
914	   time, and stored in stable or volatile storage (as necessary) for
915	   future use.

917	   The 'table[]' array assures that successive transient numeric
918	   identifiers for a given context will be monotonically-increasing.
919	   Since the increments space is separated into TABLE_LENGTH different
920	   spaces, the identifier reuse frequency will be (probabilistically)
921	   lower than that of the algorithm in Section 7.4.2.  That is, the
922	   generation of an identifier for one given context will not
923	   necessarily result in increments in the identifier sequence of other
924	   contexts.  It is interesting to note that the size of 'table[]' does
925	   not limit the number of different identifier sequences, but rather
926	   separates the *increment space* into TABLE_LENGTH different spaces.
927	   The selected transient numeric identifier sequence will be obtained
928	   by adding the corresponding entry from 'table[]' to the value in the
929	   'offset' variable, which selects the actual identifier sequence space
930	   (as in the algorithm from Section 7.4.2).

932	   An attacker can perform traffic analysis for any "increment space"
933	   (i.e., context) into which the attacker has "visibility" -- namely,
934	   the attacker can force a system to generate identifiers for
935	   G(CONTEXT, secret_key2), where the result of G() identifies the
936	   target "increment space".  However, the attacker's ability to perform
937	   traffic analysis is very reduced when compared to the simple PRF-
938	   based identifiers (described in Section 7.4.2) and the predictable
939	   linear identifiers (described in Appendix A.1).  Additionally, an
940	   implementation can further limit the attacker's ability to perform
941	   traffic analysis by further separating the increment space (that is,
942	   using a larger value for TABLE_LENGTH) and/or by randomizing the
943	   increments (i.e., increment() returning a small random number as
944	   opposed to the constant "1").

946	   Otherwise, this algorithm does not suffer from the issues discussed
947	   in Section 8.

949	8.  Common Vulnerabilities Associated with Transient Numeric Identifiers

951	8.1.  Network Activity Correlation

953	   An identifier that is predictable within a given context allows for
954	   network activity correlation within that context.

956	   For example, a stable IPv6 Interface Identifier allows for network
957	   activity to be correlated within the context in which the Interface
958	   Identifier is stable [RFC7721].  A stable-per-network IPv6 Interface
959	   Identifier (as in [RFC7217]) allows for network activity correlation
960	   within a network, whereas a constant IPv6 Interface Identifier (that
961	   remains constant across networks) allows not only network activity
962	   correlation within the same network, but also across networks ("host
963	   tracking").

965	   Similarly, an implementation that generates TCP ISNs with a global
966	   counter could allow for fingerprinting and network activity
967	   correlation across networks, since an attacker could passively infer
968	   the identity of the victim based on the TCP ISNs employed for
969	   subsequent communication instances.  Similarly, an implementation
970	   that generates predictable IPv6 Fragment Identification values could
971	   be subject to fingerprinting attacks (see e.g.  [Bellovin2002]).

973	8.2.  Information Leakage

975	   Transient numeric identifiers that result in specific patterns can
976	   produce an information leakage to other communicating entities.  For
977	   example, it is common to generate transient numeric identifiers with
978	   an algorithm such as:

980	                   ID = offset(CONTEXT) + mono(CONTEXT);

982	   This generic expression generates identifiers by adding a
983	   monotonically-increasing function (e.g. linear) to a randomized
984	   offset. offset() is constant within a given context, whereas mono()
985	   produces a monotonically-increasing sequence for the given context.
986	   Identifiers generated with this expression will generally be
987	   predictable within CONTEXT.

989	   The predictability of mono(), irrespective of the predictability of
990	   offset(), can leak information that may be of use to attackers.  For
991	   example, a node that selects ephemeral port numbers as in:

993	                 ephemeral_port = offset(Dest_IP) + mono()

995	   that is, with a per-destination offset, but a global mono() function
996	   (e.g., a global counter), will leak information about total number of
997	   outgoing connections that have been issued by the vulnerable
998	   implementation.

1000	   Similarly, a node that generates Fragment Identification values as
1001	   in:

1003	            Frag_ID = offset(IP_src_addr, IP_dst_addr) + mono()

1005	   will leak out information about the total number of fragmented
1006	   packets that have been transmitted by the vulnerable implementation.
1007	   The vulnerabilities described in [Sanfilippo1998a],
1008	   [Sanfilippo1998b], and [Sanfilippo1999] are all associated with the
1009	   use of a global mono() function (i.e., with a global and constant
1010	   "context") -- particularly when it is a linear function (constant
1011	   increments of 1).

1013	   Predicting transient numeric identifiers can be of help for other
1014	   types of attacks.  For example, predictable TCP ISNs can open the
1015	   door to trivial connection-reset and data injection attacks (see
1016	   Section 8.6).

1018	8.3.  Fingerprinting

1020	   Fingerprinting is the capability of an attacker to identify or re-
1021	   identify a visiting user, user agent or device via configuration
1022	   settings or other observable characteristics.  Observable protocol
1023	   objects and characteristics can be employed to identify/re-identify a
1024	   variety of entities, ranging from the underlying hardware or
1025	   Operating System (vendor, type and version), to the user itself (i.e.
1026	   his/her identity).  [EFF] illustrates web browser-based
1027	   fingerprinting, but similar techniques can be applied at other layers
1028	   and protocols, whether alternatively or in conjunction with it.

1030	   Transient numeric identifiers are one of the observable protocol
1031	   components that could be leveraged for fingerprinting purposes.  That
1032	   is, an attacker could sample transient numeric identifiers to infer
1033	   the algorithm (and its associated parameters, if any) for generating
1034	   such identifiers, possibly revealing the underlying Operating System
1035	   (OS) vendor, type, and version.  This information could possibly be
1036	   further leveraged in conjunction with other fingerprinting techniques
1037	   and sources.

1039	   Evasion of protocol-stack fingerprinting can prove to be a very
1040	   difficult task: most systems make use of a wide variety of protocols,
1041	   each of which have a large number of parameters that can be set to
1042	   arbitrary values or generated with a variety of algorithms with
1043	   multiple parameters.

1045	   NOTE:
1046	      General protocol-based fingerprinting is discussed in [RFC6973],
1047	      along with guidelines to mitigate the associated vulnerability.
1048	      [Fyodor1998] and [Fyodor2006] are classic references on Operating
1049	      System detection via TCP/IP stack fingerprinting.  Nmap [nmap] is
1050	      probably the most popular tool for remote OS identification via
1051	      active TCP/IP stack fingerprinting. p0f [Zalewski2012], on the
1052	      other hand, is a tool for performing remote OS detection via
1053	      passive TCP/IP stack fingerprinting.  Finally, [TBIT] is a TCP
1054	      fingerprinting tool that aims at characterizing the behaviour of a
1055	      remote TCP peer based on active probes, and which has been widely
1056	      used in the research community.

1058	   Algorithms that, from the perspective of an observer (e.g., the
1059	   legitimate communicating peer), result in specific values or
1060	   patterns, will allow for at least some level of fingerprinting.  For
1061	   example, the algorithm from Section 7.3 will typically allow
1062	   fingerprinting within the context where the resulting identifiers are
1063	   stable.  Similarly, the algorithms from Section 7.4 will result in a
1064	   monotonically-increasing sequences within a given context, thus
1065	   allowing for at least some level of fingerprinting (when the other
1066	   communicating entity can correlate different sampled identifiers as
1067	   belonging to the same monotonically-increasing sequence).

1069	   Thus, where possible, algorithms from Section 7.1 should be preferred
1070	   over algorithms that result in specific values or patterns.

1072	8.4.  Exploitation of the Semantics of Transient Numeric Identifiers

1074	   Identifiers that are not semantically opaque tend to be more
1075	   predictable than semantically-opaque identifiers.  For example, a MAC
1076	   address contains an OUI (Organizationally-Unique Identifier) which
1077	   identifies the vendor that manufactured the corresponding network
1078	   interface card.  This can be leveraged by an attacker trying to
1079	   "guess" MAC addresses, who has some knowledge about the possible NIC
1080	   vendor.

1082	   [RFC7707] discusses a number of techniques to reduce the search space
1083	   when performing IPv6 address-scanning attacks by leveraging the
1084	   semantics of the IIDs produced by traditional SLAAC algorithms
1085	   (eventually replaced by [RFC7217]) that embed MAC addresses in the
1086	   IID of IPv6 addresses.

1088	8.5.  Exploitation of Collisions of Transient Numeric Identifiers

1090	   In many cases, the collision of transient network identifiers can
1091	   have a hard failure severity (or result in a hard failure severity if
1092	   an attacker can cause multiple collisions deterministically, one
1093	   after another).  For example, predictable Fragment Identification
1094	   values open the door to Denial of Service (DoS) attacks (see e.g.
1095	   [RFC5722].).

1097	8.6.  Exploitation of Predictable Transient Numeric Identifiers for
1098	      Injection Attacks

1100	   Some protocols rely on "sequence numbers" for the validation of
1101	   incoming packets.  For example, TCP employs sequence numbers for
1102	   reassembling TCP segments, while IPv4 and IPv6 employ Fragment
1103	   Identification values for reassembling IPv4 and IPv6 fragments
1104	   (respectively).  Lacking built-in cryptographic mechanisms for
1105	   validating packets, these protocols are therefore vulnerable to on-
1106	   path data (see e.g.  [Joncheray1995]) and/or control-information (see
1107	   e.g.  [RFC4953] and [RFC5927]) injection attacks.  The extent to
1108	   which these protocols may resist off-path (i.e. "blind") injection
1109	   attacks depends on whether the associated "sequence numbers" are
1110	   predictable, and effort required to successfully predict a valid
1111	   "sequence number" (see e.g.  [RFC4953] and [RFC5927]).

1113	   We note that the use of unpredictable "sequence numbers" is a
1114	   completely-ineffective mitigation for on-path injection attacks, and
1115	   also a mostly-ineffective mitigation for off-path (i.e. "blind")
1116	   injection attacks.  However, many legacy protocols (such as TCP) do
1117	   not natively incorporate cryptographic mitigations, but rather only
1118	   as optional features (see e.g.  [RFC5925]), if at all available.
1119	   Additionally, ad-hoc use of cryptographic mitigations might not be
1120	   sufficient to relieve a protocol implementation of generating
1121	   appropriate transient numeric identifiers.  For example, use of the
1122	   Transport Layer Security (TLS) protocol [RFC8446] with TCP will
1123	   protect the application protocol, but will not help to mitigate e.g.
1124	   TCP-based connection-reset attacks (see e.g.  [RFC4953]).  Similarly,
1125	   use of SEcure Neighbor Discovery (SEND) [RFC3971] will still imply
1126	   reliance on the successful reassembly of IPv6 fragments in those
1127	   cases where SEND packets do not fit into the link Maximum
1128	   Transmission Unit (MTU) (see [RFC6980]).

1130	8.7.  Cryptanalysis

1132	   A number of algorithms discussed in this document (such as those
1133	   described in Section 7.4.2 and Section 7.4.3) rely on PRFs.
1134	   Implementations that employ weak PRFs or keys of inappropriate size
1135	   can be subject to cryptanalysis, where an attacker can obtain the
1136	   secret key employed for the PRF, predict numeric identifiers, etc.

1138	   Furthermore, an implementation that overloads the semantics of the
1139	   secret key can result in more trivial cryptanalysis, possibly
1140	   resulting in the leakage of the value employed for the secret key.

1142	   NOTE:
1143	      [IPID-DEV] describes two vulnerable transient numeric ID
1144	      generators that employ cryptographically-weak hash functions.
1145	      Additionally, one of such implementations employs 32-bits of a
1146	      kernel address as the secret key for a hash function, and
1147	      therefore successful cryptanalysis leaks the aforementioned kernel
1148	      address, allowing for Kernel Address Space Layout Randomization
1149	      (KASLR) [KASLR] bypass.

1151	9.  Vulnerability Assessment of Transient Numeric Identifiers

1153	   The following subsections analyze possible vulnerabilities associated
1154	   with the algorithms described in Section 7.

1156	9.1.  Category #1: Uniqueness (soft failure)

1158	   Possible vulnerabilities associated with the algorithms from
1159	   Section 7.1 include:

1161	   o  Use of flawed PRNGs (please see e.g.  [Zalewski2001],
1162	      [Zalewski2002] and [Klein2007]),

1164	   o  Inadvertently affecting the distribution of an otherwise suitable
1165	      PRNG.

1167	   An implementer should consult [RFC4086] regarding randomness
1168	   requirements for security, and consult relevant documentation when
1169	   employing a PRNG provided by the underlying system.

1171	   When employing a PRNG, many implementations "adapt" the length of its
1172	   output with a modulo operator (e.g., C language's "%"), possibly
1173	   changing the distribution of the output of the PRNG.

1175	   For example, consider an implementation that employs the following
1176	   code:

1178	                          id = random() % 50000;

1180	   This example implementation means to obtain a transient numeric
1181	   identifier in the range 0-49000.  If random() produces e.g. a
1182	   pseudorandom number of 16 bits (with uniform distribution), the
1183	   selected numeric ID will have a non-uniform distribution with the
1184	   numbers in the range 0-15535 having double-frequency as the numbers
1185	   in the range 15536-49000.  This effect is reduced if the PRNG
1186	   produces an output that is much longer than the length implied by the
1187	   modulo operation.

1189	   Use of algorithms other than PRNGs for generating identifiers of this
1190	   category is discouraged.

1192	9.2.  Category #2: Uniqueness (hard failure)

1194	   As noted in Section 7.2, this category can employ the same algorithms
1195	   as Category #4, since a monotonically-increasing sequence tends to
1196	   minimize the transient numeric identifier reuse frequency.
1197	   Therefore, the vulnerability analysis in Section 9.4 also applies to
1198	   this category.

1200	   Additionally, as noted in Section 7.2, some transient numeric
1201	   identifiers of this category might be able to use the algorithms from
1202	   Section 7.1, in which case the same considerations as in Section 9.1
1203	   would apply.

1205	9.3.  Category #3: Uniqueness, stable within context (soft failure)

1207	   Possible vulnerabilities associated with the algorithms from
1208	   Section 7.3 are:

1210	   1.  Use of weak PRFs, or inappropriate secret keys (whether
1211	       inappropriate selection or inappropriate size) could allow for
1212	       cryptanalysis, which could eventually be exploited by an attacker
1213	       to predict future transient numeric identifiers.

1215	   2.  Since the algorithm generates a unique and stable identifier
1216	       within a specified context, it may allow for network activity
1217	       correlation and fingerprinting within the specified context.

1219	9.4.  Category #4: Uniqueness, monotonically increasing within context
1220	      (hard failure)

1222	   The algorithm described in Section 7.4.1 for generating identifiers
1223	   of Category #4 will result in an identifiable pattern (i.e. a
1224	   monotonically-increasing sequence) for the transient numeric
1225	   identifiers generated for each CONTEXT, and thus will allow for
1226	   fingerprinting and network activity correlation within each CONTEXT.

1228	   On the other hand, a simple way to generalize and analyze the
1229	   algorithms described in Section 7.4.2 and Section 7.4.3 for
1230	   generating identifiers of Category #4, is as follows:

1232	       /* Transient Numeric ID selection function */

1234	       id_range = max_id - min_id + 1;
1235	       retry = id_range;
1236	       id_inc = increment() % id_range;

1238	       do {
1239	           update_mono(CONTEXT, id_inc);
1240	           next_id = min_id + (offset(CONTEXT) + \
1241	                               mono(CONTEXT)) % id_range;

1243	           if (suitable_id(next_id)) {
1244	               return next_id;
1245	           }

1247	           retry = retry - id_inc;

1249	       } while (retry > 0);

1251	       return ERROR;

1253	   NOTE:
1254	      increment() returns a small integer that is employed to generate a
1255	      monotonically-increasing function.  Most implementations employ a
1256	      constant value for "increment()" (usually 1).  The value returned
1257	      by increment() must be much smaller than the value computed for
1258	      "id_range".

1260	      update_mono(CONTEXT, id_inc) increments the counter corresponding
1261	      to CONTEXT by "id_inc".

1263	      mono(CONTEXT) reads the counter corresponding to CONTEXT.

1265	   Essentially, an identifier (next_id) is generated by adding a
1266	   monotonically-increasing function (mono()) to an offset value,
1267	   unknown to the attacker and stable for given context (CONTEXT).

1269	   The following aspects of the algorithm should be considered:

1271	   o  For the most part, it is the offset() function that results in
1272	      identifiers that are unpredictable by an off-patch attacker.
1273	      While the resulting sequence is known to be monotonically-
1274	      increasing, the use of a randomized offset value makes the
1275	      resulting values unknown to the attacker.

1277	   o  The most straightforward "stateless" implementation of offset() is
1278	      with a PRF that takes the values that identify the context and a
1279	      "secret_key" (not shown in the figure above) as arguments.

1281	   o  One possible implementation of mono() would be to have mono()
1282	      internally employ a single counter (as in the algorithm from
1283	      Section 7.4.2), or map the increments for different contexts into
1284	      a number of counters/buckets, such that the number of counters
1285	      that need to be maintained in memory is reduced (as in the
1286	      algorithm from algorithm in Section 7.4.3).

1288	   o  In all cases, a monotonically increasing function is implemented
1289	      by incrementing the previous value of a counter by increment()
1290	      units.  In the most trivial case, increment() could return the
1291	      constant "1".  But increment() could also be implemented to return
1292	      small random integers such that the increments are unpredictable
1293	      (see Appendix A of [RFC7739]).  This represents a tradeoff between
1294	      the unpredictability of the resulting transient numeric IDs and
1295	      the transient numeric ID reuse frequency.

1297	   Considering the generic algorithm illustrated above, we can identify
1298	   the following possible vulnerabilities:

1300	   o  Since the algorithms for this category are similar to those of
1301	      Section 9.3, with the addition of a monotonically-increasing
1302	      function, all the issues discussed in Section 9.3 ("Category #3:
1303	      Uniqueness, stable within context (soft failure)") also apply to
1304	      this case.

1306	   o  mono() can be correlated to the number of identifiers generated
1307	      for a given context (CONTEXT).  Thus, if mono() spans more than
1308	      the necessary context, the "increments" could be leaked to other
1309	      parties, thus disclosing information about the number of
1310	      identifiers that have been generated by the algorithm for all
1311	      contexts.  This is information disclosure becomes more evident
1312	      when an implementation employs a constant increment of 1.  For
1313	      example, an implementation where mono() is actually a single
1314	      global counter, will unnecessarily leak information the number of
1315	      identifiers that have been generated by the algorithm (globally,
1316	      for all contexts).  [Fyodor2003] is one example of how such
1317	      information leakages can be exploited.  We note that limiting the
1318	      span of the increments space will require a larger number of
1319	      counters to be stored in memory (i.e., a larger value for the
1320	      TABLE_LENGTH parameter of the algorithm in Section 7.4.3).

1322	   o  Transient numeric identifiers generated with the algorithms
1323	      described in Section 7.4.2 and Section 7.4.3 will normally allow
1324	      for fingerprinting within CONTEXT since, for such context, the
1325	      resulting identifiers will have an identifiable pattern (i.e. a
1326	      monotonically-increasing sequence).

1328	10.  IANA Considerations

1330	   This document has no IANA actions.

1332	11.  Security Considerations

1334	   The entire document is about the security and privacy implications of
1335	   transient numeric identifiers.
1336	   [I-D.gont-numeric-ids-sec-considerations] recommends that protocol
1337	   specifications specify the interoperability requirements of their
1338	   transient numeric identifiers, perform a vulnerability assessment of
1339	   their transient numeric identifiers, and suggest an algorithm for
1340	   generating each of their transient numeric identifiers.  This
1341	   document analyzes possible algorithms (and their implications) that
1342	   could be employed to comply with the interoperability properties of
1343	   most common categories of transient numeric identifiers, while
1344	   minimizing the associated negative security and privacy implications.

1346	12.  Acknowledgements

1348	   The authors would like to thank (in alphabetical order) Bernard
1349	   Aboba, Jean-Philippe Aumasson, Steven Bellovin, Luis Leon Cardenas
1350	   Graide, Guillermo Gont, Joseph Lorenzo Hall, Gre Norcie, Shivan
1351	   Sahib, Rich Salz, Martin Thomson, and Michael Tuexen, for providing
1352	   valuable comments on earlier versions of this document.

1354	   The authors would like to thank Shivan Sahib and Christopher Wood,
1355	   for their guidance during the publication process of this document.

1357	   The authors would like to thank Jean-Philippe Aumasson and Mathew D.
1358	   Green (John Hopkins University) for kindly answering a number of
1359	   questions.

1361	   The authors would like to thank Diego Armando Maradona for his magic
1362	   and inspiration.

1364	13.  References

1366	13.1.  Normative References

1368	   [RFC0791]  Postel, J., "Internet Protocol", STD 5, RFC 791,
1369	              DOI 10.17487/RFC0791, September 1981,
1370	              <https://www.rfc-editor.org/info/rfc791>.

1372	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
1373	              RFC 793, DOI 10.17487/RFC0793, September 1981,
1374	              <https://www.rfc-editor.org/info/rfc793>.

1376	   [RFC1035]  Mockapetris, P., "Domain names - implementation and
1377	              specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
1378	              November 1987, <https://www.rfc-editor.org/info/rfc1035>.

1380	   [RFC1321]  Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321,
1381	              DOI 10.17487/RFC1321, April 1992,
1382	              <https://www.rfc-editor.org/info/rfc1321>.

1384	   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
1385	              (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460,
1386	              December 1998, <https://www.rfc-editor.org/info/rfc2460>.

1388	   [RFC4086]  Eastlake 3rd, D., Schiller, J., and S. Crocker,
1389	              "Randomness Requirements for Security", BCP 106, RFC 4086,
1390	              DOI 10.17487/RFC4086, June 2005,
1391	              <https://www.rfc-editor.org/info/rfc4086>.

1393	   [RFC4291]  Hinden, R. and S. Deering, "IP Version 6 Addressing
1394	              Architecture", RFC 4291, DOI 10.17487/RFC4291, February
1395	              2006, <https://www.rfc-editor.org/info/rfc4291>.

1397	   [RFC4862]  Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless
1398	              Address Autoconfiguration", RFC 4862,
1399	              DOI 10.17487/RFC4862, September 2007,
1400	              <https://www.rfc-editor.org/info/rfc4862>.

1402	   [RFC4941]  Narten, T., Draves, R., and S. Krishnan, "Privacy
1403	              Extensions for Stateless Address Autoconfiguration in
1404	              IPv6", RFC 4941, DOI 10.17487/RFC4941, September 2007,
1405	              <https://www.rfc-editor.org/info/rfc4941>.

1407	   [RFC5722]  Krishnan, S., "Handling of Overlapping IPv6 Fragments",
1408	              RFC 5722, DOI 10.17487/RFC5722, December 2009,
1409	              <https://www.rfc-editor.org/info/rfc5722>.

1411	   [RFC5925]  Touch, J., Mankin, A., and R. Bonica, "The TCP
1412	              Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
1413	              June 2010, <https://www.rfc-editor.org/info/rfc5925>.

1415	   [RFC6056]  Larsen, M. and F. Gont, "Recommendations for Transport-
1416	              Protocol Port Randomization", BCP 156, RFC 6056,
1417	              DOI 10.17487/RFC6056, January 2011,
1418	              <https://www.rfc-editor.org/info/rfc6056>.

1420	   [RFC6151]  Turner, S. and L. Chen, "Updated Security Considerations
1421	              for the MD5 Message-Digest and the HMAC-MD5 Algorithms",
1422	              RFC 6151, DOI 10.17487/RFC6151, March 2011,
1423	              <https://www.rfc-editor.org/info/rfc6151>.

1425	   [RFC6191]  Gont, F., "Reducing the TIME-WAIT State Using TCP
1426	              Timestamps", BCP 159, RFC 6191, DOI 10.17487/RFC6191,
1427	              April 2011, <https://www.rfc-editor.org/info/rfc6191>.

1429	   [RFC6528]  Gont, F. and S. Bellovin, "Defending against Sequence
1430	              Number Attacks", RFC 6528, DOI 10.17487/RFC6528, February
1431	              2012, <https://www.rfc-editor.org/info/rfc6528>.

1433	   [RFC7217]  Gont, F., "A Method for Generating Semantically Opaque
1434	              Interface Identifiers with IPv6 Stateless Address
1435	              Autoconfiguration (SLAAC)", RFC 7217,
1436	              DOI 10.17487/RFC7217, April 2014,
1437	              <https://www.rfc-editor.org/info/rfc7217>.

1439	   [RFC7323]  Borman, D., Braden, B., Jacobson, V., and R.
1440	              Scheffenegger, Ed., "TCP Extensions for High Performance",
1441	              RFC 7323, DOI 10.17487/RFC7323, September 2014,
1442	              <https://www.rfc-editor.org/info/rfc7323>.

1444	   [RFC8064]  Gont, F., Cooper, A., Thaler, D., and W. Liu,
1445	              "Recommendation on Stable IPv6 Interface Identifiers",
1446	              RFC 8064, DOI 10.17487/RFC8064, February 2017,
1447	              <https://www.rfc-editor.org/info/rfc8064>.

1449	   [RFC8200]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
1450	              (IPv6) Specification", STD 86, RFC 8200,
1451	              DOI 10.17487/RFC8200, July 2017,
1452	              <https://www.rfc-editor.org/info/rfc8200>.

1454	13.2.  Informative References

1456	   [Bellovin1989]
1457	              Bellovin, S., "Security Problems in the TCP/IP Protocol
1458	              Suite", Computer Communications Review, vol. 19, no. 2,
1459	              pp. 32-48, 1989,
1460	              <https://www.cs.columbia.edu/~smb/papers/ipext.pdf>.

1462	   [Bellovin2002]
1463	              Bellovin, S., "A Technique for Counting NATted Hosts",
1464	              IMW'02 Nov. 6-8, 2002, Marseille, France, 2002,
1465	              <https://www.cs.columbia.edu/~smb/papers/fnat.pdf>.

1467	   [BLAKE3]   O'Connor, J., Aumasson, J., Neves, S., and Z. Wilcox-
1468	              O'Hearn, "BLAKE3: one function, fast everywhere", 2020,
1469	              <https://blake3.io/>.

1471	   [CPNI-TCP]
1472	              Gont, F., "Security Assessment of the Transmission Control
1473	              Protocol (TCP)",  United Kingdom's Centre for the
1474	              Protection of National Infrastructure (CPNI) Technical
1475	              Report, 2009, <https://www.gont.com.ar/papers/tn-03-09-
1476	              security-assessment-TCP.pdf>.

1478	   [EFF]      EFF, "Cover your tracks: See how trackers view your
1479	              browser", 2020, <https://coveryourtracks.eff.org/>.

1481	   [FIPS-SHS]
1482	              NIST, "Secure Hash Standard (SHS)",  Federal Information
1483	              Processing Standards Publication 180-4, August 2015,
1484	              <https://nvlpubs.nist.gov/nistpubs/FIPS/
1485	              NIST.FIPS.180-4.pdf>.

1487	   [Fyodor1998]
1488	              Fyodor, "Remote OS Detection via TCP/IP Stack
1489	              Fingerprinting",  Phrack Magazine, Volume 9, Issue 54,
1490	              1998, <http://www.phrack.org/archives/issues/54/9.txt>.

1492	   [Fyodor2003]
1493	              Fyodor, "Idle scanning and related IP ID games", 2003,
1494	              <https://nmap.org/presentations/CanSecWest03/CD_Content/
1495	              idlescan_paper/idlescan.html>.

1497	   [Fyodor2006]
1498	              Lyon, G., "Chapter 8. Remote OS Detection", 2006,
1499	              <https://nmap.org/book/osdetect.html>.

1501	   [I-D.gont-numeric-ids-sec-considerations]
1502	              Gont, F. and I. Arce, "Security Considerations for
1503	              Transient Numeric Identifiers Employed in Network
1504	              Protocols", draft-gont-numeric-ids-sec-considerations-06
1505	              (work in progress), December 2020.

1507	   [I-D.irtf-pearg-numeric-ids-history]
1508	              Gont, F. and I. Arce, "Unfortunate History of Transient
1509	              Numeric Identifiers", draft-irtf-pearg-numeric-ids-
1510	              history-06 (work in progress), January 2021.

1512	   [IANA-PROT]
1513	              IANA, "Protocol Registries",
1514	              <https://www.iana.org/protocols>.

1516	   [IPID-DEV]
1517	              Klein, A. and B. Pinkas, "From IP ID to Device ID and
1518	              KASLR Bypass (Extended Version)", June 2019,
1519	              <https://arxiv.org/pdf/1906.10478.pdf>.

1521	   [Joncheray1995]
1522	              Joncheray, L., "A Simple Active Attack Against TCP", Proc.
1523	              Fifth Usenix UNIX Security Symposium, 1995, <https://www.u
1524	              senix.org/legacy/publications/library/proceedings/
1525	              security95/full_papers/joncheray.pdf>.

1527	   [KASLR]    PaX Team, "Address Space Layout Randomization",
1528	              <https://pax.grsecurity.net/docs/aslr.txt>.

1530	   [Klein2007]
1531	              Klein, A., "OpenBSD DNS Cache Poisoning and Multiple O/S
1532	              Predictable IP ID Vulnerability", 2007,
1533	              <https://dl.packetstormsecurity.net/papers/attack/OpenBSD_
1534	              DNS_Cache_Poisoning_and_Multiple_OS_Predictable_IP_ID_Vuln
1535	              erability.pdf>.

1537	   [Morris1985]
1538	              Morris, R., "A Weakness in the 4.2BSD UNIX TCP/IP
1539	              Software", CSTR 117, AT&T Bell Laboratories, Murray Hill,
1540	              NJ, 1985,
1541	              <https://pdos.csail.mit.edu/~rtm/papers/117.pdf>.

1543	   [nmap]     nmap, "Nmap: Free Security Scanner For Network Exploration
1544	              and Audit", 2020, <https://www.insecure.org/nmap>.

1546	   [RFC2104]  Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-
1547	              Hashing for Message Authentication", RFC 2104,
1548	              DOI 10.17487/RFC2104, February 1997,
1549	              <https://www.rfc-editor.org/info/rfc2104>.

1551	   [RFC3971]  Arkko, J., Ed., Kempf, J., Zill, B., and P. Nikander,
1552	              "SEcure Neighbor Discovery (SEND)", RFC 3971,
1553	              DOI 10.17487/RFC3971, March 2005,
1554	              <https://www.rfc-editor.org/info/rfc3971>.

1556	   [RFC4953]  Touch, J., "Defending TCP Against Spoofing Attacks",
1557	              RFC 4953, DOI 10.17487/RFC4953, July 2007,
1558	              <https://www.rfc-editor.org/info/rfc4953>.

1560	   [RFC4963]  Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly
1561	              Errors at High Data Rates", RFC 4963,
1562	              DOI 10.17487/RFC4963, July 2007,
1563	              <https://www.rfc-editor.org/info/rfc4963>.

1565	   [RFC5927]  Gont, F., "ICMP Attacks against TCP", RFC 5927,
1566	              DOI 10.17487/RFC5927, July 2010,
1567	              <https://www.rfc-editor.org/info/rfc5927>.

1569	   [RFC6274]  Gont, F., "Security Assessment of the Internet Protocol
1570	              Version 4", RFC 6274, DOI 10.17487/RFC6274, July 2011,
1571	              <https://www.rfc-editor.org/info/rfc6274>.

1573	   [RFC6973]  Cooper, A., Tschofenig, H., Aboba, B., Peterson, J.,
1574	              Morris, J., Hansen, M., and R. Smith, "Privacy
1575	              Considerations for Internet Protocols", RFC 6973,
1576	              DOI 10.17487/RFC6973, July 2013,
1577	              <https://www.rfc-editor.org/info/rfc6973>.

1579	   [RFC6980]  Gont, F., "Security Implications of IPv6 Fragmentation
1580	              with IPv6 Neighbor Discovery", RFC 6980,
1581	              DOI 10.17487/RFC6980, August 2013,
1582	              <https://www.rfc-editor.org/info/rfc6980>.

1584	   [RFC7098]  Carpenter, B., Jiang, S., and W. Tarreau, "Using the IPv6
1585	              Flow Label for Load Balancing in Server Farms", RFC 7098,
1586	              DOI 10.17487/RFC7098, January 2014,
1587	              <https://www.rfc-editor.org/info/rfc7098>.

1589	   [RFC7258]  Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an
1590	              Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May
1591	              2014, <https://www.rfc-editor.org/info/rfc7258>.

1593	   [RFC7707]  Gont, F. and T. Chown, "Network Reconnaissance in IPv6
1594	              Networks", RFC 7707, DOI 10.17487/RFC7707, March 2016,
1595	              <https://www.rfc-editor.org/info/rfc7707>.

1597	   [RFC7721]  Cooper, A., Gont, F., and D. Thaler, "Security and Privacy
1598	              Considerations for IPv6 Address Generation Mechanisms",
1599	              RFC 7721, DOI 10.17487/RFC7721, March 2016,
1600	              <https://www.rfc-editor.org/info/rfc7721>.

1602	   [RFC7739]  Gont, F., "Security Implications of Predictable Fragment
1603	              Identification Values", RFC 7739, DOI 10.17487/RFC7739,
1604	              February 2016, <https://www.rfc-editor.org/info/rfc7739>.

1606	   [RFC8446]  Rescorla, E., "The Transport Layer Security (TLS) Protocol
1607	              Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
1608	              <https://www.rfc-editor.org/info/rfc8446>.

1610	   [Sanfilippo1998a]
1611	              Sanfilippo, S., "about the ip header id", Post to Bugtraq
1612	              mailing-list, Mon Dec 14 1998,
1613	              <http://seclists.org/bugtraq/1998/Dec/48>.

1615	   [Sanfilippo1998b]
1616	              Sanfilippo, S., "Idle scan", Post to Bugtraq mailing-list,
1617	              1998, <https://github.com/antirez/hping/raw/master/docs/
1618	              SPOOFED_SCAN.txt>.

1620	   [Sanfilippo1999]
1621	              Sanfilippo, S., "more ip id", Post to Bugtraq mailing-
1622	              list, 1999,
1623	              <https://github.com/antirez/hping/raw/master/docs/MORE-
1624	              FUN-WITH-IPID>.

1626	   [Schuba1993]
1627	              Schuba, C., "ADDRESSING WEAKNESSES IN THE DOMAIN NAME
1628	              SYSTEM PROTOCOL", 1993,
1629	              <http://ftp.cerias.purdue.edu/pub/papers/christoph-schuba/
1630	              schuba-DNS-msthesis.pdf>.

1632	   [Shimomura1995]
1633	              Shimomura, T., "Technical details of the attack described
1634	              by Markoff in NYT", Message posted in USENET's
1635	              comp.security.misc newsgroup  Message-ID:
1636	              <3g5gkl$5j1@ariel.sdsc.edu>, 1995,
1637	              <https://www.gont.com.ar/docs/post-shimomura-usenet.txt>.

1639	   [Silbersack2005]
1640	              Silbersack, M., "Improving TCP/IP security through
1641	              randomization without sacrificing interoperability",
1642	              EuroBSDCon 2005 Conference, 2005,
1643	              <https://citeseerx.ist.psu.edu/viewdoc/
1644	              download?doi=10.1.1.91.4542&rep=rep1&type=pdf>.

1646	   [SipHash]  Aumasson, J. and D. Bernstein, "SipHash: a fast short-
1647	              input PRF", 2012, <https://github.com/veorq/SipHash>.

1649	   [TBIT]     TBIT, "TBIT, the TCP Behavior Inference Tool", 2001,
1650	              <https://www.icir.org/tbit/>.

1652	   [TCPT-uptime]
1653	              McDanel, B., "TCP Timestamping - Obtaining System Uptime
1654	              Remotely", March 2001,
1655	              <https://securiteam.com/securitynews/5np0c153pi/>.

1657	   [Zalewski2001]
1658	              Zalewski, M., "Strange Attractors and TCP/IP Sequence
1659	              Number Analysis", 2001,
1660	              <https://lcamtuf.coredump.cx/oldtcp/tcpseq.html>.

1662	   [Zalewski2002]
1663	              Zalewski, M., "Strange Attractors and TCP/IP Sequence
1664	              Number Analysis - One Year Later", 2001,
1665	              <https://lcamtuf.coredump.cx/newtcp/>.

1667	   [Zalewski2012]
1668	              Zalewski, M., "p0f v3 (version 3.09b)", 2012,
1669	              <https://lcamtuf.coredump.cx/p0f.shtml>.

1671	Appendix A.  Algorithms and Techniques with Known Issues

1673	   The following subsections discuss algorithms and techniques with
1674	   known negative security and privacy implications.

1676	   NOTE:
1677	      As discussed in Section 1, the use of cryptographic techniques
1678	      might allow for the safe use of some of these algorithms and
1679	      techniques.  However, this should be evaluated on a case by case
1680	      basis.

1682	A.1.  Predictable Linear Identifiers Algorithm

1684	   One of the most trivial ways to achieve uniqueness with a low
1685	   identifier reuse frequency is to produce a linear sequence.  This
1686	   type of algorithm has been employed in the past to generate
1687	   identifiers of Categories #1, #2, and #4 (please see Section 6 for an
1688	   analysis of these categories).

1690	   For example, the following algorithm has been employed (see e.g.
1691	   [Morris1985], [Shimomura1995], [Silbersack2005] and [CPNI-TCP]) in a
1692	   number of operating systems for selecting IP fragment IDs, TCP
1693	   ephemeral ports, etc.:

1695	       /* Initialization code */

1697	       next_id = min_id;
1698	       id_inc= 1;

1700	       /* Transient Numeric ID selection function */

1702	       id_range = max_id - min_id + 1;
1703	       retry = id_range;

1705	       do {
1706	           if (next_id == max_id) {
1707	               next_id = min_id;
1708	           }
1709	           else {
1710	               next_id = next_id + id_inc;
1711	           }

1713	           if (suitable_id(next_id)) {
1714	               return next_id;
1715	           }

1717	           retry--;

1719	       } while (retry > 0);

1721	       return ERROR;

1723	   NOTE:
1724	      suitable_id() is a function that checks whether the resulting
1725	      identifier is acceptable (e.g., whether it's in use, etc.).

1727	   For obvious reasons, this algorithm results in predictable sequences.
1728	   Since a global counter is used to generate the transient numeric
1729	   identifiers ("next_id" in the example above), an entity that learns
1730	   one numeric identifier can infer past numeric identifiers and predict
1731	   future values to be generated by the same algorithm.  Since the value
1732	   employed for the increments is known (such as "1" in this case), an
1733	   attacker can sample two values, and learn the number of identifiers
1734	   that have been were generated in-between the two sampled values.
1735	   Furthermore, if the counter is initialized e.g. when the system its
1736	   bootstrapped to some known value, the algorithm will leak additional
1737	   information, such as the number of transmitted fragmented datagrams
1738	   in the case of an IP ID generator [Sanfilippo1998a], or the system
1739	   uptime in the case of TCP timestamps [TCPT-uptime].

1741	A.2.  Random-Increments Algorithm

1743	   This algorithm offers a middle ground between the algorithms that
1744	   generate randomized transient numeric identifiers (such as those
1745	   described in Section 7.1.1 and Section 7.1.2), and those that
1746	   generate identifiers with a predictable monotonically-increasing
1747	   function (see Appendix A.1).

1749	       /* Initialization code */

1751	       next_id = random();        /* Initialization value */
1752	       id_rinc = 500;             /* Determines the trade-off */

1754	       /* Transient Numeric ID selection function */

1756	       id_range = max_id - min_id + 1;
1757	       retry = id_range;

1759	       do {
1760	           /* Random increment */
1761	           id_inc = (random() % id_rinc) + 1;

1763	           if ( (max_id - next_id) >= id_inc){
1764	               next_id = next_id + id_inc;
1765	           }
1766	           else {
1767	               next_id = min_id + id_inc - (max_id - next_id);
1768	           }

1770	           if (suitable_id(next_id)) {
1771	              return next_id;
1772	           }

1774	           retry = retry - id_inc;

1776	       } while (retry > 0);

1778	       return ERROR;

1780	   This algorithm aims at producing a global monotonically-increasing
1781	   sequence of transient numeric identifiers, while avoiding the use of
1782	   fixed increments, which would lead to trivially predictable
1783	   sequences.  The value "id_inc" allows for direct control of the
1784	   trade-off between unpredictability and identifier reuse frequency.
1785	   The smaller the value of "id_inc", the more similar this algorithm is
1786	   to a predicable, global linear ID generation algorithm (as the one in
1787	   Appendix A.1).  The larger the value of "id_inc", the more similar
1788	   this algorithm is to the algorithm described in Section 7.1.1 of this
1789	   document.

1791	   When the identifiers wrap, there is a risk of collisions of transient
1792	   numeric identifiers (i.e., identifier reuse).  Therefore, "id_inc"
1793	   should be selected according to the following criteria:

1795	   o  It should maximize the wrapping time of the identifier space.

1797	   o  It should minimize identifier reuse frequency.

1799	   o  It should maximize unpredictability.

1801	   Clearly, these are competing goals, and the decision of which value
1802	   of "id_inc" to use is a trade-off.  Therefore, the value of "id_inc"
1803	   is at times a configurable parameter so that system administrators
1804	   can make the trade-off for themselves.  We note that the alternative
1805	   algorithms discussed throughout this document offer better
1806	   interoperability, security and privacy properties than this
1807	   algorithm, and hence implementation of this algorithm is discouraged.

1809	A.3.  Re-using Identifiers Across Different Contexts

1811	   Employing the same identifier across contexts in which stability is
1812	   not required (i.e. overloading the semantics of transient numeric
1813	   identifier) usually has negative security and privacy implications.

1815	   For example, in order to generate transient numeric identifiers of
1816	   Category #2 or Category #3, an implementation or specification might
1817	   be tempted to employ a source for the numeric identifiers which is
1818	   known to provide unique values, but that may also be predictable or
1819	   leak information related to the entity generating the identifier.
1820	   This technique has been employed in the past for e.g. generating IPv6
1821	   IIDs by re-using the MAC address of the underlying network interface.
1822	   However, as noted in [RFC7721] and [RFC7707], embedding link-layer
1823	   addresses in IPv6 IIDs not only results in predictable values, but
1824	   also leaks information about the manufacturer of the underlying
1825	   network interface card, allows for network activity correlation, and
1826	   makes address-based scanning attacks feasible.

1828	Authors' Addresses
1829	   Fernando Gont
1830	   SI6 Networks
1831	   Evaristo Carriego 2644
1832	   Haedo, Provincia de Buenos Aires  1706
1833	   Argentina

1835	   Email: fgont@si6networks.com
1836	   URI:   https://www.si6networks.com

1838	   Ivan Arce
1839	   Quarkslab

1841	   Email: iarce@quarkslab.com
1842	   URI:   https://www.quarkslab.com