idnits 2.17.1 

draft-irtf-pearg-numeric-ids-generation-08.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (31 January 2022) is 815 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  ** Obsolete normative reference: RFC  793 (Obsoleted by RFC 9293)

  ** Obsolete normative reference: RFC 6528 (Obsoleted by RFC 9293)

  ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200)

  ** Obsolete normative reference: RFC 4941 (Obsoleted by RFC 8981)

  == Outdated reference: A later version (-11) exists of
     draft-irtf-pearg-numeric-ids-history-09

  == Outdated reference: A later version (-11) exists of
     draft-gont-numeric-ids-sec-considerations-06


     Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Research Task Force (IRTF)                              F. Gont
3	Internet-Draft                                                   EdgeUno
4	Intended status: Informational                                   I. Arce
5	Expires: 4 August 2022                                         Quarkslab
6	                                                         31 January 2022

8	           On the Generation of Transient Numeric Identifiers
9	               draft-irtf-pearg-numeric-ids-generation-08

11	Abstract

13	   This document performs an analysis of the security and privacy
14	   implications of different types of "transient numeric identifiers"
15	   used in IETF protocols, and tries to categorize them based on their
16	   interoperability requirements and their associated failure severity
17	   when such requirements are not met.  Subsequently, it provides advice
18	   on possible algorithms that could be employed to satisfy the
19	   interoperability requirements of each identifier category, while
20	   minimizing the negative security and privacy implications, thus
21	   providing guidance to protocol designers and protocol implementers.
22	   Finally, it describes a number of algorithms that have been employed
23	   in real implementations to generate transient numeric identifiers,
24	   and analyzes their security and privacy properties.  This document is
25	   a product of the Privacy Enhancement and Assessment Research Group
26	   (PEARG) in the IRTF.

28	Status of This Memo

30	   This Internet-Draft is submitted in full conformance with the
31	   provisions of BCP 78 and BCP 79.

33	   Internet-Drafts are working documents of the Internet Engineering
34	   Task Force (IETF).  Note that other groups may also distribute
35	   working documents as Internet-Drafts.  The list of current Internet-
36	   Drafts is at https://datatracker.ietf.org/drafts/current/.

38	   Internet-Drafts are draft documents valid for a maximum of six months
39	   and may be updated, replaced, or obsoleted by other documents at any
40	   time.  It is inappropriate to use Internet-Drafts as reference
41	   material or to cite them other than as "work in progress."

43	   This Internet-Draft will expire on 4 August 2022.

45	Copyright Notice

47	   Copyright (c) 2022 IETF Trust and the persons identified as the
48	   document authors.  All rights reserved.

50	   This document is subject to BCP 78 and the IETF Trust's Legal
51	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
52	   license-info) in effect on the date of publication of this document.
53	   Please review these documents carefully, as they describe your rights
54	   and restrictions with respect to this document.

56	Table of Contents

58	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
59	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   4
60	   3.  Threat Model  . . . . . . . . . . . . . . . . . . . . . . . .   5
61	   4.  Issues with the Specification of Transient Numeric
62	           Identifiers . . . . . . . . . . . . . . . . . . . . . . .   6
63	   5.  Protocol Failure Severity . . . . . . . . . . . . . . . . . .   7
64	   6.  Categorizing Transient Numeric Identifiers  . . . . . . . . .   7
65	   7.  Common Algorithms for Transient Numeric Identifier
66	           Generation  . . . . . . . . . . . . . . . . . . . . . . .  10
67	     7.1.  Category #1: Uniqueness (soft failure)  . . . . . . . . .  10
68	     7.2.  Category #2: Uniqueness (hard failure)  . . . . . . . . .  13
69	     7.3.  Category #3: Uniqueness, stable within context (soft
70	           failure)  . . . . . . . . . . . . . . . . . . . . . . . .  14
71	     7.4.  Category #4: Uniqueness, monotonically increasing within
72	           context (hard failure)  . . . . . . . . . . . . . . . . .  16
73	   8.  Common Vulnerabilities Associated with Transient Numeric
74	           Identifiers . . . . . . . . . . . . . . . . . . . . . . .  22
75	     8.1.  Network Activity Correlation  . . . . . . . . . . . . . .  22
76	     8.2.  Information Leakage . . . . . . . . . . . . . . . . . . .  23
77	     8.3.  Fingerprinting  . . . . . . . . . . . . . . . . . . . . .  24
78	     8.4.  Exploitation of the Semantics of Transient Numeric
79	           Identifiers . . . . . . . . . . . . . . . . . . . . . . .  25
80	     8.5.  Exploitation of Collisions of Transient Numeric
81	           Identifiers . . . . . . . . . . . . . . . . . . . . . . .  26
82	     8.6.  Exploitation of Predictable Transient Numeric Identifiers
83	           for Injection Attacks . . . . . . . . . . . . . . . . . .  26
84	     8.7.  Cryptanalysis . . . . . . . . . . . . . . . . . . . . . .  27
85	   9.  Vulnerability Assessment of Transient Numeric Identifiers . .  27
86	     9.1.  Category #1: Uniqueness (soft failure)  . . . . . . . . .  27
87	     9.2.  Category #2: Uniqueness (hard failure)  . . . . . . . . .  28
88	     9.3.  Category #3: Uniqueness, stable within context (soft
89	           failure)  . . . . . . . . . . . . . . . . . . . . . . . .  28
90	     9.4.  Category #4: Uniqueness, monotonically increasing within
91	           context (hard failure)  . . . . . . . . . . . . . . . . .  29
92	   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  31
93	   11. Security Considerations . . . . . . . . . . . . . . . . . . .  31
94	   12. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  31
95	   13. References  . . . . . . . . . . . . . . . . . . . . . . . . .  32
96	     13.1.  Normative References . . . . . . . . . . . . . . . . . .  32
97	     13.2.  Informative References . . . . . . . . . . . . . . . . .  33

99	   Appendix A.  Algorithms and Techniques with Known Issues  . . . .  38
100	     A.1.  Predictable Linear Identifiers Algorithm  . . . . . . . .  38
101	     A.2.  Random-Increments Algorithm . . . . . . . . . . . . . . .  40
102	     A.3.  Re-using Identifiers Across Different Contexts  . . . . .  41
103	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  42

105	1.  Introduction

107	   Networking protocols employ a variety of transient numeric
108	   identifiers for different protocol objects, such as IPv4 and IPv6
109	   Fragment Identifiers [RFC0791] [RFC8200], IPv6 Interface Identifiers
110	   (IIDs) [RFC4291], transport protocol ephemeral port numbers
111	   [RFC6056], TCP Initial Sequence Numbers (ISNs) [RFC0793], and DNS
112	   Transaction IDs (TxIDs) [RFC1035].These identifiers usually have
113	   specific interoperability requirements (e.g. uniqueness during a
114	   specified period of time) that must be satisfied such that they do
115	   not result in negative interoperability implications, and an
116	   associated failure severity when such requirements are not met,
117	   ranging from soft to hard failures.

119	   For more than 30 years, a large number of implementations of the TCP/
120	   IP protocol suite have been subject to a variety of attacks, with
121	   effects ranging from Denial of Service (DoS) or data injection, to
122	   information leakages that could be exploited for pervasive monitoring
123	   [RFC7258].  The root cause of these issues has been, in many cases,
124	   the poor selection of transient numeric identifiers in such
125	   protocols, usually as a result of insufficient or misleading
126	   specifications.  While it is generally trivial to identify an
127	   algorithm that can satisfy the interoperability requirements of a
128	   given transient numeric identifier, empirical evidence exists that
129	   doing so without negatively affecting the security and/or privacy
130	   properties of the aforementioned protocols is prone to error
131	   [I-D.irtf-pearg-numeric-ids-history].

133	   For example, implementations have been subject to security and/or
134	   privacy issues resulting from:

136	   *  Predictable IPv4 or IPv6 Fragment Identifiers (see e.g.
137	      [Sanfilippo1998a], [RFC6274], and [RFC7739])

139	   *  Predictable IPv6 IIDs (see e.g.  [RFC7721], [RFC7707], and
140	      [RFC7217])

142	   *  Predictable transport protocol ephemeral port numbers (see e.g.
143	      [RFC6056] and [Silbersack2005])

145	   *  Predictable TCP Initial Sequence Numbers (ISNs) (see e.g.
146	      [Morris1985], [Bellovin1989], and [RFC6528])

148	   *  Predictable initial timestamp in TCP timestamps Options (see e.g.
149	      [TCPT-uptime] and [RFC7323])

151	   *  Predictable DNS TxIDs (see e.g.  [Schuba1993] and [Klein2007])

153	   Recent history indicates that when new protocols are standardized or
154	   new protocol implementations are produced, the security and privacy
155	   properties of the associated transient numeric identifiers tend to be
156	   overlooked, and inappropriate algorithms to generate transient
157	   numeric identifiers are either suggested in the specifications or
158	   selected by implementers.  As a result, it should be evident that
159	   advice in this area is warranted.

161	   We note that the use of cryptographic techniques may readily mitigate
162	   some of the issues arising from predictable transient numeric
163	   identifiers.  For example, cryptographic integrity and authentication
164	   can readily mitigate data injection attacks even in the presence of
165	   predictable transient numeric identifiers (such as "sequence
166	   numbers").  However, use of flawed algorithms (such as global
167	   counters) for generating transient numeric identifiers could still
168	   result in information leakages even when cryptographic techniques are
169	   employed.

171	   This document contains a non-exhaustive survey of transient numeric
172	   identifiers employed in various IETF protocols, and aims to
173	   categorize such identifiers based on their interoperability
174	   requirements, and the associated failure severity when such
175	   requirements are not met.  Subsequently, it provides advice on
176	   possible algorithms that could be employed to satisfy the
177	   interoperability requirements of each category, while minimizing
178	   negative security and privacy implications.  Finally, it analyzes
179	   several algorithms that have been employed in real implementations to
180	   meet such requirements, and analyzes their security and privacy
181	   properties.

183	   This document represents the consensus of the Privacy Enhancement and
184	   Assessment Research Group (PEARG).

186	2.  Terminology

188	   Transient Numeric Identifier:
189	      A data object in a protocol specification that can be used to
190	      definitely distinguish a protocol object (a datagram, network
191	      interface, transport protocol endpoint, session, etc.) from all
192	      other objects of the same type, in a given context.  Transient
193	      numeric identifiers are usually defined as a series of bits, and
194	      represented using integer values.  These identifiers are typically
195	      dynamically selected, as opposed to statically-assigned numeric
196	      identifiers (see e.g.  [IANA-PROT]).  We note that different
197	      transient numeric identifiers may have additional requirements or
198	      properties depending on their specific use in a protocol.  We use
199	      the term "transient numeric identifier" (or simply "numeric
200	      identifier" or "identifier" as short forms) as a generic term to
201	      refer to any data object in a protocol specification that
202	      satisfies the identification property stated above.

204	   Failure Severity:
205	      The consequences of a failure to comply with the interoperability
206	      requirements of a given identifier.  Severity considers the worst
207	      potential consequence of a failure, determined by the system
208	      damage and/or time lost to repair the failure.  In this document
209	      we define two types of failure severity: "soft failure" and "hard
210	      failure".

212	   Soft Failure:
213	      A soft failure is a recoverable condition in which a protocol does
214	      not operate in the prescribed manner but normal operation can be
215	      resumed automatically in a short period of time.  For example, a
216	      simple packet-loss event that is subsequently recovered with a
217	      packet-retransmission can be considered a soft failure.

219	   Hard Failure:
220	      A hard failure is a non-recoverable condition in which a protocol
221	      does not operate in the prescribed manner or it operates with
222	      excessive degradation of service.  For example, an established TCP
223	      connection that is aborted due to an error condition constitutes,
224	      from the point of view of the transport protocol, a hard failure,
225	      since it enters a state from which normal operation cannot be
226	      resumed.

228	3.  Threat Model

230	   Throughout this document, we assume an attacker does not have
231	   physical or logical access to the system(s) being attacked, and that
232	   the attacker can only observe traffic explicitly directed to the
233	   attacker.  For example, an attacker cannot observe traffic
234	   transferred between a sender and the receiver(s) of a target
235	   protocol, but may be able to interact with any of these entities,
236	   including by e.g. sending any traffic to them to sample transient
237	   numeric identifiers employed by the target systems when communicating
238	   with the attacker.

240	   For example, when analyzing vulnerabilities associated with TCP
241	   Initial Sequence Numbers (ISNs), we consider the attacker is unable
242	   to capture network traffic corresponding to a TCP connection between
243	   two other hosts.  However, we consider the attacker is able to
244	   communicate with any of these hosts (e.g., establish a TCP connection
245	   with any of them), to e.g. sample the TCP ISNs employed by these
246	   systems when communicating with the attacker.

248	   Similarly, when considering host-tracking attacks based on IPv6
249	   interface identifiers, we consider an attacker may learn the IPv6
250	   address employed by a victim node if e.g. the address becomes exposed
251	   as a result of the victim node communicating with an attacker-
252	   operated server.  Subsequently, an attacker may perform host-tracking
253	   by probing a set of target addresses composed by a set of target
254	   prefixes and the IPv6 interface identifier originally learned by the
255	   attacker.  Alternatively, an attacker may perform host tracking if
256	   e.g. the victim node communicates with an attacker-operated server as
257	   it moves from one location to another, those exposing its configured
258	   addresses.  We note that none of these scenarios requires the
259	   attacker observe traffic not explicitly directed to the attacker.

261	4.  Issues with the Specification of Transient Numeric Identifiers

263	   While assessing protocol specifications regarding the use of
264	   transient numeric identifiers, we have found that most of the issues
265	   discussed in this document arise as a result of one of the following
266	   conditions:

268	   *  Protocol specifications that under-specify the requirements for
269	      their transient numeric identifiers

271	   *  Protocol specifications that over-specify their transient numeric
272	      identifiers

274	   *  Protocol implementations that simply fail to comply with the
275	      specified requirements

277	   A number of protocol specifications (too many of them) have simply
278	   overlooked the security and privacy implications of transient numeric
279	   identifiers [I-D.irtf-pearg-numeric-ids-history].  Examples of them
280	   are the specification of TCP ephemeral ports in [RFC0793], the
281	   specification of TCP sequence numbers in [RFC0793], or the
282	   specification of the DNS TxID in [RFC1035].

284	   On the other hand, there are a number of protocol specifications that
285	   over-specify some of their associated transient numeric identifiers.
286	   For example, [RFC4291] essentially overloads the semantics of IPv6
287	   Interface Identifiers (IIDs) by embedding link-layer addresses in the
288	   IPv6 IIDs, when the interoperability requirement of uniqueness could
289	   be achieved in other ways that do not result in negative security and
290	   privacy implications [RFC7721].  Similarly, [RFC2460] suggested the
291	   use of a global counter for the generation of Fragment Identification
292	   values, when the interoperability properties of uniqueness per {IPv6
293	   Source Address, IPv6 Destination Address} could be achieved with
294	   other algorithms that do not result in negative security and privacy
295	   implications [RFC7739].

297	   Finally, there are protocol implementations that simply fail to
298	   comply with existing protocol specifications.  For example, some
299	   popular operating systems (notably Microsoft Windows) still fail to
300	   implement transport protocol ephemeral port randomization, as
301	   recommended in [RFC6056].

303	5.  Protocol Failure Severity

305	   Section 2 defines the concept of "Failure Severity", along with two
306	   types of failure severities that we employ throughout this document:
307	   soft and hard.

309	   Our analysis of the severity of a failure is performed from the point
310	   of view of the protocol in question.  However, the corresponding
311	   severity on the upper protocol (or application) might not be the same
312	   as that of the protocol in question.  For example, a TCP connection
313	   that is aborted might or might not result in a hard failure of the
314	   upper application: if the upper application can establish a new TCP
315	   connection without any impact on the application, a hard failure at
316	   the TCP protocol may have no severity at the application level.  On
317	   the other hand, if a hard failure of a TCP connection results in
318	   excessive degradation of service at the application layer, it will
319	   also result in a hard failure at the application.

321	6.  Categorizing Transient Numeric Identifiers

323	   This section includes a non-exhaustive survey of transient numeric
324	   identifiers, which are representative of all the possible
325	   combinations of interoperability requirements and failure severities
326	   found in popular protocols from different layers.  Additionally, it
327	   proposes a number of categories that can accommodate these
328	   identifiers based on their interoperability requirements and their
329	   associated failure severity (soft or hard).

331	   NOTE:
332	      All other transient numeric identifiers that were analyzed as part
333	      of this effort could be accommodated into one of the existing
334	      categories from Table 1.

336	    +==============+===============================+==================+
337	    |  Identifier  | Interoperability Requirements | Failure Severity |
338	    +==============+===============================+==================+
339	    | IPv6 Frag ID |   Uniqueness (for IP address  |  Soft/Hard (1)   |
340	    |              |             pair)             |                  |
341	    +--------------+-------------------------------+------------------+
342	    |   IPv6 IID   | Uniqueness (and stable within |     Soft (3)     |
343	    |              |        IPv6 prefix) (2)       |                  |
344	    +--------------+-------------------------------+------------------+
345	    |   TCP ISN    |  Monotonically-increasing (4) |     Hard (4)     |
346	    +--------------+-------------------------------+------------------+
347	    | TCP initial  |  Monotonically-increasing (5) |     Hard (5)     |
348	    |  timestamps  |                               |                  |
349	    +--------------+-------------------------------+------------------+
350	    |   TCP eph.   |   Uniqueness (for connection  |       Hard       |
351	    |     port     |              ID)              |                  |
352	    +--------------+-------------------------------+------------------+
353	    |  IPv6 Flow   |           Uniqueness          |     None (6)     |
354	    |    Label     |                               |                  |
355	    +--------------+-------------------------------+------------------+
356	    |   DNS TxID   |           Uniqueness          |     None (7)     |
357	    +--------------+-------------------------------+------------------+

359	              Table 1: Survey of Transient Numeric Identifiers

361	   NOTE:

363	   (1)
364	      While a single collision of Fragment ID values would simply lead
365	      to a single packet drop (and hence a "soft" failure), repeated
366	      collisions at high data rates might trash the Fragment ID space,
367	      leading to a hard failure [RFC4963].

369	   (2)
370	      While the interoperability requirements are simply that the
371	      Interface ID results in a unique IPv6 address, for operational
372	      reasons it is typically desirable that the resulting IPv6 address
373	      (and hence the corresponding Interface ID) be stable within each
374	      network [RFC7217] [RFC8064].

376	   (3)
377	      While IPv6 Interface IDs must result in unique IPv6 addresses,
378	      IPv6 Duplicate Address Detection (DAD) [RFC4862] allows for the
379	      detection of duplicate addresses, and hence such Interface ID
380	      collisions can be recovered.

382	   (4)
383	      In theory, there are no interoperability requirements for TCP
384	      Initial Sequence Numbers (ISNs), since the TIME-WAIT state and
385	      TCP's "quiet time" concept take care of old segments from previous
386	      incarnations of a connection.  However, a widespread optimization
387	      allows for a new incarnation of a previous connection to be
388	      created if the ISN of the incoming SYN is larger than the last
389	      sequence number seen in that direction for the previous
390	      incarnation of the connection.  Thus, monotonically-increasing TCP
391	      ISNs allow for such optimization to work as expected [RFC6528],
392	      and can help avoid connection-establishment failures.

394	   (5)
395	      Strictly speaking, there are no interoperability requirements for
396	      the *initial* TCP timestamp employed by a TCP instance (i.e., the
397	      TS Value (TSval) in a segment with the SYN bit set).  However,
398	      some TCP implementations allow a new incarnation of a previous
399	      connection to be created if the TSval of the incoming SYN is
400	      larger than the last TSval seen in that direction for the previous
401	      incarnation of the connection (please see [RFC6191]).  Thus,
402	      monotonically-increasing TCP initial timestamps (across
403	      connections to the same endpoint) allow for such optimization to
404	      work as expected [RFC6191], and can help avoid connection-
405	      establishment failures.

407	   (6)
408	      The IPv6 Flow Label is typically employed for load sharing
409	      [RFC7098], along with the Source and Destination IPv6 addresses.
410	      Reuse of a Flow Label value for the same set {Source Address,
411	      Destination Address} would typically cause both flows to be
412	      multiplexed onto the same link.  However, as long as this does not
413	      occur deterministically, it will not result in any negative
414	      implications.

416	   (7)
417	      DNS TxIDs are employed, together with the Source Address,
418	      Destination Address, Source Port, and Destination Port, to match
419	      DNS requests and responses.  However, since an implementation
420	      knows which DNS requests were sent for that set of {Source
421	      Address, Destination Address, Source Port, and Destination Port,
422	      DNS TxID}, a collision of TxID would result, if anything, in a
423	      small performance penalty (the response would nevertheless be
424	      discarded when it is found that it does not answer the query sent
425	      in the corresponding DNS query).

427	   Based on the survey above, we can categorize identifiers as follows:

429	   +=======+======================================+====================+
430	   |  Cat  |               Category               |  Sample Proto IDs  |
431	   |   #   |                                      |                    |
432	   +=======+======================================+====================+
433	   |   1   |      Uniqueness (soft failure)       | IPv6 Flow L., DNS  |
434	   |       |                                      |       TxIDs        |
435	   +-------+--------------------------------------+--------------------+
436	   |   2   |      Uniqueness (hard failure)       | IPv6 Frag ID, TCP  |
437	   |       |                                      |   ephemeral port   |
438	   +-------+--------------------------------------+--------------------+
439	   |   3   |  Uniqueness, stable within context   |     IPv6 IIDs      |
440	   |       |            (soft failure)            |                    |
441	   +-------+--------------------------------------+--------------------+
442	   |   4   | Uniqueness, monotonically increasing |    TCP ISN, TCP    |
443	   |       |    within context (hard failure)     | initial timestamps |
444	   +-------+--------------------------------------+--------------------+

446	                       Table 2: Identifier Categories

448	   We note that Category #4 could be considered a generalized case of
449	   category #3, in which a monotonically increasing element is added to
450	   a stable (within context) element, such that the resulting
451	   identifiers are monotonically increasing within a specified context.
452	   That is, the same algorithm could be employed for both #3 and #4,
453	   given appropriate parameters.

455	7.  Common Algorithms for Transient Numeric Identifier Generation

457	   The following subsections describe some sample algorithms that can be
458	   employed for generating transient numeric identifiers for each of the
459	   categories above, while mitigating the vulnerabilities analyzed in
460	   Section 8 of this document.

462	   All of the variables employed in the algorithms of the following
463	   subsections are of "unsigned integer" type, except for the "retry"
464	   variable, that is of (signed) "integer" type.

466	7.1.  Category #1: Uniqueness (soft failure)

468	   The requirement of uniqueness with a soft failure severity can be
469	   complied with a Pseudo-Random Number Generator (PRNG).

471	   NOTE:
472	      Please see [RFC4086] regarding randomness requirements for
473	      security.

475	   We note that since the premise is that collisions of transient
476	   numeric identifiers of this category only leads to soft failures, in
477	   many cases, the algorithm might not need to check the suitability of
478	   a selected identifier (i.e., the suitable_id() function, described
479	   below, could always return "true").

481	   In scenarios where e.g. simultaneous use of a given numeric ID is
482	   undesirable and the implementation detects such condition, an
483	   implementation may opt to select the next available identifier in the
484	   same sequence, or select another random number.  Section 7.1.1 is an
485	   implementation of the former strategy, while Section 7.1.2 is an
486	   implementation of the later.  Typically, the algorithm in
487	   Section 7.1.2 results in a more uniform distribution of the generated
488	   transient numeric identifiers.  However, for transient numeric
489	   identifiers where an implementation typically keeps local state about
490	   unsuitable/used identifiers, the algorithm in Section 7.1.2 may
491	   require many more iterations than the algorithm in Section 7.1.1 to
492	   generate a suitable transient numeric identifier.  This will usually
493	   be affected by the current usage ratio of transient numeric
494	   identifiers (i.e., number of numeric identifiers considered suitable
495	   / total number of numeric identifiers) and other parameters.
496	   Therefore, in such cases many implementations tend to prefer the
497	   algorithm in Section 7.1.1 over the algorithm in Section 7.1.2.

499	7.1.1.  Simple Randomization Algorithm

501	       /* Transient Numeric ID selection function */

503	       id_range = max_id - min_id + 1;
504	       next_id = min_id + (random() % id_range);
505	       retry = id_range;

507	       do {
508	           if (suitable_id(next_id)) {
509	               return next_id;
510	           }

512	           if (next_id == max_id) {
513	               next_id = min_id;
514	           } else {
515	               next_id++;
516	           }

518	           retry--;

520	       } while (retry > 0);

522	       return ERROR;

524	   NOTE:
525	      random() is a function that returns a pseudo-random unsigned
526	      integer number of appropriate size.  Note that the output needs to
527	      be unpredictable, and typical implementations of the POSIX
528	      random() function do not necessarily meet this requirement.  See
529	      [RFC4086] for randomness requirements for security.  Beware that
530	      "adapting" the length of the output of random() with a modulo
531	      operator (e.g., C language's "%") may change the distribution of
532	      the PRNG.

534	      The function suitable_id() can check, when possible and desirable,
535	      whether a selected transient numeric identifier is suitable (e.g.
536	      it is not already in use).  Depending on how/where the numeric
537	      identifier is used, it may or may not be possible (or even
538	      desirable) to check whether the numeric identifier is in use (or
539	      whether it has been recently employed).  When an identifier is
540	      found to be unsuitable, this algorithm selects the next available
541	      numeric identifier in sequence.

543	      Even when this algorithm selects numeric IDs randomly, it is
544	      biased towards the first available numeric ID after a sequence of
545	      unavailable numeric IDs.  For example, if this algorithm is
546	      employed for transport protocol ephemeral port randomization
547	      [RFC6056] and the local list of unsuitable port numbers (e.g.,
548	      registered port numbers that should not be used for ephemeral
549	      ports) is significant, an attacker may actually have a
550	      significantly better chance of guessing a port number.

552	      All the variables (in this and all the algorithms discussed in
553	      this document) are unsigned integers.

555	   Assuming the randomness requirements for the PRNG are met (see
556	   [RFC4086]), this algorithm does not suffer from any of the issues
557	   discussed in Section 8.

559	7.1.2.  Another Simple Randomization Algorithm

561	   The following pseudo-code illustrates another algorithm for selecting
562	   a random transient numeric identifier which, in the event a selected
563	   identifier is found to be unsuitable (e.g., already in use), another
564	   identifier is randomly selected:

566	       /* Transient Numeric ID selection function */

568	       id_range = max_id - min_id + 1;
569	       retry = id_range;

571	       do {
572	           next_id = min_id + (random() % id_range);

574	           if (suitable_id(next_id)) {
575	               return next_id;
576	           }

578	           retry--;

580	       } while (retry > 0);

582	       return ERROR;

584	   This algorithm might be unable to select a transient numeric
585	   identifier (i.e., return "ERROR") even if there are suitable
586	   identifiers available, in cases where a large number of identifiers
587	   are found to be unsuitable (e.g. "in use").

589	   The same considerations from Section 7.1.1 with respect to the
590	   properties of random() and the adaptation of its output length apply
591	   to this algorithm.

593	   Assuming the randomness requirements for the PRNG are met (see
594	   [RFC4086]), this algorithm does not suffer from any of the issues
595	   discussed in Section 8.

597	7.2.  Category #2: Uniqueness (hard failure)

599	   One of the most trivial approaches for generating unique transient
600	   numeric identifier (with a hard failure severity) is to reduce the
601	   identifier reuse frequency by generating the numeric identifiers with
602	   a monotonically-increasing function (e.g. linear).  As a result, any
603	   of the algorithms described in Section 7.4 ("Category #4: Uniqueness,
604	   monotonically increasing within context (hard failure)") can be
605	   readily employed for complying with the requirements of this
606	   transient numeric identifier category.

608	   In cases where suitability (e.g. uniqueness) of the selected
609	   identifiers can be definitely assessed by the local system, any of
610	   the algorithms described in Section 7.1 ("Category #1: Uniqueness
611	   (soft failure)") can be readily employed for complying with the
612	   requirements of this numeric identifier category.

614	   NOTE:
615	      In the case of e.g.  TCP ephemeral ports or TCP ISNs, a transient
616	      numeric identifier that might seem suitable from the perspective
617	      of the local system, might actually be unsuitable from the
618	      perspective of the remote system (e.g., because there is state
619	      associated with the selected identifier at the remote system).
620	      Therefore, in such cases it is not possible employ the algorithms
621	      from Section 7.1 ("Category #1: Uniqueness (soft failure)").

623	7.3.  Category #3: Uniqueness, stable within context (soft failure)

625	   The goal of the following algorithm is to produce identifiers that
626	   are stable for a given context (identified by "CONTEXT"), but that
627	   change when the aforementioned context changes.

629	   In order to avoid storing in memory the transient numeric identifiers
630	   computed for each CONTEXT, the following algorithm employs a
631	   calculated technique (as opposed to keeping state in memory) to
632	   generate a stable transient numeric identifier for each given
633	   context.

635	       /* Transient Numeric ID selection function  */

637	       id_range = max_id - min_id + 1;

639	       retry = 0;

641	       do {
642	           offset = F(CONTEXT, retry, secret_key);
643	           next_id = min_id + (offset % id_range);

645	           if (suitable_id(next_id)) {
646	               return next_id;
647	           }

649	           retry++;

651	       } while (retry <= MAX_RETRIES);

653	       return ERROR;

655	   In this algorithm, the function F() provides a stateless and stable
656	   per-CONTEXT offset, where CONTEXT is the concatenation of all the
657	   elements that define the given context.

659	      For example, if this algorithm is expected to produce IPv6 IIDs
660	      that are unique per network interface and SLAAC autoconfiguration
661	      prefix, the CONTEXT should be the concatenation of e.g. the
662	      network interface index and the SLAAC autoconfiguration prefix
663	      (please see [RFC7217] for an implementation of this algorithm for
664	      generation of stable IPv6 IIDs).

666	   F() is a pseudorandom function (PRF).  It must not be computable from
667	   the outside (without knowledge of the secret key).  F() must also be
668	   difficult to reverse, such that it resists attempts to obtain the
669	   secret_key, even when given samples of the output of F() and
670	   knowledge or control of the other input parameters.  F() should
671	   produce an output of at least as many bits as required for the
672	   transient numeric identifier.  SipHash-2-4 (128-bit key, 64-bit
673	   output) [SipHash] and BLAKE3 (256-bit key, arbitrary-length output)
674	   [BLAKE3] are two possible options for F().  Alternatively, F() could
675	   be implemented with a keyed-hash message authentication code (HMAC)
676	   [RFC2104].  HMAC-SHA-256 [FIPS-SHS] would be one possible option for
677	   such implementation alternative.  Note: Use of HMAC-MD5 [RFC1321] is
678	   not recommended for F() [RFC6151].

680	   The result of F() is no more secure than the secret key, and
681	   therefore 'secret_key' must be unknown to the attacker, and must be
682	   of a reasonable length. 'secret_key' must remain stable for a given
683	   CONTEXT, since otherwise the numeric identifiers generated by this
684	   algorithm would not have the desired stability properties (i.e.,
685	   stable for a given CONTEXT).  In most cases, 'secret_key' should be
686	   selected with a PRNG (see [RFC4086] for recommendations on choosing
687	   secrets) at an appropriate time, and stored in stable or volatile
688	   storage (as necessary) for future use.

690	   The result of F() is stored in the variable 'offset', which may take
691	   any value within the storage type range, since we are restricting the
692	   resulting identifier to be in the range [min_id, max_id] in a similar
693	   way as in the algorithm described in Section 7.1.1.

695	   suitable_id() checks whether the candidate identifier has suitable
696	   uniqueness properties.  Collisions (i.e., an identifier that is not
697	   unique) are recovered by incrementing the 'retry' variable and
698	   recomputing F(), up to a maximum of MAX_RETRIES times.  However,
699	   recovering from collisions will usually result in identifiers that
700	   fail to remain constant for the specified context.  This is normally
701	   acceptable when the probability of collisions is small, as in the
702	   case of e.g.  IPv6 IIDs resulting from SLAAC [RFC7217] [RFC4941].

704	   For obvious reasons, the transient numeric identifiers generated with
705	   this algorithm allow for network activity correlation and
706	   fingerprinting within "CONTEXT".  However, this is essentially a
707	   design goal of this category of transient numeric identifiers.

709	7.4.  Category #4: Uniqueness, monotonically increasing within context
710	      (hard failure)

712	7.4.1.  Per-context Counter Algorithm

714	   One possible way of selecting unique monotonically-increasing
715	   identifiers (per context) is to employ a per-context counter.  Such
716	   an algorithm could be described as follows:

718	       /* Transient Numeric ID selection function */

720	       id_range = max_id - min_id + 1;
721	       retry = id_range;
722	       id_inc = increment() % id_range;

724	       if( (next_id = lookup_counter(CONTEXT)) == ERROR){
725	            next_id = min_id + random() % id_range;
726	       }

728	       do {
729	           if ( (max_id - next_id) >= id_inc){
730	               next_id = next_id + id_inc;
731	           }
732	           else {
733	               next_id = min_id + id_inc - (max_id - next_id);
734	           }

736	           if (suitable_id(next_id)){
737	               store_counter(CONTEXT, next_id);
738	               return next_id;
739	           }

741	           retry = retry - id_inc;

743	       } while (retry > 0);

745	       return ERROR;

747	   NOTES:
748	      increment() returns a small integer that is employed to increment
749	      the current counter value to obtain the next transient numeric
750	      identifier.  This value must be much smaller than the number of
751	      possible values for the numeric IDs (i.e., "id_range").  Most
752	      implementations of this algorithm employ a constant increment of
753	      1.  Using a value other than 1 can help mitigate some information
754	      leakages (please see below), at the expense of a possible increase
755	      in the numeric ID reuse frequency.

757	      The code above makes sure that the increment employed in the
758	      algorithm (id_inc) is always smaller than the number of possible
759	      values for the numeric IDs (i.e., "max_id - min_d + 1").  However,
760	      as noted above, this value must also be much smaller than the
761	      number of possible values for the numeric IDs.

763	      lookup_counter() is a function that returns the current counter
764	      for a given context, or an error condition if that counter does
765	      not exist.

767	      store_counter() is a function that saves a counter value for a
768	      given context.

770	      suitable_id() is a function that checks whether the resulting
771	      identifier is acceptable (e.g., whether it is not already in use,
772	      etc.).

774	   Essentially, whenever a new identifier is to be selected, the
775	   algorithm checks whether a counter for the corresponding context
776	   exists.  If does, the value of such counter is incremented to obtain
777	   the new transient numeric identifier, and the counter is updated.  If
778	   no counter exists for such context, a new counter is created and
779	   initialized to a random value, and used as the selected transient
780	   numeric identifier.  This algorithm produces a per-context counter,
781	   which results in one monotonically-increasing function for each
782	   context.  Since each counter is initialized to a random value, the
783	   resulting values are unpredictable by an off-path attacker.

785	   The choice of id_inc has implications on both the security and
786	   privacy properties of the resulting identifiers, but also on the
787	   corresponding interoperability properties.  On one hand, minimizing
788	   the increments generally minimizes the identifier reuse frequency,
789	   albeit at increased predictability.  On the other hand, if the
790	   increments are randomized, predictability of the resulting
791	   identifiers is reduced, and the information leakage produced by
792	   global constant increments is mitigated.  However, using larger
793	   increments than necessary can result in higher numeric ID reuse
794	   frequency.

796	   This algorithm has the following drawbacks:

798	   *  It requires an implementation to store each per-CONTEXT counter in
799	      memory.  If, as a result of resource management, the counter for a
800	      given context must be removed, the last transient numeric
801	      identifier value used for that context will be lost.  Thus, if
802	      subsequently an identifier needs to be generated for the same
803	      context, the corresponding counter will need to be recreated and
804	      reinitialized to a random value, thus possibly leading to reuse/
805	      collision of numeric identifiers.

807	   *  Keeping one counter for each possible "context" may in some cases
808	      be considered too onerous in terms of memory requirements.

810	   Otherwise, the identifiers produced by this algorithm do not suffer
811	   from the other issues discussed in Section 8.

813	7.4.2.  Simple PRF-Based Algorithm

815	   The goal of this algorithm is to produce monotonically-increasing
816	   transient numeric identifiers (for each given context), with a
817	   randomized initial value.  For example, if the identifiers being
818	   generated must be monotonically-increasing for each {IP Source
819	   Address, IP Destination Address} set, then each possible combination
820	   of {IP Source Address, IP Destination Address} should have a separate
821	   monotonically-increasing sequence, that starts at a different random
822	   value.

824	   Instead of maintaining a per-context counter (as in the algorithm
825	   from Section 7.4.1), the following algorithm employs a calculated
826	   technique to maintain a random offset for each possible context.

828	       /* Initialization code */
829	       counter = 0;

831	       /* Transient Numeric ID selection function  */

833	       id_range = max_id - min_id + 1;
834	       id_inc = increment() % id_range;
835	       offset = F(CONTEXT, secret_key);
836	       retry = id_range;

838	       do {
839	           next_id = min_id + (offset + counter) % id_range;
840	           counter = counter + id_inc;

842	           if (suitable_id(next_id)) {
843	               return next_id;
844	           }

846	           retry = retry - id_inc;

848	       } while (retry > 0);

850	       return ERROR;

852	   In the algorithm above, the function F() provides a (stateless)
853	   unpredictable offset for each given context (as identified by
854	   'CONTEXT').

856	   F() is a PRF, with the same properties as those specified for F() in
857	   Section 7.3.

859	   CONTEXT is the concatenation of all the elements that define a given
860	   context.  For example, if this algorithm is expected to produce
861	   identifiers that are monotonically-increasing for each set (Source IP
862	   Address, Destination IP Address), CONTEXT should be the concatenation
863	   of these two IP addresses.

865	   The function F() provides a "per-CONTEXT" fixed offset within the
866	   numeric identifier "space".  Both the 'offset' and 'counter'
867	   variables may take any value within the storage type range since we
868	   are restricting the resulting identifier to be in the range [min_id,
869	   max_id] in a similar way as in the algorithm described in
870	   Section 7.1.1.  This allows us to simply increment the 'counter'
871	   variable and rely on the unsigned integer to wrap around.

873	   The result of F() is no more secure than the secret key, and
874	   therefore 'secret_key' must be unknown to the attacker, and must be
875	   of a reasonable length. 'secret_key' must remain stable for a given
876	   CONTEXT, since otherwise the numeric identifiers generated by this
877	   algorithm would not have the desired stability properties (i.e.,
878	   monotonically-increasing for a given CONTEXT).  In most cases,
879	   'secret_key' should be selected with a PRNG (see [RFC4086] for
880	   recommendations on choosing secrets) at an appropriate time, and
881	   stored in stable or volatile storage (as necessary) for future use.

883	   It should be noted that, since this algorithm uses a global counter
884	   ("counter") for selecting identifiers (i.e., all counters share the
885	   same increments space), this algorithm results in an information
886	   leakage (as described in Section 8.2).  For example, if this
887	   algorithm were used for selecting TCP ephemeral ports, and an
888	   attacker could force a client to periodically establish a new TCP
889	   connection to an attacker-controlled system (or through an attacker-
890	   observable routing path), the attacker could subtract consecutive
891	   source port values to obtain the number of outgoing TCP connections
892	   established globally by the victim host within that time period (up
893	   to wrap-around issues and five-tuple collisions, of course).  This
894	   information leakage could be partially mitigated by employing small
895	   random values for the increments (i.e., increment() function),
896	   instead of having increment() return the constant "1".

898	   We nevertheless note that an improved mitigation of this information
899	   leakage could be more successfully achieved by employing the
900	   algorithm from Section 7.4.3, instead.

902	7.4.3.  Double-PRF Algorithm

904	   A trade-off between maintaining a single global 'counter' variable
905	   and maintaining 2**N 'counter' variables (where N is the width of the
906	   result of F()), could be achieved as follows.  The system would keep
907	   an array of TABLE_LENGTH values, which would provide a separation of
908	   the increment space into multiple buckets.  This improvement could be
909	   incorporated into the algorithm from Section 7.4.2 as follows:

911	       /* Initialization code */

913	       for(i = 0; i < TABLE_LENGTH; i++) {
914	           table[i] = random();
915	       }

917	       /* Transient Numeric ID selection function */

919	       id_range = max_id - min_id + 1;
920	       id_inc = increment() % id_range;
921	       offset = F(CONTEXT, secret_key1);
922	       index = G(CONTEXT, secret_key2) % TABLE_LENGTH;
923	       retry = id_range;

925	       do {
926	           next_id = min_id + (offset + table[index]) % id_range;
927	           table[index] = table[index] + id_inc;

929	           if (suitable_id(next_id)) {
930	               return next_id;
931	           }

933	          retry = retry - id_inc;

935	       } while (retry > 0);

937	       return ERROR;

939	   'table[]' could be initialized with random values, as indicated by
940	   the initialization code in the pseudo-code above.

942	   Both F() and G() are PRFs, with the same properties as those required
943	   for F() in Section 7.3.

945	   The results of F() and G() are no more secure than their respective
946	   secret keys ('secret_key1' and 'secret_key2', respectively), and
947	   therefore both secret keys must be unknown to the attacker, and must
948	   be of a reasonable length.  Both secret keys must remain stable for
949	   the given CONTEXT, since otherwise the transient numeric identifiers
950	   generated by this algorithm would not have the desired stability
951	   properties (i.e., monotonically-increasing for a given CONTEXT).  In
952	   most cases, both secret keys should be selected with a PRNG (see
953	   [RFC4086] for recommendations on choosing secrets) at an appropriate
954	   time, and stored in stable or volatile storage (as necessary) for
955	   future use.

957	   The 'table[]' array assures that successive transient numeric
958	   identifiers for a given context will be monotonically-increasing.
959	   Since the increments space is separated into TABLE_LENGTH different
960	   spaces, the identifier reuse frequency will be (probabilistically)
961	   lower than that of the algorithm in Section 7.4.2.  That is, the
962	   generation of an identifier for one given context will not
963	   necessarily result in increments in the identifier sequence of other
964	   contexts.  It is interesting to note that the size of 'table[]' does
965	   not limit the number of different identifier sequences, but rather
966	   separates the *increment space* into TABLE_LENGTH different spaces.
967	   The selected transient numeric identifier sequence will be obtained
968	   by adding the corresponding entry from 'table[]' to the value in the
969	   'offset' variable, which selects the actual identifier sequence space
970	   (as in the algorithm from Section 7.4.2).

972	   An attacker can perform traffic analysis for any "increment space"
973	   (i.e., context) into which the attacker has "visibility" -- namely,
974	   the attacker can force a system to generate identifiers for
975	   G(CONTEXT, secret_key2), where the result of G() identifies the
976	   target "increment space".  However, the attacker's ability to perform
977	   traffic analysis is very reduced when compared to the simple PRF-
978	   based identifiers (described in Section 7.4.2) and the predictable
979	   linear identifiers (described in Appendix A.1).  Additionally, an
980	   implementation can further limit the attacker's ability to perform
981	   traffic analysis by further separating the increment space (that is,
982	   using a larger value for TABLE_LENGTH) and/or by randomizing the
983	   increments (i.e., increment() returning a small random number as
984	   opposed to the constant "1").

986	   Otherwise, this algorithm does not suffer from the issues discussed
987	   in Section 8.

989	8.  Common Vulnerabilities Associated with Transient Numeric Identifiers

991	8.1.  Network Activity Correlation

993	   An identifier that is predictable within a given context allows for
994	   network activity correlation within that context.

996	   For example, a stable IPv6 Interface Identifier allows for network
997	   activity to be correlated within the context in which the Interface
998	   Identifier is stable [RFC7721].  A stable-per-network IPv6 Interface
999	   Identifier (as in [RFC7217]) allows for network activity correlation
1000	   within a network, whereas a constant IPv6 Interface Identifier (that
1001	   remains constant across networks) allows not only network activity
1002	   correlation within the same network, but also across networks ("host
1003	   tracking").

1005	   Similarly, an implementation that generates TCP ISNs with a global
1006	   counter could allow for fingerprinting and network activity
1007	   correlation across networks, since an attacker could passively infer
1008	   the identity of the victim based on the TCP ISNs employed for
1009	   subsequent communication instances.  Similarly, an implementation
1010	   that generates predictable IPv6 Fragment Identification values could
1011	   be subject to fingerprinting attacks (see e.g.  [Bellovin2002]).

1013	8.2.  Information Leakage

1015	   Transient numeric identifiers that result in specific patterns can
1016	   produce an information leakage to other communicating entities.  For
1017	   example, it is common to generate transient numeric identifiers with
1018	   an algorithm such as:

1020	                   ID = offset(CONTEXT) + mono(CONTEXT);

1022	   This generic expression generates identifiers by adding a
1023	   monotonically-increasing function (e.g. linear) to a randomized
1024	   offset. offset() is constant within a given context, whereas mono()
1025	   produces a monotonically-increasing sequence for the given context.
1026	   Identifiers generated with this expression will generally be
1027	   predictable within CONTEXT.

1029	   The predictability of mono(), irrespective of the predictability of
1030	   offset(), can leak information that may be of use to attackers.  For
1031	   example, a node that selects ephemeral port numbers as in:

1033	                 ephemeral_port = offset(Dest_IP) + mono()

1035	   that is, with a per-destination offset, but a global mono() function
1036	   (e.g., a global counter), will leak information about total number of
1037	   outgoing connections that have been issued by the vulnerable
1038	   implementation.

1040	   Similarly, a node that generates Fragment Identification values as
1041	   in:

1043	            Frag_ID = offset(IP_src_addr, IP_dst_addr) + mono()

1045	   will leak out information about the total number of fragmented
1046	   packets that have been transmitted by the vulnerable implementation.
1047	   The vulnerabilities described in

1049	   [Sanfilippo1998a], [Sanfilippo1998b], and [Sanfilippo1999] are all
1050	   associated with the use of a global mono() function (i.e., with a
1051	   global and constant "context") -- particularly when it is a linear
1052	   function (constant increments of 1).

1054	   Predicting transient numeric identifiers can be of help for other
1055	   types of attacks.  For example, predictable TCP ISNs can open the
1056	   door to trivial connection-reset and data injection attacks (see
1057	   Section 8.6).

1059	8.3.  Fingerprinting

1061	   Fingerprinting is the capability of an attacker to identify or re-
1062	   identify a visiting user, user agent or device via configuration
1063	   settings or other observable characteristics.  Observable protocol
1064	   objects and characteristics can be employed to identify/re-identify a
1065	   variety of entities, ranging from the underlying hardware or
1066	   Operating System (vendor, type and version), to the user itself (i.e.
1067	   his/her identity).  [EFF] illustrates web browser-based
1068	   fingerprinting, but similar techniques can be applied at other layers
1069	   and protocols, whether alternatively or in conjunction with it.

1071	   Transient numeric identifiers are one of the observable protocol
1072	   components that could be leveraged for fingerprinting purposes.  That
1073	   is, an attacker could sample transient numeric identifiers to infer
1074	   the algorithm (and its associated parameters, if any) for generating
1075	   such identifiers, possibly revealing the underlying Operating System
1076	   (OS) vendor, type, and version.  This information could possibly be
1077	   further leveraged in conjunction with other fingerprinting techniques
1078	   and sources.

1080	   Evasion of protocol-stack fingerprinting can prove to be a very
1081	   difficult task: most systems make use of a wide variety of protocols,
1082	   each of which have a large number of parameters that can be set to
1083	   arbitrary values or generated with a variety of algorithms with
1084	   multiple parameters.

1086	   NOTE:
1087	      General protocol-based fingerprinting is discussed in [RFC6973],
1088	      along with guidelines to mitigate the associated vulnerability.
1089	      [Fyodor1998] and [Fyodor2006] are classic references on Operating
1090	      System detection via TCP/IP stack fingerprinting.  Nmap [nmap] is
1091	      probably the most popular tool for remote OS identification via
1092	      active TCP/IP stack fingerprinting. p0f [Zalewski2012], on the
1093	      other hand, is a tool for performing remote OS detection via
1094	      passive TCP/IP stack fingerprinting.  Finally, [TBIT] is a TCP
1095	      fingerprinting tool that aims at characterizing the behaviour of a
1096	      remote TCP peer based on active probes, and which has been widely
1097	      used in the research community.

1099	   Algorithms that, from the perspective of an observer (e.g., the
1100	   legitimate communicating peer), result in specific values or
1101	   patterns, will allow for at least some level of fingerprinting.  For
1102	   example, the algorithm from Section 7.3 will typically allow
1103	   fingerprinting within the context where the resulting identifiers are
1104	   stable.  Similarly, the algorithms from Section 7.4 will result in a
1105	   monotonically-increasing sequences within a given context, thus
1106	   allowing for at least some level of fingerprinting (when the other
1107	   communicating entity can correlate different sampled identifiers as
1108	   belonging to the same monotonically-increasing sequence).

1110	   Thus, where possible, algorithms from Section 7.1 should be preferred
1111	   over algorithms that result in specific values or patterns.

1113	8.4.  Exploitation of the Semantics of Transient Numeric Identifiers

1115	   Identifiers that are not semantically opaque tend to be more
1116	   predictable than semantically-opaque identifiers.  For example, a MAC
1117	   address contains an OUI (Organizationally-Unique Identifier) which
1118	   identifies the vendor that manufactured the corresponding network
1119	   interface card.  This can be leveraged by an attacker trying to
1120	   "guess" MAC addresses, who has some knowledge about the possible NIC
1121	   vendor.

1123	   [RFC7707] discusses a number of techniques to reduce the search space
1124	   when performing IPv6 address-scanning attacks by leveraging the
1125	   semantics of the IIDs produced by traditional SLAAC algorithms
1126	   (eventually replaced by [RFC7217]) that embed MAC addresses in the
1127	   IID of IPv6 addresses.

1129	8.5.  Exploitation of Collisions of Transient Numeric Identifiers

1131	   In many cases, the collision of transient network identifiers can
1132	   have a hard failure severity (or result in a hard failure severity if
1133	   an attacker can cause multiple collisions deterministically, one
1134	   after another).  For example, predictable Fragment Identification
1135	   values open the door to Denial of Service (DoS) attacks (see e.g.
1136	   [RFC5722].).

1138	8.6.  Exploitation of Predictable Transient Numeric Identifiers for
1139	      Injection Attacks

1141	   Some protocols rely on "sequence numbers" for the validation of
1142	   incoming packets.  For example, TCP employs sequence numbers for
1143	   reassembling TCP segments, while IPv4 and IPv6 employ Fragment
1144	   Identification values for reassembling IPv4 and IPv6 fragments
1145	   (respectively).  Lacking built-in cryptographic mechanisms for
1146	   validating packets, these protocols are therefore vulnerable to on-
1147	   path data (see e.g.  [Joncheray1995]) and/or control-information (see
1148	   e.g.  [RFC4953] and [RFC5927]) injection attacks.  The extent to
1149	   which these protocols may resist off-path (i.e. "blind") injection
1150	   attacks depends on whether the associated "sequence numbers" are
1151	   predictable, and effort required to successfully predict a valid
1152	   "sequence number" (see e.g.  [RFC4953] and [RFC5927]).

1154	   We note that the use of unpredictable "sequence numbers" is a
1155	   completely-ineffective mitigation for on-path injection attacks, and
1156	   also a mostly-ineffective mitigation for off-path (i.e. "blind")
1157	   injection attacks.  However, many legacy protocols (such as TCP) do
1158	   not natively incorporate cryptographic mitigations, but rather only
1159	   as optional features (see e.g.  [RFC5925]), if at all available.
1160	   Additionally, ad-hoc use of cryptographic mitigations might not be
1161	   sufficient to relieve a protocol implementation of generating
1162	   appropriate transient numeric identifiers.  For example, use of the
1163	   Transport Layer Security (TLS) protocol [RFC8446] with TCP will
1164	   protect the application protocol, but will not help to mitigate e.g.
1165	   TCP-based connection-reset attacks (see e.g.  [RFC4953]).  Similarly,
1166	   use of SEcure Neighbor Discovery (SEND) [RFC3971] will still imply
1167	   reliance on the successful reassembly of IPv6 fragments in those
1168	   cases where SEND packets do not fit into the link Maximum
1169	   Transmission Unit (MTU) (see [RFC6980]).

1171	8.7.  Cryptanalysis

1173	   A number of algorithms discussed in this document (such as those
1174	   described in Section 7.4.2 and Section 7.4.3) rely on PRFs.
1175	   Implementations that employ weak PRFs or keys of inappropriate size
1176	   can be subject to cryptanalysis, where an attacker can obtain the
1177	   secret key employed for the PRF, predict numeric identifiers, etc.

1179	   Furthermore, an implementation that overloads the semantics of the
1180	   secret key can result in more trivial cryptanalysis, possibly
1181	   resulting in the leakage of the value employed for the secret key.

1183	   NOTE:
1184	      [IPID-DEV] describes two vulnerable transient numeric ID
1185	      generators that employ cryptographically-weak hash functions.
1186	      Additionally, one of such implementations employs 32-bits of a
1187	      kernel address as the secret key for a hash function, and
1188	      therefore successful cryptanalysis leaks the aforementioned kernel
1189	      address, allowing for Kernel Address Space Layout Randomization
1190	      (KASLR) [KASLR] bypass.

1192	9.  Vulnerability Assessment of Transient Numeric Identifiers

1194	   The following subsections analyze possible vulnerabilities associated
1195	   with the algorithms described in Section 7.

1197	9.1.  Category #1: Uniqueness (soft failure)

1199	   Possible vulnerabilities associated with the algorithms from
1200	   Section 7.1 include:

1202	   *  Use of flawed PRNGs (please see e.g.  [Zalewski2001],
1203	      [Zalewski2002] and [Klein2007]),

1205	   *  Inadvertently affecting the distribution of an otherwise suitable
1206	      PRNG.

1208	   An implementer should consult [RFC4086] regarding randomness
1209	   requirements for security, and consult relevant documentation when
1210	   employing a PRNG provided by the underlying system.

1212	   When employing a PRNG, many implementations "adapt" the length of its
1213	   output with a modulo operator (e.g., C language's "%"), possibly
1214	   changing the distribution of the output of the PRNG.

1216	   For example, consider an implementation that employs the following
1217	   code:

1219	                          id = random() % 50000;

1221	   This example implementation means to obtain a transient numeric
1222	   identifier in the range 0-49999.  If random() produces e.g. a
1223	   pseudorandom number of 16 bits (with uniform distribution), the
1224	   selected transient numeric identifier will have a non-uniform
1225	   distribution with the numbers in the range 0-15535 having double-
1226	   frequency than the numbers in the range 15536-49999.

1228	   NOTE:
1229	      For example, both an output of 10 and output of 50010 from the
1230	      random() function will result in an 'id' value of 10.

1232	   This effect is reduced if the PRNG produces an output that is much
1233	   longer than the length implied by the modulo operation.

1235	   Use of algorithms other than PRNGs for generating identifiers of this
1236	   category is discouraged.

1238	9.2.  Category #2: Uniqueness (hard failure)

1240	   As noted in Section 7.2, this category can employ the same algorithms
1241	   as Category #4, since a monotonically-increasing sequence tends to
1242	   minimize the transient numeric identifier reuse frequency.
1243	   Therefore, the vulnerability analysis in Section 9.4 also applies to
1244	   this category.

1246	   Additionally, as noted in Section 7.2, some transient numeric
1247	   identifiers of this category might be able to use the algorithms from
1248	   Section 7.1, in which case the same considerations as in Section 9.1
1249	   would apply.

1251	9.3.  Category #3: Uniqueness, stable within context (soft failure)

1253	   Possible vulnerabilities associated with the algorithms from
1254	   Section 7.3 are:

1256	   1.  Use of weak PRFs, or inappropriate secret keys (whether
1257	       inappropriate selection or inappropriate size) could allow for
1258	       cryptanalysis, which could eventually be exploited by an attacker
1259	       to predict future transient numeric identifiers.

1261	   2.  Since the algorithm generates a unique and stable identifier
1262	       within a specified context, it may allow for network activity
1263	       correlation and fingerprinting within the specified context.

1265	9.4.  Category #4: Uniqueness, monotonically increasing within context
1266	      (hard failure)

1268	   The algorithm described in Section 7.4.1 for generating identifiers
1269	   of Category #4 will result in an identifiable pattern (i.e. a
1270	   monotonically-increasing sequence) for the transient numeric
1271	   identifiers generated for each CONTEXT, and thus will allow for
1272	   fingerprinting and network activity correlation within each CONTEXT.

1274	   On the other hand, a simple way to generalize and analyze the
1275	   algorithms described in Section 7.4.2 and Section 7.4.3 for
1276	   generating identifiers of Category #4, is as follows:

1278	       /* Transient Numeric ID selection function */

1280	       id_range = max_id - min_id + 1;
1281	       retry = id_range;
1282	       id_inc = increment() % id_range;

1284	       do {
1285	           update_mono(CONTEXT, id_inc);
1286	           next_id = min_id + (offset(CONTEXT) + \
1287	                               mono(CONTEXT)) % id_range;

1289	           if (suitable_id(next_id)) {
1290	               return next_id;
1291	           }

1293	           retry = retry - id_inc;

1295	       } while (retry > 0);

1297	       return ERROR;

1299	   NOTE:
1300	      increment() returns a small integer that is employed to generate a
1301	      monotonically-increasing function.  Most implementations employ a
1302	      constant value for "increment()" (usually 1).  The value returned
1303	      by increment() must be much smaller than the value computed for
1304	      "id_range".

1306	      update_mono(CONTEXT, id_inc) increments the counter corresponding
1307	      to CONTEXT by "id_inc".

1309	      mono(CONTEXT) reads the counter corresponding to CONTEXT.

1311	   Essentially, an identifier (next_id) is generated by adding a
1312	   monotonically-increasing function (mono()) to an offset value,
1313	   unknown to the attacker and stable for given context (CONTEXT).

1315	   The following aspects of the algorithm should be considered:

1317	   *  For the most part, it is the offset() function that results in
1318	      identifiers that are unpredictable by an off-patch attacker.
1319	      While the resulting sequence is known to be monotonically-
1320	      increasing, the use of a randomized offset value makes the
1321	      resulting values unknown to the attacker.

1323	   *  The most straightforward "stateless" implementation of offset() is
1324	      with a PRF that takes the values that identify the context and a
1325	      "secret_key" (not shown in the figure above) as arguments.

1327	   *  One possible implementation of mono() would be to have mono()
1328	      internally employ a single counter (as in the algorithm from
1329	      Section 7.4.2), or map the increments for different contexts into
1330	      a number of counters/buckets, such that the number of counters
1331	      that need to be maintained in memory is reduced (as in the
1332	      algorithm from algorithm in Section 7.4.3).

1334	   *  In all cases, a monotonically increasing function is implemented
1335	      by incrementing the previous value of a counter by increment()
1336	      units.  In the most trivial case, increment() could return the
1337	      constant "1".  But increment() could also be implemented to return
1338	      small random integers such that the increments are unpredictable
1339	      (see Appendix A of [RFC7739]).  This represents a trade-off
1340	      between the unpredictability of the resulting transient numeric
1341	      IDs and the transient numeric ID reuse frequency.

1343	   Considering the generic algorithm illustrated above, we can identify
1344	   the following possible vulnerabilities:

1346	   *  Since the algorithms for this category are similar to those of
1347	      Section 9.3, with the addition of a monotonically-increasing
1348	      function, all the issues discussed in Section 9.3 ("Category #3:
1349	      Uniqueness, stable within context (soft failure)") also apply to
1350	      this case.

1352	   *  mono() can be correlated to the number of identifiers generated
1353	      for a given context (CONTEXT).  Thus, if mono() spans more than
1354	      the necessary context, the "increments" could be leaked to other
1355	      parties, thus disclosing information about the number of
1356	      identifiers that have been generated by the algorithm for all
1357	      contexts.  This is information disclosure becomes more evident
1358	      when an implementation employs a constant increment of 1.  For
1359	      example, an implementation where mono() is actually a single
1360	      global counter, will unnecessarily leak information the number of
1361	      identifiers that have been generated by the algorithm (globally,
1362	      for all contexts).  [Fyodor2003] is one example of how such
1363	      information leakages can be exploited.  We note that limiting the
1364	      span of the increments space will require a larger number of
1365	      counters to be stored in memory (i.e., a larger value for the
1366	      TABLE_LENGTH parameter of the algorithm in Section 7.4.3).

1368	   *  Transient numeric identifiers generated with the algorithms
1369	      described in Section 7.4.2 and Section 7.4.3 will normally allow
1370	      for fingerprinting within CONTEXT since, for such context, the
1371	      resulting identifiers will have an identifiable pattern (i.e. a
1372	      monotonically-increasing sequence).

1374	10.  IANA Considerations

1376	   This document has no IANA actions.

1378	11.  Security Considerations

1380	   The entire document is about the security and privacy implications of
1381	   transient numeric identifiers.
1382	   [I-D.gont-numeric-ids-sec-considerations] recommends that protocol
1383	   specifications specify the interoperability requirements of their
1384	   transient numeric identifiers, perform a vulnerability assessment of
1385	   their transient numeric identifiers, and suggest an algorithm for
1386	   generating each of their transient numeric identifiers.  This
1387	   document analyzes possible algorithms (and their implications) that
1388	   could be employed to comply with the interoperability properties of
1389	   most common categories of transient numeric identifiers, while
1390	   minimizing the associated negative security and privacy implications.

1392	12.  Acknowledgements

1394	   The authors would like to thank (in alphabetical order) Bernard
1395	   Aboba, Jean-Philippe Aumasson, Steven Bellovin, Luis Leon Cardenas
1396	   Graide, Guillermo Gont, Joseph Lorenzo Hall, Gre Norcie, Colin
1397	   Perkins, Vincent Roca, Shivan Sahib, Rich Salz, Martin Thomson, and
1398	   Michael Tuexen, for providing valuable comments on earlier versions
1399	   of this document.

1401	   The authors would like to thank Shivan Sahib and Christopher Wood,
1402	   for their guidance during the publication process of this document.

1404	   The authors would like to thank Jean-Philippe Aumasson and Mathew D.
1405	   Green (John Hopkins University) for kindly answering a number of
1406	   questions.

1408	   The authors would like to thank Diego Armando Maradona for his magic
1409	   and inspiration.

1411	13.  References

1413	13.1.  Normative References

1415	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
1416	              RFC 793, DOI 10.17487/RFC0793, September 1981,
1417	              <https://www.rfc-editor.org/info/rfc793>.

1419	   [RFC6528]  Gont, F. and S. Bellovin, "Defending against Sequence
1420	              Number Attacks", RFC 6528, DOI 10.17487/RFC6528, February
1421	              2012, <https://www.rfc-editor.org/info/rfc6528>.

1423	   [RFC6056]  Larsen, M. and F. Gont, "Recommendations for Transport-
1424	              Protocol Port Randomization", BCP 156, RFC 6056,
1425	              DOI 10.17487/RFC6056, January 2011,
1426	              <https://www.rfc-editor.org/info/rfc6056>.

1428	   [RFC5925]  Touch, J., Mankin, A., and R. Bonica, "The TCP
1429	              Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
1430	              June 2010, <https://www.rfc-editor.org/info/rfc5925>.

1432	   [RFC0791]  Postel, J., "Internet Protocol", STD 5, RFC 791,
1433	              DOI 10.17487/RFC0791, September 1981,
1434	              <https://www.rfc-editor.org/info/rfc791>.

1436	   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
1437	              (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460,
1438	              December 1998, <https://www.rfc-editor.org/info/rfc2460>.

1440	   [RFC8200]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
1441	              (IPv6) Specification", STD 86, RFC 8200,
1442	              DOI 10.17487/RFC8200, July 2017,
1443	              <https://www.rfc-editor.org/info/rfc8200>.

1445	   [RFC4086]  Eastlake 3rd, D., Schiller, J., and S. Crocker,
1446	              "Randomness Requirements for Security", BCP 106, RFC 4086,
1447	              DOI 10.17487/RFC4086, June 2005,
1448	              <https://www.rfc-editor.org/info/rfc4086>.

1450	   [RFC4291]  Hinden, R. and S. Deering, "IP Version 6 Addressing
1451	              Architecture", RFC 4291, DOI 10.17487/RFC4291, February
1452	              2006, <https://www.rfc-editor.org/info/rfc4291>.

1454	   [RFC4941]  Narten, T., Draves, R., and S. Krishnan, "Privacy
1455	              Extensions for Stateless Address Autoconfiguration in
1456	              IPv6", RFC 4941, DOI 10.17487/RFC4941, September 2007,
1457	              <https://www.rfc-editor.org/info/rfc4941>.

1459	   [RFC4862]  Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless
1460	              Address Autoconfiguration", RFC 4862,
1461	              DOI 10.17487/RFC4862, September 2007,
1462	              <https://www.rfc-editor.org/info/rfc4862>.

1464	   [RFC5722]  Krishnan, S., "Handling of Overlapping IPv6 Fragments",
1465	              RFC 5722, DOI 10.17487/RFC5722, December 2009,
1466	              <https://www.rfc-editor.org/info/rfc5722>.

1468	   [RFC7217]  Gont, F., "A Method for Generating Semantically Opaque
1469	              Interface Identifiers with IPv6 Stateless Address
1470	              Autoconfiguration (SLAAC)", RFC 7217,
1471	              DOI 10.17487/RFC7217, April 2014,
1472	              <https://www.rfc-editor.org/info/rfc7217>.

1474	   [RFC8064]  Gont, F., Cooper, A., Thaler, D., and W. Liu,
1475	              "Recommendation on Stable IPv6 Interface Identifiers",
1476	              RFC 8064, DOI 10.17487/RFC8064, February 2017,
1477	              <https://www.rfc-editor.org/info/rfc8064>.

1479	   [RFC6191]  Gont, F., "Reducing the TIME-WAIT State Using TCP
1480	              Timestamps", BCP 159, RFC 6191, DOI 10.17487/RFC6191,
1481	              April 2011, <https://www.rfc-editor.org/info/rfc6191>.

1483	   [RFC7323]  Borman, D., Braden, B., Jacobson, V., and R.
1484	              Scheffenegger, Ed., "TCP Extensions for High Performance",
1485	              RFC 7323, DOI 10.17487/RFC7323, September 2014,
1486	              <https://www.rfc-editor.org/info/rfc7323>.

1488	   [RFC1321]  Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321,
1489	              DOI 10.17487/RFC1321, April 1992,
1490	              <https://www.rfc-editor.org/info/rfc1321>.

1492	   [RFC6151]  Turner, S. and L. Chen, "Updated Security Considerations
1493	              for the MD5 Message-Digest and the HMAC-MD5 Algorithms",
1494	              RFC 6151, DOI 10.17487/RFC6151, March 2011,
1495	              <https://www.rfc-editor.org/info/rfc6151>.

1497	   [RFC1035]  Mockapetris, P., "Domain names - implementation and
1498	              specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
1499	              November 1987, <https://www.rfc-editor.org/info/rfc1035>.

1501	13.2.  Informative References

1503	   [KASLR]    PaX Team, "Address Space Layout Randomization",
1504	              <https://pax.grsecurity.net/docs/aslr.txt>.

1506	   [IANA-PROT]
1507	              IANA, "Protocol Registries",
1508	              <https://www.iana.org/protocols>.

1510	   [RFC6973]  Cooper, A., Tschofenig, H., Aboba, B., Peterson, J.,
1511	              Morris, J., Hansen, M., and R. Smith, "Privacy
1512	              Considerations for Internet Protocols", RFC 6973,
1513	              DOI 10.17487/RFC6973, July 2013,
1514	              <https://www.rfc-editor.org/info/rfc6973>.

1516	   [Fyodor1998]
1517	              Fyodor, "Remote OS Detection via TCP/IP Stack
1518	              Fingerprinting",  Phrack Magazine, Volume 9, Issue 54,
1519	              1998, <http://www.phrack.org/archives/issues/54/9.txt>.

1521	   [Fyodor2006]
1522	              Lyon, G., "Chapter 8. Remote OS Detection", 2006,
1523	              <https://nmap.org/book/osdetect.html>.

1525	   [nmap]     nmap, "Nmap: Free Security Scanner For Network Exploration
1526	              and Audit", 2020, <https://www.insecure.org/nmap>.

1528	   [EFF]      EFF, "Cover your tracks: See how trackers view your
1529	              browser", 2020, <https://coveryourtracks.eff.org/>.

1531	   [Schuba1993]
1532	              Schuba, C., "ADDRESSING WEAKNESSES IN THE DOMAIN NAME
1533	              SYSTEM PROTOCOL", 1993,
1534	              <http://ftp.cerias.purdue.edu/pub/papers/christoph-schuba/
1535	              schuba-DNS-msthesis.pdf>.

1537	   [TBIT]     TBIT, "TBIT, the TCP Behavior Inference Tool", 2001,
1538	              <https://www.icir.org/tbit/>.

1540	   [Zalewski2012]
1541	              Zalewski, M., "p0f v3 (version 3.09b)", 2012,
1542	              <https://lcamtuf.coredump.cx/p0f.shtml>.

1544	   [RFC2104]  Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-
1545	              Hashing for Message Authentication", RFC 2104,
1546	              DOI 10.17487/RFC2104, February 1997,
1547	              <https://www.rfc-editor.org/info/rfc2104>.

1549	   [RFC7098]  Carpenter, B., Jiang, S., and W. Tarreau, "Using the IPv6
1550	              Flow Label for Load Balancing in Server Farms", RFC 7098,
1551	              DOI 10.17487/RFC7098, January 2014,
1552	              <https://www.rfc-editor.org/info/rfc7098>.

1554	   [RFC7258]  Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an
1555	              Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May
1556	              2014, <https://www.rfc-editor.org/info/rfc7258>.

1558	   [CPNI-TCP] Gont, F., "Security Assessment of the Transmission Control
1559	              Protocol (TCP)",  United Kingdom's Centre for the
1560	              Protection of National Infrastructure (CPNI) Technical
1561	              Report, 2009, <https://www.gont.com.ar/papers/tn-03-09-
1562	              security-assessment-TCP.pdf>.

1564	   [Zalewski2001]
1565	              Zalewski, M., "Strange Attractors and TCP/IP Sequence
1566	              Number Analysis", 2001,
1567	              <https://lcamtuf.coredump.cx/oldtcp/tcpseq.html>.

1569	   [Zalewski2002]
1570	              Zalewski, M., "Strange Attractors and TCP/IP Sequence
1571	              Number Analysis - One Year Later", 2001,
1572	              <https://lcamtuf.coredump.cx/newtcp/>.

1574	   [Joncheray1995]
1575	              Joncheray, L., "A Simple Active Attack Against TCP", Proc.
1576	              Fifth Usenix UNIX Security Symposium, 1995, <https://www.u
1577	              senix.org/legacy/publications/library/proceedings/
1578	              security95/full_papers/joncheray.pdf>.

1580	   [Morris1985]
1581	              Morris, R., "A Weakness in the 4.2BSD UNIX TCP/IP
1582	              Software", CSTR 117, AT&T Bell Laboratories, Murray Hill,
1583	              NJ, 1985,
1584	              <https://pdos.csail.mit.edu/~rtm/papers/117.pdf>.

1586	   [Shimomura1995]
1587	              Shimomura, T., "Technical details of the attack described
1588	              by Markoff in NYT", Message posted in USENET's
1589	              comp.security.misc newsgroup Message-ID:
1590	              <3g5gkl$5j1@ariel.sdsc.edu>, 1995,
1591	              <https://www.gont.com.ar/docs/post-shimomura-usenet.txt>.

1593	   [RFC5927]  Gont, F., "ICMP Attacks against TCP", RFC 5927,
1594	              DOI 10.17487/RFC5927, July 2010,
1595	              <https://www.rfc-editor.org/info/rfc5927>.

1597	   [RFC4953]  Touch, J., "Defending TCP Against Spoofing Attacks",
1598	              RFC 4953, DOI 10.17487/RFC4953, July 2007,
1599	              <https://www.rfc-editor.org/info/rfc4953>.

1601	   [RFC8446]  Rescorla, E., "The Transport Layer Security (TLS) Protocol
1602	              Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
1603	              <https://www.rfc-editor.org/info/rfc8446>.

1605	   [RFC3971]  Arkko, J., Ed., Kempf, J., Zill, B., and P. Nikander,
1606	              "SEcure Neighbor Discovery (SEND)", RFC 3971,
1607	              DOI 10.17487/RFC3971, March 2005,
1608	              <https://www.rfc-editor.org/info/rfc3971>.

1610	   [RFC6980]  Gont, F., "Security Implications of IPv6 Fragmentation
1611	              with IPv6 Neighbor Discovery", RFC 6980,
1612	              DOI 10.17487/RFC6980, August 2013,
1613	              <https://www.rfc-editor.org/info/rfc6980>.

1615	   [RFC7739]  Gont, F., "Security Implications of Predictable Fragment
1616	              Identification Values", RFC 7739, DOI 10.17487/RFC7739,
1617	              February 2016, <https://www.rfc-editor.org/info/rfc7739>.

1619	   [RFC4963]  Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly
1620	              Errors at High Data Rates", RFC 4963,
1621	              DOI 10.17487/RFC4963, July 2007,
1622	              <https://www.rfc-editor.org/info/rfc4963>.

1624	   [RFC6274]  Gont, F., "Security Assessment of the Internet Protocol
1625	              Version 4", RFC 6274, DOI 10.17487/RFC6274, July 2011,
1626	              <https://www.rfc-editor.org/info/rfc6274>.

1628	   [Bellovin1989]
1629	              Bellovin, S., "Security Problems in the TCP/IP Protocol
1630	              Suite", Computer Communications Review, vol. 19, no. 2,
1631	              pp. 32-48, 1989,
1632	              <https://www.cs.columbia.edu/~smb/papers/ipext.pdf>.

1634	   [Bellovin2002]
1635	              Bellovin, S. M., "A Technique for Counting NATted Hosts",
1636	              IMW'02 Nov. 6-8, 2002, Marseille, France, 2002,
1637	              <https://www.cs.columbia.edu/~smb/papers/fnat.pdf>.

1639	   [Fyodor2003]
1640	              Fyodor, "Idle scanning and related IP ID games", 2003,
1641	              <https://nmap.org/presentations/CanSecWest03/CD_Content/
1642	              idlescan_paper/idlescan.html>.

1644	   [Sanfilippo1998a]
1645	              Sanfilippo, S., "about the ip header id", Post to Bugtraq
1646	              mailing-list, Mon Dec 14 1998,
1647	              <http://seclists.org/bugtraq/1998/Dec/48>.

1649	   [Sanfilippo1998b]
1650	              Sanfilippo, S., "Idle scan", Post to Bugtraq mailing-list,
1651	              1998, <https://github.com/antirez/hping/raw/master/docs/
1652	              SPOOFED_SCAN.txt>.

1654	   [Sanfilippo1999]
1655	              Sanfilippo, S., "more ip id", Post to Bugtraq mailing-
1656	              list, 1999,
1657	              <https://github.com/antirez/hping/raw/master/docs/MORE-
1658	              FUN-WITH-IPID>.

1660	   [Silbersack2005]
1661	              Silbersack, M.J., "Improving TCP/IP security through
1662	              randomization without sacrificing interoperability",
1663	              EuroBSDCon 2005 Conference, 2005,
1664	              <https://citeseerx.ist.psu.edu/viewdoc/
1665	              download?doi=10.1.1.91.4542&rep=rep1&type=pdf>.

1667	   [Klein2007]
1668	              Klein, A., "OpenBSD DNS Cache Poisoning and Multiple O/S
1669	              Predictable IP ID Vulnerability", 2007,
1670	              <https://dl.packetstormsecurity.net/papers/attack/OpenBSD_
1671	              DNS_Cache_Poisoning_and_Multiple_OS_Predictable_IP_ID_Vuln
1672	              erability.pdf>.

1674	   [IPID-DEV] Klein, A. and B. Pinkas, "From IP ID to Device ID and
1675	              KASLR Bypass (Extended Version)", June 2019,
1676	              <https://arxiv.org/pdf/1906.10478.pdf>.

1678	   [I-D.irtf-pearg-numeric-ids-history]
1679	              Gont, F. and I. Arce, "Unfortunate History of Transient
1680	              Numeric Identifiers", Work in Progress, Internet-Draft,
1681	              draft-irtf-pearg-numeric-ids-history-09, 27 January 2022,
1682	              <https://www.ietf.org/archive/id/draft-irtf-pearg-numeric-
1683	              ids-history-09.txt>.

1685	   [I-D.gont-numeric-ids-sec-considerations]
1686	              Gont, F. and I. Arce, "Security Considerations for
1687	              Transient Numeric Identifiers Employed in Network
1688	              Protocols", Work in Progress, Internet-Draft, draft-gont-
1689	              numeric-ids-sec-considerations-06, 5 December 2020,
1690	              <https://www.ietf.org/archive/id/draft-gont-numeric-ids-
1691	              sec-considerations-06.txt>.

1693	   [RFC7721]  Cooper, A., Gont, F., and D. Thaler, "Security and Privacy
1694	              Considerations for IPv6 Address Generation Mechanisms",
1695	              RFC 7721, DOI 10.17487/RFC7721, March 2016,
1696	              <https://www.rfc-editor.org/info/rfc7721>.

1698	   [RFC7707]  Gont, F. and T. Chown, "Network Reconnaissance in IPv6
1699	              Networks", RFC 7707, DOI 10.17487/RFC7707, March 2016,
1700	              <https://www.rfc-editor.org/info/rfc7707>.

1702	   [TCPT-uptime]
1703	              McDanel, B., "TCP Timestamping - Obtaining System Uptime
1704	              Remotely", 14 March 2001,
1705	              <https://securiteam.com/securitynews/5np0c153pi/>.

1707	   [SipHash]  Aumasson, J. P. and D. J. Bernstein, "SipHash: a fast
1708	              short-input PRF", 2012,
1709	              <https://github.com/veorq/SipHash>.

1711	   [BLAKE3]   O'Connor, J., Aumasson, J. P., Neves, S., and Z. Wilcox-
1712	              O'Hearn, "BLAKE3: one function, fast everywhere", 2020,
1713	              <https://blake3.io/>.

1715	   [FIPS-SHS] NIST, "Secure Hash Standard (SHS)",  Federal Information
1716	              Processing Standards Publication 180-4, August 2015,
1717	              <https://nvlpubs.nist.gov/nistpubs/FIPS/
1718	              NIST.FIPS.180-4.pdf>.

1720	Appendix A.  Algorithms and Techniques with Known Issues

1722	   The following subsections discuss algorithms and techniques with
1723	   known negative security and privacy implications.

1725	   NOTE:
1726	      As discussed in Section 1, the use of cryptographic techniques
1727	      might allow for the safe use of some of these algorithms and
1728	      techniques.  However, this should be evaluated on a case by case
1729	      basis.

1731	A.1.  Predictable Linear Identifiers Algorithm

1733	   One of the most trivial ways to achieve uniqueness with a low
1734	   identifier reuse frequency is to produce a linear sequence.  This
1735	   type of algorithm has been employed in the past to generate
1736	   identifiers of Categories #1, #2, and #4 (please see Section 6 for an
1737	   analysis of these categories).

1739	   For example, the following algorithm has been employed (see e.g.
1740	   [Morris1985], [Shimomura1995], [Silbersack2005] and [CPNI-TCP]) in a
1741	   number of operating systems for selecting IP fragment IDs, TCP
1742	   ephemeral ports, etc.:

1744	       /* Initialization code */

1746	       next_id = min_id;
1747	       id_inc= 1;

1749	       /* Transient Numeric ID selection function */

1751	       id_range = max_id - min_id + 1;
1752	       retry = id_range;

1754	       do {
1755	           if (next_id == max_id) {
1756	               next_id = min_id;
1757	           }
1758	           else {
1759	               next_id = next_id + id_inc;
1760	           }

1762	           if (suitable_id(next_id)) {
1763	               return next_id;
1764	           }

1766	           retry--;

1768	       } while (retry > 0);

1770	       return ERROR;

1772	   NOTE:
1773	      suitable_id() is a function that checks whether the resulting
1774	      identifier is acceptable (e.g., whether it's in use, etc.).

1776	   For obvious reasons, this algorithm results in predictable sequences.
1777	   Since a global counter is used to generate the transient numeric
1778	   identifiers ("next_id" in the example above), an entity that learns
1779	   one numeric identifier can infer past numeric identifiers and predict
1780	   future values to be generated by the same algorithm.  Since the value
1781	   employed for the increments is known (such as "1" in this case), an
1782	   attacker can sample two values, and learn the number of identifiers
1783	   that have been were generated in-between the two sampled values.
1784	   Furthermore, if the counter is initialized e.g. when the system its
1785	   bootstrapped to some known value, the algorithm will leak additional
1786	   information, such as the number of transmitted fragmented datagrams
1787	   in the case of an IP ID generator [Sanfilippo1998a], or the system
1788	   uptime in the case of TCP timestamps [TCPT-uptime].

1790	A.2.  Random-Increments Algorithm

1792	   This algorithm offers a middle ground between the algorithms that
1793	   generate randomized transient numeric identifiers (such as those
1794	   described in Section 7.1.1 and Section 7.1.2), and those that
1795	   generate identifiers with a predictable monotonically-increasing
1796	   function (see Appendix A.1).

1798	       /* Initialization code */

1800	       next_id = random();        /* Initialization value */
1801	       id_rinc = 500;             /* Determines the trade-off */

1803	       /* Transient Numeric ID selection function */

1805	       id_range = max_id - min_id + 1;
1806	       retry = id_range;

1808	       do {
1809	           /* Random increment */
1810	           id_inc = (random() % id_rinc) + 1;

1812	           if ( (max_id - next_id) >= id_inc){
1813	               next_id = next_id + id_inc;
1814	           }
1815	           else {
1816	               next_id = min_id + id_inc - (max_id - next_id);
1817	           }

1819	           if (suitable_id(next_id)) {
1820	              return next_id;
1821	           }

1823	           retry = retry - id_inc;

1825	       } while (retry > 0);

1827	       return ERROR;

1829	   This algorithm aims at producing a global monotonically-increasing
1830	   sequence of transient numeric identifiers, while avoiding the use of
1831	   fixed increments, which would lead to trivially predictable
1832	   sequences.  The value "id_inc" allows for direct control of the
1833	   trade-off between unpredictability and identifier reuse frequency.
1834	   The smaller the value of "id_inc", the more similar this algorithm is
1835	   to a predicable, global linear ID generation algorithm (as the one in
1836	   Appendix A.1).  The larger the value of "id_inc", the more similar
1837	   this algorithm is to the algorithm described in Section 7.1.1 of this
1838	   document.

1840	   When the identifiers wrap, there is a risk of collisions of transient
1841	   numeric identifiers (i.e., identifier reuse).  Therefore, "id_inc"
1842	   should be selected according to the following criteria:

1844	   *  It should maximize the wrapping time of the identifier space.

1846	   *  It should minimize identifier reuse frequency.

1848	   *  It should maximize unpredictability.

1850	   Clearly, these are competing goals, and the decision of which value
1851	   of "id_inc" to use is a trade-off.  Therefore, the value of "id_inc"
1852	   is at times a configurable parameter so that system administrators
1853	   can make the trade-off for themselves.  We note that the alternative
1854	   algorithms discussed throughout this document offer better
1855	   interoperability, security and privacy properties than this
1856	   algorithm, and hence implementation of this algorithm is discouraged.

1858	A.3.  Re-using Identifiers Across Different Contexts

1860	   Employing the same identifier across contexts in which stability is
1861	   not required (i.e. overloading the semantics of transient numeric
1862	   identifier) usually has negative security and privacy implications.

1864	   For example, in order to generate transient numeric identifiers of
1865	   Category #2 or Category #3, an implementation or specification might
1866	   be tempted to employ a source for the numeric identifiers which is
1867	   known to provide unique values, but that may also be predictable or
1868	   leak information related to the entity generating the identifier.
1869	   This technique has been employed in the past for e.g. generating IPv6
1870	   IIDs by re-using the MAC address of the underlying network interface.
1871	   However, as noted in [RFC7721] and [RFC7707], embedding link-layer
1872	   addresses in IPv6 IIDs not only results in predictable values, but
1873	   also leaks information about the manufacturer of the underlying
1874	   network interface card, allows for network activity correlation, and
1875	   makes address-based scanning attacks feasible.

1877	Authors' Addresses

1879	   Fernando Gont
1880	   EdgeUno
1881	   Segurola y Habana 4310 7mo piso
1882	   Ciudad Autonoma de Buenos Aires
1883	   Buenos Aires
1884	   Argentina

1886	   Email: fernando.gont@edgeuno.com
1887	   URI:   https://www.edgeuno.com

1889	   Ivan Arce
1890	   Quarkslab
1891	   Segurola y Habana 4310 7mo piso
1892	   Ciudad Autonoma de Buenos Aires
1893	   Buenos Aires
1894	   Argentina

1896	   Email: iarce@quarkslab.com
1897	   URI:   https://www.quarkslab.com