idnits 2.17.1 

draft-irtf-pearg-numeric-ids-generation-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (January 13, 2021) is 1199 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  ** Obsolete normative reference: RFC  793 (Obsoleted by RFC 9293)

  ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200)

  ** Obsolete normative reference: RFC 4941 (Obsoleted by RFC 8981)

  ** Obsolete normative reference: RFC 6528 (Obsoleted by RFC 9293)

  == Outdated reference: A later version (-11) exists of
     draft-gont-numeric-ids-sec-considerations-06

  == Outdated reference: A later version (-11) exists of
     draft-irtf-pearg-numeric-ids-history-06


     Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Research Task Force (IRTF)                              F. Gont
3	Internet-Draft                                              SI6 Networks
4	Intended status: Informational                                   I. Arce
5	Expires: July 17, 2021                                         Quarkslab
6	                                                        January 13, 2021

8	           On the Generation of Transient Numeric Identifiers
9	               draft-irtf-pearg-numeric-ids-generation-06

11	Abstract

13	   This document performs an analysis of the security and privacy
14	   implications of different types of "transient numeric identifiers"
15	   used in IETF protocols, and tries to categorize them based on their
16	   interoperability requirements and their associated failure severity
17	   when such requirements are not met.  Subsequently, it provides advice
18	   on possible algorithms that could be employed to satisfy the
19	   interoperability requirements of each identifier category, while
20	   minimizing the negative security and privacy implications, thus
21	   providing guidance to protocol designers and protocol implementers.
22	   Finally, it describes a number of algorithms that have been employed
23	   in real implementations to generate transient numeric identifiers,
24	   and analyzes their security and privacy properties.  This document is
25	   a product of the Privacy Enhancement and Assessment Research Group
26	   (PEARG) in the IRTF.

28	Status of This Memo

30	   This Internet-Draft is submitted in full conformance with the
31	   provisions of BCP 78 and BCP 79.

33	   Internet-Drafts are working documents of the Internet Engineering
34	   Task Force (IETF).  Note that other groups may also distribute
35	   working documents as Internet-Drafts.  The list of current Internet-
36	   Drafts is at https://datatracker.ietf.org/drafts/current/.

38	   Internet-Drafts are draft documents valid for a maximum of six months
39	   and may be updated, replaced, or obsoleted by other documents at any
40	   time.  It is inappropriate to use Internet-Drafts as reference
41	   material or to cite them other than as "work in progress."

43	   This Internet-Draft will expire on July 17, 2021.

45	Copyright Notice

47	   Copyright (c) 2021 IETF Trust and the persons identified as the
48	   document authors.  All rights reserved.

50	   This document is subject to BCP 78 and the IETF Trust's Legal
51	   Provisions Relating to IETF Documents
52	   (https://trustee.ietf.org/license-info) in effect on the date of
53	   publication of this document.  Please review these documents
54	   carefully, as they describe your rights and restrictions with respect
55	   to this document.

57	Table of Contents

59	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
60	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   5
61	   3.  Threat Model  . . . . . . . . . . . . . . . . . . . . . . . .   5
62	   4.  Issues with the Specification of Transient Numeric
63	       Identifiers . . . . . . . . . . . . . . . . . . . . . . . . .   6
64	   5.  Protocol Failure Severity . . . . . . . . . . . . . . . . . .   7
65	   6.  Categorizing Transient Numeric Identifiers  . . . . . . . . .   7
66	   7.  Common Algorithms for Transient Numeric Identifier Generation  10
67	     7.1.  Category #1: Uniqueness (soft failure)  . . . . . . . . .  10
68	     7.2.  Category #2: Uniqueness (hard failure)  . . . . . . . . .  13
69	     7.3.  Category #3: Uniqueness, stable within context (soft
70	           failure)  . . . . . . . . . . . . . . . . . . . . . . . .  13
71	     7.4.  Category #4: Uniqueness, monotonically increasing within
72	           context (hard failure)  . . . . . . . . . . . . . . . . .  15
73	   8.  Common Vulnerabilities Associated with Transient Numeric
74	       Identifiers . . . . . . . . . . . . . . . . . . . . . . . . .  21
75	     8.1.  Network Activity Correlation  . . . . . . . . . . . . . .  21
76	     8.2.  Information Leakage . . . . . . . . . . . . . . . . . . .  22
77	     8.3.  Fingerprinting  . . . . . . . . . . . . . . . . . . . . .  23
78	     8.4.  Exploitation of the Semantics of Transient Numeric
79	           Identifiers . . . . . . . . . . . . . . . . . . . . . . .  24
80	     8.5.  Exploitation of Collisions of Transient Numeric
81	           Identifiers . . . . . . . . . . . . . . . . . . . . . . .  24
82	     8.6.  Exploitation of Predictable Transient Numeric Identifiers
83	           for Injection Attacks . . . . . . . . . . . . . . . . . .  24
84	     8.7.  Cryptanalysis . . . . . . . . . . . . . . . . . . . . . .  25
85	   9.  Vulnerability Assessment of Transient Numeric Identifiers . .  26
86	     9.1.  Category #1: Uniqueness (soft failure)  . . . . . . . . .  26
87	     9.2.  Category #2: Uniqueness (hard failure)  . . . . . . . . .  26
88	     9.3.  Category #3: Uniqueness, stable within context (soft
89	           failure)  . . . . . . . . . . . . . . . . . . . . . . . .  27
90	     9.4.  Category #4: Uniqueness, monotonically increasing within
91	           context (hard failure)  . . . . . . . . . . . . . . . . .  27
92	   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  30
93	   11. Security Considerations . . . . . . . . . . . . . . . . . . .  30
94	   12. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  30
95	   13. References  . . . . . . . . . . . . . . . . . . . . . . . . .  30
96	     13.1.  Normative References . . . . . . . . . . . . . . . . . .  30
97	     13.2.  Informative References . . . . . . . . . . . . . . . . .  32
98	   Appendix A.  Algorithms and Techniques with Known Issues  . . . .  36
99	     A.1.  Predictable Linear Identifiers Algorithm  . . . . . . . .  37
100	     A.2.  Random-Increments Algorithm . . . . . . . . . . . . . . .  38
101	     A.3.  Re-using Identifiers Across Different Contexts  . . . . .  40
102	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  40

104	1.  Introduction

106	   Networking protocols employ a variety of transient numeric
107	   identifiers for different protocol objects, such as IPv4 and IPv6
108	   Fragment Identifiers [RFC0791] [RFC8200], IPv6 Interface Identifiers
109	   (IIDs) [RFC4291], transport protocol ephemeral port numbers
110	   [RFC6056], TCP Initial Sequence Numbers (ISNs) [RFC0793], and DNS
111	   Transaction IDs (TxIDs) [RFC1035].  These identifiers usually have
112	   specific interoperability requirements (e.g. uniqueness during a
113	   specified period of time) that must be satisfied such that they do
114	   not result in negative interoperability implications, and an
115	   associated failure severity when such requirements are not met,
116	   ranging from soft to hard failures.

118	   For more than 30 years, a large number of implementations of the TCP/
119	   IP protocol suite have been subject to a variety of attacks, with
120	   effects ranging from Denial of Service (DoS) or data injection, to
121	   information leakages that could be exploited for pervasive monitoring
122	   [RFC7258].  The root cause of these issues has been, in many cases,
123	   the poor selection of transient numeric identifiers in such
124	   protocols, usually as a result of insufficient or misleading
125	   specifications.  While it is generally trivial to identify an
126	   algorithm that can satisfy the interoperability requirements of a
127	   given transient numeric identifier, empirical evidence exists that
128	   doing so without negatively affecting the security and/or privacy
129	   properties of the aforementioned protocols is prone to error
130	   [I-D.irtf-pearg-numeric-ids-history].

132	   For example, implementations have been subject to security and/or
133	   privacy issues resulting from:

135	   o  Predictable IPv4 or IPv6 Fragment Identifiers (see e.g.
136	      [Sanfilippo1998a], [RFC6274], and [RFC7739])

138	   o  Predictable IPv6 IIDs (see e.g.  [RFC7721], [RFC7707], and
139	      [RFC7217])

141	   o  Predictable transport protocol ephemeral port numbers (see e.g.
142	      [RFC6056] and [Silbersack2005])

144	   o  Predictable TCP Initial Sequence Numbers (ISNs) (see e.g.
145	      [Morris1985], [Bellovin1989], and [RFC6528])

147	   o  Predictable initial timestamp in TCP timestamps Options (see e.g.
148	      [TCPT-uptime] and [RFC7323])

150	   o  Predictable DNS TxIDs (see e.g.  [Schuba1993] and [Klein2007])

152	   Recent history indicates that when new protocols are standardized or
153	   new protocol implementations are produced, the security and privacy
154	   properties of the associated transient numeric identifiers tend to be
155	   overlooked, and inappropriate algorithms to generate transient
156	   numeric identifiers are either suggested in the specifications or
157	   selected by implementers.  As a result, it should be evident that
158	   advice in this area is warranted.

160	   We note that the use of cryptographic techniques may readily mitigate
161	   some of the issues arising from predictable transient numeric
162	   identifiers.  For example, cryptographic integrity and authentication
163	   can readily mitigate data injection attacks even in the presence of
164	   predictable transient numeric identifiers (such as "sequence
165	   numbers").  However, use of flawed algorithms (such as global
166	   counters) for generating transient numeric identifiers could still
167	   result in information leakages even when cryptographic techniques are
168	   employed.

170	   This document contains a non-exhaustive survey of transient numeric
171	   identifiers employed in various IETF protocols, and aims to
172	   categorize such identifiers based on their interoperability
173	   requirements, and the associated failure severity when such
174	   requirements are not met.  Subsequently, it provides advice on
175	   possible algorithms that could be employed to satisfy the
176	   interoperability requirements of each category, while minimizing
177	   negative security and privacy implications.  Finally, it analyzes
178	   several algorithms that have been employed in real implementations to
179	   meet such requirements, and analyzes their security and privacy
180	   properties.

182	   This document represents the consensus of the Privacy Enhancement and
183	   Assessment Research Group (PEARG).

185	2.  Terminology

187	   Transient Numeric Identifier:
188	      A data object in a protocol specification that can be used to
189	      definitely distinguish a protocol object (a datagram, network
190	      interface, transport protocol endpoint, session, etc.) from all
191	      other objects of the same type, in a given context.  Transient
192	      numeric identifiers are usually defined as a series of bits, and
193	      represented using integer values.  These identifiers are typically
194	      dynamically selected, as opposed to statically-assigned numeric
195	      identifiers (see e.g.  [IANA-PROT]).  We note that different
196	      transient numeric identifiers may have additional requirements or
197	      properties depending on their specific use in a protocol.  We use
198	      the term "transient numeric identifier" (or simply "numeric
199	      identifier" or "identifier" as short forms) as a generic term to
200	      refer to any data object in a protocol specification that
201	      satisfies the identification property stated above.

203	   Failure Severity:
204	      The consequences of a failure to comply with the interoperability
205	      requirements of a given identifier.  Severity considers the worst
206	      potential consequence of a failure, determined by the system
207	      damage and/or time lost to repair the failure.  In this document
208	      we define two types of failure severity: "soft failure" and "hard
209	      failure".

211	   Soft Failure:
212	      A soft failure is a recoverable condition in which a protocol does
213	      not operate in the prescribed manner but normal operation can be
214	      resumed automatically in a short period of time.  For example, a
215	      simple packet-loss event that is subsequently recovered with a
216	      packet-retransmission can be considered a soft failure.

218	   Hard Failure:
219	      A hard failure is a non-recoverable condition in which a protocol
220	      does not operate in the prescribed manner or it operates with
221	      excessive degradation of service.  For example, an established TCP
222	      connection that is aborted due to an error condition constitutes,
223	      from the point of view of the transport protocol, a hard failure,
224	      since it enters a state from which normal operation cannot be
225	      resumed.

227	3.  Threat Model

229	   Throughout this document, we assume an attacker does not have
230	   physical or logical access to the system(s) being attacked, and
231	   cannot observe the packets being transferred between the sender and
232	   the receiver(s) of the target protocol (if any).  However, we assume
233	   the attacker can send any traffic to the target device(s), to e.g.
234	   sample transient numeric identifiers employed by such device(s).

236	4.  Issues with the Specification of Transient Numeric Identifiers

238	   While assessing protocol specifications regarding the use of
239	   transient numeric identifiers, we have found that most of the issues
240	   discussed in this document arise as a result of one of the following
241	   conditions:

243	   o  Protocol specifications that under-specify the requirements for
244	      their transient numeric identifiers

246	   o  Protocol specifications that over-specify their transient numeric
247	      identifiers

249	   o  Protocol implementations that simply fail to comply with the
250	      specified requirements

252	   A number of protocol specifications (too many of them) have simply
253	   overlooked the security and privacy implications of transient numeric
254	   identifiers [I-D.irtf-pearg-numeric-ids-history].  Examples of them
255	   are the specification of TCP ephemeral ports in [RFC0793], the
256	   specification of TCP sequence numbers in [RFC0793], or the
257	   specification of the DNS TxID in [RFC1035].

259	   On the other hand, there are a number of protocol specifications that
260	   over-specify some of their associated transient numeric identifiers.
261	   For example, [RFC4291] essentially overloads the semantics of IPv6
262	   Interface Identifiers (IIDs) by embedding link-layer addresses in the
263	   IPv6 IIDs, when the interoperability requirement of uniqueness could
264	   be achieved in other ways that do not result in negative security and
265	   privacy implications [RFC7721].  Similarly, [RFC2460] suggested the
266	   use of a global counter for the generation of Fragment Identification
267	   values, when the interoperability properties of uniqueness per {IPv6
268	   Source Address, IPv6 Destination Address} could be achieved with
269	   other algorithms that do not result in negative security and privacy
270	   implications [RFC7739].

272	   Finally, there are protocol implementations that simply fail to
273	   comply with existing protocol specifications.  For example, some
274	   popular operating systems (notably Microsoft Windows) still fail to
275	   implement transport protocol ephemeral port randomization, as
276	   recommended in [RFC6056].

278	5.  Protocol Failure Severity

280	   Section 2 defines the concept of "Failure Severity", along with two
281	   types of failure severities that we employ throughout this document:
282	   soft and hard.

284	   Our analysis of the severity of a failure is performed from the point
285	   of view of the protocol in question.  However, the corresponding
286	   severity on the upper protocol (or application) might not be the same
287	   as that of the protocol in question.  For example, a TCP connection
288	   that is aborted might or might not result in a hard failure of the
289	   upper application: if the upper application can establish a new TCP
290	   connection without any impact on the application, a hard failure at
291	   the TCP protocol may have no severity at the application level.  On
292	   the other hand, if a hard failure of a TCP connection results in
293	   excessive degradation of service at the application layer, it will
294	   also result in a hard failure at the application.

296	6.  Categorizing Transient Numeric Identifiers

298	   This section includes a non-exhaustive survey of transient numeric
299	   identifiers, and proposes a number of categories that can accommodate
300	   these identifiers based on their interoperability requirements and
301	   their associated failure severity (soft or hard)

303	   +------------------+---------------------------------+--------------+
304	   |    Identifier    |  Interoperability Requirements  |   Failure    |
305	   |                  |                                 |   Severity   |
306	   +------------------+---------------------------------+--------------+
307	   |   IPv6 Frag ID   |    Uniqueness (for IP address   |  Soft/Hard   |
308	   |                  |              pair)              |     (1)      |
309	   +------------------+---------------------------------+--------------+
310	   |     IPv6 IID     |  Uniqueness (and stable within  |   Soft (3)   |
311	   |                  |         IPv6 prefix) (2)        |              |
312	   +------------------+---------------------------------+--------------+
313	   |     TCP ISN      |   Monotonically-increasing (4)  |   Hard (4)   |
314	   +------------------+---------------------------------+--------------+
315	   |   TCP initial    |   Monotonically-increasing (5)  |   Hard (5)   |
316	   |    timestamps    |                                 |              |
317	   +------------------+---------------------------------+--------------+
318	   |  TCP eph. port   |  Uniqueness (for connection ID) |     Hard     |
319	   +------------------+---------------------------------+--------------+
320	   | IPv6 Flow Label  |            Uniqueness           |   None (6)   |
321	   +------------------+---------------------------------+--------------+
322	   |     DNS TxID     |            Uniqueness           |   None (7)   |
323	   +------------------+---------------------------------+--------------+

325	             Table 1: Survey of Transient Numeric Identifiers

327	   Notes:

329	   (1)
330	      While a single collision of Fragment ID values would simply lead
331	      to a single packet drop (and hence a "soft" failure), repeated
332	      collisions at high data rates might trash the Fragment ID space,
333	      leading to a hard failure [RFC4963].

335	   (2)
336	      While the interoperability requirements are simply that the
337	      Interface ID results in a unique IPv6 address, for operational
338	      reasons it is typically desirable that the resulting IPv6 address
339	      (and hence the corresponding Interface ID) be stable within each
340	      network [RFC7217] [RFC8064].

342	   (3)
343	      While IPv6 Interface IDs must result in unique IPv6 addresses,
344	      IPv6 Duplicate Address Detection (DAD) [RFC4862] allows for the
345	      detection of duplicate addresses, and hence such Interface ID
346	      collisions can be recovered.

348	   (4)
349	      In theory, there are no interoperability requirements for TCP
350	      Initial Sequence Numbers (ISNs), since the TIME-WAIT state and
351	      TCP's "quiet time" concept take care of old segments from previous
352	      incarnations of a connection.  However, a widespread optimization
353	      allows for a new incarnation of a previous connection to be
354	      created if the ISN of the incoming SYN is larger than the last
355	      sequence number seen in that direction for the previous
356	      incarnation of the connection.  Thus, monotonically-increasing TCP
357	      ISNs allow for such optimization to work as expected [RFC6528],
358	      and can help avoid connection-establishment failures.

360	   (5)
361	      Strictly speaking, there are no interoperability requirements for
362	      the *initial* TCP timestamp employed by a TCP instance (i.e., the
363	      TS Value (TSval) in a segment with the SYN bit set).  However,
364	      some TCP implementations allow a new incarnation of a previous
365	      connection to be created if the TSval of the incoming SYN is
366	      larger than the last TSval seen in that direction for the previous
367	      incarnation of the connection (please see [RFC6191]).  Thus,
368	      monotonically-increasing TCP initial timestamps (across
369	      connections to the same endpoint) allow for such optimization to
370	      work as expected [RFC6191], and can help avoid connection-
371	      establishment failures.

373	   (6)
374	      The IPv6 Flow Label is typically employed for load sharing
375	      [RFC7098], along with the Source and Destination IPv6 addresses.
376	      Reuse of a Flow Label value for the same set {Source Address,
377	      Destination Address} would typically cause both flows to be
378	      multiplexed onto the same link.  However, as long as this does not
379	      occur deterministically, it will not result in any negative
380	      implications.

382	   (7)
383	      DNS TxIDs are employed, together with the Source Address,
384	      Destination Address, Source Port, and Destination Port, to match
385	      DNS requests and responses.  However, since an implementation
386	      knows which DNS requests were sent for that set of {Source
387	      Address, Destination Address, Source Port, and Destination Port,
388	      DNS TxID}, a collision of TxID would result, if anything, in a
389	      small performance penalty (the response would nevertheless be
390	      discarded when it is found that it does not answer the query sent
391	      in the corresponding DNS query).

393	   Based on the survey above, we can categorize identifiers as follows:

395	   +-----+---------------------------------------+---------------------+
396	   | Cat |                Category               |   Sample Proto IDs  |
397	   |  #  |                                       |                     |
398	   +-----+---------------------------------------+---------------------+
399	   |  1  |       Uniqueness (soft failure)       |  IPv6 Flow L., DNS  |
400	   |     |                                       |        TxIDs        |
401	   +-----+---------------------------------------+---------------------+
402	   |  2  |       Uniqueness (hard failure)       |  IPv6 Frag ID, TCP  |
403	   |     |                                       |    ephemeral port   |
404	   +-----+---------------------------------------+---------------------+
405	   |  3  |   Uniqueness, stable within context   |      IPv6 IIDs      |
406	   |     |             (soft failure)            |                     |
407	   +-----+---------------------------------------+---------------------+
408	   |  4  |  Uniqueness, monotonically increasing |     TCP ISN, TCP    |
409	   |     |     within context (hard failure)     |  initial timestamps |
410	   +-----+---------------------------------------+---------------------+

412	                      Table 2: Identifier Categories

414	   We note that Category #4 could be considered a generalized case of
415	   category #3, in which a monotonically increasing element is added to
416	   a stable (within context) element, such that the resulting
417	   identifiers are monotonically increasing within a specified context.
418	   That is, the same algorithm could be employed for both #3 and #4,
419	   given appropriate parameters.

421	7.  Common Algorithms for Transient Numeric Identifier Generation

423	   The following subsections describe some sample algorithms that can be
424	   employed for generating transient numeric identifiers for each of the
425	   categories above.

427	   All of the variables employed in the algorithms of the following
428	   subsections are of "unsigned integer" type, except for the "retry"
429	   variable, that is of (signed) "integer" type.

431	7.1.  Category #1: Uniqueness (soft failure)

433	   The requirement of uniqueness with a soft failure severity can be
434	   complied with a Pseudo-Random Number Generator (PRNG).

436	   We note that since the premise is that collisions of transient
437	   numeric identifiers of this category only leads to soft failures, in
438	   many cases, the algorithm might not need to check the suitability of
439	   a selected identifier (i.e., suitable_id() could always return
440	   "true").

442	   In scenarios where e.g. simultaneous use of a given numeric ID is
443	   undesirable and the implementation detects such condition, an
444	   implementation may opt to select the next available identifier in the
445	   same sequence, or select another random number.  Section 7.1.1 is an
446	   implementation of the former strategy, while Section 7.1.2 is an
447	   implementation of the later.  Typically, the algorithm in
448	   Section 7.1.2 results in a more uniform distribution of the generated
449	   transient numeric identifiers.  However, for transient numeric
450	   identifiers where an implementation typically keeps local state about
451	   unsuitable/used identifiers, the algorithm in Section 7.1.2 may
452	   require many more iterations than the algorithm in Section 7.1.1 to
453	   generate a suitable transient numeric identifier.  This will usually
454	   be affected by the current usage ratio of transient numeric
455	   identifiers (i.e., number of numeric identifiers considered suitable
456	   / total number of numeric identifiers) and other parameters.
457	   Therefore, in such cases many implementations tend to prefer the
458	   algorithm in Section 7.1.1 over the algorithm in Section 7.1.2.

460	7.1.1.  Simple Randomization Algorithm
461	       /* Transient Numeric ID selection function */

463	       id_range = max_id - min_id + 1;
464	       next_id = min_id + (random() % id_range);
465	       retry = id_range;

467	       do {
468	           if (suitable_id(next_id)) {
469	               return next_id;
470	           }

472	           if (next_id == max_id) {
473	               next_id = min_id;
474	           } else {
475	               next_id++;
476	           }

478	           retry--;

480	       } while (retry > 0);

482	       return ERROR;

484	   NOTES:
485	      random() is a function that returns a pseudo-random unsigned
486	      integer number of appropriate size.  Note that the output needs to
487	      be unpredictable, and typical implementations of the POSIX
488	      random() function do not necessarily meet this requirement.  See
489	      [RFC4086] for randomness requirements for security.  Beware that
490	      "adapting" the length of the output of random() with a modulo
491	      operator (e.g., C language's "%") may change the distribution of
492	      the PRNG.

494	      The function suitable_id() can check, when possible and desirable,
495	      whether a selected transient numeric identifier is suitable (e.g.
496	      it is not already in use).  Depending on how/where the numeric
497	      identifier is used, it may or may not be possible (or even
498	      desirable) to check whether the numeric identifier is in use (or
499	      whether it has been recently employed).  When an identifier is
500	      found to be unsuitable, this algorithm selects the next available
501	      numeric identifier in sequence.

503	      Even when this algorithm selects numeric IDs randomly, it is
504	      biased towards the first available numeric ID after a sequence of
505	      unavailable numeric IDs.  For example, if this algorithm is
506	      employed for transport protocol ephemeral port randomization
507	      [RFC6056] and the local list of unsuitable port numbers (e.g.,
508	      registered port numbers that should not be used for ephemeral
509	      ports) is significant, an attacker may actually have a
510	      significantly better chance of guessing a port number.

512	      All the variables (in this and all the algorithms discussed in
513	      this document) are unsigned integers.

515	   Assuming the randomness requirements for the PRNG are met (see
516	   [RFC4086]), this algorithm does not suffer from any of the issues
517	   discussed in Section 8.

519	7.1.2.  Another Simple Randomization Algorithm

521	   The following pseudo-code illustrates another algorithm for selecting
522	   a random transient numeric identifier which, in the event a selected
523	   identifier is found to be unsuitable (e.g., already in use), another
524	   identifier is randomly selected:

526	       /* Transient Numeric ID selection function */

528	       id_range = max_id - min_id + 1;
529	       retry = id_range;

531	       do {
532	           next_id = min_id + (random() % id_range);

534	           if (suitable_id(next_id)) {
535	               return next_id;
536	           }

538	           retry--;

540	       } while (retry > 0);

542	       return ERROR;

544	   This algorithm might be unable to select a transient numeric
545	   identifier (i.e., return "ERROR") even if there are suitable
546	   identifiers available, in cases where a large number of identifiers
547	   are found to be unsuitable (e.g. "in use").

549	   The same considerations from Section 7.1.1 with respect to the
550	   properties of random() and the adaptation of its output length apply
551	   to this algorithm.

553	   Assuming the randomness requirements for the PRNG are met (see
554	   [RFC4086]), this algorithm does not suffer from any of the issues
555	   discussed in Section 8.

557	7.2.  Category #2: Uniqueness (hard failure)

559	   One of the most trivial approaches for generating unique transient
560	   numeric identifier (with a hard failure severity) is to reduce the
561	   identifier reuse frequency by generating the numeric identifiers with
562	   a monotonically-increasing function (e.g. linear).  As a result, any
563	   of the algorithms described in Section 7.4 ("Category #4: Uniqueness,
564	   monotonically increasing within context (hard failure)") can be
565	   readily employed for complying with the requirements of this
566	   transient numeric identifier category.

568	   In cases where suitability (e.g. uniqueness) of the selected
569	   identifiers can be definitely assessed by the local system, any of
570	   the algorithms described in Section 7.1 ("Category #1: Uniqueness
571	   (soft failure)") can be readily employed for complying with the
572	   requirements of this numeric identifier category.

574	   NOTE:
575	      In the case of e.g.  TCP ephemeral ports or TCP ISNs, a transient
576	      numeric identifier that might seem suitable from the perspective
577	      of the local system, might actually be unsuitable from the
578	      perspective of the remote system (e.g., because there is state
579	      associated with the selected identifier at the remote system).
580	      Therefore, in such cases it is not possible employ the algorithms
581	      from Section 7.1 ("Category #1: Uniqueness (soft failure)").

583	7.3.  Category #3: Uniqueness, stable within context (soft failure)

585	   The goal of the following algorithm is to produce identifiers that
586	   are stable for a given context (identified by "CONTEXT"), but that
587	   change when the aforementioned context changes.

589	   In order to avoid storing in memory the transient numeric identifiers
590	   computed for each CONTEXT, the following algorithm employs a
591	   calculated technique (as opposed to keeping state in memory) to
592	   generate a stable transient numeric identifier for each given
593	   context.

595	       /* Transient Numeric ID selection function  */

597	       id_range = max_id - min_id + 1;

599	       retry = 0;

601	       do {
602	           offset = F(CONTEXT, retry, secret_key);
603	           next_id = min_id + (offset % id_range);

605	           if (suitable_id(next_id)) {
606	               return next_id;
607	           }

609	           retry++;

611	       } while (retry <= MAX_RETRIES);

613	       return ERROR;

615	   In this algorithm, the function F() provides a stateless and stable
616	   per-CONTEXT offset, where CONTEXT is the concatenation of all the
617	   elements that define the given context.

619	      For example, if this algorithm is expected to produce IPv6 IIDs
620	      that are unique per network interface and SLAAC autoconfiguration
621	      prefix, the CONTEXT should be the concatenation of e.g. the
622	      network interface index and the SLAAC autoconfiguration prefix
623	      (please see [RFC7217] for an implementation of this algorithm for
624	      generation of stable IPv6 IIDs).

626	   F() is a pseudorandom function (PRF) that must not computable from
627	   the outside (without knowledge of the secret key).  F() must also be
628	   difficult to reverse, such that it resists attempts to obtain the
629	   secret_key, even when given samples of the output of F() and
630	   knowledge or control of the other input parameters.  F() should
631	   produce an output of at least as many bits as required for the
632	   transient numeric identifier.  F() could be the result of applying a
633	   cryptographic hash over an encoded version of the function
634	   parameters.  While this document does not recommend a specific
635	   mechanism for encoding the function parameters (or a specific
636	   cryptographic hash function), a cryptographically robust construction
637	   will ensure that the mapping from parameters to the hash function
638	   input is an injective map, as might be attained by using fixed-width
639	   encodings and/or length-prefixing variable-length parameters.
640	   SHA-256 [FIPS-SHS] is one possible option for F().  Note: MD5
641	   [RFC1321] is considered unacceptable for F() [RFC6151].

643	   The result of F() is no more secure than the secret key, and
644	   therefore 'secret_key' must be unknown to the attacker, and must be
645	   of a reasonable length. 'secret_key' must remain stable for a given
646	   CONTEXT, since otherwise the numeric identifiers generated by this
647	   algorithm would not have the desired stability properties (i.e.,
648	   stable for a given CONTEXT).  In most cases, 'secret_key' can be
649	   selected with a PRNG (see [RFC4086] for recommendations on choosing
650	   secrets) at an appropriate time, and stored in stable or volatile
651	   storage (as necessary) for future use.

653	   The result of F() is stored in the variable 'offset', which may take
654	   any value within the storage type range, since we are restricting the
655	   resulting identifier to be in the range [min_id, max_id] in a similar
656	   way as in the algorithm described in Section 7.1.1.

658	   suitable_id() checks whether the candidate identifier has suitable
659	   uniqueness properties.  Collisions (i.e., an identifier that is not
660	   unique) are recovered by incrementing the 'retry' variable and
661	   recomputing F(), up to a maximum of MAX_RETRIES times.  However,
662	   recovering from collisions will usually result in identifiers that
663	   fail to remain constant for the specified context.  This is normally
664	   acceptable when the probability of collisions is small, as in the
665	   case of e.g.  IPv6 IIDs resulting from SLAAC [RFC7217] [RFC4941].

667	   For obvious reasons, the transient numeric identifiers generated with
668	   this algorithm allow for network activity correlation and
669	   fingerprinting within "CONTEXT".  However, this is essentially a
670	   design goal of this category of transient numeric identifiers.

672	7.4.  Category #4: Uniqueness, monotonically increasing within context
673	      (hard failure)

675	7.4.1.  Per-context Counter Algorithm

677	   One possible way of selecting unique monotonically-increasing
678	   identifiers (per context) is to employ a per-context counter.  Such
679	   an algorithm could be described as follows:

681	       /* Transient Numeric ID selection function */

683	       id_range = max_id - min_id + 1;
684	       retry = id_range;
685	       id_inc = increment() % id_range;

687	       if( (next_id = lookup_counter(CONTEXT)) == ERROR){
688	            next_id = min_id + random() % id_range;
689	       }

691	       do {
692	           if ( (max_id - next_id) >= id_inc){
693	               next_id = next_id + id_inc;
694	           }
695	           else {
696	               next_id = min_id + id_inc - (max_id - next_id);
697	           }

699	           if (suitable_id(next_id)){
700	               store_counter(CONTEXT, next_id);
701	               return next_id;
702	           }

704	           retry = retry - id_inc;

706	       } while (retry > 0);

708	       return ERROR;

710	   NOTES:
711	      increment() returns a small integer that is employed to increment
712	      the current counter value to obtain the next transient numeric
713	      identifier.  This value must be much smaller than the number of
714	      possible values for the numeric IDs (i.e., "id_range").  Most
715	      implementations of this algorithm employ a constant increment of
716	      1.  Using a value other than 1 can help mitigate some information
717	      leakages (please see below), at the expense of a possible increase
718	      in the numeric ID reuse frequency.

720	      The code above makes sure that the increment employed in the
721	      algorithm (id_inc) is always smaller than the number of possible
722	      values for the numeric IDs (i.e., "max_id - min_d + 1").  However,
723	      as noted above, this value must also be much smaller than the
724	      number of possible values for the numeric IDs.

726	      lookup_counter() is a function that returns the current counter
727	      for a given context, or an error condition if that counter does
728	      not exist.

730	      store_counter() is a function that saves a counter value for a
731	      given context.

733	      suitable_id() is a function that checks whether the resulting
734	      identifier is acceptable (e.g., whether it is not already in use,
735	      etc.).

737	   Essentially, whenever a new identifier is to be selected, the
738	   algorithm checks whether a counter for the corresponding context
739	   exists.  If does, the value of such counter is incremented to obtain
740	   the new transient numeric identifier, and the counter is updated.  If
741	   no counter exists for such context, a new counter is created and
742	   initialized to a random value, and used as the selected transient
743	   numeric identifier.  This algorithm produces a per-context counter,
744	   which results in one monotonically-increasing function for each
745	   context.  Since each counter is initialized to a random value, the
746	   resulting values are unpredictable by an off-path attacker.

748	   The choice of id_inc has implications on both the security and
749	   privacy properties of the resulting identifiers, but also on the
750	   corresponding interoperability properties.  On one hand, minimizing
751	   the increments generally minimizes the identifier reuse frequency,
752	   albeit at increased predictability.  On the other hand, if the
753	   increments are randomized, predictability of the resulting
754	   identifiers is reduced, and the information leakage produced by
755	   global constant increments is mitigated.  However, using larger
756	   increments than necessary can result in higher numeric ID reuse
757	   frequency.

759	   This algorithm has the following drawbacks:

761	   o  It requires an implementation to store each per-CONTEXT counter in
762	      memory.  If, as a result of resource management, the counter for a
763	      given context must be removed, the last transient numeric
764	      identifier value used for that context will be lost.  Thus, if
765	      subsequently an identifier needs to be generated for the same
766	      context, the corresponding counter will need to be recreated and
767	      reinitialized to a random value, thus possibly leading to reuse/
768	      collision of numeric identifiers.

770	   o  Keeping one counter for each possible "context" may in some cases
771	      be considered too onerous in terms of memory requirements.

773	   Otherwise, the identifiers produced by this algorithm do not suffer
774	   from the other issues discussed in Section 8.

776	7.4.2.  Simple PRF-Based Algorithm

778	   The goal of this algorithm is to produce monotonically-increasing
779	   transient numeric identifiers (for each given context), with a
780	   randomized initial value.  For example, if the identifiers being
781	   generated must be monotonically-increasing for each {IP Source
782	   Address, IP Destination Address} set, then each possible combination
783	   of {IP Source Address, IP Destination Address} should have a separate
784	   monotonically-increasing sequence, that starts at a different random
785	   value.

787	   Instead of maintaining a per-context counter (as in the algorithm
788	   from Section 7.4.1), the following algorithm employs a calculated
789	   technique to maintain a random offset for each possible context.

791	       /* Initialization code */
792	       counter = 0;

794	       /* Transient Numeric ID selection function  */

796	       id_range = max_id - min_id + 1;
797	       id_inc = increment() % id_range;
798	       offset = F(CONTEXT, secret_key);
799	       retry = id_range;

801	       do {
802	           next_id = min_id + (offset + counter) % id_range;
803	           counter = counter + id_inc;

805	           if (suitable_id(next_id)) {
806	               return next_id;
807	           }

809	           retry = retry - id_inc;

811	       } while (retry > 0);

813	       return ERROR;

815	   In the algorithm above, the function F() provides a (stateless)
816	   unpredictable offset for each given context (as identified by
817	   'CONTEXT').

819	   F() is a PRFs, with the same properties as those specified for F() in
820	   Section 7.3.

822	   CONTEXT is the concatenation of all the elements that define a given
823	   context.  For example, if this algorithm is expected to produce
824	   identifiers that are monotonically-increasing for each set (Source IP
825	   Address, Destination IP Address), CONTEXT should be the concatenation
826	   of these two IP addresses.

828	   The function F() provides a "per-CONTEXT" fixed offset within the
829	   numeric identifier "space".  Both the 'offset' and 'counter'
830	   variables may take any value within the storage type range since we
831	   are restricting the resulting identifier to be in the range [min_id,
832	   max_id] in a similar way as in the algorithm described in
833	   Section 7.1.1.  This allows us to simply increment the 'counter'
834	   variable and rely on the unsigned integer to wrap around.

836	   The result of F() is no more secure than the secret key, and
837	   therefore 'secret_key' must be unknown to the attacker, and must be
838	   of a reasonable length. 'secret_key' must remain stable for a given
839	   CONTEXT, since otherwise the numeric identifiers generated by this
840	   algorithm would not have the desired stability properties (i.e.,
841	   monotonically-increasing for a given CONTEXT).  In most cases,
842	   'secret_key' can be selected with a PRNG (see [RFC4086] for
843	   recommendations on choosing secrets) at an appropriate time, and
844	   stored in stable or volatile storage (as necessary) for future use.

846	   It should be noted that, since this algorithm uses a global counter
847	   ("counter") for selecting identifiers (i.e., all counters share the
848	   same increments space), this algorithm results in an information
849	   leakage (as described in Section 8.2).  For example, if this
850	   algorithm were used for selecting TCP ephemeral ports, and an
851	   attacker could force a client to periodically establish a new TCP
852	   connection to an attacker-controlled system (or through an attacker-
853	   observable routing path), the attacker could subtract consecutive
854	   source port values to obtain the number of outgoing TCP connections
855	   established globally by the victim host within that time period (up
856	   to wrap-around issues and five-tuple collisions, of course).  This
857	   information leakage could be partially mitigated by employing small
858	   random values for the increments (i.e., increment() function),
859	   instead of having increment() return the constant "1".

861	   We nevertheless note that an improved mitigation of this information
862	   leakage could be more successfully achieved by employing the
863	   algorithm from Section 7.4.3, instead.

865	7.4.3.  Double-PRF Algorithm

867	   A trade-off between maintaining a single global 'counter' variable
868	   and maintaining 2**N 'counter' variables (where N is the width of the
869	   result of F()), could be achieved as follows.  The system would keep
870	   an array of TABLE_LENGTH values, which would provide a separation of
871	   the increment space into multiple buckets.  This improvement could be
872	   incorporated into the algorithm from Section 7.4.2 as follows:

874	       /* Initialization code */

876	       for(i = 0; i < TABLE_LENGTH; i++) {
877	           table[i] = random();
878	       }

880	       /* Transient Numeric ID selection function */

882	       id_range = max_id - min_id + 1;
883	       id_inc = increment() % id_range;
884	       offset = F(CONTEXT, secret_key1);
885	       index = G(CONTEXT, secret_key2) % TABLE_LENGTH;
886	       retry = id_range;

888	       do {
889	           next_id = min_id + (offset + table[index]) % id_range;
890	           table[index] = table[index] + id_inc;

892	           if (suitable_id(next_id)) {
893	               return next_id;
894	           }

896	          retry = retry - id_inc;

898	       } while (retry > 0);

900	       return ERROR;

902	   'table[]' could be initialized with random values, as indicated by
903	   the initialization code in the pseudo-code above.

905	   Both F() and G() are PRFs, with the same properties as those required
906	   for F() in Section 7.3.

908	   The results of F() and G() are no more secure than their respective
909	   secret keys ('secret_key1' and 'secret_key2', respectively), and
910	   therefore both secret keys must be unknown to the attacker, and must
911	   be of a reasonable length.  Both secret keys must remain stable for
912	   the given CONTEXT, since otherwise the transient numeric identifiers
913	   generated by this algorithm would not have the desired stability
914	   properties (i.e., monotonically-increasing for a given CONTEXT).  In
915	   most cases, both secret keys can be selected with a PRNG (see
916	   [RFC4086] for recommendations on choosing secrets) at an appropriate
917	   time, and stored in stable or volatile storage (as necessary) for
918	   future use.

920	   The 'table[]' array assures that successive transient numeric
921	   identifiers for a given context will be monotonically-increasing.
922	   Since the increments space is separated into TABLE_LENGTH different
923	   spaces, the identifier reuse frequency will be (probabilistically)
924	   lower than that of the algorithm in Section 7.4.2.  That is, the
925	   generation of an identifier for one given context will not
926	   necessarily result in increments in the identifier sequence of other
927	   contexts.  It is interesting to note that the size of 'table[]' does
928	   not limit the number of different identifier sequences, but rather
929	   separates the *increment space* into TABLE_LENGTH different spaces.
930	   The selected transient numeric identifier sequence will be obtained
931	   by adding the corresponding entry from 'table[]' to the value in the
932	   'offset' variable, which selects the actual identifier sequence space
933	   (as in the algorithm from Section 7.4.2).

935	   An attacker can perform traffic analysis for any "increment space"
936	   (i.e., context) into which the attacker has "visibility" -- namely,
937	   the attacker can force a system to generate identifiers for
938	   G(CONTEXT, secret_key2), where the result of G() identifies the
939	   target "increment space".  However, the attacker's ability to perform
940	   traffic analysis is very reduced when compared to the simple PRF-
941	   based identifiers (described in Section 7.4.2) and the predictable
942	   linear identifiers (described in Appendix A.1).  Additionally, an
943	   implementation can further limit the attacker's ability to perform
944	   traffic analysis by further separating the increment space (that is,
945	   using a larger value for TABLE_LENGTH) and/or by randomizing the
946	   increments (i.e., increment() returning a small random number as
947	   opposed to the constant "1").

949	   Otherwise, this algorithm does not suffer from the issues discussed
950	   in Section 8.

952	8.  Common Vulnerabilities Associated with Transient Numeric Identifiers

954	8.1.  Network Activity Correlation

956	   An identifier that is predictable within a given context allows for
957	   network activity correlation within that context.

959	   For example, a stable IPv6 Interface Identifier allows for network
960	   activity to be correlated within the context in which the Interface
961	   Identifier is stable [RFC7721].  A stable-per-network IPv6 Interface
962	   Identifier (as in [RFC7217]) allows for network activity correlation
963	   within a network, whereas a constant IPv6 Interface Identifier (that
964	   remains constant across networks) allows not only network activity
965	   correlation within the same network, but also across networks ("host
966	   tracking").

968	   Similarly, an implementation that generates TCP ISNs with a global
969	   counter could allow for fingerprinting and network activity
970	   correlation across networks, since an attacker could passively infer
971	   the identity of the victim based on the TCP ISNs employed for
972	   subsequent communication instances.  Similarly, an implementation
973	   that generates predictable IPv6 Fragment Identification values could
974	   be subject to fingerprinting attacks (see e.g.  [Bellovin2002]).

976	8.2.  Information Leakage

978	   Transient numeric identifiers that result in specific patterns can
979	   produce an information leakage to other communicating entities.  For
980	   example, it is common to generate transient numeric identifiers with
981	   an algorithm such as:

983	                   ID = offset(CONTEXT) + mono(CONTEXT);

985	   This generic expression generates identifiers by adding a
986	   monotonically-increasing function (e.g. linear) to a randomized
987	   offset. offset() is constant within a given context, whereas mono()
988	   produces a monotonically-increasing sequence for the given context.
989	   Identifiers generated with this expression will generally be
990	   predictable within CONTEXT.

992	   The predictability of mono(), irrespective of the predictability of
993	   offset(), can leak information that may be of use to attackers.  For
994	   example, a node that selects ephemeral port numbers as in:

996	                 ephemeral_port = offset(Dest_IP) + mono()

998	   that is, with a per-destination offset, but a global mono() function
999	   (e.g., a global counter), will leak information about total number of
1000	   outgoing connections that have been issued by the vulnerable
1001	   implementation.

1003	   Similarly, a node that generates Fragment Identification values as
1004	   in:

1006	            Frag_ID = offset(IP_src_addr, IP_dst_addr) + mono()

1008	   will leak out information about the total number of fragmented
1009	   packets that have been transmitted by the vulnerable implementation.
1010	   The vulnerabilities described in [Sanfilippo1998a],
1011	   [Sanfilippo1998b], and [Sanfilippo1999] are all associated with the
1012	   use of a global mono() function (i.e., with a global and constant
1013	   "context") -- particularly when it is a linear function (constant
1014	   increments of 1).

1016	   Predicting transient numeric identifiers can be of help for other
1017	   types of attacks.  For example, predictable TCP ISNs can open the
1018	   door to trivial connection-reset and data injection attacks (see
1019	   Section 8.6).

1021	8.3.  Fingerprinting

1023	   Fingerprinting is the capability of an attacker to identify or re-
1024	   identify a visiting user, user agent or device via configuration
1025	   settings or other observable characteristics.  Observable protocol
1026	   objects and characteristics can be employed to identify/re-identify a
1027	   variety of entities, ranging from the underlying hardware or
1028	   Operating System (vendor, type and version), to the user itself (i.e.
1029	   his/her identity).  [EFF] illustrates web browser-based
1030	   fingerprinting, but similar techniques can be applied at other layers
1031	   and protocols, whether alternatively or in conjunction with it.

1033	   Transient numeric identifiers are one of the observable protocol
1034	   components that could be leveraged for fingerprinting purposes.  That
1035	   is, an attacker could sample transient numeric identifiers to infer
1036	   the algorithm (and its associated parameters, if any) for generating
1037	   such identifiers, possibly revealing the underlying Operating System
1038	   (OS) vendor, type, and version.  This information could possibly be
1039	   further leveraged in conjunction with other fingerprinting techniques
1040	   and sources.

1042	   Evasion of protocol-stack fingerprinting can prove to be a very
1043	   difficult task: most systems make use of a wide variety of protocols,
1044	   each of which have a large number of parameters that can be set to
1045	   arbitrary values or generated with a variety of algorithms with
1046	   multiple parameters.

1048	   NOTE:
1049	      General protocol-based fingerprinting is discussed in [RFC6973],
1050	      along with guidelines to mitigate the associated vulnerability.
1051	      [Fyodor1998] and [Fyodor2006] are classic references on Operating
1052	      System detection via TCP/IP stack fingerprinting.  Nmap [nmap] is
1053	      probably the most popular tool for remote OS identification via
1054	      active TCP/IP stack fingerprinting. p0f [Zalewski2012], on the
1055	      other hand, is a tool for performing remote OS detection via
1056	      passive TCP/IP stack fingerprinting.  Finally, [TBIT] is a TCP
1057	      fingerprinting tool that aims at characterising the behaviour of a
1058	      remote TCP peer based on active probes, and which has been widely
1059	      used in the research community.

1061	   Algorithms that, from the perspective of an observer (e.g., the
1062	   legitimate communicating peer), result in specific values or
1063	   patterns, will allow for at least some level of fingerprinting.  For
1064	   example, the algorithm from Section 7.3 will typically allow
1065	   fingerprinting within the context where the resulting identifiers are
1066	   stable.  Similarly, the algorithms from Section 7.4 will result in a
1067	   monotonically-increasing sequences within a given context, thus
1068	   allowing for at least some level of fingerprinting (when the other
1069	   communicating entity can correlate different sampled identifiers as
1070	   belonging to the same monotonically-increasing sequence).

1072	   Thus, where possible, algorithms from Section 7.1 should be preferred
1073	   over algorithms that result in specific values or patterns.

1075	8.4.  Exploitation of the Semantics of Transient Numeric Identifiers

1077	   Identifiers that are not semantically opaque tend to be more
1078	   predictable than semantically-opaque identifiers.  For example, a MAC
1079	   address contains an OUI (Organizationally-Unique Identifier) which
1080	   identifies the vendor that manufactured the corresponding network
1081	   interface card.  This can be leveraged by an attacker trying to
1082	   "guess" MAC addresses, who has some knowledge about the possible NIC
1083	   vendor.

1085	   [RFC7707] discusses a number of techniques to reduce the search space
1086	   when performing IPv6 address-scanning attacks by leveraging the
1087	   semantics of the IIDs produced by traditional SLAAC algorithms
1088	   (eventually replaced by [RFC7217]) that embed MAC addresses in the
1089	   IID of IPv6 addresses.

1091	8.5.  Exploitation of Collisions of Transient Numeric Identifiers

1093	   In many cases, the collision of transient network identifiers can
1094	   have a hard failure severity (or result in a hard failure severity if
1095	   an attacker can cause multiple collisions deterministically, one
1096	   after another).  For example, predictable Fragment Identification
1097	   values open the door to Denial of Service (DoS) attacks (see e.g.
1098	   [RFC5722].).

1100	8.6.  Exploitation of Predictable Transient Numeric Identifiers for
1101	      Injection Attacks

1103	   Some protocols rely on "sequence numbers" for the validation of
1104	   incoming packets.  For example, TCP employs sequence numbers for
1105	   reassembling TCP segments, while IPv4 and IPv6 employ Fragment
1106	   Identification values for reassembling IPv4 and IPv6 fragments
1107	   (respectively).  Lacking built-in cryptographic mechanisms for
1108	   validating packets, these protocols are therefore vulnerable to on-
1109	   path data (see e.g.  [Joncheray1995]) and/or control-information (see
1110	   e.g.  [RFC4953] and [RFC5927]) injection attacks.  The extent to
1111	   which these protocols may resist off-path (i.e. "blind") injection
1112	   attacks depends on whether the associated "sequence numbers" are
1113	   predictable, and effort required to successfully predict a valid
1114	   "sequence number" (see e.g.  [RFC4953] and [RFC5927]).

1116	   We note that the use of unpredictable "sequece numbers" is an
1117	   completely ineffective mitigation for on-path injection attacks, and
1118	   also a mostly-ineffective mitigation for off-path (i.e. "blind")
1119	   injection attacks.  However, many legacy protocols (such as TCP) do
1120	   not natively incorporate cryptographic mitigations, but rather only
1121	   as optional features (see e.g.  [RFC5925]), if at all available.
1122	   Additionally, ad-hoc use of cryptographic mitigations might not be
1123	   sufficient to relieve a protocol implementation of generating
1124	   appropriate transient numeric identifiers.  For example, use of the
1125	   Transport Layer Security (TLS) protocol [RFC8446] with TCP will
1126	   protect the application protocol, but will not help to mitigate e.g.
1127	   TCP-based connection-reset attacks (see e.g.  [RFC4953]).  Similarly,
1128	   use of SEcure Neighbor Discovery (SEND) [RFC3971] will still imply
1129	   reliance on the successful reassembly of IPv6 fragments in those
1130	   cases where SEND packets do not fit into the link Maximum
1131	   Transmission Unit (MTU) (see [RFC6980]).

1133	8.7.  Cryptanalysis

1135	   A number of algorithms discussed in this document (such as those
1136	   described in Section 7.4.2 and Section 7.4.3) rely on PRFs.
1137	   Implementations that employ weak PRFs or keys of inappropriate size
1138	   can be subject to cryptanalysis, where an attacker can obtain the
1139	   secret key employed for the PRF, predict numeric identifiers, etc.

1141	   Furthermore, an implementation that overloads the semantics of the
1142	   secret key can result in more trivial cryptanalysis, possibly
1143	   resulting in the leakage of the value employed for the secret key.

1145	   NOTE:
1146	      [IPID-DEV] describes two vulnerable transient numeric ID
1147	      generators that employ cryptographically-weak hash functions.
1148	      Additionally, one of such implementations employs 32-bits of a
1149	      kernel address as the secret key for a hash function, and
1150	      therefore successful cryptanalysis leaks the aforementioned kernel
1151	      address, allowing for Kernel Address Space Layout Randomization
1152	      (KASLR) [KASLR] bypass.

1154	9.  Vulnerability Assessment of Transient Numeric Identifiers

1156	   The following subsections analyze possible vulnerabilities associated
1157	   with the algorithms described in Section 7.

1159	9.1.  Category #1: Uniqueness (soft failure)

1161	   Possible vulnerabilities associated with the algorithms from
1162	   Section 7.1 include:

1164	   o  Use of flawed PRNGs (please see e.g.  [Zalewski2001],
1165	      [Zalewski2002] and [Klein2007]),

1167	   o  Inadvertently affecting the distribution of an otherwise suitable
1168	      PRNG.

1170	   An implementer should consult [RFC4086] regarding randomness
1171	   requirements for security, and consult relevant documentation when
1172	   employing a PRNG provided by the underlying system.

1174	   When employing a PRNG, many implementations "adapt" the length of its
1175	   output with a modulo operator (e.g., C language's "%"), possibly
1176	   changing the distribution of the output of the PRNG.

1178	   For example, consider an implementation that employs the following
1179	   code:

1181	                          id = random() % 50000;

1183	   The example implementation means obtain a transient numeric
1184	   identifier in the range 0-49000.  If random() produces e.g. a
1185	   pseudorandom number of 16 bits (with uniform distribution), the
1186	   selected numeric ID will have a non-uniform distribution with the
1187	   numbers in the range 0-15535 having double-frequency as the numbers
1188	   in the range 15536-49000.  This effect if the PRNG produces an output
1189	   that is much longer than the length implied by the modulo operation.

1191	   Use of algorithms other than PRNGs for generating identifiers of this
1192	   category is discouraged.

1194	9.2.  Category #2: Uniqueness (hard failure)

1196	   As noted in Section 7.2, this category can employ the same algorithms
1197	   as Category #4, since a monotonically-increasing sequence tends to
1198	   minimize the transient numeric identifier reuse frequency.
1199	   Therefore, the vulnerability analysis in Section 9.4 also applies to
1200	   this category.

1202	   Additionally, as noted in Section 7.2, some transient numeric
1203	   identifiers of this category might be able to use the algorithms from
1204	   Section 7.1, in which case the same considerations as in Section 9.1
1205	   would apply.

1207	9.3.  Category #3: Uniqueness, stable within context (soft failure)

1209	   Possible vulnerabilities associated with the algorithms from
1210	   Section 7.3 are:

1212	   1.  Use of weak PRFs, or inappropriate secret keys (whether
1213	       inappropriate selection or inappropriate size) could allow for
1214	       cryptanalysis, which could eventually be exploited by an attacker
1215	       to predict future transient numeric identifiers.

1217	   2.  Since the algorithm generates a unique and stable identifier
1218	       within a specified context, it may allow for network activity
1219	       correlation and fingerprinting within the specified context.

1221	9.4.  Category #4: Uniqueness, monotonically increasing within context
1222	      (hard failure)

1224	   The algorithm described in Section 7.4.1 for generating identifiers
1225	   of Category #4 will result in an identifiable pattern (i.e. a
1226	   monotonically-increasing sequence) for the transient numeric
1227	   identifiers generated for each CONTEXT, and thus will allow for
1228	   fingerprinting and network activity correlation within each CONTEXT.

1230	   On the other hand, a simple way to generalize and analyze the
1231	   algorithms described in Section 7.4.2 and Section 7.4.3 for
1232	   generating identifiers of Category #4, is as follows:

1234	       /* Transient Numeric ID selection function */

1236	       id_range = max_id - min_id + 1;
1237	       retry = id_range;
1238	       id_inc = increment() % id_range;

1240	       do {
1241	           update_mono(CONTEXT, id_inc);
1242	           next_id = min_id + (offset(CONTEXT) + \
1243	                               mono(CONTEXT)) % id_range;

1245	           if (suitable_id(next_id)) {
1246	               return next_id;
1247	           }

1249	           retry = retry - id_inc;

1251	       } while (retry > 0);

1253	       return ERROR;

1255	   NOTES:
1256	      increment() returns a small integer that is employed to generate a
1257	      monotonically-increasing function.  Most implementations employ a
1258	      constant value for "increment()" (usually 1).  The value returned
1259	      by increment() must be much smaller than the value computed for
1260	      "id_range".

1262	      update_mono(CONTEXT, id_inc) increments the counter corresponding
1263	      to CONTEXT by "id_inc".

1265	      mono(CONTEXT) reads the counter corresponding to CONTEXT.

1267	   Essentially, an identifier (next_id) is generated by adding a
1268	   monotonically-increasing function (mono()) to an offset value,
1269	   unknown to the attacker and stable for given context (CONTEXT).

1271	   The following aspects of the algorithm should be considered:

1273	   o  For the most part, it is the offset() function that results in
1274	      identifiers that are unpredictable by an off-patch attacker.
1275	      While the resulting sequence is known to be monotonically-
1276	      increasing, the use of a randomized offset value makes the
1277	      resulting values unknown to the attacker.

1279	   o  The most straightforward "stateless" implementation of offset() is
1280	      with a PRF that takes the values that identify the context and a
1281	      "secret_key" (not shown in the figure above) as arguments.

1283	   o  One possible implementation of mono() would be to have mono()
1284	      internally employ a single counter (as in the algorithm from
1285	      Section 7.4.2), or map the increments for different contexts into
1286	      a number of counters/buckets, such that the number of counters
1287	      that need to be maintained in memory is reduced (as in the
1288	      algorithm from algorithm in Section 7.4.3).

1290	   o  In all cases, a monotonically increasing function is implemented
1291	      by incrementing the previous value of a counter by increment()
1292	      units.  In the most trivial case, increment() could return the
1293	      constant "1".  But increment() could also be implemented to return
1294	      small random integers such that the increments are unpredictable
1295	      (see Appendix A of [RFC7739]).  This represents a tradeoff between
1296	      the unpredictability of the resulting transient numeric IDs and
1297	      the transient numeric ID reuse frequency.

1299	   Considering the generic algorithm illustrated above, we can identify
1300	   the following possible vulnerabilities:

1302	   o  Since the algorithms for this category are similar to those of
1303	      Section 9.3, with the addition of a monotonically-increasing
1304	      function, all the issues discussed in Section 9.3 ("Category #3:
1305	      Uniqueness, stable within context (soft failure)") also apply to
1306	      this case.

1308	   o  mono() can be correlated to the number of identifiers generated
1309	      for a given context (CONTEXT).  Thus, if mono() spans more than
1310	      the necessary context, the "increments" could be leaked to other
1311	      parties, thus disclosing information about the number of
1312	      identifiers that have been generated by the algorithm for all
1313	      contexts.  This is information disclosure becomes more evident
1314	      when an implementation employs a constant increment of 1.  For
1315	      example, an implementation where mono() is actually a single
1316	      global counter, will unnecessarily leak information the number of
1317	      identifiers that have been generated by the algorithm (globally,
1318	      for all contexts).  [Fyodor2004] is one example of how such
1319	      information leakages can be exploited.  We note that limiting the
1320	      span of the increments space will require a larger number of
1321	      counters to be stored in memory (i.e., a larger value for the
1322	      TABLE_LENGTH parameter of the algorithm in Section 7.4.3).

1324	   o  Transient numeric identifiers generated with the algorithms
1325	      described in Section 7.4.2 and Section 7.4.3 will normally allow
1326	      for fingerprinting within CONTEXT since, for such context, the
1327	      resulting identifiers will have an identifiable pattern (i.e. a
1328	      monotonically-increasing sequence).

1330	10.  IANA Considerations

1332	   This document has no IANA actions.

1334	11.  Security Considerations

1336	   The entire document is about the security and privacy implications of
1337	   transient numeric identifiers.
1338	   [I-D.gont-numeric-ids-sec-considerations] recommends that protocol
1339	   specifications specify the interoperability requirements of their
1340	   transient numeric identifiers, perform a vulnerability assessment of
1341	   their transient numeric identifiers, and suggest an algorithm for
1342	   generating each of their transient numeric identifiers.  This
1343	   document analyzes possible algorithms (and their implications) that
1344	   could be employed to comply with the interoperability properties of
1345	   most common categories of transient numeric identifiers, while
1346	   minimizing the associated negative security and privacy implications.

1348	12.  Acknowledgements

1350	   The authors would like to thank (in alphabetical order) Bernard
1351	   Aboba, Steven Bellovin, Luis Leon Cardenas Graide, Guillermo Gont,
1352	   Joseph Lorenzo Hall, Gre Norcie, Shivan Sahib, and Martin Thomson,
1353	   and Michael Tuexen, for providing valuable comments on earlier
1354	   versions of this document.

1356	   The authors would like to thank Shivan Sahib and Christopher Wood,
1357	   for their guidance during the publication process of this document.

1359	   The authors would like to thank Diego Armando Maradona for his magic
1360	   and inspiration.

1362	13.  References

1364	13.1.  Normative References

1366	   [RFC0791]  Postel, J., "Internet Protocol", STD 5, RFC 791,
1367	              DOI 10.17487/RFC0791, September 1981,
1368	              <https://www.rfc-editor.org/info/rfc791>.

1370	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
1371	              RFC 793, DOI 10.17487/RFC0793, September 1981,
1372	              <https://www.rfc-editor.org/info/rfc793>.

1374	   [RFC1035]  Mockapetris, P., "Domain names - implementation and
1375	              specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
1376	              November 1987, <https://www.rfc-editor.org/info/rfc1035>.

1378	   [RFC1321]  Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321,
1379	              DOI 10.17487/RFC1321, April 1992,
1380	              <https://www.rfc-editor.org/info/rfc1321>.

1382	   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
1383	              (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460,
1384	              December 1998, <https://www.rfc-editor.org/info/rfc2460>.

1386	   [RFC4086]  Eastlake 3rd, D., Schiller, J., and S. Crocker,
1387	              "Randomness Requirements for Security", BCP 106, RFC 4086,
1388	              DOI 10.17487/RFC4086, June 2005,
1389	              <https://www.rfc-editor.org/info/rfc4086>.

1391	   [RFC4291]  Hinden, R. and S. Deering, "IP Version 6 Addressing
1392	              Architecture", RFC 4291, DOI 10.17487/RFC4291, February
1393	              2006, <https://www.rfc-editor.org/info/rfc4291>.

1395	   [RFC4862]  Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless
1396	              Address Autoconfiguration", RFC 4862,
1397	              DOI 10.17487/RFC4862, September 2007,
1398	              <https://www.rfc-editor.org/info/rfc4862>.

1400	   [RFC4941]  Narten, T., Draves, R., and S. Krishnan, "Privacy
1401	              Extensions for Stateless Address Autoconfiguration in
1402	              IPv6", RFC 4941, DOI 10.17487/RFC4941, September 2007,
1403	              <https://www.rfc-editor.org/info/rfc4941>.

1405	   [RFC5722]  Krishnan, S., "Handling of Overlapping IPv6 Fragments",
1406	              RFC 5722, DOI 10.17487/RFC5722, December 2009,
1407	              <https://www.rfc-editor.org/info/rfc5722>.

1409	   [RFC5925]  Touch, J., Mankin, A., and R. Bonica, "The TCP
1410	              Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
1411	              June 2010, <https://www.rfc-editor.org/info/rfc5925>.

1413	   [RFC6056]  Larsen, M. and F. Gont, "Recommendations for Transport-
1414	              Protocol Port Randomization", BCP 156, RFC 6056,
1415	              DOI 10.17487/RFC6056, January 2011,
1416	              <https://www.rfc-editor.org/info/rfc6056>.

1418	   [RFC6151]  Turner, S. and L. Chen, "Updated Security Considerations
1419	              for the MD5 Message-Digest and the HMAC-MD5 Algorithms",
1420	              RFC 6151, DOI 10.17487/RFC6151, March 2011,
1421	              <https://www.rfc-editor.org/info/rfc6151>.

1423	   [RFC6191]  Gont, F., "Reducing the TIME-WAIT State Using TCP
1424	              Timestamps", BCP 159, RFC 6191, DOI 10.17487/RFC6191,
1425	              April 2011, <https://www.rfc-editor.org/info/rfc6191>.

1427	   [RFC6528]  Gont, F. and S. Bellovin, "Defending against Sequence
1428	              Number Attacks", RFC 6528, DOI 10.17487/RFC6528, February
1429	              2012, <https://www.rfc-editor.org/info/rfc6528>.

1431	   [RFC7217]  Gont, F., "A Method for Generating Semantically Opaque
1432	              Interface Identifiers with IPv6 Stateless Address
1433	              Autoconfiguration (SLAAC)", RFC 7217,
1434	              DOI 10.17487/RFC7217, April 2014,
1435	              <https://www.rfc-editor.org/info/rfc7217>.

1437	   [RFC7323]  Borman, D., Braden, B., Jacobson, V., and R.
1438	              Scheffenegger, Ed., "TCP Extensions for High Performance",
1439	              RFC 7323, DOI 10.17487/RFC7323, September 2014,
1440	              <https://www.rfc-editor.org/info/rfc7323>.

1442	   [RFC8064]  Gont, F., Cooper, A., Thaler, D., and W. Liu,
1443	              "Recommendation on Stable IPv6 Interface Identifiers",
1444	              RFC 8064, DOI 10.17487/RFC8064, February 2017,
1445	              <https://www.rfc-editor.org/info/rfc8064>.

1447	   [RFC8200]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
1448	              (IPv6) Specification", STD 86, RFC 8200,
1449	              DOI 10.17487/RFC8200, July 2017,
1450	              <https://www.rfc-editor.org/info/rfc8200>.

1452	13.2.  Informative References

1454	   [Bellovin1989]
1455	              Bellovin, S., "Security Problems in the TCP/IP Protocol
1456	              Suite", Computer Communications Review, vol. 19, no. 2,
1457	              pp. 32-48, 1989,
1458	              <https://www.cs.columbia.edu/~smb/papers/ipext.pdf>.

1460	   [Bellovin2002]
1461	              Bellovin, S., "A Technique for Counting NATted Hosts",
1462	              IMW'02 Nov. 6-8, 2002, Marseille, France, 2002.

1464	   [CPNI-TCP]
1465	              Gont, F., "Security Assessment of the Transmission Control
1466	              Protocol (TCP)",  United Kingdom's Centre for the
1467	              Protection of National Infrastructure (CPNI) Technical
1468	              Report, 2009, <https://www.gont.com.ar/papers/tn-03-09-
1469	              security-assessment-TCP.pdf>.

1471	   [EFF]      EFF, "Cover your tracks: See how trackers view your
1472	              browser", 2020, <https://coveryourtracks.eff.org/>.

1474	   [FIPS-SHS]
1475	              FIPS, "Secure Hash Standard (SHS)",  Federal Information
1476	              Processing Standards Publication 180-4, August 2015,
1477	              <https://nvlpubs.nist.gov/nistpubs/FIPS/
1478	              NIST.FIPS.180-4.pdf>.

1480	   [Fyodor1998]
1481	              Fyodor, "Remote OS Detection via TCP/IP Stack
1482	              Fingerprinting",  Phrack Magazine, Volume 9, Issue 54,
1483	              1998, <http://www.phrack.org/archives/issues/54/9.txt>.

1485	   [Fyodor2004]
1486	              Fyodor, "Idle scanning and related IP ID games", 2004,
1487	              <http://www.insecure.org/nmap/idlescan.html>.

1489	   [Fyodor2006]
1490	              Fyodor, "Remote OS Detection via TCP/IP Fingerprinting
1491	              (2nd Generation)", 1998,
1492	              <http://insecure.org/nmap/osdetect/>.

1494	   [I-D.gont-numeric-ids-sec-considerations]
1495	              Gont, F. and I. Arce, "Security Considerations for
1496	              Transient Numeric Identifiers Employed in Network
1497	              Protocols", draft-gont-numeric-ids-sec-considerations-06
1498	              (work in progress), December 2020.

1500	   [I-D.irtf-pearg-numeric-ids-history]
1501	              Gont, F. and I. Arce, "Unfortunate History of Transient
1502	              Numeric Identifiers", draft-irtf-pearg-numeric-ids-
1503	              history-06 (work in progress), January 2021.

1505	   [IANA-PROT]
1506	              IANA, "Protocol Registries",
1507	              <https://www.iana.org/protocols>.

1509	   [IPID-DEV]
1510	              Klein, A. and B. Pinkas, "From IP ID to Device ID and
1511	              KASLR Bypass (Extended Version)", June 2019,
1512	              <https://arxiv.org/pdf/1906.10478.pdf>.

1514	   [Joncheray1995]
1515	              Joncheray, L., "A Simple Active Attack Against TCP", Proc.
1516	              Fifth Usenix UNIX Security Symposium, 1995.

1518	   [KASLR]    PaX Team, "Address Space Layout Randomization",
1519	              <https://pax.grsecurity.net/docs/aslr.txt>.

1521	   [Klein2007]
1522	              Klein, A., "OpenBSD DNS Cache Poisoning and Multiple O/S
1523	              Predictable IP ID Vulnerability", 2007,
1524	              <http://www.trusteer.com/files/OpenBSD_DNS_Cache_Poisoning
1525	              _and_Multiple_OS_Predictable_IP_ID_Vulnerability.pdf>.

1527	   [Morris1985]
1528	              Morris, R., "A Weakness in the 4.2BSD UNIX TCP/IP
1529	              Software", CSTR 117, AT&T Bell Laboratories, Murray Hill,
1530	              NJ, 1985,
1531	              <https://pdos.csail.mit.edu/~rtm/papers/117.pdf>.

1533	   [nmap]     Fyodor, "Nmap: Free Security Scanner For Network
1534	              Exploration and Audit", 2020,
1535	              <https://www.insecure.org/nmap>.

1537	   [RFC3971]  Arkko, J., Ed., Kempf, J., Zill, B., and P. Nikander,
1538	              "SEcure Neighbor Discovery (SEND)", RFC 3971,
1539	              DOI 10.17487/RFC3971, March 2005,
1540	              <https://www.rfc-editor.org/info/rfc3971>.

1542	   [RFC4953]  Touch, J., "Defending TCP Against Spoofing Attacks",
1543	              RFC 4953, DOI 10.17487/RFC4953, July 2007,
1544	              <https://www.rfc-editor.org/info/rfc4953>.

1546	   [RFC4963]  Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly
1547	              Errors at High Data Rates", RFC 4963,
1548	              DOI 10.17487/RFC4963, July 2007,
1549	              <https://www.rfc-editor.org/info/rfc4963>.

1551	   [RFC5927]  Gont, F., "ICMP Attacks against TCP", RFC 5927,
1552	              DOI 10.17487/RFC5927, July 2010,
1553	              <https://www.rfc-editor.org/info/rfc5927>.

1555	   [RFC6274]  Gont, F., "Security Assessment of the Internet Protocol
1556	              Version 4", RFC 6274, DOI 10.17487/RFC6274, July 2011,
1557	              <https://www.rfc-editor.org/info/rfc6274>.

1559	   [RFC6973]  Cooper, A., Tschofenig, H., Aboba, B., Peterson, J.,
1560	              Morris, J., Hansen, M., and R. Smith, "Privacy
1561	              Considerations for Internet Protocols", RFC 6973,
1562	              DOI 10.17487/RFC6973, July 2013,
1563	              <https://www.rfc-editor.org/info/rfc6973>.

1565	   [RFC6980]  Gont, F., "Security Implications of IPv6 Fragmentation
1566	              with IPv6 Neighbor Discovery", RFC 6980,
1567	              DOI 10.17487/RFC6980, August 2013,
1568	              <https://www.rfc-editor.org/info/rfc6980>.

1570	   [RFC7098]  Carpenter, B., Jiang, S., and W. Tarreau, "Using the IPv6
1571	              Flow Label for Load Balancing in Server Farms", RFC 7098,
1572	              DOI 10.17487/RFC7098, January 2014,
1573	              <https://www.rfc-editor.org/info/rfc7098>.

1575	   [RFC7258]  Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an
1576	              Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May
1577	              2014, <https://www.rfc-editor.org/info/rfc7258>.

1579	   [RFC7707]  Gont, F. and T. Chown, "Network Reconnaissance in IPv6
1580	              Networks", RFC 7707, DOI 10.17487/RFC7707, March 2016,
1581	              <https://www.rfc-editor.org/info/rfc7707>.

1583	   [RFC7721]  Cooper, A., Gont, F., and D. Thaler, "Security and Privacy
1584	              Considerations for IPv6 Address Generation Mechanisms",
1585	              RFC 7721, DOI 10.17487/RFC7721, March 2016,
1586	              <https://www.rfc-editor.org/info/rfc7721>.

1588	   [RFC7739]  Gont, F., "Security Implications of Predictable Fragment
1589	              Identification Values", RFC 7739, DOI 10.17487/RFC7739,
1590	              February 2016, <https://www.rfc-editor.org/info/rfc7739>.

1592	   [RFC8446]  Rescorla, E., "The Transport Layer Security (TLS) Protocol
1593	              Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
1594	              <https://www.rfc-editor.org/info/rfc8446>.

1596	   [Sanfilippo1998a]
1597	              Sanfilippo, S., "about the ip header id", Post to Bugtraq
1598	              mailing-list, Mon Dec 14 1998,
1599	              <http://seclists.org/bugtraq/1998/Dec/48>.

1601	   [Sanfilippo1998b]
1602	              Sanfilippo, S., "Idle scan", Post to Bugtraq mailing-list,
1603	              1998, <https://github.com/antirez/hping/blob/master/docs/
1604	              SPOOFED_SCAN.txt>.

1606	   [Sanfilippo1999]
1607	              Sanfilippo, S., "more ip id", Post to Bugtraq mailing-
1608	              list, 1999,
1609	              <https://github.com/antirez/hping/raw/master/docs/MORE-
1610	              FUN-WITH-IPID>.

1612	   [Schuba1993]
1613	              Schuba, C., "ADDRESSING WEAKNESSES IN THE DOMAIN NAME
1614	              SYSTEM PROTOCOL", 1993,
1615	              <http://ftp.cerias.purdue.edu/pub/papers/christoph-schuba/
1616	              schuba-DNS-msthesis.pdf>.

1618	   [Shimomura1995]
1619	              Shimomura, T., "Technical details of the attack described
1620	              by Markoff in NYT", Message posted in USENET's
1621	              comp.security.misc newsgroup  Message-ID:
1622	              <3g5gkl$5j1@ariel.sdsc.edu>, 1995,
1623	              <https://www.gont.com.ar/docs/post-shimomura-usenet.txt>.

1625	   [Silbersack2005]
1626	              Silbersack, M., "Improving TCP/IP security through
1627	              randomization without sacrificing interoperability",
1628	              EuroBSDCon 2005 Conference, 2005,
1629	              <http://citeseerx.ist.psu.edu/viewdoc/
1630	              download?doi=10.1.1.91.4542&rep=rep1&type=pdf>.

1632	   [TBIT]     TBIT, "TBIT, the TCP Behavior Inference Tool", 2001,
1633	              <http://www.icir.org/tbit/>.

1635	   [TCPT-uptime]
1636	              McDanel, B., "TCP Timestamping - Obtaining System Uptime
1637	              Remotely", March 2001,
1638	              <https://securiteam.com/securitynews/5np0c153pi/>.

1640	   [Zalewski2001]
1641	              Zalewski, M., "Strange Attractors and TCP/IP Sequence
1642	              Number Analysis", 2001,
1643	              <http://lcamtuf.coredump.cx/oldtcp/tcpseq.html>.

1645	   [Zalewski2002]
1646	              Zalewski, M., "Strange Attractors and TCP/IP Sequence
1647	              Number Analysis - One Year Later", 2001,
1648	              <http://lcamtuf.coredump.cx/newtcp/>.

1650	   [Zalewski2012]
1651	              Zalewski, M., "p0f v3 (version 3.09b)", 2012,
1652	              <http://lcamtuf.coredump.cx/p0f.shtml>.

1654	Appendix A.  Algorithms and Techniques with Known Issues

1656	   The following subsections discuss algorithms and techniques with
1657	   known negative security and privacy implications.

1659	   Note:

1661	      As discussed in Section 1, the use of cryptographic techniques
1662	      might allow for the safe use of some of these algorithms and
1663	      techniques.  However, this should be evaluated on a case by case
1664	      basis.

1666	A.1.  Predictable Linear Identifiers Algorithm

1668	   One of the most trivial ways to achieve uniqueness with a low
1669	   identifier reuse frequency is to produce a linear sequence.  This
1670	   type of algorithm has been employed in the past to generate
1671	   identifiers of Categories #1, #2, and #4 (please see Section 6 for an
1672	   analysis of these categories).

1674	   For example, the following algorithm has been employed (see e.g.
1675	   [Morris1985], [Shimomura1995], [Silbersack2005] and [CPNI-TCP]) in a
1676	   number of operating systems for selecting IP fragment IDs, TCP
1677	   ephemeral ports, etc.:

1679	       /* Initialization code */

1681	       next_id = min_id;
1682	       id_inc= 1;

1684	       /* Transient Numeric ID selection function */

1686	       id_range = max_id - min_id + 1;
1687	       retry = id_range;

1689	       do {
1690	           if (next_id == max_id) {
1691	               next_id = min_id;
1692	           }
1693	           else {
1694	               next_id = next_id + id_inc;
1695	           }

1697	           if (suitable_id(next_id)) {
1698	               return next_id;
1699	           }

1701	           retry--;

1703	       } while (retry > 0);

1705	       return ERROR;

1707	   Note:

1709	      suitable_id() is a function that checks whether the resulting
1710	      identifier is acceptable (e.g., whether it's in use, etc.).

1712	   For obvious reasons, this algorithm results in predictable sequences.
1713	   Since a global counter is used to generate the transient numeric
1714	   identifiers ("next_id" in the example above), an entity that learns
1715	   one numeric identifier can infer past numeric identifiers and predict
1716	   future values to be generated by the same algorithm.  Since the value
1717	   employed for the increments is known (such as "1" in this case), an
1718	   attacker can sample two values, and learn the number of identifiers
1719	   that have been were generated in-between the two sampled values.
1720	   Furthermore, if the counter is initialized e.g. when the system its
1721	   bootstrapped to some known value, the algorithm will leak additional
1722	   information, such as the number of transmitted fragmented datagrams
1723	   in the case of an IP ID generator [Sanfilippo1998a], or the system
1724	   uptime in the case of TCP timestamps [TCPT-uptime].

1726	A.2.  Random-Increments Algorithm

1728	   This algorithm offers a middle ground between the algorithms that
1729	   generate randomized transient numeric identifiers (such as those
1730	   described in Section 7.1.1 and Section 7.1.2), and those that
1731	   generate identifiers with a predictable monotonically-increasing
1732	   function (see Appendix A.1).

1734	       /* Initialization code */

1736	       next_id = random();        /* Initialization value */
1737	       id_rinc = 500;             /* Determines the trade-off */

1739	       /* Transient Numeric ID selection function */

1741	       id_range = max_id - min_id + 1;
1742	       retry = id_range;

1744	       do {
1745	           /* Random increment */
1746	           id_inc = (random() % id_rinc) + 1;

1748	           if ( (max_id - next_id) >= id_inc){
1749	               next_id = next_id + id_inc;
1750	           }
1751	           else {
1752	               next_id = min_id + id_inc - (max_id - next_id);
1753	           }

1755	           if (suitable_id(next_id)) {
1756	              return next_id;
1757	           }

1759	           retry = retry - id_inc;

1761	       } while (retry > 0);

1763	       return ERROR;

1765	   This algorithm aims at producing a global monotonically-increasing
1766	   sequence of transient numeric identifiers, while avoiding the use of
1767	   fixed increments, which would lead to trivially predictable
1768	   sequences.  The value "id_inc" allows for direct control of the
1769	   trade-off between unpredictability and identifier reuse frequency.
1770	   The smaller the value of "id_inc", the more similar this algorithm is
1771	   to a predicable, global linear ID generation algorithm (as the one in
1772	   Appendix A.1).  The larger the value of "id_inc", the more similar
1773	   this algorithm is to the algorithm described in Section 7.1.1 of this
1774	   document.

1776	   When the identifiers wrap, there is a risk of collisions of transient
1777	   numeric identifiers (i.e., identifier reuse).  Therefore, "id_inc"
1778	   should be selected according to the following criteria:

1780	   o  It should maximize the wrapping time of the identifier space.

1782	   o  It should minimize identifier reuse frequency.

1784	   o  It should maximize unpredictability.

1786	   Clearly, these are competing goals, and the decision of which value
1787	   of "id_inc" to use is a trade-off.  Therefore, the value of "id_inc"
1788	   is at times a configurable parameter so that system administrators
1789	   can make the trade-off for themselves.  We note that the alternative
1790	   algorithms discussed throughout this document offer better
1791	   interoperability, security and privacy properties than this
1792	   algorithm, and hence implementation of this algorithm is discouraged.

1794	A.3.  Re-using Identifiers Across Different Contexts

1796	   Employing the same identifier across contexts in which stability is
1797	   not required (i.e. overloading the semantics of transient numeric
1798	   identifier) usually has negative security and privacy implications.

1800	   For example, in order to generate transient numeric identifiers of
1801	   Category #2 or Category #3, an implementation or specification might
1802	   be tempted to employ a source for the numeric identifiers which is
1803	   known to provide unique values, but that may also be predictable or
1804	   leak information related to the entity generating the identifier.
1805	   This technique has been employed in the past for e.g. generating IPv6
1806	   IIDs by re-using the MAC address of the underlying network interface.
1807	   However, as noted in [RFC7721] and [RFC7707], embedding link-layer
1808	   addresses in IPv6 IIDs not only results in predictable values, but
1809	   also leaks information about the manufacturer of the underlying
1810	   network interface card, allows for network activity correlation, and
1811	   makes address-based scanning attacks feasible.

1813	Authors' Addresses

1815	   Fernando Gont
1816	   SI6 Networks
1817	   Evaristo Carriego 2644
1818	   Haedo, Provincia de Buenos Aires  1706
1819	   Argentina

1821	   Email: fgont@si6networks.com
1822	   URI:   https://www.si6networks.com
1823	   Ivan Arce
1824	   Quarkslab

1826	   Email: iarce@quarkslab.com
1827	   URI:   https://www.quarkslab.com