idnits 2.17.1 

draft-gont-numeric-ids-generation-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  -- The document has an IETF Trust Provisions (28 Dec 2009) Section 6.c(ii)
     Publication Limitation clause.  If this document is intended for
     submission to the IESG for publication, this constitutes an error.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document doesn't use any RFC 2119 keywords, yet seems to have RFC
     2119 boilerplate text.

  -- The document date (March 11, 2019) is 1870 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Best Current Practice
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'Bellovin1989' is defined on line 1051, but no
     explicit reference was found in the text

  == Unused Reference: 'CERT2001' is defined on line 1061, but no explicit
     reference was found in the text

  == Unused Reference: 'Fyodor2004' is defined on line 1079, but no explicit
     reference was found in the text

  == Unused Reference: 'USCERT2001' is defined on line 1179, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC  793 (Obsoleted by RFC 9293)

  ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200)

  ** Obsolete normative reference: RFC 6528 (Obsoleted by RFC 9293)


     Summary: 3 errors (**), 0 flaws (~~), 6 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                            F. Gont
3	Internet-Draft                                    SI6 Networks / UTN-FRH
4	Intended status: Best Current Practice                           I. Arce
5	Expires: September 12, 2019                                    Quarkslab
6	                                                          March 11, 2019

8	           On the Generation of Transient Numeric Identifiers
9	                  draft-gont-numeric-ids-generation-03

11	Abstract

13	   This document performs an analysis of the security and privacy
14	   implications of different types of "numeric identifiers" used in IETF
15	   protocols, and tries to categorize them based on their
16	   interoperability requirements and the associated failure severity
17	   when such requirements are not met.  Subsequently, it provides advice
18	   on possible algorithms that could be employed to satisfy the
19	   interoperability requirements of each identifier type, while
20	   minimizing the security and privacy implications, thus providing
21	   guidance to protocol designers and protocol implementers.  Finally,
22	   describes a number of algorithms that have been employed in real
23	   implementations to generate transient numeric identifiers and
24	   analyzes their security and privacy properties.

26	Status of This Memo

28	   This Internet-Draft is submitted in full conformance with the
29	   provisions of BCP 78 and BCP 79.

31	   Internet-Drafts are working documents of the Internet Engineering
32	   Task Force (IETF).  Note that other groups may also distribute
33	   working documents as Internet-Drafts.  The list of current Internet-
34	   Drafts is at https://datatracker.ietf.org/drafts/current/.

36	   Internet-Drafts are draft documents valid for a maximum of six months
37	   and may be updated, replaced, or obsoleted by other documents at any
38	   time.  It is inappropriate to use Internet-Drafts as reference
39	   material or to cite them other than as "work in progress."

41	   This Internet-Draft will expire on September 12, 2019.

43	Copyright Notice

45	   Copyright (c) 2019 IETF Trust and the persons identified as the
46	   document authors.  All rights reserved.

48	   This document is subject to BCP 78 and the IETF Trust's Legal
49	   Provisions Relating to IETF Documents
50	   (https://trustee.ietf.org/license-info) in effect on the date of
51	   publication of this document.  Please review these documents
52	   carefully, as they describe your rights and restrictions with respect
53	   to this document.  Code Components extracted from this document must
54	   include Simplified BSD License text as described in Section 4.e of
55	   the Trust Legal Provisions and are provided without warranty as
56	   described in the Simplified BSD License.

58	   This document may not be modified, and derivative works of it may not
59	   be created, and it may not be published except as an Internet-Draft.

61	Table of Contents

63	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
64	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   4
65	   3.  Threat Model  . . . . . . . . . . . . . . . . . . . . . . . .   5
66	   4.  Issues with the Specification of Identifiers  . . . . . . . .   5
67	   5.  Protocol Failure Severity . . . . . . . . . . . . . . . . . .   6
68	   6.  Categorizing Identifiers  . . . . . . . . . . . . . . . . . .   6
69	   7.  Common Algorithms for Identifier Generation . . . . . . . . .   9
70	     7.1.  Category #1: Uniqueness (soft failure)  . . . . . . . . .   9
71	     7.2.  Category #2: Uniqueness (hard failure)  . . . . . . . . .  10
72	     7.3.  Category #3: Uniqueness, constant within context (soft-
73	           failure)  . . . . . . . . . . . . . . . . . . . . . . . .  10
74	     7.4.  Category #4: Uniqueness, monotonically increasing within
75	           context (hard failure)  . . . . . . . . . . . . . . . . .  12
76	   8.  Common Vulnerabilities Associated with Transient Numeric
77	       Identifiers . . . . . . . . . . . . . . . . . . . . . . . . .  17
78	     8.1.  Network Activity Correlation  . . . . . . . . . . . . . .  17
79	     8.2.  Information Leakage . . . . . . . . . . . . . . . . . . .  17
80	     8.3.  Exploitation of Semantics of Transient Numeric
81	           Identifiers . . . . . . . . . . . . . . . . . . . . . . .  19
82	     8.4.  Exploitation of Collisions of Transient Numeric
83	           Identifiers . . . . . . . . . . . . . . . . . . . . . . .  19
84	   9.  Vulnerability Analysis of Specific Transient Numeric
85	       Identifiers Categories  . . . . . . . . . . . . . . . . . . .  19
86	     9.1.  Category #1: Uniqueness (soft failure)  . . . . . . . . .  19
87	     9.2.  Category #2: Uniqueness (hard failure)  . . . . . . . . .  20
88	     9.3.  Category #3: Uniqueness, constant within context (soft
89	           failure)  . . . . . . . . . . . . . . . . . . . . . . . .  20
90	     9.4.  Category #4: Uniqueness, monotonically increasing within
91	           context (hard failure)  . . . . . . . . . . . . . . . . .  20
92	   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  22
93	   11. Security Considerations . . . . . . . . . . . . . . . . . . .  22
94	   12. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  22
95	   13. References  . . . . . . . . . . . . . . . . . . . . . . . . .  23
96	     13.1.  Normative References . . . . . . . . . . . . . . . . . .  23
97	     13.2.  Informative References . . . . . . . . . . . . . . . . .  24
98	   Appendix A.  Flawed Algorithms  . . . . . . . . . . . . . . . . .  27
99	     A.1.  Predictable Linear Identifiers Algorithm  . . . . . . . .  27
100	     A.2.  Random-Increments Algorithm . . . . . . . . . . . . . . .  28
101	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  30

103	1.  Introduction

105	   Network protocols employ a variety of numeric identifiers for
106	   different protocol entities, ranging from DNS Transaction IDs (TxIDs)
107	   to transport protocol numbers (e.g.  TCP ports) or IPv6 Interface
108	   Identifiers (IIDs).  These identifiers usually have specific
109	   properties that must be satisfied such that they do not result in
110	   negative interoperability implications (e.g. uniqueness during a
111	   specified period of time), and associated failure severities when
112	   such properties are not met, ranging from soft to hard failures.

114	   For more than 30 years, a large number of implementations of the TCP/
115	   IP protocol suite have been subject to a variety of attacks, with
116	   effects ranging from Denial of Service (DoS) or data injection, to
117	   information leakage that could be exploited for pervasive monitoring
118	   [RFC7528].  The root of these issues has been, in many cases, the
119	   poor selection of identifiers in such protocols, usually as a result
120	   of an insufficient or misleading specification.  While it is
121	   generally trivial to identify an algorithm that can satisfy the
122	   interoperability requirements for a given identifier, there exists
123	   practical evidence that doing so without negatively affecting the
124	   security and/or privacy properties of the aforementioned protocols is
125	   prone to error.

127	   For example, implementations have been subject to security and/or
128	   privacy issues resulting from:

130	   o  Predictable TCP sequence numbers

132	   o  Predictable transport protocol numbers

134	   o  Predictable IPv4 or IPv6 Fragment Identifiers

136	   o  Predictable IPv6 IIDs

138	   o  Predictable DNS TxIDs

140	   Recent history indicates that when new protocols are standardized or
141	   new protocol implementations are produced, the security and privacy
142	   properties of the associated identifiers tend to be overlooked and
143	   inappropriate algorithms to generate identifier values are either
144	   suggested in the specification or selected by implementers.  As a
145	   result, we believe that advice in this area is warranted.

147	   This document contains a non-exhaustive survey of identifiers
148	   employed in various IETF protocols, and aims to categorize such
149	   identifiers based on their interoperability requirements, and the
150	   associated failure severity when such requirements are not met.
151	   Subsequently, it provides advice on possible algorithms that could be
152	   employed to satisfy the interoperability requirements of each
153	   category, while minimizing the associated security and privacy
154	   implications.  Finally, it analyzes several algorithms that have been
155	   employed in real implementations to meet such requirements and
156	   analyzes their security and privacy properties.

158	2.  Terminology

160	   Identifier:
161	      A data object in a protocol specification that can be used to
162	      definitely distinguish a protocol object (a datagram, network
163	      interface, transport protocol endpoint, session, etc) from all
164	      other objects of the same type, in a given context.  Identifiers
165	      are usually defined as a series of bits and represented using
166	      integer values.  We note that different identifiers may have
167	      additional requirements or properties depending on their specific
168	      use in a protocol.  We use the term "identifier" as a generic term
169	      to refer to any data object in a protocol specification that
170	      satisfies the identification property stated above.

172	   Failure Severity:
173	      The consequences of a failure to comply with the interoperability
174	      requirements of a given identifier.  Severity considers the worst
175	      potential consequence of a failure, determined by the system
176	      damage and/or time lost to repair the failure.  In this document
177	      we define two types of failure severity: "soft" and "hard".

179	   Hard Failure:
180	      A hard failure is a non-recoverable condition in which a protocol
181	      does not operate in the prescribed manner or it operates with
182	      excessive degradation of service.  For example, an established TCP
183	      connection that is aborted due to an error condition constitutes,
184	      from the point of view of the transport protocol, a hard failure,
185	      since it enters a state from which normal operation cannot be
186	      recovered.

188	   Soft Failure:
189	      A soft failure is a recoverable condition in which a protocol does
190	      not operate in the prescribed manner but normal operation can be
191	      resumed automatically in a short period of time.  For example, a
192	      simple packet-loss event that is subsequently recovered with a
193	      retransmission can be considered a soft failure.

195	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
196	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
197	   document are to be interpreted as described in RFC 2119 [RFC2119].

199	3.  Threat Model

201	   Throughout this document, we assume an attacker does not have
202	   physical or logical device to the device(s) being attacked.  We
203	   assume the attacker can simply send any traffic to the target
204	   devices, to e.g. sample identifiers employed by such devices.

206	4.  Issues with the Specification of Identifiers

208	   While assessing protocol specifications regarding the use of
209	   identifiers, we found that most of the issues discussed in this
210	   document arise as a result of one of the following:

212	   o  Protocol specifications which under-specify the requirements for
213	      their identifiers

215	   o  Protocol specifications that over-specify their identifiers

217	   o  Protocol implementations that simply fail to comply with the
218	      specified requirements

220	   A number of protocol implementations (too many of them) simply
221	   overlook the security and privacy implications of identifiers.
222	   Examples of them are the specification of TCP port numbers in
223	   [RFC0793], the specification of TCP sequence numbers in [RFC0793], or
224	   the specification of the DNS TxID in [RFC1035].

226	   On the other hand, there are a number of protocol specifications that
227	   over-specify some of their associated protocol identifiers.  For
228	   example, [RFC4291] essentially results in link-layer addresses being
229	   embedded in the IPv6 Interface Identifiers (IIDs) when the
230	   interoperability requirement of uniqueness could be achieved in other
231	   ways that do not result in negative security and privacy implications
232	   [RFC7721].  Similarly, [RFC2460] suggests the use of a global counter
233	   for the generation of Fragment Identification values, when the
234	   interoperability properties of uniqueness per {Src IP, Dst IP} could
235	   be achieved with other algorithms that do not result in negative
236	   security and privacy implications.

238	   Finally, there are protocol implementations that simply fail to
239	   comply with existing protocol specifications.  For example, some
240	   popular operating systems (notably Microsoft Windows) still fail to
241	   implement randomization of transport protocol ephemeral ports, as
242	   specified in [RFC6056].

244	5.  Protocol Failure Severity

246	   Section 2 defines the concept of "Failure Severity" and two types of
247	   failures that we employ throughout this document: soft and hard.

249	   Our analysis of the severity of a failure is performed from the point
250	   of view of the protocol in question.  However, the corresponding
251	   severity on the upper application or protocol may not be the same as
252	   that of the protocol in question.  For example, a TCP connection that
253	   is aborted may or may not result in a hard failure of the upper
254	   application: if the upper application can establish a new TCP
255	   connection without any impact on the application, a hard failure at
256	   the TCP protocol may have no severity at the application level.  On
257	   the other hand, if a hard failure of a TCP connection results in
258	   excessive degradation of service at the application layer, it will
259	   also result in a hard failure at the application.

261	6.  Categorizing Identifiers

263	   This section includes a non-exhaustive survey of identifiers, and
264	   proposes a number of categories that can accommodate these
265	   identifiers based on their interoperability requirements and their
266	   failure modes (soft or hard)
267	   +------------+--------------------------------------+---------------+
268	   | Identifier |    Interoperability Requirements     |    Failure    |
269	   |            |                                      |    Severity   |
270	   +------------+--------------------------------------+---------------+
271	   | IPv6 Frag  |   Uniqueness (for IP address pair)   | Soft/Hard (1) |
272	   |     ID     |                                      |               |
273	   +------------+--------------------------------------+---------------+
274	   |  IPv6 IID  | Uniqueness (and constant within IPv6 |    Soft (3)   |
275	   |            |             prefix) (2)              |               |
276	   +------------+--------------------------------------+---------------+
277	   |  TCP SEQ   |       Monotonically-increasing       |    Hard (4)   |
278	   +------------+--------------------------------------+---------------+
279	   |  TCP eph.  |    Uniqueness (for connection ID)    |      Hard     |
280	   |    port    |                                      |               |
281	   +------------+--------------------------------------+---------------+
282	   | IPv6 Flow  |              Uniqueness              |    None (5)   |
283	   |     L.     |                                      |               |
284	   +------------+--------------------------------------+---------------+
285	   |  DNS TxID  |              Uniqueness              |    None (6)   |
286	   +------------+--------------------------------------+---------------+

288	                      Table 1: Survey of Identifiers

290	   Notes:

292	   (1)
293	      While a single collision of Fragment ID values would simply lead
294	      to a single packet drop (and hence a "soft" failure), repeated
295	      collisions at high data rates might trash the Fragment ID space,
296	      leading to a hard failure [RFC4963].

298	   (2)
299	      While the interoperability requirements are simply that the
300	      Interface ID results in a unique IPv6 address, for operational
301	      reasons it is typically desirable that the resulting IPv6 address
302	      (and hence the corresponding Interface ID) be constant within each
303	      network [I-D.ietf-6man-default-iids] [RFC7217].

305	   (3)
306	      While IPv6 Interface IDs must result in unique IPv6 addresses,
307	      IPv6 Duplicate Address Detection (DAD) [RFC4862] allows for the
308	      detection of duplicate Interface IDs/addresses, and hence such
309	      Interface ID collisions can be recovered.

311	   (4)
312	      In theory there are no interoperability requirements for TCP
313	      sequence numbers, since the TIME-WAIT state and TCP's "quiet time"
314	      take care of old segments from previous incarnations of the
315	      connection.  However, a widespread optimization allows for a new
316	      incarnation of a previous connection to be created if the Initial
317	      Sequence Number (ISN) of the incoming SYN is larger than the last
318	      sequence number seen in that direction for the previous
319	      incarnation of the connection.  Thus, monotonically-increasing TCP
320	      sequence numbers allow for such optimization to work as expected
321	      [RFC6528].

323	   (5)
324	      The IPv6 Flow Label is typically employed for load sharing
325	      [RFC7098], along with the Source and Destination IPv6 addresses.
326	      Reuse of a Flow Label value for the same set {Source Address,
327	      Destination Address} would typically cause both flows to be
328	      multiplexed into the same link.  However, as long as this does not
329	      occur deterministically, it will not result in any negative
330	      implications.

332	   (6)
333	      DNS TxIDs are employed, together with the Source Address,
334	      Destination Address, Source Port, and Destination Port, to match
335	      DNS requests and responses.  However, since an implementation
336	      knows which DNS requests were sent for that set of {Source
337	      Address, Destination Address, Source Port, and Destination Port,
338	      DNS TxID}, a collision of TxID would result, if anything, in a
339	      small performance penalty (the response would be discarded when it
340	      is found that it does not answer the query sent in the
341	      corresponding DNS query).

343	   Based on the survey above, we can categorize identifiers as follows:

345	   +-----+---------------------------------------+---------------------+
346	   | Cat |                Category               |   Sample Proto IDs  |
347	   |  #  |                                       |                     |
348	   +-----+---------------------------------------+---------------------+
349	   |  1  |       Uniqueness (soft failure)       |  IPv6 Flow L., DNS  |
350	   |     |                                       |        TxIDs        |
351	   +-----+---------------------------------------+---------------------+
352	   |  2  |       Uniqueness (hard failure)       |  IPv6 Frag ID, TCP  |
353	   |     |                                       |    ephemeral port   |
354	   +-----+---------------------------------------+---------------------+
355	   |  3  |  Uniqueness, constant within context  |      IPv6 IIDs      |
356	   |     |             (soft failure)            |                     |
357	   +-----+---------------------------------------+---------------------+
358	   |  4  |  Uniqueness, monotonically increasing |       TCP ISN       |
359	   |     |     within context (hard failure)     |                     |
360	   +-----+---------------------------------------+---------------------+

362	                      Table 2: Identifier Categories

364	   We note that Category #4 could be considered a generalized case of
365	   category #3, in which a monotonically increasing element is added to
366	   a constant (within context) element, such that the resulting
367	   identifiers are monotonically increasing within a specified context.
368	   That is, the same algorithm could be employed for both #3 and #4,
369	   given appropriate parameters.

371	7.  Common Algorithms for Identifier Generation

373	   The following subsections describe common algorithms found for
374	   Protocol ID generation for each of the categories above.

376	7.1.  Category #1: Uniqueness (soft failure)

378	7.1.1.  Simple Randomization Algorithm

380	       /* Ephemeral port selection function */
381	       id_range = max_id - min_id + 1;
382	       next_id = min_id + (random() % id_range);
383	       count = next_id;

385	       do {
386	           if(check_suitable_id(next_id))
387	               return next_id;

389	           if (next_id == max_id) {
390	               next_id = min_id;
391	           } else {
392	               next_id++;
393	           }

395	           count--;
396	       } while (count > 0);

398	       return ERROR;

400	   Note:
401	      random() is a function that returns a pseudo-random unsigned
402	      integer number of appropriate size.  Note that the output needs to
403	      be unpredictable, and typical implementations of POSIX random()
404	      function do not necessarily meet this requirement.  See [RFC4086]
405	      for randomness requirements for security.

407	      The function check_suitable_id() can check, when possible, whether
408	      this identifier is e.g. already in use.  When already used, this
409	      algorithm selects the next available protocol ID.

411	      All the variables (in this and all the algorithms discussed in
412	      this document) are unsigned integers.

414	   This algorithm does not suffer from any of the issues discussed in
415	   Section 8.

417	7.1.2.  Another Simple Randomization Algorithm

419	   The following pseudo-code illustrates another algorithm for selecting
420	   a random identifier in which, in the event the identifier is found to
421	   be not suitable (e.g., already in use), another identifier is
422	   selected randomly:

424	       id_range = max_id - min_id + 1;
425	       next_id = min_id + (random() % id_range);
426	       count = id_range;

428	       do {
429	           if(check_suitable_id(next_id))
430	               return next_id;

432	           next_id = min_id + (random() % id_range);
433	           count--;
434	       } while (count > 0);

436	       return ERROR;

438	   This algorithm might be unable to select an identifier (i.e., return
439	   "ERROR") even if there are suitable identifiers available, when there
440	   are a large number of identifiers "in use".

442	   This algorithm does not suffer from any of the issues discussed in
443	   Section 8.

445	7.2.  Category #2: Uniqueness (hard failure)

447	   One of the most trivial approaches for achieving uniqueness for an
448	   identifier (with a hard failure mode) is to implement a linear
449	   function.  As a result, all of the algorithms described in
450	   Section 7.4 are of use for complying the requirements of this
451	   identifier category.

453	7.3.  Category #3: Uniqueness, constant within context (soft-failure)

455	   The goal of this algorithm is to produce identifiers that are
456	   constant for a given context, but that change when the aforementioned
457	   context changes.

459	   Keeping one value for each possible "context" may in many cases be
460	   considered too onerous in terms of memory requirements.  As a
461	   workaround, the following algorithm employs a calculated technique
462	   (as opposed to keeping state in memory) to maintain the constant
463	   identifier for each given context.

465	   In the following algorithm, the function F() provides (statelessly) a
466	   constant identifier for each given context.

468	       /* Protocol ID selection function  */
469	       id_range = max_id - min_id + 1;

471	       counter = 0;

473	       do {
474	           offset = F(CONTEXT, counter, secret_key);
475	           next_id = min_id + (offset % id_range);

477	           if(check_suitable_id(next_id))
478	               return next_id;

480	           counter++;

482	       } while (counter <= MAX_RETRIES);

484	       return ERROR;

486	   The function F() provides a "per-CONTEXT" constant identifier for a
487	   given context. 'offset' may take any value within the storage type
488	   range since we are restricting the resulting identifier to be in the
489	   range [min_id, max_id] in a similar way as in the algorithm described
490	   in Section 7.1.1.  Collisions can be recovered by incrementing the
491	   'counter' variable and recomputing F().

493	   The function F() should be a cryptographic hash function like SHA-256
494	   [FIPS-SHS].  Note: MD5 [RFC1321] is considered unacceptable for F()
495	   [RFC6151].  CONTEXT is the concatenation of all the elements that
496	   define a given context.  For example, if this algorithm is expected
497	   to produce identifiers that are unique per network interface card
498	   (NIC) and SLAAC autoconfiguration prefix, the CONTEXT should be the
499	   concatenation of e.g. the interface index and the SLAAC
500	   autoconfiguration prefix (please see [RFC7217] for an implementation
501	   of this algorithm for the generation of IPv6 IIDs).

503	   The secret should be chosen to be as random as possible (see
504	   [RFC4086] for recommendations on choosing secrets).

506	   For obvious reasons, the transient network identifiers generated with
507	   this algorithm allow for network activity correlation within
508	   "CONTEXT".  However, this is essentially a design goal of this
509	   category of transient network identifiers.

511	7.4.  Category #4: Uniqueness, monotonically increasing within context
512	      (hard failure)

514	7.4.1.  Per-context Counter Algorithm

516	   One possible way to achieve low identifier reuse frequency while
517	   still avoiding predictable sequences would be to employ a per-context
518	   counter, as opposed to a global counter.  Such an algorithm could be
519	   described as follows:

521	       /* Initialization at system boot time. Could be random */
522	       id_inc= 1;

524	       /* Identifier selection function */
525	       count = max_id - min_id + 1;

527	       if(lookup_counter(CONTEXT) == ERROR){
528	            create_counter(CONTEXT);
529	       }

531	       next_id= lookup_counter(CONTEXT);

533	       do {
534	           if (next_id == max_id) {
535	               next_id = min_id;
536	           }
537	           else {
538	               next_id = next_id + id_inc;
539	           }

541	           if (check_suitable_id(next_id)){
542	               store_counter(CONTEXT, next_id);
543	               return next_id;
544	           }

546	           count--;

548	       } while (count > 0);

550	       store_counter(CONTEXT, next_id);
551	       return ERROR;

553	   NOTE:

555	      lookup_counter() returns the current counter for a given context,
556	      or an error condition if such a counter does not exist.

558	      create_counter() creates a counter for a given context, and
559	      initializes such counter to a random value.

561	      store_counter() saves (updates) the current counter for a given
562	      context.

564	      check_suitable_id() is a function that checks whether the
565	      resulting identifier is acceptable (e.g., whether its in use,
566	      etc.).

568	   Essentially, whenever a new identifier is to be selected, the
569	   algorithm checks whether there there is a counter for the
570	   corresponding context.  If there is, such counter is incremented to
571	   obtain the new identifier, and the new identifier updates the
572	   corresponding counter.  If there is no counter for such context, a
573	   new counter is created an initialized to a random value, and used as
574	   the new identifier.

576	   This algorithm produces a per-context counter, which results in one
577	   linear function for each context.  Since the origin of each "line" is
578	   a random value, the resulting values are unknown to an off-path
579	   attacker.

581	   This algorithm has the following drawbacks:

583	   o  If, as a result of resource management, the counter for a given
584	      context must be removed, the last identifier value used for that
585	      context will be lost.  Thus, if subsequently an identifier needs
586	      to be generated for such context, that counter will need to be
587	      recreated and reinitialized to random value, thus possibly leading
588	      to reuse/collistion of identifiers.

590	   o  If the identifiers are predictable by the destination system
591	      (e.g., the destination host represents the "context"), a
592	      vulnerable host might possibly leak to third parties the
593	      identifiers used by other hosts to send traffic to it (i.e., a
594	      vulnerable Host B could leak to Host C the identifier values that
595	      Host A is using to send packets to Host B).  Appendix A of
596	      [RFC7739] describes one possible scenario for such leakage in
597	      detail.

599	   Otherwise, the identifiers produced by this algorithm do not suffer
600	   from the other issues discussed in Section 8.

602	7.4.2.  Simple Hash-Based Algorithm

604	   The goal of this algorithm is to produce monotonically-increasing
605	   sequences, with a randomized initial value, for each given context.
606	   For example, if the identifiers being generated must be unique for
607	   each {src IP, dst IP} set, then each possible combination of {src IP,
608	   dst IP} should have a corresponding "next_id" value.

610	   Keeping one value for each possible "context" may in many cases be
611	   considered too onerous in terms of memory requirements.  As a
612	   workaround, the following algorithm employs a calculated technique
613	   (as opposed to keeping state in memory) to maintain the random offset
614	   for each possible context.

616	   In the following algorithm, the function F() provides (statelessly) a
617	   random offset for each given context.

619	       /* Initialization at system boot time. Could be random. */
620	       counter = 0;

622	       /* Protocol ID selection function  */
623	       id_range = max_id - min_id + 1;
624	       offset = F(CONTEXT, secret_key);
625	       count = id_range;

627	       do {
628	           next_id = min_id +
629	                  (counter + offset) % id_range;

631	           counter++;

633	           if(check_suitable_id(next_id))
634	               return next_id;

636	           count--;

638	       } while (count > 0);

640	       return ERROR;

642	   The function F() provides a "per-CONTEXT" fixed offset within the
643	   identifier space.  Both the 'offset' and 'counter' variables may take
644	   any value within the storage type range since we are restricting the
645	   resulting identifier to be in the range [min_id, max_id] in a similar
646	   way as in the algorithm described in Section 7.1.1.  This allows us
647	   to simply increment the 'counter' variable and rely on the unsigned
648	   integer to wrap around.

650	   The function F() should be a cryptographic hash function like SHA-256
651	   [FIPS-SHS].  Note: MD5 [RFC1321] is considered unacceptable for F()
652	   [RFC6151].  CONTEXT is the concatenation of all the elements that
653	   define a given context.  For example, if this algorithm is expected
654	   to produce identifiers that are monotonically-increasing for each set
655	   (Source IP Address, Destination IP Address), the CONTEXT should be
656	   the concatenation of these two values.

658	   The secret should be chosen to be as random as possible (see
659	   [RFC4086] for recommendations on choosing secrets).

661	   It should be noted that, since this algorithm uses a global counter
662	   ("counter") for selecting identifiers, this algorithm produces an
663	   information leakage (as described in Section 8.2).  For example, if
664	   an attacker could force a client to periodically establish a new TCP
665	   connection to an attacker-controlled machine (or through an attacker-
666	   observable routing path), the attacker could subtract consecutive
667	   source port values to obtain the number of outgoing TCP connections
668	   established globally by the target host within that time period (up
669	   to wrap-around issues and five-tuple collisions, of course).

671	7.4.3.  Double-Hash Algorithm

673	   A trade-off between maintaining a single global 'counter' variable
674	   and maintaining 2**N 'counter' variables (where N is the width of the
675	   result of F()) could be achieved as follows.  The system would keep
676	   an array of TABLE_LENGTH integers, which would provide a separation
677	   of the increment of the 'counter' variable.  This improvement could
678	   be incorporated into the algorithm from Section 7.4.2 as follows:

680	       /* Initialization at system boot time */
681	       for(i = 0; i < TABLE_LENGTH; i++)
682	           table[i] = random();

684	       id_inc = 1;

686	       /* Protocol ID selection function */
687	       id_range = max_id - min_id + 1;
688	       offset = F(CONTEXT, secret_key1);
689	       index = G(CONTEXT, secret_key2);
690	       count = id_range;

692	       do {
693	           next_id = min_id + (offset + table[index]) % id_range;
694	           table[index] = table[index] + id_inc;

696	           if(check_suitable_id(next_id))
697	               return next_id;

699	          count--;

701	       } while (count > 0);

703	       return ERROR;

705	   'table[]' could be initialized with random values, as indicated by
706	   the initialization code in pseudo-code above.  The function G()
707	   should be a cryptographic hash function.  It should use the same
708	   CONTEXT as F(), and a secret key value to compute a value between 0
709	   and (TABLE_LENGTH-1).  Alternatively, G() could take an "offset" as
710	   input, and perform the exclusive-or (XOR) operation between all the
711	   bytes in 'offset'.

713	   The array 'table[]' assures that successive identifiers for a given
714	   context will be monotonically-increasing.  However, the increments
715	   space is separated into TABLE_LENGTH different spaces, and thus
716	   identifier reuse frequency will be (probabilistically) lower than
717	   that of the algorithm in Section 7.4.2.  That is, the generation of
718	   identifier for one given context will not necessarily result in
719	   increments in the identifiers for other contexts.

721	   It is interesting to note that the size of 'table[]' does not limit
722	   the number of different identifier sequences, but rather separates
723	   the *increments* into TABLE_LENGTH different spaces.  The identifier
724	   sequence will result from adding the corresponding entry of 'table[]'
725	   to the variable 'offset', which selects the actual identifier
726	   sequence (as in the algorithm from Section 7.4.2).

728	   An attacker can perform traffic analysis for any "increment space"
729	   (i.e., context) into which the attacker has "visibility" -- namely,
730	   the attacker can force a node to generate identifiers where G(offset)
731	   identifies the target "increment space".  However, the attacker's
732	   ability to perform traffic analysis is very reduced when compared to
733	   the predictable linear identifiers (described in Appendix A.1) and
734	   the hash-based identifiers (described in Section 7.4.2).
735	   Additionally, an implementation can further limit the attacker's
736	   ability to perform traffic analysis by further separating the
737	   increment space (that is, using a larger value for TABLE_LENGTH) and/
738	   or by randomizing the increments.

740	   Otherwise, this algorithm does not suffer from the issues discussed
741	   in Section 8.

743	8.  Common Vulnerabilities Associated with Transient Numeric Identifiers

745	8.1.  Network Activity Correlation

747	   An idetifier that is preditable or stable within a given context
748	   allows for network activity correlation within that context.

750	   For example, a stable IPv6 Interface Identifier allows for network
751	   activity to be correlated for the context in which that address is
752	   stable [RFC7721].  A stable-per-network (as in [RFC7217] allows for
753	   network activity correlation within a network, whereas a constant
754	   IPv6 Interface Identifier allows not only network activity
755	   correlation within the same network, but also across networks ("host
756	   tracking").

758	   Predictable transient numeric identifiers can also allow for network
759	   activity correlation.  For example, a node that generates TCP ISNs
760	   with a global counter will typically allow network activity
761	   correlation even as it roams across networks, since the communicating
762	   nodes could infer the identity of the node based on the TCP ISNs
763	   employed for subsequent communication instances.  Similarly, a node
764	   that generates predictable IPv6 Fragment Identification values could
765	   be subject to network activity correlation (see e.g.
766	   [Bellovin2002]).

768	8.2.  Information Leakage

770	   Transient numeric identifiers that are not randomized can leak out
771	   information to other communicating nodes.  For example, it is common
772	   to generate identifiers like:

774	                ID = offset(CONTEXT_1) + linear(CONTEXT_2);

776	   This generic expression generates identifiers by adding a linear
777	   function to an offset.  The offset is constant within a given
778	   context, whereas linear() is a linear function for a given context
779	   (possibly different to that of offset()).  Identifiers generated with
780	   this expression will generally be predictable within CONTEXT_1.
781	   Thus, CONTEXT_1 essentially specifies e.g. the context within which
782	   network activity correlation is possible thanks to these identifiers.
783	   When CONTEXT_1 is "global" (e.g., offset() is simply a constant
784	   value), then all the corresponding transient numeric identifiers
785	   become predictable in all contexts.

787	      NOTE: If offset() has a global context and the specific value is
788	      known, the resulting identifiers may leak even more information.
789	      For example, the if Fragment Identification values are generated
790	      with the generic function above, and CONTEXT_1 is "global", then
791	      the corresponding identifiers will leak the number of fragmented
792	      datagrams sent for CONTEXT_2.  If both CONTEXT_1 and CONTEXT_2 are
793	      "global", then Fragment Identification values would be generated
794	      with a global counter (initialized to offset()), and thus each
795	      generated Fragment Identification value would leak the number of
796	      fragmented datagrams transmitted by the node since it was
797	      bootstrapped.

799	   On the other hand, linear() will be predictable within CONTEXT_2.
800	   The predictability of linear(), irrespective of the context and/or
801	   predictability of offset(), can leak out information that is of use
802	   to attackers.  For example, a node that selects ephemeral port
803	   numbers on as in_

805	                ehemeral_port = offset(Dest_IP) + linear()

807	   that is, with a per-destination offset, but global linear() function
808	   (e.g., a global counter), will leak information about the number of
809	   outgoing connections that have been issued between any two issued
810	   outgoing connections.  Similarly, a node that generates Fragment
811	   Identification values as in:

813	                Frag_ID = offset(Srd_IP, Dst_IP) + linear()

815	   will leak out information about the number of fragmented packets that
816	   have been transmitted between any two other transmitted fragmented
817	   packets.  The vulnerabilities described in [Sanfilippo1998a],
818	   [Sanfilippo1998b], and [Sanfilippo1999] are all associated with the
819	   use of a global linear() function (i.e., a global CONTEXT_2).

821	8.3.  Exploitation of Semantics of Transient Numeric Identifiers

823	   Identifiers that are not semantically opaque tend to be more
824	   predictable than semantically-opaque identifiers.  For example, a MAC
825	   address contains an OUI (Organizationally-Unique Identifier) which
826	   identifies the vendor that manufactured the underlying network
827	   interface card.  This fact may be leveraged by an attacker meaning to
828	   "predict" MAC addresses, if he has some knowledge about the possible
829	   NIC vendor.

831	   [RFC7707] discusses a number of techniques to reduce the search space
832	   when performing IPv6 address-scanning attacks by leveraging the
833	   semantics of the IIDs produced by a number of IID-generation
834	   techniques.

836	8.4.  Exploitation of Collisions of Transient Numeric Identifiers

838	   In many cases, th collision of transient network identifiers can have
839	   a hard failure severity (or result in a hard failure severity if an
840	   attacker can cause multiple collisions deterministically, one after
841	   another).  For example, predictable Fragment Identification values
842	   open the door to Denial of Service (DoS) attacks (see e.g.
843	   [RFC5722].  Similarly, predictable TCP ISNs open the door to trivial
844	   connection-reset and data injection attacks (see e.g.
845	   [Joncheray1995]).

847	9.  Vulnerability Analysis of Specific Transient Numeric Identifiers
848	    Categories

850	   The following subsections analyze common vulnerabilities associated
851	   with the generation of identifiers for each of the categories
852	   identified in Section 6.

854	9.1.  Category #1: Uniqueness (soft failure)

856	   Possible vulnerabilities associated with identifiers of this category
857	   are:

859	   o  Use of trivial algorithms (e.g. global counters) that generate
860	      predictable identifiers

862	   o  Use of flawed PRNGs (please see e.g.  [Zalewski2001],
863	      [Zalewski2002] and [Klein2007])

865	   Since the only interoperability requirement for these identifiers is
866	   uniqueness, the obvious approach to generate them is to employ a
867	   PRNG.  An implementer should consult [RFC4086] regarding randomness
868	   requirements for security, and consult relevant documentation when
869	   employing a PRNG provided by the underlying system.

871	   Use of algorithms other than PRNGs for generating identifiers of this
872	   category is discouraged.

874	9.2.  Category #2: Uniqueness (hard failure)

876	   As noted in Section 7.2 this category typically employs the same
877	   algorithms as Category #4, since a monotonically-increasing sequence
878	   tends to minimize the identifier reuse frequency.  Therefore, the
879	   vulnerability analysis of Section 9.4 applies to this case.

881	9.3.  Category #3: Uniqueness, constant within context (soft failure)

883	   There are two main vulnerabilities that may be associated with
884	   identifiers of this category:

886	   1.  Use algorithms or sources that result in predictable identifiers

888	   2.  Employing the same identifier across contexts in which constantcy
889	       is not required

891	   At times, an implementation or specification may be tempted to employ
892	   a source for the identifier which is known to provide unique values.
893	   However, while unique, the associated identifiers may have other
894	   properties such as being predictable or leaking information about the
895	   node in question.  For example, as noted in [RFC7721], embedding
896	   link-layer addresses for generating IPv6 IIDs not only results in
897	   predictable values, but also leaks information about the manufacturer
898	   of the network interface card.

900	   On the other hand, using an identifier across contexts where
901	   constantcy is not required can be leveraged for correlation of
902	   activities.  On of the most trivial examples of this is the use of
903	   IPv6 IIDs that are constant across networks (such as IIDs that embed
904	   the underlying link-layer address).

906	9.4.  Category #4: Uniqueness, monotonically increasing within context
907	      (hard failure)

909	   A simple way to generalize algorithms employed for generating
910	   identifiers of Category #4 would be as follows:

912	       /* Identifier selection function */
913	       count = max_id - min_id + 1;

915	       do {
916	                   linear(CONTEXT_2)= linear(CONTEXT_2) + increment();
917	           next_id= offset(CONTEXT_1) + linear(CONTEXT_1);

919	           if(check_suitable_id(next_id))
920	              return next_id;

922	           count--;
923	       } while (count > 0);

925	       return ERROR;

927	   Essentially, an identifier (next_id) is generated by adding a linear
928	   function (linear()) to an offset value, which is unknown to the
929	   attacker, and constant for given context (CONTEXT_1).

931	   The following aspects of the algorithm should be considered:

933	   o  For the most part, it is the offset() function that results in
934	      identifiers that are unpredictable by an off-path attacker.  While
935	      the resulting sequence will be monotonically-increasing, the use
936	      of an offset value that is unknown to the attacker makes the
937	      resulting values unknown to the attacker.

939	   o  The most straightforward "stateless" implementation of offset
940	      would be that in which offset() is the result of a
941	      cryptographically-secure hash-function that takes the values that
942	      identify the context and a "secret" (not shown in the figure
943	      above) as arguments.

945	   o  Another possible (but stateful) approach would be to simply
946	      generate a random offset and store it in memory, and then look-up
947	      the corresponding context when a new identifier is to be selected.
948	      The algorithm in Section 7.4.1 is essentially an implementation of
949	      this type.

951	   o  The linear function is incremented according to increment().  In
952	      the most trivial case increment() could always return the constant
953	      "1".  But it could also possibly return small integers such the
954	      increments are randomized.

956	   Considering the generic algorithm illustrated above we can identify
957	   the following possible vulnerabilities:

959	   o  If the offset value spans more than the necessary context,
960	      identifiers could be unnecessarily predictable by other parties,
961	      since the offset value would be unnecessarily leaked to them.  For
962	      example, an implementation that means to produce a per-destination
963	      counter but replaces offset() with a constant number (i.e.,
964	      employs a global counter), will unnecessarily result in
965	      predictable identifiers.

967	   o  The function linear() could be seen as representing the number of
968	      identifiers that have so far been generated for a given context
969	      (CONTEXT_2).  If linear() spans more than the necessary context,
970	      the "increments" could be leaked to other parties, thus disclosing
971	      information about the number of identifiers that have so far been
972	      generated.  For example, an implementation in which linear() is
973	      implemented as a single global counter will unnecessarily leak
974	      information the number of identifiers that have been produced.

976	   o  increment() determines how the linear() is incremented for each
977	      identifier that is selected.  In the most trivial case,
978	      increment() will return the integer "1".  However, an
979	      implementation may have increment() return a "small" integer value
980	      such that even if the current value employed by the generator is
981	      guessed (see Appendix A of [RFC7739]), the exact next identifier
982	      to be selected will be slightly harder to identify.

984	10.  IANA Considerations

986	   There are no IANA registries within this document.  The RFC-Editor
987	   can remove this section before publication of this document as an
988	   RFC.

990	11.  Security Considerations

992	   The entire document is about the security and privacy implications of
993	   identifiers.

995	12.  Acknowledgements

997	   The authors would like to thank (in alphabetical order) Steven
998	   Bellovin, Joseph Lorenzo Hall, Gre Norcie, and Martin Thomson, for
999	   providing valuable comments on earlier versions of this document.

1001	   The authors would like to thank Diego Armando Maradona for his magic
1002	   and inspiration.

1004	13.  References

1006	13.1.  Normative References

1008	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
1009	              RFC 793, DOI 10.17487/RFC0793, September 1981,
1010	              <https://www.rfc-editor.org/info/rfc793>.

1012	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1013	              Requirement Levels", BCP 14, RFC 2119,
1014	              DOI 10.17487/RFC2119, March 1997,
1015	              <https://www.rfc-editor.org/info/rfc2119>.

1017	   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
1018	              (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460,
1019	              December 1998, <https://www.rfc-editor.org/info/rfc2460>.

1021	   [RFC4086]  Eastlake 3rd, D., Schiller, J., and S. Crocker,
1022	              "Randomness Requirements for Security", BCP 106, RFC 4086,
1023	              DOI 10.17487/RFC4086, June 2005,
1024	              <https://www.rfc-editor.org/info/rfc4086>.

1026	   [RFC4291]  Hinden, R. and S. Deering, "IP Version 6 Addressing
1027	              Architecture", RFC 4291, DOI 10.17487/RFC4291, February
1028	              2006, <https://www.rfc-editor.org/info/rfc4291>.

1030	   [RFC4862]  Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless
1031	              Address Autoconfiguration", RFC 4862,
1032	              DOI 10.17487/RFC4862, September 2007,
1033	              <https://www.rfc-editor.org/info/rfc4862>.

1035	   [RFC5722]  Krishnan, S., "Handling of Overlapping IPv6 Fragments",
1036	              RFC 5722, DOI 10.17487/RFC5722, December 2009,
1037	              <https://www.rfc-editor.org/info/rfc5722>.

1039	   [RFC6528]  Gont, F. and S. Bellovin, "Defending against Sequence
1040	              Number Attacks", RFC 6528, DOI 10.17487/RFC6528, February
1041	              2012, <https://www.rfc-editor.org/info/rfc6528>.

1043	   [RFC7217]  Gont, F., "A Method for Generating Semantically Opaque
1044	              Interface Identifiers with IPv6 Stateless Address
1045	              Autoconfiguration (SLAAC)", RFC 7217,
1046	              DOI 10.17487/RFC7217, April 2014,
1047	              <https://www.rfc-editor.org/info/rfc7217>.

1049	13.2.  Informative References

1051	   [Bellovin1989]
1052	              Bellovin, S., "Security Problems in the TCP/IP Protocol
1053	              Suite", Computer Communications Review, vol. 19, no. 2,
1054	              pp. 32-48, 1989,
1055	              <https://www.cs.columbia.edu/~smb/papers/ipext.pdf>.

1057	   [Bellovin2002]
1058	              Bellovin, S., "A Technique for Counting NATted Hosts",
1059	              IMW'02 Nov. 6-8, 2002, Marseille, France, 2002.

1061	   [CERT2001]
1062	              CERT, "CERT Advisory CA-2001-09: Statistical Weaknesses in
1063	              TCP/IP Initial Sequence Numbers", 2001,
1064	              <http://www.cert.org/advisories/CA-2001-09.html>.

1066	   [CPNI-TCP]
1067	              Gont, F., "Security Assessment of the Transmission Control
1068	              Protocol (TCP)",  United Kingdom's Centre for the
1069	              Protection of National Infrastructure (CPNI) Technical
1070	              Report, 2009, <http://www.gont.com.ar/papers/
1071	              tn-03-09-security-assessment-TCP.pdf>.

1073	   [FIPS-SHS]
1074	              FIPS, "Secure Hash Standard (SHS)",  Federal Information
1075	              Processing Standards Publication 180-4, March 2012,
1076	              <http://csrc.nist.gov/publications/fips/fips180-4/
1077	              fips-180-4.pdf>.

1079	   [Fyodor2004]
1080	              Fyodor, "Idle scanning and related IP ID games", 2004,
1081	              <http://www.insecure.org/nmap/idlescan.html>.

1083	   [I-D.ietf-6man-default-iids]
1084	              Gont, F., Cooper, A., Thaler, D., and S. LIU,
1085	              "Recommendation on Stable IPv6 Interface Identifiers",
1086	              draft-ietf-6man-default-iids-16 (work in progress),
1087	              September 2016.

1089	   [Joncheray1995]
1090	              Joncheray, L., "A Simple Active Attack Against TCP", Proc.
1091	              Fifth Usenix UNIX Security Symposium, 1995.

1093	   [Klein2007]
1094	              Klein, A., "OpenBSD DNS Cache Poisoning and Multiple O/S
1095	              Predictable IP ID Vulnerability", 2007,
1096	              <http://www.trusteer.com/files/OpenBSD_DNS_Cache_Poisoning
1097	              _and_Multiple_OS_Predictable_IP_ID_Vulnerability.pdf>.

1099	   [Morris1985]
1100	              Morris, R., "A Weakness in the 4.2BSD UNIX TCP/IP
1101	              Software", CSTR 117, AT&T Bell Laboratories, Murray Hill,
1102	              NJ, 1985,
1103	              <https://pdos.csail.mit.edu/~rtm/papers/117.pdf>.

1105	   [RFC1035]  Mockapetris, P., "Domain names - implementation and
1106	              specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
1107	              November 1987, <https://www.rfc-editor.org/info/rfc1035>.

1109	   [RFC1321]  Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321,
1110	              DOI 10.17487/RFC1321, April 1992,
1111	              <https://www.rfc-editor.org/info/rfc1321>.

1113	   [RFC4963]  Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly
1114	              Errors at High Data Rates", RFC 4963,
1115	              DOI 10.17487/RFC4963, July 2007,
1116	              <https://www.rfc-editor.org/info/rfc4963>.

1118	   [RFC6056]  Larsen, M. and F. Gont, "Recommendations for Transport-
1119	              Protocol Port Randomization", BCP 156, RFC 6056,
1120	              DOI 10.17487/RFC6056, January 2011,
1121	              <https://www.rfc-editor.org/info/rfc6056>.

1123	   [RFC6151]  Turner, S. and L. Chen, "Updated Security Considerations
1124	              for the MD5 Message-Digest and the HMAC-MD5 Algorithms",
1125	              RFC 6151, DOI 10.17487/RFC6151, March 2011,
1126	              <https://www.rfc-editor.org/info/rfc6151>.

1128	   [RFC7098]  Carpenter, B., Jiang, S., and W. Tarreau, "Using the IPv6
1129	              Flow Label for Load Balancing in Server Farms", RFC 7098,
1130	              DOI 10.17487/RFC7098, January 2014,
1131	              <https://www.rfc-editor.org/info/rfc7098>.

1133	   [RFC7528]  Higgs, P. and J. Piesing, "A Uniform Resource Name (URN)
1134	              Namespace for the Hybrid Broadcast Broadband TV (HbbTV)
1135	              Association", RFC 7528, DOI 10.17487/RFC7528, April 2015,
1136	              <https://www.rfc-editor.org/info/rfc7528>.

1138	   [RFC7707]  Gont, F. and T. Chown, "Network Reconnaissance in IPv6
1139	              Networks", RFC 7707, DOI 10.17487/RFC7707, March 2016,
1140	              <https://www.rfc-editor.org/info/rfc7707>.

1142	   [RFC7721]  Cooper, A., Gont, F., and D. Thaler, "Security and Privacy
1143	              Considerations for IPv6 Address Generation Mechanisms",
1144	              RFC 7721, DOI 10.17487/RFC7721, March 2016,
1145	              <https://www.rfc-editor.org/info/rfc7721>.

1147	   [RFC7739]  Gont, F., "Security Implications of Predictable Fragment
1148	              Identification Values", RFC 7739, DOI 10.17487/RFC7739,
1149	              February 2016, <https://www.rfc-editor.org/info/rfc7739>.

1151	   [Sanfilippo1998a]
1152	              Sanfilippo, S., "about the ip header id", Post to Bugtraq
1153	              mailing-list, Mon Dec 14 1998,
1154	              <http://seclists.org/bugtraq/1998/Dec/48>.

1156	   [Sanfilippo1998b]
1157	              Sanfilippo, S., "Idle scan", Post to Bugtraq mailing-list,
1158	              1998, <http://www.kyuzz.org/antirez/papers/dumbscan.html>.

1160	   [Sanfilippo1999]
1161	              Sanfilippo, S., "more ip id", Post to Bugtraq mailing-
1162	              list, 1999,
1163	              <http://www.kyuzz.org/antirez/papers/moreipid.html>.

1165	   [Shimomura1995]
1166	              Shimomura, T., "Technical details of the attack described
1167	              by Markoff in NYT", Message posted in USENET's
1168	              comp.security.misc newsgroup  Message-ID:
1169	              <3g5gkl$5j1@ariel.sdsc.edu>, 1995,
1170	              <http://www.gont.com.ar/docs/post-shimomura-usenet.txt>.

1172	   [Silbersack2005]
1173	              Silbersack, M., "Improving TCP/IP security through
1174	              randomization without sacrificing interoperability",
1175	              EuroBSDCon 2005 Conference, 2005,
1176	              <http://citeseerx.ist.psu.edu/viewdoc/
1177	              download?doi=10.1.1.91.4542&rep=rep1&type=pdf>.

1179	   [USCERT2001]
1180	              US-CERT, "US-CERT Vulnerability Note VU#498440: Multiple
1181	              TCP/IP implementations may use statistically predictable
1182	              initial sequence numbers", 2001,
1183	              <http://www.kb.cert.org/vuls/id/498440>.

1185	   [Zalewski2001]
1186	              Zalewski, M., "Strange Attractors and TCP/IP Sequence
1187	              Number Analysis", 2001,
1188	              <http://lcamtuf.coredump.cx/oldtcp/tcpseq.html>.

1190	   [Zalewski2002]
1191	              Zalewski, M., "Strange Attractors and TCP/IP Sequence
1192	              Number Analysis - One Year Later", 2001,
1193	              <http://lcamtuf.coredump.cx/newtcp/>.

1195	Appendix A.  Flawed Algorithms

1197	   The following subsections document algorithms with known negative
1198	   security and privacy imlpications.

1200	A.1.  Predictable Linear Identifiers Algorithm

1202	   One of the most trivial ways to achieve uniqueness with a low
1203	   identifier reuse frequency is to produce a linear sequence.

1205	   For example, the following algorithm has been employed (see e.g.
1206	   [Morris1985], [Shimomura1995], [Silbersack2005] and [CPNI-TCP]) in a
1207	   number of operating systems for selecting IP fragment IDs, TCP
1208	   ephemeral ports, etc.:

1210	       /* Initialization at system boot time. Could be random */
1211	       next_id = min_id;
1212	       id_inc= 1;

1214	       /* Identifier selection function */
1215	       count = max_id - min_id + 1;

1217	       do {
1218	           if (next_id == max_id) {
1219	               next_id = min_id;
1220	           }
1221	           else {
1222	               next_id = next_id + id_inc;
1223	           }

1225	           if (check_suitable_id(next_id))
1226	               return next_id;

1228	           count--;

1230	       } while (count > 0);

1232	       return ERROR;

1234	   Note:
1235	      check_suitable_id() is a function that checks whether the
1236	      resulting identifier is acceptable (e.g., whether its in use,
1237	      etc.).

1239	   For obvious reasons, this algorithm results in predicable sequences.
1240	   If a global counter is used (such as "next_id" in the example above),
1241	   a node that learns one protocol identifier can also learn or guess
1242	   values employed by past and future protocol instances.  On the other
1243	   hand, when the value of increments is known (such as "1" in this
1244	   case), an attacker can sample two values, and learn the number of
1245	   identifiers that were generated in-between.

1247	   Where identifier reuse would lead to a hard failure, one typical
1248	   approach to generate unique identifiers (while minimizing the
1249	   security and privacy implications of predictable identifiers) is to
1250	   obfuscate the resulting protocol IDs by either:

1252	   o  Replace the global counter with multiple counters (initialized to
1253	      a random value)

1255	   o  Randomizing the "increments"

1257	   Avoiding global counters essentially means that learning one
1258	   identifier for a given context (e.g., one TCP ephemeral port for a
1259	   given {src IP, Dst IP, Dst Port}) is of no use for learning or
1260	   guessing identifiers for a different context (e.g., TCP ephemeral
1261	   ports that involve other peers).  However, this may imply keeping one
1262	   additional variable/counter per context, which may be prohibitive in
1263	   some environments.  The choice of id_inc has implications on both the
1264	   security and privacy properties of the resulting identifiers, but
1265	   also on the corresponding interoperability properties.  On one hand,
1266	   minimizing the increments (as in "id_inc = 1" in our case) generally
1267	   minimizes the identifier reuse frequency, albeit at increased
1268	   predictability.  On the other hand, if the increments are randomized,
1269	   predictability of the resulting identifiers is reduced, and the
1270	   information leakage produced by global constant increments is
1271	   mitigated.  However, using larger increments than necessary can
1272	   result in an increased identifier reuse frequency.

1274	A.2.  Random-Increments Algorithm

1276	   This algorithm offers a middle ground between the algorithms that
1277	   select ephemeral ports randomly (such as those described in
1278	   Section 7.1.1 and Section 7.1.2), and those that offer obfuscation
1279	   but no randomization (such as those described in Section 7.4.2 and
1280	   Section 7.4.3).

1282	       /* Initialization code at system boot time. */
1283	       next_id = random();        /* Initialization value */
1284	       id_inc = 500;        /* Determines the trade-off */

1286	       /* Identifier selection function */
1287	       id_range = max_id - min_id + 1;

1289	       count = id_range;

1291	       do {
1292	           /* Random increment */
1293	           next_id = next_id + (random() % id_inc) + 1;

1295	           /* Keep the identifier within acceptable range */
1296	           next_id = min_id + (next_id % id_range);

1298	           if(check_suitable_id(next_id))
1299	              return next_id;

1301	           count--;
1302	       } while (count > 0);

1304	       return ERROR;

1306	   This algorithm aims at producing a monotonically increasing sequence
1307	   of identifiers, while avoiding the use of fixed increments, which
1308	   would lead to trivially predictable sequences.  The value "id_inc"
1309	   allows for direct control of the trade-off between the level of
1310	   obfuscation and the ID reuse frequency.  The smaller the value of
1311	   "id_inc", the more similar this algorithm is to a predicable, global
1312	   monotonically-increasing ID generation algorithm.  The larger the
1313	   value of "id_inc", the more similar this algorithm is to the
1314	   algorithm described in Section 7.1.1 of this document.

1316	   When the identifiers wrap, there is the risk of collisions of
1317	   identifiers (i.e., identifier reuse).  Therefore, "id_inc" should be
1318	   selected according to the following criteria:

1320	   o  It should maximize the wrapping time of the identifier space.

1322	   o  It should minimize identifier reuse frequency.

1324	   o  It should maximize obfuscation.

1326	   Clearly, these are competing goals, and the decision of which value
1327	   of "id_inc" to use is a trade-off.  Therefore, the value of "id_inc"
1328	   should be configurable so that system administrators can make the
1329	   trade-off for themselves.

1331	Authors' Addresses

1333	   Fernando Gont
1334	   SI6 Networks / UTN-FRH
1335	   Evaristo Carriego 2644
1336	   Haedo, Provincia de Buenos Aires  1706
1337	   Argentina

1339	   Phone: +54 11 4650 8472
1340	   Email: fgont@si6networks.com
1341	   URI:   http://www.si6networks.com

1343	   Ivan Arce
1344	   Quarkslab

1346	   Email: iarce@quarkslab.com
1347	   URI:   https://www.quarkslab.com