idnits 2.17.1 draft-irtf-pearg-numeric-ids-generation-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (31 January 2022) is 815 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 6528 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) ** Obsolete normative reference: RFC 4941 (Obsoleted by RFC 8981) == Outdated reference: A later version (-11) exists of draft-irtf-pearg-numeric-ids-history-09 == Outdated reference: A later version (-11) exists of draft-gont-numeric-ids-sec-considerations-06 Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Research Task Force (IRTF) F. Gont 3 Internet-Draft EdgeUno 4 Intended status: Informational I. Arce 5 Expires: 4 August 2022 Quarkslab 6 31 January 2022 8 On the Generation of Transient Numeric Identifiers 9 draft-irtf-pearg-numeric-ids-generation-08 11 Abstract 13 This document performs an analysis of the security and privacy 14 implications of different types of "transient numeric identifiers" 15 used in IETF protocols, and tries to categorize them based on their 16 interoperability requirements and their associated failure severity 17 when such requirements are not met. Subsequently, it provides advice 18 on possible algorithms that could be employed to satisfy the 19 interoperability requirements of each identifier category, while 20 minimizing the negative security and privacy implications, thus 21 providing guidance to protocol designers and protocol implementers. 22 Finally, it describes a number of algorithms that have been employed 23 in real implementations to generate transient numeric identifiers, 24 and analyzes their security and privacy properties. This document is 25 a product of the Privacy Enhancement and Assessment Research Group 26 (PEARG) in the IRTF. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on 4 August 2022. 45 Copyright Notice 47 Copyright (c) 2022 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 52 license-info) in effect on the date of publication of this document. 53 Please review these documents carefully, as they describe your rights 54 and restrictions with respect to this document. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 3. Threat Model . . . . . . . . . . . . . . . . . . . . . . . . 5 61 4. Issues with the Specification of Transient Numeric 62 Identifiers . . . . . . . . . . . . . . . . . . . . . . . 6 63 5. Protocol Failure Severity . . . . . . . . . . . . . . . . . . 7 64 6. Categorizing Transient Numeric Identifiers . . . . . . . . . 7 65 7. Common Algorithms for Transient Numeric Identifier 66 Generation . . . . . . . . . . . . . . . . . . . . . . . 10 67 7.1. Category #1: Uniqueness (soft failure) . . . . . . . . . 10 68 7.2. Category #2: Uniqueness (hard failure) . . . . . . . . . 13 69 7.3. Category #3: Uniqueness, stable within context (soft 70 failure) . . . . . . . . . . . . . . . . . . . . . . . . 14 71 7.4. Category #4: Uniqueness, monotonically increasing within 72 context (hard failure) . . . . . . . . . . . . . . . . . 16 73 8. Common Vulnerabilities Associated with Transient Numeric 74 Identifiers . . . . . . . . . . . . . . . . . . . . . . . 22 75 8.1. Network Activity Correlation . . . . . . . . . . . . . . 22 76 8.2. Information Leakage . . . . . . . . . . . . . . . . . . . 23 77 8.3. Fingerprinting . . . . . . . . . . . . . . . . . . . . . 24 78 8.4. Exploitation of the Semantics of Transient Numeric 79 Identifiers . . . . . . . . . . . . . . . . . . . . . . . 25 80 8.5. Exploitation of Collisions of Transient Numeric 81 Identifiers . . . . . . . . . . . . . . . . . . . . . . . 26 82 8.6. Exploitation of Predictable Transient Numeric Identifiers 83 for Injection Attacks . . . . . . . . . . . . . . . . . . 26 84 8.7. Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . 27 85 9. Vulnerability Assessment of Transient Numeric Identifiers . . 27 86 9.1. Category #1: Uniqueness (soft failure) . . . . . . . . . 27 87 9.2. Category #2: Uniqueness (hard failure) . . . . . . . . . 28 88 9.3. Category #3: Uniqueness, stable within context (soft 89 failure) . . . . . . . . . . . . . . . . . . . . . . . . 28 90 9.4. Category #4: Uniqueness, monotonically increasing within 91 context (hard failure) . . . . . . . . . . . . . . . . . 29 92 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31 93 11. Security Considerations . . . . . . . . . . . . . . . . . . . 31 94 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 31 95 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 32 96 13.1. Normative References . . . . . . . . . . . . . . . . . . 32 97 13.2. Informative References . . . . . . . . . . . . . . . . . 33 99 Appendix A. Algorithms and Techniques with Known Issues . . . . 38 100 A.1. Predictable Linear Identifiers Algorithm . . . . . . . . 38 101 A.2. Random-Increments Algorithm . . . . . . . . . . . . . . . 40 102 A.3. Re-using Identifiers Across Different Contexts . . . . . 41 103 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 42 105 1. Introduction 107 Networking protocols employ a variety of transient numeric 108 identifiers for different protocol objects, such as IPv4 and IPv6 109 Fragment Identifiers [RFC0791] [RFC8200], IPv6 Interface Identifiers 110 (IIDs) [RFC4291], transport protocol ephemeral port numbers 111 [RFC6056], TCP Initial Sequence Numbers (ISNs) [RFC0793], and DNS 112 Transaction IDs (TxIDs) [RFC1035].These identifiers usually have 113 specific interoperability requirements (e.g. uniqueness during a 114 specified period of time) that must be satisfied such that they do 115 not result in negative interoperability implications, and an 116 associated failure severity when such requirements are not met, 117 ranging from soft to hard failures. 119 For more than 30 years, a large number of implementations of the TCP/ 120 IP protocol suite have been subject to a variety of attacks, with 121 effects ranging from Denial of Service (DoS) or data injection, to 122 information leakages that could be exploited for pervasive monitoring 123 [RFC7258]. The root cause of these issues has been, in many cases, 124 the poor selection of transient numeric identifiers in such 125 protocols, usually as a result of insufficient or misleading 126 specifications. While it is generally trivial to identify an 127 algorithm that can satisfy the interoperability requirements of a 128 given transient numeric identifier, empirical evidence exists that 129 doing so without negatively affecting the security and/or privacy 130 properties of the aforementioned protocols is prone to error 131 [I-D.irtf-pearg-numeric-ids-history]. 133 For example, implementations have been subject to security and/or 134 privacy issues resulting from: 136 * Predictable IPv4 or IPv6 Fragment Identifiers (see e.g. 137 [Sanfilippo1998a], [RFC6274], and [RFC7739]) 139 * Predictable IPv6 IIDs (see e.g. [RFC7721], [RFC7707], and 140 [RFC7217]) 142 * Predictable transport protocol ephemeral port numbers (see e.g. 143 [RFC6056] and [Silbersack2005]) 145 * Predictable TCP Initial Sequence Numbers (ISNs) (see e.g. 146 [Morris1985], [Bellovin1989], and [RFC6528]) 148 * Predictable initial timestamp in TCP timestamps Options (see e.g. 149 [TCPT-uptime] and [RFC7323]) 151 * Predictable DNS TxIDs (see e.g. [Schuba1993] and [Klein2007]) 153 Recent history indicates that when new protocols are standardized or 154 new protocol implementations are produced, the security and privacy 155 properties of the associated transient numeric identifiers tend to be 156 overlooked, and inappropriate algorithms to generate transient 157 numeric identifiers are either suggested in the specifications or 158 selected by implementers. As a result, it should be evident that 159 advice in this area is warranted. 161 We note that the use of cryptographic techniques may readily mitigate 162 some of the issues arising from predictable transient numeric 163 identifiers. For example, cryptographic integrity and authentication 164 can readily mitigate data injection attacks even in the presence of 165 predictable transient numeric identifiers (such as "sequence 166 numbers"). However, use of flawed algorithms (such as global 167 counters) for generating transient numeric identifiers could still 168 result in information leakages even when cryptographic techniques are 169 employed. 171 This document contains a non-exhaustive survey of transient numeric 172 identifiers employed in various IETF protocols, and aims to 173 categorize such identifiers based on their interoperability 174 requirements, and the associated failure severity when such 175 requirements are not met. Subsequently, it provides advice on 176 possible algorithms that could be employed to satisfy the 177 interoperability requirements of each category, while minimizing 178 negative security and privacy implications. Finally, it analyzes 179 several algorithms that have been employed in real implementations to 180 meet such requirements, and analyzes their security and privacy 181 properties. 183 This document represents the consensus of the Privacy Enhancement and 184 Assessment Research Group (PEARG). 186 2. Terminology 188 Transient Numeric Identifier: 189 A data object in a protocol specification that can be used to 190 definitely distinguish a protocol object (a datagram, network 191 interface, transport protocol endpoint, session, etc.) from all 192 other objects of the same type, in a given context. Transient 193 numeric identifiers are usually defined as a series of bits, and 194 represented using integer values. These identifiers are typically 195 dynamically selected, as opposed to statically-assigned numeric 196 identifiers (see e.g. [IANA-PROT]). We note that different 197 transient numeric identifiers may have additional requirements or 198 properties depending on their specific use in a protocol. We use 199 the term "transient numeric identifier" (or simply "numeric 200 identifier" or "identifier" as short forms) as a generic term to 201 refer to any data object in a protocol specification that 202 satisfies the identification property stated above. 204 Failure Severity: 205 The consequences of a failure to comply with the interoperability 206 requirements of a given identifier. Severity considers the worst 207 potential consequence of a failure, determined by the system 208 damage and/or time lost to repair the failure. In this document 209 we define two types of failure severity: "soft failure" and "hard 210 failure". 212 Soft Failure: 213 A soft failure is a recoverable condition in which a protocol does 214 not operate in the prescribed manner but normal operation can be 215 resumed automatically in a short period of time. For example, a 216 simple packet-loss event that is subsequently recovered with a 217 packet-retransmission can be considered a soft failure. 219 Hard Failure: 220 A hard failure is a non-recoverable condition in which a protocol 221 does not operate in the prescribed manner or it operates with 222 excessive degradation of service. For example, an established TCP 223 connection that is aborted due to an error condition constitutes, 224 from the point of view of the transport protocol, a hard failure, 225 since it enters a state from which normal operation cannot be 226 resumed. 228 3. Threat Model 230 Throughout this document, we assume an attacker does not have 231 physical or logical access to the system(s) being attacked, and that 232 the attacker can only observe traffic explicitly directed to the 233 attacker. For example, an attacker cannot observe traffic 234 transferred between a sender and the receiver(s) of a target 235 protocol, but may be able to interact with any of these entities, 236 including by e.g. sending any traffic to them to sample transient 237 numeric identifiers employed by the target systems when communicating 238 with the attacker. 240 For example, when analyzing vulnerabilities associated with TCP 241 Initial Sequence Numbers (ISNs), we consider the attacker is unable 242 to capture network traffic corresponding to a TCP connection between 243 two other hosts. However, we consider the attacker is able to 244 communicate with any of these hosts (e.g., establish a TCP connection 245 with any of them), to e.g. sample the TCP ISNs employed by these 246 systems when communicating with the attacker. 248 Similarly, when considering host-tracking attacks based on IPv6 249 interface identifiers, we consider an attacker may learn the IPv6 250 address employed by a victim node if e.g. the address becomes exposed 251 as a result of the victim node communicating with an attacker- 252 operated server. Subsequently, an attacker may perform host-tracking 253 by probing a set of target addresses composed by a set of target 254 prefixes and the IPv6 interface identifier originally learned by the 255 attacker. Alternatively, an attacker may perform host tracking if 256 e.g. the victim node communicates with an attacker-operated server as 257 it moves from one location to another, those exposing its configured 258 addresses. We note that none of these scenarios requires the 259 attacker observe traffic not explicitly directed to the attacker. 261 4. Issues with the Specification of Transient Numeric Identifiers 263 While assessing protocol specifications regarding the use of 264 transient numeric identifiers, we have found that most of the issues 265 discussed in this document arise as a result of one of the following 266 conditions: 268 * Protocol specifications that under-specify the requirements for 269 their transient numeric identifiers 271 * Protocol specifications that over-specify their transient numeric 272 identifiers 274 * Protocol implementations that simply fail to comply with the 275 specified requirements 277 A number of protocol specifications (too many of them) have simply 278 overlooked the security and privacy implications of transient numeric 279 identifiers [I-D.irtf-pearg-numeric-ids-history]. Examples of them 280 are the specification of TCP ephemeral ports in [RFC0793], the 281 specification of TCP sequence numbers in [RFC0793], or the 282 specification of the DNS TxID in [RFC1035]. 284 On the other hand, there are a number of protocol specifications that 285 over-specify some of their associated transient numeric identifiers. 286 For example, [RFC4291] essentially overloads the semantics of IPv6 287 Interface Identifiers (IIDs) by embedding link-layer addresses in the 288 IPv6 IIDs, when the interoperability requirement of uniqueness could 289 be achieved in other ways that do not result in negative security and 290 privacy implications [RFC7721]. Similarly, [RFC2460] suggested the 291 use of a global counter for the generation of Fragment Identification 292 values, when the interoperability properties of uniqueness per {IPv6 293 Source Address, IPv6 Destination Address} could be achieved with 294 other algorithms that do not result in negative security and privacy 295 implications [RFC7739]. 297 Finally, there are protocol implementations that simply fail to 298 comply with existing protocol specifications. For example, some 299 popular operating systems (notably Microsoft Windows) still fail to 300 implement transport protocol ephemeral port randomization, as 301 recommended in [RFC6056]. 303 5. Protocol Failure Severity 305 Section 2 defines the concept of "Failure Severity", along with two 306 types of failure severities that we employ throughout this document: 307 soft and hard. 309 Our analysis of the severity of a failure is performed from the point 310 of view of the protocol in question. However, the corresponding 311 severity on the upper protocol (or application) might not be the same 312 as that of the protocol in question. For example, a TCP connection 313 that is aborted might or might not result in a hard failure of the 314 upper application: if the upper application can establish a new TCP 315 connection without any impact on the application, a hard failure at 316 the TCP protocol may have no severity at the application level. On 317 the other hand, if a hard failure of a TCP connection results in 318 excessive degradation of service at the application layer, it will 319 also result in a hard failure at the application. 321 6. Categorizing Transient Numeric Identifiers 323 This section includes a non-exhaustive survey of transient numeric 324 identifiers, which are representative of all the possible 325 combinations of interoperability requirements and failure severities 326 found in popular protocols from different layers. Additionally, it 327 proposes a number of categories that can accommodate these 328 identifiers based on their interoperability requirements and their 329 associated failure severity (soft or hard). 331 NOTE: 332 All other transient numeric identifiers that were analyzed as part 333 of this effort could be accommodated into one of the existing 334 categories from Table 1. 336 +==============+===============================+==================+ 337 | Identifier | Interoperability Requirements | Failure Severity | 338 +==============+===============================+==================+ 339 | IPv6 Frag ID | Uniqueness (for IP address | Soft/Hard (1) | 340 | | pair) | | 341 +--------------+-------------------------------+------------------+ 342 | IPv6 IID | Uniqueness (and stable within | Soft (3) | 343 | | IPv6 prefix) (2) | | 344 +--------------+-------------------------------+------------------+ 345 | TCP ISN | Monotonically-increasing (4) | Hard (4) | 346 +--------------+-------------------------------+------------------+ 347 | TCP initial | Monotonically-increasing (5) | Hard (5) | 348 | timestamps | | | 349 +--------------+-------------------------------+------------------+ 350 | TCP eph. | Uniqueness (for connection | Hard | 351 | port | ID) | | 352 +--------------+-------------------------------+------------------+ 353 | IPv6 Flow | Uniqueness | None (6) | 354 | Label | | | 355 +--------------+-------------------------------+------------------+ 356 | DNS TxID | Uniqueness | None (7) | 357 +--------------+-------------------------------+------------------+ 359 Table 1: Survey of Transient Numeric Identifiers 361 NOTE: 363 (1) 364 While a single collision of Fragment ID values would simply lead 365 to a single packet drop (and hence a "soft" failure), repeated 366 collisions at high data rates might trash the Fragment ID space, 367 leading to a hard failure [RFC4963]. 369 (2) 370 While the interoperability requirements are simply that the 371 Interface ID results in a unique IPv6 address, for operational 372 reasons it is typically desirable that the resulting IPv6 address 373 (and hence the corresponding Interface ID) be stable within each 374 network [RFC7217] [RFC8064]. 376 (3) 377 While IPv6 Interface IDs must result in unique IPv6 addresses, 378 IPv6 Duplicate Address Detection (DAD) [RFC4862] allows for the 379 detection of duplicate addresses, and hence such Interface ID 380 collisions can be recovered. 382 (4) 383 In theory, there are no interoperability requirements for TCP 384 Initial Sequence Numbers (ISNs), since the TIME-WAIT state and 385 TCP's "quiet time" concept take care of old segments from previous 386 incarnations of a connection. However, a widespread optimization 387 allows for a new incarnation of a previous connection to be 388 created if the ISN of the incoming SYN is larger than the last 389 sequence number seen in that direction for the previous 390 incarnation of the connection. Thus, monotonically-increasing TCP 391 ISNs allow for such optimization to work as expected [RFC6528], 392 and can help avoid connection-establishment failures. 394 (5) 395 Strictly speaking, there are no interoperability requirements for 396 the *initial* TCP timestamp employed by a TCP instance (i.e., the 397 TS Value (TSval) in a segment with the SYN bit set). However, 398 some TCP implementations allow a new incarnation of a previous 399 connection to be created if the TSval of the incoming SYN is 400 larger than the last TSval seen in that direction for the previous 401 incarnation of the connection (please see [RFC6191]). Thus, 402 monotonically-increasing TCP initial timestamps (across 403 connections to the same endpoint) allow for such optimization to 404 work as expected [RFC6191], and can help avoid connection- 405 establishment failures. 407 (6) 408 The IPv6 Flow Label is typically employed for load sharing 409 [RFC7098], along with the Source and Destination IPv6 addresses. 410 Reuse of a Flow Label value for the same set {Source Address, 411 Destination Address} would typically cause both flows to be 412 multiplexed onto the same link. However, as long as this does not 413 occur deterministically, it will not result in any negative 414 implications. 416 (7) 417 DNS TxIDs are employed, together with the Source Address, 418 Destination Address, Source Port, and Destination Port, to match 419 DNS requests and responses. However, since an implementation 420 knows which DNS requests were sent for that set of {Source 421 Address, Destination Address, Source Port, and Destination Port, 422 DNS TxID}, a collision of TxID would result, if anything, in a 423 small performance penalty (the response would nevertheless be 424 discarded when it is found that it does not answer the query sent 425 in the corresponding DNS query). 427 Based on the survey above, we can categorize identifiers as follows: 429 +=======+======================================+====================+ 430 | Cat | Category | Sample Proto IDs | 431 | # | | | 432 +=======+======================================+====================+ 433 | 1 | Uniqueness (soft failure) | IPv6 Flow L., DNS | 434 | | | TxIDs | 435 +-------+--------------------------------------+--------------------+ 436 | 2 | Uniqueness (hard failure) | IPv6 Frag ID, TCP | 437 | | | ephemeral port | 438 +-------+--------------------------------------+--------------------+ 439 | 3 | Uniqueness, stable within context | IPv6 IIDs | 440 | | (soft failure) | | 441 +-------+--------------------------------------+--------------------+ 442 | 4 | Uniqueness, monotonically increasing | TCP ISN, TCP | 443 | | within context (hard failure) | initial timestamps | 444 +-------+--------------------------------------+--------------------+ 446 Table 2: Identifier Categories 448 We note that Category #4 could be considered a generalized case of 449 category #3, in which a monotonically increasing element is added to 450 a stable (within context) element, such that the resulting 451 identifiers are monotonically increasing within a specified context. 452 That is, the same algorithm could be employed for both #3 and #4, 453 given appropriate parameters. 455 7. Common Algorithms for Transient Numeric Identifier Generation 457 The following subsections describe some sample algorithms that can be 458 employed for generating transient numeric identifiers for each of the 459 categories above, while mitigating the vulnerabilities analyzed in 460 Section 8 of this document. 462 All of the variables employed in the algorithms of the following 463 subsections are of "unsigned integer" type, except for the "retry" 464 variable, that is of (signed) "integer" type. 466 7.1. Category #1: Uniqueness (soft failure) 468 The requirement of uniqueness with a soft failure severity can be 469 complied with a Pseudo-Random Number Generator (PRNG). 471 NOTE: 472 Please see [RFC4086] regarding randomness requirements for 473 security. 475 We note that since the premise is that collisions of transient 476 numeric identifiers of this category only leads to soft failures, in 477 many cases, the algorithm might not need to check the suitability of 478 a selected identifier (i.e., the suitable_id() function, described 479 below, could always return "true"). 481 In scenarios where e.g. simultaneous use of a given numeric ID is 482 undesirable and the implementation detects such condition, an 483 implementation may opt to select the next available identifier in the 484 same sequence, or select another random number. Section 7.1.1 is an 485 implementation of the former strategy, while Section 7.1.2 is an 486 implementation of the later. Typically, the algorithm in 487 Section 7.1.2 results in a more uniform distribution of the generated 488 transient numeric identifiers. However, for transient numeric 489 identifiers where an implementation typically keeps local state about 490 unsuitable/used identifiers, the algorithm in Section 7.1.2 may 491 require many more iterations than the algorithm in Section 7.1.1 to 492 generate a suitable transient numeric identifier. This will usually 493 be affected by the current usage ratio of transient numeric 494 identifiers (i.e., number of numeric identifiers considered suitable 495 / total number of numeric identifiers) and other parameters. 496 Therefore, in such cases many implementations tend to prefer the 497 algorithm in Section 7.1.1 over the algorithm in Section 7.1.2. 499 7.1.1. Simple Randomization Algorithm 501 /* Transient Numeric ID selection function */ 503 id_range = max_id - min_id + 1; 504 next_id = min_id + (random() % id_range); 505 retry = id_range; 507 do { 508 if (suitable_id(next_id)) { 509 return next_id; 510 } 512 if (next_id == max_id) { 513 next_id = min_id; 514 } else { 515 next_id++; 516 } 518 retry--; 520 } while (retry > 0); 522 return ERROR; 524 NOTE: 525 random() is a function that returns a pseudo-random unsigned 526 integer number of appropriate size. Note that the output needs to 527 be unpredictable, and typical implementations of the POSIX 528 random() function do not necessarily meet this requirement. See 529 [RFC4086] for randomness requirements for security. Beware that 530 "adapting" the length of the output of random() with a modulo 531 operator (e.g., C language's "%") may change the distribution of 532 the PRNG. 534 The function suitable_id() can check, when possible and desirable, 535 whether a selected transient numeric identifier is suitable (e.g. 536 it is not already in use). Depending on how/where the numeric 537 identifier is used, it may or may not be possible (or even 538 desirable) to check whether the numeric identifier is in use (or 539 whether it has been recently employed). When an identifier is 540 found to be unsuitable, this algorithm selects the next available 541 numeric identifier in sequence. 543 Even when this algorithm selects numeric IDs randomly, it is 544 biased towards the first available numeric ID after a sequence of 545 unavailable numeric IDs. For example, if this algorithm is 546 employed for transport protocol ephemeral port randomization 547 [RFC6056] and the local list of unsuitable port numbers (e.g., 548 registered port numbers that should not be used for ephemeral 549 ports) is significant, an attacker may actually have a 550 significantly better chance of guessing a port number. 552 All the variables (in this and all the algorithms discussed in 553 this document) are unsigned integers. 555 Assuming the randomness requirements for the PRNG are met (see 556 [RFC4086]), this algorithm does not suffer from any of the issues 557 discussed in Section 8. 559 7.1.2. Another Simple Randomization Algorithm 561 The following pseudo-code illustrates another algorithm for selecting 562 a random transient numeric identifier which, in the event a selected 563 identifier is found to be unsuitable (e.g., already in use), another 564 identifier is randomly selected: 566 /* Transient Numeric ID selection function */ 568 id_range = max_id - min_id + 1; 569 retry = id_range; 571 do { 572 next_id = min_id + (random() % id_range); 574 if (suitable_id(next_id)) { 575 return next_id; 576 } 578 retry--; 580 } while (retry > 0); 582 return ERROR; 584 This algorithm might be unable to select a transient numeric 585 identifier (i.e., return "ERROR") even if there are suitable 586 identifiers available, in cases where a large number of identifiers 587 are found to be unsuitable (e.g. "in use"). 589 The same considerations from Section 7.1.1 with respect to the 590 properties of random() and the adaptation of its output length apply 591 to this algorithm. 593 Assuming the randomness requirements for the PRNG are met (see 594 [RFC4086]), this algorithm does not suffer from any of the issues 595 discussed in Section 8. 597 7.2. Category #2: Uniqueness (hard failure) 599 One of the most trivial approaches for generating unique transient 600 numeric identifier (with a hard failure severity) is to reduce the 601 identifier reuse frequency by generating the numeric identifiers with 602 a monotonically-increasing function (e.g. linear). As a result, any 603 of the algorithms described in Section 7.4 ("Category #4: Uniqueness, 604 monotonically increasing within context (hard failure)") can be 605 readily employed for complying with the requirements of this 606 transient numeric identifier category. 608 In cases where suitability (e.g. uniqueness) of the selected 609 identifiers can be definitely assessed by the local system, any of 610 the algorithms described in Section 7.1 ("Category #1: Uniqueness 611 (soft failure)") can be readily employed for complying with the 612 requirements of this numeric identifier category. 614 NOTE: 615 In the case of e.g. TCP ephemeral ports or TCP ISNs, a transient 616 numeric identifier that might seem suitable from the perspective 617 of the local system, might actually be unsuitable from the 618 perspective of the remote system (e.g., because there is state 619 associated with the selected identifier at the remote system). 620 Therefore, in such cases it is not possible employ the algorithms 621 from Section 7.1 ("Category #1: Uniqueness (soft failure)"). 623 7.3. Category #3: Uniqueness, stable within context (soft failure) 625 The goal of the following algorithm is to produce identifiers that 626 are stable for a given context (identified by "CONTEXT"), but that 627 change when the aforementioned context changes. 629 In order to avoid storing in memory the transient numeric identifiers 630 computed for each CONTEXT, the following algorithm employs a 631 calculated technique (as opposed to keeping state in memory) to 632 generate a stable transient numeric identifier for each given 633 context. 635 /* Transient Numeric ID selection function */ 637 id_range = max_id - min_id + 1; 639 retry = 0; 641 do { 642 offset = F(CONTEXT, retry, secret_key); 643 next_id = min_id + (offset % id_range); 645 if (suitable_id(next_id)) { 646 return next_id; 647 } 649 retry++; 651 } while (retry <= MAX_RETRIES); 653 return ERROR; 655 In this algorithm, the function F() provides a stateless and stable 656 per-CONTEXT offset, where CONTEXT is the concatenation of all the 657 elements that define the given context. 659 For example, if this algorithm is expected to produce IPv6 IIDs 660 that are unique per network interface and SLAAC autoconfiguration 661 prefix, the CONTEXT should be the concatenation of e.g. the 662 network interface index and the SLAAC autoconfiguration prefix 663 (please see [RFC7217] for an implementation of this algorithm for 664 generation of stable IPv6 IIDs). 666 F() is a pseudorandom function (PRF). It must not be computable from 667 the outside (without knowledge of the secret key). F() must also be 668 difficult to reverse, such that it resists attempts to obtain the 669 secret_key, even when given samples of the output of F() and 670 knowledge or control of the other input parameters. F() should 671 produce an output of at least as many bits as required for the 672 transient numeric identifier. SipHash-2-4 (128-bit key, 64-bit 673 output) [SipHash] and BLAKE3 (256-bit key, arbitrary-length output) 674 [BLAKE3] are two possible options for F(). Alternatively, F() could 675 be implemented with a keyed-hash message authentication code (HMAC) 676 [RFC2104]. HMAC-SHA-256 [FIPS-SHS] would be one possible option for 677 such implementation alternative. Note: Use of HMAC-MD5 [RFC1321] is 678 not recommended for F() [RFC6151]. 680 The result of F() is no more secure than the secret key, and 681 therefore 'secret_key' must be unknown to the attacker, and must be 682 of a reasonable length. 'secret_key' must remain stable for a given 683 CONTEXT, since otherwise the numeric identifiers generated by this 684 algorithm would not have the desired stability properties (i.e., 685 stable for a given CONTEXT). In most cases, 'secret_key' should be 686 selected with a PRNG (see [RFC4086] for recommendations on choosing 687 secrets) at an appropriate time, and stored in stable or volatile 688 storage (as necessary) for future use. 690 The result of F() is stored in the variable 'offset', which may take 691 any value within the storage type range, since we are restricting the 692 resulting identifier to be in the range [min_id, max_id] in a similar 693 way as in the algorithm described in Section 7.1.1. 695 suitable_id() checks whether the candidate identifier has suitable 696 uniqueness properties. Collisions (i.e., an identifier that is not 697 unique) are recovered by incrementing the 'retry' variable and 698 recomputing F(), up to a maximum of MAX_RETRIES times. However, 699 recovering from collisions will usually result in identifiers that 700 fail to remain constant for the specified context. This is normally 701 acceptable when the probability of collisions is small, as in the 702 case of e.g. IPv6 IIDs resulting from SLAAC [RFC7217] [RFC4941]. 704 For obvious reasons, the transient numeric identifiers generated with 705 this algorithm allow for network activity correlation and 706 fingerprinting within "CONTEXT". However, this is essentially a 707 design goal of this category of transient numeric identifiers. 709 7.4. Category #4: Uniqueness, monotonically increasing within context 710 (hard failure) 712 7.4.1. Per-context Counter Algorithm 714 One possible way of selecting unique monotonically-increasing 715 identifiers (per context) is to employ a per-context counter. Such 716 an algorithm could be described as follows: 718 /* Transient Numeric ID selection function */ 720 id_range = max_id - min_id + 1; 721 retry = id_range; 722 id_inc = increment() % id_range; 724 if( (next_id = lookup_counter(CONTEXT)) == ERROR){ 725 next_id = min_id + random() % id_range; 726 } 728 do { 729 if ( (max_id - next_id) >= id_inc){ 730 next_id = next_id + id_inc; 731 } 732 else { 733 next_id = min_id + id_inc - (max_id - next_id); 734 } 736 if (suitable_id(next_id)){ 737 store_counter(CONTEXT, next_id); 738 return next_id; 739 } 741 retry = retry - id_inc; 743 } while (retry > 0); 745 return ERROR; 747 NOTES: 748 increment() returns a small integer that is employed to increment 749 the current counter value to obtain the next transient numeric 750 identifier. This value must be much smaller than the number of 751 possible values for the numeric IDs (i.e., "id_range"). Most 752 implementations of this algorithm employ a constant increment of 753 1. Using a value other than 1 can help mitigate some information 754 leakages (please see below), at the expense of a possible increase 755 in the numeric ID reuse frequency. 757 The code above makes sure that the increment employed in the 758 algorithm (id_inc) is always smaller than the number of possible 759 values for the numeric IDs (i.e., "max_id - min_d + 1"). However, 760 as noted above, this value must also be much smaller than the 761 number of possible values for the numeric IDs. 763 lookup_counter() is a function that returns the current counter 764 for a given context, or an error condition if that counter does 765 not exist. 767 store_counter() is a function that saves a counter value for a 768 given context. 770 suitable_id() is a function that checks whether the resulting 771 identifier is acceptable (e.g., whether it is not already in use, 772 etc.). 774 Essentially, whenever a new identifier is to be selected, the 775 algorithm checks whether a counter for the corresponding context 776 exists. If does, the value of such counter is incremented to obtain 777 the new transient numeric identifier, and the counter is updated. If 778 no counter exists for such context, a new counter is created and 779 initialized to a random value, and used as the selected transient 780 numeric identifier. This algorithm produces a per-context counter, 781 which results in one monotonically-increasing function for each 782 context. Since each counter is initialized to a random value, the 783 resulting values are unpredictable by an off-path attacker. 785 The choice of id_inc has implications on both the security and 786 privacy properties of the resulting identifiers, but also on the 787 corresponding interoperability properties. On one hand, minimizing 788 the increments generally minimizes the identifier reuse frequency, 789 albeit at increased predictability. On the other hand, if the 790 increments are randomized, predictability of the resulting 791 identifiers is reduced, and the information leakage produced by 792 global constant increments is mitigated. However, using larger 793 increments than necessary can result in higher numeric ID reuse 794 frequency. 796 This algorithm has the following drawbacks: 798 * It requires an implementation to store each per-CONTEXT counter in 799 memory. If, as a result of resource management, the counter for a 800 given context must be removed, the last transient numeric 801 identifier value used for that context will be lost. Thus, if 802 subsequently an identifier needs to be generated for the same 803 context, the corresponding counter will need to be recreated and 804 reinitialized to a random value, thus possibly leading to reuse/ 805 collision of numeric identifiers. 807 * Keeping one counter for each possible "context" may in some cases 808 be considered too onerous in terms of memory requirements. 810 Otherwise, the identifiers produced by this algorithm do not suffer 811 from the other issues discussed in Section 8. 813 7.4.2. Simple PRF-Based Algorithm 815 The goal of this algorithm is to produce monotonically-increasing 816 transient numeric identifiers (for each given context), with a 817 randomized initial value. For example, if the identifiers being 818 generated must be monotonically-increasing for each {IP Source 819 Address, IP Destination Address} set, then each possible combination 820 of {IP Source Address, IP Destination Address} should have a separate 821 monotonically-increasing sequence, that starts at a different random 822 value. 824 Instead of maintaining a per-context counter (as in the algorithm 825 from Section 7.4.1), the following algorithm employs a calculated 826 technique to maintain a random offset for each possible context. 828 /* Initialization code */ 829 counter = 0; 831 /* Transient Numeric ID selection function */ 833 id_range = max_id - min_id + 1; 834 id_inc = increment() % id_range; 835 offset = F(CONTEXT, secret_key); 836 retry = id_range; 838 do { 839 next_id = min_id + (offset + counter) % id_range; 840 counter = counter + id_inc; 842 if (suitable_id(next_id)) { 843 return next_id; 844 } 846 retry = retry - id_inc; 848 } while (retry > 0); 850 return ERROR; 852 In the algorithm above, the function F() provides a (stateless) 853 unpredictable offset for each given context (as identified by 854 'CONTEXT'). 856 F() is a PRF, with the same properties as those specified for F() in 857 Section 7.3. 859 CONTEXT is the concatenation of all the elements that define a given 860 context. For example, if this algorithm is expected to produce 861 identifiers that are monotonically-increasing for each set (Source IP 862 Address, Destination IP Address), CONTEXT should be the concatenation 863 of these two IP addresses. 865 The function F() provides a "per-CONTEXT" fixed offset within the 866 numeric identifier "space". Both the 'offset' and 'counter' 867 variables may take any value within the storage type range since we 868 are restricting the resulting identifier to be in the range [min_id, 869 max_id] in a similar way as in the algorithm described in 870 Section 7.1.1. This allows us to simply increment the 'counter' 871 variable and rely on the unsigned integer to wrap around. 873 The result of F() is no more secure than the secret key, and 874 therefore 'secret_key' must be unknown to the attacker, and must be 875 of a reasonable length. 'secret_key' must remain stable for a given 876 CONTEXT, since otherwise the numeric identifiers generated by this 877 algorithm would not have the desired stability properties (i.e., 878 monotonically-increasing for a given CONTEXT). In most cases, 879 'secret_key' should be selected with a PRNG (see [RFC4086] for 880 recommendations on choosing secrets) at an appropriate time, and 881 stored in stable or volatile storage (as necessary) for future use. 883 It should be noted that, since this algorithm uses a global counter 884 ("counter") for selecting identifiers (i.e., all counters share the 885 same increments space), this algorithm results in an information 886 leakage (as described in Section 8.2). For example, if this 887 algorithm were used for selecting TCP ephemeral ports, and an 888 attacker could force a client to periodically establish a new TCP 889 connection to an attacker-controlled system (or through an attacker- 890 observable routing path), the attacker could subtract consecutive 891 source port values to obtain the number of outgoing TCP connections 892 established globally by the victim host within that time period (up 893 to wrap-around issues and five-tuple collisions, of course). This 894 information leakage could be partially mitigated by employing small 895 random values for the increments (i.e., increment() function), 896 instead of having increment() return the constant "1". 898 We nevertheless note that an improved mitigation of this information 899 leakage could be more successfully achieved by employing the 900 algorithm from Section 7.4.3, instead. 902 7.4.3. Double-PRF Algorithm 904 A trade-off between maintaining a single global 'counter' variable 905 and maintaining 2**N 'counter' variables (where N is the width of the 906 result of F()), could be achieved as follows. The system would keep 907 an array of TABLE_LENGTH values, which would provide a separation of 908 the increment space into multiple buckets. This improvement could be 909 incorporated into the algorithm from Section 7.4.2 as follows: 911 /* Initialization code */ 913 for(i = 0; i < TABLE_LENGTH; i++) { 914 table[i] = random(); 915 } 917 /* Transient Numeric ID selection function */ 919 id_range = max_id - min_id + 1; 920 id_inc = increment() % id_range; 921 offset = F(CONTEXT, secret_key1); 922 index = G(CONTEXT, secret_key2) % TABLE_LENGTH; 923 retry = id_range; 925 do { 926 next_id = min_id + (offset + table[index]) % id_range; 927 table[index] = table[index] + id_inc; 929 if (suitable_id(next_id)) { 930 return next_id; 931 } 933 retry = retry - id_inc; 935 } while (retry > 0); 937 return ERROR; 939 'table[]' could be initialized with random values, as indicated by 940 the initialization code in the pseudo-code above. 942 Both F() and G() are PRFs, with the same properties as those required 943 for F() in Section 7.3. 945 The results of F() and G() are no more secure than their respective 946 secret keys ('secret_key1' and 'secret_key2', respectively), and 947 therefore both secret keys must be unknown to the attacker, and must 948 be of a reasonable length. Both secret keys must remain stable for 949 the given CONTEXT, since otherwise the transient numeric identifiers 950 generated by this algorithm would not have the desired stability 951 properties (i.e., monotonically-increasing for a given CONTEXT). In 952 most cases, both secret keys should be selected with a PRNG (see 953 [RFC4086] for recommendations on choosing secrets) at an appropriate 954 time, and stored in stable or volatile storage (as necessary) for 955 future use. 957 The 'table[]' array assures that successive transient numeric 958 identifiers for a given context will be monotonically-increasing. 959 Since the increments space is separated into TABLE_LENGTH different 960 spaces, the identifier reuse frequency will be (probabilistically) 961 lower than that of the algorithm in Section 7.4.2. That is, the 962 generation of an identifier for one given context will not 963 necessarily result in increments in the identifier sequence of other 964 contexts. It is interesting to note that the size of 'table[]' does 965 not limit the number of different identifier sequences, but rather 966 separates the *increment space* into TABLE_LENGTH different spaces. 967 The selected transient numeric identifier sequence will be obtained 968 by adding the corresponding entry from 'table[]' to the value in the 969 'offset' variable, which selects the actual identifier sequence space 970 (as in the algorithm from Section 7.4.2). 972 An attacker can perform traffic analysis for any "increment space" 973 (i.e., context) into which the attacker has "visibility" -- namely, 974 the attacker can force a system to generate identifiers for 975 G(CONTEXT, secret_key2), where the result of G() identifies the 976 target "increment space". However, the attacker's ability to perform 977 traffic analysis is very reduced when compared to the simple PRF- 978 based identifiers (described in Section 7.4.2) and the predictable 979 linear identifiers (described in Appendix A.1). Additionally, an 980 implementation can further limit the attacker's ability to perform 981 traffic analysis by further separating the increment space (that is, 982 using a larger value for TABLE_LENGTH) and/or by randomizing the 983 increments (i.e., increment() returning a small random number as 984 opposed to the constant "1"). 986 Otherwise, this algorithm does not suffer from the issues discussed 987 in Section 8. 989 8. Common Vulnerabilities Associated with Transient Numeric Identifiers 991 8.1. Network Activity Correlation 993 An identifier that is predictable within a given context allows for 994 network activity correlation within that context. 996 For example, a stable IPv6 Interface Identifier allows for network 997 activity to be correlated within the context in which the Interface 998 Identifier is stable [RFC7721]. A stable-per-network IPv6 Interface 999 Identifier (as in [RFC7217]) allows for network activity correlation 1000 within a network, whereas a constant IPv6 Interface Identifier (that 1001 remains constant across networks) allows not only network activity 1002 correlation within the same network, but also across networks ("host 1003 tracking"). 1005 Similarly, an implementation that generates TCP ISNs with a global 1006 counter could allow for fingerprinting and network activity 1007 correlation across networks, since an attacker could passively infer 1008 the identity of the victim based on the TCP ISNs employed for 1009 subsequent communication instances. Similarly, an implementation 1010 that generates predictable IPv6 Fragment Identification values could 1011 be subject to fingerprinting attacks (see e.g. [Bellovin2002]). 1013 8.2. Information Leakage 1015 Transient numeric identifiers that result in specific patterns can 1016 produce an information leakage to other communicating entities. For 1017 example, it is common to generate transient numeric identifiers with 1018 an algorithm such as: 1020 ID = offset(CONTEXT) + mono(CONTEXT); 1022 This generic expression generates identifiers by adding a 1023 monotonically-increasing function (e.g. linear) to a randomized 1024 offset. offset() is constant within a given context, whereas mono() 1025 produces a monotonically-increasing sequence for the given context. 1026 Identifiers generated with this expression will generally be 1027 predictable within CONTEXT. 1029 The predictability of mono(), irrespective of the predictability of 1030 offset(), can leak information that may be of use to attackers. For 1031 example, a node that selects ephemeral port numbers as in: 1033 ephemeral_port = offset(Dest_IP) + mono() 1035 that is, with a per-destination offset, but a global mono() function 1036 (e.g., a global counter), will leak information about total number of 1037 outgoing connections that have been issued by the vulnerable 1038 implementation. 1040 Similarly, a node that generates Fragment Identification values as 1041 in: 1043 Frag_ID = offset(IP_src_addr, IP_dst_addr) + mono() 1045 will leak out information about the total number of fragmented 1046 packets that have been transmitted by the vulnerable implementation. 1047 The vulnerabilities described in 1049 [Sanfilippo1998a], [Sanfilippo1998b], and [Sanfilippo1999] are all 1050 associated with the use of a global mono() function (i.e., with a 1051 global and constant "context") -- particularly when it is a linear 1052 function (constant increments of 1). 1054 Predicting transient numeric identifiers can be of help for other 1055 types of attacks. For example, predictable TCP ISNs can open the 1056 door to trivial connection-reset and data injection attacks (see 1057 Section 8.6). 1059 8.3. Fingerprinting 1061 Fingerprinting is the capability of an attacker to identify or re- 1062 identify a visiting user, user agent or device via configuration 1063 settings or other observable characteristics. Observable protocol 1064 objects and characteristics can be employed to identify/re-identify a 1065 variety of entities, ranging from the underlying hardware or 1066 Operating System (vendor, type and version), to the user itself (i.e. 1067 his/her identity). [EFF] illustrates web browser-based 1068 fingerprinting, but similar techniques can be applied at other layers 1069 and protocols, whether alternatively or in conjunction with it. 1071 Transient numeric identifiers are one of the observable protocol 1072 components that could be leveraged for fingerprinting purposes. That 1073 is, an attacker could sample transient numeric identifiers to infer 1074 the algorithm (and its associated parameters, if any) for generating 1075 such identifiers, possibly revealing the underlying Operating System 1076 (OS) vendor, type, and version. This information could possibly be 1077 further leveraged in conjunction with other fingerprinting techniques 1078 and sources. 1080 Evasion of protocol-stack fingerprinting can prove to be a very 1081 difficult task: most systems make use of a wide variety of protocols, 1082 each of which have a large number of parameters that can be set to 1083 arbitrary values or generated with a variety of algorithms with 1084 multiple parameters. 1086 NOTE: 1087 General protocol-based fingerprinting is discussed in [RFC6973], 1088 along with guidelines to mitigate the associated vulnerability. 1089 [Fyodor1998] and [Fyodor2006] are classic references on Operating 1090 System detection via TCP/IP stack fingerprinting. Nmap [nmap] is 1091 probably the most popular tool for remote OS identification via 1092 active TCP/IP stack fingerprinting. p0f [Zalewski2012], on the 1093 other hand, is a tool for performing remote OS detection via 1094 passive TCP/IP stack fingerprinting. Finally, [TBIT] is a TCP 1095 fingerprinting tool that aims at characterizing the behaviour of a 1096 remote TCP peer based on active probes, and which has been widely 1097 used in the research community. 1099 Algorithms that, from the perspective of an observer (e.g., the 1100 legitimate communicating peer), result in specific values or 1101 patterns, will allow for at least some level of fingerprinting. For 1102 example, the algorithm from Section 7.3 will typically allow 1103 fingerprinting within the context where the resulting identifiers are 1104 stable. Similarly, the algorithms from Section 7.4 will result in a 1105 monotonically-increasing sequences within a given context, thus 1106 allowing for at least some level of fingerprinting (when the other 1107 communicating entity can correlate different sampled identifiers as 1108 belonging to the same monotonically-increasing sequence). 1110 Thus, where possible, algorithms from Section 7.1 should be preferred 1111 over algorithms that result in specific values or patterns. 1113 8.4. Exploitation of the Semantics of Transient Numeric Identifiers 1115 Identifiers that are not semantically opaque tend to be more 1116 predictable than semantically-opaque identifiers. For example, a MAC 1117 address contains an OUI (Organizationally-Unique Identifier) which 1118 identifies the vendor that manufactured the corresponding network 1119 interface card. This can be leveraged by an attacker trying to 1120 "guess" MAC addresses, who has some knowledge about the possible NIC 1121 vendor. 1123 [RFC7707] discusses a number of techniques to reduce the search space 1124 when performing IPv6 address-scanning attacks by leveraging the 1125 semantics of the IIDs produced by traditional SLAAC algorithms 1126 (eventually replaced by [RFC7217]) that embed MAC addresses in the 1127 IID of IPv6 addresses. 1129 8.5. Exploitation of Collisions of Transient Numeric Identifiers 1131 In many cases, the collision of transient network identifiers can 1132 have a hard failure severity (or result in a hard failure severity if 1133 an attacker can cause multiple collisions deterministically, one 1134 after another). For example, predictable Fragment Identification 1135 values open the door to Denial of Service (DoS) attacks (see e.g. 1136 [RFC5722].). 1138 8.6. Exploitation of Predictable Transient Numeric Identifiers for 1139 Injection Attacks 1141 Some protocols rely on "sequence numbers" for the validation of 1142 incoming packets. For example, TCP employs sequence numbers for 1143 reassembling TCP segments, while IPv4 and IPv6 employ Fragment 1144 Identification values for reassembling IPv4 and IPv6 fragments 1145 (respectively). Lacking built-in cryptographic mechanisms for 1146 validating packets, these protocols are therefore vulnerable to on- 1147 path data (see e.g. [Joncheray1995]) and/or control-information (see 1148 e.g. [RFC4953] and [RFC5927]) injection attacks. The extent to 1149 which these protocols may resist off-path (i.e. "blind") injection 1150 attacks depends on whether the associated "sequence numbers" are 1151 predictable, and effort required to successfully predict a valid 1152 "sequence number" (see e.g. [RFC4953] and [RFC5927]). 1154 We note that the use of unpredictable "sequence numbers" is a 1155 completely-ineffective mitigation for on-path injection attacks, and 1156 also a mostly-ineffective mitigation for off-path (i.e. "blind") 1157 injection attacks. However, many legacy protocols (such as TCP) do 1158 not natively incorporate cryptographic mitigations, but rather only 1159 as optional features (see e.g. [RFC5925]), if at all available. 1160 Additionally, ad-hoc use of cryptographic mitigations might not be 1161 sufficient to relieve a protocol implementation of generating 1162 appropriate transient numeric identifiers. For example, use of the 1163 Transport Layer Security (TLS) protocol [RFC8446] with TCP will 1164 protect the application protocol, but will not help to mitigate e.g. 1165 TCP-based connection-reset attacks (see e.g. [RFC4953]). Similarly, 1166 use of SEcure Neighbor Discovery (SEND) [RFC3971] will still imply 1167 reliance on the successful reassembly of IPv6 fragments in those 1168 cases where SEND packets do not fit into the link Maximum 1169 Transmission Unit (MTU) (see [RFC6980]). 1171 8.7. Cryptanalysis 1173 A number of algorithms discussed in this document (such as those 1174 described in Section 7.4.2 and Section 7.4.3) rely on PRFs. 1175 Implementations that employ weak PRFs or keys of inappropriate size 1176 can be subject to cryptanalysis, where an attacker can obtain the 1177 secret key employed for the PRF, predict numeric identifiers, etc. 1179 Furthermore, an implementation that overloads the semantics of the 1180 secret key can result in more trivial cryptanalysis, possibly 1181 resulting in the leakage of the value employed for the secret key. 1183 NOTE: 1184 [IPID-DEV] describes two vulnerable transient numeric ID 1185 generators that employ cryptographically-weak hash functions. 1186 Additionally, one of such implementations employs 32-bits of a 1187 kernel address as the secret key for a hash function, and 1188 therefore successful cryptanalysis leaks the aforementioned kernel 1189 address, allowing for Kernel Address Space Layout Randomization 1190 (KASLR) [KASLR] bypass. 1192 9. Vulnerability Assessment of Transient Numeric Identifiers 1194 The following subsections analyze possible vulnerabilities associated 1195 with the algorithms described in Section 7. 1197 9.1. Category #1: Uniqueness (soft failure) 1199 Possible vulnerabilities associated with the algorithms from 1200 Section 7.1 include: 1202 * Use of flawed PRNGs (please see e.g. [Zalewski2001], 1203 [Zalewski2002] and [Klein2007]), 1205 * Inadvertently affecting the distribution of an otherwise suitable 1206 PRNG. 1208 An implementer should consult [RFC4086] regarding randomness 1209 requirements for security, and consult relevant documentation when 1210 employing a PRNG provided by the underlying system. 1212 When employing a PRNG, many implementations "adapt" the length of its 1213 output with a modulo operator (e.g., C language's "%"), possibly 1214 changing the distribution of the output of the PRNG. 1216 For example, consider an implementation that employs the following 1217 code: 1219 id = random() % 50000; 1221 This example implementation means to obtain a transient numeric 1222 identifier in the range 0-49999. If random() produces e.g. a 1223 pseudorandom number of 16 bits (with uniform distribution), the 1224 selected transient numeric identifier will have a non-uniform 1225 distribution with the numbers in the range 0-15535 having double- 1226 frequency than the numbers in the range 15536-49999. 1228 NOTE: 1229 For example, both an output of 10 and output of 50010 from the 1230 random() function will result in an 'id' value of 10. 1232 This effect is reduced if the PRNG produces an output that is much 1233 longer than the length implied by the modulo operation. 1235 Use of algorithms other than PRNGs for generating identifiers of this 1236 category is discouraged. 1238 9.2. Category #2: Uniqueness (hard failure) 1240 As noted in Section 7.2, this category can employ the same algorithms 1241 as Category #4, since a monotonically-increasing sequence tends to 1242 minimize the transient numeric identifier reuse frequency. 1243 Therefore, the vulnerability analysis in Section 9.4 also applies to 1244 this category. 1246 Additionally, as noted in Section 7.2, some transient numeric 1247 identifiers of this category might be able to use the algorithms from 1248 Section 7.1, in which case the same considerations as in Section 9.1 1249 would apply. 1251 9.3. Category #3: Uniqueness, stable within context (soft failure) 1253 Possible vulnerabilities associated with the algorithms from 1254 Section 7.3 are: 1256 1. Use of weak PRFs, or inappropriate secret keys (whether 1257 inappropriate selection or inappropriate size) could allow for 1258 cryptanalysis, which could eventually be exploited by an attacker 1259 to predict future transient numeric identifiers. 1261 2. Since the algorithm generates a unique and stable identifier 1262 within a specified context, it may allow for network activity 1263 correlation and fingerprinting within the specified context. 1265 9.4. Category #4: Uniqueness, monotonically increasing within context 1266 (hard failure) 1268 The algorithm described in Section 7.4.1 for generating identifiers 1269 of Category #4 will result in an identifiable pattern (i.e. a 1270 monotonically-increasing sequence) for the transient numeric 1271 identifiers generated for each CONTEXT, and thus will allow for 1272 fingerprinting and network activity correlation within each CONTEXT. 1274 On the other hand, a simple way to generalize and analyze the 1275 algorithms described in Section 7.4.2 and Section 7.4.3 for 1276 generating identifiers of Category #4, is as follows: 1278 /* Transient Numeric ID selection function */ 1280 id_range = max_id - min_id + 1; 1281 retry = id_range; 1282 id_inc = increment() % id_range; 1284 do { 1285 update_mono(CONTEXT, id_inc); 1286 next_id = min_id + (offset(CONTEXT) + \ 1287 mono(CONTEXT)) % id_range; 1289 if (suitable_id(next_id)) { 1290 return next_id; 1291 } 1293 retry = retry - id_inc; 1295 } while (retry > 0); 1297 return ERROR; 1299 NOTE: 1300 increment() returns a small integer that is employed to generate a 1301 monotonically-increasing function. Most implementations employ a 1302 constant value for "increment()" (usually 1). The value returned 1303 by increment() must be much smaller than the value computed for 1304 "id_range". 1306 update_mono(CONTEXT, id_inc) increments the counter corresponding 1307 to CONTEXT by "id_inc". 1309 mono(CONTEXT) reads the counter corresponding to CONTEXT. 1311 Essentially, an identifier (next_id) is generated by adding a 1312 monotonically-increasing function (mono()) to an offset value, 1313 unknown to the attacker and stable for given context (CONTEXT). 1315 The following aspects of the algorithm should be considered: 1317 * For the most part, it is the offset() function that results in 1318 identifiers that are unpredictable by an off-patch attacker. 1319 While the resulting sequence is known to be monotonically- 1320 increasing, the use of a randomized offset value makes the 1321 resulting values unknown to the attacker. 1323 * The most straightforward "stateless" implementation of offset() is 1324 with a PRF that takes the values that identify the context and a 1325 "secret_key" (not shown in the figure above) as arguments. 1327 * One possible implementation of mono() would be to have mono() 1328 internally employ a single counter (as in the algorithm from 1329 Section 7.4.2), or map the increments for different contexts into 1330 a number of counters/buckets, such that the number of counters 1331 that need to be maintained in memory is reduced (as in the 1332 algorithm from algorithm in Section 7.4.3). 1334 * In all cases, a monotonically increasing function is implemented 1335 by incrementing the previous value of a counter by increment() 1336 units. In the most trivial case, increment() could return the 1337 constant "1". But increment() could also be implemented to return 1338 small random integers such that the increments are unpredictable 1339 (see Appendix A of [RFC7739]). This represents a trade-off 1340 between the unpredictability of the resulting transient numeric 1341 IDs and the transient numeric ID reuse frequency. 1343 Considering the generic algorithm illustrated above, we can identify 1344 the following possible vulnerabilities: 1346 * Since the algorithms for this category are similar to those of 1347 Section 9.3, with the addition of a monotonically-increasing 1348 function, all the issues discussed in Section 9.3 ("Category #3: 1349 Uniqueness, stable within context (soft failure)") also apply to 1350 this case. 1352 * mono() can be correlated to the number of identifiers generated 1353 for a given context (CONTEXT). Thus, if mono() spans more than 1354 the necessary context, the "increments" could be leaked to other 1355 parties, thus disclosing information about the number of 1356 identifiers that have been generated by the algorithm for all 1357 contexts. This is information disclosure becomes more evident 1358 when an implementation employs a constant increment of 1. For 1359 example, an implementation where mono() is actually a single 1360 global counter, will unnecessarily leak information the number of 1361 identifiers that have been generated by the algorithm (globally, 1362 for all contexts). [Fyodor2003] is one example of how such 1363 information leakages can be exploited. We note that limiting the 1364 span of the increments space will require a larger number of 1365 counters to be stored in memory (i.e., a larger value for the 1366 TABLE_LENGTH parameter of the algorithm in Section 7.4.3). 1368 * Transient numeric identifiers generated with the algorithms 1369 described in Section 7.4.2 and Section 7.4.3 will normally allow 1370 for fingerprinting within CONTEXT since, for such context, the 1371 resulting identifiers will have an identifiable pattern (i.e. a 1372 monotonically-increasing sequence). 1374 10. IANA Considerations 1376 This document has no IANA actions. 1378 11. Security Considerations 1380 The entire document is about the security and privacy implications of 1381 transient numeric identifiers. 1382 [I-D.gont-numeric-ids-sec-considerations] recommends that protocol 1383 specifications specify the interoperability requirements of their 1384 transient numeric identifiers, perform a vulnerability assessment of 1385 their transient numeric identifiers, and suggest an algorithm for 1386 generating each of their transient numeric identifiers. This 1387 document analyzes possible algorithms (and their implications) that 1388 could be employed to comply with the interoperability properties of 1389 most common categories of transient numeric identifiers, while 1390 minimizing the associated negative security and privacy implications. 1392 12. Acknowledgements 1394 The authors would like to thank (in alphabetical order) Bernard 1395 Aboba, Jean-Philippe Aumasson, Steven Bellovin, Luis Leon Cardenas 1396 Graide, Guillermo Gont, Joseph Lorenzo Hall, Gre Norcie, Colin 1397 Perkins, Vincent Roca, Shivan Sahib, Rich Salz, Martin Thomson, and 1398 Michael Tuexen, for providing valuable comments on earlier versions 1399 of this document. 1401 The authors would like to thank Shivan Sahib and Christopher Wood, 1402 for their guidance during the publication process of this document. 1404 The authors would like to thank Jean-Philippe Aumasson and Mathew D. 1405 Green (John Hopkins University) for kindly answering a number of 1406 questions. 1408 The authors would like to thank Diego Armando Maradona for his magic 1409 and inspiration. 1411 13. References 1413 13.1. Normative References 1415 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1416 RFC 793, DOI 10.17487/RFC0793, September 1981, 1417 . 1419 [RFC6528] Gont, F. and S. Bellovin, "Defending against Sequence 1420 Number Attacks", RFC 6528, DOI 10.17487/RFC6528, February 1421 2012, . 1423 [RFC6056] Larsen, M. and F. Gont, "Recommendations for Transport- 1424 Protocol Port Randomization", BCP 156, RFC 6056, 1425 DOI 10.17487/RFC6056, January 2011, 1426 . 1428 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 1429 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 1430 June 2010, . 1432 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1433 DOI 10.17487/RFC0791, September 1981, 1434 . 1436 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1437 (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, 1438 December 1998, . 1440 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1441 (IPv6) Specification", STD 86, RFC 8200, 1442 DOI 10.17487/RFC8200, July 2017, 1443 . 1445 [RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker, 1446 "Randomness Requirements for Security", BCP 106, RFC 4086, 1447 DOI 10.17487/RFC4086, June 2005, 1448 . 1450 [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing 1451 Architecture", RFC 4291, DOI 10.17487/RFC4291, February 1452 2006, . 1454 [RFC4941] Narten, T., Draves, R., and S. Krishnan, "Privacy 1455 Extensions for Stateless Address Autoconfiguration in 1456 IPv6", RFC 4941, DOI 10.17487/RFC4941, September 2007, 1457 . 1459 [RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless 1460 Address Autoconfiguration", RFC 4862, 1461 DOI 10.17487/RFC4862, September 2007, 1462 . 1464 [RFC5722] Krishnan, S., "Handling of Overlapping IPv6 Fragments", 1465 RFC 5722, DOI 10.17487/RFC5722, December 2009, 1466 . 1468 [RFC7217] Gont, F., "A Method for Generating Semantically Opaque 1469 Interface Identifiers with IPv6 Stateless Address 1470 Autoconfiguration (SLAAC)", RFC 7217, 1471 DOI 10.17487/RFC7217, April 2014, 1472 . 1474 [RFC8064] Gont, F., Cooper, A., Thaler, D., and W. Liu, 1475 "Recommendation on Stable IPv6 Interface Identifiers", 1476 RFC 8064, DOI 10.17487/RFC8064, February 2017, 1477 . 1479 [RFC6191] Gont, F., "Reducing the TIME-WAIT State Using TCP 1480 Timestamps", BCP 159, RFC 6191, DOI 10.17487/RFC6191, 1481 April 2011, . 1483 [RFC7323] Borman, D., Braden, B., Jacobson, V., and R. 1484 Scheffenegger, Ed., "TCP Extensions for High Performance", 1485 RFC 7323, DOI 10.17487/RFC7323, September 2014, 1486 . 1488 [RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, 1489 DOI 10.17487/RFC1321, April 1992, 1490 . 1492 [RFC6151] Turner, S. and L. Chen, "Updated Security Considerations 1493 for the MD5 Message-Digest and the HMAC-MD5 Algorithms", 1494 RFC 6151, DOI 10.17487/RFC6151, March 2011, 1495 . 1497 [RFC1035] Mockapetris, P., "Domain names - implementation and 1498 specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, 1499 November 1987, . 1501 13.2. Informative References 1503 [KASLR] PaX Team, "Address Space Layout Randomization", 1504 . 1506 [IANA-PROT] 1507 IANA, "Protocol Registries", 1508 . 1510 [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 1511 Morris, J., Hansen, M., and R. Smith, "Privacy 1512 Considerations for Internet Protocols", RFC 6973, 1513 DOI 10.17487/RFC6973, July 2013, 1514 . 1516 [Fyodor1998] 1517 Fyodor, "Remote OS Detection via TCP/IP Stack 1518 Fingerprinting", Phrack Magazine, Volume 9, Issue 54, 1519 1998, . 1521 [Fyodor2006] 1522 Lyon, G., "Chapter 8. Remote OS Detection", 2006, 1523 . 1525 [nmap] nmap, "Nmap: Free Security Scanner For Network Exploration 1526 and Audit", 2020, . 1528 [EFF] EFF, "Cover your tracks: See how trackers view your 1529 browser", 2020, . 1531 [Schuba1993] 1532 Schuba, C., "ADDRESSING WEAKNESSES IN THE DOMAIN NAME 1533 SYSTEM PROTOCOL", 1993, 1534 . 1537 [TBIT] TBIT, "TBIT, the TCP Behavior Inference Tool", 2001, 1538 . 1540 [Zalewski2012] 1541 Zalewski, M., "p0f v3 (version 3.09b)", 2012, 1542 . 1544 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 1545 Hashing for Message Authentication", RFC 2104, 1546 DOI 10.17487/RFC2104, February 1997, 1547 . 1549 [RFC7098] Carpenter, B., Jiang, S., and W. Tarreau, "Using the IPv6 1550 Flow Label for Load Balancing in Server Farms", RFC 7098, 1551 DOI 10.17487/RFC7098, January 2014, 1552 . 1554 [RFC7258] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an 1555 Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May 1556 2014, . 1558 [CPNI-TCP] Gont, F., "Security Assessment of the Transmission Control 1559 Protocol (TCP)", United Kingdom's Centre for the 1560 Protection of National Infrastructure (CPNI) Technical 1561 Report, 2009, . 1564 [Zalewski2001] 1565 Zalewski, M., "Strange Attractors and TCP/IP Sequence 1566 Number Analysis", 2001, 1567 . 1569 [Zalewski2002] 1570 Zalewski, M., "Strange Attractors and TCP/IP Sequence 1571 Number Analysis - One Year Later", 2001, 1572 . 1574 [Joncheray1995] 1575 Joncheray, L., "A Simple Active Attack Against TCP", Proc. 1576 Fifth Usenix UNIX Security Symposium, 1995, . 1580 [Morris1985] 1581 Morris, R., "A Weakness in the 4.2BSD UNIX TCP/IP 1582 Software", CSTR 117, AT&T Bell Laboratories, Murray Hill, 1583 NJ, 1985, 1584 . 1586 [Shimomura1995] 1587 Shimomura, T., "Technical details of the attack described 1588 by Markoff in NYT", Message posted in USENET's 1589 comp.security.misc newsgroup Message-ID: 1590 <3g5gkl$5j1@ariel.sdsc.edu>, 1995, 1591 . 1593 [RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, 1594 DOI 10.17487/RFC5927, July 2010, 1595 . 1597 [RFC4953] Touch, J., "Defending TCP Against Spoofing Attacks", 1598 RFC 4953, DOI 10.17487/RFC4953, July 2007, 1599 . 1601 [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol 1602 Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, 1603 . 1605 [RFC3971] Arkko, J., Ed., Kempf, J., Zill, B., and P. Nikander, 1606 "SEcure Neighbor Discovery (SEND)", RFC 3971, 1607 DOI 10.17487/RFC3971, March 2005, 1608 . 1610 [RFC6980] Gont, F., "Security Implications of IPv6 Fragmentation 1611 with IPv6 Neighbor Discovery", RFC 6980, 1612 DOI 10.17487/RFC6980, August 2013, 1613 . 1615 [RFC7739] Gont, F., "Security Implications of Predictable Fragment 1616 Identification Values", RFC 7739, DOI 10.17487/RFC7739, 1617 February 2016, . 1619 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 1620 Errors at High Data Rates", RFC 4963, 1621 DOI 10.17487/RFC4963, July 2007, 1622 . 1624 [RFC6274] Gont, F., "Security Assessment of the Internet Protocol 1625 Version 4", RFC 6274, DOI 10.17487/RFC6274, July 2011, 1626 . 1628 [Bellovin1989] 1629 Bellovin, S., "Security Problems in the TCP/IP Protocol 1630 Suite", Computer Communications Review, vol. 19, no. 2, 1631 pp. 32-48, 1989, 1632 . 1634 [Bellovin2002] 1635 Bellovin, S. M., "A Technique for Counting NATted Hosts", 1636 IMW'02 Nov. 6-8, 2002, Marseille, France, 2002, 1637 . 1639 [Fyodor2003] 1640 Fyodor, "Idle scanning and related IP ID games", 2003, 1641 . 1644 [Sanfilippo1998a] 1645 Sanfilippo, S., "about the ip header id", Post to Bugtraq 1646 mailing-list, Mon Dec 14 1998, 1647 . 1649 [Sanfilippo1998b] 1650 Sanfilippo, S., "Idle scan", Post to Bugtraq mailing-list, 1651 1998, . 1654 [Sanfilippo1999] 1655 Sanfilippo, S., "more ip id", Post to Bugtraq mailing- 1656 list, 1999, 1657 . 1660 [Silbersack2005] 1661 Silbersack, M.J., "Improving TCP/IP security through 1662 randomization without sacrificing interoperability", 1663 EuroBSDCon 2005 Conference, 2005, 1664 . 1667 [Klein2007] 1668 Klein, A., "OpenBSD DNS Cache Poisoning and Multiple O/S 1669 Predictable IP ID Vulnerability", 2007, 1670 . 1674 [IPID-DEV] Klein, A. and B. Pinkas, "From IP ID to Device ID and 1675 KASLR Bypass (Extended Version)", June 2019, 1676 . 1678 [I-D.irtf-pearg-numeric-ids-history] 1679 Gont, F. and I. Arce, "Unfortunate History of Transient 1680 Numeric Identifiers", Work in Progress, Internet-Draft, 1681 draft-irtf-pearg-numeric-ids-history-09, 27 January 2022, 1682 . 1685 [I-D.gont-numeric-ids-sec-considerations] 1686 Gont, F. and I. Arce, "Security Considerations for 1687 Transient Numeric Identifiers Employed in Network 1688 Protocols", Work in Progress, Internet-Draft, draft-gont- 1689 numeric-ids-sec-considerations-06, 5 December 2020, 1690 . 1693 [RFC7721] Cooper, A., Gont, F., and D. Thaler, "Security and Privacy 1694 Considerations for IPv6 Address Generation Mechanisms", 1695 RFC 7721, DOI 10.17487/RFC7721, March 2016, 1696 . 1698 [RFC7707] Gont, F. and T. Chown, "Network Reconnaissance in IPv6 1699 Networks", RFC 7707, DOI 10.17487/RFC7707, March 2016, 1700 . 1702 [TCPT-uptime] 1703 McDanel, B., "TCP Timestamping - Obtaining System Uptime 1704 Remotely", 14 March 2001, 1705 . 1707 [SipHash] Aumasson, J. P. and D. J. Bernstein, "SipHash: a fast 1708 short-input PRF", 2012, 1709 . 1711 [BLAKE3] O'Connor, J., Aumasson, J. P., Neves, S., and Z. Wilcox- 1712 O'Hearn, "BLAKE3: one function, fast everywhere", 2020, 1713 . 1715 [FIPS-SHS] NIST, "Secure Hash Standard (SHS)", Federal Information 1716 Processing Standards Publication 180-4, August 2015, 1717 . 1720 Appendix A. Algorithms and Techniques with Known Issues 1722 The following subsections discuss algorithms and techniques with 1723 known negative security and privacy implications. 1725 NOTE: 1726 As discussed in Section 1, the use of cryptographic techniques 1727 might allow for the safe use of some of these algorithms and 1728 techniques. However, this should be evaluated on a case by case 1729 basis. 1731 A.1. Predictable Linear Identifiers Algorithm 1733 One of the most trivial ways to achieve uniqueness with a low 1734 identifier reuse frequency is to produce a linear sequence. This 1735 type of algorithm has been employed in the past to generate 1736 identifiers of Categories #1, #2, and #4 (please see Section 6 for an 1737 analysis of these categories). 1739 For example, the following algorithm has been employed (see e.g. 1740 [Morris1985], [Shimomura1995], [Silbersack2005] and [CPNI-TCP]) in a 1741 number of operating systems for selecting IP fragment IDs, TCP 1742 ephemeral ports, etc.: 1744 /* Initialization code */ 1746 next_id = min_id; 1747 id_inc= 1; 1749 /* Transient Numeric ID selection function */ 1751 id_range = max_id - min_id + 1; 1752 retry = id_range; 1754 do { 1755 if (next_id == max_id) { 1756 next_id = min_id; 1757 } 1758 else { 1759 next_id = next_id + id_inc; 1760 } 1762 if (suitable_id(next_id)) { 1763 return next_id; 1764 } 1766 retry--; 1768 } while (retry > 0); 1770 return ERROR; 1772 NOTE: 1773 suitable_id() is a function that checks whether the resulting 1774 identifier is acceptable (e.g., whether it's in use, etc.). 1776 For obvious reasons, this algorithm results in predictable sequences. 1777 Since a global counter is used to generate the transient numeric 1778 identifiers ("next_id" in the example above), an entity that learns 1779 one numeric identifier can infer past numeric identifiers and predict 1780 future values to be generated by the same algorithm. Since the value 1781 employed for the increments is known (such as "1" in this case), an 1782 attacker can sample two values, and learn the number of identifiers 1783 that have been were generated in-between the two sampled values. 1784 Furthermore, if the counter is initialized e.g. when the system its 1785 bootstrapped to some known value, the algorithm will leak additional 1786 information, such as the number of transmitted fragmented datagrams 1787 in the case of an IP ID generator [Sanfilippo1998a], or the system 1788 uptime in the case of TCP timestamps [TCPT-uptime]. 1790 A.2. Random-Increments Algorithm 1792 This algorithm offers a middle ground between the algorithms that 1793 generate randomized transient numeric identifiers (such as those 1794 described in Section 7.1.1 and Section 7.1.2), and those that 1795 generate identifiers with a predictable monotonically-increasing 1796 function (see Appendix A.1). 1798 /* Initialization code */ 1800 next_id = random(); /* Initialization value */ 1801 id_rinc = 500; /* Determines the trade-off */ 1803 /* Transient Numeric ID selection function */ 1805 id_range = max_id - min_id + 1; 1806 retry = id_range; 1808 do { 1809 /* Random increment */ 1810 id_inc = (random() % id_rinc) + 1; 1812 if ( (max_id - next_id) >= id_inc){ 1813 next_id = next_id + id_inc; 1814 } 1815 else { 1816 next_id = min_id + id_inc - (max_id - next_id); 1817 } 1819 if (suitable_id(next_id)) { 1820 return next_id; 1821 } 1823 retry = retry - id_inc; 1825 } while (retry > 0); 1827 return ERROR; 1829 This algorithm aims at producing a global monotonically-increasing 1830 sequence of transient numeric identifiers, while avoiding the use of 1831 fixed increments, which would lead to trivially predictable 1832 sequences. The value "id_inc" allows for direct control of the 1833 trade-off between unpredictability and identifier reuse frequency. 1834 The smaller the value of "id_inc", the more similar this algorithm is 1835 to a predicable, global linear ID generation algorithm (as the one in 1836 Appendix A.1). The larger the value of "id_inc", the more similar 1837 this algorithm is to the algorithm described in Section 7.1.1 of this 1838 document. 1840 When the identifiers wrap, there is a risk of collisions of transient 1841 numeric identifiers (i.e., identifier reuse). Therefore, "id_inc" 1842 should be selected according to the following criteria: 1844 * It should maximize the wrapping time of the identifier space. 1846 * It should minimize identifier reuse frequency. 1848 * It should maximize unpredictability. 1850 Clearly, these are competing goals, and the decision of which value 1851 of "id_inc" to use is a trade-off. Therefore, the value of "id_inc" 1852 is at times a configurable parameter so that system administrators 1853 can make the trade-off for themselves. We note that the alternative 1854 algorithms discussed throughout this document offer better 1855 interoperability, security and privacy properties than this 1856 algorithm, and hence implementation of this algorithm is discouraged. 1858 A.3. Re-using Identifiers Across Different Contexts 1860 Employing the same identifier across contexts in which stability is 1861 not required (i.e. overloading the semantics of transient numeric 1862 identifier) usually has negative security and privacy implications. 1864 For example, in order to generate transient numeric identifiers of 1865 Category #2 or Category #3, an implementation or specification might 1866 be tempted to employ a source for the numeric identifiers which is 1867 known to provide unique values, but that may also be predictable or 1868 leak information related to the entity generating the identifier. 1869 This technique has been employed in the past for e.g. generating IPv6 1870 IIDs by re-using the MAC address of the underlying network interface. 1871 However, as noted in [RFC7721] and [RFC7707], embedding link-layer 1872 addresses in IPv6 IIDs not only results in predictable values, but 1873 also leaks information about the manufacturer of the underlying 1874 network interface card, allows for network activity correlation, and 1875 makes address-based scanning attacks feasible. 1877 Authors' Addresses 1879 Fernando Gont 1880 EdgeUno 1881 Segurola y Habana 4310 7mo piso 1882 Ciudad Autonoma de Buenos Aires 1883 Buenos Aires 1884 Argentina 1886 Email: fernando.gont@edgeuno.com 1887 URI: https://www.edgeuno.com 1889 Ivan Arce 1890 Quarkslab 1891 Segurola y Habana 4310 7mo piso 1892 Ciudad Autonoma de Buenos Aires 1893 Buenos Aires 1894 Argentina 1896 Email: iarce@quarkslab.com 1897 URI: https://www.quarkslab.com