idnits 2.17.1 draft-gont-numeric-ids-generation-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- -- The document has an IETF Trust Provisions (28 Dec 2009) Section 6.c(ii) Publication Limitation clause. If this document is intended for submission to the IESG for publication, this constitutes an error. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (March 11, 2019) is 1870 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Best Current Practice ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'Bellovin1989' is defined on line 1051, but no explicit reference was found in the text == Unused Reference: 'CERT2001' is defined on line 1061, but no explicit reference was found in the text == Unused Reference: 'Fyodor2004' is defined on line 1079, but no explicit reference was found in the text == Unused Reference: 'USCERT2001' is defined on line 1179, but no explicit reference was found in the text ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) ** Obsolete normative reference: RFC 6528 (Obsoleted by RFC 9293) Summary: 3 errors (**), 0 flaws (~~), 6 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Gont 3 Internet-Draft SI6 Networks / UTN-FRH 4 Intended status: Best Current Practice I. Arce 5 Expires: September 12, 2019 Quarkslab 6 March 11, 2019 8 On the Generation of Transient Numeric Identifiers 9 draft-gont-numeric-ids-generation-03 11 Abstract 13 This document performs an analysis of the security and privacy 14 implications of different types of "numeric identifiers" used in IETF 15 protocols, and tries to categorize them based on their 16 interoperability requirements and the associated failure severity 17 when such requirements are not met. Subsequently, it provides advice 18 on possible algorithms that could be employed to satisfy the 19 interoperability requirements of each identifier type, while 20 minimizing the security and privacy implications, thus providing 21 guidance to protocol designers and protocol implementers. Finally, 22 describes a number of algorithms that have been employed in real 23 implementations to generate transient numeric identifiers and 24 analyzes their security and privacy properties. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on September 12, 2019. 43 Copyright Notice 45 Copyright (c) 2019 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 This document may not be modified, and derivative works of it may not 59 be created, and it may not be published except as an Internet-Draft. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 64 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 65 3. Threat Model . . . . . . . . . . . . . . . . . . . . . . . . 5 66 4. Issues with the Specification of Identifiers . . . . . . . . 5 67 5. Protocol Failure Severity . . . . . . . . . . . . . . . . . . 6 68 6. Categorizing Identifiers . . . . . . . . . . . . . . . . . . 6 69 7. Common Algorithms for Identifier Generation . . . . . . . . . 9 70 7.1. Category #1: Uniqueness (soft failure) . . . . . . . . . 9 71 7.2. Category #2: Uniqueness (hard failure) . . . . . . . . . 10 72 7.3. Category #3: Uniqueness, constant within context (soft- 73 failure) . . . . . . . . . . . . . . . . . . . . . . . . 10 74 7.4. Category #4: Uniqueness, monotonically increasing within 75 context (hard failure) . . . . . . . . . . . . . . . . . 12 76 8. Common Vulnerabilities Associated with Transient Numeric 77 Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . 17 78 8.1. Network Activity Correlation . . . . . . . . . . . . . . 17 79 8.2. Information Leakage . . . . . . . . . . . . . . . . . . . 17 80 8.3. Exploitation of Semantics of Transient Numeric 81 Identifiers . . . . . . . . . . . . . . . . . . . . . . . 19 82 8.4. Exploitation of Collisions of Transient Numeric 83 Identifiers . . . . . . . . . . . . . . . . . . . . . . . 19 84 9. Vulnerability Analysis of Specific Transient Numeric 85 Identifiers Categories . . . . . . . . . . . . . . . . . . . 19 86 9.1. Category #1: Uniqueness (soft failure) . . . . . . . . . 19 87 9.2. Category #2: Uniqueness (hard failure) . . . . . . . . . 20 88 9.3. Category #3: Uniqueness, constant within context (soft 89 failure) . . . . . . . . . . . . . . . . . . . . . . . . 20 90 9.4. Category #4: Uniqueness, monotonically increasing within 91 context (hard failure) . . . . . . . . . . . . . . . . . 20 92 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 93 11. Security Considerations . . . . . . . . . . . . . . . . . . . 22 94 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 22 95 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 23 96 13.1. Normative References . . . . . . . . . . . . . . . . . . 23 97 13.2. Informative References . . . . . . . . . . . . . . . . . 24 98 Appendix A. Flawed Algorithms . . . . . . . . . . . . . . . . . 27 99 A.1. Predictable Linear Identifiers Algorithm . . . . . . . . 27 100 A.2. Random-Increments Algorithm . . . . . . . . . . . . . . . 28 101 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 30 103 1. Introduction 105 Network protocols employ a variety of numeric identifiers for 106 different protocol entities, ranging from DNS Transaction IDs (TxIDs) 107 to transport protocol numbers (e.g. TCP ports) or IPv6 Interface 108 Identifiers (IIDs). These identifiers usually have specific 109 properties that must be satisfied such that they do not result in 110 negative interoperability implications (e.g. uniqueness during a 111 specified period of time), and associated failure severities when 112 such properties are not met, ranging from soft to hard failures. 114 For more than 30 years, a large number of implementations of the TCP/ 115 IP protocol suite have been subject to a variety of attacks, with 116 effects ranging from Denial of Service (DoS) or data injection, to 117 information leakage that could be exploited for pervasive monitoring 118 [RFC7528]. The root of these issues has been, in many cases, the 119 poor selection of identifiers in such protocols, usually as a result 120 of an insufficient or misleading specification. While it is 121 generally trivial to identify an algorithm that can satisfy the 122 interoperability requirements for a given identifier, there exists 123 practical evidence that doing so without negatively affecting the 124 security and/or privacy properties of the aforementioned protocols is 125 prone to error. 127 For example, implementations have been subject to security and/or 128 privacy issues resulting from: 130 o Predictable TCP sequence numbers 132 o Predictable transport protocol numbers 134 o Predictable IPv4 or IPv6 Fragment Identifiers 136 o Predictable IPv6 IIDs 138 o Predictable DNS TxIDs 140 Recent history indicates that when new protocols are standardized or 141 new protocol implementations are produced, the security and privacy 142 properties of the associated identifiers tend to be overlooked and 143 inappropriate algorithms to generate identifier values are either 144 suggested in the specification or selected by implementers. As a 145 result, we believe that advice in this area is warranted. 147 This document contains a non-exhaustive survey of identifiers 148 employed in various IETF protocols, and aims to categorize such 149 identifiers based on their interoperability requirements, and the 150 associated failure severity when such requirements are not met. 151 Subsequently, it provides advice on possible algorithms that could be 152 employed to satisfy the interoperability requirements of each 153 category, while minimizing the associated security and privacy 154 implications. Finally, it analyzes several algorithms that have been 155 employed in real implementations to meet such requirements and 156 analyzes their security and privacy properties. 158 2. Terminology 160 Identifier: 161 A data object in a protocol specification that can be used to 162 definitely distinguish a protocol object (a datagram, network 163 interface, transport protocol endpoint, session, etc) from all 164 other objects of the same type, in a given context. Identifiers 165 are usually defined as a series of bits and represented using 166 integer values. We note that different identifiers may have 167 additional requirements or properties depending on their specific 168 use in a protocol. We use the term "identifier" as a generic term 169 to refer to any data object in a protocol specification that 170 satisfies the identification property stated above. 172 Failure Severity: 173 The consequences of a failure to comply with the interoperability 174 requirements of a given identifier. Severity considers the worst 175 potential consequence of a failure, determined by the system 176 damage and/or time lost to repair the failure. In this document 177 we define two types of failure severity: "soft" and "hard". 179 Hard Failure: 180 A hard failure is a non-recoverable condition in which a protocol 181 does not operate in the prescribed manner or it operates with 182 excessive degradation of service. For example, an established TCP 183 connection that is aborted due to an error condition constitutes, 184 from the point of view of the transport protocol, a hard failure, 185 since it enters a state from which normal operation cannot be 186 recovered. 188 Soft Failure: 189 A soft failure is a recoverable condition in which a protocol does 190 not operate in the prescribed manner but normal operation can be 191 resumed automatically in a short period of time. For example, a 192 simple packet-loss event that is subsequently recovered with a 193 retransmission can be considered a soft failure. 195 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 196 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 197 document are to be interpreted as described in RFC 2119 [RFC2119]. 199 3. Threat Model 201 Throughout this document, we assume an attacker does not have 202 physical or logical device to the device(s) being attacked. We 203 assume the attacker can simply send any traffic to the target 204 devices, to e.g. sample identifiers employed by such devices. 206 4. Issues with the Specification of Identifiers 208 While assessing protocol specifications regarding the use of 209 identifiers, we found that most of the issues discussed in this 210 document arise as a result of one of the following: 212 o Protocol specifications which under-specify the requirements for 213 their identifiers 215 o Protocol specifications that over-specify their identifiers 217 o Protocol implementations that simply fail to comply with the 218 specified requirements 220 A number of protocol implementations (too many of them) simply 221 overlook the security and privacy implications of identifiers. 222 Examples of them are the specification of TCP port numbers in 223 [RFC0793], the specification of TCP sequence numbers in [RFC0793], or 224 the specification of the DNS TxID in [RFC1035]. 226 On the other hand, there are a number of protocol specifications that 227 over-specify some of their associated protocol identifiers. For 228 example, [RFC4291] essentially results in link-layer addresses being 229 embedded in the IPv6 Interface Identifiers (IIDs) when the 230 interoperability requirement of uniqueness could be achieved in other 231 ways that do not result in negative security and privacy implications 232 [RFC7721]. Similarly, [RFC2460] suggests the use of a global counter 233 for the generation of Fragment Identification values, when the 234 interoperability properties of uniqueness per {Src IP, Dst IP} could 235 be achieved with other algorithms that do not result in negative 236 security and privacy implications. 238 Finally, there are protocol implementations that simply fail to 239 comply with existing protocol specifications. For example, some 240 popular operating systems (notably Microsoft Windows) still fail to 241 implement randomization of transport protocol ephemeral ports, as 242 specified in [RFC6056]. 244 5. Protocol Failure Severity 246 Section 2 defines the concept of "Failure Severity" and two types of 247 failures that we employ throughout this document: soft and hard. 249 Our analysis of the severity of a failure is performed from the point 250 of view of the protocol in question. However, the corresponding 251 severity on the upper application or protocol may not be the same as 252 that of the protocol in question. For example, a TCP connection that 253 is aborted may or may not result in a hard failure of the upper 254 application: if the upper application can establish a new TCP 255 connection without any impact on the application, a hard failure at 256 the TCP protocol may have no severity at the application level. On 257 the other hand, if a hard failure of a TCP connection results in 258 excessive degradation of service at the application layer, it will 259 also result in a hard failure at the application. 261 6. Categorizing Identifiers 263 This section includes a non-exhaustive survey of identifiers, and 264 proposes a number of categories that can accommodate these 265 identifiers based on their interoperability requirements and their 266 failure modes (soft or hard) 267 +------------+--------------------------------------+---------------+ 268 | Identifier | Interoperability Requirements | Failure | 269 | | | Severity | 270 +------------+--------------------------------------+---------------+ 271 | IPv6 Frag | Uniqueness (for IP address pair) | Soft/Hard (1) | 272 | ID | | | 273 +------------+--------------------------------------+---------------+ 274 | IPv6 IID | Uniqueness (and constant within IPv6 | Soft (3) | 275 | | prefix) (2) | | 276 +------------+--------------------------------------+---------------+ 277 | TCP SEQ | Monotonically-increasing | Hard (4) | 278 +------------+--------------------------------------+---------------+ 279 | TCP eph. | Uniqueness (for connection ID) | Hard | 280 | port | | | 281 +------------+--------------------------------------+---------------+ 282 | IPv6 Flow | Uniqueness | None (5) | 283 | L. | | | 284 +------------+--------------------------------------+---------------+ 285 | DNS TxID | Uniqueness | None (6) | 286 +------------+--------------------------------------+---------------+ 288 Table 1: Survey of Identifiers 290 Notes: 292 (1) 293 While a single collision of Fragment ID values would simply lead 294 to a single packet drop (and hence a "soft" failure), repeated 295 collisions at high data rates might trash the Fragment ID space, 296 leading to a hard failure [RFC4963]. 298 (2) 299 While the interoperability requirements are simply that the 300 Interface ID results in a unique IPv6 address, for operational 301 reasons it is typically desirable that the resulting IPv6 address 302 (and hence the corresponding Interface ID) be constant within each 303 network [I-D.ietf-6man-default-iids] [RFC7217]. 305 (3) 306 While IPv6 Interface IDs must result in unique IPv6 addresses, 307 IPv6 Duplicate Address Detection (DAD) [RFC4862] allows for the 308 detection of duplicate Interface IDs/addresses, and hence such 309 Interface ID collisions can be recovered. 311 (4) 312 In theory there are no interoperability requirements for TCP 313 sequence numbers, since the TIME-WAIT state and TCP's "quiet time" 314 take care of old segments from previous incarnations of the 315 connection. However, a widespread optimization allows for a new 316 incarnation of a previous connection to be created if the Initial 317 Sequence Number (ISN) of the incoming SYN is larger than the last 318 sequence number seen in that direction for the previous 319 incarnation of the connection. Thus, monotonically-increasing TCP 320 sequence numbers allow for such optimization to work as expected 321 [RFC6528]. 323 (5) 324 The IPv6 Flow Label is typically employed for load sharing 325 [RFC7098], along with the Source and Destination IPv6 addresses. 326 Reuse of a Flow Label value for the same set {Source Address, 327 Destination Address} would typically cause both flows to be 328 multiplexed into the same link. However, as long as this does not 329 occur deterministically, it will not result in any negative 330 implications. 332 (6) 333 DNS TxIDs are employed, together with the Source Address, 334 Destination Address, Source Port, and Destination Port, to match 335 DNS requests and responses. However, since an implementation 336 knows which DNS requests were sent for that set of {Source 337 Address, Destination Address, Source Port, and Destination Port, 338 DNS TxID}, a collision of TxID would result, if anything, in a 339 small performance penalty (the response would be discarded when it 340 is found that it does not answer the query sent in the 341 corresponding DNS query). 343 Based on the survey above, we can categorize identifiers as follows: 345 +-----+---------------------------------------+---------------------+ 346 | Cat | Category | Sample Proto IDs | 347 | # | | | 348 +-----+---------------------------------------+---------------------+ 349 | 1 | Uniqueness (soft failure) | IPv6 Flow L., DNS | 350 | | | TxIDs | 351 +-----+---------------------------------------+---------------------+ 352 | 2 | Uniqueness (hard failure) | IPv6 Frag ID, TCP | 353 | | | ephemeral port | 354 +-----+---------------------------------------+---------------------+ 355 | 3 | Uniqueness, constant within context | IPv6 IIDs | 356 | | (soft failure) | | 357 +-----+---------------------------------------+---------------------+ 358 | 4 | Uniqueness, monotonically increasing | TCP ISN | 359 | | within context (hard failure) | | 360 +-----+---------------------------------------+---------------------+ 362 Table 2: Identifier Categories 364 We note that Category #4 could be considered a generalized case of 365 category #3, in which a monotonically increasing element is added to 366 a constant (within context) element, such that the resulting 367 identifiers are monotonically increasing within a specified context. 368 That is, the same algorithm could be employed for both #3 and #4, 369 given appropriate parameters. 371 7. Common Algorithms for Identifier Generation 373 The following subsections describe common algorithms found for 374 Protocol ID generation for each of the categories above. 376 7.1. Category #1: Uniqueness (soft failure) 378 7.1.1. Simple Randomization Algorithm 380 /* Ephemeral port selection function */ 381 id_range = max_id - min_id + 1; 382 next_id = min_id + (random() % id_range); 383 count = next_id; 385 do { 386 if(check_suitable_id(next_id)) 387 return next_id; 389 if (next_id == max_id) { 390 next_id = min_id; 391 } else { 392 next_id++; 393 } 395 count--; 396 } while (count > 0); 398 return ERROR; 400 Note: 401 random() is a function that returns a pseudo-random unsigned 402 integer number of appropriate size. Note that the output needs to 403 be unpredictable, and typical implementations of POSIX random() 404 function do not necessarily meet this requirement. See [RFC4086] 405 for randomness requirements for security. 407 The function check_suitable_id() can check, when possible, whether 408 this identifier is e.g. already in use. When already used, this 409 algorithm selects the next available protocol ID. 411 All the variables (in this and all the algorithms discussed in 412 this document) are unsigned integers. 414 This algorithm does not suffer from any of the issues discussed in 415 Section 8. 417 7.1.2. Another Simple Randomization Algorithm 419 The following pseudo-code illustrates another algorithm for selecting 420 a random identifier in which, in the event the identifier is found to 421 be not suitable (e.g., already in use), another identifier is 422 selected randomly: 424 id_range = max_id - min_id + 1; 425 next_id = min_id + (random() % id_range); 426 count = id_range; 428 do { 429 if(check_suitable_id(next_id)) 430 return next_id; 432 next_id = min_id + (random() % id_range); 433 count--; 434 } while (count > 0); 436 return ERROR; 438 This algorithm might be unable to select an identifier (i.e., return 439 "ERROR") even if there are suitable identifiers available, when there 440 are a large number of identifiers "in use". 442 This algorithm does not suffer from any of the issues discussed in 443 Section 8. 445 7.2. Category #2: Uniqueness (hard failure) 447 One of the most trivial approaches for achieving uniqueness for an 448 identifier (with a hard failure mode) is to implement a linear 449 function. As a result, all of the algorithms described in 450 Section 7.4 are of use for complying the requirements of this 451 identifier category. 453 7.3. Category #3: Uniqueness, constant within context (soft-failure) 455 The goal of this algorithm is to produce identifiers that are 456 constant for a given context, but that change when the aforementioned 457 context changes. 459 Keeping one value for each possible "context" may in many cases be 460 considered too onerous in terms of memory requirements. As a 461 workaround, the following algorithm employs a calculated technique 462 (as opposed to keeping state in memory) to maintain the constant 463 identifier for each given context. 465 In the following algorithm, the function F() provides (statelessly) a 466 constant identifier for each given context. 468 /* Protocol ID selection function */ 469 id_range = max_id - min_id + 1; 471 counter = 0; 473 do { 474 offset = F(CONTEXT, counter, secret_key); 475 next_id = min_id + (offset % id_range); 477 if(check_suitable_id(next_id)) 478 return next_id; 480 counter++; 482 } while (counter <= MAX_RETRIES); 484 return ERROR; 486 The function F() provides a "per-CONTEXT" constant identifier for a 487 given context. 'offset' may take any value within the storage type 488 range since we are restricting the resulting identifier to be in the 489 range [min_id, max_id] in a similar way as in the algorithm described 490 in Section 7.1.1. Collisions can be recovered by incrementing the 491 'counter' variable and recomputing F(). 493 The function F() should be a cryptographic hash function like SHA-256 494 [FIPS-SHS]. Note: MD5 [RFC1321] is considered unacceptable for F() 495 [RFC6151]. CONTEXT is the concatenation of all the elements that 496 define a given context. For example, if this algorithm is expected 497 to produce identifiers that are unique per network interface card 498 (NIC) and SLAAC autoconfiguration prefix, the CONTEXT should be the 499 concatenation of e.g. the interface index and the SLAAC 500 autoconfiguration prefix (please see [RFC7217] for an implementation 501 of this algorithm for the generation of IPv6 IIDs). 503 The secret should be chosen to be as random as possible (see 504 [RFC4086] for recommendations on choosing secrets). 506 For obvious reasons, the transient network identifiers generated with 507 this algorithm allow for network activity correlation within 508 "CONTEXT". However, this is essentially a design goal of this 509 category of transient network identifiers. 511 7.4. Category #4: Uniqueness, monotonically increasing within context 512 (hard failure) 514 7.4.1. Per-context Counter Algorithm 516 One possible way to achieve low identifier reuse frequency while 517 still avoiding predictable sequences would be to employ a per-context 518 counter, as opposed to a global counter. Such an algorithm could be 519 described as follows: 521 /* Initialization at system boot time. Could be random */ 522 id_inc= 1; 524 /* Identifier selection function */ 525 count = max_id - min_id + 1; 527 if(lookup_counter(CONTEXT) == ERROR){ 528 create_counter(CONTEXT); 529 } 531 next_id= lookup_counter(CONTEXT); 533 do { 534 if (next_id == max_id) { 535 next_id = min_id; 536 } 537 else { 538 next_id = next_id + id_inc; 539 } 541 if (check_suitable_id(next_id)){ 542 store_counter(CONTEXT, next_id); 543 return next_id; 544 } 546 count--; 548 } while (count > 0); 550 store_counter(CONTEXT, next_id); 551 return ERROR; 553 NOTE: 555 lookup_counter() returns the current counter for a given context, 556 or an error condition if such a counter does not exist. 558 create_counter() creates a counter for a given context, and 559 initializes such counter to a random value. 561 store_counter() saves (updates) the current counter for a given 562 context. 564 check_suitable_id() is a function that checks whether the 565 resulting identifier is acceptable (e.g., whether its in use, 566 etc.). 568 Essentially, whenever a new identifier is to be selected, the 569 algorithm checks whether there there is a counter for the 570 corresponding context. If there is, such counter is incremented to 571 obtain the new identifier, and the new identifier updates the 572 corresponding counter. If there is no counter for such context, a 573 new counter is created an initialized to a random value, and used as 574 the new identifier. 576 This algorithm produces a per-context counter, which results in one 577 linear function for each context. Since the origin of each "line" is 578 a random value, the resulting values are unknown to an off-path 579 attacker. 581 This algorithm has the following drawbacks: 583 o If, as a result of resource management, the counter for a given 584 context must be removed, the last identifier value used for that 585 context will be lost. Thus, if subsequently an identifier needs 586 to be generated for such context, that counter will need to be 587 recreated and reinitialized to random value, thus possibly leading 588 to reuse/collistion of identifiers. 590 o If the identifiers are predictable by the destination system 591 (e.g., the destination host represents the "context"), a 592 vulnerable host might possibly leak to third parties the 593 identifiers used by other hosts to send traffic to it (i.e., a 594 vulnerable Host B could leak to Host C the identifier values that 595 Host A is using to send packets to Host B). Appendix A of 596 [RFC7739] describes one possible scenario for such leakage in 597 detail. 599 Otherwise, the identifiers produced by this algorithm do not suffer 600 from the other issues discussed in Section 8. 602 7.4.2. Simple Hash-Based Algorithm 604 The goal of this algorithm is to produce monotonically-increasing 605 sequences, with a randomized initial value, for each given context. 606 For example, if the identifiers being generated must be unique for 607 each {src IP, dst IP} set, then each possible combination of {src IP, 608 dst IP} should have a corresponding "next_id" value. 610 Keeping one value for each possible "context" may in many cases be 611 considered too onerous in terms of memory requirements. As a 612 workaround, the following algorithm employs a calculated technique 613 (as opposed to keeping state in memory) to maintain the random offset 614 for each possible context. 616 In the following algorithm, the function F() provides (statelessly) a 617 random offset for each given context. 619 /* Initialization at system boot time. Could be random. */ 620 counter = 0; 622 /* Protocol ID selection function */ 623 id_range = max_id - min_id + 1; 624 offset = F(CONTEXT, secret_key); 625 count = id_range; 627 do { 628 next_id = min_id + 629 (counter + offset) % id_range; 631 counter++; 633 if(check_suitable_id(next_id)) 634 return next_id; 636 count--; 638 } while (count > 0); 640 return ERROR; 642 The function F() provides a "per-CONTEXT" fixed offset within the 643 identifier space. Both the 'offset' and 'counter' variables may take 644 any value within the storage type range since we are restricting the 645 resulting identifier to be in the range [min_id, max_id] in a similar 646 way as in the algorithm described in Section 7.1.1. This allows us 647 to simply increment the 'counter' variable and rely on the unsigned 648 integer to wrap around. 650 The function F() should be a cryptographic hash function like SHA-256 651 [FIPS-SHS]. Note: MD5 [RFC1321] is considered unacceptable for F() 652 [RFC6151]. CONTEXT is the concatenation of all the elements that 653 define a given context. For example, if this algorithm is expected 654 to produce identifiers that are monotonically-increasing for each set 655 (Source IP Address, Destination IP Address), the CONTEXT should be 656 the concatenation of these two values. 658 The secret should be chosen to be as random as possible (see 659 [RFC4086] for recommendations on choosing secrets). 661 It should be noted that, since this algorithm uses a global counter 662 ("counter") for selecting identifiers, this algorithm produces an 663 information leakage (as described in Section 8.2). For example, if 664 an attacker could force a client to periodically establish a new TCP 665 connection to an attacker-controlled machine (or through an attacker- 666 observable routing path), the attacker could subtract consecutive 667 source port values to obtain the number of outgoing TCP connections 668 established globally by the target host within that time period (up 669 to wrap-around issues and five-tuple collisions, of course). 671 7.4.3. Double-Hash Algorithm 673 A trade-off between maintaining a single global 'counter' variable 674 and maintaining 2**N 'counter' variables (where N is the width of the 675 result of F()) could be achieved as follows. The system would keep 676 an array of TABLE_LENGTH integers, which would provide a separation 677 of the increment of the 'counter' variable. This improvement could 678 be incorporated into the algorithm from Section 7.4.2 as follows: 680 /* Initialization at system boot time */ 681 for(i = 0; i < TABLE_LENGTH; i++) 682 table[i] = random(); 684 id_inc = 1; 686 /* Protocol ID selection function */ 687 id_range = max_id - min_id + 1; 688 offset = F(CONTEXT, secret_key1); 689 index = G(CONTEXT, secret_key2); 690 count = id_range; 692 do { 693 next_id = min_id + (offset + table[index]) % id_range; 694 table[index] = table[index] + id_inc; 696 if(check_suitable_id(next_id)) 697 return next_id; 699 count--; 701 } while (count > 0); 703 return ERROR; 705 'table[]' could be initialized with random values, as indicated by 706 the initialization code in pseudo-code above. The function G() 707 should be a cryptographic hash function. It should use the same 708 CONTEXT as F(), and a secret key value to compute a value between 0 709 and (TABLE_LENGTH-1). Alternatively, G() could take an "offset" as 710 input, and perform the exclusive-or (XOR) operation between all the 711 bytes in 'offset'. 713 The array 'table[]' assures that successive identifiers for a given 714 context will be monotonically-increasing. However, the increments 715 space is separated into TABLE_LENGTH different spaces, and thus 716 identifier reuse frequency will be (probabilistically) lower than 717 that of the algorithm in Section 7.4.2. That is, the generation of 718 identifier for one given context will not necessarily result in 719 increments in the identifiers for other contexts. 721 It is interesting to note that the size of 'table[]' does not limit 722 the number of different identifier sequences, but rather separates 723 the *increments* into TABLE_LENGTH different spaces. The identifier 724 sequence will result from adding the corresponding entry of 'table[]' 725 to the variable 'offset', which selects the actual identifier 726 sequence (as in the algorithm from Section 7.4.2). 728 An attacker can perform traffic analysis for any "increment space" 729 (i.e., context) into which the attacker has "visibility" -- namely, 730 the attacker can force a node to generate identifiers where G(offset) 731 identifies the target "increment space". However, the attacker's 732 ability to perform traffic analysis is very reduced when compared to 733 the predictable linear identifiers (described in Appendix A.1) and 734 the hash-based identifiers (described in Section 7.4.2). 735 Additionally, an implementation can further limit the attacker's 736 ability to perform traffic analysis by further separating the 737 increment space (that is, using a larger value for TABLE_LENGTH) and/ 738 or by randomizing the increments. 740 Otherwise, this algorithm does not suffer from the issues discussed 741 in Section 8. 743 8. Common Vulnerabilities Associated with Transient Numeric Identifiers 745 8.1. Network Activity Correlation 747 An idetifier that is preditable or stable within a given context 748 allows for network activity correlation within that context. 750 For example, a stable IPv6 Interface Identifier allows for network 751 activity to be correlated for the context in which that address is 752 stable [RFC7721]. A stable-per-network (as in [RFC7217] allows for 753 network activity correlation within a network, whereas a constant 754 IPv6 Interface Identifier allows not only network activity 755 correlation within the same network, but also across networks ("host 756 tracking"). 758 Predictable transient numeric identifiers can also allow for network 759 activity correlation. For example, a node that generates TCP ISNs 760 with a global counter will typically allow network activity 761 correlation even as it roams across networks, since the communicating 762 nodes could infer the identity of the node based on the TCP ISNs 763 employed for subsequent communication instances. Similarly, a node 764 that generates predictable IPv6 Fragment Identification values could 765 be subject to network activity correlation (see e.g. 766 [Bellovin2002]). 768 8.2. Information Leakage 770 Transient numeric identifiers that are not randomized can leak out 771 information to other communicating nodes. For example, it is common 772 to generate identifiers like: 774 ID = offset(CONTEXT_1) + linear(CONTEXT_2); 776 This generic expression generates identifiers by adding a linear 777 function to an offset. The offset is constant within a given 778 context, whereas linear() is a linear function for a given context 779 (possibly different to that of offset()). Identifiers generated with 780 this expression will generally be predictable within CONTEXT_1. 781 Thus, CONTEXT_1 essentially specifies e.g. the context within which 782 network activity correlation is possible thanks to these identifiers. 783 When CONTEXT_1 is "global" (e.g., offset() is simply a constant 784 value), then all the corresponding transient numeric identifiers 785 become predictable in all contexts. 787 NOTE: If offset() has a global context and the specific value is 788 known, the resulting identifiers may leak even more information. 789 For example, the if Fragment Identification values are generated 790 with the generic function above, and CONTEXT_1 is "global", then 791 the corresponding identifiers will leak the number of fragmented 792 datagrams sent for CONTEXT_2. If both CONTEXT_1 and CONTEXT_2 are 793 "global", then Fragment Identification values would be generated 794 with a global counter (initialized to offset()), and thus each 795 generated Fragment Identification value would leak the number of 796 fragmented datagrams transmitted by the node since it was 797 bootstrapped. 799 On the other hand, linear() will be predictable within CONTEXT_2. 800 The predictability of linear(), irrespective of the context and/or 801 predictability of offset(), can leak out information that is of use 802 to attackers. For example, a node that selects ephemeral port 803 numbers on as in_ 805 ehemeral_port = offset(Dest_IP) + linear() 807 that is, with a per-destination offset, but global linear() function 808 (e.g., a global counter), will leak information about the number of 809 outgoing connections that have been issued between any two issued 810 outgoing connections. Similarly, a node that generates Fragment 811 Identification values as in: 813 Frag_ID = offset(Srd_IP, Dst_IP) + linear() 815 will leak out information about the number of fragmented packets that 816 have been transmitted between any two other transmitted fragmented 817 packets. The vulnerabilities described in [Sanfilippo1998a], 818 [Sanfilippo1998b], and [Sanfilippo1999] are all associated with the 819 use of a global linear() function (i.e., a global CONTEXT_2). 821 8.3. Exploitation of Semantics of Transient Numeric Identifiers 823 Identifiers that are not semantically opaque tend to be more 824 predictable than semantically-opaque identifiers. For example, a MAC 825 address contains an OUI (Organizationally-Unique Identifier) which 826 identifies the vendor that manufactured the underlying network 827 interface card. This fact may be leveraged by an attacker meaning to 828 "predict" MAC addresses, if he has some knowledge about the possible 829 NIC vendor. 831 [RFC7707] discusses a number of techniques to reduce the search space 832 when performing IPv6 address-scanning attacks by leveraging the 833 semantics of the IIDs produced by a number of IID-generation 834 techniques. 836 8.4. Exploitation of Collisions of Transient Numeric Identifiers 838 In many cases, th collision of transient network identifiers can have 839 a hard failure severity (or result in a hard failure severity if an 840 attacker can cause multiple collisions deterministically, one after 841 another). For example, predictable Fragment Identification values 842 open the door to Denial of Service (DoS) attacks (see e.g. 843 [RFC5722]. Similarly, predictable TCP ISNs open the door to trivial 844 connection-reset and data injection attacks (see e.g. 845 [Joncheray1995]). 847 9. Vulnerability Analysis of Specific Transient Numeric Identifiers 848 Categories 850 The following subsections analyze common vulnerabilities associated 851 with the generation of identifiers for each of the categories 852 identified in Section 6. 854 9.1. Category #1: Uniqueness (soft failure) 856 Possible vulnerabilities associated with identifiers of this category 857 are: 859 o Use of trivial algorithms (e.g. global counters) that generate 860 predictable identifiers 862 o Use of flawed PRNGs (please see e.g. [Zalewski2001], 863 [Zalewski2002] and [Klein2007]) 865 Since the only interoperability requirement for these identifiers is 866 uniqueness, the obvious approach to generate them is to employ a 867 PRNG. An implementer should consult [RFC4086] regarding randomness 868 requirements for security, and consult relevant documentation when 869 employing a PRNG provided by the underlying system. 871 Use of algorithms other than PRNGs for generating identifiers of this 872 category is discouraged. 874 9.2. Category #2: Uniqueness (hard failure) 876 As noted in Section 7.2 this category typically employs the same 877 algorithms as Category #4, since a monotonically-increasing sequence 878 tends to minimize the identifier reuse frequency. Therefore, the 879 vulnerability analysis of Section 9.4 applies to this case. 881 9.3. Category #3: Uniqueness, constant within context (soft failure) 883 There are two main vulnerabilities that may be associated with 884 identifiers of this category: 886 1. Use algorithms or sources that result in predictable identifiers 888 2. Employing the same identifier across contexts in which constantcy 889 is not required 891 At times, an implementation or specification may be tempted to employ 892 a source for the identifier which is known to provide unique values. 893 However, while unique, the associated identifiers may have other 894 properties such as being predictable or leaking information about the 895 node in question. For example, as noted in [RFC7721], embedding 896 link-layer addresses for generating IPv6 IIDs not only results in 897 predictable values, but also leaks information about the manufacturer 898 of the network interface card. 900 On the other hand, using an identifier across contexts where 901 constantcy is not required can be leveraged for correlation of 902 activities. On of the most trivial examples of this is the use of 903 IPv6 IIDs that are constant across networks (such as IIDs that embed 904 the underlying link-layer address). 906 9.4. Category #4: Uniqueness, monotonically increasing within context 907 (hard failure) 909 A simple way to generalize algorithms employed for generating 910 identifiers of Category #4 would be as follows: 912 /* Identifier selection function */ 913 count = max_id - min_id + 1; 915 do { 916 linear(CONTEXT_2)= linear(CONTEXT_2) + increment(); 917 next_id= offset(CONTEXT_1) + linear(CONTEXT_1); 919 if(check_suitable_id(next_id)) 920 return next_id; 922 count--; 923 } while (count > 0); 925 return ERROR; 927 Essentially, an identifier (next_id) is generated by adding a linear 928 function (linear()) to an offset value, which is unknown to the 929 attacker, and constant for given context (CONTEXT_1). 931 The following aspects of the algorithm should be considered: 933 o For the most part, it is the offset() function that results in 934 identifiers that are unpredictable by an off-path attacker. While 935 the resulting sequence will be monotonically-increasing, the use 936 of an offset value that is unknown to the attacker makes the 937 resulting values unknown to the attacker. 939 o The most straightforward "stateless" implementation of offset 940 would be that in which offset() is the result of a 941 cryptographically-secure hash-function that takes the values that 942 identify the context and a "secret" (not shown in the figure 943 above) as arguments. 945 o Another possible (but stateful) approach would be to simply 946 generate a random offset and store it in memory, and then look-up 947 the corresponding context when a new identifier is to be selected. 948 The algorithm in Section 7.4.1 is essentially an implementation of 949 this type. 951 o The linear function is incremented according to increment(). In 952 the most trivial case increment() could always return the constant 953 "1". But it could also possibly return small integers such the 954 increments are randomized. 956 Considering the generic algorithm illustrated above we can identify 957 the following possible vulnerabilities: 959 o If the offset value spans more than the necessary context, 960 identifiers could be unnecessarily predictable by other parties, 961 since the offset value would be unnecessarily leaked to them. For 962 example, an implementation that means to produce a per-destination 963 counter but replaces offset() with a constant number (i.e., 964 employs a global counter), will unnecessarily result in 965 predictable identifiers. 967 o The function linear() could be seen as representing the number of 968 identifiers that have so far been generated for a given context 969 (CONTEXT_2). If linear() spans more than the necessary context, 970 the "increments" could be leaked to other parties, thus disclosing 971 information about the number of identifiers that have so far been 972 generated. For example, an implementation in which linear() is 973 implemented as a single global counter will unnecessarily leak 974 information the number of identifiers that have been produced. 976 o increment() determines how the linear() is incremented for each 977 identifier that is selected. In the most trivial case, 978 increment() will return the integer "1". However, an 979 implementation may have increment() return a "small" integer value 980 such that even if the current value employed by the generator is 981 guessed (see Appendix A of [RFC7739]), the exact next identifier 982 to be selected will be slightly harder to identify. 984 10. IANA Considerations 986 There are no IANA registries within this document. The RFC-Editor 987 can remove this section before publication of this document as an 988 RFC. 990 11. Security Considerations 992 The entire document is about the security and privacy implications of 993 identifiers. 995 12. Acknowledgements 997 The authors would like to thank (in alphabetical order) Steven 998 Bellovin, Joseph Lorenzo Hall, Gre Norcie, and Martin Thomson, for 999 providing valuable comments on earlier versions of this document. 1001 The authors would like to thank Diego Armando Maradona for his magic 1002 and inspiration. 1004 13. References 1006 13.1. Normative References 1008 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1009 RFC 793, DOI 10.17487/RFC0793, September 1981, 1010 . 1012 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1013 Requirement Levels", BCP 14, RFC 2119, 1014 DOI 10.17487/RFC2119, March 1997, 1015 . 1017 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1018 (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, 1019 December 1998, . 1021 [RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker, 1022 "Randomness Requirements for Security", BCP 106, RFC 4086, 1023 DOI 10.17487/RFC4086, June 2005, 1024 . 1026 [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing 1027 Architecture", RFC 4291, DOI 10.17487/RFC4291, February 1028 2006, . 1030 [RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless 1031 Address Autoconfiguration", RFC 4862, 1032 DOI 10.17487/RFC4862, September 2007, 1033 . 1035 [RFC5722] Krishnan, S., "Handling of Overlapping IPv6 Fragments", 1036 RFC 5722, DOI 10.17487/RFC5722, December 2009, 1037 . 1039 [RFC6528] Gont, F. and S. Bellovin, "Defending against Sequence 1040 Number Attacks", RFC 6528, DOI 10.17487/RFC6528, February 1041 2012, . 1043 [RFC7217] Gont, F., "A Method for Generating Semantically Opaque 1044 Interface Identifiers with IPv6 Stateless Address 1045 Autoconfiguration (SLAAC)", RFC 7217, 1046 DOI 10.17487/RFC7217, April 2014, 1047 . 1049 13.2. Informative References 1051 [Bellovin1989] 1052 Bellovin, S., "Security Problems in the TCP/IP Protocol 1053 Suite", Computer Communications Review, vol. 19, no. 2, 1054 pp. 32-48, 1989, 1055 . 1057 [Bellovin2002] 1058 Bellovin, S., "A Technique for Counting NATted Hosts", 1059 IMW'02 Nov. 6-8, 2002, Marseille, France, 2002. 1061 [CERT2001] 1062 CERT, "CERT Advisory CA-2001-09: Statistical Weaknesses in 1063 TCP/IP Initial Sequence Numbers", 2001, 1064 . 1066 [CPNI-TCP] 1067 Gont, F., "Security Assessment of the Transmission Control 1068 Protocol (TCP)", United Kingdom's Centre for the 1069 Protection of National Infrastructure (CPNI) Technical 1070 Report, 2009, . 1073 [FIPS-SHS] 1074 FIPS, "Secure Hash Standard (SHS)", Federal Information 1075 Processing Standards Publication 180-4, March 2012, 1076 . 1079 [Fyodor2004] 1080 Fyodor, "Idle scanning and related IP ID games", 2004, 1081 . 1083 [I-D.ietf-6man-default-iids] 1084 Gont, F., Cooper, A., Thaler, D., and S. LIU, 1085 "Recommendation on Stable IPv6 Interface Identifiers", 1086 draft-ietf-6man-default-iids-16 (work in progress), 1087 September 2016. 1089 [Joncheray1995] 1090 Joncheray, L., "A Simple Active Attack Against TCP", Proc. 1091 Fifth Usenix UNIX Security Symposium, 1995. 1093 [Klein2007] 1094 Klein, A., "OpenBSD DNS Cache Poisoning and Multiple O/S 1095 Predictable IP ID Vulnerability", 2007, 1096 . 1099 [Morris1985] 1100 Morris, R., "A Weakness in the 4.2BSD UNIX TCP/IP 1101 Software", CSTR 117, AT&T Bell Laboratories, Murray Hill, 1102 NJ, 1985, 1103 . 1105 [RFC1035] Mockapetris, P., "Domain names - implementation and 1106 specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, 1107 November 1987, . 1109 [RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, 1110 DOI 10.17487/RFC1321, April 1992, 1111 . 1113 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 1114 Errors at High Data Rates", RFC 4963, 1115 DOI 10.17487/RFC4963, July 2007, 1116 . 1118 [RFC6056] Larsen, M. and F. Gont, "Recommendations for Transport- 1119 Protocol Port Randomization", BCP 156, RFC 6056, 1120 DOI 10.17487/RFC6056, January 2011, 1121 . 1123 [RFC6151] Turner, S. and L. Chen, "Updated Security Considerations 1124 for the MD5 Message-Digest and the HMAC-MD5 Algorithms", 1125 RFC 6151, DOI 10.17487/RFC6151, March 2011, 1126 . 1128 [RFC7098] Carpenter, B., Jiang, S., and W. Tarreau, "Using the IPv6 1129 Flow Label for Load Balancing in Server Farms", RFC 7098, 1130 DOI 10.17487/RFC7098, January 2014, 1131 . 1133 [RFC7528] Higgs, P. and J. Piesing, "A Uniform Resource Name (URN) 1134 Namespace for the Hybrid Broadcast Broadband TV (HbbTV) 1135 Association", RFC 7528, DOI 10.17487/RFC7528, April 2015, 1136 . 1138 [RFC7707] Gont, F. and T. Chown, "Network Reconnaissance in IPv6 1139 Networks", RFC 7707, DOI 10.17487/RFC7707, March 2016, 1140 . 1142 [RFC7721] Cooper, A., Gont, F., and D. Thaler, "Security and Privacy 1143 Considerations for IPv6 Address Generation Mechanisms", 1144 RFC 7721, DOI 10.17487/RFC7721, March 2016, 1145 . 1147 [RFC7739] Gont, F., "Security Implications of Predictable Fragment 1148 Identification Values", RFC 7739, DOI 10.17487/RFC7739, 1149 February 2016, . 1151 [Sanfilippo1998a] 1152 Sanfilippo, S., "about the ip header id", Post to Bugtraq 1153 mailing-list, Mon Dec 14 1998, 1154 . 1156 [Sanfilippo1998b] 1157 Sanfilippo, S., "Idle scan", Post to Bugtraq mailing-list, 1158 1998, . 1160 [Sanfilippo1999] 1161 Sanfilippo, S., "more ip id", Post to Bugtraq mailing- 1162 list, 1999, 1163 . 1165 [Shimomura1995] 1166 Shimomura, T., "Technical details of the attack described 1167 by Markoff in NYT", Message posted in USENET's 1168 comp.security.misc newsgroup Message-ID: 1169 <3g5gkl$5j1@ariel.sdsc.edu>, 1995, 1170 . 1172 [Silbersack2005] 1173 Silbersack, M., "Improving TCP/IP security through 1174 randomization without sacrificing interoperability", 1175 EuroBSDCon 2005 Conference, 2005, 1176 . 1179 [USCERT2001] 1180 US-CERT, "US-CERT Vulnerability Note VU#498440: Multiple 1181 TCP/IP implementations may use statistically predictable 1182 initial sequence numbers", 2001, 1183 . 1185 [Zalewski2001] 1186 Zalewski, M., "Strange Attractors and TCP/IP Sequence 1187 Number Analysis", 2001, 1188 . 1190 [Zalewski2002] 1191 Zalewski, M., "Strange Attractors and TCP/IP Sequence 1192 Number Analysis - One Year Later", 2001, 1193 . 1195 Appendix A. Flawed Algorithms 1197 The following subsections document algorithms with known negative 1198 security and privacy imlpications. 1200 A.1. Predictable Linear Identifiers Algorithm 1202 One of the most trivial ways to achieve uniqueness with a low 1203 identifier reuse frequency is to produce a linear sequence. 1205 For example, the following algorithm has been employed (see e.g. 1206 [Morris1985], [Shimomura1995], [Silbersack2005] and [CPNI-TCP]) in a 1207 number of operating systems for selecting IP fragment IDs, TCP 1208 ephemeral ports, etc.: 1210 /* Initialization at system boot time. Could be random */ 1211 next_id = min_id; 1212 id_inc= 1; 1214 /* Identifier selection function */ 1215 count = max_id - min_id + 1; 1217 do { 1218 if (next_id == max_id) { 1219 next_id = min_id; 1220 } 1221 else { 1222 next_id = next_id + id_inc; 1223 } 1225 if (check_suitable_id(next_id)) 1226 return next_id; 1228 count--; 1230 } while (count > 0); 1232 return ERROR; 1234 Note: 1235 check_suitable_id() is a function that checks whether the 1236 resulting identifier is acceptable (e.g., whether its in use, 1237 etc.). 1239 For obvious reasons, this algorithm results in predicable sequences. 1240 If a global counter is used (such as "next_id" in the example above), 1241 a node that learns one protocol identifier can also learn or guess 1242 values employed by past and future protocol instances. On the other 1243 hand, when the value of increments is known (such as "1" in this 1244 case), an attacker can sample two values, and learn the number of 1245 identifiers that were generated in-between. 1247 Where identifier reuse would lead to a hard failure, one typical 1248 approach to generate unique identifiers (while minimizing the 1249 security and privacy implications of predictable identifiers) is to 1250 obfuscate the resulting protocol IDs by either: 1252 o Replace the global counter with multiple counters (initialized to 1253 a random value) 1255 o Randomizing the "increments" 1257 Avoiding global counters essentially means that learning one 1258 identifier for a given context (e.g., one TCP ephemeral port for a 1259 given {src IP, Dst IP, Dst Port}) is of no use for learning or 1260 guessing identifiers for a different context (e.g., TCP ephemeral 1261 ports that involve other peers). However, this may imply keeping one 1262 additional variable/counter per context, which may be prohibitive in 1263 some environments. The choice of id_inc has implications on both the 1264 security and privacy properties of the resulting identifiers, but 1265 also on the corresponding interoperability properties. On one hand, 1266 minimizing the increments (as in "id_inc = 1" in our case) generally 1267 minimizes the identifier reuse frequency, albeit at increased 1268 predictability. On the other hand, if the increments are randomized, 1269 predictability of the resulting identifiers is reduced, and the 1270 information leakage produced by global constant increments is 1271 mitigated. However, using larger increments than necessary can 1272 result in an increased identifier reuse frequency. 1274 A.2. Random-Increments Algorithm 1276 This algorithm offers a middle ground between the algorithms that 1277 select ephemeral ports randomly (such as those described in 1278 Section 7.1.1 and Section 7.1.2), and those that offer obfuscation 1279 but no randomization (such as those described in Section 7.4.2 and 1280 Section 7.4.3). 1282 /* Initialization code at system boot time. */ 1283 next_id = random(); /* Initialization value */ 1284 id_inc = 500; /* Determines the trade-off */ 1286 /* Identifier selection function */ 1287 id_range = max_id - min_id + 1; 1289 count = id_range; 1291 do { 1292 /* Random increment */ 1293 next_id = next_id + (random() % id_inc) + 1; 1295 /* Keep the identifier within acceptable range */ 1296 next_id = min_id + (next_id % id_range); 1298 if(check_suitable_id(next_id)) 1299 return next_id; 1301 count--; 1302 } while (count > 0); 1304 return ERROR; 1306 This algorithm aims at producing a monotonically increasing sequence 1307 of identifiers, while avoiding the use of fixed increments, which 1308 would lead to trivially predictable sequences. The value "id_inc" 1309 allows for direct control of the trade-off between the level of 1310 obfuscation and the ID reuse frequency. The smaller the value of 1311 "id_inc", the more similar this algorithm is to a predicable, global 1312 monotonically-increasing ID generation algorithm. The larger the 1313 value of "id_inc", the more similar this algorithm is to the 1314 algorithm described in Section 7.1.1 of this document. 1316 When the identifiers wrap, there is the risk of collisions of 1317 identifiers (i.e., identifier reuse). Therefore, "id_inc" should be 1318 selected according to the following criteria: 1320 o It should maximize the wrapping time of the identifier space. 1322 o It should minimize identifier reuse frequency. 1324 o It should maximize obfuscation. 1326 Clearly, these are competing goals, and the decision of which value 1327 of "id_inc" to use is a trade-off. Therefore, the value of "id_inc" 1328 should be configurable so that system administrators can make the 1329 trade-off for themselves. 1331 Authors' Addresses 1333 Fernando Gont 1334 SI6 Networks / UTN-FRH 1335 Evaristo Carriego 2644 1336 Haedo, Provincia de Buenos Aires 1706 1337 Argentina 1339 Phone: +54 11 4650 8472 1340 Email: fgont@si6networks.com 1341 URI: http://www.si6networks.com 1343 Ivan Arce 1344 Quarkslab 1346 Email: iarce@quarkslab.com 1347 URI: https://www.quarkslab.com