idnits 2.17.1 draft-peabody-dispatch-new-uuid-format-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC4122]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. -- The draft header indicates that this document updates RFC4122, but the abstract doesn't seem to directly say this. It does mention RFC4122 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: MAC addresses pose inherent security risks and MUST not be used for node generation. As such they have been strictly forbidden from time-based UUIDs within this specification. Instead pseudo-random bits SHOULD selected from a source with sufficient entropy to ensure guaranteed uniqueness among UUID generation. (Using the creation date from RFC4122, updated by this document, for RFC5378 checks: 2002-09-23) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (7 October 2021) is 931 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 dispatch BGP. Peabody 3 Internet-Draft 4 Updates: 4122 (if approved) K. Davis 5 Intended status: Standards Track 7 October 2021 6 Expires: 10 April 2022 8 New UUID Formats 9 draft-peabody-dispatch-new-uuid-format-02 11 Abstract 13 This document presents new time-based UUID formats which are suited 14 for use as a database key. 16 A common case for modern applications is to create a unique 17 identifier for use as a primary key in a database table. This 18 identifier usually implements an embedded timestamp that is sortable 19 using the monotonic creation time in the most significant bits. In 20 addition the identifier is highly collision resistant, difficult to 21 guess, and provides minimal security attack surfaces. None of the 22 existing UUID versions, including UUIDv1, fulfill each of these 23 requirements in the most efficient possible way. This document is a 24 proposal to update [RFC4122] with three new UUID versions that 25 address these concerns, each with different trade-offs. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at https://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on 10 April 2022. 44 Copyright Notice 46 Copyright (c) 2021 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 51 license-info) in effect on the date of publication of this document. 52 Please review these documents carefully, as they describe your rights 53 and restrictions with respect to this document. Code Components 54 extracted from this document must include Simplified BSD License text 55 as described in Section 4.e of the Trust Legal Provisions and are 56 provided without warranty as described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 3 62 3. Summary of Changes . . . . . . . . . . . . . . . . . . . . . 5 63 3.1. changelog . . . . . . . . . . . . . . . . . . . . . . . . 6 64 4. Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 65 4.1. Versions . . . . . . . . . . . . . . . . . . . . . . . . 7 66 4.2. Variant . . . . . . . . . . . . . . . . . . . . . . . . . 7 67 4.3. UUIDv6 Layout and Bit Order . . . . . . . . . . . . . . . 7 68 4.3.1. UUIDv6 Basic Creation Algorithm . . . . . . . . . . . 9 69 4.4. UUIDv7 Layout and Bit Order . . . . . . . . . . . . . . . 10 70 4.4.1. UUIDv7 Timestamp Usage . . . . . . . . . . . . . . . 11 71 4.4.2. UUIDv7 Clock Sequence Usage . . . . . . . . . . . . . 12 72 4.4.3. UUIDv7 Node Usage . . . . . . . . . . . . . . . . . . 12 73 4.4.4. UUIDv7 Encoding and Decoding . . . . . . . . . . . . 12 74 4.5. UUIDv8 Layout and Bit Order . . . . . . . . . . . . . . . 17 75 4.5.1. UUIDv8 Timestamp Usage . . . . . . . . . . . . . . . 19 76 4.5.2. UUIDv8 Clock Sequence Usage . . . . . . . . . . . . . 20 77 4.5.3. UUIDv8 Node Usage . . . . . . . . . . . . . . . . . . 21 78 4.5.4. UUIDv8 Basic Creation Algorithm . . . . . . . . . . . 21 79 5. Encoding and Storage . . . . . . . . . . . . . . . . . . . . 24 80 6. Global Uniqueness . . . . . . . . . . . . . . . . . . . . . . 25 81 7. Distributed UUID Generation . . . . . . . . . . . . . . . . . 25 82 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 83 9. Security Considerations . . . . . . . . . . . . . . . . . . . 25 84 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 26 85 11. Normative References . . . . . . . . . . . . . . . . . . . . 26 86 12. Informative References . . . . . . . . . . . . . . . . . . . 26 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 28 89 1. Introduction 91 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 92 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 93 document are to be interpreted as described in [RFC2119]. 95 2. Background 97 A lot of things have changed in the time since UUIDs were originally 98 created. Modern applications have a need to use (and many have 99 already implemented) UUIDs as database primary keys. 101 The motivation for using UUIDs as database keys stems primarily from 102 the fact that applications are increasingly distributed in nature. 103 Simplistic "auto increment" schemes with integers in sequence do not 104 work well in a distributed system since the effort required to 105 synchronize such numbers across a network can easily become a burden. 106 The fact that UUIDs can be used to create unique and reasonably short 107 values in distributed systems without requiring synchronization makes 108 them a good candidate for use as a database key in such environments. 110 However some properties of [RFC4122] UUIDs are not well suited to 111 this task. First, most of the existing UUID versions such as UUIDv4 112 have poor database index locality. Meaning new values created in 113 succession are not close to each other in the index and thus require 114 inserts to be performed at random locations. The negative 115 performance effects of which on common structures used for this 116 (B-tree and its variants) can be dramatic. As such newly inserted 117 values SHOULD be time-ordered to address this. 119 While it is true that UUIDv1 does contain an embedded timestamp and 120 can be time-ordered; UUIDv1 has other issues. It is possible to sort 121 Version 1 UUIDs by time but it is a laborious task. The process 122 requires breaking the bytes of the UUID into various pieces, re- 123 ordering the bits, and then determining the order from the 124 reconstructed timestamp. This is not efficient in very large 125 systems. Implementations would be simplified with a sort order where 126 the UUID can simply be treated as an opaque sequence of bytes and 127 ordered as such. 129 After the embedded timestamp, the remaining 64 bits are in essence 130 used to provide uniqueness both on a global scale and within a given 131 timestamp tick. The clock sequence value ensures that when multiple 132 UUIDs are generated for the same timestamp value are given a 133 monotonic sequence value. This explicit sequencing helps further 134 facilitate sorting. The remaining random bits ensure collisions are 135 minimal. 137 Furthermore, UUIDv1 utilizes a non-standard timestamp epoch derived 138 from the Gregorian Calendar. More specifically, the Coordinated 139 Universal Time (UTC) as a count of 100-nanosecond intervals since 140 00:00:00.00, 15 October 1582. Implementations and many languages may 141 find it easier to implement the widely adopted and well known Unix 142 Epoch, a custom epoch, or another timestamp source with various 143 levels of timestamp precision required by the application. 145 Lastly, privacy and network security issues arise from using a MAC 146 address in the node field of Version 1 UUIDs. Exposed MAC addresses 147 can be used as an attack surface to locate machines and reveal 148 various other information about such machines (minimally 149 manufacturer, potentially other details). Instead "cryptographically 150 secure" pseudo-random number generators (CSPRNGs) or pseudo-random 151 number generators (PRNG) SHOULD be used within an application context 152 to provide uniqueness and unguessability. 154 Due to the shortcomings of UUIDv1 and UUIDv4 details so far, many 155 widely distributed database applications and large application 156 vendors have sought to solve the problem of creating a better time- 157 based, sortable unique identifier for use as a database key. This 158 has lead to numerous implementations over the past 10+ years solving 159 the same problem in slightly different ways. 161 While preparing this specification the following 16 different 162 implementations were analyzed for trends in total ID length, bit 163 Layout, lexical formatting/encoding, timestamp type, timestamp 164 format, timestamp accuracy, node format/components, collision 165 handling and multi-timestamp tick generation sequencing. 167 1. [LexicalUUID] by Twitter 168 2. [Snowflake] by Twitter 169 3. [Flake] by Boundary 170 4. [ShardingID] by Instagram 171 5. [KSUID] by Segment 172 6. [Elasticflake] by P. Pearcy 173 7. [FlakeID] by T. Pawlak 174 8. [Sonyflake] by Sony 175 9. [orderedUuid] by IT. Cabrera 176 10. [COMBGUID] by R. Tallent 177 11. [ULID] by A. Feerasta 178 12. [SID] by A. Chilton 179 13. [pushID] by Google 180 14. [XID] by O. Poitrey 181 15. [ObjectID] by MongoDB 182 16. [CUID] by E. Elliott 183 An inspection of these implementations details the following trends 184 that help define this standard: 186 - Timestamps MUST be k-sortable. That is, values within or close 187 to the same timestamp are ordered properly by sorting algorithms. 188 - Timestamps SHOULD be big-endian with the most-significant bits 189 of the time embedded as-is without reordering. 190 - Timestamps SHOULD utilize millisecond precision and Unix Epoch 191 as timestamp source. Although, there is some variation to this 192 among implementations depending on the application requirements. 193 - The ID format SHOULD be Lexicographically sortable while in the 194 textual representation. 195 - IDs MUST ensure proper embedded sequencing to facilitate sorting 196 when multiple UUIDs are created during a given timestamp. 197 - IDs MUST NOT require unique network identifiers as part of 198 achieving uniqueness. 199 - Distributed nodes MUST be able to create collision resistant 200 Unique IDs without consulting a centralized resource. 202 3. Summary of Changes 204 In order to solve these challenges this specification introduces 205 three new version identifiers assigned for time-based UUIDs. 207 The first, UUIDv6, aims to be the easiest to implement for 208 applications which already implement UUIDv1. The UUIDv6 209 specification keeps the original Gregorian timestamp source but does 210 not reorder the timestamp bits as per the process utilized by UUIDv1. 211 UUIDv6 also requires that pseudo-random data MUST be used in place of 212 the MAC address. The rest of the UUIDv1 format remains unchanged in 213 UUIDv6. See Section 4.3 215 Next, UUIDv7 introduces an entirely new time-based UUID bit layout 216 utilizing a variable length timestamp sourced from the widely 217 implemented and well known Unix Epoch timestamp source. The 218 timestamp is broken into a 36 bit integer sections part, and is 219 followed by a field of variable length which represents the sub- 220 second timestamp portion, encoded so that each bit from most to least 221 significant adds more precision. See Section 4.4 223 Finally, UUIDv8 introduces a relaxed time-based UUID format that 224 caters to application implementations that cannot utilize UUIDv1, 225 UUIDv6, or UUIDv7. UUIDv8 also future-proofs this specification by 226 allowing time-based UUID formats from timestamp sources that are not 227 yet be defined. The variable size timestamp offers lots of 228 flexibility to create an implementation specific RFC compliant time- 229 based UUID while retaining the properties that make UUID great. See 230 Section 4.5 232 3.1. changelog 234 RFC EDITOR PLEASE DELETE THIS SECTION. 236 draft-02 238 - Added Changelog 239 - Fixed misc. grammatical errors 240 - Fixed section numbering issue 241 - Fixed some UUIDvX reference issues 242 - Changed all instances of "motonic" to "monotonic" 243 - Changed all instances of "#-bit" to "# bit" 244 - Changed "proceeding" veriage to "after" in section 7 245 - Added details on how to pad 32 bit unix timestamp to 36 bits in 246 UUIDv7 247 - Added details on how to truncate 64 bit unix timestamp to 36 248 bits in UUIDv7 249 - Added forward reference and bullet to UUIDv8 if truncating 64 250 bit Unix Epoch is not an option. 251 - Fixed bad reference to non-existent "time_or_node" in section 252 4.5.4 254 draft-01 256 - Complete rewrite of entire document. 257 - The format, flow and verbiage used in the specification has been 258 reworked to mirror the original RFC 4122 and current IETF 259 standards. 260 - Removed the topics of UUID length modification, alternate UUID 261 text formats, and alternate UUID encoding techniques. 262 - Research into 16 different historical and current 263 implementations of time-based universal identifiers was completed 264 at the end of 2020 in attempt to identify trends which have 265 directly influenced design decisions in this draft document 266 (https://github.com/uuid6/uuid6-ietf-draft/tree/master/research) 267 - Prototype implementation have been completed for UUIDv6, UUIDv7, 268 and UUIDv8 in various languages by many GitHub community members. 269 (https://github.com/uuid6/prototypes) 271 4. Format 273 The UUID length of 16 octets (128 bits) remains unchanged. The 274 textual representation of a UUID consisting of 36 hexadecimal and 275 dash characters in the format 8-4-4-4-12 remains unchanged for human 276 readability. In addition the position of both the Version and 277 Variant bits remain unchanged in the layout. 279 4.1. Versions 281 Table 1 defines the 4 bit version found in Bits 48 through 51 within 282 a given UUID. 284 +------+------+------+------+---------+-----------------------+ 285 | Msb0 | Msb1 | Msb2 | Msb3 | Version | Description | 286 +------+------+------+------+---------+-----------------------+ 287 | 0 | 1 | 1 | 0 | 6 | Reordered Gregorian | 288 | | | | | | time-based UUID | 289 +------+------+------+------+---------+-----------------------+ 290 | 0 | 1 | 1 | 1 | 7 | Variable length Unix | 291 | | | | | | Epoch time-based UUID | 292 +------+------+------+------+---------+-----------------------+ 293 | 1 | 0 | 0 | 0 | 8 | Custom time-based | 294 | | | | | | UUID | 295 +------+------+------+------+---------+-----------------------+ 297 Table 1: UUID versions defined by this specification 299 4.2. Variant 301 The variant bits utilized by UUIDs in this specification remains the 302 same as [RFC4122], Section 4.1.1. 304 The Table 2 lists the contents of the variant field, bits 64 and 65, 305 where the letter "x" indicates a "don't-care" value. Common hex 306 values of 8 (1000), 9 (1001), A (1010), and B (1011) frequent the 307 text representation. 309 +------+------+------+-----------------------------------------+ 310 | Msb0 | Msb1 | Msb2 | Description | 311 +------+------+------+-----------------------------------------+ 312 | 1 | 0 | x | The variant specified in this document. | 313 +------+------+------+-----------------------------------------+ 315 Table 2: UUID Variant defined by this specification 317 4.3. UUIDv6 Layout and Bit Order 319 UUIDv6 aims to be the easiest to implement by reusing most of the 320 layout of bits found in UUIDv1 but with changes to bit ordering for 321 the timestamp. Where UUIDv1 splits the timestamp bits into three 322 distinct parts and orders them as time_low, time_mid, 323 time_high_and_version. UUIDv6 instead keeps the source bits from the 324 timestamp intact and changes the order to time_high, time_mid, and 325 time_low. Incidentally this will match the original 60 bit Gregorian 326 timestamp source with 100-nanosecond precision defined in [RFC4122], 327 Section 4.1.4 The clock sequence bits remain unchanged from their 328 usage and position in [RFC4122], Section 4.1.5. The 48 bit node 329 SHOULD be set to a pseudo-random value however implementations MAY 330 choose retain the old MAC address behavior from [RFC4122], 331 Section 4.1.6 and [RFC4122], Section 4.5 333 The format for the 16-octet, 128 bit UUIDv6 is shown in Figure 1 335 0 1 2 3 336 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 337 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 338 | time_high | 339 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 340 | time_mid | time_low_and_version | 341 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 342 |clk_seq_hi_res | clk_seq_low | node (0-1) | 343 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 344 | node (2-5) | 345 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 347 Figure 1: UUIDv6 Field and Bit Layout 349 time_high: 350 The most significant 32 bits of the 60 bit starting timestamp. 351 Occupies bits 0 through 31 (octets 0-3) 353 time_mid: 354 The middle 16 bits of the 60 bit starting timestamp. Occupies 355 bits 32 through 47 (octets 4-5) 357 time_low_and_version: 358 The first four most significant bits MUST contain the UUIDv6 359 version (0110) while the remaining 12 bits will contain the least 360 significant 12 bits from the 60 bit starting timestamp. Occupies 361 bits 48 through 63 (octets 6-7) 363 clk_seq_hi_res: 364 The first two bits MUST be set to the UUID variant (10) The 365 remaining 6 bits contain the high portion of the clock sequence. 366 Occupies bits 64 through 71 (octet 8) 368 clock_seq_low: 369 The 8 bit low portion of the clock sequence. Occupies bits 72 370 through 79 (octet 9) 372 node: 373 48 bit spatially unique identifier Occupies bits 80 through 127 374 (octets 10-15) 376 4.3.1. UUIDv6 Basic Creation Algorithm 378 The following implementation algorithm is based on [RFC4122] but with 379 changes specific to UUIDv6: 381 1. From a system-wide shared stable store (e.g., a file) or global 382 variable, read the UUID generator state: the values of the 383 timestamp and clock sequence used to generate the last UUID. 385 2. Obtain the current time as a 60 bit count of 100-nanosecond 386 intervals since 00:00:00.00, 15 October 1582. 388 3. Set the time_low field to the 12 least significant bits of the 389 starting 60 bit timestamp. 391 4. Truncate the timestamp to the 48 most significant bits in order 392 to create time_high_and_time_mid. 394 5. Set the time_high field to the 32 most significant bits of the 395 truncated timestamp. 397 6. Set the time_mid field to the 16 least significant bits of the 398 truncated timestamp. 400 7. Create the 16 bit time_low_and_version by concatenating the 4 401 bit UUIDv6 version with the 12 bit time_low. 403 8. If the state was unavailable (e.g., non-existent or corrupted) 404 or the timestamp is greater than the current timestamp generate 405 a random 14 bit clock sequence value. 407 9. If the state was available, but the saved timestamp is less than 408 or equal to the current timestamp, increment the clock sequence 409 value. 411 10. Complete the 16 bit clock sequence high, low and reserved 412 creation by concatenating the clock sequence onto UUID variant 413 bits which take the most significant position in the 16 bit 414 value. 416 11. Generate a 48 bit pseudo-random node. 418 12. Format by concatenating the 128 bits from each parts: 419 time_high|time_mid|time_low_and_version|variant_clk_seq|node 421 13. Save the state (current timestamp and clock sequence) back to 422 the stable store 424 The steps for splitting time_high_and_time_mid into time_high and 425 time_mid are optional since the 48 bits of time_high and time_mid 426 will remain in the same order as time_high_and_time_mid during the 427 final concatenation. This extra step of splitting into the most 428 significant 32 bits and least significant 16 bits proves useful when 429 reusing an existing UUIDv1 implementation. In which the following 430 logic can be applied to reshuffle the bits with minimal 431 modifications. 433 +--------------+------+--------------+ 434 | UUIDv1 Field | Bits | UUIDv6 Field | 435 +--------------+------+--------------+ 436 | time_low | 32 | time_high | 437 +--------------+------+--------------+ 438 | time_mid | 16 | time_mid | 439 +--------------+------+--------------+ 440 | time_high | 12 | time_low | 441 +--------------+------+--------------+ 443 Table 3: UUIDv1 to UUIDv6 Field 444 Mappings 446 4.4. UUIDv7 Layout and Bit Order 448 The UUIDv7 format is designed to encode a Unix timestamp with 449 arbitrary sub-second precision. The key property provided by UUIDv7 450 is that timestamp values generated by one system and parsed by 451 another are guaranteed to have sub-second precision of either the 452 generator or the parser, whichever is less. Additionally, the system 453 parsing the UUIDv7 value does not need to know which precision was 454 used during encoding in order to function correctly. 456 The format for the 16-octet, 128 bit UUIDv7 is shown in Figure 2 458 0 1 2 3 459 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 460 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 461 | unixts | 462 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 463 |unixts | subsec_a | ver | subsec_b | 464 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 465 |var| subsec_seq_node | 466 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 467 | subsec_seq_node | 468 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 470 Figure 2: UUIDv7 Field and Bit Layout 472 unixts: 473 36 bit big-endian unsigned Unix Timestamp value 475 subsec_a: 476 12 bits allocated to sub-second precision values. 478 ver: 479 The 4 bit UUIDv7 version (0111) 481 subsec_b: 482 12 bits allocated to sub-second precision values. 484 var: 485 2 bit UUID variant (10) 487 subsec_seq_node: 488 The remaining 62 bits which MAY be allocated to any combination of 489 additional sub-second precision, sequence counter, or pseudo- 490 random data. 492 4.4.1. UUIDv7 Timestamp Usage 494 UUIDv7 utilizes a 36 bit big-endian unsigned Unix Timestamp value 495 (number of seconds since the epoch of 1 Jan 1970, leap seconds 496 excluded so each hour is exactly 3600 seconds long). The 36 bit 497 value was selected in order to provide more available time to the 498 unix timestamp and avoid the Year 2038 problem by extending the 499 maximum timestamp to the year 4147. 501 To achieve a 36 bit UUIDv7 timestamp, the lower 36 bits of a 64 bit 502 unix time are extracted verbatim into UUIDv7 504 In the event that 32 bit Unix Timestamp are in use; four zeros MUST 505 be appended at the start in the most significant (left-most) bits of 506 the 32 bit Unix timestamp creating the 36 bit Unix timestamp. This 507 ensures sorting compatibility with 64 bit unix timestamp which have 508 been truncated to 36 bits. 510 Additional sub-second precision (millisecond, nanosecond, 511 microsecond, etc) MAY be provided for encoding and decoding in the 512 remaining bits in the layout. 514 UUIDv8 SHOULD be used in place of UUIDv7 if an application or 515 implementation does not want to truncate a 64 bit Unix Epoch to the 516 lower 36 bits. 518 4.4.2. UUIDv7 Clock Sequence Usage 520 UUIDv7 SHOULD utilize a monotonic sequence counter to provide 521 additional sequencing guarantees when multiple UUIDv7 values are 522 created in the same UNIXTS and SUBSEC timestamp. The amount of bits 523 allocates to the sequence counter depend on the precision of the 524 timestamp. For example, a more accurate timestamp source using 525 nanosecond precision will require less clock sequence bits than a 526 timestamp source utilizing seconds for precision. For best 527 sequencing results the sequence counter SHOULD be placed immediately 528 after available sub-second bits. 530 The clock sequence MUST start at zero and increment monotonically for 531 each new UUIDv7 created on by the application on the same timestamp. 532 When the timestamp increments the clock sequence MUST be reset to 533 zero. The clock sequence MUST NOT rollover or reset to zero unless 534 the timestamp has incremented. Care MUST be given to ensure that an 535 adequate sized clock sequence is selected for a given application 536 based on expected timestamp precision and expected UUIDv7 generation 537 rates. 539 4.4.3. UUIDv7 Node Usage 541 UUIDv7 implementations, even with very detailed sub-second precision 542 and the optional sequence counter, MAY have leftover bits that will 543 be identified as the Node for this section. The UUIDv7 Node MAY 544 contain any set of data an implementation desires however the node 545 MUST NOT be set to all 0s which does not ensure global uniqueness. 546 In most scenarios the node SHOULD be filled with pseudo-random data. 548 4.4.4. UUIDv7 Encoding and Decoding 550 The UUIDv7 bit layout for encoding and decoding are described 551 separately in this document. 553 4.4.4.1. UUIDv7 Encoding 555 Since the UUIDv7 Unix timestamp is fixed at 36 bits in length the 556 exact layout for encoding UUIDv7 depends on the precision (number of 557 bits) used for the sub-second portion and the sizes of the optionally 558 desired sequence counter and node bits. 560 Three examples of UUIDv7 encoding are given below as a general 561 guidelines but implementations are not limited to just these three 562 examples. 564 All of these fields are only used during encoding, and during 565 decoding the system is unaware of the bit layout used for them and 566 considers this information opaque. As such, implementations 567 generating these values can assign whatever lengths to each field it 568 deems applicable, as long as it does not break decoding compatibility 569 (i.e. Unix timestamp (unixts), version (ver) and variant (var) have 570 to stay where they are, and clock sequence counter (seq), random 571 (random) or other implementation specific values must follow the sub- 572 second encoding). 574 In Figure 3 the UUIDv7 has been created with millisecond precision 575 with the available sub-second precision bits. 577 Examining Figure 3 one can observe: 579 * The first 36 bits have been dedicated to the Unix Timestamp 580 (unixts) 582 * All 12 bits of scenario subsec_a is fully dedicated to millisecond 583 information (msec). 585 * The 4 Version bits remain unchanged (ver). 587 * All 12 bits of subsec_b have been dedicated to a monotonic clock 588 sequence counter (seq). 590 * The 2 Variant bits remain unchanged (var). 592 * Finally the remaining 62 bits in the subsec_seq_node section are 593 layout is filled out with random data to pad the length and 594 provide guaranteed uniqueness (rand). 596 0 1 2 3 597 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 598 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 599 | unixts | 600 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 601 |unixts | msec | ver | seq | 602 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 603 |var| rand | 604 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 605 | rand | 606 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 608 Figure 3: UUIDv7 Field and Bit Layout - Encoding Example (Millisecond 609 Precision) 611 In Figure 4 the UUIDv7 has been created with Microsecond precision 612 with the available sub-second precision bits. 614 Examining Figure 4 one can observe: 616 * The first 36 bits have been dedicated to the Unix Timestamp 617 (unixts) 619 * All 12 bits of scenario subsec_a is fully dedicated to providing 620 sub-second encoding for the Microsecond precision (usec). 622 * The 4 Version bits remain unchanged (ver). 624 * All 12 bits of subsec_b have been dedicated to providing sub- 625 second encoding for the Microsecond precision (usec). 627 * The 2 Variant bits remain unchanged (var). 629 * A 14 bit monotonic clock sequence counter (seq) has been embedded 630 in the most significant position of subsec_seq_node 632 * Finally the remaining 48 bits in the subsec_seq_node section are 633 layout is filled out with random data to pad the length and 634 provide guaranteed uniqueness (rand). 636 0 1 2 3 637 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 638 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 639 | unixts | 640 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 641 |unixts | usec | ver | usec | 642 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 643 |var| seq | rand | 644 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 645 | rand | 646 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 648 Figure 4: UUIDv7 Field and Bit Layout - Encoding Example (Microsecond 649 Precision) 651 In Figure 5 the UUIDv7 has been created with Nanosecond precision 652 with the available sub-second precision bits. 654 Examining Figure 5 one can observe: 656 * The first 36 bits have been dedicated to the Unix Timestamp 657 (unixts) 659 * All 12 bits of scenario subsec_a is fully dedicated to providing 660 sub-second encoding for the Nanosecond precision (nsec). 662 * The 4 Version bits remain unchanged (ver). 664 * All 12 bits of subsec_b have been dedicated to providing sub- 665 second encoding for the Nanosecond precision (nsec). 667 * The 2 Variant bits remain unchanged (var). 669 * The first 14 bit of the subsec_seq_node dedicated to providing 670 sub-second encoding for the Nanosecond precision (nsec). 672 * The next 8 bits of subsec_seq_node dedicated a monotonic clock 673 sequence counter (seq). 675 * Finally the remaining 40 bits in the subsec_seq_node section are 676 layout is filled out with random data to pad the length and 677 provide guaranteed uniqueness (rand). 679 0 1 2 3 680 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 681 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 682 | unixts | 683 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 684 |unixts | nsec | ver | nsec | 685 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 686 |var| nsec | seq | rand | 687 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 688 | rand | 689 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 691 Figure 5: UUIDv7 Field and Bit Layout - Encoding Example 692 (Nanosecond Precision) 694 4.4.4.2. UUIDv7 Decoding 696 When decoding or parsing a UUIDv7 value there are only two values to 697 be considered: 699 1. The unix timestamp defined as unixts 701 2. The sub-second precision values defined as subsec_a, subsec_b, 702 and subsec_seq_node 704 As detailed in Figure 2 the unix timestamp (unixts) is always the 705 first 36 bits of the UUIDv7 layout. 707 Similarly as per Figure 2, the sub-second precision values lie within 708 subsec_a, subsec_b, and subsec_seq_node which are all interpreted as 709 sub-second information after skipping over the version (ver) and 710 (var) bits. These concatenated sub-second information bits are 711 interpreted in a way where most to least significant bits represent a 712 further division by two. This is the same normal place notation used 713 to express fractional numbers, except in binary. For example, in 714 decimal ".1" means one tenth, and ".01" means one hundredth. In this 715 subsec field, a 1 means one half, 01 means one quarter, 001 is one 716 eighth, etc. This scheme can work for any number of bits up to the 717 maximum available, and keeps the most significant data leftmost in 718 the bit sequence. 720 To perform the sub-second math, simply take the first (most 721 significant/leftmost) N bits of subsec and divide it by 2^N. Take 722 for example: 724 1. To parse the first 16 bits, extract that value as an integer and 725 divide it by 65536 (2 to the 16th). 727 2. If these 16 bits are 0101 0101 0101 0101, then treating that as 728 an integer gives 0x5555 or 21845 in decimal, and dividing by 729 65536 gives 0.3333282 731 This sub-second encoding scheme provides maximum interoperability 732 across systems where different levels of time precision are 733 required/feasible/available. The timestamp value derived from a 734 UUIDv7 value SHOULD be "as close to the correct value as possible" 735 when parsed, even across disparate systems. 737 Take for example the starting point for our next two UUIDv7 parsing 738 scenarios: 740 1. System A produces a UUIDv7 with a microsecond-precise timestamp 741 value. 743 2. System B is unaware of the precision encoded in the UUIDv7 744 timestamp by System A. 746 Scenario 1: 748 1. System B parses the embedded timestamp with millisecond 749 precision. (Less precision than the encoder) 751 2. System B SHOULD return the correct millisecond value encoded by 752 system A (truncated to milliseconds). 754 Scenario 2: 756 1. System B parses the timestamp with nanosecond precision. (More 757 precision than the encoder) 759 2. System B's value returned SHOULD have the same microsecond level 760 of precision provided by the encoder with the additional 761 precision down to nanosecond level being essentially random as 762 per the encoded random value at the end of the UUIDv7. 764 4.5. UUIDv8 Layout and Bit Order 766 UUIDv8 offers variable-size timestamp, clock sequence, and node 767 values which allow for a highly customizable UUID that fits a given 768 application needs. 770 UUIDv8 SHOULD only be utilized if an implementation cannot utilize 771 UUIDv1, UUIDv6, or UUIDv7. Some situations in which UUIDv8 usage 772 could occur: 774 * An implementation would like to utilize a timestamp source not 775 defined by the current time-based UUIDs. 777 * An implementation would like to utilize a timestamp bit layout not 778 defined by the current time-based UUIDs. 780 * An implementation would like to avoid truncating a 64 bit Unix to 781 36 bits as defined by UUIDv7. 783 * An implementation would like a specific level of precision within 784 the timestamp not offered by current time-based UUIDs. 786 * An implementation would like to embed extra information within the 787 UUID node other than what is defined in this document. 789 * An implementation has other application/language restrictions 790 which inhibit the usage of one of the current time-based UUIDs. 792 Roughly speaking a properly formatted UUIDv8 SHOULD contain the 793 following sections adding up to a total of 128 bits. 795 - Timestamp Bits (Variable Length) 796 - Clock Sequence Bits (Variable Length) 797 - Node Bits (Variable Length) 798 - UUIDv8 Version Bits (4 bits) 799 - UUID Variant Bits (2 Bits) 801 The only explicitly defined bits are the Version and Variant leaving 802 122 bits for implementation specific time-based UUIDs. To be clear: 803 UUIDv8 is not a replacement for UUIDv4 where all 122 extra bits are 804 filled with random data. UUIDv8's 128 bits (including the version 805 and variant) SHOULD contain at the minimum a timestamp of some format 806 in the most significant bit position followed directly by a clock 807 sequence counter and finally a node containing either random data or 808 implementation specific data. 810 A sample format in Figure 6 is used to further illustrate the point 811 for the 16-octet, 128 bit UUIDv8. 813 0 1 2 3 814 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 815 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 816 | timestamp_32 | 817 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 818 | timestamp_48 | ver | time_or_seq | 819 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 820 |var| seq_or_node | node | 821 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 822 | node | 823 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 825 Figure 6: UUIDv8 Field and Bit Layout 827 timestamp_32: 828 The most significant 32 bits of the desired timestamp source. 829 Occupies bits 0 through 31 (octets 0-3). 831 timestamp_48: 832 The next 16 bits of the timestamp source when a timestamp source 833 with at least 48 bits is used. When a 32 bit timestamp source is 834 utilized, these bits are set to 0. Occupies bits 32 through 47 836 ver: 837 The 4 bit UUIDv8 version (1000). Occupies bits 48 through 51. 839 time_or_seq: 840 If a 60 bit, or larger, timestamp is used these 12 bits are used 841 to fill out the remaining timestamp. If a 32 or 48 bit timestamp 842 is leveraged a 12 bit clock sequence MAY be used. Together ver 843 and time_or_seq occupy bits 48 through 63 (octets 6-7) 845 var: 846 2 bit UUID variant (10) 848 seq_or_node: 849 If a 60 bit, or larger, timestamp source is leverages these 8 bits 850 SHOULD be allocated for an 8 bit clock sequence counter. If a 32 851 or 48 bit timestamp source is used these 8 bits SHOULD be set to 852 random. 854 node: 855 In most implementations these bits will likely be set to pseudo- 856 random data. However, implementations utilize the node as they 857 see fit. Together var, seq_or_node, and node occupy Bits 64 858 through 127 (octets 8-15) 860 4.5.1. UUIDv8 Timestamp Usage 862 UUIDv8's usage of timestamp relaxes both the timestamp source and 863 timestamp length. Implementations are free to utilize any 864 monotonically stable timestamp source for UUIDv8. 866 Some examples include: 868 - Custom Epoch 869 - NTP timestamp 870 - ISO 8601 timestamp 871 - Full, Non-truncated 64 bit Unix Epoch timestamp 873 The relaxed nature UUIDv8 timestamps also works to future proof this 874 specification and allow implementations a method to create compliant 875 time-based UUIDs using timestamp source that might not yet be 876 defined. 878 Timestamps come in many sizes and UUIDv8 defines three fields that 879 can easily used for the majority of timestamp lengths: 881 * 32 bit timestamp: using timestamp_32 and setting timestamp_48 to 882 0s 884 * 48 bit timestamp: using timestamp_32 and timestamp_48 entirely 886 * 60 bit timestamp: using timestamp_32, timestamp_48, and 887 time_or_seq 889 * 64 bit timestamp: using timestamp_32, timestamp_48, and 890 time_or_seq and truncating the timestamp the 60 most significant 891 bits. 893 Although it is possible to create a timestamp larger than 64 bits in 894 size The usage and bit layout of that timestamp format is up to the 895 implementation. When a timestamp exceeds the 64th bit (octet 7), 896 extra care must be taken to ensure the Variant bits are properly 897 inserted at their respective location in the UUID. Likewise, the 898 Version MUST always be implemented at the appropriate location. 900 Any timestamps that does not entirely fill the timestamp_32, 901 timestamp_48 or time_or_seq MUST set all leftover bits in the least 902 significant position of the respective field to 0. For example a 36 903 bit timestamp source would fully utilize timestamp_32 and 4 bits of 904 timestamp_48. The remaining 12 bits in timestamp_48 MUST be set to 905 0. 907 By using implementation-specific timestamp sources it is not 908 guaranteed that devices outside of the application context are able 909 to extract and parse the timestamp from UUIDv8 without some pre- 910 existing knowledge of the source timestamp used by the UUIDv8 911 implementation. 913 4.5.2. UUIDv8 Clock Sequence Usage 915 A clock sequence MUST be used with UUIDv8 as added sequencing 916 guarantees when multiple UUIDv8 will be created on the same clock 917 tick. The amount of bits allocated to the clock sequence depends on 918 the precision of the timestamp source. For example, a more accurate 919 timestamp source using nanosecond precision will require less clock 920 sequence bits than a timestamp source utilizing seconds for 921 precision. 923 The UUIDv8 layout in Figure 6 generically defines two possible clock 924 sequence values that can leveraged: 926 * 12 bit clock sequence using time_or_seq for use when the timestamp 927 is less than 48 bits which allows for 4095 UUIDs per clock tick. 928 * 8 bit clock sequence using seq_or_node when the timestamp uses 929 more than 48 bits which allows for 255 UUIDs per clock tick. 931 An implementation MAY use both time_or_seq and seq_or_node for clock 932 sequencing however it is highly unlikely that 20 bits of clock 933 sequence are needed for a given clock tick. Furthermore, more bits 934 from the node MAY be used for clock sequencing in the event that 8 935 bits is not sufficient. 937 The clock sequence MUST start at zero and increment monotonically for 938 each new UUIDv8 created on by the application on the same timestamp. 939 When the timestamp increments the clock sequence MUST be reset to 940 zero. The clock sequence MUST NOT rollover or reset to zero unless 941 the timestamp has incremented. Care MUST be given to ensure that an 942 adequate sized clock sequence is selected for a given application 943 based on expected timestamp precision and expected UUIDv8 generation 944 rates. 946 4.5.3. UUIDv8 Node Usage 948 The UUIDv8 Node MAY contain any set of data an implementation desires 949 however the node MUST NOT be set to all 0s which does not ensure 950 global uniqueness. In most scenarios the node will be filled with 951 pseudo-random data. 953 The UUIDv8 layout in Figure 6 defines 2 sizes of Node depending on 954 the timestamp size: 956 * 62 bit node encompassing seq_or_node and node Used when a 957 timestamp of 48 bits or less is leveraged. 958 * 54 bit node when all 60 bits of the timestamp are in use and the 959 seq_or_node is used as clock sequencing. 961 An implementation MAY choose to allocate bits from the node to the 962 timestamp, clock sequence or application-specific embedded field. It 963 is recommended that implementation utilize a node of at least 48 bits 964 to ensure global uniqueness can be guaranteed. 966 4.5.4. UUIDv8 Basic Creation Algorithm 968 The entire usage of UUIDv8 is meant to be variable and allow as much 969 customization as possible to meet specific application/language 970 requirements. As such any UUIDv8 implementations will likely vary 971 among applications. 973 The following algorithm is a generic implementation using Figure 6 974 and the recommendations outlined in this specification. 976 *32 bit timestamp, 12 bit sequence counter, 62 bit node:* 978 1. From a system-wide shared stable store (e.g., a file) or global 979 variable, read the UUID generator state: the values of the 980 timestamp and clock sequence used to generate the last UUID. 982 2. Obtain the current time from the selected clock source as 32 983 bits. 985 3. Set the 32 bit field timestamp_32 to the 32 bits from the 986 timestamp 988 4. Set 16 bit timestamp_48 to all 0s 990 5. Set the version to 8 (1000) 992 6. If the state was unavailable (e.g., non-existent or corrupted) 993 or the timestamp is greater than the current timestamp; set the 994 12 bit clock sequence value (time_or_seq) to 0 996 7. If the state was available, but the saved timestamp is less than 997 or equal to the current timestamp, increment the clock sequence 998 value (time_or_seq). 1000 8. Set the variant to binary 10 1002 9. Generate 62 random bits and fill in 8 bits for seq_or_node and 1003 54 bits for the node. 1005 10. Format by concatenating the 128 bits as: timestamp_32|timestamp_ 1006 48|version|time_or_seq|variant|seq_or_node|node 1008 11. Save the state (current timestamp and clock sequence) back to 1009 the stable store 1011 *48 bit timestamp, 12 bit sequence counter, 62 bit node:* 1013 1. From a system-wide shared stable store (e.g., a file) or global 1014 variable, read the UUID generator state: the values of the 1015 timestamp and clock sequence used to generate the last UUID. 1017 2. Obtain the current time from the selected clock source as 32 1018 bits. 1020 3. Set the 32 bit field timestamp_32 to the 32 most significant bits 1021 from the timestamp 1023 4. Set 16 bit timestamp_48 to the 16 least significant bits from the 1024 timestamp 1026 5. The rest of the steps are the same as the previous example. 1028 *60 bit timestamp, 8 bit sequence counter, 54 bit node:* 1030 1. From a system-wide shared stable store (e.g., a file) or global 1031 variable, read the UUID generator state: the values of the 1032 timestamp and clock sequence used to generate the last UUID. 1034 2. Obtain the current time from the selected clock source as 32 1035 bits. 1037 3. Set the 32 bit field timestamp_32 to the 32 bits from the 1038 timestamp 1040 4. Set 16 bit timestamp_48 to the 16 middle bits from the timestamp 1042 5. Set the version to 8 (1000) 1044 6. Set 12 bit time_or_seq to the 12 least significant bits from the 1045 timestamp 1047 7. Set the variant to 10 1049 8. If the state was unavailable (e.g., non-existent or corrupted) 1050 or the timestamp is greater than the current timestamp; set the 1051 12 bit clock sequence value (seq_or_node) to 0 1053 9. If the state was available, but the saved timestamp is less than 1054 or equal to the current timestamp, increment the clock sequence 1055 value (seq_or_node). 1057 10. Generate 54 random bits and fill in the node 1059 11. Format by concatenating the 128 bits as: timestamp_32|timestamp_ 1060 48|version|time_or_seq|variant|seq_or_node|node 1062 12. Save the state (current timestamp and clock sequence) back to 1063 the stable store 1065 *64 bit timestamp, 8 bit sequence counter, 54 bit node:* 1067 1. The same steps as the 60 bit timestamp can be utilized if the 64 1068 bit timestamp is truncated to 60 bits. 1070 2. Implementations MAY chose to truncate the most or least 1071 significant bits but it is recommended to utilize the most 1072 significant 60 bits and lose 4 bits of precision in the 1073 nanoseconds or microseconds position. 1075 *General algorithm for generation of UUIDv8 not defined here:* 1077 1. From a system-wide shared stable store (e.g., a file) or global 1078 variable, read the UUID generator state: the values of the 1079 timestamp and clock sequence used to generate the last UUID. 1081 2. Obtain the current time from the selected clock source as desired 1082 bit total 1084 3. Set total amount of bits for timestamp as required in the most 1085 significant positions of the 128 bit UUID 1087 4. Care MUST be taken to ensure that the UUID Version and UUID 1088 Variant are in the correct bit positions. 1090 UUID Version: Bits 48 through 51 1092 UUID Variant: Bits 64 and 65 1094 5. If the state was unavailable (e.g., non-existent or corrupted) or 1095 the timestamp is greater than the current timestamp; set the 1096 desired clock sequence value to 0 1098 6. If the state was available, but the saved timestamp is less than 1099 or equal to the current timestamp, increment the clock sequence 1100 value. 1102 7. Set the remaining bits to the node as pseudo-random data 1104 8. Format by concatenating the 128 bits together 1106 9. Save the state (current timestamp and clock sequence) back to the 1107 stable store 1109 5. Encoding and Storage 1111 The existing UUID hex and dash format of 8-4-4-4-12 is retained for 1112 both backwards compatibility and human readability. 1114 For many applications such as databases this format is unnecessarily 1115 verbose totaling 288 bits. 1117 * 8 bits for each of the 32 hex characters = 256 bits 1118 * 8 bits for each of the 4 hyphens = 32 bits 1120 Where possible UUIDs SHOULD be stored within database applications as 1121 the underlying 128 bit binary value. 1123 6. Global Uniqueness 1125 UUIDs created by this specification offer the same guarantees for 1126 global uniqueness as those found in [RFC4122]. Furthermore, the 1127 time-based UUIDs defined in this specification are geared towards 1128 database applications but MAY be used for a wide variety of use- 1129 cases. Just as global uniqueness is guaranteed, UUIDs are guaranteed 1130 to be unique within an application context within the enterprise 1131 domain. 1133 7. Distributed UUID Generation 1135 Some implementations might desire to utilize multi-node, clustered, 1136 applications which involve 2 or more applications independently 1137 generating UUIDs that will be stored in a common location. UUIDs 1138 already feature sufficient entropy to ensure that the chances of 1139 collision are low. However, implementations MAY dedicate a portion 1140 of the node's most significant random bits to a pseudo-random 1141 machineID which helps identify UUIDs created by a given node. This 1142 works to add an extra layer of collision avoidance. 1144 This machine ID MUST be placed in the UUID after the timestamp and 1145 sequence counter bits. This position is selected to ensure that the 1146 sorting by timestamp and clock sequence is still possible. The 1147 machineID MUST NOT be an IEEE 802 MAC address. The creation and 1148 negotiation of the machineID among distributed nodes is out of scope 1149 for this specification. 1151 8. IANA Considerations 1153 This document has no IANA actions. 1155 9. Security Considerations 1157 MAC addresses pose inherent security risks and MUST not be used for 1158 node generation. As such they have been strictly forbidden from 1159 time-based UUIDs within this specification. Instead pseudo-random 1160 bits SHOULD selected from a source with sufficient entropy to ensure 1161 guaranteed uniqueness among UUID generation. 1163 Timestamps embedded in the UUID do pose a very small attack surface. 1164 The timestamp in conjunction with the clock sequence does signal the 1165 order of creation for a given UUID and it's corresponding data but 1166 does not define anything about the data itself or the application as 1167 a whole. If UUIDs are required for use with any security operation 1168 within an application context in any shape or form then [RFC4122] 1169 UUIDv4 SHOULD be utilized. 1171 The machineID portion of node, described in Section 7, does provide 1172 small unique identifier which could be used to determine which 1173 application is generating data but this machineID alone is not enough 1174 to identify a node on the network without other corresponding data 1175 points. Furthermore the machineID, like the timestamp+sequence, does 1176 not provide any context about the data the corresponds to the UUID or 1177 the current state of the application as a whole. 1179 10. Acknowledgements 1181 The authors gratefully acknowledge the contributions of Ben Campbell, 1182 Ben Ramsey, Fabio Lima, Gonzalo Salgueiro, Martin Thomson, Murray S. 1183 Kucherawy, Rick van Rein, Rob Wilton, Sean Leonard, Theodore Y. 1184 Ts'o. As well as all of those in and outside the IETF community to 1185 who contributed to the discussions which resulted in this document. 1187 11. Normative References 1189 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1190 Requirement Levels", BCP 14, RFC 2119, 1191 DOI 10.17487/RFC2119, March 1997, 1192 . 1194 [RFC4122] Leach, P., Mealling, M., and R. Salz, "A Universally 1195 Unique IDentifier (UUID) URN Namespace", RFC 4122, 1196 DOI 10.17487/RFC4122, July 2005, 1197 . 1199 12. Informative References 1201 [LexicalUUID] 1202 Twitter, "A Scala client for Cassandra", commit f6da4e0, 1203 November 2012, 1204 . 1206 [Snowflake] 1207 Twitter, "Snowflake is a network service for generating 1208 unique ID numbers at high scale with some simple 1209 guarantees.", Commit b3f6a3c, May 2014, 1210 . 1213 [Flake] Boundary, "Flake: A decentralized, k-ordered id generation 1214 service in Erlang", Commit 15c933a, February 2017, 1215 . 1217 [ShardingID] 1218 Instagram Engineering, "Sharding & IDs at Instagram", 1219 December 2012, . 1222 [KSUID] Segment, "K-Sortable Globally Unique IDs", Commit bf376a7, 1223 July 2020, . 1225 [Elasticflake] 1226 Pearcy, P., "Sequential UUID / Flake ID generator pulled 1227 out of elasticsearch common", Commit dd71c21, January 1228 2015, . 1230 [FlakeID] Pawlak, T., "Flake ID Generator", Commit fcd6a2f, April 1231 2020, . 1233 [Sonyflake] 1234 Sony, "A distributed unique ID generator inspired by 1235 Twitter's Snowflake", Commit 848d664, August 2020, 1236 . 1238 [orderedUuid] 1239 Cabrera, IT., "Laravel: The mysterious "Ordered UUID"", 1240 January 2020, . 1243 [COMBGUID] Tallent, R., "Creating sequential GUIDs in C# for MSSQL or 1244 PostgreSql", Commit 2759820, December 2020, 1245 . 1247 [ULID] Feerasta, A., "Universally Unique Lexicographically 1248 Sortable Identifier", Commit d0c7170, May 2019, 1249 . 1251 [SID] Chilton, A., "sid : generate sortable identifiers", 1252 Commit 660e947, June 2019, 1253 . 1255 [pushID] Google, "The 2^120 Ways to Ensure Unique Identifiers", 1256 February 2015, . 1259 [XID] Poitrey, O., "Globally Unique ID Generator", 1260 Commit efa678f, October 2020, . 1262 [ObjectID] MongoDB, "ObjectId - MongoDB Manual", 1263 . 1266 [CUID] Elliott, E., "Collision-resistant ids optimized for 1267 horizontal scaling and performance.", Commit 215b27b, 1268 October 2020, . 1270 Authors' Addresses 1272 Brad G. Peabody 1274 Email: brad@peabody.io 1276 Kyzer R. Davis 1278 Email: kydavis@cisco.com