< draft-peabody-dispatch-new-uuid-format-02.txt   draft-peabody-dispatch-new-uuid-format-03.txt >
dispatch BGP. Peabody dispatch BGP. Peabody
Internet-Draft Internet-Draft
Updates: 4122 (if approved) K. Davis Updates: 4122 (if approved) K. Davis
Intended status: Standards Track 7 October 2021 Intended status: Standards Track 31 March 2022
Expires: 10 April 2022 Expires: 2 October 2022
New UUID Formats New UUID Formats
draft-peabody-dispatch-new-uuid-format-02 draft-peabody-dispatch-new-uuid-format-03
Abstract Abstract
This document presents new time-based UUID formats which are suited This document presents new Universally Unique Identifier (UUID)
for use as a database key. formats for use in modern applications and databases.
A common case for modern applications is to create a unique
identifier for use as a primary key in a database table. This
identifier usually implements an embedded timestamp that is sortable
using the monotonic creation time in the most significant bits. In
addition the identifier is highly collision resistant, difficult to
guess, and provides minimal security attack surfaces. None of the
existing UUID versions, including UUIDv1, fulfill each of these
requirements in the most efficient possible way. This document is a
proposal to update [RFC4122] with three new UUID versions that
address these concerns, each with different trade-offs.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on 10 April 2022. This Internet-Draft will expire on 2 October 2022.
Copyright Notice Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document. license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components and restrictions with respect to this document. Code Components
extracted from this document must include Simplified BSD License text extracted from this document must include Revised BSD License text as
as described in Section 4.e of the Trust Legal Provisions and are described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Simplified BSD License. provided without warranty as described in the Revised BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1. Requirements Language . . . . . . . . . . . . . . . . . . 4
2.2. Abbreviations . . . . . . . . . . . . . . . . . . . . . . 5
3. Summary of Changes . . . . . . . . . . . . . . . . . . . . . 5 3. Summary of Changes . . . . . . . . . . . . . . . . . . . . . 5
3.1. changelog . . . . . . . . . . . . . . . . . . . . . . . . 6 3.1. changelog . . . . . . . . . . . . . . . . . . . . . . . . 5
4. Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 4. Variant and Version Fields . . . . . . . . . . . . . . . . . 7
4.1. Versions . . . . . . . . . . . . . . . . . . . . . . . . 7 5. New Formats . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.2. Variant . . . . . . . . . . . . . . . . . . . . . . . . . 7 5.1. UUID Version 6 . . . . . . . . . . . . . . . . . . . . . 8
4.3. UUIDv6 Layout and Bit Order . . . . . . . . . . . . . . . 7 5.2. UUID Version 7 . . . . . . . . . . . . . . . . . . . . . 9
4.3.1. UUIDv6 Basic Creation Algorithm . . . . . . . . . . . 9 5.3. UUID Version 8 . . . . . . . . . . . . . . . . . . . . . 10
4.4. UUIDv7 Layout and Bit Order . . . . . . . . . . . . . . . 10 5.4. Max UUID . . . . . . . . . . . . . . . . . . . . . . . . 11
4.4.1. UUIDv7 Timestamp Usage . . . . . . . . . . . . . . . 11 6. UUID Best Practices . . . . . . . . . . . . . . . . . . . . . 12
4.4.2. UUIDv7 Clock Sequence Usage . . . . . . . . . . . . . 12 6.1. Timestamp Granularity . . . . . . . . . . . . . . . . . . 12
4.4.3. UUIDv7 Node Usage . . . . . . . . . . . . . . . . . . 12 6.2. Monotonicity and Counters . . . . . . . . . . . . . . . . 13
4.4.4. UUIDv7 Encoding and Decoding . . . . . . . . . . . . 12 6.3. Distributed UUID Generation . . . . . . . . . . . . . . . 16
4.5. UUIDv8 Layout and Bit Order . . . . . . . . . . . . . . . 17 6.4. Collision Resistance . . . . . . . . . . . . . . . . . . 17
4.5.1. UUIDv8 Timestamp Usage . . . . . . . . . . . . . . . 19 6.5. Global and Local Uniqueness . . . . . . . . . . . . . . . 18
4.5.2. UUIDv8 Clock Sequence Usage . . . . . . . . . . . . . 20 6.6. Unguessability . . . . . . . . . . . . . . . . . . . . . 18
4.5.3. UUIDv8 Node Usage . . . . . . . . . . . . . . . . . . 21 6.7. Sorting . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.5.4. UUIDv8 Basic Creation Algorithm . . . . . . . . . . . 21 6.8. Opacity . . . . . . . . . . . . . . . . . . . . . . . . . 19
5. Encoding and Storage . . . . . . . . . . . . . . . . . . . . 24 6.9. DBMS and Database Considerations . . . . . . . . . . . . 19
6. Global Uniqueness . . . . . . . . . . . . . . . . . . . . . . 25 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19
7. Distributed UUID Generation . . . . . . . . . . . . . . . . . 25 8. Security Considerations . . . . . . . . . . . . . . . . . . . 19
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20
9. Security Considerations . . . . . . . . . . . . . . . . . . . 25 10. Normative References . . . . . . . . . . . . . . . . . . . . 20
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 26 11. Informative References . . . . . . . . . . . . . . . . . . . 20
11. Normative References . . . . . . . . . . . . . . . . . . . . 26 Appendix A. Example Code . . . . . . . . . . . . . . . . . . . . 22
12. Informative References . . . . . . . . . . . . . . . . . . . 26 A.1. Creating a UUIDv6 Value . . . . . . . . . . . . . . . . . 22
A.2. Creating a UUIDv7 Value . . . . . . . . . . . . . . . . . 23
A.3. Creating a UUIDv8 Value . . . . . . . . . . . . . . . . . 24
Appendix B. Test Vectors . . . . . . . . . . . . . . . . . . . . 24
B.1. Example of a UUIDv6 Value . . . . . . . . . . . . . . . . 25
B.2. Example of a UUIDv7 Value . . . . . . . . . . . . . . . . 26
B.3. Example of a UUIDv8 Value . . . . . . . . . . . . . . . . 26
Appendix C. Version and Variant Tables . . . . . . . . . . . . . 27
C.1. Variant 10xx Versions . . . . . . . . . . . . . . . . . . 27
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 28 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 28
1. Introduction 1. Introduction
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", Many things have changed in the time since UUIDs were originally
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this created. Modern applications have a need to create and utilize UUIDs
document are to be interpreted as described in [RFC2119]. as the primary identifier for a variety of different items in complex
computational systems, including but not limited to database keys,
2. Background file names, machine or system names, and identifiers for event-driven
transactions.
A lot of things have changed in the time since UUIDs were originally One area UUIDs have gained popularity is as database keys. This
created. Modern applications have a need to use (and many have stems from the increasingly distributed nature of modern
already implemented) UUIDs as database primary keys. applications. In such cases, "auto increment" schemes often used by
databases do not work well, as the effort required to coordinate
unique numeric identifiers across a network can easily become a
burden. The fact that UUIDs can be used to create unique, reasonably
short values in distributed systems without requiring synchronization
makes them a good alternative, but UUID versions 1-5 lack certain
other desirable characteristics:
The motivation for using UUIDs as database keys stems primarily from 1. Non-time-ordered UUID versions such as UUIDv4 have poor database
the fact that applications are increasingly distributed in nature. index locality. Meaning new values created in succession are not
Simplistic "auto increment" schemes with integers in sequence do not close to each other in the index and thus require inserts to be
work well in a distributed system since the effort required to performed at random locations. The negative performance effects
synchronize such numbers across a network can easily become a burden. of which on common structures used for this (B-tree and its
The fact that UUIDs can be used to create unique and reasonably short variants) can be dramatic.
values in distributed systems without requiring synchronization makes
them a good candidate for use as a database key in such environments.
However some properties of [RFC4122] UUIDs are not well suited to 2. The 100-nanosecond, Gregorian epoch used in UUIDv1 timestamps is
this task. First, most of the existing UUID versions such as UUIDv4 uncommon and difficult to represent accurately using a standard
have poor database index locality. Meaning new values created in number format such as [IEEE754].
succession are not close to each other in the index and thus require
inserts to be performed at random locations. The negative
performance effects of which on common structures used for this
(B-tree and its variants) can be dramatic. As such newly inserted
values SHOULD be time-ordered to address this.
While it is true that UUIDv1 does contain an embedded timestamp and 3. Introspection/parsing is required to order by time sequence; as
can be time-ordered; UUIDv1 has other issues. It is possible to sort opposed to being able to perform a simple byte-by-byte
Version 1 UUIDs by time but it is a laborious task. The process comparison.
requires breaking the bytes of the UUID into various pieces, re-
ordering the bits, and then determining the order from the
reconstructed timestamp. This is not efficient in very large
systems. Implementations would be simplified with a sort order where
the UUID can simply be treated as an opaque sequence of bytes and
ordered as such.
After the embedded timestamp, the remaining 64 bits are in essence 4. Privacy and network security issues arise from using a MAC
used to provide uniqueness both on a global scale and within a given address in the node field of Version 1 UUIDs. Exposed MAC
timestamp tick. The clock sequence value ensures that when multiple addresses can be used as an attack surface to locate machines and
UUIDs are generated for the same timestamp value are given a reveal various other information about such machines (minimally
monotonic sequence value. This explicit sequencing helps further manufacturer, potentially other details). Additionally, with the
facilitate sorting. The remaining random bits ensure collisions are advent of virtual machines and containers, MAC address uniqueness
minimal. is no longer guaranteed.
Furthermore, UUIDv1 utilizes a non-standard timestamp epoch derived 5. Many of the implementation details specified in [RFC4122] involve
from the Gregorian Calendar. More specifically, the Coordinated trade offs that are neither possible to specify for all
Universal Time (UTC) as a count of 100-nanosecond intervals since applications nor necessary to produce interoperable
00:00:00.00, 15 October 1582. Implementations and many languages may implementations.
find it easier to implement the widely adopted and well known Unix
Epoch, a custom epoch, or another timestamp source with various
levels of timestamp precision required by the application.
Lastly, privacy and network security issues arise from using a MAC 6. [RFC4122] does not distinguish between the requirements for
address in the node field of Version 1 UUIDs. Exposed MAC addresses generation of a UUID versus an application which simply stores
can be used as an attack surface to locate machines and reveal one, which are often different.
various other information about such machines (minimally
manufacturer, potentially other details). Instead "cryptographically
secure" pseudo-random number generators (CSPRNGs) or pseudo-random
number generators (PRNG) SHOULD be used within an application context
to provide uniqueness and unguessability.
Due to the shortcomings of UUIDv1 and UUIDv4 details so far, many Due to the aforementioned issue, many widely distributed database
widely distributed database applications and large application applications and large application vendors have sought to solve the
vendors have sought to solve the problem of creating a better time- problem of creating a better time-based, sortable unique identifier
based, sortable unique identifier for use as a database key. This for use as a database key. This has lead to numerous implementations
has lead to numerous implementations over the past 10+ years solving over the past 10+ years solving the same problem in slightly
the same problem in slightly different ways. different ways.
While preparing this specification the following 16 different While preparing this specification the following 16 different
implementations were analyzed for trends in total ID length, bit implementations were analyzed for trends in total ID length, bit
Layout, lexical formatting/encoding, timestamp type, timestamp Layout, lexical formatting/encoding, timestamp type, timestamp
format, timestamp accuracy, node format/components, collision format, timestamp accuracy, node format/components, collision
handling and multi-timestamp tick generation sequencing. handling and multi-timestamp tick generation sequencing.
1. [LexicalUUID] by Twitter 1. [ULID] by A. Feerasta
2. [Snowflake] by Twitter 2. [LexicalUUID] by Twitter
3. [Flake] by Boundary 3. [Snowflake] by Twitter
4. [ShardingID] by Instagram 4. [Flake] by Boundary
5. [KSUID] by Segment 5. [ShardingID] by Instagram
6. [Elasticflake] by P. Pearcy 6. [KSUID] by Segment
7. [FlakeID] by T. Pawlak 7. [Elasticflake] by P. Pearcy
8. [Sonyflake] by Sony 8. [FlakeID] by T. Pawlak
9. [orderedUuid] by IT. Cabrera 9. [Sonyflake] by Sony
10. [COMBGUID] by R. Tallent 10. [orderedUuid] by IT. Cabrera
11. [ULID] by A. Feerasta 11. [COMBGUID] by R. Tallent
12. [SID] by A. Chilton 12. [SID] by A. Chilton
13. [pushID] by Google 13. [pushID] by Google
14. [XID] by O. Poitrey 14. [XID] by O. Poitrey
15. [ObjectID] by MongoDB 15. [ObjectID] by MongoDB
16. [CUID] by E. Elliott 16. [CUID] by E. Elliott
An inspection of these implementations details the following trends
that help define this standard:
- Timestamps MUST be k-sortable. That is, values within or close An inspection of these implementations and the issues described above
to the same timestamp are ordered properly by sorting algorithms. has led to this document which attempts to adapt UUIDs to address
- Timestamps SHOULD be big-endian with the most-significant bits these issues.
of the time embedded as-is without reordering.
- Timestamps SHOULD utilize millisecond precision and Unix Epoch 2. Terminology
as timestamp source. Although, there is some variation to this
among implementations depending on the application requirements. 2.1. Requirements Language
- The ID format SHOULD be Lexicographically sortable while in the
textual representation. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
- IDs MUST ensure proper embedded sequencing to facilitate sorting "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
when multiple UUIDs are created during a given timestamp. "OPTIONAL" in this document are to be interpreted as described in BCP
- IDs MUST NOT require unique network identifiers as part of 14 [RFC2119] [RFC8174] when, and only when, they appear in all
achieving uniqueness. capitals, as shown here.
- Distributed nodes MUST be able to create collision resistant
Unique IDs without consulting a centralized resource. 2.2. Abbreviations
The following abbreviations are used in this document:
UUID Universally Unique Identifier [RFC4122]
CSPRNG Cryptographically Secure Pseudo-Random Number Generator
MAC Media Access Control
MSB Most Significant Bit
DBMS Database Management System
3. Summary of Changes 3. Summary of Changes
In order to solve these challenges this specification introduces The following UUIDs are hereby introduced:
three new version identifiers assigned for time-based UUIDs.
The first, UUIDv6, aims to be the easiest to implement for UUID version 6 (UUIDv6)
applications which already implement UUIDv1. The UUIDv6 A re-ordering of UUID version 1 so it is sortable as an opaque
specification keeps the original Gregorian timestamp source but does sequence of bytes. Easy to implement given an existing UUIDv1
not reorder the timestamp bits as per the process utilized by UUIDv1. implementation. See Section 5.1
UUIDv6 also requires that pseudo-random data MUST be used in place of
the MAC address. The rest of the UUIDv1 format remains unchanged in
UUIDv6. See Section 4.3
Next, UUIDv7 introduces an entirely new time-based UUID bit layout UUID version 7 (UUIDv7)
utilizing a variable length timestamp sourced from the widely An entirely new time-based UUID bit layout sourced from the widely
implemented and well known Unix Epoch timestamp source. The implemented and well known Unix Epoch timestamp source. See
timestamp is broken into a 36 bit integer sections part, and is Section 5.2
followed by a field of variable length which represents the sub-
second timestamp portion, encoded so that each bit from most to least
significant adds more precision. See Section 4.4
Finally, UUIDv8 introduces a relaxed time-based UUID format that UUID version 8 (UUIDv8)
caters to application implementations that cannot utilize UUIDv1, A free-form UUID format which has no explicit requirements except
UUIDv6, or UUIDv7. UUIDv8 also future-proofs this specification by maintaining backward compatibility. See Section 5.3
allowing time-based UUID formats from timestamp sources that are not
yet be defined. The variable size timestamp offers lots of Max UUID
flexibility to create an implementation specific RFC compliant time- A specialized UUID which is the inverse of [RFC4122],
based UUID while retaining the properties that make UUID great. See Section 4.1.7 See Section 5.4
Section 4.5
3.1. changelog 3.1. changelog
RFC EDITOR PLEASE DELETE THIS SECTION. RFC EDITOR PLEASE DELETE THIS SECTION.
draft-03
- Reworked the draft body to make the content more concise
- UUIDv6 section reworked to just the reorder of the timestamp
- UUIDv7 changed to simplify timestamp mechanism to just
millisecond Unix timestamp
- UUIDv8 relaxed to be custom in all elements except version and
variant
- Introduced Max UUID.
- Added C code samples in Appendix.
- Added test vectors in Appendix.
- Version and Variant section combined into one section.
- Changed from pseudo-random number generators to
cryptographically secure pseudo-random number generator (CSPRNG).
- Combined redundant topics from all UUIDs into sections such as
Timestamp granularity, Monotonicity and Counters, Collision
Resistance, Sorting, and Unguessability, etc.
- Split Encoding and Storage into Opacity and DBMS and Database
Considerations
- Reworked Global Uniqueness under new section Global and Local
Uniqueness
- Node verbiage only used in UUIDv6 all others reference random/
rand instead
- Clock sequence verbiage changed simply to counter in any section
other than UUIDv6
- Added Abbreviations section
- Updated IETF Draft XML Layout
- Added information about little-endian UUIDs
draft-02 draft-02
- Added Changelog - Added Changelog
- Fixed misc. grammatical errors - Fixed misc. grammatical errors
- Fixed section numbering issue - Fixed section numbering issue
- Fixed some UUIDvX reference issues - Fixed some UUIDvX reference issues
- Changed all instances of "motonic" to "monotonic" - Changed all instances of "motonic" to "monotonic"
- Changed all instances of "#-bit" to "# bit" - Changed all instances of "#-bit" to "# bit"
- Changed "proceeding" veriage to "after" in section 7 - Changed "proceeding" verbiage to "after" in section 7
- Added details on how to pad 32 bit unix timestamp to 36 bits in - Added details on how to pad 32 bit Unix timestamp to 36 bits in
UUIDv7 UUIDv7
- Added details on how to truncate 64 bit unix timestamp to 36 - Added details on how to truncate 64 bit Unix timestamp to 36
bits in UUIDv7 bits in UUIDv7
- Added forward reference and bullet to UUIDv8 if truncating 64 - Added forward reference and bullet to UUIDv8 if truncating 64
bit Unix Epoch is not an option. bit Unix Epoch is not an option.
- Fixed bad reference to non-existent "time_or_node" in section - Fixed bad reference to non-existent "time_or_node" in section
4.5.4 4.5.4
draft-01 draft-01
- Complete rewrite of entire document. - Complete rewrite of entire document.
- The format, flow and verbiage used in the specification has been - The format, flow and verbiage used in the specification has been
skipping to change at page 6, line 35 skipping to change at page 7, line 4
4.5.4 4.5.4
draft-01 draft-01
- Complete rewrite of entire document. - Complete rewrite of entire document.
- The format, flow and verbiage used in the specification has been - The format, flow and verbiage used in the specification has been
reworked to mirror the original RFC 4122 and current IETF reworked to mirror the original RFC 4122 and current IETF
standards. standards.
- Removed the topics of UUID length modification, alternate UUID - Removed the topics of UUID length modification, alternate UUID
text formats, and alternate UUID encoding techniques. text formats, and alternate UUID encoding techniques.
- Research into 16 different historical and current - Research into 16 different historical and current
implementations of time-based universal identifiers was completed implementations of time-based universal identifiers was completed
at the end of 2020 in attempt to identify trends which have at the end of 2020 in attempt to identify trends which have
directly influenced design decisions in this draft document directly influenced design decisions in this draft document
(https://github.com/uuid6/uuid6-ietf-draft/tree/master/research) (https://github.com/uuid6/uuid6-ietf-draft/tree/master/research)
- Prototype implementation have been completed for UUIDv6, UUIDv7, - Prototype implementation have been completed for UUIDv6, UUIDv7,
and UUIDv8 in various languages by many GitHub community members. and UUIDv8 in various languages by many GitHub community members.
(https://github.com/uuid6/prototypes) (https://github.com/uuid6/prototypes)
4. Format 4. Variant and Version Fields
The UUID length of 16 octets (128 bits) remains unchanged. The The variant bits utilized by UUIDs in this specification remain in
textual representation of a UUID consisting of 36 hexadecimal and the same octet as originally defined by [RFC4122], Section 4.1.1.
dash characters in the format 8-4-4-4-12 remains unchanged for human
readability. In addition the position of both the Version and
Variant bits remain unchanged in the layout.
4.1. Versions The next table details Variant 10xx (8/9/A/B) and the new versions
defined by this specification. A complete guide to all versions
within this variant has been includes in Appendix C.1.
Table 1 defines the 4 bit version found in Bits 48 through 51 within +------+------+------+------+---------+---------------------------+
a given UUID. | Msb0 | Msb1 | Msb2 | Msb3 | Version | Description |
+------+------+------+------+---------+---------------------------+
| 0 | 1 | 1 | 0 | 6 | Reordered Gregorian time- |
| | | | | | based UUID specified in |
| | | | | | this document. |
+------+------+------+------+---------+---------------------------+
| 0 | 1 | 1 | 1 | 7 | Unix Epoch time-based |
| | | | | | UUID specified in this |
| | | | | | document. |
+------+------+------+------+---------+---------------------------+
| 1 | 0 | 0 | 0 | 8 | Reserved for custom UUID |
| | | | | | formats specified in this |
| | | | | | document |
+------+------+------+------+---------+---------------------------+
+------+------+------+------+---------+-----------------------+ Table 1: New UUID variant 10xx (8/9/A/B) versions defined by this
| Msb0 | Msb1 | Msb2 | Msb3 | Version | Description | specification
+------+------+------+------+---------+-----------------------+
| 0 | 1 | 1 | 0 | 6 | Reordered Gregorian |
| | | | | | time-based UUID |
+------+------+------+------+---------+-----------------------+
| 0 | 1 | 1 | 1 | 7 | Variable length Unix |
| | | | | | Epoch time-based UUID |
+------+------+------+------+---------+-----------------------+
| 1 | 0 | 0 | 0 | 8 | Custom time-based |
| | | | | | UUID |
+------+------+------+------+---------+-----------------------+
Table 1: UUID versions defined by this specification For UUID version 6, 7 and 8 the variant field placement from
[RFC4122] are unchanged. An example version/variant layout for
UUIDv6 follows the table where M is the version and N is the variant.
4.2. Variant 00000000-0000-6000-8000-000000000000
00000000-0000-6000-9000-000000000000
00000000-0000-6000-A000-000000000000
00000000-0000-6000-B000-000000000000
xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx
The variant bits utilized by UUIDs in this specification remains the Figure 1: UUIDv6 Variant Examples
same as [RFC4122], Section 4.1.1.
The Table 2 lists the contents of the variant field, bits 64 and 65, 5. New Formats
where the letter "x" indicates a "don't-care" value. Common hex
values of 8 (1000), 9 (1001), A (1010), and B (1011) frequent the
text representation.
+------+------+------+-----------------------------------------+ The UUID format is 16 octets; the variant bits in conjunction with
| Msb0 | Msb1 | Msb2 | Description | the version bits described in the next section in determine finer
+------+------+------+-----------------------------------------+ structure.
| 1 | 0 | x | The variant specified in this document. |
+------+------+------+-----------------------------------------+
Table 2: UUID Variant defined by this specification 5.1. UUID Version 6
4.3. UUIDv6 Layout and Bit Order UUID version 6 is a field-compatible version of UUIDv1, reordered for
improved DB locality. It is expected that UUIDv6 will primarily be
used in contexts where there are existing v1 UUIDs. Systems that do
not involve legacy UUIDv1 SHOULD consider using UUIDv7 instead.
UUIDv6 aims to be the easiest to implement by reusing most of the Instead of splitting the timestamp into the low, mid and high
layout of bits found in UUIDv1 but with changes to bit ordering for sections from UUIDv1, UUIDv6 changes this sequence so timestamp bytes
the timestamp. Where UUIDv1 splits the timestamp bits into three are stored from most to least significant. That is, given a 60 bit
distinct parts and orders them as time_low, time_mid, timestamp value as specified for UUIDv1 in [RFC4122], Section 4.1.4,
time_high_and_version. UUIDv6 instead keeps the source bits from the for UUIDv6, the first 48 most significant bits are stored first,
timestamp intact and changes the order to time_high, time_mid, and followed by the 4 bit version (same position), followed by the
time_low. Incidentally this will match the original 60 bit Gregorian remaining 12 bits of the original 60 bit timestamp.
timestamp source with 100-nanosecond precision defined in [RFC4122],
Section 4.1.4 The clock sequence bits remain unchanged from their
usage and position in [RFC4122], Section 4.1.5. The 48 bit node
SHOULD be set to a pseudo-random value however implementations MAY
choose retain the old MAC address behavior from [RFC4122],
Section 4.1.6 and [RFC4122], Section 4.5
The format for the 16-octet, 128 bit UUIDv6 is shown in Figure 1 The clock sequence bits remain unchanged from their usage and
position in [RFC4122], Section 4.1.5.
The 48 bit node SHOULD be set to a pseudo-random value however
implementations MAY choose to retain the old MAC address behavior
from [RFC4122], Section 4.1.6 and [RFC4122], Section 4.5. For more
information on MAC address usage within UUIDs see the Section 8
The format for the 16-byte, 128 bit UUIDv6 is shown in Figure 1
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| time_high | | time_high |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| time_mid | time_low_and_version | | time_mid | time_low_and_version |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|clk_seq_hi_res | clk_seq_low | node (0-1) | |clk_seq_hi_res | clk_seq_low | node (0-1) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| node (2-5) | | node (2-5) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 1: UUIDv6 Field and Bit Layout Figure 2: UUIDv6 Field and Bit Layout
time_high: time_high:
The most significant 32 bits of the 60 bit starting timestamp. The most significant 32 bits of the 60 bit starting timestamp.
Occupies bits 0 through 31 (octets 0-3) Occupies bits 0 through 31 (octets 0-3)
time_mid: time_mid:
The middle 16 bits of the 60 bit starting timestamp. Occupies The middle 16 bits of the 60 bit starting timestamp. Occupies
bits 32 through 47 (octets 4-5) bits 32 through 47 (octets 4-5)
time_low_and_version: time_low_and_version:
skipping to change at page 9, line 5 skipping to change at page 9, line 28
Occupies bits 64 through 71 (octet 8) Occupies bits 64 through 71 (octet 8)
clock_seq_low: clock_seq_low:
The 8 bit low portion of the clock sequence. Occupies bits 72 The 8 bit low portion of the clock sequence. Occupies bits 72
through 79 (octet 9) through 79 (octet 9)
node: node:
48 bit spatially unique identifier Occupies bits 80 through 127 48 bit spatially unique identifier Occupies bits 80 through 127
(octets 10-15) (octets 10-15)
4.3.1. UUIDv6 Basic Creation Algorithm With UUIDv6 the steps for splitting the timestamp into time_high and
time_mid are OPTIONAL since the 48 bits of time_high and time_mid
The following implementation algorithm is based on [RFC4122] but with will remain in the same order. An extra step of splitting the first
changes specific to UUIDv6: 48 bits of the timestamp into the most significant 32 bits and least
significant 16 bits proves useful when reusing an existing UUIDv1
1. From a system-wide shared stable store (e.g., a file) or global implementation.
variable, read the UUID generator state: the values of the
timestamp and clock sequence used to generate the last UUID.
2. Obtain the current time as a 60 bit count of 100-nanosecond
intervals since 00:00:00.00, 15 October 1582.
3. Set the time_low field to the 12 least significant bits of the
starting 60 bit timestamp.
4. Truncate the timestamp to the 48 most significant bits in order
to create time_high_and_time_mid.
5. Set the time_high field to the 32 most significant bits of the
truncated timestamp.
6. Set the time_mid field to the 16 least significant bits of the
truncated timestamp.
7. Create the 16 bit time_low_and_version by concatenating the 4
bit UUIDv6 version with the 12 bit time_low.
8. If the state was unavailable (e.g., non-existent or corrupted)
or the timestamp is greater than the current timestamp generate
a random 14 bit clock sequence value.
9. If the state was available, but the saved timestamp is less than
or equal to the current timestamp, increment the clock sequence
value.
10. Complete the 16 bit clock sequence high, low and reserved
creation by concatenating the clock sequence onto UUID variant
bits which take the most significant position in the 16 bit
value.
11. Generate a 48 bit pseudo-random node.
12. Format by concatenating the 128 bits from each parts:
time_high|time_mid|time_low_and_version|variant_clk_seq|node
13. Save the state (current timestamp and clock sequence) back to
the stable store
The steps for splitting time_high_and_time_mid into time_high and
time_mid are optional since the 48 bits of time_high and time_mid
will remain in the same order as time_high_and_time_mid during the
final concatenation. This extra step of splitting into the most
significant 32 bits and least significant 16 bits proves useful when
reusing an existing UUIDv1 implementation. In which the following
logic can be applied to reshuffle the bits with minimal
modifications.
+--------------+------+--------------+
| UUIDv1 Field | Bits | UUIDv6 Field |
+--------------+------+--------------+
| time_low | 32 | time_high |
+--------------+------+--------------+
| time_mid | 16 | time_mid |
+--------------+------+--------------+
| time_high | 12 | time_low |
+--------------+------+--------------+
Table 3: UUIDv1 to UUIDv6 Field
Mappings
4.4. UUIDv7 Layout and Bit Order
The UUIDv7 format is designed to encode a Unix timestamp with 5.2. UUID Version 7
arbitrary sub-second precision. The key property provided by UUIDv7
is that timestamp values generated by one system and parsed by
another are guaranteed to have sub-second precision of either the
generator or the parser, whichever is less. Additionally, the system
parsing the UUIDv7 value does not need to know which precision was
used during encoding in order to function correctly.
The format for the 16-octet, 128 bit UUIDv7 is shown in Figure 2 UUID version 7 features a time-ordered value field derived from the
widely implemented and well known Unix Epoch timestamp source, the
number of milliseconds seconds since midnight 1 Jan 1970 UTC, leap
seconds excluded. As well as improved entropy characteristics over
versions 1 or 6.
0 1 2 3 Implementations SHOULD utilize UUID version 7 over UUID version 1 and
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 6 if possible.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| unixts |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|unixts | subsec_a | ver | subsec_b |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|var| subsec_seq_node |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| subsec_seq_node |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2: UUIDv7 Field and Bit Layout 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| unix_ts_ms |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| unix_ts_ms | ver | rand_a |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|var| rand_b |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| rand_b |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
unixts: Figure 3: UUIDv7 Field and Bit Layout
36 bit big-endian unsigned Unix Timestamp value
subsec_a: unix_ts_ms:
12 bits allocated to sub-second precision values. 48 bit big-endian unsigned number of Unix epoch timestamp as per
Section 6.1.
ver: ver:
The 4 bit UUIDv7 version (0111) 4 bit UUIDv7 version set as per Section 4
subsec_b: rand_a:
12 bits allocated to sub-second precision values. 12 bits pseudo-random data to provide uniqueness as per
Section 6.2 and Section 6.6.
var: var:
2 bit UUID variant (10) The 2 bit variant defined by Section 4.
subsec_seq_node:
The remaining 62 bits which MAY be allocated to any combination of
additional sub-second precision, sequence counter, or pseudo-
random data.
4.4.1. UUIDv7 Timestamp Usage
UUIDv7 utilizes a 36 bit big-endian unsigned Unix Timestamp value
(number of seconds since the epoch of 1 Jan 1970, leap seconds
excluded so each hour is exactly 3600 seconds long). The 36 bit
value was selected in order to provide more available time to the
unix timestamp and avoid the Year 2038 problem by extending the
maximum timestamp to the year 4147.
To achieve a 36 bit UUIDv7 timestamp, the lower 36 bits of a 64 bit
unix time are extracted verbatim into UUIDv7
In the event that 32 bit Unix Timestamp are in use; four zeros MUST
be appended at the start in the most significant (left-most) bits of
the 32 bit Unix timestamp creating the 36 bit Unix timestamp. This
ensures sorting compatibility with 64 bit unix timestamp which have
been truncated to 36 bits.
Additional sub-second precision (millisecond, nanosecond,
microsecond, etc) MAY be provided for encoding and decoding in the
remaining bits in the layout.
UUIDv8 SHOULD be used in place of UUIDv7 if an application or
implementation does not want to truncate a 64 bit Unix Epoch to the
lower 36 bits.
4.4.2. UUIDv7 Clock Sequence Usage
UUIDv7 SHOULD utilize a monotonic sequence counter to provide
additional sequencing guarantees when multiple UUIDv7 values are
created in the same UNIXTS and SUBSEC timestamp. The amount of bits
allocates to the sequence counter depend on the precision of the
timestamp. For example, a more accurate timestamp source using
nanosecond precision will require less clock sequence bits than a
timestamp source utilizing seconds for precision. For best
sequencing results the sequence counter SHOULD be placed immediately
after available sub-second bits.
The clock sequence MUST start at zero and increment monotonically for
each new UUIDv7 created on by the application on the same timestamp.
When the timestamp increments the clock sequence MUST be reset to
zero. The clock sequence MUST NOT rollover or reset to zero unless
the timestamp has incremented. Care MUST be given to ensure that an
adequate sized clock sequence is selected for a given application
based on expected timestamp precision and expected UUIDv7 generation
rates.
4.4.3. UUIDv7 Node Usage
UUIDv7 implementations, even with very detailed sub-second precision
and the optional sequence counter, MAY have leftover bits that will
be identified as the Node for this section. The UUIDv7 Node MAY
contain any set of data an implementation desires however the node
MUST NOT be set to all 0s which does not ensure global uniqueness.
In most scenarios the node SHOULD be filled with pseudo-random data.
4.4.4. UUIDv7 Encoding and Decoding
The UUIDv7 bit layout for encoding and decoding are described
separately in this document.
4.4.4.1. UUIDv7 Encoding
Since the UUIDv7 Unix timestamp is fixed at 36 bits in length the
exact layout for encoding UUIDv7 depends on the precision (number of
bits) used for the sub-second portion and the sizes of the optionally
desired sequence counter and node bits.
Three examples of UUIDv7 encoding are given below as a general
guidelines but implementations are not limited to just these three
examples.
All of these fields are only used during encoding, and during
decoding the system is unaware of the bit layout used for them and
considers this information opaque. As such, implementations
generating these values can assign whatever lengths to each field it
deems applicable, as long as it does not break decoding compatibility
(i.e. Unix timestamp (unixts), version (ver) and variant (var) have
to stay where they are, and clock sequence counter (seq), random
(random) or other implementation specific values must follow the sub-
second encoding).
In Figure 3 the UUIDv7 has been created with millisecond precision
with the available sub-second precision bits.
Examining Figure 3 one can observe:
* The first 36 bits have been dedicated to the Unix Timestamp
(unixts)
* All 12 bits of scenario subsec_a is fully dedicated to millisecond
information (msec).
* The 4 Version bits remain unchanged (ver).
* All 12 bits of subsec_b have been dedicated to a monotonic clock
sequence counter (seq).
* The 2 Variant bits remain unchanged (var).
* Finally the remaining 62 bits in the subsec_seq_node section are
layout is filled out with random data to pad the length and
provide guaranteed uniqueness (rand).
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| unixts |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|unixts | msec | ver | seq |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|var| rand |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| rand |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3: UUIDv7 Field and Bit Layout - Encoding Example (Millisecond
Precision)
In Figure 4 the UUIDv7 has been created with Microsecond precision
with the available sub-second precision bits.
Examining Figure 4 one can observe:
* The first 36 bits have been dedicated to the Unix Timestamp
(unixts)
* All 12 bits of scenario subsec_a is fully dedicated to providing
sub-second encoding for the Microsecond precision (usec).
* The 4 Version bits remain unchanged (ver).
* All 12 bits of subsec_b have been dedicated to providing sub-
second encoding for the Microsecond precision (usec).
* The 2 Variant bits remain unchanged (var).
* A 14 bit monotonic clock sequence counter (seq) has been embedded
in the most significant position of subsec_seq_node
* Finally the remaining 48 bits in the subsec_seq_node section are
layout is filled out with random data to pad the length and
provide guaranteed uniqueness (rand).
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| unixts |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|unixts | usec | ver | usec |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|var| seq | rand |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| rand |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4: UUIDv7 Field and Bit Layout - Encoding Example (Microsecond
Precision)
In Figure 5 the UUIDv7 has been created with Nanosecond precision
with the available sub-second precision bits.
Examining Figure 5 one can observe:
* The first 36 bits have been dedicated to the Unix Timestamp
(unixts)
* All 12 bits of scenario subsec_a is fully dedicated to providing
sub-second encoding for the Nanosecond precision (nsec).
* The 4 Version bits remain unchanged (ver).
* All 12 bits of subsec_b have been dedicated to providing sub-
second encoding for the Nanosecond precision (nsec).
* The 2 Variant bits remain unchanged (var).
* The first 14 bit of the subsec_seq_node dedicated to providing
sub-second encoding for the Nanosecond precision (nsec).
* The next 8 bits of subsec_seq_node dedicated a monotonic clock
sequence counter (seq).
* Finally the remaining 40 bits in the subsec_seq_node section are
layout is filled out with random data to pad the length and
provide guaranteed uniqueness (rand).
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| unixts |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|unixts | nsec | ver | nsec |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|var| nsec | seq | rand |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| rand |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5: UUIDv7 Field and Bit Layout - Encoding Example
(Nanosecond Precision)
4.4.4.2. UUIDv7 Decoding
When decoding or parsing a UUIDv7 value there are only two values to
be considered:
1. The unix timestamp defined as unixts
2. The sub-second precision values defined as subsec_a, subsec_b,
and subsec_seq_node
As detailed in Figure 2 the unix timestamp (unixts) is always the
first 36 bits of the UUIDv7 layout.
Similarly as per Figure 2, the sub-second precision values lie within
subsec_a, subsec_b, and subsec_seq_node which are all interpreted as
sub-second information after skipping over the version (ver) and
(var) bits. These concatenated sub-second information bits are
interpreted in a way where most to least significant bits represent a
further division by two. This is the same normal place notation used
to express fractional numbers, except in binary. For example, in
decimal ".1" means one tenth, and ".01" means one hundredth. In this
subsec field, a 1 means one half, 01 means one quarter, 001 is one
eighth, etc. This scheme can work for any number of bits up to the
maximum available, and keeps the most significant data leftmost in
the bit sequence.
To perform the sub-second math, simply take the first (most
significant/leftmost) N bits of subsec and divide it by 2^N. Take
for example:
1. To parse the first 16 bits, extract that value as an integer and
divide it by 65536 (2 to the 16th).
2. If these 16 bits are 0101 0101 0101 0101, then treating that as
an integer gives 0x5555 or 21845 in decimal, and dividing by
65536 gives 0.3333282
This sub-second encoding scheme provides maximum interoperability
across systems where different levels of time precision are
required/feasible/available. The timestamp value derived from a
UUIDv7 value SHOULD be "as close to the correct value as possible"
when parsed, even across disparate systems.
Take for example the starting point for our next two UUIDv7 parsing
scenarios:
1. System A produces a UUIDv7 with a microsecond-precise timestamp
value.
2. System B is unaware of the precision encoded in the UUIDv7
timestamp by System A.
Scenario 1:
1. System B parses the embedded timestamp with millisecond
precision. (Less precision than the encoder)
2. System B SHOULD return the correct millisecond value encoded by
system A (truncated to milliseconds).
Scenario 2:
1. System B parses the timestamp with nanosecond precision. (More
precision than the encoder)
2. System B's value returned SHOULD have the same microsecond level
of precision provided by the encoder with the additional
precision down to nanosecond level being essentially random as
per the encoded random value at the end of the UUIDv7.
4.5. UUIDv8 Layout and Bit Order
UUIDv8 offers variable-size timestamp, clock sequence, and node
values which allow for a highly customizable UUID that fits a given
application needs.
UUIDv8 SHOULD only be utilized if an implementation cannot utilize rand_b:
UUIDv1, UUIDv6, or UUIDv7. Some situations in which UUIDv8 usage The final 62 bits of pseudo-random data to provide uniqueness as
could occur: per Section 6.2 and Section 6.6.
* An implementation would like to utilize a timestamp source not 5.3. UUID Version 8
defined by the current time-based UUIDs.
* An implementation would like to utilize a timestamp bit layout not UUID version 8 provides an RFC-compatible format for experimental or
defined by the current time-based UUIDs. vendor-specific use cases. The only requirement is that the variant
and version bits MUST be set as defined in Section 4. UUIDv8's
uniqueness will be implementation-specific and SHOULD NOT be assumed.
* An implementation would like to avoid truncating a 64 bit Unix to The only explicitly defined bits are the Version and Variant leaving
36 bits as defined by UUIDv7. 120 bits for implementation specific time-based UUIDs. To be clear:
UUIDv8 is not a replacement for UUIDv4 where all 122 extra bits are
filled with random data.
* An implementation would like a specific level of precision within Some example situations in which UUIDv8 usage could occur:
the timestamp not offered by current time-based UUIDs.
* An implementation would like to embed extra information within the * An implementation would like to embed extra information within the
UUID node other than what is defined in this document. UUID other than what is defined in this document.
* An implementation has other application/language restrictions * An implementation has other application/language restrictions
which inhibit the usage of one of the current time-based UUIDs. which inhibit the use of one of the current UUIDs.
Roughly speaking a properly formatted UUIDv8 SHOULD contain the
following sections adding up to a total of 128 bits.
- Timestamp Bits (Variable Length)
- Clock Sequence Bits (Variable Length)
- Node Bits (Variable Length)
- UUIDv8 Version Bits (4 bits)
- UUID Variant Bits (2 Bits)
The only explicitly defined bits are the Version and Variant leaving
122 bits for implementation specific time-based UUIDs. To be clear:
UUIDv8 is not a replacement for UUIDv4 where all 122 extra bits are
filled with random data. UUIDv8's 128 bits (including the version
and variant) SHOULD contain at the minimum a timestamp of some format
in the most significant bit position followed directly by a clock
sequence counter and finally a node containing either random data or
implementation specific data.
A sample format in Figure 6 is used to further illustrate the point
for the 16-octet, 128 bit UUIDv8.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp_32 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp_48 | ver | time_or_seq |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|var| seq_or_node | node |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| node |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6: UUIDv8 Field and Bit Layout 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| custom_a |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| custom_a | ver | custom_b |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|var| custom_c |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| custom_c |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
timestamp_32: Figure 4: UUIDv8 Field and Bit Layout
The most significant 32 bits of the desired timestamp source.
Occupies bits 0 through 31 (octets 0-3).
timestamp_48: custom_a:
The next 16 bits of the timestamp source when a timestamp source The first 48 bits of the layout that can be filled as an
with at least 48 bits is used. When a 32 bit timestamp source is implementation sees fit.
utilized, these bits are set to 0. Occupies bits 32 through 47
ver: ver:
The 4 bit UUIDv8 version (1000). Occupies bits 48 through 51. The 4 bit version field as defined by Section 4
time_or_seq: custom_b:
If a 60 bit, or larger, timestamp is used these 12 bits are used 12 more bits of the layout that can be filled as an implementation
to fill out the remaining timestamp. If a 32 or 48 bit timestamp sees fit.
is leveraged a 12 bit clock sequence MAY be used. Together ver
and time_or_seq occupy bits 48 through 63 (octets 6-7)
var: var:
2 bit UUID variant (10) The 2 bit variant field as defined by Section 4.
seq_or_node:
If a 60 bit, or larger, timestamp source is leverages these 8 bits
SHOULD be allocated for an 8 bit clock sequence counter. If a 32
or 48 bit timestamp source is used these 8 bits SHOULD be set to
random.
node:
In most implementations these bits will likely be set to pseudo-
random data. However, implementations utilize the node as they
see fit. Together var, seq_or_node, and node occupy Bits 64
through 127 (octets 8-15)
4.5.1. UUIDv8 Timestamp Usage
UUIDv8's usage of timestamp relaxes both the timestamp source and
timestamp length. Implementations are free to utilize any
monotonically stable timestamp source for UUIDv8.
Some examples include:
- Custom Epoch
- NTP timestamp
- ISO 8601 timestamp
- Full, Non-truncated 64 bit Unix Epoch timestamp
The relaxed nature UUIDv8 timestamps also works to future proof this
specification and allow implementations a method to create compliant
time-based UUIDs using timestamp source that might not yet be
defined.
Timestamps come in many sizes and UUIDv8 defines three fields that
can easily used for the majority of timestamp lengths:
* 32 bit timestamp: using timestamp_32 and setting timestamp_48 to
0s
* 48 bit timestamp: using timestamp_32 and timestamp_48 entirely
* 60 bit timestamp: using timestamp_32, timestamp_48, and
time_or_seq
* 64 bit timestamp: using timestamp_32, timestamp_48, and
time_or_seq and truncating the timestamp the 60 most significant
bits.
Although it is possible to create a timestamp larger than 64 bits in
size The usage and bit layout of that timestamp format is up to the
implementation. When a timestamp exceeds the 64th bit (octet 7),
extra care must be taken to ensure the Variant bits are properly
inserted at their respective location in the UUID. Likewise, the
Version MUST always be implemented at the appropriate location.
Any timestamps that does not entirely fill the timestamp_32,
timestamp_48 or time_or_seq MUST set all leftover bits in the least
significant position of the respective field to 0. For example a 36
bit timestamp source would fully utilize timestamp_32 and 4 bits of
timestamp_48. The remaining 12 bits in timestamp_48 MUST be set to
0.
By using implementation-specific timestamp sources it is not
guaranteed that devices outside of the application context are able
to extract and parse the timestamp from UUIDv8 without some pre-
existing knowledge of the source timestamp used by the UUIDv8
implementation.
4.5.2. UUIDv8 Clock Sequence Usage
A clock sequence MUST be used with UUIDv8 as added sequencing
guarantees when multiple UUIDv8 will be created on the same clock
tick. The amount of bits allocated to the clock sequence depends on
the precision of the timestamp source. For example, a more accurate
timestamp source using nanosecond precision will require less clock
sequence bits than a timestamp source utilizing seconds for
precision.
The UUIDv8 layout in Figure 6 generically defines two possible clock
sequence values that can leveraged:
* 12 bit clock sequence using time_or_seq for use when the timestamp
is less than 48 bits which allows for 4095 UUIDs per clock tick.
* 8 bit clock sequence using seq_or_node when the timestamp uses
more than 48 bits which allows for 255 UUIDs per clock tick.
An implementation MAY use both time_or_seq and seq_or_node for clock
sequencing however it is highly unlikely that 20 bits of clock
sequence are needed for a given clock tick. Furthermore, more bits
from the node MAY be used for clock sequencing in the event that 8
bits is not sufficient.
The clock sequence MUST start at zero and increment monotonically for
each new UUIDv8 created on by the application on the same timestamp.
When the timestamp increments the clock sequence MUST be reset to
zero. The clock sequence MUST NOT rollover or reset to zero unless
the timestamp has incremented. Care MUST be given to ensure that an
adequate sized clock sequence is selected for a given application
based on expected timestamp precision and expected UUIDv8 generation
rates.
4.5.3. UUIDv8 Node Usage
The UUIDv8 Node MAY contain any set of data an implementation desires
however the node MUST NOT be set to all 0s which does not ensure
global uniqueness. In most scenarios the node will be filled with
pseudo-random data.
The UUIDv8 layout in Figure 6 defines 2 sizes of Node depending on custom_c:
the timestamp size: The final 62 bits of the layout immediatly following the var field
to be filled as an implementation sees fit.
* 62 bit node encompassing seq_or_node and node Used when a 5.4. Max UUID
timestamp of 48 bits or less is leveraged.
* 54 bit node when all 60 bits of the timestamp are in use and the
seq_or_node is used as clock sequencing.
An implementation MAY choose to allocate bits from the node to the The Max UUID is special form of UUID that is specified to have all
timestamp, clock sequence or application-specific embedded field. It 128 bits set to 1. This UUID can be thought of as the inverse of Nil
is recommended that implementation utilize a node of at least 48 bits UUID defined in [RFC4122], Section 4.1.7
to ensure global uniqueness can be guaranteed.
4.5.4. UUIDv8 Basic Creation Algorithm FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF
The entire usage of UUIDv8 is meant to be variable and allow as much Figure 5: Max UUID Format
customization as possible to meet specific application/language
requirements. As such any UUIDv8 implementations will likely vary
among applications.
The following algorithm is a generic implementation using Figure 6 6. UUID Best Practices
and the recommendations outlined in this specification.
*32 bit timestamp, 12 bit sequence counter, 62 bit node:* The minimum requirements for generating UUIDs are described in this
document for each version. Everything else is an implementation
detail and up to the implementer to decide what is appropriate for a
given implementation. That being said, various relevant factors are
covered below to help guide an implementer through the different
trade-offs among differing UUID implementations.
1. From a system-wide shared stable store (e.g., a file) or global 6.1. Timestamp Granularity
variable, read the UUID generator state: the values of the
timestamp and clock sequence used to generate the last UUID.
2. Obtain the current time from the selected clock source as 32 UUID timestamp source, precision and length was the topic of great
bits. debate while creating this specification. As such choosing the right
timestamp for your application is a very important topic. This
section will detail some of the most common points on this topic.
3. Set the 32 bit field timestamp_32 to the 32 bits from the Reliability:
timestamp Implementations SHOULD use the current timestamp from a reliable
source to provide values that are time-ordered and continually
increasing. Care SHOULD be taken to ensure that timestamp changes
from the environment or operating system are handled in a way that
is consistent with implementation requirements. For example, if
it is possible for the system clock to move backward due to either
manual adjustment or corrections from a time synchronization
protocol, implementations must decide how to handle such cases.
(See Altering, Fuzzing, or Smearing bullet below.)
4. Set 16 bit timestamp_48 to all 0s Source:
UUID version 1 and 6 both utilize a Gregorian epoch timestamp
while UUIDv7 utilizes a Unix Epoch timestamp. If other timestamp
sources or a custom timestamp epoch are required UUIDv8 SHOULD be
leveraged.
5. Set the version to 8 (1000) Sub-second Precision and Accuracy:
Many levels of precision exist for timestamps: milliseconds,
microseconds, nanoseconds, and beyond. Additionally fractional
representations of sub-second precision may be desired to mix
various levels of precision in a time-ordered manner.
Furthermore, system clocks themselves have an underlying
granularity and it is frequently less than the precision offered
by the operating system. With UUID version 1 and 6,
100-nanoseconds of precision are present while UUIDv7 features
fixed millisecond level of precision within the Unix epoch that
does not exceed the granularity capable in most modern systems.
For other levels of precision UUIDv8 SHOULD be utilized.
6. If the state was unavailable (e.g., non-existent or corrupted) Length:
or the timestamp is greater than the current timestamp; set the The length of a given timestamp directly impacts how long a given
12 bit clock sequence value (time_or_seq) to 0 UUID will be valid. That is, how many timestamp ticks can be
contained in a UUID before the maximum value for the timestamp
field is reached. Care should be given to ensure that the proper
length is selected for a given timestamp. UUID version 1 and 6
utilize a 60 bit timestamp and UUIDv7 features a 48 bit timestamp.
7. If the state was available, but the saved timestamp is less than Altering, Fuzzing, or Smearing:
or equal to the current timestamp, increment the clock sequence Implementations MAY alter the actual timestamp. Some examples
value (time_or_seq). included security considerations around providing a real clock
value within a UUID, to correct inaccurate clocks or to handle
leap seconds. This specification makes no requirement or
guarantee about how close the clock value needs to be to actual
time.
8. Set the variant to binary 10 Padding:
When timestamp padding is required, implementations MUST pad the
most significant bits (left-most) bits with zeros. An example is
padding the most significant, left-most bits of a 32 bit Unix
timestamp with zero's to fill out the 48 bit timestamp in UUIDv7.
9. Generate 62 random bits and fill in 8 bits for seq_or_node and Truncating:
54 bits for the node. Similarly, when timestamps need to be truncated: the lower, least
significant bits MUST be used. An example would be truncating a
64 bit Unix timestamp to the least significant, right-most 48 bits
for UUIDv7.
10. Format by concatenating the 128 bits as: timestamp_32|timestamp_ 6.2. Monotonicity and Counters
48|version|time_or_seq|variant|seq_or_node|node
11. Save the state (current timestamp and clock sequence) back to Monotonicity is the backbone of time-based sortable UUIDs. Naturally
the stable store time-based UUIDs from this document will be monotonic due to an
embedded timestamp however implementations can guarantee additional
monotonicity via the concepts covered in this section.
*48 bit timestamp, 12 bit sequence counter, 62 bit node:* Additionally, care MUST be taken to ensure UUIDs generated in batches
are also monotonic. That is, if one-thousand UUIDs are generated for
the same timestamp; there is sufficient logic for organizing the
creation order of those one-thousand UUIDs. For batch UUID creation
implementions MAY utilize a monotonic counter which SHOULD increment
for each UUID created during a given timestamp.
1. From a system-wide shared stable store (e.g., a file) or global For single-node UUID implementations that do not need to create
variable, read the UUID generator state: the values of the batches of UUIDs, the embedded timestamp within UUID version 1, 6,
timestamp and clock sequence used to generate the last UUID. and 7 can provide sufficient monotonicity guarantees by simply
ensuring that timestamp increments before creating a new UUID. For
the topic of Distributed Nodes please refer to Section 6.3
Implementations SHOULD choose one method for single-node UUID
implementations that require batch UUID creation.
2. Obtain the current time from the selected clock source as 32 Fixed-Length Dedicated Counter Bits (Method 1):
bits. This references the practice of allocating a specific number of
bits in the UUID layout to the sole purpose of tallying the total
number of UUIDs created during a given UUID timestamp tick.
Positioning of a fixed bit-length counter SHOULD be immediatly
after the embedded timestamp. This promotes sortability and
allows random data generation for each counter increment. With
this method rand_a section of UUIDv7 MAY be utilized as fixed-
length dedicated counter bits. In the event more counter bits are
required the most significant, left-most, bits of rand_b MAY be
leveraged as additional counter bits.
3. Set the 32 bit field timestamp_32 to the 32 most significant bits Monotonic Random (Method 2):
from the timestamp With this method the random data is extended to also double as a
counter. This monotonic random can be thought of as a "randomly
seeded counter" which MUST be incremented in the least significant
position for each UUID created on a given timestamp tick.
UUIDv7's rand_b section SHOULD be utilized with this method to
handle batch UUID generation during a single timestamp tick.
4. Set 16 bit timestamp_48 to the 16 least significant bits from the The following sub-topics cover methods behind incrementing either
timestamp type of counter method:
5. The rest of the steps are the same as the previous example. Plus One Increment (Type A):
With this increment logic the counter method is incremented by one
for every UUID generation. When this increment method is utilized
with Fixed-Length Dedicated Counter the trailing random generated
for each new UUID can help produce unguessable UUIDs. When this
increment method is utilized with Monotonic Random Counters the
resulting values are easily guessable. Implementations that favor
unguessiblity SHOULD NOT utilize this method with the monotonic
random method.
*60 bit timestamp, 8 bit sequence counter, 54 bit node:* Random Increment (Type B):
With this increment the actual increment of the counter MAY be a
random integer of any desired length larger than zero. When this
increment method is utilized with Fixed-Length Dedicated Counters
the random increments MAY deplete the counter bit space (including
any rollover guards) faster than the desired if a counter of
adequate length is not selected. When this increment method is
utilized with Monotonic Random Counters the counter ensures the
UUIDs retain the required level of unguessability characters
provided by the underlying entropy.
1. From a system-wide shared stable store (e.g., a file) or global The following sub-topics cover topics related solely with creating
variable, read the UUID generator state: the values of the reliable fixed-length dedicated counters:
timestamp and clock sequence used to generate the last UUID.
2. Obtain the current time from the selected clock source as 32 Fixed-Length Dedicated Counter Seeding:
bits. Implementations utilizing fixed-length counter method SHOULD
randomly initialize the counter with each new timestamp tick.
However, when the timestamp has not incremented; the counter
SHOULD be frozen and incremented via the desired increment logic.
When utilizing a randomly seeded counter alongside Method 1; the
random MAY be regenerated with each counter increment without
impacting sortability. The downside is that Method 1 is prone to
overflows if a counter of adequate length is not selected or the
random data generated leaves little room for the required number
of increments. Implementations utilizing fixed-length counter
method MAY also choose to randomly initialize a portion counter
rather than the entire counter. For example, a 24 bit counter
could have the 23 bits in least-significant, right-most, position
randomly initialized. The remaining most significant, left-most
counter bits are initialized as zero for the sole purpose of
guarding against counter rollovers.
3. Set the 32 bit field timestamp_32 to the 32 bits from the Fixed-Length Dedicated Counter Length:
timestamp Care MUST be taken to select a counter bit-length that can
properly handle the level of timestamp precision in use. For
example, millisecond precision SHOULD require a larger counter
than a timestamp with nanosecond precision. General guidance is
that the counter SHOULD be at least 12 bits but no longer than 42
bits. Care SHOULD also be given to ensure that the counter length
selected leaves room for sufficient entropy in the random portion
of the UUID after the counter. This entropy helps improve the
unguessability characteristics of UUIDs created within the batch.
4. Set 16 bit timestamp_48 to the 16 middle bits from the timestamp The following sub-topics cover rollover handling with either type of
counter method:
5. Set the version to 8 (1000) Counter Rollover Guards:
The technique from Fixed-Length Dedicated Counter Seeding which
describes allocating a segment of the fixed-length counter as a
rollover guard is also recommended and SHOULD be employed to help
mitigate counter rollover issues. This same technique can be
leveraged with Monotonic random counter methods by ensuring the
total length of a possible increment in the least significant,
right most position is less than the total length of the random
being incremented. As such the most significant, left-most, bits
can be incremented as rollover guarding.
6. Set 12 bit time_or_seq to the 12 least significant bits from the Counter Rollover Handling:
timestamp Counter rollovers SHOULD be handled by the application to avoid
sorting issues. The general guidance is that applications that
care about absolute monotonicity and sortability SHOULD freeze the
counter and wait for the timestamp to advance which ensures
monotonicity is not broken.
7. Set the variant to 10 Implementations MAY use the following logic to ensure UUIDs featuring
embedded counters are monotonic in nature:
8. If the state was unavailable (e.g., non-existent or corrupted) 1. Compare the current timestamp against the previously stored
or the timestamp is greater than the current timestamp; set the timestamp.
12 bit clock sequence value (seq_or_node) to 0 2. If the current timestamp is equal to the previous timestamp;
increment the counter according to the desired method and type.
3. If the current timestamp is greater than the previous timestamp;
re-initialize the desired counter method to the new timestamp and
generate new random bytes (if the bytes were frozen or being used
as the seed for a monotonic counter).
9. If the state was available, but the saved timestamp is less than Implementations SHOULD check if the the currently generated UUID is
or equal to the current timestamp, increment the clock sequence greater than the previously generated UUID. If this is not the case
value (seq_or_node). then any number of things could have occurred. Such as, but not
limited to, clock rollbacks, leap second handling or counter
rollovers. Applications SHOULD embed sufficient logic to catch these
scenarios and correct the problem ensuring the next UUID generated is
greater than the previous.
10. Generate 54 random bits and fill in the node 6.3. Distributed UUID Generation
11. Format by concatenating the 128 bits as: timestamp_32|timestamp_ Some implementations MAY desire to utilize multi-node, clustered,
48|version|time_or_seq|variant|seq_or_node|node applications which involve two or more nodes independently generating
UUIDs that will be stored in a common location. While UUIDs already
feature sufficient entropy to ensure that the chances of collision
are low as the total number of nodes increase; so does the likelihood
of a collision. This section will detail the approaches that MAY be
utilized by multi-node UUID implementations in distributed
environments.
12. Save the state (current timestamp and clock sequence) back to Centralized Registry:
the stable store With this method all nodes tasked with creating UUIDs consult a
central registry and confirm the generated value is unique. As
applications scale the communication with the central registry
could become a bottleneck and impact UUID generation in a negative
way. Utilization of shared knowledge schemes with central/global
registries is outside the scope of this specification.
*64 bit timestamp, 8 bit sequence counter, 54 bit node:* Node IDs:
With this method, a pseudo-random Node ID value is placed within
the UUID layout. This identifier helps ensure the bit-space for a
given node is unique, resulting in UUIDs that do not conflict with
any other UUID created by another node with a different node id.
Implementations that choose to leverage an embedded node id SHOULD
utilize UUIDv8. The node id SHOULD NOT be an IEEE 802 MAC address
as per Section 8. The location and bit length are left to
implementations and are outside the scope of this specification.
Furthermore, the creation and negotiation of unique node ids among
nodes is also out of scope for this specification.
1. The same steps as the 60 bit timestamp can be utilized if the 64 Utilization of either a Centralized Registry or Node ID are not
bit timestamp is truncated to 60 bits. required for implementing UUIDs in this specification. However
implementations SHOULD utilize one of the two aforementioned methods
if distributed UUID generation is a requirement.
2. Implementations MAY chose to truncate the most or least 6.4. Collision Resistance
significant bits but it is recommended to utilize the most
significant 60 bits and lose 4 bits of precision in the
nanoseconds or microseconds position.
*General algorithm for generation of UUIDv8 not defined here:* Implementations SHOULD weigh the consequences of UUID collisions
within their application and when deciding between UUID versions that
use entropy (random) versus the other components such as Section 6.1
and Section 6.2. This is especially true for distributed node
collision resistance as defined by Section 6.3.
1. From a system-wide shared stable store (e.g., a file) or global There are two example scenarios below which help illustrate the
variable, read the UUID generator state: the values of the varying seriousness of a collision within an application.
timestamp and clock sequence used to generate the last UUID.
2. Obtain the current time from the selected clock source as desired Low Impact
bit total A UUID collision generated a duplicate log entry which results in
incorrect statistics derived from the data. Implementations that
are not negatively affected by collisions may continue with the
entropy and uniqueness provided by the traditional UUID format.
3. Set total amount of bits for timestamp as required in the most High Impact:
significant positions of the 128 bit UUID A duplicate key causes an airplane to receive the wrong course
which puts people's lives at risk. In this scenario there is no
margin for error. Collisions MUST be avoided and failure is
unacceptable. Applications dealing with this type of scenario
MUST employ as much collision resistance as possible within the
given application context.
4. Care MUST be taken to ensure that the UUID Version and UUID 6.5. Global and Local Uniqueness
Variant are in the correct bit positions.
UUID Version: Bits 48 through 51 UUIDs created by this specification MAY be used to provide local
uniqueness guarantees. For example, ensuring UUIDs created within a
local application context are unique within a database MAY be
sufficient for some implementations where global uniqueness outside
of the application context, in other applications, or around the
world is not required.
UUID Variant: Bits 64 and 65 Although true global uniqueness is impossible to guarantee without a
shared knowledge scheme; a shared knowledge scheme is not required by
UUID to provide uniqueness guarantees. Implementations MAY implement
a shared knowledge scheme introduced in Section 6.3 as they see fit
to extend the uniqueness guaranteed this specification and [RFC4122].
5. If the state was unavailable (e.g., non-existent or corrupted) or 6.6. Unguessability
the timestamp is greater than the current timestamp; set the
desired clock sequence value to 0
6. If the state was available, but the saved timestamp is less than Implementations SHOULD utilize a cryptographically secure pseudo-
or equal to the current timestamp, increment the clock sequence random number generator (CSPRNG) to provide values that are both
value. difficult to predict ("unguessable") and have a low likelihood of
collision ("unique"). CSPRNG ensures the best of Section 6.4 and
Section 8 are present in modern UUIDs.
7. Set the remaining bits to the node as pseudo-random data Advice on generating cryptographic-quality random numbers can be
found in [RFC4086]
8. Format by concatenating the 128 bits together 6.7. Sorting
9. Save the state (current timestamp and clock sequence) back to the UUIDv6 and UUIDv7 are designed so that implementations that require
stable store sorting (e.g. database indexes) SHOULD sort as opaque raw bytes,
without need for parsing or introspection.
5. Encoding and Storage Time ordered monotonic UUIDs benefit from greater database index
locality because the new values are near each other in the index. As
a result objects are more easily clustered together for better
performance. The real-world differences in this approach of index
locality vs random data inserts can be quite large.
The existing UUID hex and dash format of 8-4-4-4-12 is retained for UUIDs formats created by this specification SHOULD be
both backwards compatibility and human readability. Lexicographically sortable while in the textual representation.
For many applications such as databases this format is unnecessarily UUIDs created by this specification are crafted with big-ending byte
verbose totaling 288 bits. order (network byte order) in mind. If Little-endian style is
required a custom UUID format SHOULD be created using UUIDv8.
* 8 bits for each of the 32 hex characters = 256 bits 6.8. Opacity
* 8 bits for each of the 4 hyphens = 32 bits
Where possible UUIDs SHOULD be stored within database applications as UUIDs SHOULD be treated as opaque values and implementations SHOULD
the underlying 128 bit binary value. NOT examine the bits in a UUID to whatever extent is possible.
However, where necessary, inspectors should refer to Section 4 for
more information on determining UUID version and variant.
6. Global Uniqueness 6.9. DBMS and Database Considerations
UUIDs created by this specification offer the same guarantees for For many applications, such as databases, storing UUIDs as text is
global uniqueness as those found in [RFC4122]. Furthermore, the unnecessarily verbose, requiring 288 bits to represent 128 bit UUID
time-based UUIDs defined in this specification are geared towards values. Thus, where feasible, UUIDs SHOULD be stored within database
database applications but MAY be used for a wide variety of use- applications as the underlying 128 bit binary value.
cases. Just as global uniqueness is guaranteed, UUIDs are guaranteed
to be unique within an application context within the enterprise
domain.
7. Distributed UUID Generation For other systems, UUIDs MAY be stored in binary form or as text, as
appropriate. The trade-offs to both approaches are as such:
Some implementations might desire to utilize multi-node, clustered, * Storing as binary requires less space and may result in faster
applications which involve 2 or more applications independently data access.
generating UUIDs that will be stored in a common location. UUIDs * Storing as text requires more space but may require less
already feature sufficient entropy to ensure that the chances of translation if the resulting text form is to be used after
collision are low. However, implementations MAY dedicate a portion retrieval and thus maybe simpler to implement.
of the node's most significant random bits to a pseudo-random
machineID which helps identify UUIDs created by a given node. This
works to add an extra layer of collision avoidance.
This machine ID MUST be placed in the UUID after the timestamp and DBMS vendors are encouraged to provide functionality to generate and
sequence counter bits. This position is selected to ensure that the store UUID formats defined by this specification for use as
sorting by timestamp and clock sequence is still possible. The identifiers or left parts of identifiers such as, but not limited to,
machineID MUST NOT be an IEEE 802 MAC address. The creation and primary keys, surrogate keys for temporal databases, foreign keys
negotiation of the machineID among distributed nodes is out of scope included in polymorphic relationships, and keys for key-value pairs
for this specification. in JSON columns and key-value databases. Applications using a
monolithic database may find using database-generated UUIDs (as
opposed to client-generate UUIDs) provides the best UUID
monotonicity. In addition to UUIDs, additional identifiers MAY be
used to ensure integrity and feedback.
8. IANA Considerations 7. IANA Considerations
This document has no IANA actions. This document has no IANA actions.
9. Security Considerations 8. Security Considerations
MAC addresses pose inherent security risks and MUST not be used for MAC addresses pose inherent security risks and SHOULD not be used
node generation. As such they have been strictly forbidden from within a UUID. Instead CSPRNG data SHOULD be selected from a source
time-based UUIDs within this specification. Instead pseudo-random with sufficient entropy to ensure guaranteed uniqueness among UUID
bits SHOULD selected from a source with sufficient entropy to ensure generation. See Section 6.6 for more information.
guaranteed uniqueness among UUID generation.
Timestamps embedded in the UUID do pose a very small attack surface. Timestamps embedded in the UUID do pose a very small attack surface.
The timestamp in conjunction with the clock sequence does signal the The timestamp in conjunction with an embedded counter does signal the
order of creation for a given UUID and it's corresponding data but order of creation for a given UUID and it's corresponding data but
does not define anything about the data itself or the application as does not define anything about the data itself or the application as
a whole. If UUIDs are required for use with any security operation a whole. If UUIDs are required for use with any security operation
within an application context in any shape or form then [RFC4122] within an application context in any shape or form then [RFC4122]
UUIDv4 SHOULD be utilized. UUIDv4 SHOULD be utilized.
The machineID portion of node, described in Section 7, does provide 9. Acknowledgements
small unique identifier which could be used to determine which
application is generating data but this machineID alone is not enough
to identify a node on the network without other corresponding data
points. Furthermore the machineID, like the timestamp+sequence, does
not provide any context about the data the corresponds to the UUID or
the current state of the application as a whole.
10. Acknowledgements
The authors gratefully acknowledge the contributions of Ben Campbell, The authors gratefully acknowledge the contributions of Ben Campbell,
Ben Ramsey, Fabio Lima, Gonzalo Salgueiro, Martin Thomson, Murray S. Ben Ramsey, Fabio Lima, Gonzalo Salgueiro, Martin Thomson, Murray S.
Kucherawy, Rick van Rein, Rob Wilton, Sean Leonard, Theodore Y. Kucherawy, Rick van Rein, Rob Wilton, Sean Leonard, Theodore Y.
Ts'o. As well as all of those in and outside the IETF community to Ts'o., Robert Kieffer, sergeyprokhorenko, LiosK As well as all of
who contributed to the discussions which resulted in this document. those in the IETF community and on GitHub to who contributed to the
discussions which resulted in this document.
11. Normative References 10. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC4122] Leach, P., Mealling, M., and R. Salz, "A Universally [RFC4122] Leach, P., Mealling, M., and R. Salz, "A Universally
Unique IDentifier (UUID) URN Namespace", RFC 4122, Unique IDentifier (UUID) URN Namespace", RFC 4122,
DOI 10.17487/RFC4122, July 2005, DOI 10.17487/RFC4122, July 2005,
<https://www.rfc-editor.org/info/rfc4122>. <https://www.rfc-editor.org/info/rfc4122>.
12. Informative References [RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker,
"Randomness Requirements for Security", RFC 4086,
DOI 10.17487/RFC4086, June 2005,
<https://www.rfc-editor.org/info/rfc4122>.
11. Informative References
[LexicalUUID] [LexicalUUID]
Twitter, "A Scala client for Cassandra", commit f6da4e0, Twitter, "A Scala client for Cassandra", commit f6da4e0,
November 2012, November 2012,
<https://github.com/twitter-archive/cassie>. <https://github.com/twitter-archive/cassie>.
[Snowflake] [Snowflake]
Twitter, "Snowflake is a network service for generating Twitter, "Snowflake is a network service for generating
unique ID numbers at high scale with some simple unique ID numbers at high scale with some simple
guarantees.", Commit b3f6a3c, May 2014, guarantees.", Commit b3f6a3c, May 2014,
skipping to change at page 28, line 9 skipping to change at page 22, line 13
Commit efa678f, October 2020, <https://github.com/rs/xid>. Commit efa678f, October 2020, <https://github.com/rs/xid>.
[ObjectID] MongoDB, "ObjectId - MongoDB Manual", [ObjectID] MongoDB, "ObjectId - MongoDB Manual",
<https://docs.mongodb.com/manual/reference/method/ <https://docs.mongodb.com/manual/reference/method/
ObjectId/>. ObjectId/>.
[CUID] Elliott, E., "Collision-resistant ids optimized for [CUID] Elliott, E., "Collision-resistant ids optimized for
horizontal scaling and performance.", Commit 215b27b, horizontal scaling and performance.", Commit 215b27b,
October 2020, <https://github.com/ericelliott/cuid>. October 2020, <https://github.com/ericelliott/cuid>.
[IEEE754] IEEE, "Collision-resistant ids optimized for horizontal
scaling and performance.", Series 754-2019, July 2019,
<https://standards.ieee.org/ieee/754/6210/>.
Appendix A. Example Code
A.1. Creating a UUIDv6 Value
This section details a function in C which converts from a UUID
version 1 to version 6:
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#include <arpa/inet.h>
#include <uuid/uuid.h>
/* Converts UUID version 1 to version 6 in place. */
void uuidv1tov6(uuid_t u) {
uint64_t ut;
unsigned char *up = (unsigned char *)u;
// load ut with the first 64 bits of the UUID
ut = ((uint64_t)ntohl(*((uint32_t*)up))) << 32;
ut |= ((uint64_t)ntohl(*((uint32_t*)&up[4])));
// dance the bit-shift...
ut =
((ut >> 32) & 0x0FFF) | // 12 least significant bits
(0x6000) | // version number
((ut >> 28) & 0x0000000FFFFF0000) | // next 20 bits
((ut << 20) & 0x000FFFF000000000) | // next 16 bits
(ut << 52); // 12 most significant bits
// store back in UUID
*((uint32_t*)up) = htonl((uint32_t)(ut >> 32));
*((uint32_t*)&up[4]) = htonl((uint32_t)(ut));
}
Figure 6: UUIDv6 Function in C
A.2. Creating a UUIDv7 Value
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <time.h>
// ...
// csprng data source
FILE *rndf;
rndf = fopen("/dev/urandom", "r");
if (rndf == 0) {
printf("fopen /dev/urandom error\n");
return 1;
}
// ...
// generate one UUIDv7E
uint8_t u[16];
struct timespec ts;
int ret;
ret = clock_gettime(CLOCK_REALTIME, &ts);
if (ret != 0) {
printf("clock_gettime error: %d\n", ret);
return 1;
}
uint64_t tms;
tms = ((uint64_t)ts.tv_sec) * 1000;
tms += ((uint64_t)ts.tv_nsec) / 1000000;
memset(u, 0, 16);
fread(&u[6], 10, 1, rndf); // fill everything after the timestamp with random bytes
*((uint64_t*)(u)) |= htonll(tms << 16); // shift time into first 48 bits and OR into place
u[8] = 0x80 | (u[8] & 0x3F); // set variant field, top two bits are 1, 0
u[6] = 0x70 | (u[6] & 0x0F); // set version field, top four bits are 0, 1, 1, 1
Figure 7: UUIDv7 Function in C
A.3. Creating a UUIDv8 Value
UUIDv8 will vary greatly from implementation to implementation. A
good candidate use case for UUIDv8 is to embed exotic timestamps like
the one found in this example which employs approximately 0.25
milliseconds and approximately 5 microseconds per timestamp tick as a
48 bit value.
#include <stdint.h>
#include <stdio.h>
#include <time.h>
int main() {
struct timespec tp;
clock_gettime(CLOCK_REALTIME, &tp);
uint64_t timestamp = (uint64_t)tp.tv_sec << 12;
// compute 12 bit (~0.25 msec precision) fraction from nsecs
timestamp |= ((uint64_t)tp.tv_nsec << 12) / 1000000000;
printf("%08llx-%04llx\n", timestamp >> 16, timestamp & 0xFFFF);
return 0;
}
Figure 8: UUIDv8 Function in C
Appendix B. Test Vectors
Both UUIDv1 and UUIDv6 test vectors utilize the same 60 bit
timestamp: 0x1EC9414C232AB00 (138648505420000000) Tuesday, February
22, 2022 2:22:22.000000 PM GMT-05:00
Both UUIDv1 and UUIDv6 utilize the same values in clk_seq_hi_res,
clock_seq_low, and node. All of which have been generated with
random data.
# Unix Nanosecond precision to Gregorian 100-nanosecond intervals
gregorian_100_ns = (Unix_64_bit_nanoseconds / 100) + gregorian_Unix_offset
# Gregorian to Unix Offset:
# The number of 100-ns intervals between the
# UUID epoch 1582-10-15 00:00:00 and the Unix epoch 1970-01-01 00:00:00.
# gregorian_Unix_offset = 0x01b21dd213814000 or 122192928000000000
# Unix 64 bit Nanosecond Timestamp:
# Unix NS: Tuesday, February 22, 2022 2:22:22 PM GMT-05:00
# Unix_64_bit_nanoseconds = 0x16D6320C3D4DCC00 or 1645557742000000000
# Work:
# gregorian_100_ns = (1645557742000000000 / 100) + 122192928000000000
# (138648505420000000 - 122192928000000000) * 100 = Unix_64_bit_nanoseconds
# Final:
# gregorian_100_ns = 0x1EC9414C232AB00 or 138648505420000000
# Original: 000111101100100101000001010011000010001100101010101100000000
# UUIDv1: 11000010001100101010101100000000|1001010000010100|0001|000111101100
# UUIDv6: 00011110110010010100000101001100|0010001100101010|0110|101100000000
Figure 9: Test Vector Timestamp Pseudo-code
B.1. Example of a UUIDv6 Value
----------------------------------------------
field bits value_hex
----------------------------------------------
time_low 32 0xC232AB00
time_mid 16 0x9414
time_hi_and_version 16 0x11EC
clk_seq_hi_res 8 0xB3
clock_seq_low 8 0xC8
node 48 0x9E6BDECED846
----------------------------------------------
total 128
----------------------------------------------
final_hex: C232AB00-9414-11EC-B3C8-9E6BDECED846
Figure 10: UUIDv1 Example Test Vector
-----------------------------------------------
field bits value_hex
-----------------------------------------------
time_high 32 0x1EC9414C
time_mid 16 0x232A
time_low_and_version 16 0x6B00
clk_seq_hi_res 8 0xB3
clock_seq_low 8 0xC8
node 48 0x9E6BDECED846
-----------------------------------------------
total 128
-----------------------------------------------
final_hex: 1EC9414C-232A-6B00-B3C8-9E6BDECED846
Figure 11: UUIDv6 Example Test Vector
B.2. Example of a UUIDv7 Value
This example UUIDv7 test vector utilizes a well-known 32 bit Unix
epoch with additional millisecond precision to fill the first 48 bits
rand_a and rand_b are filled with random data.
The timestamp is Tuesday, February 22, 2022 2:22:22.00 PM GMT-05:00
represented as 0x17F21CFD130 or 1645539742000
-------------------------------
field bits value
-------------------------------
unix_ts_ms 48 0x017F21CFD130
var 4 0x7
rand_a 12 0xCC3
var 2 b10
rand_b 62 0x18C4DC0C0C07398F
-------------------------------
total 128
-------------------------------
final: 017F21CF-D130-7CC3-98C4-DC0C0C07398F
Figure 12: UUIDv7 Example Test Vector
B.3. Example of a UUIDv8 Value
This example UUIDv8 test vector utilizes a well-known 64 bit Unix
epoch with nanosecond precision, truncated to the least-significant,
right-most, bits to fill the first 48 bits through version.
The next two segments of custom_b and custom_c are are filled with
random data.
Timestamp is Tuesday, February 22, 2022 2:22:22.000000 PM GMT-05:00
represented as 0x16D6320C3D4DCC00 or 1645557742000000000
It should be noted that this example is just to illustrate one
scenario for UUIDv8. Test vectors will likely be implementation
specific and vary greatly from this simple example.
-------------------------------
field bits value
-------------------------------
custom_a 48 0x320C3D4DCC00
ver 4 0x8
custom_b 12 0x75B
var 2 b10
custom_c 62 0xEC932D5F69181C0
-------------------------------
total 128
-------------------------------
final: 320C3D4D-CC00-875B-8EC9-32D5F69181C0
Figure 13: UUIDv8 Example Test Vector
Appendix C. Version and Variant Tables
C.1. Variant 10xx Versions
+------+------+------+------+---------+----------------------------+
| Msb0 | Msb1 | Msb2 | Msb3 | Version | Description |
+------+------+------+------+---------+----------------------------+
| 0 | 0 | 0 | 0 | 0 | Unused |
+------+------+------+------+---------+----------------------------+
| 0 | 0 | 0 | 1 | 1 | The Gregorian time-based |
| | | | | | UUID from in [RFC4122], |
| | | | | | Section 4.1.3 |
+------+------+------+------+---------+----------------------------+
| 0 | 0 | 1 | 0 | 2 | DCE Security version, with |
| | | | | | embedded POSIX UIDs from |
| | | | | | [RFC4122], Section 4.1.3 |
+------+------+------+------+---------+----------------------------+
| 0 | 0 | 1 | 1 | 3 | The name-based version |
| | | | | | specified in [RFC4122], |
| | | | | | Section 4.1.3 that uses |
| | | | | | MD5 hashing. |
+------+------+------+------+---------+----------------------------+
| 0 | 1 | 0 | 0 | 4 | The randomly or pseudo- |
| | | | | | randomly generated version |
| | | | | | specified in [RFC4122], |
| | | | | | Section 4.1.3. |
+------+------+------+------+---------+----------------------------+
| 0 | 1 | 0 | 1 | 5 | The name-based version |
| | | | | | specified in [RFC4122], |
| | | | | | Section 4.1.3 that uses |
| | | | | | SHA-1 hashing. |
+------+------+------+------+---------+----------------------------+
| 0 | 1 | 1 | 0 | 6 | Reordered Gregorian time- |
| | | | | | based UUID specified in |
| | | | | | this document. |
+------+------+------+------+---------+----------------------------+
| 0 | 1 | 1 | 1 | 7 | Unix Epoch time-based UUID |
| | | | | | specified in this |
| | | | | | document. |
+------+------+------+------+---------+----------------------------+
| 1 | 0 | 0 | 0 | 8 | Reserved for custom UUID |
| | | | | | formats specified in this |
| | | | | | document. |
+------+------+------+------+---------+----------------------------+
| 1 | 0 | 0 | 1 | 9 | Reserved for future |
| | | | | | definition. |
+------+------+------+------+---------+----------------------------+
| 1 | 0 | 1 | 0 | 10 | Reserved for future |
| | | | | | definition. |
+------+------+------+------+---------+----------------------------+
| 1 | 0 | 1 | 1 | 11 | Reserved for future |
| | | | | | definition. |
+------+------+------+------+---------+----------------------------+
| 1 | 1 | 0 | 0 | 12 | Reserved for future |
| | | | | | definition. |
+------+------+------+------+---------+----------------------------+
| 1 | 1 | 0 | 1 | 13 | Reserved for future |
| | | | | | definition. |
+------+------+------+------+---------+----------------------------+
| 1 | 1 | 1 | 0 | 14 | Reserved for future |
| | | | | | definition. |
+------+------+------+------+---------+----------------------------+
| 1 | 1 | 1 | 1 | 15 | Reserved for future |
| | | | | | definition. |
+------+------+------+------+---------+----------------------------+
Table 2: All UUID variant 10xx (8/9/A/B) version definitions.
Authors' Addresses Authors' Addresses
Brad G. Peabody Brad G. Peabody
Email: brad@peabody.io Email: brad@peabody.io
Kyzer R. Davis Kyzer R. Davis
Email: kydavis@cisco.com Email: kydavis@cisco.com
 End of changes. 140 change blocks. 
921 lines changed or deleted 942 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/