idnits 2.17.1 draft-multiformats-multibase-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 2022) is 800 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: '0-9a-z' is mentioned on line 239, but not defined == Unused Reference: 'RFC2119' is defined on line 127, but no explicit reference was found in the text Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Benet 3 Internet-Draft Protocol Labs 4 Intended status: Informational M. Sporny 5 Expires: 20 August 2022 Digital Bazaar 6 February 2022 8 The Multibase Data Format 9 draft-multiformats-multibase-05 11 Abstract 13 Raw binary data is often encoded using a mechanism that enables the 14 data to be included in human-readable text-based formats. This 15 mechanism is often referred to as "base-encoding the data". Base- 16 encoding is often used when expressing binary data in hyperlinks, 17 cryptographic keys in web pages, or security tokens in application 18 software. There are a variety of base-encodings, such as base32, 19 base58, and base64. It is not always possible to differentiate one 20 base-encoding from another. The purpose of this specification is to 21 provide a mechanism to be able to deterministically identify the 22 base-encoding for a particular string of data. 24 Feedback 26 This specification is a joint work product of Protocol Labs 27 (https://protocol.ai/), the W3C Digital Verification Community Group 28 (https://w3c-dvcg.github.io/), and the W3C Credentials Community 29 Group (https://w3c-ccg.github.io/). Feedback related to this 30 specification should logged in the issue tracker (https://github.com/ 31 w3c-dvcg/multibase/issues) or be sent to public-credentials@w3.org 32 (mailto:public-credentials@w3.org). . 34 Status of This Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at https://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on 5 August 2022. 50 Copyright Notice 52 Copyright (c) 2022 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 57 license-info) in effect on the date of publication of this document. 58 Please review these documents carefully, as they describe your rights 59 and restrictions with respect to this document. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 64 2. The Multibase Format . . . . . . . . . . . . . . . . . . . . 3 65 2.1. A Multibase Example . . . . . . . . . . . . . . . . . . . 3 66 3. Normative References . . . . . . . . . . . . . . . . . . . . 3 67 Appendix A. Security Considerations . . . . . . . . . . . . . . 4 68 Appendix B. Test Values . . . . . . . . . . . . . . . . . . . . 4 69 B.1. Hexadecimal upper-case encoding . . . . . . . . . . . . . 4 70 B.2. Base-32 upper-case encoding, no padding . . . . . . . . . 4 71 B.3. Base-58 Bitcoin encoding . . . . . . . . . . . . . . . . 4 72 B.4. Base-64 with padding and MIME-encoding . . . . . . . . . 4 73 Appendix C. Acknowledgements . . . . . . . . . . . . . . . . . . 4 74 Appendix D. IANA Considerations . . . . . . . . . . . . . . . . 4 75 D.1. The Multibase Algorithms Registry . . . . . . . . . . . . 4 76 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 6 78 1. Introduction 80 This specification describes a forward-compatible data model for 81 expressing raw binary data in a variety of base-encoding formats such 82 as base32, base58. and base64. 84 When text is encoded as bytes, we can usually use a one-size-fits-all 85 encoding (UTF-8) because we're always encoding to the same set of 256 86 bytes. When that doesn't work, usually for historical or performance 87 reasons, we can usually infer the encoding from the context. 89 However, when bytes are encoded as text (using a base encoding), the 90 choice of base encoding is often restricted by the context. Worse, 91 these restrictions can change based on where the data appears in the 92 text. In some cases, we can only use [a-z0-9]. In others, we can 93 use a larger set of characters but need a compact encoding. This has 94 lead to a large set of "base encodings", one for every use-case. 95 Unlike when encoding text to bytes, we can't just standardize around 96 a single base encoding because there is no optimal encoding for all 97 cases. 99 Unfortunately, it's not always clear what base encoding is used; 100 that's where this specification comes in. It answers the question: 102 Given data 'd' encoded into text 's', what base is it encoded with? 104 2. The Multibase Format 106 A multibase-encoded value follows a simple format: 108 base-encoding-character base-encoded-data 110 The encoding algorithm is a single character value that is always the 111 first byte of the data. The possible values for this field are 112 provided in The Multibase Algorithm Registry (#mb-registry). 114 2.1. A Multibase Example 116 The following is an encoding of "Hello World!" using the version of 117 base-58 that utilizes the Bitcoin encoding character set: 119 z2NEpo7TZRRrLZSi2U 121 The first byte (z) specifies the multibase encoding algorithm. The 122 rest of the data specifies the value of the output of the multibase 123 encoding algorithm. 125 3. Normative References 127 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 128 Requirement Levels", BCP 14, RFC 2119, 129 DOI 10.17487/RFC2119, March 1997, 130 . 132 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 133 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, 134 . 136 Appendix A. Security Considerations 138 There are a number of security considerations to take into account 139 when implementing or utilizing this specification. TBD 141 Appendix B. Test Values 143 The multibase examples are chosen to show different encoding 144 algorithms and different output lengths at play. The input test data 145 for all of the examples in this section is: 147 Multibase is awesome! \o/ 149 B.1. Hexadecimal upper-case encoding 151 F4D756C74696261736520697320617765736F6D6521205C6F2F 153 B.2. Base-32 upper-case encoding, no padding 155 BJV2WY5DJMJQXGZJANFZSAYLXMVZW63LFEEQFY3ZP 157 B.3. Base-58 Bitcoin encoding 159 zYAjKoNbau5KiqmHPmSxYCvn66dA1vLmwbt 161 B.4. Base-64 with padding and MIME-encoding 163 MTXVsdGliYXNlIGlzIGF3ZXNvbWUhIFxvLw== 165 Appendix C. Acknowledgements 167 The editors would like to thank the following individuals for 168 feedback on and implementations of the specification (in alphabetical 169 order): 171 Appendix D. IANA Considerations 173 D.1. The Multibase Algorithms Registry 175 The following initial entries should be added to the Multibase 176 Algorithms Registry to be created and maintained at (the suggested 177 URI) http://www.iana.org/assignments/multibase-algorithms 178 (http://www.iana.org/assignments/multibase-algorithms): 180 +===================+=============+========+========================+ 181 | Algorithm | Identifier | Status | Specification | 182 | | (character) | | | 183 +===================+=============+========+========================+ 184 | identity | 0x00 | active | 8-bit binary (encoder | 185 | | | | and decoder keeps | 186 | | | | data unmodified) | 187 +-------------------+-------------+--------+------------------------+ 188 | base2 | 0 | active | binary (01010101) | 189 +-------------------+-------------+--------+------------------------+ 190 | base8 | 7 | active | octal | 191 +-------------------+-------------+--------+------------------------+ 192 | base10 | 9 | active | decimal | 193 +-------------------+-------------+--------+------------------------+ 194 | base16 | f | active | hexadecimal | 195 +-------------------+-------------+--------+------------------------+ 196 | base16upper | F | active | hexadecimal | 197 +-------------------+-------------+--------+------------------------+ 198 | base32hex | v | active | RFC 4648 [RFC4648] | 199 | | | | case-insensitive - no | 200 | | | | padding - highest | 201 | | | | char | 202 +-------------------+-------------+--------+------------------------+ 203 | base32hexupper | V | active | RFC 4648 [RFC4648] | 204 | | | | case-insensitive - no | 205 | | | | padding - highest | 206 | | | | char | 207 +-------------------+-------------+--------+------------------------+ 208 | base32hexpad | t | active | RFC 4648 [RFC4648] | 209 | | | | case-insensitive - | 210 | | | | with padding | 211 +-------------------+-------------+--------+------------------------+ 212 | base32hexpadupper | T | active | RFC 4648 [RFC4648] | 213 | | | | case-insensitive - | 214 | | | | with padding | 215 +-------------------+-------------+--------+------------------------+ 216 | base32 | b | active | RFC 4648 [RFC4648] | 217 | | | | case-insensitive - no | 218 | | | | padding | 219 +-------------------+-------------+--------+------------------------+ 220 | base32upper | B | active | RFC 4648 [RFC4648] | 221 | | | | case-insensitive - no | 222 | | | | padding | 223 +-------------------+-------------+--------+------------------------+ 224 | base32pad | c | active | RFC 4648 [RFC4648] | 225 | | | | case-insensitive - | 226 | | | | with padding | 227 +-------------------+-------------+--------+------------------------+ 228 | base32padupper | C | active | RFC 4648 [RFC4648] | 229 | | | | case-insensitive - | 230 | | | | with padding | 231 +-------------------+-------------+--------+------------------------+ 232 | base32z | h | active | z-base-32 (used by | 233 | | | | Tahoe-LAFS) | 234 +-------------------+-------------+--------+------------------------+ 235 | base36 | k | active | base36 [0-9a-z] case- | 236 | | | | insensitive - no | 237 | | | | padding | 238 +-------------------+-------------+--------+------------------------+ 239 | base36upper | K | active | base36 [0-9a-z] case- | 240 | | | | insensitive - no | 241 | | | | padding | 242 +-------------------+-------------+--------+------------------------+ 243 | base58btc | z | active | base58 bitcoin | 244 +-------------------+-------------+--------+------------------------+ 245 | base58flickr | Z | active | base58 flicker | 246 +-------------------+-------------+--------+------------------------+ 247 | base64 | m | active | RFC 4648 [RFC4648] no | 248 | | | | padding | 249 +-------------------+-------------+--------+------------------------+ 250 | base64pad | M | active | RFC 4648 [RFC4648] | 251 | | | | with padding - MIME | 252 | | | | encoding | 253 +-------------------+-------------+--------+------------------------+ 254 | base64url | u | active | RFC 4648 [RFC4648] no | 255 | | | | padding | 256 +-------------------+-------------+--------+------------------------+ 257 | base64urlpad | U | active | RFC 4648 [RFC4648] | 258 | | | | with padding | 259 +-------------------+-------------+--------+------------------------+ 260 | proquint | p | active | PRO-QUINT | 261 | | | | https://arxiv.org/ | 262 | | | | html/0901.4016 | 263 +-------------------+-------------+--------+------------------------+ 265 Table 1: Multihash Algorithms Registry 267 NOTE: The most up to date place for developers to find the table 268 above is https://github.com/multiformats/multibase/blob/master/ 269 multibase.csv (https://github.com/multiformats/multibase/blob/master/ 270 multibase.csv). 272 Authors' Addresses 273 Juan Benet 274 Protocol Labs 275 548 Market Street, #51207 276 San Francisco, CA 94104 277 United States of America 279 Phone: +1 619 957 7606 280 Email: juan@protocol.ai 281 URI: http://juan.benet.ai/ 283 Manu Sporny 284 Digital Bazaar 285 203 Roanoke Street W. 286 Blacksburg, VA 24060 287 United States of America 289 Phone: +1 540 961 4469 290 Email: msporny@digitalbazaar.com 291 URI: http://manu.sporny.org/