idnits 2.17.1 draft-hallambaker-udf-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: Since PKIX certificates and CLRs contain security policy information, UDF fingerprints used to identify certificates or CRLs SHOULD be presented with a minimum of 200 bits of precision. PKIX applications MUST not accept UDF fingerprints specified with less than 200 bits of precision for purposes of identifying trust anchors. -- The document date (April 11, 2018) is 2197 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1' on line 762 == Missing Reference: 'RFC2119' is mentioned on line 240, but not defined Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Hallam-Baker 3 Internet-Draft Comodo Group Inc. 4 Intended status: Informational April 11, 2018 5 Expires: October 13, 2018 7 Uniform Data Fingerprint (UDF) 8 draft-hallambaker-udf-10 10 Abstract 12 This document describes means of generating Uniform Data Fingerprint 13 (UDF) values and their presentation as text sequences and as URIs. 14 Uses of UDF fingerprints include but are not limited to creating 15 Strong Internet Names (SINs). 17 Cryptographic digests provide a means of uniquely identifying static 18 data without the need for a registration authority. A fingerprint is 19 a form of presenting a cryptographic digest that makes it suitable 20 for use in applications where human readability is required. The UDF 21 fingerprint format improves over existing formats through the 22 introduction of a compact algorithm identifier affording an 23 intentionally limited choice of digest algorithm and the inclusion of 24 an IANA registered MIME Content-Type identifier within the scope of 25 the digest input to allow the use of a single fingerprint format in 26 multiple application domains. 28 Alternative means of rendering fingerprint values are considered 29 including machine-readable codes, word and image lists. 31 This document is also available online at 32 http://mathmesh.com/Documents/draft-hallambaker-udf.html [1] . 34 Status of This Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at https://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on October 13, 2018. 50 Copyright Notice 52 Copyright (c) 2018 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (https://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 68 1.1. Algorithm Identifier . . . . . . . . . . . . . . . . . . 4 69 1.2. Content Type Identifier . . . . . . . . . . . . . . . . . 4 70 1.3. Representation . . . . . . . . . . . . . . . . . . . . . 5 71 1.4. Truncation . . . . . . . . . . . . . . . . . . . . . . . 5 72 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 5 73 2.1. Requirements Language . . . . . . . . . . . . . . . . . . 5 74 2.2. Defined Terms . . . . . . . . . . . . . . . . . . . . . . 6 75 2.3. Related Specifications . . . . . . . . . . . . . . . . . 7 76 2.4. Implementation Status . . . . . . . . . . . . . . . . . . 7 77 3. Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 7 78 3.1. Binary Fingerprint Value . . . . . . . . . . . . . . . . 7 79 3.1.1. Version ID . . . . . . . . . . . . . . . . . . . . . 8 80 3.2. Truncation . . . . . . . . . . . . . . . . . . . . . . . 8 81 3.3. Base32 Representation . . . . . . . . . . . . . . . . . . 8 82 3.4. Example Encoding . . . . . . . . . . . . . . . . . . . . 9 83 3.4.1. Using SHA-2-512 Digest . . . . . . . . . . . . . . . 9 84 3.4.2. Using SHA-3-512 Digest . . . . . . . . . . . . . . . 10 85 3.5. Fingerprint Improvement . . . . . . . . . . . . . . . . . 10 86 3.6. Compressed Presentation . . . . . . . . . . . . . . . . . 11 87 3.7. Example of Compressed Encoding. . . . . . . . . . . . . . 11 88 3.7.1. Example . . . . . . . . . . . . . . . . . . . . . . . 12 89 4. Content Types . . . . . . . . . . . . . . . . . . . . . . . . 12 90 4.1. PKIX Certificates and Keys . . . . . . . . . . . . . . . 12 91 4.2. OpenPGP Key . . . . . . . . . . . . . . . . . . . . . . . 13 92 4.3. DNSSEC . . . . . . . . . . . . . . . . . . . . . . . . . 13 93 5. Additional UDF Renderings . . . . . . . . . . . . . . . . . . 13 94 5.1. Machine Readable Rendering . . . . . . . . . . . . . . . 13 95 5.2. Word Lists . . . . . . . . . . . . . . . . . . . . . . . 13 96 5.3. Image List . . . . . . . . . . . . . . . . . . . . . . . 14 97 6. Security Considerations . . . . . . . . . . . . . . . . . . . 14 98 6.1. Work Factor and Precision . . . . . . . . . . . . . . . . 14 99 6.2. Semantic Substitution . . . . . . . . . . . . . . . . . . 15 100 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 101 7.1. URI Registration . . . . . . . . . . . . . . . . . . . . 16 102 7.2. Content Type Registration . . . . . . . . . . . . . . . . 16 103 7.3. Version Registry . . . . . . . . . . . . . . . . . . . . 16 104 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 105 8.1. Normative References . . . . . . . . . . . . . . . . . . 16 106 8.2. Informative References . . . . . . . . . . . . . . . . . 16 107 8.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 17 108 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 17 110 1. Introduction 112 The use of cryptographic digest functions to produce identifiers is 113 well established as a means of generating a unique identifier for 114 fixed data without the need for a registration authority. 116 While the use of fingerprints of public keys was popularized by PGP, 117 they are employed in many other applications including OpenPGP, SSH, 118 BitCoin and PKIX. 120 A cryptographic digest is a particular form of hash function that has 121 the properties: 123 o It is easy to compute the digest value for any given message 125 o It is infeasible to generate a message from its digest value 127 o It is infeasible to modify a message without changing the digest 128 value 130 o It is infeasible to find two different messages with the same 131 digest value. 133 If these properties are met, the only way that two data objects that 134 map to the same digest value is by random chance. If the number of 135 possible digest values is sufficiently large (i.e. is a sufficiently 136 large number of bits in length), this chance is reduced to an 137 arbitrarily infinitesimal probability. Such values are described as 138 being probabilistically unique. 140 A fingerprint is a representation of a cryptographic digest value 141 optimized for purposes of verification and in some cases data entry. 143 1.1. Algorithm Identifier 145 Although a secure cryptographic digest algorithm has properties that 146 make it ideal for certain types of identifier use, several 147 cryptographic digest algorithms have found widespread use, some of 148 which have been demonstrated to be insecure. 150 For example the MD5 message digest algorithm [RFC1321] , was widely 151 used in IETF protocols until it was demonstrated to be vulnerable to 152 collision attacks [Dobertin95] . 154 The secure use of a fingerprint scheme therefore requires the digest 155 algorithm to either be fixed or otherwise determined by the 156 fingerprint value itself. Otherwise an attacker may be able to use a 157 weak, broken digest algorithm to generate a data object matching a 158 fingerprint value generated using a strong digest algorithm. 160 The two digest algorithms currently used in the UDF scheme are both 161 believed to be strong. These are SHA-2-512 [SHA-2] and SHA-3-512 162 [SHA-3] . The most secure, 512 bit version of the algorithm is used 163 in both cases although the output is almost invariably truncated to a 164 shorter length. Use of the strongest version of the algorithm in 165 every circumstance eliminates the need to negotiate the algorithm 166 strength. 168 1.2. Content Type Identifier 170 A secure cryptographic digest algorithm provides a unique digest 171 value that is probabilistically unique for a particular byte sequence 172 but does not fix the context in which a byte sequence is interpreted. 173 While such ambiguity may be tolerated in a fingerprint format 174 designed for a single specific field of use, it is not acceptable in 175 a general purpose format. 177 For example, the SSH and OpenPGP applications both make use of 178 fingerprints as identifiers for the public keys used but using 179 different digest algorithms and data formats for representing the 180 public key data. While no such vulnerability has been demonstrated 181 to date, it is certainly conceivable that a crafty attacker might 182 construct an SSH key in such a fashion that OpenPGP interprets the 183 data in an insecure fashion. If the number of applications making 184 use of fingerprint format that permits such substitutions is 185 sufficiently large, the probability of a semantic substitution 186 vulnerability being possible becomes unacceptably large. 188 A simple control that defeats such attacks is to incorporate a 189 content type identifier within the scope of the data input to the 190 hash function. 192 1.3. Representation 194 The representation of a fingerprint is the format in which it is 195 presented to either an application or the user. 197 Base32 encoding is used to produce the preferred text representation 198 of a UDF fingerprint. This encoding uses only the letters of the 199 Latin alphabet with numbers chosen to minimize the risk of ambiguity 200 between numbers and letters (2, 3, 4, 5, 6 and 7). 202 To enhance readability and improve data entry, characters are grouped 203 into groups of five. 205 1.4. Truncation 207 Different applications of fingerprints demand different tradeoffs 208 between compactness of the representation and the number of 209 significant bits. A larger the number of significant bits reduces 210 the risk of collision but at a cost to convenience. 212 Modern cryptographic digest functions such as SHA-2 produce output 213 values of at least 256 bits in length. This is considerably larger 214 than most uses of fingerprints require and certainly greater than can 215 be represented in human readable form on a business card. 217 Since a strong cryptographic digest function produces an output value 218 in which every bit in the input value affects every bit in the output 219 value with equal probability, it follows that truncating the digest 220 value to produce a finger print is at least as strong as any other 221 mechanism if digest algorithm used is strong. 223 Using truncation to reduce the precision of the digest function has 224 the advantage that a lower precision fingerprint of some data content 225 is always a prefix of a higher prefix of the same content. This 226 allows higher precision fingerprints to be converted to a lower 227 precision without the need for special tools. 229 2. Definitions 231 This section presents the related specifications and standard, the 232 terms that are used as terms of art within the documents and the 233 terms used as requirements language. 235 2.1. Requirements Language 237 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 238 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 239 document are to be interpreted as described in [RFC2119]. 241 2.2. Defined Terms 243 Cryptographic Digest Function 245 A hash function that has the properties required for use as a 246 cryptographic hash function. These include collision resistance, 247 first pre-image resistance and second pre-image resistance. 249 Content Type An identifier indicating how a Data Value is to be 250 interpreted as specified in the IANA registry Media Types. 252 Data Value The binary octet stream that is the input to the digest 253 function used to calculate a digest value. 255 Data Object A Data Value and its associated Content Type 257 Digest Algorithm A synonym for Cryptographic Digest Function 259 Digest Value The output of a Cryptographic Digest Function 261 Data Digest Value The output of a Cryptographic Digest Function for 262 a given Data Value input. 264 Fingerprint A presentation of the digest value of a data value or 265 data object. 267 Fingerprint Presentation The representation of at least some part of 268 a fingerprint value in human or machine readable form. 270 Fingerprint Improvement The practice of recording a higher precision 271 presentation of a fingerprint on successful validation. 273 Fingerprint Work Hardening The practice of generating a sequence of 274 fingerprints until one is found that matches criteria that permit 275 a compressed presentation form to be used. The compressed 276 fingerprint thus being shorter than but presenting the same work 277 factor as an uncompressed one. 279 Hash A function which takes an input and returns a fixed-size 280 output. Ideally, the output of a hash function is unbiased and 281 not correlated to the outputs returned to similar inputs in any 282 predictable fashion. 284 Precision The number of significant bits provided by a Fingerprint 285 Presentation. 287 Work Factor A measure of the computational effort required to 288 perform an attack against some security property. 290 2.3. Related Specifications 292 This specification makes use of Base32 [RFC4648] encoding, SHA-2 293 [SHA-2] and SHA-3 [SHA-3] digest functions. 295 UDFs are used in the definition of Strong Internet Names 296 [hallambaker-sin] . 298 2.4. Implementation Status 300 The implementation status of the reference code base is described in 301 the companion document [draft-hallambaker-mesh-developer] . 303 3. Encoding 305 A UDF fingerprint for a given data object is generated by calculating 306 the Binary Fingerprint Value for the given data object and type 307 identifier, truncating it to obtain the desired degree of precision 308 and then converting the truncated value to a representation. 310 3.1. Binary Fingerprint Value 312 The binary encoding of a fingerprint is calculated using the formula: 314 Fingerprint = & + H (& + ?:? + H(&)) 316 Figure 1 318 Where 320 H(x) is the cryptographic digest function 321 & is the fingerprint version and algorithm identifier. 322 & is the MIME Content-Type of the data. 323 & is the binary data. 325 Figure 2 327 The use of the nested hash function permits a fingerprint to be taken 328 of data for which a digest value is already known without the need to 329 calculate a new digest over the data. 331 The inclusion of a MIME content type prevents message substitution 332 attacks in which one content type is substituted for another. 334 3.1.1. Version ID 336 A Version Identifier consists of a single byte. The following digest 337 algorithm identifiers are specified in this document: 339 +-----------------+------------------------+-----------------+ 340 | Version ID | Algorithm | Reference | 341 +-----------------+------------------------+-----------------+ 342 | 96 | SHA-2-512 | | 343 | 97, 98, 99, 100 | SHA-2-512 (compressed) | | 344 | 144 | SHA-3-512 | | 345 +-----------------+------------------------+-----------------+ 347 Table 1 349 These algorithm identifiers have been chosen so that the first 350 character in a SHA-2-512 fingerprint will always be ?M? and the first 351 character in a SHA-3-512 fingerprint will always be ?S?. These 352 provide mnemonics for ?Merkle-Damgard? and ?Sponge? respectively. 354 3.2. Truncation 356 The Binary Fingerprint Value is truncated to an integer multiple of 357 25 bits regardless of the intended output presentation. 359 The output of the hash function is truncated to a sequence of n bits 360 by first selecting the first n/8 bytes of the output function. If n 361 is an integer multiple of 8, no additional bits are required and this 362 is the result. Otherwise the remaining bits are taken from the most 363 significant bits of the next byte and any unused bits set to 0. 365 For example, to truncate the byte sequence [a0, b1, c2, d3, e4] to 25 366 bits. 25/8 = 3 bytes with 1 bit remaining, the first three bytes of 367 the truncated sequence is [a0, b1, c2] and the final byte is e4 AND 368 80 = 80 which we add to the previous result to obtain the final 369 truncated sequence of [a0, b1, c2, 80] 371 3.3. Base32 Representation 373 A modified version of Base32 [RFC4648] encoding is used to present 374 the fingerprint in text form grouping the output text into groups of 375 five characters separated by a dash ?-?. This representation improves 376 the accuracy of both data entry and verification. 378 3.4. Example Encoding 380 In the following examples, is the UTF8 encoding of the 381 string "text/plain" and is the UTF8 encoding of the string 382 "UDF Data Value" 384 Data = 385 55 44 46 20 44 61 74 61 20 56 61 6C 75 65 387 ContentType = 388 74 65 78 74 2F 70 6C 61 69 6E 390 Figure 3 392 3.4.1. Using SHA-2-512 Digest 394 H( ) = 396 48 DA 47 CC AB FE A4 5C 76 61 D3 21 BA 34 3E 58 397 10 87 2A 03 B4 02 9D AB 84 7C CE D2 22 B6 9C AB 398 02 38 D4 E9 1E 2F 6B 36 A0 9E ED 11 09 8A EA AC 399 99 D9 E0 BD EA 47 93 15 BD 7A E9 E1 2E AD C4 15 401 H ( + ':' + H())= 403 74 65 78 74 2F 70 6C 61 69 6E 3A 48 DA 47 CC AB 404 FE A4 5C 76 61 D3 21 BA 34 3E 58 10 87 2A 03 B4 405 02 9D AB 84 7C CE D2 22 B6 9C AB 02 38 D4 E9 1E 406 2F 6B 36 A0 9E ED 11 09 8A EA AC 99 D9 E0 BD EA 407 47 93 15 BD 7A E9 E1 2E AD C4 15 409 H ( + ':' + H())= 411 C6 AF B7 C0 FE BE 04 E5 AE 94 E3 7B AA 5F 1A 40 412 5B A3 CE CC 97 4D 55 C0 9E 61 E4 B0 EF 9C AE F9 413 EB 83 BB 9D 5F 0F 39 F6 5F AA 06 DC 67 2A 67 71 414 4F FF 8F 83 C4 55 38 36 38 AE 42 7A 82 9C 85 BB 416 Figure 4 418 Text Presentation (100 bit) MDDK7-N6A72-7AJZN-OSTR3 420 Text Presentation (125 bit) MDDK7-N6A72-7AJZN-OSTRX-XKS72 422 Text Presentation (150 bit) MDDK7-N6A72-7AJZN-OSTRX-XKS7D-JAFXD 424 Text Presentation (250 bit) MDDK7-N6A72-7AJZN-OSTRX-XKS7D-JAFXI- 425 6OZSL-U2VOA-TZQ6J-MHPTS 427 3.4.2. Using SHA-3-512 Digest 429 H( ) = 431 6D 2E CF E6 93 5A 0C FC F2 A9 1A 49 E0 0C D8 07 432 A1 4E 70 AB 72 94 6E CC BB 47 48 F1 8E 41 49 95 433 07 1D F3 6E 0D 0C 8B 60 39 C1 8E B4 0F 6E C8 08 434 65 B4 C4 45 9B A2 7E 97 74 7B BE 68 BC A8 C2 17 436 H ( + ':' + H())= 438 74 65 78 74 2F 70 6C 61 69 6E 3A 6D 2E CF E6 93 439 5A 0C FC F2 A9 1A 49 E0 0C D8 07 A1 4E 70 AB 72 440 94 6E CC BB 47 48 F1 8E 41 49 95 07 1D F3 6E 0D 441 0C 8B 60 39 C1 8E B4 0F 6E C8 08 65 B4 C4 45 9B 442 A2 7E 97 74 7B BE 68 BC A8 C2 17 444 H ( + ':' + H())= 446 58 9B 76 70 35 B4 55 E5 41 4C 29 4D 73 C1 FD 48 447 F9 9A D6 29 35 A3 14 9A 32 6C EA 9E 7D 7A 8C 3F 448 26 B0 0F 15 84 CB BE 6F 35 C6 37 48 AF 5C F1 02 449 31 79 50 B1 A1 4F 97 50 97 49 5E DA A2 A0 A9 B5 451 Figure 5 453 Text Presentation (100 bit) SCFIN-CQGDR-KG47R-7OVPZ 455 Text Presentation (125 bit) SCFIN-CQGDR-KG47R-7OVPT-TCHZ5 457 Text Presentation (150 bit) SCFIN-CQGDR-KG47R-7OVPT-TCHZ7-UXY4I 459 Text Presentation (250 bit) SCFIN-CQGDR-KG47R-7OVPT-TCHZ7-UXY5S- 460 CFSMN-YBKBP-FELHX-I56EH 462 3.5. Fingerprint Improvement 464 Since an application must always calculate the full fingerprint value 465 as part of the verification process, an application MAY accept a low 466 precision (e.g. 100 bit) fingerprint value from the user and replace 467 it with a higher precision fingerprint (e.g. 250 bits) after 468 verification. 470 Applications are encouraged to make use of the practice of 471 fingerprint improvement wherever possible. 473 3.6. Compressed Presentation 475 Fingerprint compression permits the use of shorter fingerprint 476 presentation without a reduction in the attacker work factor by 477 requiring the fingerprint value to match a particular pattern. 479 UDF fingerprints MUST use compression if possible. A compressed 480 fingerprint uses a version identifier that specifies the form of 481 compression used as follows: 483 +------------+-------------------------+ 484 | Version ID | Compression | 485 +------------+-------------------------+ 486 | 96 | None | 487 | 97 | First 25 bits are zeros | 488 | 98 | First 35 bits are zeros | 489 | 99 | First 40 bits are zeros | 490 | 100 | First 45 bits are zeros | 491 | 101 | First 50 bits are zeros | 492 +------------+-------------------------+ 494 Table 2 496 Support for compression may introduce perverse incentives such as 497 performing key generation on machines that less secure but offer fast 498 (or cheap) processing power. An attacker might even offer to 499 generate public key pairs for free using their 'ultra fast' machine. 500 For this reason, it is probably desirable to at least support if not 501 mandate the use of some sort of salting scheme when compression is in 502 use. This allows the key to be generated in secure, trusted hardware 503 and only the discovery of a salt providing the desired compression 504 being performed on less trusted or untrusted devices. 506 Currently, 25 bit compression may be achieved on commodity machines 507 with minimal impact on key generation if salting is used. Use of 35 508 bit compression has a noticeable impact, but can still be achieved 509 within hours without the use of special purpose hardware (e.g. use of 510 a GPU unit). Use of 40 bit compression is feasible with a GPU and 511 use of 50 bit compression which would allow a fingerprint to be 512 shortened by ten significant characters is on the outer edge of 513 practicality. While support for even higher levels of compression is 514 conceivable, it is probably not very sensible. 516 3.7. Example of Compressed Encoding. 518 3.7.1. Example 520 The string "290668103" has a SHA-2-512 UDF fingerprint with 29 521 leading zero bits. The inputs to the fingerprint are: 523 Data = 524 32 39 30 36 36 38 31 30 33 526 ContentType = 527 74 65 78 74 2F 70 6C 61 69 6E 529 Figure 6 531 The 100 bit UDF fingerprint is: 533 MF3VV-FOFE2-CLRW (Maybe) 535 NB: The above is not generated from code and might well be incorrect. 537 4. Content Types 539 While a UDF fingerprint MAY be used to identify any form of static 540 data, the use of a UDF fingerprint to identify a public key signature 541 key provides a level of indirection and thus the ability to identify 542 dynamic data. The content types used to identify public keys are 543 thus of particular interest. 545 As described in the security considerations section, the use of 546 fingerprints to identify a bare public key and the use of 547 fingerprints to identify a public key and associated security policy 548 information are very different. 550 4.1. PKIX Certificates and Keys 552 UDF fingerprints MAY be used to identify PKIX certificates, CRLs and 553 public keys in the ASN.1 encoding used in PKIX certificates. 555 Since PKIX certificates and CLRs contain security policy information, 556 UDF fingerprints used to identify certificates or CRLs SHOULD be 557 presented with a minimum of 200 bits of precision. PKIX applications 558 MUST not accept UDF fingerprints specified with less than 200 bits of 559 precision for purposes of identifying trust anchors. 561 PKIX certificates, keys and related content data are identified by 562 the following content types: 564 application/pkix-cert A PKIX Certificate 565 application/pkix-crl A PKIX CRL 567 application/pkix-keyinfo The KeyInfo structure defined in the PKIX 568 certificate specification 570 4.2. OpenPGP Key 572 OpenPGPv5 keys and key set content data are identified by the 573 following content types: 575 application/pgp-key-v5 An OpenPGP key 577 application/pgp-keys An OpenPGP key set. 579 4.3. DNSSEC 581 DNSSEC record data consists of DNS records which are identified by 582 the following content type: 584 application/dns A DNS resource record in binary format 586 5. Additional UDF Renderings 588 By default, a UDF fingerprint is rendered in the Base32 encoding 589 described in this document. Additional renderings MAY be employed to 590 facilitate entry and/or verification of fingerprint values. 592 5.1. Machine Readable Rendering 594 The use of a machine-readable rendering such as a QR Code allows a 595 UDF value to be input directly using a smartphone or other device 596 equipped with a camera. 598 A QR code fixed to a network capable device might contain the 599 fingerprint of a machine readable description of the device. 601 5.2. Word Lists 603 The use of a Word List to encode fingerprint values was introduced by 604 Patrick Juola and Philip Zimmerman for the PGPfone application. The 605 PGP Word List is designed to facilitate exchange and verification of 606 fingerprint values in a voice application. To minimize the risk of 607 misinterpretation, two word lists of 256 values each are used to 608 encode alternative fingerprint bytes. The compact size of the lists 609 used allowed the compilers to curate them so as to maximize the 610 phonetic distance of the words selected. 612 The PGP Word List is designed to achieve a balance between ease of 613 entry and verification. Applications where only verification is 614 required may be better served by a much larger word list, permitting 615 shorter fingerprint encodings. 617 For example, a word list with 16384 entries permits 14 bits of the 618 fingerprint to be encoded at once, 65536 entries permits 16. These 619 encodings allow a 125 bit fingerprint to be encoded in 9 and 8 words 620 respectively. 622 5.3. Image List 624 An image list is used in the same manner as a word list affording 625 rapid visual verification of a fingerprint value. For obvious 626 reasons, this approach is not generally suited to data entry. 628 6. Security Considerations 630 6.1. Work Factor and Precision 632 A given UDF data object has a single fingerprint value that may be 633 presented at different precisions. The shortest legitimate precision 634 with which a UDF fingerprint may be presented has 96 significant bits 636 A UDF fingerprint presents the same work factor as any other 637 cryptographic digest function. The difficulty of finding a second 638 data item that matches a given fingerprint is 2^n and the difficulty 639 or finding two data items that have the same fingerprint is 2^(n/2). 640 Where n is the precision of the fingerprint. 642 For the algorithms specified in this document, n = 512 and thus the 643 work factor for finding collisions is 2^256, a value that is 644 generally considered to be computationally infeasible. 646 Since the use of 512 bit fingerprints is impractical in the type of 647 applications where fingerprints are generally used, truncation is a 648 practical necessity. The longer a fingerprint is, the less likely it 649 is that a user will check every character. It is therefore important 650 to consider carefully whether the security of an application depends 651 on second pre-image resistance or collision resistance. 653 In most fingerprint applications, such as the use of fingerprints to 654 identify public keys, the fact that a malicious party might generate 655 two keys that have the same fingerprint value is a minor concern. 656 Combined with a flawed protocol architecture, such a vulnerability 657 may permit an attacker to construct a document such that the 658 signature will be accepted as valid by some parties but not by 659 others. 661 For example, Alice generates keypairs until two are generated that 662 have the same 100 bit UDF presentation (typically 2^48 attempts). 663 She registers one keypair with a merchant and the other with her 664 bank. This allows Alice to create a payment instrument that will be 665 accepted as valid by one and rejected by the other. 667 The ability to generate of two PKIX certificates with the same 668 fingerprint and different certificate attributes raises very 669 different and more serious security concerns. For example, an 670 attacker might generate two certificates with the same key and 671 different use constraints. This might allow an attacker to present a 672 highly constrained certificate that does not present a security risk 673 to an application for purposes of gaining approval and an 674 unconstrained certificate to request a malicious action. 676 In general, any use of fingerprints to identify data that has 677 security policy semantics requires the risk of collision attacks to 678 be considered. For this reason the use of short, ?user friendly? 679 fingerprint presentations (Less than 200 bits) SHOULD only be used 680 for public key values. 682 6.2. Semantic Substitution 684 Many applications record the fact that a data item is trusted, rather 685 fewer record the circumstances in which the data item is trusted. 686 This results in a semantic substitution vulnerability which an 687 attacker may exploit by presenting the trusted data item in the wrong 688 context. 690 The UDF format provides protection against high level semantic 691 substitution attacks by incorporating the content type into the input 692 to the outermost fingerprint digest function. The work factor for 693 generating a UDF fingerprint that is valid in both contexts is thus 694 the same as the work factor for finding a second preimage in the 695 digest function (2^512 for the specified digest algorithms). 697 It is thus infeasible to generate a data item such that some 698 applications will interpret it as a PKIX key and others will accept 699 as an OpenPGP key. While attempting to parse a PKIX key as an 700 OpenPGP key is virtually certain to fail to return the correct key 701 parameters it cannot be assumed that the attempt is guaranteed to 702 fail with an error message. 704 The UDF format does not provide protection against semantic 705 substitution attacks that do not affect the content type. 707 7. IANA Considerations 709 [This will be extended later] 711 7.1. URI Registration 713 [Here a URI registration for the udf: scheme] 715 7.2. Content Type Registration 717 [application/pkix-keyinfo] 719 [application/pgp-key] 721 7.3. Version Registry 723 96 = SHA-2-512 724 97 = SHA-2-512 with 25 leading zeros 725 98 = SHA-2-512 with 40 leading zeros 726 99 = SHA-2-512 with 50 leading zeros 727 100 = SHA-2-512 with 55 leading zeros 728 144 = SHA-3-512 730 Figure 7 732 8. References 734 8.1. Normative References 736 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 737 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006. 739 [SHA-2] NIST, "Secure Hash Standard", August 2015. 741 [SHA-3] Dworkin, M., "SHA-3 Standard: Permutation-Based Hash and 742 Extendable-Output Functions", August 2015. 744 8.2. Informative References 746 [Dobertin95] 747 Eurocrypt 1996, "Cryptanalysis of MD5 Compress". 749 [draft-hallambaker-mesh-developer] 750 Hallam-Baker, P., "Mathematical Mesh: Reference 751 Implementation", draft-hallambaker-mesh-developer-06 (work 752 in progress), April 2018. 754 [hallambaker-sin] 755 "[Reference Not Found!]". 757 [RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, 758 DOI 10.17487/RFC1321, April 1992. 760 8.3. URIs 762 [1] http://mathmesh.com/Documents/draft-hallambaker-udf.html 764 Author's Address 766 Phillip Hallam-Baker 767 Comodo Group Inc. 769 Email: philliph@comodo.com