idnits 2.17.1 draft-hallambaker-udf-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 367 has weird spacing: '...a 47 cc ab fe...' == Line 370 has weird spacing: '...9 e0 bd ea 47...' == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: Since PKIX certificates and CLRs contain security policy information, UDF fingerprints used to identify certificates or CRLs SHOULD be presented with a minimum of 200 bits of precision. PKIX applications MUST not accept UDF fingerprints specified with less than 200 bits of precision for purposes of identifying trust anchors. -- The document date (August 14, 2017) is 2440 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'RFC2119' is mentioned on line 279, but not defined Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Hallam-Baker 3 Internet-Draft Comodo Group Inc. 4 Intended status: Informational August 14, 2017 5 Expires: February 15, 2018 7 Uniform Data Fingerprint (UDF) 8 draft-hallambaker-udf-06 10 Abstract 12 This document is also available online at 13 http://prismproof.org/Documents/draft-hallambaker-udf.html . 15 Status of This Memo 17 This Internet-Draft is submitted in full conformance with the 18 provisions of BCP 78 and BCP 79. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF). Note that other groups may also distribute 22 working documents as Internet-Drafts. The list of current Internet- 23 Drafts is at http://datatracker.ietf.org/drafts/current/. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 This Internet-Draft will expire on February 15, 2018. 32 Copyright Notice 34 Copyright (c) 2017 IETF Trust and the persons identified as the 35 document authors. All rights reserved. 37 This document is subject to BCP 78 and the IETF Trust's Legal 38 Provisions Relating to IETF Documents 39 (http://trustee.ietf.org/license-info) in effect on the date of 40 publication of this document. Please review these documents 41 carefully, as they describe your rights and restrictions with respect 42 to this document. Code Components extracted from this document must 43 include Simplified BSD License text as described in Section 4.e of 44 the Trust Legal Provisions and are provided without warranty as 45 described in the Simplified BSD License. 47 Table of Contents 49 1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . 2 50 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 51 2.1. Algorithm Identifier . . . . . . . . . . . . . . . . . . 4 52 2.2. Content Type Identifier . . . . . . . . . . . . . . . . . 4 53 2.3. Representation . . . . . . . . . . . . . . . . . . . . . 5 54 2.4. Truncation . . . . . . . . . . . . . . . . . . . . . . . 5 55 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 5 56 3.1. Requirements Language . . . . . . . . . . . . . . . . . . 6 57 4. Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 6 58 4.1. Binary Fingerprint Value . . . . . . . . . . . . . . . . 7 59 4.1.1. Version ID . . . . . . . . . . . . . . . . . . . . . 7 60 4.2. Truncation . . . . . . . . . . . . . . . . . . . . . . . 8 61 4.3. Base32 Representation . . . . . . . . . . . . . . . . . . 8 62 4.4. Examples . . . . . . . . . . . . . . . . . . . . . . . . 8 63 4.4.1. Using SHA-2-512 Digest . . . . . . . . . . . . . . . 8 64 4.5. Fingerprint Improvement . . . . . . . . . . . . . . . . . 9 65 4.6. Compressed Presentation . . . . . . . . . . . . . . . . . 9 66 4.7. Identifiers formed using UDFs . . . . . . . . . . . . . . 9 67 4.7.1. URI Representation . . . . . . . . . . . . . . . . . 10 68 4.7.2. DNS Name . . . . . . . . . . . . . . . . . . . . . . 10 69 5. Content Types . . . . . . . . . . . . . . . . . . . . . . . . 11 70 5.1. PKIX Certificates and Keys . . . . . . . . . . . . . . . 11 71 5.2. OpenPGP Key . . . . . . . . . . . . . . . . . . . . . . . 11 72 5.3. DNSSEC . . . . . . . . . . . . . . . . . . . . . . . . . 11 73 6. Additional UDF Renderings . . . . . . . . . . . . . . . . . . 12 74 6.1. Machine Readable Rendering . . . . . . . . . . . . . . . 12 75 6.2. Word Lists . . . . . . . . . . . . . . . . . . . . . . . 12 76 6.3. Image List . . . . . . . . . . . . . . . . . . . . . . . 12 77 7. Security Considerations . . . . . . . . . . . . . . . . . . . 13 78 7.1. Work Factor and Precision . . . . . . . . . . . . . . . . 13 79 7.2. Semantic Substitution . . . . . . . . . . . . . . . . . . 14 80 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 81 8.1. URI Registration . . . . . . . . . . . . . . . . . . . . 14 82 8.2. Content Type Registration . . . . . . . . . . . . . . . . 14 83 8.3. Version Registry . . . . . . . . . . . . . . . . . . . . 15 84 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 85 9.1. Normative References . . . . . . . . . . . . . . . . . . 15 86 9.2. Informative References . . . . . . . . . . . . . . . . . 15 87 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 15 89 1. Abstract 91 This document describes means of generating Uniform Data Fingerprint 92 (UDF) values and their presentation as text sequences and as URIs. 94 Cryptographic digests provide a means of uniquely identifying static 95 data without the need for a registration authority. A fingerprint is 96 a form of presenting a cryptographic digest that makes it suitable 97 for use in applications where human readability is required. The UDF 98 fingerprint format improves over existing formats through the 99 introduction of a compact algorithm identifier affording an 100 intentionally limited choice of digest algorithm and the inclusion of 101 an IANA registered MIME Content-Type identifier within the scope of 102 the digest input to allow the use of a single fingerprint format in 103 multiple application domains. 105 Alternative means of rendering fingerprint values are considered 106 including machine-readable codes, word and image lists. 108 2. Introduction 110 The use of cryptographic digest functions to produce identifiers is 111 well established as a means of generating a unique identifier for 112 fixed data without the need for a registration authority. 114 While the use of fingerprints of public keys was popularized by PGP, 115 they are employed in many other applications including OpenPGP, SSH, 116 BitCoin and PKIX. 118 A cryptographic digest is a particular form of hash function that has 119 the properties: 121 o It is easy to compute the digest value for any given message 123 o It is infeasible to generate a message from its digest value 125 o It is infeasible to modify a message without changing the digest 126 value 128 o It is infeasible to find two different messages with the same 129 digest value. 131 If these properties are met, the only way that two data objects that 132 map to the same digest value is by random chance. If the number of 133 possible digest values is sufficiently large (i.e. is a sufficiently 134 large number of bits in length), this chance is reduced to an 135 arbitrarily infinitesimal probability. Such values are described as 136 being probabilistically unique. 138 A fingerprint is a representation of a cryptographic digest value 139 optimized for purposes of verification and in some cases data entry. 141 2.1. Algorithm Identifier 143 Although a secure cryptographic digest algorithm has properties that 144 make it ideal for certain types of identifier use, several 145 cryptographic digest algorithms have found widespread use, some of 146 which have been demonstrated to be insecure. 148 For example the MD5 message digest algorithm [RFC1321] [RFC1321] , 149 was widely used in IETF protocols until it was demonstrated to be 150 vulnerable to collision attacks [Dobertin95] [Dobertin95] . 152 The secure use of a fingerprint scheme therefore requires the digest 153 algorithm to either be fixed or otherwise determined by the 154 fingerprint value itself. Otherwise an attacker may be able to use a 155 weak, broken digest algorithm to generate a data object matching a 156 fingerprint value generated using a strong digest algorithm. 158 The two digest algorithms currently used in the UDF scheme are both 159 believed to be strong. These are SHA-2-512 [SHA-2] [SHA-2] and SHA- 160 3-512 [SHA-3] [SHA-3] . The most secure, 512 bit version of the 161 algorithm is used in both cases although the output is almost 162 invariably truncated to a shorter length. Use of the strongest 163 version of the algorithm in every circumstance eliminates the need to 164 negotiate the algorithm strength. 166 2.2. Content Type Identifier 168 A secure cryptographic digest algorithm provides a unique digest 169 value that is probabilistically unique for a particular byte sequence 170 but does not fix the context in which a byte sequence is interpreted. 171 While such ambiguity may be tolerated in a fingerprint format 172 designed for a single specific field of use, it is not acceptable in 173 a general purpose format. 175 For example, the SSH and OpenPGP applications both make use of 176 fingerprints as identifiers for the public keys used but using 177 different digest algorithms and data formats for representing the 178 public key data. While no such vulnerability has been demonstrated 179 to date, it is certainly conceivable that a crafty attacker might 180 construct an SSH key in such a fashion that OpenPGP interprets the 181 data in an insecure fashion. If the number of applications making 182 use of fingerprint format that permits such substitutions is 183 sufficiently large, the probability of a semantic substitution 184 vulnerability being possible becomes unacceptably large. 186 A simple control that defeats such attacks is to incorporate a 187 content type identifier within the scope of the data input to the 188 hash function. 190 2.3. Representation 192 The representation of a fingerprint is the format in which it is 193 presented to either an application or the user. 195 Base32 encoding is used to produce the preferred text representation 196 of a UDF fingerprint. This encoding uses only the letters of the 197 Latin alphabet with numbers chosen to minimize the risk of ambiguity 198 between numbers and letters (2, 3, 4, 5, 6 and 7). 200 To enhance readability and improve data entry, characters are grouped 201 into groups of five. 203 2.4. Truncation 205 Different applications of fingerprints demand different tradeoffs 206 between compactness of the representation and the number of 207 significant bits. A larger the number of significant bits reduces 208 the risk of collision but at a cost to convenience. 210 Modern cryptographic digest functions such as SHA-2 produce output 211 values of at least 256 bits in length. This is considerably larger 212 than most uses of fingerprints require and certainly greater than can 213 be represented in human readable form on a business card. 215 Since a strong cryptographic digest function produces an output value 216 in which every bit in the input value affects every bit in the output 217 value with equal probability, it follows that truncating the digest 218 value to produce a finger print is at least as strong as any other 219 mechanism if digest algorithm used is strong. 221 Using truncation to reduce the precision of the digest function has 222 the advantage that a lower precision fingerprint of some data content 223 is always a prefix of a higher prefix of the same content. This 224 allows higher precision fingerprints to be converted to a lower 225 precision without the need for special tools. 227 3. Definitions 229 Cryptographic Digest Function 231 A hash function that has the properties required for use as a 232 cryptographic hash function. These include collision resistance, 233 first pre-image resistance and second pre-image resistance. 235 An identifier indicating how a Data Value is to be interpreted as 236 specified in the IANA registry Media Types. 238 The binary octet stream that is the input to the digest function 239 used to calculate a digest value. 241 A Data Value and its associated Content Type 243 A synonym for Cryptographic Digest Function 245 The output of a Cryptographic Digest Function 247 The output of a Cryptographic Digest Function for a given Data 248 Value input. 250 A presentation of the digest value of a data value or data object. 252 The representation of at least some part of a fingerprint value in 253 human or machine readable form. 255 The practice of recording a higher precision presentation of a 256 fingerprint on successful validation. 258 The practice of generating a sequence of fingerprints until one is 259 found that matches criteria that permit a compressed presentation 260 form to be used. The compressed fingerprint thus being shorter 261 than but presenting the same work factor as an uncompressed one. 263 A function which takes an input and returns a fixed-size output. 264 Ideally, the output of a hash function is unbiased and not 265 correlated to the outputs returned to similar inputs in any 266 predictable fashion. 268 The number of significant bits provided by a Fingerprint 269 Presentation. 271 A measure of the computational effort required to perform an 272 attack against some security property. 274 3.1. Requirements Language 276 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 277 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 278 document are to be interpreted as described in RFC 2119 [RFC2119]. 280 4. Encoding 282 A UDF fingerprint for a given data object is generated by calculating 283 the Binary Fingerprint Value for the given data object and type 284 identifier, truncating it to obtain the desired degree of precision 285 and then converting the truncated value to a representation. 287 4.1. Binary Fingerprint Value 289 The binary encoding of a fingerprint is calculated using the formula: 291 Fingerprint = & + H (& + ?:? + H(&)) 293 Figure 1 295 Where 297 H(x) is the cryptographic digest function 298 & is the fingerprint version and algorithm identifier. 299 & is the MIME Content-Type of the data. 300 & is the binary data. 302 Figure 2 304 The use of the nested hash function permits a fingerprint to be taken 305 of data for which a digest value is already known without the need to 306 calculate a new digest over the data. 308 The inclusion of a MIME content type prevents message substitution 309 attacks in which one content type is substituted for another. 311 4.1.1. Version ID 313 A Version Identifier consists of a single byte. The following digest 314 algorithm identifiers are specified in this document: 316 +-----------------+------------------------+-----------------+ 317 | Version ID | Algorithm | Reference | 318 +-----------------+------------------------+-----------------+ 319 | 96 | SHA-2-512 | | 320 | 97, 98, 99, 100 | SHA-2-512 (compressed) | | 321 | 144 | SHA-3-512 | | 322 +-----------------+------------------------+-----------------+ 324 Table 1 326 These algorithm identifiers have been chosen so that the first 327 character in a SHA-2-512 fingerprint will always be ?M? and the first 328 character in a SHA-3-512 fingerprint will always be ?S?. These 329 provide mnemonics for ?Merkle-Damgard? and ?Sponge? respectively. 331 4.2. Truncation 333 The Binary Fingerprint Value is truncated to an integer multiple of 334 25 bits regardless of the intended output presentation. 336 The output of the hash function is truncated to a sequence of n bits 337 by first selecting the first n/8 bytes of the output function. If n 338 is an integer multiple of 8, no additional bits are required and this 339 is the result. Otherwise the remaining bits are taken from the most 340 significant bits of the next byte and any unused bits set to 0. 342 For example, to truncate the byte sequence [a0, b1, c2, d3, e4] to 25 343 bits. 25/8 = 3 bytes with 1 bit remaining, the first three bytes of 344 the truncated sequence is [a0, b1, c2] and the final byte is e4 AND 345 80 = 80 which we add to the previous result to obtain the final 346 truncated sequence of [a0, b1, c2, 80] 348 4.3. Base32 Representation 350 A modified version of Base32 [RFC4648] [RFC4648] encoding is used to 351 present the fingerprint in text form grouping the output text into 352 groups of five characters separated by a dash ?-?. This 353 representation improves the accuracy of both data entry and 354 verification. 356 4.4. Examples 358 In the following examples, is the UTF8 encoding of the 359 string "text/plain" and is the UTF8 encoding of the string 360 "UDF Data Value" 362 Data = 55 44 46 20 44 61 74 61 20 56 61 6c 75 65 364 4.4.1. Using SHA-2-512 Digest 366 H( ) = 367 48 da 47 cc ab fe a4 5c 76 61 d3 21 ba 34 3e 58 368 10 87 2a 03 b4 02 9d ab 84 7c ce d2 22 b6 9c ab 369 02 38 d4 e9 1e 2f 6b 36 a0 9e ed 11 09 8a ea ac 370 99 d9 e0 bd ea 47 93 15 bd 7a e9 e1 2e ad c4 15 371 H(H( ) + Content-ID>) = 372 45 e0 59 e0 39 34 ea b7 f6 5d 83 b2 d8 f9 b1 6d 373 2a 6b 08 63 d9 3c c1 02 86 7b 83 49 f2 d9 f0 8f 374 fe 07 87 30 c7 c9 05 74 ac a1 38 2b b3 14 4d c6 375 39 f9 8c 12 c0 4a 3e b5 05 0b 3e 67 df 52 4b 57 377 Figure 3 379 MB2GK-6DUF5-YGYYL-JNY5E 381 MB2GK-6DUF5-YGYYL-JNY5E-RWSHZ 383 MB2GK-6DUF5-YGYYL-JNY5E-RWSHZ-SV75J 385 MB2GK-6DUF5-YGYYL-JNY5E-RWSHZ-SV75J-C4OZQ-5GIN2-GQ7FQ-EEHFI 387 4.5. Fingerprint Improvement 389 Since an application must always calculate the full fingerprint value 390 as part of the verification process, an application MAY record a 392 Applications are encouraged to make use of the practice of 393 fingerprint improvement wherever possible. 395 4.6. Compressed Presentation 397 Fingerprint compression permits the use of shorter fingerprint 398 presentation without a reduction in the attacker work factor by 399 requiring the fingerprint value to match a particular pattern. 401 UDF fingerprints MUST use compression if possible. A compressed 402 fingerprint uses a version identifier that specifies the form of 403 compression used as follows: 405 +------------+-------------------------+ 406 | Version ID | Compression | 407 +------------+-------------------------+ 408 | 96 | None | 409 | 97 | First 25 bits are zeros | 410 | 98 | First 40 bits are zeros | 411 | 99 | First 50 bits are zeros | 412 | 100 | First 55 bits are zeros | 413 +------------+-------------------------+ 415 Table 2 417 Thus, the fingerprint that would be represented in uncompressed form 418 as MAAAA-AAWIY-LTMFTG-CZTRO is instead represented as MIWIY-LTMFTG- 419 CZTRO. 421 4.7. Identifiers formed using UDFs 423 UDF fingerprints MAY be used to form a part of another protocol 424 identifier. Such practice carries the implicit semantic that the 425 interpretation of the identifier formed is bound to the document 426 identified by the fingerprint. 428 4.7.1. URI Representation 430 Any UDF fingerprint MAY be encoded as a URI by prefixing the Base32 431 text representation of the fingerprint with the string ?udf:? 433 4.7.2. DNS Name 435 A UDF fingerprint MAY be encoded as a DNS label by prefixing the 436 Base32 text representation with the string ?mm--?. 438 A DNS name that includes a UDF fingerprint as a DNS label carries the 439 implicit assertion that the interpretation of the address MUST be 440 authorized by a security policy that is validated under a key that 441 matches the corresponding fingerprint. 443 Placing such a DNS label as the top level (rightmost) label in a DNS 444 address creates an address that is not legal and thus cannot be 445 resolved by the Internet DNS infrastructure. Thus ensuring that the 446 address is rejected by applications that are not capable of 447 performing the associated validation steps. 449 For example, Alice has the email security key with fingerprint MB2GK- 450 6DUF5-YGYYL-JNY5E. She uses the following email addresses: 452 Alice publishes this email address when she does not want the 453 other party to use the secure email system. 455 Alice publishes this email address when she wants to give the 456 other party the option of using secure email if their system 457 supports it. 459 The DNS server for example.com has been configured to redirect 460 requests to resolve zz--mb2gk-6duf5-ygyyl-jny5e.example.com to the 461 mail server example.com. 463 Alice uses this email address when she wants the other party to be 464 able to send her email if and only if their client supports use of 465 the secure messaging system. 467 While there should never be a DNS label of the form mm--* in the 468 authoritative DNS root, such labels MAY be introduced by a trusted 469 local resolver. This would allow attempts at making an untrusted 470 communication request to be transparently redirected through a 471 locally trusted security enhancing proxy. 473 5. Content Types 475 While a UDF fingerprint MAY be used to identify any form of static 476 data, the use of a UDF fingerprint to identify a public key signature 477 key provides a level of indirection and thus the ability to identify 478 dynamic data. The content types used to identify public keys are 479 thus of particular interest. 481 As described in the security considerations section, the use of 482 fingerprints to identify a bare public key and the use of 483 fingerprints to identify a public key and associated security policy 484 information are very different. 486 5.1. PKIX Certificates and Keys 488 UDF fingerprints MAY be used to identify PKIX certificates, CRLs and 489 public keys in the ASN.1 encoding used in PKIX certificates. 491 Since PKIX certificates and CLRs contain security policy information, 492 UDF fingerprints used to identify certificates or CRLs SHOULD be 493 presented with a minimum of 200 bits of precision. PKIX applications 494 MUST not accept UDF fingerprints specified with less than 200 bits of 495 precision for purposes of identifying trust anchors. 497 PKIX certificates, keys and related content data are identified by 498 the following content types: 500 A PKIX Certificate 502 A PKIX CRL 504 The KeyInfo structure defined in the PKIX certificate 505 specification 507 5.2. OpenPGP Key 509 OpenPGPv5 keys and key set content data are identified by the 510 following content types: 512 An OpenPGP key 514 An OpenPGP key set. 516 5.3. DNSSEC 518 DNSSEC record data consists of DNS records which are identified by 519 the following content type: 521 A DNS resource record in binary format 523 6. Additional UDF Renderings 525 By default, a UDF fingerprint is rendered in the Base32 encoding 526 described in this document. Additional renderings MAY be employed to 527 facilitate entry and/or verification of fingerprint values. 529 6.1. Machine Readable Rendering 531 The use of a machine-readable rendering such as a QR Code allows a 532 UDF value to be input directly using a smartphone or other device 533 equipped with a camera. 535 A QR code fixed to a network capable device might contain the 536 fingerprint of a machine readable description of the device. 538 6.2. Word Lists 540 The use of a Word List to encode fingerprint values was introduced by 541 Patrick Juola and Philip Zimmerman for the PGPfone application. The 542 PGP Word List is designed to facilitate exchange and verification of 543 fingerprint values in a voice application. To minimize the risk of 544 misinterpretation, two word lists of 256 values each are used to 545 encode alternative fingerprint bytes. The compact size of the lists 546 used allowed the compilers to curate them so as to maximize the 547 phonetic distance of the words selected. 549 The PGP Word List is designed to achieve a balance between ease of 550 entry and verification. Applications where only verification is 551 required may be better served by a much larger word list, permitting 552 shorter fingerprint encodings. 554 For example, a word list with 16384 entries permits 14 bits of the 555 fingerprint to be encoded at once, 65536 entries permits 16. These 556 encodings allow a 125 bit fingerprint to be encoded in 9 and 8 words 557 respectively. 559 6.3. Image List 561 An image list is used in the same manner as a word list affording 562 rapid visual verification of a fingerprint value. For obvious 563 reasons, this approach is not generally suited to data entry. 565 7. Security Considerations 567 7.1. Work Factor and Precision 569 A given UDF data object has a single fingerprint value that may be 570 presented at different precisions. The shortest legitimate precision 571 with which a UDF fingerprint may be presented has 96 significant bits 573 A UDF fingerprint presents the same work factor as any other 574 cryptographic digest function. The difficulty of finding a second 575 data item that matches a given fingerprint is 2^n and the difficulty 576 or finding two data items that have the same fingerprint is 2^(n/2). 577 Where n is the precision of the fingerprint. 579 For the algorithms specified in this document, n = 512 and thus the 580 work factor for finding collisions is 2^256, a value that is 581 generally considered to be computationally infeasible. 583 Since the use of 512 bit fingerprints is impractical in the type of 584 applications where fingerprints are generally used, truncation is a 585 practical necessity. The longer a fingerprint is, the less likely it 586 is that a user will check every character. It is therefore important 587 to consider carefully whether the security of an application depends 588 on second pre-image resistance or collision resistance. 590 In most fingerprint applications, such as the use of fingerprints to 591 identify public keys, the fact that a malicious party might generate 592 two keys that have the same fingerprint value is a minor concern. 593 Combined with a flawed protocol architecture, such a vulnerability 594 may permit an attacker to construct a document such that the 595 signature will be accepted as valid by some parties but not by 596 others. 598 For example, Alice generates keypairs until two are generated that 599 have the same 100 bit UDF presentation (typically 2^48 attempts). 600 She registers one keypair with a merchant and the other with her 601 bank. This allows Alice to create a payment instrument that will be 602 accepted as valid by one and rejected by the other. 604 The ability to generate of two PKIX certificates with the same 605 fingerprint and different certificate attributes raises very 606 different and more serious security concerns. For example, an 607 attacker might generate two certificates with the same key and 608 different use constraints. This might allow an attacker to present a 609 highly constrained certificate that does not present a security risk 610 to an application for purposes of gaining approval and an 611 unconstrained certificate to request a malicious action. 613 In general, any use of fingerprints to identify data that has 614 security policy semantics requires the risk of collision attacks to 615 be considered. For this reason the use of short, ?user friendly? 616 fingerprint presentations (Less than 200 bits) SHOULD only be used 617 for public key values. 619 7.2. Semantic Substitution 621 Many applications record the fact that a data item is trusted, rather 622 fewer record the circumstances in which the data item is trusted. 623 This results in a semantic substitution vulnerability which an 624 attacker may exploit by presenting the trusted data item in the wrong 625 context. 627 The UDF format provides protection against high level semantic 628 substitution attacks by incorporating the content type into the input 629 to the outermost fingerprint digest function. The work factor for 630 generating a UDF fingerprint that is valid in both contexts is thus 631 the same as the work factor for finding a second preimage in the 632 digest function (2^512 for the specified digest algorithms). 634 It is thus infeasible to generate a data item such that some 635 applications will interpret it as a PKIX key and others will accept 636 as an OpenPGP key. While attempting to parse a PKIX key as an 637 OpenPGP key is virtually certain to fail to return the correct key 638 parameters it cannot be assumed that the attempt is guaranteed to 639 fail with an error message. 641 The UDF format does not provide protection against semantic 642 substitution attacks that do not affect the content type. 644 8. IANA Considerations 646 [This will be extended later] 648 8.1. URI Registration 650 [Here a URI registration for the udf: scheme] 652 8.2. Content Type Registration 654 [application/pkix-keyinfo] 656 [application/pgp-key] 658 8.3. Version Registry 660 96 = SHA-2-512 661 97 = SHA-2-512 with 25 leading zeros 662 98 = SHA-2-512 with 40 leading zeros 663 99 = SHA-2-512 with 50 leading zeros 664 100 = SHA-2-512 with 55 leading zeros 665 144 = SHA-3-512 667 Figure 4 669 9. References 671 9.1. Normative References 673 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 674 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006. 676 [SHA-2] "[Reference Not Found!]". 678 [SHA-3] "[Reference Not Found!]". 680 9.2. Informative References 682 [Dobertin95] 683 "[Reference Not Found!]". 685 [RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, 686 DOI 10.17487/RFC1321, April 1992. 688 Author's Address 690 Phillip Hallam-Baker 691 Comodo Group Inc. 693 Email: philliph@comodo.com