idnits 2.17.1 

draft-hallambaker-udf-10.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([1]), which it shouldn't. 
     Please replace those with straight textual mentions of the documents in
     question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     Since PKIX certificates and CLRs contain security policy
     information, UDF fingerprints used to identify certificates or CRLs
     SHOULD be presented with a minimum of 200 bits of precision.  PKIX
     applications MUST not accept UDF fingerprints specified with less than
     200 bits of precision for purposes of identifying trust anchors.

  -- The document date (April 11, 2018) is 2197 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Looks like a reference, but probably isn't: '1' on line 762

  == Missing Reference: 'RFC2119' is mentioned on line 240, but not defined


     Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                    P. Hallam-Baker
3	Internet-Draft                                         Comodo Group Inc.
4	Intended status: Informational                            April 11, 2018
5	Expires: October 13, 2018

7	                     Uniform Data Fingerprint (UDF)
8	                        draft-hallambaker-udf-10

10	Abstract

12	   This document describes means of generating Uniform Data Fingerprint
13	   (UDF) values and their presentation as text sequences and as URIs.
14	   Uses of UDF fingerprints include but are not limited to creating
15	   Strong Internet Names (SINs).

17	   Cryptographic digests provide a means of uniquely identifying static
18	   data without the need for a registration authority.  A fingerprint is
19	   a form of presenting a cryptographic digest that makes it suitable
20	   for use in applications where human readability is required.  The UDF
21	   fingerprint format improves over existing formats through the
22	   introduction of a compact algorithm identifier affording an
23	   intentionally limited choice of digest algorithm and the inclusion of
24	   an IANA registered MIME Content-Type identifier within the scope of
25	   the digest input to allow the use of a single fingerprint format in
26	   multiple application domains.

28	   Alternative means of rendering fingerprint values are considered
29	   including machine-readable codes, word and image lists.

31	   This document is also available online at
32	   http://mathmesh.com/Documents/draft-hallambaker-udf.html [1] .

34	Status of This Memo

36	   This Internet-Draft is submitted in full conformance with the
37	   provisions of BCP 78 and BCP 79.

39	   Internet-Drafts are working documents of the Internet Engineering
40	   Task Force (IETF).  Note that other groups may also distribute
41	   working documents as Internet-Drafts.  The list of current Internet-
42	   Drafts is at https://datatracker.ietf.org/drafts/current/.

44	   Internet-Drafts are draft documents valid for a maximum of six months
45	   and may be updated, replaced, or obsoleted by other documents at any
46	   time.  It is inappropriate to use Internet-Drafts as reference
47	   material or to cite them other than as "work in progress."
48	   This Internet-Draft will expire on October 13, 2018.

50	Copyright Notice

52	   Copyright (c) 2018 IETF Trust and the persons identified as the
53	   document authors.  All rights reserved.

55	   This document is subject to BCP 78 and the IETF Trust's Legal
56	   Provisions Relating to IETF Documents
57	   (https://trustee.ietf.org/license-info) in effect on the date of
58	   publication of this document.  Please review these documents
59	   carefully, as they describe your rights and restrictions with respect
60	   to this document.  Code Components extracted from this document must
61	   include Simplified BSD License text as described in Section 4.e of
62	   the Trust Legal Provisions and are provided without warranty as
63	   described in the Simplified BSD License.

65	Table of Contents

67	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
68	     1.1.  Algorithm Identifier  . . . . . . . . . . . . . . . . . .   4
69	     1.2.  Content Type Identifier . . . . . . . . . . . . . . . . .   4
70	     1.3.  Representation  . . . . . . . . . . . . . . . . . . . . .   5
71	     1.4.  Truncation  . . . . . . . . . . . . . . . . . . . . . . .   5
72	   2.  Definitions . . . . . . . . . . . . . . . . . . . . . . . . .   5
73	     2.1.  Requirements Language . . . . . . . . . . . . . . . . . .   5
74	     2.2.  Defined Terms . . . . . . . . . . . . . . . . . . . . . .   6
75	     2.3.  Related Specifications  . . . . . . . . . . . . . . . . .   7
76	     2.4.  Implementation Status . . . . . . . . . . . . . . . . . .   7
77	   3.  Encoding  . . . . . . . . . . . . . . . . . . . . . . . . . .   7
78	     3.1.  Binary Fingerprint Value  . . . . . . . . . . . . . . . .   7
79	       3.1.1.  Version ID  . . . . . . . . . . . . . . . . . . . . .   8
80	     3.2.  Truncation  . . . . . . . . . . . . . . . . . . . . . . .   8
81	     3.3.  Base32 Representation . . . . . . . . . . . . . . . . . .   8
82	     3.4.  Example Encoding  . . . . . . . . . . . . . . . . . . . .   9
83	       3.4.1.  Using SHA-2-512 Digest  . . . . . . . . . . . . . . .   9
84	       3.4.2.  Using SHA-3-512 Digest  . . . . . . . . . . . . . . .  10
85	     3.5.  Fingerprint Improvement . . . . . . . . . . . . . . . . .  10
86	     3.6.  Compressed Presentation . . . . . . . . . . . . . . . . .  11
87	     3.7.  Example of Compressed Encoding. . . . . . . . . . . . . .  11
88	       3.7.1.  Example . . . . . . . . . . . . . . . . . . . . . . .  12
89	   4.  Content Types . . . . . . . . . . . . . . . . . . . . . . . .  12
90	     4.1.  PKIX Certificates and Keys  . . . . . . . . . . . . . . .  12
91	     4.2.  OpenPGP Key . . . . . . . . . . . . . . . . . . . . . . .  13
92	     4.3.  DNSSEC  . . . . . . . . . . . . . . . . . . . . . . . . .  13
93	   5.  Additional UDF Renderings . . . . . . . . . . . . . . . . . .  13
94	     5.1.  Machine Readable Rendering  . . . . . . . . . . . . . . .  13
95	     5.2.  Word Lists  . . . . . . . . . . . . . . . . . . . . . . .  13
96	     5.3.  Image List  . . . . . . . . . . . . . . . . . . . . . . .  14
97	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  14
98	     6.1.  Work Factor and Precision . . . . . . . . . . . . . . . .  14
99	     6.2.  Semantic Substitution . . . . . . . . . . . . . . . . . .  15
100	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  16
101	     7.1.  URI Registration  . . . . . . . . . . . . . . . . . . . .  16
102	     7.2.  Content Type Registration . . . . . . . . . . . . . . . .  16
103	     7.3.  Version Registry  . . . . . . . . . . . . . . . . . . . .  16
104	   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  16
105	     8.1.  Normative References  . . . . . . . . . . . . . . . . . .  16
106	     8.2.  Informative References  . . . . . . . . . . . . . . . . .  16
107	     8.3.  URIs  . . . . . . . . . . . . . . . . . . . . . . . . . .  17
108	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  17

110	1.  Introduction

112	   The use of cryptographic digest functions to produce identifiers is
113	   well established as a means of generating a unique identifier for
114	   fixed data without the need for a registration authority.

116	   While the use of fingerprints of public keys was popularized by PGP,
117	   they are employed in many other applications including OpenPGP, SSH,
118	   BitCoin and PKIX.

120	   A cryptographic digest is a particular form of hash function that has
121	   the properties:

123	   o  It is easy to compute the digest value for any given message

125	   o  It is infeasible to generate a message from its digest value

127	   o  It is infeasible to modify a message without changing the digest
128	      value

130	   o  It is infeasible to find two different messages with the same
131	      digest value.

133	   If these properties are met, the only way that two data objects that
134	   map to the same digest value is by random chance.  If the number of
135	   possible digest values is sufficiently large (i.e. is a sufficiently
136	   large number of bits in length), this chance is reduced to an
137	   arbitrarily infinitesimal probability.  Such values are described as
138	   being probabilistically unique.

140	   A fingerprint is a representation of a cryptographic digest value
141	   optimized for purposes of verification and in some cases data entry.

143	1.1.  Algorithm Identifier

145	   Although a secure cryptographic digest algorithm has properties that
146	   make it ideal for certain types of identifier use, several
147	   cryptographic digest algorithms have found widespread use, some of
148	   which have been demonstrated to be insecure.

150	   For example the MD5 message digest algorithm [RFC1321] , was widely
151	   used in IETF protocols until it was demonstrated to be vulnerable to
152	   collision attacks [Dobertin95] .

154	   The secure use of a fingerprint scheme therefore requires the digest
155	   algorithm to either be fixed or otherwise determined by the
156	   fingerprint value itself.  Otherwise an attacker may be able to use a
157	   weak, broken digest algorithm to generate a data object matching a
158	   fingerprint value generated using a strong digest algorithm.

160	   The two digest algorithms currently used in the UDF scheme are both
161	   believed to be strong.  These are SHA-2-512 [SHA-2] and SHA-3-512
162	   [SHA-3] . The most secure, 512 bit version of the algorithm is used
163	   in both cases although the output is almost invariably truncated to a
164	   shorter length.  Use of the strongest version of the algorithm in
165	   every circumstance eliminates the need to negotiate the algorithm
166	   strength.

168	1.2.  Content Type Identifier

170	   A secure cryptographic digest algorithm provides a unique digest
171	   value that is probabilistically unique for a particular byte sequence
172	   but does not fix the context in which a byte sequence is interpreted.
173	   While such ambiguity may be tolerated in a fingerprint format
174	   designed for a single specific field of use, it is not acceptable in
175	   a general purpose format.

177	   For example, the SSH and OpenPGP applications both make use of
178	   fingerprints as identifiers for the public keys used but using
179	   different digest algorithms and data formats for representing the
180	   public key data.  While no such vulnerability has been demonstrated
181	   to date, it is certainly conceivable that a crafty attacker might
182	   construct an SSH key in such a fashion that OpenPGP interprets the
183	   data in an insecure fashion.  If the number of applications making
184	   use of fingerprint format that permits such substitutions is
185	   sufficiently large, the probability of a semantic substitution
186	   vulnerability being possible becomes unacceptably large.

188	   A simple control that defeats such attacks is to incorporate a
189	   content type identifier within the scope of the data input to the
190	   hash function.

192	1.3.  Representation

194	   The representation of a fingerprint is the format in which it is
195	   presented to either an application or the user.

197	   Base32 encoding is used to produce the preferred text representation
198	   of a UDF fingerprint.  This encoding uses only the letters of the
199	   Latin alphabet with numbers chosen to minimize the risk of ambiguity
200	   between numbers and letters (2, 3, 4, 5, 6 and 7).

202	   To enhance readability and improve data entry, characters are grouped
203	   into groups of five.

205	1.4.  Truncation

207	   Different applications of fingerprints demand different tradeoffs
208	   between compactness of the representation and the number of
209	   significant bits.  A larger the number of significant bits reduces
210	   the risk of collision but at a cost to convenience.

212	   Modern cryptographic digest functions such as SHA-2 produce output
213	   values of at least 256 bits in length.  This is considerably larger
214	   than most uses of fingerprints require and certainly greater than can
215	   be represented in human readable form on a business card.

217	   Since a strong cryptographic digest function produces an output value
218	   in which every bit in the input value affects every bit in the output
219	   value with equal probability, it follows that truncating the digest
220	   value to produce a finger print is at least as strong as any other
221	   mechanism if digest algorithm used is strong.

223	   Using truncation to reduce the precision of the digest function has
224	   the advantage that a lower precision fingerprint of some data content
225	   is always a prefix of a higher prefix of the same content.  This
226	   allows higher precision fingerprints to be converted to a lower
227	   precision without the need for special tools.

229	2.  Definitions

231	   This section presents the related specifications and standard, the
232	   terms that are used as terms of art within the documents and the
233	   terms used as requirements language.

235	2.1.  Requirements Language

237	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
238	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
239	   document are to be interpreted as described in [RFC2119].

241	2.2.  Defined Terms

243	   Cryptographic Digest Function

245	   A hash function that has the properties required for use as a
246	   cryptographic hash function.  These include collision resistance,
247	   first pre-image resistance and second pre-image resistance.

249	   Content Type  An identifier indicating how a Data Value is to be
250	      interpreted as specified in the IANA registry Media Types.

252	   Data Value  The binary octet stream that is the input to the digest
253	      function used to calculate a digest value.

255	   Data Object  A Data Value and its associated Content Type

257	   Digest Algorithm  A synonym for Cryptographic Digest Function

259	   Digest Value  The output of a Cryptographic Digest Function

261	   Data Digest Value  The output of a Cryptographic Digest Function for
262	      a given Data Value input.

264	   Fingerprint  A presentation of the digest value of a data value or
265	      data object.

267	   Fingerprint Presentation  The representation of at least some part of
268	      a fingerprint value in human or machine readable form.

270	   Fingerprint Improvement  The practice of recording a higher precision
271	      presentation of a fingerprint on successful validation.

273	   Fingerprint Work Hardening  The practice of generating a sequence of
274	      fingerprints until one is found that matches criteria that permit
275	      a compressed presentation form to be used.  The compressed
276	      fingerprint thus being shorter than but presenting the same work
277	      factor as an uncompressed one.

279	   Hash  A function which takes an input and returns a fixed-size
280	      output.  Ideally, the output of a hash function is unbiased and
281	      not correlated to the outputs returned to similar inputs in any
282	      predictable fashion.

284	   Precision  The number of significant bits provided by a Fingerprint
285	      Presentation.

287	   Work Factor  A measure of the computational effort required to
288	      perform an attack against some security property.

290	2.3.  Related Specifications

292	   This specification makes use of Base32 [RFC4648] encoding, SHA-2
293	   [SHA-2] and SHA-3 [SHA-3] digest functions.

295	   UDFs are used in the definition of Strong Internet Names
296	   [hallambaker-sin] .

298	2.4.  Implementation Status

300	   The implementation status of the reference code base is described in
301	   the companion document [draft-hallambaker-mesh-developer] .

303	3.  Encoding

305	   A UDF fingerprint for a given data object is generated by calculating
306	   the Binary Fingerprint Value for the given data object and type
307	   identifier, truncating it to obtain the desired degree of precision
308	   and then converting the truncated value to a representation.

310	3.1.  Binary Fingerprint Value

312	   The binary encoding of a fingerprint is calculated using the formula:

314	   Fingerprint = &<Version-ID> + H (&<Content-ID> + ?:? + H(&<Data>))

316	                                 Figure 1

318	   Where

320	   H(x) is the cryptographic digest function
321	   &<Version-ID> is the fingerprint version and algorithm identifier.
322	   &<Content-ID> is the MIME Content-Type of the data.
323	   &<Data> is the binary data.

325	                                 Figure 2

327	   The use of the nested hash function permits a fingerprint to be taken
328	   of data for which a digest value is already known without the need to
329	   calculate a new digest over the data.

331	   The inclusion of a MIME content type prevents message substitution
332	   attacks in which one content type is substituted for another.

334	3.1.1.  Version ID

336	   A Version Identifier consists of a single byte.  The following digest
337	   algorithm identifiers are specified in this document:

339	      +-----------------+------------------------+-----------------+
340	      | Version ID      | Algorithm              | Reference       |
341	      +-----------------+------------------------+-----------------+
342	      | 96              | SHA-2-512              | <norm="SHA-2"/> |
343	      | 97, 98, 99, 100 | SHA-2-512 (compressed) | <norm="SHA-2"/> |
344	      | 144             | SHA-3-512              | <norm="SHA-3"/> |
345	      +-----------------+------------------------+-----------------+

347	                                  Table 1

349	   These algorithm identifiers have been chosen so that the first
350	   character in a SHA-2-512 fingerprint will always be ?M? and the first
351	   character in a SHA-3-512 fingerprint will always be ?S?. These
352	   provide mnemonics for ?Merkle-Damgard? and ?Sponge? respectively.

354	3.2.  Truncation

356	   The Binary Fingerprint Value is truncated to an integer multiple of
357	   25 bits regardless of the intended output presentation.

359	   The output of the hash function is truncated to a sequence of n bits
360	   by first selecting the first n/8 bytes of the output function.  If n
361	   is an integer multiple of 8, no additional bits are required and this
362	   is the result.  Otherwise the remaining bits are taken from the most
363	   significant bits of the next byte and any unused bits set to 0.

365	   For example, to truncate the byte sequence [a0, b1, c2, d3, e4] to 25
366	   bits. 25/8 = 3 bytes with 1 bit remaining, the first three bytes of
367	   the truncated sequence is [a0, b1, c2] and the final byte is e4 AND
368	   80 = 80 which we add to the previous result to obtain the final
369	   truncated sequence of [a0, b1, c2, 80]

371	3.3.  Base32 Representation

373	   A modified version of Base32 [RFC4648] encoding is used to present
374	   the fingerprint in text form grouping the output text into groups of
375	   five characters separated by a dash ?-?. This representation improves
376	   the accuracy of both data entry and verification.

378	3.4.  Example Encoding

380	   In the following examples, <Content-ID> is the UTF8 encoding of the
381	   string "text/plain" and <Data> is the UTF8 encoding of the string
382	   "UDF Data Value"

384	   Data =
385	     55 44 46 20  44 61 74 61  20 56 61 6C  75 65

387	   ContentType =
388	     74 65 78 74  2F 70 6C 61  69 6E

390	                                 Figure 3

392	3.4.1.  Using SHA-2-512 Digest

394	   H(<Data> ) =

396	     48 DA 47 CC  AB FE A4 5C  76 61 D3 21  BA 34 3E 58
397	     10 87 2A 03  B4 02 9D AB  84 7C CE D2  22 B6 9C AB
398	     02 38 D4 E9  1E 2F 6B 36  A0 9E ED 11  09 8A EA AC
399	     99 D9 E0 BD  EA 47 93 15  BD 7A E9 E1  2E AD C4 15

401	   H (<Content-ID> + ':' + H(<Data>))=

403	     74 65 78 74  2F 70 6C 61  69 6E 3A 48  DA 47 CC AB
404	     FE A4 5C 76  61 D3 21 BA  34 3E 58 10  87 2A 03 B4
405	     02 9D AB 84  7C CE D2 22  B6 9C AB 02  38 D4 E9 1E
406	     2F 6B 36 A0  9E ED 11 09  8A EA AC 99  D9 E0 BD EA
407	     47 93 15 BD  7A E9 E1 2E  AD C4 15

409	   H ( <Content-ID> + ':' + H(<Data>))=

411	     C6 AF B7 C0  FE BE 04 E5  AE 94 E3 7B  AA 5F 1A 40
412	     5B A3 CE CC  97 4D 55 C0  9E 61 E4 B0  EF 9C AE F9
413	     EB 83 BB 9D  5F 0F 39 F6  5F AA 06 DC  67 2A 67 71
414	     4F FF 8F 83  C4 55 38 36  38 AE 42 7A  82 9C 85 BB

416	                                 Figure 4

418	   Text Presentation (100 bit)  MDDK7-N6A72-7AJZN-OSTR3

420	   Text Presentation (125 bit)  MDDK7-N6A72-7AJZN-OSTRX-XKS72

422	   Text Presentation (150 bit)  MDDK7-N6A72-7AJZN-OSTRX-XKS7D-JAFXD

424	   Text Presentation (250 bit)  MDDK7-N6A72-7AJZN-OSTRX-XKS7D-JAFXI-
425	      6OZSL-U2VOA-TZQ6J-MHPTS

427	3.4.2.  Using SHA-3-512 Digest

429	   H(<Data> ) =

431	     6D 2E CF E6  93 5A 0C FC  F2 A9 1A 49  E0 0C D8 07
432	     A1 4E 70 AB  72 94 6E CC  BB 47 48 F1  8E 41 49 95
433	     07 1D F3 6E  0D 0C 8B 60  39 C1 8E B4  0F 6E C8 08
434	     65 B4 C4 45  9B A2 7E 97  74 7B BE 68  BC A8 C2 17

436	   H (<Content-ID> + ':' + H(<Data>))=

438	     74 65 78 74  2F 70 6C 61  69 6E 3A 6D  2E CF E6 93
439	     5A 0C FC F2  A9 1A 49 E0  0C D8 07 A1  4E 70 AB 72
440	     94 6E CC BB  47 48 F1 8E  41 49 95 07  1D F3 6E 0D
441	     0C 8B 60 39  C1 8E B4 0F  6E C8 08 65  B4 C4 45 9B
442	     A2 7E 97 74  7B BE 68 BC  A8 C2 17

444	   H ( <Content-ID> + ':' + H(<Data>))=

446	     58 9B 76 70  35 B4 55 E5  41 4C 29 4D  73 C1 FD 48
447	     F9 9A D6 29  35 A3 14 9A  32 6C EA 9E  7D 7A 8C 3F
448	     26 B0 0F 15  84 CB BE 6F  35 C6 37 48  AF 5C F1 02
449	     31 79 50 B1  A1 4F 97 50  97 49 5E DA  A2 A0 A9 B5

451	                                 Figure 5

453	   Text Presentation (100 bit)  SCFIN-CQGDR-KG47R-7OVPZ

455	   Text Presentation (125 bit)  SCFIN-CQGDR-KG47R-7OVPT-TCHZ5

457	   Text Presentation (150 bit)  SCFIN-CQGDR-KG47R-7OVPT-TCHZ7-UXY4I

459	   Text Presentation (250 bit)  SCFIN-CQGDR-KG47R-7OVPT-TCHZ7-UXY5S-
460	      CFSMN-YBKBP-FELHX-I56EH

462	3.5.  Fingerprint Improvement

464	   Since an application must always calculate the full fingerprint value
465	   as part of the verification process, an application MAY accept a low
466	   precision (e.g. 100 bit) fingerprint value from the user and replace
467	   it with a higher precision fingerprint (e.g. 250 bits) after
468	   verification.

470	   Applications are encouraged to make use of the practice of
471	   fingerprint improvement wherever possible.

473	3.6.  Compressed Presentation

475	   Fingerprint compression permits the use of shorter fingerprint
476	   presentation without a reduction in the attacker work factor by
477	   requiring the fingerprint value to match a particular pattern.

479	   UDF fingerprints MUST use compression if possible.  A compressed
480	   fingerprint uses a version identifier that specifies the form of
481	   compression used as follows:

483	                 +------------+-------------------------+
484	                 | Version ID | Compression             |
485	                 +------------+-------------------------+
486	                 | 96         | None                    |
487	                 | 97         | First 25 bits are zeros |
488	                 | 98         | First 35 bits are zeros |
489	                 | 99         | First 40 bits are zeros |
490	                 | 100        | First 45 bits are zeros |
491	                 | 101        | First 50 bits are zeros |
492	                 +------------+-------------------------+

494	                                  Table 2

496	   Support for compression may introduce perverse incentives such as
497	   performing key generation on machines that less secure but offer fast
498	   (or cheap) processing power.  An attacker might even offer to
499	   generate public key pairs for free using their 'ultra fast' machine.
500	   For this reason, it is probably desirable to at least support if not
501	   mandate the use of some sort of salting scheme when compression is in
502	   use.  This allows the key to be generated in secure, trusted hardware
503	   and only the discovery of a salt providing the desired compression
504	   being performed on less trusted or untrusted devices.

506	   Currently, 25 bit compression may be achieved on commodity machines
507	   with minimal impact on key generation if salting is used.  Use of 35
508	   bit compression has a noticeable impact, but can still be achieved
509	   within hours without the use of special purpose hardware (e.g. use of
510	   a GPU unit).  Use of 40 bit compression is feasible with a GPU and
511	   use of 50 bit compression which would allow a fingerprint to be
512	   shortened by ten significant characters is on the outer edge of
513	   practicality.  While support for even higher levels of compression is
514	   conceivable, it is probably not very sensible.

516	3.7.  Example of Compressed Encoding.

518	3.7.1.  Example

520	   The string "290668103" has a SHA-2-512 UDF fingerprint with 29
521	   leading zero bits.  The inputs to the fingerprint are:

523	   Data =
524	     32 39 30 36  36 38 31 30  33

526	   ContentType =
527	     74 65 78 74  2F 70 6C 61  69 6E

529	                                 Figure 6

531	   The 100 bit UDF fingerprint is:

533	      MF3VV-FOFE2-CLRW (Maybe)

535	   NB: The above is not generated from code and might well be incorrect.

537	4.  Content Types

539	   While a UDF fingerprint MAY be used to identify any form of static
540	   data, the use of a UDF fingerprint to identify a public key signature
541	   key provides a level of indirection and thus the ability to identify
542	   dynamic data.  The content types used to identify public keys are
543	   thus of particular interest.

545	   As described in the security considerations section, the use of
546	   fingerprints to identify a bare public key and the use of
547	   fingerprints to identify a public key and associated security policy
548	   information are very different.

550	4.1.  PKIX Certificates and Keys

552	   UDF fingerprints MAY be used to identify PKIX certificates, CRLs and
553	   public keys in the ASN.1 encoding used in PKIX certificates.

555	   Since PKIX certificates and CLRs contain security policy information,
556	   UDF fingerprints used to identify certificates or CRLs SHOULD be
557	   presented with a minimum of 200 bits of precision.  PKIX applications
558	   MUST not accept UDF fingerprints specified with less than 200 bits of
559	   precision for purposes of identifying trust anchors.

561	   PKIX certificates, keys and related content data are identified by
562	   the following content types:

564	   application/pkix-cert  A PKIX Certificate
565	   application/pkix-crl  A PKIX CRL

567	   application/pkix-keyinfo  The KeyInfo structure defined in the PKIX
568	      certificate specification

570	4.2.  OpenPGP Key

572	   OpenPGPv5 keys and key set content data are identified by the
573	   following content types:

575	   application/pgp-key-v5  An OpenPGP key

577	   application/pgp-keys  An OpenPGP key set.

579	4.3.  DNSSEC

581	   DNSSEC record data consists of DNS records which are identified by
582	   the following content type:

584	   application/dns  A DNS resource record in binary format

586	5.  Additional UDF Renderings

588	   By default, a UDF fingerprint is rendered in the Base32 encoding
589	   described in this document.  Additional renderings MAY be employed to
590	   facilitate entry and/or verification of fingerprint values.

592	5.1.  Machine Readable Rendering

594	   The use of a machine-readable rendering such as a QR Code allows a
595	   UDF value to be input directly using a smartphone or other device
596	   equipped with a camera.

598	   A QR code fixed to a network capable device might contain the
599	   fingerprint of a machine readable description of the device.

601	5.2.  Word Lists

603	   The use of a Word List to encode fingerprint values was introduced by
604	   Patrick Juola and Philip Zimmerman for the PGPfone application.  The
605	   PGP Word List is designed to facilitate exchange and verification of
606	   fingerprint values in a voice application.  To minimize the risk of
607	   misinterpretation, two word lists of 256 values each are used to
608	   encode alternative fingerprint bytes.  The compact size of the lists
609	   used allowed the compilers to curate them so as to maximize the
610	   phonetic distance of the words selected.

612	   The PGP Word List is designed to achieve a balance between ease of
613	   entry and verification.  Applications where only verification is
614	   required may be better served by a much larger word list, permitting
615	   shorter fingerprint encodings.

617	   For example, a word list with 16384 entries permits 14 bits of the
618	   fingerprint to be encoded at once, 65536 entries permits 16.  These
619	   encodings allow a 125 bit fingerprint to be encoded in 9 and 8 words
620	   respectively.

622	5.3.  Image List

624	   An image list is used in the same manner as a word list affording
625	   rapid visual verification of a fingerprint value.  For obvious
626	   reasons, this approach is not generally suited to data entry.

628	6.  Security Considerations

630	6.1.  Work Factor and Precision

632	   A given UDF data object has a single fingerprint value that may be
633	   presented at different precisions.  The shortest legitimate precision
634	   with which a UDF fingerprint may be presented has 96 significant bits

636	   A UDF fingerprint presents the same work factor as any other
637	   cryptographic digest function.  The difficulty of finding a second
638	   data item that matches a given fingerprint is 2^n and the difficulty
639	   or finding two data items that have the same fingerprint is 2^(n/2).
640	   Where n is the precision of the fingerprint.

642	   For the algorithms specified in this document, n = 512 and thus the
643	   work factor for finding collisions is 2^256, a value that is
644	   generally considered to be computationally infeasible.

646	   Since the use of 512 bit fingerprints is impractical in the type of
647	   applications where fingerprints are generally used, truncation is a
648	   practical necessity.  The longer a fingerprint is, the less likely it
649	   is that a user will check every character.  It is therefore important
650	   to consider carefully whether the security of an application depends
651	   on second pre-image resistance or collision resistance.

653	   In most fingerprint applications, such as the use of fingerprints to
654	   identify public keys, the fact that a malicious party might generate
655	   two keys that have the same fingerprint value is a minor concern.
656	   Combined with a flawed protocol architecture, such a vulnerability
657	   may permit an attacker to construct a document such that the
658	   signature will be accepted as valid by some parties but not by
659	   others.

661	   For example, Alice generates keypairs until two are generated that
662	   have the same 100 bit UDF presentation (typically 2^48 attempts).
663	   She registers one keypair with a merchant and the other with her
664	   bank.  This allows Alice to create a payment instrument that will be
665	   accepted as valid by one and rejected by the other.

667	   The ability to generate of two PKIX certificates with the same
668	   fingerprint and different certificate attributes raises very
669	   different and more serious security concerns.  For example, an
670	   attacker might generate two certificates with the same key and
671	   different use constraints.  This might allow an attacker to present a
672	   highly constrained certificate that does not present a security risk
673	   to an application for purposes of gaining approval and an
674	   unconstrained certificate to request a malicious action.

676	   In general, any use of fingerprints to identify data that has
677	   security policy semantics requires the risk of collision attacks to
678	   be considered.  For this reason the use of short, ?user friendly?
679	   fingerprint presentations (Less than 200 bits) SHOULD only be used
680	   for public key values.

682	6.2.  Semantic Substitution

684	   Many applications record the fact that a data item is trusted, rather
685	   fewer record the circumstances in which the data item is trusted.
686	   This results in a semantic substitution vulnerability which an
687	   attacker may exploit by presenting the trusted data item in the wrong
688	   context.

690	   The UDF format provides protection against high level semantic
691	   substitution attacks by incorporating the content type into the input
692	   to the outermost fingerprint digest function.  The work factor for
693	   generating a UDF fingerprint that is valid in both contexts is thus
694	   the same as the work factor for finding a second preimage in the
695	   digest function (2^512 for the specified digest algorithms).

697	   It is thus infeasible to generate a data item such that some
698	   applications will interpret it as a PKIX key and others will accept
699	   as an OpenPGP key.  While attempting to parse a PKIX key as an
700	   OpenPGP key is virtually certain to fail to return the correct key
701	   parameters it cannot be assumed that the attempt is guaranteed to
702	   fail with an error message.

704	   The UDF format does not provide protection against semantic
705	   substitution attacks that do not affect the content type.

707	7.  IANA Considerations

709	   [This will be extended later]

711	7.1.  URI Registration

713	   [Here a URI registration for the udf: scheme]

715	7.2.  Content Type Registration

717	   [application/pkix-keyinfo]

719	   [application/pgp-key]

721	7.3.  Version Registry

723	   96 = SHA-2-512
724	   97 = SHA-2-512 with 25 leading zeros
725	   98 = SHA-2-512 with 40 leading zeros
726	   99 = SHA-2-512 with 50 leading zeros
727	   100 = SHA-2-512 with 55 leading zeros
728	   144 = SHA-3-512

730	                                 Figure 7

732	8.  References

734	8.1.  Normative References

736	   [RFC4648]  Josefsson, S., "The Base16, Base32, and Base64 Data
737	              Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006.

739	   [SHA-2]    NIST, "Secure Hash Standard", August 2015.

741	   [SHA-3]    Dworkin, M., "SHA-3 Standard: Permutation-Based Hash and
742	              Extendable-Output Functions", August 2015.

744	8.2.  Informative References

746	   [Dobertin95]
747	              Eurocrypt 1996, "Cryptanalysis of MD5 Compress".

749	   [draft-hallambaker-mesh-developer]
750	              Hallam-Baker, P., "Mathematical Mesh: Reference
751	              Implementation", draft-hallambaker-mesh-developer-06 (work
752	              in progress), April 2018.

754	   [hallambaker-sin]
755	              "[Reference Not Found!]".

757	   [RFC1321]  Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321,
758	              DOI 10.17487/RFC1321, April 1992.

760	8.3.  URIs

762	   [1] http://mathmesh.com/Documents/draft-hallambaker-udf.html

764	Author's Address

766	   Phillip Hallam-Baker
767	   Comodo Group Inc.

769	   Email: philliph@comodo.com