idnits 2.17.1 

draft-hallambaker-mesh-udf-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([1]), which it shouldn't. 
     Please replace those with straight textual mentions of the documents in
     question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 1640 has weird spacing: '... suffix  srv/m...'

  == Line 1957 has weird spacing: '... set of  point...'

  == Line 1958 has weird spacing: '... degree  in a...'

  == Line 1959 has weird spacing: '...o prime  to...'

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     UDF Binary Data Sequence types are either fixed length or variable
     length.  A variable length Binary Data Sequence MUST be truncated for
     presentation.  Fixed length Binary Data Sequences MUST not be truncated.

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     Since PKIX certificates and CLRs contain security policy
     information, UDF fingerprints used to identify certificates or CRLs
     SHOULD be presented with a minimum of 200 bits of precision.  PKIX
     applications MUST not accept UDF fingerprints specified with less than
     200 bits of precision for purposes of identifying trust anchors.

  -- The document date (February 25, 2019) is 1884 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Looks like a reference, but probably isn't: '1' on line 2127

  -- Looks like a reference, but probably isn't: '2' on line 2129

  == Missing Reference: 'This' is mentioned on line 1776, but not defined

  -- Obsolete informational reference (is this intentional?): RFC 5785
     (Obsoleted by RFC 8615)


     Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                    P. Hallam-Baker
3	Internet-Draft                                         February 25, 2019
4	Intended status: Informational
5	Expires: August 29, 2019

7	          Mathematical Mesh Part II: Uniform Data Fingerprint.
8	                     draft-hallambaker-mesh-udf-01

10	Abstract

12	   This document describes the naming and addressing schemes used in the
13	   Mathematical Mesh.  The means of generating Uniform Data Fingerprint
14	   (UDF) values and their presentation as text sequences and as URIs are
15	   described.

17	   A UDF consists of a binary sequence, the initial eight bits of which
18	   specify a type identifier code.  Type identifier codes have been
19	   selected so as to provide a useful mnemonic indicating their purpose
20	   when presented in Base32 encoding.

22	   Two categories of UDF are described.  Data UDFs provide a compact
23	   presentation of a fixed length binary data value in a format that is
24	   convenient for data entry.  A Data UDF may represent a cryptographic
25	   key, a nonce value or a share of a secret.  Fingerprint UDFs provide
26	   a compact presentation of a Message Digest or Message Authentication
27	   Code value.

29	   A Strong Internet Name (SIN) consists of a DNS name which contains at
30	   least one label that is a UDF fingerprint of a policy document
31	   controlling interpretation of the name.  SINs allow a direct trust
32	   model to be applied to achieve end-to-end security in existing
33	   Internet applications without the need for trusted third parties.

35	   UDFs may be presented as URIs to form either names or locators for
36	   use with the UDF location service.  An Encrypted Authenticated
37	   Resource Locator (EARL) is a UDF locator URI presenting a service
38	   from which an encrypted resource may be obtained and a symmetric key
39	   that may be used to decrypt the content.  EARLs may be presented on
40	   paper correspondence as a QR code to securely provide a machine
41	   readable version of the same content.  This may be applied to
42	   automate processes such as invoicing or to provide accessibility
43	   services for the partially sighted.

45	   This document is also available online at
46	   http://mathmesh.com/Documents/draft-hallambaker-mesh-udf.html [1] .

48	Status of This Memo

50	   This Internet-Draft is submitted in full conformance with the
51	   provisions of BCP 78 and BCP 79.

53	   Internet-Drafts are working documents of the Internet Engineering
54	   Task Force (IETF).  Note that other groups may also distribute
55	   working documents as Internet-Drafts.  The list of current Internet-
56	   Drafts is at https://datatracker.ietf.org/drafts/current/.

58	   Internet-Drafts are draft documents valid for a maximum of six months
59	   and may be updated, replaced, or obsoleted by other documents at any
60	   time.  It is inappropriate to use Internet-Drafts as reference
61	   material or to cite them other than as "work in progress."

63	   This Internet-Draft will expire on August 29, 2019.

65	Copyright Notice

67	   Copyright (c) 2019 IETF Trust and the persons identified as the
68	   document authors.  All rights reserved.

70	   This document is subject to BCP 78 and the IETF Trust's Legal
71	   Provisions Relating to IETF Documents
72	   (https://trustee.ietf.org/license-info) in effect on the date of
73	   publication of this document.  Please review these documents
74	   carefully, as they describe your rights and restrictions with respect
75	   to this document.  Code Components extracted from this document must
76	   include Simplified BSD License text as described in Section 4.e of
77	   the Trust Legal Provisions and are provided without warranty as
78	   described in the Simplified BSD License.

80	Table of Contents

82	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   4
83	     1.1.  UDF Types . . . . . . . . . . . . . . . . . . . . . . . .   4
84	       1.1.1.  Cryptographic Keys and Nonces . . . . . . . . . . . .   5
85	       1.1.2.  Fingerprint type UDFS . . . . . . . . . . . . . . . .   6
86	     1.2.  UDF URIs  . . . . . . . . . . . . . . . . . . . . . . . .   6
87	       1.2.1.  Name Form . . . . . . . . . . . . . . . . . . . . . .   7
88	       1.2.2.  Locator Form  . . . . . . . . . . . . . . . . . . . .   7
89	     1.3.  Secure Internet Names . . . . . . . . . . . . . . . . . .   9
90	   2.  Definitions . . . . . . . . . . . . . . . . . . . . . . . . .   9
91	     2.1.  Requirements Language . . . . . . . . . . . . . . . . . .  10
92	     2.2.  Defined Terms . . . . . . . . . . . . . . . . . . . . . .  10
93	     2.3.  Related Specifications  . . . . . . . . . . . . . . . . .  11
94	     2.4.  Implementation Status . . . . . . . . . . . . . . . . . .  11
95	   3.  Architecture  . . . . . . . . . . . . . . . . . . . . . . . .  11
96	     3.1.  Base32 Presentation . . . . . . . . . . . . . . . . . . .  11
97	       3.1.1.  Precision Improvement . . . . . . . . . . . . . . . .  12
98	     3.2.  Type Identifier . . . . . . . . . . . . . . . . . . . . .  12
99	     3.3.  Content Type Identifier . . . . . . . . . . . . . . . . .  13
100	     3.4.  Truncation  . . . . . . . . . . . . . . . . . . . . . . .  14
101	       3.4.1.  Compression . . . . . . . . . . . . . . . . . . . . .  14
102	     3.5.  Presentation  . . . . . . . . . . . . . . . . . . . . . .  15
103	     3.6.  Alternative Presentations . . . . . . . . . . . . . . . .  15
104	       3.6.1.  Word Lists  . . . . . . . . . . . . . . . . . . . . .  15
105	       3.6.2.  Image List  . . . . . . . . . . . . . . . . . . . . .  16
106	   4.  Fixed Length UDFs . . . . . . . . . . . . . . . . . . . . . .  16
107	     4.1.  Nonce Type  . . . . . . . . . . . . . . . . . . . . . . .  16
108	     4.2.  Encryption/Authentication Type  . . . . . . . . . . . . .  16
109	     4.3.  Shamir Shared Secret  . . . . . . . . . . . . . . . . . .  17
110	       4.3.1.  Secret Generation . . . . . . . . . . . . . . . . . .  17
111	       4.3.2.  Recovery  . . . . . . . . . . . . . . . . . . . . . .  18
112	   5.  Variable Length UDFs  . . . . . . . . . . . . . . . . . . . .  20
113	     5.1.  Content Digest  . . . . . . . . . . . . . . . . . . . . .  20
114	       5.1.1.  Content Digest Value  . . . . . . . . . . . . . . . .  21
115	       5.1.2.  Typed Content Digest Value  . . . . . . . . . . . . .  21
116	       5.1.3.  Compression . . . . . . . . . . . . . . . . . . . . .  21
117	       5.1.4.  Presentation  . . . . . . . . . . . . . . . . . . . .  22
118	       5.1.5.  Example Encoding  . . . . . . . . . . . . . . . . . .  23
119	       5.1.6.  Using SHA-2-512 Digest  . . . . . . . . . . . . . . .  23
120	       5.1.7.  Using SHA-3-512 Digest  . . . . . . . . . . . . . . .  24
121	       5.1.8.  Using SHA-2-512 Digest with Compression . . . . . . .  24
122	       5.1.9.  Using SHA-3-512 Digest with Compression . . . . . . .  25
123	     5.2.  Authenticator UDF . . . . . . . . . . . . . . . . . . . .  25
124	       5.2.1.  Content Digest Value  . . . . . . . . . . . . . . . .  26
125	       5.2.2.  Authentication Value  . . . . . . . . . . . . . . . .  26
126	     5.3.  Content Type Values . . . . . . . . . . . . . . . . . . .  28
127	       5.3.1.  PKIX Certificates and Keys  . . . . . . . . . . . . .  29
128	       5.3.2.  OpenPGP Key . . . . . . . . . . . . . . . . . . . . .  29
129	       5.3.3.  DNSSEC  . . . . . . . . . . . . . . . . . . . . . . .  29
130	   6.  UDF URIs  . . . . . . . . . . . . . . . . . . . . . . . . . .  29
131	     6.1.  Name form . . . . . . . . . . . . . . . . . . . . . . . .  30
132	     6.2.  Locator form  . . . . . . . . . . . . . . . . . . . . . .  30
133	       6.2.1.  DNS Web service discovery . . . . . . . . . . . . . .  31
134	       6.2.2.  Content Identifier  . . . . . . . . . . . . . . . . .  31
135	       6.2.3.  Target URI  . . . . . . . . . . . . . . . . . . . . .  31
136	       6.2.4.  Postprocessing  . . . . . . . . . . . . . . . . . . .  32
137	       6.2.5.  Decryption and Authentication . . . . . . . . . . . .  32
138	       6.2.6.  QR Presentation . . . . . . . . . . . . . . . . . . .  32
139	   7.  Strong Internet Names . . . . . . . . . . . . . . . . . . . .  32
140	   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  33
141	     8.1.  Confidentiality . . . . . . . . . . . . . . . . . . . . .  33
142	     8.2.  Availability  . . . . . . . . . . . . . . . . . . . . . .  33
143	     8.3.  Integrity . . . . . . . . . . . . . . . . . . . . . . . .  33
144	     8.4.  Work Factor and Precision . . . . . . . . . . . . . . . .  33
145	     8.5.  Semantic Substitution . . . . . . . . . . . . . . . . . .  34
146	     8.6.  QR Code Scanning  . . . . . . . . . . . . . . . . . . . .  35
147	   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  35
148	     9.1.  Protocol Service Name . . . . . . . . . . . . . . . . . .  35
149	     9.2.  Well Known  . . . . . . . . . . . . . . . . . . . . . . .  36
150	     9.3.  URI Registration  . . . . . . . . . . . . . . . . . . . .  36
151	     9.4.  Media Types Registrations . . . . . . . . . . . . . . . .  37
152	       9.4.1.  Media Type: application/pkix-keyinfo  . . . . . . . .  37
153	       9.4.2.  Media Type: application/udf-encryption  . . . . . . .  38
154	       9.4.3.  Media Type: application/udf-secret  . . . . . . . . .  39
155	     9.5.  Uniform Data Fingerprint Type Identifier Registry . . . .  40
156	       9.5.1.  The name of the registry  . . . . . . . . . . . . . .  40
157	       9.5.2.  Required information for registrations  . . . . . . .  40
158	       9.5.3.  Applicable registration policy  . . . . . . . . . . .  40
159	       9.5.4.  Size, format, and syntax of registry entries  . . . .  40
160	       9.5.5.  Initial assignments and reservations  . . . . . . . .  41
161	   10. Appendix A: Prime Values for Secret Sharing . . . . . . . . .  41
162	   11. Recovering Shamir Shared Secret . . . . . . . . . . . . . . .  42
163	   12. References  . . . . . . . . . . . . . . . . . . . . . . . . .  45
164	     12.1.  Normative References . . . . . . . . . . . . . . . . . .  45
165	     12.2.  Informative References . . . . . . . . . . . . . . . . .  46
166	     12.3.  URIs . . . . . . . . . . . . . . . . . . . . . . . . . .  47
167	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  47

169	1.  Introduction

171	   A Uniform Data Fingerprint (UDF) is a generalized format for
172	   presenting and interpreting short binary sequences representing
173	   cryptographic keys or fingerprints of data of any specified type.
174	   The UDF format provides a superset of the OpenPGP [RFC4880]
175	   fingerprint encoding capability with greater encoding density and
176	   readability.

178	   This document describes the syntax and encoding of UDFs, the means of
179	   constructing and comparing them and their use in other Internet
180	   addressing schemes.

182	1.1.  UDF Types

184	   Two categories of UDF are described.  Data UDFs provide a compact
185	   presentation of a fixed length binary data value in a format that is
186	   convenient for data entry.  A Data UDF may represent a cryptographic
187	   key or nonce value or a part share of a key generated using a secret
188	   sharing mechanism.  Fingerprint UDFs provide a compact presentation
189	   of a Message Digest or Message Authentication Code value.

191	   Both categories of UDF are encoded as a UDF binary sequence, the
192	   first octet of which is a Type Identifier and the remaining octets
193	   specify the binary value according to the type identifier and data
194	   referenced.

196	   UDFs are typically presented to the user as a Base32 encoded sequence
197	   in groups of five characters separated by dashes.  This format
198	   provides a useful balance between compactness and readability.  The
199	   type identifier codes have been selected so as to provide a useful
200	   mnemonic when presented in Base32 encoding.

202	   The following are examples of UDF values:

204	   NA4C-5USH-UPDO-KGBT-VTXN-UGY4-47KP
205	   ECNY-JVFQ-26XG-25SM-2GT6-6KNE-XGPA
206	   SAQM-GT3M-N3EA-CLAZ-DHCC-DPUO-RFSO-E
207	   MB5S-R4AJ-3FBT-7NHO-T26Z-2E6Y-WFH4
208	   KCM5-7VB6-IJXJ-WKHX-NZQF-OKGZ-EWVN
209	   ACON-ADL2-6W6O-LDE7-XJ7B-EFQE-BUOZ

211	   Like email addresses, UDFs are not a Uniform Resource Identifier
212	   (URI) but may be expressed in URI form by adding the scheme
213	   identifier (UDF) for use in contexts where an identifier in URI
214	   syntax is required.  A UDF URI MAY contain a domain name component
215	   allowing it to be used as a locator

217	1.1.1.  Cryptographic Keys and Nonces

219	   A Nonce (N) UDF represents a short, fixed length randomly chosen
220	   binary value.

222	   Nonce UDFs are used within many Mesh protocols and data formats where
223	   it is necessary to represent a nonce value in text form.

225	   Nonce UDF:
226	     NA4C-5USH-UPDO-KGBT-VTXN-UGY4-47KP

228	   An Encryption/Authentication (E) UDF has the same format as a Random
229	   UDF but is identified as being intended to be used as a symmetric key
230	   for encryption and/or authentication.

232	   KeyValue:
233	     9B 84 D4 B0  D7 AE 6D 76  4C D1 A7 EF  29 A4 B9 9E

235	   Encryption/Authenticator UDF:
236	     ECNY-JVFQ-26XG-25SM-2GT6-6KNE-XGPA

238	   A Share (S) UDF also represents a short, fixed length binary value
239	   but only provides one share in secret sharing scheme.  Recovery of
240	   the binary value requires a sufficient number of shares.

242	   Share UDFs are used in the Mesh to support key and data escrow
243	   operations without the need to rely on trusted hardware.  A share UDF
244	   can be copied by hand or printed in human or machine-readable form
245	   (e.g.  QR code).

247	   Key:     ECNY-JVFQ-26XG-25SM-2GT6-6KNE-XGPA
248	   Share 0: SAQM-GT3M-N3EA-CLAZ-DHCC-DPUO-RFSO-E
249	   Share 1: SAQ6-WGQE-FS4F-H2V3-423J-XDPT-NYIC-M
250	   Share 2: SARB-FZE3-5KUK-NKK6-WOUR-KXKY-KK5T-O

252	1.1.2.  Fingerprint type UDFS

254	   Fingerprint type UDFs contains a fingerprint value calculated over a
255	   content data item and an IANA media type.

257	   A Content Digest type UDF is a fingerprint type UDF in which the
258	   fingerprint is formed using a cryptographic algorithm.  Two digest
259	   algorithms are currently supported, SHA-2-512 (M, for Merkle Damgard)
260	   and SHA-3-512 (K, for Keccak).

262	   The inclusion of the media type in the calculation of the UDF value
263	   provides protection against semantic substitution attacks in which
264	   content that has been found to be trustworthy when interpreted as one
265	   content type is presented in a context in which it is interpreted as
266	   a different content type in which it is unsafe.

268	   SHA-2-512: MB5S-R4AJ-3FBT-7NHO-T26Z-2E6Y-WFH4
269	   SHA-3-512: KCM5-7VB6-IJXJ-WKHX-NZQF-OKGZ-EWVN

271	   An Authentication UDF (A) is formed in the same manner as a
272	   fingerprint but using a Message Authentication Code algorithm and a
273	   symmetric key.

275	   Authentication UDFs are used to express commitments and to provide a
276	   means of blinding fingerprint values within a protocol by means of a
277	   nonce.

279	   SHA-2-512: ACON-ADL2-6W6O-LDE7-XJ7B-EFQE-BUOZ

281	1.2.  UDF URIs

283	   The UDF URI scheme allows use of a UDF in contexts where a URF is
284	   expected.  The UDF URI scheme has two forms, name and locator.

286	1.2.1.  Name Form

288	   Name form UDF URIs identify a data resource but do not provide a
289	   means of discovery.  The URI is simply the scheme (udf) followed by
290	   the UDF value:

292	   udf:MB5S-R4AJ-3FBT-7NHO-T26Z-2E6Y-WFH4

294	1.2.2.  Locator Form

296	   Locator form UDF URIs identify a data resource and provide a hint
297	   that MAY provide a means of discovery.  If the content is not
298	   available from the location indicated, content obtained from a
299	   different source that matches the fingerprint MAY be used instead.

301	   udf://MB5S-R4AJ-3FBT-7NHO-T26Z-2E6Y-WFH4

303	   UDF locator form URIs presenting a fingerprint type UDF provide a
304	   tight binding of the content to the locator.  This allows the
305	   resolved content to be verified and rejected if it has been modified.

307	   UDF locator form URIs presenting an Encryptor/Authenticator type UDF
308	   provide a mechanism for identification, discovery and decryption of
309	   encrypted content.  UDF locators of this type are known as Encrypted/
310	   Authenticated Resource Locators (EARLs).

312	   Regardless of the type of the embedded UDF, UDF locator form URIs are
313	   resolved by first performing DNS Web Service Discovery to identify
314	   the Web Service Endpoint for the mmm-udf service at the specified
315	   domain.

317	   Resolution is completed by presenting the Content Digest Fingerprint
318	   of the UDF value specified in the URI to the specified Web Service
319	   Endpoint and performing a GET method request on the result.

321	   For example, Alice subscribes to Example.com, a purveyor of cat and
322	   kitten images.  The company generates paper and electronic invoices
323	   on a monthly basis.

325	   To generate the paper invoice, Example.com first creates a new
326	   encryption key:

328	   EAQE-BCUO-RXQ2-RQND-PVQ2-24T6-BK6W-PG

330	   One or more electronic forms of the invoice are encrypted under the
331	   key EAQE-BCUO-RXQ2-RQND-PVQ2-24T6-BK6W-PG and placed on the
332	   Example.com Web site so that the appropriate version is returned if
333	   Alice scans the QR code.

335	   The key is then converted to form an EARL for the example.com UDF
336	   resolution service:

338	   udf://example.com/EAQE-BCUO-RXQ2-RQND-PVQ2-24T6-BK6W-PG

340	   The EARL is then rendered as a QR code:

342	   [[This figure is not viewable in this format.  The figure is
343	   available at http://mathmesh.com/Documents/draft-hallambaker-mesh-
344	   udf.html [2].]]

346	   QR Code with embedded decryption and location key

348	   A printable invoice containing the QR code is now generated and sent
349	   to Alice.

351	   When Alice receives the invoice, she can pay it by simply scanning
352	   the invoice with a device that recognizes at least one of the invoice
353	   formats supported by Example.com.

355	   The UDF EARL locator shown above is resolved by first determining the
356	   Web Service Endpoint for the mmm-udf service for the domain
357	   example.com.

359	   Discover ("example.com", "mmm-udf") =
360	   https://example.com/.well-known/mmm-udf/

362	   Next the fingerprint of the source UDF is obtained.

364	   UDF (EAQE-BCUO-RXQ2-RQND-PVQ2-24T6-BK6W-PG) =
365	   MCR2-OH2Y-4XG7-Y3PP-3MFT-R57N-MZ4G-B3FN-2Z5S-K35M-PQHJ-ZZS2-4RR7-ILTA

367	   Combining the Web Service Endpoint and the fingerprint of the source
368	   UDF provides the URI from which the content is obtained using the
369	   normal HTTP GET method:

371	   https://example.com/.well-known/mmm-udf/MCR2-OH2Y-4XG7-Y3PP-3MFT-
372	   R57N-MZ4G-B3FN-2Z5S-K35M-PQHJ-ZZS2-4RR7-ILTA

374	   Having established that Alice can read postal mail sent to a physical
375	   address and having delivered a secret to that address, this process
376	   might be extended to provide a means of automating the process of
377	   enrolment in electronic delivery of future invoices.

379	1.3.  Secure Internet Names

381	   A SIN is an Internet Identifier that contains a UDF fingerprint of a
382	   security policy document that may be used to verify the
383	   interpretation of the identifier.  This permits traditional forms of
384	   Internet address such as URIs and RFC822 email addresses to be used
385	   to express a trusted address that is independent of any trusted third
386	   party.

388	   This document only describes the syntax and interpretation of the
389	   identifiers themselves.  The means by which the security policy
390	   documents bound to an address govern interpretation of the name is
391	   discussed separately in [draft-hallambaker-mesh-trust] .

393	   For example, Example Inc holds the domain name example.com and has
394	   deployed a private CA whose root of trust is a PKIX certificate with
395	   the UDF fingerprint MB2GK-6DUF5-YGYYL-JNY5E-RWSHZ.

397	   Alice is an employee of Example Inc., she uses three email addresses:

399	   alice@example.com  A regular email address (not a SIN).

401	   alice@mm--mb2gk-6duf5-ygyyl-jny5e-rwshz.example.com  A strong email
402	      address that is backwards compatible.

404	   alice@example.com.mm--mb2gk-6duf5-ygyyl-jny5e-rwshz  A strong email
405	      address that is backwards incompatible.

407	   All three forms of the address are valid RFC822 addresses and may be
408	   used in a legacy email client, stored in an address book application,
409	   etc.  But the ability of a legacy client to make use of the address
410	   differs.  Addresses of the first type may always be used.  Addresses
411	   of the second type may only be used if an appropriate MX record is
412	   provisioned.  Addresses of the third type will always fail unless the
413	   resolver understands that it is a SIN requiring special processing.

415	   These rules allow Bob to send email to Alice with either 'best
416	   effort' security or mandatory security as the circumstances demand.

418	2.  Definitions

420	   This section presents the related specifications and standard, the
421	   terms that are used as terms of art within the documents and the
422	   terms used as requirements language.

424	2.1.  Requirements Language

426	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
427	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
428	   document are to be interpreted as described in [RFC2119] .

430	2.2.  Defined Terms

432	   Cryptographic Digest Function  A hash function that has the
433	      properties required for use as a cryptographic hash function.
434	      These include collision resistance, first pre-image resistance and
435	      second pre-image resistance.

437	   Content Type  An identifier indicating how a Data Value is to be
438	      interpreted as specified in the IANA registry Media Types.

440	   Commitment  A cryptographic primitive that allows one to commit to a
441	      chosen value while keeping it hidden to others, with the ability
442	      to reveal the committed value later.

444	   Data Value  The binary octet stream that is the input to the digest
445	      function used to calculate a digest value.

447	   Data Object  A Data Value and its associated Content Type

449	   Digest Algorithm  A synonym for Cryptographic Digest Function

451	   Digest Value  The output of a Cryptographic Digest Function

453	   Data Digest Value  The output of a Cryptographic Digest Function for
454	      a given Data Value input.

456	   Fingerprint  A presentation of the digest value of a data value or
457	      data object.

459	   Fingerprint Presentation  The representation of at least some part of
460	      a fingerprint value in human or machine-readable form.

462	   Fingerprint Improvement  The practice of recording a higher precision
463	      presentation of a fingerprint on successful validation.

465	   Fingerprint Work Hardening  The practice of generating a sequence of
466	      fingerprints until one is found that matches criteria that permit
467	      a compressed presentation form to be used.  The compressed
468	      fingerprint thus being shorter than but presenting the same work
469	      factor as an uncompressed one.

471	   Hash  A function which takes an input and returns a fixed-size
472	      output.  Ideally, the output of a hash function is unbiased and
473	      not correlated to the outputs returned to similar inputs in any
474	      predictable fashion.

476	   Precision  The number of significant bits provided by a Fingerprint
477	      Presentation.

479	   Work Factor  A measure of the computational effort required to
480	      perform an attack against some security property.

482	2.3.  Related Specifications

484	   This specification makes use of Base32 [RFC4648] encoding, SHA-2
485	   [SHA-2] and SHA-3 [SHA-3] digest functions in the derivation of basic
486	   fingerprints.  The derivation of keyed fingerprints additionally
487	   requires the use of the HMAC [RFC2014] and HKDF [RFC5869] functions.

489	   Resolution of UDF URI Locators makes use of DNS Web Service Discovery
490	   [draft-hallambaker-web-service-discovery] .

492	2.4.  Implementation Status

494	   The implementation status of the reference code base is described in
495	   the companion document [draft-hallambaker-mesh-developer] .

497	3.  Architecture

499	   A Uniform Data Fingerprint (UDF) is a presentation of a UDF Binary
500	   Data Sequence.

502	   This document specifies seven UDF Binary Data Sequence types and one
503	   presentation.

505	   The first octet of a UDF Binary Data Sequence identifies the UDF type
506	   and is referred to as the Type identifier.

508	   UDF Binary Data Sequence types are either fixed length or variable
509	   length.  A variable length Binary Data Sequence MUST be truncated for
510	   presentation.  Fixed length Binary Data Sequences MUST not be
511	   truncated.

513	3.1.  Base32 Presentation

515	   The default UDF presentation is Base32 Presentation.

517	   Variable length Binary Data Sequences are truncated to an integer
518	   multiple of 20 bits that provides the desired precision before
519	   conversion to Base32 form.

521	   Fixed length Binary Data Sequences are converted to Base32 form
522	   without truncation.

524	   After conversion to Base32 form, dash '-' characters are inserted
525	   between groups of 4 characters to aid reading.  This representation
526	   improves the accuracy of both data entry and verification.

528	3.1.1.  Precision Improvement

530	   Precision improvement is the practice of using a high precision UDF
531	   (e.g. 260 bits) calculated from content data that has been validated
532	   according to a lower precision UDF (e.g. 120 bits).

534	   This allows a lower precision UDF to be used in a medium such as a
535	   business card where space is constrained without compromising
536	   subsequent uses.

538	   Applications SHOULD make use of precision improvement wherever
539	   possible.

541	3.2.  Type Identifier

543	   A Version Identifier consists of a single byte.

545	   The byte codes have been chosen so that the first character of the
546	   Base32 presentation of the UDF provides a mnemonic for its type.  A
547	   SHA-2 fingerprint UDF will always have M (for Merkle Damgard) as the
548	   initial letter, a SHA-3 fingerprint UDF will always have K (for
549	   Keccak) as the initial letter, and so on.

551	   The following version identifiers are specified in this document:

553	          +---------+---------+--------------------------------+
554	          | Type ID | Initial | Algorithm                      |
555	          +---------+---------+--------------------------------+
556	          | 0       | A       | HMAC-SHA-2-512                 |
557	          | 32      | E       | HKDF-AES-512                   |
558	          | 80      | K       | SHA-3-512                      |
559	          | 81      | K       | SHA-3-512 (20 bits compressed) |
560	          | 82      | K       | SHA-3-512 (30 bits compressed) |
561	          | 83      | K       | SHA-3-512 (40 bits compressed) |
562	          | 84      | K       | SHA-3-512 (50 bits compressed) |
563	          | 96      | M       | SHA-2-512                      |
564	          | 97      | M       | SHA-2-512 (20 bits compressed) |
565	          | 98      | M       | SHA-2-512 (30 bits compressed) |
566	          | 99      | M       | SHA-2-512 (40 bits compressed) |
567	          | 100     | M       | SHA-2-512 (50 bits compressed) |
568	          | 104     | N       | Nonce data                     |
569	          | 144     | S       | Shamir Secret Sharing          |
570	          +---------+---------+--------------------------------+

572	                                  Table 1

574	3.3.  Content Type Identifier

576	   A secure cryptographic digest algorithm provides a unique digest
577	   value that is probabilistically unique for a particular byte sequence
578	   but does not fix the context in which a byte sequence is interpreted.
579	   While such ambiguity may be tolerated in a fingerprint format
580	   designed for a single specific field of use, it is not acceptable in
581	   a general-purpose format.

583	   For example, the SSH and OpenPGP applications both make use of
584	   fingerprints as identifiers for the public keys used but using
585	   different digest algorithms and data formats for representing the
586	   public key data.  While no such vulnerability has been demonstrated
587	   to date, it is certainly conceivable that a crafty attacker might
588	   construct an SSH key in such a fashion that OpenPGP interprets the
589	   data in an insecure fashion.  If the number of applications making
590	   use of fingerprint format that permits such substitutions is
591	   sufficiently large, the probability of a semantic substitution
592	   vulnerability being possible becomes unacceptably large.

594	   A simple control that defeats such attacks is to incorporate a
595	   content type identifier within the scope of the data input to the
596	   hash function.

598	3.4.  Truncation

600	   Different applications of fingerprints demand different tradeoffs
601	   between compactness of the representation and the number of
602	   significant bits.  A larger the number of significant bits reduces
603	   the risk of collision but at a cost to convenience.

605	   Modern cryptographic digest functions such as SHA-2 produce output
606	   values of at least 256 bits in length.  This is considerably larger
607	   than most uses of fingerprints require and certainly greater than can
608	   be represented in human readable form on a business card.

610	   Since a strong cryptographic digest function produces an output value
611	   in which every bit in the input value affects every bit in the output
612	   value with equal probability, it follows that truncating the digest
613	   value to produce a finger print is at least as strong as any other
614	   mechanism if digest algorithm used is strong.

616	   Using truncation to reduce the precision of the digest function has
617	   the advantage that a lower precision fingerprint of some data content
618	   is always a prefix of a higher prefix of the same content.  This
619	   allows higher precision fingerprints to be converted to a lower
620	   precision without the need for special tools.

622	3.4.1.  Compression

624	   The Content Digest UDF types make use of work factor compression.
625	   Additional type identifiers are used to indicate digest values with
626	   20, 30, 40 or 50 trailing zero bits allowing a UDF fingerprint
627	   offering the equivalent of up to 150 bits of precision to be
628	   expressed in 20 characters instead of 30.

630	   To use compressed UDF identifiers, it is necessary to search for
631	   content that can be compressed.  If the digest algorithm used is
632	   secure, this means that by definition, the fastest means of search is
633	   brute force.  Thus, the reduction in fingerprint size is achieved by
634	   transferring the work factor from the attacker to the defender.  To
635	   maintain a work factor of 2^120 with a 2^80 bits, it is necessary for
636	   the content generator to perform a brute force search at a cost of
637	   the order of 2^40 operations.

639	   For example, the smallest allowable work factor for a UDF
640	   presentation of a public key fingerprint is 92 bits.  This would
641	   normally require a presentation with 20 significant characters.
642	   Reducing this to 16 characters requires a brute force search of
643	   approximately 10^6 attempts.  Reducing this to 12 characters would
644	   require 10^12 attempts and to 10 characters, 10^15 attempts.

646	   Omission of support for higher levels of compression than 2^50 is
647	   intentional.

649	   In addition to allowing use of shorter presentations, work factor
650	   compression MAY be used as evidence of proof of work.

652	3.5.  Presentation

654	   The presentation of a fingerprint is the format in which it is
655	   presented to either an application or the user.

657	   Base32 encoding is used to produce the preferred text representation
658	   of a UDF fingerprint.  This encoding uses only the letters of the
659	   Latin alphabet with numbers chosen to minimize the risk of ambiguity
660	   between numbers and letters (2, 3, 4, 5, 6 and 7).

662	   To enhance readability and improve data entry, characters are grouped
663	   into groups of four.  This means that each block of four characters
664	   represents an increase in work factor of approximately one million
665	   times.

667	3.6.  Alternative Presentations

669	   Applications that support UDF MUST support use of the Base32
670	   presentation.  Applications MAY support alternative presentations.

672	3.6.1.  Word Lists

674	   The use of a Word List to encode fingerprint values was introduced by
675	   Patrick Juola and Philip Zimmerman for the PGPfone application.  The
676	   PGP Word List is designed to facilitate exchange and verification of
677	   fingerprint values in a voice application.  To minimize the risk of
678	   misinterpretation, two-word lists of 256 values each are used to
679	   encode alternative fingerprint bytes.  The compact size of the lists
680	   used allowed the compilers to curate them so as to maximize the
681	   phonetic distance of the words selected.

683	   The PGP Word List is designed to achieve a balance between ease of
684	   entry and verification.  Applications where only verification is
685	   required may be better served by a much larger word list, permitting
686	   shorter fingerprint encodings.

688	   For example, a word list with 16384 entries permits 14 bits of the
689	   fingerprint to be encoded at once, 65536 entries permits encoding of
690	   16 bits.  These encodings allow a 120 bit fingerprint to be encoded
691	   in 9 and 8 words respectively.

693	3.6.2.  Image List

695	   An image list is used in the same manner as a word list affording
696	   rapid visual verification of a fingerprint value.  For obvious
697	   reasons, this approach is not suited to data entry but is preferable
698	   for comparison purposes.  An image list of 1,048,576 images would
699	   provide a 20 bit encoding allowing 120 bit precision fingerprints to
700	   be displayed in six images.

702	4.  Fixed Length UDFs

704	   Fixed length UDFs are used to represent cryptographic keys, nonces
705	   and secret shares and have a fixed length determined by their
706	   function that cannot be truncated without loss of information.

708	   All fixed length Binary Data Sequence values are an integer multiple
709	   of eight bits.

711	4.1.  Nonce Type

713	   A Nonce Type UDF consists of the type identifier octet 136 followed
714	   by the Binary Data Sequence value.

716	   The Binary Data Sequence value is an integer number of octets that
717	   SHOULD have been generated in accordance with processes and
718	   procedures that ensure that it is sufficiently unpredictable for the
719	   purposes of the protocol in which the value is to be used.
720	   Requirements for such processes and procedures are described in
721	   [RFC4086] .

723	   Nonce Type UDFs are intended for use in contexts where it is
724	   necessary for a randomly chosen value to be unpredictable but not
725	   secret.  For example, the challenge in a challenge/response
726	   mechanism.

728	4.2.  Encryption/Authentication Type

730	   An Encryption/Authentication Type UDF consists of the type identifier
731	   octet 104 followed by the Binary Data Sequence value.

733	   The Binary Data Sequence value is an integer number of octets that
734	   SHOULD have been generated in accordance with processes and
735	   procedures that ensure that it is sufficiently unpredictable and
736	   unguessable for the purposes of the protocol in which the value is to
737	   be used.  Requirements for such processes and procedures are
738	   described in [RFC4086] .

740	   Encryption/Authentication Type UDFs are intended to be used as a
741	   means of specifying secret cryptographic keying material.  For
742	   example, the input to a Key Derivation Function used to encrypt a
743	   document.  Accordingly, the identifier UDF corresponding to an
744	   Encryption/Authentication type UDF is a UDF fingerprint of the
745	   Encryption/Authentication Type UDF in Base32 presentation with
746	   content type 'application/udf-encryption'.

748	4.3.  Shamir Shared Secret

750	   The UDF format MAY be used to encode shares generated by a secret
751	   sharing mechanism.  The only secret sharing mechanism currently
752	   supported is the Shamir Secret Sharing mechanism [Shamir79] . Each
753	   secret share represents a point represents a point on (x, f(x)), a
754	   polynomial in a modular field p.  The secret being shared is an
755	   integer multiple of 32 bits represented by the polynomial value f(0).

757	   A Shamir Shared Secret Type UDF consists of the type identifier octet
758	   144 followed by the Binary Data Sequence value describing the share
759	   value.

761	   The first octet of the Binary Data Sequence value specifies the
762	   threshold value and the x value of the particular share:

764	   o  Bits 4-7 of the first byte specify the threshold value.

766	   o  Bits 0-3 of the first byte specify the x value minus 1.

768	   The remaining octets specify the value f(x) in network byte (big-
769	   endian) order with leading padding if necessary so that the share has
770	   the same number of bytes as the secret.

772	   The algorithm requires that the value p be a prime larger than the
773	   integer representing the largest secret being shared.  For
774	   compactness of representation we chose p to be the smallest prime
775	   that is greater than 2^n where n is an integer multiple of 32.  This
776	   approach leaves a small probability that a set of chosen polynomial
777	   parameters cause one or more share values be larger than 2^n.  Since
778	   it is the value of the secret rather than the polynomial parameters
779	   that is of important, such parameters MUST NOT be used.

781	4.3.1.  Secret Generation

783	   To share a secret of L bits with a threshold of n we use a f(x) a
784	   polynomial of degree n in the modular field p:

786	   f(x) = a_0 + a_1.x + a_2.x^2 + ... a_n.x^n
787	   where:

789	   L  Is the length of the secret in bits, an integer multiple of 32.

791	   n  Is the threshold, the number of shares required to reconstitute
792	      the secret.

794	   a0 Is the integer representation of the secret to be shared.

796	   a1 ... an  Are randomly chosen integers less than p

798	   p  Is the smallest prime that is greater than 2^L.

800	   For L=128, p = 2^128+51.

802	   The values of the key shares are the values f(1), f(2),... f(n).

804	   The most straightforward approach to generation of Shamir secrets is
805	   to generate the set of polynomial coefficients, a_0, a_1, ... a_n and
806	   use these to generate the share values f(1), f(2),... f(n).

808	   Note that if this approach is adopted, there is a small probability
809	   that one or more of the values f(1), f(2),... f(n) exceeds the range
810	   of values supported by the encoding.  Should this occur, at least one
811	   of the polynomial coefficients MUST be replaced.

813	   An alternative means of generating the set of secrets is to select up
814	   to n-1 secret share values and use secret recovery to determine at
815	   least one additional share.  If n shares are selected, the shared
816	   secret becomes an output of rather than an input to the process.

818	4.3.2.  Recovery

820	   To recover the value of the shared secret, it is necessary to obtain
821	   sufficient shares to meet the threshold and recover the value f(0) =
822	   a_0.

824	   Applications MAY employ any approach that returns the correct result.
825	   The use of Lagrange basis polynomials is described in Appendix C.

827	   Alice decides to encrypt an important document and split the
828	   encryption key so that there are five key shares, three of which will
829	   be required to recover the key.

831	   Alice's master secret is
832	     98 E3 AE F6  0C CA 1A 53  9A 42 30 B5  D6 AB 80 A6

834	   This has the UDF representation:

836	   ECMO-HLXW-BTFB-UU42-IIYL-LVVL-QCTA

838	   The master secret is converted to an integer applying network byte
839	   order conventions.  Since the master secret is 128 bits, it is
840	   guaranteed to be smaller than the modulus.  The resulting value
841	   becomes the polynomial value a0.

843	   Since a threshold of three shares is required, we will need a second
844	   order polynomial.  The co-efficients of the polynomial a1, a2 are
845	   random numbers smaller than the modulus:

847	   a0 = 203224855379551779909878019697389830310
848	   a1 = 213435878098443219772173517206501812827
849	   a2 = 14119443632507462021491753632290899362

851	   The master secret is the value f(0) = a0.  The key shares are the
852	   values f(1), f(2)...f(5):

854	   f(1) = 90497810189563998240168683104414330992
855	   f(2) = 6009652264591140613442853776020630398
856	   f(3) = 290042748525571670493075139143976940035
857	   f(4) = 262032365130628660952316324344746836889
858	   f(5) = 262260869000700575454541016810098532467

860	   The first byte of each share specifies the recovery information
861	   (quorum, x value), the remaining bytes specify the share value in
862	   network byte order:

864	   f(1) =
865	     30 44 15 3E  87 77 6F 2D  78 FF E9 CE  FB 72 B9 F0
866	     70
867	   f(2) =
868	     31 04 85 6A  BB 9B AF 30  8C 51 55 0F  4A 38 1A 6B
869	     7E
870	   f(3) =
871	     32 DA 34 33  92 79 8A 23  8D 8E 83 F1  A2 26 CC F2
872	     03
873	   f(4) =
874	     33 C5 21 99  0C 11 00 06  7C B7 76 76  03 3E D1 83
875	     99
876	   f(5) =
877	     34 C5 4D 9B  28 62 10 D9  59 CC 2C 9C  6D 80 28 20
878	     73

880	   The UDF presentation of the key shares is thus:

882	   f(1) = SAYE-IFJ6-Q53W-6LLY-77U4-563S-XHYH-A
883	   f(2) = SAYQ-JBLK-XON2-6MEM-KFKQ-6SRY-DJVX-4
884	   f(3) = SAZN-UNBT-SJ4Y-UI4N-R2B7-DIRG-ZTZA-G
885	   f(4) = SAZ4-KIMZ-BQIQ-ABT4-W53H-MAZ6-2GBZ-S
886	   f(5) = SA2M-KTM3-FBRB-BWKZ-ZQWJ-Y3MA-FAQH-G

888	   To recover the value f(0) from any three shares, we need to fit a
889	   polynomial curve to the three points and use it to calculate the
890	   value at x=0 using the Lagrange polynomial basis.

892	5.  Variable Length UDFs

894	   Variable length UDFs are used to represent fingerprint values
895	   calculated over a content type identifier and the cryptographic
896	   digest of a content data item.  The fingerprint value MAY be
897	   specified at any integer multiple of 20 bits that provides a work
898	   factor sufficient for the intended purpose.

900	   Two types of fingerprint are specified:

902	   Digest fingerprints  Are computed with the same cryptographic digest
903	      algorithm used to calculate the digest of the content data.

905	   Message Authentication Code fingerprints  Are computed using a
906	      Message Authentication Code.

908	   For a given algorithm (and key, if requires), if two UDF fingerprints
909	   are of the same content data and content type, either the fingerprint
910	   values will be the same or the initial characters of one will be
911	   exactly equal to the other.

913	5.1.  Content Digest

915	   A Content Digest Type UDF consists of the type identifier octet
916	   followed by the Binary Data Sequence value.

918	   The type identifier specifies the digest algorithm used and the
919	   compression level.  Two digest algorithms are currently specified
920	   with four compression levels for each making a total of eight
921	   possible type identifiers.

923	   The Content Digest UDF for given content data is generated by the
924	   steps of:

926	   1.  Applying the digest algorithm to determine the Content Digest
927	       Value

929	   2.  Applying the digest algorithm to determine the Typed Content
930	       Digest Value

932	   3.  Determining the compression level from bytes 0-3 of the Typed
933	       Content Digest Value.

935	   4.  Determining the Type Identifier octet from the Digest algorithm
936	       identifier and compression level.

938	   5.  Truncating bytes 4-63 of the Typed Content Digest Value to
939	       determine the Binary Data Sequence value.

941	5.1.1.  Content Digest Value

943	   The Content Digest Value (CDV) is determined by applying the digest
944	   algorithm to the content data:

946	   CDV = H(<Data>))

948	   Where

950	      H(x) is the cryptographic digest function

952	      <Data> is the binary data.

954	5.1.2.  Typed Content Digest Value

956	   The Typed Content Digest Value (TCDV) is determined by applying the
957	   digest algorithm to the content type identifier and the CDV:

959	   TCDV = H (<Content-ID> + ?:? + CDV)

961	   Where

963	      A + B represents concatenation of the binary sequences A and B.

965	      <Content-ID> is the IANA Content Type of the data in UTF8 encoding

967	   The two-step approach to calculating the Type Content Digest Value
968	   allows an application to attempt to match a set of content data
969	   against multiple types without the need to recalculate the value of
970	   the content data digest.

972	5.1.3.  Compression

974	   The compression factor is determined according to the number of
975	   trailing zero bits in the first 8 bytes of the Typed Content Digest
976	   Value as follows:

978	   19 or fewer leading zero bits  Compression factor = 0

980	   29 or fewer leading zero bits  Compression factor = 20

982	   39 or fewer leading zero bits  Compression factor = 30

984	   49 or fewer leading zero bits  Compression factor = 40

986	   50 or more leading zero bits  Compression factor = 50

988	   The least significant bits of each octet are regarded to be
989	   'trailing'.

991	   Applications MUST use compression when creating and comparing UDFs.
992	   Applications MAY support content generation techniques that search
993	   for UDF values that use a compressed representation.  Presentation of
994	   a content digest value indicating use of compression MAY be used as
995	   an indicator of 'proof of work'.

997	5.1.4.  Presentation

999	   The type identifier is determined by the algorithm and compression
1000	   factor as follows:

1002	              +---------+---------+-----------+-------------+
1003	              | Type ID | Initial | Algorithm | Compression |
1004	              +---------+---------+-----------+-------------+
1005	              | 80      | K       | SHA-3-512 | 0           |
1006	              | 81      | K       | SHA-3-512 | 20          |
1007	              | 82      | K       | SHA-3-512 | 30          |
1008	              | 83      | K       | SHA-3-512 | 40          |
1009	              | 84      | K       | SHA-3-512 | 50          |
1010	              | 96      | M       | SHA-2-512 | 0           |
1011	              | 97      | M       | SHA-2-512 | 20          |
1012	              | 98      | M       | SHA-2-512 | 30          |
1013	              | 99      | M       | SHA-2-512 | 40          |
1014	              | 100     | M       | SHA-2-512 | 50          |
1015	              +---------+---------+-----------+-------------+

1017	                                  Table 2

1019	   The Binary Data Sequence value is taken from the Typed Content Digest
1020	   Value starting at the 9^th octet and as many additional bytes as are
1021	   required to meet the presentation precision.

1023	5.1.5.  Example Encoding

1025	   In the following examples, <Content-ID> is the UTF8 encoding of the
1026	   string "text/plain" and <Data> is the UTF8 encoding of the string
1027	   "UDF Data Value"

1029	   Data =
1030	     55 44 46 20  44 61 74 61  20 56 61 6C  75 65

1032	   ContentType =
1033	     74 65 78 74  2F 70 6C 61  69 6E

1035	5.1.6.  Using SHA-2-512 Digest

1037	   H(<Data>) =
1038	     48 DA 47 CC  AB FE A4 5C  76 61 D3 21  BA 34 3E 58
1039	     10 87 2A 03  B4 02 9D AB  84 7C CE D2  22 B6 9C AB
1040	     02 38 D4 E9  1E 2F 6B 36  A0 9E ED 11  09 8A EA AC
1041	     99 D9 E0 BD  EA 47 93 15  BD 7A E9 E1  2E AD C4 15

1043	   <Content-ID> + ':' + H(<Data>) =
1044	     74 65 78 74  2F 70 6C 61  69 6E 3A 48  DA 47 CC AB
1045	     FE A4 5C 76  61 D3 21 BA  34 3E 58 10  87 2A 03 B4
1046	     02 9D AB 84  7C CE D2 22  B6 9C AB 02  38 D4 E9 1E
1047	     2F 6B 36 A0  9E ED 11 09  8A EA AC 99  D9 E0 BD EA
1048	     47 93 15 BD  7A E9 E1 2E  AD C4 15

1050	   H(<Content-ID> + ':' + H(<Data>)) =
1051	     C6 AF B7 C0  FE BE 04 E5  AE 94 E3 7B  AA 5F 1A 40
1052	     5B A3 CE CC  97 4D 55 C0  9E 61 E4 B0  EF 9C AE F9
1053	     EB 83 BB 9D  5F 0F 39 F6  5F AA 06 DC  67 2A 67 71
1054	     4F FF 8F 83  C4 55 38 36  38 AE 42 7A  82 9C 85 BB

1056	   The prefixed Binary Data Sequence is thus
1057	     60 C6 AF B7  C0 FE BE 04  E5 AE 94 E3  7B AA 5F 1A
1058	     40 5B

1060	   The 125 bit fingerprint value is MDDK-7N6A-727A-JZNO-STRX-XKS7-DJAF

1062	   This fingerprint MAY be specified with higher or lower precision as
1063	   appropriate.

1065	   100 bit precision  MDDK-7N6A-727A-JZNO-STRX

1067	   120 bit precision  MDDK-7N6A-727A-JZNO-STRX-XKS7

1069	   200 bit precision  MDDK-7N6A-727A-JZNO-STRX-XKS7-DJAF-XI6O-ZSLU-2VOA
1070	   260 bit precision  MDDK-7N6A-727A-JZNO-STRX-XKS7-DJAF-XI6O-ZSLU-2VOA-
1071	      TZQ6-JMHP-TSXP

1073	5.1.7.  Using SHA-3-512 Digest

1075	   H(<Data>) =
1076	     6D 2E CF E6  93 5A 0C FC  F2 A9 1A 49  E0 0C D8 07
1077	     A1 4E 70 AB  72 94 6E CC  BB 47 48 F1  8E 41 49 95
1078	     07 1D F3 6E  0D 0C 8B 60  39 C1 8E B4  0F 6E C8 08
1079	     65 B4 C4 45  9B A2 7E 97  74 7B BE 68  BC A8 C2 17

1081	   <Content-ID> + ':' + H(<Data>) =
1082	     74 65 78 74  2F 70 6C 61  69 6E 3A 6D  2E CF E6 93
1083	     5A 0C FC F2  A9 1A 49 E0  0C D8 07 A1  4E 70 AB 72
1084	     94 6E CC BB  47 48 F1 8E  41 49 95 07  1D F3 6E 0D
1085	     0C 8B 60 39  C1 8E B4 0F  6E C8 08 65  B4 C4 45 9B
1086	     A2 7E 97 74  7B BE 68 BC  A8 C2 17

1088	   H(<Content-ID> + ':' + H(<Data>)) =
1089	     8A 86 8A 06  1C 54 6E 7E  3F 75 5F 39  88 F9 FD 2F
1090	     8E C8 45 93  1B 80 A8 2F  29 16 7B A3  BE 21 1F 8A
1091	     75 61 88 A1  D5 7F 07 D5  9D 68 A4 2D  17 F4 4D 23
1092	     F9 E4 0B B2  1A 8D B9 F5  8D FC EC BD  01 F4 37 7C

1094	   The prefixed Binary Data Sequence is thus
1095	     50 8A 86 8A  06 1C 54 6E  7E 3F 75 5F  39 88 F9 FD
1096	     2F 8E

1098	   The 125 bit fingerprint value is KCFI-NCQG-DRKG-47R7-OVPT-TCHZ-7UXY

1100	5.1.8.  Using SHA-2-512 Digest with Compression

1102	   The content data "UDF Compressed Document 4187123" produces a UDF
1103	   Content Digest SHA-2-512 binary value with 20 trailing zeros and is
1104	   therefore presented using compressed presentation:

1106	   Data = "
1107	     55 44 46 20  43 6F 6D 70  72 65 73 73  65 64 20 44
1108	     6F 63 75 6D  65 6E 74 20  34 31 38 37  31 32 33"

1110	   The UTF8 Content Digest is given as:

1112	   H(<Data>) =
1113	     36 21 FA 2A  C5 D8 62 5C  2D 0B 45 FB  65 93 FC 69
1114	     C1 ED F7 00  AE 6F E3 3D  38 13 FE AB  76 AA 74 13
1115	     6D 5A 2B 20  DE D6 A5 CF  6C 04 E6 56  3F F3 C0 C7
1116	     C4 1D 3F 43  DD DC F1 A5  67 A7 E0 67  9A B0 C6 B7

1118	   <Content-ID> + ':' + H(<Data>) =
1119	     74 65 78 74  2F 70 6C 61  69 6E 3A 36  21 FA 2A C5
1120	     D8 62 5C 2D  0B 45 FB 65  93 FC 69 C1  ED F7 00 AE
1121	     6F E3 3D 38  13 FE AB 76  AA 74 13 6D  5A 2B 20 DE
1122	     D6 A5 CF 6C  04 E6 56 3F  F3 C0 C7 C4  1D 3F 43 DD
1123	     DC F1 A5 67  A7 E0 67 9A  B0 C6 B7

1125	   H(<Content-ID> + ':' + H(<Data>)) =
1126	     8E 14 D9 19  4E D6 02 12  C3 30 A7 BB  5F C7 17 6D
1127	     AE 9A 56 7C  A8 2A 23 1F  96 75 ED 53  10 EC E8 F2
1128	     60 14 24 D0  C8 BC 55 3D  C0 70 F7 5E  86 38 1A 0B
1129	     CB 55 9C B2  87 81 27 FF  3C EC E2 F0  90 A0 00 00

1131	   The prefixed Binary Data Sequence is thus
1132	     61 8E 14 D9  19 4E D6 02  12 C3 30 A7  BB 5F C7 17
1133	     6D AE

1135	   The 125 bit fingerprint value is MGHB-JWIZ-J3LA-EEWD-GCT3-WX6H-C5W2

1137	5.1.9.  Using SHA-3-512 Digest with Compression

1139	   The content data "UDF Compressed Document 774665" produces a UDF
1140	   Content Digest SHA-3-512 binary value with 20 trailing zeros and is
1141	   therefore presented using compressed presentation:

1143	   Data =
1144	     55 44 46 20  43 6F 6D 70  72 65 73 73  65 64 20 44
1145	     6F 63 75 6D  65 6E 74 20  37 37 34 36  36 35

1147	   The UTF8 SHA-3-512 Content Digest is KEJI-Y225-BDUG-XX22-MXKE-5ITF-
1148	   YVYM

1150	5.2.  Authenticator UDF

1152	   An authenticator Type UDF consists of the type identifier octet
1153	   followed by the Binary Data Sequence value.

1155	   The type identifier specifies the digest and Message Authentication
1156	   Code algorithm.  Two algorithm suites are currently specified.  Use
1157	   of compression is not supported.

1159	   The Authenticator UDF for given content data and key is generated by
1160	   the steps of:

1162	   1.  Applying the digest algorithm to determine the Content Digest
1163	       Value

1165	   2.  Applying the MAC algorithm to determine the Authentication value

1167	   3.  Determining the Type Identifier octet from the Digest algorithm
1168	       identifier and compression level.

1170	   4.  Truncating the Authentication value to determine the Binary Data
1171	       Sequence value.

1173	   The key used to calculate and Authenticator type UDF is always a
1174	   UNICODE string.  If use of a binary value as a key is required, the
1175	   value MUST be converted to a string format first.  For example, by
1176	   conversion to an Encryption/Authentication type UDF.

1178	5.2.1.  Content Digest Value

1180	   The Content Digest Value (CDV) is determined in the exact same
1181	   fashion as for a Content Digest UDF by applying the digest algorithm
1182	   to the content data:

1184	   CDV = H(<Data>))

1186	   Where

1188	      H(x) is the cryptographic digest function

1190	      <Data> is the binary data.

1192	5.2.2.  Authentication Value

1194	   The Authentication Value (AV) is determined by applying the digest
1195	   algorithm to the content type identifier and the CDV:

1197	   AV = MAC (<OKM>, (<Content-ID> + ?:? + CDV))

1199	   Where

1201	      <OKM> is the authentication key as specified below

1203	      MAC( <Key>, <data>) is the result of applying the Message
1204	      Authentication Code algorithm to with Key <Key> and data <data>

1206	   The value is calculated as follows:

1208	   IKM = UTF8 (Key)
1209	   PRK = MAC (UTF8 ("KeyedUDFMaster"), IKM)
1210	   OKM = HKDF-Expand(PRK, UTF8 ("KeyedUDFExpand"), HashLen)

1212	   Where the function UTF8(string) converts a string to the binary UTF8
1213	   representation, HKDF-Expand is as defined in [RFC5869] and the
1214	   function MAC(k,m) is the HMAC function formed from the specified hash
1215	   H(m) as specified in [RFC2014] .

1217	   Keyed UDFs are typically used in circumstances where user interaction
1218	   requires a cryptographic commitment type functionality

1220	   In the following example, <Content-ID> is the UTF8 encoding of the
1221	   string "text/plain" and <Data> is the UTF8 encoding of the string
1222	   "Konrad is the traitor".  The randomly chosen key is NDD7-6CMX-H2FW-
1223	   ISAL-K4VB-DQ3E-PEDM.

1225	   Data =
1226	     4B 6F 6E 72  61 64 20 69  73 20 74 68  65 20 74 72
1227	     61 69 74 6F  72

1229	   ContentType =
1230	     74 65 78 74  2F 70 6C 61  69 6E

1232	   Key =
1233	     4E 44 44 37  2D 36 43 4D  58 2D 48 32  46 57 2D 49
1234	     53 41 4C 2D  4B 34 56 42  2D 44 51 33  45 2D 50 45
1235	     44 4D

1237	   Processing is performed in the same manner as an unkeyed fingerprint
1238	   except that compression is never used:

1240	   H(<Data>) =
1241	     93 FC DA F9  FA FD 1E 26  50 26 C3 C1  28 43 40 73
1242	     D8 BC 3D 62  87 73 2B 73  B8 EC 93 B6  DE 80 FF DA
1243	     70 0A D1 CE  E8 F4 36 68  EF 4E 71 63  41 53 91 5C
1244	     CE 8C 5C CE  C7 9A 46 94  6A 35 79 F9  33 70 85 01

1246	   <Content-ID> + ':' + H(<Data>) =
1247	     74 65 78 74  2F 70 6C 61  69 6E 3A 93  FC DA F9 FA
1248	     FD 1E 26 50  26 C3 C1 28  43 40 73 D8  BC 3D 62 87
1249	     73 2B 73 B8  EC 93 B6 DE  80 FF DA 70  0A D1 CE E8
1250	     F4 36 68 EF  4E 71 63 41  53 91 5C CE  8C 5C CE C7
1251	     9A 46 94 6A  35 79 F9 33  70 85 01

1253	   PRK(Key) =
1254	     77 D3 0A 08  39 BD 9D C0  97 44 DA 33  15 0A 42 5E
1255	     CD 17 80 03  B3 CF CC 89  7A C7 84 12  B4 51 5B 25
1256	     DC 26 F5 E1  1B 20 F3 89  2E 9A 1A 7B  0E 73 23 39
1257	     0E C3 4C EF  2D 40 DA 05  B4 70 C6 1C  82 C1 49 33

1259	   HKDF(Key) =
1260	     BF A9 B4 58  9C 1D 68 D7  9A B7 11 F6  C8 98 59 14
1261	     20 D7 82 67  C5 84 22 E5  A0 F9 93 52  B1 C3 87 EB
1262	     05 06 CB C4  E4 D6 E6 EE  1F F0 D4 7A  97 68 5E CE
1263	     28 1C CA AF  D8 B5 D1 24  4A 71 EC E3  AC B5 D2 04

1265	   MAC(<key>, <Content-ID> + ':' + H(<Data>)) =
1266	     4C C3 7F D3  F9 9E 52 CF  07 90 74 53  84 65 95 BC
1267	     1A 2B A5 D1  68 9D 05 6D  06 C5 CA BF  17 CB E0 49
1268	     95 39 57 08  79 C4 E5 49  D3 3A 59 A3  32 05 45 A6
1269	     30 26 25 AE  8A F4 47 C6  1F B5 33 7F  AD 69 A6 30

1271	   The prefixed Binary Data Sequence is thus
1272	     00 4C C3 7F  D3 F9 9E 52  CF 07 90 74  53 84 65 95
1273	     BC 1A

1275	   The 125 bit fingerprint value is ABGM-G76T-7GPF-FTYH-SB2F-HBDF-SW6B

1277	5.3.  Content Type Values

1279	   While a UDF fingerprint MAY be used to identify any form of static
1280	   data, the use of a UDF fingerprint to identify a public key signature
1281	   key provides a level of indirection and thus the ability to identify
1282	   dynamic data.  The content types used to identify public keys are
1283	   thus of particular interest.

1285	   As described in the security considerations section, the use of
1286	   fingerprints to identify a bare public key and the use of
1287	   fingerprints to identify a public key and associated security policy
1288	   information are very different.

1290	5.3.1.  PKIX Certificates and Keys

1292	   UDF fingerprints MAY be used to identify PKIX certificates, CRLs and
1293	   public keys in the ASN.1 encoding used in PKIX certificates.

1295	   Since PKIX certificates and CLRs contain security policy information,
1296	   UDF fingerprints used to identify certificates or CRLs SHOULD be
1297	   presented with a minimum of 200 bits of precision.  PKIX applications
1298	   MUST not accept UDF fingerprints specified with less than 200 bits of
1299	   precision for purposes of identifying trust anchors.

1301	   PKIX certificates, keys and related content data are identified by
1302	   the following content types:

1304	   application/pkix-cert  A PKIX Certificate

1306	   application/pkix-crl  A PKIX CRL

1308	   application/pkix-keyinfo  The KeyInfo structure defined in the PKIX
1309	      certificate specification.

1311	5.3.2.  OpenPGP Key

1313	   OpenPGPv5 keys and key set content data are identified by the
1314	   following content type:

1316	   application/pgp-keys  An OpenPGP key set.

1318	5.3.3.  DNSSEC

1320	   DNSSEC record data consists of DNS records which are identified by
1321	   the following content type:

1323	   application/dns  A DNS resource record in binary format

1325	6.  UDF URIs

1327	   The UDF URI scheme describes a means of constructing URIs from a UDF
1328	   value.

1330	   Two forms or UDF URI are specified, Name and Locator.  In both cases
1331	   the URI MUST specify the scheme type "UDF", and a UDF fingerprint and
1332	   MAY specify a query identifier and/or a fragment identifier.

1334	   By definition a Locator form URI contains an authority field which
1335	   MUST be a DNS domain name.  The use of IP address forms for this
1336	   purpose is not permitted.

1338	   Name Form URIs allow static content data to be identified without
1339	   specifying the means by which the content data may be retrieved.
1340	   Locator form URIs allow static content data or dynamic network
1341	   resources to be identified and the means of retrieval.

1343	   The syntax of a UDF URI is a subset of the generic URI syntax
1344	   specified in [RFC3986] . The use of userinfo and port numbers is not
1345	   supported and the path part of the uri is a UDF in base32
1346	   presentation.

1348	   URI           = "UDF:" udf [ "?" query ] [ "" fragment ]

1350	   udf           = name-form / locator-form

1352	   name-form     = udf-value
1353	   locator-form  = "//" authority "/" udf-value

1355	   authority     = host
1356	   host          = reg-name

1358	6.1.  Name form

1360	   Name form UDF URIs provide a means of presenting a UDF value in a
1361	   context in which a URI form of a name is required without providing a
1362	   means of resolution.

1364	   Adding the UDF scheme prefix to a UDF fingerprint does not change the
1365	   semantics of the fingerprint itself.  The semantics of the name
1366	   result from the context in which it is used.

1368	   For example, a UDF value of any type MAY be used to give a unique
1369	   targetNamespace value in an XML Schema [XMLSchema]

1371	6.2.  Locator form

1373	   The locator form of an unkeyed UDF URI is resolved by the following
1374	   steps:

1376	   o  Use DNS Web service discovery to determine the Web Service
1377	      Endpoint.

1379	   o  Determine the content identifier from the source URI.

1381	   o  Append the content identifier to the Web Service Endpoint as a
1382	      suffix to form the target URI.

1384	   o  Retrieve content from the Web Service Endpoint by means of a GET
1385	      method.

1387	   o  Perform post processing as specified by the UDF type.

1389	6.2.1.  DNS Web service discovery

1391	   DNS Web Discovery is performed as specified in
1392	   [draft-hallambaker-web-service-discovery] for the service mmm-udf and
1393	   domain name specified in the URI.  For a full description of the
1394	   discovery mechanism, consult the referenced specification.

1396	   The use of DNS Web Discovery permits service providers to make full
1397	   use of the load balancing and service description capabilities
1398	   afforded by use of DNS SRV and TXT records in accordance with the
1399	   approach described in [RFC6763] .

1401	   If no SRV or TXT records are specified, DNS Web Discovery specifies
1402	   that the Web Service Endpoint be the Well Known Service [RFC5785]
1403	   with the prefix /.well-known/srv/mmm-udf.

1405	6.2.2.  Content Identifier

1407	   For all UDF types other than Secret Share, the Content Identifier
1408	   value is the UDF SHA-2-512 Content Digest of the canonical form of
1409	   the UDF specified in the source URI presented at twice the precision
1410	   to a maximum of 440 bits.

1412	   If the UDF is of type Secret Share, the shared secret MUST be
1413	   recovered before the content identifier can be resolved.  The shared
1414	   secret is then expressed as a UDF of type Encryption/Authentication
1415	   and the Content Identifier determined as for an Encryption/
1416	   Authentication type UDF.

1418	6.2.3.  Target URI

1420	   The target URI is formed by appending a slash separator '/' and the
1421	   Content Identifier value to the Web Service Endpoint.

1423	   Since the path portion of a URI is case sensitive, the UDF value MUST
1424	   be specified in upper case and MUST include separator marks.

1426	6.2.4.  Postprocessing

1428	   After retrieving the content data, the resolver MUST perform post
1429	   processing as indicated by the content type:

1431	   Nonce  No additional post processing is required.

1433	   Content Digest  The resolver MUST verify that the content returned
1434	      matches the UDF fingerprint value.

1436	   Authenticator  The resolver MUST verify that the content returned
1437	      matches the UDF fingerprint value.

1439	   Encryption/Authentication  The content data returned is decrypted and
1440	      authenticated using the key specified in the UDF value as the
1441	      initial keying material (see below).

1443	   Secret Share (set)  The content data returned is decrypted and
1444	      authenticated using the shared secret as the initial keying
1445	      material (see below).

1447	6.2.5.  Decryption and Authentication

1449	   The steps performed to decode cryptographically enhanced content data
1450	   depends on the content type specified in the returned content.  Two
1451	   formats are currently supported:

1453	   o  DARE Message format as specified in
1454	      [draft-hallambaker-dare-message]

1456	   o  Cryptographic Message Syntax (CMS) Symmetric Key Package as
1457	      specified in [RFC6031]

1459	6.2.6.  QR Presentation

1461	   Encoding of a UDF URI as a QR code requires only the characters in
1462	   alphanumeric encoding, thus achieving compactness with minimal
1463	   overhead.

1465	7.  Strong Internet Names

1467	   A Strong Internet Name is an Internet address that is bound to a
1468	   policy governing interpretation of that address by means of a Content
1469	   Digest type UDF of the policy expressed as a UDF prefixed DNS label
1470	   within the address itself.

1472	   The Reserved LDH labels as defined in [RFC5890] that begin with the
1473	   prefix mm-- are reserved for use as Strong Internet Names.  The
1474	   characters following the prefix are a Content Digest type UDF in
1475	   Base32 presentation.

1477	   Since DNS labels are limited to 63 characters, the presentation of
1478	   the SIN itself is limited to 59 characters and thus 240 bits of
1479	   precision.

1481	8.  Security Considerations

1483	8.1.  Confidentiality

1485	   Encrypted locator is a bearer token

1487	8.2.  Availability

1489	   Corruption of a part of a shared secret may prevent recovery

1491	8.3.  Integrity

1493	   Shared secret parts do not contain context information to specify
1494	   which secret they relate to.

1496	8.4.  Work Factor and Precision

1498	   A given UDF data object has a single fingerprint value that may be
1499	   presented at different precisions.  The shortest legitimate precision
1500	   with which a UDF fingerprint may be presented has 96 significant bits

1502	   A UDF fingerprint presents the same work factor as any other
1503	   cryptographic digest function.  The difficulty of finding a second
1504	   data item that matches a given fingerprint is 2^n and the difficulty
1505	   or finding two data items that have the same fingerprint is 2^(n/2).
1506	   Where n is the precision of the fingerprint.

1508	   For the algorithms specified in this document, n = 512 and thus the
1509	   work factor for finding collisions is 2^256, a value that is
1510	   generally considered to be computationally infeasible.

1512	   Since the use of 512 bit fingerprints is impractical in the type of
1513	   applications where fingerprints are generally used, truncation is a
1514	   practical necessity.  The longer a fingerprint is, the less likely it
1515	   is that a user will check every character.  It is therefore important
1516	   to consider carefully whether the security of an application depends
1517	   on second pre-image resistance or collision resistance.

1519	   In most fingerprint applications, such as the use of fingerprints to
1520	   identify public keys, the fact that a malicious party might generate
1521	   two keys that have the same fingerprint value is a minor concern.

1523	   Combined with a flawed protocol architecture, such a vulnerability
1524	   may permit an attacker to construct a document such that the
1525	   signature will be accepted as valid by some parties but not by
1526	   others.

1528	   For example, Alice generates keypairs until two are generated that
1529	   have the same 100 bit UDF presentation (typically 2^48 attempts).
1530	   She registers one keypair with a merchant and the other with her
1531	   bank.  This allows Alice to create a payment instrument that will be
1532	   accepted as valid by one and rejected by the other.

1534	   The ability to generate of two PKIX certificates with the same
1535	   fingerprint and different certificate attributes raises very
1536	   different and more serious security concerns.  For example, an
1537	   attacker might generate two certificates with the same key and
1538	   different use constraints.  This might allow an attacker to present a
1539	   highly constrained certificate that does not present a security risk
1540	   to an application for purposes of gaining approval and an
1541	   unconstrained certificate to request a malicious action.

1543	   In general, any use of fingerprints to identify data that has
1544	   security policy semantics requires the risk of collision attacks to
1545	   be considered.  For this reason, the use of short, 'user friendly'
1546	   fingerprint presentations (Less than 200 bits) SHOULD only be used
1547	   for public key values.

1549	8.5.  Semantic Substitution

1551	   Many applications record the fact that a data item is trusted, rather
1552	   fewer record the circumstances in which the data item is trusted.
1553	   This results in a semantic substitution vulnerability which an
1554	   attacker may exploit by presenting the trusted data item in the wrong
1555	   context.

1557	   The UDF format provides protection against high level semantic
1558	   substitution attacks by incorporating the content type into the input
1559	   to the outermost fingerprint digest function.  The work factor for
1560	   generating a UDF fingerprint that is valid in both contexts is thus
1561	   the same as the work factor for finding a second preimage in the
1562	   digest function (2^512 for the specified digest algorithms).

1564	   It is thus infeasible to generate a data item such that some
1565	   applications will interpret it as a PKIX key and others will accept
1566	   as an OpenPGP key.  While attempting to parse a PKIX key as an
1567	   OpenPGP key is virtually certain to fail to return the correct key
1568	   parameters it cannot be assumed that the attempt is guaranteed to
1569	   fail with an error message.

1571	   The UDF format does not provide protection against semantic
1572	   substitution attacks that do not affect the content type.

1574	8.6.  QR Code Scanning

1576	   The act of scanning a QR code SHOULD be considered equivalent to
1577	   clicking on an unlabeled hypertext link.  Since QR codes are scanned
1578	   in many different contexts, the mere act of scanning a QR code MUST
1579	   NOT be interpreted as constituting an affirmative acceptance of terms
1580	   or conditions or as creating an electronic signature.

1582	   If such semantics are required in the context of an application,
1583	   these MUST be established by secondary user actions made subsequent
1584	   to the scanning of the QR code.

1586	   There is a risk that use of QR codes to automate processes such as
1587	   payment will lead to abusive practices such as presentation of
1588	   fraudulent invoices for goods not ordered or delivered.  It is
1589	   therefore important to ensure that such requests are subject to
1590	   adequate accountability controls.

1592	9.  IANA Considerations

1594	   Registrations are requested in the following registries:

1596	   o  Service Name and Transport Protocol Port Number

1598	   o  well-known URI registry

1600	   o  Uniform Resource Identifier (URI) Schemes

1602	   o  Media Types

1604	   In addition, the creation of the following registry is requested:
1605	   Uniform Data Fingerprint Type Identifier Registry.

1607	9.1.  Protocol Service Name

1609	   The following registration is requested in the Service Name and
1610	   Transport Protocol Port Number Registry in accordance with [RFC6355]

1612	   Service Name (REQUIRED)  mmm-udf

1614	   Transport Protocol(s) (REQUIRED)  TCP

1616	   Assignee (REQUIRED)  Phillip Hallam-Baker, phill@hallambaker.com

1618	   Contact (REQUIRED)  Phillip Hallam-Baker, phill@hallambaker.com
1619	   Description (REQUIRED)  mmm-udf is a Web Service protocol that
1620	      resolves Mathematical Mesh Uniform Data Fingerprints (UDF) to
1621	      resources.  The mmm-udf service name is used in service discovery
1622	      to identify a Web Service endpoint to perform resolution of a UDF
1623	      presented in URI locator form.

1625	   Reference (REQUIRED)  [This document]

1627	   Port Number (OPTIONAL)  None

1629	   Service Code (REQUIRED for DCCP only)  None

1631	   Known Unauthorized Uses (OPTIONAL)  None

1633	   Assignment Notes (OPTIONAL)  None

1635	9.2.  Well Known

1637	   The following registration is requested in the well-known URI
1638	   registry in accordance with [RFC5785]

1640	   URI suffix  srv/mmm-udf

1642	   Change controller  Phillip Hallam-Baker, phill@hallambaker.com

1644	   Specification document(s):  [This document]

1646	   Related information

1648	   [draft-hallambaker-web-service-discovery]

1650	9.3.  URI Registration

1652	   The following registration is requested in the Uniform Resource
1653	   Identifier (URI) Schemes registry in accordance with [RFC7595]

1655	   Scheme name:  UDF

1657	   Status:  Provisional

1659	   Applications/protocols that use this scheme name:  Mathematical Mesh
1660	      Service protocols (mmm)

1662	   Contact:  Phillip Hallam-Baker mailto:phill@hallambaker.com

1664	   Change controller:  Phillip Hallam-Baker

1666	   References:  [This document]

1668	9.4.  Media Types Registrations

1670	9.4.1.  Media Type: application/pkix-keyinfo

1672	   Type name:  application

1674	   Subtype name:  pkix-keyinfo

1676	   Required parameters:  None

1678	   Optional parameters:  None

1680	   Encoding considerations:  None

1682	   Security considerations:  Described in [This]

1684	   Interoperability considerations:  None

1686	   Published specification:  [This]

1688	   Applications that use this media type:  Uniform Data Fingerprint

1690	   Fragment identifier considerations:  None

1692	   Additional information:  Deprecated alias names for this type: None

1694	      Magic number(s): None

1696	      File extension(s): None

1698	      Macintosh file type code(s): None

1700	   Person &amp; email address to contact for further information:
1701	      Phillip Hallam-Baker @hallambaker.com>

1703	   Intended usage:  Content type identifier to be used in constructing
1704	      UDF Content Digests and Authenticators and related cryptographic
1705	      purposes.

1707	   Restrictions on usage:  None

1709	   Author:  Phillip Hallam-Baker

1711	   Change controller:  Phillip Hallam-Baker

1713	   Provisional registration? (standards tree only):  Yes

1715	9.4.2.  Media Type: application/udf-encryption

1717	   Type name:  application

1719	   Subtype name:  udf-encryption

1721	   Required parameters:  None

1723	   Optional parameters:  None

1725	   Encoding considerations:  None

1727	   Security considerations:  Described in [This]

1729	   Interoperability considerations:  None

1731	   Published specification:  [This]

1733	   Applications that use this media type:  Uniform Data Fingerprint

1735	   Fragment identifier considerations:  None

1737	   Additional information:  Deprecated alias names for this type: None

1739	      Magic number(s): None

1741	      File extension(s): None

1743	      Macintosh file type code(s): None

1745	   Person &amp; email address to contact for further information:
1746	      Phillip Hallam-Baker @hallambaker.com>

1748	   Intended usage:  Content type identifier to be used in constructing
1749	      UDF Content Digests and Authenticators and related cryptographic
1750	      purposes.

1752	   Restrictions on usage:  None

1754	   Author:  Phillip Hallam-Baker

1756	   Change controller:  Phillip Hallam-Baker

1758	   Provisional registration? (standards tree only):  Yes

1760	9.4.3.  Media Type: application/udf-secret

1762	   Type name:  application

1764	   Subtype name:  udf- secret

1766	   Required parameters:  None

1768	   Optional parameters:  None

1770	   Encoding considerations:  None

1772	   Security considerations:  Described in [This]

1774	   Interoperability considerations:  None

1776	   Published specification:  [This]

1778	   Applications that use this media type:  Uniform Data Fingerprint

1780	   Fragment identifier considerations:  None

1782	   Additional information:  Deprecated alias names for this type: None

1784	      Magic number(s): None

1786	      File extension(s): None

1788	      Macintosh file type code(s): None

1790	   Person &amp; email address to contact for further information:
1791	      Phillip Hallam-Baker @hallambaker.com>

1793	   Intended usage:  Content type identifier to be used in constructing
1794	      UDF Content Digests and Authenticators and related cryptographic
1795	      purposes.

1797	   Restrictions on usage:  None

1799	   Author:  Phillip Hallam-Baker

1801	   Change controller:  Phillip Hallam-Baker

1803	   Provisional registration? (standards tree only):  Yes

1805	9.5.  Uniform Data Fingerprint Type Identifier Registry

1807	   This document describes a new extensible data format employing fixed
1808	   length version identifiers for UDF types.

1810	9.5.1.  The name of the registry

1812	   Uniform Data Fingerprint Type Identifier Registry

1814	9.5.2.  Required information for registrations

1816	   Registrants must specify the Type identifier code(s) requested,
1817	   description and RFC number for the corresponding standards action
1818	   document.

1820	   The standards document must specify the means of generating and
1821	   interpreting the UDF Data Sequence Value and the purpose(s) for which
1822	   it is proposed.

1824	   Since the initial letter of the Base32 presentation provides a
1825	   mnemonic function in UDFs, the standards document must explain why
1826	   the proposed Type Identifier and associated initial letter are
1827	   appropriate.  In cases where a new initial letter is to be created,
1828	   there must be an explanation of why this is appropriate.  If an
1829	   existing initial letter is to be created, there must be an
1830	   explanation of why this is appropriate and/or acceptable.

1832	9.5.3.  Applicable registration policy

1834	   Due to the intended field of use (human data entry), the code space
1835	   is severely constrained.  Accordingly, it is intended that code point
1836	   registrations be as infrequent as possible.

1838	   Registration of new digest algorithms is strongly discouraged and
1839	   should not occur unless, (1) there is a known security vulnerability
1840	   in one of the two schemes specified in the original assignment and
1841	   (2) the proposed algorithm has been subjected to rigorous peer
1842	   review, preferably in the form of an open, international competition
1843	   and (3) the proposed algorithm has been adopted as a preferred
1844	   algorithm for use in IETF protocols.

1846	   Accordingly, the applicable registration policy is Standards Action.

1848	9.5.4.  Size, format, and syntax of registry entries

1850	   Each registry entry consists of a single byte code,

1852	9.5.5.  Initial assignments and reservations

1854	   The following entries should be added to the registry as initial
1855	   assignments:

1857	   Code  Description                      Reference
1858	   ---  -------------------               ---------
1859	   00   HMAC and SHA-2-512                [This document]
1860	   32   HKDF-AES-512                      [This document]
1861	   80   SHA-3-512                         [This document]
1862	   81   SHA-3-512 with 20 trailing zeros  [This document]
1863	   82   SHA-3-512 with 30 trailing zeros  [This document]
1864	   82   SHA-3-512 with 40 trailing zeros  [This document]
1865	   83   SHA-3-512 with 50 trailing zeros  [This document]
1866	   96   SHA-2-512                         [This document]
1867	   97   SHA-2-512 with 20 trailing zeros  [This document]
1868	   98   SHA-2-512 with 30 trailing zeros  [This document]
1869	   99   SHA-2-512 with 40 trailing zeros  [This document]
1870	   100  SHA-2-512 with 50 trailing zeros  [This document]
1871	   104  Random nonce                      [This document]
1872	   144  Shamir Secret Share               [This document]

1874	10.  Appendix A: Prime Values for Secret Sharing

1876	   The following are the prime values to be used for sharing secrets of
1877	   up to 512 bits.

1879	   If it is necessary to share larger secrets, the corresponding prime
1880	   may be found by choosing a value (2^32)^n that is larger than the
1881	   secret to be encoded and determining the next largest number that is
1882	   prime.

1884	                 +----------------+----------------------+
1885	                 | Number of bits | Offset = Primen - 2n |
1886	                 +----------------+----------------------+
1887	                 | 32             | 15                   |
1888	                 | 64             | 13                   |
1889	                 | 96             | 61                   |
1890	                 | 128            | 51                   |
1891	                 | 160            | 7                    |
1892	                 | 192            | 133                  |
1893	                 | 224            | 735                  |
1894	                 | 256            | 297                  |
1895	                 | 288            | 127                  |
1896	                 | 320            | 27                   |
1897	                 | 352            | 55                   |
1898	                 | 384            | 231                  |
1899	                 | 416            | 235                  |
1900	                 | 448            | 211                  |
1901	                 | 480            | 165                  |
1902	                 | 512            | 75                   |
1903	                 +----------------+----------------------+

1905	                                  Table 3

1907	   For example, the prime to be used to share a 128 bit value is 2^128 +
1908	   51.

1910	11.  Recovering Shamir Shared Secret

1912	   The value of a Shamir Shared secret may be recovered using Lagrange
1913	   basis polynomials.

1915	   To share a secret with a threshold of n shares and L bits we
1916	   constructed f(x) a polynomial of degree n in the modular field p
1917	   where p is the smallest prime greater than 2^L:

1919	   f(x) = a_0 + a_1.x + a_2.x^2 + ... a_n.x^n

1921	   The shared secret is the binary representation of the value a_0

1923	   Given n shares (x_0, y_0), (x_1, y_1), ... (x_n-1, y_n-1), The
1924	   corresponding the Lagrange basis polynomials l_0, l_1, .. l_n-1 are
1925	   given by:

1927	   lm = ((x - x(m_0)) / (x(m) - x(m_0))) . ((x - x(m_1)) / (x(m) -
1928	   x(m_1))) . ... .  ((x - x(m_n-2)) / (x(m) - x(m_n-2)))

1930	   Where the values m_0, m_1, ... m_n-2, are the integers 0, 1, .. n-1,
1931	   excluding the value m.

1933	   These can be used to compute f(x) as follows:

1935	   f(x) = y_0l_0 + y_1l_1 + ... y_n-1l_n-1

1937	   Since it is only the value of f(0) that we are interested in, we
1938	   compute the Lagrange basis for the value x = 0:

1940	   lz_m = ((x(m_1)) / (x(m) - x(m_1))) . ((x(m_2)) / (x(m) - x(m_2)))

1942	   Hence,

1944	   a_0 = f(0) = y_0lz_0 + y_1lz_1 + ... y_n-1l_n-1

1946	   The following C# code recovers the values.

1948	   using System;
1949	   using System.Collections.Generic;
1950	   using System.Numerics;

1952	   namespace Examples {

1954	       class Examples {

1956	           ///
1957	           /// Combine a set of  points (x, f(x))
1958	           /// on a polynomial of degree  in a
1959	           /// discrete field modulo prime  to
1960	           /// recover the value f(0) using Lagrange basis polynomials.
1961	           ///
1962	           /// The values f(x).
1963	           /// The values for x.
1964	           /// The modulus.
1965	           /// The polynomial degree.
1966	           /// The value f(0).
1967	           static BigInteger CombineNK(
1968	                       BigInteger[] fx,
1969	                       int[] x,
1970	                       BigInteger p,
1971	                       int n) {
1972	               if (fx.Length < n) {
1973	                   throw new Exception("Insufficient shares");
1974	                   }

1976	               BigInteger accumulator = 0;
1977	               for (var formula = 0; formula < n; formula++) {
1978	                   var value = fx[formula];

1980	                   BigInteger numerator = 1, denominator = 1;
1981	                   for (var count = 0; count < n; count++) {
1982	                       if (formula == count) {
1983	                           continue;  // If not the same value
1984	                           }

1986	                       var start = x[formula];
1987	                       var next = x[count];

1989	                       numerator = (numerator * -next) % p;
1990	                       denominator = (denominator * (start - next)) % p;
1991	                       }

1993	                   var InvDenominator = ModInverse(denominator, p);

1995	                   accumulator = Modulus((accumulator +
1996	                       (fx[formula] * numerator * InvDenominator)), p);
1997	                   }

1999	               return accumulator;
2000	               }

2002	           ///
2003	           /// Compute the modular multiplicative inverse of the value
2004	           ///  modulo
2005	           ///
2006	           /// The value to find the inverse of
2007	           /// The modulus.
2008	           ///
2009	           static BigInteger ModInverse(
2010	                       BigInteger k,
2011	                       BigInteger p) {
2012	               var m2 = p - 2;
2013	               if (k < 0) {
2014	                   k = k + p;
2015	                   }

2017	               return BigInteger.ModPow(k, m2, p);
2018	               }

2020	           ///
2021	           /// Calculate the modulus of a number with correct handling
2022	           /// for negative numbers.
2023	           ///
2024	           /// Value
2025	           /// The modulus.
2026	           /// x mod p
2027	           public static BigInteger Modulus(
2028	                       BigInteger x,
2029	                       BigInteger p) {
2030	               var Result = x % p;
2031	               return Result.Sign >= 0 ? Result : Result + p;
2032	               }
2033	           }
2034	       }

2036	12.  References

2038	12.1.  Normative References

2040	   [draft-hallambaker-dare-message]
2041	              Hallam-Baker, P., "Data At Rest Encryption Part 1: DARE
2042	              Message Syntax", draft-hallambaker-dare-message-02 (work
2043	              in progress), August 2018.

2045	   [draft-hallambaker-web-service-discovery]
2046	              Hallam-Baker, P., "DNS Web Service Discovery", draft-
2047	              hallambaker-web-service-discovery-01 (work in progress),
2048	              February 2019.

2050	   [RFC2014]  Weinrib, A. and J. Postel, "IRTF Research Group Guidelines
2051	              and Procedures", BCP 8, RFC 2014, DOI 10.17487/RFC2014,
2052	              October 1996.

2054	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
2055	              Requirement Levels", BCP 14, RFC 2119,
2056	              DOI 10.17487/RFC2119, March 1997.

2058	   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
2059	              Resource Identifier (URI): Generic Syntax", STD 66,
2060	              RFC 3986, DOI 10.17487/RFC3986, January 2005.

2062	   [RFC4648]  Josefsson, S., "The Base16, Base32, and Base64 Data
2063	              Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006.

2065	   [RFC5869]  Krawczyk, H. and P. Eronen, "HMAC-based Extract-and-Expand
2066	              Key Derivation Function (HKDF)", RFC 5869,
2067	              DOI 10.17487/RFC5869, May 2010.

2069	   [RFC6031]  Turner, S. and R. Housley, "Cryptographic Message Syntax
2070	              (CMS) Symmetric Key Package Content Type", RFC 6031,
2071	              DOI 10.17487/RFC6031, December 2010.

2073	   [SHA-2]    NIST, "Secure Hash Standard", August 2015.

2075	   [SHA-3]    Dworkin, M., "SHA-3 Standard: Permutation-Based Hash and
2076	              Extendable-Output Functions", August 2015.

2078	12.2.  Informative References

2080	   [draft-hallambaker-mesh-developer]
2081	              Hallam-Baker, P., "Mathematical Mesh: Reference
2082	              Implementation", draft-hallambaker-mesh-developer-07 (work
2083	              in progress), April 2018.

2085	   [draft-hallambaker-mesh-trust]
2086	              Hallam-Baker, P., "Mathematical Mesh Part IV: The Trust
2087	              Mesh", draft-hallambaker-mesh-trust-00 (work in progress),
2088	              January 2019.

2090	   [RFC4086]  Eastlake 3rd, D., Schiller, J., and S. Crocker,
2091	              "Randomness Requirements for Security", BCP 106, RFC 4086,
2092	              DOI 10.17487/RFC4086, June 2005.

2094	   [RFC4880]  Callas, J., Donnerhacke, L., Finney, H., Shaw, D., and R.
2095	              Thayer, "OpenPGP Message Format", RFC 4880,
2096	              DOI 10.17487/RFC4880, November 2007.

2098	   [RFC5785]  Nottingham, M. and E. Hammer-Lahav, "Defining Well-Known
2099	              Uniform Resource Identifiers (URIs)", RFC 5785,
2100	              DOI 10.17487/RFC5785, April 2010.

2102	   [RFC5890]  Klensin, J., "Internationalized Domain Names for
2103	              Applications (IDNA): Definitions and Document Framework",
2104	              RFC 5890, DOI 10.17487/RFC5890, August 2010.

2106	   [RFC6355]  Narten, T. and J. Johnson, "Definition of the UUID-Based
2107	              DHCPv6 Unique Identifier (DUID-UUID)", RFC 6355,
2108	              DOI 10.17487/RFC6355, August 2011.

2110	   [RFC6763]  Cheshire, S. and M. Krochmal, "DNS-Based Service
2111	              Discovery", RFC 6763, DOI 10.17487/RFC6763, February 2013.

2113	   [RFC7595]  Thaler, D., Hansen, T., and T. Hardie, "Guidelines and
2114	              Registration Procedures for URI Schemes", BCP 35,
2115	              RFC 7595, DOI 10.17487/RFC7595, June 2015.

2117	   [Shamir79]
2118	              "[Reference Not Found!]".

2120	   [XMLSchema]
2121	              Gao, S., Sperberg-McQueen, C., Thompson, H., Mendelsohn,
2122	              N., Beech, D., and M. Maloney, "W3C XML Schema Definition
2123	              Language (XSD) 1.1 Part 1: Structures", April 2012.

2125	12.3.  URIs

2127	   [1] http://mathmesh.com/Documents/draft-hallambaker-mesh-udf.html

2129	   [2] http://mathmesh.com/Documents/draft-hallambaker-mesh-udf.html

2131	Author's Address

2133	   Phillip Hallam-Baker

2135	   Email: phill@hallambaker.com