idnits 2.17.1 

draft-hallambaker-mesh-udf-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([1]), which it shouldn't. 
     Please replace those with straight textual mentions of the documents in
     question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 1633 has weird spacing: '... suffix  srv/m...'

  == Line 1950 has weird spacing: '... set of  point...'

  == Line 1951 has weird spacing: '... degree  in a...'

  == Line 1952 has weird spacing: '...o prime  to...'

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     UDF Binary Data Sequence types are either fixed length or variable
     length.  A variable length Binary Data Sequence MUST be truncated for
     presentation.  Fixed length Binary Data Sequences MUST not be truncated.

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     Since PKIX certificates and CLRs contain security policy
     information, UDF fingerprints used to identify certificates or CRLs
     SHOULD be presented with a minimum of 200 bits of precision.  PKIX
     applications MUST not accept UDF fingerprints specified with less than
     200 bits of precision for purposes of identifying trust anchors.

  -- The document date (February 24, 2019) is 1882 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Looks like a reference, but probably isn't: '1' on line 2120

  -- Looks like a reference, but probably isn't: '2' on line 2122

  == Missing Reference: 'This' is mentioned on line 1769, but not defined

  -- Obsolete informational reference (is this intentional?): RFC 5785
     (Obsoleted by RFC 8615)


     Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                    P. Hallam-Baker
3	Internet-Draft                                         February 24, 2019
4	Intended status: Informational
5	Expires: August 28, 2019

7	          Mathematical Mesh Part II: Uniform Data Fingerprint.
8	                     draft-hallambaker-mesh-udf-00

10	Abstract

12	   This document describes the naming and addressing schemes used in the
13	   Mathematical Mesh.  The means of generating Uniform Data Fingerprint
14	   (UDF) values and their presentation as text sequences and as URIs are
15	   described.

17	   A UDF consists of a binary sequence, the initial eight bits of which
18	   specify a type identifier code.  Type identifier codes have been
19	   selected so as to provide a useful mnemonic indicating their purpose
20	   when presented in Base32 encoding.

22	   Two categories of UDF are described.  Data UDFs provide a compact
23	   presentation of a fixed length binary data value in a format that is
24	   convenient for data entry.  A Data UDF may represent a cryptographic
25	   key, a nonce value or a share of a secret.  Fingerprint UDFs provide
26	   a compact presentation of a Message Digest or Message Authentication
27	   Code value.

29	   A Strong Internet Name (SIN) consists of a DNS name which contains at
30	   least one label that is a UDF fingerprint of a policy document
31	   controlling interpretation of the name.  SINs allow a direct trust
32	   model to be applied to achieve end-to-end security in existing
33	   Internet applications without the need for trusted third parties.

35	   UDFs may be presented as URIs to form either names or locators for
36	   use with the UDF location service.  An Encrypted Authenticated
37	   Resource Locator (EARL) is a UDF locator URI presenting a service
38	   from which an encrypted resource may be obtained and a symmetric key
39	   that may be used to decrypt the content.  EARLs may be presented on
40	   paper correspondence as a QR code to securely provide a machine
41	   readable version of the same content.  This may be applied to
42	   automate processes such as invoicing or to provide accessibility
43	   services for the partially sighted.

45	   This document is also available online at http://mathmesh.com/
46	   Documents/ draft-hallambaker-mesh-udf.html [1] .

48	Status of This Memo

50	   This Internet-Draft is submitted in full conformance with the
51	   provisions of BCP 78 and BCP 79.

53	   Internet-Drafts are working documents of the Internet Engineering
54	   Task Force (IETF).  Note that other groups may also distribute
55	   working documents as Internet-Drafts.  The list of current Internet-
56	   Drafts is at https://datatracker.ietf.org/drafts/current/.

58	   Internet-Drafts are draft documents valid for a maximum of six months
59	   and may be updated, replaced, or obsoleted by other documents at any
60	   time.  It is inappropriate to use Internet-Drafts as reference
61	   material or to cite them other than as "work in progress."

63	   This Internet-Draft will expire on August 28, 2019.

65	Copyright Notice

67	   Copyright (c) 2019 IETF Trust and the persons identified as the
68	   document authors.  All rights reserved.

70	   This document is subject to BCP 78 and the IETF Trust's Legal
71	   Provisions Relating to IETF Documents
72	   (https://trustee.ietf.org/license-info) in effect on the date of
73	   publication of this document.  Please review these documents
74	   carefully, as they describe your rights and restrictions with respect
75	   to this document.  Code Components extracted from this document must
76	   include Simplified BSD License text as described in Section 4.e of
77	   the Trust Legal Provisions and are provided without warranty as
78	   described in the Simplified BSD License.

80	Table of Contents

82	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   4
83	     1.1.  UDF Types . . . . . . . . . . . . . . . . . . . . . . . .   4
84	       1.1.1.  Cryptographic Keys and Nonces . . . . . . . . . . . .   5
85	       1.1.2.  Fingerprint type UDFS . . . . . . . . . . . . . . . .   6
86	     1.2.  UDF URIs  . . . . . . . . . . . . . . . . . . . . . . . .   6
87	       1.2.1.  Name Form . . . . . . . . . . . . . . . . . . . . . .   7
88	       1.2.2.  Locator Form  . . . . . . . . . . . . . . . . . . . .   7
89	     1.3.  Secure Internet Names . . . . . . . . . . . . . . . . . .   9
90	   2.  Definitions . . . . . . . . . . . . . . . . . . . . . . . . .   9
91	     2.1.  Requirements Language . . . . . . . . . . . . . . . . . .   9
92	     2.2.  Defined Terms . . . . . . . . . . . . . . . . . . . . . .  10
93	     2.3.  Related Specifications  . . . . . . . . . . . . . . . . .  11
94	     2.4.  Implementation Status . . . . . . . . . . . . . . . . . .  11
95	   3.  Architecture  . . . . . . . . . . . . . . . . . . . . . . . .  11
96	     3.1.  Base32 Presentation . . . . . . . . . . . . . . . . . . .  11
97	       3.1.1.  Precision Improvement . . . . . . . . . . . . . . . .  12
98	     3.2.  Type Identifier . . . . . . . . . . . . . . . . . . . . .  12
99	     3.3.  Content Type Identifier . . . . . . . . . . . . . . . . .  13
100	     3.4.  Truncation  . . . . . . . . . . . . . . . . . . . . . . .  13
101	       3.4.1.  Compression . . . . . . . . . . . . . . . . . . . . .  14
102	     3.5.  Presentation  . . . . . . . . . . . . . . . . . . . . . .  14
103	     3.6.  Alternative Presentations . . . . . . . . . . . . . . . .  15
104	       3.6.1.  Word Lists  . . . . . . . . . . . . . . . . . . . . .  15
105	       3.6.2.  Image List  . . . . . . . . . . . . . . . . . . . . .  15
106	   4.  Fixed Length UDFs . . . . . . . . . . . . . . . . . . . . . .  15
107	     4.1.  Nonce Type  . . . . . . . . . . . . . . . . . . . . . . .  16
108	     4.2.  Encryption/Authentication Type  . . . . . . . . . . . . .  16
109	     4.3.  Shamir Shared Secret  . . . . . . . . . . . . . . . . . .  16
110	       4.3.1.  Secret Generation . . . . . . . . . . . . . . . . . .  17
111	       4.3.2.  Recovery  . . . . . . . . . . . . . . . . . . . . . .  18
112	   5.  Variable Length UDFs  . . . . . . . . . . . . . . . . . . . .  19
113	     5.1.  Content Digest  . . . . . . . . . . . . . . . . . . . . .  20
114	       5.1.1.  Content Digest Value  . . . . . . . . . . . . . . . .  20
115	       5.1.2.  Typed Content Digest Value  . . . . . . . . . . . . .  21
116	       5.1.3.  Compression . . . . . . . . . . . . . . . . . . . . .  21
117	       5.1.4.  Presentation  . . . . . . . . . . . . . . . . . . . .  22
118	       5.1.5.  Example Encoding  . . . . . . . . . . . . . . . . . .  22
119	       5.1.6.  Using SHA-2-512 Digest  . . . . . . . . . . . . . . .  22
120	       5.1.7.  Using SHA-3-512 Digest  . . . . . . . . . . . . . . .  23
121	       5.1.8.  Using SHA-2-512 Digest with Compression . . . . . . .  24
122	       5.1.9.  Using SHA-3-512 Digest with Compression . . . . . . .  25
123	     5.2.  Authenticator UDF . . . . . . . . . . . . . . . . . . . .  25
124	       5.2.1.  Content Digest Value  . . . . . . . . . . . . . . . .  26
125	       5.2.2.  Authentication Value  . . . . . . . . . . . . . . . .  26
126	     5.3.  Content Type Values . . . . . . . . . . . . . . . . . . .  28
127	       5.3.1.  PKIX Certificates and Keys  . . . . . . . . . . . . .  29
128	       5.3.2.  OpenPGP Key . . . . . . . . . . . . . . . . . . . . .  29
129	       5.3.3.  DNSSEC  . . . . . . . . . . . . . . . . . . . . . . .  29
130	   6.  UDF URIs  . . . . . . . . . . . . . . . . . . . . . . . . . .  29
131	     6.1.  Name form . . . . . . . . . . . . . . . . . . . . . . . .  30
132	     6.2.  Locator form  . . . . . . . . . . . . . . . . . . . . . .  30
133	       6.2.1.  DNS Web service discovery . . . . . . . . . . . . . .  31
134	       6.2.2.  Content Identifier  . . . . . . . . . . . . . . . . .  31
135	       6.2.3.  Target URI  . . . . . . . . . . . . . . . . . . . . .  31
136	       6.2.4.  Postprocessing  . . . . . . . . . . . . . . . . . . .  32
137	       6.2.5.  Decryption and Authentication . . . . . . . . . . . .  32
138	       6.2.6.  QR Presentation . . . . . . . . . . . . . . . . . . .  32
139	   7.  Strong Internet Names . . . . . . . . . . . . . . . . . . . .  32
140	   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  33
141	     8.1.  Confidentiality . . . . . . . . . . . . . . . . . . . . .  33
142	     8.2.  Availability  . . . . . . . . . . . . . . . . . . . . . .  33
143	     8.3.  Integrity . . . . . . . . . . . . . . . . . . . . . . . .  33
144	     8.4.  Work Factor and Precision . . . . . . . . . . . . . . . .  33
145	     8.5.  Semantic Substitution . . . . . . . . . . . . . . . . . .  34
146	     8.6.  QR Code Scanning  . . . . . . . . . . . . . . . . . . . .  35
147	   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  35
148	     9.1.  Protocol Service Name . . . . . . . . . . . . . . . . . .  35
149	     9.2.  Well Known  . . . . . . . . . . . . . . . . . . . . . . .  36
150	     9.3.  URI Registration  . . . . . . . . . . . . . . . . . . . .  36
151	     9.4.  Media Types Registrations . . . . . . . . . . . . . . . .  37
152	       9.4.1.  Media Type: application/pkix-keyinfo  . . . . . . . .  37
153	       9.4.2.  Media Type: application/udf-encryption  . . . . . . .  38
154	       9.4.3.  Media Type: application/udf-secret  . . . . . . . . .  39
155	     9.5.  Uniform Data Fingerprint Type Identifier Registry . . . .  40
156	       9.5.1.  The name of the registry  . . . . . . . . . . . . . .  40
157	       9.5.2.  Required information for registrations  . . . . . . .  40
158	       9.5.3.  Applicable registration policy  . . . . . . . . . . .  40
159	       9.5.4.  Size, format, and syntax of registry entries  . . . .  40
160	       9.5.5.  Initial assignments and reservations  . . . . . . . .  41
161	   10. Appendix A: Prime Values for Secret Sharing . . . . . . . . .  41
162	   11. Recovering Shamir Shared Secret . . . . . . . . . . . . . . .  42
163	   12. References  . . . . . . . . . . . . . . . . . . . . . . . . .  45
164	     12.1.  Normative References . . . . . . . . . . . . . . . . . .  45
165	     12.2.  Informative References . . . . . . . . . . . . . . . . .  46
166	     12.3.  URIs . . . . . . . . . . . . . . . . . . . . . . . . . .  47
167	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  47

169	1.  Introduction

171	   A Uniform Data Fingerprint (UDF) is a generalized format for
172	   presenting and interpreting short binary sequences representing
173	   cryptographic keys or fingerprints of data of any specified type.
174	   The UDF format provides a superset of the OpenPGP [RFC4880]
175	   fingerprint encoding capability with greater encoding density and
176	   readability.

178	   This document describes the syntax and encoding of UDFs, the means of
179	   constructing and comparing them and their use in other Internet
180	   addressing schemes.

182	1.1.  UDF Types

184	   Two categories of UDF are described.  Data UDFs provide a compact
185	   presentation of a fixed length binary data value in a format that is
186	   convenient for data entry.  A Data UDF may represent a cryptographic
187	   key or nonce value or a part share of a key generated using a secret
188	   sharing mechanism.  Fingerprint UDFs provide a compact presentation
189	   of a Message Digest or Message Authentication Code value.

191	   Both categories of UDF are encoded as a UDF binary sequence, the
192	   first octet of which is a Type Identifier and the remaining octets
193	   specify the binary value according to the type identifier and data
194	   referenced.

196	   UDFs are typically presented to the user as a Base32 encoded sequence
197	   in groups of five characters separated by dashes.  This format
198	   provides a useful balance between compactness and readability.  The
199	   type identifier codes have been selected so as to provide a useful
200	   mnemonic when presented in Base32 encoding.

202	   The following are examples of UDF values:

204	   ND6M-3FGP-TEQ2-C3CA-47LI-LGDT-KRKP
205	   EBP6-ZN2L-4ZYR-CCSY-ESBF-IYOZ-WBKQ
206	   SAQD-SEKQ-ZIEF-EQIZ-I7Y7-IJA5-ALRD-A
207	   MB5S-R4AJ-3FBT-7NHO-T26Z-2E6Y-WFH4
208	   KCM5-7VB6-IJXJ-WKHX-NZQF-OKGZ-EWVN
209	   AARM-PPXK-JH54-MUOW-Q4QH-ZFCV-LMM7

211	   Like email addresses, UDFs are not a Uniform Resource Identifier
212	   (URI) but may be expressed in URI form by adding the scheme
213	   identifier (UDF) for use in contexts where an identifier in URI
214	   syntax is required.  A UDF URI MAY contain a domain name component
215	   allowing it to be used as a locator

217	1.1.1.  Cryptographic Keys and Nonces

219	   A Nonce (N) UDF represents a short, fixed length randomly chosen
220	   binary value.

222	   Nonce UDFs are used within many Mesh protocols and data formats where
223	   it is necessary to represent a nonce value in text form.

225	   Nonce UDF:
226	     ND6M-3FGP-TEQ2-C3CA-47LI-LGDT-KRKP

228	   An Encryption/Authentication (E) UDF has the same format as a Random
229	   UDF but is identified as being intended to be used as a symmetric key
230	   for encryption and/or authentication.

232	   KeyValue:
233	     5F EC B7 4B  E6 71 11 0A  58 24 82 54  61 D9 B0 55

235	   Encryption/Authenticator UDF:
236	     EBP6-ZN2L-4ZYR-CCSY-ESBF-IYOZ-WBKQ

238	   A Share (S) UDF also represents a short, fixed length binary value
239	   but only provides one share in secret sharing scheme.  Recovery of
240	   the binary value requires a sufficient number of shares.

242	   Share UDFs are used in the Mesh to support key and data escrow
243	   operations without the need to rely on trusted hardware.  A share UDF
244	   can be copied by hand or printed in human or machine-readable form
245	   (e.g.  QR code).

247	   Key:     EBP6-ZN2L-4ZYR-CCSY-ESBF-IYOZ-WBKQ
248	   Share 0: SAQD-SEKQ-ZIEF-EQIZ-I7Y7-IJA5-ALRD-A
249	   Share 1: SAQR-ENPK-JAVD-G4JI-G67W-L46Y-FQKA-W
250	   Share 2: SARO-WWUD-YZGB-JIJX-E6GN-PQ4T-KVDB-S

252	1.1.2.  Fingerprint type UDFS

254	   Fingerprint type UDFs contains a fingerprint value calculated over a
255	   content data item and an IANA media type.

257	   A Content Digest type UDF is a fingerprint type UDF in which the
258	   fingerprint is formed using a cryptographic algorithm.  Two digest
259	   algorithms are currently supported, SHA-2-512 (M, for Merkle Damgard)
260	   and SHA-3-512 (K, for Keccak).

262	   The inclusion of the media type in the calculation of the UDF value
263	   provides protection against semantic substitution attacks in which
264	   content that has been found to be trustworthy when interpreted as one
265	   content type is presented in a context in which it is interpreted as
266	   a different content type in which it is unsafe.

268	   SHA-2-512: MB5S-R4AJ-3FBT-7NHO-T26Z-2E6Y-WFH4
269	   SHA-3-512: KCM5-7VB6-IJXJ-WKHX-NZQF-OKGZ-EWVN

271	   An Authentication UDF (A) is formed in the same manner as a
272	   fingerprint but using a Message Authentication Code algorithm and a
273	   symmetric key.

275	   Authentication UDFs are used to express commitments and to provide a
276	   means of blinding fingerprint values within a protocol by means of a
277	   nonce.

279	   SHA-2-512: AARM-PPXK-JH54-MUOW-Q4QH-ZFCV-LMM7

281	1.2.  UDF URIs

283	   The UDF URI scheme allows use of a UDF in contexts where a URF is
284	   expected.  The UDF URI scheme has two forms, name and locator.

286	1.2.1.  Name Form

288	   Name form UDF URIs identify a data resource but do not provide a
289	   means of discovery.  The URI is simply the scheme (udf) followed by
290	   the UDF value:

292	   udf:MB5S-R4AJ-3FBT-7NHO-T26Z-2E6Y-WFH4

294	1.2.2.  Locator Form

296	   Locator form UDF URIs identify a data resource and provide a hint
297	   that MAY provide a means of discovery.  If the content is not
298	   available from the location indicated, content obtained from a
299	   different source that matches the fingerprint MAY be used instead.

301	   udf://MB5S-R4AJ-3FBT-7NHO-T26Z-2E6Y-WFH4

303	   UDF locator form URIs presenting a fingerprint type UDF provide a
304	   tight binding of the content to the locator.  This allows the
305	   resolved content to be verified and rejected if it has been modified.

307	   UDF locator form URIs presenting an Encryptor/Authenticator type UDF
308	   provide a mechanism for identification, discovery and decryption of
309	   encrypted content.  UDF locators of this type are known as Encrypted/
310	   Authenticated Resource Locators (EARLs).

312	   Regardless of the type of the embedded UDF, UDF locator form URIs are
313	   resolved by first performing DNS Web Service Discovery to identify
314	   the Web Service Endpoint for the mmm-udf service at the specified
315	   domain.

317	   Resolution is completed by presenting the Content Digest Fingerprint
318	   of the UDF value specified in the URI to the specified Web Service
319	   Endpoint and performing a GET method request on the result.

321	   For example, Alice subscribes to Example.com, a purveyor of cat and
322	   kitten images.  The company generates paper and electronic invoices
323	   on a monthly basis.

325	   To generate the paper invoice, Example.com first creates a new
326	   encryption key:

328	   EC5P-GNKW-2C7N-WEJJ-65V2-FGU2-QBI3-HC

330	   One or more electronic forms of the invoice are encrypted under the
331	   key EC5P-GNKW-2C7N-WEJJ-65V2-FGU2-QBI3-HC and placed on the
332	   Example.com Web site so that the appropriate version is returned if
333	   Alice scans the QR code.

335	   The key is then converted to form an EARL for the example.com UDF
336	   resolution service:

338	   udf://example.com/EC5P-GNKW-2C7N-WEJJ-65V2-FGU2-QBI3-HC

340	   The EARL is then rendered as a QR code:

342	   [[This figure is not viewable in this format.  The figure is
343	   available at http://mathmesh.com/Documents/ draft-hallambaker-mesh-
344	   udf.html [2].]]

346	   QR Code with embedded decryption and location key

348	   A printable invoice containing the QR code is now generated and sent
349	   to Alice.

351	   When Alice receives the invoice, she can pay it by simply scanning
352	   the invoice with a device that recognizes at least one of the invoice
353	   formats supported by Example.com.

355	   The UDF EARL locator shown above is resolved by first determining the
356	   Web Service Endpoint for the mmm-udf service for the domain
357	   example.com.

359	   Discover ("example.com", "mmm-udf") =
360	   https://example.com/.well-known/mmm-udf/

362	   Next the fingerprint of the source UDF is obtained.

364	   UDF (EC5P-GNKW-2C7N-WEJJ-65V2-FGU2-QBI3-HC) =
365	   MC4C-K3CC-R2V5-DDX2-W4SX-C6WR-MTAC-YCFL-GTG3-L5CK-7I7L-HRPK-B64E-VLUS

367	   Combining the Web Service Endpoint and the fingerprint of the source
368	   UDF provides the URI from which the content is obtained using the
369	   normal HTTP GET method:

371	   https://example.com/.well-known/mmm-udf/MC4C-K3CC-R2V5-DDX2-W4SX-
372	   C6WR-MTAC-YCFL-GTG3-L5CK-7I7L-HRPK-B64E-VLUS

374	   Having established that Alice can read postal mail sent to a physical
375	   address and having delivered a secret to that address, this process
376	   might be extended to provide a means of automating the process of
377	   enrolment in electronic delivery of future invoices.

379	1.3.  Secure Internet Names

381	   A SIN is an Internet Identifier that contains a UDF fingerprint of a
382	   security policy document that may be used to verify the
383	   interpretation of the identifier.  This permits traditional forms of
384	   Internet address such as URIs and RFC822 email addresses to be used
385	   to express a trusted address that is independent of any trusted third
386	   party.

388	   This document only describes the syntax and interpretation of the
389	   identifiers themselves.  The means by which the security policy
390	   documents bound to an address govern interpretation of the name is
391	   discussed separately in [draft-hallambaker-mesh-trust] .

393	   For example, Example Inc holds the domain name example.com and has
394	   deployed a private CA whose root of trust is a PKIX certificate with
395	   the UDF fingerprint MB2GK-6DUF5-YGYYL-JNY5E-RWSHZ.  Alice is an
396	   employee of Example Inc., she uses three email addresses:
397	   alice@example.com A regular email address (not a SIN). alice@mm--
398	   mb2gk-6duf5-ygyyl-jny5e-rwshz.example.com A strong email address that
399	   is backwards compatible. alice@example.com.mm--mb2gk-6duf5-ygyyl-
400	   jny5e-rwshz A strong email address that is backwards incompatible.

402	   All three forms of the address are valid RFC822 addresses and may be
403	   used in a legacy email client, stored in an address book application,
404	   etc.  But the ability of a legacy client to make use of the address
405	   differs.  Addresses of the first type may always be used.  Addresses
406	   of the second type may only be used if an appropriate MX record is
407	   provisioned.  Addresses of the third type will always fail unless the
408	   resolver understands that it is a SIN requiring special processing.

410	   These rules allow Bob to send email to Alice with either 'best
411	   effort' security or mandatory security as the circumstances demand.

413	2.  Definitions

415	   This section presents the related specifications and standard, the
416	   terms that are used as terms of art within the documents and the
417	   terms used as requirements language.

419	2.1.  Requirements Language

421	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
422	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
423	   document are to be interpreted as described in [RFC2119] .

425	2.2.  Defined Terms

427	   Cryptographic Digest Function  A hash function that has the
428	      properties required for use as a cryptographic hash function.
429	      These include collision resistance, first pre-image resistance and
430	      second pre-image resistance.

432	   Content Type  An identifier indicating how a Data Value is to be
433	      interpreted as specified in the IANA registry Media Types.

435	   Commitment  A cryptographic primitive that allows one to commit to a
436	      chosen value while keeping it hidden to others, with the ability
437	      to reveal the committed value later.

439	   Data Value  The binary octet stream that is the input to the digest
440	      function used to calculate a digest value.

442	   Data Object  A Data Value and its associated Content Type

444	   Digest Algorithm  A synonym for Cryptographic Digest Function

446	   Digest Value  The output of a Cryptographic Digest Function

448	   Data Digest Value  The output of a Cryptographic Digest Function for
449	      a given Data Value input.

451	   Fingerprint  A presentation of the digest value of a data value or
452	      data object.

454	   Fingerprint Presentation  The representation of at least some part of
455	      a fingerprint value in human or machine-readable form.

457	   Fingerprint Improvement  The practice of recording a higher precision
458	      presentation of a fingerprint on successful validation.

460	   Fingerprint Work Hardening  The practice of generating a sequence of
461	      fingerprints until one is found that matches criteria that permit
462	      a compressed presentation form to be used.  The compressed
463	      fingerprint thus being shorter than but presenting the same work
464	      factor as an uncompressed one.

466	   Hash  A function which takes an input and returns a fixed-size
467	      output.  Ideally, the output of a hash function is unbiased and
468	      not correlated to the outputs returned to similar inputs in any
469	      predictable fashion.

471	   Precision  The number of significant bits provided by a Fingerprint
472	      Presentation.

474	   Work Factor  A measure of the computational effort required to
475	      perform an attack against some security property.

477	2.3.  Related Specifications

479	   This specification makes use of Base32 [RFC4648] encoding, SHA-2
480	   [SHA-2] and SHA-3 [SHA-3] digest functions in the derivation of basic
481	   fingerprints.  The derivation of keyed fingerprints additionally
482	   requires the use of the HMAC [RFC2014] and HKDF [RFC5869] functions.

484	   Resolution of UDF URI Locators makes use of DNS Web Service Discovery
485	   [draft-hallambaker-web-service-discovery] .

487	2.4.  Implementation Status

489	   The implementation status of the reference code base is described in
490	   the companion document [draft-hallambaker-mesh-developer] .

492	3.  Architecture

494	   A Uniform Data Fingerprint (UDF) is a presentation of a UDF Binary
495	   Data Sequence.

497	   This document specifies seven UDF Binary Data Sequence types and one
498	   presentation.

500	   The first octet of a UDF Binary Data Sequence identifies the UDF type
501	   and is referred to as the Type identifier.

503	   UDF Binary Data Sequence types are either fixed length or variable
504	   length.  A variable length Binary Data Sequence MUST be truncated for
505	   presentation.  Fixed length Binary Data Sequences MUST not be
506	   truncated.

508	3.1.  Base32 Presentation

510	   The default UDF presentation is Base32 Presentation.

512	   Variable length Binary Data Sequences are truncated to an integer
513	   multiple of 20 bits that provides the desired precision before
514	   conversion to Base32 form.

516	   Fixed length Binary Data Sequences are converted to Base32 form
517	   without truncation.

519	   After conversion to Base32 form, dash '-' characters are inserted
520	   between groups of 4 characters to aid reading.  This representation
521	   improves the accuracy of both data entry and verification.

523	3.1.1.  Precision Improvement

525	   Precision improvement is the practice of using a high precision UDF
526	   (e.g. 260 bits) calculated from content data that has been validated
527	   according to a lower precision UDF (e.g. 120 bits).

529	   This allows a lower precision UDF to be used in a medium such as a
530	   business card where space is constrained without compromising
531	   subsequent uses.

533	   Applications SHOULD make use of precision improvement wherever
534	   possible.

536	3.2.  Type Identifier

538	   A Version Identifier consists of a single byte.

540	   The byte codes have been chosen so that the first character of the
541	   Base32 presentation of the UDF provides a mnemonic for its type.  A
542	   SHA-2 fingerprint UDF will always have M (for Merkle Damgard) as the
543	   initial letter, a SHA-3 fingerprint UDF will always have K (for
544	   Keccak) as the initial letter, and so on.

546	   The following version identifiers are specified in this document:

548	          +---------+---------+--------------------------------+
549	          | Type ID | Initial | Algorithm                      |
550	          +---------+---------+--------------------------------+
551	          | 0       | A       | HMAC-SHA-2-512                 |
552	          | 32      | E       | HKDF-AES-512                   |
553	          | 80      | K       | SHA-3-512                      |
554	          | 81      | K       | SHA-3-512 (20 bits compressed) |
555	          | 82      | K       | SHA-3-512 (30 bits compressed) |
556	          | 83      | K       | SHA-3-512 (40 bits compressed) |
557	          | 84      | K       | SHA-3-512 (50 bits compressed) |
558	          | 96      | M       | SHA-2-512                      |
559	          | 97      | M       | SHA-2-512 (20 bits compressed) |
560	          | 98      | M       | SHA-2-512 (30 bits compressed) |
561	          | 99      | M       | SHA-2-512 (40 bits compressed) |
562	          | 100     | M       | SHA-2-512 (50 bits compressed) |
563	          | 104     | N       | Nonce data                     |
564	          | 144     | S       | Shamir Secret Sharing          |
565	          +---------+---------+--------------------------------+

567	                                  Table 1

569	3.3.  Content Type Identifier

571	   A secure cryptographic digest algorithm provides a unique digest
572	   value that is probabilistically unique for a particular byte sequence
573	   but does not fix the context in which a byte sequence is interpreted.
574	   While such ambiguity may be tolerated in a fingerprint format
575	   designed for a single specific field of use, it is not acceptable in
576	   a general-purpose format.

578	   For example, the SSH and OpenPGP applications both make use of
579	   fingerprints as identifiers for the public keys used but using
580	   different digest algorithms and data formats for representing the
581	   public key data.  While no such vulnerability has been demonstrated
582	   to date, it is certainly conceivable that a crafty attacker might
583	   construct an SSH key in such a fashion that OpenPGP interprets the
584	   data in an insecure fashion.  If the number of applications making
585	   use of fingerprint format that permits such substitutions is
586	   sufficiently large, the probability of a semantic substitution
587	   vulnerability being possible becomes unacceptably large.

589	   A simple control that defeats such attacks is to incorporate a
590	   content type identifier within the scope of the data input to the
591	   hash function.

593	3.4.  Truncation

595	   Different applications of fingerprints demand different tradeoffs
596	   between compactness of the representation and the number of
597	   significant bits.  A larger the number of significant bits reduces
598	   the risk of collision but at a cost to convenience.

600	   Modern cryptographic digest functions such as SHA-2 produce output
601	   values of at least 256 bits in length.  This is considerably larger
602	   than most uses of fingerprints require and certainly greater than can
603	   be represented in human readable form on a business card.

605	   Since a strong cryptographic digest function produces an output value
606	   in which every bit in the input value affects every bit in the output
607	   value with equal probability, it follows that truncating the digest
608	   value to produce a finger print is at least as strong as any other
609	   mechanism if digest algorithm used is strong.

611	   Using truncation to reduce the precision of the digest function has
612	   the advantage that a lower precision fingerprint of some data content
613	   is always a prefix of a higher prefix of the same content.  This
614	   allows higher precision fingerprints to be converted to a lower
615	   precision without the need for special tools.

617	3.4.1.  Compression

619	   The Content Digest UDF types make use of work factor compression.
620	   Additional type identifiers are used to indicate digest values with
621	   20, 30, 40 or 50 trailing zero bits allowing a UDF fingerprint
622	   offering the equivalent of up to 150 bits of precision to be
623	   expressed in 20 characters instead of 30.

625	   To use compressed UDF identifiers, it is necessary to search for
626	   content that can be compressed.  If the digest algorithm used is
627	   secure, this means that by definition, the fastest means of search is
628	   brute force.  Thus, the reduction in fingerprint size is achieved by
629	   transferring the work factor from the attacker to the defender.  To
630	   maintain a work factor of 2^120 with a 2^80 bits, it is necessary for
631	   the content generator to perform a brute force search at a cost of
632	   the order of 2^40 operations.

634	   For example, the smallest allowable work factor for a UDF
635	   presentation of a public key fingerprint is 92 bits.  This would
636	   normally require a presentation with 20 significant characters.
637	   Reducing this to 16 characters requires a brute force search of
638	   approximately 10^6 attempts.  Reducing this to 12 characters would
639	   require 10^12 attempts and to 10 characters, 10^15 attempts.

641	   Omission of support for higher levels of compression than 2^50 is
642	   intentional.

644	   In addition to allowing use of shorter presentations, work factor
645	   compression MAY be used as evidence of proof of work.

647	3.5.  Presentation

649	   The presentation of a fingerprint is the format in which it is
650	   presented to either an application or the user.

652	   Base32 encoding is used to produce the preferred text representation
653	   of a UDF fingerprint.  This encoding uses only the letters of the
654	   Latin alphabet with numbers chosen to minimize the risk of ambiguity
655	   between numbers and letters (2, 3, 4, 5, 6 and 7).

657	   To enhance readability and improve data entry, characters are grouped
658	   into groups of four.  This means that each block of four characters
659	   represents an increase in work factor of approximately one million
660	   times.

662	3.6.  Alternative Presentations

664	   Applications that support UDF MUST support use of the Base32
665	   presentation.  Applications MAY support alternative presentations.

667	3.6.1.  Word Lists

669	   The use of a Word List to encode fingerprint values was introduced by
670	   Patrick Juola and Philip Zimmerman for the PGPfone application.  The
671	   PGP Word List is designed to facilitate exchange and verification of
672	   fingerprint values in a voice application.  To minimize the risk of
673	   misinterpretation, two-word lists of 256 values each are used to
674	   encode alternative fingerprint bytes.  The compact size of the lists
675	   used allowed the compilers to curate them so as to maximize the
676	   phonetic distance of the words selected.

678	   The PGP Word List is designed to achieve a balance between ease of
679	   entry and verification.  Applications where only verification is
680	   required may be better served by a much larger word list, permitting
681	   shorter fingerprint encodings.

683	   For example, a word list with 16384 entries permits 14 bits of the
684	   fingerprint to be encoded at once, 65536 entries permits encoding of
685	   16 bits.  These encodings allow a 120 bit fingerprint to be encoded
686	   in 9 and 8 words respectively.

688	3.6.2.  Image List

690	   An image list is used in the same manner as a word list affording
691	   rapid visual verification of a fingerprint value.  For obvious
692	   reasons, this approach is not suited to data entry but is preferable
693	   for comparison purposes.  An image list of 1,048,576 images would
694	   provide a 20 bit encoding allowing 120 bit precision fingerprints to
695	   be displayed in six images.

697	4.  Fixed Length UDFs

699	   Fixed length UDFs are used to represent cryptographic keys, nonces
700	   and secret shares and have a fixed length determined by their
701	   function that cannot be truncated without loss of information.

703	   All fixed length Binary Data Sequence values are an integer multiple
704	   of eight bits.

706	4.1.  Nonce Type

708	   A Nonce Type UDF consists of the type identifier octet 136 followed
709	   by the Binary Data Sequence value.

711	   The Binary Data Sequence value is an integer number of octets that
712	   SHOULD have been generated in accordance with processes and
713	   procedures that ensure that it is sufficiently unpredictable for the
714	   purposes of the protocol in which the value is to be used.
715	   Requirements for such processes and procedures are described in
716	   [RFC4086] .

718	   Nonce Type UDFs are intended for use in contexts where it is
719	   necessary for a randomly chosen value to be unpredictable but not
720	   secret.  For example, the challenge in a challenge/response
721	   mechanism.

723	4.2.  Encryption/Authentication Type

725	   An Encryption/Authentication Type UDF consists of the type identifier
726	   octet 104 followed by the Binary Data Sequence value.

728	   The Binary Data Sequence value is an integer number of octets that
729	   SHOULD have been generated in accordance with processes and
730	   procedures that ensure that it is sufficiently unpredictable and
731	   unguessable for the purposes of the protocol in which the value is to
732	   be used.  Requirements for such processes and procedures are
733	   described in [RFC4086] .

735	   Encryption/Authentication Type UDFs are intended to be used as a
736	   means of specifying secret cryptographic keying material.  For
737	   example, the input to a Key Derivation Function used to encrypt a
738	   document.  Accordingly, the identifier UDF corresponding to an
739	   Encryption/Authentication type UDF is a UDF fingerprint of the
740	   Encryption/Authentication Type UDF in Base32 presentation with
741	   content type 'application/udf-encryption'.

743	4.3.  Shamir Shared Secret

745	   The UDF format MAY be used to encode shares generated by a secret
746	   sharing mechanism.  The only secret sharing mechanism currently
747	   supported is the Shamir Secret Sharing mechanism [Shamir79] . Each
748	   secret share represents a point represents a point on (x, f(x)), a
749	   polynomial in a modular field p.  The secret being shared is an
750	   integer multiple of 32 bits represented by the polynomial value f(0).

752	   A Shamir Shared Secret Type UDF consists of the type identifier octet
753	   144 followed by the Binary Data Sequence value describing the share
754	   value.

756	   The first octet of the Binary Data Sequence value specifies the
757	   threshold value and the x value of the particular share:

759	   o  Bits 4-7 of the first byte specify the threshold value.

761	   o  Bits 0-3 of the first byte specify the x value minus 1.

763	   The remaining octets specify the value f(x) in network byte (big-
764	   endian) order with leading padding if necessary so that the share has
765	   the same number of bytes as the secret.

767	   The algorithm requires that the value p be a prime larger than the
768	   integer representing the largest secret being shared.  For
769	   compactness of representation we chose p to be the smallest prime
770	   that is greater than 2^n where n is an integer multiple of 32.  This
771	   approach leaves a small probability that a set of chosen polynomial
772	   parameters cause one or more share values be larger than 2^n.  Since
773	   it is the value of the secret rather than the polynomial parameters
774	   that is of important, such parameters MUST NOT be used.

776	4.3.1.  Secret Generation

778	   To share a secret of L bits with a threshold of n we use a f(x) a
779	   polynomial of degree n in the modular field p:

781	   f(x) = a_0 + a_1.x + a_2.x^2 + ... a_n.x^n

783	   where:

785	   L  Is the length of the secret in bits, an integer multiple of 32.

787	   n  Is the threshold, the number of shares required to reconstitute
788	      the secret.

790	   a0 Is the integer representation of the secret to be shared.

792	   a1 ... an  Are randomly chosen integers less than p

794	   p  Is the smallest prime that is greater than 2^L.

796	   For L=128, p = 2^128+51.

798	   The values of the key shares are the values f(1), f(2),... f(n).

800	   The most straightforward approach to generation of Shamir secrets is
801	   to generate the set of polynomial coefficients, a_0, a_1, ... a_n and
802	   use these to generate the share values f(1), f(2),... f(n).

804	   Note that if this approach is adopted, there is a small probability
805	   that one or more of the values f(1), f(2),... f(n) exceeds the range
806	   of values supported by the encoding.  Should this occur, at least one
807	   of the polynomial coefficients MUST be replaced.

809	   An alternative means of generating the set of secrets is to select up
810	   to n-1 secret share values and use secret recovery to determine at
811	   least one additional share.  If n shares are selected, the shared
812	   secret becomes an output of rather than an input to the process.

814	4.3.2.  Recovery

816	   To recover the value of the shared secret, it is necessary to obtain
817	   sufficient shares to meet the threshold and recover the value f(0) =
818	   a_0.

820	   Applications MAY employ any approach that returns the correct result.
821	   The use of Lagrange basis polynomials is described in Appendix C.

823	   Alice decides to encrypt an important document and split the
824	   encryption key so that there are five key shares, three of which will
825	   be required to recover the key.

827	   Alice's master secret is
828	     6D C5 4C FD  4A A8 49 0A  D3 73 61 80  A5 21 F2 65

830	   This has the UDF representation:

832	   EBW4-KTH5-JKUE-SCWT-ONQY-BJJB-6JSQ

834	   The master secret is converted to an integer applying network byte
835	   order conventions.  Since the master secret is 128 bits, it is
836	   guaranteed to be smaller than the modulus.  The resulting value
837	   becomes the polynomial value a0.

839	   Since a threshold of three shares is required, we will need a second
840	   order polynomial.  The co-efficients of the polynomial a1, a2 are
841	   random numbers smaller than the modulus:

843	   a0 = 145910295552647520022097220682790335077
844	   a1 = 297322838204719131118519352197937676082
845	   a2 = 188590136456605872847847519358002837359
846	   The master secret is the value f(0) = a0.  The key shares are the
847	   values f(1), f(2)...f(5):

849	   f(1) = 291540903293034060525089484806962637011
850	   f(2) = 133787050104755419797027572783604190649
851	   f(3) = 12931102908750061301286092044483207498
852	   f(4) = 269255428625956448501239650021367899065
853	   f(5) = 222195293414497654470139031850721842336

855	   The first byte of each share specifies the recovery information
856	   (quorum, x value), the remaining bytes specify the share value in
857	   network byte order:

859	   f(1) =
860	     30 DB 54 BC  4E 17 52 EA  DA BA B8 5D  98 53 61 68
861	     D3
862	   f(2) =
863	     31 64 A6 72  D7 3D 62 31  89 91 37 A1  D7 09 66 4D
864	     B9
865	   f(3) =
866	     32 09 BA 70  98 BC D6 1D  17 56 F1 2E  3C C7 30 A1
867	     4A
868	   f(4) =
869	     33 CA 90 B5  92 95 AE AD  84 0B E5 02  C9 8C C0 63
870	     B9
871	   f(5) =
872	     34 A7 29 41  C4 C7 EB E2  CF B0 13 1F  7D 5A 15 94
873	     A0

875	   The UDF presentation of the key shares is thus:

877	   f(1) = SAYN-WVF4-JYLV-F2W2-XK4F-3GCT-MFUN-G
878	   f(2) = SAYW-JJTS-246W-EMMJ-SE32-DVYJ-MZG3-S
879	   f(3) = SAZA-TOTQ-TC6N-MHIX-K3YS-4PGH-GCQU-U
880	   f(4) = SAZ4-VEFV-SKK2-5LME-BPSQ-FSMM-YBR3-S
881	   f(5) = SA2K-OKKB-YTD6-XYWP-WAJR-67K2-CWKK-A

883	   To recover the value f(0) from any three shares, we need to fit a
884	   polynomial curve to the three points and use it to calculate the
885	   value at x=0 using the Lagrange polynomial basis.

887	5.  Variable Length UDFs

889	   Variable length UDFs are used to represent fingerprint values
890	   calculated over a content type identifier and the cryptographic
891	   digest of a content data item.  The fingerprint value MAY be
892	   specified at any integer multiple of 20 bits that provides a work
893	   factor sufficient for the intended purpose.

895	   Two types of fingerprint are specified:

897	   Digest fingerprints  Are computed with the same cryptographic digest
898	      algorithm used to calculate the digest of the content data.

900	   Message Authentication Code fingerprints  Are computed using a
901	      Message Authentication Code.

903	   For a given algorithm (and key, if requires), if two UDF fingerprints
904	   are of the same content data and content type, either the fingerprint
905	   values will be the same or the initial characters of one will be
906	   exactly equal to the other.

908	5.1.  Content Digest

910	   A Content Digest Type UDF consists of the type identifier octet
911	   followed by the Binary Data Sequence value.

913	   The type identifier specifies the digest algorithm used and the
914	   compression level.  Two digest algorithms are currently specified
915	   with four compression levels for each making a total of eight
916	   possible type identifiers.

918	   The Content Digest UDF for given content data is generated by the
919	   steps of:

921	   1.  Applying the digest algorithm to determine the Content Digest
922	       Value

924	   2.  Applying the digest algorithm to determine the Typed Content
925	       Digest Value

927	   3.  Determining the compression level from bytes 0-3 of the Typed
928	       Content Digest Value.

930	   4.  Determining the Type Identifier octet from the Digest algorithm
931	       identifier and compression level.

933	   5.  Truncating bytes 4-63 of the Typed Content Digest Value to
934	       determine the Binary Data Sequence value.

936	5.1.1.  Content Digest Value

938	   The Content Digest Value (CDV) is determined by applying the digest
939	   algorithm to the content data:

941	   CDV = H(<Data>))
942	   Where

944	      H(x) is the cryptographic digest function

946	      <Data> is the binary data.

948	5.1.2.  Typed Content Digest Value

950	   The Typed Content Digest Value (TCDV) is determined by applying the
951	   digest algorithm to the content type identifier and the CDV:

953	   TCDV = H (<Content-ID> + ?:? + CDV)

955	   Where

957	      A + B represents concatenation of the binary sequences A and B.

959	      <Content-ID> is the IANA Content Type of the data in UTF8 encoding

961	   The two-step approach to calculating the Type Content Digest Value
962	   allows an application to attempt to match a set of content data
963	   against multiple types without the need to recalculate the value of
964	   the content data digest.

966	5.1.3.  Compression

968	   The compression factor is determined according to the number of
969	   trailing zero bits in the first 8 bytes of the Typed Content Digest
970	   Value as follows:

972	   19 or fewer leading zero bits  Compression factor = 0

974	   29 or fewer leading zero bits  Compression factor = 20

976	   39 or fewer leading zero bits  Compression factor = 30

978	   49 or fewer leading zero bits  Compression factor = 40

980	   50 or more leading zero bits  Compression factor = 50

982	   The least significant bits of each octet are regarded to be
983	   'trailing'.

985	   Applications MUST use compression when creating and comparing UDFs.
986	   Applications MAY support content generation techniques that search
987	   for UDF values that use a compressed representation.  Presentation of
988	   a content digest value indicating use of compression MAY be used as
989	   an indicator of 'proof of work'.

991	5.1.4.  Presentation

993	   The type identifier is determined by the algorithm and compression
994	   factor as follows:

996	              +---------+---------+-----------+-------------+
997	              | Type ID | Initial | Algorithm | Compression |
998	              +---------+---------+-----------+-------------+
999	              | 80      | K       | SHA-3-512 | 0           |
1000	              | 81      | K       | SHA-3-512 | 20          |
1001	              | 82      | K       | SHA-3-512 | 30          |
1002	              | 83      | K       | SHA-3-512 | 40          |
1003	              | 84      | K       | SHA-3-512 | 50          |
1004	              | 96      | M       | SHA-2-512 | 0           |
1005	              | 97      | M       | SHA-2-512 | 20          |
1006	              | 98      | M       | SHA-2-512 | 30          |
1007	              | 99      | M       | SHA-2-512 | 40          |
1008	              | 100     | M       | SHA-2-512 | 50          |
1009	              +---------+---------+-----------+-------------+

1011	                                  Table 2

1013	   The Binary Data Sequence value is taken from the Typed Content Digest
1014	   Value starting at the 9^th octet and as many additional bytes as are
1015	   required to meet the presentation precision.

1017	5.1.5.  Example Encoding

1019	   In the following examples, <Content-ID> is the UTF8 encoding of the
1020	   string "text/plain" and <Data> is the UTF8 encoding of the string
1021	   "UDF Data Value"

1023	   Data =
1024	     55 44 46 20  44 61 74 61  20 56 61 6C  75 65

1026	   ContentType =
1027	     74 65 78 74  2F 70 6C 61  69 6E

1029	5.1.6.  Using SHA-2-512 Digest
1030	   H(<Data>) =
1031	     48 DA 47 CC  AB FE A4 5C  76 61 D3 21  BA 34 3E 58
1032	     10 87 2A 03  B4 02 9D AB  84 7C CE D2  22 B6 9C AB
1033	     02 38 D4 E9  1E 2F 6B 36  A0 9E ED 11  09 8A EA AC
1034	     99 D9 E0 BD  EA 47 93 15  BD 7A E9 E1  2E AD C4 15

1036	   <Content-ID> + ':' + H(<Data>) =
1037	     74 65 78 74  2F 70 6C 61  69 6E 3A 48  DA 47 CC AB
1038	     FE A4 5C 76  61 D3 21 BA  34 3E 58 10  87 2A 03 B4
1039	     02 9D AB 84  7C CE D2 22  B6 9C AB 02  38 D4 E9 1E
1040	     2F 6B 36 A0  9E ED 11 09  8A EA AC 99  D9 E0 BD EA
1041	     47 93 15 BD  7A E9 E1 2E  AD C4 15

1043	   H(<Content-ID> + ':' + H(<Data>)) =
1044	     C6 AF B7 C0  FE BE 04 E5  AE 94 E3 7B  AA 5F 1A 40
1045	     5B A3 CE CC  97 4D 55 C0  9E 61 E4 B0  EF 9C AE F9
1046	     EB 83 BB 9D  5F 0F 39 F6  5F AA 06 DC  67 2A 67 71
1047	     4F FF 8F 83  C4 55 38 36  38 AE 42 7A  82 9C 85 BB

1049	   The prefixed Binary Data Sequence is thus
1050	     60 C6 AF B7  C0 FE BE 04  E5 AE 94 E3  7B AA 5F 1A
1051	     40 5B

1053	   The 125 bit fingerprint value is MDDK-7N6A-727A-JZNO-STRX-XKS7-DJAF

1055	   This fingerprint MAY be specified with higher or lower precision as
1056	   appropriate.

1058	   100 bit precision  MDDK-7N6A-727A-JZNO-STRX

1060	   120 bit precision  MDDK-7N6A-727A-JZNO-STRX-XKS7

1062	   200 bit precision  MDDK-7N6A-727A-JZNO-STRX-XKS7-DJAF-XI6O-ZSLU-2VOA

1064	   260 bit precision  MDDK-7N6A-727A-JZNO-STRX-XKS7-DJAF-XI6O-ZSLU-2VOA-
1065	      TZQ6-JMHP-TSXP

1067	5.1.7.  Using SHA-3-512 Digest
1068	   H(<Data>) =
1069	     6D 2E CF E6  93 5A 0C FC  F2 A9 1A 49  E0 0C D8 07
1070	     A1 4E 70 AB  72 94 6E CC  BB 47 48 F1  8E 41 49 95
1071	     07 1D F3 6E  0D 0C 8B 60  39 C1 8E B4  0F 6E C8 08
1072	     65 B4 C4 45  9B A2 7E 97  74 7B BE 68  BC A8 C2 17

1074	   <Content-ID> + ':' + H(<Data>) =
1075	     74 65 78 74  2F 70 6C 61  69 6E 3A 6D  2E CF E6 93
1076	     5A 0C FC F2  A9 1A 49 E0  0C D8 07 A1  4E 70 AB 72
1077	     94 6E CC BB  47 48 F1 8E  41 49 95 07  1D F3 6E 0D
1078	     0C 8B 60 39  C1 8E B4 0F  6E C8 08 65  B4 C4 45 9B
1079	     A2 7E 97 74  7B BE 68 BC  A8 C2 17

1081	   H(<Content-ID> + ':' + H(<Data>)) =
1082	     8A 86 8A 06  1C 54 6E 7E  3F 75 5F 39  88 F9 FD 2F
1083	     8E C8 45 93  1B 80 A8 2F  29 16 7B A3  BE 21 1F 8A
1084	     75 61 88 A1  D5 7F 07 D5  9D 68 A4 2D  17 F4 4D 23
1085	     F9 E4 0B B2  1A 8D B9 F5  8D FC EC BD  01 F4 37 7C

1087	   The prefixed Binary Data Sequence is thus
1088	     50 8A 86 8A  06 1C 54 6E  7E 3F 75 5F  39 88 F9 FD
1089	     2F 8E

1091	   The 125 bit fingerprint value is KCFI-NCQG-DRKG-47R7-OVPT-TCHZ-7UXY

1093	5.1.8.  Using SHA-2-512 Digest with Compression

1095	   The content data "UDF Compressed Document 4187123" produces a UDF
1096	   Content Digest SHA-2-512 binary value with 20 trailing zeros and is
1097	   therefore presented using compressed presentation:

1099	   Data = "
1100	     55 44 46 20  43 6F 6D 70  72 65 73 73  65 64 20 44
1101	     6F 63 75 6D  65 6E 74 20  34 31 38 37  31 32 33"

1103	   The UTF8 Content Digest is given as:

1105	   H(<Data>) =
1106	     36 21 FA 2A  C5 D8 62 5C  2D 0B 45 FB  65 93 FC 69
1107	     C1 ED F7 00  AE 6F E3 3D  38 13 FE AB  76 AA 74 13
1108	     6D 5A 2B 20  DE D6 A5 CF  6C 04 E6 56  3F F3 C0 C7
1109	     C4 1D 3F 43  DD DC F1 A5  67 A7 E0 67  9A B0 C6 B7

1111	   <Content-ID> + ':' + H(<Data>) =
1112	     74 65 78 74  2F 70 6C 61  69 6E 3A 36  21 FA 2A C5
1113	     D8 62 5C 2D  0B 45 FB 65  93 FC 69 C1  ED F7 00 AE
1114	     6F E3 3D 38  13 FE AB 76  AA 74 13 6D  5A 2B 20 DE
1115	     D6 A5 CF 6C  04 E6 56 3F  F3 C0 C7 C4  1D 3F 43 DD
1116	     DC F1 A5 67  A7 E0 67 9A  B0 C6 B7

1118	   H(<Content-ID> + ':' + H(<Data>)) =
1119	     8E 14 D9 19  4E D6 02 12  C3 30 A7 BB  5F C7 17 6D
1120	     AE 9A 56 7C  A8 2A 23 1F  96 75 ED 53  10 EC E8 F2
1121	     60 14 24 D0  C8 BC 55 3D  C0 70 F7 5E  86 38 1A 0B
1122	     CB 55 9C B2  87 81 27 FF  3C EC E2 F0  90 A0 00 00

1124	   The prefixed Binary Data Sequence is thus
1125	     61 8E 14 D9  19 4E D6 02  12 C3 30 A7  BB 5F C7 17
1126	     6D AE

1128	   The 125 bit fingerprint value is MGHB-JWIZ-J3LA-EEWD-GCT3-WX6H-C5W2

1130	5.1.9.  Using SHA-3-512 Digest with Compression

1132	   The content data "UDF Compressed Document 774665" produces a UDF
1133	   Content Digest SHA-3-512 binary value with 20 trailing zeros and is
1134	   therefore presented using compressed presentation:

1136	   Data =
1137	     55 44 46 20  43 6F 6D 70  72 65 73 73  65 64 20 44
1138	     6F 63 75 6D  65 6E 74 20  37 37 34 36  36 35

1140	   The UTF8 SHA-3-512 Content Digest is KEJI-Y225-BDUG-XX22-MXKE-5ITF-
1141	   YVYM

1143	5.2.  Authenticator UDF

1145	   An authenticator Type UDF consists of the type identifier octet
1146	   followed by the Binary Data Sequence value.

1148	   The type identifier specifies the digest and Message Authentication
1149	   Code algorithm.  Two algorithm suites are currently specified.  Use
1150	   of compression is not supported.

1152	   The Authenticator UDF for given content data and key is generated by
1153	   the steps of:

1155	   1.  Applying the digest algorithm to determine the Content Digest
1156	       Value

1158	   2.  Applying the MAC algorithm to determine the Authentication value

1160	   3.  Determining the Type Identifier octet from the Digest algorithm
1161	       identifier and compression level.

1163	   4.  Truncating the Authentication value to determine the Binary Data
1164	       Sequence value.

1166	   The key used to calculate and Authenticator type UDF is always a
1167	   UNICODE string.  If use of a binary value as a key is required, the
1168	   value MUST be converted to a string format first.  For example, by
1169	   conversion to an Encryption/Authentication type UDF.

1171	5.2.1.  Content Digest Value

1173	   The Content Digest Value (CDV) is determined in the exact same
1174	   fashion as for a Content Digest UDF by applying the digest algorithm
1175	   to the content data:

1177	   CDV = H(<Data>))

1179	   Where

1181	      H(x) is the cryptographic digest function

1183	      <Data> is the binary data.

1185	5.2.2.  Authentication Value

1187	   The Authentication Value (AV) is determined by applying the digest
1188	   algorithm to the content type identifier and the CDV:

1190	   AV = MAC (<OKM>, (<Content-ID> + ?:? + CDV))

1192	   Where

1194	      <OKM> is the authentication key as specified below

1196	      MAC( <Key>, <data>) is the result of applying the Message
1197	      Authentication Code algorithm to with Key <Key> and data <data>

1199	   The value is calculated as follows:

1201	   IKM = UTF8 (Key)
1202	   PRK = MAC (UTF8 ("KeyedUDFMaster"), IKM)
1203	   OKM = HKDF-Expand(PRK, UTF8 ("KeyedUDFExpand"), HashLen)

1205	   Where the function UTF8(string) converts a string to the binary UTF8
1206	   representation, HKDF-Expand is as defined in [RFC5869] and the
1207	   function MAC(k,m) is the HMAC function formed from the specified hash
1208	   H(m) as specified in [RFC2014] .

1210	   Keyed UDFs are typically used in circumstances where user interaction
1211	   requires a cryptographic commitment type functionality

1213	   In the following example, <Content-ID> is the UTF8 encoding of the
1214	   string "text/plain" and <Data> is the UTF8 encoding of the string
1215	   "Konrad is the traitor".  The randomly chosen key is NDD7-6CMX-H2FW-
1216	   ISAL-K4VB-DQ3E-PEDM.

1218	   Data =
1219	     4B 6F 6E 72  61 64 20 69  73 20 74 68  65 20 74 72
1220	     61 69 74 6F  72

1222	   ContentType =
1223	     74 65 78 74  2F 70 6C 61  69 6E

1225	   Key =
1226	     4E 44 44 37  2D 36 43 4D  58 2D 48 32  46 57 2D 49
1227	     53 41 4C 2D  4B 34 56 42  2D 44 51 33  45 2D 50 45
1228	     44 4D

1230	   Processing is performed in the same manner as an unkeyed fingerprint
1231	   except that compression is never used:

1233	   H(<Data>) =
1234	     93 FC DA F9  FA FD 1E 26  50 26 C3 C1  28 43 40 73
1235	     D8 BC 3D 62  87 73 2B 73  B8 EC 93 B6  DE 80 FF DA
1236	     70 0A D1 CE  E8 F4 36 68  EF 4E 71 63  41 53 91 5C
1237	     CE 8C 5C CE  C7 9A 46 94  6A 35 79 F9  33 70 85 01

1239	   <Content-ID> + ':' + H(<Data>) =
1240	     74 65 78 74  2F 70 6C 61  69 6E 3A 93  FC DA F9 FA
1241	     FD 1E 26 50  26 C3 C1 28  43 40 73 D8  BC 3D 62 87
1242	     73 2B 73 B8  EC 93 B6 DE  80 FF DA 70  0A D1 CE E8
1243	     F4 36 68 EF  4E 71 63 41  53 91 5C CE  8C 5C CE C7
1244	     9A 46 94 6A  35 79 F9 33  70 85 01

1246	   PRK(Key) =
1247	     77 D3 0A 08  39 BD 9D C0  97 44 DA 33  15 0A 42 5E
1248	     CD 17 80 03  B3 CF CC 89  7A C7 84 12  B4 51 5B 25
1249	     DC 26 F5 E1  1B 20 F3 89  2E 9A 1A 7B  0E 73 23 39
1250	     0E C3 4C EF  2D 40 DA 05  B4 70 C6 1C  82 C1 49 33

1252	   HKDF(Key) =
1253	     BF A9 B4 58  9C 1D 68 D7  9A B7 11 F6  C8 98 59 14
1254	     20 D7 82 67  C5 84 22 E5  A0 F9 93 52  B1 C3 87 EB
1255	     05 06 CB C4  E4 D6 E6 EE  1F F0 D4 7A  97 68 5E CE
1256	     28 1C CA AF  D8 B5 D1 24  4A 71 EC E3  AC B5 D2 04

1258	   MAC(<key>, <Content-ID> + ':' + H(<Data>)) =
1259	     4C C3 7F D3  F9 9E 52 CF  07 90 74 53  84 65 95 BC
1260	     1A 2B A5 D1  68 9D 05 6D  06 C5 CA BF  17 CB E0 49
1261	     95 39 57 08  79 C4 E5 49  D3 3A 59 A3  32 05 45 A6
1262	     30 26 25 AE  8A F4 47 C6  1F B5 33 7F  AD 69 A6 30

1264	   The prefixed Binary Data Sequence is thus
1265	     00 4C C3 7F  D3 F9 9E 52  CF 07 90 74  53 84 65 95
1266	     BC 1A

1268	   The 125 bit fingerprint value is ABGM-G76T-7GPF-FTYH-SB2F-HBDF-SW6B

1270	5.3.  Content Type Values

1272	   While a UDF fingerprint MAY be used to identify any form of static
1273	   data, the use of a UDF fingerprint to identify a public key signature
1274	   key provides a level of indirection and thus the ability to identify
1275	   dynamic data.  The content types used to identify public keys are
1276	   thus of particular interest.

1278	   As described in the security considerations section, the use of
1279	   fingerprints to identify a bare public key and the use of
1280	   fingerprints to identify a public key and associated security policy
1281	   information are very different.

1283	5.3.1.  PKIX Certificates and Keys

1285	   UDF fingerprints MAY be used to identify PKIX certificates, CRLs and
1286	   public keys in the ASN.1 encoding used in PKIX certificates.

1288	   Since PKIX certificates and CLRs contain security policy information,
1289	   UDF fingerprints used to identify certificates or CRLs SHOULD be
1290	   presented with a minimum of 200 bits of precision.  PKIX applications
1291	   MUST not accept UDF fingerprints specified with less than 200 bits of
1292	   precision for purposes of identifying trust anchors.

1294	   PKIX certificates, keys and related content data are identified by
1295	   the following content types:

1297	   application/pkix-cert  A PKIX Certificate

1299	   application/pkix-crl  A PKIX CRL

1301	   application/pkix-keyinfo  The KeyInfo structure defined in the PKIX
1302	      certificate specification.

1304	5.3.2.  OpenPGP Key

1306	   OpenPGPv5 keys and key set content data are identified by the
1307	   following content type:

1309	   application/pgp-keys  An OpenPGP key set.

1311	5.3.3.  DNSSEC

1313	   DNSSEC record data consists of DNS records which are identified by
1314	   the following content type:

1316	   application/dns  A DNS resource record in binary format

1318	6.  UDF URIs

1320	   The UDF URI scheme describes a means of constructing URIs from a UDF
1321	   value.

1323	   Two forms or UDF URI are specified, Name and Locator.  In both cases
1324	   the URI MUST specify the scheme type "UDF", and a UDF fingerprint and
1325	   MAY specify a query identifier and/or a fragment identifier.

1327	   By definition a Locator form URI contains an authority field which
1328	   MUST be a DNS domain name.  The use of IP address forms for this
1329	   purpose is not permitted.

1331	   Name Form URIs allow static content data to be identified without
1332	   specifying the means by which the content data may be retrieved.
1333	   Locator form URIs allow static content data or dynamic network
1334	   resources to be identified and the means of retrieval.

1336	   The syntax of a UDF URI is a subset of the generic URI syntax
1337	   specified in [RFC3986] . The use of userinfo and port numbers is not
1338	   supported and the path part of the uri is a UDF in base32
1339	   presentation.

1341	   URI           = "UDF:" udf [ "?" query ] [ "" fragment ]

1343	   udf           = name-form / locator-form

1345	   name-form     = udf-value
1346	   locator-form  = "//" authority "/" udf-value

1348	   authority     = host
1349	   host          = reg-name

1351	6.1.  Name form

1353	   Name form UDF URIs provide a means of presenting a UDF value in a
1354	   context in which a URI form of a name is required without providing a
1355	   means of resolution.

1357	   Adding the UDF scheme prefix to a UDF fingerprint does not change the
1358	   semantics of the fingerprint itself.  The semantics of the name
1359	   result from the context in which it is used.

1361	   For example, a UDF value of any type MAY be used to give a unique
1362	   targetNamespace value in an XML Schema [XMLSchema]

1364	6.2.  Locator form

1366	   The locator form of an unkeyed UDF URI is resolved by the following
1367	   steps:

1369	   o  Use DNS Web service discovery to determine the Web Service
1370	      Endpoint.

1372	   o  Determine the content identifier from the source URI.

1374	   o  Append the content identifier to the Web Service Endpoint as a
1375	      suffix to form the target URI.

1377	   o  Retrieve content from the Web Service Endpoint by means of a GET
1378	      method.

1380	   o  Perform post processing as specified by the UDF type.

1382	6.2.1.  DNS Web service discovery

1384	   DNS Web Discovery is performed as specified in
1385	   [draft-hallambaker-web-service-discovery] for the service mmm-udf and
1386	   domain name specified in the URI.  For a full description of the
1387	   discovery mechanism, consult the referenced specification.

1389	   The use of DNS Web Discovery permits service providers to make full
1390	   use of the load balancing and service description capabilities
1391	   afforded by use of DNS SRV and TXT records in accordance with the
1392	   approach described in [RFC6763] .

1394	   If no SRV or TXT records are specified, DNS Web Discovery specifies
1395	   that the Web Service Endpoint be the Well Known Service [RFC5785]
1396	   with the prefix /.well-known/srv/mmm-udf.

1398	6.2.2.  Content Identifier

1400	   For all UDF types other than Secret Share, the Content Identifier
1401	   value is the UDF SHA-2-512 Content Digest of the canonical form of
1402	   the UDF specified in the source URI presented at twice the precision
1403	   to a maximum of 440 bits.

1405	   If the UDF is of type Secret Share, the shared secret MUST be
1406	   recovered before the content identifier can be resolved.  The shared
1407	   secret is then expressed as a UDF of type Encryption/Authentication
1408	   and the Content Identifier determined as for an Encryption/
1409	   Authentication type UDF.

1411	6.2.3.  Target URI

1413	   The target URI is formed by appending a slash separator '/' and the
1414	   Content Identifier value to the Web Service Endpoint.

1416	   Since the path portion of a URI is case sensitive, the UDF value MUST
1417	   be specified in upper case and MUST include separator marks.

1419	6.2.4.  Postprocessing

1421	   After retrieving the content data, the resolver MUST perform post
1422	   processing as indicated by the content type:

1424	   Nonce  No additional post processing is required.

1426	   Content Digest  The resolver MUST verify that the content returned
1427	      matches the UDF fingerprint value.

1429	   Authenticator  The resolver MUST verify that the content returned
1430	      matches the UDF fingerprint value.

1432	   Encryption/Authentication  The content data returned is decrypted and
1433	      authenticated using the key specified in the UDF value as the
1434	      initial keying material (see below).

1436	   Secret Share (set)  The content data returned is decrypted and
1437	      authenticated using the shared secret as the initial keying
1438	      material (see below).

1440	6.2.5.  Decryption and Authentication

1442	   The steps performed to decode cryptographically enhanced content data
1443	   depends on the content type specified in the returned content.  Two
1444	   formats are currently supported:

1446	   o  DARE Message format as specified in
1447	      [draft-hallambaker-dare-message]

1449	   o  Cryptographic Message Syntax (CMS) Symmetric Key Package as
1450	      specified in [RFC6031]

1452	6.2.6.  QR Presentation

1454	   Encoding of a UDF URI as a QR code requires only the characters in
1455	   alphanumeric encoding, thus achieving compactness with minimal
1456	   overhead.

1458	7.  Strong Internet Names

1460	   A Strong Internet Name is an Internet address that is bound to a
1461	   policy governing interpretation of that address by means of a Content
1462	   Digest type UDF of the policy expressed as a UDF prefixed DNS label
1463	   within the address itself.

1465	   The Reserved LDH labels as defined in [RFC5890] that begin with the
1466	   prefix mm-- are reserved for use as Strong Internet Names.  The
1467	   characters following the prefix are a Content Digest type UDF in
1468	   Base32 presentation.

1470	   Since DNS labels are limited to 63 characters, the presentation of
1471	   the SIN itself is limited to 59 characters and thus 240 bits of
1472	   precision.

1474	8.  Security Considerations

1476	8.1.  Confidentiality

1478	   Encrypted locator is a bearer token

1480	8.2.  Availability

1482	   Corruption of a part of a shared secret may prevent recovery

1484	8.3.  Integrity

1486	   Shared secret parts do not contain context information to specify
1487	   which secret they relate to.

1489	8.4.  Work Factor and Precision

1491	   A given UDF data object has a single fingerprint value that may be
1492	   presented at different precisions.  The shortest legitimate precision
1493	   with which a UDF fingerprint may be presented has 96 significant bits

1495	   A UDF fingerprint presents the same work factor as any other
1496	   cryptographic digest function.  The difficulty of finding a second
1497	   data item that matches a given fingerprint is 2^n and the difficulty
1498	   or finding two data items that have the same fingerprint is 2^(n/2).
1499	   Where n is the precision of the fingerprint.

1501	   For the algorithms specified in this document, n = 512 and thus the
1502	   work factor for finding collisions is 2^256, a value that is
1503	   generally considered to be computationally infeasible.

1505	   Since the use of 512 bit fingerprints is impractical in the type of
1506	   applications where fingerprints are generally used, truncation is a
1507	   practical necessity.  The longer a fingerprint is, the less likely it
1508	   is that a user will check every character.  It is therefore important
1509	   to consider carefully whether the security of an application depends
1510	   on second pre-image resistance or collision resistance.

1512	   In most fingerprint applications, such as the use of fingerprints to
1513	   identify public keys, the fact that a malicious party might generate
1514	   two keys that have the same fingerprint value is a minor concern.

1516	   Combined with a flawed protocol architecture, such a vulnerability
1517	   may permit an attacker to construct a document such that the
1518	   signature will be accepted as valid by some parties but not by
1519	   others.

1521	   For example, Alice generates keypairs until two are generated that
1522	   have the same 100 bit UDF presentation (typically 2^48 attempts).
1523	   She registers one keypair with a merchant and the other with her
1524	   bank.  This allows Alice to create a payment instrument that will be
1525	   accepted as valid by one and rejected by the other.

1527	   The ability to generate of two PKIX certificates with the same
1528	   fingerprint and different certificate attributes raises very
1529	   different and more serious security concerns.  For example, an
1530	   attacker might generate two certificates with the same key and
1531	   different use constraints.  This might allow an attacker to present a
1532	   highly constrained certificate that does not present a security risk
1533	   to an application for purposes of gaining approval and an
1534	   unconstrained certificate to request a malicious action.

1536	   In general, any use of fingerprints to identify data that has
1537	   security policy semantics requires the risk of collision attacks to
1538	   be considered.  For this reason, the use of short, 'user friendly'
1539	   fingerprint presentations (Less than 200 bits) SHOULD only be used
1540	   for public key values.

1542	8.5.  Semantic Substitution

1544	   Many applications record the fact that a data item is trusted, rather
1545	   fewer record the circumstances in which the data item is trusted.
1546	   This results in a semantic substitution vulnerability which an
1547	   attacker may exploit by presenting the trusted data item in the wrong
1548	   context.

1550	   The UDF format provides protection against high level semantic
1551	   substitution attacks by incorporating the content type into the input
1552	   to the outermost fingerprint digest function.  The work factor for
1553	   generating a UDF fingerprint that is valid in both contexts is thus
1554	   the same as the work factor for finding a second preimage in the
1555	   digest function (2^512 for the specified digest algorithms).

1557	   It is thus infeasible to generate a data item such that some
1558	   applications will interpret it as a PKIX key and others will accept
1559	   as an OpenPGP key.  While attempting to parse a PKIX key as an
1560	   OpenPGP key is virtually certain to fail to return the correct key
1561	   parameters it cannot be assumed that the attempt is guaranteed to
1562	   fail with an error message.

1564	   The UDF format does not provide protection against semantic
1565	   substitution attacks that do not affect the content type.

1567	8.6.  QR Code Scanning

1569	   The act of scanning a QR code SHOULD be considered equivalent to
1570	   clicking on an unlabeled hypertext link.  Since QR codes are scanned
1571	   in many different contexts, the mere act of scanning a QR code MUST
1572	   NOT be interpreted as constituting an affirmative acceptance of terms
1573	   or conditions or as creating an electronic signature.

1575	   If such semantics are required in the context of an application,
1576	   these MUST be established by secondary user actions made subsequent
1577	   to the scanning of the QR code.

1579	   There is a risk that use of QR codes to automate processes such as
1580	   payment will lead to abusive practices such as presentation of
1581	   fraudulent invoices for goods not ordered or delivered.  It is
1582	   therefore important to ensure that such requests are subject to
1583	   adequate accountability controls.

1585	9.  IANA Considerations

1587	   Registrations are requested in the following registries:

1589	   o  Service Name and Transport Protocol Port Number

1591	   o  well-known URI registry

1593	   o  Uniform Resource Identifier (URI) Schemes

1595	   o  Media Types

1597	   In addition, the creation of the following registry is requested:
1598	   Uniform Data Fingerprint Type Identifier Registry.

1600	9.1.  Protocol Service Name

1602	   The following registration is requested in the Service Name and
1603	   Transport Protocol Port Number Registry in accordance with [RFC6355]

1605	   Service Name (REQUIRED)  mmm-udf

1607	   Transport Protocol(s) (REQUIRED)  TCP

1609	   Assignee (REQUIRED)  Phillip Hallam-Baker, phill@hallambaker.com

1611	   Contact (REQUIRED)  Phillip Hallam-Baker, phill@hallambaker.com
1612	   Description (REQUIRED)  mmm-udf is a Web Service protocol that
1613	      resolves Mathematical Mesh Uniform Data Fingerprints (UDF) to
1614	      resources.  The mmm-udf service name is used in service discovery
1615	      to identify a Web Service endpoint to perform resolution of a UDF
1616	      presented in URI locator form.

1618	   Reference (REQUIRED)  [This document]

1620	   Port Number (OPTIONAL)  None

1622	   Service Code (REQUIRED for DCCP only)  None

1624	   Known Unauthorized Uses (OPTIONAL)  None

1626	   Assignment Notes (OPTIONAL)  None

1628	9.2.  Well Known

1630	   The following registration is requested in the well-known URI
1631	   registry in accordance with [RFC5785]

1633	   URI suffix  srv/mmm-udf

1635	   Change controller  Phillip Hallam-Baker, phill@hallambaker.com

1637	   Specification document(s):  [This document]

1639	   Related information

1641	   [draft-hallambaker-web-service-discovery]

1643	9.3.  URI Registration

1645	   The following registration is requested in the Uniform Resource
1646	   Identifier (URI) Schemes registry in accordance with [RFC7595]

1648	   Scheme name:  UDF

1650	   Status:  Provisional

1652	   Applications/protocols that use this scheme name:  Mathematical Mesh
1653	      Service protocols (mmm)

1655	   Contact:  Phillip Hallam-Baker mailto:phill@hallambaker.com

1657	   Change controller:  Phillip Hallam-Baker

1659	   References:  [This document]

1661	9.4.  Media Types Registrations

1663	9.4.1.  Media Type: application/pkix-keyinfo

1665	   Type name:  application

1667	   Subtype name:  pkix-keyinfo

1669	   Required parameters:  None

1671	   Optional parameters:  None

1673	   Encoding considerations:  None

1675	   Security considerations:  Described in [This]

1677	   Interoperability considerations:  None

1679	   Published specification:  [This]

1681	   Applications that use this media type:  Uniform Data Fingerprint

1683	   Fragment identifier considerations:  None

1685	   Additional information:  Deprecated alias names for this type: None

1687	      Magic number(s): None

1689	      File extension(s): None

1691	      Macintosh file type code(s): None

1693	   Person &amp; email address to contact for further information:
1694	      Phillip Hallam-Baker @hallambaker.com>

1696	   Intended usage:  Content type identifier to be used in constructing
1697	      UDF Content Digests and Authenticators and related cryptographic
1698	      purposes.

1700	   Restrictions on usage:  None

1702	   Author:  Phillip Hallam-Baker

1704	   Change controller:  Phillip Hallam-Baker

1706	   Provisional registration? (standards tree only):  Yes

1708	9.4.2.  Media Type: application/udf-encryption

1710	   Type name:  application

1712	   Subtype name:  udf-encryption

1714	   Required parameters:  None

1716	   Optional parameters:  None

1718	   Encoding considerations:  None

1720	   Security considerations:  Described in [This]

1722	   Interoperability considerations:  None

1724	   Published specification:  [This]

1726	   Applications that use this media type:  Uniform Data Fingerprint

1728	   Fragment identifier considerations:  None

1730	   Additional information:  Deprecated alias names for this type: None

1732	      Magic number(s): None

1734	      File extension(s): None

1736	      Macintosh file type code(s): None

1738	   Person &amp; email address to contact for further information:
1739	      Phillip Hallam-Baker @hallambaker.com>

1741	   Intended usage:  Content type identifier to be used in constructing
1742	      UDF Content Digests and Authenticators and related cryptographic
1743	      purposes.

1745	   Restrictions on usage:  None

1747	   Author:  Phillip Hallam-Baker

1749	   Change controller:  Phillip Hallam-Baker

1751	   Provisional registration? (standards tree only):  Yes

1753	9.4.3.  Media Type: application/udf-secret

1755	   Type name:  application

1757	   Subtype name:  udf- secret

1759	   Required parameters:  None

1761	   Optional parameters:  None

1763	   Encoding considerations:  None

1765	   Security considerations:  Described in [This]

1767	   Interoperability considerations:  None

1769	   Published specification:  [This]

1771	   Applications that use this media type:  Uniform Data Fingerprint

1773	   Fragment identifier considerations:  None

1775	   Additional information:  Deprecated alias names for this type: None

1777	      Magic number(s): None

1779	      File extension(s): None

1781	      Macintosh file type code(s): None

1783	   Person &amp; email address to contact for further information:
1784	      Phillip Hallam-Baker @hallambaker.com>

1786	   Intended usage:  Content type identifier to be used in constructing
1787	      UDF Content Digests and Authenticators and related cryptographic
1788	      purposes.

1790	   Restrictions on usage:  None

1792	   Author:  Phillip Hallam-Baker

1794	   Change controller:  Phillip Hallam-Baker

1796	   Provisional registration? (standards tree only):  Yes

1798	9.5.  Uniform Data Fingerprint Type Identifier Registry

1800	   This document describes a new extensible data format employing fixed
1801	   length version identifiers for UDF types.

1803	9.5.1.  The name of the registry

1805	   Uniform Data Fingerprint Type Identifier Registry

1807	9.5.2.  Required information for registrations

1809	   Registrants must specify the Type identifier code(s) requested,
1810	   description and RFC number for the corresponding standards action
1811	   document.

1813	   The standards document must specify the means of generating and
1814	   interpreting the UDF Data Sequence Value and the purpose(s) for which
1815	   it is proposed.

1817	   Since the initial letter of the Base32 presentation provides a
1818	   mnemonic function in UDFs, the standards document must explain why
1819	   the proposed Type Identifier and associated initial letter are
1820	   appropriate.  In cases where a new initial letter is to be created,
1821	   there must be an explanation of why this is appropriate.  If an
1822	   existing initial letter is to be created, there must be an
1823	   explanation of why this is appropriate and/or acceptable.

1825	9.5.3.  Applicable registration policy

1827	   Due to the intended field of use (human data entry), the code space
1828	   is severely constrained.  Accordingly, it is intended that code point
1829	   registrations be as infrequent as possible.

1831	   Registration of new digest algorithms is strongly discouraged and
1832	   should not occur unless, (1) there is a known security vulnerability
1833	   in one of the two schemes specified in the original assignment and
1834	   (2) the proposed algorithm has been subjected to rigorous peer
1835	   review, preferably in the form of an open, international competition
1836	   and (3) the proposed algorithm has been adopted as a preferred
1837	   algorithm for use in IETF protocols.

1839	   Accordingly, the applicable registration policy is Standards Action.

1841	9.5.4.  Size, format, and syntax of registry entries

1843	   Each registry entry consists of a single byte code,

1845	9.5.5.  Initial assignments and reservations

1847	   The following entries should be added to the registry as initial
1848	   assignments:

1850	   Code  Description                      Reference
1851	   ---  -------------------               ---------
1852	   00   HMAC and SHA-2-512                [This document]
1853	   32   HKDF-AES-512                      [This document]
1854	   80   SHA-3-512                         [This document]
1855	   81   SHA-3-512 with 20 trailing zeros  [This document]
1856	   82   SHA-3-512 with 30 trailing zeros  [This document]
1857	   82   SHA-3-512 with 40 trailing zeros  [This document]
1858	   83   SHA-3-512 with 50 trailing zeros  [This document]
1859	   96   SHA-2-512                         [This document]
1860	   97   SHA-2-512 with 20 trailing zeros  [This document]
1861	   98   SHA-2-512 with 30 trailing zeros  [This document]
1862	   99   SHA-2-512 with 40 trailing zeros  [This document]
1863	   100  SHA-2-512 with 50 trailing zeros  [This document]
1864	   104  Random nonce                      [This document]
1865	   144  Shamir Secret Share               [This document]

1867	10.  Appendix A: Prime Values for Secret Sharing

1869	   The following are the prime values to be used for sharing secrets of
1870	   up to 512 bits.

1872	   If it is necessary to share larger secrets, the corresponding prime
1873	   may be found by choosing a value (2^32)^n that is larger than the
1874	   secret to be encoded and determining the next largest number that is
1875	   prime.

1877	                 +----------------+----------------------+
1878	                 | Number of bits | Offset = Primen - 2n |
1879	                 +----------------+----------------------+
1880	                 | 32             | 15                   |
1881	                 | 64             | 13                   |
1882	                 | 96             | 61                   |
1883	                 | 128            | 51                   |
1884	                 | 160            | 7                    |
1885	                 | 192            | 133                  |
1886	                 | 224            | 735                  |
1887	                 | 256            | 297                  |
1888	                 | 288            | 127                  |
1889	                 | 320            | 27                   |
1890	                 | 352            | 55                   |
1891	                 | 384            | 231                  |
1892	                 | 416            | 235                  |
1893	                 | 448            | 211                  |
1894	                 | 480            | 165                  |
1895	                 | 512            | 75                   |
1896	                 +----------------+----------------------+

1898	                                  Table 3

1900	   For example, the prime to be used to share a 128 bit value is 2^128 +
1901	   51.

1903	11.  Recovering Shamir Shared Secret

1905	   The value of a Shamir Shared secret may be recovered using Lagrange
1906	   basis polynomials.

1908	   To share a secret with a threshold of n shares and L bits we
1909	   constructed f(x) a polynomial of degree n in the modular field p
1910	   where p is the smallest prime greater than 2^L:

1912	   f(x) = a_0 + a_1.x + a_2.x^2 + ... a_n.x^n

1914	   The shared secret is the binary representation of the value a_0

1916	   Given n shares (x_0, y_0), (x_1, y_1), ... (x_n-1, y_n-1), The
1917	   corresponding the Lagrange basis polynomials l_0, l_1, .. l_n-1 are
1918	   given by:

1920	   lm = ((x - x(m_0)) / (x(m) - x(m_0))) . ((x - x(m_1)) / (x(m) -
1921	   x(m_1))) . ... .  ((x - x(m_n-2)) / (x(m) - x(m_n-2)))

1923	   Where the values m_0, m_1, ... m_n-2, are the integers 0, 1, .. n-1,
1924	   excluding the value m.

1926	   These can be used to compute f(x) as follows:

1928	   f(x) = y_0l_0 + y_1l_1 + ... y_n-1l_n-1

1930	   Since it is only the value of f(0) that we are interested in, we
1931	   compute the Lagrange basis for the value x = 0:

1933	   lz_m = ((x(m_1)) / (x(m) - x(m_1))) . ((x(m_2)) / (x(m) - x(m_2)))

1935	   Hence,

1937	   a_0 = f(0) = y_0lz_0 + y_1lz_1 + ... y_n-1l_n-1

1939	   The following C# code recovers the values.

1941	   using System;
1942	   using System.Collections.Generic;
1943	   using System.Numerics;

1945	   namespace Examples {

1947	       class Examples {

1949	           ///
1950	           /// Combine a set of  points (x, f(x))
1951	           /// on a polynomial of degree  in a
1952	           /// discrete field modulo prime  to
1953	           /// recover the value f(0) using Lagrange basis polynomials.
1954	           ///
1955	           /// The values f(x).
1956	           /// The values for x.
1957	           /// The modulus.
1958	           /// The polynomial degree.
1959	           /// The value f(0).
1960	           static BigInteger CombineNK(
1961	                       BigInteger[] fx,
1962	                       int[] x,
1963	                       BigInteger p,
1964	                       int n) {
1965	               if (fx.Length < n) {
1966	                   throw new Exception("Insufficient shares");
1967	                   }

1969	               BigInteger accumulator = 0;
1970	               for (var formula = 0; formula < n; formula++) {
1971	                   var value = fx[formula];

1973	                   BigInteger numerator = 1, denominator = 1;
1974	                   for (var count = 0; count < n; count++) {
1975	                       if (formula == count) {
1976	                           continue;  // If not the same value
1977	                           }

1979	                       var start = x[formula];
1980	                       var next = x[count];

1982	                       numerator = (numerator * -next) % p;
1983	                       denominator = (denominator * (start - next)) % p;
1984	                       }

1986	                   var InvDenominator = ModInverse(denominator, p);

1988	                   accumulator = Modulus((accumulator +
1989	                       (fx[formula] * numerator * InvDenominator)), p);
1990	                   }

1992	               return accumulator;
1993	               }

1995	           ///
1996	           /// Compute the modular multiplicative inverse of the value
1997	           ///  modulo
1998	           ///
1999	           /// The value to find the inverse of
2000	           /// The modulus.
2001	           ///
2002	           static BigInteger ModInverse(
2003	                       BigInteger k,
2004	                       BigInteger p) {
2005	               var m2 = p - 2;
2006	               if (k < 0) {
2007	                   k = k + p;
2008	                   }

2010	               return BigInteger.ModPow(k, m2, p);
2011	               }

2013	           ///
2014	           /// Calculate the modulus of a number with correct handling
2015	           /// for negative numbers.
2016	           ///
2017	           /// Value
2018	           /// The modulus.
2019	           /// x mod p
2020	           public static BigInteger Modulus(
2021	                       BigInteger x,
2022	                       BigInteger p) {
2023	               var Result = x % p;
2024	               return Result.Sign >= 0 ? Result : Result + p;
2025	               }
2026	           }
2027	       }

2029	12.  References

2031	12.1.  Normative References

2033	   [draft-hallambaker-dare-message]
2034	              Hallam-Baker, P., "Data At Rest Encryption Part 1: DARE
2035	              Message Syntax", draft-hallambaker-dare-message-02 (work
2036	              in progress), August 2018.

2038	   [draft-hallambaker-web-service-discovery]
2039	              Hallam-Baker, P., "DNS Web Service Discovery", draft-
2040	              hallambaker-web-service-discovery-00 (work in progress),
2041	              June 2018.

2043	   [RFC2014]  Weinrib, A. and J. Postel, "IRTF Research Group Guidelines
2044	              and Procedures", BCP 8, RFC 2014, DOI 10.17487/RFC2014,
2045	              October 1996.

2047	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
2048	              Requirement Levels", BCP 14, RFC 2119,
2049	              DOI 10.17487/RFC2119, March 1997.

2051	   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
2052	              Resource Identifier (URI): Generic Syntax", STD 66,
2053	              RFC 3986, DOI 10.17487/RFC3986, January 2005.

2055	   [RFC4648]  Josefsson, S., "The Base16, Base32, and Base64 Data
2056	              Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006.

2058	   [RFC5869]  Krawczyk, H. and P. Eronen, "HMAC-based Extract-and-Expand
2059	              Key Derivation Function (HKDF)", RFC 5869,
2060	              DOI 10.17487/RFC5869, May 2010.

2062	   [RFC6031]  Turner, S. and R. Housley, "Cryptographic Message Syntax
2063	              (CMS) Symmetric Key Package Content Type", RFC 6031,
2064	              DOI 10.17487/RFC6031, December 2010.

2066	   [SHA-2]    NIST, "Secure Hash Standard", August 2015.

2068	   [SHA-3]    Dworkin, M., "SHA-3 Standard: Permutation-Based Hash and
2069	              Extendable-Output Functions", August 2015.

2071	12.2.  Informative References

2073	   [draft-hallambaker-mesh-developer]
2074	              Hallam-Baker, P., "Mathematical Mesh: Reference
2075	              Implementation", draft-hallambaker-mesh-developer-07 (work
2076	              in progress), April 2018.

2078	   [draft-hallambaker-mesh-trust]
2079	              Hallam-Baker, P., "Mathematical Mesh Part IV: The Trust
2080	              Mesh", draft-hallambaker-mesh-trust-00 (work in progress),
2081	              January 2019.

2083	   [RFC4086]  Eastlake 3rd, D., Schiller, J., and S. Crocker,
2084	              "Randomness Requirements for Security", BCP 106, RFC 4086,
2085	              DOI 10.17487/RFC4086, June 2005.

2087	   [RFC4880]  Callas, J., Donnerhacke, L., Finney, H., Shaw, D., and R.
2088	              Thayer, "OpenPGP Message Format", RFC 4880,
2089	              DOI 10.17487/RFC4880, November 2007.

2091	   [RFC5785]  Nottingham, M. and E. Hammer-Lahav, "Defining Well-Known
2092	              Uniform Resource Identifiers (URIs)", RFC 5785,
2093	              DOI 10.17487/RFC5785, April 2010.

2095	   [RFC5890]  Klensin, J., "Internationalized Domain Names for
2096	              Applications (IDNA): Definitions and Document Framework",
2097	              RFC 5890, DOI 10.17487/RFC5890, August 2010.

2099	   [RFC6355]  Narten, T. and J. Johnson, "Definition of the UUID-Based
2100	              DHCPv6 Unique Identifier (DUID-UUID)", RFC 6355,
2101	              DOI 10.17487/RFC6355, August 2011.

2103	   [RFC6763]  Cheshire, S. and M. Krochmal, "DNS-Based Service
2104	              Discovery", RFC 6763, DOI 10.17487/RFC6763, February 2013.

2106	   [RFC7595]  Thaler, D., Hansen, T., and T. Hardie, "Guidelines and
2107	              Registration Procedures for URI Schemes", BCP 35,
2108	              RFC 7595, DOI 10.17487/RFC7595, June 2015.

2110	   [Shamir79]
2111	              "[Reference Not Found!]".

2113	   [XMLSchema]
2114	              Gao, S., Sperberg-McQueen, C., Thompson, H., Mendelsohn,
2115	              N., Beech, D., and M. Maloney, "W3C XML Schema Definition
2116	              Language (XSD) 1.1 Part 1: Structures", April 2012.

2118	12.3.  URIs

2120	   [1] http://mathmesh.com/Documents/ draft-hallambaker-mesh-udf.html

2122	   [2] http://mathmesh.com/Documents/ draft-hallambaker-mesh-udf.html

2124	Author's Address

2126	   Phillip Hallam-Baker

2128	   Email: phill@hallambaker.com