idnits 2.17.1 

draft-thomson-http-mice-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([2], [1]), which it shouldn't.
      Please replace those with straight textual mentions of the documents in
     question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (August 14, 2018) is 2075 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Looks like a reference, but probably isn't: '1' on line 461

  -- Looks like a reference, but probably isn't: '2' on line 463

  -- Looks like a reference, but probably isn't: '3' on line 465

  -- Possible downref: Non-RFC (?) normative reference: ref. 'FIPS180-4'

  == Outdated reference: A later version (-19) exists of
     draft-ietf-httpbis-header-structure-07

  -- Possible downref: Non-RFC (?) normative reference: ref. 'MERKLE'

  ** Obsolete normative reference: RFC 3230 (Obsoleted by RFC 9530)

  ** Obsolete normative reference: RFC 7231 (Obsoleted by RFC 9110)

  -- Obsolete informational reference (is this intentional?): RFC 2818
     (Obsoleted by RFC 9110)

  -- Obsolete informational reference (is this intentional?): RFC 6962
     (Obsoleted by RFC 9162)

  -- Obsolete informational reference (is this intentional?): RFC 7233
     (Obsoleted by RFC 9110)


     Summary: 3 errors (**), 0 flaws (~~), 2 warnings (==), 9 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                         M. Thomson
3	Internet-Draft                                                   Mozilla
4	Intended status: Standards Track                              J. Yasskin
5	Expires: February 15, 2019                                        Google
6	                                                         August 14, 2018

8	                   Merkle Integrity Content Encoding
9	                       draft-thomson-http-mice-03

11	Abstract

13	   This memo introduces a content-coding for HTTP that provides
14	   progressive integrity for message contents.  This integrity
15	   protection can be evaluated on a partial representation, allowing a
16	   recipient to process a message as it is delivered while retaining
17	   strong integrity protection.

19	Note to Readers

21	   _RFC EDITOR: please remove this section before publication_

23	   Discussion of this draft takes place on the HTTP working group
24	   mailing list (ietf-http-wg@w3.org), which is archived at
25	   https://lists.w3.org/Archives/Public/ietf-http-wg/ [1].

27	   The source code and issues list for this draft can be found at
28	   https://github.com/martinthomson/http-mice [2].

30	Status of This Memo

32	   This Internet-Draft is submitted in full conformance with the
33	   provisions of BCP 78 and BCP 79.

35	   Internet-Drafts are working documents of the Internet Engineering
36	   Task Force (IETF).  Note that other groups may also distribute
37	   working documents as Internet-Drafts.  The list of current Internet-
38	   Drafts is at https://datatracker.ietf.org/drafts/current/.

40	   Internet-Drafts are draft documents valid for a maximum of six months
41	   and may be updated, replaced, or obsoleted by other documents at any
42	   time.  It is inappropriate to use Internet-Drafts as reference
43	   material or to cite them other than as "work in progress."

45	   This Internet-Draft will expire on February 15, 2019.

47	Copyright Notice

49	   Copyright (c) 2018 IETF Trust and the persons identified as the
50	   document authors.  All rights reserved.

52	   This document is subject to BCP 78 and the IETF Trust's Legal
53	   Provisions Relating to IETF Documents
54	   (https://trustee.ietf.org/license-info) in effect on the date of
55	   publication of this document.  Please review these documents
56	   carefully, as they describe your rights and restrictions with respect
57	   to this document.  Code Components extracted from this document must
58	   include Simplified BSD License text as described in Section 4.e of
59	   the Trust Legal Provisions and are provided without warranty as
60	   described in the Simplified BSD License.

62	Table of Contents

64	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
65	     1.1.  Notational Conventions  . . . . . . . . . . . . . . . . .   3
66	   2.  The "mi-sha256" HTTP Content Encoding . . . . . . . . . . . .   3
67	     2.1.  Content Encoding Structure  . . . . . . . . . . . . . . .   5
68	     2.2.  Validating Integrity Proofs . . . . . . . . . . . . . . .   5
69	   3.  The "mi-sha256" Digest Algorithm  . . . . . . . . . . . . . .   6
70	   4.  Examples  . . . . . . . . . . . . . . . . . . . . . . . . . .   7
71	     4.1.  Simple Example  . . . . . . . . . . . . . . . . . . . . .   7
72	     4.2.  Example with Multiple Records . . . . . . . . . . . . . .   7
73	   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
74	     5.1.  Message Truncation  . . . . . . . . . . . . . . . . . . .   8
75	     5.2.  Algorithm Agility . . . . . . . . . . . . . . . . . . . .   9
76	   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
77	     6.1.  The "mi-sha256" HTTP Content Encoding . . . . . . . . . .   9
78	     6.2.  The "mi-sha256" Digest Algorithm  . . . . . . . . . . . .   9
79	   7.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   9
80	     7.1.  Normative References  . . . . . . . . . . . . . . . . . .   9
81	     7.2.  Informative References  . . . . . . . . . . . . . . . . .  10
82	     7.3.  URIs  . . . . . . . . . . . . . . . . . . . . . . . . . .  10
83	   Appendix A.  Acknowledgements . . . . . . . . . . . . . . . . . .  11
84	   Appendix B.  FAQ  . . . . . . . . . . . . . . . . . . . . . . . .  11
85	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  11

87	1.  Introduction

89	   Integrity protection for HTTP content is highly valuable.  HTTPS
90	   [RFC2818] is the most common form of integrity protection deployed,
91	   but that requires a direct TLS [RFC8446] connection to a host.
92	   However, additional integrity protection might be desirable for some
93	   use cases.  This might be for additional protection against failures
94	   or attack (see [SRI]) or because content needs to remain unmodified
95	   throughout multiple HTTPS-protected exchanges.

97	   This document describes a "mi-sha256" content-encoding (see
98	   Section 2) that is a progressive, hash-based integrity check based on
99	   Merkle Hash Trees [MERKLE].

101	   The means of conveying the root integrity proof used by this content
102	   encoding will depend on deployment requirements.  This document
103	   defines a digest algorithm (see Section 3) that can carry an
104	   integrity proof.

106	1.1.  Notational Conventions

108	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
109	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
110	   document are to be interpreted as described in [RFC2119].

112	2.  The "mi-sha256" HTTP Content Encoding

114	   A Merkle Hash Tree [MERKLE] is a structured integrity mechanism that
115	   collates multiple integrity checks into a tree.  The leaf nodes of
116	   the tree contain data (or hashes of data) and non-leaf nodes contain
117	   hashes of the nodes below them.

119	   A balanced Merkle Hash Tree is used to efficiently prove membership
120	   in large sets (such as in [RFC6962]).  However, in this case, a
121	   right-skewed tree is used to provide a progressive integrity proof.
122	   This integrity proof is used to establish that a given record is part
123	   of a message.

125	   The hash function used for "mi-sha256" content encoding is SHA-256
126	   [FIPS180-4].  The integrity proof for all records other than the last
127	   is the hash of the concatenation of the record, the integrity proof
128	   of all subsequent records, and a single octet with a value of 0x1:

130	      proof(r[i]) = SHA-256(r[i] || proof(r[i+1]) || 0x1)

132	   The integrity proof for the final record is the hash of the record
133	   with a single octet with a value 0x0 appended:

135	      proof(r[last]) = SHA-256(r[last] || 0x0)

137	   Figure 1 shows the structure of the integrity proofs for a message
138	   that is split into 4 blocks: A, B, C, D).  As shown, the integrity
139	   proof for the entire message (that is, "proof(A)") is derived from
140	   the content of the first block (A), plus the value of the proof for
141	   the second and subsequent blocks.

143	       proof(A)
144	         /\
145	        /  \
146	       /    \
147	      A    proof(B)
148	            /\
149	           /  \
150	          /    \
151	         B    proof(C)
152	                /\
153	               /  \
154	              /    \
155	             C    proof(D)
156	                    |
157	                    |
158	                    D

160	           Figure 1: Proof structure for a message with 4 blocks

162	   The final encoded message is formed from the record size and first
163	   record, followed by an arbitrary number of tuples of the integrity
164	   proof of the next record and then the record itself.  Thus, in
165	   Figure 1, the body is:

167	      rs || A || proof(B) || B || proof(C) || C || proof(D) || D

169	   Note:  The "||" operator is used to represent concatenation.

171	   A message that has a content length less than or equal to the content
172	   size does not include any inline proofs.  The proof for a message
173	   with a single record is simply the hash of the body plus a trailing
174	   zero octet.

176	   As a special case, the encoding of an empty payload is itself an
177	   empty message (i.e. it omits the initial record size), and its
178	   integrity proof is SHA-256("\0").

180	   _RFC EDITOR: Please remove the next paragraph before publication._

182	   Implementations of drafts of this specification MUST implement a
183	   content encoding named "mi-sha256-##" instead of the "mi-sha256"
184	   content encoding specified by the final RFC, with "##" replaced by
185	   the draft number being implemented.  For example, implementations of
186	   draft-thomson-http-mice-03 would implement "mi-sha256-03".

188	2.1.  Content Encoding Structure

190	   In order to produce the final content encoding the content of the
191	   message is split into equal-sized records.  The final record can
192	   contain less than the defined record size.

194	   For non-empty payloads, the record size is included in the first 8
195	   octets of the message as an unsigned 64-bit integer.  This refers to
196	   the length of each data block.

198	   The final encoded stream comprises of the record size ("rs"), plus a
199	   sequence of records, each "rs" octets in length.  Each record, other
200	   than the last, is followed by a 32 octet proof for the record that
201	   follows.  This allows a receiver to validate and act upon each record
202	   after receiving the proof that precedes it.  The final record is not
203	   followed by a proof.

205	   Note:  This content encoding increases the size of a message by 8
206	      plus 32 octets times the length of the message divided by the
207	      record size, rounded up, less one.  That is, 8 + 32 * (ceil(length
208	      / rs) - 1).

210	   Constructing a message with the "mi-sha256" content encoding requires
211	   processing of the records in reverse order, inserting the proof
212	   derived from each record before that record.

214	   This structure permits the use of range requests [RFC7233].  However,
215	   to validate a given record, a contiguous sequence of records back to
216	   the start of the message is needed.

218	2.2.  Validating Integrity Proofs

220	   A receiver of a message with the "mi-sha256" content-encoding applied
221	   first attempts to acquire the integrity proof for the first record,
222	   "top-proof".  If the Digest header field is present with the mi-
223	   sha256 parameter, a value might be included there.

225	   The receiver attempts to read the first 8 octets as an unsigned
226	   64-bit integer, "rs".  If 8 octets aren't available then:

228	   o  If 0 octets are available, and "top-proof" is SHA-256("\0") (whose
229	      base64 encoding is
230	      "bjQLnP+zepicpUTmu3gKLHiQHT+zNzh2hRGjBhevoB0="), then return a
231	      0-length decoded payload.

233	   o  Otherwise, validation fails.

235	   The remainder of the message is read into records of size "rs" plus
236	   32 octets.  The last record is between 1 and "rs" octets in length,
237	   if not then validation fails.  For each record:

239	   1.  Hash the record using SHA-256 with a single octet appended:

241	       a.  All records other than the last have an octet with a value of
242	       0x1 appended.

244	       b.  The last record has an octet with a value of 0x0 appended.

246	   2.  Compare the hash with the expected value:

248	       a.  For the first record, the expected value is "top-proof".

250	       b.  For records after the first, the expected value is the last
251	       32 octets of the previous record.

253	   3.  If the hash is different, then this record and all subsequent
254	       records do not have integrity protection and this process ends.

256	   4.  If a record is valid, up to "rs" octets is passed on for
257	       processing.  In other words, the trailing 32 octets is removed
258	       from every record other than the last before being used.

260	   If an integrity check fails, the message SHOULD be discarded and the
261	   exchange treated as an error unless explicitly configured otherwise.
262	   For clients, treat this as equivalent to a server error; servers
263	   SHOULD generate a 400 or other 4xx status code.  However, if the
264	   integrity proof for the first record is not known, this check SHOULD
265	   NOT fail unless explicitly configured to do so.

267	3.  The "mi-sha256" Digest Algorithm

269	   [RFC3230] describes digests applying to "the entire instance
270	   associated with the message".  The instance corresponds to the
271	   "representation" in Section 3 of [RFC7231], but unlike the existing
272	   digest algorithms, the "mi-sha256" digest algorithm specifies the
273	   top-level digest at the point when the "mi-sha256" content coding
274	   (Section 2) is applied or removed from the representation.

276	   When the "mi-sha256" digest algorithm is specified for a
277	   representation, the recipient MUST use the base64-decoding (Section 4
278	   of [RFC4648]) of the "mi-sha256" digest as the "top-proof" for the
279	   "mi-sha256" content encoding (Section 2.2).

281	   The recipient MUST behave as described by Section 4.2.9 of
282	   [I-D.ietf-httpbis-header-structure] if it encounters improper
283	   padding, non-zero padding bits, or non-alphabet characters, where
284	   rejecting the data means to reject the representation.

286	   If different mechanisms specify different "top-proof" values for the
287	   "mi-sha256" content encoding, the recipient MUST reject the
288	   representation.

290	   If "mi-sha256" content coding has not been applied to the
291	   representation exactly once (Section 3.1.2.2 of [RFC7231]), the
292	   recipient MUST reject the representation.

294	   When rejecting the representation, clients SHOULD treat this as
295	   equivalent to a server error, and servers SHOULD generate a 400 or
296	   other 4xx status code.

298	   _RFC EDITOR: Please remove the next paragraph before publication._

300	   Implementations of drafts of this specification MUST use a digest
301	   algorithm named the same as the "mi-sha256-##" content encoding they
302	   implement, with the meaning described for "mi-sha256" above.

304	4.  Examples

306	4.1.  Simple Example

308	   The following example contains a short message.  This contains just a
309	   single record, so there are no inline integrity proofs, just a single
310	   value in the mi-sha256 parameter of a Digest header field.  The
311	   record size is prepended to the message body (shown here in angle
312	   brackets).

314	   HTTP/1.1 200 OK
315	   Digest: mi-sha256=dcRDgR2GM35DluAV13PzgnG6+pvQwPywfFvAu1UeFrs=
316	   Content-Encoding: mi-sha256
317	   Content-Length: 49

319	   <0x0000000000000029>When I grow up, I want to be a watermelon

321	4.2.  Example with Multiple Records

323	   This example shows the same message as above, but with a smaller
324	   record size (16 octets).  This results in two integrity proofs being
325	   included in the representation.

327	   PUT /test HTTP/1.1
328	   Host: example.com
329	   Digest: mi-sha256=IVa9shfs0nyKEhHqtB3WVNANJ2Njm5KjQLjRtnbkYJ4=
330	   Content-Encoding: mi-sha256
331	   Content-Length: 113

333	   <0x0000000000000010>When I grow up,
334	   OElbplJlPK+Rv6JNK6p5/515IaoPoZo+2elWL7OQ60A=
335	   I want to be a w
336	   iPMpmgExHPrbEX3/RvwP4d16fWlK4l++p75PUu_KyN0=
337	   atermelon

339	   Since the inline integrity proofs contain non-printing characters,
340	   these are shown here using the base64 encoding [RFC4648] with new
341	   lines between the original text and integrity proofs.  Note that
342	   there is a single trailing space (0x20) on the first line.

344	5.  Security Considerations

346	   The integrity of an entire message body depends on the means by which
347	   the integrity proof for the first record is protected.  If this value
348	   comes from the same place as the message, then this provides only
349	   limited protection against transport-level errors (something that TLS
350	   provides adequate protection against).

352	   Separate protection for header fields might be provided by other
353	   means if the first record retrieved is the first record in the
354	   message, but range requests do not allow for this option.

356	5.1.  Message Truncation

358	   This integrity scheme permits the detection of truncated messages.
359	   However, it enables and even encourages processing of messages prior
360	   to receiving an complete message.  Actions taken on a partial message
361	   can produce incorrect results.  For example, a message could say "I
362	   need some 2mm copper cable, please send 100mm for evaluation
363	   purposes" then be truncated to "I need some 2mm copper cable, please
364	   send 100m".  A network-based attacker might be able to force this
365	   sort of truncation by delaying packets that contain the remainder of
366	   the message.

368	   Whether it is safe to act on partial messages will depend on the
369	   nature of the message and the processing that is performed.

371	5.2.  Algorithm Agility

373	   A new content encoding type is needed in order to define the use of a
374	   hash function other than SHA-256.

376	6.  IANA Considerations

378	6.1.  The "mi-sha256" HTTP Content Encoding

380	   This memo registers the "mi-sha256" HTTP content-coding in the HTTP
381	   Content Codings Registry, as detailed in Section 2.

383	   o  Name: mi-sha256

385	   o  Description: A Merkle Hash Tree based content encoding that
386	      provides progressive integrity.

388	   o  Reference: this specification

390	6.2.  The "mi-sha256" Digest Algorithm

392	   This memo registers the "mi-sha256" digest algorithm in the HTTP
393	   Digest Algorithm Values [3] registry:

395	   o  Digest Algorithm: mi-sha256

397	   o  Description: As specified in Section 3.

399	7.  References

401	7.1.  Normative References

403	   [FIPS180-4]
404	              Department of Commerce, National., "NIST FIPS 180-4,
405	              Secure Hash Standard", March 2012,
406	              <http://csrc.nist.gov/publications/fips/fips180-4/
407	              fips-180-4.pdf>.

409	   [I-D.ietf-httpbis-header-structure]
410	              Nottingham, M. and P. Kamp, "Structured Headers for HTTP",
411	              draft-ietf-httpbis-header-structure-07 (work in progress),
412	              July 2018.

414	   [MERKLE]   Merkle, R., "A Digital Signature Based on a Conventional
415	              Encryption Function", International Crytology Conference -
416	              CRYPTO , 1987.

418	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
419	              Requirement Levels", BCP 14, RFC 2119,
420	              DOI 10.17487/RFC2119, March 1997,
421	              <https://www.rfc-editor.org/info/rfc2119>.

423	   [RFC3230]  Mogul, J. and A. Van Hoff, "Instance Digests in HTTP",
424	              RFC 3230, DOI 10.17487/RFC3230, January 2002,
425	              <https://www.rfc-editor.org/info/rfc3230>.

427	   [RFC4648]  Josefsson, S., "The Base16, Base32, and Base64 Data
428	              Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006,
429	              <https://www.rfc-editor.org/info/rfc4648>.

431	   [RFC7231]  Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
432	              Protocol (HTTP/1.1): Semantics and Content", RFC 7231,
433	              DOI 10.17487/RFC7231, June 2014,
434	              <https://www.rfc-editor.org/info/rfc7231>.

436	7.2.  Informative References

438	   [RFC2818]  Rescorla, E., "HTTP Over TLS", RFC 2818,
439	              DOI 10.17487/RFC2818, May 2000,
440	              <https://www.rfc-editor.org/info/rfc2818>.

442	   [RFC6962]  Laurie, B., Langley, A., and E. Kasper, "Certificate
443	              Transparency", RFC 6962, DOI 10.17487/RFC6962, June 2013,
444	              <https://www.rfc-editor.org/info/rfc6962>.

446	   [RFC7233]  Fielding, R., Ed., Lafon, Y., Ed., and J. Reschke, Ed.,
447	              "Hypertext Transfer Protocol (HTTP/1.1): Range Requests",
448	              RFC 7233, DOI 10.17487/RFC7233, June 2014,
449	              <https://www.rfc-editor.org/info/rfc7233>.

451	   [RFC8446]  Rescorla, E., "The Transport Layer Security (TLS) Protocol
452	              Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
453	              <https://www.rfc-editor.org/info/rfc8446>.

455	   [SRI]      Akhawe, D., Braun, F., Marier, F., and J. Weinberger,
456	              "Subresource Integrity", W3C CR , November 2015,
457	              <https://w3c.github.io/webappsec-subresource-integrity/>.

459	7.3.  URIs

461	   [1] https://lists.w3.org/Archives/Public/ietf-http-wg/

463	   [2] https://github.com/martinthomson/http-mice

465	   [3] https://www.iana.org/assignments/http-dig-alg/http-dig-alg.xhtml

467	Appendix A.  Acknowledgements

469	   David Benjamin and Erik Nygren both separately suggested that
470	   something like this might be valuable.  James Manger and Eric
471	   Rescorla provided useful feedback.

473	Appendix B.  FAQ

475	   1.  Why not include the first proof in the encoding?

477	       The requirements for the integrity proof for the first record
478	       require a great deal more flexibility than this allows for.
479	       Transferring the proof separately is sometimes necessary.
480	       Separating the value out allows for that to happen more easily.

482	   2.  Why do messages have to be processed in reverse to construct
483	       them?

485	       The final integrity value, no matter how it is derived, has to
486	       depend on every bit of the message.  That means that there are
487	       three choices: both sender and receiver have to process the whole
488	       message, the sender has to work backwards, or the receiver has to
489	       work backwards.  The current form is the best option of the
490	       three.  The expectation is that this will be useful for content
491	       that is generated once and sent multiple times, since the onerous
492	       backwards processing requirement can be amortized.

494	   3.  Why not just generate a table of hashes?

496	       An alternative design includes a header that comprises hashes of
497	       every block of the message.  The final proof is a hash of that
498	       table.  This has the advantage that the table can be built in any
499	       order.  The disadvantage is that a receiver needs to store the
500	       table while processing content, whereas a chained hash can be
501	       processed with a single stored hash worth of state no matter how
502	       many blocks are present.  The chained hash is also smaller by 32
503	       octets.

505	Authors' Addresses

507	   Martin Thomson
508	   Mozilla

510	   Email: martin.thomson@gmail.com
511	   Jeffrey Yasskin
512	   Google

514	   Email: jyasskin@chromium.org