idnits 2.17.1 draft-thomson-http-mice-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([2], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 14, 2018) is 2075 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 461 -- Looks like a reference, but probably isn't: '2' on line 463 -- Looks like a reference, but probably isn't: '3' on line 465 -- Possible downref: Non-RFC (?) normative reference: ref. 'FIPS180-4' == Outdated reference: A later version (-19) exists of draft-ietf-httpbis-header-structure-07 -- Possible downref: Non-RFC (?) normative reference: ref. 'MERKLE' ** Obsolete normative reference: RFC 3230 (Obsoleted by RFC 9530) ** Obsolete normative reference: RFC 7231 (Obsoleted by RFC 9110) -- Obsolete informational reference (is this intentional?): RFC 2818 (Obsoleted by RFC 9110) -- Obsolete informational reference (is this intentional?): RFC 6962 (Obsoleted by RFC 9162) -- Obsolete informational reference (is this intentional?): RFC 7233 (Obsoleted by RFC 9110) Summary: 3 errors (**), 0 flaws (~~), 2 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Thomson 3 Internet-Draft Mozilla 4 Intended status: Standards Track J. Yasskin 5 Expires: February 15, 2019 Google 6 August 14, 2018 8 Merkle Integrity Content Encoding 9 draft-thomson-http-mice-03 11 Abstract 13 This memo introduces a content-coding for HTTP that provides 14 progressive integrity for message contents. This integrity 15 protection can be evaluated on a partial representation, allowing a 16 recipient to process a message as it is delivered while retaining 17 strong integrity protection. 19 Note to Readers 21 _RFC EDITOR: please remove this section before publication_ 23 Discussion of this draft takes place on the HTTP working group 24 mailing list (ietf-http-wg@w3.org), which is archived at 25 https://lists.w3.org/Archives/Public/ietf-http-wg/ [1]. 27 The source code and issues list for this draft can be found at 28 https://github.com/martinthomson/http-mice [2]. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at https://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on February 15, 2019. 47 Copyright Notice 49 Copyright (c) 2018 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (https://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 65 1.1. Notational Conventions . . . . . . . . . . . . . . . . . 3 66 2. The "mi-sha256" HTTP Content Encoding . . . . . . . . . . . . 3 67 2.1. Content Encoding Structure . . . . . . . . . . . . . . . 5 68 2.2. Validating Integrity Proofs . . . . . . . . . . . . . . . 5 69 3. The "mi-sha256" Digest Algorithm . . . . . . . . . . . . . . 6 70 4. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 7 71 4.1. Simple Example . . . . . . . . . . . . . . . . . . . . . 7 72 4.2. Example with Multiple Records . . . . . . . . . . . . . . 7 73 5. Security Considerations . . . . . . . . . . . . . . . . . . . 8 74 5.1. Message Truncation . . . . . . . . . . . . . . . . . . . 8 75 5.2. Algorithm Agility . . . . . . . . . . . . . . . . . . . . 9 76 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 77 6.1. The "mi-sha256" HTTP Content Encoding . . . . . . . . . . 9 78 6.2. The "mi-sha256" Digest Algorithm . . . . . . . . . . . . 9 79 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 80 7.1. Normative References . . . . . . . . . . . . . . . . . . 9 81 7.2. Informative References . . . . . . . . . . . . . . . . . 10 82 7.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 10 83 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 11 84 Appendix B. FAQ . . . . . . . . . . . . . . . . . . . . . . . . 11 85 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 87 1. Introduction 89 Integrity protection for HTTP content is highly valuable. HTTPS 90 [RFC2818] is the most common form of integrity protection deployed, 91 but that requires a direct TLS [RFC8446] connection to a host. 92 However, additional integrity protection might be desirable for some 93 use cases. This might be for additional protection against failures 94 or attack (see [SRI]) or because content needs to remain unmodified 95 throughout multiple HTTPS-protected exchanges. 97 This document describes a "mi-sha256" content-encoding (see 98 Section 2) that is a progressive, hash-based integrity check based on 99 Merkle Hash Trees [MERKLE]. 101 The means of conveying the root integrity proof used by this content 102 encoding will depend on deployment requirements. This document 103 defines a digest algorithm (see Section 3) that can carry an 104 integrity proof. 106 1.1. Notational Conventions 108 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 109 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 110 document are to be interpreted as described in [RFC2119]. 112 2. The "mi-sha256" HTTP Content Encoding 114 A Merkle Hash Tree [MERKLE] is a structured integrity mechanism that 115 collates multiple integrity checks into a tree. The leaf nodes of 116 the tree contain data (or hashes of data) and non-leaf nodes contain 117 hashes of the nodes below them. 119 A balanced Merkle Hash Tree is used to efficiently prove membership 120 in large sets (such as in [RFC6962]). However, in this case, a 121 right-skewed tree is used to provide a progressive integrity proof. 122 This integrity proof is used to establish that a given record is part 123 of a message. 125 The hash function used for "mi-sha256" content encoding is SHA-256 126 [FIPS180-4]. The integrity proof for all records other than the last 127 is the hash of the concatenation of the record, the integrity proof 128 of all subsequent records, and a single octet with a value of 0x1: 130 proof(r[i]) = SHA-256(r[i] || proof(r[i+1]) || 0x1) 132 The integrity proof for the final record is the hash of the record 133 with a single octet with a value 0x0 appended: 135 proof(r[last]) = SHA-256(r[last] || 0x0) 137 Figure 1 shows the structure of the integrity proofs for a message 138 that is split into 4 blocks: A, B, C, D). As shown, the integrity 139 proof for the entire message (that is, "proof(A)") is derived from 140 the content of the first block (A), plus the value of the proof for 141 the second and subsequent blocks. 143 proof(A) 144 /\ 145 / \ 146 / \ 147 A proof(B) 148 /\ 149 / \ 150 / \ 151 B proof(C) 152 /\ 153 / \ 154 / \ 155 C proof(D) 156 | 157 | 158 D 160 Figure 1: Proof structure for a message with 4 blocks 162 The final encoded message is formed from the record size and first 163 record, followed by an arbitrary number of tuples of the integrity 164 proof of the next record and then the record itself. Thus, in 165 Figure 1, the body is: 167 rs || A || proof(B) || B || proof(C) || C || proof(D) || D 169 Note: The "||" operator is used to represent concatenation. 171 A message that has a content length less than or equal to the content 172 size does not include any inline proofs. The proof for a message 173 with a single record is simply the hash of the body plus a trailing 174 zero octet. 176 As a special case, the encoding of an empty payload is itself an 177 empty message (i.e. it omits the initial record size), and its 178 integrity proof is SHA-256("\0"). 180 _RFC EDITOR: Please remove the next paragraph before publication._ 182 Implementations of drafts of this specification MUST implement a 183 content encoding named "mi-sha256-##" instead of the "mi-sha256" 184 content encoding specified by the final RFC, with "##" replaced by 185 the draft number being implemented. For example, implementations of 186 draft-thomson-http-mice-03 would implement "mi-sha256-03". 188 2.1. Content Encoding Structure 190 In order to produce the final content encoding the content of the 191 message is split into equal-sized records. The final record can 192 contain less than the defined record size. 194 For non-empty payloads, the record size is included in the first 8 195 octets of the message as an unsigned 64-bit integer. This refers to 196 the length of each data block. 198 The final encoded stream comprises of the record size ("rs"), plus a 199 sequence of records, each "rs" octets in length. Each record, other 200 than the last, is followed by a 32 octet proof for the record that 201 follows. This allows a receiver to validate and act upon each record 202 after receiving the proof that precedes it. The final record is not 203 followed by a proof. 205 Note: This content encoding increases the size of a message by 8 206 plus 32 octets times the length of the message divided by the 207 record size, rounded up, less one. That is, 8 + 32 * (ceil(length 208 / rs) - 1). 210 Constructing a message with the "mi-sha256" content encoding requires 211 processing of the records in reverse order, inserting the proof 212 derived from each record before that record. 214 This structure permits the use of range requests [RFC7233]. However, 215 to validate a given record, a contiguous sequence of records back to 216 the start of the message is needed. 218 2.2. Validating Integrity Proofs 220 A receiver of a message with the "mi-sha256" content-encoding applied 221 first attempts to acquire the integrity proof for the first record, 222 "top-proof". If the Digest header field is present with the mi- 223 sha256 parameter, a value might be included there. 225 The receiver attempts to read the first 8 octets as an unsigned 226 64-bit integer, "rs". If 8 octets aren't available then: 228 o If 0 octets are available, and "top-proof" is SHA-256("\0") (whose 229 base64 encoding is 230 "bjQLnP+zepicpUTmu3gKLHiQHT+zNzh2hRGjBhevoB0="), then return a 231 0-length decoded payload. 233 o Otherwise, validation fails. 235 The remainder of the message is read into records of size "rs" plus 236 32 octets. The last record is between 1 and "rs" octets in length, 237 if not then validation fails. For each record: 239 1. Hash the record using SHA-256 with a single octet appended: 241 a. All records other than the last have an octet with a value of 242 0x1 appended. 244 b. The last record has an octet with a value of 0x0 appended. 246 2. Compare the hash with the expected value: 248 a. For the first record, the expected value is "top-proof". 250 b. For records after the first, the expected value is the last 251 32 octets of the previous record. 253 3. If the hash is different, then this record and all subsequent 254 records do not have integrity protection and this process ends. 256 4. If a record is valid, up to "rs" octets is passed on for 257 processing. In other words, the trailing 32 octets is removed 258 from every record other than the last before being used. 260 If an integrity check fails, the message SHOULD be discarded and the 261 exchange treated as an error unless explicitly configured otherwise. 262 For clients, treat this as equivalent to a server error; servers 263 SHOULD generate a 400 or other 4xx status code. However, if the 264 integrity proof for the first record is not known, this check SHOULD 265 NOT fail unless explicitly configured to do so. 267 3. The "mi-sha256" Digest Algorithm 269 [RFC3230] describes digests applying to "the entire instance 270 associated with the message". The instance corresponds to the 271 "representation" in Section 3 of [RFC7231], but unlike the existing 272 digest algorithms, the "mi-sha256" digest algorithm specifies the 273 top-level digest at the point when the "mi-sha256" content coding 274 (Section 2) is applied or removed from the representation. 276 When the "mi-sha256" digest algorithm is specified for a 277 representation, the recipient MUST use the base64-decoding (Section 4 278 of [RFC4648]) of the "mi-sha256" digest as the "top-proof" for the 279 "mi-sha256" content encoding (Section 2.2). 281 The recipient MUST behave as described by Section 4.2.9 of 282 [I-D.ietf-httpbis-header-structure] if it encounters improper 283 padding, non-zero padding bits, or non-alphabet characters, where 284 rejecting the data means to reject the representation. 286 If different mechanisms specify different "top-proof" values for the 287 "mi-sha256" content encoding, the recipient MUST reject the 288 representation. 290 If "mi-sha256" content coding has not been applied to the 291 representation exactly once (Section 3.1.2.2 of [RFC7231]), the 292 recipient MUST reject the representation. 294 When rejecting the representation, clients SHOULD treat this as 295 equivalent to a server error, and servers SHOULD generate a 400 or 296 other 4xx status code. 298 _RFC EDITOR: Please remove the next paragraph before publication._ 300 Implementations of drafts of this specification MUST use a digest 301 algorithm named the same as the "mi-sha256-##" content encoding they 302 implement, with the meaning described for "mi-sha256" above. 304 4. Examples 306 4.1. Simple Example 308 The following example contains a short message. This contains just a 309 single record, so there are no inline integrity proofs, just a single 310 value in the mi-sha256 parameter of a Digest header field. The 311 record size is prepended to the message body (shown here in angle 312 brackets). 314 HTTP/1.1 200 OK 315 Digest: mi-sha256=dcRDgR2GM35DluAV13PzgnG6+pvQwPywfFvAu1UeFrs= 316 Content-Encoding: mi-sha256 317 Content-Length: 49 319 <0x0000000000000029>When I grow up, I want to be a watermelon 321 4.2. Example with Multiple Records 323 This example shows the same message as above, but with a smaller 324 record size (16 octets). This results in two integrity proofs being 325 included in the representation. 327 PUT /test HTTP/1.1 328 Host: example.com 329 Digest: mi-sha256=IVa9shfs0nyKEhHqtB3WVNANJ2Njm5KjQLjRtnbkYJ4= 330 Content-Encoding: mi-sha256 331 Content-Length: 113 333 <0x0000000000000010>When I grow up, 334 OElbplJlPK+Rv6JNK6p5/515IaoPoZo+2elWL7OQ60A= 335 I want to be a w 336 iPMpmgExHPrbEX3/RvwP4d16fWlK4l++p75PUu_KyN0= 337 atermelon 339 Since the inline integrity proofs contain non-printing characters, 340 these are shown here using the base64 encoding [RFC4648] with new 341 lines between the original text and integrity proofs. Note that 342 there is a single trailing space (0x20) on the first line. 344 5. Security Considerations 346 The integrity of an entire message body depends on the means by which 347 the integrity proof for the first record is protected. If this value 348 comes from the same place as the message, then this provides only 349 limited protection against transport-level errors (something that TLS 350 provides adequate protection against). 352 Separate protection for header fields might be provided by other 353 means if the first record retrieved is the first record in the 354 message, but range requests do not allow for this option. 356 5.1. Message Truncation 358 This integrity scheme permits the detection of truncated messages. 359 However, it enables and even encourages processing of messages prior 360 to receiving an complete message. Actions taken on a partial message 361 can produce incorrect results. For example, a message could say "I 362 need some 2mm copper cable, please send 100mm for evaluation 363 purposes" then be truncated to "I need some 2mm copper cable, please 364 send 100m". A network-based attacker might be able to force this 365 sort of truncation by delaying packets that contain the remainder of 366 the message. 368 Whether it is safe to act on partial messages will depend on the 369 nature of the message and the processing that is performed. 371 5.2. Algorithm Agility 373 A new content encoding type is needed in order to define the use of a 374 hash function other than SHA-256. 376 6. IANA Considerations 378 6.1. The "mi-sha256" HTTP Content Encoding 380 This memo registers the "mi-sha256" HTTP content-coding in the HTTP 381 Content Codings Registry, as detailed in Section 2. 383 o Name: mi-sha256 385 o Description: A Merkle Hash Tree based content encoding that 386 provides progressive integrity. 388 o Reference: this specification 390 6.2. The "mi-sha256" Digest Algorithm 392 This memo registers the "mi-sha256" digest algorithm in the HTTP 393 Digest Algorithm Values [3] registry: 395 o Digest Algorithm: mi-sha256 397 o Description: As specified in Section 3. 399 7. References 401 7.1. Normative References 403 [FIPS180-4] 404 Department of Commerce, National., "NIST FIPS 180-4, 405 Secure Hash Standard", March 2012, 406 . 409 [I-D.ietf-httpbis-header-structure] 410 Nottingham, M. and P. Kamp, "Structured Headers for HTTP", 411 draft-ietf-httpbis-header-structure-07 (work in progress), 412 July 2018. 414 [MERKLE] Merkle, R., "A Digital Signature Based on a Conventional 415 Encryption Function", International Crytology Conference - 416 CRYPTO , 1987. 418 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 419 Requirement Levels", BCP 14, RFC 2119, 420 DOI 10.17487/RFC2119, March 1997, 421 . 423 [RFC3230] Mogul, J. and A. Van Hoff, "Instance Digests in HTTP", 424 RFC 3230, DOI 10.17487/RFC3230, January 2002, 425 . 427 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 428 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, 429 . 431 [RFC7231] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 432 Protocol (HTTP/1.1): Semantics and Content", RFC 7231, 433 DOI 10.17487/RFC7231, June 2014, 434 . 436 7.2. Informative References 438 [RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, 439 DOI 10.17487/RFC2818, May 2000, 440 . 442 [RFC6962] Laurie, B., Langley, A., and E. Kasper, "Certificate 443 Transparency", RFC 6962, DOI 10.17487/RFC6962, June 2013, 444 . 446 [RFC7233] Fielding, R., Ed., Lafon, Y., Ed., and J. Reschke, Ed., 447 "Hypertext Transfer Protocol (HTTP/1.1): Range Requests", 448 RFC 7233, DOI 10.17487/RFC7233, June 2014, 449 . 451 [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol 452 Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, 453 . 455 [SRI] Akhawe, D., Braun, F., Marier, F., and J. Weinberger, 456 "Subresource Integrity", W3C CR , November 2015, 457 . 459 7.3. URIs 461 [1] https://lists.w3.org/Archives/Public/ietf-http-wg/ 463 [2] https://github.com/martinthomson/http-mice 465 [3] https://www.iana.org/assignments/http-dig-alg/http-dig-alg.xhtml 467 Appendix A. Acknowledgements 469 David Benjamin and Erik Nygren both separately suggested that 470 something like this might be valuable. James Manger and Eric 471 Rescorla provided useful feedback. 473 Appendix B. FAQ 475 1. Why not include the first proof in the encoding? 477 The requirements for the integrity proof for the first record 478 require a great deal more flexibility than this allows for. 479 Transferring the proof separately is sometimes necessary. 480 Separating the value out allows for that to happen more easily. 482 2. Why do messages have to be processed in reverse to construct 483 them? 485 The final integrity value, no matter how it is derived, has to 486 depend on every bit of the message. That means that there are 487 three choices: both sender and receiver have to process the whole 488 message, the sender has to work backwards, or the receiver has to 489 work backwards. The current form is the best option of the 490 three. The expectation is that this will be useful for content 491 that is generated once and sent multiple times, since the onerous 492 backwards processing requirement can be amortized. 494 3. Why not just generate a table of hashes? 496 An alternative design includes a header that comprises hashes of 497 every block of the message. The final proof is a hash of that 498 table. This has the advantage that the table can be built in any 499 order. The disadvantage is that a receiver needs to store the 500 table while processing content, whereas a chained hash can be 501 processed with a single stored hash worth of state no matter how 502 many blocks are present. The chained hash is also smaller by 32 503 octets. 505 Authors' Addresses 507 Martin Thomson 508 Mozilla 510 Email: martin.thomson@gmail.com 511 Jeffrey Yasskin 512 Google 514 Email: jyasskin@chromium.org