idnits 2.17.1 draft-ietf-httpbis-cache-digest-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 30, 2017) is 2520 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 7230 (Obsoleted by RFC 9110, RFC 9112) ** Obsolete normative reference: RFC 7232 (Obsoleted by RFC 9110) ** Obsolete normative reference: RFC 7234 (Obsoleted by RFC 9111) ** Obsolete normative reference: RFC 7540 (Obsoleted by RFC 9113) == Outdated reference: A later version (-28) exists of draft-ietf-tls-tls13-20 Summary: 4 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 HTTP Working Group K. Oku 3 Internet-Draft DeNA Co, Ltd. 4 Intended status: Experimental M. Nottingham 5 Expires: December 1, 2017 May 30, 2017 7 Cache Digests for HTTP/2 8 draft-ietf-httpbis-cache-digest-02 10 Abstract 12 This specification defines a HTTP/2 frame type to allow clients to 13 inform the server of their cache's contents. Servers can then use 14 this to inform their choices of what to push to clients. 16 Note to Readers 18 Discussion of this draft takes place on the HTTP working group 19 mailing list (ietf-http-wg@w3.org), which is archived at 20 https://lists.w3.org/Archives/Public/ietf-http-wg/ . 22 Working Group information can be found at http://httpwg.github.io/ ; 23 source code and issues list for this draft can be found at 24 https://github.com/httpwg/http-extensions/labels/cache-digest . 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on December 1, 2017. 43 Copyright Notice 45 Copyright (c) 2017 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 1.1. Notational Conventions . . . . . . . . . . . . . . . . . 3 62 2. The CACHE_DIGEST Frame . . . . . . . . . . . . . . . . . . . 3 63 2.1. Client Behavior . . . . . . . . . . . . . . . . . . . . . 4 64 2.1.1. Computing the Digest-Value . . . . . . . . . . . . . 5 65 2.1.2. Computing a Hash Value . . . . . . . . . . . . . . . 6 66 2.2. Server Behavior . . . . . . . . . . . . . . . . . . . . . 7 67 2.2.1. Querying the Digest for a Value . . . . . . . . . . . 7 68 3. The ACCEPT_CACHE_DIGEST SETTINGS Parameter . . . . . . . . . 8 69 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 70 5. Security Considerations . . . . . . . . . . . . . . . . . . . 10 71 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 72 6.1. Normative References . . . . . . . . . . . . . . . . . . 10 73 6.2. Informative References . . . . . . . . . . . . . . . . . 11 74 Appendix A. Encoding the CACHE_DIGEST frame as an HTTP Header . 12 75 Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 13 76 Appendix C. Changes . . . . . . . . . . . . . . . . . . . . . . 13 77 C.1. Since draft-ietf-httpbis-cache-digest-01 . . . . . . . . 13 78 C.2. Since draft-ietf-httpbis-cache-digest-00 . . . . . . . . 13 79 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 81 1. Introduction 83 HTTP/2 [RFC7540] allows a server to "push" synthetic request/response 84 pairs into a client's cache optimistically. While there is strong 85 interest in using this facility to improve perceived Web browsing 86 performance, it is sometimes counterproductive because the client 87 might already have cached the "pushed" response. 89 When this is the case, the bandwidth used to "push" the response is 90 effectively wasted, and represents opportunity cost, because it could 91 be used by other, more relevant responses. HTTP/2 allows a stream to 92 be cancelled by a client using a RST_STREAM frame in this situation, 93 but there is still at least one round trip of potentially wasted 94 capacity even then. 96 This specification defines a HTTP/2 frame type to allow clients to 97 inform the server of their cache's contents using a Golomb-Rice Coded 98 Set [Rice]. Servers can then use this to inform their choices of 99 what to push to clients. 101 1.1. Notational Conventions 103 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 104 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 105 document are to be interpreted as described in [RFC2119]. 107 2. The CACHE_DIGEST Frame 109 The CACHE_DIGEST frame type is 0xd (decimal 13). 111 +-------------------------------+-------------------------------+ 112 | Origin-Len (16) | Origin? (\*) ... 113 +-------------------------------+-------------------------------+ 114 | Digest-Value? (\*) ... 115 +---------------------------------------------------------------+ 117 The CACHE_DIGEST frame payload has the following fields: 119 Origin-Len: An unsigned, 16-bit integer indicating the length, in 120 octets, of the Origin field. 122 Origin: A sequence of characters containing the ASCII serialization 123 of an origin ([RFC6454], Section 6.2) that the Digest-Value 124 applies to. 126 Digest-Value: A sequence of octets containing the digest as computed 127 in Section 2.1.1. 129 The CACHE_DIGEST frame defines the following flags: 131 o *RESET* (0x1): When set, indicates that any and all cache digests 132 for the applicable origin held by the recipient MUST be considered 133 invalid. 135 o *COMPLETE* (0x2): When set, indicates that the currently valid set 136 of cache digests held by the server constitutes a complete 137 representation of the cache's state regarding that origin, for the 138 type of cached response indicated by the "STALE" flag. 140 o *VALIDATORS* (0x4): When set, indicates that the "validators" 141 boolean in Section 2.1.1 is true. 143 o *STALE* (0x8): When set, indicates that all cached responses 144 represented in the digest-value are stale [RFC7234] at the point 145 in them that the digest was generated; otherwise, all are fresh. 147 2.1. Client Behavior 149 A CACHE_DIGEST frame MUST be sent from a client to a server on stream 150 0, and conveys a digest of the contents of the client's cache for the 151 indicated origin. 153 In typical use, a client will send one or more CACHE_DIGESTs 154 immediately after the first request on a connection for a given 155 origin, on the same stream, because there is usually a short period 156 of inactivity then, and servers can benefit most when they understand 157 the state of the cache before they begin pushing associated assets 158 (e.g., CSS, JavaScript and images). Clients MAY send CACHE_DIGEST at 159 other times. 161 If the cache's state is cleared, lost, or the client otherwise wishes 162 the server to stop using previously sent CACHE_DIGESTs, it can send a 163 CACHE_DIGEST with the RESET flag set. 165 When generating CACHE_DIGEST, a client MUST NOT include cached 166 responses whose URLs do not share origins [RFC6454] with the 167 indicated origin. Clients MUST NOT send CACHE_DIGEST frames on 168 connections that are not authoritative (as defined in [RFC7540], 169 10.1) for the indicated origin. 171 CACHE_DIGEST allows the client to indicate whether the set of URLs 172 used to compute the digest represent fresh or stale stored responses, 173 using the STALE flag. Clients MAY decide whether to only sent 174 CACHE_DIGEST frames representing their fresh stored responses, their 175 stale stored responses, or both. 177 Clients can choose to only send a subset of the suitable stored 178 responses of each type (fresh or stale). However, when the 179 CACHE_DIGEST frames sent represent the complete set of stored 180 responses of a given type, the last such frame SHOULD have a COMPLETE 181 flag set, to indicate to the server that it has all relevant state of 182 that type. Note that for the purposes of COMPLETE, responses cached 183 since the beginning of the connection or the last RESET flag on a 184 CACHE_DIGEST frame need not be included. 186 CACHE_DIGEST can be computed to include cached responses' ETags, as 187 indicated by the VALIDATORS flag. This information can be used by 188 servers to decide what kinds of responses to push to clients; for 189 example, a stale response that hasn't changed could be refreshed with 190 a 304 (Not Modified) response; one that has changed can be replaced 191 with a 200 (OK) response, whether the cached response was fresh or 192 stale. 194 CACHE_DIGEST has no defined meaning when sent from servers, and 195 SHOULD be ignored by clients. 197 2.1.1. Computing the Digest-Value 199 Given the following inputs: 201 o "validators", a boolean indicating whether validators ([RFC7232]) 202 are to be included in the digest; 204 o "URLs'", an array of (string "URL", string "ETag") tuples, each 205 corresponding to the Effective Request URI ([RFC7230], 206 Section 5.5) of a cached response [RFC7234] and its entity-tag 207 [RFC7232] (if "validators" is true and if the ETag is available; 208 otherwise, null); 210 o "P", an integer that MUST be a power of 2 smaller than 2**32, that 211 indicates the probability of a false positive that is acceptable, 212 expressed as "1/P". 214 "digest-value" can be computed using the following algorithm: 216 1. Let N be the count of "URLs"' members, rounded to the nearest 217 power of 2 smaller than 2**32. 219 2. Let "hash-values" be an empty array of integers. 221 3. For each ("URL", "ETag") in "URLs", compute a hash value 222 (Section 2.1.2) and append the result to "hash-values". 224 4. Sort "hash-values" in ascending order. 226 5. Let "digest-value" be an empty array of bits. 228 6. Write log base 2 of "N" to "digest-value" using 5 bits. 230 7. Write log base 2 of "P" to "digest-value" using 5 bits. 232 8. Let "C" be -1. 234 9. For each "V" in "hash-values": 236 1. If "V" is equal to "C", continue to the next "V". 238 2. Let "D" be the result of "V - C - 1". 240 3. Let "Q" be the integer result of "D / P". 242 4. Let "R" be the result of "D modulo P". 244 5. Write "Q" '0' bits to "digest-value". 246 6. Write 1 '1' bit to "digest-value". 248 7. Write "R" to "digest-value" as binary, using log2("P") bits. 250 8. Let "C" be "V" 252 10. If the length of "digest-value" is not a multiple of 8, pad it 253 with 0s until it is. 255 2.1.2. Computing a Hash Value 257 Given: 259 o "URL", an array of characters 261 o "ETag", an array of characters 263 o "validators", a boolean 265 o "N", an integer 267 o "P", an integer 269 "hash-value" can be computed using the following algorithm: 271 1. Let "key" be "URL" converted to an ASCII string by percent- 272 encoding as appropriate [RFC3986]. 274 2. If "validators" is true and "ETag" is not null: 276 1. Append "ETag" to "key" as an ASCII string, including both the 277 "weak" indicator (if present) and double quotes, as per 278 [RFC7232] Section 2.3. 280 3. Let "hash-value" be the SHA-256 message digest [RFC6234] of 281 "key", expressed as an integer. 283 4. Truncate "hash-value" to log2( "N" * "P" ) bits. 285 2.2. Server Behavior 287 In typical use, a server will query (as per Section 2.2.1) the 288 CACHE_DIGESTs received on a given connection to inform what it pushes 289 to that client; 291 o If a given URL has a match in a current CACHE_DIGEST with the 292 STALE flag unset, it need not be pushed, because it is fresh in 293 cache; 295 o If a given URL and ETag combination has a match in a current 296 CACHE_DIGEST with the STALE flag set, the client has a stale copy 297 in cache, and a validating response can be pushed; 299 o If a given URL has no match in any current CACHE_DIGEST, the 300 client does not have a cached copy, and a complete response can be 301 pushed. 303 Servers MAY use all CACHE_DIGESTs received for a given origin as 304 current, as long as they do not have the RESET flag set; a 305 CACHE_DIGEST frame with the RESET flag set MUST clear any previously 306 stored CACHE_DIGESTs for its origin. Servers MUST treat an empty 307 Digest-Value with a RESET flag set as effectively clearing all stored 308 digests for that origin. 310 Clients are not likely to send updates to CACHE_DIGEST over the 311 lifetime of a connection; it is expected that servers will separately 312 track what cacheable responses have been sent previously on the same 313 connection, using that knowledge in conjunction with that provided by 314 CACHE_DIGEST. 316 Servers MUST ignore CACHE_DIGEST frames sent on a stream other than 317 0. 319 2.2.1. Querying the Digest for a Value 321 Given: 323 o "digest-value", an array of bits 325 o "URL", an array of characters 327 o "ETag", an array of characters 329 o "validators", a boolean 331 we can determine whether there is a match in the digest using the 332 following algorithm: 334 1. Read the first 5 bits of "digest-value" as an integer; let "N" 335 be two raised to the power of that value. 337 2. Read the next 5 bits of "digest-value" as an integer; let "P" be 338 two raised to the power of that value. 340 3. Let "hash-value" be the result of computing a hash value 341 (Section 2.1.2). 343 4. Let "C" be -1. 345 5. Read '0' bits from "digest-value" until a '1' bit is found; let 346 "Q" be the number of '0' bits. Discard the '1'. 348 6. Read log2("P") bits from "digest-value" after the '1' as an 349 integer; let "R" be its value. 351 7. Let "D" be "Q" * "P" + "R". 353 8. Increment "C" by "D" + 1. 355 9. If "C" is equal to "hash-value", return 'true'. 357 10. Otherwise, return to step 5 and continue processing; if no match 358 is found before "digest-value" is exhausted, return 'false'. 360 3. The ACCEPT_CACHE_DIGEST SETTINGS Parameter 362 A server can notify its support for CACHE_DIGEST frame by sending the 363 ACCEPT_CACHE_DIGEST (0x7) SETTINGS parameter. If the server is 364 tempted to making optimizations based on CACHE_DIGEST frames, it 365 SHOULD send the SETTINGS parameter immediately after the connection 366 is established. 368 The value of the parameter is a bit-field of which the following bits 369 are defined: 371 FRESH (0x1): When set, it indicates that the server is willing to 372 make use of a digest of freshly-cached responses. 374 STALE (0x2): When set, it indicates that the server is willing to 375 make use of a digest of stale-cached responses. 377 Rest of the bits MUST be ignored and MUST be left unset when sending. 379 The initial value of the parameter is zero (0x0) meaning that the 380 server is not interested in seeing a CACHE_DIGEST frame. 382 Some underlying transports allow the server's first flight of 383 application data to reach the client at around the same time when the 384 client sends it's first flight data. When such transport (e.g., TLS 385 1.3 [I-D.ietf-tls-tls13] in full-handshake mode) is used, a client 386 can postpone sending the CACHE_DIGEST frame until it receives a 387 ACCEPT_CACHE_DIGEST settings value. 389 When the underlying transport does not have such property (e.g., TLS 390 1.3 in 0-RTT mode), a client can reuse the settings value found in 391 previous connections to that origin [RFC6454] to make assumptions. 393 4. IANA Considerations 395 This document registers the following entry in the Permanent Message 396 Headers Registry, as per [RFC3864]: 398 o Header field name: Cache-Digest 400 o Applicable protocol: http 402 o Status: experimental 404 o Author/Change controller: IESG 406 o Specification document(s): [this document] 408 This document registers the following entry in the HTTP/2 Frame Type 409 Registry, as per [RFC7540]: 411 o Frame Type: CACHE_DIGEST 413 o Code: 0xd 415 o Specification: [this document] 417 This document registers the following entry in the HTTP/2 Settings 418 Registry, as per [RFC7540]: 420 o Code: 0x7 422 o Name: ACCEPT_CACHE_DIGEST 424 o Initial Value: 0x0 426 o Reference: [this document] 428 5. Security Considerations 430 The contents of a User Agent's cache can be used to re-identify or 431 "fingerprint" the user over time, even when other identifiers (e.g., 432 Cookies [RFC6265]) are cleared. 434 CACHE_DIGEST allows such cache-based fingerprinting to become 435 passive, since it allows the server to discover the state of the 436 client's cache without any visible change in server behaviour. 438 As a result, clients MUST mitigate for this threat when the user 439 attempts to remove identifiers (e.g., "clearing cookies"). This 440 could be achieved in a number of ways; for example: by clearing the 441 cache, by changing one or both of N and P, or by adding new, 442 synthetic entries to the digest to change its contents. 444 TODO: discuss how effective the suggested mitigations actually would 445 be. 447 Additionally, User Agents SHOULD NOT send CACHE_DIGEST when in 448 "privacy mode." 450 6. References 452 6.1. Normative References 454 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 455 Requirement Levels", BCP 14, RFC 2119, 456 DOI 10.17487/RFC2119, March 1997, 457 . 459 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 460 Resource Identifier (URI): Generic Syntax", STD 66, 461 RFC 3986, DOI 10.17487/RFC3986, January 2005, 462 . 464 [RFC6234] Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms 465 (SHA and SHA-based HMAC and HKDF)", RFC 6234, 466 DOI 10.17487/RFC6234, May 2011, 467 . 469 [RFC6454] Barth, A., "The Web Origin Concept", RFC 6454, 470 DOI 10.17487/RFC6454, December 2011, 471 . 473 [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 474 Protocol (HTTP/1.1): Message Syntax and Routing", 475 RFC 7230, DOI 10.17487/RFC7230, June 2014, 476 . 478 [RFC7232] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 479 Protocol (HTTP/1.1): Conditional Requests", RFC 7232, 480 DOI 10.17487/RFC7232, June 2014, 481 . 483 [RFC7234] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, 484 Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching", 485 RFC 7234, DOI 10.17487/RFC7234, June 2014, 486 . 488 [RFC7540] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 489 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 490 DOI 10.17487/RFC7540, May 2015, 491 . 493 6.2. Informative References 495 [Fetch] "Fetch Standard", n.d., . 497 [I-D.ietf-tls-tls13] 498 Rescorla, E., "The Transport Layer Security (TLS) Protocol 499 Version 1.3", draft-ietf-tls-tls13-20 (work in progress), 500 April 2017. 502 [RFC3864] Klyne, G., Nottingham, M., and J. Mogul, "Registration 503 Procedures for Message Header Fields", BCP 90, RFC 3864, 504 DOI 10.17487/RFC3864, September 2004, 505 . 507 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 508 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, 509 . 511 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 512 Specifications: ABNF", STD 68, RFC 5234, 513 DOI 10.17487/RFC5234, January 2008, 514 . 516 [RFC6265] Barth, A., "HTTP State Management Mechanism", RFC 6265, 517 DOI 10.17487/RFC6265, April 2011, 518 . 520 [Rice] Rice, R. and J. Plaunt, "Adaptive variable-length coding 521 for efficient compression of spacecraft television data", 522 IEEE Transactions on Communication Technology 19.6 , 1971. 524 [Service-Workers] 525 Russell, A., Song, J., Archibald, J., and M. 526 Kruisselbrink, "Service Workers 1", October 2016, 527 . 529 Appendix A. Encoding the CACHE_DIGEST frame as an HTTP Header 531 On some web browsers that support Service Workers [Service-Workers] 532 but not Cache Digests (yet), it is possible to achieve the benefit of 533 using Cache Digests by emulating the frame using HTTP Headers. 535 For the sake of interoperability with such clients, this appendix 536 defines how a CACHE_DIGEST frame can be encoded as an HTTP header 537 named "Cache-Digest". 539 The definition uses the Augmented Backus-Naur Form (ABNF) notation of 540 [RFC5234] with the list rule extension defined in [RFC7230], 541 Appendix B. 543 Cache-Digest = 1#digest-entity 544 digest-entity = digest-value *(OWS ";" OWS digest-flag) 545 digest-value = 546 digest-flag = token 548 A Cache-Digest request header is defined as a list construct of 549 cache-digest-entities. Each cache-digest-entity corresponds to a 550 CACHE_DIGEST frame. 552 Digest-Value is encoded using base64url [RFC4648], Section 5. Flags 553 that are set are encoded as digest-flags by their names that are 554 compared case-insensitively. 556 Origin is omitted in the header form. The value is implied from the 557 value of the ":authority" pseudo header. Client MUST only send 558 Cache-Digest headers containing digests that belong to the origin 559 specified by the HTTP request. 561 The example below contains one digest of fresh resource and has only 562 the "COMPLETE" flag set. 564 Cache-Digest: AfdA; complete 566 Clients MUST associate Cache-Digest headers to every HTTP request, 567 since Fetch [Fetch] - the HTTP API supported by Service Workers - 568 does not define the order in which the issued requests will be sent 569 to the server nor guarantees that all the requests will be 570 transmitted using a single HTTP/2 connection. 572 Also, due to the fact that any header that is supplied to Fetch is 573 required to be end-to-end, there is an ambiguity in what a Cache- 574 Digest header respresents when a request is transmitted through a 575 proxy. The header may represent the cache state of a client or that 576 of a proxy, depending on how the proxy handles the header. 578 Appendix B. Acknowledgements 580 Thanks to Adam Langley and Giovanni Bajo for their explorations of 581 Golomb-coded sets. In particular, see 582 http://giovanni.bajo.it/post/47119962313/golomb-coded-sets-smaller- 583 than-bloom-filters , which refers to sample code. 585 Thanks to Stefan Eissing for his suggestions. 587 Appendix C. Changes 589 C.1. Since draft-ietf-httpbis-cache-digest-01 591 o Added definition of the Cache-Digest header. 593 o Introduce ACCEPT_CACHE_DIGEST SETTINGS parameter. 595 o Change intended status from Standard to Experimental. 597 C.2. Since draft-ietf-httpbis-cache-digest-00 599 o Make the scope of a digest frame explicit and shift to stream 0. 601 Authors' Addresses 603 Kazuho Oku 604 DeNA Co, Ltd. 606 Email: kazuhooku@gmail.com 608 Mark Nottingham 610 Email: mnot@mnot.net 611 URI: https://www.mnot.net/