idnits 2.17.1 draft-ietf-httpbis-cache-digest-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 9, 2016) is 2847 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 6234 ** Obsolete normative reference: RFC 7230 (Obsoleted by RFC 9110, RFC 9112) ** Obsolete normative reference: RFC 7232 (Obsoleted by RFC 9110) ** Obsolete normative reference: RFC 7234 (Obsoleted by RFC 9111) ** Obsolete normative reference: RFC 7540 (Obsoleted by RFC 9113) Summary: 5 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group K. Oku 3 Internet-Draft DeNA Co, Ltd. 4 Intended status: Standards Track M. Nottingham 5 Expires: January 10, 2017 July 9, 2016 7 Cache Digests for HTTP/2 8 draft-ietf-httpbis-cache-digest-00 10 Abstract 12 This specification defines a HTTP/2 frame type to allow clients to 13 inform the server of their cache's contents. Servers can then use 14 this to inform their choices of what to push to clients. 16 Note to Readers 18 Discussion of this draft takes place on the HTTP working group 19 mailing list (ietf-http-wg@w3.org), which is archived at 20 https://lists.w3.org/Archives/Public/ietf-http-wg/ . 22 Working Group information can be found at http://httpwg.github.io/ ; 23 source code and issues list for this draft can be found at 24 https://github.com/httpwg/http-extensions/labels/cache-digest . 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on January 10, 2017. 43 Copyright Notice 45 Copyright (c) 2016 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 1.1. Notational Conventions . . . . . . . . . . . . . . . . . 3 62 2. The CACHE_DIGEST Frame . . . . . . . . . . . . . . . . . . . 3 63 2.1. Client Behavior . . . . . . . . . . . . . . . . . . . . . 3 64 2.1.1. Computing the Digest-Value . . . . . . . . . . . . . 4 65 2.1.2. Computing a Hash Value . . . . . . . . . . . . . . . 6 66 2.2. Server Behavior . . . . . . . . . . . . . . . . . . . . . 6 67 2.2.1. Querying the Digest for a Value . . . . . . . . . . . 7 68 3. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 69 4. Security Considerations . . . . . . . . . . . . . . . . . . . 8 70 5. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 71 5.1. Normative References . . . . . . . . . . . . . . . . . . 8 72 5.2. Informative References . . . . . . . . . . . . . . . . . 9 73 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 9 74 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 76 1. Introduction 78 HTTP/2 [RFC7540] allows a server to "push" synthetic request/response 79 pairs into a client's cache optimistically. While there is strong 80 interest in using this facility to improve perceived Web browsing 81 performance, it is sometimes counterproductive because the client 82 might already have cached the "pushed" response. 84 When this is the case, the bandwidth used to "push" the response is 85 effectively wasted, and represents opportunity cost, because it could 86 be used by other, more relevant responses. HTTP/2 allows a stream to 87 be cancelled by a client using a RST_STREAM frame in this situation, 88 but there is still at least one round trip of potentially wasted 89 capacity even then. 91 This specification defines a HTTP/2 frame type to allow clients to 92 inform the server of their cache's contents using a Golumb-Rice Coded 93 Set. Servers can then use this to inform their choices of what to 94 push to clients. 96 1.1. Notational Conventions 98 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 99 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 100 document are to be interpreted as described in [RFC2119]. 102 2. The CACHE_DIGEST Frame 104 The CACHE_DIGEST frame type is 0xf1. NOTE: This is an experimental 105 value; if standardised, a permanent value will be assigned. 107 +-----------------------------------------------+ 108 | Digest-Value? (\*) ... 109 +-----------------------------------------------+ 111 The CACHE_DIGEST frame payload has the following fields: 113 o *Digest-Value*: A sequence of octets containing the digest as 114 computed in Section 2.1.1. 116 The CACHE_DIGEST frame defines the following flags: 118 o *RESET* (0x1): When set, indicates that any and all cache digests 119 for the applicable origin held by the recipient MUST be considered 120 invalid. 122 o *COMPLETE* (0x2): When set, indicates that the currently valid set 123 of cache digests held by the server constitutes a complete 124 representation of the cache's state regarding that origin, for the 125 type of cached response indicated by the "STALE" flag. 127 o *VALIDATORS* (0x4): When set, indicates that the "validators" 128 boolean in Section 2.1.1 is true. 130 o *STALE* (0x8): When set, indicates that all cached responses 131 represented in the digest-value are stale [RFC7234] at the point 132 in them that the digest was generated; otherwise, all are fresh. 134 2.1. Client Behavior 136 A CACHE_DIGEST frame can be sent from a client to a server on any 137 stream in the "open" state, and conveys a digest of the contents of 138 the client's cache for associated stream. 140 In typical use, a client will send one or more CACHE_DIGESTs 141 immediately after the first request on a connection for a given 142 origin, on the same stream, because there is usually a short period 143 of inactivity then, and servers can benefit most when they understand 144 the state of the cache before they begin pushing associated assets 145 (e.g., CSS, JavaScript and images). Clients MAY send CACHE_DIGEST at 146 other times. 148 If the cache's state is cleared, lost, or the client otherwise wishes 149 the server to stop using previously sent CACHE_DIGESTs, it can send a 150 CACHE_DIGEST with the RESET flag set. 152 When generating CACHE_DIGEST, a client MUST NOT include cached 153 responses whose URLs do not share origins [RFC6454] with the request 154 of the stream that the frame is sent upon. 156 CACHE_DIGEST allows the client to indicate whether the set of URLs 157 used to compute the digest represent fresh or stale stored responses, 158 using the STALE flag. Clients MAY decide whether to only sent 159 CACHE_DIGEST frames representing their fresh stored responses, their 160 stale stored responses, or both. 162 Clients can choose to only send a subset of the suitable stored 163 responses of each type (fresh or stale). However, when the 164 CACHE_DIGEST frames sent represent the complete set of stored 165 responses of a given type, the last such frame SHOULD have a COMPLETE 166 flag set, to indicate to the server that it has all relevant state of 167 that type. Note that for the purposes of COMPLETE, responses cached 168 since the beginning of the connection or the last RESET flag on a 169 CACHE_DIGEST frame need not be included. 171 CACHE_DIGEST can be computed to include cached responses' ETags, as 172 indicated by the VALIDATORS flag. This information can be used by 173 servers to decide what kinds of responses to push to clients; for 174 example, a stale response that hasn't changed could be refreshed with 175 a 304 (Not Modified) response; one that has changed can be replaced 176 with a 200 (OK) response, whether the cached response was fresh or 177 stale. 179 CACHE_DIGEST has no defined meaning when sent from servers, and 180 SHOULD be ignored by clients. 182 2.1.1. Computing the Digest-Value 184 Given the following inputs: 186 o "validators", a boolean indicating whether validators ([RFC7232]) 187 are to be included in the digest; 189 o "URLs'", an array of (string "URL", string "ETag") tuples, each 190 corresponding to the Effective Request URI ([RFC7230], 191 Section 5.5) of a cached response [RFC7234] and its entity-tag 192 [RFC7232] (if "validators" is true and if the ETag is available; 193 otherwise, null); 195 o "P", an integer that MUST be a power of 2 smaller than 2**32, that 196 indicates the probability of a false positive that is acceptable, 197 expressed as "1/P". 199 "digest-value" can be computed using the following algorithm: 201 1. Let N be the count of "URLs"' members, rounded to the nearest 202 power of 2 smaller than 2**32. 204 2. Let "hash-values" be an empty array of integers. 206 3. For each ("URL", "ETag") in "URLs", compute a hash value 207 (Section 2.1.2) and append the result to "hash-values". 209 4. Sort "hash-values" in ascending order. 211 5. Let "digest-value" be an empty array of bits. 213 6. Write log base 2 of "N" to "digest-value" using 5 bits. 215 7. Write log base 2 of "P" to "digest-value" using 5 bits. 217 8. Let "C" be -1. 219 9. For each "V" in "hash-values": 221 1. If "V" is equal to "C", continue to the next "V". 223 2. Let "D" be the result of "V - C - 1". 225 3. Let "Q" be the integer result of "D / P". 227 4. Let "R" be the result of "D modulo P". 229 5. Write "Q" '0' bits to "digest-value". 231 6. Write 1 '1' bit to "digest-value". 233 7. Write "R" to "digest-value" as binary, using log2("P") bits. 235 8. Let "C" be "V" 237 10. If the length of "digest-value" is not a multiple of 8, pad it 238 with 0s until it is. 240 2.1.2. Computing a Hash Value 242 Given: 244 o "URL", an array of characters 246 o "ETag", an array of characters 248 o "validators", a boolean 250 o "N", an integer 252 o "P", an integer 254 "hash-value" can be computed using the following algorithm: 256 1. Let "key" be "URL" converted to an ASCII string by percent- 257 encoding as appropriate [RFC3986]. 259 2. If "validators" is true and "ETag" is not null: 261 1. Append "ETag" to "key" as an ASCII string, including both the 262 "weak" indicator (if present) and double quotes, as per 263 [RFC7232] Section 2.3. 265 3. Let "hash-value" be the SHA-256 message digest [RFC6234] of 266 "key", expressed as an integer. 268 4. Truncate "hash-value" to log2( "N" * "P" ) bits. 270 2.2. Server Behavior 272 In typical use, a server will query (as per Section 2.2.1) the 273 CACHE_DIGESTs received on a given connection to inform what it pushes 274 to that client; 276 o If a given URL has a match in a current CACHE_DIGEST with the 277 STALE flag unset, it need not be pushed, because it is fresh in 278 cache; 280 o If a given URL and ETag combination has a match in a current 281 CACHE_DIGEST with the STALE flag set, the client has a stale copy 282 in cache, and a validating response can be pushed; 284 o If a given URL has no match in any current CACHE_DIGEST, the 285 client does not have a cached copy, and a complete response can be 286 pushed. 288 Servers MAY use all CACHE_DIGESTs received for a given origin as 289 current, as long as they do not have the RESET flag set; a 290 CACHE_DIGEST frame with the RESET flag set MUST clear any previously 291 stored CACHE_DIGESTs for its origin. Servers MUST treat an empty 292 Digest-Value with a RESET flag set as effectively clearing all stored 293 digests for that origin. 295 Clients are not likely to send updates to CACHE_DIGEST over the 296 lifetime of a connection; it is expected that servers will separately 297 track what cacheable responses have been sent previously on the same 298 connection, using that knowledge in conjunction with that provided by 299 CACHE_DIGEST. 301 2.2.1. Querying the Digest for a Value 303 Given: 305 o "digest-value", an array of bits 307 o "URL", an array of characters 309 o "ETag", an array of characters 311 o "validators", a boolean 313 we can determine whether there is a match in the digest using the 314 following algorithm: 316 1. Read the first 5 bits of "digest-value" as an integer; let "N" 317 be two raised to the power of that value. 319 2. Read the next 5 bits of "digest-value" as an integer; let "P" be 320 two raised to the power of that value. 322 3. Let "hash-value" be the result of computing a hash value 323 (Section 2.1.2). 325 4. Let "C" be -1. 327 5. Read '0' bits from "digest-value" until a '1' bit is found; let 328 "Q" bit the number of '0' bits. Discard the '1'. 330 6. Read log2("P") bits from "digest-value" after the '1' as an 331 integer; let "R" be its value. 333 7. Let "D" be "Q" * "P" + "R". 335 8. Increment "C" by "D" + 1. 337 9. If "C" is equal to "hash-value", return 'true'. 339 10. Otherwise, return to step 5 and continue processing; if no match 340 is found before "digest-value" is exhausted, return 'false'. 342 3. IANA Considerations 344 This draft currently has no requirements for IANA. If the 345 CACHE_DIGEST frame is standardised, it will need to be assigned a 346 frame type. 348 4. Security Considerations 350 The contents of a User Agent's cache can be used to re-identify or 351 "fingerprint" the user over time, even when other identifiers (e.g., 352 Cookies [RFC6265]) are cleared. 354 CACHE_DIGEST allows such cache-based fingerprinting to become 355 passive, since it allows the server to discover the state of the 356 client's cache without any visible change in server behaviour. 358 As a result, clients MUST mitigate for this threat when the user 359 attempts to remove identifiers (e.g., "clearing cookies"). This 360 could be achieved in a number of ways; for example: by clearing the 361 cache, by changing one or both of N and P, or by adding new, 362 synthetic entries to the digest to change its contents. 364 TODO: discuss how effective the suggested mitigations actually would 365 be. 367 Additionally, User Agents SHOULD NOT send CACHE_DIGEST when in 368 "privacy mode." 370 5. References 372 5.1. Normative References 374 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 375 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/ 376 RFC2119, March 1997, 377 . 379 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 380 Resource Identifier (URI): Generic Syntax", STD 66, RFC 381 3986, DOI 10.17487/RFC3986, January 2005, 382 . 384 [RFC6234] Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms 385 (SHA and SHA-based HMAC and HKDF)", RFC 6234, DOI 386 10.17487/RFC6234, May 2011, 387 . 389 [RFC6454] Barth, A., "The Web Origin Concept", RFC 6454, DOI 390 10.17487/RFC6454, December 2011, 391 . 393 [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 394 Protocol (HTTP/1.1): Message Syntax and Routing", RFC 395 7230, DOI 10.17487/RFC7230, June 2014, 396 . 398 [RFC7232] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 399 Protocol (HTTP/1.1): Conditional Requests", RFC 7232, DOI 400 10.17487/RFC7232, June 2014, 401 . 403 [RFC7234] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, 404 Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching", 405 RFC 7234, DOI 10.17487/RFC7234, June 2014, 406 . 408 [RFC7540] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 409 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, DOI 410 10.17487/RFC7540, May 2015, 411 . 413 5.2. Informative References 415 [RFC6265] Barth, A., "HTTP State Management Mechanism", RFC 6265, 416 DOI 10.17487/RFC6265, April 2011, 417 . 419 Appendix A. Acknowledgements 421 Thanks to Adam Langley and Giovanni Bajo for their explorations of 422 Golumb-coded sets. In particular, see 423 http://giovanni.bajo.it/post/47119962313/golomb-coded-sets-smaller- 424 than-bloom-filters , which refers to sample code. 426 Thanks to Stefan Eissing for his suggestions. 428 Authors' Addresses 430 Kazuho Oku 431 DeNA Co, Ltd. 433 Email: kazuhooku@gmail.com 435 Mark Nottingham 437 Email: mnot@mnot.net 438 URI: https://www.mnot.net/