idnits 2.17.1 draft-yasskin-webpackage-use-cases-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([2], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 05, 2018) is 2244 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1' on line 931 -- Looks like a reference, but probably isn't: '2' on line 933 -- Looks like a reference, but probably isn't: '3' on line 935 -- Looks like a reference, but probably isn't: '4' on line 937 -- Looks like a reference, but probably isn't: '5' on line 939 -- Looks like a reference, but probably isn't: '6' on line 942 -- Looks like a reference, but probably isn't: '7' on line 944 -- Looks like a reference, but probably isn't: '8' on line 946 -- Looks like a reference, but probably isn't: '9' on line 949 -- Looks like a reference, but probably isn't: '10' on line 951 -- Looks like a reference, but probably isn't: '11' on line 953 -- Looks like a reference, but probably isn't: '12' on line 955 -- Looks like a reference, but probably isn't: '13' on line 957 -- Looks like a reference, but probably isn't: '14' on line 959 -- Looks like a reference, but probably isn't: '15' on line 961 -- Looks like a reference, but probably isn't: '16' on line 963 -- Looks like a reference, but probably isn't: '17' on line 965 -- Looks like a reference, but probably isn't: '18' on line 968 -- Looks like a reference, but probably isn't: '19' on line 971 -- Looks like a reference, but probably isn't: '20' on line 974 -- Looks like a reference, but probably isn't: '21' on line 976 -- Looks like a reference, but probably isn't: '22' on line 979 -- Looks like a reference, but probably isn't: '23' on line 982 -- Looks like a reference, but probably isn't: '24' on line 984 -- Looks like a reference, but probably isn't: '25' on line 986 == Outdated reference: A later version (-12) exists of draft-cavage-http-signatures-09 == Outdated reference: A later version (-05) exists of draft-ietf-httpbis-cache-digest-03 == Outdated reference: A later version (-06) exists of draft-ietf-httpbis-http2-secondary-certs-00 -- Obsolete informational reference (is this intentional?): RFC 6962 (Obsoleted by RFC 9162) -- Obsolete informational reference (is this intentional?): RFC 7540 (Obsoleted by RFC 9113) Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 28 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Yasskin 3 Internet-Draft Google 4 Intended status: Informational March 05, 2018 5 Expires: September 6, 2018 7 Use Cases and Requirements for Web Packages 8 draft-yasskin-webpackage-use-cases-01 10 Abstract 12 This document lists use cases for signing and/or bundling collections 13 of web pages, and extracts a set of requirements from them. 15 Note to Readers 17 Discussion of this draft takes place on the ART area mailing list 18 (art@ietf.org), which is archived at 19 https://mailarchive.ietf.org/arch/search/?email_list=art [1]. 21 The source code and issues list for this draft can be found in 22 https://github.com/WICG/webpackage [2]. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at https://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on September 6, 2018. 41 Copyright Notice 43 Copyright (c) 2018 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (https://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . 3 60 2.1. Essential . . . . . . . . . . . . . . . . . . . . . . . . 4 61 2.1.1. Offline installation . . . . . . . . . . . . . . . . 4 62 2.1.2. Offline browsing . . . . . . . . . . . . . . . . . . 5 63 2.1.3. Save and share a web page . . . . . . . . . . . . . . 6 64 2.2. Nice-to-have . . . . . . . . . . . . . . . . . . . . . . 6 65 2.2.1. Packaged Web Publications . . . . . . . . . . . . . . 6 66 2.2.2. Avoiding Censorship . . . . . . . . . . . . . . . . . 7 67 2.2.3. Third-party security review . . . . . . . . . . . . . 8 68 2.2.4. Building packages from multiple libraries . . . . . . 8 69 2.2.5. Privacy-preserving prefetch . . . . . . . . . . . . . 9 70 2.2.6. Cross-CDN Serving . . . . . . . . . . . . . . . . . . 10 71 2.2.7. Installation from a self-extracting executable . . . 11 72 2.2.8. Packages in version control . . . . . . . . . . . . . 11 73 2.2.9. Subresource bundling . . . . . . . . . . . . . . . . 11 74 2.2.10. Archival . . . . . . . . . . . . . . . . . . . . . . 12 75 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 13 76 3.1. Essential . . . . . . . . . . . . . . . . . . . . . . . . 13 77 3.1.1. Indexed by URL . . . . . . . . . . . . . . . . . . . 13 78 3.1.2. Request headers . . . . . . . . . . . . . . . . . . . 13 79 3.1.3. Response headers . . . . . . . . . . . . . . . . . . 13 80 3.1.4. Signing as an origin . . . . . . . . . . . . . . . . 13 81 3.1.5. Random access . . . . . . . . . . . . . . . . . . . . 14 82 3.1.6. Resources from multiple origins in a package . . . . 14 83 3.1.7. Cryptographic agility . . . . . . . . . . . . . . . . 14 84 3.1.8. Unsigned content . . . . . . . . . . . . . . . . . . 14 85 3.1.9. Certificate revocation . . . . . . . . . . . . . . . 14 86 3.1.10. Downgrade prevention . . . . . . . . . . . . . . . . 14 87 3.1.11. Metadata . . . . . . . . . . . . . . . . . . . . . . 14 88 3.1.12. Implementations are hard to get wrong . . . . . . . . 15 89 3.2. Nice to have . . . . . . . . . . . . . . . . . . . . . . 15 90 3.2.1. Streamed loading . . . . . . . . . . . . . . . . . . 15 91 3.2.2. Additional signatures . . . . . . . . . . . . . . . . 15 92 3.2.3. Binary . . . . . . . . . . . . . . . . . . . . . . . 15 93 3.2.4. Deduplication of diamond dependencies . . . . . . . . 15 94 3.2.5. Old crypto can be removed . . . . . . . . . . . . . . 15 95 3.2.6. Compress transfers . . . . . . . . . . . . . . . . . 16 96 3.2.7. Compress stored packages . . . . . . . . . . . . . . 16 97 3.2.8. Subsetting and reordering . . . . . . . . . . . . . . 16 98 3.2.9. Packaged validity information . . . . . . . . . . . . 16 99 3.2.10. Signing uses existing TLS certificates . . . . . . . 16 100 3.2.11. External dependencies . . . . . . . . . . . . . . . . 16 101 3.2.12. Trailing length . . . . . . . . . . . . . . . . . . . 16 102 3.2.13. Time-shifting execution . . . . . . . . . . . . . . . 16 103 4. Non-goals . . . . . . . . . . . . . . . . . . . . . . . . . . 17 104 4.1. Store confidential data . . . . . . . . . . . . . . . . . 17 105 4.2. Generate packages on the fly . . . . . . . . . . . . . . 17 106 4.3. Non-origin identity . . . . . . . . . . . . . . . . . . . 17 107 4.4. DRM . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 108 4.5. Ergonomic replacement for HTTP/2 PUSH . . . . . . . . . . 17 109 5. Security Considerations . . . . . . . . . . . . . . . . . . . 18 110 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 111 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 112 7.1. Informative References . . . . . . . . . . . . . . . . . 18 113 7.2. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 20 114 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 21 115 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 22 117 1. Introduction 119 People would like to use content offline and in other situations 120 where there isn't a direct connection to the server where the content 121 originates. However, it's difficult to distribute and verify the 122 authenticity of applications and content without a connection to the 123 network. The W3C has addressed running applications offline with 124 Service Workers ([ServiceWorkers]), but not the problem of 125 distribution. 127 Previous attempts at packaging web resources (e.g. Resource Packages 128 [3] and the W3C TAG's packaging proposal [4]) were motivated by 129 speeding up the download of resources from a single server, which is 130 probably better achieved through other mechanisms like HTTP/2 PUSH, 131 possibly augmented with a simple manifest of URLs a page plans to use 132 [5]. This attempt is instead motivated by avoiding a connection to 133 the origin server at all. It may still be useful for the earlier use 134 cases, so they're still listed, but they're not primary. 136 2. Use cases 138 These use cases are in rough descending priority order. If use cases 139 have conflicting requirements, the design should enable more 140 important use cases. 142 2.1. Essential 144 2.1.1. Offline installation 146 Alex can download a file containing a website (a PWA [6]) including a 147 Service Worker from origin "O", and transmit it to their peer Bailey, 148 and then Bailey can install the Service Worker with a proof that it 149 came from "O". This saves Bailey the bandwidth costs of transferring 150 the website. 152 There are roughly two ways to accomplish this: 154 1. Package just the Service Worker Javascript and any other 155 Javascript that it importScripts() [7], with their URLs and 156 enough metadata to synthesize a 157 navigator.serviceWorker.register(scriptURL, options) call [8], 158 along with an uninterpreted but signature-checked blob of data 159 that the Service Worker can interpret to fill in its caches. 161 2. Package the resources so that the Service Worker can fetch() them 162 to populate its cache. 164 Associated requirements for just the Service Worker: 166 o Indexed by URL: The "register()" and "importScripts()" calls have 167 semantics that depend on the URL. 169 o Signing as an origin: To prove that the file came from "O". 171 o Signing uses existing TLS certificates: So "O" doesn't have to 172 spend lots of money buying a specialized certificate. 174 o Cryptographic agility: Today's algorithms will eventually be 175 obsolete and will need to be replaced. 177 o Certificate revocation: "O"'s certificate might be compromised or 178 mis-issued, and the attacker shouldn't then get an infinite 179 ability to mint packages. 181 o Downgrade prevention: "O"'s site might have an XSS vulnerability, 182 and attackers with an old signed package shouldn't be able to take 183 advantage of the XSS forever. 185 o Metadata: Just enough to generate the "register()" call, which is 186 less than a full W3C Application Manifest. 188 Additional associated requirements for packaged resources: 190 o Indexed by URL: Resources on the web are addressed by URL. 192 o Request headers: If Bailey's running a different browser from Alex 193 or has a different language configured, the "accept*" headers are 194 important for selecting which resource to use at each URL. 196 o Response headers: The meaning of a resource is heavily influenced 197 by its HTTP response headers. 199 o Resources from multiple origins in a package: So the site can be 200 built from multiple components (Section 2.2.4). 202 o Metadata: The browser needs to know which resource within a 203 package file to treat as its Service Worker and/or initial HTML 204 page. 206 2.1.1.1. Online use 208 Bailey may have an internet connection through which they can, in 209 real time, fetch updates to the package they received from Alex. 211 2.1.1.2. Fully offline use 213 Or Bailey may not have any internet connection a significant fraction 214 of the time, either because they have no internet at all, because 215 they turn off internet except when intentionally downloading content, 216 or because they use up their plan partway through each month. 218 Associated requirements beyond Offline installation: 220 o Packaged validity information: Even without a direct internet 221 connection, Bailey should be able to check that their package is 222 still valid. 224 2.1.2. Offline browsing 226 Alex can download a file containing a large website (e.g. Wikipedia) 227 from its origin, save it to transferrable storage (e.g. an SD card), 228 and hand it to their peer Bailey. Then Bailey can browse the website 229 with a proof that it came from "O". Bailey may not have the storage 230 space to copy the website before browsing it. 232 This use case is harder for publishers to support if we specialize 233 Section 2.1.1 for Service Workers since it requires the publisher to 234 adopt Service Workers before they can sign their site. 236 Associated requirements beyond Offline installation: 238 o Random access: To avoid needing a long linear scan before using 239 the content. 241 o Compress stored packages: So that more content can fit on the same 242 storage device. 244 2.1.3. Save and share a web page 246 Casey is viewing a web page and wants to save it either for offline 247 use or to show it to their friend Dakota. Since Casey isn't the web 248 page's author, they don't have the private key needed to sign the 249 page. Browsers currently allow their users to save pages, but each 250 browser uses a different format (MHTML, Web Archive, or files in a 251 directory), so Dakota and Casey would need to be using the same 252 browser. Casey could also take a screenshot, at the cost of losing 253 links and accessibility. 255 Associated requirements: 257 o Unsigned content: A client can't sign content as another origin. 259 o Resources from multiple origins in a package: General web pages 260 include resources from multiple origins. 262 o Indexed by URL: Resources on the web are addressed by URL. 264 o Response headers: The meaning of a resource is heavily influenced 265 by its HTTP response headers. 267 2.2. Nice-to-have 269 2.2.1. Packaged Web Publications 271 The W3C's Publishing Working Group [9], merged from the International 272 Digital Publishing Forum (IDPF) and in charge of EPUB maintenance, 273 wants to be able to create publications on the web and then let them 274 be copied to different servers or to other users via arbitrary 275 protocols. See their Packaged Web Publications use cases [10] for 276 more details. 278 Associated requirements: 280 o Indexed by URL: Resources on the web are addressed by URL. 282 o Signing as an origin: So that readers can be sure their copy is 283 authentic and so that copying the package preserves the URLs of 284 the content inside it. 286 o Downgrade prevention: An early version of a publication might 287 contain incorrect content, and a publisher should be able to 288 update that without worrying that an attacker can still show the 289 old content to users. 291 o Metadata: A publication can have copyright and licensing concerns; 292 a title, author, and cover image; an ISBN or DOI name; etc.; which 293 should be included when that publication is packaged. 295 Other requirements are similar to those from Offline installation: 297 o Random access: To avoid needing a long linear scan before using 298 the content. 300 o Compress stored packages: So that more content can fit on the same 301 storage device. 303 o Request headers: If different users' browsers have different 304 capabilities or preferences, the "accept*" headers are important 305 for selecting which resource to use at each URL. 307 o Response headers: The meaning of a resource is heavily influenced 308 by its HTTP response headers. 310 o Signing uses existing TLS certificates: So a publisher doesn't 311 have to spend lots of money buying a specialized certificate. 313 o Cryptographic agility: Today's algorithms will eventually be 314 obsolete and will need to be replaced. 316 o Certificate revocation: The publisher's certificate might be 317 compromised or mis-issued, and an attacker shouldn't then get an 318 infinite ability to mint packages. 320 2.2.2. Avoiding Censorship 322 Some users want to retrieve resources that their governments or 323 network providers don't want them to see. Right now, it's 324 straightforward for someone in a privileged network position to block 325 access to particular hosts, but TLS makes it difficult to block 326 access to particular resources on those hosts. 328 Today it's straightforward to retrieve blocked content from a third 329 party, but there's no guarantee that the third-party has sent the 330 user an accurate representation of the content: the user has to trust 331 the third party. 333 With signed web packages, the user can re-gain assurance that the 334 content is authentic, while still bypassing the censorship. Packages 335 don't do anything to help discover this content. 337 Systems that make censorship more difficult can also make legitimate 338 content filtering more difficult. Because the client that processes 339 a web package always knows the true URL, this forces content 340 filtering to happen on the client instead of on the network. 342 Associated requirements: 344 o Indexed by URL: So the user can see that they're getting the 345 content they expected. 347 o Signing as an origin: So that readers can be sure their copy is 348 authentic and so that copying the package preserves the URLs of 349 the content inside it. 351 2.2.3. Third-party security review 353 Some users may want to grant certain permissions only to applications 354 that have been reviewed for security by a trusted third party. These 355 third parties could provide guarantees similar to those provided by 356 the iOS, Android, or Chrome OS app stores, which might allow browsers 357 to offer more powerful capabilities than have been deemed safe for 358 unaudited websites. 360 Binary transparency for websites is similar: like with Certificate 361 Transparency [RFC6962], the transparency logs would sign the content 362 of the package to provide assurance that experts had a chance to 363 audit the exact package a client received. 365 Associated requirements: 367 o Additional signatures 369 2.2.4. Building packages from multiple libraries 371 Large programs are built from smaller components. In the case of the 372 web, components can be included either as Javascript files or as 373 "