idnits 2.17.1
draft-yasskin-webpackage-use-cases-02.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
** The abstract seems to contain references ([2], [1]), which it shouldn't.
Please replace those with straight textual mentions of the documents in
question.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
-- The document date (October 30, 2019) is 1637 days in the past. Is this
intentional?
Checking references for intended status: Informational
----------------------------------------------------------------------------
-- Looks like a reference, but probably isn't: '1' on line 1035
-- Looks like a reference, but probably isn't: '2' on line 1037
-- Looks like a reference, but probably isn't: '3' on line 1039
-- Looks like a reference, but probably isn't: '4' on line 1041
-- Looks like a reference, but probably isn't: '5' on line 1043
-- Looks like a reference, but probably isn't: '6' on line 1046
-- Looks like a reference, but probably isn't: '7' on line 1048
-- Looks like a reference, but probably isn't: '8' on line 1050
-- Looks like a reference, but probably isn't: '9' on line 1053
-- Looks like a reference, but probably isn't: '10' on line 1055
-- Looks like a reference, but probably isn't: '11' on line 1057
-- Looks like a reference, but probably isn't: '12' on line 1059
-- Looks like a reference, but probably isn't: '13' on line 1061
-- Looks like a reference, but probably isn't: '14' on line 1063
-- Looks like a reference, but probably isn't: '15' on line 1065
-- Looks like a reference, but probably isn't: '16' on line 1067
-- Looks like a reference, but probably isn't: '17' on line 1069
-- Looks like a reference, but probably isn't: '18' on line 1072
-- Looks like a reference, but probably isn't: '19' on line 1074
-- Looks like a reference, but probably isn't: '20' on line 1077
-- Looks like a reference, but probably isn't: '21' on line 1080
-- Looks like a reference, but probably isn't: '22' on line 1083
-- Looks like a reference, but probably isn't: '23' on line 1085
-- Looks like a reference, but probably isn't: '24' on line 1088
-- Looks like a reference, but probably isn't: '25' on line 1091
-- Looks like a reference, but probably isn't: '26' on line 1093
-- Looks like a reference, but probably isn't: '27' on line 1095
== Outdated reference: A later version (-06) exists of
draft-ietf-httpbis-http2-secondary-certs-04
-- Obsolete informational reference (is this intentional?): RFC 6962
(Obsoleted by RFC 9162)
-- Obsolete informational reference (is this intentional?): RFC 7540
(Obsoleted by RFC 9113)
Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 30 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Network Working Group J. Yasskin
3 Internet-Draft Google
4 Intended status: Informational October 30, 2019
5 Expires: May 2, 2020
7 Use Cases and Requirements for Web Packages
8 draft-yasskin-webpackage-use-cases-02
10 Abstract
12 This document lists use cases for signing and/or bundling collections
13 of web pages, and extracts a set of requirements from them.
15 Note to Readers
17 Discussion of this draft takes place on the ART area mailing list
18 (art@ietf.org), which is archived at
19 https://mailarchive.ietf.org/arch/search/?email_list=art [1].
21 The source code and issues list for this draft can be found in
22 https://github.com/WICG/webpackage [2].
24 Status of This Memo
26 This Internet-Draft is submitted in full conformance with the
27 provisions of BCP 78 and BCP 79.
29 Internet-Drafts are working documents of the Internet Engineering
30 Task Force (IETF). Note that other groups may also distribute
31 working documents as Internet-Drafts. The list of current Internet-
32 Drafts is at https://datatracker.ietf.org/drafts/current/.
34 Internet-Drafts are draft documents valid for a maximum of six months
35 and may be updated, replaced, or obsoleted by other documents at any
36 time. It is inappropriate to use Internet-Drafts as reference
37 material or to cite them other than as "work in progress."
39 This Internet-Draft will expire on May 2, 2020.
41 Copyright Notice
43 Copyright (c) 2019 IETF Trust and the persons identified as the
44 document authors. All rights reserved.
46 This document is subject to BCP 78 and the IETF Trust's Legal
47 Provisions Relating to IETF Documents
48 (https://trustee.ietf.org/license-info) in effect on the date of
49 publication of this document. Please review these documents
50 carefully, as they describe your rights and restrictions with respect
51 to this document. Code Components extracted from this document must
52 include Simplified BSD License text as described in Section 4.e of
53 the Trust Legal Provisions and are provided without warranty as
54 described in the Simplified BSD License.
56 Table of Contents
58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
59 2. Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . 3
60 2.1. Essential . . . . . . . . . . . . . . . . . . . . . . . . 4
61 2.1.1. Offline installation . . . . . . . . . . . . . . . . 4
62 2.1.2. Offline browsing . . . . . . . . . . . . . . . . . . 5
63 2.1.3. Save and share a web page . . . . . . . . . . . . . . 6
64 2.1.4. Privacy-preserving prefetch . . . . . . . . . . . . . 6
65 2.2. Nice-to-have . . . . . . . . . . . . . . . . . . . . . . 7
66 2.2.1. Packaged Web Publications . . . . . . . . . . . . . . 7
67 2.2.2. Avoiding Censorship . . . . . . . . . . . . . . . . . 8
68 2.2.3. Third-party security review . . . . . . . . . . . . . 9
69 2.2.4. Building packages from multiple libraries . . . . . . 9
70 2.2.5. Cross-CDN Serving . . . . . . . . . . . . . . . . . . 10
71 2.2.6. Pre-installed applications . . . . . . . . . . . . . 11
72 2.2.7. Protecting Users from a Compromised Frontend . . . . 11
73 2.2.8. Installation from a self-extracting executable . . . 12
74 2.2.9. Packages in version control . . . . . . . . . . . . . 13
75 2.2.10. Subresource bundling . . . . . . . . . . . . . . . . 13
76 2.2.11. Archival . . . . . . . . . . . . . . . . . . . . . . 14
77 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 14
78 3.1. Essential . . . . . . . . . . . . . . . . . . . . . . . . 14
79 3.1.1. Indexed by URL . . . . . . . . . . . . . . . . . . . 14
80 3.1.2. Request headers . . . . . . . . . . . . . . . . . . . 15
81 3.1.3. Response headers . . . . . . . . . . . . . . . . . . 15
82 3.1.4. Signing as an origin . . . . . . . . . . . . . . . . 15
83 3.1.5. Random access . . . . . . . . . . . . . . . . . . . . 15
84 3.1.6. Resources from multiple origins in a package . . . . 16
85 3.1.7. Cryptographic agility . . . . . . . . . . . . . . . . 16
86 3.1.8. Unsigned content . . . . . . . . . . . . . . . . . . 16
87 3.1.9. Certificate revocation . . . . . . . . . . . . . . . 16
88 3.1.10. Downgrade prevention . . . . . . . . . . . . . . . . 16
89 3.1.11. Metadata . . . . . . . . . . . . . . . . . . . . . . 16
90 3.1.12. Implementations are hard to get wrong . . . . . . . . 16
91 3.2. Nice to have . . . . . . . . . . . . . . . . . . . . . . 17
92 3.2.1. Streamed loading . . . . . . . . . . . . . . . . . . 17
93 3.2.2. Signing without origin trust . . . . . . . . . . . . 17
94 3.2.3. Additional signatures . . . . . . . . . . . . . . . . 17
95 3.2.4. Binary . . . . . . . . . . . . . . . . . . . . . . . 17
96 3.2.5. Deduplication of diamond dependencies . . . . . . . . 17
97 3.2.6. Old crypto can be removed . . . . . . . . . . . . . . 17
98 3.2.7. Compress transfers . . . . . . . . . . . . . . . . . 18
99 3.2.8. Compress stored packages . . . . . . . . . . . . . . 18
100 3.2.9. Subsetting and reordering . . . . . . . . . . . . . . 18
101 3.2.10. Packaged validity information . . . . . . . . . . . . 18
102 3.2.11. Signing uses existing TLS certificates . . . . . . . 18
103 3.2.12. External dependencies . . . . . . . . . . . . . . . . 18
104 3.2.13. Trailing length . . . . . . . . . . . . . . . . . . . 18
105 3.2.14. Time-shifting execution . . . . . . . . . . . . . . . 18
106 3.2.15. Service Worker integration . . . . . . . . . . . . . 19
107 4. Non-goals . . . . . . . . . . . . . . . . . . . . . . . . . . 19
108 4.1. Store confidential data . . . . . . . . . . . . . . . . . 19
109 4.2. Generate packages on the fly . . . . . . . . . . . . . . 19
110 4.3. Non-origin identity . . . . . . . . . . . . . . . . . . . 19
111 4.4. DRM . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
112 4.5. Ergonomic replacement for HTTP/2 PUSH . . . . . . . . . . 20
113 5. Security Considerations . . . . . . . . . . . . . . . . . . . 20
114 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21
115 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 21
116 7.1. Informative References . . . . . . . . . . . . . . . . . 21
117 7.2. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 23
118 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 24
119 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 24
121 1. Introduction
123 People would like to use content offline and in other situations
124 where there isn't a direct connection to the server where the content
125 originates. However, it's difficult to distribute and verify the
126 authenticity of applications and content without a connection to the
127 network. The W3C has addressed running applications offline with
128 Service Workers ([ServiceWorkers]), but not the problem of
129 distribution.
131 Previous attempts at packaging web resources (e.g. Resource Packages
132 [3] and the W3C TAG's packaging proposal [4]) were motivated by
133 speeding up the download of resources from a single server, which is
134 probably better achieved through other mechanisms like HTTP/2 PUSH,
135 possibly augmented with a simple manifest of URLs a page plans to use
136 [5]. This attempt is instead motivated by avoiding a connection to
137 the origin server at all. It may still be useful for the earlier use
138 cases, so they're still listed, but they're not primary.
140 2. Use cases
142 These use cases are in rough descending priority order. If use cases
143 have conflicting requirements, the design should enable more
144 important use cases.
146 2.1. Essential
148 2.1.1. Offline installation
150 Alex can download a file containing a website (a PWA [6]) including a
151 Service Worker from origin "O", and transmit it to their peer Bailey,
152 and then Bailey can install the Service Worker with a proof that it
153 came from "O". This saves Bailey the bandwidth costs of transferring
154 the website.
156 There are roughly two ways to accomplish this:
158 1. Package just the Service Worker Javascript and any other
159 Javascript that it importScripts() [7], with their URLs and
160 enough metadata to synthesize a
161 navigator.serviceWorker.register(scriptURL, options) call [8],
162 along with an uninterpreted but signature-checked blob of data
163 that the Service Worker can interpret to fill in its caches.
165 2. Package the resources so that the Service Worker can fetch() them
166 to populate its cache.
168 Associated requirements for just the Service Worker:
170 o Indexed by URL: The "register()" and "importScripts()" calls have
171 semantics that depend on the URL.
173 o Signing as an origin: To prove that the file came from "O".
175 o Signing uses existing TLS certificates: So "O" doesn't have to
176 spend lots of money buying a specialized certificate.
178 o Cryptographic agility: Today's algorithms will eventually be
179 obsolete and will need to be replaced.
181 o Certificate revocation: "O"'s certificate might be compromised or
182 mis-issued, and the attacker shouldn't then get an infinite
183 ability to mint packages.
185 o Downgrade prevention: "O"'s site might have an XSS vulnerability,
186 and attackers with an old signed package shouldn't be able to take
187 advantage of the XSS forever.
189 o Metadata: Just enough to generate the "register()" call, which is
190 less than a full W3C Application Manifest.
192 Additional associated requirements for packaged resources:
194 o Indexed by URL: Resources on the web are addressed by URL.
196 o Request headers: If Bailey's running a different browser from Alex
197 or has a different language configured, the "accept*" headers are
198 important for selecting which resource to use at each URL.
200 o Response headers: The meaning of a resource is heavily influenced
201 by its HTTP response headers.
203 o Resources from multiple origins in a package: So the site can be
204 built from multiple components (Section 2.2.4).
206 o Metadata: The browser needs to know which resource within a
207 package file to treat as its Service Worker and/or initial HTML
208 page.
210 2.1.1.1. Online use
212 Bailey may have an internet connection through which they can, in
213 real time, fetch updates to the package they received from Alex.
215 2.1.1.2. Fully offline use
217 Or Bailey may not have any internet connection a significant fraction
218 of the time, either because they have no internet at all, because
219 they turn off internet except when intentionally downloading content,
220 or because they use up their plan partway through each month.
222 Associated requirements beyond Offline installation:
224 o Packaged validity information: Even without a direct internet
225 connection, Bailey should be able to check that their package is
226 still valid.
228 2.1.2. Offline browsing
230 Alex can download a file containing a large website (e.g. Wikipedia)
231 from its origin, save it to transferrable storage (e.g. an SD card),
232 and hand it to their peer Bailey. Then Bailey can browse the website
233 with a proof that it came from "O". Bailey may not have the storage
234 space to copy the website before browsing it.
236 This use case is harder for publishers to support if we specialize
237 Section 2.1.1 for Service Workers since it requires the publisher to
238 adopt Service Workers before they can sign their site.
240 Associated requirements beyond Offline installation:
242 o Random access: To avoid needing a long linear scan before using
243 the content.
245 o Compress stored packages: So that more content can fit on the same
246 storage device.
248 2.1.3. Save and share a web page
250 Casey is viewing a web page and wants to save it either for offline
251 use or to show it to their friend Dakota. Since Casey isn't the web
252 page's publisher, they don't have the private key needed to sign the
253 page. Browsers currently allow their users to save pages, but each
254 browser uses a different format (MHTML, Web Archive, or files in a
255 directory), so Dakota and Casey would need to be using the same
256 browser. Casey could also take a screenshot, at the cost of losing
257 links and accessibility.
259 Associated requirements:
261 o Unsigned content: A client can't sign content as another origin.
263 o Resources from multiple origins in a package: General web pages
264 include resources from multiple origins.
266 o Indexed by URL: Resources on the web are addressed by URL.
268 o Response headers: The meaning of a resource is heavily influenced
269 by its HTTP response headers.
271 2.1.4. Privacy-preserving prefetch
273 Lots of websites link to other websites. Many of these source sites
274 would like the targets of these links to load quickly. The source
275 could use "" to prefetch the target of a link,
276 but if the user doesn't actually click that link, that leaks the fact
277 that the user saw a page that linked to the target. This can be true
278 even if the prefetch is made without browser credentials because of
279 mechanisms like TLS session IDs.
281 Because clients have limited data budgets to prefetch link targets,
282 this use case is probably limited to sites that can accurately
283 predict which link their users are most likely to click. For
284 example, search engines can predict that their users will click one
285 of the first couple results, and news aggreggation sites like Reddit
286 or Slashdot can hope that users will read the article if they've
287 navigated to its discussion.
289 Two search engines have built systems to do this with today's
290 technology: Google's AMP [9] and Baidu's MIP [10] formats and caches
291 allow them to prefetch search results while preserving privacy, at
292 the cost of showing the wrong URLs for the results once the user has
293 clicked. A good solution to this problem would show the right URLs
294 but still avoid a request to the publishing origin until after the
295 user clicks.
297 Associated requirements:
299 o Signing as an origin: To prove the content came from the original
300 origin.
302 o Streamed loading: If the user clicks before the target page is
303 fully transferred, the browser should be able to start loading
304 early parts before the source site finishes sending the whole
305 page.
307 o Compress transfers
309 o Subsetting and reordering: If a prefetched page includes
310 subresources, its publisher might want to provide and sign both
311 WebP and PNG versions of an image, but the source site should be
312 able to transfer only best one for each client.
314 2.2. Nice-to-have
316 2.2.1. Packaged Web Publications
318 The W3C's Publishing Working Group [11], merged from the
319 International Digital Publishing Forum (IDPF) and in charge of EPUB
320 maintenance, wants to be able to create publications on the web and
321 then let them be copied to different servers or to other users via
322 arbitrary protocols. See their Packaged Web Publications use cases
323 [12] for more details.
325 Associated requirements:
327 o Indexed by URL: Resources on the web are addressed by URL.
329 o Signing as an origin: So that readers can be sure their copy is
330 authentic and so that copying the package preserves the URLs of
331 the content inside it.
333 o Downgrade prevention: An early version of a publication might
334 contain incorrect content, and a publisher should be able to
335 update that without worrying that an attacker can still show the
336 old content to users.
338 o Metadata: A publication can have copyright and licensing concerns;
339 a title, author, and cover image; an ISBN or DOI name; etc.; which
340 should be included when that publication is packaged.
342 Other requirements are similar to those from Offline installation:
344 o Random access: To avoid needing a long linear scan before using
345 the content.
347 o Compress stored packages: So that more content can fit on the same
348 storage device.
350 o Request headers: If different users' browsers have different
351 capabilities or preferences, the "accept*" headers are important
352 for selecting which resource to use at each URL.
354 o Response headers: The meaning of a resource is heavily influenced
355 by its HTTP response headers.
357 o Signing uses existing TLS certificates: So a publisher doesn't
358 have to spend lots of money buying a specialized certificate.
360 o Cryptographic agility: Today's algorithms will eventually be
361 obsolete and will need to be replaced.
363 o Certificate revocation: The publisher's certificate might be
364 compromised or mis-issued, and an attacker shouldn't then get an
365 infinite ability to mint packages.
367 2.2.2. Avoiding Censorship
369 Some users want to retrieve resources that their governments or
370 network providers don't want them to see. Right now, it's
371 straightforward for someone in a privileged network position to block
372 access to particular hosts, but TLS makes it difficult to block
373 access to particular resources on those hosts.
375 Today it's straightforward to retrieve blocked content from a third
376 party, but there's no guarantee that the third-party has sent the
377 user an accurate representation of the content: the user has to trust
378 the third party.
380 With signed web packages, the user can re-gain assurance that the
381 content is authentic, while still bypassing the censorship. Packages
382 don't do anything to help discover this content.
384 Systems that make censorship more difficult can also make legitimate
385 content filtering more difficult. Because the client that processes
386 a web package always knows the true URL, this forces content
387 filtering to happen on the client instead of on the network.
389 Associated requirements:
391 o Indexed by URL: So the user can see that they're getting the
392 content they expected.
394 o Signing as an origin: So that readers can be sure their copy is
395 authentic and so that copying the package preserves the URLs of
396 the content inside it.
398 2.2.3. Third-party security review
400 Some users may want to grant certain permissions only to applications
401 that have been reviewed for security by a trusted third party. These
402 third parties could provide guarantees similar to those provided by
403 the iOS, Android, or Chrome OS app stores, which might allow browsers
404 to offer more powerful capabilities than have been deemed safe for
405 unaudited websites.
407 Binary transparency for websites is similar: like with Certificate
408 Transparency [RFC6962], the transparency logs would sign the content
409 of the package to provide assurance that experts had a chance to
410 audit the exact package a client received.
412 Associated requirements:
414 o Additional signatures
416 2.2.4. Building packages from multiple libraries
418 Large programs are built from smaller components. In the case of the
419 web, components can be included either as Javascript files or as
420 "