idnits 2.17.1 draft-bryan-metalinkhttp-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 7, 2009) is 5345 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'BITTORRENT' ** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) ** Downref: Normative reference to an Informational RFC: RFC 3174 ** Obsolete normative reference: RFC 3230 (Obsoleted by RFC 9530) Summary: 5 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Bryan, Ed. 3 Internet-Draft Metalinker Project 4 Intended status: Standards Track September 7, 2009 5 Expires: March 11, 2010 7 MetaLinkHeader: Mirrors and Checksums in HTTP Headers 8 draft-bryan-metalinkhttp-02 10 Status of this Memo 12 This Internet-Draft is submitted to IETF in full conformance with the 13 provisions of BCP 78 and BCP 79. This document may contain material 14 from IETF Documents or IETF Contributions published or made publicly 15 available before November 10, 2008. The person(s) controlling the 16 copyright in some of this material may not have granted the IETF 17 Trust the right to allow modifications of such material outside the 18 IETF Standards Process. Without obtaining an adequate license from 19 the person(s) controlling the copyright in such materials, this 20 document may not be modified outside the IETF Standards Process, and 21 derivative works of it may not be created outside the IETF Standards 22 Process, except to format it for publication as an RFC or to 23 translate it into languages other than English. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as Internet- 28 Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt. 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html. 41 This Internet-Draft will expire on March 11, 2010. 43 Copyright Notice 45 Copyright (c) 2009 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents in effect on the date of 50 publication of this document (http://trustee.ietf.org/license-info). 51 Please review these documents carefully, as they describe your rights 52 and restrictions with respect to this document. 54 Abstract 56 This document specifies MetaLinkHeader: Mirrors and Checksums in HTTP 57 Headers, an alternative representation of Metalink, instead of the 58 usual XML-based download description format. MetaLinkHeader 59 describes multiple download locations (mirrors), Peer-to-Peer, 60 checksums, digital signatures, and other information using existing 61 standards. Clients can transparently use this information to make 62 file transfers more robust and reliable. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 67 1.1. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 1.2. Notational Conventions . . . . . . . . . . . . . . . . . . 4 69 2. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 4 70 3. Mirrors / Multiple Download Locations . . . . . . . . . . . . 5 71 4. Peer-to-Peer . . . . . . . . . . . . . . . . . . . . . . . . . 5 72 5. OpenPGP Signatures . . . . . . . . . . . . . . . . . . . . . . 5 73 6. Checksums . . . . . . . . . . . . . . . . . . . . . . . . . . 6 74 6.1. Checksums of Whole Files . . . . . . . . . . . . . . . . . 6 75 6.2. Checksums of Chunks of Files . . . . . . . . . . . . . . . 6 76 7. Client / Server Multi-source Download Interaction . . . . . . 7 77 8. Link Relation Type Registration: "duplicate" . . . . . . . . . 8 78 9. Security Considerations . . . . . . . . . . . . . . . . . . . 8 79 9.1. URIs and IRIs . . . . . . . . . . . . . . . . . . . . . . 8 80 9.2. Spoofing . . . . . . . . . . . . . . . . . . . . . . . . . 8 81 9.3. Cryptographic Hashes . . . . . . . . . . . . . . . . . . . 8 82 9.4. Signing . . . . . . . . . . . . . . . . . . . . . . . . . 9 83 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 84 10.1. Normative References . . . . . . . . . . . . . . . . . . . 9 85 10.2. Informative References . . . . . . . . . . . . . . . . . . 10 86 Appendix A. Acknowledgements and Contributors . . . . . . . . . . 10 87 Appendix B. What's different...?! (to be removed by RFC 88 Editor before publication) . . . . . . . . . . . . . 10 89 Appendix C. Document History (to be removed by RFC Editor 90 before publication) . . . . . . . . . . . . . . . . . 10 91 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 11 93 1. Introduction 95 MetaLinkHeader is an alternative to Metalink, usually represented in 96 an XML-based document format [draft-bryan-metalink]. MetaLinkHeader 97 attempts to provide as much functionality as the Metalink XML format 98 by using existing standards such as Web Linking 99 [draft-nottingham-http-link-header], Instance Digests in HTTP 100 [RFC3230], and Content-MD5 [RFC1864]. MetaLinkHeader is used to list 101 information about a file to be downloaded. This includes lists of 102 multiple URIs (mirrors), Peer-to-Peer information, checksums, and 103 digital signatures. 105 Identical copies of a file are frequently accessible in multiple 106 locations on the Internet over a variety of protocols (FTP, HTTP, and 107 Peer-to-Peer). In some cases, Users are shown a list of these 108 multiple download locations (mirrors) and must manually select a 109 single one on the basis of geographical location, priority, or 110 bandwidth. This distributes the load across multiple servers. At 111 times, individual servers can be slow, outdated, or unreachable, but 112 this can not be determined until the download has been initiated. 113 This can lead to the user canceling the download and needing to 114 restart it. During downloads, errors in transmission can corrupt the 115 file. There are no easy ways to repair these files. For large 116 downloads this can be extremely troublesome. Any of the number of 117 problems that can occur during a download lead to frustration on the 118 part of users. 120 All the information about a download, including mirrors, checksums, 121 digital signatures, and more can be transferred in coordinated HTTP 122 Headers. This Metalink transfers the knowledge of the download 123 server (and mirror database) to the client. Clients can fallback to 124 other mirrors if the current one has an issue. With this knowledge, 125 the client is enabled to work its way to a successful download even 126 under adverse circumstances. All this is done transparently to the 127 user and the download is much more reliable and efficient. In 128 contrast, a traditional HTTP redirect to a mirror conveys only 129 extremely minimal information - one link to one server, and there is 130 no provision in the HTTP protocol to handle failures. Other features 131 that some clients provide include multi-source downloads, where 132 chunks of a file are downloaded from multiple mirrors (and 133 optionally, Peer-to-Peer) simultaneously, which frequently results in 134 a faster download. 136 [[ Discussion of this draft should take place on IETF HTTP WG mailing 137 list at ietf-http-wg@w3.org or the Metalink discussion mailing list 138 located at metalink-discussion@googlegroups.com. To join the list, 139 visit http://groups.google.com/group/metalink-discussion . ]] 141 1.1. Examples 143 A brief Metalink server response with checksum, mirrors, .torrent, 144 and OpenPGP signature: 146 Link: ; rel="duplicate"; 147 Link: ; rel="duplicate"; 148 Link: ; rel="describedby"; 149 type="torrent"; 150 Link: ; rel="describedby"; 151 type="application/pgp-signature"; 152 Digest: SHA=thvDyvhfIqlvFe+A9MYgxAfm1q5= 154 1.2. Notational Conventions 156 This specification describes conformance of MetaLinkHeader. 158 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 159 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 160 document are to be interpreted as described in BCP 14, [RFC2119], as 161 scoped to those conformance targets. 163 2. Requirements 165 In this context, "MetaLink" refers to a MetaLinkHeader which consists 166 of mirrors and checksums in HTTP Headers as described in this 167 document. "Metalink XML" refers to the XML format described in 168 [draft-bryan-metalink]. 170 Metalink servers are HTTP servers that MUST have lists of mirrors and 171 use the Link header [draft-nottingham-http-link-header] to indicate 172 them. They also MUST provide checksums of files via Instance Digests 173 in HTTP [RFC3230]. Mirror and checksum information provided by the 174 originating Metalink server is considered authoritative. 176 Mirror servers are typically FTP or HTTP servers that "mirror" 177 another server. That is, they provide identical copies of (at least 178 some) files that are also on the mirrored server. Mirror servers MAY 179 be Metalink servers. Mirror servers MUST support serving partial 180 content. Mirror servers SHOULD support Instance Digests in HTTP 181 [RFC3230]. 183 Metalink clients use the mirrors provided by a Metalink server with 184 Link header [draft-nottingham-http-link-header]. Metalink clients 185 MUST support HTTP and MAY support FTP, BitTorrent, or other download 186 methods. Metalink clients MUST switch downloads from one mirror to 187 another if the one mirror becomes unreachable. Metalink clients are 188 RECOMMENDED to support multi-source, or parallel, downloads, where 189 chunks of a file are downloaded from multiple mirrors simultaneously 190 (and optionally, Peer-to-Peer). Metalink clients MUST support 191 Instance Digests in HTTP [RFC3230] by requesting and verifying 192 checksums. Metalink clients MAY make use of digital signatures if 193 they are offered. 195 3. Mirrors / Multiple Download Locations 197 Mirrors are specified with the Link header 198 [draft-nottingham-http-link-header] and a relation type of 199 "duplicate" as defined in Section 8. 201 A brief Metalink server response with two mirrors only: 203 Link: ; rel="duplicate"; 204 Link: ; rel="duplicate"; 206 Mirror servers are listed in order of priority. 208 4. Peer-to-Peer 210 Ways to download a file over Peer-to-Peer networks are specified with 211 the Link header [draft-nottingham-http-link-header] and a relation 212 type of "describedby" and a type parameter of "torrent" for .torrent 213 [BITTORRENT] files. 215 A brief Metalink server response with .torrent only: 217 Link: ; rel="describedby"; 218 type="torrent"; 220 5. OpenPGP Signatures 222 OpenPGP signatures are specified with the Link header 223 [draft-nottingham-http-link-header] and a relation type of 224 "describedby" and a type parameter of "application/pgp-signature". 226 A brief Metalink server response with OpenPGP signature only: 228 Link: ; rel="describedby"; 229 type="application/pgp-signature"; 231 6. Checksums 233 6.1. Checksums of Whole Files 235 Instance Digests in HTTP [RFC3230] are used to request and retrieve 236 whole file checksums. 238 A brief Metalink client request that prefers SHA-1 checksums over 239 MD5: 241 Want-Digest: MD5;q=0.3, SHA;q=0.8 243 A brief Metalink server response with checksum: 245 Digest: SHA=thvDyvhfIqlvFe+A9MYgxAfm1q5= 247 [[Some publishers will probably desire stronger hashes.]] 249 6.2. Checksums of Chunks of Files 251 The Content-MD5 header [RFC1864] provides checksums for a chunk, or 252 portion, of a file, when requested with a Range header field. 254 Negotiation of Content-MD5 is described in [RFC3230]. A checksum for 255 a chunk of a file can determine if there has been an error in 256 transmission, which means the file is corrupt. If an error is 257 detected in a chunk, then just that chunk can be requested again from 258 the current mirror, or a different mirror. 260 A brief Metalink client request for Content-MD5 of a portion of a 261 file: 263 Range: bytes=7433802- 264 Want-Digest: contentMD5;q=0.8 266 A brief Metalink server response with checksum: 268 HTTP/1.1 206 Partial Content 269 Accept-Ranges: bytes 270 Content-Length: 7433801 271 Content-Range: bytes 7433802-14867602/14867603 272 Content-MD5: Q2hlY2sgSW50ZWdyaXR5IQ== 274 [[Content-MD5 for chunk checksums could lead to many random size 275 chunk checksum requests. Use consistent chunk sizes? Could we get 276 all chunk checksums from the referring Metalink server with Content- 277 MD5? Otherwise, this could also be a lot to ask on a mirror network 278 if you don't control it and most servers might not have this feature 279 enabled.]] 281 [[Alternatively, Metalink XML could be used for chunk checksums but 282 that complicates things.]] 284 7. Client / Server Multi-source Download Interaction 286 Metalink clients begin a download with a standard HTTP [RFC2616] GET 287 request to the Metalink server. Here the client prefers SHA-1 288 checksums over MD5: 290 GET /distribution/example.ext HTTP/1.1 291 Host: www.example.com 292 Want-Digest: MD5;q=0.3, SHA;q=0.8 294 The Metalink server responds with this: 296 HTTP/1.1 200 OK 297 Accept-Ranges: bytes 298 Content-Length: 14867603 299 Content-Type: application/x-cd-image 300 Link: ; rel="duplicate"; 301 Link: ; rel="duplicate"; 302 Link: ; rel="describedby"; 303 type="torrent"; 304 Link: ; rel="describedby"; 305 type="application/pgp-signature"; 306 Digest: SHA=thvDyvhfIqlvFe+A9MYgxAfm1q5= 308 The Metalink client then contacts the other mirrors requesting a 309 portion of the file with the "Range" header field, and using the 310 location of the original GET request in the "Referer" header field. 311 One of the client requests to a mirror server: 313 GET /example.ext HTTP/1.1 314 Host: www2.example.com 315 Range: bytes=7433802- 316 Referer: http://www.example.com/distribution/example.ext 318 The mirror servers respond with a 206 Partial Content HTTP status 319 code and appropriate "Content-Length" and "Content Range" header 320 fields. The mirror response to the above request: 322 HTTP/1.1 206 Partial Content 323 Accept-Ranges: bytes 324 Content-Length: 7433801 325 Content-Range: bytes 7433802-14867602/14867603 327 Once the download has completed, the Metalink client MUST verify the 328 checksum of the file. 330 8. Link Relation Type Registration: "duplicate" 332 o Relation Name: duplicate 334 o Description: Refers to an identical resource that is a byte-for- 335 byte equivalence of representations. 337 o Reference: This specification. 339 o Notes: This relation is for static resources. That is, an HTTP GET 340 request on any duplicate will return the same representation. It 341 does not make sense for dynamic or POSTable resources and should not 342 be used for them. 344 9. Security Considerations 346 9.1. URIs and IRIs 348 Metalink clients handle URIs and IRIs. See Section 7 of [RFC3986] 349 and Section 8 of [RFC3987] for security considerations related to 350 their handling and use. 352 9.2. Spoofing 354 There is potential for spoofing attacks where the attacker publishes 355 Metalinks with false information. In that case, this could deceive 356 unaware downloaders that they are downloading a malicious or 357 worthless file. Also, malicious publishers could attempt a 358 distributed denial of service attack by inserting unrelated IRIs into 359 Metalinks. 361 9.3. Cryptographic Hashes 363 Currently, some of the hash types defined in the IANA registry named 364 "Hash Function Textual Names", Instance Digests in HTTP [RFC3230], 365 and Content-MD5 header [RFC1864] are considered insecure. These 366 include the whole Message Digest family of algorithms which are not 367 suitable for cryptographically strong verification. Malicious people 368 could provide files that appear to be identical to another file 369 because of a collision, i.e. the weak cryptographic hashes of the 370 intended file and a substituted malicious file could match. 372 If a Metalink contains hashes as described in Section 6, it SHOULD 373 include "sha" which is SHA-1, as specified in [RFC3174]. It MAY also 374 include other hashes. 376 9.4. Signing 378 Metalinks should include digital signatures, as described in 379 Section 5. 381 Digital signatures provide authentication, message integrity, and 382 non-repudiation with proof of origin. 384 10. References 386 10.1. Normative References 388 [BITTORRENT] 389 Cohen, B., "The BitTorrent Protocol Specification", 390 BITTORRENT 11031, February 2008, 391 . 393 [RFC1864] Myers, J. and M. Rose, "The Content-MD5 Header Field", 394 RFC 1864, October 1995. 396 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 397 Requirement Levels", BCP 14, RFC 2119, March 1997. 399 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 400 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext 401 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. 403 [RFC3174] Eastlake, D. and P. Jones, "US Secure Hash Algorithm 1 404 (SHA1)", RFC 3174, September 2001. 406 [RFC3230] Mogul, J. and A. Van Hoff, "Instance Digests in HTTP", 407 RFC 3230, January 2002. 409 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 410 Resource Identifier (URI): Generic Syntax", STD 66, 411 RFC 3986, January 2005. 413 [RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource 414 Identifiers (IRIs)", RFC 3987, January 2005. 416 [draft-nottingham-http-link-header] 417 Nottingham, M., "Web Linking", 418 draft-nottingham-http-link-header-06 (work in progress), 419 July 2009. 421 10.2. Informative References 423 [draft-bryan-metalink] 424 Bryan, A., Ed., Tsujikawa, T., McNab, N., and P. Poeml, 425 "The Metalink Download Description Format", 426 draft-bryan-metalink-16 (work in progress), August 2009. 428 Appendix A. Acknowledgements and Contributors 430 Thanks to Daniel Stenberg and Mark Nottingham. 432 Appendix B. What's different...?! (to be removed by RFC Editor before 433 publication) 435 ...or missing, compared to the Metalink XML format 436 [draft-bryan-metalink] : 438 o (+) Reuses existing standards without defining much new stuff. 439 It's more of a collection/coordinated feature set. 440 o (+) No XML dependency. 441 o (-?) Tied to HTTP, not as generic. FTP/P2P clients won't be 442 using it unless they also support HTTP, unlike Metalink XML. 443 o (---) Requires changes to server software. 444 o (-?) Could require some coordination of all mirror servers for 445 all features, which may be difficult or impossible unless you are 446 in control of all servers on the mirror network. 447 o (-) Metalink XML can be created by user (or server, but server 448 component/changes not required). 449 o (-) Also, Metalink XML files are easily mirrored on all servers. 450 Even if usage in that case is not as transparent, it still gives 451 access to users at all mirrors (FTP included) to all download 452 information with no changes needed to the server. 453 o (-) Not portable/archivable/emailable. Not as easy for search 454 engines to index? 455 o (-) No way to show mirror/p2p geographical location (yet). 456 o (-) No checksums besides MD5/SHA-1 (yet). 457 o (-) Not as rich metadata. 458 o (-) Not able to add multiple files to a download queue or create 459 directory structure. 461 Appendix C. Document History (to be removed by RFC Editor before 462 publication) 464 [[ to be removed by the RFC editor before publication as an RFC. ]] 466 Known issues concerning this draft: 467 o None. 469 -02 : September 6, 2009. 470 o Content-MD5 for chunk checksums. 472 -01 : September 1, 2009. 473 o Link Relation Type Registration: "duplicate" 475 -00 : August 24, 2009. 476 o Initial draft. 478 Author's Address 480 Anthony Bryan (editor) 481 Metalinker Project 483 Email: anthonybryan@gmail.com 484 URI: http://www.metalinker.org