idnits 2.17.1 draft-bryan-metalinkhttp-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 16, 2009) is 5335 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'BITTORRENT' ** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) ** Downref: Normative reference to an Informational RFC: RFC 3174 ** Obsolete normative reference: RFC 3230 (Obsoleted by RFC 9530) Summary: 5 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Bryan, Ed. 3 Internet-Draft Metalinker Project 4 Intended status: Standards Track September 16, 2009 5 Expires: March 20, 2010 7 MetaLinkHeader: Mirrors and Checksums in HTTP Headers 8 draft-bryan-metalinkhttp-03 10 Status of this Memo 12 This Internet-Draft is submitted to IETF in full conformance with the 13 provisions of BCP 78 and BCP 79. This document may contain material 14 from IETF Documents or IETF Contributions published or made publicly 15 available before November 10, 2008. The person(s) controlling the 16 copyright in some of this material may not have granted the IETF 17 Trust the right to allow modifications of such material outside the 18 IETF Standards Process. Without obtaining an adequate license from 19 the person(s) controlling the copyright in such materials, this 20 document may not be modified outside the IETF Standards Process, and 21 derivative works of it may not be created outside the IETF Standards 22 Process, except to format it for publication as an RFC or to 23 translate it into languages other than English. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as Internet- 28 Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt. 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html. 41 This Internet-Draft will expire on March 20, 2010. 43 Copyright Notice 45 Copyright (c) 2009 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents in effect on the date of 50 publication of this document (http://trustee.ietf.org/license-info). 51 Please review these documents carefully, as they describe your rights 52 and restrictions with respect to this document. 54 Abstract 56 This document specifies MetaLinkHeader: Mirrors and Checksums in HTTP 57 Headers, an alternative to the Metalink XML-based download 58 description format. MetaLinkHeader describes multiple download 59 locations (mirrors), Peer-to-Peer, checksums, digital signatures, and 60 other information using existing standards. Clients can 61 transparently use this information to make file transfers more robust 62 and reliable. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 67 1.1. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 1.2. Notational Conventions . . . . . . . . . . . . . . . . . . 4 69 2. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 4 70 3. Mirrors / Multiple Download Locations . . . . . . . . . . . . 5 71 4. Peer-to-Peer . . . . . . . . . . . . . . . . . . . . . . . . . 5 72 5. OpenPGP Signatures . . . . . . . . . . . . . . . . . . . . . . 5 73 6. Checksums . . . . . . . . . . . . . . . . . . . . . . . . . . 6 74 6.1. Checksums of Whole Files . . . . . . . . . . . . . . . . . 6 75 6.2. Checksums of Chunks of Files . . . . . . . . . . . . . . . 6 76 7. Client / Server Multi-source Download Interaction . . . . . . 7 77 8. Link Relation Type Registration: "duplicate" . . . . . . . . . 8 78 9. Security Considerations . . . . . . . . . . . . . . . . . . . 8 79 9.1. URIs and IRIs . . . . . . . . . . . . . . . . . . . . . . 8 80 9.2. Spoofing . . . . . . . . . . . . . . . . . . . . . . . . . 9 81 9.3. Cryptographic Hashes . . . . . . . . . . . . . . . . . . . 9 82 9.4. Signing . . . . . . . . . . . . . . . . . . . . . . . . . 9 83 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 84 10.1. Normative References . . . . . . . . . . . . . . . . . . . 9 85 10.2. Informative References . . . . . . . . . . . . . . . . . . 10 86 Appendix A. Acknowledgements and Contributors . . . . . . . . . . 10 87 Appendix B. What's different...?! (to be removed by RFC 88 Editor before publication) . . . . . . . . . . . . . 10 89 Appendix C. Document History (to be removed by RFC Editor 90 before publication) . . . . . . . . . . . . . . . . . 11 91 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 11 93 1. Introduction 95 MetaLinkHeader is an alternative to Metalink, usually an XML-based 96 document format [draft-bryan-metalink]. MetaLinkHeader attempts to 97 provide as much functionality as the Metalink XML format by using 98 existing standards such as Web Linking 99 [draft-nottingham-http-link-header], Instance Digests in HTTP 100 [RFC3230], and Content-MD5 [RFC1864]. MetaLinkHeader is used to list 101 information about a file to be downloaded. This includes lists of 102 multiple URIs (mirrors), Peer-to-Peer information, checksums, and 103 digital signatures. 105 Identical copies of a file are frequently accessible in multiple 106 locations on the Internet over a variety of protocols (FTP, HTTP, and 107 Peer-to-Peer). In some cases, Users are shown a list of these 108 multiple download locations (mirrors) and must manually select a 109 single one on the basis of geographical location, priority, or 110 bandwidth. This distributes the load across multiple servers. At 111 times, individual servers can be slow, outdated, or unreachable, but 112 this can not be determined until the download has been initiated. 113 This can lead to the user canceling the download and needing to 114 restart it. During downloads, errors in transmission can corrupt the 115 file. There are no easy ways to repair these files. For large 116 downloads this can be extremely troublesome. Any of the number of 117 problems that can occur during a download lead to frustration on the 118 part of users. 120 All the information about a download, including mirrors, checksums, 121 digital signatures, and more can be transferred in coordinated HTTP 122 Headers. This Metalink transfers the knowledge of the download 123 server (and mirror database) to the client. Clients can fallback to 124 other mirrors if the current one has an issue. With this knowledge, 125 the client is enabled to work its way to a successful download even 126 under adverse circumstances. All this is done transparently to the 127 user and the download is much more reliable and efficient. In 128 contrast, a traditional HTTP redirect to a mirror conveys only 129 extremely minimal information - one link to one server, and there is 130 no provision in the HTTP protocol to handle failures. Other features 131 that some clients provide include multi-source downloads, where 132 chunks of a file are downloaded from multiple mirrors (and 133 optionally, Peer-to-Peer) simultaneously, which frequently results in 134 a faster download. 136 [[ Discussion of this draft should take place on IETF HTTP WG mailing 137 list at ietf-http-wg@w3.org or the Metalink discussion mailing list 138 located at metalink-discussion@googlegroups.com. To join the list, 139 visit http://groups.google.com/group/metalink-discussion . ]] 141 1.1. Examples 143 A brief Metalink server response with checksum, mirrors, .torrent, 144 and OpenPGP signature: 146 Link: ; rel="duplicate"; 147 Link: ; rel="duplicate"; 148 Link: ; rel="describedby"; 149 type="torrent"; 150 Link: ; rel="describedby"; 151 type="application/pgp-signature"; 152 Digest: SHA=thvDyvhfIqlvFe+A9MYgxAfm1q5= 154 1.2. Notational Conventions 156 This specification describes conformance of MetaLinkHeader. 158 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 159 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 160 document are to be interpreted as described in BCP 14, [RFC2119], as 161 scoped to those conformance targets. 163 2. Requirements 165 In this context, "MetaLink" refers to a MetaLinkHeader which consists 166 of mirrors and checksums in HTTP Headers as described in this 167 document. "Metalink XML" refers to the XML format described in 168 [draft-bryan-metalink]. 170 Metalink servers are HTTP servers that MUST have lists of mirrors and 171 use the Link header [draft-nottingham-http-link-header] to indicate 172 them. They also MUST provide checksums of files via Instance Digests 173 in HTTP [RFC3230]. Mirror and checksum information provided by the 174 originating Metalink server is considered authoritative. 176 Mirror servers are typically FTP or HTTP servers that "mirror" 177 another server. That is, they provide identical copies of (at least 178 some) files that are also on the mirrored server. Mirror servers MAY 179 be Metalink servers. Mirror servers MUST support serving partial 180 content. Mirror servers SHOULD support Instance Digests in HTTP 181 [RFC3230]. 183 Metalink clients use the mirrors provided by a Metalink server with 184 Link header [draft-nottingham-http-link-header]. Metalink clients 185 MUST support HTTP and MAY support FTP, BitTorrent, or other download 186 methods. Metalink clients MUST switch downloads from one mirror to 187 another if the one mirror becomes unreachable. Metalink clients are 188 RECOMMENDED to support multi-source, or parallel, downloads, where 189 chunks of a file are downloaded from multiple mirrors simultaneously 190 (and optionally, Peer-to-Peer). Metalink clients MUST support 191 Instance Digests in HTTP [RFC3230] by requesting and verifying 192 checksums. Metalink clients MAY make use of digital signatures if 193 they are offered. 195 3. Mirrors / Multiple Download Locations 197 Mirrors are specified with the Link header 198 [draft-nottingham-http-link-header] and a relation type of 199 "duplicate" as defined in Section 8. 201 A brief Metalink server response with two mirrors only: 203 Link: ; rel="duplicate"; 204 Link: ; rel="duplicate"; 206 Mirror servers are listed in order of priority. 208 [[Some organizations have many mirrors. Only send a few mirrors, or 209 only use the Link header if Want-Digest is used?]] 211 4. Peer-to-Peer 213 Ways to download a file over Peer-to-Peer networks are specified with 214 the Link header [draft-nottingham-http-link-header] and a relation 215 type of "describedby" and a type parameter of "torrent" for .torrent 216 [BITTORRENT] files. 218 A brief Metalink server response with .torrent only: 220 Link: ; rel="describedby"; 221 type="torrent"; 223 5. OpenPGP Signatures 225 OpenPGP signatures are specified with the Link header 226 [draft-nottingham-http-link-header] and a relation type of 227 "describedby" and a type parameter of "application/pgp-signature". 229 A brief Metalink server response with OpenPGP signature only: 231 Link: ; rel="describedby"; 232 type="application/pgp-signature"; 234 6. Checksums 236 6.1. Checksums of Whole Files 238 Instance Digests in HTTP [RFC3230] are used to request and retrieve 239 whole file checksums. 241 A brief Metalink client request that prefers SHA-1 checksums over 242 MD5: 244 Want-Digest: MD5;q=0.3, SHA;q=0.8 246 A brief Metalink server response with checksum: 248 Digest: SHA=thvDyvhfIqlvFe+A9MYgxAfm1q5= 250 [[Some publishers will probably desire stronger hashes.]] 252 6.2. Checksums of Chunks of Files 254 The Content-MD5 header [RFC1864] provides checksums for a chunk, or 255 portion, of a file, when requested with a Range header field. 257 Negotiation of Content-MD5 is described in [RFC3230]. A checksum for 258 a chunk of a file can determine if there has been an error in 259 transmission, which means the file is corrupt. If an error is 260 detected in a chunk, then just that chunk can be requested again from 261 the current mirror, or a different mirror. 263 A brief Metalink client request for Content-MD5 of a portion of a 264 file: 266 Range: bytes=7433802- 267 Want-Digest: contentMD5;q=0.8 268 A brief Metalink server response with checksum: 270 HTTP/1.1 206 Partial Content 271 Accept-Ranges: bytes 272 Content-Length: 7433801 273 Content-Range: bytes 7433802-14867602/14867603 274 Content-MD5: Q2hlY2sgSW50ZWdyaXR5IQ== 276 [[Content-MD5 for chunk checksums could lead to many random size 277 chunk checksum requests. Use consistent chunk sizes? Could we get 278 all chunk checksums from the referring Metalink server with Content- 279 MD5? Otherwise, this could also be a lot to ask on a mirror network 280 if you don't control it and most servers might not have this feature 281 enabled.]] 283 [[Alternatively, Metalink XML could be used for chunk checksums but 284 that complicates things.]] 286 7. Client / Server Multi-source Download Interaction 288 Metalink clients begin a download with a standard HTTP [RFC2616] GET 289 request to the Metalink server. Here the client prefers SHA-1 290 checksums over MD5: 292 GET /distribution/example.ext HTTP/1.1 293 Host: www.example.com 294 Want-Digest: MD5;q=0.3, SHA;q=0.8 296 Alternatively, Metalink clients can use a HEAD request to discover 297 mirrors via Link headers. After that, it follows with a GET request 298 as usual. 300 The Metalink server responds with this: 302 HTTP/1.1 200 OK 303 Accept-Ranges: bytes 304 Content-Length: 14867603 305 Content-Type: application/x-cd-image 306 Link: ; rel="duplicate"; 307 Link: ; rel="duplicate"; 308 Link: ; rel="describedby"; 309 type="torrent"; 310 Link: ; rel="describedby"; 311 type="application/pgp-signature"; 312 Digest: SHA=thvDyvhfIqlvFe+A9MYgxAfm1q5= 313 The Metalink client then contacts the other mirrors requesting a 314 portion of the file with the "Range" header field, and using the 315 location of the original GET request in the "Referer" header field. 316 One of the client requests to a mirror server: 318 GET /example.ext HTTP/1.1 319 Host: www2.example.com 320 Range: bytes=7433802- 321 Referer: http://www.example.com/distribution/example.ext 323 The mirror servers respond with a 206 Partial Content HTTP status 324 code and appropriate "Content-Length" and "Content Range" header 325 fields. The mirror response to the above request: 327 HTTP/1.1 206 Partial Content 328 Accept-Ranges: bytes 329 Content-Length: 7433801 330 Content-Range: bytes 7433802-14867602/14867603 332 Once the download has completed, the Metalink client MUST verify the 333 checksum of the file. 335 8. Link Relation Type Registration: "duplicate" 337 o Relation Name: duplicate 339 o Description: Refers to a resource whose available representations 340 are byte-for-byte identical with the corresponding representations of 341 the context IRI. 343 o Reference: This specification. 345 o Notes: This relation is for static resources. That is, an HTTP GET 346 request on any duplicate will return the same representation. It 347 does not make sense for dynamic or POSTable resources and should not 348 be used for them. 350 9. Security Considerations 352 9.1. URIs and IRIs 354 Metalink clients handle URIs and IRIs. See Section 7 of [RFC3986] 355 and Section 8 of [RFC3987] for security considerations related to 356 their handling and use. 358 9.2. Spoofing 360 There is potential for spoofing attacks where the attacker publishes 361 Metalinks with false information. In that case, this could deceive 362 unaware downloaders that they are downloading a malicious or 363 worthless file. Also, malicious publishers could attempt a 364 distributed denial of service attack by inserting unrelated IRIs into 365 Metalinks. 367 9.3. Cryptographic Hashes 369 Currently, some of the hash types defined in Instance Digests in HTTP 370 [RFC3230] and Content-MD5 header [RFC1864] are considered insecure. 371 These include the whole Message Digest family of algorithms which are 372 not suitable for cryptographically strong verification. Malicious 373 people could provide files that appear to be identical to another 374 file because of a collision, i.e. the weak cryptographic hashes of 375 the intended file and a substituted malicious file could match. 377 If a Metalink contains hashes as described in Section 6, it SHOULD 378 include "sha" which is SHA-1, as specified in [RFC3174]. It MAY also 379 include other hashes. 381 9.4. Signing 383 Metalinks should include digital signatures, as described in 384 Section 5. 386 Digital signatures provide authentication, message integrity, and 387 non-repudiation with proof of origin. 389 10. References 391 10.1. Normative References 393 [BITTORRENT] 394 Cohen, B., "The BitTorrent Protocol Specification", 395 BITTORRENT 11031, February 2008, 396 . 398 [RFC1864] Myers, J. and M. Rose, "The Content-MD5 Header Field", 399 RFC 1864, October 1995. 401 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 402 Requirement Levels", BCP 14, RFC 2119, March 1997. 404 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 405 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext 406 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. 408 [RFC3174] Eastlake, D. and P. Jones, "US Secure Hash Algorithm 1 409 (SHA1)", RFC 3174, September 2001. 411 [RFC3230] Mogul, J. and A. Van Hoff, "Instance Digests in HTTP", 412 RFC 3230, January 2002. 414 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 415 Resource Identifier (URI): Generic Syntax", STD 66, 416 RFC 3986, January 2005. 418 [RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource 419 Identifiers (IRIs)", RFC 3987, January 2005. 421 [draft-nottingham-http-link-header] 422 Nottingham, M., "Web Linking", 423 draft-nottingham-http-link-header-06 (work in progress), 424 July 2009. 426 10.2. Informative References 428 [draft-bryan-metalink] 429 Bryan, A., Ed., Tsujikawa, T., McNab, N., and P. Poeml, 430 "The Metalink Download Description Format", 431 draft-bryan-metalink-16 (work in progress), August 2009. 433 Appendix A. Acknowledgements and Contributors 435 Thanks to Daniel Stenberg, Mark Nottingham, and Neil McNab. 437 Appendix B. What's different...?! (to be removed by RFC Editor before 438 publication) 440 ...or missing, compared to the Metalink XML format 441 [draft-bryan-metalink] : 443 o (+) Reuses existing standards without defining much new stuff. 444 It's more of a collection/coordinated feature set. 445 o (+) No XML dependency. 446 o (-?) Tied to HTTP, not as generic. FTP/P2P clients won't be 447 using it unless they also support HTTP, unlike Metalink XML. 448 o (---) Requires changes to server software. 450 o (-?) Could require some coordination of all mirror servers for 451 all features, which may be difficult or impossible unless you are 452 in control of all servers on the mirror network. 453 o (-) Metalink XML can be created by user (or server, but server 454 component/changes not required). 455 o (-) Also, Metalink XML files are easily mirrored on all servers. 456 Even if usage in that case is not as transparent, it still gives 457 access to users at all mirrors (FTP included) to all download 458 information with no changes needed to the server. 459 o (-) Not portable/archivable/emailable. Not as easy for search 460 engines to index? 461 o (-) No way to show mirror/p2p geographical location (yet). 462 o (-) No checksums besides MD5/SHA-1 (yet). 463 o (-) Not as rich metadata. 464 o (-) Not able to add multiple files to a download queue or create 465 directory structure. 467 Appendix C. Document History (to be removed by RFC Editor before 468 publication) 470 [[ to be removed by the RFC editor before publication as an RFC. ]] 472 Known issues concerning this draft: 473 o None. 475 -03 : September 16, 2009. 476 o Mention HEAD request, negotiate mirrors if Want-Digest is used. 478 -02 : September 6, 2009. 479 o Content-MD5 for chunk checksums. 481 -01 : September 1, 2009. 482 o Link Relation Type Registration: "duplicate" 484 -00 : August 24, 2009. 485 o Initial draft. 487 Author's Address 489 Anthony Bryan (editor) 490 Metalinker Project 492 Email: anthonybryan@gmail.com 493 URI: http://www.metalinker.org