idnits 2.17.1 draft-bryan-metalinkhttp-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 19, 2009) is 5327 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) ** Downref: Normative reference to an Informational RFC: RFC 3174 ** Obsolete normative reference: RFC 3230 (Obsoleted by RFC 9530) Summary: 5 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Bryan, Ed. 3 Internet-Draft N. McNab 4 Intended status: Standards Track Metalinker Project 5 Expires: March 23, 2010 September 19, 2009 7 MetaLinkHeader: Mirrors and Checksums in HTTP Headers 8 draft-bryan-metalinkhttp-05 10 Status of this Memo 12 This Internet-Draft is submitted to IETF in full conformance with the 13 provisions of BCP 78 and BCP 79. This document may contain material 14 from IETF Documents or IETF Contributions published or made publicly 15 available before November 10, 2008. The person(s) controlling the 16 copyright in some of this material may not have granted the IETF 17 Trust the right to allow modifications of such material outside the 18 IETF Standards Process. Without obtaining an adequate license from 19 the person(s) controlling the copyright in such materials, this 20 document may not be modified outside the IETF Standards Process, and 21 derivative works of it may not be created outside the IETF Standards 22 Process, except to format it for publication as an RFC or to 23 translate it into languages other than English. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as Internet- 28 Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt. 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html. 41 This Internet-Draft will expire on March 23, 2010. 43 Copyright Notice 45 Copyright (c) 2009 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents in effect on the date of 50 publication of this document (http://trustee.ietf.org/license-info). 51 Please review these documents carefully, as they describe your rights 52 and restrictions with respect to this document. 54 Abstract 56 This document specifies MetaLinkHeader: Mirrors and Checksums in HTTP 57 Headers, an alternative to the Metalink XML-based download 58 description format. MetaLinkHeader describes multiple download 59 locations (mirrors), Peer-to-Peer, checksums, digital signatures, and 60 other information using existing standards. Clients can 61 transparently use this information to make file transfers more robust 62 and reliable. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 67 1.1. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 1.2. Notational Conventions . . . . . . . . . . . . . . . . . . 4 69 2. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 4 70 3. Mirrors / Multiple Download Locations . . . . . . . . . . . . 5 71 4. Peer-to-Peer Descriptions / Metainfo . . . . . . . . . . . . . 5 72 5. OpenPGP Signatures . . . . . . . . . . . . . . . . . . . . . . 5 73 6. Checksums . . . . . . . . . . . . . . . . . . . . . . . . . . 6 74 6.1. Checksums of Whole Files . . . . . . . . . . . . . . . . . 6 75 6.2. Checksums of Chunks of Files . . . . . . . . . . . . . . . 6 76 7. Client / Server Multi-source Download Interaction . . . . . . 7 77 8. Link Relation Type Registration: "duplicate" . . . . . . . . . 8 78 9. Security Considerations . . . . . . . . . . . . . . . . . . . 8 79 9.1. URIs and IRIs . . . . . . . . . . . . . . . . . . . . . . 8 80 9.2. Spoofing . . . . . . . . . . . . . . . . . . . . . . . . . 8 81 9.3. Cryptographic Hashes . . . . . . . . . . . . . . . . . . . 8 82 9.4. Signing . . . . . . . . . . . . . . . . . . . . . . . . . 9 83 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 84 10.1. Normative References . . . . . . . . . . . . . . . . . . . 9 85 10.2. Informative References . . . . . . . . . . . . . . . . . . 10 86 Appendix A. Acknowledgements and Contributors . . . . . . . . . . 10 87 Appendix B. Comparisons to Similar Options (to be removed by 88 RFC Editor before publication) . . . . . . . . . . . 10 89 Appendix C. Document History (to be removed by RFC Editor 90 before publication) . . . . . . . . . . . . . . . . . 11 91 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11 93 1. Introduction 95 MetaLinkHeader is an alternative to Metalink, usually an XML-based 96 document format [draft-bryan-metalink]. MetaLinkHeader attempts to 97 provide as much functionality as the Metalink XML format by using 98 existing standards such as Web Linking 99 [draft-nottingham-http-link-header], Instance Digests in HTTP 100 [RFC3230], and Content-MD5 [RFC1864]. MetaLinkHeader is used to list 101 information about a file to be downloaded. This includes lists of 102 multiple URIs (mirrors), Peer-to-Peer information, checksums, and 103 digital signatures. 105 Identical copies of a file are frequently accessible in multiple 106 locations on the Internet over a variety of protocols (FTP, HTTP, and 107 Peer-to-Peer). In some cases, Users are shown a list of these 108 multiple download locations (mirrors) and must manually select a 109 single one on the basis of geographical location, priority, or 110 bandwidth. This distributes the load across multiple servers. At 111 times, individual servers can be slow, outdated, or unreachable, but 112 this can not be determined until the download has been initiated. 113 This can lead to the user canceling the download and needing to 114 restart it. During downloads, errors in transmission can corrupt the 115 file. There are no easy ways to repair these files. For large 116 downloads this can be extremely troublesome. Any of the number of 117 problems that can occur during a download lead to frustration on the 118 part of users. 120 All the information about a download, including mirrors, checksums, 121 digital signatures, and more can be transferred in coordinated HTTP 122 Headers. This Metalink transfers the knowledge of the download 123 server (and mirror database) to the client. Clients can fallback to 124 other mirrors if the current one has an issue. With this knowledge, 125 the client is enabled to work its way to a successful download even 126 under adverse circumstances. All this is done transparently to the 127 user and the download is much more reliable and efficient. In 128 contrast, a traditional HTTP redirect to a mirror conveys only 129 extremely minimal information - one link to one server, and there is 130 no provision in the HTTP protocol to handle failures. Other features 131 that some clients provide include multi-source downloads, where 132 chunks of a file are downloaded from multiple mirrors (and 133 optionally, Peer-to-Peer) simultaneously, which frequently results in 134 a faster download. 136 [[ Discussion of this draft should take place on IETF HTTP WG mailing 137 list at ietf-http-wg@w3.org or the Metalink discussion mailing list 138 located at metalink-discussion@googlegroups.com. To join the list, 139 visit http://groups.google.com/group/metalink-discussion . ]] 141 1.1. Examples 143 A brief Metalink server response with checksum, mirrors, .metalink, 144 and OpenPGP signature: 146 Link: ; rel="duplicate"; 147 Link: ; rel="duplicate"; 148 Link: ; rel="describedby"; 149 type="application/metalink4+xml"; 150 Link: ; rel="describedby"; 151 type="application/pgp-signature"; 152 Digest: SHA=thvDyvhfIqlvFe+A9MYgxAfm1q5= 154 1.2. Notational Conventions 156 This specification describes conformance of MetaLinkHeader. 158 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 159 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 160 document are to be interpreted as described in BCP 14, [RFC2119], as 161 scoped to those conformance targets. 163 2. Requirements 165 In this context, "MetaLink" refers to a MetaLinkHeader which consists 166 of mirrors and checksums in HTTP Headers as described in this 167 document. "Metalink XML" refers to the XML format described in 168 [draft-bryan-metalink]. 170 Metalink servers are HTTP servers that MUST have lists of mirrors and 171 use the Link header [draft-nottingham-http-link-header] to indicate 172 them. They also MUST provide checksums of files via Instance Digests 173 in HTTP [RFC3230]. Mirror and checksum information provided by the 174 originating Metalink server MUST be considered authoritative. 175 Metalink servers and their associated mirror servers SHOULD all share 176 the same ETag policy, i.e. base it on the file contents (checksum) 177 and not server-unique filesystem metadata. The emitted ETag may be 178 implemented the same as the Instance Digest for simplicity. 180 Mirror servers are typically FTP or HTTP servers that "mirror" 181 another server. That is, they provide identical copies of (at least 182 some) files that are also on the mirrored server. Mirror servers MAY 183 be Metalink servers. Mirror servers MUST support serving partial 184 content. Mirror servers SHOULD support Instance Digests in HTTP 185 [RFC3230]. 187 Metalink clients use the mirrors provided by a Metalink server with 188 Link header [draft-nottingham-http-link-header]. Metalink clients 189 MUST support HTTP and MAY support FTP, BitTorrent, or other download 190 methods. Metalink clients MUST switch downloads from one mirror to 191 another if the one mirror becomes unreachable. Metalink clients are 192 RECOMMENDED to support multi-source, or parallel, downloads, where 193 chunks of a file are downloaded from multiple mirrors simultaneously 194 (and optionally, from Peer-to-Peer sources). Metalink clients MUST 195 support Instance Digests in HTTP [RFC3230] by requesting and 196 verifying checksums. Metalink clients MAY make use of digital 197 signatures if they are offered. 199 3. Mirrors / Multiple Download Locations 201 Mirrors are specified with the Link header 202 [draft-nottingham-http-link-header] and a relation type of 203 "duplicate" as defined in Section 8. 205 A brief Metalink server response with two mirrors only: 207 Link: ; rel="duplicate"; 208 Link: ; rel="duplicate"; 210 Mirror servers are listed in order of priority. 212 [[Some organizations have many mirrors. Only send a few mirrors, or 213 only use the Link header if Want-Digest is used?]] 215 4. Peer-to-Peer Descriptions / Metainfo 217 Ways to download a file over Peer-to-Peer networks are specified with 218 the Link header [draft-nottingham-http-link-header] and a relation 219 type of "describedby" and a type parameter hat indicates the MIME 220 type of the metadata available at the IRI. 222 A brief Metalink server response with .metalink only: 224 Link: ; rel="describedby"; 225 type="application/metalink4+xml"; 227 5. OpenPGP Signatures 229 OpenPGP signatures are specified with the Link header 230 [draft-nottingham-http-link-header] and a relation type of 231 "describedby" and a type parameter of "application/pgp-signature". 233 A brief Metalink server response with OpenPGP signature only: 235 Link: ; rel="describedby"; 236 type="application/pgp-signature"; 238 6. Checksums 240 6.1. Checksums of Whole Files 242 Instance Digests in HTTP [RFC3230] are used to request and retrieve 243 whole file checksums. 245 A brief Metalink client request that prefers SHA-1 checksums over 246 MD5: 248 Want-Digest: MD5;q=0.3, SHA;q=0.8 250 A brief Metalink server response with checksum: 252 Digest: SHA=thvDyvhfIqlvFe+A9MYgxAfm1q5= 254 6.2. Checksums of Chunks of Files 256 The Content-MD5 header [RFC1864] provides checksums for a chunk, or 257 portion, of a file, when requested with a Range header field. 259 Negotiation of Content-MD5 is described in [RFC3230]. A checksum for 260 a chunk of a file can determine if there has been an error in 261 transmission, which means the file is corrupt. If an error is 262 detected in a chunk, then just that chunk can be requested again from 263 the current mirror, or a different mirror. 265 A brief Metalink client request for Content-MD5 of a portion of a 266 file: 268 Range: bytes=7433802- 269 Want-Digest: contentMD5;q=0.8 271 A brief Metalink server response with checksum: 273 HTTP/1.1 206 Partial Content 274 Accept-Ranges: bytes 275 Content-Length: 7433801 276 Content-Range: bytes 7433802-14867602/14867603 277 Content-MD5: Q2hlY2sgSW50ZWdyaXR5IQ== 279 [[Content-MD5 for chunk checksums could lead to many random size 280 chunk checksum requests. Use consistent chunk sizes? Could we get 281 all chunk checksums from the referring Metalink server with Content- 282 MD5? Otherwise, this could also be a lot to ask on a mirror network 283 if you don't control it and most servers might not have this feature 284 enabled.]] 286 7. Client / Server Multi-source Download Interaction 288 Metalink clients begin a download with a standard HTTP [RFC2616] GET 289 request to the Metalink server. Here the client prefers SHA-1 290 checksums over MD5: 292 GET /distribution/example.ext HTTP/1.1 293 Host: www.example.com 294 Want-Digest: MD5;q=0.3, SHA;q=0.8 296 Alternatively, Metalink clients can use a HEAD request to discover 297 mirrors via Link headers. After that, it follows with a GET request 298 as usual. 300 The Metalink server responds with this: 302 HTTP/1.1 200 OK 303 Accept-Ranges: bytes 304 Content-Length: 14867603 305 Content-Type: application/x-cd-image 306 Link: ; rel="duplicate"; 307 Link: ; rel="duplicate"; 308 Link: ; rel="describedby"; 309 type="application/metalink4+xml"; 310 Link: ; rel="describedby"; 311 type="application/pgp-signature"; 312 Digest: SHA=thvDyvhfIqlvFe+A9MYgxAfm1q5= 314 The Metalink client then contacts the other mirrors requesting a 315 portion of the file with the "Range" header field, and using the 316 location of the original GET request in the "Referer" header field. 317 One of the client requests to a mirror server: 319 GET /example.ext HTTP/1.1 320 Host: www2.example.com 321 Range: bytes=7433802- 322 Referer: http://www.example.com/distribution/example.ext 324 The mirror servers respond with a 206 Partial Content HTTP status 325 code and appropriate "Content-Length" and "Content Range" header 326 fields. The mirror response to the above request: 328 HTTP/1.1 206 Partial Content 329 Accept-Ranges: bytes 330 Content-Length: 7433801 331 Content-Range: bytes 7433802-14867602/14867603 333 Once the download has completed, the Metalink client MUST verify the 334 checksum of the file. 336 8. Link Relation Type Registration: "duplicate" 338 o Relation Name: duplicate 340 o Description: Refers to a resource whose available representations 341 are byte-for-byte identical with the corresponding representations of 342 the context IRI. 344 o Reference: This specification. 346 o Notes: This relation is for static resources. That is, an HTTP GET 347 request on any duplicate will return the same representation. It 348 does not make sense for dynamic or POSTable resources and should not 349 be used for them. 351 9. Security Considerations 353 9.1. URIs and IRIs 355 Metalink clients handle URIs and IRIs. See Section 7 of [RFC3986] 356 and Section 8 of [RFC3987] for security considerations related to 357 their handling and use. 359 9.2. Spoofing 361 There is potential for spoofing attacks where the attacker publishes 362 Metalinks with false information. In that case, this could deceive 363 unaware downloaders that they are downloading a malicious or 364 worthless file. Also, malicious publishers could attempt a 365 distributed denial of service attack by inserting unrelated IRIs into 366 Metalinks. 368 9.3. Cryptographic Hashes 370 Currently, some of the hash types defined in Instance Digests in HTTP 371 [RFC3230] and Content-MD5 header [RFC1864] are considered insecure. 373 These include the whole Message Digest family of algorithms which are 374 not suitable for cryptographically strong verification. Malicious 375 people could provide files that appear to be identical to another 376 file because of a collision, i.e. the weak cryptographic hashes of 377 the intended file and a substituted malicious file could match. 379 If a Metalink contains hashes as described in Section 6, it SHOULD 380 include "sha" which is SHA-1, as specified in [RFC3174]. It MAY also 381 include other hashes. 383 9.4. Signing 385 Metalinks should include digital signatures, as described in 386 Section 5. 388 Digital signatures provide authentication, message integrity, and 389 non-repudiation with proof of origin. 391 10. References 393 10.1. Normative References 395 [RFC1864] Myers, J. and M. Rose, "The Content-MD5 Header Field", 396 RFC 1864, October 1995. 398 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 399 Requirement Levels", BCP 14, RFC 2119, March 1997. 401 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 402 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext 403 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. 405 [RFC3174] Eastlake, D. and P. Jones, "US Secure Hash Algorithm 1 406 (SHA1)", RFC 3174, September 2001. 408 [RFC3230] Mogul, J. and A. Van Hoff, "Instance Digests in HTTP", 409 RFC 3230, January 2002. 411 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 412 Resource Identifier (URI): Generic Syntax", STD 66, 413 RFC 3986, January 2005. 415 [RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource 416 Identifiers (IRIs)", RFC 3987, January 2005. 418 [draft-nottingham-http-link-header] 419 Nottingham, M., "Web Linking", 420 draft-nottingham-http-link-header-06 (work in progress), 421 July 2009. 423 10.2. Informative References 425 [draft-bryan-metalink] 426 Bryan, A., Ed., Tsujikawa, T., McNab, N., and P. Poeml, 427 "The Metalink Download Description Format", 428 draft-bryan-metalink-16 (work in progress), August 2009. 430 Appendix A. Acknowledgements and Contributors 432 Thanks to Mark Nottingham, Daniel Stenberg and Henrik Nordstrom. 434 Appendix B. Comparisons to Similar Options (to be removed by RFC Editor 435 before publication) 437 ...or missing, compared to the Metalink XML format 438 [draft-bryan-metalink] : 440 o (+) Reuses existing HTTP standards without defining anything new 441 besides a Link Relation Type. It's more of a collection/ 442 coordinated feature set. 443 o (+) No XML dependency. 444 o (-?) Tied to HTTP, not as generic. FTP/P2P clients won't be 445 using it unless they also support HTTP, unlike Metalink XML. 446 o (---) Requires changes to server software. 447 o (-?) Could require some coordination of all mirror servers for 448 all features, which may be difficult or impossible unless you are 449 in control of all servers on the mirror network. 450 o (-) Metalink XML can be created by user (or server, but server 451 component/changes not required). 452 o (-) Also, Metalink XML files are easily mirrored on all servers. 453 Even if usage in that case is not as transparent, it still gives 454 access to users at all mirrors (FTP included) to all download 455 information with no changes needed to the server. 456 o (-) Not portable/archivable/emailable. Metalink XML is used to 457 import/export transfer queues. Not as easy for search engines to 458 index? 459 o (-) No way to show mirror/p2p geographical location (yet). 460 o (-) No checksums besides MD5/SHA-1 (yet). 461 o (-) Not as rich metadata. 462 o (-) Not able to add multiple files to a download queue or create 463 directory structure. 465 ...or missing, draft-ford-http-multi-server compared to this draft : 467 o (+) Can define mirrors for whole directories. 468 o (-?) Defines new headers. Doesn't reuse existing standards. 469 o (---) Requires changes to server software. 470 o (---) Requires coordination of all mirror servers for all 471 features, which may be difficult or impossible unless you are in 472 control of all servers on the mirror network. 473 o (---) Doesn't tie in p2p. 474 o (-) No way to show mirror/p2p priority or geographical location 475 (yet). 477 Appendix C. Document History (to be removed by RFC Editor before 478 publication) 480 [[ to be removed by the RFC editor before publication as an RFC. ]] 482 Known issues concerning this draft: 483 o Mirror negotiation. Only send a few mirrors, or only send them if 484 Want-Digest is used? Some organizations have many mirrors. 485 o Some publishers desire stronger hashes than MD5 and SHA-1. 486 o Content-MD5 for chunk checksums could lead to many random size 487 chunk checksum requests. Use consistent chunk sizes? 488 o Do we want a way to show that whole directories are mirrored, 489 instead of individual files? 490 o Need an official MIME type for .torrent files. 492 -05 : September , 2009. 493 o ETags, preferably matching the Instance Digests. 495 -04 : September 17, 2009. 496 o Remove .torrent until it has an official MIME type. 498 -03 : September 16, 2009. 499 o Mention HEAD request, negotiate mirrors if Want-Digest is used. 501 -02 : September 6, 2009. 502 o Content-MD5 for chunk checksums. 504 -01 : September 1, 2009. 505 o Link Relation Type Registration: "duplicate" 507 -00 : August 24, 2009. 508 o Initial draft. 510 Authors' Addresses 512 Anthony Bryan (editor) 513 Metalinker Project 515 Email: anthonybryan@gmail.com 516 URI: http://www.metalinker.org 518 Neil McNab 519 Metalinker Project 521 Email: nabber00@gmail.com 522 URI: http://www.nabber.org