idnits 2.17.1 draft-bryan-metalinkhttp-21.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 27, 2011) is 4807 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'BITTORRENT' -- Possible downref: Non-RFC (?) normative reference: ref. 'FIPS-180-3' ** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) ** Obsolete normative reference: RFC 3230 (Obsoleted by RFC 9530) ** Obsolete normative reference: RFC 5751 (Obsoleted by RFC 8551) ** Obsolete normative reference: RFC 5988 (Obsoleted by RFC 8288) Summary: 4 errors (**), 0 flaws (~~), 1 warning (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Bryan 3 Internet-Draft N. McNab 4 Intended status: Standards Track T. Tsujikawa 5 Expires: August 31, 2011 6 P. Poeml 7 MirrorBrain 8 H. Nordstrom 9 February 27, 2011 11 Metalink/HTTP: Mirrors and Cryptographic Hashes in HTTP Header Fields 12 draft-bryan-metalinkhttp-21 14 Abstract 16 This document specifies Metalink/HTTP: Mirrors and Cryptographic 17 Hashes in HTTP header fields, a different way to get information that 18 is usually contained in the Metalink XML-based download description 19 format. Metalink/HTTP describes multiple download locations 20 (mirrors), Peer-to-Peer, cryptographic hashes, digital signatures, 21 and other information using existing standards for HTTP header 22 fields. Clients can use this information to make file transfers more 23 robust and reliable. 25 Editorial Note (To be removed by RFC Editor) 27 Discussion of this draft should take place on the HTTPBIS working 28 group mailing list (ietf-http-wg@w3.org), although this draft is not 29 a WG item. 31 The changes in this draft are summarized in Appendix C. 33 Status of this Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at http://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on August 31, 2011. 50 Copyright Notice 52 Copyright (c) 2011 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 1.1. Examples Metalink Server Response . . . . . . . . . . . . 5 69 1.2. Notational Conventions . . . . . . . . . . . . . . . . . . 5 70 2. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 5 71 3. Mirrors / Multiple Download Locations . . . . . . . . . . . . 7 72 3.1. Mirror Priority . . . . . . . . . . . . . . . . . . . . . 7 73 3.2. Mirror Geographical Location . . . . . . . . . . . . . . . 8 74 3.3. Coordinated Mirror Policies . . . . . . . . . . . . . . . 8 75 3.4. Mirror Depth . . . . . . . . . . . . . . . . . . . . . . . 8 76 4. Peer-to-Peer / Metainfo . . . . . . . . . . . . . . . . . . . 9 77 4.1. Metalink/XML Files . . . . . . . . . . . . . . . . . . . . 9 78 5. Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . 10 79 5.1. OpenPGP Signatures . . . . . . . . . . . . . . . . . . . . 10 80 5.2. S/MIME Signatures . . . . . . . . . . . . . . . . . . . . 10 81 6. Cryptographic Hashes of Whole Documents . . . . . . . . . . . 10 82 7. Client / Server Multi-source Download Interaction . . . . . . 11 83 7.1. Error Prevention, Detection, and Correction . . . . . . . 14 84 7.1.1. Error Prevention (Early File Mismatch Detection) . . . 14 85 7.1.2. Error Correction . . . . . . . . . . . . . . . . . . . 15 86 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 87 9. Security Considerations . . . . . . . . . . . . . . . . . . . 16 88 9.1. URIs and IRIs . . . . . . . . . . . . . . . . . . . . . . 16 89 9.2. Spoofing . . . . . . . . . . . . . . . . . . . . . . . . . 16 90 9.3. Cryptographic Hashes . . . . . . . . . . . . . . . . . . . 16 91 9.4. Signing . . . . . . . . . . . . . . . . . . . . . . . . . 17 92 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 17 93 10.1. Normative References . . . . . . . . . . . . . . . . . . . 17 94 10.2. Informative References . . . . . . . . . . . . . . . . . . 18 95 Appendix A. Acknowledgements and Contributors . . . . . . . . . . 18 96 Appendix B. Comparisons to Similar Options . . . . . . . . . . . 18 97 Appendix C. Document History . . . . . . . . . . . . . . . . . . 19 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21 100 1. Introduction 102 Metalink/HTTP is an alternative and complementary representation of 103 Metalink information, which is usually presented as an XML-based 104 document format [RFC5854]. Metalink/HTTP attempts to provide as much 105 functionality as the Metalink/XML format by using existing standards 106 such as Web Linking [RFC5988], Instance Digests in HTTP [RFC3230], 107 and Entity Tags (also known as ETags) [RFC2616]. Metalink/HTTP is 108 used to list information about a file to be downloaded. This can 109 include lists of multiple URIs (mirrors), Peer-to-Peer information, 110 cryptographic hashes, and digital signatures. 112 Identical copies of a file are frequently accessible in multiple 113 locations on the Internet over a variety of protocols (such as FTP, 114 HTTP, and Peer-to-Peer). In some cases, users are shown a list of 115 these multiple download locations (mirrors) and must manually select 116 a single one on the basis of geographical location, priority, or 117 bandwidth. This distributes the load across multiple servers, and 118 should also increase throughput and resilience. At times, however, 119 individual servers can be slow, outdated, or unreachable, but this 120 can not be determined until the download has been initiated. Users 121 will rarely have sufficient information to choose the most 122 appropriate server, and will often choose the first in a list which 123 might not be optimal for their needs, and will lead to a particular 124 server getting a disproportionate share of load. The use of 125 suboptimal mirrors can lead to the user canceling and restarting the 126 download to try to manually find a better source. During downloads, 127 errors in transmission can corrupt the file. There are no easy ways 128 to repair these files. For large downloads this can be extremely 129 troublesome. Any of the number of problems that can occur during a 130 download lead to frustration on the part of users. 132 Some popular sites automate the process of selecting mirrors using 133 DNS load balancing, both to approximately balance load between 134 servers, and to direct clients to nearby servers with the hope that 135 this improves throughput. Indeed, DNS load balancing can balance 136 long-term server load fairly effectively, but it is less effective at 137 delivering the best throughput to users when the bottleneck is not 138 the server but the network. 140 This document describes a mechanism by which the benefit of mirrors 141 can be automatically and more effectively realized. All the 142 information about a download, including mirrors, cryptographic 143 hashes, digital signatures, and more can be transferred in 144 coordinated HTTP header fields hereafter referred to as a Metalink. 145 This Metalink transfers the knowledge of the download server (and 146 mirror database) to the client. Clients can fallback to other 147 mirrors if the current one has an issue. With this knowledge, the 148 client is enabled to work its way to a successful download even under 149 adverse circumstances. All this can be done without complicated user 150 interaction and the download can be much more reliable and efficient. 151 In contrast, a traditional HTTP redirect to a mirror conveys only 152 extremely minimal information - one link to one server, and there is 153 no provision in the HTTP protocol to handle failures. Furthermore, 154 in order to provide better load distribution across servers and 155 potentially faster downloads to users, Metalink/HTTP facilitates 156 multi-source downloads, where portions of a file are downloaded from 157 multiple mirrors (and optionally, Peer-to-Peer) simultaneously. 159 Upon connection to a Metalink/HTTP server, a client will receive 160 information about other sources of the same resource and a 161 cryptographic hash of the whole resource. The client will then be 162 able to request chunks of the file from the various sources, 163 scheduling appropriately in order to maximize the download rate. 165 1.1. Examples Metalink Server Response 167 This example shows a brief Metalink server response with ETag, 168 mirrors, .meta4, OpenPGP signature, and a cryptographic hash of the 169 whole file: 171 Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5=" 172 Link: ; rel=duplicate 173 Link: ; rel=duplicate 174 Link: ; rel=describedby; 175 type="application/x-bittorrent" 176 Link: ; rel=describedby; 177 type="application/metalink4+xml" 178 Link: ; rel=describedby; 179 type="application/pgp-signature" 180 Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO 181 DYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ== 183 1.2. Notational Conventions 185 This specification describes conformance of Metalink/HTTP. 187 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 188 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 189 document are to be interpreted as described in BCP 14, [RFC2119], as 190 scoped to those conformance targets. 192 2. Requirements 194 In this context, "Metalink" refers to Metalink/HTTP which consists of 195 mirrors and cryptographic hashes in HTTP header fields as described 196 in this document. "Metalink/XML" refers to the XML format described 197 in [RFC5854]. 199 Metalink resources include Link header fields [RFC5988] to present a 200 list of mirrors in the response to a client request for the resource. 201 Metalink servers MUST include the cryptographic hash of a resource 202 via Instance Digests in HTTP [RFC3230]. Algorithms used in the 203 Instance Digest field are registered in the IANA registry named 204 "Hypertext Transfer Protocol (HTTP) Digest Algorithm Values" at 205 . 206 This document restricts the use of these algorithms. SHA-256 and 207 SHA-512 were added to the registry by [RFC5843]. If a Metalink 208 contains whole file hashes as described in Section 6, it SHOULD 209 include SHA-256, as specified in [FIPS-180-3], or stronger. It MAY 210 also include other hashes. 212 Metalink servers are HTTP servers with one or more Metalink 213 resources. Metalink servers MUST support the Link header fields for 214 listing mirrors and MUST support Instance Digests in HTTP [RFC3230]. 215 Metalink servers MUST return the same Link header fields and Instance 216 Digests on HEAD requests. Metalink servers and their associated 217 mirror servers SHOULD all share the same ETag policy. It is up to 218 the administrator of the Metalink server to communicate the details 219 of the shared ETag policy to the administrators of the mirror servers 220 so that the mirror servers can be configured with the same ETag 221 policy. To have the same ETag policy means that ETags are 222 synchronized across servers for resources that are mirrored, i.e. 223 byte-for-byte identical files will have the same ETag on mirrors that 224 they have on the Metalink server. For example, it would be better to 225 derive an ETag from a cryptographic hash of the file contents than on 226 server-unique filesystem metadata. Metalink servers SHOULD offer 227 Metalink/XML documents that contain cryptographic hashes of parts of 228 the file (and other information) if error recovery is desirable. 230 Mirror servers are typically FTP or HTTP servers that "mirror" 231 another server. That is, they provide identical copies of (at least 232 some) files that are also on the mirrored server. Mirror servers 233 SHOULD support serving partial content. HTTP mirror servers SHOULD 234 share the same ETag policy as the originating Metalink server. HTTP 235 Mirror servers SHOULD support Instance Digests in HTTP [RFC3230] 236 using the same algorithm as the Metalink server. Optimally, mirror 237 servers will share the same ETag policy and support Instance Digests 238 in HTTP. Mirror servers that share the same ETag policy and/or 239 support Instance Digests in HTTP using the same algorithm as a 240 Metalink server are known as preferred mirror servers. 242 Metalink clients use the mirrors provided by a Metalink server in 243 Link header fields [RFC5988] but it is restricted to the initial 244 Metalink server they contacted. If Metalink clients find Link header 245 fields [RFC5988] for listing mirrors from mirrors, they MUST discard 246 such Link header fields [RFC5988] to prevent a possible infinite 247 loop. Metalink clients MUST support HTTP and SHOULD support FTP 248 [RFC0959]. Metalink clients MAY support BitTorrent [BITTORRENT], or 249 other download methods. Metalink clients SHOULD switch downloads 250 from one mirror to another if a mirror becomes unreachable. Metalink 251 clients MAY support multi-source, or parallel, downloads, where 252 portions of a file can be downloaded from multiple mirrors 253 simultaneously (and optionally, from Peer-to-Peer sources). Metalink 254 clients MUST support Instance Digests in HTTP [RFC3230] by requesting 255 and verifying cryptographic hashes. Metalink clients SHOULD support 256 error recovery by using the cryptographic hashes of parts of the file 257 listed in Metalink/XML files. Metalink clients SHOULD support 258 checking digital signatures. 260 3. Mirrors / Multiple Download Locations 262 Mirrors are specified with the Link header fields [RFC5988] and a 263 relation type of "duplicate" as defined in Section 8. 265 The following OPTIONAL attributes are defined: 266 o "depth" : mirror depth in Section 3.4. 267 o "geo" : mirror geographical location in Section 3.2. 268 o "pref" : a preferred mirror server in Section 3.3. 269 o "pri" : mirror priority in Section 3.1. 271 This example shows a brief Metalink server response with two mirrors 272 only: 274 Link: ; rel=duplicate; 275 pri=1; pref 276 Link: ; rel=duplicate; 277 pri=2; geo=gb; depth=1 279 As some organizations can have many mirrors, it is up to the 280 organization to configure the amount of Link header fields the 281 Metalink server will provide. Such a decision could be a random 282 selection or a hard-coded limit based on network proximity, file 283 size, server load, or other factors. 285 3.1. Mirror Priority 287 Entries for mirror servers are listed in order of priority (from most 288 preferred to least) or have a "pri" value, where mirrors with lower 289 values are used first. 291 This is purely an expression of the server's preferences; it is up to 292 the client what it does with this information, particularly with 293 reference to how many servers to use at any one time. 295 3.2. Mirror Geographical Location 297 Entries for a mirror server MAY have a "geo" value, which is a 298 [ISO3166-1] alpha-2 two letter country code for the geographical 299 location of the physical server the URI is used to access. A client 300 MAY use this information to select a mirror, or set of mirrors, that 301 are geographically near (if the client has access to such 302 information), with the aim of reducing network load at inter-country 303 bottlenecks. 305 3.3. Coordinated Mirror Policies 307 There are two types of mirror servers: preferred and normal. 308 Preferred mirror servers are HTTP mirror servers that MUST share the 309 same ETag policy as the originating Metalink server and/or MUST 310 provide Instance Digests using the same algorithm as the Metalink 311 server. Preferred mirrors make it possible for Metalink clients to 312 detect early on, before data is transferred, if the file requested 313 matches the desired file. This early file mismatch detection is 314 described in Section 7.1.1. Entries for preferred HTTP mirror 315 servers have a "pref" value. By default, if unspecified then mirrors 316 are considered "normal" and do not necessarily share the same ETag 317 policy or support Instance Digests using the same algorithm as the 318 Metalink server. FTP mirrors are considered "normal", as they do not 319 emit ETags or support Instance Digests. 321 3.4. Mirror Depth 323 Some mirrors can mirror single files, whole directories, or multiple 324 directories. 326 Entries for mirror servers can have a "depth" value, where "depth=0" 327 is the default. A value of 0 means only that file is mirrored and 328 that other URI path segments are not. A value of 1 means that file 329 and all other files and URI path segments contained in the rightmost 330 URI path segment are mirrored. For values of N, N-1 URI path 331 segments closer to the Host are mirrored. A value of 2 means one URI 332 path segment closer to the Host is mirrored, and all files and URI 333 path segments contained are mirrored. For each higher value, another 334 URI path segment closer to the Host is mirrored. 336 This example shows a mirror with a depth value of 4: 338 Link: ; 339 rel=duplicate; pri=1; pref; depth=4 341 In the above example, 4 URI path segments closer to the Host are 342 mirrored, from /dir2/ and all files and directories included. 344 4. Peer-to-Peer / Metainfo 346 Entries for metainfo files, which describe ways to download a file 347 over Peer-to-Peer networks or otherwise, are specified with the Link 348 header fields [RFC5988] and a relation type of "describedby" and a 349 type parameter that indicates the MIME type of the metadata available 350 at the URI. Since metainfo files can sometimes describe multiple 351 files, or the filename MAY not be the same on the Metalink server and 352 in the metainfo file but still have the same content, an OPTIONAL 353 "name" attribute can be used. 355 The following OPTIONAL attribute is defined: 356 o "name" : a file described within the metainfo file. 358 This example shows a brief Metalink server response with .torrent and 359 .meta4: 361 Link: ; rel=describedby; 362 type="application/x-bittorrent"; name="differentname.ext" 363 Link: ; rel=describedby; 364 type="application/metalink4+xml" 366 Metalink clients MAY support the use of metainfo files for 367 downloading files. 369 4.1. Metalink/XML Files 371 Metalink/XML files for a given resource MAY be provided in a Link 372 header field as shown in the example in Section 4. Metalink/XML 373 files are specified in [RFC5854] and they are particularly useful for 374 providing metadata such as cryptographic hashes of parts of a file, 375 allowing a client to recover from errors (see Section 7.1.2). 376 Metalink servers SHOULD provide Metalink/XML files with partial file 377 hashes in Link header fields and Metalink clients SHOULD use them for 378 error recovery. 380 5. Signatures 382 5.1. OpenPGP Signatures 384 OpenPGP signatures [RFC3156] of requested files are specified with 385 the Link header fields [RFC5988] and a relation type of "describedby" 386 and a type parameter of "application/pgp-signature". 388 This example shows a brief Metalink server response with OpenPGP 389 signature only: 391 Link: ; rel=describedby; 392 type="application/pgp-signature" 394 Metalink clients SHOULD support the use of OpenPGP signatures. 396 5.2. S/MIME Signatures 398 S/MIME signatures [RFC5751] of requested files are specified with the 399 Link header fields [RFC5988] and a relation type of "describedby" and 400 a type parameter of "application/pkcs7-mime". 402 This example shows a brief Metalink server response with S/MIME 403 signature only: 405 Link: ; rel=describedby; 406 type="application/pkcs7-mime" 408 Metalink clients SHOULD support the use of S/MIME signatures. 410 6. Cryptographic Hashes of Whole Documents 412 If Instance Digests are not provided by the Metalink servers, the 413 Link header fields pertaining to this specification MUST be ignored. 415 This example shows a brief Metalink server response with ETag, 416 mirror, and cryptographic hash: 418 Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5=" 419 Link: ; rel=duplicate 420 Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO 421 DYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ== 423 7. Client / Server Multi-source Download Interaction 425 Metalink clients begin a download with a standard HTTP [RFC2616] GET 426 request to the Metalink server. Metalink clients MAY use a Range 427 limit if desired. 429 GET /distribution/example.ext HTTP/1.1 430 Host: www.example.com 432 The Metalink server responds with the data and these header fields: 434 HTTP/1.1 200 OK 435 Accept-Ranges: bytes 436 Content-Length: 14867603 437 Content-Type: application/x-cd-image 438 Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5=" 439 Link: ; rel=duplicate; pref 440 Link: ; rel=duplicate 441 Link: ; rel=describedby; 442 type="application/x-bittorrent" 443 Link: ; rel=describedby; 444 type="application/metalink4+xml" 445 Link: ; rel=describedby; 446 type="application/pgp-signature" 447 Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO 448 DYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ== 450 Alternatively, Metalink clients can begin with a HEAD request to the 451 Metalink server to discover mirrors via Link header fields, and then 452 skip to making the following decisions on every available mirror 453 server found via the Link header fields. 455 After that, the client follows with a GET request to the desired 456 mirrors. 458 From the Metalink server response the client learns some or all of 459 the following metadata about the requested object, in addition to 460 also starting to receive the object: 462 o Mirror profile link, which can describe the mirror's priority, 463 whether it shares the ETag policy of the originating Metalink 464 server, geographical location, and mirror depth. 465 o Instance Digest, which is the whole file cryptographic hash. 466 o ETag. 467 o Object size from the Content-Length header field. 469 o Metalink/XML, which can include partial file cryptographic hashes 470 to repair a file. 471 o Peer-to-peer information. 472 o Digital signature. 474 Next, the Metalink client requests a Range of the object from a 475 preferred mirror server, so it can use If-Match conditions: 477 GET /example.ext HTTP/1.1 478 Host: www2.example.com 479 Range: bytes=7433802- 480 If-Match: "thvDyvhfIqlvFe+A9MYgxAfm1q5=" 481 Referer: http://www.example.com/distribution/example.ext 483 Metalink clients SHOULD use preferred mirrors, if possible, as they 484 allow early file mismatch detection as described in Section 7.1.1. 485 Preferred mirrors have coordinated ETags, as described in 486 Section 3.3, and Metalink clients SHOULD use If-Match conditions 487 based on the ETag to quickly detect out-of-date mirrors by using the 488 ETag from the Metalink server response. Metalink clients SHOULD use 489 partial file cryptographic hashes as described in Section 7.1.2, if 490 available, to detect if the mirror server returned the correct data. 492 Optimally, the mirror server also will include an Instance Digest in 493 the mirror response to the client GET request, which the client can 494 also use to detect a mismatch early. Metalink clients MUST reject 495 individual downloads from mirrors that support Instance Digests if 496 the Instance Digest from the mirror does not match the Instance 497 Digest as reported by the Metalink server and the same algorithm is 498 used. If normal mirrors are used, then a mismatch can not be 499 detected until the completed object is verified. Errors in 500 transmission and substitutions of incorrect data on mirrors, whether 501 deliberate or accidental, can be detected with error correction as 502 described in Section 7.1.2. 504 Here, the preferred mirror server has the correct file (the If-Match 505 conditions match) and responds with a 206 Partial Content HTTP status 506 code and appropriate "Content-Length", "Content Range", ETag, and 507 Instance Digest header fields. In this example, the mirror server 508 responds, with data, to the above request: 510 HTTP/1.1 206 Partial Content 511 Accept-Ranges: bytes 512 Content-Length: 7433801 513 Content-Range: bytes 7433802-14867602/14867603 514 Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5=" 515 Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO 516 DYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ== 517 If the object is large and gets delivered slower than expected, then 518 the Metalink client MAY start a number of parallel ranged downloads 519 (one per selected mirror server other than the first) using mirrors 520 provided by the Link header fields with "duplicate" relation type. 521 Metalink clients MUST limit the number of parallel connections to 522 mirror servers, ideally based on observing how the aggregate 523 throughput changes as connections are opened. It would be pointless 524 to blindly open connections once the path bottleneck is filled. 525 Metalink clients SHOULD use the location of the original GET request 526 in the "Referer" header field for these ranged requests. 528 The Metalink client can determine the size and number of ranges 529 requested from each server, based upon the type and number of mirrors 530 and performance observed from each mirror. Note that Range requests 531 impose an overhead on servers and clients need to be aware of that 532 and not abuse them. Metalink clients SHOULD NOT make more than one 533 concurrent Range request to each mirror server that it downloads 534 from. 536 Metalink clients SHOULD close all but the fastest connection if any 537 Ranged requests generated after the first request end up with a 538 complete response, instead of a partial response (as some mirrors 539 might not support HTTP ranges), if the goal is the fastest transfer. 540 Metalink clients MAY monitor mirror conditions and dynamically switch 541 between mirrors to achieve the fastest download possible. Similarly, 542 Metalink clients SHOULD abort extremely slow or stalled range 543 requests and finish the request on other mirrors. If all ranges have 544 finished except for the final one, the Metalink client can split the 545 final range into multiple range requests to other mirrors so the 546 transfer finishes faster. 548 If the first request was GET and no Range header field was sent and 549 the client determines later that it will issue a Range request, then 550 the client SHOULD close the first connection when it catches up with 551 the other parallel ranged downloads of the same object. This means 552 the first connection was sacrificed. Metalink clients can use a HEAD 553 request first, if possible, so that the client can find out if there 554 are any Link header fields, and then Range-based requests are 555 undertaken to the mirror servers without sacrificing a first 556 connection. 558 Metalink clients MUST reject individual downloads from mirrors where 559 the file size does not match the file size as reported by the 560 Metalink server. 562 If a Metalink client does not support certain download methods (such 563 as FTP or BitTorrent) that a file is available from, and there are no 564 available download methods that the client supports, then the 565 download will have no way to complete. 567 Metalink clients MUST verify the cryptographic hash of the file once 568 the download has completed. If the cryptographic hash offered by the 569 Metalink server with Instance Digests does not match the 570 cryptographic hash of the downloaded file, see Section 7.1.2 for a 571 possible way to repair errors. 573 If the download can not be repaired, it is considered corrupt. The 574 client can attempt to re-download the file. 576 Metalink clients that support verifying digital signatures MUST 577 verify digital signatures of requested files if they are included. 578 Digital signatures MUST validate back to a trust anchor as described 579 in the validation rules in [RFC3156] and [RFC5280]. 581 7.1. Error Prevention, Detection, and Correction 583 Error prevention, or early file mismatch detection, is possible 584 before file transfers with the use of file sizes, ETags, and Instance 585 Digests provided by Metalink servers. Error detection requires 586 Instance Digests to detect errors in transfer after the transfers 587 have completed. Error correction, or download repair, is possible 588 with partial file cryptographic hashes. 590 Note that cryptographic hashes obtained from Instance Digests are in 591 base64 encoding, while those from Metalink/XML are in hexadecimal. 593 7.1.1. Error Prevention (Early File Mismatch Detection) 595 In HTTP terms, the merging of ranges from multiple responses SHOULD 596 be verified with a strong validator, which in this context is either 597 an Instance Digest or a shared ETag from that Metalink server that 598 matches with the same provided by a preferred mirror server. In most 599 cases, it is sufficient that the Metalink server provides mirrors and 600 Instance Digest information, but operation will be more robust and 601 efficient if the mirror servers do implement a shared ETag policy or 602 Instance Digests as well. There is no need to specify how the ETag 603 is generated, just that it needs to be shared between the Metalink 604 server and the mirror servers. The benefit of having mirror servers 605 return an Instance Digest is that the client then can detect 606 mismatches early even if ETags are not used. Mirrors that support 607 both a shared ETag and Instance Digests do provide value, but just 608 one is sufficient for early detection of mismatches. If the mirror 609 server provides neither shared ETag nor Instance Digest, then early 610 detection of mismatches is not possible unless file length also 611 differs. Finally, errors are still detectable after the download has 612 completed, when the cryptographic hash of the merged response is 613 verified. 615 ETags can not be used for verifying the integrity of the received 616 content. If the ETag given by the mirror server matches the ETag 617 given by the Metalink server, then the Metalink client assumes the 618 responses are valid for that object. 620 This guarantees that a mismatch will be detected by using only the 621 shared ETag from a Metalink server and mirror server. Mirror servers 622 will respond with an error if ETags do not match, which will prevent 623 accidental merges of ranges from different versions of files with the 624 same name. 626 A shared ETag or Instance Digest can not strictly protect against 627 malicious attacks or server or network errors replacing content. An 628 attacker can make a mirror server seemingly respond with the expected 629 Instance Digest or ETags even if the file contents have been 630 modified. The same goes for various system failures which would also 631 cause bad data (i.e. corrupted files) to be returned. The Metalink 632 client has to rely on the Instance Digest returned by the Metalink 633 server in the first response for the verification of the downloaded 634 object as a whole. To verify the individual ranges, which might have 635 been requested from different sources, see Section 7.1.2. 637 7.1.2. Error Correction 639 Partial file cryptographic hashes can be used to detect errors during 640 the download. Metalink servers SHOULD provide Metalink/XML files 641 with partial file hashes in Link header fields as specified in 642 Section 4.1, and Metalink clients SHOULD use them for error 643 correction. 645 An error in transfer or a substitution attack will be detected by a 646 cryptographic hash of the object not matching the Instance Digest 647 from the Metalink server. If the cryptographic hash of the object 648 does not match the Instance Digest from the Metalink server, then the 649 client SHOULD fetch the Metalink/XML (if available). This may 650 contain partial file cryptographic hashes which will allow detection 651 of which mirror server returned incorrect data. Metalink clients 652 SHOULD use the Metalink/XML data to figure out what ranges of the 653 downloaded data can be recovered and what needs to be fetched again. 655 Other methods can be used for error correction. For example, some 656 other metainfo files also include partial file hashes that can be 657 used to check for errors. 659 8. IANA Considerations 661 Accordingly, IANA will make the following registration to the Link 662 Relation Type registry at . 665 o Relation Name: duplicate 667 o Description: Refers to a resource whose available representations 668 are byte-for-byte identical with the corresponding representations of 669 the context IRI. 671 o Reference: This specification. 673 o Notes: This relation is for static resources. That is, an HTTP GET 674 request on any duplicate will return the same representation. It 675 does not make sense for dynamic or POSTable resources and should not 676 be used for them. 678 9. Security Considerations 680 9.1. URIs and IRIs 682 Metalink clients handle URIs and IRIs. See Section 7 of [RFC3986] 683 and Section 8 of [RFC3987] for security considerations related to 684 their handling and use. 686 9.2. Spoofing 688 There is potential for spoofing attacks where the attacker publishes 689 Metalinks with false information. In that case, this could deceive 690 unaware downloaders into downloading a malicious or worthless file. 691 As with all downloads, users should only download from trusted 692 sources. Also, malicious publishers could attempt a distributed 693 denial of service attack by inserting unrelated URIs into Metalinks. 694 [RFC4732] contains information on amplification attacks and denial of 695 service attacks. 697 9.3. Cryptographic Hashes 699 Currently, some of the digest values defined in Instance Digests in 700 HTTP [RFC3230] are considered insecure. These include the whole 701 Message Digest family of algorithms which are not suitable for 702 cryptographically strong verification. Malicious people could 703 provide files that appear to be identical to another file because of 704 a collision, i.e. the weak cryptographic hashes of the intended file 705 and a substituted malicious file could match. 707 9.4. Signing 709 Metalinks SHOULD include digital signatures, as described in 710 Section 5. 712 Digital signatures provide authentication, message integrity, and 713 enable non-repudiation with proof of origin. 715 10. References 717 10.1. Normative References 719 [BITTORRENT] 720 Cohen, B., "The BitTorrent Protocol Specification", 721 BITTORRENT 11031, February 2008, 722 . 724 [FIPS-180-3] 725 National Institute of Standards and Technology (NIST), 726 "Secure Hash Standard (SHS)", FIPS PUB 180-3, 727 October 2008. 729 [RFC0959] Postel, J. and J. Reynolds, "File Transfer Protocol", 730 STD 9, RFC 0959, October 1985. 732 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 733 Requirement Levels", BCP 14, RFC 2119, March 1997. 735 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 736 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext 737 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. 739 [RFC3156] Elkins, M., Del Torto, D., Levien, R., and T. Roessler, 740 "MIME Security with OpenPGP", RFC 3156, August 2001. 742 [RFC3230] Mogul, J. and A. Van Hoff, "Instance Digests in HTTP", 743 RFC 3230, January 2002. 745 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 746 Resource Identifier (URI): Generic Syntax", STD 66, 747 RFC 3986, January 2005. 749 [RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource 750 Identifiers (IRIs)", RFC 3987, January 2005. 752 [RFC5280] Cooper, D., Santesson, S., Farrell, S., Boeyen, S., 753 Housley, R., and W. Polk, "Internet X.509 Public Key 754 Infrastructure Certificate and Certificate Revocation List 755 (CRL) Profile", RFC 5280, May 2008. 757 [RFC5751] Ramsdell, B. and S. Turner, "Secure/Multipurpose Internet 758 Mail Extensions (S/MIME) Version 3.2 Message 759 Specification", RFC 5751, January 2010. 761 [RFC5854] Bryan, A., Tsujikawa, T., McNab, N., and P. Poeml, "The 762 Metalink Download Description Format", RFC 5854, 763 June 2010. 765 [RFC5988] Nottingham, M., "Web Linking", RFC 5988, October 2010. 767 10.2. Informative References 769 [ISO3166-1] 770 International Organization for Standardization, "ISO 3166- 771 1:2006. Codes for the representation of names of 772 countries and their subdivisions -- Part 1: Country 773 codes", November 2006. 775 [RFC4732] Handley, M., Rescorla, E., and IAB, "Internet Denial-of- 776 Service Considerations", RFC 4732, December 2006. 778 [RFC5843] Bryan, A., "Additional Hash Algorithms for HTTP Instance 779 Digests", RFC 5843, April 2010. 781 Appendix A. Acknowledgements and Contributors 783 Thanks to the Metalink community, Alexey Melnikov, Julian Reschke, 784 Mark Nottingham, Daniel Stenberg, Matt Domsch, Micah Cowan, David 785 Morris, Yves Lafon, Juergen Schoenwaelder, Ben Campbell, Lars Eggert, 786 Sean Turner, Robert Sparks, and the HTTPBIS Working Group. 788 Thanks to Alan Ford and Mark Handley for spurring us on to publish 789 this document. 791 This document is dedicated to Zimmy Bryan and Juanita Anthony. 793 Appendix B. Comparisons to Similar Options 795 [[ to be removed by the RFC editor before publication as an RFC. ]] 797 This draft, compared to the Metalink/XML format [RFC5854] : 799 o (+) Reuses existing HTTP standards without much new besides a Link 800 Relation Type. It's more of a collection/coordinated feature set. 801 o (?) The existing standards don't seem to be widely implemented. 802 o (+) No XML dependency, except for Metalink/XML for partial file 803 cryptographic hashes. 804 o (+) Existing Metalink/XML clients can be easily converted to 805 support this as well. 806 o (+) Coordination of mirror servers is preferred, but not required. 807 Coordination could be difficult or impossible unless one group is 808 in control of all servers on the mirror network. 809 o (-) Requires software or configuration changes to originating 810 server. 811 o (-?) Tied to HTTP, not as generic. FTP/P2P clients won't be 812 using it unless they also support HTTP, unlike Metalink/XML. 813 o (-) Requires server-side support. Metalink/XML can be created by 814 user (or server, but server component/changes not required). 815 o (-) Also, Metalink/XML files are easily mirrored on all servers. 816 Even if usage in that case is not as transparent, this method 817 still gives access to all download information (with no changes 818 needed to servers) from all mirrors (FTP included). 819 o (-) Not portable/archivable/emailable. Metalink/XML is used to 820 import/export transfer queues. Not as easy for search engines to 821 index? 822 o (-) Not as rich metadata. 823 o (-) Not able to add multiple files to a download queue or create 824 directory structure. 826 Appendix C. Document History 828 [[ to be removed by the RFC editor before publication as an RFC. ]] 830 Known issues concerning this draft: 831 o None. 833 -21 : February 27, 2011. 834 o IESG review. 836 -20 : February 14, 2011. 837 o Yves Lafon's apps-team review, Juergen Schoenwaelder's secdir 838 review, Ben Campbell's Gen-ART review. 840 -19 : January 20, 2011. 841 o Julian Reschke's review. 843 -18 : January 1, 2010. 845 o AD review by Alexey Melnikov. 847 -17 : September 13, 2010. 848 o RFC 5854 Metalink/XML. 850 -16 : April 16, 2010. 851 o Add draft-ietf-ftpext2-hash reference and FTP mirror coordination. 853 -15 : February 20, 2010. 854 o Update references and terminology. 856 -14 : December 31, 2009. 857 o Baseline file hash: SHA-256. 859 -13 : November 22, 2009. 860 o Metalink/XML for partial file cryptographic hashes. 862 -12 : November 11, 2009. 863 o Clarifications. 865 -11 : October 23, 2009. 866 o Mirror changes. 868 -10 : October 15, 2009. 869 o Mirror coordination changes. 871 -09 : October 13, 2009. 872 o Mirror location, coordination, and depth. 873 o Split HTTP Digest Algorithm Values Registration into 874 draft-bryan-http-digest-algorithm-values-update. 876 -08 : October 4, 2009. 877 o Clarifications. 879 -07 : September 29, 2009. 880 o Preferred mirror servers. 882 -06 : September 24, 2009. 883 o Add Mismatch Detection, Error Recovery, and Digest Algorithm 884 values. 885 o Remove Content-MD5 and Want-Digest. 887 -05 : September 19, 2009. 888 o ETags, preferably matching the Instance Digests. 890 -04 : September 17, 2009. 892 o Temporarily remove .torrent. 894 -03 : September 16, 2009. 895 o Mention HEAD request, negotiate mirrors if Want-Digest is used. 897 -02 : September 7, 2009. 898 o Content-MD5 for partial file cryptographic hashes. 900 -01 : September 1, 2009. 901 o Link Relation Type Registration: "duplicate" 903 -00 : August 24, 2009. 904 o Initial draft. 906 Authors' Addresses 908 Anthony Bryan 909 Pompano Beach, FL 910 USA 912 Email: anthonybryan@gmail.com 913 URI: http://www.metalinker.org 915 Neil McNab 917 Email: neil@nabber.org 918 URI: http://www.nabber.org 920 Tatsuhiro Tsujikawa 921 Shiga 922 Japan 924 Email: tatsuhiro.t@gmail.com 925 URI: http://aria2.sourceforge.net 926 Dr. med. Peter Poeml 927 MirrorBrain 928 Venloer Str. 317 929 Koeln 50823 930 DE 932 Phone: +49 221 6778 333 8 933 Email: peter@poeml.de 934 URI: http://mirrorbrain.org/~poeml/ 936 Henrik Nordstrom 938 Email: henrik@henriknordstrom.net 939 URI: http://www.henriknordstrom.net/