idnits 2.17.1 

draft-ford-http-multi-server-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** The document seems to lack a License Notice according IETF Trust
     Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009
     Section 6.b -- however, there's a paragraph with a matching beginning.
     Boilerplate error?

     (You're using the IETF Trust Provisions' Section 6.b License Notice from
     12 Feb 2009 rather than one of the newer Notices.  See
     https://trustee.ietf.org/license-info/.)


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the
     document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'SHOULD not' in this paragraph:
     
     The cache lifetime is the time which this mirrored information can
     be cached for, before it SHOULD be considered stale and SHOULD not be
     used.  Caching the mirror list is most important for mirrors of
     directories, where small files can be requested directly from the mirrors
     without needing to request the first chunk from the initial server.

  -- The document date (July 6, 2009) is 5401 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  -- Obsolete informational reference (is this intentional?): RFC 2616
     (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235)


     Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                                  A. Ford
3	Internet-Draft                                       Roke Manor Research
4	Intended status: Experimental                                 M. Handley
5	Expires: January 7, 2010                       University College London
6	                                                            July 6, 2009

8	    HTTP Extensions for Simultaneous Download from Multiple Mirrors
9	                    draft-ford-http-multi-server-00

11	Status of this Memo

13	   This Internet-Draft is submitted to IETF in full conformance with the
14	   provisions of BCP 78 and BCP 79.

16	   Internet-Drafts are working documents of the Internet Engineering
17	   Task Force (IETF), its areas, and its working groups.  Note that
18	   other groups may also distribute working documents as Internet-
19	   Drafts.

21	   Internet-Drafts are draft documents valid for a maximum of six months
22	   and may be updated, replaced, or obsoleted by other documents at any
23	   time.  It is inappropriate to use Internet-Drafts as reference
24	   material or to cite them other than as "work in progress."

26	   The list of current Internet-Drafts can be accessed at
27	   http://www.ietf.org/ietf/1id-abstracts.txt.

29	   The list of Internet-Draft Shadow Directories can be accessed at
30	   http://www.ietf.org/shadow.html.

32	   This Internet-Draft will expire on January 7, 2010.

34	Copyright Notice

36	   Copyright (c) 2009 IETF Trust and the persons identified as the
37	   document authors.  All rights reserved.

39	   This document is subject to BCP 78 and the IETF Trust's Legal
40	   Provisions Relating to IETF Documents in effect on the date of
41	   publication of this document (http://trustee.ietf.org/license-info).
42	   Please review these documents carefully, as they describe your rights
43	   and restrictions with respect to this document.

45	Abstract

47	   This document describes an extension to HTTP by which servers can
48	   automatically inform clients of mirrors of resources.  Clients can
49	   then simultaneously request segments of the resource from different
50	   servers, enhancing both network and server utilisation, download
51	   speeds, and thus user experience.

53	Table of Contents

55	   1.  Introduction and Motivation  . . . . . . . . . . . . . . . . .  3
56	     1.1.  Operation Overview . . . . . . . . . . . . . . . . . . . .  3
57	     1.2.  Requirements Language  . . . . . . . . . . . . . . . . . .  4
58	   2.  HTTP Extension Headers . . . . . . . . . . . . . . . . . . . .  4
59	     2.1.  Overview . . . . . . . . . . . . . . . . . . . . . . . . .  4
60	     2.2.  Requests . . . . . . . . . . . . . . . . . . . . . . . . .  5
61	       2.2.1.  X-Multiserver-Version  . . . . . . . . . . . . . . . .  5
62	       2.2.2.  X-If-Checksum-Match  . . . . . . . . . . . . . . . . .  5
63	     2.3.  Responses  . . . . . . . . . . . . . . . . . . . . . . . .  6
64	       2.3.1.  X-Multiserver-Version  . . . . . . . . . . . . . . . .  6
65	       2.3.2.  X-Checksum . . . . . . . . . . . . . . . . . . . . . .  6
66	       2.3.3.  X-Mirrors  . . . . . . . . . . . . . . . . . . . . . .  6
67	   3.  Example Operation  . . . . . . . . . . . . . . . . . . . . . .  7
68	   4.  Full Interaction Specification . . . . . . . . . . . . . . . .  8
69	     4.1.  Error Handling . . . . . . . . . . . . . . . . . . . . . .  8
70	       4.1.1.  Checksum Failure . . . . . . . . . . . . . . . . . . .  8
71	   5.  Alternative Approaches . . . . . . . . . . . . . . . . . . . .  9
72	   6.  Heuristics and Optimizations . . . . . . . . . . . . . . . . .  9
73	   7.  Managing Server Load . . . . . . . . . . . . . . . . . . . . . 10
74	   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 10
75	   9.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10
76	   10. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 11
77	   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
78	     11.1. Normative References . . . . . . . . . . . . . . . . . . . 11
79	     11.2. Informative References . . . . . . . . . . . . . . . . . . 11
80	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11

82	1.  Introduction and Motivation

84	   Mirrored HTTP servers are regularly used for software downloads,
85	   whereby copies of data to be downloaded are duplicated on many
86	   servers distributed around the Internet.  Users are encouraged to
87	   manually choose a nearby mirror from which to download.  This is
88	   intended to increase both throughput and resilience, and reduce load
89	   on individual servers.  Manual mirror choice rarely works well; users
90	   do not wish to make a choice, but if they are not forced to, then the
91	   default server takes a disproportionate share of the load.  Even when
92	   they are forced to choose, they rarely have enough information to
93	   choose the server that will provide the best performance.

95	   Some popular sites automate this process using DNS load balancing,
96	   both to approximately balance load between servers, and to direct
97	   clients to nearby servers with the hope that this improves
98	   throughput.  Indeed, DNS load balancing can balance long-term server
99	   load fairly effectively, but it is less effective at delivering the
100	   best throughput to users when the bottleneck is not the server but
101	   the network.

103	   This document specifies an alternative mechanism by which the benefit
104	   of mirrors can be automatically and more efficiently realised.  These
105	   benefits are achieved using a number of extensions to HTTP which
106	   allow the discovery of mirrors, the verification of the integrity of
107	   files on each mirror, and the simultaneous downloading of chunks from
108	   multiple mirrors.  The use of this mechanism allows greater
109	   efficiency in resource utilisation in the Internet as a whole,
110	   balances server utilization, even on short timescales, and enhances
111	   user experience through faster downloads.

113	1.1.  Operation Overview

115	   In an HTTP request a client will first declare that it conforms to
116	   multi-server HTTP extensions.  The capable server that wishes to take
117	   advantage of mirroring for the file requested will respond in its
118	   headers with a list of mirrors for the given file, and it will also
119	   return a checksum for the file.

121	   Multi-server HTTP extensions require the client and server support
122	   persistent connections.  A server responding with a mirror list uses
123	   chunked encoding for the response to permit the client to request
124	   ranges of bytes from the file from different servers.  It is normally
125	   up to the client to decide on the ranges requested from each mirror.
126	   However to avoid wasted time when talking to the initial server, this
127	   server starts to return data immediately, and it is up to the server
128	   to determine this initial chunk size.

130	   The server will make an estimate of a sensible initial chunk size,
131	   and immediately respond with this amount of the file (a Content-
132	   Range: header declares how much is sent in this chunk).  The
133	   mechanism for determining the initial chunk size is up to the server
134	   implementation.  However some suggestions include: it could be
135	   determined by a bounded proportion of the file size, or based on the
136	   RTT of the initial handshake, or based on cached information on
137	   previous throughput to the same client, or based on current
138	   throughput achieved by the server, or a combination of all of these.
139	   We will discuss these heuristics in more detail below

141	   It is then up to the client to work out appropriate scheduling for
142	   retrieving further chunks of this file from the original server and
143	   the mirrors.  The initial server will continue sending its own
144	   scheduled chunks until a GET request is received containing the
145	   standard HTTP Range: header, and at this point the server defers to
146	   the client's requests.  It is up to the client to choose how many
147	   servers to use, and what size chunks to request from each.  An
148	   advanced client is expected to adapt to relative speeds of servers
149	   and utilise the bandwidth most effectively, and to pipeline range
150	   requests to these servers to avoid idle time on the connections.  The
151	   client will also include a new checksum verification header to ensure
152	   the file on the mirror is the same as from the initial server.

154	1.2.  Requirements Language

156	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
157	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
158	   document are to be interpreted as described in RFC 2119 [RFC2119].

160	2.  HTTP Extension Headers

162	2.1.  Overview

164	   The additional functionality required at the HTTP level can be broken
165	   down as follows:

167	   o  Ensure client and server operate same version of Multi-Server HTTP

169	   o  Retrieve and verify a checksum for a resource

171	   o  Identify alternatives servers that provide access to the same
172	      resource

174	   o  Retrieve segments of a resource

176	   From these requirements, the additional headers required can be
177	   derived, and are described in the following sections.

179	   NOTE WELL: The names of the extension headers given below are
180	   TEMPORARY and FOR DISCUSSION ONLY.  These will evolve to meet
181	   [RFC2774] if appropriate.

183	   The formal definitions below make use of the conventions defined in
184	   Section 2.2 of [RFC2616].

186	2.2.  Requests

188	2.2.1.  X-Multiserver-Version

190	   This is used in both request and response to indicate that the host
191	   is has multi-server HTTP capability and what version it operates.
192	   For this specification, this version field MUST read "0.1".

194	   Formally,

196	             X-Multiserver-Version =
197	               "X-Multiserver-Version" ":" 1*DIGIT "." 1*DIGIT

199	   Example:

201	             X-Multiserver-Version: 0.1

203	2.2.2.  X-If-Checksum-Match

205	   This option is used in a request as a conditional, where the request
206	   is only valid if the checksum of the resource matches that specified
207	   in this header.

209	   The checksum is that already provided by the X-Checksum header in the
210	   response (see below).

212	   Two checksum functions are provided in this specification, and a
213	   compliant implementation MUST implement both.  These are MD5
214	   [RFC1321] and SHA-256 [FIPS180-3].

216	   Formally,

218	             checksum = quoted-string
219	             sum-type = "MD5" | "SHA-256"
220	             X-If-Checksum-Match =
221	               "X-If-Checksum-Match" ":" 1 sum-type 1 checksum

223	   Example:

225	             X-If-Checksum-Match: MD5 \
226	               "b1946ac92492d2347c6235b4d2611184"

228	2.3.  Responses

230	2.3.1.  X-Multiserver-Version

232	   As for a request, this handshake ensures both hosts will speak the
233	   same protocol.

235	2.3.2.  X-Checksum

237	   This option is presented by the server to provide a checksum of the
238	   whole file, which MUST be used in the X-If-Checksum-Match option to
239	   other mirrors.  It SHOULD be used by a client to verify the file once
240	   all segments have been downloaded.

242	   Formally,

244	             X-Checksum = "X-Checksum" ":" 1 sum-type 1 checksum

246	   Example:

248	             X-Checksum: MD5 \
249	               "b1946ac92492d2347c6235b4d2611184"

251	2.3.3.  X-Mirrors

253	   This option lists the orginal file, the cache lifetime in seconds,
254	   and the list of all mirrors available for the requested URI.  This
255	   header once per mirror.  If the original file ends in a slash ("/"),
256	   it indicates that the mirrors will provide all resources under this
257	   directory.  If so, then all the mirrors MUST also end in a slash.  If
258	   it does not specify a directory, but instead ends in a file name,
259	   then this declares that the mirror exists only for this file.  Any
260	   further requests for different files must first go to this server to
261	   acquire a mirror list, if it exists.

263	   The cache lifetime is the time which this mirrored information can be
264	   cached for, before it SHOULD be considered stale and SHOULD not be
265	   used.  Caching the mirror list is most important for mirrors of
266	   directories, where small files can be requested directly from the
267	   mirrors without needing to request the first chunk from the initial
268	   server.

270	   Formally:

272	             X-Mirrors="X-Mirrors" ":" filename cachetime 1*absoluteURI

274	   Example:

276	             X-Mirrors: /wibble/download.zip 3600 \
277	               http://www.example2.com/wibble/download.zip \
278	               http://www.example3.com/wibble/download.zip

280	3.  Example Operation

282	   So far, this document has defined the functional components that
283	   enable multi-server HTTP.  This section will draw out an example of
284	   how these fit together, and utilising existing HTTP/1.1 features.

286	   First, a Multi-Server HTTP client will connect to a server and
287	   request a file, while declaring itself multi-server HTTP capable:

289	         GET /wibble/download.zip HTTP/1.1
290	         Host: www.example.com
291	         X-Multiserver-Version: 0.1

293	   The server will respond with details about this resource, and begin
294	   delivering some of the file:

296	         HTTP/1.1 200 OK
297	         Accept-Ranges: bytes
298	         Content-Length: 10240
299	         Content-Type: application/zip
300	         Content-Range: bytes 1-10240/2025121
301	         X-Multiserver-Version: 0.1
302	         X-Checksum: MD5 "d6862c992a3d6736ad678cc865dee67f"
303	         X-Mirrors: /wibble/download.zip 3600 \
304	           http://www.example2.com/wibble/download.zip \
305	           http://www.example3.com/wibble/download.zip

307	         ...data...

309	   The client will then continue requesting more blocks from this
310	   server:

312	         GET /wibble/download.zip HTTP/1.1
313	         Host: www.example.com
314	         X-Multiserver-Version: 0.1
315	         Range: 10241-20480

317	   The server will then respond with further blocks of the file:

319	         HTTP/1.1 200 OK
320	         Accept-Ranges: bytes
321	         Content-Length: 10240
322	         Content-Type: application/zip
323	         Content-Range: bytes 10241-20480/2025121
324	         X-Multiserver-Version: 0.1

326	         ...data...

328	   It will then also connect to the mirrors and request more blocks,
329	   such as the following:

331	         GET /wibble/download.zip HTTP/1.1
332	         Host: www.example2.com
333	         X-Multiserver-Version: 0.1
334	         X-If-Checksum-Match: MD5 "d6862c992a3d6736ad678cc865dee67f"
335	         Range: 20481-30720

337	   Assuming the checksum matches correctly, the server will respond as
338	   for a normal multiserver HTTP continuation request from the original
339	   server.

341	   When finished, the client can close all its connections.

343	4.  Full Interaction Specification

345	   TBD.  This section will define permitted HTTP status codes, behaviour
346	   upon errors, etc.

348	4.1.  Error Handling

350	   Multi-server HTTP introduces a number of failure modes that do not
351	   exist in conventional HTTP.

353	4.1.1.  Checksum Failure

355	   It should not be the case that the data checksum fails when the file
356	   has finished downloading, as the mirrors MUST NOT return data if it
357	   does not correspond to the version of the file specified in the
358	   checksum.  However, if this occurs due to bugs or hardware data
359	   corruption, there is no way to determine which chunks are errored.
360	   It would be possible to add mechanism to check with the original
361	   server to determine what the checksum would be for a range, but this
362	   would add extra complexity for a case that is expected to be
363	   extremely rare.  In the case of a checksum failure, a client SHOULD
364	   re-request any chunks received from mirrors from the original server,
365	   but recalculate the file checksum periodically to determine if the
366	   errored chunk has now been replaced.

368	5.  Alternative Approaches

370	   It is debatable whether ETags could be used in place of checksums,
371	   along with the If-Match header in requests.  We elected not to follow
372	   this as the definition of an ETag is that it must be unique to a
373	   resource, however in this case we are accessing multiple resources
374	   (URIs) even though they should be the same data.  We do not feel
375	   comfortable using ETags as this would require the semantics of the
376	   field to be changed.

378	6.  Heuristics and Optimizations

380	   The performance of Multipath HTTP depends heavily on several factors:

382	   o  The number of servers used simultaneously.

384	   o  The ability to pipeline sufficient or sufficiently large range
385	      requests to each server so as to avoid connections going idle.

387	   o  The ability to pipeline sufficiently few or sufficiently small
388	      range requests to servers so that all the servers finish their
389	      final chunks simultaneously.

391	   o  The ability to switch between mirrors dynamically so as to use the
392	      fastest mirrors at any moment in time

394	   Obviously we do not want to use too many simultaneous connections, or
395	   other traffic sharing a bottleneck link will be starved.  But at the
396	   same time, good performance requires that the client can
397	   simultaneously download from at least one fast mirror while exploring
398	   whether any other mirror is faster.  Based on laboratory experiments,
399	   we suggest a good default number of simultaneous connections is
400	   probably four, with three of these being used for the best three
401	   mirrors found so far, and one being used to evaluate whether any
402	   other mirror might offer better performance.

404	   The size of chunks chosen by the client should be sufficiently large
405	   that the chunk request headers and reponse headers represent neglible
406	   overhead, and sufficiently large that they can be pipelined
407	   effectively without needing a very high rate of chunk requests.  At
408	   the same time, the amount of time wasted waiting for the last chunk
409	   to download from the last server after all the other servers have
410	   finished should be minimized.  Thus we currently recommend that a
411	   chunk size of at least 10KBytes should be used.  If the file being
412	   transfered is very large, or the download speed very high, this can
413	   be increased to perhaps 1MByte.  As network bandwidths increase, we
414	   expect these numbers to increase appropriately, so that the time to
415	   transfer a chunk remains significantly larger than the latency of
416	   requesting a chunk from a server.

418	7.  Managing Server Load

420	   A goal of mirroring is to manage server load, not just network
421	   bandwidth.  Thus server operators need simple mechanisms they can use
422	   to handle overload conditions gracefully and balance load.  Multi-
423	   server HTTP provides spreads load automatically across heterogeneous
424	   servers so that the fastest servers get to serve the most data.
425	   However, if a server is approaching an overload condition, Multi-
426	   server HTTP can increase the overall load a little due to the
427	   additional processing of chunk requests.  Thus, a way for such a
428	   server to discard load is desirable.  The easiest way to shed load is
429	   to use two ports for HTTP; one is used for the initial requests to a
430	   server, and the second is used to handle requests for subsequent
431	   chunks redirected from other servers.  This second port would then be
432	   explicitly reported in the URLs specified in the X-Mirrors option.  A
433	   server that is approaching overload can continue to report mirrors in
434	   initial requests sent to it (thus directing load to other mirrors),
435	   but simply stop listening on the second port.  This avoids it having
436	   to handle requests redirected to it from other servers until its load
437	   returns to a more manageable level.  In the event that all the
438	   mirrors are overloaded, this effectively falls back to conventional
439	   HTTP, with only requests sent to the original port being serviced.

441	8.  Security Considerations

443	   At present, we do not believe there should be any new security issues
444	   with this approach.  The use of checksums will ensure that the
445	   resource on every server is the same, and the final result can also
446	   be verified.  One concern could be regarding whether we should permit
447	   redirects to be used as responses on a mirror request.  Although this
448	   is deviating from the mirrored resource as originally specified by
449	   the initial server, the use of checksums will allow the final file to
450	   be verified.

452	9.  Acknowledgements

454	   This work builds upon work undertaken by Javier Vela Diago, who also
455	   provided validation of the benefits of this approach.  It is based
456	   loosely on the BitTorrent algorithm, but adapted to the HTTP client/
457	   server architecture.

459	   The authors are supported by Trilogy
460	   (http://www.trilogy-project.org), a research project (ICT-216372)
461	   partially funded by the European Community under its Seventh
462	   Framework Program.  The views expressed here are those of the
463	   author(s) only.  The European Commission is not liable for any use
464	   that may be made of the information in this document.

466	10.  IANA Considerations

468	   None.

470	11.  References

472	11.1.  Normative References

474	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
475	              Requirement Levels", BCP 14, RFC 2119, March 1997.

477	11.2.  Informative References

479	   [FIPS180-3]
480	              National Institute of Standards and Technology (NIST),
481	              "FIPS 180-3 Secure Hash Standard (SHS)", October 2008, <ht
482	              tp://csrc.nist.gov/publications/fips/fips180-3/
483	              fips180-3_final.pdf>.

485	   [RFC1321]  Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321,
486	              April 1992.

488	   [RFC2616]  Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
489	              Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
490	              Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.

492	   [RFC2774]  Nielsen, H., Leach, P., and S. Lawrence, "An HTTP
493	              Extension Framework", RFC 2774, February 2000.

495	Authors' Addresses

497	   Alan Ford
498	   Roke Manor Research
499	   Old Salisbury Lane
500	   Romsey, Hampshire  SO51 0ZN
501	   UK

503	   Phone: +44 1794 833 465
504	   Email: alan.ford@roke.co.uk

506	   Mark Handley
507	   University College London
508	   Gower Street
509	   London  WC1E 6BT
510	   UK

512	   Email: m.handley@cs.ucl.ac.uk