idnits 2.17.1
draft-vandesompel-memento-02.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
== There are 1 instance of lines with non-RFC2606-compliant FQDNs in the
document.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
== Line 1686 has weird spacing: '...ts that respe...'
-- The document date (June 3, 2011) is 4711 days in the past. Is this
intentional?
Checking references for intended status: Informational
----------------------------------------------------------------------------
== Unused Reference: 'RFC4151' is defined on line 1792, but no explicit
reference was found in the text
== Outdated reference: A later version (-14) exists of
draft-ietf-core-link-format-03
** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231,
RFC 7232, RFC 7233, RFC 7234, RFC 7235)
** Obsolete normative reference: RFC 5988 (Obsoleted by RFC 8288)
== Outdated reference: A later version (-10) exists of
draft-masinter-dated-uri-08
Summary: 2 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Internet Engineering Task Force H. VandeSompel
3 Internet-Draft Los Alamos National Laboratory
4 Intended status: Informational M. Nelson
5 Expires: December 5, 2011 Old Dominion University
6 R. Sanderson
7 Los Alamos National Laboratory
8 June 3, 2011
10 HTTP framework for time-based access to resource states -- Memento
11 draft-vandesompel-memento-02
13 Abstract
15 The HTTP-based Memento framework bridges the present and past Web by
16 interlinking current resources with resources that encapsulate their
17 past. It facilitates obtaining representations of prior states of a
18 resource, available from archival resources in Web archives or
19 version resources in content management systems, by leveraging the
20 resource's URI and a preferred datetime. To this end, the framework
21 introduces datetime negotiation (a variation on content negotiation),
22 and new Relation Types for the HTTP Link header aimed at interlinking
23 resources with their archival/version resources. It also introduces
24 various discovery mechanisms that further support bridging the
25 present and past Web.
27 Status of this Memo
29 This Internet-Draft is submitted in full conformance with the
30 provisions of BCP 78 and BCP 79.
32 Internet-Drafts are working documents of the Internet Engineering
33 Task Force (IETF). Note that other groups may also distribute
34 working documents as Internet-Drafts. The list of current Internet-
35 Drafts is at http://datatracker.ietf.org/drafts/current/.
37 Internet-Drafts are draft documents valid for a maximum of six months
38 and may be updated, replaced, or obsoleted by other documents at any
39 time. It is inappropriate to use Internet-Drafts as reference
40 material or to cite them other than as "work in progress."
42 This Internet-Draft will expire on December 5, 2011.
44 Copyright Notice
46 Copyright (c) 2011 IETF Trust and the persons identified as the
47 document authors. All rights reserved.
49 This document is subject to BCP 78 and the IETF Trust's Legal
50 Provisions Relating to IETF Documents
51 (http://trustee.ietf.org/license-info) in effect on the date of
52 publication of this document. Please review these documents
53 carefully, as they describe your rights and restrictions with respect
54 to this document. Code Components extracted from this document must
55 include Simplified BSD License text as described in Section 4.e of
56 the Trust Legal Provisions and are provided without warranty as
57 described in the Simplified BSD License.
59 Table of Contents
61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
62 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4
63 1.2. Purpose . . . . . . . . . . . . . . . . . . . . . . . . . 4
64 1.3. Notational Conventions . . . . . . . . . . . . . . . . . . 6
65 2. The Memento Framework, Datetime Negotiation component:
66 HTTP headers, HTTP Link Relation Types . . . . . . . . . . . . 7
67 2.1. HTTP Headers . . . . . . . . . . . . . . . . . . . . . . . 7
68 2.1.1. Accept-Datetime, Memento-Datetime . . . . . . . . . . 7
69 2.1.1.1. Values for Accept-Datetime . . . . . . . . . . . . 8
70 2.1.1.2. Values for Memento-Datetime . . . . . . . . . . . 9
71 2.1.2. Vary . . . . . . . . . . . . . . . . . . . . . . . . . 9
72 2.1.3. Location . . . . . . . . . . . . . . . . . . . . . . . 10
73 2.1.4. Link . . . . . . . . . . . . . . . . . . . . . . . . . 10
74 2.2. Link Header Relation Types . . . . . . . . . . . . . . . . 10
75 2.2.1. Memento Framework Relation Types . . . . . . . . . . . 10
76 2.2.1.1. Relation Type "original" . . . . . . . . . . . . . 11
77 2.2.1.2. Relation Type "timegate" . . . . . . . . . . . . . 11
78 2.2.1.3. Relation Type "timemap" . . . . . . . . . . . . . 12
79 2.2.1.4. Relation Type "memento" . . . . . . . . . . . . . 12
80 2.2.2. Other Relation Types . . . . . . . . . . . . . . . . . 14
81 3. The Memento Framework, Datetime Negotiation component:
82 HTTP Interactions . . . . . . . . . . . . . . . . . . . . . . 15
83 3.1. Interactions with an Original Resource . . . . . . . . . . 16
84 3.1.1. Step 1: User Agent Requests an Original Resource . . . 16
85 3.1.2. Step 2: Server Responds to a Request for an
86 Original Resource . . . . . . . . . . . . . . . . . . 17
87 3.1.2.1. Original Resource is an Appropriate Memento . . . 18
88 3.1.2.2. Server Exists and Original Resource Used to
89 Exist . . . . . . . . . . . . . . . . . . . . . . 19
90 3.1.2.3. Missing or Inadequate "timegate" Link in
91 Original Server's Response . . . . . . . . . . . . 20
92 3.2. Interactions with a TimeGate . . . . . . . . . . . . . . . 20
93 3.2.1. Step 3: User Agent Negotiates with a TimeGate . . . . 20
94 3.2.2. Step 4: Server Responds to Negotiation with
95 TimeGate . . . . . . . . . . . . . . . . . . . . . . . 21
97 3.2.2.1. Successful Scenario . . . . . . . . . . . . . . . 21
98 3.2.2.2. Accept-Datetime with Interval Indicator
99 Provided . . . . . . . . . . . . . . . . . . . . . 23
100 3.2.2.3. Multiple Matching Mementos . . . . . . . . . . . . 24
101 3.2.2.4. TimeGate Redirects to another TimeGate . . . . . . 25
102 3.2.2.5. Accept-Datetime and other Accept Headers
103 Provided . . . . . . . . . . . . . . . . . . . . . 26
104 3.2.2.6. Accept-Datetime Unparseable . . . . . . . . . . . 27
105 3.2.2.7. Accept-Datetime Not Provided . . . . . . . . . . . 27
106 3.2.2.8. TimeGate Does Not Exist . . . . . . . . . . . . . 27
107 3.2.2.9. HTTP Methods other than HEAD/GET . . . . . . . . . 27
108 3.2.3. Recognizing a TimeGate . . . . . . . . . . . . . . . . 28
109 3.3. Interactions with a Memento . . . . . . . . . . . . . . . 28
110 3.3.1. Step 5: User Agent Requests a Memento . . . . . . . . 29
111 3.3.2. Step 6: Server Responds to a Request for a Memento . . 29
112 3.3.2.1. Memento Does not Exist . . . . . . . . . . . . . . 30
113 3.3.2.2. Mementos Without a TimeGate . . . . . . . . . . . 31
114 3.3.3. Recognizing a Memento . . . . . . . . . . . . . . . . 32
115 3.4. Interactions with a TimeMap . . . . . . . . . . . . . . . 32
116 3.4.1. User Agent Requests a TimeMap . . . . . . . . . . . . 33
117 3.4.2. Server Responds to a Request for a TimeMap . . . . . . 33
118 4. The Memento Framework, Discovery Component . . . . . . . . . . 35
119 4.1. Discovering TimeGates Via Robots Exclusion Protocol . . . 35
120 4.2. Discovering Mementos via Robots Exclusion Protocol . . . . 37
121 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 37
122 6. Security Considerations . . . . . . . . . . . . . . . . . . . 37
123 7. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . 38
124 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 39
125 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 39
126 9.1. Normative References . . . . . . . . . . . . . . . . . . . 39
127 9.2. Informative References . . . . . . . . . . . . . . . . . . 40
128 Appendix A. Appendix B: A Sample, Successful Memento
129 Request/Response cycle . . . . . . . . . . . . . . . 40
130 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 42
132 1. Introduction
134 1.1. Terminology
136 This specification uses the terms "resource", "request", "response",
137 "entity", "entity-body", "entity-header", "content negotiation",
138 "client", "user agent", "server" as described in RFC 2616 [RFC2616],
139 and it uses the terms "representation" and "resource state" as
140 described in W3C.REC-aww-20041215 [W3C.REC-aww-20041215].
142 In addition, the following terms specific to the Memento framework
143 are introduced:
145 o Original Resource: An Original Resource is a resource that exists
146 or used to exist, and for which access to one of its prior states
147 is desired.
149 o Memento: A Memento for an Original Resource is a resource that
150 encapsulates a prior state of the Original Resource. A Memento
151 for an Original Resource as it existed at time Tj is a resource
152 that encapsulates the state that the Original Resource had at time
153 Tj.
155 o TimeGate: A TimeGate for an Original Resource is a resource that
156 is capable of negotiation to allow selective, datetime-based,
157 access to prior states of the Original Resource.
159 o TimeMap: A TimeMap for an Original Resource is a resource from
160 which a list of URIs of Mementos of the Original Resource is
161 available.
163 1.2. Purpose
165 The state of an Original Resource may change over time.
166 Dereferencing its URI at any specific moment in time during its
167 existence yields a representation of its then current state.
168 Dereferencing its URI at any time past its existence no longer yields
169 a meaningful representation, if any. Still, in both cases, resources
170 may exist that encapsulate prior states of the Original Resource.
171 Each such resource, named a Memento, has its own URI that, when
172 dereferenced, returns a representation of a prior state of the
173 Original Resource. Mementos may, for example, exist in Web archives,
174 Content Management Systems, or Revision Control Systems.
176 Examples are:
178 Mementos for Original Resource http://www.ietf.org/ :
180 o http://web.archive.org/web/19970107171109/http://www.ietf.org/
182 o http://webarchive.nationalarchives.gov.uk/20080906200044/http://
183 www.ietf.org/
185 Mementos for Original Resource
186 http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol :
188 o http://en.wikipedia.org/w/
189 index.php?title=Hypertext_Transfer_Protocol&oldid=366806574
191 o http://en.wikipedia.org/w/
192 index.php?title=Hypertext_Transfer_Protocol&oldid=33912
194 o http://web.archive.org/web/20071011153017/http://en.wikipedia.org/
195 wiki/Hypertext_Transfer_Protocol
197 Mementos for Original Resource http://www.w3.org/TR/webarch/ :
199 o http://www.w3.org/TR/2004/PR-webarch-20041105/
201 o http://www.w3.org/TR/2002/WD-webarch-20020830/
203 o http://webarchive.nationalarchives.gov.uk/20100304163140/http://
204 www.w3.org/TR/webarch/
206 In the abstract, Memento introduces a mechanism to access versions of
207 Web resources that:
209 o Is fully distributed in the sense that resource versions may
210 reside on multiple hosts, and that any such host is likely only
211 aware of the versions it holds;
213 o Uses the global notion of datetime as a resource version indicator
214 and access key;
216 o Leverages the following primitives of W3C.REC-aww-20041215
217 [W3C.REC-aww-20041215]: resource, resource state, representation,
218 content negotiation, and link.
220 The core components of Memento's mechanism to access resource
221 versions are:
223 1. The abstract notion of the state of a resource identified by
224 URI-R as it existed at some time Tj. Note the relationship with the
225 ability to identify a the state of a resource at some datetime Tj by
226 means of a URI as intended by the proposed Dated URI scheme
227 I-D.masinter-dated-uri [I-D.masinter-dated-uri].
229 2. A bridge from the present to the past, consisting of:
231 o An appropriately typed link from a resource identified by URI-R to
232 an associated TimeGate identified by URI-G, which is aware of (at
233 least part of the) version history of the resource identified by
234 URI-R;
236 o The ability to content negotiate in the datetime dimension with
237 the TimeGate identified by URI-G, as a means to obtain a
238 representation of the state that the resource identified by URI-R
239 had at some datetime Tj.
241 3. A bridge from the past to the present, consisting of an
242 appropriately typed link from a resource identified by URI-M, which
243 encapsulates the state a resource identified by URI-R had at some
244 datetime Tj, to the resource identified by URI-R.
246 Section 2 and Section 3 of this document are concerned with
247 specifying an instantiation of these abstractions for resources that
248 are identified by HTTP(S) URIs, whereas Section 4 details approaches
249 to discover TimeGates, TimeMaps, and Mementos on the HTTP(S) Web by
250 other means than typed links.
252 1.3. Notational Conventions
254 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
255 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
256 document are to be interpreted as described in RFC 2119 [RFC2119].
258 When needed for extra clarity, the following conventions are used:
260 o URI-R is used to denote the URI of an Original Resource.
262 o URI-G is used to denote the URI of a TimeGate.
264 o URI-M is used to denote the URI of a Memento.
266 o URI-T is used to denote the URI of a TimeMap.
268 o When scenarios are described that involve multiple Mementos,
269 URI-M0 denotes the URI of the first Memento known to the
270 responding server, URI-Mn denotes the URI of the most recent known
271 Memento, URI-Mj denotes the URI of the selected Memento, URI-Mi
272 denotes the URI of the Memento that is temporally previous to the
273 selected Memento, and URI-Mk denotes the URI of the Memento that
274 is temporally after the selected Memento. The respective
275 datetimes for these Mementos are T0, Tn, Tj, Ti, and Tk; it holds
276 that T0 <= Ti <= Tj <= Tk <= Tn.
278 2. The Memento Framework, Datetime Negotiation component: HTTP headers,
279 HTTP Link Relation Types
281 The Memento framework is concerned with Original Resources,
282 TimeGates, Mementos, and TimeMaps that are identified by HTTP or
283 HTTPS URIs. Details are only provided for resources identified by
284 HTTP URIs but apply similarly to those with HTTPS URIs.
286 2.1. HTTP Headers
288 The Memento framework operates at the level of HTTP request and
289 response headers. It introduces two new headers ("Accept-Datetime",
290 "Memento-Datetime"), introduces new values for two existing headers
291 ("Vary", "Link"), and uses an existing header ("Location") without
292 modification. All these headers are described below. Other HTTP
293 headers are present or absent in Memento response/request cycles as
294 specified by RFC 2616 [RFC2616].
296 2.1.1. Accept-Datetime, Memento-Datetime
298 The "Accept-Datetime" request header is used by a user agent to
299 indicate it wants to retrieve a representation of a Memento that
300 encapsulates a past state of an Original Resource. To that end, the
301 "Accept-Datetime" header is conveyed in an HTTP GET/HEAD request
302 issued against a TimeGate for an Original Resource, and its value
303 indicates the datetime of the desired past state of the Original
304 Resource. The "Accept-Datetime" request header has no defined
305 meaning for HTTP methods other than HEAD and GET.
307 The "Memento-Datetime" response header is used by a server to
308 indicate that the response contains a representation of a Memento,
309 and its value expresses the datetime of the state of an Original
310 Resource that is encapsulated in that Memento. The URI of that
311 Original Resource is provided in the response, as the Target IRI (see
312 RFC5988 [RFC5988]) of a link provided in the HTTP "Link" header that
313 has a Relation Type of "original" (see Section 2.2).
315 The presence of a "Memento-Datetime" header and associated value for
316 a given resource constitutes a promise that the resource is stable
317 and that its state will no longer change. This means that, in terms
318 of the Ontology for Relating Generic and Specific Information
319 Resources (see W3C.gen-ont-20090420 [W3C.gen-ont-20090420]), a
320 Memento is a FixedResource.
322 As a consequence, "Memento-Datetime" headers associated with a
323 Memento MUST be "sticky" in the following ways:
325 o The server that originally assigns the "Memento-Datetime" header
326 and value MUST retain that header in all responses to HTTP HEAD/
327 GET requests (with or without "Accept-Datetime" header) that occur
328 against the Memento after the time of the original assignment of
329 the header, and it MUST NOT change its associated value.
331 o Applications that mirror Mementos at a different URI MUST NOT
332 change the "Memento-Datetime" header and value of those Mementos
333 unless mirroring involves a meaningful state change. This allows,
334 for example, duplicating a Web archive at a new location while
335 preserving the value of the "Memento-Datetime" header of the
336 archived resources. In this example, the "Last-Modified" header
337 will be updated to reflect the time of mirroring at the new URI,
338 whereas the value for "Memento-Datetime" will be sticky.
340 2.1.1.1. Values for Accept-Datetime
342 Values for the "Accept-Datetime" header consist of a MANDATORY
343 datetime expressed according to the RFC 1123 [RFC1123] format, which
344 is formalized by the rfc1123-date construction rule of the BNF in
345 Figure 1, and an OPTIONAL interval indicator expressed according to
346 the iso8601-interval rule of the BNF in Figure 1. The datetime MUST
347 be represented in Greenwich Mean Time (GMT).
349 Examples of "Accept-Datetime" request headers with and without an
350 interval indicator:
352 Accept-Datetime: Thu, 31 May 2007 20:35:00 GMT
353 Accept-Datetime: Thu, 31 May 2007 20:35:00 GMT; -P3DT5H;+P2DT6H
355 The user agent uses the MANDATORY datetime value to convey its
356 preferred datetime for a Memento; it uses the OPTIONAL interval
357 indicator to convey it is interested in retrieving Mementos that
358 reside within this interval around the preferred datetime, and not
359 interested in Mementos that reside outside of it. Not using an
360 interval indicator is equivalent to expressing an infinite interval
361 around the preferred datetime.
363 The interval mechanism can be regarded as an implementation of the
364 functionality intended by the q-value approach that is used in
365 regular content negotiation. The q-value approach is not supported
366 for Memento's datetime negotiation because it is well-suited for
367 negotiation over a discrete space of mostly predictable values, not
368 for negotiation over a continuum of unpredictable datetime values.
370 accept-dt-value = rfc1123-date *SP [ iso8601-interval ]
371 rfc1123-date = wkday "," SP date1 SP time SP "GMT"
372 date1 = 2DIGIT SP month SP 4DIGIT
373 ; day month year (e.g., 20 Mar 1957)
374 time = 2DIGIT ":" 2DIGIT ":" 2DIGIT
375 ; 00:00:00 - 23:59:59 (e.g., 14:33:22)
376 wkday = "Mon" | "Tue" | "Wed" | "Thu" | "Fri" | "Sat" |
377 "Sun"
378 month = "Jan" | "Feb" | "Mar" | "Apr" | "May" | "Jun" |
379 "Jul" | "Aug" | "Sep" | "Oct" | "Nov" | "Dec"
380 iso8601-interval = ";" *SP "-" duration *SP ";" *SP "+" duration
381 duration = "P" ( dur-date | dur-week )
382 dur-date = ( dur-day | dur-month | dur-year ) [ dur-time ]
383 dur-year = 1*DIGIT "Y" [ dur-month ] [ dur-day ]
384 dur-month = 1*DIGIT "M" [ dur-day ]
385 dur-day = 1*DIGIT "D"
386 dur-time = "T" ( dur-hour | dur-minute | dur-second )
387 dur-hour = 1*DIGIT "H" [ dur-minute ] [ dur-second ]
388 dur-minute = 1*DIGIT "M" [ dur-second ]
389 dur-second = 1*DIGIT "S"
390 dur-week = 1*DIGIT "W"
392 Figure 1: BNF for the datetime format
394 2.1.1.2. Values for Memento-Datetime
396 Values for the "Memento-Datetime" headers MUST be datetimes expressed
397 according to the rfc1123-date construction rule of the BNF in
398 Figure 1; they MUST be represented in Greenwich Mean Time (GMT).
400 An example "Memento-Datetime" response header:
402 Memento-Datetime: Wed, 30 May 2007 18:47:52 GMT
404 2.1.2. Vary
406 The "Vary" response header is used in responses to indicate the
407 dimensions in which content negotiation was successfully applied.
408 This header is used in the Memento framework to indicate both whether
409 datetime negotiation was applied or is supported by the responding
410 server.
412 For example, this use of the "Vary" header indicates that datetime is
413 the only dimension in which negotiation was applied:
415 Vary: negotiate, accept-datetime
416 The use of the "Vary" header in this example shows that both datetime
417 negotiation, and media type content negotiation were applied:
419 Vary: negotiate, accept-datetime, accept
421 2.1.3. Location
423 The "Location" header is used as defined in RFC 2616 [RFC2616].
424 Examples are given in Section 3 below.
426 2.1.4. Link
428 The "Link" response header is specified in RFC5988 [RFC5988]. The
429 Memento framework introduces new Relation Types to convey typed links
430 among Original Resources, TimeGates, Mementos, and TimeMaps. Already
431 existing Relation Types, among others, aimed at supporting navigation
432 among a series of ordered resources may also be used in the Memento
433 framework. This is detailed in Link Header Relation Types
434 (Section 2.2), below.
436 2.2. Link Header Relation Types
438 The "Link" header specified in RFC5988 [RFC5988] is semantically
439 equivalent to the "" element in HTML, as well as the "atom:
440 link" feed-level element in Atom RFC 4287 [RFC4287]. By default, the
441 origin of a link expressed by an entry in a "Link" header (named
442 Context IRI in RFC5988 [RFC5988]) is the IRI of the requested
443 resource. This default can be overwritten using the "anchor"
444 attribute in the entry.
446 2.2.1. Memento Framework Relation Types
448 The Relation Types used in the Memento framework are listed in the
449 remainder of this section, and their use is summarized in the below
450 table. Appendix A shows a Memento request/response cycle that uses
451 all the Relation Types that are introduced here.
453 +----------+-------------------+---------------------+--------------+
454 | Relation | Original Resource | TimeGate | Memento |
455 | Type | | | |
456 +----------+-------------------+---------------------+--------------+
457 | original | NA, except see | REQUIRED, 1 | REQUIRED, 1 |
458 | | Section 3.1.2.1 | | |
459 | timegate | RECOMMENDED, 0 or | REQUIRED, 1 in case | RECOMMENDED, |
460 | | more | of Section 3.2.2.4 | 0 or more |
461 | timemap | NA | RECOMMENDED, 0 or | RECOMMENDED, |
462 | | | more | 0 or more |
463 | memento | NA, except see | REQUIRED, 1 or more | REQUIRED, 1 |
464 | | Section 3.1.2.1 | | or more |
465 +----------+-------------------+---------------------+--------------+
467 Table 1: The use of Relation Types
469 2.2.1.1. Relation Type "original"
471 "original" -- A "Link" header entry with a Relation Type of
472 "original" is used to point from a TimeGate or a Memento to their
473 associated Original Resource. In both cases, an entry with the
474 "original" Relation Type MUST occur exactly once in a "Link" header.
475 Details for the entry are as follows:
477 o Context IRI: URI-G, URI-M
479 o Target IRI: URI-R
481 o Relation Type: "original"
483 o Use: REQUIRED
485 o Cardinality: 1
487 2.2.1.2. Relation Type "timegate"
489 "timegate" -- A "Link" header entry with a Relation Type of
490 "timegate" is used to point both from an Original Resource or a
491 Memento to a TimeGate for the Original Resource. In both cases, the
492 use of an entry with the "timegate" Relation Type is RECOMMENDED.
493 Since more than one TimeGate can exist for any Original Resource,
494 multiple entries with a "timegate" Relation Type MAY occur, each with
495 a distinct Target IRI. Since a TimeGate has no mime type, the "type"
496 attribute MUST NOT be used on Links with a "timegate" Relation Type.
497 Details for the entry are as follows:
499 o Context IRI: URI-R or URI-Mj
501 o Target IRI: URI-G
503 o Relation Type: "timegate"
505 o Use: RECOMMENDED
507 o Cardinality: 0 or more
509 In the special case (see Section 3.2.2.4) where a TimeGate redirects
510 to another TimeGate for the Original Resource, a "Link" header entry
511 with a Relation Type of "timegate" MUST be used to point from the
512 former to the latter.
514 2.2.1.3. Relation Type "timemap"
516 "timemap" -- A "Link" header entry with a Relation Type of "timemap"
517 is used to point from both a TimeGate or a Memento to a TimeMap
518 resource from which a list of Mementos known to the responding server
519 is available. Use of an entry with the "timemap" Relation Type is
520 RECOMMENDED, and, since multiple serializations of a TimeMap are
521 possible, multiple entries with a "timemap" Relation Type MAY occur,
522 each with a distinct Target IRI, and each with a MANDATORY "type"
523 attribute to convey the mime type of the TimeMap serialization.
524 Details for the entry are as follows:
526 o Context IRI: URI-G or URI-Mi
528 o Target IRI: URI-T
530 o Relation Type: "timemap"
532 o Target Attribute: "type"
534 o Use: RECOMMENDED
536 o Cardinality: 0 or more
538 Further details about TimeMap serializations are provided in
539 Section 3.4.
541 2.2.1.4. Relation Type "memento"
543 "memento" -- A "Link" header entry with a Relation Type of "memento"
544 is used to point from both a TimeGate and a Memento to various
545 Mementos for an Original Resource. This link MUST include a
546 "datetime" attribute with a value that matches the "Memento-Datetime"
547 of the Memento that is the target of the link; that is, the value of
548 the "Memento-Datetime" header that is returned when the URI of the
549 linked Memento is dereferenced. In addition, the link MAY include an
550 "embargo" attribute to convey the datetime until which the Memento
551 will remain inaccessible. The value for both the "datetime" and
552 "embargo" attributes MUST be a datetime expressed according to the
553 rfc1123-date construction rule of the BNF in Figure 1 and it MUST be
554 represented in Greenwich Mean Time (GMT). This link MAY also include
555 a "license" attribute to associate a license with the Memento; the
556 value for the "license" attribute SHOULD be a URI. The link SHOULD
557 also include a "type" attribute to convey the mime type of the
558 Memento that is the target of the link. Use of entries with the
559 "memento" Relation Type is REQUIRED and it MUST be as follows:
561 For all responses to HTTP HEAD/GET requests issued against a TimeGate
562 or a Memento in which a Memento is selected or served by the
563 responding server:
565 o One "memento" link MUST be included that has as Target IRI the URI
566 of the Memento that was selected or served;
568 o One "memento" link MUST be included that has as Target IRI the URI
569 of the temporally first Memento known to the responding server;
571 o One "memento" link MUST be included that has as Target IRI the URI
572 of the temporally most recent Memento known to the responding
573 server.
575 o One "memento" link SHOULD be included that has as Target IRI the
576 URI of the Memento that is previous to the selected Memento in the
577 temporal series of all Mementos (sorted by ascending "Memento-
578 Datetime" values) known to the server;
580 o One "memento" link SHOULD be included that has as Target IRI the
581 URI the Memento that is next to the selected Memento in the
582 temporal series of all Mementos (sorted by ascending "Memento-
583 Datetime" values) known to the server.
585 o Other "memento" links MAY only be included if both the
586 aforementioned previous and next links are provided. Each of
587 these OPTIONAL "memento" links MUST have as Target IRI the URI of
588 a Memento other than the ones listed above.
590 For all responses to HTTP HEAD/GET requests issued against an
591 existing TimeGate or Memento in which no Memento is selected or
592 served by the responding server:
594 o One "memento" link MUST be included that has as Target IRI the URI
595 of the temporally first Memento known to the responding server;
597 o One "memento" link MUST be included that has as Target IRI the URI
598 of the temporally most recent Memento known to the responding
599 server.
601 o Other "memento" links MAY be included, and each of these OPTIONAL
602 links MUST have as Target IRI the URI of a Memento other than the
603 two listed above.
605 Note that the Target IRI of some of these links may coincide. For
606 example, if the selected Memento actually is the first Memento known
607 to the server, only three distinct "memento" links may result. The
608 value for the "datetime" attribute of these links would be the
609 datetimes of the first (equal to selected), next, and most recent
610 Memento known to the responding server.
612 The summary is as follows:
614 o Context IRI: URI-G, URI-Mj
616 o Target IRI: URI-M
618 o Relation Type: "memento"
620 o Target Attributes: "datetime", "embargo", "license"
622 o Use: REQUIRED
624 o Cardinality: 1 or more
626 2.2.2. Other Relation Types
628 Web Linking RFC5988 [RFC5988] allows for the inclusion of links with
629 different Relation Types but the same Target IRI, and hence the
630 Relation Types introduced by the Memento framework MAY be combined
631 with others as deemed necessary. As the "memento" Relation Type
632 focuses on conveying the datetime of a linked Memento, Relation Types
633 that allow navigating among the temporally ordered series of Mementos
634 known to a server are of particular importance. With this regard,
635 the Relation Types listed in the below table SHOULD be considered for
636 combination with the "memento" Relation Type. A distinction is made
637 between responding servers that can be categorized as systems that
638 are the focus of RFC5829 [RFC5829] (such as version control systems)
639 and others that can not (such as Web archives). Note that, in terms
640 of RFC5829 [RFC5829], the last Memento (URI-Mn) is the version prior
641 to the latest (i.e. current) version.
643 +-----------------------------+---------------------+---------------+
644 | Memento Type | RFC5988 system | non RFC5988 |
645 | | | system |
646 +-----------------------------+---------------------+---------------+
647 | First Memento (URI-M0) | first | first |
648 | Last Memento (URI-Mn) | last | last |
649 | Selected Memento (URI-Mj) | NA | NA |
650 | Memento prior to selected | predecessor-version | prev |
651 | Memento (URI-Mi) | | |
652 | Memento next to selected | successor-version | next |
653 | Memento (URI-Mk) | | |
654 +-----------------------------+---------------------+---------------+
655 Table 2: The use of Relation Types
657 3. The Memento Framework, Datetime Negotiation component: HTTP
658 Interactions
660 This section describes the HTTP interactions of the Memento framework
661 for a variety of scenarios. First, Figure 2 provides a schematic
662 overview of a successful request/response chain that involves
663 datetime negotiation. Dashed lines depict HTTP transactions between
664 user agent and server. Appendix A shows these HTTP interactions in
665 detail for the case where the Original Resource resides on one
666 server, whereas both the TimeGate and the Mementos reside on another.
667 Scenarios also exist in which all these resources are on the same
668 server (for example, Content Management Systems) or on different
669 servers (for example, an aggregator of TimeGates). Note that, in
670 Step 2 and Step 6, the HTTP status code of the response is shown as
671 "200 OK", but a series of "206 Partial Content" responses could be
672 substituted without loss of generality.
674 1: UA --- HTTP GET/HEAD; Accept-Datetime: Tj ---------------> URI-R
675 2: UA <-- HTTP 200; Link: URI-G ----------------------------- URI-R
676 3: UA --- HTTP GET/HEAD; Accept-Datetime: Tj ---------------> URI-G
677 4: UA <-- HTTP 302; Location: URI-Mj; Vary; Link:
678 URI-R,URI-T,URI-M0,URI-Mn,URI-Mi,URI-Mj,URI-Mk -------- URI-G
679 5: UA --- HTTP GET URI-Mj; Accept-Datetime: Tj -------------> URI-Mj
680 6: UA <-- HTTP 200; Memento-Datetime: Tj; Link:
681 URI-R,URI-T,URI-G,URI-M0,URI-Mn,URI-Mi,URI-Mj,URI-Mk -- URI-Mj
683 Figure 2: Typical Memento request/response chain
685 o Step 1: In order to determine what the URI is of a TimeGate for an
686 Original Resource, the user agent issues an HTTP HEAD/GET request
687 against the URI of the Original Resource (URI-R).
689 o Step 2: The entity-header of the response from URI-R includes an
690 HTTP "Link" header with a Relation Type of "timegate" pointing at
691 a TimeGate (URI-G) for the Original Resource.
693 o Step 3: The user agent starts the datetime negotiation process
694 with the TimeGate by issuing an HTTP GET request against its URI-G
695 thereby including an "Accept-Datetime" HTTP header with a value of
696 the datetime of the desired prior state of the Original Resource.
698 o Step 4: The entity-header of the response from URI-G includes a
699 "Location" header pointing at the URI of a Memento (URI-Mj) for
700 the Original Resource. In addition, the entity-header contains an
701 HTTP "Link" header with a Relation Type of "original" pointing at
702 the Original Resource, and an HTTP "Link" header with a Relation
703 Type of "timemap" pointing at a TimeMap (URI-T). Also HTTP Links
704 pointing at various Mementos are provided using the "memento"
705 Relation Type, as specified in Section 2.2.1.4.
707 o Step 5: The user agent issues an HTTP GET request against the
708 URI-Mj of a Memento, obtained in Step 4.
710 o Step 6: The entity-header of the response from URI-Mj includes a
711 "Memento-Datetime" HTTP header with a value of the datetime of the
712 Memento. It also contains an HTTP "Link" header with a Relation
713 Type of "original" pointing at the Original Resource, with a
714 Relation Type of "timegate" pointing at a TimeGate associated with
715 the Original Resource, and with a Relation Type of "timemap"
716 pointing at a TimeMap. The state that is expressed by the
717 representation provided in the response is the state the Original
718 Resource had at the datetime expressed in the "Memento-Datetime"
719 header. This response also includes HTTP Links with a "memento"
720 Relation Type pointing at various Mementos, as specified in
721 Section 2.2.1.4.
723 The following sections detail the specifics of HTTP interactions with
724 Original Resources, TimeGates, Mementos, and TimeMaps under various
725 conditions.
727 3.1. Interactions with an Original Resource
729 This section details HTTP GET/HEAD requests targeted at an Original
730 Resource (URI-R).
732 3.1.1. Step 1: User Agent Requests an Original Resource
734 In order to try and discover a TimeGate for the Original Resource,
735 the user agent SHOULD issue an HTTP HEAD or GET request against the
736 Original Resource's URI. Use of the "Accept-Datetime" header in the
737 HTTP HEAD/GET request is OPTIONAL.
739 Figure 3 shows the use of HTTP HEAD indicating the user agent is not
740 interested in retrieving a representation of the Original Resource,
741 but only in determining a TimeGate for it. It also shows the use of
742 the "Accept-Datetime" header anticipating that the user agent will
743 set it for the entire duration of a Memento request/response cycle.
745 HEAD / HTTP/1.1
746 Host: a.example.org
747 Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT
748 Connection: close
749 Figure 3: User Agent Requests Original Resource
751 3.1.2. Step 2: Server Responds to a Request for an Original Resource
753 The response of the Original Resource's server to the user agent's
754 HTTP HEAD/GET request of Step 1, for the case where the Original
755 Resource exists, is as it would be in a regular HTTP request/response
756 cycle, but in addition MAY include a HTTP "Link" header with a
757 Relation Type of "timegate" that conveys the URI of the Original
758 Resource's TimeGate as the Target IRI of the Link. Multiple HTTP
759 Links with a relation type of "timegate" MAY be provided to
760 accommodate situations in which the server is aware of multiple
761 TimeGates for an Original Resource. The actual Target IRI provided
762 in the "timegate" Link may depend on several factors including the
763 datetime provided in the "Accept-Datetime" header, and the IP address
764 of the user agent. A response for this case is illustrated in
765 Figure 4.
767 HTTP/1.1 200 OK
768 Date: Thu, 21 Jan 2010 00:02:12 GMT
769 Server: Apache
770 Link:
771 ; rel="timegate"
772 Content-Length: 255
773 Connection: close
774 Content-Type: text/html; charset=iso-8859-1
776 Figure 4: Server of Original Resource Responds
778 Servers that actively maintain archives of their resources SHOULD
779 include the "timegate" HTTP "Link" header because this link is an
780 important way for a user agent to discover TimeGates for those
781 resources. This includes servers such as Content Management Systems,
782 Control Version Systems, and Web servers with associated
783 transactional archives Fitch [Fitch]. Servers that do not actively
784 maintain archives of their resources MAY include the "timegate" HTTP
785 "Link" header as a way to convey a preference for TimeGates for their
786 resources exposed by a third party archive. This includes servers
787 that rely on Web archives such as the Internet Archive to archive
788 their resources.
790 The server of the Original Resource MUST treat requests with and
791 without an "Accept-Datetime" header in the same way:
793 o The response MUST either always or never include a HTTP "Link"
794 header with an entry that has a "timegate" Relation Type and the
795 URI of a TimeGate as the Target IRI.
797 o The entity-body of the response MUST be the same, for user agent
798 requests with or without a "Accept-Datetime" header.
800 3.1.2.1. Original Resource is an Appropriate Memento
802 The "Memento-Datetime" header MAY be applied to an Original Resource
803 directly to indicate it is a FixedResource (see W3C.gen-ont-20090420
804 [W3C.gen-ont-20090420]), meaning that the state of the Original
805 Resource has not changed since the datetime conveyed in the "Memento-
806 Datetime" header, and as a promise that it will not change anymore
807 beyond it. This may occur, for example, for certain stable media
808 resources on news sites. In case the user agent's preferred datetime
809 is equal to or more recent than the datetime conveyed as the value of
810 "Memento-Datetime" in the server's response in Step 2, the user agent
811 SHOULD conclude it has located an appropriate Memento, and it SHOULD
812 NOT continue to Step 3.
814 Figure 5 illustrates such a response to a request for the resource
815 with URI http://a.example.org/pic that has been stable since it was
816 created. Note the use of both the "memento" and "original" Relation
817 Types for links that have as Target IRI the URI of the Original
818 Resource.
820 HTTP/1.1 200 OK
821 Date: Thu, 21 Jan 2010 00:02:12 GMT
822 Server: Apache
823 Link:
824
825 ; rel="original memento"
826 ; datetime="Fri, 20 Mar 2009 11:00:00 GMT"
827 Memento-Datetime: Fri, 20 Mar 2009 11:00:00 GMT
828 Content-Length: 255
829 Connection: close
830 Content-Type: text/html; charset=iso-8909-1
832 Figure 5: Response to a request for an Original Resource that was
833 created as a FixedResource
835 Cases may also exist in which a resource becomes stable at a certain
836 point in its existence, but changed previously. In such cases, the
837 Original Resource may know about a TimeGate that is aware of its
838 prior history and hence MAY also include a link with a "timegate"
839 Relation Type. This is illustrated in Figure 6, where the "memento"
840 and "original" Relation Types are used as in Figure 5, and the
841 existence of a TimeGate to negotiate for Mementos with datetimes
842 prior to Fri, 20 Mar 2009 11:00:00 GMT is indicated.
844 HTTP/1.1 200 OK
845 Date: Thu, 21 Jan 2010 00:02:12 GMT
846 Server: Apache
847 Link:
848
849 ; rel="original memento"
850 ; datetime="Fri, 20 Mar 2009 11:00:00 GMT",
851
852 ; rel="timegate"
853 Memento-Datetime: Fri, 20 Mar 2009 11:00:00 GMT
854 Content-Length: 255
855 Connection: close
856 Content-Type: text/html; charset=iso-8909-1
858 Figure 6: Response to a request for an Original Resource that became
859 a FixedResource
861 3.1.2.2. Server Exists and Original Resource Used to Exist
863 Servers SHOULD also provide a "timegate" HTTP "Link" header in
864 responses to requests for an Original Resource that the server knows
865 used to exist, but no longer does. This allows the use of an
866 Original Resource's URI as an entry point to representations of its
867 prior states even if the resource itself no longer exists. A
868 server's response for this case is illustrated in Figure 7.
870 HTTP/1.1 404 Not Found
871 Date: Thu, 21 Jan 2010 00:02:12 GMT
872 Server: Apache
873 Link:
874
875 ; rel="timegate"
876 Content-Length: 255
877 Connection: close
878 Content-Type: text/html; charset=iso-8909-1
880 Figure 7: Response to a request for an Original Resource that not
881 longer exists
883 In case the server is not aware of the prior existence of the
884 Original Resource, its response SHOULD NOT include a "timegate" HTTP
885 Link. Section 3.1.2.3 details what the user agent's behavior should
886 be in such cases.
888 3.1.2.3. Missing or Inadequate "timegate" Link in Original Server's
889 Response
891 A user agent MAY ignore the TimeGate returned in Step 2. However,
892 when engaging in a Memento request/response cycle, a user agent
893 SHOULD NOT proceed immediately to Step 3 by using a TimeGate of its
894 own preference but rather SHOULD always start the cycle by issuing an
895 HTTP GET/HEAD against the Original Resource (Step 1, Figure 3) as it
896 is an important way to learn about dedicated or preferred TimeGates
897 for the Original Resource. Also, cases exist in which the response
898 in Step 2 will not provide a "timegate" link, including:
900 o The Original Resource's server does not support the Memento
901 framework;
903 o The Original Resource no longer exists and the responding server
904 is not aware of its prior existence;
906 o The server that hosted the Original Resource no longer exists;
908 In all these cases, the user agent SHOULD attempt to determine an
909 appropriate TimeGate for the Original Resource, either automatically
910 or interactively supported by the user. The discovery mechanisms
911 described in Section 4 can support the user agent with this regard.
913 3.2. Interactions with a TimeGate
915 This section details HTTP GET/HEAD requests targeted at a TimeGate
916 (URI-G).
918 3.2.1. Step 3: User Agent Negotiates with a TimeGate
920 In order to negotiate with a TimeGate, the user agent MUST issue a
921 HTTP HEAD or GET against its URI, its request MUST include the
922 "Accept-Datetime" header to express its datetime preference, and the
923 use of that header MUST be as described in Section 2.1.1.1. The URI
924 of the TimeGate may have been provided as the Target IRI of a
925 "timegate" HTTP "Link" header in the response from the Original
926 Resource (Step 2, Figure 4), or may have resulted from another
927 discovery mechanism (see Section 4) or user interaction. Such a
928 request is illustrated in Figure 8.
930 GET /timegate/http://a.example.org HTTP/1.1
931 Host: arxiv.example.net
932 Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT
933 Connection: close
935 Figure 8: User agent negotiates with TimeGate
937 3.2.2. Step 4: Server Responds to Negotiation with TimeGate
939 In order to respond to a datetime negotiation request (Step 3,
940 Section 3.2.1), the server uses an internal algorithm to select the
941 Memento that best meets the user agent's datetime preference, and
942 redirects to it. The exact nature of the selection algorithm is at
943 the server's discretion but SHOULD be consistent. A variety of
944 approaches can be used including selecting the Memento that is
945 nearest in time (either past or future) or nearest in the past
946 relative to the requested datetime. Special cases for datetime
947 negotiation with a TimeGate exist, and they are addressed in
948 Section 3.2.2.7 through Section 3.2.2.8.
950 3.2.2.1. Successful Scenario
952 In cases where the TimeGate exists, and the datetime provided in the
953 user agent's "Accept-Datetime" header can be parsed and does not
954 contain an interval indicator, the server selects a Memento based on
955 the user agent's datetime preference. The response MUST have a "302
956 Found" HTTP status code, and the "Location" header MUST be used to
957 convey the URI of the selected Memento. The "Vary" header MUST be
958 provided and it MUST include the "negotiate" and "accept-datetime"
959 values to indicate that datetime negotiation has taken place. The
960 "Link" header MUST be provided and contain links with Relation Types
961 subject to the considerations described in Section 2.2. Such a
962 response is illustrated in Figure 9.
964 HTTP/1.1 302 Found
965 Date: Thu, 21 Jan 2010 00:06:50 GMT
966 Server: Apache
967 Vary: negotiate, accept-datetime
968 Location:
969 http://arxiv.example.net/web/20010911203610/http://a.example.org
970 Link: ; rel="original",
971
972 ; rel="timemap"; type="application/link-format",
973
974 ; rel="first memento"; datetime="Tue, 15 Sep 2000 11:28:26 GMT",
975
976 ; rel="last memento"; datetime="Tue, 08 Jul 2008 09:34:33 GMT",
977
978 ; rel="memento"; datetime="Tue, 11 Sep 2001 20:36:10 GMT",
979
980 ; rel="prev memento"; datetime="Tue, 11 Sep 2001 20:30:51 GMT",
981
982 ; rel="next memento"; datetime="Tue, 11 Sep 2001 20:47:33 GMT"
983 Content-Length: 0
984 Content-Type: text/plain; charset=UTF-8
985 Connection: close
987 Figure 9: Server of TimeGate responds
989 Note that if a user agent's "Accept-Datetime" header does not convey
990 an interval indicator, and conveys a datetime that is either earlier
991 than the datetime of the first Memento or later than the datetime of
992 the most recent Memento known to the server, the server's response is
993 as just described yet entails the selection of the first or most
994 recent Memento, respectively. This approach is consistent with
995 interpreting the abscence of an interval indicator in the user
996 agent's request as an indication of an infinite interval around its
997 preferred datetime (see Section 2.1.1.1).
999 This is illustrated in Figure 10 that shows the response from a
1000 TimeGate exposed by a MediaWiki server to a request by a user agent
1001 that has an "Accept-Datetime: Mon, 31 May 1999 00:00:00 GMT" header.
1002 Note that a link is provided with a "successor-version" Relation Type
1003 but not with a "predecessor-version" Relation Type.
1005 HTTP/1.1 302 Found
1006 Server: Apache
1007 Content-Length: 709
1008 Content-Type: text/html; charset=utf-8
1009 Date: Thu, 21 Jan 2010 00:09:40 GMT
1010 Location:
1011 http://a.example.org/w/index.php?title=Clock&oldid=1493688
1012 Vary: negotiate, accept-datetime
1013 Link: ; rel="original",
1014
1015 ; rel="timemap",
1016
1017 ; rel="first memento"; datetime="Sun, 28 Sep 2003 01:42:00 GMT",
1018
1019 ; rel="successor-version memento"
1020 ; datetime="Tue, 30 Sep 2003 14:28:00 GMT",
1021
1022 ; rel="last memento"; datetime="Tue, 12 Jan 2010 19:55:00 GMT"
1023 Connection: close
1025 Figure 10: A TimeGate's response to a request for a Memento with a
1026 datetime earlier than that of the first Memento
1028 3.2.2.2. Accept-Datetime with Interval Indicator Provided
1030 In case, in Step 3, the datetime provided in the user agent's
1031 "Accept-Datetime" header can be parsed, and contains an interval
1032 indicator, the response depends on whether the server is or is not
1033 aware of Mementos with datetimes within the expressed interval. If
1034 the server is aware of such Mementos, the server's response MUST be
1035 as in Section 3.2.2.1.
1037 However, if the responding server is not aware of any Mementos with
1038 "Memento-Datetime" values within the expressed interval, the server's
1039 response MUST have a "406 Not Acceptable" HTTP status code. The use
1040 of the "Vary" header MUST be as described in Section 3.2.2.1. The
1041 use of the "Link" header MUST be as described in Section 2.2.
1042 Specifically, the use of links with a "memento" Relation Type MUST
1043 follow the rules for the case where no Memento is selected by the
1044 responding server (Section 2.2.1.4) and it is RECOMMENDED that the
1045 server provides "memento" links pointing at Mementos that have
1046 "Memento-Datetime" values in the temporal vicinity of the interval
1047 expressed by the client.
1049 As a result, a user agent that allows for the provision of an
1050 interval indicator in requests SHOULD anticipate possible "406 Not
1051 Acceptable" responses and provide the capability for their
1052 resolution. For example, the client can leverage the "memento" links
1053 returned by the responding server, can extend its preferred interval,
1054 or can remove it from further requests.
1056 Figure 11 shows a user agent using an "Accept-Datetime" header
1057 conveying an interval of interest starting 5 hours before and ending
1058 6 hours after Tue, 11 Sep 2001 20:35:00 GMT. Figure 12 shows the
1059 "406 Not Acceptable" response from the TimeGate that has links to the
1060 first and last Memento, as well to two Mementos, one on either
1061 temoporal side of the user agent's preferred interval.
1063 GET /timegate/http://a.example.org HTTP/1.1
1064 Host: arxiv.example.net
1065 Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT; -PT5H;+PT6H
1066 Connection: close
1068 Figure 11: User agent expresses interval of interest in Accept-
1069 Datetime header
1071 HTTP/1.1 406 Not Acceptable
1072 Date: Thu, 21 Jan 2010 00:06:50 GMT
1073 Server: Apache
1074 Vary: negotiate, accept-datetime
1075 Link: ; rel="original",
1076
1077 ; rel="timemap";type="application/link-format",
1078
1079 ; rel="memento first"; datetime="Tue, 15 Sep 2000 11:28:26 GMT",
1080
1081 ; rel="memento last"; datetime="Tue, 08 Jul 2008 09:34:33 GMT",
1082
1083 ; rel="memento"; datetime="Mon, 10 Sep 2001 08:22:00 GMT",
1084
1085 ; rel="memento"; datetime="Wed, 12 Sep 2001 03:41:00 GMT"
1086 Content-Length: 1732
1087 Connection: close
1088 Content-Type: text/plain; charset=UTF-8
1090 Figure 12: A TimeGate's response indicating it has no Mementos within
1091 the interval of interest
1093 3.2.2.3. Multiple Matching Mementos
1095 Because the finest datetime granularity expressable using the RFC
1096 1123 [RFC1123] format used in HTTP is seconds level, cases may occur
1097 in which a TimeGate server is aware of multiple Mementos that meet
1098 the user agent's datetime preference. This may occur in Content
1099 Management Systems with very high update rates. The response in this
1100 case MUST be handled as in Section 3.2.2.1, with the selection of one
1101 of the matching Mementos.
1103 As an example, Figure 13 shows a hypothetical response from a
1104 TimeGate on a MediaWiki server to a request for a Memento for the
1105 Original Resource http://a.example.org/w/Clock for which two Mementos
1106 exist for the user agent's preferred datetime.
1108 HTTP/1.1 302 Found
1109 Server: Apache
1110 Content-Length: 705
1111 Content-Type: text/html; charset=utf-8
1112 Date: Thu, 21 Jan 2010 00:09:40 GMT
1113 Vary: negotiate, accept-datetime
1114 Location:
1115 http://a.example.org/w/index.php?title=Clock&oldid=322586071
1116 Link: ; rel="original",
1117
1118 ; rel="timemap";type="application/link-format",
1119
1120 ; rel="first memento"; datetime="Sun, 28 Sep 2003 01:42:00 GMT",
1121
1122 ; rel="last memento"; datetime="Tue, 12 Jan 2010 19:55:00 GMT",
1123
1124 ; rel="memento"; datetime="Sun, 31 May 2009 15:43:00 GMT",
1125
1126 ; rel="memento successor-version"
1127 ; datetime="Sun, 31 May 2009 15:43:00 GMT"
1128
1129 ; rel="memento predecessor-version"
1130 ; datetime="Sun, 31 May 2009 15:41:24 GMT"
1131 Connection: close
1133 Figure 13: A TimeGate's response to a request that has multiple
1134 Mementos with a matching datetime
1136 3.2.2.4. TimeGate Redirects to another TimeGate
1138 Cases may exist in which a TimeGate's response entails a redirects to
1139 another TimeGate, for example, because the responding TimeGate is
1140 aware that the other TimeGate is able to more precisely respond to a
1141 client's datetime preference. In such cases, the TimeGate's response
1142 MUST have a "302 Found" HTTP status code, and the "Location" header
1143 MUST be used to convey the URI of the other TimeGate. The "Vary"
1144 header MUST be provided and it MUST include the "negotiate" and
1145 "accept-datetime" values to indicate that, although datetime
1146 negotiation has not taken place, the responding TimeGate is capable
1147 of it. The "Link" header MUST be provided and contain links with
1148 Relation Types subject to the considerations described in
1149 Section 2.2. Specifically, the use of links with a "memento"
1150 Relation Type MUST follow the rules for the case where no Memento is
1151 selected by the responding server (Section 2.2.1.4). Also, a link
1152 with a "timegate" Relation Type MUST be provided that has as Target
1153 IRI the URI of the TimeGate to which the current TimeGate is
1154 redirecting the client.
1156 A response in which the client is redirected by TimeGate
1157 http://arxiv.example.net/timegate/http://a.example.org to TimeGate
1158 http://otherarxiv.example.com/timegate/http://a.example.org for the
1159 Original Resource http://a.example.org is illustrated in Figure 14.
1160 Note the URI of the latter TimeGate in both the "Location" and "Link"
1161 header, in the latter case as the Target IRI of a "timegate" link.
1162 Note also that the "memento" and "timemap" links in this response
1163 reflect the knowledge of the responding TimeGate, not of the remote
1164 TimeGate.
1166 HTTP/1.1 302 Found
1167 Date: Thu, 21 Jan 2010 00:06:50 GMT
1168 Server: Apache
1169 Vary: negotiate, accept-datetime
1170 Location:
1171 http://otherarxiv.example.com/timegate/http://a.example.org
1172 Link: ; rel="original",
1173
1174 ; rel="timemap"; type="application/link-format",
1175
1176 ; rel="first memento"; datetime="Tue, 15 Sep 2000 11:28:26 GMT",
1177
1178 ; rel="last memento"; datetime="Tue, 08 Jul 2008 09:34:33 GMT",
1179
1180 ; rel="timegate"
1181 Content-Length: 0
1182 Content-Type: text/plain; charset=UTF-8
1183 Connection: close
1185 Figure 14: TimeGate redirects to another TimeGate
1187 3.2.2.5. Accept-Datetime and other Accept Headers Provided
1189 When interacting with a TimeGate, the regular content negotiation
1190 dimensions (media type, character encoding, language, and
1191 compression) remain available. It is the TimeGate server's
1192 responsibility to honor (or not) such content negotiation, and in
1193 doing so it MUST always first select a Memento that meets the user
1194 agent's datetime preference, and then consider honoring regular
1195 content negotiation for it. As a result of this approach, the
1196 returned Memento will not necessarily meet the user agent's regular
1197 content negotiation preferences. Therefore, it is RECOMMENDED that
1198 the server provides HTTP Links with a "memento" Relation Type
1199 pointing at Mementos that do meet the user agent's regular content
1200 negotiation requests and that have a Memento-Datetime value in the
1201 temporal vicinity of the user agent's preferred datetime value.
1203 3.2.2.6. Accept-Datetime Unparseable
1205 In case, in Step 3, a user agent conveys a value for the "Accept-
1206 Datetime" request header that does not conform to the accept-dt-value
1207 construction rule of the BNF in Figure 1, the TimeGate server's
1208 response MUST have a "400 Bad Request" HTTP status code. With all
1209 other respects, responses in this case MUST be handled as described
1210 in Section 3.2.2.2.
1212 3.2.2.7. Accept-Datetime Not Provided
1214 In case, in Step 3, a user agent issues a request to a TimeGate and
1215 fails to include an "Accept-Datetime" request header, the response
1216 MUST be handled as in Section 3.2.2.1, with a selection of the most
1217 recent Memento known to the responding server.
1219 3.2.2.8. TimeGate Does Not Exist
1221 Cases may occur in which a user agent issues a request against a
1222 TimeGate that does not exist. This may, for example, occur when a
1223 user agent uses internal knowledge to construct the URI of an
1224 assumed, yet non-existent TimeGate. In these cases, the response
1225 from the target server MUST have a "404 Not Found" HTTP status code,
1226 and SHOULD include a "Vary" header that includes the "negotiate" and
1227 "accept-datetime" values as an indication that, generally, the server
1228 is capable of datetime negotiation. The response MUST NOT include a
1229 "Link" header with any of the Relation Types introduced in
1230 Section 2.2.1.
1232 3.2.2.9. HTTP Methods other than HEAD/GET
1234 In the above, the safe HTTP methods GET and HEAD are described for
1235 TimeGates. TimeGates MAY support the safe HTTP methods OPTIONS and
1236 TRACE in the way described in RFC 2616 [RFC2616]. Unsafe HTTP
1237 methods (i.e. PUT, POST, DELETE) MUST NOT be supported by a
1238 TimeGate. Such requests MUST yield a response with a "405 Method Not
1239 Allowed" HTTP status code, and MUST include an "Allow" header to
1240 convey that only the HEAD and GET (and OPTIONALLY the OPTIONS and
1241 TRACE) methods are supported. In addition, the response MUST have a
1242 "Vary" header that includes the "negotiate" and "accept-datetime"
1243 values to indicate the TimeGate supports datetime negotiation.
1245 Figure 15 shows such a response.
1247 HTTP/1.1 405 Method Not Allowed
1248 Date: Thu, 21 Jan 2010 00:02:12 GMT
1249 Server: Apache
1250 Vary: negotiate, accept-datetime
1251 Allow: HEAD, GET
1252 Content-Length: 255
1253 Connection: close
1254 Content-Type: text/html; charset=iso-8909-1
1256 Figure 15: Response from a TimeGate accessed with HTTP method other
1257 than HEAD/GET
1259 3.2.3. Recognizing a TimeGate
1261 When a user agent issues a HTTP HEAD/GET request against a resource
1262 of which it found the URI as the Target IRI of an entry in the "Link"
1263 header with a "timegate" Relation Type, it SHOULD NOT assume that the
1264 targeted resource effectively is a TimeGate and hence will behave as
1265 described in Section 3.2.2.
1267 A user agent MUST decide it has reached a TimeGate if the response to
1268 a HTTP HEAD/GET request against the resource's URI contains a "Vary"
1269 header that includes the "negotiate" and "accept-datetime" values.
1270 If the response does not, the user agent MUST decide it has not
1271 reached a TimeGate and proceed as follows:
1273 o If the response contains a redirection, the user agent SHOULD
1274 follow it. Note that even a chain of redirections is possible,
1275 e.g. URI-R -> URI-1 -> URI-2 -> ... -> URI-G
1277 o If the response does not contain a redirection, or if the
1278 redirection (chain) does not lead to a TimeGate, the user agent
1279 SHOULD attempt to determine an appropriate TimeGate for the
1280 Original Resource, either automatically or interactively supported
1281 by the user. The discovery mechanisms described in Section 4 can
1282 support the user agent with this regard.
1284 Resources that are not TimeGates (i.e. do not behave as described in
1285 Section 3.2.2) MUST NOT use a "Vary" header that includes the
1286 "accept-datetime" value.
1288 3.3. Interactions with a Memento
1290 This section details HTTP GET/HEAD requests targeted at a Memento
1291 (URI-M).
1293 3.3.1. Step 5: User Agent Requests a Memento
1295 In Step 5, the user agent issues a HTTP GET request against the URI
1296 of a Memento. The user agent MAY include an "Accept-Datetime" header
1297 in this request, but the existence or absence of this header MUST NOT
1298 affect the server's response. The URI of the Memento may have
1299 resulted from a response in Step 4, or the user agent may simply have
1300 happened upon it. Such a request is illustrated in Figure 16.
1302 GET /web/20010911203610/http://a.example.org HTTP/1.1
1303 Host: arxiv.example.net
1304 Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT
1305 Connection: close
1307 Figure 16: User agent requests Memento
1309 3.3.2. Step 6: Server Responds to a Request for a Memento
1311 If the Memento requested by the user agent in Step 5 exists, the
1312 server's response MUST have a "200 OK" HTTP status code (or "206
1313 Partial Content", where appropriate), and it MUST include a "Memento-
1314 Datetime" header with a value equal to the archival datetime of the
1315 Memento, that is, the datetime of the state of the Original Resource
1316 that is encapsulated in the Memento. The "Link" header MUST be
1317 provided and contain links subject to the considerations described in
1318 Section 2.2. The Target IRI and, when applicable, the datetime
1319 values in the "Link" header associated with the "memento" Relation
1320 Type SHOULD be the same as conveyed in Step 4, in case the TimeGate
1321 and the selected Memento reside on the same server. However, they
1322 MAY be different in case the TimeGate and the selected Memento reside
1323 on different servers.
1325 Figure 17 illustrates the server's response to the request issued
1326 against a Memento in Step 5 (Figure 16).
1328 HTTP/1.1 200 OK
1329 Date: Thu, 21 Jan 2010 00:09:40 GMT
1330 Server: Apache-Coyote/1.1
1331 Memento-Datetime: Tue, 11 Sep 2001 20:36:10 GMT
1332 Link: ; rel="original",
1333
1334 ; rel="timemap"; type="application/link-format",
1335
1336 ; rel="timegate",
1337
1338 ; rel="first memento"; datetime="Tue, 15 Sep 2000 11:28:26 GMT",
1339
1340 ; rel="last memento"; datetime="Tue, 08 Jul 2008 09:34:33 GMT",
1341
1342 ; rel="memento"; datetime="Tue, 11 Sep 2001 20:36:10 GMT",
1343
1344 ; rel="prev memento"; datetime="Tue, 11 Sep 2001 20:30:51 GMT",
1345
1346 ; rel="next memento"; datetime="Tue, 11 Sep 2001 20:47:33 GMT"
1347 Content-Length: 23364
1348 Content-Type: text/html;charset=utf-8
1349 Connection: close
1351 Figure 17: Server of Memento responds
1353 The server's response MUST include the "Memento-Datetime" header
1354 regardless whether the user agent's request contained an "Accept-
1355 Datetime" header or not. This is the way by which resources make
1356 explicit that they are Mementos. Due to the sparseness of Mementos
1357 in most archives, the value of the "Memento-Datetime" header returned
1358 by a server may differ (significantly) from the value conveyed by the
1359 user agent in "Accept-Datetime".
1361 Although a Memento encapsulates a prior state of an Original
1362 Resource, the entity-body returned in response to an HTTP GET request
1363 issued against a Memento may very well not be byte-to-byte the same
1364 as an entity-body that was previously returned by that Original
1365 Resource. Various reasons exist why there are significant chances
1366 these would be different yet do convey substantially the same
1367 information. These include format migrations as part of a digital
1368 preservation strategy, URI-rewriting as applied by some Web archives,
1369 and the addition of banners as a means to brand Web archives.
1371 3.3.2.1. Memento Does not Exist
1373 Cases may occur in which a TimeGate's response (Step 4) points at a
1374 Memento that actually does not exist, resulting in a user agent's
1375 request (Step 5) for a non-existent Memento. In this case, the
1376 server's response MUST have the expected "404 Not Found" HTTP Status
1377 Code and it MUST NOT contain a "Memento-Datetime" header.
1379 3.3.2.2. Mementos Without a TimeGate
1381 Cases may occur in which a server that hosts Mementos does not expose
1382 a TimeGate for those Mementos. This can, for example, be the case if
1383 the server's Mementos result from taking a snapshot of the state of a
1384 set of Original Resources from another server at the time this other
1385 server is being retired. As a result, only a single Memento per
1386 Original Resource is hosted, making the introduction of a TimeGate
1387 unnecessary. But it may also be the case for servers that hosts
1388 multiple Mementos for an Original Resource but consider exposing
1389 TimeGates too expensive.
1391 In cases of Mementos without associated TimeGates, responses to a
1392 request for a Memento by a user agent MUST be as described in
1393 Section 3.3.2 with the exception that it will not contain a HTTP Link
1394 with a "timegate" Relation Type pointing at a TimeGate exposed by the
1395 responding server. It MAY still contain such a Link pointing at a
1396 TimeGate exposed elsewhere. Depending on whether one or more
1397 Mementos are hosted for an Original Resource, the response may or may
1398 not have a HTTP Link with a "timemap" Relation Type. However, the
1399 response MUST still contain a "Memento-Datetime" response header with
1400 a value that corresponds to archival datetime of the Memento.
1402 Figure 18 illustrates the server's response to the request issued
1403 against a Memento in Step 5 (Figure 16) for the case that Memento has
1404 no associated TimeGate. In this example, it is also assumed there is
1405 only one Memento for the Original Resource, and hence the Links with
1406 Relation Types "memento", "first", "last" all point at the same -
1407 responding - Memento.
1409 HTTP/1.1 200 OK
1410 Date: Thu, 21 Jan 2010 00:09:40 GMT
1411 Server: Apache-Coyote/1.1
1412 Memento-Datetime: Tue, 11 Sep 2001 20:36:10 GMT
1413 Link: ; rel="original",
1414
1415 ; rel="first last memento"
1416 ; datetime="Tue, 15 Sep 2000 11:28:26 GMT"
1417 Content-Length: 23364
1418 Content-Type: text/html;charset=utf-8
1419 Connection: close
1421 Figure 18: Server of Memento without TimeGate responds
1423 Note that a server issuing a response similar to that of Figure 18
1424 does not imply that there is no server whatsoever that exposes a
1425 TimeGate; it merely means that the responding server neither provides
1426 nor is aware of the location of a TimeGate.
1428 3.3.3. Recognizing a Memento
1430 When following the redirection provided by a confirmed TimeGate (see
1431 Section 3.2.3), a user agent SHOULD NOT assume that the targeted
1432 resource effectively is a Memento and hence will behave as described
1433 in Section 3.3.2.
1435 A user agent MUST decide it has reached a Memento if the response to
1436 a HTTP HEAD/GET request against the resource's URI contains a
1437 "Memento-Datetime" header with a legitimate value. If the response
1438 does not, the following applies:
1440 o If the response contains a redirection, the user agent SHOULD
1441 follow it. Even a chain of redirections is possible, e.g. URI-G
1442 -> URI-X -> URI-Y -> ... -> URI-M.
1444 o If the response by a confirmed TimeGate does not contain a
1445 redirection, or if the redirection (chain) that started at a
1446 confirmed TimeGate does not lead to a resource that provides a
1447 "Memento-Datetime" header, the user agent MAY still conclude that
1448 it has likely arrived at a Memento. That is because cases exist
1449 in which Web archives and CMS are made compliant with the Memento
1450 framework "by proxy". In these cases TimeGates will redirect to
1451 Mementos in such systems, but the responses from these Mementos
1452 will not (yet) include a "Memento-Datetime" header.
1454 3.4. Interactions with a TimeMap
1456 A TimeMap is introduced to support retrieving a comprehensive list of
1457 all Mementos for a specific Original Resource, known to a responding
1458 server. The entity-body of a response to an HTTP GET request issued
1459 against a TimeMap's URI:
1461 o MUST list the URI of the Original Resource that the response lists
1462 Mementos for;
1464 o MUST list the URI and datetime of each Memento for the Original
1465 Resource known to the responding server;
1467 o MUST list the URI of one or more TimeGates for the Original
1468 Resource except when no TimeGate exists (see Section 3.3.2.2);
1470 o SHOULD, for self-containment, list the URI of the TimeMap itself;
1471 o MUST unambiguously type listed resources as being Original
1472 Resource, TimeGate, Memento, or TimeMap.
1474 The entity-body of a response from a TimeMap MAY be serialized in
1475 various ways, but the link-value format serialization MUST be
1476 supported. In this serialization, the entity-body MUST be formatted
1477 in the same way as the value of a HTTP "Link" header, and hence MUST
1478 comply to the "link-value" construction rule of "Section 5. The Link
1479 Header Field" of RFC5988 [RFC5988], and the media type of the entity-
1480 body MUST be "application/link-format" as introduced in I-D.ietf-
1481 core-link-format [I-D.ietf-core-link-format]. All links conveyed in
1482 this serialization MUST be interpreted as having the URI of the
1483 Original Resource as their Context IRI. The URI of the Original
1484 Resource is provided in the entity-body as the Target IRI of the link
1485 with an "original" Relation Type.
1487 3.4.1. User Agent Requests a TimeMap
1489 In order to retrieve the link-value serialization of a TimeMap, a
1490 user agent SHOULD use an "Accept" request header with a value set to
1491 "application/link-format". This is shown in Figure 19.
1493 GET /timemap/http://a.example.org HTTP/1.1
1494 Host: arxiv.example.net
1495 Accept: application/link-format;q=1.0
1496 Connection: close
1498 Figure 19: Request for a TimeMap
1500 3.4.2. Server Responds to a Request for a TimeMap
1502 If the TimeMap requested by the user agent exists, the server's
1503 response MUST have a "200 OK" HTTP status code (or "206 Partial
1504 Content", where appropriate). Note that a TimeMap is itself an
1505 Original Resource for which Mementos may exist. For example, a
1506 response from a TimeMap could provide a "timegate" Link to a TimeGate
1507 via which prior TimeMap versions are available. In this case, the
1508 use of the "Link" header is subject to all considerations described
1509 in Section 2.2, with the TimeMap acting as the Original Resource.
1511 However, in case a TimeMap wants to explicitly indicate in its
1512 response headers for which Original Resource it is a TimeMap, it MUST
1513 do so by including a HTTP "Link" header with the following
1514 characteristics:
1516 o The Context IRI for the HTTP Link is the URI of the Original
1517 Resource;
1519 o The Relation Type is "timemap";
1521 o The Target IRI for the HTTP Link is the URI of the TimeMap.
1523 Because the Context IRI of this HTTP Link is not the URI of the
1524 TimeMap, as per RFC5988 [RFC5988], the default Context IRI must be
1525 overwritten by using the "anchor" attribute with a value of the URI
1526 of the Original Resource.
1528 The response from the TimeMap to the request of Figure 19 is shown in
1529 Figure 20. The response header shows the TimeMap explicitly
1530 conveying the URI of the Original Resource for which it is a TimeMap;
1531 for practical reasons the entity-body in the example has been
1532 abbreviated. Notice also the use of the "license" and "embargo"
1533 attributes introduced in Section 2.2.1.4 on the "memento" links in
1534 the TimeMap.
1536 HTTP/1.1 200 OK
1537 Date: Thu, 21 Jan 2010 00:06:50 GMT
1538 Server: Apache
1539 Link:
1540 ; anchor="http://a.example.org"; rel="timemap"
1541 ; type="application/link-format"
1542 Content-Length: 4883
1543 Content-Type: application/link-format
1544 Connection: close
1546 ;rel="original",
1547
1548 ; rel="timemap";type="application/link-format",
1549
1550 ; rel="timegate",
1551
1552 ; rel="first memento";datetime="Tue, 20 Jun 2000 18:02:59 GMT"
1553 ; license="http://creativecommons.org/publicdomain/zero/1.0/",
1554
1555 ; rel="last memento";datetime="Tue, 27 Oct 2009 20:49:54 GMT"
1556 ; license="http://creativecommons.org/publicdomain/zero/1.0/"
1557 ; embargo="Tue, 19 Apr 2011 00:00:00 GMT",
1558
1559 ; rel="memento";datetime="Wed, 21 Jun 2000 01:17:31 GMT"
1560 ; license="http://creativecommons.org/publicdomain/zero/1.0/",
1561
1562 ; rel="memento";datetime="Wed, 21 Jun 2000 04:41:56 GMT"
1563 ; license="http://creativecommons.org/publicdomain/zero/1.0/",
1564 ...
1566 Figure 20: Response from a TimeMap
1568 4. The Memento Framework, Discovery Component
1570 Section 3 describes how TimeGates, Mementos, Original Resources, and
1571 TimeMaps can be discovered by following HTTP Links with Relation
1572 Types "timegate", "memento", "original", and "timemap", respectively.
1574 Naturally, some of these links can also be embedded into
1575 representations of resources that have a media type that allows
1576 embedding of typed links. For example, an Original Resource that has
1577 an HTML representation can include a "timegate" link by using HTML's
1578 LINK element, e.g. . The use of such embedded links is also subject to
1581 the considerations of Section 2.2.
1583 In this section additional approaches are introduced that support
1584 batch discovery of TimeGates and Mementos. The approaches leverage
1585 the Robots Exclusion Protocol.
1587 4.1. Discovering TimeGates Via Robots Exclusion Protocol
1589 The Robots Exclusion Protocol's robots.txt file [robotstxt.org] is
1590 commonly used by Web site owners to give instructions about their
1591 site to Web robots. It is used both to protect resources hosted by a
1592 server from crawling and to facilitate discovering them. This
1593 document introduces the "TimeGate" and "Archived" directives for
1594 robots.txt to provide a server-wide mechanism to support TimeGate
1595 discovery that SHOULD be used by:
1597 o Servers of Original Resources;
1599 o Servers that provide access to Mementos of Original Resources by
1600 exposing TimeGates.
1602 A robots.txt file MAY contain zero or more occurrences of the
1603 "TimeGate" directive, and each occurrence MUST be followed by one or
1604 more associated "Archived" directives. The meaning of the directives
1605 is as follows:
1607 o TimeGate: Conveys the base URL (that is URI scheme, host and path
1608 component) that is shared by all URIs of TimeGates of a set of
1609 Original Resources.
1611 o Archived: Indicates - by means of mandatory host and optional path
1612 parts of a URI - for which set of Original Resources actual
1613 TimeGates are available that have the base URL conveyed in the
1614 associated TimeGate directive.
1616 For example, consider a wiki at http://a.example.org/w/ that supports
1617 the Memento framework and exposes TimeGates to access the wiki's
1618 history pages at base URL
1619 http://a.example.org/w/index.php/Special:TimeGate/. An actual
1620 TimeGate for the wiki's http://a.example.org/w/My_Title page would
1621 then be at http://a.example.org/w/index.php/Special:TimeGate/http://
1622 a.example.org/w/My_Title. This wiki SHOULD make its TimeGates
1623 discoverable by using the directives shown in Figure 21 in its
1624 robots.txt file.
1626 TimeGate: http://a.example.org/w/index.php/Special:TimeGate/
1627 Archived: a.example.org/w/
1629 Figure 21: robots.txt for a wiki, host of Original Resources,
1630 TimeGates, and Mementos
1632 As another example, consider a server of Original Resources at
1633 http://a.example.org/ and http://www.a.example.org/ that is aware
1634 that its resources are regularly crawled by a Web archive that
1635 generally exposes TimeGates at base URL
1636 http://arxiv.example.net/timegate/ and hence has TimeGate
1637 http://arxiv.example.net/timegate/http://a.example.org/ to access
1638 Mementos for http://a.example.org/. This server SHOULD make the
1639 remote TimeGates discoverable by including the directives shown in
1640 Figure 22 in its robots.txt file:
1642 TimeGate: http://arxiv.example.net/timegate/
1643 Archived: a.example.org/
1644 Archived: www.a.example.org/
1646 Figure 22: robots.txt for a server of Original Resources aware of
1647 remote TimeGates
1649 And, consider a Web archive that crawls a wide range of Original
1650 Resources, and exposes TimeGates to access the resulting Mementos at
1651 base URL http://arxiv.example.net/timegate/. In order to make its
1652 TimeGates discoverable, this Web archive SHOULD include the
1653 directives shown in Figure 23 in its robots.txt file:
1655 TimeGate: http://arxiv.example.net/timegate/
1656 Archived: *
1658 Figure 23: robots.txt for a Web Archive that hosts Mementos for a
1659 wide range of Original Resources
1661 4.2. Discovering Mementos via Robots Exclusion Protocol
1663 Servers can support discovery of their Mementos by crawlers through
1664 the use of the Robots Exclusion Protocol, but SHOULD do so in a
1665 manner that conveys to crawlers and mirroring applications that the
1666 sticky Memento-Datetime behavior (see Section 2.1.1) MUST be
1667 respected. To that end, servers SHOULD use the "User-agent" and
1668 "Allow" directives of the Robots Exclusion Protocol in the following
1669 manner:
1671 o User-agent: Has "memento" as its value;
1673 o Allow: Lists the path that contains Mementos that can be crawled,
1674 and for which content can be mirrored subject to the sticky
1675 Memento-Datetime behavior.
1677 Figure 24 shows the robots.txt for a server that generally disallows
1678 crawling, yet allows agents that respect the sticky Memento-Datetime
1679 behavior to crawl Mementos in the /web/ path.
1681 User-agent: *
1682 Disallow: /
1683 User-agent: memento
1684 Allow: /web/
1686 Figure 24: Restricting crawling to agents that respect sticky
1687 Memento-Datetime behavior
1689 5. IANA Considerations
1691 This memo requires IANA to register the Accept-Datetime and Memento-
1692 Datetime HTTP headers defined in Section 2.1.1 in the appropriate
1693 IANA registry.
1695 This memo requires IANA to register the "Link" header Relation Types
1696 "original", "timegate", "timemap", and "memento" defined in
1697 Section 2.2.1 in the appropriate IANA registry.
1699 This memo requires IANA to register the "datetime", "license", and
1700 "embargo" attributes for Link headers with a "memento" Relation Type,
1701 as defined in Section 2.2.1.4 in the appropriate IANA registry.
1703 6. Security Considerations
1705 Provision of a "timegate" HTTP "Link" header in responses to requests
1706 for an Original Resource that is protected (e.g., 401 or 403 HTTP
1707 response codes) is OPTIONAL. The inclusion of this Link when
1708 requesting authentication is at the server's discretion; cases may
1709 exist in which a server protects the current state of a resource, but
1710 supports open access to prior states and thus chooses to supply a
1711 "timegate" HTTP "Link" header. Conversely, the server may choose to
1712 not advertise the TimeGate URIs (e.g., they exist in an intranet
1713 archive) for unauthenticated requests.
1715 Authentication, encryption and other security related issues are
1716 otherwise orthogonal to Memento.
1718 7. Changelog
1720 v03 2011-05-11 HVDS MLN RS draft-vandesompel-memento-02
1722 o Added scenario in which a TimeGate redirects to another TimeGate.
1724 o Reorganized TimeGate section to better reflect the difference
1725 between requests with and without interval indicator.
1727 o Added recommendation to provide "memento" links to Mementos in the
1728 vicinity of the preferred interval provided by the client, in case
1729 of a 406 response.
1731 o Removed TimeMap Feed material from the Discovery section as a
1732 result of discussions regarding (lack of) scalability of the
1733 approach with representatives of the International Internet
1734 Preservation Consortium. An alternative approach to support batch
1735 discovery of Mementos will be specified.
1737 v02 2011-04-28 HVDS MLN RS draft-vandesompel-memento-01
1739 o Introduced wording and reference to indicate a Memento is a
1740 FixedResource.
1742 o Introduced "Sticky Memento-Datetime" notion and clarified wording
1743 about retaining "Memento-Datetime" headers and values when a
1744 Memento is mirrored at different URI.
1746 o Introduced section about handling both datetime and regular
1747 negotiation.
1749 o Introduced section about Mementos Without TimeGate.
1751 o Made various changes in the section Relation Type "memento",
1752 including addition of "license" and "embargo" attributes, and
1753 clarification of rules regarding the use of "memento" links.
1755 o Moved section about TimeMaps inside the Datetime Negotiation
1756 section, and updated it.
1758 o Restarted the Discovery section from scratch.
1760 v01 2010-11-11 HVDS MLN RS First public version
1761 draft-vandesompel-memento-00
1763 v00 2010-10-19 HVDS MLN RS Limited circulation version
1765 2010-07-22 HVDS MLN First internal version
1767 8. Acknowledgements
1769 The Memento effort is funded by the Library of Congress. Many thanks
1770 to Kris Carpenter Negulescu, Michael Hausenblas, Erik Hetzner, Larry
1771 Masinter, Gordon Mohr, Mark Nottingham, David Rosenthal, Ed Summers
1772 for early feedback. Many thanks to Samuel Adams, Scott Ainsworth,
1773 Lyudmilla Balakireva, Frank McCown, Harihar Shankar, Brad Tofel for
1774 early implementations.
1776 9. References
1778 9.1. Normative References
1780 [I-D.ietf-core-link-format]
1781 Shelby, Z., "CoRE Link Format",
1782 draft-ietf-core-link-format-03 (work in progress),
1783 March 2011.
1785 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
1786 Requirement Levels", BCP 14, RFC 2119, March 1997.
1788 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
1789 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
1790 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
1792 [RFC4151] Kindberg, T. and S. Hawke, "The 'tag' URI Scheme",
1793 RFC 4151, October 2005.
1795 [RFC4287] Nottingham, M., Ed. and R. Sayre, Ed., "The Atom
1796 Syndication Format", RFC 4287, December 2005.
1798 [RFC5829] Brown, A., Clemm, G., and J. Reschke, "Link Relation Types
1799 for Simple Version Navigation between Web Resources",
1800 RFC 5829, April 2010.
1802 [RFC5988] Nottingham, M., "Web Linking", RFC 5988, October 2010.
1804 9.2. Informative References
1806 [Fitch] Fitch, "Web site archiving - an approach to recording
1807 every materially different response produced by a
1808 website", July 2003,
1809 .
1811 [I-D.masinter-dated-uri]
1812 Masinter, L., "The 'tdb' and 'duri' URI schemes, based on
1813 dated URIs", draft-masinter-dated-uri-08 (work in
1814 progress), January 2011.
1816 [RFC1123] Braden, R., "Requirements for Internet Hosts - Application
1817 and Support", STD 3, RFC 1123, October 1989.
1819 [W3C.REC-aww-20041215]
1820 Jacobs and Walsh, "Architecture of the World Wide Web",
1821 December 2004, .
1823 [W3C.gen-ont-20090420]
1824 Berners-Lee, "Architecture of the World Wide Web",
1825 April 2009, .
1827 [robotstxt.org]
1828 "Robots Exclusion Protocol", August 2010,
1829 .
1831 Appendix A. Appendix B: A Sample, Successful Memento Request/Response
1832 cycle
1834 Step 1 : UA --- HTTP GET/HEAD; Accept-Datetime: Tj ---------> URI-R
1836 HEAD / HTTP/1.1
1837 Host: a.example.org
1838 Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT
1839 Connection: close
1841 Step 2 : UA <-- HTTP 200; Link: URI-G ----------------------- URI-R
1843 HTTP/1.1 200 OK
1844 Date: Thu, 21 Jan 2010 00:02:12 GMT
1845 Server: Apache
1846 Link:
1847 ; rel="timegate"
1848 Content-Length: 255
1849 Connection: close
1850 Content-Type: text/html; charset=iso-8859-1
1852 Step 3 : UA --- HTTP GET/HEAD; Accept-Datetime: Tj ---------> URI-G
1854 GET /timegate/http://a.example.org
1855 HTTP/1.1
1856 Host: arxiv.example.net
1857 Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT
1858 Connection: close
1860 Step 4 : UA <-- HTTP 302; Location: URI-Mj; Vary; Link:
1861 URI-R, URI-T, URI-M0, URI-Mn, URI-Mi, URI-Mj, URI-Mk ---- URI-G
1863 HTTP/1.1 302 Found
1864 Date: Thu, 21 Jan 2010 00:06:50 GMT
1865 Server: Apache
1866 Vary: negotiate, accept-datetime
1867 Location:
1868 http://arxiv.example.net/web/20010911203610/http://a.example.org
1869 Link: ; rel="original",
1870
1871 ; rel="first memento"; datetime="Tue, 15 Sep 2000 11:28:26 GMT",
1872
1873 ; rel="last memento"; datetime="Tue, 08 Jul 2008 09:34:33 GMT",
1874
1875 ; rel="timemap"; type="application/link-format",
1876
1877 ; rel="memento"; datetime="Tue, 11 Sep 2001 20:36:10 GMT",
1878
1879 ; rel="prev memento"; datetime="Tue, 11 Sep 2001 20:30:51 GMT",
1880
1881 ; rel="next memento"; datetime="Tue, 11 Sep 2001 20:47:33 GMT"
1882 Content-Length: 0
1883 Content-Type: text/plain; charset=UTF-8
1884 Connection: close
1886 Step 5 : UA --- HTTP GET URI-Mj; Accept-Datetime: Tj -------> URI-Mj
1888 GET /web/20010911203610/http://a.example.org
1889 HTTP/1.1
1890 Host: arxiv.example.net
1891 Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT
1892 Connection: close
1894 Step 6 : UA <-- HTTP 200; Memento-Datetime: Tj; Link: URI-R,
1895 URI-T, URI-G, URI-M0, URI-Mn, URI-Mi, URI-Mj, URI-Mk ---- URI-Mj
1897 HTTP/1.1 200 OK
1898 Date: Thu, 21 Jan 2010 00:09:40 GMT
1899 Server: Apache-Coyote/1.1
1900 Memento-Datetime: Tue, 11 Sep 2001 20:36:10 GMT
1901 Link: ; rel="original",
1902
1903 ; rel="first memento"; datetime="Tue, 15 Sep 2000 11:28:26 GMT",
1904
1905 ; rel="last memento"; datetime="Tue, 08 Jul 2008 09:34:33 GMT",
1906
1907 ; rel="timemap"; type="application/link-format",
1908
1909 ; rel="timegate",
1910
1911 ; rel="memento"; datetime="Tue, 11 Sep 2001 20:36:10 GMT",
1912
1913 ; rel="prev memento"; datetime="Tue, 11 Sep 2001 20:30:51 GMT",
1914
1915 ; rel="next memento"; datetime="Tue, 11 Sep 2001 20:47:33 GMT"
1916 Content-Length: 23364
1917 Content-Type: text/html;charset=utf-8
1918 Connection: close
1920 A successful flow with TimeGate and Mementos on the same server
1922 Authors' Addresses
1924 Herbert VandeSompel
1925 Los Alamos National Laboratory
1926 PO Box 1663
1927 Los Alamos, New Mexico 87545
1928 USA
1930 Phone: +1 505 667 1267
1931 Email: hvdsomp@gmail.com
1932 URI: http://public.lanl.gov/herbertv/
1934 Michael Nelson
1935 Old Dominion University
1936 Norfolk, Virginia 23529
1937 USA
1939 Phone: +1 757 683 6393
1940 Email: mln@cs.odu.edu
1941 URI: http://www.cs.odu.edu/~mln/
1942 Robert Sanderson
1943 Los Alamos National Laboratory
1944 PO Box 1663
1945 Los Alamos, New Mexico 87545
1946 USA
1948 Phone: +1 505 665 5804
1949 Email: azaroth42@gmail.com