idnits 2.17.1 draft-vandesompel-memento-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1885 has weird spacing: '...ts that respe...' -- The document date (April 28, 2011) is 4747 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-14) exists of draft-ietf-core-link-format-03 ** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) ** Obsolete normative reference: RFC 5988 (Obsoleted by RFC 8288) == Outdated reference: A later version (-10) exists of draft-masinter-dated-uri-08 Summary: 2 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force H. VandeSompel 3 Internet-Draft Los Alamos National Laboratory 4 Intended status: Informational M. Nelson 5 Expires: October 30, 2011 Old Dominion University 6 R. Sanderson 7 Los Alamos National Laboratory 8 April 28, 2011 10 HTTP framework for time-based access to resource states -- Memento 11 draft-vandesompel-memento-01 13 Abstract 15 The HTTP-based Memento framework bridges the present and past Web by 16 interlinking current resources with resources that encapsulate their 17 past. It facilitates obtaining representations of prior states of a 18 resource, available from archival resources in Web archives or 19 version resources in content management systems, by leveraging the 20 resource's URI and a preferred datetime. To this end, the framework 21 introduces datetime negotiation (a variation on content negotiation), 22 and new Relation Types for the HTTP Link header aimed at interlinking 23 resources with their archival/version resources. It also introduces 24 various discovery mechanisms that further support briding the present 25 and past Web. 27 Status of this Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on October 30, 2011. 44 Copyright Notice 46 Copyright (c) 2011 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 63 1.2. Purpose . . . . . . . . . . . . . . . . . . . . . . . . . 4 64 1.3. Notational Conventions . . . . . . . . . . . . . . . . . . 6 65 2. The Memento Framework, Datetime Negotiation component: 66 HTTP headers, HTTP Link Relation Types . . . . . . . . . . . . 7 67 2.1. HTTP Headers . . . . . . . . . . . . . . . . . . . . . . . 7 68 2.1.1. Accept-Datetime, Memento-Datetime . . . . . . . . . . 7 69 2.1.1.1. Values for Accept-Datetime . . . . . . . . . . . . 8 70 2.1.1.2. Values for Memento-Datetime . . . . . . . . . . . 9 71 2.1.2. Vary . . . . . . . . . . . . . . . . . . . . . . . . . 9 72 2.1.3. Location . . . . . . . . . . . . . . . . . . . . . . . 10 73 2.1.4. Link . . . . . . . . . . . . . . . . . . . . . . . . . 10 74 2.2. Link Header Relation Types . . . . . . . . . . . . . . . . 10 75 2.2.1. Memento Framework Relation Types . . . . . . . . . . . 10 76 2.2.1.1. Relation Type "original" . . . . . . . . . . . . . 11 77 2.2.1.2. Relation Type "timegate" . . . . . . . . . . . . . 11 78 2.2.1.3. Relation Type "timemap" . . . . . . . . . . . . . 12 79 2.2.1.4. Relation Type "memento" . . . . . . . . . . . . . 12 80 2.2.2. Other Relation Types . . . . . . . . . . . . . . . . . 14 81 3. The Memento Framework, Datetime Negotiation component: 82 HTTP Interactions . . . . . . . . . . . . . . . . . . . . . . 15 83 3.1. Interactions with an Original Resource . . . . . . . . . . 16 84 3.1.1. Step 1: User Agent Requests an Original Resource . . . 16 85 3.1.2. Step 2: Server Responds to a Request for an 86 Original Resource . . . . . . . . . . . . . . . . . . 17 87 3.1.2.1. Original Resource is an Appropriate Memento . . . 18 88 3.1.2.2. Server Exists and Original Resource Used to 89 Exist . . . . . . . . . . . . . . . . . . . . . . 19 90 3.1.2.3. Missing or Inadequate "timegate" Link in 91 Original Server's Response . . . . . . . . . . . . 20 92 3.2. Interactions with a TimeGate . . . . . . . . . . . . . . . 20 93 3.2.1. Step 3: User Agent Negotiates with a TimeGate . . . . 20 94 3.2.2. Step 4: Server Responds to Negotiation with 95 TimeGate . . . . . . . . . . . . . . . . . . . . . . . 21 97 3.2.2.1. Successful Scenario . . . . . . . . . . . . . . . 21 98 3.2.2.2. Accept-Datetime and other Accept Headers 99 Provided . . . . . . . . . . . . . . . . . . . . . 23 100 3.2.2.3. Accept-Datetime Not Provided . . . . . . . . . . . 23 101 3.2.2.4. Multiple Matching Mementos . . . . . . . . . . . . 24 102 3.2.2.5. Datetime Out of the User Agent's Range . . . . . . 24 103 3.2.2.6. Accept-Datetime Unparseable . . . . . . . . . . . 26 104 3.2.2.7. TimeGate Does Not Exist . . . . . . . . . . . . . 26 105 3.2.2.8. HTTP Methods other than HEAD/GET . . . . . . . . . 26 106 3.2.3. Recognizing a TimeGate . . . . . . . . . . . . . . . . 27 107 3.3. Interactions with a Memento . . . . . . . . . . . . . . . 27 108 3.3.1. Step 5: User Agent Requests a Memento . . . . . . . . 27 109 3.3.2. Step 6: Server Responds to a Request for a Memento . . 28 110 3.3.2.1. Memento Does not Exist . . . . . . . . . . . . . . 29 111 3.3.2.2. Mementos Without a TimeGate . . . . . . . . . . . 29 112 3.3.3. Recognizing a Memento . . . . . . . . . . . . . . . . 30 113 3.4. Interactions with a TimeMap . . . . . . . . . . . . . . . 31 114 3.4.1. User Agent Requests a TimeMap . . . . . . . . . . . . 31 115 3.4.2. Server Responds to a Request for a TimeMap . . . . . . 32 116 4. The Memento Framework, Discovery Component . . . . . . . . . . 33 117 4.1. Discovery of TimeGates Via Robots Exclusion Protocol . . . 34 118 4.2. Discovery of TimeMaps Via TimeMap Feeds and Robots 119 Exclusion Protocol . . . . . . . . . . . . . . . . . . . . 35 120 4.2.1. TimeMap Feeds . . . . . . . . . . . . . . . . . . . . 35 121 4.2.1.1. TimeMap Feeds: Feed-level Elements . . . . . . . . 36 122 4.2.1.2. TimeMap Feeds: Entry-level Elements . . . . . . . 38 123 4.2.2. Discovering TimeMap Feeds via Robots Exclusion 124 Protocol . . . . . . . . . . . . . . . . . . . . . . . 41 125 4.3. Discovering Mementos via Robots Exclusion Protocol . . . . 41 126 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 42 127 6. Security Considerations . . . . . . . . . . . . . . . . . . . 42 128 7. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . 42 129 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 43 130 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 43 131 9.1. Normative References . . . . . . . . . . . . . . . . . . . 43 132 9.2. Informative References . . . . . . . . . . . . . . . . . . 44 133 Appendix A. Appendix B: A Sample, Successful Memento 134 Request/Response cycle . . . . . . . . . . . . . . . 44 135 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 46 137 1. Introduction 139 1.1. Terminology 141 This specification uses the terms "resource", "request", "response", 142 "entity", "entity-body", "entity-header", "content negotiation", 143 "client", "user agent", "server" as described in RFC 2616 [RFC2616], 144 and it uses the terms "representation" and "resource state" as 145 described in W3C.REC-aww-20041215 [W3C.REC-aww-20041215]. 147 In addition, the following terms specific to the Memento framework 148 are introduced: 150 o Original Resource: An Original Resource is a resource that exists 151 or used to exist, and for which access to one of its prior states 152 is desired. 154 o Memento: A Memento for an Original Resource is a resource that 155 encapsulates a prior state of the Original Resource. A Memento 156 for an Original Resource as it existed at time Tj is a resource 157 that encapsulates the state that the Original Resource had at time 158 Tj. 160 o TimeGate: A TimeGate for an Original Resource is a resource that 161 supports negotiation to allow selective, datetime-based, access to 162 prior states of the Original Resource. 164 o TimeMap: A TimeMap for an Original Resource is a resource from 165 which a list of URIs of Mementos of the Original Resource is 166 available. 168 1.2. Purpose 170 The state of an Original Resource may change over time. 171 Dereferencing its URI at any specific moment in time during its 172 existence yields a representation of its then current state. 173 Dereferencing its URI at any time past its existence no longer yields 174 a meaningful representation, if any. Still, in both cases, resources 175 may exist that encapsulate prior states of the Original Resource. 176 Each such resource, named a Memento, has its own URI that, when 177 dereferenced, returns a representation of a prior state of the 178 Original Resource. Mementos may, for example, exist in Web archives, 179 Content Management Systems, or Revision Control Systems. 181 Examples are: 183 Mementos for Original Resource http://www.ietf.org/ : 185 o http://web.archive.org/web/19970107171109/http://www.ietf.org/ 187 o http://webarchive.nationalarchives.gov.uk/20080906200044/http:// 188 www.ietf.org/ 190 Mementos for Original Resource 191 http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol : 193 o http://en.wikipedia.org/w/ 194 index.php?title=Hypertext_Transfer_Protocol&oldid=366806574 196 o http://en.wikipedia.org/w/ 197 index.php?title=Hypertext_Transfer_Protocol&oldid=33912 199 o http://web.archive.org/web/20071011153017/http://en.wikipedia.org/ 200 wiki/Hypertext_Transfer_Protocol 202 Mementos for Original Resource http://www.w3.org/TR/webarch/ : 204 o http://www.w3.org/TR/2004/PR-webarch-20041105/ 206 o http://www.w3.org/TR/2002/WD-webarch-20020830/ 208 o http://webarchive.nationalarchives.gov.uk/20100304163140/http:// 209 www.w3.org/TR/webarch/ 211 In the abstract, Memento introduces a mechanism to access versions of 212 Web resources that: 214 o Is fully distributed in the sense that resource versions may 215 reside on multiple hosts, and that any such host is likely only 216 aware of the versions it holds; 218 o Uses the global notion of datetime as a resource version indicator 219 and access key; 221 o Leverages the following primitives of W3C.REC-aww-20041215 222 [W3C.REC-aww-20041215]: resource, resource state, representation, 223 content negotiation, and link. 225 The core components of Memento's mechanism to access resource 226 versions are: 228 1. The abstract notion of the state of a resource identified by 229 URI-R as it existed at some time Tj. Note the relationship with the 230 ability to identify a the state of a resource at some datetime Tj by 231 means of a URI as intended by the proposed Dated URI scheme 232 I-D.masinter-dated-uri [I-D.masinter-dated-uri]. 234 2. A bridge from the present to the past, consisting of: 236 o An appropriately typed link from a resource identified by URI-R to 237 an associated TimeGate identified by URI-G, which is aware of (at 238 least part of the) version history of the resource identified by 239 URI-R; 241 o The ability to content negotiate in the datetime dimension with 242 the TimeGate identified by URI-G, as a means to obtain a 243 representation of the state that the resource identified by URI-R 244 had at some datetime Tj. 246 3. A bridge from the past to the present, consisting of an 247 appropriately typed link from a resource identified by URI-M, which 248 encapsulates the state a resource identified by URI-R had at some 249 datetime Tj, to the resource identified by URI-R. 251 Section 2 and Section 3 of this document are concerned with 252 specifying an instantiation of these abstractions for resources that 253 are identified by HTTP(S) URIs, whereas Section 4 details approaches 254 to discover TimeGates, TimeMaps, and Mementos on the HTTP(S) Web by 255 other means than typed links. 257 1.3. Notational Conventions 259 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 260 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 261 document are to be interpreted as described in RFC 2119 [RFC2119]. 263 When needed for extra clarity, the following conventions are used: 265 o URI-R is used to denote the URI of an Original Resource. 267 o URI-G is used to denote the URI of a TimeGate. 269 o URI-M is used to denote the URI of a Memento. 271 o URI-T is used to denote the URI of a TimeMap. 273 o When scenarios are described that involve multiple Mementos, 274 URI-M0 denotes the URI of the first Memento known to the 275 responding server, URI-Mn denotes the URI of the most recent known 276 Memento, URI-Mj denotes the URI of the selected Memento, URI-Mi 277 denotes the URI of the Memento that is temporally previous to the 278 selected Memento, and URI-Mk denotes the URI of the Memento that 279 is temporally after the selected Memento. The respective 280 datetimes for these Mementos are T0, Tn, Tj, Ti, and Tk; it holds 281 that T0 <= Ti <= Tj <= Tk <= Tn. 283 2. The Memento Framework, Datetime Negotiation component: HTTP headers, 284 HTTP Link Relation Types 286 The Memento framework is concerned with Original Resources, 287 TimeGates, Mementos, and TimeMaps that are identified by HTTP or 288 HTTPS URIs. Details are only provided for resources identified by 289 HTTP URIs but apply similarly to those with HTTPS URIs. 291 2.1. HTTP Headers 293 The Memento framework operates at the level of HTTP request and 294 response headers. It introduces two new headers ("Accept-Datetime", 295 "Memento-Datetime"), introduces new values for two existing headers 296 ("Vary", "Link"), and uses an existing header ("Location") without 297 modification. All these headers are described below. Other HTTP 298 headers are present or absent in Memento response/request cycles as 299 specified by RFC 2616 [RFC2616]. 301 2.1.1. Accept-Datetime, Memento-Datetime 303 The "Accept-Datetime" request header is used by a user agent to 304 indicate it wants to retrieve a representation of a Memento that 305 encapsulates a past state of an Original Resource. To that end, the 306 "Accept-Datetime" header is conveyed in an HTTP GET/HEAD request 307 issued against a TimeGate for an Original Resource, and its value 308 indicates the datetime of the desired past state of the Original 309 Resource. The "Accept-Datetime" request header has no defined 310 meaning for HTTP methods other than HEAD and GET. 312 The "Memento-Datetime" response header is used by a server to 313 indicate that the response contains a representation of a Memento, 314 and its value expresses the datetime of the state of an Original 315 Resource that is encapsulated in that Memento. The URI of that 316 Original Resource is provided in the response, as the Target IRI (see 317 RFC5988 [RFC5988]) of a link provided in the HTTP "Link" header that 318 has a Relation Type of "original" (see Section 2.2). 320 The presence of a "Memento-Datetime" header and associated value for 321 a given resource constitutes a promise that the resource is stable 322 and that its state will no longer change. This means that, in terms 323 of the Ontology for Relating Generic and Specific Information 324 Resources (see W3C.gen-ont-20090420 [W3C.gen-ont-20090420]), a 325 Memento is a FixedResource. 327 As a consequence, "Memento-Datetime" headers associated with a 328 Memento MUST be "sticky" in the following ways: 330 o The server that originally assigns the "Memento-Datetime" header 331 and value MUST retain that header in all responses to HTTP HEAD/ 332 GET requests (with or without "Accept-Datetime" header) that occur 333 against the Memento after the time of the original assignment of 334 the header, and it MUST NOT change its associated value. 336 o Applications that mirror Mementos at a different URI MUST NOT 337 change the "Memento-Datetime" header and value of those Mementos 338 unless mirroring involves a meaningful state change. This allows, 339 for example, duplicating a Web archive at a new location while 340 preserving the value of the "Memento-Datetime" header of the 341 archived resources. In this example, the "Last-Modified" header 342 will be updated to reflect the time of mirroring at the new URI, 343 whereas the value for "Memento-Datetime" will be sticky. 345 2.1.1.1. Values for Accept-Datetime 347 Values for the "Accept-Datetime" header consist of a MANDATORY 348 datetime expressed according to the RFC 1123 [RFC1123] format, which 349 is formalized by the rfc1123-date construction rule of the BNF in 350 Figure 1, and an OPTIONAL interval indicator expressed according to 351 the iso8601-interval rule of the BNF in Figure 1. The datetime MUST 352 be represented in Greenwich Mean Time (GMT). 354 Examples of "Accept-Datetime" request headers with and without an 355 interval indicator: 357 Accept-Datetime: Thu, 31 May 2007 20:35:00 GMT 358 Accept-Datetime: Thu, 31 May 2007 20:35:00 GMT; -P3DT5H;+P2DT6H 360 The user agent uses the MANDATORY datetime value to convey its 361 preferred datetime for a Memento; it uses the OPTIONAL interval 362 indicator to convey it is interested in retrieving Mementos that 363 reside within this interval around the preferred datetime, and not 364 interested in Mementos that reside outside of it. Not using an 365 interval indicator is equivalent to expressing an infinite interval 366 around the preferred datetime. 368 The interval mechanism can be regarded as an implementation of the 369 functionality intended by the q-value approach that is used in 370 regular content negotiation. The q-value approach is not supported 371 for Memento's datetime negotiation because it is well-suited for 372 negotiation over a discrete space of mostly predictable values, not 373 for negotiation over a continuum of unpredictable datetime values. 375 accept-dt-value = rfc1123-date *SP [ iso8601-interval ] 376 rfc1123-date = wkday "," SP date1 SP time SP "GMT" 377 date1 = 2DIGIT SP month SP 4DIGIT 378 ; day month year (e.g., 20 Mar 1957) 379 time = 2DIGIT ":" 2DIGIT ":" 2DIGIT 380 ; 00:00:00 - 23:59:59 (e.g., 14:33:22) 381 wkday = "Mon" | "Tue" | "Wed" | "Thu" | "Fri" | "Sat" | 382 "Sun" 383 month = "Jan" | "Feb" | "Mar" | "Apr" | "May" | "Jun" | 384 "Jul" | "Aug" | "Sep" | "Oct" | "Nov" | "Dec" 385 iso8601-interval = ";" *SP "-" duration *SP ";" *SP "+" duration 386 duration = "P" ( dur-date | dur-week ) 387 dur-date = ( dur-day | dur-month | dur-year ) [ dur-time ] 388 dur-year = 1*DIGIT "Y" [ dur-month ] [ dur-day ] 389 dur-month = 1*DIGIT "M" [ dur-day ] 390 dur-day = 1*DIGIT "D" 391 dur-time = "T" ( dur-hour | dur-minute | dur-second ) 392 dur-hour = 1*DIGIT "H" [ dur-minute ] [ dur-second ] 393 dur-minute = 1*DIGIT "M" [ dur-second ] 394 dur-second = 1*DIGIT "S" 395 dur-week = 1*DIGIT "W" 397 Figure 1: BNF for the datetime format 399 2.1.1.2. Values for Memento-Datetime 401 Values for the "Memento-Datetime" headers MUST be datetimes expressed 402 according to the rfc1123-date construction rule of the BNF in 403 Figure 1; they MUST be represented in Greenwich Mean Time (GMT). 405 An example "Memento-Datetime" response header: 407 Memento-Datetime: Wed, 30 May 2007 18:47:52 GMT 409 2.1.2. Vary 411 The "Vary" response header is used in responses to indicate the 412 dimensions in which content negotiation was successfully applied. 413 This header is used in the Memento framework to indicate both whether 414 datetime negotiation was applied or is supported by the responding 415 server. 417 For example, this use of the "Vary" header indicates that datetime is 418 the only dimension in which negotiation was applied: 420 Vary: negotiate, accept-datetime 421 The use of the "Vary" header in this example shows that both datetime 422 negotiation, and media type content negotiation were applied: 424 Vary: negotiate, accept-datetime, accept 426 2.1.3. Location 428 The "Location" header is used as defined in RFC 2616 [RFC2616]. 429 Examples are given in Section 3 below. 431 2.1.4. Link 433 The "Link" response header is specified in RFC5988 [RFC5988]. The 434 Memento framework introduces new Relation Types to convey typed links 435 among Original Resources, TimeGates, Mementos, and TimeMaps. Already 436 existing Relation Types, among others, aimed at supporting navigation 437 among a series of ordered resources may also be used in the Memento 438 framework. This is detailed in Link Header Relation Types 439 (Section 2.2), below. 441 2.2. Link Header Relation Types 443 The "Link" header specified in RFC5988 [RFC5988] is semantically 444 equivalent to the "" element in HTML, as well as the "atom: 445 link" feed-level element in Atom RFC 4287 [RFC4287]. By default, the 446 origin of a link expressed by an entry in a "Link" header (named 447 Context IRI in RFC5988 [RFC5988]) is the IRI of the requested 448 resource. This default can be overwritten using the "anchor" 449 attribute in the entry. 451 2.2.1. Memento Framework Relation Types 453 The Relation Types used in the Memento framework are listed in the 454 remainder of this section, and their use is summarized in the below 455 table. Appendix A shows a Memento request/response cycle that uses 456 all the Relation Types that are introduced here. 458 +-----------+-------------------+-----------------+-----------------+ 459 | Relation | Original Resource | TimeGate | Memento | 460 | Type | | | | 461 +-----------+-------------------+-----------------+-----------------+ 462 | original | NA, except see | REQUIRED, 1 | REQUIRED, 1 | 463 | | Section 3.1.2.1 | | | 464 | timegate | RECOMMENDED, 0 or | NA | RECOMMENDED, 0 | 465 | | more | | or more | 466 | timemap | NA | RECOMMENDED, 0 | RECOMMENDED, 0 | 467 | | | or more | or more | 468 | memento | NA, except see | REQUIRED, 1 or | REQUIRED, 1 or | 469 | | Section 3.1.2.1 | more | more | 470 +-----------+-------------------+-----------------+-----------------+ 472 Table 1: The use of Relation Types 474 2.2.1.1. Relation Type "original" 476 "original" -- A "Link" header entry with a Relation Type of 477 "original" is used to point from a TimeGate or a Memento to their 478 associated Original Resource. In both cases, an entry with the 479 "original" Relation Type MUST occur exactly once in a "Link" header. 480 Details for the entry are as follows: 482 o Context IRI: URI-G, URI-M 484 o Target IRI: URI-R 486 o Relation Type: "original" 488 o Use: REQUIRED 490 o Cardinality: 1 492 2.2.1.2. Relation Type "timegate" 494 "timegate" -- A "Link" header entry with a Relation Type of 495 "timegate" is used to point both from an Original Resource or a 496 Memento to a TimeGate for the Original Resource. In both cases, the 497 use of an entry with the "timegate" Relation Type is RECOMMENDED. 498 Since more than one TimeGate can exist for any Original Resource, 499 multiple entries with a "timegate" Relation Type MAY occur, each with 500 a distinct Target IRI. Since a TimeGate has no mime type, the "type" 501 attribute MUST NOT be used on Links with a "timegate" Relation Type. 502 Details for the entry are as follows: 504 o Context IRI: URI-R or URI-Mj 506 o Target IRI: URI-G 508 o Relation Type: "timegate" 510 o Use: RECOMMENDED 512 o Cardinality: 0 or more 514 2.2.1.3. Relation Type "timemap" 516 "timemap" -- A "Link" header entry with a Relation Type of "timemap" 517 is used to point from both a TimeGate or a Memento to a TimeMap 518 resource from which a list of Mementos known to the responding server 519 is available. Use of an entry with the "timemap" Relation Type is 520 RECOMMENDED, and, since multiple serializations of a TimeMap are 521 possible, multiple entries with a "timemap" Relation Type MAY occur, 522 each with a distinct Target IRI, and each with a MANDATORY "type" 523 attribute to convey the mime type of the TimeMap serialization. 524 Details for the entry are as follows: 526 o Context IRI: URI-G or URI-Mi 528 o Target IRI: URI-T 530 o Relation Type: "timemap" 532 o Target Attribute: "type" 534 o Use: RECOMMENDED 536 o Cardinality: 0 or more 538 Further details about TimeMap serializations are provided in 539 Section 3.4. 541 2.2.1.4. Relation Type "memento" 543 "memento" -- A "Link" header entry with a Relation Type of "memento" 544 is used to point from both a TimeGate and a Memento to various 545 Mementos for an Original Resource. This link MUST include a 546 "datetime" attribute with a value that matches the "Memento-Datetime" 547 of the Memento that is the target of the link; that is, the value of 548 the "Memento-Datetime" header that is returned when the URI of the 549 linked Memento is dereferenced. In addition, the link MAY include an 550 "embargo" attribute to convey the datetime until which the Memento 551 will remain inaccessible. The value for both the "datetime" and 552 "embargo" attributes MUST be a datetime expressed according to the 553 rfc1123-date construction rule of the BNF in Figure 1 and it MUST be 554 represented in Greenwich Mean Time (GMT). This link MAY also include 555 a "license" attribute to associate a license with the Memento; the 556 value for the "license" attribute SHOULD be a URI. The link SHOULD 557 also include a "type" attribute to convey the mime type of the 558 Memento that is the target of the link. Use of entries with the 559 "memento" Relation Type is REQUIRED and it MUST be as follows: 561 For all responses to HTTP HEAD/GET requests issued against a TimeGate 562 or a Memento in which a Memento is selected or served by the 563 responding server: 565 o One "memento" link MUST be included that has as Target IRI the URI 566 of the Memento that was selected or served; 568 o One "memento" link MUST be included that has as Target IRI the URI 569 of the temporally first Memento known to the responding server; 571 o One "memento" link MUST be included that has as Target IRI the URI 572 of the temporally most recent Memento known to the responding 573 server. 575 o One "memento" link SHOULD be included that has as Target IRI the 576 URI of the Memento that is previous to the selected Memento in the 577 temporal series of all Mementos (sorted by ascending "Memento- 578 Datetime" values) known to the server; 580 o One "memento" link SHOULD be included that has as Target IRI the 581 URI the Memento that is next to the selected Memento in the 582 temporal series of all Mementos (sorted by ascending "Memento- 583 Datetime" values) known to the server. 585 o Other "memento" links MAY only be included if both the 586 aforementioned previous and next links are provided. Each of 587 these OPTIONAL "memento" links MUST have as Target IRI the URI of 588 a Memento other than the ones listed above. 590 For all responses to HTTP HEAD/GET requests issued against an 591 existing TimeGate or Memento in which no Memento is selected or 592 served by the responding server: 594 o One "memento" link MUST be included that has as Target IRI the URI 595 of the temporally first Memento known to the responding server; 597 o One "memento" link MUST be included that has as Target IRI the URI 598 of the temporally most recent Memento known to the responding 599 server. 601 o Other "memento" links MAY be included, and each of these OPTIONAL 602 links MUST have as Target IRI the URI of a Memento other than the 603 two listed above. 605 Note that the Target IRI of some of these links may coincide. For 606 example, if the selected Memento actually is the first Memento known 607 to the server, only three distinct "memento" links may result. The 608 value for the "datetime" attribute of these links would be the 609 datetimes of the first (equal to selected), next, and most recent 610 Memento known to the responding server. 612 The summary is as follows: 614 o Context IRI: URI-G, URI-Mj 616 o Target IRI: URI-M 618 o Relation Type: "memento" 620 o Target Attributes: "datetime", "embargo", "license" 622 o Use: REQUIRED 624 o Cardinality: 1 or more 626 2.2.2. Other Relation Types 628 Web Linking RFC5988 [RFC5988] allows for the inclusion of links with 629 different Relation Types but the same Target IRI, and hence the 630 Relation Types introduced by the Memento framework MAY be combined 631 with others as deemed necessary. As the "memento" Relation Type 632 focuses on conveying the datetime of a linked Memento, Relation Types 633 that allow navigating among the temporally ordered series of Mementos 634 known to a server are of particular importance. With this regard, 635 the Relation Types listed in the below table SHOULD be considered for 636 combination with the "memento" Relation Type. A distinction is made 637 between responding servers that can be categorized as systems that 638 are the focus of RFC5829 [RFC5829] (such as version control systems) 639 and others that can not (such as Web archives). Note that, in terms 640 of RFC5829 [RFC5829], the last Memento (URI-Mn) is the version prior 641 to the latest (i.e. current) version. 643 +-----------------------------+---------------------+---------------+ 644 | Memento Type | RFC5988 system | non RFC5988 | 645 | | | system | 646 +-----------------------------+---------------------+---------------+ 647 | First Memento (URI-M0) | first | first | 648 | Last Memento (URI-Mn) | last | last | 649 | Selected Memento (URI-Mj) | NA | NA | 650 | Memento prior to selected | predecessor-version | prev | 651 | Memento (URI-Mi) | | | 652 | Memento next to selected | successor-version | next | 653 | Memento (URI-Mk) | | | 654 +-----------------------------+---------------------+---------------+ 656 Table 2: The use of Relation Types 658 3. The Memento Framework, Datetime Negotiation component: HTTP 659 Interactions 661 This section describes the HTTP interactions of the Memento framework 662 for a variety of scenarios. First, Figure 2 provides a schematic 663 overview of a successful request/response chain that involves 664 datetime negotiation. Dashed lines depict HTTP transactions between 665 user agent and server. Appendix A shows these HTTP interactions in 666 detail for the case where the Original Resource resides on one 667 server, whereas both the TimeGate and the Mementos reside on another. 668 Scenarios also exist in which all these resources are on the same 669 server (for example, Content Management Systems) or on different 670 servers (for example, an aggregator of TimeGates). Note that, in 671 Step 2 and Step 6, the HTTP status code of the response is shown as 672 "200 OK", but a series of "206 Partial Content" responses could be 673 substituted without loss of generality. 675 1: UA --- HTTP GET/HEAD; Accept-Datetime: Tj ---------------> URI-R 676 2: UA <-- HTTP 200; Link: URI-G ----------------------------- URI-R 677 3: UA --- HTTP GET/HEAD; Accept-Datetime: Tj ---------------> URI-G 678 4: UA <-- HTTP 302; Location: URI-Mj; Vary; Link: 679 URI-R,URI-T,URI-M0,URI-Mn,URI-Mi,URI-Mj,URI-Mk -------- URI-G 680 5: UA --- HTTP GET URI-Mj; Accept-Datetime: Tj -------------> URI-Mj 681 6: UA <-- HTTP 200; Memento-Datetime: Tj; Link: 682 URI-R,URI-T,URI-G,URI-M0,URI-Mn,URI-Mi,URI-Mj,URI-Mk -- URI-Mj 684 Figure 2: Typical Memento request/response chain 686 o Step 1: In order to determine what the URI is of a TimeGate for an 687 Original Resource, the user agent issues an HTTP HEAD/GET request 688 against the URI of the Original Resource (URI-R). 690 o Step 2: The entity-header of the response from URI-R includes an 691 HTTP "Link" header with a Relation Type of "timegate" pointing at 692 a TimeGate (URI-G) for the Original Resource. 694 o Step 3: The user agent starts the datetime negotiation process 695 with the TimeGate by issuing an HTTP GET request against its URI-G 696 thereby including an "Accept-Datetime" HTTP header with a value of 697 the datetime of the desired prior state of the Original Resource. 699 o Step 4: The entity-header of the response from URI-G includes a 700 "Location" header pointing at the URI of a Memento (URI-Mj) for 701 the Original Resource. In addition, the entity-header contains an 702 HTTP "Link" header with a Relation Type of "original" pointing at 703 the Original Resource, and an HTTP "Link" header with a Relation 704 Type of "timemap" pointing at a TimeMap (URI-T). Also HTTP Links 705 pointing at various Mementos are provided using the "memento" 706 Relation Type, as specified in Section 2.2.1.4. 708 o Step 5: The user agent issues an HTTP GET request against the 709 URI-Mj of a Memento, obtained in Step 4. 711 o Step 6: The entity-header of the response from URI-Mj includes a 712 "Memento-Datetime" HTTP header with a value of the datetime of the 713 Memento. It also contains an HTTP "Link" header with a Relation 714 Type of "original" pointing at the Original Resource, with a 715 Relation Type of "timegate" pointing at a TimeGate associated with 716 the Original Resource, and with a Relation Type of "timemap" 717 pointing at a TimeMap. The state that is expressed by the 718 representation provided in the response is the state the Original 719 Resource had at the datetime expressed in the "Memento-Datetime" 720 header. This response also includes HTTP Links with a "memento" 721 Relation Type pointing at various Mementos, as specified in 722 Section 2.2.1.4. 724 The following sections detail the specifics of HTTP interactions with 725 Original Resources, TimeGates, Mementos, and TimeMaps under various 726 conditions. 728 3.1. Interactions with an Original Resource 730 This section details HTTP GET/HEAD requests targeted at an Original 731 Resource (URI-R). 733 3.1.1. Step 1: User Agent Requests an Original Resource 735 In order to try and discover a TimeGate for the Original Resource, 736 the user agent SHOULD issue an HTTP HEAD or GET request against the 737 Original Resource's URI. Use of the "Accept-Datetime" header in the 738 HTTP HEAD/GET request is OPTIONAL. 740 Figure 3 shows the use of HTTP HEAD indicating the user agent is not 741 interested in retrieving a representation of the Original Resource, 742 but only in determining a TimeGate for it. It also shows the use of 743 the "Accept-Datetime" header anticipating that the user agent will 744 set it for the entire duration of a Memento request/response cycle. 746 HEAD / HTTP/1.1 747 Host: a.example.org 748 Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT 749 Connection: close 751 Figure 3: User Agent Requests Original Resource 753 3.1.2. Step 2: Server Responds to a Request for an Original Resource 755 The response of the Original Resource's server to the user agent's 756 HTTP HEAD/GET request of Step 1, for the case where the Original 757 Resource exists, is as it would be in a regular HTTP request/response 758 cycle, but in addition MAY include a HTTP "Link" header with a 759 Relation Type of "timegate" that conveys the URI of the Original 760 Resource's TimeGate as the Target IRI of the Link. Multiple HTTP 761 Links with a relation type of "timegate" MAY be provided to 762 accommodate situations in which the server is aware of multiple 763 TimeGates for an Original Resource. The actual Target IRI provided 764 in the "timegate" Link may depend on several factors including the 765 datetime provided in the "Accept-Datetime" header, and the IP address 766 of the user agent. A response for this case is illustrated in 767 Figure 4. 769 HTTP/1.1 200 OK 770 Date: Thu, 21 Jan 2010 00:02:12 GMT 771 Server: Apache 772 Link: 773 ; rel="timegate" 774 Content-Length: 255 775 Connection: close 776 Content-Type: text/html; charset=iso-8859-1 778 Figure 4: Server of Original Resource Responds 780 Servers that actively maintain archives of their resources SHOULD 781 include the "timegate" HTTP "Link" header because this link is an 782 important way for a user agent to discover TimeGates for those 783 resources. This includes servers such as Content Management Systems, 784 Control Version Systems, and Web servers with associated 785 transactional archives Fitch [Fitch]. Servers that do not actively 786 maintain archives of their resources MAY include the "timegate" HTTP 787 "Link" header as a way to convey a preference for TimeGates for their 788 resources exposed by a third party archive. This includes servers 789 that rely on Web archives such as the Internet Archive to archive 790 their resources. 792 The server of the Original Resource MUST treat requests with and 793 without an "Accept-Datetime" header in the same way: 795 o The response MUST either always or never include a HTTP "Link" 796 header with an entry that has a "timegate" Relation Type and the 797 URI of a TimeGate as the Target IRI. 799 o The entity-body of the response MUST be the same, for user agent 800 requests with or without a "Accept-Datetime" header. 802 3.1.2.1. Original Resource is an Appropriate Memento 804 The "Memento-Datetime" header MAY be applied to an Original Resource 805 directly to indicate it is a FixedResource (see W3C.gen-ont-20090420 806 [W3C.gen-ont-20090420]), meaning that the state of the Original 807 Resource has not changed since the datetime conveyed in the "Memento- 808 Datetime" header, and as a promise that it will not change anymore 809 beyond it. This may occur, for example, for certain stable media 810 resources on news sites. In case the user agent's preferred datetime 811 is equal to or more recent than the datetime conveyed as the value of 812 "Memento-Datetime" in the server's response in Step 2, the user agent 813 SHOULD conclude it has located an appropriate Memento, and it SHOULD 814 NOT continue to Step 3. 816 Figure 5 illustrates such a response to a request for the resource 817 with URI http://a.example.org/pic that has been stable since it was 818 created. Note the use of both the "memento" and "original" Relation 819 Types for links that have as Target IRI the URI of the Original 820 Resource. 822 HTTP/1.1 200 OK 823 Date: Thu, 21 Jan 2010 00:02:12 GMT 824 Server: Apache 825 Link: 826 827 ; rel="original memento" 828 ; datetime="Fri, 20 Mar 2009 11:00:00 GMT" 829 Memento-Datetime: Fri, 20 Mar 2009 11:00:00 GMT 830 Content-Length: 255 831 Connection: close 832 Content-Type: text/html; charset=iso-8909-1 834 Figure 5: Response to a request for an Original Resource that was 835 created as a FixedResource 837 Cases may also exist in which a resource becomes stable at a certain 838 point in its existence, but changed previously. In such cases, the 839 Original Resource may know about a TimeGate that is aware of its 840 prior history and hence MAY also include a link with a "timegate" 841 Relation Type. This is illustrated in Figure 6, where the "memento" 842 and "original" Relation Types are used as in Figure 5, and the 843 existence of a TimeGate to negotiate for Mementos with datetimes 844 prior to Fri, 20 Mar 2009 11:00:00 GMT is indicated. 846 HTTP/1.1 200 OK 847 Date: Thu, 21 Jan 2010 00:02:12 GMT 848 Server: Apache 849 Link: 850 851 ; rel="original memento" 852 ; datetime="Fri, 20 Mar 2009 11:00:00 GMT", 853 854 ; rel="timegate" 855 Memento-Datetime: Fri, 20 Mar 2009 11:00:00 GMT 856 Content-Length: 255 857 Connection: close 858 Content-Type: text/html; charset=iso-8909-1 860 Figure 6: Response to a request for an Original Resource that became 861 a FixedResource 863 3.1.2.2. Server Exists and Original Resource Used to Exist 865 Servers SHOULD also provide a "timegate" HTTP "Link" header in 866 responses to requests for an Original Resource that the server knows 867 used to exist, but no longer does. This allows the use of an 868 Original Resource's URI as an entry point to representations of its 869 prior states even if the resource itself no longer exists. A 870 server's response for this case is illustrated in Figure 7. 872 HTTP/1.1 404 Not Found 873 Date: Thu, 21 Jan 2010 00:02:12 GMT 874 Server: Apache 875 Link: 876 877 ; rel="timegate" 878 Content-Length: 255 879 Connection: close 880 Content-Type: text/html; charset=iso-8909-1 882 Figure 7: Response to a request for an Original Resource that not 883 longer exists 885 In case the server is not aware of the prior existence of the 886 Original Resource, its response SHOULD NOT include a "timegate" HTTP 887 Link. Section 3.1.2.3 details what the user agent's behavior should 888 be in such cases. 890 3.1.2.3. Missing or Inadequate "timegate" Link in Original Server's 891 Response 893 A user agent MAY ignore the TimeGate returned in Step 2. However, 894 when engaging in a Memento request/response cycle, a user agent 895 SHOULD NOT proceed immediately to Step 3 by using a TimeGate of its 896 own preference but rather SHOULD always start the cycle by issuing an 897 HTTP GET/HEAD against the Original Resource (Step 1, Figure 3) as it 898 is an important way to learn about dedicated or preferred TimeGates 899 for the Original Resource. Also, cases exist in which the response 900 in Step 2 will not provide a "timegate" link, including: 902 o The Original Resource's server does not support the Memento 903 framework; 905 o The Original Resource no longer exists and the responding server 906 is not aware of its prior existence; 908 o The server that hosted the Original Resource no longer exists; 910 In all these cases, the user agent SHOULD attempt to determine an 911 appropriate TimeGate for the Original Resource, either automatically 912 or interactively supported by the user. The discovery mechanisms 913 described in Section 4 can support the user agent with this regard. 915 3.2. Interactions with a TimeGate 917 This section details HTTP GET/HEAD requests targeted at a TimeGate 918 (URI-G). 920 3.2.1. Step 3: User Agent Negotiates with a TimeGate 922 In order to negotiate with a TimeGate, the user agent MUST issue a 923 HTTP HEAD or GET against its URI, its request MUST include the 924 "Accept-Datetime" header to express its datetime preference, and the 925 use of that header MUST be as described in Section 2.1.1.1. The URI 926 of the TimeGate may have been provided as the Target IRI of a 927 "timegate" HTTP "Link" header in the response from the Original 928 Resource (Step 2, Figure 4), or may have resulted from another 929 discovery mechanism (see Section 4) or user interaction. Such a 930 request is illustrated in Figure 8. 932 GET /timegate/http://a.example.org HTTP/1.1 933 Host: arxiv.example.net 934 Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT 935 Connection: close 937 Figure 8: User agent negotiates with TimeGate 939 3.2.2. Step 4: Server Responds to Negotiation with TimeGate 941 In order to respond to a datetime negotiation request (Step 3, 942 Section 3.2.1), the server uses an internal algorithm to select the 943 Memento that best meets the user agent's datetime preference, and 944 redirects to it. The exact nature of the selection algorithm is at 945 the server's discretion but SHOULD be consistent. A variety of 946 approaches can be used including selecting the Memento that is 947 nearest in time (either past or future) or nearest in the past 948 relative to the requested datetime. Special cases for datetime 949 negotiation with a TimeGate exist, and they are addressed in 950 Section 3.2.2.3 through Section 3.2.2.7. 952 3.2.2.1. Successful Scenario 954 In cases where the TimeGate exists, and the datetime provided in the 955 user agent's "Accept-Datetime" header can be parsed and is not out of 956 the user agent's range (see Section 3.2.2.5), the server selects a 957 Memento based on the user agent's datetime preference. The response 958 MUST have a "302 Found" HTTP status code, and the "Location" header 959 MUST be used to convey the URI of the selected Memento. The "Vary" 960 header MUST be provided and it MUST include the "negotiate" and 961 "accept-datetime" values to indicate that datetime negotiation has 962 taken place. The "Link" header MUST be provided and contain links 963 with Relation Types subject to the considerations described in 964 Section 2.2. Such a response is illustrated in Figure 9. 966 HTTP/1.1 302 Found 967 Date: Thu, 21 Jan 2010 00:06:50 GMT 968 Server: Apache 969 Vary: negotiate, accept-datetime 970 Location: 971 http://arxiv.example.net/web/20010911203610/http://a.example.org 972 Link: ; rel="original", 973 974 ; rel="timemap"; type="application/link-format", 975 976 ; rel="first memento"; datetime="Tue, 15 Sep 2000 11:28:26 GMT", 977 978 ; rel="last memento"; datetime="Tue, 08 Jul 2008 09:34:33 GMT", 979 980 ; rel="memento"; datetime="Tue, 11 Sep 2001 20:36:10 GMT", 981 982 ; rel="prev memento"; datetime="Tue, 11 Sep 2001 20:30:51 GMT", 983 984 ; rel="next memento"; datetime="Tue, 11 Sep 2001 20:47:33 GMT" 985 Content-Length: 0 986 Content-Type: text/plain; charset=UTF-8 987 Connection: close 989 Figure 9: Server of TimeGate responds 991 Note that if a user agent's "Accept-Datetime" header does not convey 992 an interval indicator, and conveys a datetime that is either earlier 993 than the datetime of the first Memento or later than the datetime of 994 the most recent Memento known to the server, the server's response is 995 as just described yet entails the selection of the first or most 996 recent Memento, respectively. This approach is consistent with 997 interpreting the abscence of an interval indicator in the user 998 agent's request as an indication of an infinite interval around its 999 preferred datetime (see Section 2.1.1.1). 1001 This is illustrated in Figure 10 that shows the response from a 1002 TimeGate exposed by a MediaWiki server to a request by a user agent 1003 that has an "Accept-Datetime: Mon, 31 May 1999 00:00:00 GMT" header. 1004 Note that a link is provided with a "successor-version" Relation Type 1005 but not with a "predecessor-version" Relation Type. 1007 HTTP/1.1 302 Found 1008 Server: Apache 1009 Content-Length: 709 1010 Content-Type: text/html; charset=utf-8 1011 Date: Thu, 21 Jan 2010 00:09:40 GMT 1012 Location: 1013 http://a.example.org/w/index.php?title=Clock&oldid=1493688 1014 Vary: negotiate, accept-datetime 1015 Link: ; rel="original", 1016 1017 ; rel="timemap", 1018 1019 ; rel="first memento"; datetime="Sun, 28 Sep 2003 01:42:00 GMT", 1020 1021 ; rel="successor-version memento" 1022 ; datetime="Tue, 30 Sep 2003 14:28:00 GMT", 1023 1024 ; rel="last memento"; datetime="Tue, 12 Jan 2010 19:55:00 GMT" 1025 Connection: close 1027 Figure 10: A TimeGate's response to a request for a Memento with a 1028 datetime earlier than that of the first Memento 1030 3.2.2.2. Accept-Datetime and other Accept Headers Provided 1032 When interacting with a TimeGate, the regular content negotiation 1033 dimensions (media type, character encoding, language, and 1034 compression) remain available. It is the TimeGate server's 1035 responsibility to honor (or not) such content negotiation, and in 1036 doing so it MUST always first select a Memento that meets the user 1037 agent's datetime preference, and then consider honoring regular 1038 content negotiation for it. As a result of this approach, the 1039 returned Memento will not necessarily meet the user agent's regular 1040 content negotiation preferences. Therefore, it is RECOMMENDED that 1041 the server provides HTTP Links with a "memento" Relation Type 1042 pointing at Mementos that do meet the user agent's regular content 1043 negotiation requests and that have a Memento-Datetime value in the 1044 temporal vicinity of the user agent's preferred datetime value. 1046 3.2.2.3. Accept-Datetime Not Provided 1048 In case, in Step 3, a user agent issues a request to a TimeGate and 1049 fails to include an "Accept-Datetime" request header, the response 1050 MUST be handled as in Section 3.2.2.1, with a selection of the most 1051 recent Memento known to the responding server. 1053 3.2.2.4. Multiple Matching Mementos 1055 Because the finest datetime granularity expressable using the RFC 1056 1123 [RFC1123] format used in HTTP is seconds level, cases may occur 1057 in which a TimeGate server is aware of multiple Mementos that meet 1058 the user agent's datetime preference. This may occur in Content 1059 Management Systems with very high update rates. The response in this 1060 case MUST be handled as in Section 3.2.2.1, with the selection of one 1061 of the matching Mementos. 1063 As an example, Figure 11 shows a hypothetical response from a 1064 TimeGate on a MediaWiki server to a request for a Memento for the 1065 Original Resource http://a.example.org/w/Clock for which two Mementos 1066 exist for the user agent's preferred datetime. 1068 HTTP/1.1 302 Found 1069 Server: Apache 1070 Content-Length: 705 1071 Content-Type: text/html; charset=utf-8 1072 Date: Thu, 21 Jan 2010 00:09:40 GMT 1073 Vary: negotiate, accept-datetime 1074 Location: 1075 http://a.example.org/w/index.php?title=Clock&oldid=322586071 1076 Link: ; rel="original", 1077 1078 ; rel="timemap";type="application/link-format", 1079 1080 ; rel="first memento"; datetime="Sun, 28 Sep 2003 01:42:00 GMT", 1081 1082 ; rel="last memento"; datetime="Tue, 12 Jan 2010 19:55:00 GMT", 1083 1084 ; rel="memento"; datetime="Sun, 31 May 2009 15:43:00 GMT", 1085 1086 ; rel="memento successor-version" 1087 ; datetime="Sun, 31 May 2009 15:43:00 GMT" 1088 1089 ; rel="memento predecessor-version" 1090 ; datetime="Sun, 31 May 2009 15:41:24 GMT" 1091 Connection: close 1093 Figure 11: A TimeGate's response to a request that has multiple 1094 Mementos with a matching datetime 1096 3.2.2.5. Datetime Out of the User Agent's Range 1098 In case, in Step 3, a user agent conveys an interval indicator, and 1099 the responding server is not aware of any Mementos with datetimes 1100 within the expressed interval, the server's response MUST have a "406 1101 Not Acceptable" HTTP status code. The use of the "Vary" header MUST 1102 be as described in Section 3.2.2.1. The use of the "Link" header 1103 MUST be as described in Section 2.2. Specifically, the use of links 1104 with a "memento" Relation Type MUST follow the rules for the case 1105 where no Memento is selected by the responding server 1106 (Section 2.2.1.4). 1108 Figure 12 shows a user agent using an "Accept-Datetime" header 1109 conveying an interval of interest starting 5 hours before and ending 1110 6 hours after Tue, 11 Sep 2001 20:35:00 GMT. Figure 13 shows the 1111 "406 Not Acceptable" response from the TimeGate that has links to the 1112 first and last Memento, as well to a Memento outside of the user 1113 agent's interval yet in the temporal vicinity of its preferred 1114 datetime. 1116 GET /timegate/http://a.example.org HTTP/1.1 1117 Host: arxiv.example.net 1118 Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT; -P5H;+P6H 1119 Connection: close 1121 Figure 12: User agent expresses interval of interest in Accept- 1122 Datetime header 1124 HTTP/1.1 406 Not Acceptable 1125 Date: Thu, 21 Jan 2010 00:06:50 GMT 1126 Server: Apache 1127 Vary: negotiate, accept-datetime 1128 Link: ; rel="original", 1129 1130 ; rel="timemap";type="application/link-format", 1131 1132 ; rel="memento first"; datetime="Tue, 15 Sep 2000 11:28:26 GMT", 1133 1134 ; rel="memento last"; datetime="Tue, 08 Jul 2008 09:34:33 GMT", 1135 1136 ; rel="memento"; datetime="Mon, 10 Sep 2001 08:22:00 GMT" 1137 Content-Length: 1732 1138 Connection: close 1139 Content-Type: text/plain; charset=UTF-8 1141 Figure 13: A TimeGate's response indicating it has no Mementos within 1142 the interval of interest 1144 3.2.2.6. Accept-Datetime Unparseable 1146 In case, in Step 3, a user agent conveys a value for the "Accept- 1147 Datetime" request header that does not conform to the accept-dt-value 1148 construction rule of the BNF in Figure 1, the TimeGate server's 1149 response MUST have a "400 Bad Request" HTTP status code. With all 1150 other respects, responses in this case MUST be handled as described 1151 in Section 3.2.2.5 1153 3.2.2.7. TimeGate Does Not Exist 1155 Cases may occur in which a user agent issues a request against a 1156 TimeGate that does not exist. This may, for example, occur when a 1157 user agent uses internal knowledge to construct the URI of an 1158 assumed, yet non-existent TimeGate. In these cases, the response 1159 from the target server MUST have a "404 Not Found" HTTP status code, 1160 and SHOULD include a "Vary" header that includes the "negotiate" and 1161 "accept-datetime" values as an indication that, generally, the server 1162 is capable of datetime negotiation. The response MUST NOT include a 1163 "Link" header with any of the Relation Types introduced in 1164 Section 2.2.1. 1166 3.2.2.8. HTTP Methods other than HEAD/GET 1168 In the above, the safe HTTP methods GET and HEAD are described for 1169 TimeGates. TimeGates MAY support the safe HTTP methods OPTIONS and 1170 TRACE in the way described in RFC 2616 [RFC2616]. Unsafe HTTP 1171 methods (i.e. PUT, POST, DELETE) MUST NOT be supported by a 1172 TimeGate. Such requests MUST yield a response with a "405 Method Not 1173 Allowed" HTTP status code, and MUST include an "Allow" header to 1174 convey that only the HEAD and GET (and OPTIONALLY the OPTIONS and 1175 TRACE) methods are supported. In addition, the response MUST have a 1176 "Vary" header that includes the "negotiate" and "accept-datetime" 1177 values to indicate the TimeGate supports datetime negotiation. 1178 Figure 14 shows such a response. 1180 HTTP/1.1 405 Method Not Allowed 1181 Date: Thu, 21 Jan 2010 00:02:12 GMT 1182 Server: Apache 1183 Vary: negotiate, accept-datetime 1184 Allow: HEAD, GET 1185 Content-Length: 255 1186 Connection: close 1187 Content-Type: text/html; charset=iso-8909-1 1189 Figure 14: Response from a TimeGate accessed with HTTP method other 1190 than HEAD/GET 1192 3.2.3. Recognizing a TimeGate 1194 When a user agent issues a HTTP HEAD/GET request against a resource 1195 of which it found the URI as the Target IRI of an entry in the "Link" 1196 header with a "timegate" Relation Type, it SHOULD NOT assume that the 1197 targeted resource effectively is a TimeGate and hence will behave as 1198 described in Section 3.2.2. 1200 A user agent MUST decide it has reached a TimeGate if the response to 1201 a HTTP HEAD/GET request against the resource's URI contains a "Vary" 1202 header that includes the "negotiate" and "accept-datetime" values. 1203 If the response does not, the user agent MUST decide it has not 1204 reached a TimeGate and proceed as follows: 1206 o If the response contains a redirection, the user agent SHOULD 1207 follow it. Note that even a chain of redirections is possible, 1208 e.g. URI-R -> URI-1 -> URI-2 -> ... -> URI-G 1210 o If the response does not contain a redirection, or if the 1211 redirection (chain) does not lead to a TimeGate, the user agent 1212 SHOULD attempt to determine an appropriate TimeGate for the 1213 Original Resource, either automatically or interactively supported 1214 by the user. The discovery mechanisms described in Section 4 can 1215 support the user agent with this regard. 1217 Resources that are not TimeGates (i.e. do not behave as described in 1218 Section 3.2.2) MUST NOT use a "Vary" header that includes the 1219 "accept-datetime" value. 1221 3.3. Interactions with a Memento 1223 This section details HTTP GET/HEAD requests targeted at a Memento 1224 (URI-M). 1226 3.3.1. Step 5: User Agent Requests a Memento 1228 In Step 5, the user agent issues a HTTP GET request against the URI 1229 of a Memento. The user agent MAY include an "Accept-Datetime" header 1230 in this request, but the existence or absence of this header MUST NOT 1231 affect the server's response. The URI of the Memento may have 1232 resulted from a response in Step 4, or the user agent may simply have 1233 happened upon it. Such a request is illustrated in Figure 15. 1235 GET /web/20010911203610/http://a.example.org HTTP/1.1 1236 Host: arxiv.example.net 1237 Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT 1238 Connection: close 1239 Figure 15: User agent requests Memento 1241 3.3.2. Step 6: Server Responds to a Request for a Memento 1243 If the Memento requested by the user agent in Step 5 exists, the 1244 server's response MUST have a "200 OK" HTTP status code (or "206 1245 Partial Content", where appropriate), and it MUST include a "Memento- 1246 Datetime" header with a value equal to the archival datetime of the 1247 Memento, that is, the datetime of the state of the Original Resource 1248 that is encapsulated in the Memento. The "Link" header MUST be 1249 provided and contain links subject to the considerations described in 1250 Section 2.2. The Target IRI and, when applicable, the datetime 1251 values in the "Link" header associated with the "memento" Relation 1252 Type SHOULD be the same as conveyed in Step 4, in case the TimeGate 1253 and the selected Memento reside on the same server. However, they 1254 MAY be different in case the TimeGate and the selected Memento reside 1255 on different servers. 1257 Figure 16 illustrates the server's response to the request issued 1258 against a Memento in Step 5 (Figure 15). 1260 HTTP/1.1 200 OK 1261 Date: Thu, 21 Jan 2010 00:09:40 GMT 1262 Server: Apache-Coyote/1.1 1263 Memento-Datetime: Tue, 11 Sep 2001 20:36:10 GMT 1264 Link: ; rel="original", 1265 1266 ; rel="timemap"; type="application/link-format", 1267 1268 ; rel="timegate", 1269 1270 ; rel="first memento"; datetime="Tue, 15 Sep 2000 11:28:26 GMT", 1271 1272 ; rel="last memento"; datetime="Tue, 08 Jul 2008 09:34:33 GMT", 1273 1274 ; rel="memento"; datetime="Tue, 11 Sep 2001 20:36:10 GMT", 1275 1276 ; rel="prev memento"; datetime="Tue, 11 Sep 2001 20:30:51 GMT", 1277 1278 ; rel="next memento"; datetime="Tue, 11 Sep 2001 20:47:33 GMT" 1279 Content-Length: 23364 1280 Content-Type: text/html;charset=utf-8 1281 Connection: close 1283 Figure 16: Server of Memento responds 1285 The server's response MUST include the "Memento-Datetime" header 1286 regardless whether the user agent's request contained an "Accept- 1287 Datetime" header or not. This is the way by which resources make 1288 explicit that they are Mementos. Due to the sparseness of Mementos 1289 in most archives, the value of the "Memento-Datetime" header returned 1290 by a server may differ (significantly) from the value conveyed by the 1291 user agent in "Accept-Datetime". 1293 Although a Memento encapsulates a prior state of an Original 1294 Resource, the entity-body returned in response to an HTTP GET request 1295 issued against a Memento may very well not be byte-to-byte the same 1296 as an entity-body that was previously returned by that Original 1297 Resource. Various reasons exist why there are significant chances 1298 these would be different yet do convey substantially the same 1299 information. These include format migrations as part of a digital 1300 preservation strategy, URI-rewriting as applied by some Web archives, 1301 and the addition of banners as a means to brand Web archives. 1303 3.3.2.1. Memento Does not Exist 1305 Cases may occur in which a TimeGate's response (Step 4) points at a 1306 Memento that actually does not exist, resulting in a user agent's 1307 request (Step 5) for a non-existent Memento. In this case, the 1308 server's response MUST have the expected "404 Not Found" HTTP Status 1309 Code and it MUST NOT contain a "Memento-Datetime" header. 1311 3.3.2.2. Mementos Without a TimeGate 1313 Cases may occur in which a server that hosts Mementos does not expose 1314 a TimeGate for those Mementos. This can, for example, be the case if 1315 the server's Mementos result from taking a snapshot of the state of a 1316 set of Original Resources from another server at the time this other 1317 server is being retired. As a result, only a single Memento per 1318 Original Resource is hosted, making the introduction of a TimeGate 1319 unnecessary. But it may also be the case for servers that hosts 1320 multiple Mementos for an Original Resource but consider exposing 1321 TimeGates too expensive. 1323 In cases of Mementos without associated TimeGates, responses to a 1324 request for a Memento by a user agent MUST be as described in 1325 Section 3.3.2 with the exception that it will not contain a HTTP Link 1326 with a "timegate" Relation Type pointing at a TimeGate exposed by the 1327 responding server. It MAY still contain such a Link pointing at a 1328 TimeGate exposed elsewhere. Depending on whether one or more 1329 Mementos are hosted for an Original Resource, the response may or may 1330 not have a HTTP Link with a "timemap" Relation Type. However, the 1331 response MUST still contain a "Memento-Datetime" response header with 1332 a value that corresponds to archival datetime of the Memento. 1334 Figure 17 illustrates the server's response to the request issued 1335 against a Memento in Step 5 (Figure 15) for the case that Memento has 1336 no associated TimeGate. In this example, it is also assumed there is 1337 only one Memento for the Original Resource, and hence the Links with 1338 Relation Types "memento", "first", "last" all point at the same - 1339 responding - Memento. 1341 HTTP/1.1 200 OK 1342 Date: Thu, 21 Jan 2010 00:09:40 GMT 1343 Server: Apache-Coyote/1.1 1344 Memento-Datetime: Tue, 11 Sep 2001 20:36:10 GMT 1345 Link: ; rel="original", 1346 1347 ; rel="first last memento" 1348 ; datetime="Tue, 15 Sep 2000 11:28:26 GMT" 1349 Content-Length: 23364 1350 Content-Type: text/html;charset=utf-8 1351 Connection: close 1353 Figure 17: Server of Memento without TimeGate responds 1355 Note that a server issuing a response similar to that of Figure 17 1356 does not imply that there is no server whatsoever that exposes a 1357 TimeGate; it merely means that the responding server neither provides 1358 nor is aware of the location of a TimeGate. 1360 3.3.3. Recognizing a Memento 1362 When following the redirection provided by a confirmed TimeGate (see 1363 Section 3.2.3), a user agent SHOULD NOT assume that the targeted 1364 resource effectively is a Memento and hence will behave as described 1365 in Section 3.3.2. 1367 A user agent MUST decide it has reached a Memento if the response to 1368 a HTTP HEAD/GET request against the resource's URI contains a 1369 "Memento-Datetime" header with a legitimate value. If the response 1370 does not, the following applies: 1372 o If the response contains a redirection, the user agent SHOULD 1373 follow it. Even a chain of redirections is possible, e.g. URI-G 1374 -> URI-X -> URI-Y -> ... -> URI-M. 1376 o If the response by a confirmed TimeGate does not contain a 1377 redirection, or if the redirection (chain) that started at a 1378 confirmed TimeGate does not lead to a resource that provides a 1379 "Memento-Datetime" header, the user agent MAY still conclude that 1380 it has likely arrived at a Memento. That is because cases exist 1381 in which Web archives and CMS are made compliant with the Memento 1382 framework "by proxy". In these cases TimeGates will redirect to 1383 Mementos in such systems, but the responses from these Mementos 1384 will not (yet) include a "Memento-Datetime" header. 1386 3.4. Interactions with a TimeMap 1388 A TimeMap is introduced to support retrieving a comprehensive list of 1389 all Mementos for a specific Original Resource, known to a responding 1390 server. The entity-body of a response to an HTTP GET request issued 1391 against a TimeMap's URI: 1393 o MUST list the URI of the Original Resource that the response lists 1394 Mementos for; 1396 o MUST list the URI and datetime of each Memento for the Original 1397 Resource known to the responding server; 1399 o MUST list the URI of one or more TimeGates for the Original 1400 Resource except when no TimeGate exists (see Section 3.3.2.2); 1402 o SHOULD, for self-containment, list the URI of the TimeMap itself; 1404 o MUST unambiguously type listed resources as being Original 1405 Resource, TimeGate, Memento, or TimeMap. 1407 The entity-body of a response from a TimeMap MAY be serialized in 1408 various ways, but the link-value format serialization MUST be 1409 supported. In this serialization, the entity-body MUST be formatted 1410 in the same way as the value of a HTTP "Link" header, and hence MUST 1411 comply to the "link-value" construction rule of "Section 5. The Link 1412 Header Field" of RFC5988 [RFC5988], and the media type of the entity- 1413 body MUST be "application/link-format" as introduced in I-D.ietf- 1414 core-link-format [I-D.ietf-core-link-format]. All links conveyed in 1415 this serialization MUST be interpreted as having the URI of the 1416 Original Resource as their Context IRI. The URI of the Original 1417 Resource is provided in the entity-body as the Target IRI of the link 1418 with an "original" Relation Type. 1420 3.4.1. User Agent Requests a TimeMap 1422 In order to retrieve the link-value serialization of a TimeMap, a 1423 user agent SHOULD use an "Accept" request header with a value set to 1424 "application/link-format". This is shown in Figure 18. 1426 GET /timemap/http://a.example.org HTTP/1.1 1427 Host: arxiv.example.net 1428 Accept: application/link-format;q=1.0 1429 Connection: close 1430 Figure 18: Request for a TimeMap 1432 3.4.2. Server Responds to a Request for a TimeMap 1434 If the TimeMap requested by the user agent exists, the server's 1435 response MUST have a "200 OK" HTTP status code (or "206 Partial 1436 Content", where appropriate). Note that a TimeMap is itself an 1437 Original Resource for which Mementos may exist. For example, a 1438 response from a TimeMap could provide a "timegate" Link to a TimeGate 1439 via which prior TimeMap versions are available. In this case, the 1440 use of the "Link" header is subject to all considerations described 1441 in Section 2.2, with the TimeMap acting as the Original Resource. 1443 However, in case a TimeMap wants to explicitly indicate in its 1444 response headers for which Original Resource it is a TimeMap, it MUST 1445 do so by including a HTTP "Link" header with the following 1446 characteristics: 1448 o The Context IRI for the HTTP Link is the URI of the Original 1449 Resource; 1451 o The Relation Type is "timemap"; 1453 o The Target IRI for the HTTP Link is the URI of the TimeMap. 1455 Because the Context IRI of this HTTP Link is not the URI of the 1456 TimeMap, as per RFC5988 [RFC5988], the default Context IRI must be 1457 overwritten by using the "anchor" attribute with a value of the URI 1458 of the Original Resource. 1460 The response from the TimeMap to the request of Figure 18 is shown in 1461 Figure 19. The response header shows the TimeMap explicitly 1462 conveying the URI of the Original Resource for which it is a TimeMap; 1463 for practical reasons the entity-body in the example has been 1464 abbreviated. Notice also the use of the "license" and "embargo" 1465 attributes introduced in Section 2.2.1.4 on the "memento" links in 1466 the TimeMap. 1468 HTTP/1.1 200 OK 1469 Date: Thu, 21 Jan 2010 00:06:50 GMT 1470 Server: Apache 1471 Link: 1472 ; anchor="http://a.example.org"; rel="timemap" 1473 ; type="application/link-format" 1474 Content-Length: 4883 1475 Content-Type: application/link-format 1476 Connection: close 1478 ;rel="original", 1479 1480 ; rel="timemap";type="application/link-format", 1481 1482 ; rel="timegate", 1483 1484 ; rel="first memento";datetime="Tue, 20 Jun 2000 18:02:59 GMT" 1485 ; license="http://creativecommons.org/publicdomain/zero/1.0/", 1486 1487 ; rel="last memento";datetime="Tue, 27 Oct 2009 20:49:54 GMT" 1488 ; license="http://creativecommons.org/publicdomain/zero/1.0/" 1489 ; embargo="Tue, 19 Apr 2011 00:00:00 GMT", 1490 1491 ; rel="memento";datetime="Wed, 21 Jun 2000 01:17:31 GMT" 1492 ; license="http://creativecommons.org/publicdomain/zero/1.0/", 1493 1494 ; rel="memento";datetime="Wed, 21 Jun 2000 04:41:56 GMT" 1495 ; license="http://creativecommons.org/publicdomain/zero/1.0/", 1496 ... 1498 Figure 19: Response from a TimeMap 1500 4. The Memento Framework, Discovery Component 1502 Section 3 describes how TimeGates, Mementos, Original Resources, and 1503 TimeMaps can be discovered by following HTTP Links with Relation 1504 Types "timegate", "memento", "original", and "timemap", respectively. 1506 Naturally, some of these links can also be embedded into 1507 representations of resources that have a media type that allows 1508 embedding of typed links. For example, an Original Resource that has 1509 an HTML representation can include a "timegate" link by using HTML's 1510 LINK element, e.g. . The use of such embedded links is also subject to 1513 the considerations of Section 2.2. 1515 In this section additional approaches are introduced that support 1516 batch discovery of TimeGates, TimeMaps, and Mementos. The approaches 1517 leverage the Robots Exclusion Protocol and a special-purpose profile 1518 of Atom Feeds named TimeMap Feeds. 1520 4.1. Discovery of TimeGates Via Robots Exclusion Protocol 1522 The Robots Exclusion Protocol's robots.txt file [robotstxt.org] is 1523 commonly used by Web site owners to give instructions about their 1524 site to Web robots. It is used both to protect resources hosted by a 1525 server from crawling and to facilitate discovering them. This 1526 document introduces the "TimeGate" and "Archived" directives for 1527 robots.txt to provide a server-wide mechanism to support TimeGate 1528 discovery that SHOULD be used by: 1530 o Servers of Original Resources; 1532 o Servers that provide access to Mementos of Original Resources by 1533 exposing TimeGates. 1535 A robots.txt file MAY contain zero or more occurrences of the 1536 "TimeGate" directive, and each occurrence MUST be followed by one or 1537 more associated "Archived" directives. The meaning of the directives 1538 is as follows: 1540 o TimeGate: Conveys the base URL (that is URI scheme, host and path 1541 component) that is shared by all URIs of TimeGates of a set of 1542 Original Resources. 1544 o Archived: Indicates - by means of mandatory host and optional path 1545 parts of a URI - for which set of Original Resources actual 1546 TimeGates are available that have the base URL conveyed in the 1547 associated TimeGate directive. 1549 For example, consider a wiki at http://a.example.org/w/ that supports 1550 the Memento framework and exposes TimeGates to access the wiki's 1551 history pages at base URL 1552 http://a.example.org/w/index.php/Special:TimeGate/. An actual 1553 TimeGate for the wiki's http://a.example.org/w/My_Title page would 1554 then be at http://a.example.org/w/index.php/Special:TimeGate/http:// 1555 a.example.org/w/My_Title. This wiki SHOULD make its TimeGates 1556 discoverable by using the directives shown in Figure 20 in its 1557 robots.txt file. 1559 TimeGate: http://a.example.org/w/index.php/Special:TimeGate/ 1560 Archived: a.example.org/w/ 1562 Figure 20: robots.txt for a wiki, host of Original Resources, 1563 TimeGates, and Mementos 1565 As another example, consider a server of Original Resources at 1566 http://a.example.org/ and http://www.a.example.org/ that is aware 1567 that its resources are regularly crawled by a Web archive that 1568 generally exposes TimeGates at base URL 1569 http://arxiv.example.net/timegate/ and hence has TimeGate 1570 http://arxiv.example.net/timegate/http://a.example.org/ to access 1571 Mementos for http://a.example.org/. This server SHOULD make the 1572 remote TimeGates discoverable by including the directives shown in 1573 Figure 21 in its robots.txt file: 1575 TimeGate: http://arxiv.example.net/timegate/ 1576 Archived: a.example.org/ 1577 Archived: www.a.example.org/ 1579 Figure 21: robots.txt for a server of Original Resources aware of 1580 remote TimeGates 1582 And, consider a Web archive that crawls a wide range of Original 1583 Resources, and exposes TimeGates to access the resulting Mementos at 1584 base URL http://arxiv.example.net/timegate/. In order to make its 1585 TimeGates discoverable, this Web archive SHOULD include the 1586 directives shown in Figure 22 in its robots.txt file: 1588 TimeGate: http://arxiv.example.net/timegate/ 1589 Archived: * 1591 Figure 22: robots.txt for a Web Archive that hosts Mementos for a 1592 wide range of Original Resources 1594 4.2. Discovery of TimeMaps Via TimeMap Feeds and Robots Exclusion 1595 Protocol 1597 Atom Feeds [RFC4287] are commonly used to support discovery of news 1598 items by humans and are also frequently used for automated discovery 1599 by a variety of applications. This section introduces a profile of 1600 Atom Feeds named TimeMap Feeds intended to support batch discovery of 1601 TimeMaps. The discovery of TimeMap Feeds is in its turn supported by 1602 the new "TimeMapFeed" directive for robots.txt. 1604 4.2.1. TimeMap Feeds 1606 TimeMap Feeds are special-purpose Atom Feeds that SHOULD be published 1607 by servers to support batch discovery of their Mementos. The 1608 following are the essential characteristics of a TimeMap Feed: 1610 o The feed has one entry per Original Resource for which the servers 1611 hosts Mementos; 1613 o Each entry conveys a TimeMap that lists the Mementos the server 1614 hosts for the associated Original Resource; 1616 o An entry can convey a TimeMap by-value (as inline content) or by- 1617 reference (as linked content); 1619 o Entries in the feed are provided in reverse chronological order: 1620 the most recently updated entry MUST be listed first; 1622 o Any change to a TimeMap conveyed by an entry MUST result in a 1623 change of the updated time of the entry, irrespective of the 1624 TimeMap being conveyed by-value or by-reference; 1626 o Depending on the size of the collection of Mementos, a server may 1627 need to publish multiple TimeMap Feeds. In this case, the server 1628 SHOULD also publish an Index of TimeMaps Feed that has an entry 1629 per published TimeMap Feed. 1631 Further details about the use of feed-level and entry-level elements 1632 in a TimeMap Feed are provided in Section 4.2.1.1 and 1633 Section 4.2.1.2, respectively. 1635 4.2.1.1. TimeMap Feeds: Feed-level Elements 1637 This section discusses the use of feed-level Atom elements in TimeMap 1638 Feeds. All elements are as specified in Atom [RFC4287], yet 1639 additional constraints or guidelines apply to some when used in 1640 TimeMap Feeds. 1642 4.2.1.1.1. Feed-level element: atom:id 1644 As the content of the atom:id element, a tag URI [RFC4151] or an HTTP 1645 URI equal to the one provided as the value of the "href" attribute of 1646 the MANDATORY feed-level "atom:link" element with a "rel" attribute 1647 equal to "self" is RECOMMENDED. 1649 4.2.1.1.2. Feed-level element: atom:author 1651 The atom:author element MUST occur exactly once, and is constructed 1652 as follows: 1654 o It MUST contain exactly one "atom:name" element, which contains 1655 the name of the server that publishes the TimeMap Feed. 1657 o It MUST contain exactly one "atom:uri" element, which contains the 1658 URI of the server that publishes the TimeMap Feed. 1660 o It SHOULD contain exactly one "atom:email" element, which contains 1661 the email address of a technical contact for the server that 1662 publishes the TimeMap Feed. 1664 4.2.1.1.3. Feed-level element: atom:category 1666 The atom:category element MUST occur at least once, and its use is as 1667 follows: 1669 o One atom:category element MUST be included to assert that the feed 1670 is a TimeMap Feed. It MUST have 1671 "http://purl.org/memento/categories/feedtype" as the value of the 1672 "scheme" attribute and it MUST have "TimeMapFeed" as the value of 1673 the "term" attribute. 1675 o In case the feed is an Index of TimeMap Feeds, an atom:category 1676 element SHOULD be included that has 1677 "http://purl.org/memento/categories/feedtype" as the value of the 1678 "scheme" attribute and "IndexFeed" as the value of the "term" 1679 attribute. 1681 o One or more atom:category elements SHOULD be included to coarsely 1682 convey information about the Original Resources for which the feed 1683 provides TimeMaps. It MUST have 1684 "http://purl.org/memento/categories/archived" as the value of the 1685 "scheme" attribute and the value of the "term" attribute MUST be 1686 as specified for the "Archived" directive of the Memento extension 1687 for robots.txt (see Section 4.1). 1689 o One atom:category element SHOULD be included to convey the type of 1690 archival system for which the TimeMap Feed provides an inventory. 1691 It MUST have "http://purl.org/memento/categories/class" as the 1692 value of the "scheme" attribute. Values of the "term" attribute 1693 MUST be taken from the vocabulary at 1694 http://purl.org/memento/categories/class. 1696 4.2.1.1.4. Feed-level elements: Example 1698 Figure 23 shows the use of feed-level elements for a TimeMap Feed 1699 published by the server http://arxiv.example.net/. 1701 1702 http://arxiv.example.net/timemapfeeds/feed1 1703 Feed 1 of arxiv.example.net TimeMaps 1704 2011-05-01 12:34:00 GMT 1705 1706 Example Web Archive 1707 http://arxiv.example.net/ 1708 admin@arxiv.example.net 1709 1710 Content of this feed is public domain. 1711 http://arxiv.example.net/images/icon.png 1712 1714 1716 1718 1720 1724 ... entries go here ... 1726 1728 Figure 23: Feed-level elements for a TimeMap Feed 1730 4.2.1.2. TimeMap Feeds: Entry-level Elements 1732 This section discusses the use of entry-level Atom elements in 1733 TimeMap Feeds. All elements are as specified in Atom [RFC4287], yet 1734 additional constraints or guidelines apply to some when used in 1735 TimeMap Feeds. 1737 4.2.1.2.1. Entry-level element: atom:id 1739 The content of the atom:id element MUST be a tag URI [RFC4151] as 1740 specified by the "timemap-tagURI" construction rule of Figure 24, and 1741 it MUST have the URI of the Original Resource as the value for the 1742 "or-uri" component of that rule. If the feed is moved or copied, the 1743 tag URI that is provided as the value of the atom:id element MUST 1744 remain the same. 1746 timemap-tagURI = "tag:" taggingEntity ":" or-uri 1747 taggingEntity = DNSname "," "2011" 1748 DNSname = DNScomp *( "." DNScomp ) ; see RFC 1035 1749 DNScomp = alphaNum [*(alphaNum /"-") alphaNum] 1750 alphaNum = DIGIT / ALPHA 1751 or-uri = scheme ":" hier-part [ "?" query ] ; see RFC 3986 1752 scheme = "http" | "https" 1754 Figure 24: BNF for a tag URI for entries in TimeMap feeds 1756 4.2.1.2.2. Entry-level element: atom:author 1758 The atom:author element MUST NOT be used. Authorship information for 1759 an entry is inherited as follows: 1761 o From the atom:source child element of the entry, if it exists and 1762 contains authorship information; 1764 o From the feed-level atom:author element otherwise. 1766 4.2.1.2.3. Entry-level element: atom:updated 1768 The atom:updated element MUST be used and its value MUST change 1769 whenever the entry changes, including when the TimeMap conveyed by 1770 the entry (by-value or by-reference) changes. 1772 4.2.1.2.4. Entry-level element: atom:link 1774 The atom:link element MUST occur at least once and its use is as 1775 follows: 1777 o One atom:link element MUST be included to convey the URI of the 1778 Original Resource for which the entry conveys a TimeMap. This 1779 atom:link MUST have "original" as the value of the "rel" 1780 attribute, and the URI of the Original Resource as the value of 1781 the "href" attribute. This atom:link MAY have a "type" attribute 1782 to convey a mime type. 1784 o If the TimeMap is not embedded within the atom:content child 1785 element of the entry (see below), there MUST be an atom:link 1786 element that has both "alternate" and "timemap" as values for the 1787 "rel" attribute, and that has the URI of a TimeMap serialized 1788 according to the link-value format (see Section 3.4) as the value 1789 of the "href" attribute. The value of the "type" attribute for 1790 this link MUST be "application/link-format". 1792 o Additional atom:link elements MAY be provided, each linking to 1793 another TimeMap serialization (such as RDF/XML). These atom:link 1794 elements MUST have "timemap" as the value for the "rel" attribute 1795 and MUST have a "type" attribute that conveys the mime type of the 1796 serialization. 1798 o An atom:link element MAY be provided that links to a TimeGate for 1799 the Original Resource. This atom:link MUST have "timegate" as the 1800 value for the "rel" attribute, and it MUST NOT have a "type" 1801 attribute. 1803 4.2.1.2.5. Entry-level element: atom:content 1805 If the entry does not contain an atom:link element pointing to a 1806 TimeMap serialized according to the link-value format, then the atom: 1807 content element MUST be used to directly contain such a TimeMap 1808 wrapped in a CDATA section. The "type" attribute of this atom: 1809 content element MUST be used and it MUST have the value "application/ 1810 link-format" (see Section 3.4). 1812 4.2.1.2.6. Entry-level elements: Example 1814 Figure 25 shows the use of entry-level elements for a TimeMap Feed 1815 published by the server http://arxiv.example.net/. 1817 1819 ... feed information ... 1821 1822 tag:arxiv.example.net,2011:http://a.example.org/ 1823 1824 <updated>2011-05-01 12:34:00 GMT</updated> 1825 <link rel="alternate timemap" 1826 type="application/link-format" 1827 href="http://arxiv.example.net/timemap/http://a.example.org"/> 1828 <link rel="original" href="http://a.example.org/" /> 1829 <link rel="timegate" 1830 href="http://arxiv.example.net/timegate/http://a.example.org"/> 1831 </entry> 1833 ... more entries ... 1835 </feed> 1837 Figure 25: Entry-level elements for a TimeMap Feed 1839 4.2.2. Discovering TimeMap Feeds via Robots Exclusion Protocol 1841 Servers that publish TimeMap Feeds SHOULD make them discoverable by 1842 using the "TimeMapFeed" directive for robots.txt that is introduced 1843 here. 1845 A robots.txt file MAY contain zero or more occurrences of the 1846 "TimeMapFeed" directive, and its meaning is as follows: 1848 o TimeMapFeed: Conveys the HTTP(S) URI of a TimeMap Feed or of an 1849 Index of TimeMap Feeds. 1851 Figure 26 shows an excerpt of the robots.txt file of the server at 1852 http://arxiv.example.net/ that hosts two TimeMap Feeds to make its 1853 Mementos discoverable. 1855 TimeMapFeed: http://arxiv.example.net/timemapfeeds/feed1 1856 TimeMapFeed: http://arxiv.example.net/timemapfeeds/feed2 1858 Figure 26: robots.txt to support discovery of TimeMap Feeds 1860 4.3. Discovering Mementos via Robots Exclusion Protocol 1862 Servers can support discovery of their Mementos by crawlers through 1863 the use of the Robots Exclusion Protocol, but SHOULD do so in a 1864 manner that conveys to crawlers and mirroring applications that the 1865 sticky Memento-Datetime behavior (see Section 2.1.1) MUST be 1866 respected. To that end, servers SHOULD use the "User-agent" and 1867 "Allow" directives of the Robots Exclusion Protocol in the following 1868 manner: 1870 o User-agent: Has "memento" as its value; 1872 o Allow: Lists the path that contains Mementos that can be crawled, 1873 and for which content can be mirrored subject to the sticky 1874 Memento-Datetime behavior. 1876 Figure 27 shows the robots.txt for a server that generally disallows 1877 crawling, yet allows agents that respect the sticky Memento-Datetime 1878 behavior to crawl Mementos in the /web/ path. 1880 User-agent: * 1881 Disallow: / 1882 User-agent: memento 1883 Allow: /web/ 1885 Figure 27: Restricting crawling to agents that respect sticky 1886 Memento-Datetime behavior 1888 5. IANA Considerations 1890 This memo requires IANA to register the Accept-Datetime and Memento- 1891 Datetime HTTP headers defined in Section 2.1.1 in the appropriate 1892 IANA registry. 1894 This memo requires IANA to register the "Link" header Relation Types 1895 "original", "timegate", "timemap", and "memento" defined in 1896 Section 2.2.1 in the appropriate IANA registry. 1898 This memo requires IANA to register the "datetime", "license", and 1899 "embargo" attributes for Link headers with a "memento" Relation Type, 1900 as defined in Section 2.2.1.4 in the appropriate IANA registry. 1902 6. Security Considerations 1904 Provision of a "timegate" HTTP "Link" header in responses to requests 1905 for an Original Resource that is protected (e.g., 401 or 403 HTTP 1906 response codes) is OPTIONAL. The inclusion of this Link when 1907 requesting authentication is at the server's discretion; cases may 1908 exist in which a server protects the current state of a resource, but 1909 supports open access to prior states and thus chooses to supply a 1910 "timegate" HTTP "Link" header. Conversely, the server may choose to 1911 not advertise the TimeGate URIs (e.g., they exist in an intranet 1912 archive) for unauthenticated requests. 1914 Authentication, encryption and other security related issues are 1915 otherwise orthogonal to Memento. 1917 7. Changelog 1919 v02 2011-05-11 HVDS MLN RS draft-vandesompel-memento-01 1921 o Introduced wording and reference to indicate a Memento is a 1922 FixedResource. 1924 o Introduced "Sticky Memento-Datetime" notion and clarified wording 1925 about retaining "Memento-Datetime" headers and values when a 1926 Memento is mirrored at different URI. 1928 o Introduced section about handling both datetime and regular 1929 negotiation. 1931 o Introduced section about Mementos Without TimeGate. 1933 o Made various changes in the section Relation Type "memento", 1934 including addition of "license" and "embargo" attributes, and 1935 clarification of rules regarding the use of "memento" links. 1937 o Moved section about TimeMaps inside the Datetime Negotiation 1938 section, and updated it. 1940 o Restarted the Discovery section from scratch. 1942 v01 2010-11-11 HVDS MLN RS First public version 1943 draft-vandesompel-memento-00 1945 v00 2010-10-19 HVDS MLN RS Limited circulation version 1947 2010-07-22 HVDS MLN First internal version 1949 8. Acknowledgements 1951 The Memento effort is funded by the Library of Congress. Many thanks 1952 to Kris Carpenter Negulescu, Michael Hausenblas, Erik Hetzner, Larry 1953 Masinter, Gordon Mohr, Mark Nottingham, David Rosenthal, Ed Summers 1954 for early feedback. Many thanks to Samuel Adams, Scott Ainsworth, 1955 Lyudmilla Balakireva, Frank McCown, Harihar Shankar, Brad Tofel for 1956 early implementations. 1958 9. References 1960 9.1. Normative References 1962 [I-D.ietf-core-link-format] 1963 Shelby, Z., "CoRE Link Format", 1964 draft-ietf-core-link-format-03 (work in progress), 1965 March 2011. 1967 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1968 Requirement Levels", BCP 14, RFC 2119, March 1997. 1970 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 1971 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext 1972 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. 1974 [RFC4151] Kindberg, T. and S. Hawke, "The 'tag' URI Scheme", 1975 RFC 4151, October 2005. 1977 [RFC4287] Nottingham, M., Ed. and R. Sayre, Ed., "The Atom 1978 Syndication Format", RFC 4287, December 2005. 1980 [RFC5829] Brown, A., Clemm, G., and J. Reschke, "Link Relation Types 1981 for Simple Version Navigation between Web Resources", 1982 RFC 5829, April 2010. 1984 [RFC5988] Nottingham, M., "Web Linking", RFC 5988, October 2010. 1986 9.2. Informative References 1988 [Fitch] Fitch, "Web site archiving - an approach to recording 1989 every materially different response produced by a 1990 website", July 2003, 1991 <http://ausweb.scu.edu.au/aw03/papers/fitch/paper.html>. 1993 [I-D.masinter-dated-uri] 1994 Masinter, L., "The 'tdb' and 'duri' URI schemes, based on 1995 dated URIs", draft-masinter-dated-uri-08 (work in 1996 progress), January 2011. 1998 [RFC1123] Braden, R., "Requirements for Internet Hosts - Application 1999 and Support", STD 3, RFC 1123, October 1989. 2001 [W3C.REC-aww-20041215] 2002 Jacobs and Walsh, "Architecture of the World Wide Web", 2003 December 2004, <http://www.w3.org/TR/webarch/>. 2005 [W3C.gen-ont-20090420] 2006 Berners-Lee, "Architecture of the World Wide Web", 2007 April 2009, <http://www.w3.org/2006/gen/ont>. 2009 [robotstxt.org] 2010 "Robots Exclusion Protocol", August 2010, 2011 <http://www.robotstxt.org/robotstxt.html>. 2013 Appendix A. Appendix B: A Sample, Successful Memento Request/Response 2014 cycle 2016 Step 1 : UA --- HTTP GET/HEAD; Accept-Datetime: Tj ---------> URI-R 2018 HEAD / HTTP/1.1 2019 Host: a.example.org 2020 Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT 2021 Connection: close 2023 Step 2 : UA <-- HTTP 200; Link: URI-G ----------------------- URI-R 2025 HTTP/1.1 200 OK 2026 Date: Thu, 21 Jan 2010 00:02:12 GMT 2027 Server: Apache 2028 Link: <http://arxiv.example.net/timegate/http://a.example.org> 2029 ; rel="timegate" 2030 Content-Length: 255 2031 Connection: close 2032 Content-Type: text/html; charset=iso-8859-1 2034 Step 3 : UA --- HTTP GET/HEAD; Accept-Datetime: Tj ---------> URI-G 2036 GET /timegate/http://a.example.org 2037 HTTP/1.1 2038 Host: arxiv.example.net 2039 Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT 2040 Connection: close 2042 Step 4 : UA <-- HTTP 302; Location: URI-Mj; Vary; Link: 2043 URI-R, URI-T, URI-M0, URI-Mn, URI-Mi, URI-Mj, URI-Mk ---- URI-G 2045 HTTP/1.1 302 Found 2046 Date: Thu, 21 Jan 2010 00:06:50 GMT 2047 Server: Apache 2048 Vary: negotiate, accept-datetime 2049 Location: 2050 http://arxiv.example.net/web/20010911203610/http://a.example.org 2051 Link: <http://a.example.org>; rel="original", 2052 <http://arxiv.example.net/web/20000915112826/http://a.example.org> 2053 ; rel="first memento"; datetime="Tue, 15 Sep 2000 11:28:26 GMT", 2054 <http://arxiv.example.net/web/20080708093433/http://a.example.org> 2055 ; rel="last memento"; datetime="Tue, 08 Jul 2008 09:34:33 GMT", 2056 <http://arxiv.example.net/timemap/http://a.example.org> 2057 ; rel="timemap"; type="application/link-format", 2058 <http://arxiv.example.net/web/20010911203610/http://a.example.org> 2059 ; rel="memento"; datetime="Tue, 11 Sep 2001 20:36:10 GMT", 2060 <http://arxiv.example.net/web/20010911203610/http://a.example.org> 2061 ; rel="prev memento"; datetime="Tue, 11 Sep 2001 20:30:51 GMT", 2062 <http://arxiv.example.net/web/20010911203610/http://a.example.org> 2063 ; rel="next memento"; datetime="Tue, 11 Sep 2001 20:47:33 GMT" 2064 Content-Length: 0 2065 Content-Type: text/plain; charset=UTF-8 2066 Connection: close 2068 Step 5 : UA --- HTTP GET URI-Mj; Accept-Datetime: Tj -------> URI-Mj 2070 GET /web/20010911203610/http://a.example.org 2071 HTTP/1.1 2072 Host: arxiv.example.net 2073 Accept-Datetime: Tue, 11 Sep 2001 20:35:00 GMT 2074 Connection: close 2075 Step 6 : UA <-- HTTP 200; Memento-Datetime: Tj; Link: URI-R, 2076 URI-T, URI-G, URI-M0, URI-Mn, URI-Mi, URI-Mj, URI-Mk ---- URI-Mj 2078 HTTP/1.1 200 OK 2079 Date: Thu, 21 Jan 2010 00:09:40 GMT 2080 Server: Apache-Coyote/1.1 2081 Memento-Datetime: Tue, 11 Sep 2001 20:36:10 GMT 2082 Link: <http://a.example.org>; rel="original", 2083 <http://arxiv.example.net/web/20000915112826/http://a.example.org> 2084 ; rel="first memento"; datetime="Tue, 15 Sep 2000 11:28:26 GMT", 2085 <http://arxiv.example.net/web/20080708093433/http://a.example.org> 2086 ; rel="last memento"; datetime="Tue, 08 Jul 2008 09:34:33 GMT", 2087 <http://arxiv.example.net/timemap/http://a.example.org> 2088 ; rel="timemap"; type="application/link-format", 2089 <http://arxiv.example.net/timegate/http://a.example.org> 2090 ; rel="timegate", 2091 <http://arxiv.example.net/web/20010911203610/http://a.example.org> 2092 ; rel="memento"; datetime="Tue, 11 Sep 2001 20:36:10 GMT", 2093 <http://arxiv.example.net/web/20010911203610/http://a.example.org> 2094 ; rel="prev memento"; datetime="Tue, 11 Sep 2001 20:30:51 GMT", 2095 <http://arxiv.example.net/web/20010911203610/http://a.example.org> 2096 ; rel="next memento"; datetime="Tue, 11 Sep 2001 20:47:33 GMT" 2097 Content-Length: 23364 2098 Content-Type: text/html;charset=utf-8 2099 Connection: close 2101 A successful flow with TimeGate and Mementos on the same server 2103 Authors' Addresses 2105 Herbert VandeSompel 2106 Los Alamos National Laboratory 2107 PO Box 1663 2108 Los Alamos, New Mexico 87545 2109 USA 2111 Phone: +1 505 667 1267 2112 Email: hvdsomp@gmail.com 2113 URI: http://public.lanl.gov/herbertv/ 2114 Michael Nelson 2115 Old Dominion University 2116 Norfolk, Virginia 23529 2117 USA 2119 Phone: +1 757 683 6393 2120 Email: mln@cs.odu.edu 2121 URI: http://www.cs.odu.edu/~mln/ 2123 Robert Sanderson 2124 Los Alamos National Laboratory 2125 PO Box 1663 2126 Los Alamos, New Mexico 87545 2127 USA 2129 Phone: +1 505 665 5804 2130 Email: azaroth42@gmail.com