idnits 2.17.1 draft-vandesompel-citeas-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 31, 2018) is 2276 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 5988 (Obsoleted by RFC 8288) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group H. Van de Sompel 3 Internet-Draft Los Alamos National Laboratory 4 Intended status: Informational M. Nelson 5 Expires: August 4, 2018 Old Dominion University 6 G. Bilder 7 Crossref 8 J. Kunze 9 California Digital Library 10 S. Warner 11 Cornell University 12 January 31, 2018 14 cite-as: A Link Relation to Convey a Preferred URI for Referencing 15 draft-vandesompel-citeas-02 17 Abstract 19 This specification defines a link relation type that is intended to 20 convey that a URI, other than the URI that provides a link with the 21 relation type, is preferred for the purpose of referencing. 23 Note to Readers 25 Please discuss this draft on the ART mailing list 26 (). 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on August 4, 2018. 45 Copyright Notice 47 Copyright (c) 2018 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (https://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 63 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 3. Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 3.1. Persistent Identifiers . . . . . . . . . . . . . . . . . 3 66 3.2. Version Identifiers . . . . . . . . . . . . . . . . . . . 4 67 3.3. Preferred Social Identifier . . . . . . . . . . . . . . . 5 68 3.4. Multi-Resource Publications . . . . . . . . . . . . . . . 5 69 4. The "cite-as" Relation Type for Expressing a Preferred URI 70 for the Purpose of Referencing . . . . . . . . . . . . . . . 6 71 5. Distinction with Other Relation Types . . . . . . . . . . . . 7 72 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 8 73 6.1. Persistent HTTP URI . . . . . . . . . . . . . . . . . . . 8 74 6.2. Preferred Profile URI . . . . . . . . . . . . . . . . . . 9 75 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 76 7.1. Link Relation Type: cite-as . . . . . . . . . . . . . . . 9 77 8. Security Considerations . . . . . . . . . . . . . . . . . . . 9 78 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 79 9.1. Normative References . . . . . . . . . . . . . . . . . . 10 80 9.2. Informative References . . . . . . . . . . . . . . . . . 10 81 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 12 82 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 84 1. Introduction 86 A web resource is routinely referenced (e.g. linked, bookmarked) by 87 means of the URI where it is directly accessed. But cases exist 88 where referencing a resource by means of a different URI is 89 preferred, for example because the latter URI is intended to be more 90 persistent over time. Currently, there is no link relation type to 91 convey such alternative referencing preference; this specification 92 addresses this deficit by introducing a link relation type intended 93 for that purpose. 95 2. Terminology 97 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 98 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 99 document are to be interpreted as described in RFC 2119 [RFC2119]. 101 This specification uses the terms "link context" and "link target" as 102 defined in [RFC8288]. These terms respectively correspond with 103 "Context IRI" and "Target IRI" as used in [RFC5988]. Although 104 defined as IRIs, in common scenarios they are also URIs. 106 Additionally, this specification uses the following terms: 108 o "access URI": A URI at which a user agent accesses a web resource. 110 o "reference URI": A URI, other than the access URI, that should 111 preferentially be used for referencing. 113 By interacting with the access URI, the user agent may discover typed 114 links. For such links, the access URI is the link context. 116 3. Scenarios 118 3.1. Persistent Identifiers 120 Despite sound advice regarding the design of Cool URIs [CoolURIs], 121 link rot ("HTTP 404 Not Found") is a common phenomena when following 122 links on the web. Certain communities of practice have introduced 123 solutions to combat this problem that typically consist of: 125 o Accepting the reality that the web location of a resource - the 126 access URI - may change over time. 128 o Minting an additional URI for the resource - the reference URI - 129 that is specifically intended to remain persistent over time. 131 o Redirecting (typically "HTTP 301 Moved Permanently", "HTTP 302 132 Found", or "HTTP 303 See Other") from the reference URI to the 133 access URI. 135 o As a community, committing to adjust that redirection whenever the 136 access URI changes over time. 138 This approach is, for example, used by: 140 o Scholarly publishers that use DOIs [DOIs] to identify articles and 141 DOI URLs [DOI-URLs] as a means to keep cross-publisher article-to- 142 article links operational, even when the journals in which the 143 articles are published change hands from one publisher to another, 144 for example, as a result of an acquisition. 146 o Authors of controlled vocabularies that use PURLs [PURLs] for 147 vocabulary terms to ensure that the term URIs remain stable even 148 if management of the vocabulary is transfered to a new custodian. 150 o A variety of organizations, including libraries, archives, and 151 museums that assign ARK URLs [draft-kunze-ark-18] to information 152 objects in order to support long-term access. 154 In order for the investments in infrastructure involved in these 155 approaches to pay off, and hence for links to effectively remain 156 operational as intended, it is crucial that a resource be referenced 157 by means of its reference URI. However, the access URI is where a 158 user agent actually accesses the resource (e.g., it is the URI in the 159 browser's address bar). As such, there is a considerable risk that 160 the access URI instead of the reference URI is used for referencing 161 [PIDs-must-be-used]. 163 The link relation type defined in this specification allows to convey 164 to user agents that the reference URI is the preferred URI for 165 referencing. 167 3.2. Version Identifiers 169 Resource versioning systems often use a naming approach whereby: 171 o the most recent version of a resource is at any time available at 172 the same, generic URI 174 o each version of the resource - including the most recent one - has 175 a distinct version URI. 177 For example, Wikipedia uses generic URIs of the form 178 and version URIs of the form 179 . 182 While the current version of a resource is accessed at the generic 183 URI, some versioning systems adhere to a policy that favors linking 184 and referencing by means of the version URI that was minted for the 185 current version. To express this using the terminology of Section 2, 186 these policies intend that the generic URI is the access URI, and 187 that the version URI is the reference URI. These policies are 188 informed by the understanding that the content at the generic URI is 189 likely to evolve over time, and that accurate links or references 190 should lead to the content as it was at the time of referencing. To 191 that end, Wikipedia's "Permanent link" and "Cite this page" 192 functionalities promote the version URI, not the generic URI. 194 The link relation type defined in this specification allows to convey 195 to user agents that the version URI is preferred over the generic URI 196 for referencing. 198 3.3. Preferred Social Identifier 200 A web user commonly has multiple profiles on the web, for example, 201 one per social network she takes part in, a personal homepage, a 202 professional homepage, a FOAF profile [FOAF], etc. Each of these 203 profiles is accessible at a distinct URI. But the user may have a 204 preference for one of those profiles, for example, because it is most 205 complete, kept up-to-date, or expected to be long-lived. 207 The link relation type defined in this specification allows to convey 208 to user agents that a profile URI - the reference URI - other than 209 the one the agent is accessing - the access URI - is preferred for 210 referencing. 212 3.4. Multi-Resource Publications 214 When publishing on the web, it is not uncommon to make distinct 215 components of a publication available as different web resources, 216 each with their own URI. For example: 218 o Contemporary scholarly publications routinely consists of a 219 traditional article as well as additional materials that are 220 considered an integral part of the publication such as 221 supplementary information, high-resolution images, a video 222 recording of an experiment. 224 o Scientific or governmental open data sets frequently consist of 225 multiple files. 227 o Online books typically consist of multiple chapters. 229 While each of these components are accessible at their distinct URI - 230 the access URI - they often also share a URI assigned to the 231 intellectual publication of which they are components - the reference 232 URI. 234 The link relation type defined in this specification allows to convey 235 to user agents that, for the purpose of referencing, the reference 236 URI of the intellectual publication is preferred over an access URI 237 of a component of the publication. 239 4. The "cite-as" Relation Type for Expressing a Preferred URI for the 240 Purpose of Referencing 242 A link with the "cite-as" relation type indicates that the link 243 target is preferred over the link context for the purpose of 244 referencing. 246 The link target of a "cite-as" link SHOULD support protocol-based 247 access as a means to ensure that applications that store them can 248 effectively re-use them for access. 250 The link target of a "cite-as" link SHOULD provide the ability for a 251 user agent to follow its nose back to the context of the link, e.g. 252 by following redirects and/or links. This helps a user agent to 253 establish trust in the target URI. 255 Because a link with the "cite-as" relation type expresses a preferred 256 URI for the purpose of referencing, the access URI SHOULD only 257 provide one link with that relation type. If more than one "cite-as" 258 link is provided, the user agent may decide to select one (e.g. an 259 HTTP URI over a mailto URI), for example, based on the purpose that 260 the reference URI will serve. 262 Providing a link with the "cite-as" relation type does not prevent 263 using the access URI for the purpose of referencing if such 264 specificity is needed for the application at hand. For example, in 265 the case of scenario Section 3.4 the access URI is likely required 266 for the purpose of annotating a specific component of an intellectual 267 publication. Yet, the annotation application may also want to 268 appropriately include the reference URI in the annotation. 270 Applications can leverage the information provided by a "cite-as" 271 link in a variety of ways, for example: 273 o Bookmarking tools and citation managers can take this preference 274 into account when recording a URI. 276 o Webometrics applications that trace URIs can trace both the access 277 URI and the reference URI. 279 o Discovery tools can support look-up by means of both the access 280 and the reference URI. This includes web archives that typically 281 make archived versions of web resources discoverable by means of 282 the original access URI of the archived resource; they can 283 additionally make these archived resources discoverable by means 284 of the associated reference URI. 286 5. Distinction with Other Relation Types 288 The following existing IANA-registered relationships may intuitively 289 resemble the relationship that "cite-as" is intended to convey, but 290 are not appropriate for various reasons: 292 o "alternate" [RFC4287], used to link to an alternate version of the 293 content at the link context, for example the same content with 294 varying Content-Type (e.g., application/pdf vs. text/html) and/or 295 Content-Language (e.g., en vs. fr). 297 o "bookmark" [W3C.REC-html5-20151028], used to convey a permanent 298 link to use for bookmarking purposes. 300 o "canonical" [RFC6596], used to identify content that is either 301 duplicative or a superset of the content at the link context, for 302 example a single page version of a magazine article, provided for 303 indexing by search engines, of an article that is spread over 304 several pages for human use. 306 o "duplicate" [RFC6249], used to link to a resource whose available 307 representations are byte-for-byte identical with the corresponding 308 representations of the link context, for example, an identical 309 file on a mirror site. 311 o "related" [RFC4287], used to link to a related resource. 313 A closer inspection of these candidates [identifier-blog] shows that 314 they are not appropriate and that a new relation type is required. 316 In the scenario of Section 3.1 there is no content available at the 317 reference URI as it merely redirects to the access URI. In the 318 scenario of Section 3.3, the content at the reference URI is a 319 profile that is different than the profile at the access URI. In the 320 scenario of Section 3.4 the content at the reference URI, if any, 321 would typically be a sort of table of contents with links to 322 component resources and possibly a summary. These considerations 323 exclude "alternate", "canonical", and "duplicate" as possible 324 relation types. 326 The meaning of "canonical" is commonly misunderstood on the basis of 327 its brief definition as being "the preferred version of a resource." 328 A more detailed reading of [RFC6596] clarifies that the intended 329 meaning is preferred for the purpose of content indexing 331 [canonical-blog]. In constrast, for "cite-as" it is preferred for 332 the purpose of referencing. 334 The intent of "bookmark" is closest to that of "cite-as" in that the 335 link target of a link with the "bookmark" relation type is intended 336 "to give a permanent link to use for for bookmarking purposes." 337 However, for reasons related to its original intent [bookmark-blog], 338 "bookmark" is specifically defined for use in conjunction with the 339 HTML
element and is explictly excluded from use in the 340 element in HTML . Since a link in and a link in 341 the HTTP Link header are semantically equivalent, "bookmark" is also 342 excluded from use in HTTP Link. 344 While "related" could be used, its semantics are too vague to convey 345 the specific nature of "cite-as" as a means to convey a URI for the 346 purpose of referencing. 348 6. Examples 350 Sections Section 6.1 and Section 6.2 show examples of the use of 351 links with the "cite-as" relation type. One example shows its use in 352 a response header and body, the other in a response body only. 354 6.1. Persistent HTTP URI 356 If the access URI is a landing page for a scholarly article for which 357 the persistent HTTP URI 358 was minted, then the response to an HTTP GET on the landing page's 359 URI could be as shown in Figure 1. 361 HTTP/1.1 200 OK 362 Link: ; rel="cite-as" 363 Content-Type: text/html;charset=utf-8 365 366 367 ... 368 369 ... 370 371 372 ... 373 374 376 Figure 1: Response to HTTP GET on the URI of the landing page of a 377 scholarly article 379 6.2. Preferred Profile URI 381 If the access URI is the home page of John Doe, John can add a link 382 with the "cite-as" relation type to it, as a means to convey that he 383 would preferably be referenced by means of the URI of his FOAF 384 profile. Figure 2 shows the response to an HTTP GET on the URI of 385 John's home page. 387 HTTP/1.1 200 OK 388 Content-Type: text/html;charset=utf-8 390 391 392 ... 393 395 ... 396 397 398 ... 399 400 402 Figure 2: Response to HTTP GET on the URI of John Doe's home page 404 7. IANA Considerations 406 7.1. Link Relation Type: cite-as 408 The link relation type below has been registered by IANA per 409 Section 2.1.1 of [RFC8288]: 411 Relation Name: cite-as 413 Description: A link with the "cite-as" relation type indicates 414 that the link target is preferred over the link context for the 415 purpose of referencing. 417 Reference: [[ This document ]] 419 8. Security Considerations 421 In cases where there is no way for the agent to automatically verify 422 the correctness of the reference URI (cf. Section 4), out-of-band 423 mechanisms might be required to establish trust. 425 If a trusted site is compromised, the "cite-as" link relation could 426 be used with malicious intent to supply misleading URIs for 427 referencing. Use of these links might direct user agents to an 428 attacker's site, break the referencing record they are intended to 429 support, or corrupt algorithmic interpretation of referencing data. 431 9. References 433 9.1. Normative References 435 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 436 Requirement Levels", BCP 14, RFC 2119, 437 DOI 10.17487/RFC2119, March 1997, 438 . 440 [RFC4287] Nottingham, M., Ed. and R. Sayre, Ed., "The Atom 441 Syndication Format", RFC 4287, DOI 10.17487/RFC4287, 442 December 2005, . 444 [RFC5988] Nottingham, M., "Web Linking", RFC 5988, 445 DOI 10.17487/RFC5988, October 2010, 446 . 448 [RFC6249] Bryan, A., McNab, N., Tsujikawa, T., Poeml, P., and H. 449 Nordstrom, "Metalink/HTTP: Mirrors and Hashes", RFC 6249, 450 DOI 10.17487/RFC6249, June 2011, 451 . 453 [RFC6596] Ohye, M. and J. Kupke, "The Canonical Link Relation", 454 RFC 6596, DOI 10.17487/RFC6596, April 2012, 455 . 457 [RFC8288] Nottingham, M., "Web Linking", RFC 8288, 458 DOI 10.17487/RFC8288, October 2017, 459 . 461 [W3C.REC-html5-20151028] 462 Hickson, I., Berjon, R., Faulkner, S., Leithead, T., Doyle 463 Navara, E., O'Connor, E., and S. Pfeiffer, "HTML5", World 464 Wide Web Consortium Recommendation REC-HTML5-20141028, 465 October 2014, 466 . 468 9.2. Informative References 470 [bookmark-blog] 471 Nelson, M. and H. Van de Sompel, "rel=bookmark also does 472 not mean what you think it means", August 2017, 473 . 476 [canonical-blog] 477 Nelson, M. and H. Van de Sompel, "rel=canonical does not 478 mean what you think it means", August 2017, . 482 [CoolURIs] 483 Berners-Lee, T., "Cool URIs don't change", World Wide Web 484 Consortium Style, 1998, 485 . 487 [DOI-URLs] 488 Hendricks, G., "Display guidelines for Crossref DOIs", 489 June 2017, 490 . 492 [DOIs] "Information and documentation - Digital object identifier 493 system", ISO 26324:2012(en), 2012, 494 . 497 [draft-kunze-ark-18] 498 Kunze, J. and R. Rodgers, "The ARK Identifier Scheme", 499 Internet Draft draft-kunze-ark-18, April 2013, 500 . 502 [FOAF] Brickley, D. and L. Miller, "FOAF Vocabulary Specification 503 0.99", January 2014, . 505 [identifier-blog] 506 Nelson, M. and H. Van de Sompel, "Linking to Persistent 507 Identifiers with rel=identifier", July 2016, . 511 [PIDs-must-be-used] 512 Van de Sompel, H., Klein, M., and S. Jones, "Persistent 513 URIs Must Be Used To Be Persistent", February 2016, 514 . 516 [PURLs] "Persistent uniform resource locator", April 2017, 517 . 520 Appendix A. Acknowledgements 522 Thanks for comments and suggestions provided by Martin Klein, Harihar 523 Shankar, Peter Williams, John Howard, Mark Nottingham, Graham Klyne. 525 Authors' Addresses 527 Herbert Van de Sompel 528 Los Alamos National Laboratory 530 Email: herbertv@lanl.gov 531 URI: http://public.lanl.gov/herbertv/ 533 Michael Nelson 534 Old Dominion University 536 Email: mln@cs.odu.edu 537 URI: http://www.cs.odu.edu/~mln/ 539 Geoffrey Bilder 540 Crossref 542 Email: gbilder@crossref.org 543 URI: https://www.crossref.org/authors/geoffrey-bilder/ 545 John Kunze 546 California Digital Library 548 Email: jak@ucop.edu 549 URI: http://www.cdlib.org/contact/staff_directory/jkunze.html 551 Simeon Warner 552 Cornell University 554 Email: simeon.warner@cornell.edu 555 URI: https://orcid.org/0000-0002-7970-7855