idnits 2.17.1 draft-vandesompel-citeas-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 6 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 4, 2017) is 2368 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 5988 (Obsoleted by RFC 8288) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group H. Van de Sompel 3 Internet-Draft Los Alamos National Laboratory 4 Intended status: Informational M. Nelson 5 Expires: April 7, 2018 Old Dominion University 6 G. Bilder 7 Crossref 8 J. Kunze 9 California Digital Library 10 S. Warner 11 Cornell University 12 October 4, 2017 14 cite-as: A Link Relation to Convey a Preferred URI for Referencing 15 draft-vandesompel-citeas-00 17 Abstract 19 This specification defines a link relation type that is intended to 20 convey that a URI, other than the URI that provides a link with the 21 relation type, is preferred for the purpose of referencing. 23 Note to Readers 25 Please discuss this draft on the ART mailing list 26 (). 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on April 7, 2018. 45 Copyright Notice 47 Copyright (c) 2017 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (https://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 63 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 3. Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 3.1. Persistent Identifiers . . . . . . . . . . . . . . . . . 3 66 3.2. Version Identifiers . . . . . . . . . . . . . . . . . . . 4 67 3.3. Preferred Social Identifier . . . . . . . . . . . . . . . 5 68 3.4. Multi-Resource Publications . . . . . . . . . . . . . . . 5 69 4. The "cite-as" Relation Type for Expressing a Preferred URI 70 for the Purpose of Referencing . . . . . . . . . . . . . . . 6 71 5. Distinction with Other Relation Types . . . . . . . . . . . . 6 72 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 8 73 6.1. Persistent HTTP URI . . . . . . . . . . . . . . . . . . . 8 74 6.2. Preferred Profile URI . . . . . . . . . . . . . . . . . . 8 75 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 76 7.1. Link Relation Type: identifier . . . . . . . . . . . . . 9 77 8. Security Considerations . . . . . . . . . . . . . . . . . . . 9 78 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 79 9.1. Normative References . . . . . . . . . . . . . . . . . . 10 80 9.2. Informative References . . . . . . . . . . . . . . . . . 10 81 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 12 82 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 84 1. Introduction 86 A web resource is routinely referenced (e.g. linked, bookmarked) by 87 means of the URI where it is directly accessed. But cases exist 88 where referencing a resource by means of a different URI is 89 preferred, for example because the latter URI is intended to be more 90 persistent over time. Currently, there is no link relation type to 91 convey such alternative referencing preference; this specification 92 addresses this deficit by introducing a link relation type intended 93 for that purpose. 95 2. Terminology 97 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 98 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 99 document are to be interpreted as described in RFC 2119 [RFC2119]. 101 This specification uses the terms "link context" and "link target" as 102 defined in [I-D.nottingham-rfc5988bis]. These terms respectively 103 correspond with "Context IRI" and "Target IRI" as used in [RFC5988]. 104 Although defined as IRIs, in common scenarios they are also URIs. 106 Additionally, this specification uses the following terms: 108 o "access URI": A URI at which a user agent accesses a web resource. 110 o "reference URI": A URI, other than the access URI, that should 111 preferentially be used for referencing. 113 By interacting with the access URI, the user agent may discover typed 114 links. For such links, the access URI is the link context. 116 3. Scenarios 118 3.1. Persistent Identifiers 120 Despite sound advice regarding the design of Cool URIs [CoolURIs], 121 link rot ("HTTP 404 Not Found") is a common phenomena when following 122 links on the web. Certain communities of practice have introduced 123 solutions to combat this problem that typically consist of: 125 o Accepting the reality that the web location of a resource - the 126 access URI - may change over time. 128 o Minting an additional URI for the resource - the reference URI - 129 that is specifically intended to remain persistent over time. 131 o Redirecting (typically "HTTP 301 Moved Permanently", "HTTP 302 132 Found", or "HTTP 303 See Other") from the reference URI to the 133 access URI. 135 o As a community, committing to adjust that redirection whenever the 136 access URI changes over time. 138 This approach is, for example, used by: 140 o Scholarly publishers that use DOIs [DOIs] to identify articles and 141 DOI URLs [DOI-URLs] as a means to keep cross-publisher article-to- 142 article links operational, even when the journals in which the 143 articles are published change hands from one publisher to another, 144 for example, as a result of an acquisition. 146 o Authors of controlled vocabularies that use PURLs [PURLs] for 147 vocabulary terms to ensure that the term URIs remain stable even 148 if management of the vocabulary is transfered to a new custodian. 150 o A variety of organizations, including libraries, archives, and 151 museums that assign ARK URLs [draft-kunze-ark-18] to information 152 objects in order to support long-term access. 154 In order for the investments in infrastructure involved in these 155 approaches to pay off, and hence for links to effectively remain 156 operational as intended, it is crucial that a resource be referenced 157 by means of its reference URI. However, the access URI is where a 158 user agent actually accesses the resource (e.g., it is the URI in the 159 browser's address bar). As such, there is a considerable risk that 160 the access URI instead of the reference URI is used for referencing 161 [PIDs-must-be-used]. 163 The link relation type defined in this specification allows to convey 164 to user agents that the reference URI is the preferred URI for 165 referencing. Applications such as bookmarking tools, citation 166 managers, and webometrics applications can take this preference into 167 account when recording a URI. 169 3.2. Version Identifiers 171 Resource versioning systems often use a naming approach whereby: 173 o the most recent version of a resource is at any time available at 174 the same, generic URI 176 o each version of the resource - including the most recent one - has 177 a distinct version URI. 179 For example, Wikipedia uses generic URIs of the form 180 and version URIs of the form 181 . 184 While the current version of a resource is accessed at the generic 185 URI, some versioning systems adhere to a policy that favors linking 186 and referencing by means of the version URI that was minted for the 187 current version. To express this using the terminology of Section 2, 188 these policies intend that the generic URI is the access URI, and 189 that the version URI is the reference URI. These policies are 190 informed by the understanding that the content at the generic URI is 191 likely to evolve over time, and that accurate links or references 192 should lead to the content as it was at the time of referencing. To 193 that end, Wikipedia's "Permanent link" and "Cite this page" 194 functionalities promote the version URI, not the generic URI. 196 The link relation type defined in this specification allows to convey 197 to user agents that the version URI is preferred over the generic URI 198 for referencing. 200 3.3. Preferred Social Identifier 202 A web user commonly has multiple profiles on the web, for example, 203 one per social network she takes part in, a personal homepage, a 204 professional homepage, a FOAF profile [FOAF], etc. Each of these 205 profiles is accessible at a distinct URI. But the user may have a 206 preference for one of those profiles, for example, because it is most 207 complete, kept up-to-date, or expected to be long-lived. 209 The link relation type defined in this specification allows to convey 210 to user agents that a profile URI - the reference URI - other than 211 the one the agent is accessing - the access URI - is preferred for 212 referencing. 214 3.4. Multi-Resource Publications 216 When publishing on the web, it is not uncommon to make distinct 217 components of a publication available as different web resources, 218 each with their own URI. For example: 220 o Contemporary scholarly publications routinely consists of a 221 traditional article as well as additional materials that are 222 considered an integral part of the publication such as 223 supplementary information, high-resolution images, a video 224 recording of an experiment. 226 o Scientific or governmental open data sets frequently consist of 227 multiple files. 229 o Online books typically consist of multiple chapters. 231 While each of these components are accessible at their distinct URI - 232 the access URI - they often also share a URI assigned to the 233 intellectual publication of which they are components - the reference 234 URI. 236 The link relation type defined in this specification allows to convey 237 to user agents that, for the purpose of referencing, the reference 238 URI of the intellectual publication is preferred over an access URI 239 of a component of the publication. 241 4. The "cite-as" Relation Type for Expressing a Preferred URI for the 242 Purpose of Referencing 244 A link with the "cite-as" relation type indicates that the link 245 target is preferred over the link context for the purpose of 246 referencing. 248 The link target of a "cite-as" link SHOULD support protocol-based 249 access as a means to ensure that applications that store them can 250 effectively re-use them for access. 252 The link target of a "cite-as" link SHOULD provide the ability for a 253 user agent to follow its nose back to the context of the link, e.g. 254 by following redirects and/or links. This helps a user agent to 255 establish trust in the target URI. 257 Because a link with the "cite-as" relation type expresses a preferred 258 URI for the purpose of referencing, the access URI SHOULD only 259 provide one link with that relation type. If more than one "cite-as" 260 link is provided, the user agent may decide to select one (e.g. an 261 HTTP URI over a mailto URI), for example, based on the purpose that 262 the reference URI will serve. 264 Providing a link with the "cite-as" relation type does not prevent 265 using the access URI for the purpose of referencing if such 266 specificity is needed for the application at hand. For example, in 267 the case of scenario Section 3.4 the access URI is likely required 268 for the purpose of annotating a specific component of an intellectual 269 publication. Yet, the annotation application may also want to 270 appropriately include the reference URI in the annotation. 272 5. Distinction with Other Relation Types 274 The following existing IANA-registered relationships may intuitively 275 resemble the relationship that "cite-as" is intended to convey, but 276 are not appropriate for various reasons: 278 o "alternate" [RFC4287], used to link to an alternate version of the 279 content at the link context, for example the same content with 280 varying Content-Type (e.g., application/pdf vs. text/html) and/or 281 Content-Language (e.g., en vs. fr). 283 o "bookmark" [W3C.REC-html5-20151028], used to convey a permanent 284 link to use for bookmarking purposes. 286 o "canonical" [RFC6596], used to identify content that is either 287 duplicative or a superset of the content at the link context, for 288 example a single page version of a magazine article, provided for 289 indexing by search engines, of an article that is spread over 290 several pages for human use. 292 o "duplicate" [RFC6249], used to link to a resource whose available 293 representations are byte-for-byte identical with the corresponding 294 representations of the link context, for example, an identical 295 file on a mirror site. 297 o "related" [RFC4287], used to link to a related resource. 299 A closer inspection of these candidates [identifier-blog] shows that 300 they are not appropriate and that a new relation type is required. 302 In the scenario of Section 3.1 there is no content available at the 303 reference URI as it merely redirects to the access URI. In the 304 scenario of Section 3.3, the content at the reference URI is a 305 profile that is different than the profile at the access URI. In the 306 scenario of Section 3.4 the content at the reference URI, if any, 307 would typically be a sort of table of contents with links to 308 component resources and possibly a summary. These considerations 309 exclude "alternate", "canonical", and "duplicate" as possible 310 relation types. 312 The meaning of "canonical" is commonly misunderstood on the basis of 313 its brief definition as being "the preferred version of a resource." 314 A more detailed reading of [RFC6596] clarifies that the intended 315 meaning is preferred for the purpose of content indexing 316 [canonical-blog]. In constrast, for "cite-as" it is preferred for 317 the purpose of referencing. 319 The intent of "bookmark" is closest to that of "cite-as" in that the 320 link target of a link with the "bookmark" relation type is intended 321 "to give a permanent link to use for for bookmarking purposes." 322 However, for reasons related to its original intent [bookmark-blog], 323 "bookmark" is specifically defined for use in conjunction with the 324 HTML
element and is explictly excluded from use in the 325 element in HTML . Since a link in and a link in 326 the HTTP Link header are semantically equivalent, "bookmark" is also 327 excluded from use in HTTP Link. 329 While "related" could be used, its semantics are too vague to convey 330 the specific nature of "cite-as" as a means to convey a URI for the 331 purpose of referencing. 333 6. Examples 335 Sections Section 6.1 and Section 6.2 show examples of the use of 336 links with the "cite-as" relation type. One example shows its use in 337 a response header and body, the other in a response body only. 339 6.1. Persistent HTTP URI 341 If the access URI is a landing page for a scholarly article for which 342 the persistent HTTP URI 343 was minted, then the response to an HTTP GET on the landing page's 344 URI could be as shown in Figure 1. 346 HTTP/1.1 200 OK 347 Link: ; rel="cite-as" 348 Content-Type: text/html;charset=utf-8 350 351 352 ... 353 354 ... 355 356 357 ... 358 359 361 Figure 1: Response to HTTP GET on the URI of the landing page of a 362 scholarly article 364 6.2. Preferred Profile URI 366 If the access URI is the home page of John Doe, John can add a link 367 with the "cite-as" relation type to it, as a means to convey that he 368 would preferably be referenced by means of the URI of his FOAF 369 profile. Figure 2 shows the response to an HTTP GET on the URI of 370 John's home page. 372 HTTP/1.1 200 OK 373 Content-Type: text/html;charset=utf-8 375 376 377 ... 378 379 ... 380 381 382 ... 383 384 386 Figure 2: Response to HTTP GET on the URI of John Doe's home page 388 7. IANA Considerations 390 7.1. Link Relation Type: identifier 392 The link relation type below has been registered by IANA per 393 Section 2.1.1 of [I-D.nottingham-rfc5988bis]: 395 Relation Name: cite-as 397 Description: A link with the "cite-as" relation type indicates 398 that the link target is preferred over the link context for the 399 purpose of referencing. 401 Reference: [[ This document ]] 403 8. Security Considerations 405 In cases where there is no way for the agent to automatically verify 406 the correctness of the reference URI (cf. Section 4), out-of-band 407 mechanisms might be required to establish trust. 409 If a trusted site is compromised, the "cite-as" link relation could 410 be used with malicious intent to supply misleading URIs for 411 referencing. Use of these links might direct user agents to an 412 attacker's site, break the referencing record they are intended to 413 support, or corrupt algorithmic interpretation of referencing data. 415 9. References 417 9.1. Normative References 419 [I-D.nottingham-rfc5988bis] 420 Nottingham, M., "Web Linking", draft-nottingham- 421 rfc5988bis-08 (work in progress), August 2017. 423 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 424 Requirement Levels", BCP 14, RFC 2119, 425 DOI 10.17487/RFC2119, March 1997, 426 . 428 [RFC4287] Nottingham, M., Ed. and R. Sayre, Ed., "The Atom 429 Syndication Format", RFC 4287, DOI 10.17487/RFC4287, 430 December 2005, . 432 [RFC5988] Nottingham, M., "Web Linking", RFC 5988, 433 DOI 10.17487/RFC5988, October 2010, 434 . 436 [RFC6249] Bryan, A., McNab, N., Tsujikawa, T., Poeml, P., and H. 437 Nordstrom, "Metalink/HTTP: Mirrors and Hashes", RFC 6249, 438 DOI 10.17487/RFC6249, June 2011, 439 . 441 [RFC6596] Ohye, M. and J. Kupke, "The Canonical Link Relation", 442 RFC 6596, DOI 10.17487/RFC6596, April 2012, 443 . 445 [W3C.REC-html5-20151028] 446 Hickson, I., Berjon, R., Faulkner, S., Leithead, T., Doyle 447 Navara, E., O'Connor, E., and S. Pfeiffer, "HTML5", World 448 Wide Web Consortium Recommendation REC-HTML5-20141028, 449 October 2014, 450 . 452 9.2. Informative References 454 [bookmark-blog] 455 Nelson, M. and H. Van de Sompel, "rel=bookmark also does 456 not mean what you think it means", August 2017, 457 . 460 [canonical-blog] 461 Nelson, M. and H. Van de Sompel, "rel=canonical does not 462 mean what you think it means", August 2017, . 466 [CoolURIs] 467 Berners-Lee, T., "Cool URIs don't change", World Wide Web 468 Consortium Style, 1998, 469 . 471 [DOI-URLs] 472 Hendricks, G., "Display guidelines for Crossref DOIs", 473 June 2017, 474 . 476 [DOIs] "Information and documentation - Digital object identifier 477 system", ISO 26324:2012(en), 2012, 478 . 481 [draft-kunze-ark-18] 482 Kunze, J. and R. Rodgers, "The ARK Identifier Scheme", 483 Internet Draft draft-kunze-ark-18, April 2013, 484 . 486 [FOAF] Brickley, D. and L. Miller, "FOAF Vocabulary Specification 487 0.99", January 2014, . 489 [identifier-blog] 490 Nelson, M. and H. Van de Sompel, "Linking to Persistent 491 Identifiers with rel=identifier", July 2016, . 495 [PIDs-must-be-used] 496 Van de Sompel, H., Klein, M., and S. Jones, "Persistent 497 URIs Must Be Used To Be Persistent", February 2016, 498 . 500 [PURLs] "Persistent uniform resource locator", April 2017, 501 . 504 Appendix A. Acknowledgements 506 Thanks for comments and suggestions provided by Martin Klein, Harihar 507 Shankar, Peter Williams, John Howard, Mark Nottingham. 509 Authors' Addresses 511 Herbert Van de Sompel 512 Los Alamos National Laboratory 514 Email: herbertv@lanl.gov 515 URI: http://public.lanl.gov/herbertv/ 517 Michael Nelson 518 Old Dominion University 520 Email: mln@cs.odu.edu 521 URI: http://www.cs.odu.edu/~mln/ 523 Geoffrey Bilder 524 Crossref 526 Email: gbilder@crossref.org 527 URI: https://www.crossref.org/authors/geoffrey-bilder/ 529 John Kunze 530 California Digital Library 532 Email: jak@ucop.edu 533 URI: http://www.cdlib.org/contact/staff_directory/jkunze.html 535 Simeon Warner 536 Cornell University 538 Email: simeon.warner@cornell.edu 539 URI: https://orcid.org/0000-0002-7970-7855