idnits 2.17.1 draft-vandesompel-identifier-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 2 instances of too long lines in the document, the longest one being 9 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 2, 2017) is 2453 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-08) exists of draft-nottingham-rfc5988bis-06 ** Obsolete normative reference: RFC 5988 (Obsoleted by RFC 8288) Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group H. Van de Sompel 3 Internet-Draft Los Alamos National Laboratory 4 Intended status: Informational M. Nelson 5 Expires: February 3, 2018 Old Dominion University 6 G. Bilder 7 Crossref 8 J. Kunze 9 California Digital Library 10 S. Warner 11 Cornell University 12 August 2, 2017 14 Identifier: A Link Relation to Convey a Preferred URI for Referencing 15 draft-vandesompel-identifier-00 17 Abstract 19 This specification defines a link relation type that is intended to 20 convey that a URI, other than the URI that provides a link with the 21 relation type, is preferred for the purpose of referencing. 23 Note to Readers 25 Please discuss this draft on the ART mailing list 26 (). 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at http://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on February 3, 2018. 45 Copyright Notice 47 Copyright (c) 2017 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 63 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 3. Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 3.1. Persistent Identifiers . . . . . . . . . . . . . . . . . 3 66 3.2. Version Identifiers . . . . . . . . . . . . . . . . . . . 4 67 3.3. Preferred Social Identifier . . . . . . . . . . . . . . . 5 68 3.4. Multi-Resource Publications . . . . . . . . . . . . . . . 5 69 4. The "identifier" Relation Type for Expressing a Preferred URI 70 for the Purpose of Referencing . . . . . . . . . . . . . . . 6 71 5. Distinction with Other Relation Types . . . . . . . . . . . . 6 72 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 7 73 6.1. Persistent HTTP URI . . . . . . . . . . . . . . . . . . . 8 74 6.2. Preferred Profile URI . . . . . . . . . . . . . . . . . . 8 75 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 76 7.1. Link Relation Type: identifier . . . . . . . . . . . . . 9 77 8. Security Considerations . . . . . . . . . . . . . . . . . . . 9 78 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 79 9.1. Normative References . . . . . . . . . . . . . . . . . . 10 80 9.2. Informative References . . . . . . . . . . . . . . . . . 10 81 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 11 82 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 84 1. Introduction 86 A web resource is routinely referenced (e.g. linked, bookmarked) by 87 means of the URI where it is directly accessed. But cases exist 88 where referencing a resource by means of a different URI is 89 preferred, for example because the latter URI is intended to be more 90 persistent over time. Currently, there is no link relation type to 91 convey such alternative referencing preference; this specification 92 addresses this deficit by introducing a link relation type intended 93 for that purpose. 95 2. Terminology 97 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 98 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 99 document are to be interpreted as described in RFC 2119 [RFC2119]. 101 This specification uses the terms "link context" and "link target" as 102 defined in [I-D.nottingham-rfc5988bis]. These terms respectively 103 correspond with "Context IRI" and "Target IRI" as used in [RFC5988]. 104 Although defined as IRIs, in common scenarios they are also URIs. 106 Additionally, this specification uses the following terms: 108 o "access URI": A URI at which a user agent accesses a web resource. 110 o "identifying URI": A URI, other than the access URI, that should 111 preferentially be used for referencing. 113 By interacting with the access URI, the user agent may discover typed 114 links. For such links, the access URI is the link context. 116 3. Scenarios 118 3.1. Persistent Identifiers 120 Despite sound advice regarding the design of Cool URIs [CoolURIs], 121 link rot ("HTTP 404 Not Found") is a common phenomena when following 122 links on the web. Certain communities of practice have introduced 123 solutions to combat this problem that typically consist of: 125 o Accepting the reality that the web location of a resource - the 126 access URI - may change over time. 128 o Minting an additional URI for the resource - the identifying URI - 129 that is specifically intended to remain persistent over time. 131 o Redirecting (typically "HTTP 301 Moved Permanently", "HTTP 302 132 Found", or "HTTP 303 See Other") from the identifying URI to the 133 access URI. 135 o As a community, committing to adjust that redirection whenever the 136 access URI changes over time. 138 This approach is, for example, used by: 140 o Scholarly publishers that use DOIs [DOIs] to identify articles and 141 DOI URLs [DOI-URLs] as a means to keep cross-publisher article-to- 142 article links operational, even when the journals in which the 143 articles are published change hands from one publisher to another, 144 for example, as a result of an acquisition. 146 o Authors of controlled vocabularies that use PURLs [PURLs] for 147 vocabulary terms to ensure that the term URIs remain stable even 148 if management of the vocabulary is transfered to a new custodian. 150 o A variety of organizations, including libraries, archives, and 151 museums that assign ARK URLs [draft-kunze-ark-18] to information 152 objects in order to support long-term access. 154 In order for the investments in infrastructure involved in these 155 approaches to pay off, and hence for links to effectively remain 156 operational as intended, it is crucial that a resource be referenced 157 by means of its identifying URI. However, the access URI is where a 158 user agent actually accesses the resource (e.g., it is the URI in the 159 browser's address bar). As such, there is a considerable risk that 160 the access URI instead of the identifying URI is used for referencing 161 [PIDs-must-be-used]. 163 The link relation type defined in this specification allows to convey 164 to user agents that the identifying URI is the preferred URI for 165 referencing. Applications such as bookmarking tools, citation 166 managers, and webometrics applications can take this preference into 167 account when recording a URI. 169 3.2. Version Identifiers 171 Resource versioning systems often use a naming approach whereby: 173 o the most recent version of a resource is at any time available at 174 the same, generic URI 176 o each version of the resource - including the most recent one - has 177 a distinct version URI. 179 For example, Wikipedia uses generic URIs of the form 180 and version URIs of the form 181 . 184 While the current version of a resource is accessed at the generic 185 URI, some versioning systems adhere to a policy that favors linking 186 and referencing by means of the version URI that was minted for the 187 current version. To express this using the terminology of Section 2, 188 these policies intend that the generic URI is the access URI, and 189 that the version URI is the identifying URI. These policies are 190 informed by the understanding that the content at the generic URI is 191 likely to evolve over time, and that accurate links or references 192 should lead to the content as it was at the time of referencing. To 193 that end, Wikipedia's "Permanent link" and "Cite this page" 194 functionalities use the version URI, not the generic URI. 196 The link relation type defined in this specification allows to convey 197 to user agents that the version URI is preferred over the generic URI 198 for referencing. 200 3.3. Preferred Social Identifier 202 A web user commonly has multiple profiles on the web, for example, 203 one per social network she takes part in, a personal homepage, a 204 professional homepage, a FOAF profile [FOAF], etc. Each of these 205 profiles is accessible at a distinct URI. But the user may have a 206 preference for one of those profiles, for example, because it is most 207 complete, kept up-to-date, or expected to be long-lived. 209 The link relation type defined in this specification allows to convey 210 to user agents that a profile URI - the identifying URI - other than 211 the one the agent is accessing - the access URI - is preferred for 212 referencing. 214 3.4. Multi-Resource Publications 216 When publishing on the web, it is not uncommon to make distinct 217 components of a publication available as different web resources, 218 each with its own URI. For example: 220 o Contemporary scholarly publications routinely consists of a 221 traditional article as well as additional materials that are 222 considered an integral part of the publication such as 223 supplementary information, high-resolution images, a video 224 recording of an experiment. 226 o Scientific or governmental open data sets frequently consist of 227 multiple files. 229 o Online books typically consist of multiple chapters. 231 While each of these components are accessible at their distinct URI - 232 the access URI - they often also share a URI assigned to the 233 intellectual publication of which they are components - the 234 identifying URI. 236 The link relation type defined in this specification allows to convey 237 to user agents that, for the purpose of referencing, the identifying 238 URI of the intellectual publication is preferred over an access URI 239 of a component of the publication. 241 4. The "identifier" Relation Type for Expressing a Preferred URI for 242 the Purpose of Referencing 244 A link with the "identifier" relation type indicates that the link 245 target - the identifying URI - is preferred over the link context for 246 the purpose of referencing. 248 An identifying URI SHOULD support protocol-based access as a means to 249 ensure that applications that store identifying URIs can effectively 250 re-use them for access. 252 An identifying URI SHOULD provide the ability for a user agent to 253 follow its nose back to the access URI, e.g. by following redirects 254 and/or links. This helps a user agent to establish trust in the 255 identifying URI. 257 Because a link with the "identifier" relation type expresses a 258 preferred URI for the purpose of referencing, the access URI SHOULD 259 only provide one link with that relation type. If more than one 260 "identifier" link is provided, the user agent may decide to select 261 one (e.g. an HTTP URI over a mailto URI), for example, based on the 262 purpose that the identifying URI will serve. 264 Providing a link with the "identifier" relation type does not prevent 265 using the access URI for the purpose of referencing if such 266 specificity is needed for the application at hand. For example, in 267 the case of scenario Section 3.4 the access URI is likely required 268 for the purpose of annotating a specific component of an intellectual 269 publication. Yet, the annotation application may also want to 270 appropriately include the identifying URI in the annotation. 272 5. Distinction with Other Relation Types 274 The following existing IANA-registered relationships are similar to 275 the relationship that "identifier" is intended to convey, but are not 276 appropriate for various reasons: 278 o "alternate" [RFC4287], used to link to an alternate version of the 279 content at the link context, for example the same content with 280 varying Content-Type (e.g., application/pdf vs. text/html) and/or 281 Content-Language (e.g., en vs. fr). 283 o "bookmark" [W3C.REC-html5-20151028], used to convey a permanent 284 link to use for bookmarking purposes. 286 o "canonical" [RFC6596], used to identify content that is either 287 duplicative or a superset of the content at the link context, for 288 example a single page version of a magazine article, provided for 289 indexing by search engines, of an article that is spread over 290 several pages for human use. 292 o "duplicate" [RFC6249], used to link to a resource whose available 293 representations are byte-for-byte identical with the corresponding 294 representations of the link context, for example, an identical 295 file on a mirror site. 297 o "related" [RFC4287], used to link to a related resource. 299 A closer inspection of these candidates [identifier-blog] shows that 300 they are not appropriate and that a new relation type is required. 302 In the scenario of Section 3.1 there is no content available at the 303 identifying URI as it merely redirects to the access URI. In the 304 scenario of Section 3.3, the content at the identifying URI is a 305 profile that is different than the profile at the access URI. In the 306 scenario of Section 3.4 the content at the identifying URI, if any, 307 would typically be a sort of table of contents with links to 308 component resources and possibly a summary. These considerations 309 exclude "alternate", "canonical", and "duplicate" as possible 310 relation types. 312 The intent of "bookmark" is closest to that of "identifier" in that 313 the link target of a link with this relation type is intended for 314 bookmarking, which is a case of referencing. However, "bookmark" is 315 specifically defined for use in conjunction with the HTML
316 element and is explictly excluded from use in the element in 317 HTML . Since a link in and a link in the HTTP Link 318 header are semantically equivalent, "bookmark" is also excluded from 319 use in HTTP Link. 321 While "related" could be used, its semantics are too vague to convey 322 the specific nature of "identifier" as a means to convey a URI for 323 the purpose of referencing. 325 6. Examples 327 Sections Section 6.1 and Section 6.2 show examples of the use of 328 links with the "identifier" relation type. One example shows its use 329 in a response header and body, the other in a response body only. 331 6.1. Persistent HTTP URI 333 If the access URI is a landing page for a scholarly article for which 334 the persistent HTTP URI 335 was minted, then the response to an HTTP GET on the landing page's 336 URI could be as shown in Figure 1. 338 HTTP/1.1 200 OK 339 Link: ; rel="identifier" 340 Content-Type: text/html;charset=utf-8 342 343 344 ... 345 346 ... 347 348 349 ... 350 351 353 Figure 1: Response to HTTP GET on the URI of the landing page of a 354 scholarly article 356 6.2. Preferred Profile URI 358 If the access URI is the home page of John Doe, John can add a link 359 with the "identifier" relation type to it, as a means to convey that 360 he would preferably be referenced by means of the URI of his FOAF 361 profile. Figure 2 shows the response to an HTTP GET on the URI of 362 John's home page. 364 HTTP/1.1 200 OK 365 Content-Type: text/html;charset=utf-8 367 368 369 ... 370 371 ... 372 373 374 ... 375 376 378 Figure 2: Response to HTTP GET on the URI of John Doe's home page 380 7. IANA Considerations 382 7.1. Link Relation Type: identifier 384 The link relation type below has been registered by IANA per 385 Section 2.1.1 of [I-D.nottingham-rfc5988bis]: 387 Relation Name: identifier 389 Description: A link with the "identifier" relation type indicates 390 that the link target is preferred over the link context for the 391 purpose of referencing. 393 Reference: [[ This document ]] 395 8. Security Considerations 397 In cases where there is no way for the agent to automatically verify 398 the correctness of the identifying URI (cf. Section 4), out-of-band 399 mechanisms might be required to establish trust. 401 If a trusted site is compromised, the "identifier" link relation 402 could be used with malicious intent to supply misleading URIs for 403 referencing. Use of these links might direct user agents to an 404 attacker's site, break the referencing record they are intended to 405 support, or corrupt algorithmic interpretation of referencing data. 407 9. References 409 9.1. Normative References 411 [I-D.nottingham-rfc5988bis] 412 Nottingham, M., "Web Linking", draft-nottingham- 413 rfc5988bis-06 (work in progress), June 2017. 415 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 416 Requirement Levels", BCP 14, RFC 2119, 417 DOI 10.17487/RFC2119, March 1997, 418 . 420 [RFC4287] Nottingham, M., Ed. and R. Sayre, Ed., "The Atom 421 Syndication Format", RFC 4287, DOI 10.17487/RFC4287, 422 December 2005, . 424 [RFC5988] Nottingham, M., "Web Linking", RFC 5988, 425 DOI 10.17487/RFC5988, October 2010, 426 . 428 [RFC6249] Bryan, A., McNab, N., Tsujikawa, T., Poeml, P., and H. 429 Nordstrom, "Metalink/HTTP: Mirrors and Hashes", RFC 6249, 430 DOI 10.17487/RFC6249, June 2011, 431 . 433 [RFC6596] Ohye, M. and J. Kupke, "The Canonical Link Relation", 434 RFC 6596, DOI 10.17487/RFC6596, April 2012, 435 . 437 [W3C.REC-html5-20151028] 438 Hickson, I., Berjon, R., Faulkner, S., Leithead, T., Doyle 439 Navara, E., O'Connor, E., and S. Pfeiffer, "HTML5", World 440 Wide Web Consortium Recommendation REC-HTML5-20141028, 441 October 2014, . 444 9.2. Informative References 446 [CoolURIs] 447 Berners-Lee, T., "Cool URIs don't change", World Wide Web 448 Consortium Style, 1998, 449 . 451 [DOI-URLs] 452 Hendricks, G., "Display guidelines for Crossref DOIs", 453 June 2017, . 456 [DOIs] "Information and documentation - Digital object identifier 457 system", ISO 26324:2012(en), 2012, 458 . 461 [draft-kunze-ark-18] 462 Kunze, J. and R. Rodgers, "The ARK Identifier Scheme", 463 Internet Draft draft-kunze-ark-18, April 2013, 464 . 466 [FOAF] Brickley, D. and L. Miller, "FOAF Vocabulary Specification 467 0.99", January 2014, . 469 [identifier-blog] 470 Nelson, M., "Linking to Persistent Identifiers with 471 rel="identifier"", July 2016, . 475 [PIDs-must-be-used] 476 Van de Sompel, H., Klein, M., and S. Jones, "Persistent 477 URIs Must Be Used To Be Persistent", February 2016, 478 . 480 [PURLs] "Persistent uniform resource locator", April 2017, 481 . 484 Appendix A. Acknowledgements 486 Thanks for comments and suggestions provided by Martin Klein, Harihar 487 Shankar. 489 Authors' Addresses 491 Herbert Van de Sompel 492 Los Alamos National Laboratory 494 Email: herbertv@lanl.gov 495 URI: http://public.lanl.gov/herbertv/ 497 Michael Nelson 498 Old Dominion University 500 Email: mln@cs.odu.edu 501 URI: http://www.cs.odu.edu/~mln/ 502 Geoffrey Bilder 503 Crossref 505 Email: gbilder@crossref.org 506 URI: https://www.crossref.org/authors/geoffrey-bilder/ 508 John Kunze 509 California Digital Library 511 Email: jak@ucop.edu 512 URI: http://www.cdlib.org/contact/staff_directory/jkunze.html 514 Simeon Warner 515 Cornell University 517 Email: simeon.warner@cornell.edu 518 URI: https://orcid.org/0000-0002-7970-7855