idnits 2.17.1 draft-vandesompel-citeas-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 4 instances of too long lines in the document, the longest one being 13 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 18, 2018) is 1954 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 5988 (Obsoleted by RFC 8288) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group H. Van de Sompel 3 Internet-Draft Data Archiving and Networked Services 4 Intended status: Informational M. Nelson 5 Expires: June 21, 2019 Old Dominion University 6 G. Bilder 7 Crossref 8 J. Kunze 9 California Digital Library 10 S. Warner 11 Cornell University 12 December 18, 2018 14 cite-as: A Link Relation to Convey a Preferred URI for Referencing 15 draft-vandesompel-citeas-04 17 Abstract 19 A web resource is routinely referenced by means of the URI with which 20 it is directly accessed. But cases exist where referencing a 21 resource by means of a URI, different than that access URI, is 22 preferred. This specification defines a link relation type that can 23 be used to convey such a preference. 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at https://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on June 21, 2019. 42 Copyright Notice 44 Copyright (c) 2018 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (https://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 60 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 3. Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . 3 62 3.1. Persistent Identifiers . . . . . . . . . . . . . . . . . 3 63 3.2. Version Identifiers . . . . . . . . . . . . . . . . . . . 4 64 3.3. Preferred Social Identifier . . . . . . . . . . . . . . . 5 65 3.4. Multi-Resource Publications . . . . . . . . . . . . . . . 5 66 4. The "cite-as" Relation Type for Expressing a Preferred URI 67 for the Purpose of Referencing . . . . . . . . . . . . . . . 6 68 5. Distinction with Other Relation Types . . . . . . . . . . . . 7 69 5.1. bookmark . . . . . . . . . . . . . . . . . . . . . . . . 8 70 5.2. canonical . . . . . . . . . . . . . . . . . . . . . . . . 8 71 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 10 72 6.1. Persistent HTTP URI . . . . . . . . . . . . . . . . . . . 10 73 6.2. Version URIs . . . . . . . . . . . . . . . . . . . . . . 11 74 6.3. Preferred Profile URI . . . . . . . . . . . . . . . . . . 12 75 6.4. Multi-Resource Publication . . . . . . . . . . . . . . . 13 76 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 77 7.1. Link Relation Type: cite-as . . . . . . . . . . . . . . . 13 78 8. Security Considerations . . . . . . . . . . . . . . . . . . . 14 79 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 80 9.1. Normative References . . . . . . . . . . . . . . . . . . 14 81 9.2. Informative References . . . . . . . . . . . . . . . . . 15 82 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 16 83 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 85 1. Introduction 87 A web resource is routinely referenced (e.g. linked, bookmarked) by 88 means of the URI with which it is directly accessed. But cases exist 89 where referencing a resource by means of a different URI is 90 preferred, for example because the latter URI is intended to be more 91 persistent over time. Currently, there is no link relation type to 92 convey such alternative referencing preference; this specification 93 addresses this deficit by introducing a link relation type intended 94 for that purpose. 96 2. Terminology 98 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 99 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 100 "OPTIONAL" in this document are to be interpreted as described in BCP 101 14 [RFC2119] [RFC8174] when, and only when, they appear in all 102 capitals, as shown here. 104 This specification uses the terms "link context" and "link target" as 105 defined in [RFC8288]. These terms respectively correspond with 106 "Context IRI" and "Target IRI" as used in [RFC5988]. Although 107 defined as IRIs, in common scenarios they are also URIs. 109 Additionally, this specification uses the following terms: 111 o "access URI": A URI at which a user agent accesses a web resource. 113 o "reference URI": A URI, other than the access URI, that should 114 preferentially be used for referencing. 116 By interacting with the access URI, the user agent may discover typed 117 links. For such links, the access URI is the link context. 119 3. Scenarios 121 3.1. Persistent Identifiers 123 Despite sound advice regarding the design of Cool URIs [CoolURIs], 124 link rot ("HTTP 404 Not Found") is a common phenomena when following 125 links on the web. Certain communities of practice (see examples, 126 below) have introduced solutions to combat this problem that 127 typically consist of: 129 o Accepting the reality that the web location of a resource - the 130 access URI - may change over time. 132 o Minting an additional URI for the resource - the reference URI - 133 that is specifically intended to remain persistent over time. 135 o Redirecting (typically "HTTP 301 Moved Permanently", "HTTP 302 136 Found", or "HTTP 303 See Other") from the reference URI to the 137 access URI. 139 o As a community of practice, committing to adjust that redirection 140 whenever the access URI changes over time. 142 This approach is, for example, used by: 144 o Scholarly publishers that use DOIs [DOIs] to identify articles and 145 DOI URLs [DOI-URLs] as a means to keep cross-publisher article-to- 146 article links operational, even when the journals in which the 147 articles are published change hands from one publisher to another, 148 for example, as a result of an acquisition. 150 o Authors of controlled vocabularies that use PURLs (Persistent 151 Uniform Resource Locators) [PURLs] for vocabulary terms to ensure 152 that the URIs they assign to vocabulary terms remain stable even 153 if management of the vocabulary is transfered to a new custodian. 155 o A variety of organizations, including libraries, archives, and 156 museums that assign ARK URLs [draft-kunze-ark-18] to information 157 objects in order to support long-term access. 159 In order for the investments in infrastructure involved in these 160 approaches to pay off, and hence for links to effectively remain 161 operational as intended, it is crucial that a resource be referenced 162 by means of its reference URI. However, the access URI is where a 163 user agent actually accesses the resource (e.g., it is the URI in the 164 browser's address bar). As such, there is a considerable risk that 165 the access URI instead of the reference URI is used for referencing 166 [PIDs-must-be-used]. 168 The link relation type defined in this document makes it possible for 169 user agents to differentiate the reference URI from the access URI. 171 3.2. Version Identifiers 173 Resource versioning systems often use a naming approach whereby: 175 o The most recent version of a resource is at any time available at 176 the same, generic URI. 178 o Each version of the resource - including the most recent one - has 179 a distinct version URI. 181 For example, Wikipedia uses generic URIs of the form 182 https://en.wikipedia.org/wiki/John_Doe and version URIs of the form 183 https://en.wikipedia.org/w/index.php?title=John_Doe&oldid=776253882. 185 While the current version of a resource is accessed at the generic 186 URI, some versioning systems adhere to a policy that favors linking 187 and referencing a specific version URI. To express this using the 188 terminology of Section 2, these policies intend that the generic URI 189 is the access URI, and that the version URI is the reference URI. 190 These policies are informed by the understanding that the content at 191 the generic URI is likely to evolve over time, and that accurate 192 links or references should lead to the content as it was at the time 193 of referencing. To that end, Wikipedia's "Permanent link" and "Cite 194 this page" functionalities promote the version URI, not the generic 195 URI. 197 The link relation type defined in this document makes it possible for 198 user agents to differentiate the version URI from the generic URI. 200 3.3. Preferred Social Identifier 202 A web user commonly has multiple profiles on the web, for example, 203 one per social network, a personal homepage, a professional homepage, 204 a FOAF (Friend Of A Friend) profile [FOAF], etc. Each of these 205 profiles is accessible at a distinct URI. But the user may have a 206 preference for one of those profiles, for example, because it is most 207 complete, kept up-to-date, or expected to be long-lived. As an 208 example, the first author of this document has, among others, the 209 following profile URIs: 211 o https://hvdsomp.info 213 o https://twitter.com/hvdsomp 215 o https://www.linkedin.com/in/herbertvandesompel/ 217 o https://orcid.org/0000-0002-0715-6126 219 Of these, from the perspective of the person described by these 220 profiles, the first URI may be the preferred profile URI for the 221 purpose of referencing because the domain is not under the 222 custodianship of a third party. When an agent accesses another 223 profile URI, such as https://orcid.org/0000-0002-0715-6126, this 224 preference for referencing by means of the first URI could be 225 expressed. 227 The link relation type defined in this specification makes it 228 possible for user agents to differentiate the preferred profile URI 229 from the accessed profile URI. 231 3.4. Multi-Resource Publications 233 When publishing on the web, it is not uncommon to make distinct 234 components of a publication available as different web resources, 235 each with their own URI. For example: 237 o Contemporary scholarly publications routinely consists of a 238 traditional article as well as additional materials that are 239 considered an integral part of the publication such as 240 supplementary information, high-resolution images, or a video 241 recording of an experiment. 243 o Scientific or governmental open data sets frequently consist of 244 multiple files. 246 o Online books typically consist of multiple chapters. 248 While each of these components is accessible at its distinct URI - 249 the access URI - they often also share a URI assigned to the 250 intellectual publication of which they are components - the reference 251 URI. 253 The link relation type defined in this document makes it possible for 254 user agents to differentiate the URI of the intellectual publication 255 from the access URI of a component of the publication. 257 4. The "cite-as" Relation Type for Expressing a Preferred URI for the 258 Purpose of Referencing 260 A link with the "cite-as" relation type indicates that, for 261 referencing the link context, use of the URI of the link target is 262 preferred over use of the URI of the link context. It allows the 263 resource identified by the access URI (link context) to unambiguously 264 link to its corresponding reference URI (link target), thereby 265 expressing that the link target is preferred over the link context 266 for the purpose of permanent citation. 268 The link target of a "cite-as" link SHOULD support protocol-based 269 access as a means to ensure that applications that store them can 270 effectively re-use them for access. 272 The link target of a "cite-as" link SHOULD provide the ability for a 273 user agent to follow its nose back to the context of the link, e.g. 274 by following redirects and/or links. This helps a user agent to 275 establish trust in the target URI. 277 Because a link with the "cite-as" relation type expresses a preferred 278 URI for the purpose of referencing, the access URI SHOULD only 279 provide one link with that relation type. If more than one "cite-as" 280 link is provided, the user agent may decide to select one (e.g. an 281 HTTP URI over a mailto URI), for example, based on the purpose that 282 the reference URI will serve. 284 Providing a link with the "cite-as" relation type does not prevent 285 using the access URI for the purpose of referencing if such 286 specificity is needed for the application at hand. For example, in 287 the case of scenario Section 3.4 the access URI is likely required 288 for the purpose of annotating a specific component of an intellectual 289 publication. Yet, the annotation application may also want to 290 appropriately include the reference URI in the annotation. 292 Applications can leverage the information provided by a "cite-as" 293 link in a variety of ways, for example: 295 o Bookmarking tools and citation managers can take this preference 296 into account when recording a URI. 298 o Webometrics applications that trace URIs can trace both the access 299 URI and the reference URI. 301 o Discovery tools can support look-up by means of both the access 302 and the reference URI. This includes web archives that typically 303 make archived versions of web resources discoverable by means of 304 the original access URI of the archived resource; they can 305 additionally make these archived resources discoverable by means 306 of the associated reference URI. 308 5. Distinction with Other Relation Types 310 Some existing IANA-registered relationships intuitively resemble the 311 relationship that "cite-as" is intended to convey. But a closer 312 inspection of these candidates provided in the blog posts 313 [identifier-blog], [canonical-blog], and [bookmark-blog] shows that 314 they are not appropriate for various reasons and that a new relation 315 type is required. The remainder of this section provides a summary 316 of the detailed explanations provided in the referenced blog posts. 318 It can readily be seen that the following relation types do not 319 address the requirements described in Section 3: 321 o "alternate" [RFC4287]: The link target provides an alternate 322 version of the content at the link context. These are typically 323 variants according to dimensions that are subject to content 324 negotiation, for example the same content with varying Content- 325 Type (e.g., application/pdf vs. text/html) and/or Content-Language 326 (e.g., en vs. fr). The representations provided by the context 327 URIs and target URIs in the scenarios of Section 3.1 through 328 Section 3.4 are not variants in the sense intended by [RFC4287], 329 and, as such, the use of "alternate" is not appropriate. 331 o "duplicate" [RFC6249]: The link target is a resource whose 332 available representations are byte-for-byte identical with the 333 corresponding representations of the link context, for example, an 334 identical file on a mirror site. In none of the above scenarios 335 do the link context and the link target provide identical content. 336 As such, the use of "duplicate" is not appropriate. 338 o "related" [RFC4287]: The link target is a resource that is related 339 to the link context. While "related" could be used in all of the 340 above scenarios, its semantics are too vague to convey the 341 specific semantics intended by "cite-as". 343 Two existing IANA-registered relationships deserve closer attention 344 and are discussed in the remainder of this section. 346 5.1. bookmark 348 "bookmark" [W3C.REC-html5-20151028]: The link target provides a URI 349 for the purpose of bookmarking the link context. 351 The intent of "bookmark" is closest to that of "cite-as" in that the 352 link target is intended to be a permalink for the link context, for 353 bookmarking purposes. The relation type dates back to the earliest 354 days of news syndication, before blogs and news feeds had permalinks 355 to identify individual resources that were aggregated into a single 356 page. As such, its intent is to provide permalinks for different 357 sections of an HTML document. It was originally used with HTML 358 elements such as