idnits 2.17.1 draft-paskin-doi-uri-04.txt: -(101): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 7 instances of lines with non-ascii characters in the document. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 75 has weird spacing: '... or perfo...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 2003) is 7613 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 126 looks like a reference -- Missing reference section? '2' on line 133 looks like a reference -- Missing reference section? '3' on line 175 looks like a reference -- Missing reference section? '4' on line 185 looks like a reference -- Missing reference section? '5' on line 186 looks like a reference -- Missing reference section? '6' on line 189 looks like a reference -- Missing reference section? '7' on line 321 looks like a reference -- Missing reference section? '8' on line 349 looks like a reference -- Missing reference section? '9' on line 354 looks like a reference Summary: 3 errors (**), 0 flaws (~~), 4 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet-Draft Norman Paskin 3 Document: draft-paskin-doi-uri-04.txt International DOI 4 Expires: December 2003 Foundation 5 Eamonn Neylon 6 Manifest Solutions 7 Tony Hammond 8 Elsevier 9 Sam Sun 10 CNRI 11 June 2003 13 The "doi" URI Scheme for the Digital Object Identifier (DOI) 15 Status of this Memo 17 This document is an Internet-Draft and is in full conformance with 18 all provisions of Section 10 of RFC 2026. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six 26 months and may be updated, replaced, or obsoleted by other 27 documents at any time. It is inappropriate to use Internet-Drafts 28 as reference material or to cite them other than as "work in 29 progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 Abstract 38 This document defines the "doi" Uniform Resource Identifier (URI) 39 scheme for the Digital Object Identifier (DOI). DOIs are 40 identifiers for entities of significance to the content 41 industries. The "doi" URI scheme allows a resource associated with 42 an entity identified by a DOI to be referenced by a URI for 43 Internet applications. A "doi" URI is dereferenced to a set of 44 service descriptions through discoverable resolution mechanisms. 46 Table of Contents 48 1 Introduction..................................................2 49 2 Terminology...................................................3 50 3 The "doi" URI Scheme..........................................3 51 4 Normalization and Comparison of "doi" URIs....................5 52 5 DOI Administration............................................6 53 6 DOI Resolution................................................6 54 7 Rationale.....................................................7 55 8 Security Considerations.......................................8 56 9 Acknowledgements..............................................8 57 10 References..................................................8 58 11 Authors' Addresses..........................................9 59 12 Full Copyright Statement....................................9 61 1 Introduction 63 This document defines the "doi" Uniform Resource Identifier (URI) 64 scheme for the Digital Object Identifier (DOI). DOIs are 65 identifiers for entities of significance to the content 66 industries. The "doi" URI scheme allows a resource associated with 67 an entity identified by a DOI to be referenced by a URI for 68 Internet applications. A "doi" URI is dereferenced to a set of 69 service descriptions through discoverable resolution mechanisms. 71 The term "Digital Object Identifier" should be construed as 72 meaning an identifier ("Identifier") of an entity ("Object") for 73 use in networked environments ("Digital"). In this sense an 74 "Object" can be any entity - any digital or physical manifestation 75 or performance, or any abstract work or concept - that is 76 identified by a DOI. 78 Some concepts relevant to DOI follow: 80 International DOI Foundation (IDF) � The International DOI 81 Foundation, Inc. is a non-stock membership corporation 82 organized in 1997 and existing under and by virtue of the 83 General Corporation Law of the State of Delaware, USA. The 84 Foundation is controlled by a Board elected by the members of 85 the Foundation. The Corporation is a "not-for-profit" 86 organization, i.e. prohibited from activities not permitted to 87 be carried on by a corporation exempt from US federal income 88 tax under Section 501(c)(6) of the Internal Revenue Code of 89 1986 et seq. 91 The activities of the Foundation are controlled by its members, 92 operating under a legal Charter and formal By-laws. Membership 93 is open to all organizations with an interest in electronic 94 publishing, content distribution, rights management, and 95 related enabling technologies. 97 The Foundation was founded to develop a framework of 98 infrastructure, policies and procedures to support the 99 identification needs of the content industries. 101 DOI Prefix Holder � Any network user who has been assigned the use 102 of a DOI naming authority under which DOIs may be created. 104 DOI Registration Agency - An IDF-appointed body that provides 105 administration facilities to DOI Prefix Holders. 107 DOI Resolution � A process of service indirection whereby a 108 service is selected from a set of service descriptions returned 109 on dereference of a "doi" URI and this service subsequently 110 activated. 112 DOI Service � One or more network services accessible on 113 resolution of a DOI. 115 DOI Metadata � A set of data associated with a DOI which is 116 deposited into a repository at time of creation by a DOI 117 Registration Agency and thereafter maintained. 119 2 Terminology 121 In this document the key words "must", "must not", "required", 122 "shall", "shall not", "should", "should not", "recommended", 123 "may", and "optional" are to be interpreted as described in RFC 124 2119 [1] and indicate requirement levels for compliant 125 implementations. 127 3 The "doi" URI Scheme 129 3.1 Definition of "doi" URI Syntax 131 The "doi" URI syntax defined in this document conforms to the 132 generic URI syntax. This specification uses the Augmented Backus- 133 Naur Form (ABNF) notation of RFC 2234 [2] to define the URI. The 134 following core ABNF productions are used by this specification as 135 defined by Section 6.1 of RFC 2234: ALPHA, DIGIT, HEXDIG. The 136 complete "doi" URI syntax is as follows: 138 doi-uri = scheme ":" encoded-doi [ "?" query ] 139 [ "#" fragment ] 141 scheme = "doi" 143 encoded-doi = prefix "/" suffix 145 prefix = segment 147 suffix = segment *( "/" segment ) 148 segment = *pchar 150 query = *( pchar / "/" / "?" ) 152 fragment = *( pchar / "/" / "?" ) 154 pchar = unreserved / escaped / ";" / 155 ":" / "@" / "&" / "=" / "+" / "$" / "," 157 unreserved = ALPHA / DIGIT / mark 159 escaped = "%" HEXDIG HEXDIG 161 mark = "-" / "_" / "." / "!" / "~" / "*" / "'" / 162 "(" / ")" 164 A "doi" URI has an (encoded) DOI as its scheme-specific part 165 followed by an optional query component followed by an optional 166 fragment identifier. A DOI is constructed by appending a unique 167 suffix string to an assigned prefix string separated by a slash 168 "/" character. The prefix is always assigned to a DOI Prefix 169 Holder by a DOI Registration Agency. The DOI Prefix Holder is 170 responsible for the creation of a valid suffix. The prefix in a 171 DOI corresponds to the naming authority. The administration of any 172 particular DOI may be transferred to another party at any time. 173 The prefix does not denote the owner of a DOI. 175 ANSI/NISO Z39.84-2000 [3] is the authoritative reference that 176 specifies the rules for constructing a DOI. Once constructed, a 177 DOI may be regarded as an opaque identifier with no internal 178 structure. The minimum constraints for validation of a DOI string 179 are that the prefix and suffix components be non-empty. 181 3.2 Allowed Characters Under the "doi" URI Scheme 183 The syntax for a DOI is defined in accordance with the ANSI/NISO 184 Z39.84-2000 standard "Syntax for the Digital Object Identifier 185 Syntax". A DOI is represented using the Unicode [4] character set 186 and is encoded in UTF-8 [5]. 188 The "doi" URI syntax uses the same set of allowed US-ASCII 189 characters as specified in RFC 2396 [6] for a generic URI. 190 Reserved characters as well as excluded US-ASCII characters and 191 non-US-ASCII characters must be escaped before forming the URI. 192 Details of the escape encoding can be found in RFC 2396, section 193 2.4. 195 3.3 Examples of "doi" URIs 197 Some examples of syntactically valid "doi" URIs are given below: 199 (a) doi:alpha-beta/182.342-24 201 where "alpha-beta" is the prefix and "182.342-24" is the suffix. 203 (b) doi:10.abc/ab-cd-ef 205 where "10.abc" is the prefix and "ab-cd-ef" is the suffix. 207 (c) 209 where "10.23" is the prefix and "2002/january/21/4690" is the 210 suffix. 212 (d) doi:11.a.7/0363-0277(19950315)120%3A5%3C%3E1.0.TX%3B2-V 214 where "11.a.7" is the prefix and "0363- 215 0277(19950315)120%3A5%3C%3E1.0.TX%3B2-V" is the prefix. Note that 216 in unescaped form this DOI is represented in UTF-8 as 217 "11.a.7/0363-0277(19950315)120:5<>1.0.TX;2-V". 219 (e) doi:dk/P%C3%A6dagogi%2037(2),%20562 221 where "dk" is the prefix and "P%C3%A6dagogi%2037(2),%20562" is the 222 suffix. Note that in unescaped form this DOI is represented in 223 UTF-8 as "dk/P�dagogi 37(2), 562" and in ISO-Latin-1 as 224 "dk/P�dagogi 37(2), 562". 226 4 Normalization and Comparison of "doi" URIs 228 In order to facilitate comparison of "doi" URIs and to reduce the 229 risk of false negatives, normalization to the canonical form 230 should be applied to minimize the amount of software processing 231 for such comparisons. 233 The following normalization steps should be applied: 235 1. Normalize the case of the leading "doi:" token to be 236 lowercase 237 2. Unescape all unreserved %-escaped characters 238 3. Normalize the case of the scheme-specific part 239 including any %-escaped characters to be uppercase 241 The following forms of a "doi" URI 242 1. DOI:dk/P%C3%A6dagogi%2037(2),%20562 243 2. doi:DK/P%C3%A6dagogi%2037(2),%20562 244 3. doi:dk/P%c3%a6dagogi%2037(2),%20562 245 4. doi:dk/p%c3%a6dagogi%2037(2),%20562 246 5. doi:dk%2FP%C3%A6dagogi%2037%282%29%2C%20562 248 are normalized to the canonical form 250 doi:DK/P%C3%A6DAGOGI%2037(2),%20562 252 5 DOI Administration 254 The International DOI Foundation (IDF) is a not-for-profit 255 membership-based organization founded to develop a framework of 256 infrastructure, policies and procedures to support the 257 identification needs of the content industries. 258 The IDF is the maintenance agency for DOI and appoints DOI 259 Registration Agencies. 261 DOIs are created by DOI Prefix Holders and must be registered via 262 a DOI Registration Agency. Any network user can become a DOI 263 Prefix Holder by agreement with a DOI Registration Agency. 265 DOI Registration Agencies perform the following functions: 266 allocating DOI prefixes, registering DOIs, and providing the 267 necessary infrastructure to allow DOI Prefix Holders to declare 268 and maintain the metadata associated with a particular DOI. DOI 269 Registration Agencies also maintain knowledge of the current owner 270 of each individual DOI to ensure administrative updates. 272 The IDF maintains the DOI system (to allow registration and ensure 273 resolution of DOIs) and provides governance to ensure appropriate 274 use. DOI assignment requires a fee to ensure that the system costs 275 are met. This allows the system to be managed and supports 276 persistence as a function of organization rather than technology. 277 The fee is for the registering of DOIs (and may optionally be 278 passed on to registrants, waived or subsidized by a DOI 279 Registration Agency), but not for the resolution of a DOI. 281 The DOI system relies on copyright and trademark law to protect 282 the DOI brand and reputation. DOI is not a patented system; the 283 IDF has not developed any patent claims on the DOI system and does 284 not rely on patent law for remedy. 286 6 DOI Resolution 288 A "doi" URI references a set of service descriptions which is 289 returned on dereference of the URI. Following such a dereference a 290 service description is typically selected and the corresponding 291 service activated. This process of service indirection is commonly 292 referred to as "resolution" a DOI. Examples of services that can 293 be accessed by the resolution of a DOI include redirection to 294 another network resource, return of a metadata record describing 295 the entity identified by the DOI, etc. A discussion of such 296 services is beyond the scope of this document. 298 Resolution of a DOI can be accomplished using a variety of network 299 protocols. The combination of a network protocol, an access method 300 defined by that protocol and a service endpoint provides the means 301 of access to a resolution mechanism. As the maintenance agency for 302 DOI, the IDF will publish the means of access for known resolution 303 mechanisms of DOI. For the use of other resolution mechanisms 304 prior knowledge of the means of access is required. 306 As such a "doi" URI can be classified both as a name and a 307 locator. The locator references a set of service descriptions. 308 Note that this locator must not be confused with the locator used 309 to retrieve the ultimate representation that may be returned as a 310 result of activating a service. The "doi" URI is thus an instance 311 of an application-level URI and requires a methodology for mapping 312 from the "doi" URI to a proxy locator URI in order to realize its 313 locator role. These mapping methodologies provide the resolution 314 mechanisms that enable a "doi" URI to function as a locator of a 315 set of services. 317 7 Rationale 319 7.1 Why Create a New URI Scheme for DOI? 321 Under RFC 2718, "Guidelines for new URL Schemes" [7], it is stated 322 that a URI scheme should have a "demonstrated utility", and in 323 particular should be applied to "things that cannot be referred to 324 in any other way". DOI meets both of these criteria in that it is 325 a well established identifier (see ) for 326 entities of significance to the content industries, with some 10 327 million examples in current use on the Internet, and is being 328 widely embraced by the content industries. DOI is not bound to any 329 Internet protocol and so requires its own dedicated URI scheme. 331 The administration granularity of existing URI schemes typically 332 operates at the authority component level. By contrast DOIs are 333 managed at the individual identifier level. It is for this reason 334 that the DOI prefix is not to be interpreted as an "owner" 335 authority but rather as the "creator" authority. Once created the 336 "doi" URI may be regarded as an opaque identifier with no internal 337 structure. 339 7.2 Why Not Use a URN Namespace ID for DOI? 341 RFC 2396 states that a "URN differs from a URL in that it's [sic] 342 primary purpose is persistent labeling of a resource with an 343 identifier". A "doi" URI on the other hand has a dual purpose: 344 both to allow a resource associated with an entity identified by a 345 DOI to be referenced by a URI for Internet applications, as well 346 as to enable access to a set of service descriptions. In this 347 regard a "doi" URI scheme should be considered as being similar to 348 the "tel", "fax" and "modem" URI schemes documented in RFC 2806 349 [8]. 351 Further the syntactic requirements of the "doi" URI scheme are 352 incompatible with the URN syntax. Specifically the use of optional 353 query component and/or fragment identifier cannot be accommodated 354 by the URN syntax (cf. Sect. 2.3.2, RFC 2141 [9]). 356 8 Security Considerations 358 The "doi" URI scheme is subject to the same security 359 considerations as the general URI scheme described in RFC 2396. 361 Dereference of a "doi" URI to access a set of service descriptions 362 will be subject to the security considerations of the underlying 363 protocol used to access the resource referenced by the "doi" URI. 365 9 Acknowledgements 367 The authors acknowledge the contributions of Larry Lannom and 368 Jason Petrone, of the Corporation for National Research 369 Initiatives, to this specification. 371 The authors are also grateful to Larry Masinter and Martin Duerst 372 for their constructive comments on this specification. 374 10 References 376 1. Bradner, S., "Key Words for Use in RFCs to Indicate Requirement 377 Levels", BCP 14, RFC 2119, March 1997. 379 2. Crocker, D.H. and Overell, P., "Augmented BNF for Syntax 380 Specifications: ABNF", RFC 2234, November 1997. 382 3. ANSI/NISO Z39.84-2000, "Syntax for the Digital Object 383 Identifier", ISBN 1-880124-47-5. 385 4. The Unicode Consortium, "The Unicode Standard", Version 3, ISBN 386 0-201-61633-5, as updated from time to time by the publication of 387 new versions. (See 388 http://www.unicode.org/unicode/standard/versions for the latest 389 version and additional information on versions of the standard and 390 of the Unicode Character Database). 392 5. Yergeau, F., "UTF-8, A Transformation Format for Unicode and 393 ISO10646", RFC 2279, October 1996. 395 6. Berners-Lee, T., R. Fielding and L. Manister, "Uniform Resource 396 Identifiers (URI): Generic Syntax", RFC 2396, August 1998. 398 7. Masinter, L., H. Alvestrand, D. Zigmond and P. Petke, 399 "Guidelines for new URL Schemes", RFC 2718, November 1999. 401 8. Vaha-Sipila, A., "URLs for Telephone Calls", RFC 2806, April 402 2000. 404 9. Moats, R., "URN Syntax", RFC 2141, May 1997. 406 11 Authors' Addresses 408 Norman Paskin 409 The International DOI Foundation 410 Linacre House, Jordan Hill 411 Oxford, OX2 8DP, UK 412 n.paskin@doi.org 414 Eamonn Neylon 415 Manifest Solutions 416 Bicester 417 Oxfordfordshire, OX26 2HX, UK 418 eneylon@manifestsolutions.com 420 Tony Hammond 421 Elsevier Ltd 422 32 Jamestown Road 423 London, NW1 7BY, UK 424 t.hammond@elsevier.com 426 Sam Sun 427 Corporation for National Research Initiatives 428 1805 Preston White Dr., Suite 100 429 Reston, VA 20191, USA 430 ssun@cnri.reston.va.us 432 12 Full Copyright Statement 434 Copyright (C) The Internet Society (2003). All Rights Reserved. 436 This document and translations of it may be copied and furnished 437 to others, and derivative works that comment on or otherwise 438 explain it or assist in its implementation may be prepared, copied, 439 published and distributed, in whole or in part, without 440 restriction of any kind, provided that the above copyright notice 441 and this paragraph are included on all such copies and derivative 442 works. However, this document itself may not be modified in any 443 way, such as by removing the copyright notice or references to the 444 Internet Society or other Internet organizations, except as needed 445 for the purpose of developing Internet standards in which case the 446 procedures for copyrights defined in the Internet Standards 447 process must be followed, or as required to translate it into 448 languages other than English. 450 The limited permissions granted above are perpetual and will not 451 be revoked by the Internet Society or its successors or assigns. 453 This document and the information contained herein is provided on 454 an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET 455 ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR 456 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 457 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 458 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.