idnits 2.17.1 draft-masinter-dated-uri-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 2 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 265: '... A timestamp SHOULD be supplied, sin...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 12, 2009) is 5396 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 3 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group L. Masinter 3 Internet-Draft Adobe 4 Intended status: Informational July 12, 2009 5 Expires: January 13, 2010 7 The "tdb" URI scheme: denoting described resources 8 draft-masinter-dated-uri-06 10 Status of this Memo 12 This Internet-Draft is submitted to IETF in full conformance with the 13 provisions of BCP 78 and BCP 79. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt. 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http://www.ietf.org/shadow.html. 31 This Internet-Draft will expire on January 13, 2010. 33 Copyright Notice 35 Copyright (c) 2009 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents in effect on the date of 40 publication of this document (http://trustee.ietf.org/license-info). 41 Please review these documents carefully, as they describe your rights 42 and restrictions with respect to this document. 44 Abstract 46 This document defines a URI scheme, "tdb" ( standing for "Thing 47 Described By"). It provides a semantic hook for allowing anyone at 48 any time to mint a URI for anything that they can describe. Such 49 URIs may include a timestamp to fix the description at a given date 50 or time. 52 This URI scheme may reduce the need to define define new URN 53 namespaces merely for the purpose of creating stable identifiers. In 54 addition, they provide a ready means for identifying "non-information 55 resources" by semantic indirection -- a way of creating a URI for 56 anything. 58 Note 60 This document is not a product of any working group. Many of the 61 ideas here have been discussed since 2001. This document has been 62 discussed on the mailing list . Previous versions have 63 couched "tdb" as a URN namespace, and included a "duri" scheme for 64 fixing date without indirection, which seems unnecessary. It was 65 originally written as a thought experiment as a way of resolving the 66 use/mention problem in semantic web applications, but may have other 67 uses. 69 Table of Contents 71 1. Overview and Requirements . . . . . . . . . . . . . . . . . . 4 72 1.1. Easy assignment of permanent identifiers . . . . . . . . . 4 73 1.2. Persistent identifiers . . . . . . . . . . . . . . . . . . 4 74 1.3. URIs for abstractions . . . . . . . . . . . . . . . . . . 5 75 2. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 76 3. Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . 6 77 4. Use as a Locator . . . . . . . . . . . . . . . . . . . . . . . 7 78 5. Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . 7 79 6. Timestamps in tdb URIs . . . . . . . . . . . . . . . . . . . . 7 80 7. Additional Considerations . . . . . . . . . . . . . . . . . . 8 81 7.1. URI schemes for the description resource . . . . . . . . . 8 82 7.2. Useful timestamps . . . . . . . . . . . . . . . . . . . . 9 83 7.3. Free assignment . . . . . . . . . . . . . . . . . . . . . 10 84 7.4. Resolution . . . . . . . . . . . . . . . . . . . . . . . . 10 85 7.5. Why Names with Semantics? . . . . . . . . . . . . . . . . 10 86 7.6. Avoiding MetaData . . . . . . . . . . . . . . . . . . . . 10 87 7.7. Avoiding tdb . . . . . . . . . . . . . . . . . . . . . . . 10 88 7.8. tdb and levels of indirection . . . . . . . . . . . . . . 11 89 8. URI Specification Template . . . . . . . . . . . . . . . . . . 11 90 9. IANA considerations . . . . . . . . . . . . . . . . . . . . . 12 91 10. Security Considerations . . . . . . . . . . . . . . . . . . . 12 92 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 93 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13 94 12.1. Normative References . . . . . . . . . . . . . . . . . . . 13 95 12.2. Informative References . . . . . . . . . . . . . . . . . . 13 96 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 13 98 1. Overview and Requirements 100 The tdb URI scheme here solves several related problems: 102 1.1. Easy assignment of permanent identifiers 104 The URN specification [RFC1737] allows for many URN namespaces, and 105 many have been registered. However, obtaining an appropriate URN in 106 any of the currently defined URN namespaces may be difficult: a 107 number of URN namespace registrations have been accompanied by 108 comments that no other URN namespace was available for the class of 109 documents for which identifiers were wanted. 111 1.2. Persistent identifiers 113 [RFC1737] defines several requirements for Uniform Resource Names. 114 In particular, it requires "persistence": 116 Persistence: It is intended that the lifetime of a URN be 117 permanent. That is, the URN will be globally unique forever, and 118 may well be used as a reference to a resource well beyond the 119 lifetime of the resource it identifies or of any naming authority 120 involved in the assignment of its name. 122 Many people have wondered how to create globally unique and 123 persistent identifiers. There are a number of URI schemes and URN 124 namespaces already registered. However, an absolute guarantee of 125 both uniqueness and persistence is very difficult. 127 In some cases, the guarantee of persistence comes through a promise 128 of good management practice, such as is encouraged in "Cool URLs 129 don't change" [COOL]. However, relying on promise of good management 130 practice is not the same as having a design that guarantees 131 reliability independent of actual administrative practice. 133 A primary design goal for URIs is that they are intended to mean the 134 same thing, no matter in what context they appear: a "Uniform" way to 135 Identify a Resource. However, even when URIs have Uniform meaning 136 from the point of view of the source of the reference, they don't 137 guarantee stability over time. Despite best efforts and intentions, 138 identifying information can change in unpredictable ways: domain 139 names can disappear or be reassigned, name assigning organizations 140 can change structure, responsibility, disappear, merge, or change in 141 unpredictable ways. 143 There is a significant dependence in the interpretation of many URNs 144 with the concept of "naming authority". The authority is presumably 145 some individual or organization both to insure uniqueness of 146 assignment and also to help with understanding the meaning of the 147 link between the name and the named. 149 However, authorities, whether individuals or organizations, have a 150 lifetime, and must be consulted at some point to understand the 151 bindings. The functioning of names as unique identifiers and holders 152 of meaning depends on having a reliable infrastructure of consulting 153 the authority or the authorities records to determine the thing 154 referenced. 156 1.3. URIs for abstractions 158 The description of URIs [RFC3986] describes a range for 'Resource' 159 that is quite broad: 161 This specification does not limit the scope of what might be a 162 resource; rather, the term "resource" is used in a general sense 163 for whatever might be identified by a URI. Familiar examples 164 include an electronic document, an image, a source of information 165 with a consistent purpose (e.g., "today's weather report for Los 166 Angeles"), a service (e.g., an HTTP-to-SMS gateway), and a 167 collection of other resources. A resource is not necessarily 168 accessible via the Internet; e.g., human beings, corporations, and 169 bound books in a library can also be resources. Likewise, 170 abstract concepts can be resources, such as the operators and 171 operands of a mathematical equation, the types of a relationship 172 (e.g., "parent" or "employee"), or numeric values (e.g., zero, 173 one, and infinity). 175 One might use a URI such as "mailto:" email address to identify a 176 person, or a "http:" URI to identify an abstract comment. However, 177 this leaves the question of how one might identify, within the same 178 context, both the system mailbox and the person to which it is 179 assigned, or the web page at a http URI and the concept it describes. 180 The "tdb" URI scheme allows ready assignment of URIs for abstractions 181 that are distinguished from the media content that describes them. 183 The goal, then, of the "tdb" URI scheme is to provide a mechanism 184 which is, at the same time: 186 permanent: The identity of the resource identified is not subject 187 to reinterpretation over time. 189 explicitly bound: The mechanism by which the identified resource 190 can be determined is explicitly included in the URI. 192 useful for non-networked items: Allows identification of resources 193 outside the network: people, organizations, abstract concepts. 195 no administration: The mechanism does not depend on reliable 196 administrative processes of authorities for either assignment or 197 interpretation. 199 2. Syntax 201 A tdb URI takes the form: 202 duri:: 204 Where is s sequence of digits representing a date and 205 time (Section 6) and is any valid URI. 207 3. Semantics 209 The tdb URI scheme is intended to be useful for describing entities, 210 concepts, abstractions, and other items which may not themselves be 211 network accessible resources, but have been at some point described 212 by network accessible resources. 214 The meaning of a duri is "the resource (or fragment) that was 215 identified by the (after hex decoding) at the very last 216 instant of the date(time) given". 218 The intent is to use the inversion of "is a document about". It is 219 common practice to give a reference for a concept by including a 220 pointer to a document, segment, phrase that defines the concept. 221 "tdb" attempts to capture this practice in URI space. 223 For example, one might use "tdb:2008:http://www.ietf.org" as a 224 persistent identifier for the Internet Engineering Task Force, as 225 described by the "http://www.ietf.org" as of the very last instant of 226 the year 2008. 228 The "tdb" namespace differs from the URN methods for identifying 229 abstractions because the designation of what is actually identified 230 by the tdb doesn't depend on knowing the intention of the "assigner" 231 of the identifier. Unlike "tag", "info", "cid", "mid" or related 232 schemes, the identification is not dependent on the context of use. 234 The "tdb" scheme can be thought of as adding a level of semantic 235 indirection to URI resolution. 237 4. Use as a Locator 239 A tdb URI is not a resource locator in a practical sense. It allows 240 one to know that a resource was described at some point in time, but 241 whether the description is still available, or whether that 242 description is still meaningful, is ambiguous. 244 5. Hierarchy 246 The "thing descibed by" a network resource may bear little 247 relationship to the "thing described by" a relative pointer, so the 248 "tdb" URI scheme seems to have no use cases for using "/" as a 249 hierarchical delimiter. 251 6. Timestamps in tdb URIs 253 It is traditional in convention references and citations in printed 254 works to include the date of publication; this practice serves the 255 important purpose that the context of the naming can be determined. 257 While one could imagine using tdb without a timestamp, it would leave 258 the possibility that a reference that is unambiguous at one time 259 might become ambiguous at some other time. There are two ways that 260 the date is useful for "tdb": it fixes the time of access of the 261 resource, for variable descriptions, and it fixes the time of 262 interpretation, for descriptions whose meaning (in natural language) 263 might vary. 265 A timestamp SHOULD be supplied, since the network resources which 266 provide descriptions can also change over time. The timestamp is 267 allowed to be quite broad -- only a year -- or with as much precision 268 as needed. This keeps "tdb" URIs relatively short. To avoid 269 ambiguity, a single instant has been chosen -- for tdb this is "the 270 last possible instant of the indicated range". 272 A timestamp in the tdb scheme is a simple expression of date, 273 optional time, with arbitrary precision. The goal is to allow 274 relatively short expressions with no ambiguity, but also with 275 arbitrary precision. (Other date formats were considered, but 276 arbitrary precision syntactic simplicity of only using digits time 277 zones not.) 278 date = [ year [ month [ day [ hour [ minute [ second [ fraction ]]]]]]] 280 year = 4digit 281 month = 2digit 282 day = 2digit 283 hour = 2digit 284 minute = 2digit 285 second = 2digit 286 fraction = *digit 288 The representation of a date or time refers to the (open interval) 289 instant just before the end of the given date/time range at the 290 resolution supplied. 199912 is "just before" 1999, but 19991231 falls 291 between them. If necessary, timestamps can include times and even 292 fractional times, so that a generator of tdbs can be arbitrarily 293 precise. 295 Timestamps are interpreted relative to International Atomic Time 296 (TAI) [TAI]. The syntax and semantics are similar to those in 297 [RFC2550]; in particular, using TAI avoids ambiguity about time zones 298 and difficulties with leap seconds. 300 There are actually two dates to consider, with "tdb". There is the 301 date that the resource is obtained, and there is the date that the 302 description it makes is read, understood, and used to denote. 303 Normally in a literary work in natural language which makes a 304 reference to another work, both the reference itself and the work 305 referenced are dated, e.g., a footnote in an article written in 1967 306 might talk about a "private communication" which itself had a date. 307 The difference between a URI and a conventional literary reference is 308 the desire to be able to extract the URI from its context and still 309 retain its meaning. 311 7. Additional Considerations 313 7.1. URI schemes for the description resource 315 The "tdb" scheme is intended for use with resources which have 316 retrievable resources that describe something else -- these 317 "description resources" are intended as "information resources". 319 For example, use with a "http" URI can be used to refer to the 320 subject of a web page (at it was described at the given time.) This 321 can be a way of referring to a web site at some time in the past, or 322 an organization that has changed, merged, split, or disappeared. 324 Local systems that have known-to-be unique host names can use "file" 325 URIs with "tdb", for example, 327 tdb:20010814142327:file://this.example.com/c|/temp/test.txt 329 since this use is primarily focused on providing a unique way of 330 identifying an abstraction, even if the referent of the abstraction 331 is not widely known. (Using 'file:' URIs in this way without a fully 332 qualified domain name would not be appropriate, because the 333 interpretation is not uniform.) 335 One might consider using "tdb" with "data" to designate concepts that 336 can be described uniquely briefly inline. For example, 338 tdb:2001:data:,The%20US%20president 340 names the concept described by the (text/plain) string "The US 341 president" at the very last instant of 2001. Of course, this 342 practice is only useful if the referent of the data is (or was at the 343 time) completely unique. Since "data" does not contain a way to 344 designate content-language, the string in question would have to not 345 be ambiguous as to its language. In the case of 'data', there is no 346 assigning authority at all; the interpretation of the 'tdb' depend on 347 the interpreting community. 349 Many URIs identify resources which do not clearly describe anything 350 at all. The "home page" for an organization isn't nearly as good a 351 resource to use to describe an organization as the organization's 352 "about" page. But it is up to the minter of the tdb URI to choose 353 wisely. 355 7.2. Useful timestamps 357 Timestamps far in the future are suspect, because the future content 358 of a description resource cannot usually reliably predicted. 359 Timestamps which preceed the availability of the description resource 360 should not be used either. For example, using a http URI with a 361 timestamp before the description resource is also not recommended. 363 However, although these practices are not recommended, there is no 364 assurance that they haven't been used; by itself, a tdb does not 365 constitute an assertion that the description resource was available 366 or assigned at the date specified. 368 Note that the use of the "very last instant" allows for the 369 conventional bibliographic convention that a work published in 2009 370 can use "2009" as the date string, to refer to the work in the year 371 of publication. 373 7.3. Free assignment 375 Because of the many possible schemes that can be used in the 376 portion, there should be no difficulty in almost any computational 377 process being able to assign tdbs at will. Of course, it is 378 necessary for there to be some resource which is available at some 379 point in time, and to have a clock which is accurate to the 380 granularity of the frequency of assignment. 382 7.4. Resolution 384 There no resolution servers or processes for tdb URI. However, a tdb 385 URI might be "resolvable" in the sense that a resource that was 386 accessed at a point in time might have the result of that access 387 cached or archived in an Internet archive service. See, for example, 388 the "Internet Archive" project [archive]. And the "tdb" is 389 "resolvable" in the sense that the description resource can be 390 accessed and interpreted. 392 7.5. Why Names with Semantics? 394 There are a number of URI and URN schemes that create otherwise 395 unbound "names", where the scheme only provides for uniqueness, with 396 some other agent or process or context providing the authority to 397 interpret the meaning of the identifier at some point in the future. 398 "tdb" is different, in that it is the agreement between the describer 399 (the agent creating the tdb URI) and the receiver of the URI (the 400 agent interpreting the tdb URI) to agree upon the semantics without 401 any reference to any third party. 403 7.6. Avoiding MetaData 405 One might consider the date in a tdb URI to be just one piece of 406 additional metadata about the URI, and consider adding other pieces 407 of metadata as annotation. 409 However, the use of the date in a tdb URI is intended primarily as a 410 mechanism of accomplishing uniqueness over time. No other bit of 411 metadata or description readily fills that purpose. Further, the 412 date is not descriptive (an assertion about the URI) but merely 413 refining. 415 7.7. Avoiding tdb 417 Many applications of URIs already provide a context of timestamp. 418 For example, one could imagine a hypertext system where the URIs 419 contained within a document were intended to refer to the resources 420 as of the date of the enclosing document. This would be a reasonable 421 interpretation of URIs within an Internet archive system, for 422 example. 424 And some applications of URIs arguably already contain the level of 425 interpretive indirection that is explicit with "tdb". For example, 426 one might consider the use of URIs as namespace names within XML 427 [namespaces] as a reference to the "thing described by" the URI used. 429 7.8. tdb and levels of indirection 431 The "tdb" scheme introduces a level of semantic indirection. The 432 puzzles and confusions about use and mention, name and reference, and 433 levels of indirection have been puzzling and amusing for quite a 434 while. 436 "It's long," said the Knight, "but it's very, very beautiful. 437 Everybody that hears me sing it--either it brings tears into their 438 eyes, or else--" 439 "Or else what?" said Alice, for the Knight had made a sudden 440 pause. 441 "Or else it doesn't, you know. The name of the song is called 442 'Haddock's Eyes.'" 443 "Oh, that's the name of the song, is it?" Alice said, trying to 444 feel interested. 445 "No, you don't understand," the knight said, looking a little 446 vexed. "That's what the name is called. The name really is 'The 447 Aged Aged Man.'" 448 "Then I ought to have said 'That's what the song is called'?" 449 Alice corrected herself. 450 "No, you oughtn't: that's quite another thing! The song is called 451 'Ways and Means': but that's only what it's called, you know!" 452 "Well, what is the song, then?" said Alice, who was by this time 453 completely bewildered. 454 "I was coming to that," the Knight said. "The song really is 455 'A-sitting On A Gate': and the tune's my own invention." [LOOK] 457 8. URI Specification Template 459 URI scheme name: tdb 461 Status: permanent 463 URI scheme syntax: Briefly, the syntax is 464 tdb:: 465 The syntax is described in this document. 467 URI scheme semantics: Semantic indirection at indicated date. 468 Semantics are described in detail in this document. 470 Encoding considerations: tdb URIs consist of a prefix followed by 471 another URI, and should have the same encoding considerations as 472 others. 474 Applications/protocols that use this URI scheme name: This scheme 475 was designed to resolve some of the use/mention ambiguities in 476 semantic web applications that wish to "denote" concepts and other 477 ideas and not just access resources over the Internet. 479 Interoperability considerations: Existing semantic web applications 480 may have other means of fixing meaning at a particular time or 481 semantic indirection, but this should not in itself cause 482 interoperability difficulties. 484 Security considerations: See Section 10 of this document. 486 Contact: Larry Masinter tdb:2009:http://larry.masinter.net 488 Author/Change controller: as above 490 References: See References of this document. 492 9. IANA considerations 494 This document includes a URI scheme registration (Section 8 that 495 should be entered into the IANA registry of URI schemes as a 496 permanent registration (once approved.) 498 10. Security Considerations 500 "tdb" identifiers are not any more reliable because they have dates. 501 URIs don't contain enough information to supply the authority for 502 deciding what was or wasn't at a given URI at a given date. 504 11. Acknowledgements 506 There have been many discussions over several years on the 507 relationship of URLs, URNs, URIs, resources and resource identifiers, 508 with many contributions. Particular thanks to Al Gilman, Aaron 509 Swartz, Brian McBride, Stuart Williams, Michael Mealling, Ray 510 Denenberg and Pat Hayes. 512 12. References 514 12.1. Normative References 516 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 517 Resource Identifiers (URI): Generic Syntax", RFC 3986, 518 January 2005. 520 [TAI] Bureau International des Poids et Mesures, "International 521 Atomic Time". 523 [namespaces] 524 Bray, T., Hollander, D., and A. Layman, "Namespaces in 525 XML", W3C Recommendation REC-xml-names, January 1999, 526 . 528 12.2. Informative References 530 [COOL] Berners-Lee, T., "Cool URIs don't change", 1998, 531 . 533 [LOOK] Carroll, L., "Through the Looking Glass", 1872, . 537 [RFC1737] Sollins, K., "Functional Requirements for Uniform Resource 538 Names", RFC 1737, December 1994. 540 [RFC2550] Glassman, S., Manasse, M., and J. Mogul, "Y10K and 541 Beyond", RFC 2550, April 1 1999. 543 [archive] Kahle, B., "Preserving the Internet", Scientific 544 American , March 1997, 545 . 547 Author's Address 549 Larry Masinter 550 Adobe 551 345 Park Ave 552 San Jose, CA 95110 553 US 555 Phone: +1 408 536 3024 556 Email: LMM@acm.org 557 URI: http://larry.masinter.net