idnits 2.17.1 draft-ietf-find-cip-soif-02.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-03-29) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 763 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 13 instances of too long lines in the document, the longest one being 11 characters in excess of 72. ** There are 37 instances of lines with control characters in the document. ** The abstract seems to contain references ([Attribute-Identifier]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 527: '... REQUIRED field, no default....' Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: 'TBD' on line 255 == Unused Reference: '1' is defined on line 347, but no explicit reference was found in the text == Unused Reference: '2' is defined on line 351, but no explicit reference was found in the text == Unused Reference: '4' is defined on line 358, but no explicit reference was found in the text == Unused Reference: '5' is defined on line 361, but no explicit reference was found in the text == Unused Reference: '6' is defined on line 365, but no explicit reference was found in the text == Unused Reference: '7' is defined on line 368, but no explicit reference was found in the text == Unused Reference: '8' is defined on line 373, but no explicit reference was found in the text == Unused Reference: '9' is defined on line 377, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' -- Possible downref: Non-RFC (?) normative reference: ref. '3' -- Possible downref: Non-RFC (?) normative reference: ref. '4' ** Obsolete normative reference: RFC 1738 (ref. '5') (Obsoleted by RFC 4248, RFC 4266) -- Possible downref: Non-RFC (?) normative reference: ref. '6' -- Possible downref: Non-RFC (?) normative reference: ref. '7' -- Possible downref: Non-RFC (?) normative reference: ref. '8' -- Possible downref: Non-RFC (?) normative reference: ref. '9' Summary: 15 errors (**), 0 flaws (~~), 10 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Edward Hardie 2 Expires: April, 1998 NASA NIC 3 Mic Bowman 4 Transarc 5 Darren Hardy 6 Netscape 7 Mike Schwartz 8 @Home 9 Duane Wessels 10 NLANR 11 January, 1997 13 CIP Index Object Format for SOIF Objects 15 1. Status of this Memo 17 This document is an Internet-Draft. Internet-Drafts are working 18 documents of the Internet Engineering Task Force (IETF), its areas, and 19 its working groups. Note that other groups may also distribute working 20 documents as Internet-Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference material 25 or to cite them other than as ``work in progress.'' 27 To learn the current status of any Internet-Draft, please check the 28 "1id-abstracts.txt" listing contained in the Internet- Drafts Shadow 29 Directories on ds.internic.net (US East Coast), nic.nordu.net (Europe), 30 ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific Rim). 32 Distribution of this memo is unlimited. Please send comments to the 33 authors. 35 2. Abstract 37 The Common Indexing Protocol (CIP) allows servers to form a referral 38 mesh for query handling by defining a mechanism by which cooperating 39 servers exchange hints about the searchable indices they maintain. The 40 structure and transport of CIP are described in (Ref. 1), as are general 41 rules for the definition of index object types. This document describes 42 SOIF, the Summary Object Interchange Format, as an index object type in 43 the context of the CIP framework. SOIF is a machine-readable syntax for 44 transmitting structured summary objects, currently used primarily in the 45 context of the World Wide Web. 47 Query referral has often been dismissed as an ineffective strategy for 48 handling searches of Web resources, and Web resources certainly present 49 challenges not present in structured directory services like Rwhois. In 50 situations where a keyword-based free text search is desired, query 51 referral is not likely to be effective because the query will probably 52 be routed to every server participating in the referral mesh. Where a 53 search can be limited by reference to a specific resource attribute, 54 however, query referral is an effective tool. SOIF can be used to 55 create such a known-attribute query mesh because it provides a method 56 for associating attributes with net-addressable resources. 58 Mic Bowman, Darren Hardy, Mike Schwartz, and Duane Wessels each 59 contributed to the creation of the SOIF format and to the descriptions 60 from which this draft is drawn; errors in this description of their work 61 are the responsibility of Edward Hardie and corrections should be 62 directed accordingly. 64 2.1 History 66 SOIF was first defined by the Harvest project [Ref 2.] in January 1994. 67 SOIF was derived from a combination of the Internet Anonymous FTP 68 Archives IETF Working Group (IAFA) templates [Ref 3.] and the BibTeX 69 bibliography format [Ref 4.]. The combination was originally noted for 70 its advantages of providing a convenient and intuitive way for 71 delimiting objects within a stream, and setting apart the URL for easy 72 object access or invocation, while still preserving compatibility with 73 IAFA templates. 75 3. Name 77 The index object described below will have the MIME type of 78 application/index.obj.HARVEST-SOIF-1 . 80 4. Payload Format 82 Each summary object has 3 fundamental components: a template type, a 83 URL, and zero or more ATTRIBUTE-VALUE pairs. Because the VALUEs in 84 the ATTRIBUTE-VALUE pairs may contain arbitrary data (cf. Section 85 4.5), SOIF objects should be encoded in Base64 unless the template 86 type unambiguously establishes that the VALUEs do not contain binary 87 data. 89 4.1 Template Type. 91 The Template type is used to identify the set of ATTRIBUTEs contained 92 within a particular SOIF object. SOIF does not define the template 93 types themselves; it only provides a way to associate the summary 94 object with a predefined template type name. Template types may be 95 registered or unregistered. Unregistered template types provide an 96 indication of available ATTRIBUTE-VALUE pairs, but these may vary both 97 according to the original resource and the method by which the summary 98 object was generated. Registered template types must refer to a 99 formally specified description of all mandatory and optional 100 ATTRIBUTE-VALUE pairs available for that type. See [TBD] for a 101 description of the process of registering template types with the 102 IANA. 104 Historically, the template types used by SOIF were derived from IAFA 105 template types (Ref. 3). SOIF objects generated by the Harvest system 106 have a "FILE" template type; in current practice this is the most 107 common template type. The "FILE" template type is a generic template 108 type meant to handle a large variety of web-based resources. No 109 formal specification of it is available, though a list of 110 ATTRIBUTE-VALUE pairs common to the "FILE" template type is found in 111 Appendix A. "DOCUMENT" and "OBJECT" are other generic template-types. 113 The use of unregistered template types obviously presents some 114 problems to the correct operation of query referral. Two efforts have 115 been mounted to allow peer-to-peer agreement on the association of 116 template types with specific attribute sets: Netscape's RDM (Ref. 6) 117 and the STARTS project (Ref. 7). Initially, CIP meshes based on 118 systems which use unregisterested template types may need to 119 use these or similar methods to associate template types with specific 120 attribute sets. 122 Mesh operators are strongly encouraged, however, to migrate to 123 registered template types as soon as is practical. Registered 124 template types allow CIP meshes to derive the definitions of 125 attributes, which enables multiple-language interfaces to the base 126 attributes. In addition, registered template types allow CIP meshes 127 and other users of SOIF to establish the permitted data types and 128 encodings of the VALUEs associated with each ATTRIBUTE. This makes 129 deriving the appropriate matching semantics for a particular VALUE 130 much more straightforward and eliminates the limitations of the 131 default octet-by-octet matching (cf. Section 5.). 133 4.2 URL 135 Uniform Resource Locators (URLs) (Ref 5.) are used by SOIF as object 136 IDENTIFIERs. SOIF associates its summary objects with net-addressable 137 resources by using the URL by which the resource was addressed as the 138 initial field of the object body. See section 4.4 for the formal 139 grammar associated with SOIF objects. 141 This association allows the same resource to have multiple summary 142 objects, differentiated only by the URL by which the resource was 143 accessed. This possibility does not, however, impact the usability of 144 the URL as an object IDENTIFIER. Furthermore, since it can be argued 145 that the net address is a salient part of the metadata, there may be 146 compensating benefits to using the URL as an object IDENTIFIER. 148 As noted in Appendix A, the Harvest project used several additional 149 identity attributes ("Gatherer-Name", "Gatherer-Host", "Gatherer-Port" 150 and "Gatherer-Version") to further identify the provenance of a 151 particular object. Within the context of CIP, it may be useful to 152 identify the base sources of particular index objects; see Appendix B 153 for one example of how a SOIF-based CIP hint could use the base source 154 URL. 156 4.3 ATTRIBUTE-VALUE pairs. 158 Each summary object has zero or more ATTRIBUTE-VALUE pairs, which 159 contain metadata about the net-addressable resource referenced by the 160 URL. Pairs are composed of an ATTRIBUTE IDENTIFIER, the length of the 161 VALUE, a delimeter, and the VALUE. It should be stressed that 162 ATTRIBUTE VALUE pairs are not CR/LF terminated, but parsed according 163 to grammar set out in section 4.4. In the examples in Section 4.6 and 164 in many other representations of SOIF objects, ATTRIBUTE-VALUE pairs 165 are represented on individual lines to enhance readability. VALUEs may 166 contain CR/LF, however, and implementors must be careful to parse the 167 full VALUE. Implementors of SOIF parsers should ignore 168 ,,,, or other whitespace found between the VALUE 169 of an ATTRIBUTE-VALUE pair and the ATTRIBUTE-IDENTIFIER of the 170 subsequent pair. 172 The SOIF syntax does not explicitly allow for a single ATTRIBUTE to have 173 multiple VALUEs. To handle multiple VALUEs for the same ATTRIBUTE, SOIF 174 uses an ATTRIBUTE naming convention; a hyphen and positive integer are 175 appended to the ATTRIBUTE name to create an ATTRIBUTE IDENTIFIER VALUE 176 associated with a specific ATTRIBUTE. For example, the ATTRIBUTE 177 IDENTIFIERs "Author-1", "Author-2", and "Author-3" can be used to 178 represent three VALUEs associated with the ATTRIBUTE "Author" where a 179 specific resource has three authors. See section 5 for the implications 180 of this strategy on matching semantics. 182 4.4 SOIF Grammar 184 The SOIF syntax is defined by the following grammar: 186 SOIF ::= OBJECT SOIF | 187 OBJECT 188 OBJECT ::= @ TEMPLATE-TYPE { URL ATTRIBUTE-LIST } 189 TEMPLATE-TYPE ::= IDENTIFIER 190 ATTRIBUTE-LIST ::= ATTRIBUTE ATTRIBUTE-LIST | 191 ATTRIBUTE | 192 NULL 193 ATTRIBUTE ::= IDENTIFIER {VALUE-SIZE} DELIMITER VALUE 194 URL ::= RFC1738-URL-Syntax | "-" 195 IDENTIFIER ::= ALPHA-NUMERIC-STRING 196 VALUE ::= ARBITRARY-DATA 197 VALUE-SIZE ::= NUMERIC-STRING 198 DELIMITER ::= ":" 200 4.5 Grammar Description 202 URL 203 a Uniform Resource Locator encoded in the syntax defined by RFC 204 1738 [3]. If the summary object has no URL associated with it, 205 then a Latin-1 hyphen (octal \055) is used instead. 207 IDENTIFIER 208 an ASCII character string that only contains alphanumeric charac- 209 ters and hyphens or underscores. IDENTIFIERs should avoid including 210 hyphens followed by positive integers except when constructing 211 multiple-VALUE ATTRIBUTE IDENTIFIERs. 213 VALUE 214 a buffer of VALUE-SIZE octets containing the VALUE. The 215 VALUE may contain data in arbitrary formats or encodings, which 216 recipients recognize based on Template-Type. 218 VALUE-SIZE 219 a non-negative integer encoded as an ASCII character string. The 220 integer indicates how many octets the VALUE occupies after the 221 DELIMITER. 223 DELIMITER 224 a two octet delimiter which is a Latin-1 colon (:) and a tab (\t), 225 (octal \072\011). 227 { } the Latin-1 curly braces (octal \173 and \175) are used to wrap the 228 VALUE-SIZE (no spaces) as well as the URL and ATTRIBUTE-LIST combi- 229 nation. 231 @TEMPLATE-TYPE 232 the Latin-1 @ (octal \100) and TEMPLATE-TYPE (no space between 233 them) is used to mark the beginning of the SOIF object. 235 NUMERIC-STRING 236 Zero or more ASCII numerals. 238 ALPHA-NUMERIC-STRING 239 Zero or more ASCII letters or numerals, plus hyphens or underscore. 240 [a-z,A-Z,0-9,- and _]. 242 ARBITRARY-DATA 243 Octets of data in arbitrary formats or encodings. 245 5. Matching Semantics 247 As was discussed in Section 2, query referral of SOIF objects will be 248 most effective when a query identifies a particular ATTRIBUTE or set 249 of ATTRIBUTEs as the target of the query match. A query-identified 250 ATTRIBUTE should be considered to match a SOIF ATTRIBUTE when a 251 case-insentive character-by-character comparison matches that portion 252 of the ATTRIBUTE IDENTIFIER prior to any hyphen-integer suffix. For 253 example, a query which asks for a match on the ATTRIBUTE "author" 254 should match the IDENTIFIERs "author", "Author", "AUTHOR", and 255 "Author-1". [TBD] discourages the registration of template types 256 containing ATTRIBUTEs which have previously been registered with 257 substantially different definitions. This will help eliminate 258 mis-referral, but a CIP mesh may nonetheless need to maintain a 259 thesaurus matching ATTRIBUTEs from particular template-types to those 260 of other, especially unregistered, template-types. 262 The matching semantics appropriate for a particular VALUE are derived 263 from its data type and encoding. For VALUEs associated with 264 ATTRIBUTEs which are part of a registered template type, the data 265 type and encoding are readily available. For VALUEs associated with 266 ATTRIBUTES associated with unregistered template-types, an 267 octet-by-octet comparison is the default. In cases where previous 268 experience has demonstrated that a particular ATTRIBUTE contains 269 string data, a case-insensitive substring match may be used. For 270 example, in a query against the "AUTHOR" ATTRIBUTE of the generic 271 "DOCUMENT" template type, the query VALUE "Garcia" should match the 272 SOIF VALUEs "Garcia", "GARCIA", and "Jose Garcia y Montes". 274 Over time, there may well emerge an understanding of which attributes 275 tend to produce correct query referrals within a mesh. As such 276 understandings emerge, mesh maintainers may wish to define a particular 277 SOIF TEMPLATE-TYPE which restricts included ATTRIBUTES to those likely 278 to foster correct referrals. 280 6. Internationalization 282 The internationalization of SOIF depends on the registration of 283 template-types. Since TEMPLATE-TYPEs and ATTRIBUTE IDENTIFIERs must 284 be in ASCII characters, only languages which use the ASCII character 285 set are fully supported for unregistered TEMPLATE-TYPEs. For 286 registered template types, in contrast, the specification of an 287 ATTRIBUTE's definition will allow UI designers to present a 288 native-language mapping of the ATTRIBUTE to the end user. Further, 289 the inclusion of data type and encoding information in the description 290 of VALUEs means that any language encoding or character set required 291 by a particular application may be supported. For unregistered 292 template types, the ability of peer servers to pass schema definitions 293 may provide a form of "private registration" which could provide some 294 of the facilities for internationalization available to registered 295 template types. (See above, section 4.1 and Refs. 6 and 7.) 297 7. Example Summary Objects 299 The appendices contain example summary objects encoded using specific 300 template types. The following are some example summary objects using 301 the generic "DOCUMENT" SOIF template-type: 303 @DOCUMENT { http://home.netscape.com:80/ 304 Title{19}: Welcome to Netscape 305 Content-Type{9}: text/html 306 Content-Length{5}: 33262 307 } 309 @DOCUMENT { http://home.netscape.com/eng/ssl3/ssl-toc.html 310 Title{19}: SSL Protocol V. 3.0 311 Content-Type{9}: text/html 312 Content-Length{5}: 5870 313 Author-1{14}: Alan O. Freier 314 Author-2{14}: Philip Karlton 315 Author-3{14}: Paul C. Kocher 316 Abstract{318}: This document specifies Version 3.0 of the Secure 317 Sockets Layer (SSL V3.0) protocol, a security protocol that 318 provides communications privacy over the Internet. The protocol allows 319 client/server applications to communicate in a way that is designed 320 to prevent eavesdropping, tampering, or message forgery. 321 } 323 @DOCUMENT { http://www.nissanmotors.com/1996/300ZX/pictures/300zx.jpg 324 Content-Type{10}: image/jpeg 325 Content-Length{5}: 25940 326 Last-Modified{31}: Tuesday, 11-Jun-96 19:18:44 GMT 327 Thumbnail{259}: .................. 328 } 330 8. Security 332 Please see (Ref. 1) for a general discussion of Security concerns for 333 the CIP framework. 335 SOIF currently contains no requirement that any template type contain an 336 authentication ATTRIBUTE. SOIF summary objects lacking authentication 337 ATTRIBUTEs must, therefore, be treated as unreliable indicators of the 338 referenced resource's content. A hostile party could create a summary 339 object which significantly misrepresented a resource's content. As part 340 of a CIP mesh, this data could either channel a large number of 341 requestors to a resource (possibly resulting in a denial of service) or 342 away from a resource (possibly resulting in a loss of appropriate 343 visibility). 345 9. References 347 [1] The Common Indexing Protocol: 348 351 [2] The Harvest Information Discovery and Access System: 352 . 354 [3] D. Beckett, IAFA Templates in Use as Internet Metadata, 4th Int'l 355 WWW Conference, December 1995, 356 358 [4] L. Lamport, LaTeX: A Document Preparation System, Addison-Wesley, 359 Reading, Mass., 1986. 361 [5] T. Berners-Lee, L. Masinter, and M. McCahill, Uniform Resource 362 Locators (URL), RFC 1738, December 1994, 363 365 [6] D. Hardey, Resource Description Messages (RDM), W3C Note-rdm-960724, 366 July 24, 1996, 368 [7] L. Gravano, K. Chang, H. Garcia-Molina, C. Lagoze, A. Paepcke, 369 STARTS: Stanford Protocol Proposal for Internet Retrieval and 370 Search, January 1997, 371 373 [8] S. Weibel, J. Kunze, C. Lagoze, Dublin Core Metadata for Simple 374 Resource Description, February 1997, 375 377 [9] E. Miller, Dublin Core Element Set Crosswalk, January 1997, 378 380 10. Authors' Addresses 382 Edward Hardie 383 NASA Network Information Center 384 MS 204-14 385 Moffett Field, CA 94035-1000 USA 386 +1 415 604 0134 387 hardie@nasa.gov 389 Mic Bowman 390 Transarc Corporation 391 The Gulf Tower 392 707 Grant Street 393 Pittsburgh, PA 15219 USA 394 +1 412 338 4400 395 mic@transarc.com 397 Darren Hardy 398 Netscape Communications Corp. 399 685 E. Middlefield Road 400 Mountain View, CA 94043 USA 401 +1 415 937 2555 402 dhardy@netscape.com 404 Mike Schwartz 405 @Home Network 406 385 Ravendale Drive 407 Mountain View, CA 94043 USA 408 +1 415 944 7200 409 schwartz@home.net 411 Duane Wessels 412 National Laboratory for Applied Network Research 413 +1 303 497 1822 414 wessels@nlanr.net 416 Appendix A. 418 Common Attributes for "FILE" Template-type Summary Objects 419 created by Harvest: 421 Abstract 422 Brief abstract about the object. 424 Author 425 Author(s) of the object. 427 Description 428 Brief description about the object. 430 File-Size 431 Number of bytes in the object. 433 Full-Text 434 Entire contents of the object. 436 Gatherer-Host 437 Host on which the Gatherer ran to extract information from the 438 object. 440 Gatherer-Name 441 Name of the Gatherer that extracted information from the 442 object. (eg. Full-Text, Selected-Text, or Terse). 444 Gatherer-Port 445 Port number on the Gatherer-Host that serves the Gatherer's 446 information. 448 Gatherer-Version 449 Version number of the Gatherer. 451 Update-Time 452 The time that Gatherer updated the content summary for the object. 454 Keywords 455 Searchable keywords extracted from the object. 457 Last-Modification-Time 458 The time that the object was last modified. 460 MD5 461 MD5 16-byte checksum of the object. 463 Refresh-Rate 464 The number of seconds after Update-Time when the summary object is 465 to be re-generated. Defaults to 1 month. 467 Time-to-Live 468 The number of seconds after Update-Time when the summary object is 469 no longer valid. Defaults to 6 months. 471 Title 472 Title of the object. 474 Type 475 The object's type. Some example types are: 477 Archive 478 Audio 479 Awk 480 Backup 481 Binary 482 C 483 CHeader 484 Command 485 Compressed 486 CompressedTar 487 Configuration 488 Data 489 Directory 490 DotFile 491 Dvi 492 FAQ 493 FYI 494 Font 495 FormattedText 496 GDBM 497 GNUCompressed 498 GNUCompressedTar 499 HTML 500 Image 501 Internet-Draft 502 MacCompressed 503 Mail 504 Makefile 505 ManPage 506 Object 507 OtherCode 508 PCCompressed 509 Patch 510 Perl 511 PostScript 512 RCS 513 README 514 RFC 515 SCCS 516 ShellArchive 517 Tar 518 Tcl 519 Tex 520 Text 521 Troff 522 Uuencoded 523 WaisSource 525 Update-Time 526 The time that the summary object was last updated. 527 REQUIRED field, no default. 529 URL-References 530 Any URL references present within HTML objects. 532 Appendix B. 534 Proposed Attributes for a "CIP-HINT" Template Type 536 Attribute-Identifier-List 537 A comma-delimited list whose entries take the form 538 Template-Type:Attribute . This list identifies the 539 attributes against which queries are supported. Because 540 of the current limitation on Identifiers, this list 541 must be in ASCII. 543 Source 544 The URI of the service which created some or all of the 545 index objects to which this hint applies. Note that this 546 service may be and often is distinct from the server which 547 provides query access to those objects. 549 Total-Object-Count 550 The total number of index objects in the collection for 551 which the Hint applies. This should be a positive integer. 553 Weightlist-[Attribute-Identifier] 554 This construction allows the HINT to contain a weighted 555 list of values for a specific Attribute-Identifier. There 556 may be as many Weightlist entries as there Attribute-Identifiers 557 in the Attribute-Identifier-List. Each Weightlist entry takes 558 the form of Value;Object-Count, where the object count is 559 a positive integer representing the number of objects within 560 the collection which contain that value. Weightlists are comma- 561 delimited. Should a Value contain a comma, it should be escaped 562 when incorporated into the weightlist. 564 Threshold-[Attribute-Identifier] 565 If a server wishes not to report infrequently occurring Values in 566 a specific Weightlist, it may declare a threshold under which it 567 will not report Values. 569 Certification-Type 570 The type of Certification used for this object 572 Certification 573 The Value of the Certification. 575 Date 576 The Date at which the hint was generated 578 Example: 580 @CIP-HINT{ http://nic.nasa.gov:80/Harvest/brokers/NASA/ 581 Attribute-Identifier-list{49}: DOCUMENT:Author, DOCUMENT:Keywords, IMAGE:Subject 582 Source-1{45}: http://nic.nasa.gov/Harvest/gatherers/Eureka/ 583 Source-2{46}: http://techreports.larc.nasa.gov/cgi-bin/NTRS/ 584 Total-Object-Count{5}: 10000 585 Weightlist-[IMAGE:Subject]{40}: Shuttle;100, Planet;227, Moon;15, Sun;33 586 Threshold-[IMAGE:Subject]{2}: 10 587 Weightlist-[DOCUMENT:Author]{49}: Grizzard;12, Aldrin\, Buzz;15, Aldrin\, James;45, 588 Threshold-[DOCMENT:Author]{1}: 5 589 Certification-Type{13}: PGP-Signature 590 Certification{51}: mQCNAzFNm5QAAEEALUBOolOWKpby+=YtmtBxUZWQgSGFyZGllID 591 Date{29}: Sun, 05 Jan 1997 08:33:33 GMT 592 } 594 Appendix C. 596 A "Dublin-Core" Template Type [Ref. 8,9] 598 TITLE 599 The name given to the resource by the CREATOR or PUBLISHER. 601 CREATOR 602 The person(s) or organization(s) primarily responsible for the 603 intellectual content of the resource. For example, authors in the 604 case of written documents, artists, photographers, or illustrators 605 in the case of visual resources. 607 SUBJECT 608 The topic of the resource, or keywords or phrases that describe 609 the subject or content of the resource. The intent of the 610 specification of this element is to promote the use of controlled 611 vocabularies and keywords. This element might well include 612 scheme-qualified classification data (for example, Library of 613 Congress Classification Numbers or Dewey Decimal numbers) or 614 scheme-qualified controlled vocabularies (such as Medical Subject 615 Headings or Art and Architecture Thesaurus descriptors) as well. 617 DESCRIPTION 618 A textual description of the content of the resource, including 619 abstracts in the case of document-like objects or content 620 descriptions in the case of visual resources. Future metadata 621 collections might well include computational content description 622 (spectral analysis of a visual resource, for example) that may not 623 be embeddable in current network systems. In such a case this 624 field might contain a link to such a description rather than the 625 description itself. 627 PUBLISHER 628 The entity responsible for making the resource available in its 629 present form, such as a publisher, a university department, or a 630 corporate entity. The intent of specifying this field is to 631 identify the entity that provides access to the resource. 633 CONTRIBUTOR 634 Person(s) or organization(s) in addition to those specified in the 635 CREATOR element who have made significant intellectual contributions 636 to the resource but whose contribution is secondary to the 637 individuals or entities specifed in the CREATOR element (for 638 example, editors, transcribers, illustrators, and convenors). 640 DATE 641 The date the resource was made available in its present form. The 642 recommended best practice is an 8 digit number in the form YYYYMMDD 643 as defined by ANSI X3.30-1985. In this scheme, the date element for 644 the day this is written would be 19961203, or December 3, 1996. 645 Many other schema are possible, but if used, they should be 646 identified in an unambiguous manner. 648 TYPE 649 The category of the resource, such as home page, novel, poem, working 650 paper, technical report, essay, dictionary. It is expected that 651 RESOURCE TYPE will be chosen from an enumerated list of types. 653 FORMAT 654 The data representation of the resource, such as text/html, ASCII, 655 Postscript file, executable application, or JPEG image. The intent 656 of specifying this element is to provide information necessary to 657 allow people or machines to make decisions about the usability of 658 the encoded data (what hardware and software might be required to 659 display or execute it, for example). As with RESOURCE TYPE, FORMAT 660 will be assigned from enumerated lists such as registered Internet 661 Media Types (MIME types). In principal, formats can include 662 physical media such as books, serials, or other non-electronic media. 664 IDENTIFIER 665 String or number used to uniquely identify the resource. Examples 666 for networked resources include URLs and URNs (when implemented). 667 Other globally-unique identifiers,such as International Standard 668 Book Numbers (ISBN) or other formal names would also be candidates 669 for this element. 671 SOURCE 672 The work, either print or electronic, from which this resource 673 is derived, if applicable. For example, an html encoding of a 674 Shakespearean sonnet might identify the paper version of the 675 sonnet from which the electronic version was transcribed. 677 LANGUAGE 678 Language(s) of the intellectual content of the resource. Where 679 practical, the content of this field should coincide with the 680 NISO Z39.53 three character codes for written languages. 682 RELATION 683 Relationship to other resources. The intent of specifying this 684 element is to provide a means to express relationships among 685 resources that have formal relationships to others, but exist as 686 discrete resources themselves. For example, images in a document, 687 chapters in a book, or items in a collection. A formal 688 specification of RELATION is currently under development. Users 689 and developers should understand that use of this element should 690 be currently considered experimental. 692 COVERAGE 693 The spatial locations and temporal durations characteristic of the 694 resource. Formal specification of COVERAGE is currently under 695 development. Users and developers should understand that use of 696 this element should be currently considered experimental. 698 RIGHTS 699 The content of this element is intended to be a link (a URL or 700 other suitable URI as appropriate) to a copyright notice, a 701 rights-management statement, or perhaps a server that would 702 provide such information in a dynamic way. The intent of 703 specifying this field is to allow providers a means to associate 704 terms and conditions or copyright statements with a resource or 705 collection of resources. No assumptions should be made by users 706 if such a field is empty or not present. 708 Example: 710 @Dublin-Core-1 { ftp://ds.internic.net/internet-drafts/draft-kunze-dc-00.txt 711 TITLE{52}: Dublin Core Metadata for Simple Resource Description 712 CREATOR-1{9}: S. Weibel 713 CREATOR-2{8}: J. Kunze 714 CREATOR-3{9}: C. Lagoze 715 SUBJECT{44}: The Dublin Core Set of Elements for Metadata 716 DESCRIPTION{46}: Reference description of Dublin Core elements. 717 PUBLISHER{31}: Internet Engineering Task Force 718 CONTRIBUTOR-1{11}: Nick Arnett 719 CONTRIBUTOR-2{15}: Eliot Christian 720 CONTRIBUTOR-3{14}: Martijn Koster 721 CONTRIBUTOR-4{18}: Christian Mogensen 722 CONTRIBUTOR-5{14}: Timothy Niesen 723 CONTRIBUTOR-6{11}: Andrew Wood 724 CONTRIBUTOR-7{10}: Mic Bowman 725 CONTRIBUTOR-8{11}: Dan Connoly 726 CONTRIBUTOR-9{15}: Michael Mauldin 727 CONTRIBUTOR-10{12}: Wick Nichols 728 DATE{16}: February 9, 1997 729 TYPE{14}: Internet draft 730 FORMAT{4}: Text 731 IDENTIFIER:{21} draft-kunze-dc-00.txt 732 SOURCE{41}: http://purl.oclc.org/metadata/dublin_core 733 LANGUAGE{3}: eng 734 RELATION{24}: Draft Reference Standard 735 COVERAGE{22}: Expires August 8, 1997 736 RIGHTS{58}: Unlimited Distribution; readers must not cite as standard. 737 }