idnits 2.17.1 

draft-kunze-thump-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 221: '...   MUST respond to the "help" command ...'
     RFC 2119 keyword, line 254: '...st modifications SHOULD be reported ba...'
     RFC 2119 keyword, line 258: '...erver that declares THUMP support MUST...'
     RFC 2119 keyword, line 297: '...   but SHOULD report the resulting fil...'
     RFC 2119 keyword, line 450: '...sed by the above status codes, it MUST...'


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (May 2, 2017) is 2549 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'TEMPER' is mentioned on line 284, but not defined

  == Unused Reference: 'ARK' is defined on line 590, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC2822' is defined on line 598, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC5013' is defined on line 606, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 2822 (Obsoleted by RFC 5322)


     Summary: 4 errors (**), 0 flaws (~~), 5 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                           J. Kunze
3	Internet-Draft                                California Digital Library
4	Intended status: Informational                                 N. Nassar
5	Expires: November 3, 2017                                 Index Data ApS
6	                                                             May 2, 2017

8	                  THUMP: The HTTP URL Mapping Protocol
9	                          draft-kunze-thump-03

11	Abstract

13	   The HTTP URL Mapping Protocol (THUMP) is a set of URL-based
14	   conventions for retrieving information and conducting searches.
15	   THUMP can be used for focused retrievals or for broad database
16	   queries.  A THUMP request is a URL containing a query string that
17	   starts with a `?', and can contain one or more THUMP commands.
18	   Returned records are formatted with Dublin Core Kernel metadata as
19	   Electronic Resource Citations, which are similar to blocks of email
20	   headers.

22	Status of This Memo

24	   This Internet-Draft is submitted in full conformance with the
25	   provisions of BCP 78 and BCP 79.

27	   Internet-Drafts are working documents of the Internet Engineering
28	   Task Force (IETF).  Note that other groups may also distribute
29	   working documents as Internet-Drafts.  The list of current Internet-
30	   Drafts is at http://datatracker.ietf.org/drafts/current/.

32	   Internet-Drafts are draft documents valid for a maximum of six months
33	   and may be updated, replaced, or obsoleted by other documents at any
34	   time.  It is inappropriate to use Internet-Drafts as reference
35	   material or to cite them other than as "work in progress."

37	   This Internet-Draft will expire on November 3, 2017.

39	Copyright Notice

41	   Copyright (c) 2017 IETF Trust and the persons identified as the
42	   document authors.  All rights reserved.

44	   This document is subject to BCP 78 and the IETF Trust's Legal
45	   Provisions Relating to IETF Documents
46	   (http://trustee.ietf.org/license-info) in effect on the date of
47	   publication of this document.  Please review these documents
48	   carefully, as they describe your rights and restrictions with respect
49	   to this document.  Code Components extracted from this document must
50	   include Simplified BSD License text as described in Section 4.e of
51	   the Trust Legal Provisions and are provided without warranty as
52	   described in the Simplified BSD License.

54	Table of Contents

56	   1.  Overview  . . . . . . . . . . . . . . . . . . . . . . . . . .   2
57	   2.  A Sample THUMP Session  . . . . . . . . . . . . . . . . . . .   3
58	   3.  Keys and Citations  . . . . . . . . . . . . . . . . . . . . .   4
59	   4.  Key-Request Dualism . . . . . . . . . . . . . . . . . . . . .   5
60	   5.  Request Summary . . . . . . . . . . . . . . . . . . . . . . .   6
61	     5.1.  Key ? help  . . . . . . . . . . . . . . . . . . . . . . .   6
62	     5.2.  Key ? was(DESCRIPTION) when(DATE) resync  . . . . . . . .   6
63	     5.3.  Key ? in(DB) find(QUERY) sort([!]ELEMS) list(RANGE)
64	           show(ELEMS) as(FORMAT)  . . . . . . . . . . . . . . . . .   7
65	     5.4.  Key ? . . . . . . . . . . . . . . . . . . . . . . . . . .   9
66	     5.5.  Key ??  . . . . . . . . . . . . . . . . . . . . . . . . .   9
67	     5.6.  Key ? get() put() group() apply() . . . . . . . . . . . .   9
68	   6.  Response Summary  . . . . . . . . . . . . . . . . . . . . . .   9
69	   7.  Returned Records  . . . . . . . . . . . . . . . . . . . . . .  10
70	     7.1.  Empty values for required elements  . . . . . . . . . . .  11
71	   8.  FAQ -- Frequently Asked Questions . . . . . . . . . . . . . .  12
72	     8.1.  What's the difference between THUMP, OpenSearch, SRU/SRW,
73	           and OpenURL?  . . . . . . . . . . . . . . . . . . . . . .  12
74	   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  13
75	   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  13
76	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  14

78	1.  Overview

80	   This document specifies The HTTP URL Mapping Protocol (THUMP), a set
81	   of URL-based conventions for retrieving information and conducting
82	   searches.  THUMP can be used for focused retrievals; e.g., for a
83	   given known-item, asking that a specifically formatted subset of
84	   information about it be returned.  It can also be used for broad
85	   database queries, such as finding all records matching the word,
86	   "monitor".

88	   A THUMP request is a URL containing a query string that starts with a
89	   `?', and can contain one or more THUMP commands.  A request is passed
90	   to a server with HTTP GET (or POST if desired).  The shortest request
91	   is a URL ending in `?', as in,

93	       http://example.com/object321?

95	   which asks the server to return a metadata record describing the
96	   information item identified by the URL.  This is a shorthand for the
97	   common request for a short description of a known-item; the
98	   completely spelled out equivalent in this case would be

100	       http://example.com/object321?show(brief)as(anvl/erc)

102	   An example of a broad database search is,

104	       http://example.com/?in(books)find(war and peace)show(full)

106	   Query strings and responses are UTF8-encoded [RFC3629].  A THUMP
107	   response is an HTTP message body containing one or more records.
108	   Records contain Kernel metadata [Kernel] elements h1-h4.  The
109	   currently defined tag names are summarized below, formatted as
110	   Electronic Resource Citations (ERC), which are similar to blocks of
111	   email headers.  In an ERC each element consists of a label, colon,
112	   and value; long values are continued on indented lines and empty
113	   lines separate records.  It will be possible in a future version of
114	   THUMP to request ERC records formatted in XML.

116	2.  A Sample THUMP Session

118	   THUMP is very simple and follows the classical stateless HTTP
119	   communication model.  This section contains a complete annotated
120	   example of a request and response exchange.  To summarize, the
121	   requester sets up a TCP/HTTP session with the server system, sends a
122	   THUMP request inside an HTTP request, receives an answer inside an
123	   HTTP response, and closes the session.

125	   In the following example THUMP session, each line has been annotated
126	   to include a line number and whether it was the client or server that
127	   sent it.  Without going into depth, the session has four pieces
128	   separated by blank lines: the client's piece (lines 1-3), the
129	   server's HTTP/THUMP response headers (4-7), and the body of the
130	   server's response (8-18).  The first and last lines (1 and 18)
131	   correspond to the client's steps to start the TCP session and the
132	   server's steps to end it, respectively.  The heart of the request is
133	   the known-item metadata request indicated by the URL ending in a
134	   single `?' on line 2.

136	    1  C: [opens session]
137	       C: GET http://ark.cdlib.org/ark:/13030/ft167nb0vq? HTTP/1.1
138	       C:
139	       S: HTTP/1.1 200 OK
140	    5  S: Content-Type: text/plain
141	       S: THUMP-Status: 0.6 200 OK
142	       S:
143	       S: erc:
144	       S: who:   Stanton A. Glantz and Edith D.  Balbach
145	   10  S: what:  Tobacco War: Inside the California Battles
146	       S: when:  20000510
147	       S: where: http://ark.cdlib.org/ark:/13030/ft167nb0vq
148	       S: [closes session]

150	   The first two server response lines (4-5) above are typical with
151	   HTTP.  The next line (6) is peculiar to THUMP, and indicates the
152	   THUMP version and a normal return status.  The balance of the
153	   response consists of a single metadata record (8-12) that comprises
154	   the service response.

156	   The returned record (8-12) in this case is in the ERC format (other
157	   formats are possible).  It contains four elements that answer high
158	   priority questions regarding an expression of the object: who played
159	   a major role in expressing it, what the expression was called, when
160	   is was created, and where the expression may be found.

162	3.  Keys and Citations

164	   A THUMP request is a command sequence operating on a Key, which is a
165	   base URL for a service point that supports THUMP.  It is expected,
166	   however, that the Key may generalize to service points in client-
167	   server computation contexts other than today's WWW.

169	   The Key uses a "citation-centered" system of reference.  This means
170	   that data elements are addressed relative to an abstract object
171	   surrogate, or "citation".

173	   While some systems have stored metadata-based surrogates (e.g.,
174	   library catalog records for books), many other systems do not.  This
175	   is not an obstacle to using THUMP.  The latter usually support the
176	   display or delivery of dynamically generated object citations, each
177	   consisting of such things as an access URL, a size, a date, a title,
178	   a snippet of relevant text (e.g., matching a query), plus links to
179	   related materials.

181	   Non-surrogate information objects in this model are, loosely
182	   speaking, the priority objects for end users, and include documents,
183	   articles, books, films, recordings, etc.  Surrogates, whether static
184	   or dynamically generated, are important temporary stand-ins during
185	   discovery, filtering, and selection processes.  They are easy to
186	   manipulate in large numbers because they are much more homogeneous
187	   than the objects they represent.  Those objects are often too large,
188	   unwieldy, or rights-encumbered to be dealt with directly during
189	   discovery.  Surrogates are also valuable in preservation since they
190	   can provide useful information about the original context,
191	   dependencies, and provenance of an object.

193	4.  Key-Request Dualism

195	   Although THUMP does not specify anything about the structure of the
196	   Key, it is possible for a given Key string to express, often in an ad
197	   hoc manner, information similar to that expressed in the Request
198	   query string.  The more intuitive the Key structure, the greater the
199	   chance for it to carry information that might appear to repeat or
200	   even contradict commands in the Request.  For example, one server
201	   might require

203	       http://example.org/?in(books)find(war and peace)show(full)

205	   while another server required

207	       http://example.com/in=books/find=war+and+peace?show(full)

209	   and a third server required

211	       http://example.net/books/full/war_and_peace

213	   There is a natural dualism that servers may exploit by permitting or
214	   proposing (e.g., by returning) such semantically-laden Keys.  Any
215	   conventions for re-expressing THUMP commands within the Key or for
216	   resolving apparent contradictions, however, are up to individual
217	   servers and are out of scope for this document.

219	   This document recognizes the dualism but does not constrain it except
220	   to say that for a given Key, a server that declares THUMP support
221	   MUST respond to the "help" command by listing all the commands
222	   (methods) valid for that Key.  As a foundation requirement, the
223	   "help" command is a common way to ping a THUMP server to see if it is
224	   alive.  As an edge case, a THUMP response might be returned even for
225	   a URL that has no request at all (not even a `?'); this might make
226	   sense, for example, when the URL serves as the base Key for an entire
227	   service.

229	   There are cases when a server may wish to generate a temporary Key as
230	   a stand-in for a long or complex request and return it along with a
231	   subset of found records.  For example, the request,
232	       http://example.com/?in(books)find(war and peace)list(10|1)

234	   might return the first 10 records along with a Key that could be used
235	   in subsequent requests to return the next 10 records:

237	       http://example.com/req98765?list(10|11)

239	   Note that this document makes no assumption about the dynamicity of
240	   queries, whether expressed partially or entirely in the Key or in the
241	   request.  In either form, returned records might come from cached
242	   results or from results freshly computed upon each access.  THUMP
243	   support does not constrain servers in this regard.

245	5.  Request Summary

247	   There are several request forms described below, with output formats
248	   listed in a later section.  Spaces have been inserted for readability
249	   in the forms below; usually, inter-command spaces would not be
250	   present.  It is normal to formulate THUMP queries using only a subset
251	   of the commands specified.  With a few important exceptions, this
252	   document is silent on how servers supply defaults or whether they
253	   signal errors for missing commands.  All default actions and server-
254	   side request modifications SHOULD be reported back to the client.

256	5.1.  Key ? help

258	   This form is required.  A server that declares THUMP support MUST
259	   respond to the "help" command by listing all the commands (methods)
260	   valid for that Key.  As a foundation requirement, the "help" command
261	   is a common way to ping a THUMP server to see if it is alive.

263	5.2.  Key ? was(DESCRIPTION) when(DATE) resync

265	   This "metadata" command form provides nothing more than a way to
266	   carry a Key along with its description.  The form is a "no-op"
267	   (except when "resync" is present) in the sense that the Key is
268	   treated as an adorned URL (as if no THUMP request were present).
269	   This form is designed as a passive data structrue that pairs a
270	   hyperlink with its metadata so that a formatted description might be
271	   surfaced by a client-side trigger event such as a "mouse-over".  It
272	   is passive in the sense that selecting ("clicking on") the URL should
273	   result in ordinary access via the Key-as-pure-link as if no THUMP
274	   request were present.  The form is effectively a metadata cache, and
275	   the DATE of last extraction tells how fresh it is.

277	   The "was" pseudo-command takes multiple arguments separated by "|",
278	   the first argument identifying the kind of DESCRIPTION that follows,
279	   e.g,
280	   was(erc|Tolstoy, L|War and Peace|1863|http://example.org/etext/2600)

282	   The "when" pseudo-command (optional) takes one argument that is the
283	   date that the immediately DESCRIPTION was extracted.  The date,
284	   conforming to the [TEMPER] specification, looks like YYYYMMDDhhmmss.
285	   The "was" and "when" pseudo-commands can harmlessly accompany any
286	   THUMP request.

288	   The "resync" command, however, is a request to update the metadata.
289	   It returns a "metadata" form similar to the one submitted, but with
290	   refreshed metadata and no "resync" at the end.

292	5.3.  Key ? in(DB) find(QUERY) sort([!]ELEMS) list(RANGE) show(ELEMS)
293	      as(FORMAT)

295	   This form is used for generalized queries.  The server is permitted
296	   to modify commands, such as by supplying missing commands (defaults),
297	   but SHOULD report the resulting filled-out command xxx.

299	   *in(DB)*
300	   The "in" command specifies one or more dataset names separated by
301	   "|".  If no "in" command is present, the server picks a suitable
302	   default dataset or returns an error.  If no other commands are
303	   present, the server may treat the dataset as a result set or return
304	   an error.  Dataset names originating in relational databases are
305	   assumed to name a table in a default database, but may be structured
306	   into database, schema, and table names using the reserved characters
307	   '/' and '.' as per the following forms:

309	       database/schema.table
310	       database/table
311	       schema.table
312	       table

314	   *find(QUERY)*
315	   The "find" command specifies a QUERY that should produce a result set
316	   of matching records or an error.  The result set is modeled as a
317	   numbered sequence of records that is returned "by reference" with a
318	   generated Key (see the "results" tag later) or as one or more
319	   returned subsequences of records, known as returned sets.  If no
320	   "find" command is present, Key is expected to imply either a single
321	   record or a set of records.  THUMP distinguishes between a result set
322	   and a returned set, which is a subsequence of the result set included
323	   in a given response.

325	   The QUERY consists of free text words separated by spaces.  Reserved
326	   words begin with a ":" (colon), such as the :and, :or, and :not
327	   boolean operators.  Parentheses can be used for grouping.  Prepending
328	   "+" ("-") to a word is done when the requester desires that the word
329	   be present (absent) from search results.  The double-quote character
330	   can be used to join words in a phrase or to turn off the special
331	   meanings of parentheses or ":+-" in front of words.

333	   *sort([!]ELEMS)*
334	   The "sort" command is used to request ordering according to the ELEMS
335	   specification (descending order if preceded by '!').  If no "sort"
336	   command is present, it is up to the server to determine record
337	   ordering.  ELEMS is one or more element or element subset names
338	   separated by "|".

340	   *list(RANGE)*
341	   The "list" command is used to request that a specific subsequence or
342	   RANGE of records be returned.  The server should always use the
343	   starting point of the requested RANGE, but is free to return fewer
344	   records (or a partial record).  In all cases the server must report
345	   what records or record fragment it has returned.  If no "list"
346	   command is present, it is up to the server whether to return records,
347	   and if so, which records.

349	   RANGE is a pair of arguments, "LENGTH|START", indicating the number
350	   of records and starting record in the requested sequence.  For
351	   example, a RANGE of "10|81" requests 10 records beginning with result
352	   set record 81.  If both arguments are missing, as in "list()", it is
353	   considered a request for all records.  If given as just "list(0)", it
354	   is a request that no records be returned directly, but a that the
355	   result set be returned by reference to a generated Key listed in the
356	   "results" tag of the returned set header.  If LENGTH is positive and
357	   START is 0, the server should send LENGTH randomly selected result
358	   set records.  If START is missing it defaults to 1; if LENGTH is
359	   missing, it is considered a request for all records starting from
360	   START.

362	   RANGE may also be used to request record fragments.  A returned
363	   record set consists of either one or more entire (whole) records, or
364	   of exactly one fragment of one record.  When a fragment is returned,
365	   the start position in the set header (described later) is indicated
366	   with S_F, where S is the record number and F is the fragment sequence
367	   number.  To request the next fragment, a START is formulated by
368	   adding 1 to F.  For example, "10|45_3" requests 10 records starting
369	   at fragment 3 of record 45 (only one fragment can be returned).

371	   *show(ELEMS)*
372	   The "show" command is used to request that returned records be
373	   constituted with ELEMS elements.  ELEMS is one or more element or
374	   element subset names separated by "|".  It can be used by users to
375	   define the composition and element order of a returned record set;
376	   element names are discovered by XXX.

378	   Element subset names can also be used.  Common subset names are
379	   "brief", "full", and "support" (a record that is complete enough to
380	   show the server's commitment to the object.  If no "show" command is
381	   present, it is up to the server which elements to return.

383	   *as(FORMAT)*
384	   The "as" command is used to request that returned records be
385	   formatted according to FORMAT.  Common format names are "anvl/erc",
386	   "anvl/qdc", and "xml/marc".  If no "as" command is present, the
387	   default format is usually "anvl/erc" (a plain text format that is
388	   eye-readable and machine-readable), although a service may define
389	   defaults in its own way.

391	5.4.  Key ?

393	   This is a shorthand for

395	       Key ? show(brief) as(anvl/erc)

397	   which returns a brief object (identified by Key) description.
398	   Support for this shorthand is required.

400	5.5.  Key ??

402	   This is a shorthand for

404	       Key ? show(support) as(anvl/erc)

406	   which returns an object description full enough to contain the server
407	   provider's commitment statement.  Support for this shorthand is
408	   required.

410	5.6.  Key ? get() put() group() apply()

412	   These commands are currently undefined and reserved by THUMP for
413	   future use.

415	6.  Response Summary

417	   A THUMP response consists of a block of HTTP and extension headers, a
418	   blank line, and, if the THUMP-Status extension header was 200, a
419	   returned set of records.  The Content-Type HTTP header is normally
420	   returned as

422	       Content-Type: text/plain

424	   so that the results will display correctly on a web browser's
425	   display.  The THUMP content types "text/xml" and "text/html" are
426	   being considered.

428	   The rest of this section describes the THUMP extension headers and
429	   the structure of the returned record set.  Extension headers are
430	   inserted in the block of HTTP response headers, usually near the end.
431	   Currently, one extension header, THUMP-Status, is defined, and it is
432	   required:

434	       THUMP-Status: THUMPVersion StatusCode ReasonPhrase

436	   It includes the version, a short human-readable phrase, and a 3-digit
437	   integer result code indicating the status of the attempt to execute
438	   the request.  Defined StatusCodes and ReasonPhrases for THUMP are:

440	       200: OK
441	       400: Bad Request
442	       402: Payment Required
443	       403: Forbidden
444	       404: Not Found
445	       405: Method Not Allowed
446	       408: Request Time-out

448	   If the status code is other than 200, no record set should be sent.
449	   If the server wishes to convey any more detailed diagnostic or error
450	   information than may be expressed by the above status codes, it MUST
451	   set the code to 200 and use "error" or "warning" element tags within
452	   the returned record set.

454	   A blank line separates the HTTP response and THUMP-Status headers
455	   from the returned set that is the body of the response.  The returned
456	   record set consists of a set-start header record followed by a
457	   sequence of records, each separated by one ore more blank lines,
458	   until end of stream (file) is reached.  A set-end header record is
459	   optional.

461	   The format of the records is normally "anvl/erc", which specifies a
462	   serialization syntax [ANVL] with ERC semantics [Kernel].  In a future
463	   version of THUMP it will be possible to request ERC semantics with
464	   "xml/erc".  The next sections describe the special ANVL record used
465	   to introduce a record set and then the ERC records.

467	7.  Returned Records

469	   This section describes how a record in the sequence of returned
470	   records is encoded in the anvl/erc format.  ANVL (A Name Value
471	   Language) defines the syntax and the ERC (Electronic Resource
472	   Citation) defines semantics.  The URI for the ERC [Kernel] reference
473	   should be included in the record set header.  While a comprehensive
474	   description of the ERC record is out of scope for this document, some
475	   details are give below that may suffice for simple implementations.

477	   An ERC record is a sequence of tagged elements.  It has the form,

479	       erc:
480	       who:   WHO_EXPRESSED_THIS_ITEM
481	       what:  WHAT_THE_EXPRESSION_WAS_CALLED
482	       when:  WHEN_IT_WAS_EXPRESSED
483	       where: WHERE_THE_EXPRESSION_CAN_BE_FOUND
484	       how:   DESCRIPTION_OR_SUMMARY_OF_ITEM              <optional>
485	       why:   COPYRIGHT_DISCLAIMER_AUDIENCE_STATEMENT     <optional>
486	       note:  ANY_TEXT                                    <optional>
487	              .......
488	       <any other tagged elements>                        <optional>

490	   The first five tagged elements are required.  The required elements
491	   may be thought to answer questions about an "expression" of a
492	   resource (an item).

494	   All other elements are optional.  The next ERC element shown above
495	   ("how") is concerned with the content of an item and the element
496	   after that ("why") with any high priority information that comes from
497	   the lawyerly domain -- the really hard questions.

499	   A short form of the ERC is also possible that the above ordering for
500	   the first 6 elements.  It has the form,

502	       erc: WHO | WHAT | WHEN
503	            | WHERE
504	            | HOW                                         <optional>
505	            | WHY                                         <optional>
506	       note:  ANY_TEXT                                    <optional>
507	              .......
508	       <any other tagged elements>                        <optional>

510	   The line breaks among the first 6 elements are arbitrary.  Together
511	   they are considered to be part of one long value for the "erc:" as
512	   long as they are continued on indented lines.  In either form of the
513	   ERC, arbitrary additional elements are possible.

515	7.1.  Empty values for required elements

517	   Although they are required, if no suitable element value can be
518	   found, a controlled code value for "empty" of the form
519	       (:ccode)

521	   should be used, drawing from the following reserved values:

523	      (:unac) temporarily inaccessible

525	      (:unal) unallowed, suppressed intentionally

527	      (:unap) not applicable, makes no sense

529	      (:unas) value unassigned (e.g., Untitled)

531	      (:unav) value unavailable, possibly unknown

533	      (:unkn) known to be unknown (e.g., Anonymous, Inconnue)

535	      (:none) never had a value, never will

537	      (:null) explicitly and meaningfully empty

539	      (:tba) to be assigned or announced later

541	      (:etal) too numerous to list (et alia).

543	      (:at) the real value is at the given URL or identifier.

545	8.  FAQ -- Frequently Asked Questions

547	8.1.  What's the difference between THUMP, OpenSearch, SRU/SRW, and
548	      OpenURL?

550	   All of these protocols are capable of expressing a parameter package
551	   on the right-hand side of a URL, and all of them reserve specific
552	   parameter names as having defined meanings.  In theory, these
553	   packages can be extended arbitrarily to express any functionality
554	   with any level of complexity.  There's no syntactic limitation to
555	   these protocols' expressiveness.  The difference lies in how.

557	   THUMP uses a classic parenthesized argument list syntax while the
558	   others use the flat argument-value list syntax traditional on the web
559	   since 1995.  OpenSearch and SRU/SRW are logical descendants of the
560	   complex Z39.50 search and retrieve protocol, but with restricted
561	   functionality and a text-based syntax.  SRW and OpenURL define an
562	   XML-encoding for request parameters.  OpenURL tends to be used for
563	   known-item linking.  THUMP aims to be a more concise specification
564	   for key-based requests.

566	9.  Security Considerations

568	   The THUMP protocol poses no direct risk to computers and networks.
569	   Implementors of THUMP services need to be aware of security issues
570	   when querying networks and filesystems, and the concomitant risks
571	   from spoofing and obtaining incorrect information.  These risks are
572	   no greater for THUMP than for any other kind of HTTP-based
573	   application.  For example, recipients of a URL with embedded THUMP
574	   commands should treat it like a URL and be aware that the identified
575	   service may no longer be operational.

577	   THUMP clients and servers subject themselves to all the risks that
578	   accompany normal operation of the protocols underlying mapping
579	   services (e.g., HTTP, Z39.50).  As specializations of such protocols,
580	   a THUMP service may limit exposure to the usual risks.  Indeed, THUMP
581	   services may enhance a kind of security by helping users identify
582	   long-term reliable references to information objects.

584	10.  References

586	   [ANVL]     Kunze, J., Kahle, B., Masanes, J., and G. Mohr, "A Name-
587	              Value Language", August 2005,
588	              <http://www.cdlib.org/inside/diglib/ark/anvlspec.pdf>.

590	   [ARK]      Kunze, J. and R. Rodgers, "The ARK Persistent Identifier
591	              Scheme", July 2007,
592	              <http://www.cdlib.org/inside/diglib/ark/arkspec.pdf>.

594	   [Kernel]   Kunze, J. and A. Turner, "Kernel Metadata and Electronic
595	              Resource Citations (ERCs)", October 2007,
596	              <http://www.cdlib.org/inside/diglib/ark/ercspec.html>.

598	   [RFC2822]  Resnick, P., Ed., "Internet Message Format", RFC 2822,
599	              DOI 10.17487/RFC2822, April 2001,
600	              <http://www.rfc-editor.org/info/rfc2822>.

602	   [RFC3629]  Yergeau, F., "UTF-8, a transformation format of ISO
603	              10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November
604	              2003, <http://www.rfc-editor.org/info/rfc3629>.

606	   [RFC5013]  Kunze, J. and T. Baker, "The Dublin Core Metadata Element
607	              Set", RFC 5013, DOI 10.17487/RFC5013, August 2007,
608	              <http://www.rfc-editor.org/info/rfc5013>.

610	Authors' Addresses

612	   John Kunze
613	   California Digital Library
614	   415 20th St, #406
615	   Oakland, CA  94612
616	   USA

618	   Email: jak@ucop.edu

620	   Nassib Nassar
621	   Index Data ApS
622	   Njalsgade 76, 13
623	   Copenhagen  2300
624	   Denmark

626	   Email: nassib@indexdata.com