idnits 2.17.1 

draft-henderson-dasl-scenarios-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-26) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  ** Bad filename characters: the document name given in the document,
     'draft-henderson-dasl-scenarios-00.html', contains other characters than
     digits, lowercase letters and dash.

  ** Missing revision: the document name given in the document,
     'draft-henderson-dasl-scenarios-00.html', does not give the document
     revision number

  == Mismatching filename: the document gives the document name as
     'draft-henderson-dasl-scenarios-00.html', but the file name used is
     'draft-henderson-dasl-scenarios-00'

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack an Authors' Addresses Section.

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The abstract seems to contain references ([WEBDAV]), which it shouldn't.
      Please replace those with straight textual mentions of the documents in
     question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Couldn't figure out when the document was first submitted -- there may
     comments or warnings related to the use of a disclaimer for pre-RFC5378
     work that could not be issued because of this.  Please check the Legal
     Provisions document at https://trustee.ietf.org/license-info to determine
     if you need the pre-RFC5378 disclaimer.

  -- The document date (Mar 23, 1999) is 9166 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Outdated reference: A later version (-10) exists of
     draft-ietf-webdav-protocol-08


     Summary: 13 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	INTERNET-DRAFT                                        Rick Henderson
2	draft-henderson-dasl-scenarios-00.html       Netscape Communications
3	September 18, 1998
4	Expires Mar 23, 1999

6	Scenarios for DASL

8	Status of this Memo

10	  This document is an Internet draft. Internet drafts are working
11	  documents of the Internet Engineering Task Force (IETF), its areas and
12	  its working groups. Note that other groups may also distribute working
13	  information as Internet drafts.

15	  Internet Drafts are draft documents valid for a maximum of six months
16	  and can be updated, replaced or obsoleted by other documents at any
17	  time. It is inappropriate to use Internet drafts as reference material
18	  or to cite them as other than as "work in progress".

20	  To learn the current status of any Internet draft please check the
21	  "lid-abstracts.txt" listing contained in the Internet drafts shadow
22	  directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
23	  munnari.oz.au (Pacific Rim), ds.internic.net (US East coast) or
24	  ftp.isi.edu (US West coast). Further information about the IETF can be
25	  found at URL: http://www.ietf.org/.

27	  Distribution of this document is unlimited. Please send comments to
28	  the mailing list at www-webdav-dasl@w3.org, which may be joined by
29	  sending a message with subject "subscribe" to
30	  www-webdav-dasl-request@w3.org.

32	  Discussions of the list are archived at
33	  http://www.w3.org/pub/WWW/Archives/Public/www-webdav-dasl.

35	Abstract

37	  The Distributed Authoring and Versioning protocol [WEBDAV] defines
38	  simple mechanisms to assign and retrieve values for properties. This
39	  document presents scenarios for a WebDAV extension to support
40	  efficient searching for resources based on WEBDAV properties and
41	  content. These scenarios are intended to suggest some of the uses that
42	  DASL could be put to. This may in turn motivate decisions on what is
43	  essential to DASL and what may be considered extra.

45	1. Introduction

47	  The scenarios below are intended to provoke discussion of what DASL
48	  should and shouldn't do. It is not necessarily true that DASL should
49	  support all of these or to what extent DASL should support them and to
50	  what extent DASL is a small piece of what it would take to support
51	  them. At least one is probably impossible. These scenarios should
52	  encompass most of the sorts of things that we expect DASL to play a
53	  part in.

55	2. Scenarios

57	  The scenarios below are roughly grouped into scenarios dealing with
58	  the following topics: Document Management, Seeking Information,
59	  Navigation, and "Search Isn't Always Enough".

61	2.1 Resource Management

63	  Search could be used to help keep track of what is going on with a set
64	  of DAV resources. Some DASL queries that might help with this:
65	     * Find the owners of all locked resources.
66	     * Search for all the owners of locked resources.
67	     * Search for resources that have been locked for more than 1 week.
68	       [Though desirable this is impossible since DAV does not record
69	       the time when a resource was locked]
70	     * Search for resources that have not changed in the last year.
71	  These queries could help find resources that are likely to be
72	  undergoing changes, who is changing them, what resources have been
73	  locked for too long, what resources aren't dynamic anymore.

75	2.2 Seeking Information

77	2.2.1 Finding a specific resource using content search

79	  Another user's information need may be like this: "I need that article
80	  I saw a while back that made a connection between epilepsy, migraines,
81	  and zinc." They can do a content based search seeking resources with
82	  all of the words, epilepsy, migraine, and zinc.

84	2.2.2 Finding a specific resource by phrase

86	  A user remembers a resource that they liked and want to see again but
87	  doesn't have it book marked or remember the location. They do remember
88	  a key phrase from the content though. They can search for the phrase
89	  such as "invisible car", and find the resource without picking through
90	  a large number of irrelevant resources. Here the phrase search is
91	  important to use instead of just finding resources with both invisible
92	  and car since these are common enough words that they will overlap
93	  much more than the phrase invisible car.

95	2.2.3 Finding a specific resource by author and date range

97	  A user's information need may be expressed something like this: "I
98	  need that trip report that John Doe wrote last spring." They don't
99	  know its location or its title. They can search for resources with
100	  author equal to "John Doe" and create date greater than 1998/01/01 and
101	  less than 1998/06/01. This may yield few enough resources to easily
102	  find the one of interest.

104	2.2.4 Finding a specific resource using both content and property search

106	  The user who wanted to find the trip report that John Doe wrote last
107	  spring may find that John Doe was very prolific and wrote several
108	  hundred things last spring. The user may do better using both content
109	  and property search. They can search for resources with author equal
110	  to "John Doe" and create date greater than 1998/01/01 and less than
111	  1998/06/01 that contain the some of the words IETF, Redmond, and DASL.

113	2.2.5 Finding resources of a particular kind

115	  DASL could be used to find resources of a particular kind such as
116	  images. This could be used directly by an end user looking for
117	  interesting images, or by a program that does some kind of processing
118	  on the images like select gif images that are portraits. A query that
119	  asked for mime-type = image/* could gather that data.

121	2.2.6 Finding resources in a particular language

123	  Assuming that a language attribute is set, then a search could be
124	  restricted to resources that are in a particular language, say German.
125	  It would be possible for a site to automatically set this tag using
126	  language recognition technology.

128	2.2.7 Searching for information on multiple servers

130	  A user seeking information of some sort may not know what server(s)
131	  contain the information they are seeking. The DASL client program can
132	  send the content based query to a several servers without having to
133	  translate the query into a different query syntax for each server. For
134	  property queries, the DASL client can query the attribute schema on
135	  the DASL servers and send a property query or a mixed property and
136	  content query to a set of DASL servers that have common property
137	  schema. The results from such a cross server search can be sorted
138	  according to property values or according to relevance score.

140	2.2.8 Stemming

142	  If a user is searching for information about the hobby of building
143	  model cars, resources that are likely to contains various forms of
144	  those words, model, models, modeling, as well as car and cars.
145	  Stemming saves them from entering all the various forms of the words
146	  they may want to match. Stemming is sometimes confused with right
147	  truncation, but it is quite different. In languages such as English
148	  one can approximate stemming by right truncation of words, e.g.
149	  "model*" matches "model", "models", "modeling", "modeler" etc. This
150	  doesn't work well for shorter words. Car* not only matches car and
151	  cars, but also carbon, carcinoma, card etc. For many languages right
152	  truncation doesn't work well since the forms of a word are changed by
153	  changing something in the middle or the beginning of the word.

155	 2.2.9 Word proximity

157	  In the stemming example our user was searching for fairly common
158	  words, car and model, in an effort to find information on building
159	  model cars. Many resources that have nothing to do with model cars or
160	  building models of cars might contains both words. What the user wants
161	  is resources where model and car are close together. A search that
162	  takes into account the proximity of the search terms would help filter
163	  out the irrelevant resources. This is distinct from phrase search as
164	  described in 2.2.2 and the conjunctive content search in 2.2.1. It is
165	  different from phrase search in that the user here is probably also
166	  interested in "car models", "model cars", and "model of a car". It is
167	  also different from conjunctive search in that the user has a
168	  reasonable expectation that the words are likely to occur together in
169	  a relevant resource.

171	2.2.10 Query By Example

173	  A user has done a search and found some relevant or nearly relevant
174	  resources and some clearly irrelevant resources. Desiring a broader
175	  and more specific set of resources, they specify one or more of the
176	  relevant result resources and one or more of the irrelevant resources
177	  to a query by example type operator. The result is a new set of
178	  resources having more overlap in keywords than the irrelevant
179	  resources. This type of operator saves the user the considerable
180	  trouble of constructing a new query that will filter out the
181	  irrelevant resources while expanding the set of keywords from the
182	  relevant resources.

184	2.3 Navigation

186	2.3.1 Site Navigation

188	  While DAV itself is sufficient for basic site navigation, DASL can
189	  support fancier site navigation, where resources are sorted on the
190	  server, or filtered out on the server.

192	2.3.2 Browse Tree for exploring a resource space

194	  A DASL application could present a browse tree for a set of resources.
195	  In a browse tree some property is selected at each level of the tree
196	  to branch on. Thus if the top level property selected were resource
197	  type, then the unique values of the resource type property for all the
198	  resources would be the branches of the tree and would be presented to
199	  the user. So the user might see a list of resource types, say
200	  "Administrative memo", "Design spec", "Requirements spec", "Test
201	  plan", "Project schedule". Beneath that another property could be
202	  selected, say Project, which might display project names with values
203	  such as "Tuolemne", "Calaveras", "Russian", "Sacramento", "American",
204	  "Merced". At that point the user might want to view the list of
205	  resources within these categories and there might be only a few or
206	  just one project schedule for project Russian. The same resource space
207	  might also be explored using properties like Date and Author. (Note:
208	  DASL will most likely not explicitly support browse trees, but
209	  searches like 'docType = "Design spec" AND project = "Tuolemne" sorted
210	  by date' could be used to gather the raw data to generate the
211	  information for a node in the browse tree)

213	2.3.3 Finding information on a particular topic in an organized
214	collection

216	  A collection may have been organized according to some taxonomy and
217	  the keywords chosen accordingly. The user, knowing or having scanned
218	  the taxonomy, presents a query for general subject equal to gardening
219	  and subordinate subject equal to bonsai.

221	2.3.4 Finding information on a particular topic in an unorganized
222	collection

224	  A collection may not have been organized according to some taxonomy or
225	  the taxonomy may not be detailed enough for the user's purposes, or
226	  may be irrelevant to the user's interest. In this case content based
227	  search becomes crucial. A user could search for resources containing
228	  all three of the words "small", "Japanese", and "trees", and likely
229	  obtain articles on bonsai. If the collection were organized with a
230	  taxonomy that the user didn't know about they could then discover the
231	  keywords from the resource found and use that to find other resources
232	  with the same categorization.

234	2.3.5 External taxonomy to view a DASL collection

236	  A user could view various DASL supporting collections according to the
237	  user's own taxonomy. Here we assume that the user has a taxonomy where
238	  for each category there is a complex query for which the relevance
239	  score returned establishes a resources degree of membership in the
240	  category. A DASL application could issue a series of these queries on
241	  a collection resource and thus categorize the resources within the
242	  resource.

244	2.4 Search Isn't Always Enough

246	  The following scenarios deal with uses of search where the initial
247	  search or the basic result list isn't enough by itself to solve the
248	  user's information need.

250	2.4.1 Finding the right information by looking at the hit highlights

252	  Natural language being so context dependent means that content based
253	  search inevitably retrieves false positives if it is getting very many
254	  of the true positives. The user is left to pick through the resources
255	  returned to find the ones that are actually relevant. Highlight
256	  information can be used to make this easier. A DASL application could
257	  present a list of the sentences that had the hit words in them. This
258	  is likely to allow the user to discard most of the false positives
259	  without having to view the whole resource.

261	2.4.2 Finding the information in a large resource

263	  The user may do a content based search that returns a large resource
264	  of many pages but the relevant part of the resource is in only one or
265	  a few parts of the resource. Hit highlighting will help the user find
266	  those parts. A smart DASL application could present links to jump to
267	  the next hit or concentration of hits.

269	2.4.3 Saved query result

271	  A user does a search and gets a very large set of results. The user
272	  then progressively narrows the search down by adding constraints to
273	  the previous search.

275	2.4.4 Saved query result II

277	  A user does a search and spends some time improving the query so that
278	  it catches a large set of information on a particular topic without
279	  bringing in much noise. The query is made available to other users
280	  with similar information needs. The others are likely to combine that
281	  query with their own more temporary constraints to achieve their own
282	  information needs. If saved searches are explicitly part of the DASL
283	  protocol, it may be easier for servers to recognize repeated queries
284	  and avoid full re-execution of a search.

286	3. Internationalization

288	  The queries described above should work equally well for resources or
289	  properties in any language that can be expressed with Unicode. In
290	  particular, this means that when two strings are compared for equality
291	  or ordering, the customary language specific rules should be used.
292	  These rules will typically include rules for how case sensitivity is
293	  determined, the significance of diacritics, ordering of base
294	  characters, and sorting rules for strings. For example, in the Dutch
295	  language, a name such as "van Bree" sorts under "B" not "v". HTTP
296	  provides means of indicating the language of a entity, and XML
297	  provides means of indicating the language of an XML resource (the
298	  xml:lang attribute), and these should be used in DASL. Note that
299	  comparisons of strings from different languages is out of scope for
300	  DASL.

302	4. References

304	  [WEBDAV] Y. Y. Goland, E. J. Whitehead, Jr., A. Faizi, S. R. Carter,
305	  D. Jensen, "Extensions for Distributed Authoring and
306	  Versioning on the World Wide Web", April, 1998. internet-draft,
307	  work-in-progress, draft-ietf-webdav-protocol-08.txt.

309	 5. Authors' Addresses

311	  Rick Henderson
312	  Netscape Communications
313	  501 E. Middlefield Road
314	  Mountain View CA 94043