idnits 2.17.1 draft-henderson-dasl-scenarios-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** Bad filename characters: the document name given in the document, 'draft-henderson-dasl-scenarios-00.html', contains other characters than digits, lowercase letters and dash. ** Missing revision: the document name given in the document, 'draft-henderson-dasl-scenarios-00.html', does not give the document revision number == Mismatching filename: the document gives the document name as 'draft-henderson-dasl-scenarios-00.html', but the file name used is 'draft-henderson-dasl-scenarios-00' == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([WEBDAV]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == Couldn't figure out when the document was first submitted -- there may comments or warnings related to the use of a disclaimer for pre-RFC5378 work that could not be issued because of this. Please check the Legal Provisions document at https://trustee.ietf.org/license-info to determine if you need the pre-RFC5378 disclaimer. -- The document date (Mar 23, 1999) is 9166 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-10) exists of draft-ietf-webdav-protocol-08 Summary: 13 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Rick Henderson 2 draft-henderson-dasl-scenarios-00.html Netscape Communications 3 September 18, 1998 4 Expires Mar 23, 1999 6 Scenarios for DASL 8 Status of this Memo 10 This document is an Internet draft. Internet drafts are working 11 documents of the Internet Engineering Task Force (IETF), its areas and 12 its working groups. Note that other groups may also distribute working 13 information as Internet drafts. 15 Internet Drafts are draft documents valid for a maximum of six months 16 and can be updated, replaced or obsoleted by other documents at any 17 time. It is inappropriate to use Internet drafts as reference material 18 or to cite them as other than as "work in progress". 20 To learn the current status of any Internet draft please check the 21 "lid-abstracts.txt" listing contained in the Internet drafts shadow 22 directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 23 munnari.oz.au (Pacific Rim), ds.internic.net (US East coast) or 24 ftp.isi.edu (US West coast). Further information about the IETF can be 25 found at URL: http://www.ietf.org/. 27 Distribution of this document is unlimited. Please send comments to 28 the mailing list at www-webdav-dasl@w3.org, which may be joined by 29 sending a message with subject "subscribe" to 30 www-webdav-dasl-request@w3.org. 32 Discussions of the list are archived at 33 http://www.w3.org/pub/WWW/Archives/Public/www-webdav-dasl. 35 Abstract 37 The Distributed Authoring and Versioning protocol [WEBDAV] defines 38 simple mechanisms to assign and retrieve values for properties. This 39 document presents scenarios for a WebDAV extension to support 40 efficient searching for resources based on WEBDAV properties and 41 content. These scenarios are intended to suggest some of the uses that 42 DASL could be put to. This may in turn motivate decisions on what is 43 essential to DASL and what may be considered extra. 45 1. Introduction 47 The scenarios below are intended to provoke discussion of what DASL 48 should and shouldn't do. It is not necessarily true that DASL should 49 support all of these or to what extent DASL should support them and to 50 what extent DASL is a small piece of what it would take to support 51 them. At least one is probably impossible. These scenarios should 52 encompass most of the sorts of things that we expect DASL to play a 53 part in. 55 2. Scenarios 57 The scenarios below are roughly grouped into scenarios dealing with 58 the following topics: Document Management, Seeking Information, 59 Navigation, and "Search Isn't Always Enough". 61 2.1 Resource Management 63 Search could be used to help keep track of what is going on with a set 64 of DAV resources. Some DASL queries that might help with this: 65 * Find the owners of all locked resources. 66 * Search for all the owners of locked resources. 67 * Search for resources that have been locked for more than 1 week. 68 [Though desirable this is impossible since DAV does not record 69 the time when a resource was locked] 70 * Search for resources that have not changed in the last year. 71 These queries could help find resources that are likely to be 72 undergoing changes, who is changing them, what resources have been 73 locked for too long, what resources aren't dynamic anymore. 75 2.2 Seeking Information 77 2.2.1 Finding a specific resource using content search 79 Another user's information need may be like this: "I need that article 80 I saw a while back that made a connection between epilepsy, migraines, 81 and zinc." They can do a content based search seeking resources with 82 all of the words, epilepsy, migraine, and zinc. 84 2.2.2 Finding a specific resource by phrase 86 A user remembers a resource that they liked and want to see again but 87 doesn't have it book marked or remember the location. They do remember 88 a key phrase from the content though. They can search for the phrase 89 such as "invisible car", and find the resource without picking through 90 a large number of irrelevant resources. Here the phrase search is 91 important to use instead of just finding resources with both invisible 92 and car since these are common enough words that they will overlap 93 much more than the phrase invisible car. 95 2.2.3 Finding a specific resource by author and date range 97 A user's information need may be expressed something like this: "I 98 need that trip report that John Doe wrote last spring." They don't 99 know its location or its title. They can search for resources with 100 author equal to "John Doe" and create date greater than 1998/01/01 and 101 less than 1998/06/01. This may yield few enough resources to easily 102 find the one of interest. 104 2.2.4 Finding a specific resource using both content and property search 106 The user who wanted to find the trip report that John Doe wrote last 107 spring may find that John Doe was very prolific and wrote several 108 hundred things last spring. The user may do better using both content 109 and property search. They can search for resources with author equal 110 to "John Doe" and create date greater than 1998/01/01 and less than 111 1998/06/01 that contain the some of the words IETF, Redmond, and DASL. 113 2.2.5 Finding resources of a particular kind 115 DASL could be used to find resources of a particular kind such as 116 images. This could be used directly by an end user looking for 117 interesting images, or by a program that does some kind of processing 118 on the images like select gif images that are portraits. A query that 119 asked for mime-type = image/* could gather that data. 121 2.2.6 Finding resources in a particular language 123 Assuming that a language attribute is set, then a search could be 124 restricted to resources that are in a particular language, say German. 125 It would be possible for a site to automatically set this tag using 126 language recognition technology. 128 2.2.7 Searching for information on multiple servers 130 A user seeking information of some sort may not know what server(s) 131 contain the information they are seeking. The DASL client program can 132 send the content based query to a several servers without having to 133 translate the query into a different query syntax for each server. For 134 property queries, the DASL client can query the attribute schema on 135 the DASL servers and send a property query or a mixed property and 136 content query to a set of DASL servers that have common property 137 schema. The results from such a cross server search can be sorted 138 according to property values or according to relevance score. 140 2.2.8 Stemming 142 If a user is searching for information about the hobby of building 143 model cars, resources that are likely to contains various forms of 144 those words, model, models, modeling, as well as car and cars. 145 Stemming saves them from entering all the various forms of the words 146 they may want to match. Stemming is sometimes confused with right 147 truncation, but it is quite different. In languages such as English 148 one can approximate stemming by right truncation of words, e.g. 149 "model*" matches "model", "models", "modeling", "modeler" etc. This 150 doesn't work well for shorter words. Car* not only matches car and 151 cars, but also carbon, carcinoma, card etc. For many languages right 152 truncation doesn't work well since the forms of a word are changed by 153 changing something in the middle or the beginning of the word. 155 2.2.9 Word proximity 157 In the stemming example our user was searching for fairly common 158 words, car and model, in an effort to find information on building 159 model cars. Many resources that have nothing to do with model cars or 160 building models of cars might contains both words. What the user wants 161 is resources where model and car are close together. A search that 162 takes into account the proximity of the search terms would help filter 163 out the irrelevant resources. This is distinct from phrase search as 164 described in 2.2.2 and the conjunctive content search in 2.2.1. It is 165 different from phrase search in that the user here is probably also 166 interested in "car models", "model cars", and "model of a car". It is 167 also different from conjunctive search in that the user has a 168 reasonable expectation that the words are likely to occur together in 169 a relevant resource. 171 2.2.10 Query By Example 173 A user has done a search and found some relevant or nearly relevant 174 resources and some clearly irrelevant resources. Desiring a broader 175 and more specific set of resources, they specify one or more of the 176 relevant result resources and one or more of the irrelevant resources 177 to a query by example type operator. The result is a new set of 178 resources having more overlap in keywords than the irrelevant 179 resources. This type of operator saves the user the considerable 180 trouble of constructing a new query that will filter out the 181 irrelevant resources while expanding the set of keywords from the 182 relevant resources. 184 2.3 Navigation 186 2.3.1 Site Navigation 188 While DAV itself is sufficient for basic site navigation, DASL can 189 support fancier site navigation, where resources are sorted on the 190 server, or filtered out on the server. 192 2.3.2 Browse Tree for exploring a resource space 194 A DASL application could present a browse tree for a set of resources. 195 In a browse tree some property is selected at each level of the tree 196 to branch on. Thus if the top level property selected were resource 197 type, then the unique values of the resource type property for all the 198 resources would be the branches of the tree and would be presented to 199 the user. So the user might see a list of resource types, say 200 "Administrative memo", "Design spec", "Requirements spec", "Test 201 plan", "Project schedule". Beneath that another property could be 202 selected, say Project, which might display project names with values 203 such as "Tuolemne", "Calaveras", "Russian", "Sacramento", "American", 204 "Merced". At that point the user might want to view the list of 205 resources within these categories and there might be only a few or 206 just one project schedule for project Russian. The same resource space 207 might also be explored using properties like Date and Author. (Note: 208 DASL will most likely not explicitly support browse trees, but 209 searches like 'docType = "Design spec" AND project = "Tuolemne" sorted 210 by date' could be used to gather the raw data to generate the 211 information for a node in the browse tree) 213 2.3.3 Finding information on a particular topic in an organized 214 collection 216 A collection may have been organized according to some taxonomy and 217 the keywords chosen accordingly. The user, knowing or having scanned 218 the taxonomy, presents a query for general subject equal to gardening 219 and subordinate subject equal to bonsai. 221 2.3.4 Finding information on a particular topic in an unorganized 222 collection 224 A collection may not have been organized according to some taxonomy or 225 the taxonomy may not be detailed enough for the user's purposes, or 226 may be irrelevant to the user's interest. In this case content based 227 search becomes crucial. A user could search for resources containing 228 all three of the words "small", "Japanese", and "trees", and likely 229 obtain articles on bonsai. If the collection were organized with a 230 taxonomy that the user didn't know about they could then discover the 231 keywords from the resource found and use that to find other resources 232 with the same categorization. 234 2.3.5 External taxonomy to view a DASL collection 236 A user could view various DASL supporting collections according to the 237 user's own taxonomy. Here we assume that the user has a taxonomy where 238 for each category there is a complex query for which the relevance 239 score returned establishes a resources degree of membership in the 240 category. A DASL application could issue a series of these queries on 241 a collection resource and thus categorize the resources within the 242 resource. 244 2.4 Search Isn't Always Enough 246 The following scenarios deal with uses of search where the initial 247 search or the basic result list isn't enough by itself to solve the 248 user's information need. 250 2.4.1 Finding the right information by looking at the hit highlights 252 Natural language being so context dependent means that content based 253 search inevitably retrieves false positives if it is getting very many 254 of the true positives. The user is left to pick through the resources 255 returned to find the ones that are actually relevant. Highlight 256 information can be used to make this easier. A DASL application could 257 present a list of the sentences that had the hit words in them. This 258 is likely to allow the user to discard most of the false positives 259 without having to view the whole resource. 261 2.4.2 Finding the information in a large resource 263 The user may do a content based search that returns a large resource 264 of many pages but the relevant part of the resource is in only one or 265 a few parts of the resource. Hit highlighting will help the user find 266 those parts. A smart DASL application could present links to jump to 267 the next hit or concentration of hits. 269 2.4.3 Saved query result 271 A user does a search and gets a very large set of results. The user 272 then progressively narrows the search down by adding constraints to 273 the previous search. 275 2.4.4 Saved query result II 277 A user does a search and spends some time improving the query so that 278 it catches a large set of information on a particular topic without 279 bringing in much noise. The query is made available to other users 280 with similar information needs. The others are likely to combine that 281 query with their own more temporary constraints to achieve their own 282 information needs. If saved searches are explicitly part of the DASL 283 protocol, it may be easier for servers to recognize repeated queries 284 and avoid full re-execution of a search. 286 3. Internationalization 288 The queries described above should work equally well for resources or 289 properties in any language that can be expressed with Unicode. In 290 particular, this means that when two strings are compared for equality 291 or ordering, the customary language specific rules should be used. 292 These rules will typically include rules for how case sensitivity is 293 determined, the significance of diacritics, ordering of base 294 characters, and sorting rules for strings. For example, in the Dutch 295 language, a name such as "van Bree" sorts under "B" not "v". HTTP 296 provides means of indicating the language of a entity, and XML 297 provides means of indicating the language of an XML resource (the 298 xml:lang attribute), and these should be used in DASL. Note that 299 comparisons of strings from different languages is out of scope for 300 DASL. 302 4. References 304 [WEBDAV] Y. Y. Goland, E. J. Whitehead, Jr., A. Faizi, S. R. Carter, 305 D. Jensen, "Extensions for Distributed Authoring and 306 Versioning on the World Wide Web", April, 1998. internet-draft, 307 work-in-progress, draft-ietf-webdav-protocol-08.txt. 309 5. Authors' Addresses 311 Rick Henderson 312 Netscape Communications 313 501 E. Middlefield Road 314 Mountain View CA 94043