idnits 2.17.1 draft-ietf-nntpext-srch-00.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([NNTP-NEW], [IMAP4], [NNTP-977]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 22 has weird spacing: '...listing conta...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC-822' is mentioned on line 223, but not defined ** Obsolete undefined reference: RFC 822 (Obsoleted by RFC 2822) -- No information found for draft-drums-abnf - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'ABNF' -- Possible downref: Non-RFC (?) normative reference: ref. 'IMAP4' -- Possible downref: Non-RFC (?) normative reference: ref. 'MIME-1' -- Possible downref: Non-RFC (?) normative reference: ref. 'NNTP-977' == Outdated reference: A later version (-27) exists of draft-ietf-nntpext-base-02 Summary: 10 errors (**), 0 flaws (~~), 4 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 IETF NNTP Working Group N. Ballou, Microsoft 2 Internet Draft B. Hernacki, Netscape 3 S. Waters, Microsoft 4 Document: draft-ietf-nntpext-srch-00.txt January, 1998 6 NNTP Full-text Search Extension 8 Status of this Memo 10 This document is an Internet-Draft. Internet-Drafts are 11 working documents of the Internet Engineering Task Force (IETF), 12 its areas, and its working groups. Note that other groups may 13 also distribute working documents as Internet-Drafts. 15 Internet-Drafts are draft documents valid for a maximum of six 16 months and may be updated, replaced, or obsoleted by other 17 documents at any time. It is inappropriate to use Internet-Drafts 18 as reference material or to cite them other than as "work in 19 progress." 21 To learn the current status of any Internet-Draft, please check 22 the "1id-abstracts.txt" listing contained in the Internet-Drafts 23 Shadow Directories on ds.internic.net (US East Coast), 24 nic.nordu.net (Europe), ftp.isi.edu (US West Coast), or 25 munnari.oz.au (Pacific Rim). 27 A revised version of this draft document will be submitted to the 28 RFC editor as a Proposed Standard for the Internet Community. 29 Discussion and suggestions for improvement are requested. This 30 document will expire before July 1998. Distribution of this 31 draft is unlimited. 33 1. Abstract 35 This document describes a set of enhancements to the Network News 36 Transport Protocol [NNTP-977] that allows full-text searching of 37 news articles in multiple newsgroups. The proposed SEARCH command 38 supports functionality similar to the [IMAP4] SEARCH command, 39 minus user specific search keys (i.e., ANSWERED, DRAFT, FLAGGED, 40 KEYWORD, NEW, OLD, RECENT, SEEN) and minus search keys based on 41 headers that do not exist in news (i.e., CC, BCC, TO). 43 The availability of the extensions described here will be 44 advertised by the server using the extension negotiation-mechanism 45 described in the new NNTP protocol specification currently being 46 developed [NNTP-NEW]. 48 2. Conventions used in this document 50 In examples, "C:" and "S:" indicate lines sent by the client and 51 server respectively. 53 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL 54 NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" 55 in this document are to be interpreted as described in [RFC-2119]. 57 3. Introduction 59 The NNTP SEARCH command is sent from the client to the server to 60 specify and initiate a full-text search on articles in one or more 61 newsgroups. The NNTP SEARCH command is similar to the [IMAP4] 62 SEARCH command, with user property and mail-specific header search 63 keys not present in NNTP SEARCH. The results of an NNTP Search is 64 OVER data as specified in [NNTP-NEW] for each article that 65 satisfies the search criteria. 67 In addition, the PAT command is extended so that it can be used to 68 full-text search articles within a single newsgroup. Both the 69 headers and the body of the articles are searched. 71 3.1. New and Enhanced NNTP Commands 73 There are four new NNTP commands: two new options to the existing 74 LIST command, and enhancements to one existing command. 76 * SEARCH 77 * LIST SRCHFIELDS 78 * LIST SEARCHABLE 79 * PAT 81 The SEARCH command runs a one-time search, returning overview-like 82 data. 84 The LIST SRCHFIELDS command returns the fields that the server 85 allows in full-text searches. 87 The LIST SEARCHABLE command allows the client to determine which 88 newsgroups are full-text searchable. 90 The PAT command allows the pseudo-header ":TEXT". This specifies 91 a full-text (headers and body) search of the articles in a single 92 newsgroup. 94 4. Use of NNTP Extension Mechanism 96 The NNTP extension mechanism allows a server to describe its 97 capabilities. The following extensions are used to describe the 98 capabilities described in this document. 100 4.1. SRCH Extension 102 The SRCH extension means that the server supports the following 103 commands: SEARCH, LIST SEARCHABLE, LIST SRCHFIELDS. 105 4.2. PATTEXT Extension 107 The PATTEXT extension means that the server supports the :TEXT 108 header in the PAT command, as described by this document. 110 5. SEARCH Command 112 Arguments: optional newsgroup specification 113 searching criteria (one or more) 115 Responses: 224 overview information follows 116 412 no news group selected 117 462 error performing search 118 463 too many hits 119 480 authentication required 120 501 command syntax error 121 502 no permission 123 The SEARCH command searches the newsgroups for articles that match 124 the given searching criteria. Searching criteria consist of one 125 or more search keys. If there are articles that match the search 126 criteria, the server responds with code 224 and returns OVER data 127 for each matching article in a similar format as described in 128 [NNTP-NEW] with one exception. The one change from [NNTP-NEW] OVER 129 format is to change the article number field to a format that 130 supports searches over multiple newsgroups. The article ID field 131 for SEARCH OVER data will use the format newsgroup:art-ID rather 132 than just an article number as defined in [NNTP-NEW] (note: this 133 is the same format used by the Xref header). 135 A response of 501 indicates a syntax error in the search criteria. 136 A response of 502 indicates that the user does not have permission 137 to search one or more of the specified newsgroups. If the search 138 criteria did not specify a newsgroup, and there is no current 139 newsgroup (i.e., set using the NNTP GROUP command), then the 140 server returns the error code 412, indicating that no newsgroup 141 has been specified. A response of 462 indicates that the server 142 encountered an error when processing the search. Some 143 implementations may wish to limit the maximum number of articles 144 that can match a search. A response of 463 indicates that the 145 number of hits has been exceeded. This response may also be used 146 to indicate that a search request is not being performed because 147 it is anticipated to produce too many matches. An example would 148 be searching for a single character. 150 When multiple keys are specified, the result is the intersection 151 (AND function) of all the messages that match those keys. For 152 example, the criteria FROM "SMITH" SINCE 1-Feb-1994 refers to all 153 articles from Smith that were placed in the newsgroup since 154 February 1, 1994. A search key may also be a parenthesized list of 155 one or more search keys (e.g. for use with the OR and NOT keys). 157 Server implementations MAY exclude [MIME-1] body parts with 158 terminal content types other than TEXT and MESSAGE from 159 consideration in SEARCH matching. 161 The optional newsgroup specification consists of the word _IN_ 162 followed by either a wildcard character _*_ - indicating a search 163 over all newsgroups - or a list of newsgroup names separated by a 164 comma. A newsgroup name can end with the wildcard string _.*_ 165 indicating a search over a sub-hierarchy of the newsgroup name 166 space. If no newsgroup specification is given, the search is over 167 the current newsgroup. If there is no current newsgroup, the 168 server returns the 412 error code. 170 The ON, BEFORE, and SINCE search criteria use the same date as 171 used in the NNTP NEWNEWS command in [NNTP-NEW] - the date the 172 article arrived on the server. A server indicates support for the 173 ON, BEFORE, and SINCE search criteria by listing :Date in the LIST 174 SRCHFIELDS response. 176 The defined search keys are as follows. Refer to the Formal 177 Syntax section for the precise syntactic definitions of the 178 arguments. 180 Articles with article numbers corresponding to 181 the specified range. 183 ALL All Articles in the current newsgroup; the 184 default initial key for ANDing. 186 BEFORE Articles whose server arrival date is earlier 187 than the specified date. 189 BODY Articles that contain the specified string in 190 the body of the message. 192 FROM Articles that contain the specified string in 193 the article structure's FROM field. 195 HEADER 196 Articles that have a header with the specified 197 field-name (as defined in [RFC-822]) and that 198 contains the specified string in the [RFC-822] 199 field-body. If the is the empty string 200 (""), then a match implies that the specified 201 header does not exist. 203 LARGER Articles with a size larger than the specified 204 number of octets. 206 NOT Articles that do not match the specified search 207 key. 209 ON Articles whose server arrival date is within the 210 specified date. 212 OR Articles that match either search 213 key. 215 SENTBEFORE 216 Articles whose [RFC-822] Date: header is earlier 217 than the specified date. 219 SENTON Articles whose [RFC-822] Date: header is within 220 the specified date. 222 SENTSINCE 223 Articles whose [RFC-822] Date: header is within 224 or later than the specified date. 226 SINCE Articles whose server arrival date is within or 227 later than the specified date. 229 SMALLER Articles with a size smaller than the specified 230 number of octets. 232 SUBJECT 233 Articles that contain the specified string in 234 the envelope structure's SUBJECT field. 236 TEXT Articles that contain the specified string in 237 the header or body of the message. 239 Example: 240 C: SEARCH FROM "Smith" SINCE 1-Feb-1994 241 S: 224 overview information follows 242 S: comp.object:573 \t RE: object-oriented langs \t \ 243 "John Smith" \t Sun, 03 Nov 1996 \ 244 14:25:05 -0800 \t <01cbc9d5f3c70$eab9a2cd@xyz.com> \ 245 \t 4080 \t 33 246 S: . 248 Note: each field in OVER response is separated by a tab - 249 shown as a \t in the example above. 251 5.1.1. Search Formal Syntax 253 The search query syntax is derived from the search syntax defined 254 for the IMAP4 protocol. It is somewhat different because of the 255 way international character sets need to be encoded. The following 256 syntax specification uses the augmented Backus-Naur Form (BNF) as 257 described in [ABNF]. 259 Except as noted otherwise, all alphabetic characters are case- 260 insensitive. The use of upper or lower case characters to define 261 token strings is for editorial clarity only. Implementations MUST 262 accept these strings in a case-insensitive fashion. 264 astring ::= atom / string 266 atom ::= 1*ATOM_CHAR 268 ATOM_CHAR ::= 270 atom_specials ::= "," / "(" / ")" / SPACE / CTL / "*" / 271 quoted_specials 273 CHAR ::= 275 CTL ::= 278 date ::= date_text / <"> date_text <"> 280 date_day ::= 1*2digit 281 ;; Day of month 283 date_month ::= "Jan" / "Feb" / "Mar" / "Apr" / "May" / "Jun" 284 / "Jul" / "Aug" / "Sep" / "Oct" / "Nov" / 285 "Dec" 287 date_text ::= date_day "-" date_month "-" date_year 289 date_year ::= 4digit 291 digit ::= "0" / digit_nz 293 digit_nz ::= "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" 294 / "9" 296 header_fld_name ::= sstring 297 mstring ::= A MIME encoded string surrounded by double 298 quotes 300 newsgroup ::= atom [ ".*"] 302 newsgroups ::= "*" / newsgroup_list 304 newsgroup_list ::= newsgroup [ "," newsgroup_list] 306 number ::= 1*digit 307 ;; Unsigned 32-bit integer 308 ;; (0 <= n < 4,294,967,296) 310 nz_number ::= digit_nz *digit 311 ;; Non-zero unsigned 32-bit integer 312 ;; (0 < n < 4,294,967,296) 314 QUOTED_CHAR ::= / "\" 315 quoted_specials 317 quoted_specials ::= <"> / "\" 319 range ::= nz_number / nz_number "-" [ nz_number ] 320 ;; Identifies a range of Articles. 322 search ::= "SEARCH" SPACE 323 ["IN" SPACE newsgroups SPACE] 324 1#search_key 326 search_key ::= "ALL" / "BODY" SPACE sstring / "FROM" SPACE 327 sstring / "ON" SPACE date / "SINCE" SPACE date 328 / "BEFORE" SPACE date / "SUBJECT" SPACE 329 sstring / "TEXT" SPACE sstring / "HEADER" 330 SPACE header_fld_name SPACE sstring / "LARGER" 331 SPACE number / "NOT" SPACE search_key / "OR" 332 SPACE search_key SPACE search_key / 333 "SENTBEFORE" SPACE date / "SENTON" SPACE date 334 / "SENTSINCE" SPACE date / "SMALLER" SPACE 335 number / range / "(" 1#search_key ")" 337 SPACE ::= 1* 339 sstring ::= astring / mstring 341 string ::= <"> *QUOTED_CHAR <"> 343 TEXT_CHAR ::= 345 5.2. LIST SRCHFIELDS Command 347 Arguments: none 348 Responses: 224 data follows 350 The LIST SRCHFIELDS command returns a list of which fields can be 351 specified in full-text search queries on the server. The response 352 is a list of searchable fields, one per line. A _._ on its own 353 line terminates the list. The fields are either newsgroup 354 headers, or non-header fields supported by the query syntax. 356 The three currently defined non-header fields are _:Body_, 357 _:Text_, and _:Date_. _:Text_ means all the searchable text in the 358 article, and indicates that the _TEXT_ keyword is supported in the 359 search query language. _:Body_ means the body of the article, 360 excluding the headers, and indicates that the _BODY_ keyword is 361 supported in the search query language. _:Date_ means the date at 362 which an article arrived on a server - similar to the date used in 363 the NNTP NEWNEWS command - and indicates that the _ON_, _SINCE_, 364 and _BEFORE_ keywords are supported in the search query language. 366 The _TEXT_ and _BODY_ search query fields are optional, but the 367 server must indicate whether they are supported or not in the LIST 368 SRCHFIELDS response. 370 Example: C: LIST SRCHFIELDS 371 S: 224 Data follows. 372 S: From 373 S: Date 374 S: Subject 375 S: :Text 376 S: . 378 5.3. LIST SEARCHABLE Command 380 Arguments: none 382 Responses: 224 Data Follows 384 The LIST SEARCHABLE command returns a list of strings that define 385 which new groups are being indexed by the news server and are thus 386 available for searching. In addition, the character sets allowed 387 for each group is returned. 389 When there are newsgroups indexed it will return 224, followed by 390 each portion of the tree that is indexed. If all groups are 391 indexed, a line with "*" is returned. If only some parts of the 392 newsgroup hierarchy are indexed, they are identified in the form 393 .*. Clients should not assume that these will 394 always be top level hierarchies. A "." on its own line terminates 395 the list. 397 Example: C: LIST SEARCHABLE 398 S: 224 Data follows. 399 S: alt.* 400 S: comp.lang.* 401 S: mcom.* 402 S: . 404 5.4. PAT Command Enhancement 406 Arguments: header range| [pat [pat...]] 408 Responses: 410 The PAT command is enhanced in a simple way: The new value _:TEXT_ 411 will be supported as a header when invoking the command. The :TEXT 412 header requests a full-text search of the body and all headers of 413 the specified articles. Other than adding a new header name, the 414 PAT command arguments are the same as specified in [NNTP-NEW]. 416 If :TEXT isn't specified as the header, the response is the same 417 as it always has been for PAT, with each result line containing 418 the article number and the value of the header that matched the 419 pattern. 421 If the :TEXT header is specified, the constant string _TEXT_ is 422 returned in place of the value of the header that matched the 423 pattern. 425 Example: C: PAT :TEXT 1000-2000 searchtext 426 S: 221 Header follows 427 S: 1021 TEXT 428 S: 1024 TEXT 429 S:. 431 6. Security Considerations 433 The search commands must be implemented in a way that does not 434 allow access to articles in newsgroups that a client is otherwise 435 restricted from reading due to access control rules. 437 7. References 439 [ABNF], DRUMS working group, Dave Crocker Editor, _Augmented BNF 440 for Syntax Specifications: ABNF_, draft-drums-abnf-02.txt (work in 441 progress), Internet Mail Consortium, April 1997 443 [IMAP4] IMAP4 INTERNET MESSAGE ACCESS PROTOCOL - VERSION 4rev1. M 444 Crispin, Request for Comment (RFC) 2060, December 1994 446 [MIME-1] Borenstein N., and N. Freed, MIME (Multipurpose Internet 447 Mail Extensions) Part One: Format of Internet Message Bodies, 448 Request for Comment (RFC) 2045, December 1996. 450 [NNTP-977] Network News Transfer Protocol. B. Kantor, Phil 451 Lapsley, Request for Comment (RFC) 977, February 1986. 453 [NNTP-NEW] Network News Transfer Protocol. S. Barber INTERNET 454 DRAFT, draft-ietf-nntpext-base-02.txt, September 1997. 456 [RFC-2119], Bradner, S, _Key words for use in RFCs to Indicate 457 Requirement Levels_, RFC 2119, Harvard University, March 1997 459 8. Acknowledgments 461 TBD 463 9. Author's Addresses 465 Nathaniel Ballou 466 Microsoft Corporation 467 One Microsoft Way 468 Redmond, WA 98052-6399 469 Phone: 425-703-0574 470 Email: natba@microsoft.com 472 Brian Hernacki 473 Netscape Communications 474 501 E. Middlefield Rd. 475 Mountain View, CA 94043-4042 476 Phone: 650-937-6738 477 Email: bhern@netscape.com 479 Stephen Waters 480 Microsoft Corporation 481 One Microsoft Way 482 Redmond, WA 98052-6399 483 Phone: 425-703-4972 484 Email: swater@microsoft.com