idnits 2.17.1 draft-ietf-imapext-sort-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 10 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 48: '...ient implementations SHOULD accept any...' RFC 2119 keyword, line 94: '...onnected clients MUST use exactly this...' RFC 2119 keyword, line 126: '...nd UTF-8 charsets MUST be implemented....' RFC 2119 keyword, line 267: '...ementations of SORT MUST implement the...' Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 2000) is 8655 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 6 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IMAP Extensions Working Group M. Crispin 3 INTERNET-DRAFT: IMAP SORT K. Murchison 4 Document: internet-drafts/draft-ietf-imapext-sort-04.txt August 2000 6 INTERNET MESSAGE ACCESS PROTOCOL - SORT EXTENSION 8 Status of this Memo 10 This document is an Internet-Draft and is in full conformance with 11 all provisions of Section 10 of RFC 2026. 13 Internet-Drafts are working documents of the Internet Engineering 14 Task Force (IETF), its areas, and its working groups. Note that 15 other groups may also distribute working documents as 16 Internet-Drafts. 18 Internet-Drafts are draft documents valid for a maximum of six months 19 and may be updated, replaced, or obsoleted by other documents at any 20 time. It is inappropriate to use Internet-Drafts as reference 21 material or to cite them other than as "work in progress." 23 The list of current Internet-Drafts can be accessed at 24 http://www.ietf.org/ietf/1id-abstracts.txt 26 To view the list Internet-Draft Shadow Directories, see 27 http://www.ietf.org/shadow.html. 29 A revised version of this document will be submitted to the RFC 30 editor as an Informational Document for the Internet Community. 32 A revised version of this draft document, describing an expanded 33 version of this protocol extension, will be submitted to the RFC 34 editor as a Proposed Standard for the Internet Community. 36 Discussion and suggestions for improvement are requested, and should 37 be sent to ietf-imapext@IMC.ORG. This document will expire before 10 38 January 2001. Distribution of this memo is unlimited. 40 Abstract 42 This document describes an experimental server-based sorting 43 extension to the IMAP4rev1 protocol, as implemented by the University 44 of Washington's IMAP toolkit. This extension provides substantial 45 performance improvements for IMAP clients which offer sorted views. 47 A server which supports this extension indicates this with a 48 capability name of "SORT". Client implementations SHOULD accept any 49 capability name which begins with "SORT" as indicating support for 50 the extension described in this document. This provides for future 51 upwards-compatible extensions. 53 At the time of this document was written, the IMAP Extensions Working 54 Group (IETF-IMAPEXT) was considering upwards-compatible additions to 55 the SORT extension described in this document, tenatively called the 56 SORT2 extension. 58 Extracted Subject Text 60 The "SUBJECT" SORT criteria uses a version of the subject which has 61 specific subject artifacts of deployed Internet mail software 62 removed. Due to the complexity of these artifacts, the formal syntax 63 for the subject extraction rules is ambiguous. The following 64 procedure is followed to determing the actual "base subject" which is 65 used to sort by subject: 67 (1) Convert any RFC 2047 encoded-words in the subject to 68 UTF-8. Convert all tabs and continuations to space. 69 Convert all multiple spaces to a single space. 71 (2) Remove all trailing text of the subject that matches 72 the subj-trailer ABNF, repeat until no more matches are 73 possible. 75 (3) Remove all prefix text of the subject that matches the 76 subj-leader ABNF. 78 (4) If there is prefix text of the subject that matches the 79 subj-blob ABNF, and removing that prefix leaves a non-empty 80 subj-base, then remove the prefix text. 82 (5) Repeat (3) and (4) until no matches remain. 84 Note: it is possible to defer step (2) until step (6), but this 85 requires checking for subj-trailer in step (4). 87 (6) If the resulting text begins with the subj-fwd-hdr ABNF 88 and ends with the subj-fwd-trl ABNF, remove the 89 subj-fwd-hdr and subj-fwd-trl and repeat from step (2). 91 (7) The resulting text is the "base subject" used in the 92 SORT. 94 All servers and disconnected clients MUST use exactly this algorithm 95 when sorting by subject. Otherwise there is potential for a user to 96 get inconsistant results based on whether they are running in 97 connected or disconnected IMAP mode. 99 Additional Commands 101 This command is an extension to the IMAP4rev1 base protocol. 103 The section header is intended to correspond with where it would be 104 located in the main document if it was part of the base 105 specification. 107 6.3.SORT. SORT Command 109 Arguments: sort program 110 charset specification 111 searching criteria (one or more) 113 Data: untagged responses: SORT 115 Result: OK - sort completed 116 NO - sort error: can't sort that charset or 117 criteria 118 BAD - command unknown or arguments invalid 120 The SORT command is a variant of SEARCH with sorting semantics for 121 the results. Sort has two arguments before the searching criteria 122 argument; a parenthesized list of sort criteria, and the searching 123 charset. 125 Note that unlike SEARCH, the searching charset argument is 126 mandatory. The US-ASCII and UTF-8 charsets MUST be implemented. 127 All other charsets are optional. 129 There is also a UID SORT command which corresponds to SORT the way 130 that UID SEARCH corresponds to SEARCH. 132 The SORT command first searches the mailbox for messages that 133 match the given searching criteria using the charset argument for 134 the interpretation of strings in the searching criteria. It then 135 returns the matching messages in an untagged SORT response, sorted 136 according to one or more sort criteria. 138 If two or more messages exactly match according to the sorting 139 criteria, these messages are sorted according to the order in 140 which they appear in the mailbox. In other words, there is an 141 implicit sort criterion of "sequence number". 143 When multiple sort criteria are specified, the result is sorted in 144 the priority order that the criteria appear. For example, 145 (SUBJECT DATE) will sort messages in order by their subject text; 146 and for messages with the same subject text will sort by their 147 sent date. 149 Untagged EXPUNGE responses are not permitted while the server is 150 responding to a SORT command, but are permitted during a UID SORT 151 command. 153 The defined sort criteria are as follows. Refer to the Formal 154 Syntax section for the precise syntactic definitions of the 155 arguments. If the associated RFC-822 header for a particular 156 criterion is absent, it is treated as the empty string. The empty 157 string always collates before non-empty strings. 159 ARRIVAL 160 Internal date and time of the message. This differs from the 161 ON criteria in SEARCH, which uses just the internal date. 163 CC 164 RFC-822 local-part of the first "cc" address. 166 DATE 167 Sent date and time from the Date: header, adjusted by time 168 zone. This differs from the SENTON criteria in SEARCH, which 169 uses just the date and not the time, nor adjusts by time zone. 171 FROM 172 RFC-822 local-part of the "From" address. 174 REVERSE 175 Followed by another sort criterion, has the effect of that 176 criterion but in reverse order. 178 SIZE 179 Size of the message in octets. 181 SUBJECT 182 Extracted subject text. 184 TO 185 RFC-822 local-part of the first "To" address. 187 Example: C: A282 SORT (SUBJECT) UTF-8 SINCE 1-Feb-1994 188 S: * SORT 2 84 882 189 S: A282 OK SORT completed 190 C: A283 SORT (SUBJECT REVERSE DATE) UTF-8 ALL 191 S: * SORT 5 3 4 1 2 192 S: A283 OK SORT completed 193 C: A284 SORT (SUBJECT) US-ASCII TEXT "not in mailbox" 194 S: * SORT 195 S: A284 OK SORT completed 197 Additional Responses 199 This response is an extension to the IMAP4rev1 base protocol. 201 The section heading of this response is intended to correspond with 202 where it would be located in the main document. 204 7.2.SORT. SORT Response 206 Data: one or more numbers 208 The SORT response occurs as a result of a SORT or UID SORT 209 command. The number(s) refer to those messages that match the 210 search criteria. For SORT, these are message sequence numbers; 211 for UID SORT, these are unique identifiers. Each number is 212 delimited by a space. 214 Example: S: * SORT 2 3 6 216 Formal Syntax of SORT commands and Responses 218 sort-data = "SORT" *(SP nz-number) 220 sort = ["UID" SP] "SORT" SP 221 "(" sort-criterion *(SP sort-criterion) ")" 222 SP search_charset 1*(SP search_key) 224 sort-criterion = ["REVERSE" SP] sort-key 226 sort-key = "ARRIVAL" / "CC" / "DATE" / "FROM" / "SIZE" / 227 "SUBJECT" / "TO" 229 The following syntax describes subject extraction rules (2)-(6): 231 subject = *subj-leader [subj-middle] *subj-trailer 233 subj-refwd = ("re" / ("fw" ["d"])) *WSP [subj-blob] ":" 235 subj-blob = "[" *BLOBCHAR "]" *WSP 237 subj-fwd = subj-fwd-hdr subject subj-fwd-trl 239 subj-fwd-hdr = "[fwd:" 241 subj-fwd-trl = "]" 243 subj-leader = (*subj-blob subj-refwd) / WSP 245 subj-middle = *subj-blob (subj-base / subj-fwd) 246 ; last subj-blob is subj-base if subj-base would 247 ; otherwise be empty 249 subj-trailer = "(fwd)" / WSP 251 subj-base = NONWSP *([*WSP] NONWSP) 252 ; can be a subj-blob 254 BLOBCHAR = %x01-5a / %x5c / %x5e-7f 255 ; any CHAR except '[' and ']' 257 NONWSP = %x01-08 / %x0a-1f / %x21-7f 258 ; any CHAR other than WSP 260 Security Considerations 262 Security issues are not discussed in this memo. 264 Internationalization Considerations 266 By default, strings are sorted according to the "minimum sorting 267 collation algorithm". All implementations of SORT MUST implement the 268 minimum sorting collation algorithm. 270 In the minimum sorting collation algorithm, the Basic Latin 271 alphabetics (U+0041 to U+005A uppercase, U+0061 to U+007A lowercase) 272 are sorted in a case-insensitive fashion; that is, "A" (U+0041) and 273 "a" (U+0061) are treated as exact equals. All other characters are 274 sorted according to their octet values, as expressed in UTF-8. No 275 attempt is made to treat composed characters specially, or to do 276 case-insensitive comparisons of composed characters. 278 Note: this means, among other things, that the composed 279 characters in the Latin-1 Supplement are not compared in 280 what would be considered an ISO 8859-1 "case-insensitive" 281 fashion. Case comparison rules for characters with 282 diacriticals differ between languages; the minimum sorting 283 collation does not attempt to deal with this at all. This 284 is reserved for other sorting collations, which may be 285 language-specific. 287 Other sorting collations, and the ability to change the sorting 288 collation, will be defined in a separate document dealing with IMAP 289 internationalization. 291 It is anticipated that there will be a generic Unicode sorting 292 collation, which will provide generic case-insensitivity for 293 alphabetic scripts, specification of composed character handling, and 294 language-specific sorting collations. A server which implements 295 non-default sorting collations will modify its sorting behavior 296 according to the selected sorting collation. 298 Non-English translations of "Re" or "Fw"/"Fwd" are not specified for 299 removal in the extracted subject text process. By specifying that 300 only the English forms of the prefixes are used, it becomes a simple 301 display time task to localize the prefix language for the user. If, 302 on the other hand, prefixes in multiple languages are permitted, the 303 result is a geometrically complex, and ultimately unimplementable, 304 task. In order to improve the ability to support non-English display 305 in Internet mail clients, only the English form of these prefixes 306 should be transmitted in Internet mail messages. 308 Author's Address 310 Mark R. Crispin 311 Networks and Distributed Computing 312 University of Washington 313 4545 15th Avenue NE 314 Seattle, WA 98105-4527 316 Phone: (206) 543-5762 318 EMail: MRC@CAC.Washington.EDU 320 Kenneth Murchison 321 Oceana Matrix Ltd. 322 21 Princeton Place 323 Orchard Park, NY 14127 325 Phone: (716) 662-8973 x26 327 EMail: ken@oceana.com