idnits 2.17.1 draft-ietf-imapext-sort-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 10 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 46: '...ient implementations SHOULD accept any...' RFC 2119 keyword, line 92: '...onnected clients MUST use exactly this...' RFC 2119 keyword, line 124: '...nd UTF-8 charsets MUST be implemented....' RFC 2119 keyword, line 281: '...ementations of SORT MUST implement the...' Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 2002) is 7984 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 6 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IMAP Extensions Working Group M. Crispin 3 INTERNET-DRAFT: IMAP SORT K. Murchison 4 Document: internet-drafts/draft-ietf-imapext-sort-10.txt June 2002 6 INTERNET MESSAGE ACCESS PROTOCOL - SORT EXTENSION 8 Status of this Memo 10 This document is an Internet-Draft and is in full conformance with 11 all provisions of Section 10 of RFC 2026. 13 Internet-Drafts are working documents of the Internet Engineering 14 Task Force (IETF), its areas, and its working groups. Note that 15 other groups may also distribute working documents as 16 Internet-Drafts. 18 Internet-Drafts are draft documents valid for a maximum of six months 19 and may be updated, replaced, or obsoleted by other documents at any 20 time. It is inappropriate to use Internet-Drafts as reference 21 material or to cite them other than as "work in progress." 23 The list of current Internet-Drafts can be accessed at 24 http://www.ietf.org/ietf/1id-abstracts.txt 26 To view the list Internet-Draft Shadow Directories, see 27 http://www.ietf.org/shadow.html. 29 A revised version of this document will be submitted to the RFC 30 editor as an Informational Document for the Internet Community. 32 A revised version of this draft document will be submitted to the RFC 33 editor as a Proposed Standard for the Internet Community. Discussion 34 and suggestions for improvement are requested, and should be sent to 35 ietf-imapext@IMC.ORG. This document will expire before 22 December 36 2002. Distribution of this memo is unlimited. 38 Abstract 40 This document describes an experimental server-based sorting 41 extension to the IMAP4rev1 protocol, as implemented by the University 42 of Washington's IMAP toolkit. This extension provides substantial 43 performance improvements for IMAP clients which offer sorted views. 45 A server which supports this extension indicates this with a 46 capability name of "SORT". Client implementations SHOULD accept any 47 capability name which begins with "SORT" as indicating support for 48 the extension described in this document. This provides for future 49 upwards-compatible extensions. 51 At the time of this document was written, the IMAP Extensions Working 52 Group (IETF-IMAPEXT) was considering upwards-compatible additions to 53 the SORT extension described in this document, tentatively called the 54 SORT2 extension. 56 Base Subject Text 58 The "SUBJECT" SORT criteria the "base subject," which has specific 59 subject artifacts of deployed Internet mail software removed. Due to 60 the complexity of these artifacts, the formal syntax for the subject 61 extraction rules is ambiguous. The following procedure is followed 62 to determine the actual "base subject" which is used to sort by 63 subject: 65 (1) Convert any RFC 2047 encoded-words in the subject to 66 UTF-8. Convert all tabs and continuations to space. 67 Convert all multiple spaces to a single space. 69 (2) Remove all trailing text of the subject that matches 70 the subj-trailer ABNF, repeat until no more matches are 71 possible. 73 (3) Remove all prefix text of the subject that matches the 74 subj-leader ABNF. 76 (4) If there is prefix text of the subject that matches the 77 subj-blob ABNF, and removing that prefix leaves a non-empty 78 subj-base, then remove the prefix text. 80 (5) Repeat (3) and (4) until no matches remain. 82 Note: it is possible to defer step (2) until step (6), but this 83 requires checking for subj-trailer in step (4). 85 (6) If the resulting text begins with the subj-fwd-hdr ABNF 86 and ends with the subj-fwd-trl ABNF, remove the 87 subj-fwd-hdr and subj-fwd-trl and repeat from step (2). 89 (7) The resulting text is the "base subject" used in the 90 SORT. 92 All servers and disconnected clients MUST use exactly this algorithm 93 when sorting by subject. Otherwise there is potential for a user to 94 get inconsistent results based on whether they are running in 95 connected or disconnected IMAP mode. 97 Additional Commands 99 This command is an extension to the IMAP4rev1 base protocol. 101 The section header is intended to correspond with where it would be 102 located in the main document if it was part of the base 103 specification. 105 6.3.SORT. SORT Command 107 Arguments: sort program 108 charset specification 109 searching criteria (one or more) 111 Data: untagged responses: SORT 113 Result: OK - sort completed 114 NO - sort error: can't sort that charset or 115 criteria 116 BAD - command unknown or arguments invalid 118 The SORT command is a variant of SEARCH with sorting semantics for 119 the results. Sort has two arguments before the searching criteria 120 argument; a parenthesized list of sort criteria, and the searching 121 charset. 123 Note that unlike SEARCH, the searching charset argument is 124 mandatory. The US-ASCII and UTF-8 charsets MUST be implemented. 125 All other charsets are optional. 127 There is also a UID SORT command which corresponds to SORT the way 128 that UID SEARCH corresponds to SEARCH. 130 The SORT command first searches the mailbox for messages that 131 match the given searching criteria using the charset argument for 132 the interpretation of strings in the searching criteria. It then 133 returns the matching messages in an untagged SORT response, sorted 134 according to one or more sort criteria. 136 Sorting is in ascending order. Earlier dates sort before later 137 dates; smaller sizes sort before larger sizes; and strings are 138 sorted according to ascending values established by their 139 collation algorithm (see under "Internationalization 140 Considerations"). 142 If two or more messages exactly match according to the sorting 143 criteria, these messages are sorted according to the order in 144 which they appear in the mailbox. In other words, there is an 145 implicit sort criterion of "sequence number". 147 When multiple sort criteria are specified, the result is sorted in 148 the priority order that the criteria appear. For example, 149 (SUBJECT DATE) will sort messages in order by their base subject 150 text; and for messages with the same base subject text will sort 151 by their sent date. 153 Untagged EXPUNGE responses are not permitted while the server is 154 responding to a SORT command, but are permitted during a UID SORT 155 command. 157 The defined sort criteria are as follows. Refer to the Formal 158 Syntax section for the precise syntactic definitions of the 159 arguments. If the associated RFC-822 header for a particular 160 criterion is absent, it is treated as the empty string. The empty 161 string always collates before non-empty strings. 163 ARRIVAL 164 Internal date and time of the message. This differs from the 165 ON criteria in SEARCH, which uses just the internal date. 167 CC 168 RFC-822 local-part of the first "cc" address. 170 DATE 171 Sent date and time from the Date: header, adjusted by time 172 zone. This differs from the SENTON criteria in SEARCH, which 173 uses just the date and not the time, nor adjusts by time zone. 175 FROM 176 RFC-822 local-part of the first "From" address. 178 REVERSE 179 Followed by another sort criterion, has the effect of that 180 criterion but in reverse (descending) order. 181 Note: REVERSE only reverses a single criterion, and does not 182 affect the implicit "sequence number" sort criterion if all 183 other criteria are identicial. Consequently, a sort of 184 REVERSE SUBJECT is not the same as a reverse ordering of a 185 SUBJECT sort. 186 This can be avoided by use of additional criteria, e.g. 187 SUBJECT DATE vs. REVERSE SUBJECT REVERSE DATE. In general, 188 however, it's better (and faster, if the client has a 189 "reverse current ordering" command) to reverse the results 190 in the client instead of issuing a new SORT. 192 SIZE 193 Size of the message in octets. 195 SUBJECT 196 Base subject text. 198 TO 199 RFC-822 local-part of the first "To" address. 201 Example: C: A282 SORT (SUBJECT) UTF-8 SINCE 1-Feb-1994 202 S: * SORT 2 84 882 203 S: A282 OK SORT completed 204 C: A283 SORT (SUBJECT REVERSE DATE) UTF-8 ALL 205 S: * SORT 5 3 4 1 2 206 S: A283 OK SORT completed 207 C: A284 SORT (SUBJECT) US-ASCII TEXT "not in mailbox" 208 S: * SORT 209 S: A284 OK SORT completed 211 Additional Responses 213 This response is an extension to the IMAP4rev1 base protocol. 215 The section heading of this response is intended to correspond with 216 where it would be located in the main document. 218 7.2.SORT. SORT Response 220 Data: zero or more numbers 222 The SORT response occurs as a result of a SORT or UID SORT 223 command. The number(s) refer to those messages that match the 224 search criteria. For SORT, these are message sequence numbers; 225 for UID SORT, these are unique identifiers. Each number is 226 delimited by a space. 228 Example: S: * SORT 2 3 6 230 Formal Syntax of SORT commands and Responses 232 sort-data = "SORT" *(SP nz-number) 234 sort = ["UID" SP] "SORT" SP 235 "(" sort-criterion *(SP sort-criterion) ")" 236 SP search-charset 1*(SP search-key) 238 sort-criterion = ["REVERSE" SP] sort-key 240 sort-key = "ARRIVAL" / "CC" / "DATE" / "FROM" / "SIZE" / 241 "SUBJECT" / "TO" 243 The following syntax describes base subject extraction rules (2)-(6): 245 subject = *subj-leader [subj-middle] *subj-trailer 247 subj-refwd = ("re" / ("fw" ["d"])) *WSP [subj-blob] ":" 249 subj-blob = "[" *BLOBCHAR "]" *WSP 251 subj-fwd = subj-fwd-hdr subject subj-fwd-trl 253 subj-fwd-hdr = "[fwd:" 255 subj-fwd-trl = "]" 257 subj-leader = (*subj-blob subj-refwd) / WSP 259 subj-middle = *subj-blob (subj-base / subj-fwd) 260 ; last subj-blob is subj-base if subj-base would 261 ; otherwise be empty 263 subj-trailer = "(fwd)" / WSP 265 subj-base = NONWSP *([*WSP] NONWSP) 266 ; can be a subj-blob 268 BLOBCHAR = %x01-5a / %x5c / %x5e-7f 269 ; any CHAR except '[' and ']' 271 NONWSP = %x01-08 / %x0a-1f / %x21-7f 272 ; any CHAR other than WSP 274 Security Considerations 276 Security issues are not discussed in this memo. 278 Internationalization Considerations 280 By default, strings are sorted according to the "minimum sorting 281 collation algorithm". All implementations of SORT MUST implement the 282 minimum sorting collation algorithm. 284 In the minimum sorting collation algorithm, the Basic Latin 285 alphabetics (U+0041 to U+005A uppercase, U+0061 to U+007A lowercase) 286 are sorted in a case-insensitive fashion; that is, "A" (U+0041) and 287 "a" (U+0061) are treated as exact equals. The characters U+005B to 288 U+0060 are sorted after the Basic Latin alphabetics; for example, 289 U+005E is sorted after U+005A and U+007A. All other characters are 290 sorted according to their octet values, as expressed in UTF-8. No 291 attempt is made to treat composed characters specially, or to do 292 case-insensitive comparisons of composed characters. 294 Note: this means, among other things, that the composed 295 characters in the Latin-1 Supplement are not compared in 296 what would be considered an ISO 8859-1 "case-insensitive" 297 fashion. Case comparison rules for characters with 298 diacriticals differ between languages; the minimum sorting 299 collation does not attempt to deal with this at all. This 300 is reserved for other sorting collations, which may be 301 language-specific. 303 ;;; *** ITEM FOR DISCUSSION *** 304 ;;; THERE IS SOME CONCERN THAT THIS MINIMUM COLLATION IS TOO MINIMAL, 305 ;;; AND THAT THE "GENERIC UNICODE SORTING COLLATION" DISCUSSED BELOW 306 ;;; NEEDS TO BE THE MINIMUM. ONE SUGGESTION IS UNICODE TECHNICAL 307 ;;; STANDARD 10 (TR-10). IF THIS IS THE MINIMUM, THAT REQUIRES THAT 308 ;;; ALL IMPLEMENTATIONS OF SORT AND THREAD BE UNICODE-SAVVY AT LEAST 309 ;;; TO THE POINT OF IMPLEMENTATION TR-10. IS THIS REALISTIC? DOES 310 ;;; THIS RAISE EXCESSIVE IMPLEMENTATION BARRIERS? 311 Other sorting collations, and the ability to change the sorting 312 collation, will be defined in a separate document dealing with IMAP 313 internationalization. 315 It is anticipated that there will be a generic Unicode sorting 316 collation, which will provide generic case-insensitivity for 317 alphabetic scripts, specification of composed character handling, and 318 language-specific sorting collations. A server which implements 319 non-default sorting collations will modify its sorting behavior 320 according to the selected sorting collation. 322 Non-English translations of "Re" or "Fw"/"Fwd" are not specified for 323 removal in the base subject extraction process. By specifying that 324 only the English forms of the prefixes are used, it becomes a simple 325 display time task to localize the prefix language for the user. If, 326 on the other hand, prefixes in multiple languages are permitted, the 327 result is a geometrically complex, and ultimately unimplementable, 328 task. In order to improve the ability to support non-English display 329 in Internet mail clients, only the English form of these prefixes 330 should be transmitted in Internet mail messages. 332 Author's Address 334 Mark R. Crispin 335 Networks and Distributed Computing 336 University of Washington 337 4545 15th Avenue NE 338 Seattle, WA 98105-4527 340 Phone: (206) 543-5762 342 EMail: MRC@CAC.Washington.EDU 344 Kenneth Murchison 345 Oceana Matrix Ltd. 346 21 Princeton Place 347 Orchard Park, NY 14127 349 Phone: (716) 662-8973 x26 351 EMail: ken@oceana.com