idnits 2.17.1 draft-ietf-imapext-sort-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 10 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 46: '...ient implementations SHOULD accept any...' RFC 2119 keyword, line 92: '...onnected clients MUST use exactly this...' RFC 2119 keyword, line 124: '...nd UTF-8 charsets MUST be implemented....' RFC 2119 keyword, line 275: '...ementations of SORT MUST implement the...' Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 2002) is 8134 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 6 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IMAP Extensions Working Group M. Crispin 3 INTERNET-DRAFT: IMAP SORT K. Murchison 4 Document: internet-drafts/draft-ietf-imapext-sort-08.txt January 2002 6 INTERNET MESSAGE ACCESS PROTOCOL - SORT EXTENSION 8 Status of this Memo 10 This document is an Internet-Draft and is in full conformance with 11 all provisions of Section 10 of RFC 2026. 13 Internet-Drafts are working documents of the Internet Engineering 14 Task Force (IETF), its areas, and its working groups. Note that 15 other groups may also distribute working documents as 16 Internet-Drafts. 18 Internet-Drafts are draft documents valid for a maximum of six months 19 and may be updated, replaced, or obsoleted by other documents at any 20 time. It is inappropriate to use Internet-Drafts as reference 21 material or to cite them other than as "work in progress." 23 The list of current Internet-Drafts can be accessed at 24 http://www.ietf.org/ietf/1id-abstracts.txt 26 To view the list Internet-Draft Shadow Directories, see 27 http://www.ietf.org/shadow.html. 29 A revised version of this document will be submitted to the RFC 30 editor as an Informational Document for the Internet Community. 32 A revised version of this draft document will be submitted to the RFC 33 editor as a Proposed Standard for the Internet Community. Discussion 34 and suggestions for improvement are requested, and should be sent to 35 ietf-imapext@IMC.ORG. This document will expire before 4 July 2002. 36 Distribution of this memo is unlimited. 38 Abstract 40 This document describes an experimental server-based sorting 41 extension to the IMAP4rev1 protocol, as implemented by the University 42 of Washington's IMAP toolkit. This extension provides substantial 43 performance improvements for IMAP clients which offer sorted views. 45 A server which supports this extension indicates this with a 46 capability name of "SORT". Client implementations SHOULD accept any 47 capability name which begins with "SORT" as indicating support for 48 the extension described in this document. This provides for future 49 upwards-compatible extensions. 51 At the time of this document was written, the IMAP Extensions Working 52 Group (IETF-IMAPEXT) was considering upwards-compatible additions to 53 the SORT extension described in this document, tentatively called the 54 SORT2 extension. 56 Extracted Subject Text 58 The "SUBJECT" SORT criteria uses a version of the subject which has 59 specific subject artifacts of deployed Internet mail software 60 removed. Due to the complexity of these artifacts, the formal syntax 61 for the subject extraction rules is ambiguous. The following 62 procedure is followed to determine the actual "base subject" which is 63 used to sort by subject: 65 (1) Convert any RFC 2047 encoded-words in the subject to 66 UTF-8. Convert all tabs and continuations to space. 67 Convert all multiple spaces to a single space. 69 (2) Remove all trailing text of the subject that matches 70 the subj-trailer ABNF, repeat until no more matches are 71 possible. 73 (3) Remove all prefix text of the subject that matches the 74 subj-leader ABNF. 76 (4) If there is prefix text of the subject that matches the 77 subj-blob ABNF, and removing that prefix leaves a non-empty 78 subj-base, then remove the prefix text. 80 (5) Repeat (3) and (4) until no matches remain. 82 Note: it is possible to defer step (2) until step (6), but this 83 requires checking for subj-trailer in step (4). 85 (6) If the resulting text begins with the subj-fwd-hdr ABNF 86 and ends with the subj-fwd-trl ABNF, remove the 87 subj-fwd-hdr and subj-fwd-trl and repeat from step (2). 89 (7) The resulting text is the "base subject" used in the 90 SORT. 92 All servers and disconnected clients MUST use exactly this algorithm 93 when sorting by subject. Otherwise there is potential for a user to 94 get inconsistent results based on whether they are running in 95 connected or disconnected IMAP mode. 97 Additional Commands 99 This command is an extension to the IMAP4rev1 base protocol. 101 The section header is intended to correspond with where it would be 102 located in the main document if it was part of the base 103 specification. 105 6.3.SORT. SORT Command 107 Arguments: sort program 108 charset specification 109 searching criteria (one or more) 111 Data: untagged responses: SORT 113 Result: OK - sort completed 114 NO - sort error: can't sort that charset or 115 criteria 116 BAD - command unknown or arguments invalid 118 The SORT command is a variant of SEARCH with sorting semantics for 119 the results. Sort has two arguments before the searching criteria 120 argument; a parenthesized list of sort criteria, and the searching 121 charset. 123 Note that unlike SEARCH, the searching charset argument is 124 mandatory. The US-ASCII and UTF-8 charsets MUST be implemented. 125 All other charsets are optional. 127 There is also a UID SORT command which corresponds to SORT the way 128 that UID SEARCH corresponds to SEARCH. 130 The SORT command first searches the mailbox for messages that 131 match the given searching criteria using the charset argument for 132 the interpretation of strings in the searching criteria. It then 133 returns the matching messages in an untagged SORT response, sorted 134 according to one or more sort criteria. 136 If two or more messages exactly match according to the sorting 137 criteria, these messages are sorted according to the order in 138 which they appear in the mailbox. In other words, there is an 139 implicit sort criterion of "sequence number". 141 When multiple sort criteria are specified, the result is sorted in 142 the priority order that the criteria appear. For example, 143 (SUBJECT DATE) will sort messages in order by their subject text; 144 and for messages with the same subject text will sort by their 145 sent date. 147 Untagged EXPUNGE responses are not permitted while the server is 148 responding to a SORT command, but are permitted during a UID SORT 149 command. 151 The defined sort criteria are as follows. Refer to the Formal 152 Syntax section for the precise syntactic definitions of the 153 arguments. If the associated RFC-822 header for a particular 154 criterion is absent, it is treated as the empty string. The empty 155 string always collates before non-empty strings. 157 ARRIVAL 158 Internal date and time of the message. This differs from the 159 ON criteria in SEARCH, which uses just the internal date. 161 CC 162 RFC-822 local-part of the first "cc" address. 164 DATE 165 Sent date and time from the Date: header, adjusted by time 166 zone. This differs from the SENTON criteria in SEARCH, which 167 uses just the date and not the time, nor adjusts by time zone. 169 FROM 170 RFC-822 local-part of the first "From" address. 172 REVERSE 173 Followed by another sort criterion, has the effect of that 174 criterion but in reverse order. 175 Note: REVERSE only reverses a single criterion, and does not 176 affect the implicit "sequence number" sort criterion if all 177 other criteria are identicial. Consequently, a sort of 178 REVERSE SUBJECT is not the same as a reverse ordering of a 179 SUBJECT sort. 180 This can be avoided by use of additional criteria, e.g. 181 SUBJECT DATE vs. REVERSE SUBJECT REVERSE DATE. In general, 182 however, it's better (and faster, if the client has a 183 "reverse current ordering" command) to reverse the results 184 in the client instead of issuing a new SORT. 186 SIZE 187 Size of the message in octets. 189 SUBJECT 190 Extracted subject text. 192 TO 193 RFC-822 local-part of the first "To" address. 195 Example: C: A282 SORT (SUBJECT) UTF-8 SINCE 1-Feb-1994 196 S: * SORT 2 84 882 197 S: A282 OK SORT completed 198 C: A283 SORT (SUBJECT REVERSE DATE) UTF-8 ALL 199 S: * SORT 5 3 4 1 2 200 S: A283 OK SORT completed 201 C: A284 SORT (SUBJECT) US-ASCII TEXT "not in mailbox" 202 S: * SORT 203 S: A284 OK SORT completed 205 Additional Responses 207 This response is an extension to the IMAP4rev1 base protocol. 209 The section heading of this response is intended to correspond with 210 where it would be located in the main document. 212 7.2.SORT. SORT Response 214 Data: zero or more numbers 216 The SORT response occurs as a result of a SORT or UID SORT 217 command. The number(s) refer to those messages that match the 218 search criteria. For SORT, these are message sequence numbers; 219 for UID SORT, these are unique identifiers. Each number is 220 delimited by a space. 222 Example: S: * SORT 2 3 6 224 Formal Syntax of SORT commands and Responses 226 sort-data = "SORT" *(SP nz-number) 228 sort = ["UID" SP] "SORT" SP 229 "(" sort-criterion *(SP sort-criterion) ")" 230 SP search_charset 1*(SP search_key) 232 sort-criterion = ["REVERSE" SP] sort-key 234 sort-key = "ARRIVAL" / "CC" / "DATE" / "FROM" / "SIZE" / 235 "SUBJECT" / "TO" 237 The following syntax describes subject extraction rules (2)-(6): 239 subject = *subj-leader [subj-middle] *subj-trailer 241 subj-refwd = ("re" / ("fw" ["d"])) *WSP [subj-blob] ":" 243 subj-blob = "[" *BLOBCHAR "]" *WSP 245 subj-fwd = subj-fwd-hdr subject subj-fwd-trl 247 subj-fwd-hdr = "[fwd:" 249 subj-fwd-trl = "]" 251 subj-leader = (*subj-blob subj-refwd) / WSP 253 subj-middle = *subj-blob (subj-base / subj-fwd) 254 ; last subj-blob is subj-base if subj-base would 255 ; otherwise be empty 257 subj-trailer = "(fwd)" / WSP 259 subj-base = NONWSP *([*WSP] NONWSP) 260 ; can be a subj-blob 262 BLOBCHAR = %x01-5a / %x5c / %x5e-7f 263 ; any CHAR except '[' and ']' 265 NONWSP = %x01-08 / %x0a-1f / %x21-7f 266 ; any CHAR other than WSP 268 Security Considerations 270 Security issues are not discussed in this memo. 272 Internationalization Considerations 274 By default, strings are sorted according to the "minimum sorting 275 collation algorithm". All implementations of SORT MUST implement the 276 minimum sorting collation algorithm. 278 In the minimum sorting collation algorithm, the Basic Latin 279 alphabetics (U+0041 to U+005A uppercase, U+0061 to U+007A lowercase) 280 are sorted in a case-insensitive fashion; that is, "A" (U+0041) and 281 "a" (U+0061) are treated as exact equals. The characters U+005B to 282 U+0060 are sorted after the Basic Latin alphabetics; for example, 283 U+005E is sorted after U+005A and U+007A. All other characters are 284 sorted according to their octet values, as expressed in UTF-8. No 285 attempt is made to treat composed characters specially, or to do 286 case-insensitive comparisons of composed characters. 288 Note: this means, among other things, that the composed 289 characters in the Latin-1 Supplement are not compared in 290 what would be considered an ISO 8859-1 "case-insensitive" 291 fashion. Case comparison rules for characters with 292 diacriticals differ between languages; the minimum sorting 293 collation does not attempt to deal with this at all. This 294 is reserved for other sorting collations, which may be 295 language-specific. 297 ;;; *** ITEM FOR DISCUSSION *** 298 ;;; THERE IS SOME CONCERN THAT THIS MINIMUM COLLATION IS TOO MINIMAL, 299 ;;; AND THAT THE "GENERIC UNICODE SORTING COLLATION" DISCUSSED BELOW 300 ;;; NEEDS TO BE THE MINIMUM. ONE SUGGESTION IS UNICODE TECHNICAL 301 ;;; STANDARD 10 (TR-10). IF THIS IS THE MINIMUM, THAT REQUIRES THAT 302 ;;; ALL IMPLEMENTATIONS OF SORT AND THREAD BE UNICODE-SAVVY AT LEAST 303 ;;; TO THE POINT OF IMPLEMENTATION TR-10. IS THIS REALISTIC? DOES 304 ;;; THIS RAISE EXCESSIVE IMPLEMENTATION BARRIERS? 305 Other sorting collations, and the ability to change the sorting 306 collation, will be defined in a separate document dealing with IMAP 307 internationalization. 309 It is anticipated that there will be a generic Unicode sorting 310 collation, which will provide generic case-insensitivity for 311 alphabetic scripts, specification of composed character handling, and 312 language-specific sorting collations. A server which implements 313 non-default sorting collations will modify its sorting behavior 314 according to the selected sorting collation. 316 Non-English translations of "Re" or "Fw"/"Fwd" are not specified for 317 removal in the extracted subject text process. By specifying that 318 only the English forms of the prefixes are used, it becomes a simple 319 display time task to localize the prefix language for the user. If, 320 on the other hand, prefixes in multiple languages are permitted, the 321 result is a geometrically complex, and ultimately unimplementable, 322 task. In order to improve the ability to support non-English display 323 in Internet mail clients, only the English form of these prefixes 324 should be transmitted in Internet mail messages. 326 Author's Address 328 Mark R. Crispin 329 Networks and Distributed Computing 330 University of Washington 331 4545 15th Avenue NE 332 Seattle, WA 98105-4527 334 Phone: (206) 543-5762 336 EMail: MRC@CAC.Washington.EDU 338 Kenneth Murchison 339 Oceana Matrix Ltd. 340 21 Princeton Place 341 Orchard Park, NY 14127 343 Phone: (716) 662-8973 x26 345 EMail: ken@oceana.com