idnits 2.17.1 draft-ietf-imapext-sort-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 10 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 48: '...ient implementations SHOULD accept any...' RFC 2119 keyword, line 94: '...onnected clients MUST use exactly this...' RFC 2119 keyword, line 126: '...nd UTF-8 charsets MUST be implemented....' RFC 2119 keyword, line 277: '...ementations of SORT MUST implement the...' Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (December 2000) is 8525 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 6 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IMAP Extensions Working Group M. Crispin 3 INTERNET-DRAFT: IMAP SORT K. Murchison 4 Document: internet-drafts/draft-ietf-imapext-sort-06.txt December 2000 6 INTERNET MESSAGE ACCESS PROTOCOL - SORT EXTENSION 8 Status of this Memo 10 This document is an Internet-Draft and is in full conformance with 11 all provisions of Section 10 of RFC 2026. 13 Internet-Drafts are working documents of the Internet Engineering 14 Task Force (IETF), its areas, and its working groups. Note that 15 other groups may also distribute working documents as 16 Internet-Drafts. 18 Internet-Drafts are draft documents valid for a maximum of six months 19 and may be updated, replaced, or obsoleted by other documents at any 20 time. It is inappropriate to use Internet-Drafts as reference 21 material or to cite them other than as "work in progress." 23 The list of current Internet-Drafts can be accessed at 24 http://www.ietf.org/ietf/1id-abstracts.txt 26 To view the list Internet-Draft Shadow Directories, see 27 http://www.ietf.org/shadow.html. 29 A revised version of this document will be submitted to the RFC 30 editor as an Informational Document for the Internet Community. 32 A revised version of this draft document, describing an expanded 33 version of this protocol extension, will be submitted to the RFC 34 editor as a Proposed Standard for the Internet Community. 36 Discussion and suggestions for improvement are requested, and should 37 be sent to ietf-imapext@IMC.ORG. This document will expire before 29 38 June 2001. Distribution of this memo is unlimited. 40 Abstract 42 This document describes an experimental server-based sorting 43 extension to the IMAP4rev1 protocol, as implemented by the University 44 of Washington's IMAP toolkit. This extension provides substantial 45 performance improvements for IMAP clients which offer sorted views. 47 A server which supports this extension indicates this with a 48 capability name of "SORT". Client implementations SHOULD accept any 49 capability name which begins with "SORT" as indicating support for 50 the extension described in this document. This provides for future 51 upwards-compatible extensions. 53 At the time of this document was written, the IMAP Extensions Working 54 Group (IETF-IMAPEXT) was considering upwards-compatible additions to 55 the SORT extension described in this document, tentatively called the 56 SORT2 extension. 58 Extracted Subject Text 60 The "SUBJECT" SORT criteria uses a version of the subject which has 61 specific subject artifacts of deployed Internet mail software 62 removed. Due to the complexity of these artifacts, the formal syntax 63 for the subject extraction rules is ambiguous. The following 64 procedure is followed to determine the actual "base subject" which is 65 used to sort by subject: 67 (1) Convert any RFC 2047 encoded-words in the subject to 68 UTF-8. Convert all tabs and continuations to space. 69 Convert all multiple spaces to a single space. 71 (2) Remove all trailing text of the subject that matches 72 the subj-trailer ABNF, repeat until no more matches are 73 possible. 75 (3) Remove all prefix text of the subject that matches the 76 subj-leader ABNF. 78 (4) If there is prefix text of the subject that matches the 79 subj-blob ABNF, and removing that prefix leaves a non-empty 80 subj-base, then remove the prefix text. 82 (5) Repeat (3) and (4) until no matches remain. 84 Note: it is possible to defer step (2) until step (6), but this 85 requires checking for subj-trailer in step (4). 87 (6) If the resulting text begins with the subj-fwd-hdr ABNF 88 and ends with the subj-fwd-trl ABNF, remove the 89 subj-fwd-hdr and subj-fwd-trl and repeat from step (2). 91 (7) The resulting text is the "base subject" used in the 92 SORT. 94 All servers and disconnected clients MUST use exactly this algorithm 95 when sorting by subject. Otherwise there is potential for a user to 96 get inconsistent results based on whether they are running in 97 connected or disconnected IMAP mode. 99 Additional Commands 101 This command is an extension to the IMAP4rev1 base protocol. 103 The section header is intended to correspond with where it would be 104 located in the main document if it was part of the base 105 specification. 107 6.3.SORT. SORT Command 109 Arguments: sort program 110 charset specification 111 searching criteria (one or more) 113 Data: untagged responses: SORT 115 Result: OK - sort completed 116 NO - sort error: can't sort that charset or 117 criteria 118 BAD - command unknown or arguments invalid 120 The SORT command is a variant of SEARCH with sorting semantics for 121 the results. Sort has two arguments before the searching criteria 122 argument; a parenthesized list of sort criteria, and the searching 123 charset. 125 Note that unlike SEARCH, the searching charset argument is 126 mandatory. The US-ASCII and UTF-8 charsets MUST be implemented. 127 All other charsets are optional. 129 There is also a UID SORT command which corresponds to SORT the way 130 that UID SEARCH corresponds to SEARCH. 132 The SORT command first searches the mailbox for messages that 133 match the given searching criteria using the charset argument for 134 the interpretation of strings in the searching criteria. It then 135 returns the matching messages in an untagged SORT response, sorted 136 according to one or more sort criteria. 138 If two or more messages exactly match according to the sorting 139 criteria, these messages are sorted according to the order in 140 which they appear in the mailbox. In other words, there is an 141 implicit sort criterion of "sequence number". 143 When multiple sort criteria are specified, the result is sorted in 144 the priority order that the criteria appear. For example, 145 (SUBJECT DATE) will sort messages in order by their subject text; 146 and for messages with the same subject text will sort by their 147 sent date. 149 Untagged EXPUNGE responses are not permitted while the server is 150 responding to a SORT command, but are permitted during a UID SORT 151 command. 153 The defined sort criteria are as follows. Refer to the Formal 154 Syntax section for the precise syntactic definitions of the 155 arguments. If the associated RFC-822 header for a particular 156 criterion is absent, it is treated as the empty string. The empty 157 string always collates before non-empty strings. 159 ARRIVAL 160 Internal date and time of the message. This differs from the 161 ON criteria in SEARCH, which uses just the internal date. 163 CC 164 RFC-822 local-part of the first "cc" address. 166 DATE 167 Sent date and time from the Date: header, adjusted by time 168 zone. This differs from the SENTON criteria in SEARCH, which 169 uses just the date and not the time, nor adjusts by time zone. 171 FROM 172 RFC-822 local-part of the "From" address. 174 REVERSE 175 Followed by another sort criterion, has the effect of that 176 criterion but in reverse order. 177 Note: REVERSE only reverses a single criterion, and does not 178 affect the implicit "sequence number" sort criterion if all 179 other criteria are identicial. Consequently, a sort of 180 REVERSE SUBJECT is not the same as a reverse ordering of a 181 SUBJECT sort. 182 This can be avoided by use of additional criteria, e.g. 183 SUBJECT DATE vs. REVERSE SUBJECT REVERSE DATE. In general, 184 however, it's better (and faster, if the client has a 185 "reverse current ordering" command) to reverse the results 186 in the client instead of issuing a new SORT. 188 SIZE 189 Size of the message in octets. 191 SUBJECT 192 Extracted subject text. 194 TO 195 RFC-822 local-part of the first "To" address. 197 Example: C: A282 SORT (SUBJECT) UTF-8 SINCE 1-Feb-1994 198 S: * SORT 2 84 882 199 S: A282 OK SORT completed 200 C: A283 SORT (SUBJECT REVERSE DATE) UTF-8 ALL 201 S: * SORT 5 3 4 1 2 202 S: A283 OK SORT completed 203 C: A284 SORT (SUBJECT) US-ASCII TEXT "not in mailbox" 204 S: * SORT 205 S: A284 OK SORT completed 207 Additional Responses 209 This response is an extension to the IMAP4rev1 base protocol. 211 The section heading of this response is intended to correspond with 212 where it would be located in the main document. 214 7.2.SORT. SORT Response 216 Data: zero or more numbers 218 The SORT response occurs as a result of a SORT or UID SORT 219 command. The number(s) refer to those messages that match the 220 search criteria. For SORT, these are message sequence numbers; 221 for UID SORT, these are unique identifiers. Each number is 222 delimited by a space. 224 Example: S: * SORT 2 3 6 226 Formal Syntax of SORT commands and Responses 228 sort-data = "SORT" *(SP nz-number) 230 sort = ["UID" SP] "SORT" SP 231 "(" sort-criterion *(SP sort-criterion) ")" 232 SP search_charset 1*(SP search_key) 234 sort-criterion = ["REVERSE" SP] sort-key 236 sort-key = "ARRIVAL" / "CC" / "DATE" / "FROM" / "SIZE" / 237 "SUBJECT" / "TO" 239 The following syntax describes subject extraction rules (2)-(6): 241 subject = *subj-leader [subj-middle] *subj-trailer 243 subj-refwd = ("re" / ("fw" ["d"])) *WSP [subj-blob] ":" 245 subj-blob = "[" *BLOBCHAR "]" *WSP 247 subj-fwd = subj-fwd-hdr subject subj-fwd-trl 249 subj-fwd-hdr = "[fwd:" 251 subj-fwd-trl = "]" 253 subj-leader = (*subj-blob subj-refwd) / WSP 255 subj-middle = *subj-blob (subj-base / subj-fwd) 256 ; last subj-blob is subj-base if subj-base would 257 ; otherwise be empty 259 subj-trailer = "(fwd)" / WSP 261 subj-base = NONWSP *([*WSP] NONWSP) 262 ; can be a subj-blob 264 BLOBCHAR = %x01-5a / %x5c / %x5e-7f 265 ; any CHAR except '[' and ']' 267 NONWSP = %x01-08 / %x0a-1f / %x21-7f 268 ; any CHAR other than WSP 270 Security Considerations 272 Security issues are not discussed in this memo. 274 Internationalization Considerations 276 By default, strings are sorted according to the "minimum sorting 277 collation algorithm". All implementations of SORT MUST implement the 278 minimum sorting collation algorithm. 280 In the minimum sorting collation algorithm, the Basic Latin 281 alphabetics (U+0041 to U+005A uppercase, U+0061 to U+007A lowercase) 282 are sorted in a case-insensitive fashion; that is, "A" (U+0041) and 283 "a" (U+0061) are treated as exact equals. The characters U+005B to 284 U+0060 are sorted after the Basic Latin alphabetics; for example, 285 U+005E is sorted after U+005A and U+007A. All other characters are 286 sorted according to their octet values, as expressed in UTF-8. No 287 attempt is made to treat composed characters specially, or to do 288 case-insensitive comparisons of composed characters. 290 Note: this means, among other things, that the composed 291 characters in the Latin-1 Supplement are not compared in 292 what would be considered an ISO 8859-1 "case-insensitive" 293 fashion. Case comparison rules for characters with 294 diacriticals differ between languages; the minimum sorting 295 collation does not attempt to deal with this at all. This 296 is reserved for other sorting collations, which may be 297 language-specific. 299 Other sorting collations, and the ability to change the sorting 300 collation, will be defined in a separate document dealing with IMAP 301 internationalization. 303 It is anticipated that there will be a generic Unicode sorting 304 collation, which will provide generic case-insensitivity for 305 alphabetic scripts, specification of composed character handling, and 306 language-specific sorting collations. A server which implements 307 non-default sorting collations will modify its sorting behavior 308 according to the selected sorting collation. 310 Non-English translations of "Re" or "Fw"/"Fwd" are not specified for 311 removal in the extracted subject text process. By specifying that 312 only the English forms of the prefixes are used, it becomes a simple 313 display time task to localize the prefix language for the user. If, 314 on the other hand, prefixes in multiple languages are permitted, the 315 result is a geometrically complex, and ultimately unimplementable, 316 task. In order to improve the ability to support non-English display 317 in Internet mail clients, only the English form of these prefixes 318 should be transmitted in Internet mail messages. 320 Author's Address 322 Mark R. Crispin 323 Networks and Distributed Computing 324 University of Washington 325 4545 15th Avenue NE 326 Seattle, WA 98105-4527 328 Phone: (206) 543-5762 330 EMail: MRC@CAC.Washington.EDU 332 Kenneth Murchison 333 Oceana Matrix Ltd. 334 21 Princeton Place 335 Orchard Park, NY 14127 337 Phone: (716) 662-8973 x26 339 EMail: ken@oceana.com