idnits 2.17.1 

draft-ietf-imapext-sort-10.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == It seems as if not all pages are separated by form feeds - found 0 form
     feeds but 10 pages


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Introduction section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 46: '...ient implementations SHOULD accept any...'
     RFC 2119 keyword, line 92: '...onnected clients MUST use exactly this...'
     RFC 2119 keyword, line 124: '...nd UTF-8 charsets MUST be implemented....'
     RFC 2119 keyword, line 281: '...ementations of SORT MUST implement the...'


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (June 2002) is 7984 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

     No issues found here.

     Summary: 6 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	IMAP Extensions Working Group                                 M. Crispin
3	INTERNET-DRAFT: IMAP SORT                                   K. Murchison
4	Document: internet-drafts/draft-ietf-imapext-sort-10.txt       June 2002

6	           INTERNET MESSAGE ACCESS PROTOCOL - SORT EXTENSION

8	Status of this Memo

10	   This document is an Internet-Draft and is in full conformance with
11	   all provisions of Section 10 of RFC 2026.

13	   Internet-Drafts are working documents of the Internet Engineering
14	   Task Force (IETF), its areas, and its working groups.  Note that
15	   other groups may also distribute working documents as
16	   Internet-Drafts.

18	   Internet-Drafts are draft documents valid for a maximum of six months
19	   and may be updated, replaced, or obsoleted by other documents at any
20	   time.  It is inappropriate to use Internet-Drafts as reference
21	   material or to cite them other than as "work in progress."

23	   The list of current Internet-Drafts can be accessed at
24	   http://www.ietf.org/ietf/1id-abstracts.txt

26	   To view the list Internet-Draft Shadow Directories, see
27	   http://www.ietf.org/shadow.html.

29	   A revised version of this document will be submitted to the RFC
30	   editor as an Informational Document for the Internet Community.

32	   A revised version of this draft document will be submitted to the RFC
33	   editor as a Proposed Standard for the Internet Community.  Discussion
34	   and suggestions for improvement are requested, and should be sent to
35	   ietf-imapext@IMC.ORG.  This document will expire before 22 December
36	   2002.  Distribution of this memo is unlimited.

38	Abstract

40	   This document describes an experimental server-based sorting
41	   extension to the IMAP4rev1 protocol, as implemented by the University
42	   of Washington's IMAP toolkit.  This extension provides substantial
43	   performance improvements for IMAP clients which offer sorted views.

45	   A server which supports this extension indicates this with a
46	   capability name of "SORT".  Client implementations SHOULD accept any
47	   capability name which begins with "SORT" as indicating support for
48	   the extension described in this document.  This provides for future
49	   upwards-compatible extensions.

51	   At the time of this document was written, the IMAP Extensions Working
52	   Group (IETF-IMAPEXT) was considering upwards-compatible additions to
53	   the SORT extension described in this document, tentatively called the
54	   SORT2 extension.

56	Base Subject Text

58	   The "SUBJECT" SORT criteria the "base subject," which has specific
59	   subject artifacts of deployed Internet mail software removed.  Due to
60	   the complexity of these artifacts, the formal syntax for the subject
61	   extraction rules is ambiguous.  The following procedure is followed
62	   to determine the actual "base subject" which is used to sort by
63	   subject:

65	        (1) Convert any RFC 2047 encoded-words in the subject to
66	        UTF-8.  Convert all tabs and continuations to space.
67	        Convert all multiple spaces to a single space.

69	        (2) Remove all trailing text of the subject that matches
70	        the subj-trailer ABNF, repeat until no more matches are
71	        possible.

73	        (3) Remove all prefix text of the subject that matches the
74	        subj-leader ABNF.

76	        (4) If there is prefix text of the subject that matches the
77	        subj-blob ABNF, and removing that prefix leaves a non-empty
78	        subj-base, then remove the prefix text.

80	        (5) Repeat (3) and (4) until no matches remain.

82	   Note: it is possible to defer step (2) until step (6), but this
83	   requires checking for subj-trailer in step (4).

85	        (6) If the resulting text begins with the subj-fwd-hdr ABNF
86	        and ends with the subj-fwd-trl ABNF, remove the
87	        subj-fwd-hdr and subj-fwd-trl and repeat from step (2).

89	        (7) The resulting text is the "base subject" used in the
90	        SORT.

92	   All servers and disconnected clients MUST use exactly this algorithm
93	   when sorting by subject.  Otherwise there is potential for a user to
94	   get inconsistent results based on whether they are running in
95	   connected or disconnected IMAP mode.

97	Additional Commands

99	   This command is an extension to the IMAP4rev1 base protocol.

101	   The section header is intended to correspond with where it would be
102	   located in the main document if it was part of the base
103	   specification.

105	6.3.SORT.       SORT Command

107	   Arguments:  sort program
108	               charset specification
109	               searching criteria (one or more)

111	   Data:       untagged responses: SORT

113	   Result:     OK - sort completed
114	               NO - sort error: can't sort that charset or
115	                    criteria
116	               BAD - command unknown or arguments invalid

118	      The SORT command is a variant of SEARCH with sorting semantics for
119	      the results.  Sort has two arguments before the searching criteria
120	      argument; a parenthesized list of sort criteria, and the searching
121	      charset.

123	      Note that unlike SEARCH, the searching charset argument is
124	      mandatory.  The US-ASCII and UTF-8 charsets MUST be implemented.
125	      All other charsets are optional.

127	      There is also a UID SORT command which corresponds to SORT the way
128	      that UID SEARCH corresponds to SEARCH.

130	      The SORT command first searches the mailbox for messages that
131	      match the given searching criteria using the charset argument for
132	      the interpretation of strings in the searching criteria.  It then
133	      returns the matching messages in an untagged SORT response, sorted
134	      according to one or more sort criteria.

136	      Sorting is in ascending order.  Earlier dates sort before later
137	      dates; smaller sizes sort before larger sizes; and strings are
138	      sorted according to ascending values established by their
139	      collation algorithm (see under "Internationalization
140	      Considerations").

142	      If two or more messages exactly match according to the sorting
143	      criteria, these messages are sorted according to the order in
144	      which they appear in the mailbox.  In other words, there is an
145	      implicit sort criterion of "sequence number".

147	      When multiple sort criteria are specified, the result is sorted in
148	      the priority order that the criteria appear.  For example,
149	      (SUBJECT DATE) will sort messages in order by their base subject
150	      text; and for messages with the same base subject text will sort
151	      by their sent date.

153	      Untagged EXPUNGE responses are not permitted while the server is
154	      responding to a SORT command, but are permitted during a UID SORT
155	      command.

157	      The defined sort criteria are as follows.  Refer to the Formal
158	      Syntax section for the precise syntactic definitions of the
159	      arguments.  If the associated RFC-822 header for a particular
160	      criterion is absent, it is treated as the empty string.  The empty
161	      string always collates before non-empty strings.

163	      ARRIVAL
164	         Internal date and time of the message.  This differs from the
165	         ON criteria in SEARCH, which uses just the internal date.

167	      CC
168	         RFC-822 local-part of the first "cc" address.

170	      DATE
171	         Sent date and time from the Date: header, adjusted by time
172	         zone.  This differs from the SENTON criteria in SEARCH, which
173	         uses just the date and not the time, nor adjusts by time zone.

175	      FROM
176	         RFC-822 local-part of the first "From" address.

178	      REVERSE
179	         Followed by another sort criterion, has the effect of that
180	         criterion but in reverse (descending) order.
181	            Note: REVERSE only reverses a single criterion, and does not
182	            affect the implicit "sequence number" sort criterion if all
183	            other criteria are identicial.  Consequently, a sort of
184	            REVERSE SUBJECT is not the same as a reverse ordering of a
185	            SUBJECT sort.
186	            This can be avoided by use of additional criteria, e.g.
187	            SUBJECT DATE vs. REVERSE SUBJECT REVERSE DATE.  In general,
188	            however, it's better (and faster, if the client has a
189	            "reverse current ordering" command) to reverse the results
190	            in the client instead of issuing a new SORT.

192	      SIZE
193	         Size of the message in octets.

195	      SUBJECT
196	         Base subject text.

198	      TO
199	         RFC-822 local-part of the first "To" address.

201	   Example:    C: A282 SORT (SUBJECT) UTF-8 SINCE 1-Feb-1994
202	               S: * SORT 2 84 882
203	               S: A282 OK SORT completed
204	               C: A283 SORT (SUBJECT REVERSE DATE) UTF-8 ALL
205	               S: * SORT 5 3 4 1 2
206	               S: A283 OK SORT completed
207	               C: A284 SORT (SUBJECT) US-ASCII TEXT "not in mailbox"
208	               S: * SORT
209	               S: A284 OK SORT completed

211	Additional Responses

213	   This response is an extension to the IMAP4rev1 base protocol.

215	   The section heading of this response is intended to correspond with
216	   where it would be located in the main document.

218	7.2.SORT.       SORT Response

220	   Data:       zero or more numbers

222	      The SORT response occurs as a result of a SORT or UID SORT
223	      command.  The number(s) refer to those messages that match the
224	      search criteria.  For SORT, these are message sequence numbers;
225	      for UID SORT, these are unique identifiers.  Each number is
226	      delimited by a space.

228	   Example:    S: * SORT 2 3 6

230	Formal Syntax of SORT commands and Responses

232	   sort-data       = "SORT" *(SP nz-number)

234	   sort            = ["UID" SP] "SORT" SP
235	                     "(" sort-criterion *(SP sort-criterion) ")"
236	                     SP search-charset 1*(SP search-key)

238	   sort-criterion  = ["REVERSE" SP] sort-key

240	   sort-key        = "ARRIVAL" / "CC" / "DATE" / "FROM" / "SIZE" /
241	                     "SUBJECT" / "TO"

243	   The following syntax describes base subject extraction rules (2)-(6):

245	   subject         = *subj-leader [subj-middle] *subj-trailer

247	   subj-refwd      = ("re" / ("fw" ["d"])) *WSP [subj-blob] ":"

249	   subj-blob       = "[" *BLOBCHAR "]" *WSP

251	   subj-fwd        = subj-fwd-hdr subject subj-fwd-trl

253	   subj-fwd-hdr    = "[fwd:"

255	   subj-fwd-trl    = "]"

257	   subj-leader     = (*subj-blob subj-refwd) / WSP

259	   subj-middle     = *subj-blob (subj-base / subj-fwd)
260	                   ; last subj-blob is subj-base if subj-base would
261	                   ; otherwise be empty

263	   subj-trailer    = "(fwd)" / WSP

265	   subj-base       = NONWSP *([*WSP] NONWSP)
266	                   ; can be a subj-blob

268	   BLOBCHAR        = %x01-5a / %x5c / %x5e-7f
269	                   ; any CHAR except '[' and ']'

271	   NONWSP          = %x01-08 / %x0a-1f / %x21-7f
272	                   ; any CHAR other than WSP

274	Security Considerations

276	   Security issues are not discussed in this memo.

278	Internationalization Considerations

280	   By default, strings are sorted according to the "minimum sorting
281	   collation algorithm".  All implementations of SORT MUST implement the
282	   minimum sorting collation algorithm.

284	   In the minimum sorting collation algorithm, the Basic Latin
285	   alphabetics (U+0041 to U+005A uppercase, U+0061 to U+007A lowercase)
286	   are sorted in a case-insensitive fashion; that is, "A" (U+0041) and
287	   "a" (U+0061) are treated as exact equals.  The characters U+005B to
288	   U+0060 are sorted after the Basic Latin alphabetics; for example,
289	   U+005E is sorted after U+005A and U+007A.  All other characters are
290	   sorted according to their octet values, as expressed in UTF-8.  No
291	   attempt is made to treat composed characters specially, or to do
292	   case-insensitive comparisons of composed characters.

294	        Note: this means, among other things, that the composed
295	        characters in the Latin-1 Supplement are not compared in
296	        what would be considered an ISO 8859-1 "case-insensitive"
297	        fashion.  Case comparison rules for characters with
298	        diacriticals differ between languages; the minimum sorting
299	        collation does not attempt to deal with this at all.  This
300	        is reserved for other sorting collations, which may be
301	        language-specific.

303	   ;;;   *** ITEM FOR DISCUSSION ***
304	   ;;; THERE IS SOME CONCERN THAT THIS MINIMUM COLLATION IS TOO MINIMAL,
305	   ;;; AND THAT THE "GENERIC UNICODE SORTING COLLATION" DISCUSSED BELOW
306	   ;;; NEEDS TO BE THE MINIMUM.  ONE SUGGESTION IS UNICODE TECHNICAL
307	   ;;; STANDARD 10 (TR-10).  IF THIS IS THE MINIMUM, THAT REQUIRES THAT
308	   ;;; ALL IMPLEMENTATIONS OF SORT AND THREAD BE UNICODE-SAVVY AT LEAST
309	   ;;; TO THE POINT OF IMPLEMENTATION TR-10.  IS THIS REALISTIC?  DOES
310	   ;;; THIS RAISE EXCESSIVE IMPLEMENTATION BARRIERS?
311	   Other sorting collations, and the ability to change the sorting
312	   collation, will be defined in a separate document dealing with IMAP
313	   internationalization.

315	   It is anticipated that there will be a generic Unicode sorting
316	   collation, which will provide generic case-insensitivity for
317	   alphabetic scripts, specification of composed character handling, and
318	   language-specific sorting collations.  A server which implements
319	   non-default sorting collations will modify its sorting behavior
320	   according to the selected sorting collation.

322	   Non-English translations of "Re" or "Fw"/"Fwd" are not specified for
323	   removal in the base subject extraction process.  By specifying that
324	   only the English forms of the prefixes are used, it becomes a simple
325	   display time task to localize the prefix language for the user.  If,
326	   on the other hand, prefixes in multiple languages are permitted, the
327	   result is a geometrically complex, and ultimately unimplementable,
328	   task.  In order to improve the ability to support non-English display
329	   in Internet mail clients, only the English form of these prefixes
330	   should be transmitted in Internet mail messages.

332	Author's Address

334	   Mark R. Crispin
335	   Networks and Distributed Computing
336	   University of Washington
337	   4545 15th Avenue NE
338	   Seattle, WA  98105-4527

340	   Phone: (206) 543-5762

342	   EMail: MRC@CAC.Washington.EDU

344	   Kenneth Murchison
345	   Oceana Matrix Ltd.
346	   21 Princeton Place
347	   Orchard Park, NY 14127

349	   Phone: (716) 662-8973 x26

351	   EMail: ken@oceana.com