idnits 2.17.1 draft-leiba-imap-implement-guide-10.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([RFC-2060]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 2000) is 8868 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'UIDVALIDITY 824708485' is mentioned on line 282, but not defined == Missing Reference: 'UNSEEN 9921' is mentioned on line 283, but not defined == Missing Reference: 'UNSEEN' is mentioned on line 288, but not defined == Missing Reference: 'TRYCREATE' is mentioned on line 471, but not defined == Missing Reference: 'READ-ONLY' is mentioned on line 473, but not defined == Missing Reference: 'UIDVALIDITY 12345' is mentioned on line 618, but not defined -- Looks like a reference, but probably isn't: '1' on line 672 == Unused Reference: 'RFC-2119' is defined on line 1013, but no explicit reference was found in the text == Unused Reference: 'UTF-7' is defined on line 1022, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2060 (Obsoleted by RFC 3501) ** Downref: Normative reference to an Informational RFC: RFC 2180 ** Obsolete normative reference: RFC 2044 (ref. 'UTF-8') (Obsoleted by RFC 2279) ** Downref: Normative reference to an Informational RFC: RFC 2152 (ref. 'UTF-7') -- Possible downref: Non-RFC (?) normative reference: ref. 'NAMESPACE' Summary: 12 errors (**), 0 flaws (~~), 9 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group B. Leiba 2 Internet Draft IBM T.J. Watson Research Center 3 Document: draft-leiba-imap-implement-guide-10.txt July 1999 4 Expires January 2000 6 IMAP4 Implementation Recommendations 8 Status of this Document 10 This document is an Internet-Draft and is in full conformance with 11 all provisions of Section 10 of RFC2026. Internet-Drafts are working 12 documents of the Internet Engineering Task Force (IETF), its areas, 13 and its working groups. Note that other groups may also distribute 14 working documents as Internet-Drafts. 16 Internet-Drafts are draft documents valid for a maximum of six months 17 and may be updated, replaced, or obsoleted by other documents at any 18 time. It is inappropriate to use Internet-Drafts as reference 19 material or to cite them other than as "work in progress." 21 The list of current Internet-Drafts can be accessed at 22 http://www.ietf.org/ietf/1id-abstracts.txt 24 The list of Internet-Draft Shadow Directories can be accessed at 25 http://www.ietf.org/shadow.html. 27 This document provides information for the Internet community. This 28 document does not specify an Internet standard of any kind. 29 Distribution of this document is unlimited. A revised version of 30 this draft document will be submitted to the RFC editor. Discussion 31 and suggestions for improvement are requested; discussion should be 32 held on the IMAP mailing list, "imap@u.washington.edu" (subscription 33 requests go to "imap-request@u.washington.edu"). 35 1. Abstract 37 The IMAP4 specification [RFC-2060] describes a rich protocol for use 38 in building clients and servers for storage, retrieval, and 39 manipulation of electronic mail. Because the protocol is so rich and 40 has so many implementation choices, there are often trade-offs that 41 must be made and issues that must be considered when designing such 42 clients and servers. This document attempts to outline these issues 43 and to make recommendations in order to make the end products as 44 interoperable as possible. 46 2. Conventions used in this document 48 In examples, "C:" indicates lines sent by a client that is connected 49 to a server. "S:" indicates lines sent by the server to the client. 51 The words "must", "must not", "should", "should not", and "may" are 52 used with specific meaning in this document; since their meaning is 53 somewhat different from that specified in RFC 2119, we do not put 54 them in all caps here. Their meaning is as follows: 56 must -- This word means that the action described is necessary to ensure 57 interoperability. The recommendation should not be ignored. 58 must not -- This phrase means that the action described will be almost 59 certain to hurt interoperability. The recommendation should not 60 be ignored. 61 should -- This word means that the action described is strongly 62 recommended and will enhance interoperability or usability. The 63 recommendation should not be ignored without careful 64 consideration. 65 should not -- This phrase means that the action described is strongly 66 recommended against, and might hurt interoperability or 67 usability. The recommendation should not be ignored without 68 careful consideration. 69 may -- This word means that the action described is an acceptable 70 implementation choice. No specific recommendation is implied; 71 this word is used to point out a choice that might not be 72 obvious, or to let implementors know what choices have been made 73 by existing implementations. 75 3. Interoperability Issues and Recommendations 77 3.1. Accessibility 79 This section describes the issues related to access to servers and 80 server resources. Concerns here include data sharing and maintenance 81 of client/server connections. 83 3.1.1. Multiple Accesses of the Same Mailbox 85 One strong point of IMAP4 is that, unlike POP3, it allows for 86 multiple simultaneous access to a single mailbox. A user can, thus, 87 read mail from a client at home while the client in the office is 88 still connected; or the help desk staff can all work out of the same 89 inbox, all seeing the same pool of questions. An important point 90 about this capability, though is that NO SERVER IS GUARANTEED TO 91 SUPPORT THIS. If you are selecting an IMAP server and this facility 92 is important to you, be sure that the server you choose to install, 93 in the configuration you choose to use, supports it. 95 If you are designing a client, you must not assume that you can 96 access the same mailbox more than once at a time. That means 97 1. you must handle gracefully the failure of a SELECT command if 98 the server refuses the second SELECT, 99 2. you must handle reasonably the severing of your connection (see 100 "Severed Connections", below) if the server chooses to allow the 101 second SELECT by forcing the first off, 102 3. you must avoid making multiple connections to the same mailbox 103 in your own client (for load balancing or other such reasons), 104 and 105 4. you must avoid using the STATUS command on a mailbox that you 106 have selected (with some server implementations the STATUS 107 command has the same problems with multiple access as do the 108 SELECT and EXAMINE commands). 110 A further note about STATUS: The STATUS command is sometimes used to 111 check a non-selected mailbox for new mail. This mechanism must not 112 be used to check for new mail in the selected mailbox; section 5.2 of 113 [RFC-2060] specifically forbids this in its last paragraph. Further, 114 since STATUS takes a mailbox name it is an independent operation, not 115 operating on the selected mailbox. Because of this, the information 116 it returns is not necessarily in synchronization with the selected 117 mailbox state. 119 3.1.2. Severed Connections 121 The client/server connection may be severed for one of three reasons: 122 the client severs the connection, the server severs the connection, 123 or the connection is severed by outside forces beyond the control of 124 the client and the server (a telephone line drops, for example). 125 Clients and servers must both deal with these situations. 127 When the client wants to sever a connection, it's usually because it 128 has finished the work it needed to do on that connection. The client 129 should send a LOGOUT command, wait for the tagged response, and then 130 close the socket. But note that, while this is what's intended in 131 the protocol design, there isn't universal agreement here. Some 132 contend that sending the LOGOUT and waiting for the two responses 133 (untagged BYE and tagged OK) is wasteful and unnecessary, and that 134 the client can simply close the socket. The server should interpret 135 the closed socket as a log out by the client. The counterargument is 136 that it's useful from the standpoint of cleanup, problem 137 determination, and the like, to have an explicit client log out, 138 because otherwise there is no way for the server to tell the 139 difference between "closed socket because of log out" and "closed 140 socket because communication was disrupted". If there is a 141 client/server interaction problem, a client which routinely 142 terminates a session by breaking the connection without a LOGOUT will 143 make it much more difficult to determine the problem. 145 Because of this disagreement, server designers must be aware that 146 some clients might close the socket without sending a LOGOUT. In any 147 case, whether or not a LOGOUT was sent, the server should not 148 implicitly expunge any messages from the selected mailbox. If a 149 client wants the server to do so, it must send a CLOSE or EXPUNGE 150 command explicitly. 152 When the server wants to sever a connection it's usually due to an 153 inactivity timeout or is because a situation has arisen that has 154 changed the state of the mail store in a way that the server can not 155 communicate to the client. The server should send an untagged BYE 156 response to the client and then close the socket. Sending an 157 untagged BYE response before severing allows the server to send a 158 human-readable explanation of the problem to the client, which the 159 client may then log, display to the user, or both (see section 7.1.5 160 of [RFC-2060]). 162 Regarding inactivity timeouts, there is some controversy. Unlike 163 POP, for which the design is for a client to connect, retrieve mail, 164 and log out, IMAP's design encourages long-lived (and mostly 165 inactive) client/server sessions. As the number of users grows, this 166 can use up a lot of server resources, especially with clients that 167 are designed to maintain sessions for mailboxes that the user has 168 finished accessing. To alleviate this, a server may implement an 169 inactivity timeout, unilaterally closing a session (after first 170 sending an untagged BYE, as noted above). Some server operators have 171 reported dramatic improvements in server performance after doing 172 this. As specified in [RFC-2060], if such a timeout is done it must 173 not be until at least 30 minutes of inactivity. The reason for this 174 specification is to prevent clients from sending commands (such as 175 NOOP) to the server at frequent intervals simply to avert a too-early 176 timeout. If the client knows that the server may not time out the 177 session for at least 30 minutes, then the client need not poll at 178 intervals more frequent than, say, 25 minutes. 180 3.2. Scaling 182 IMAP4 has many features that allow for scalability, as mail stores 183 become larger and more numerous. Large numbers of users, mailboxes, 184 and messages, and very large messages require thought to handle 185 efficiently. This document will not address the administrative 186 issues involved in large numbers of users, but we will look at the 187 other items. 189 3.2.1. Flood Control 191 There are three situations when a client can make a request that will 192 result in a very large response - too large for the client reasonably 193 to deal with: there are a great many mailboxes available, there are a 194 great many messages in the selected mailbox, or there is a very large 195 message part. The danger here is that the end user will be stuck 196 waiting while the server sends (and the client processes) an enormous 197 response. In all of these cases there are things a client can do to 198 reduce that danger. 200 There is also the case where a client can flood a server, by sending 201 an arbitratily long command. We'll discuss that issue, too, in this 202 section. 204 3.2.1.1. Listing Mailboxes 206 Some servers present Usenet newsgroups to IMAP users. Newsgroups, 207 and other such hierarchical mailbox structures, can be very numerous 208 but may have only a few entries at the top level of hierarchy. Also, 209 some servers are built against mail stores that can, unbeknownst to 210 the server, have circular hierarchies - that is, it's possible for 211 "a/b/c/d" to resolve to the same file structure as "a", which would 212 then mean that "a/b/c/d/b" is the same as "a/b", and the hierarchy 213 will never end. The LIST response in this case will be unlimited. 215 Clients that will have trouble with this are those that use 216 C: 001 LIST "" * 217 to determine the mailbox list. Because of this, clients should not 218 use an unqualified "*" that way in the LIST command. A safer 219 approach is to list each level of hierarchy individually, allowing 220 the user to traverse the tree one limb at a time, thus: 222 C: 001 LIST "" % 223 S: * LIST () "/" Banana 224 S: * LIST ...etc... 225 S: 001 OK done 226 and then 227 C: 002 LIST "" Banana/% 228 S: * LIST () "/" Banana/Apple 229 S: * LIST ...etc... 230 S: 002 OK done 232 Using this technique the client's user interface can give the user 233 full flexibility without choking on the voluminous reply to "LIST *". 235 Of course, it is still possible that the reply to 236 C: 005 LIST "" alt.fan.celebrity.% 237 may be thousands of entries long, and there is, unfortunately, 238 nothing the client can do to protect itself from that. This has not 239 yet been a notable problem. 241 Servers that may export circular hierarchies (any server that 242 directly presents a UNIX file system, for instance) should limit the 243 hierarchy depth to prevent unlimited LIST responses. A suggested 244 depth limit is 20 hierarchy levels. 246 3.2.1.2. Fetching the List of Messages 248 When a client selects a mailbox, it is given a count, in the untagged 249 EXISTS response, of the messages in the mailbox. This number can be 250 very large. In such a case it might be unwise to use 251 C: 004 FETCH 1:* ALL 252 to populate the user's view of the mailbox. One good method to avoid 253 problems with this is to batch the requests, thus: 255 C: 004 FETCH 1:50 ALL 256 S: * 1 FETCH ...etc... 257 S: 004 OK done 258 C: 005 FETCH 51:100 ALL 259 S: * 51 FETCH ...etc... 260 S: 005 OK done 261 C: 006 FETCH 101:150 ALL 262 ...etc... 264 Using this method, another command, such as "FETCH 6 BODY[1]" can be 265 inserted as necessary, and the client will not have its access to the 266 server blocked by a storm of FETCH replies. (Such a method could be 267 reversed to fetch the LAST 50 messages first, then the 50 prior to 268 that, and so on.) 270 As a smart extension of this, a well designed client, prepared for 271 very large mailboxes, will not automatically fetch data for all 272 messages AT ALL. Rather, the client will populate the user's view 273 only as the user sees it, possibly pre-fetching selected information, 274 and only fetching other information as the user scrolls to it. For 275 example, to select only those messages beginning with the first 276 unseen one: 278 C: 003 SELECT INBOX 279 S: * 10000 EXISTS 280 S: * 80 RECENT 281 S: * FLAGS (\Answered \Flagged \Deleted \Draft \Seen) 282 S: * OK [UIDVALIDITY 824708485] UID validity status 283 S: * OK [UNSEEN 9921] First unseen message 284 S: 003 OK [READ-WRITE] SELECT completed 285 C: 004 FETCH 9921:* ALL 286 ... etc... 288 If the server does not return an OK [UNSEEN] response, the client may 289 use SEARCH UNSEEN to obtain that value. 291 This mechanism is good as a default presentation method, but only 292 works well if the default message order is acceptable. A client may 293 want to present various sort orders to the user (by subject, by date 294 sent, by sender, and so on) and in that case (lacking a SORT 295 extension on the server side) the client WILL have to retrieve all 296 message descriptors. A client that provides this service should not 297 do it by default and should inform the user of the costs of choosing 298 this option for large mailboxes. 300 3.2.1.3. Fetching a Large Body Part 302 The issue here is similar to the one for a list of messages. In the 303 BODYSTRUCTURE response the client knows the size, in bytes, of the 304 body part it plans to fetch. Suppose this is a 70 MB video clip. 305 The client can use partial fetches to retrieve the body part in 306 pieces, avoiding the problem of an uninterruptible 70 MB literal 307 coming back from the server: 309 C: 022 FETCH 3 BODY[1]<0.20000> 310 S: * 3 FETCH (FLAGS(\Seen) BODY[1]<0> {20000} 311 S: ...data...) 312 S: 022 OK done 313 C: 023 FETCH 3 BODY[1]<20001.20000> 314 S: * 3 FETCH (BODY[1]<20001> {20000} 315 S: ...data...) 316 S: 023 OK done 317 C: 024 FETCH 3 BODY[1]<40001.20000> 318 ...etc... 320 3.2.1.4. BODYSTRUCTURE vs. Entire Messages 322 Because FETCH BODYSTRUCTURE is necessary in order to determine the 323 number of body parts, and, thus, whether a message has "attachments", 324 clients often use FETCH FULL as their normal method of populating the 325 user's view of a mailbox. The benefit is that the client can display 326 a paperclip icon or some such indication along with the normal 327 message summary. However, this comes at a significant cost with some 328 server configurations. The parsing needed to generate the FETCH 329 BODYSTRUCTURE response may be time-consuming compared with that 330 needed for FETCH ENVELOPE. The client developer should consider this 331 issue when deciding whether the ability to add a paperclip icon is 332 worth the tradeoff in performance, especially with large mailboxes. 334 Some clients, rather than using FETCH BODYSTRUCTURE, use FETCH BODY[] 335 (or the equivalent FETCH RFC822) to retrieve the entire message. 336 They then do the MIME parsing in the client. This may give the 337 client slightly more flexibility in some areas (access, for instance, 338 to header fields that aren't returned in the BODYSTRUCTURE and 339 ENVELOPE responses), but it can cause severe performance problems by 340 forcing the transfer of all body parts when the user might only want 341 to see some of them - a user logged on by modem and reading a small 342 text message with a large ZIP file attached may prefer to read the 343 text only and save the ZIP file for later. Therefore, a client 344 should not normally retrieve entire messages and should retrieve 345 message body parts selectively. 347 3.2.1.5. Long Command Lines 349 A client can wind up building a very long command line in an effort 350 to try to be efficient about requesting information from a server. 351 This can typically happen when a client builds a message set from 352 selected messages and doesn't recognise that contiguous blocks of 353 messages may be group in a range. Suppose a user selects all 10,000 354 messages in a large mailbox and then unselects message 287. The 355 client could build that message set as "1:286,288:10000", but a 356 client that doesn't handle that might try to enumerate each message 357 individually and build "1,2,3,4, [and so on] ,9999,10000". Adding 358 that to the fetch command results in a command line that's almost 359 49,000 octets long, and, clearly, one can construct a command line 360 that's even longer. 362 A client should limit the length of the command lines it generates to 363 approximately 1000 octets (including all quoted strings but not 364 including literals). If the client is unable to group things into 365 ranges so that the command line is within that length, it should 366 split the request into multiple commands. The client should use 367 literals instead of long quoted strings, in order to keep the command 368 length down. 370 For its part, a server should allow for a command line of at least 371 8000 octets. This provides plenty of leeway for accepting reasonable 372 length commands from clients. The server should send a BAD response 373 to a command that does not end within the server's maximum accepted 374 command length. 376 3.2.2. Subscriptions 378 The client isn't the only entity that can get flooded: the end user, 379 too, may need some flood control. The IMAP4 protocol provides such 380 control in the form of subscriptions. Most servers support the 381 SUBSCRIBE, UNSUBSCRIBE, and LSUB commands, and many users choose to 382 narrow down a large list of available mailboxes by subscribing to the 383 ones that they usually want to see. Clients, with this in mind, 384 should give the user a way to see only subscribed mailboxes. A 385 client that never uses the LSUB command takes a significant usability 386 feature away from the user. Of course, the client would not want to 387 hide the LIST command completely; the user needs to have a way to 388 choose between LIST and LSUB. The usual way to do this is to provide 389 a setting like "show which mailboxes?: [] all [] subscribed only". 391 3.2.3. Searching 393 IMAP SEARCH commands can become particularly troublesome (that is, 394 slow) on mailboxes containing a large number of messages. So let's 395 put a few things in perspective in that regard. 397 The flag searches should be fast. The flag searches (ALL, [UN]SEEN, 398 [UN]ANSWERED, [UN]DELETED, [UN]DRAFT, [UN]FLAGGED, NEW, OLD, RECENT) 399 are known to be used by clients for the client's own use (for 400 instance, some clients use "SEARCH UNSEEN" to find unseen mail and 401 "SEARCH DELETED" to warn the user before expunging messages). 403 Other searches, particularly the text searches (HEADER, TEXT, BODY) 404 are initiated by the user, rather than by the client itself, and 405 somewhat slower performance can be tolerated, since the user is aware 406 that the search is being done (and is probably aware that it might be 407 time-consuming). A smart server might use dynamic indexing to speed 408 commonly used text searches. 410 The client may allow other commands to be sent to the server while a 411 SEARCH is in progress, but at the time of this writing there is 412 little or no server support for parallel processing of multiple 413 commands in the same session (and see "Multiple Accesses of the Same 414 Mailbox" above for a description of the dangers of trying to work 415 around this by doing your SEARCH in another session). 417 Another word about text searches: some servers, built on database 418 back-ends with indexed search capabilities, may return search results 419 that do not match the IMAP spec's "case-insensitive substring" 420 requirements. While these servers are in violation of the protocol, 421 there is little harm in the violation as long as the search results 422 are used only in response to a user's request. Still, developers of 423 such servers should be aware that they ARE violating the protocol, 424 should think carefully about that behaviour, and must be certain that 425 their servers respond accurately to the flag searches for the reasons 426 outlined above. 428 In addition, servers should support CHARSET UTF-8 [UTF-8] in 429 searches. 431 3.3 Avoiding Invalid Requests 433 IMAP4 provides ways for a server to tell a client in advance what is 434 and isn't permitted in some circumstances. Clients should use these 435 features to avoid sending requests that a well designed client would 436 know to be invalid. This section explains this in more detail. 438 3.3.1. The CAPABILITY Command 440 All IMAP4 clients should use the CAPABILITY command to determine what 441 version of IMAP and what optional features a server supports. The 442 client should not send IMAP4rev1 commands and arguments to a server 443 that does not advertize IMAP4rev1 in its CAPABILITY response. 444 Similarly, the client should not send IMAP4 commands that no longer 445 exist in IMAP4rev1 to a server that does not advertize IMAP4 in its 446 CAPABILITY response. An IMAP4rev1 server is NOT required to support 447 obsolete IMAP4 or IMAP2bis commands (though some do; do not let this 448 fact lull you into thinking that it's valid to send such commands to 449 an IMAP4rev1 server). 451 A client should not send commands to probe for the existance of 452 certain extensions. All standard and standards-track extensions 453 include CAPABILITY tokens indicating their presense. All private and 454 experimental extensions should do the same, and clients that take 455 advantage of them should use the CAPABILITY response to determine 456 whether they may be used or not. 458 3.3.2. Don't Do What the Server Says You Can't 460 In many cases, the server, in response to a command, will tell the 461 client something about what can and can't be done with a particular 462 mailbox. The client should pay attention to this information and 463 should not try to do things that it's been told it can't do. 464 Examples: 465 * Do not try to SELECT a mailbox that has the \Noselect flag set. 467 * Do not try to CREATE a sub-mailbox in a mailbox that has the 468 \Noinferiors flag set. 469 * Do not respond to a failing COPY or APPEND command by trying to 470 CREATE the target mailbox if the server does not respond with a 471 [TRYCREATE] response code. 472 * Do not try to expunge a mailbox that has been selected with the 473 [READ-ONLY] response code. 475 3.4. Miscellaneous Protocol Considerations 477 We describe here a number of important protocol-related issues, the 478 misunderstanding of which has caused significant interoperability 479 problems in IMAP4 implementations. One general item is that every 480 implementer should be certain to take note of and to understand 481 section 2.2.2 and the preamble to section 7 of the IMAP4rev1 spec 482 [RFC-2060]. 484 3.4.1. Well Formed Protocol 486 We cannot stress enough the importance of adhering strictly to the 487 protocol grammar. The specification of the protocol is quite rigid; 488 do not assume that you can insert blank space for "readability" if 489 none is called for. Keep in mind that there are parsers out there 490 that will crash if there are protocol errors. There are clients that 491 will report every parser burp to the user. And in any case, 492 information that cannot be parsed is information that is lost. Be 493 careful in your protocol generation. And see "A Word About Testing", 494 below. 496 In particular, note that the string in the INTERNALDATE response is 497 NOT an RFC-822 date string - that is, it is not in the same format as 498 the first string in the ENVELOPE response. Since most clients will, 499 in fact, accept an RFC-822 date string in the INTERNALDATE response, 500 it's easy to miss this in your interoperability testing. But it will 501 cause a problem with some client, so be sure to generate the correct 502 string for this field. 504 3.4.2. Special Characters 506 Certain characters, currently the double-quote and the backslash, may 507 not be sent as-is inside a quoted string. These characters must be 508 preceded by the escape character if they are in a quoted string, or 509 else the string must be sent as a literal. Both clients and servers 510 must handle this, both on output (they must send these characters 511 properly) and on input (they must be able to receive escaped 512 characters in quoted strings). Example: 514 C: 001 LIST "" % 515 S: * LIST () "" INBOX 516 S: * LIST () "\\" TEST 517 S: * LIST () "\\" {12} 518 S: "My" mailbox 519 S: 001 OK done 520 C: 002 LIST "" "\"My\" mailbox\\%" 521 S: * LIST () "\\" {17} 522 S: "My" mailbox\Junk 523 S: 002 OK done 525 Note that in the example the server sent the hierarchy delimiter as 526 an escaped character in the quoted string and sent the mailbox name 527 containing imbedded double-quotes as a literal. The client used only 528 quoted strings, escaping both the backslash and the double-quote 529 characters. 531 The CR and LF characters may be sent ONLY in literals; they are not 532 allowed, even if escaped, inside quoted strings. 534 And while we're talking about special characters: the IMAP spec, in 535 the section titled "Mailbox International Naming Convention", 536 describes how to encode mailbox names in modified UTF-7 [UTF-7 and 537 RFC-2060]. Implementations must adhere to this in order to be 538 interoperable in the international market, and servers should 539 validate mailbox names sent by client and reject names that do not 540 conform. 542 As to special characters in userids and passwords: clients must not 543 restrict what a user may type in for a userid or a password. The 544 formal grammar specifies that these are "astrings", and an astring 545 can be a literal. A literal, in turn can contain any 8-bit 546 character, and clients must allow users to enter all 8-bit characters 547 here, and must pass them, unchanged, to the server (being careful to 548 send them as literals when necessary). In particular, some server 549 configurations use "@" in user names, and some clients do not allow 550 that character to be entered; this creates a severe interoperability 551 problem. 553 3.4.3. UIDs and UIDVALIDITY 555 Servers that support existing back-end mail stores often have no good 556 place to save UIDs for messages. Often the existing mail store will 557 not have the concept of UIDs in the sense that IMAP has: strictly 558 increasing, never re-issued, 32-bit integers. Some servers solve 559 this by storing the UIDs in a place that's accessible to end users, 560 allowing for the possibility that the users will delete them. Others 561 solve it by re-assigning UIDs every time a mailbox is selected. 563 The server should maintain UIDs permanently for all messages if it 564 can. If that's not possible, the server must change the UIDVALIDITY 565 value for the mailbox whenever any of the UIDs may have become 566 invalid. Clients must recognize that the UIDVALIDITY has changed and 567 must respond to that condition by throwing away any information that 568 they have saved about UIDs in that mailbox. There have been many 569 problems in this area when clients have failed to do this; in the 570 worst case it will result in loss of mail when a client deletes the 571 wrong piece of mail by using a stale UID. 573 It seems to be a common misunderstanding that "the UIDVALIDITY and 574 the UID, taken together, form a 64-bit identifier that uniquely 575 identifies a message on a server". This is absolutely NOT TRUE. 576 There is no assurance that the UIDVALIDITY values of two mailboxes be 577 different, so the UIDVALIDITY in no way identifies a mailbox. The 578 ONLY purpose of UIDVALIDITY is, as its name indicates, to give the 579 client a way to check the validity of the UIDs it has cached. While 580 it is a valid implementation choice to put these values together to 581 make a 64-bit identifier for the message, the important concept here 582 is that UIDs are not unique between mailboxes; they are only unique 583 WITHIN a given mailbox. 585 Some server implementations have attempted to make UIDs unique across 586 the entire server. This is inadvisable, in that it limits the life 587 of UIDs unnecessarily. The UID is a 32-bit number and will run out 588 in reasonably finite time if it's global across the server. If you 589 assign UIDs sequentially in one mailbox, you will not have to start 590 re-using them until you have had, at one time or another, 2**32 591 different messages in that mailbox. In the global case, you will 592 have to reuse them once you have had, at one time or another, 2**32 593 different messages in the entire mail store. Suppose your server has 594 around 8000 users registered (2**13). That gives an average of 2**19 595 UIDs per user. Suppose each user gets 32 messages (2**5) per day. 596 That gives you 2**14 days (16000+ days = about 45 years) before you 597 run out. That may seem like enough, but multiply the usage just a 598 little (a lot of spam, a lot of mailing list subscriptions, more 599 users) and you limit yourself too much. 601 What's worse is that if you have to wrap the UIDs, and, thus, you 602 have to change UIDVALIDITY and invalidate the UIDs in the mailbox, 603 you have to do it for EVERY mailbox in the system, since they all 604 share the same UID pool. If you assign UIDs per mailbox and you have 605 a problem, you only have to kill the UIDs for that one mailbox. 607 Under extreme circumstances (and this is extreme, indeed), the server 608 may have to invalidate UIDs while a mailbox is in use by a client - 609 that is, the UIDs that the client knows about in its active mailbox 610 are no longer valid. In that case, the server must immediately 611 change the UIDVALIDITY and must communicate this to the client. The 612 server may do this by sending an unsolicited UIDVALIDITY message, in 613 the same form as in response to the SELECT command. Clients must be 614 prepared to handle such a message and the possibly coincident failure 615 of the command in process. For example: 617 C: 032 UID STORE 382 +Flags.silent \Deleted 618 S: * OK [UIDVALIDITY 12345] New UIDVALIDITY value! 619 S: 032 NO UID command rejected because UIDVALIDITY changed! 620 C: ...invalidates local information and re-fetches... 621 C: 033 FETCH 1:* UID 622 ...etc... 624 At the time of the writing of this document, the only server known to 625 do this does so only under the following condition: the client 626 selects INBOX, but there is not yet a physical INBOX file created. 627 Nonetheless, the SELECT succeeds, exporting an empty INBOX with a 628 temporary UIDVALIDITY of 1. While the INBOX remains selected, mail 629 is delivered to the user, which creates the real INBOX file and 630 assigns a permanent UIDVALIDITY (that is likely not to be 1). The 631 server reports the change of UIDVALIDITY, but as there were no 632 messages before, so no UIDs have actually changed, all the client 633 must do is accept the change in UIDVALIDITY. 635 Alternatively, a server may force the client to re-select the 636 mailbox, at which time it will obtain a new UIDVALIDITY value. To do 637 this, the server closes this client session (see "Severed 638 Connections" above) and the client then reconnects and gets back in 639 synch. Clients must be prepared for either of these behaviours. 641 We do not know of, nor do we anticipate the future existance of, a 642 server that changes UIDVALIDITY while there are existing messages, 643 but clients must be prepared to handle this eventuality. 645 3.4.4. FETCH Responses 647 When a client asks for certain information in a FETCH command, the 648 server may return the requested information in any order, not 649 necessarily in the order that it was requested. Further, the server 650 may return the information in separate FETCH responses and may also 651 return information that was not explicitly requested (to reflect to 652 the client changes in the state of the subject message). Some 653 examples: 655 C: 001 FETCH 1 UID FLAGS INTERNALDATE 656 S: * 5 FETCH (FLAGS (\Deleted)) 657 S: * 1 FETCH (FLAGS (\Seen) INTERNALDATE "..." UID 345) 658 S: 001 OK done 659 (In this case, the responses are in a different order. Also, the 660 server returned a flag update for message 5, which wasn't part of the 661 client's request.) 663 C: 002 FETCH 2 UID FLAGS INTERNALDATE 664 S: * 2 FETCH (INTERNALDATE "...") 665 S: * 2 FETCH (UID 399) 666 S: * 2 FETCH (FLAGS ()) 667 S: 002 OK done 668 (In this case, the responses are in a different order and were 669 returned in separate responses.) 671 C: 003 FETCH 2 BODY[1] 672 S: * 2 FETCH (FLAGS (\Seen) BODY[1] {14} 673 S: Hello world! 674 S: ) 675 S: 003 OK done 676 (In this case, the FLAGS response was added by the server, since 677 fetching the body part caused the server to set the \Seen flag.) 679 Because of this characteristic a client must be ready to receive any 680 FETCH response at any time and should use that information to update 681 its local information about the message to which the FETCH response 682 refers. A client must not assume that any FETCH responses will come 683 in any particular order, or even that any will come at all. If after 684 receiving the tagged response for a FETCH command the client finds 685 that it did not get all of the information requested, the client 686 should send a NOOP command to the server to ensure that the server 687 has an opportunity to send any pending EXPUNGE responses to the 688 client (see [RFC-2180]). 690 3.4.5. RFC822.SIZE 692 Some back-end mail stores keep the mail in a canonical form, rather 693 than retaining the original MIME format of the messages. This means 694 that the server must reassemble the message to produce a MIME stream 695 when a client does a fetch such as RFC822 or BODY[], requesting the 696 entire message. It also may mean that the server has no convenient 697 way to know the RFC822.SIZE of the message. Often, such a server 698 will actually have to build the MIME stream to compute the size, only 699 to throw the stream away and report the size to the client. 701 When this is the case, some servers have chosen to estimate the size, 702 rather than to compute it precisely. Such an estimate allows the 703 client to display an approximate size to the user and to use the 704 estimate in flood control considerations (q.v.), but requires that 705 the client not use the size for things such as allocation of buffers, 706 because those buffers might then be too small to hold the actual MIME 707 stream. Instead, a client should use the size that's returned in the 708 literal when you fetch the data. 710 The protocol requires that the RFC822.SIZE value returned by the 711 server be EXACT. Estimating the size is a protocol violation, and 712 server designers must be aware that, despite the performance savings 713 they might realize in using an estimate, this practice will cause 714 some clients to fail in various ways. If possible, the server should 715 compute the RFC822.SIZE for a particular message once, and then save 716 it for later retrieval. If that's not possible, the server must 717 compute the value exactly every time. Incorrect estimates do cause 718 severe interoperability problems with some clients. 720 3.4.6. Expunged Messages 722 If the server allows multiple connections to the same mailbox, it is 723 often possible for messages to be expunged in one client unbeknownst 724 to another client. Since the server is not allowed to tell the 725 client about these expunged messages in response to a FETCH command, 726 the server may have to deal with the issue of how to return 727 information about an expunged message. There was extensive 728 discussion about this issue, and the results of that discussion are 729 summarized in [RFC-2180]. See that reference for a detailed 730 explanation and for recommendations. 732 3.4.7. The Namespace Issue 734 Namespaces are a very muddy area in IMAP4 implementation right now 735 (see [NAMESPACE] for a proposal to clear the water a bit). Until the 736 issue is resolved, the important thing for client developers to 737 understand is that some servers provide access through IMAP to more 738 than just the user's personal mailboxes, and, in fact, the user's 739 personal mailboxes may be "hidden" somewhere in the user's default 740 hierarchy. The client, therefore, should provide a setting wherein 741 the user can specify a prefix to be used when accessing mailboxes. 742 If the user's mailboxes are all in "~/mail/", for instance, then the 743 user can put that string in the prefix. The client would then put 744 the prefix in front of any name pattern in the LIST and LSUB 745 commands: 746 C: 001 LIST "" ~/mail/% 747 (See also "Reference Names in the LIST Command" below.) 749 3.4.8. Creating Special-Use Mailboxes 751 It may seem at first that this is part of the namespace issue; it is 752 not, and is only indirectly related to it. A number of clients like 753 to create special-use mailboxes with particular names. Most 754 commonly, clients with a "trash folder" model of message deletion 755 want to create a mailbox with the name "Trash" or "Deleted". Some 756 clients want to create a "Drafts" mailbox, an "Outbox" mailbox, or a 757 "Sent Mail" mailbox. And so on. There are two major 758 interoperability problems with this practice: 759 1. different clients may use different names for mailboxes with 760 similar functions (such as "Trash" and "Deleted"), or may manage the 761 same mailboxes in different ways, causing problems if a user switches 762 between clients and 763 2. there is no guarantee that the server will allow the creation of 764 the desired mailbox. 766 The client developer is, therefore, well advised to consider 767 carefully the creation of any special-use mailboxes on the server, 768 and, further, the client must not require such mailbox creation - 769 that is, if you do decide to do this, you must handle gracefully the 770 failure of the CREATE command and behave reasonably when your 771 special-use mailboxes do not exist and can not be created. 773 In addition, the client developer should provide a convenient way for 774 the user to select the names for any special-use mailboxes, allowing 775 the user to make these names the same in all clients used and to put 776 them where the user wants them. 778 3.4.9. Reference Names in the LIST Command 780 Many implementers of both clients and servers are confused by the 781 "reference name" on the LIST command. The reference name is intended 782 to be used in much the way a "cd" (change directory) command is used 783 on Unix, PC DOS, Windows, and OS/2 systems. That is, the mailbox 784 name is interpreted in much the same way as a file of that name would 785 be found if one had done a "cd" command into the directory specified 786 by the reference name. For example, in Unix we have the following: 788 > cd /u/jones/junk 789 > vi banana [file is "/u/jones/junk/banana"] 790 > vi stuff/banana [file is "/u/jones/junk/stuff/banana"] 791 > vi /etc/hosts [file is "/etc/hosts"] 793 In the past, there have been several interoperability problems with 794 this. First, while some IMAP servers are built on Unix or PC file 795 systems, many others are not, and the file system semantics do not 796 make sense in those configurations. Second, while some IMAP servers 797 expose the underlying file system to the clients, others allow access 798 only to the user's personal mailboxes, or to some other limited set 799 of files, making such file-system-like semantics less meaningful. 800 Third, because the IMAP spec leaves the interpretation of the 801 reference name as "implementation-dependent", in the past the various 802 server implementations handled it in vastly differing ways. 804 The following recommendations are the result of significant 805 operational experience, and are intended to maximize 806 interoperability. 808 Server implementations must implement the reference argument in a way 809 that matches the intended "change directory" operation as closely as 810 possible. As a minimum implementation, the reference argument may be 811 prepended to the mailbox name (while suppressing double delimiters; 812 see the next paragraph). Even servers that do not provide a way to 813 break out of the current hierarchy (see "breakout facility" below) 814 must provide a reasonable implementation of the reference argument, 815 as described here, so that they will interoperate with clients that 816 use it. 818 Server implementations that prepend the reference argument to the 819 mailbox name should insert a hierarchy delimiter between them, and 820 must not insert a second if one is already present: 822 C: A001 LIST ABC DEF 823 S: * LIST () "/" ABC/DEF <=== should do this 824 S: A001 OK done 826 C: A002 LIST ABC/ /DEF 827 S: * LIST () "/" ABC//DEF <=== must not do this 828 S: A002 OK done 830 On clients, the reference argument is chiefly used to implement a 831 "breakout facility", wherein the user may directly access a mailbox 832 outside the "current directory" hierarchy. Client implementations 833 should have an operational mode that does not use the reference 834 argument. This is to interoperate with older servers that did not 835 implement the reference argument properly. While it's a good idea to 836 give the user access to a breakout facility, clients that do not 837 intend to do so should not use the reference argument at all. 839 Client implementations should always place a trailing hierarchy 840 delimiter on the reference argument. This is because some servers 841 prepend the reference argument to the mailbox name without inserting 842 a hierarchy delimiter, while others do insert a hierarchy delimiter 843 if one is not already present. A client that puts the delimiter in 844 will work with both varieties of server. 846 Client implementations that implement a breakout facility should 847 allow the user to choose whether or not to use a leading hierarchy 848 delimiter on the mailbox argument. This is because the handling of a 849 leading mailbox hierarchy delimiter also varies from server to 850 server, and even between different mailstores on the same server. In 851 some cases, a leading hierarchy delimiter means "discard the 852 reference argument" (implementing the intended breakout facility), 853 thus: 855 C: A001 LIST ABC/ /DEF 856 S: * LIST () "/" /DEF 857 S: A001 OK done 859 In other cases, however, the two are catenated and the extra 860 hierarchy delimiter is discarded, thus: 862 C: A001 LIST ABC/ /DEF 863 S: * LIST () "/" ABC/DEF 864 S: A001 OK done 866 Client implementations must not assume that the server supports a 867 breakout facility, but may provide a way for the user to use one if 868 it is available. Any breakout facility should be exported to the 869 user interface. Note that there may be other "breakout" characters 870 besides the hierarchy delimiter (for instance, UNIX filesystem 871 servers are likely to use a leading "~" as well), and that their 872 interpretation is server-dependent. 874 3.4.12. Mailbox Hierarchy Delimiters 876 The server's selection of what to use as a mailbox hierarchy 877 delimiter is a difficult one, involving several issues: What 878 characters do users expect to see? What characters can they enter 879 for a hierarchy delimiter if it is desired (or required) that the 880 user enter it? What character can be used for the hierarchy 881 delimiter, noting that the chosen character can not otherwise be used 882 in the mailbox name? 884 Because some interfaces show users the hierarchy delimiters or allow 885 users to enter qualified mailbox names containing them, server 886 implementations should use delimiter characters that users generally 887 expect to see as name separators. The most common characters used 888 for this are "/" (as in Unix file names), "\" (as in OS/2 and Windows 889 file names), and "." (as in news groups). There is little to choose 890 among these apart from what users may expect or what is dictated by 891 the underlying file system, if any. One consideration about using 892 "\" is that it's also a special character in the IMAP protocol. 893 While the use of other hierarchy delimiter characters is permissible, 894 A DESIGNER IS WELL ADVISED TO STAY WITH ONE FROM THIS SET unless the 895 server is intended for special purposes only. Implementers might be 896 thinking about using characters such as "-", "_", ";", "&", "#", "@", 897 and "!", but they should be aware of the surprise to the user as well 898 as of the effect on URLs and other external specifications (since 899 some of these characters have special meanings there). Also, a 900 server that uses "\" (and clients of such a server) must remember to 901 escape that character in quoted strings or to send literals instead. 902 Literals are recommended over escaped characters in quoted strings in 903 order to maintain compatibility with older IMAP versions that did not 904 allow escaped characters in quoted strings (but check the grammar to 905 see where literals are allowed): 906 C: 001 LIST "" {13} 907 S: + send literal 908 C: this\%\%\%\h* 909 S: * LIST () "\\" {27} 910 S: this\is\a\mailbox\hierarchy 911 S: 001 OK LIST complete 913 In any case, a server should not use normal alpha-numeric characters 914 (such as "X" or "0") as delimiters; a user would be very surprised to 915 find that "EXPENDITURES" actually represented a two-level hierarchy. 916 And a server should not use characters that are non-printable or 917 difficult or impossible to enter on a standard US keyboard. Control 918 characters, box-drawing characters, and characters from non-US 919 alphabets fit into this category. Their use presents 920 interoperability problems that are best avoided. 922 The UTF-7 encoding of mailbox names also raises questions about what 923 to do with the hierarchy delimiters in encoded names: do we encode 924 each hierarchy level and separate them with delimiters, or do we 925 encode the fully qualified name, delimiters and all? The answer for 926 IMAP is the former: encode each hierarchy level separately, and 927 insert delimiters between. This makes it particularly important not 928 to use as a hierarchy delimiter a character that might cause 929 confusion with IMAP's modified UTF-7 [UTF-7 and RFC-2060] encoding. 931 To repeat: a server should use "/", "\", or "." as its hierarchy 932 delimiter. The use of any other character is likely to cause 933 problems and is STRONGLY DISCOURAGED. 935 3.4.11. ALERT Response Codes 937 The protocol spec is very clear on the matter of what to do with 938 ALERT response codes, and yet there are many clients that violate it 939 so it needs to be said anyway: "The human-readable text contains a 940 special alert that must be presented to the user in a fashion that 941 calls the user's attention to the message." That should be clear 942 enough, but I'll repeat it here: Clients must present ALERT text 943 clearly to the user. 945 3.4.12. Deleting Mailboxes 947 The protocol does not guarantee that a client may delete a mailbox 948 that is not empty, though on some servers it is permissible and is, 949 in fact, much faster than the alternative or deleting all the 950 messages from the client. If the client chooses to try to take 951 advantage of this possibility it must be prepared to use the other 952 method in the even that the more convenient one fails. Further, a 953 client should not try to delete the mailbox that it has selected, but 954 should first close that mailbox; some servers do not permit the 955 deletion of the selected mailbox. 957 That said, a server should permit the deletion of a non-empty 958 mailbox; there's little reason to pass this work on to the client. 959 Moreover, forbidding this prevents the deletion of a mailbox that for 960 some reason can not be opened or expunged, leading to possible 961 denial-of-service problems. 963 Example: 964 [User tells the client to delete mailbox BANANA, which is 965 currently selected...] 966 C: 008 CLOSE 967 S: 008 OK done 968 C: 009 DELETE BANANA 969 S: 009 NO Delete failed; mailbox is not empty. 970 C: 010 SELECT BANANA 971 S: * ... untagged SELECT responses 972 S: 010 OK done 973 C: 011 STORE 1:* +FLAGS.SILENT \DELETED 974 S: 011 OK done 975 C: 012 CLOSE 976 S: 012 OK done 977 C: 013 DELETE BANANA 978 S: 013 OK done 980 3.5. A Word About Testing 982 Since the whole point of IMAP is interoperability, and since 983 interoperability can not be tested in a vacuum, the final 984 recommendation of this treatise is, "Test against EVERYTHING." Test 985 your client against every server you can get an account on. Test 986 your server with every client you can get your hands on. Many 987 clients make limited test versions available on the Web for the 988 downloading. Many server owners will give serious client developers 989 guest accounts for testing. Contact them and ask. NEVER assume that 990 because your client works with one or two servers, or because your 991 server does fine with one or two clients, you will interoperate well 992 in general. 994 In particular, in addition to everything else, be sure to test 995 against the reference implementations: the PINE client, the 996 University of Washington server, and the Cyrus server. 998 See the following URLs on the web for more information here: 999 IMAP Products and Sources: http://www.imap.org/products.html 1000 IMC MailConnect: http://www.imc.org/imc-mailconnect 1002 4. Security Considerations 1004 This document describes behaviour of clients and servers that use the 1005 IMAP4 protocol, and as such, has the same security considerations as 1006 described in [RFC-2060]. 1008 5. References 1010 [RFC-2060]; Crispin, M.; "Internet Message Access Protocol - Version 1011 4rev1"; RFC 2060; University of Washington; December 1996. 1013 [RFC-2119]; Bradner, S.; "Key words for use in RFCs to Indicate 1014 Requirement Levels"; RFC 2119; Harvard University; March 1997. 1016 [RFC-2180]; Gahrns, M.; "IMAP4 Multi-Accessed Mailbox Practice"; RFC 1017 2180; Microsoft; July 1997. 1019 [UTF-8]; Yergeau, F.; " UTF-8, a transformation format of Unicode and 1020 ISO 10646"; RFC 2044; Alis Technilogies; October 1996. 1022 [UTF-7]; Goldsmith, D. & Davis, M.; "UTF-7, a Mail-Safe 1023 Transformation Format of Unicode"; RFC 2152; Apple Computer, Inc. & 1024 Taligent, Inc.; May 1997. 1026 [NAMESPACE]; Gahrns, M. & Newman, C.; "IMAP4 Namespace"; draft 1027 document ; Microsoft & Innosoft; 1028 June 1997. 1030 6. Author's Address 1032 Barry Leiba 1033 IBM T.J. Watson Research Center 1034 30 Saw Mill River Road 1035 Hawthorne, NY 10532 1037 Phone: 1-914-784-7941 1038 Email: leiba@watson.ibm.com 1040 This document will expire at the end of January 2000.