idnits 2.17.1 draft-ietf-imapext-i18n-15.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 17. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 908. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 880. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 887. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 893. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 18 longer pages, the longest (page 2) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 19 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 5 instances of too long lines in the document, the longest one being 2 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 1, 2008) is 5921 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 3501 (Obsoleted by RFC 9051) ** Obsolete normative reference: RFC 4234 (Obsoleted by RFC 5234) ** Obsolete normative reference: RFC 4646 (Obsoleted by RFC 5646) == Outdated reference: A later version (-20) exists of draft-ietf-imapext-sort-19 -- Obsolete informational reference (is this intentional?): RFC 3490 (Obsoleted by RFC 5890, RFC 5891) == Outdated reference: A later version (-17) exists of draft-daboo-imap-annotatemore-12 Summary: 5 errors (**), 0 flaws (~~), 5 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Chris Newman 3 Internet-Draft Sun Microsystems 4 Intended Status: Proposed Standard Arnt Gulbrandsen 5 Oryx Mail Systems GmhH 6 Alexey Melnikov 7 Isode Limited 8 February 1, 2008 10 Internet Message Access Protocol Internationalization 11 draft-ietf-imapext-i18n-15.txt 13 Status of this Memo 14 By submitting this Internet-Draft, each author represents that any 15 applicable patent or other IPR claims of which he or she is aware 16 have been or will be disclosed, and any of which he or she becomes 17 aware will be disclosed, in accordance with Section 6 of BCP 79. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as Internet- 22 Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six 25 months and may be updated, replaced, or obsoleted by other documents 26 at any time. It is inappropriate to use Internet-Drafts as 27 reference material or to cite them other than as "work in progress". 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet- 31 Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft expires in August 2008. 36 Copyright Notice 38 Copyright (C) The IETF Trust (2008). 40 Abstract 42 Internet Message Access Protocol (IMAP) version 4rev1 has basic 43 support for non-ASCII characters in mailbox names and search 44 substrings. It also supports non-ASCII message headers and content 45 encoded as specified by Multipurpose Internet Mail Extensions 46 (MIME). This specification defines a collection of IMAP extensions 48 Internet-draft February 2008 50 which improve international support including comparator negotiation 51 for search, sort and thread, language negotiation for international 52 error text, and translations for namespace prefixes. 54 Table of Contents 56 1. Conventions Used in this Document . . . . . . . . . . . . . . 2 57 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 58 3. LANGUAGE Extension . . . . . . . . . . . . . . . . . . . . . 3 59 3.1 LANGUAGE Extension Requirements . . . . . . . . . . . . . . . 3 60 3.2 LANGUAGE Command . . . . . . . . . . . . . . . . . . . . . . 4 61 3.3 LANGUAGE Response . . . . . . . . . . . . . . . . . . . . . . 6 62 3.4 TRANSLATION Extension to the NAMESPACE Response . . . . . . . 6 63 3.5 Formal Syntax . . . . . . . . . . . . . . . . . . . . . . . . 6 64 4. I18NLEVEL=1 and I18NLEVEL=2 Extensions . . . . . . . . . . . 7 65 4.1 Introduction and Overview . . . . . . . . . . . . . . . . . . 8 66 4.2 Requirements common to both I18NLEVEL=1 and I18NLEVEL=2 . . . 67 4.3 I18NLEVEL=1 Extension Requirements . . . . . . . . . . . . . 8 68 4.4 I18NLEVEL=2 Extension Requirements . . . . . . . . . . . . . 8 69 4.5 Compatibility Notes 70 4.6 Comparators and Charsets . . . . . . . . . . . . . . . . . . 9 71 4.7 COMPARATOR Command . . . . . . . . . . . . . . . . . . . . . 9 72 4.8 COMPARATOR Response . . . . . . . . . . . . . . . . . . . . . 10 73 4.9 BADCOMPARATOR Response Code . . . . . . . . . . . . . . . . . 74 4.10 Formal Syntax . . . . . . . . . . . . . . . . . . . . . . . 10 75 5. Other IMAP Internationalization Issues . . . . . . . . . . . 11 76 5.1 UTF-8 Userids and Passwords . . . . . . . . . . . . . . . . . 11 77 5.2 UTF-8 Mailbox Names . . . . . . . . . . . . . . . . . . . . . 11 78 5.3 UTF-8 Domains, Addresses and Mail Headers . . . . . . . . . . 11 79 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 80 7. Security Considerations . . . . . . . . . . . . . . . . . . . 12 81 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 82 9. Relevant Standards for i18n IMAP Implementations . . . . . . 13 83 Normative References . . . . . . . . . . . . . . . . . . . . 13 84 Informative References . . . . . . . . . . . . . . . . . . . 14 85 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 15 86 Intellectual Property and Copyright Statements . . . . . . . 16 88 Conventions Used in This Document 90 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 91 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 92 document are to be interpreted as described in [RFC2119]. 94 The formal syntax use the Augmented Backus-Naur Form (ABNF) 95 [RFC4234] notation including the core rules defined in Appendix A. 97 Internet-draft February 2008 99 The UTF8-related productions are defined in [RFC3629]. 101 In examples, "C:" and "S:" indicate lines sent by the client and 102 server respectively. If a single "C:" or "S:" label applies to 103 multiple lines, then the line breaks between those lines are for 104 editorial clarity only and are not part of the actual protocol 105 exchange. 107 2. Introduction 109 This specification defines two IMAP4rev1 [RFC3501] extensions to 110 enhance international support. These extensions can be advertised 111 and implemented separately. 113 The LANGUAGE extension allows the client to request a suitable 114 language for protocol error messages and in combination with the 115 NAMESPACE extension [RFC2342] enables namespace translations. 117 The I18NLEVEL=2 extension allows the client to request a suitable 118 collation which will modify the behavior of the base specification's 119 SEARCH command as well as the SORT and THREAD extensions [SORT]. 120 This leverages the collation registry [RFC4790]. 122 3. LANGUAGE Extension 124 IMAP allows server responses to include human-readable text that in 125 many cases needs to be presented to the user. But that text is 126 limited to US-ASCII by the IMAP specification [RFC3501] in order to 127 preserve backwards compatibility with deployed IMAP implementations. 128 This section specifies a way for an IMAP client to negotiate which 129 language the server should use when sending human-readable text. 131 The LANGUAGE extension only provides a mechanism for altering fixed 132 server strings such as response text and NAMESPACE folder names. 133 Assigning localized language aliases to shared mailboxes would be 134 done with a separate mechanism such as the proposed METADATA 135 extension (see [METADATA]). 137 3.1 LANGUAGE Extension Requirements 139 IMAP servers that support this extension MUST list the keyword 140 LANGUAGE in their CAPABILITY response as well as in the greeting 141 CAPABILITY data. 143 A server that advertises this extension MUST use the language "i- 145 Internet-draft February 2008 147 default" as described in [RFC2277] as its default language until 148 another supported language is negotiated by the client. A server 149 MUST include "i-default" as one of its supported languages. 151 Clients and servers that support this extension MUST also support 152 the NAMESPACE extension [RFC2342]. 154 The LANGUAGE command is valid in all states. Clients are urged to 155 issue LANGUAGE before authentication, since some servers send 156 valuable user information as part of authentication (e.g. "password 157 is correct, but expired"). If a security layer (such as SASL or 158 TLS) is subsequently negotiated by the client, it MUST re-issue the 159 LANGUAGE command in order to make sure that no previous active 160 attack (if any) on LANGUAGE negotiation has effect on subsequent 161 error messages. (See Section 7 for a more detailed explanation of 162 the attack.) 164 3.2 LANGUAGE Command 166 Arguments: Optional language range arguments. 168 Response: A possible LANGUAGE response (see section 3.3). 169 A possible NAMESPACE response (see section 3.4). 171 Result: OK - Command completed 172 NO - Could not complete command 173 BAD - arguments invalid 175 The LANGUAGE command requests that human-readable text emitted by 176 the server be localized to a language matching one of the language 177 range argument as described by section 2 of [RFC4647]. 179 If the command succeeds, the server will return human-readable 180 responses in the first supported language specified. These 181 responses will be in UTF-8 [RFC3629]. The server MUST send a 182 LANGUAGE response specifying the language used, and the change takes 183 effect immediately after the LANGUAGE response. 185 If the command fails, the server continues to return human-readable 186 responses in the language it was previously using. 188 The special "default" language range argument indicates a request to 189 use a language designated as preferred by the server administrator. 190 The preferred language MAY vary based on the currently active user. 192 If a language range does not match a known language tag exactly but 194 Internet-draft February 2008 196 does match a language by the rules of [RFC4647], the server MUST 197 send an untagged LANGUAGE response indicating the language selected. 199 If there aren't any arguments, the server SHOULD send an untagged 200 LANGUAGE response listing the languages it supports. If the server 201 is unable to enumerate the list of languages it supports it MAY 202 return a tagged NO response to the enumeration request. 204 < The server defaults to using English i-default responses until 205 the user explicitly changes the language. > 207 C: A001 LOGIN KAREN PASSWORD 208 S: A001 OK LOGIN completed 210 < Client requested MUL language, which no server supports. > 212 C: A002 LANGUAGE MUL 213 S: A002 NO Unsupported language MUL 215 < A LANGUAGE command with no arguments is a request to enumerate 216 the list of languages the server supports. > 218 C: A003 LANGUAGE 219 S: * LANGUAGE (EN DE IT i-default) 220 S: A003 OK Supported languages have been enumerated 222 C: B001 LANGUAGE 223 S: B001 NO Server is unable to enumerate supported languages 225 < Once the client changes the language, all responses will be in 226 that language starting after the LANGUAGE response. Note that 227 this includes the NAMESPACE response. Because RFCs are in US- 228 ASCII, this document uses an ASCII transcription rather than 229 UTF-8 text, e.g. ue in the word "ausgefuehrt" > 231 C: C001 LANGUAGE DE 232 S: * LANGUAGE (DE) 233 S: * NAMESPACE (("" "/")) (("Other Users/" "/" "TRANSLATION" 234 ("Andere Ben&APw-tzer/"))) (("Public Folders/" "/" 235 "TRANSLATION" ("Gemeinsame Postf&AM8-cher/"))) 236 S: C001 OK Sprachwechsel durch LANGUAGE-Befehl ausgefuehrt 238 < If a server does not support the requested primary language, 239 responses will continue to be returned in the current language 240 the server is using. > 242 C: D001 LANGUAGE FR 243 S: D001 NO Diese Sprache ist nicht unterstuetzt 245 Internet-draft February 2008 247 C: D002 LANGUAGE DE-IT 248 S: * LANGUAGE (DE-IT) 249 S: * NAMESPACE (("" "/"))(("Other Users/" "/" "TRANSLATION" 250 ("Andere Ben&APw-tzer/"))) (("Public Folders/" "/" 251 "TRANSLATION" ("Gemeinsame Postf&AM8-cher/"))) 252 S: D002 OK Sprachwechsel durch LANGUAGE-Befehl ausgefuehrt 253 C: D003 LANGUAGE "default" 254 S: * LANGUAGE (DE) 255 S: D003 OK Sprachwechsel durch LANGUAGE-Befehl ausgefuehrt 257 < Server does not speak French, but does speak English. User 258 speaks Canadian French and Canadian English. > 260 C: E001 LANGUAGE FR-CA EN-CA 261 S: * LANGUAGE (EN) 262 S: E001 OK Now speaking English 264 3.3 LANGUAGE Response 266 Contents: A list of one or more language tags. 268 The LANGUAGE response occurs as a result of a LANGUAGE command. A 269 LANGUAGE response with a list containing a single language tag 270 indicates that the server is now using that language. A LANGUAGE 271 response with a list containing multiple language tags indicates the 272 server is communicating a list of available languages to the client, 273 and no change in the active language has been made. 275 3.4 TRANSLATION Extension to the NAMESPACE Response 277 If localized representations of the namespace prefixes are available 278 in the selected language, the server SHOULD include these in the 279 TRANSLATION extension to the NAMESPACE response. 281 The TRANSLATION extension to the NAMESPACE response returns a single 282 string, containing the modified UTF-7 [RFC3501] encoded translation 283 of the namespace prefix. It is the responsibility of the client to 284 convert between the namespace prefix and the translation of the 285 namespace prefix when presenting mailbox names to the user. 287 In this example a server supports the IMAP4 NAMESPACE command. It 288 uses no prefix to the user's Personal Namespace, a prefix of "Other 289 Users" to its Other Users' Namespace and a prefix of "Public 290 Folders" to its only Shared Namespace. Since a client will often 291 display these prefixes to the user, the server includes a 293 Internet-draft February 2008 295 translation of them that can be presented to the user. 297 C: A001 LANGUAGE DE-IT 298 S: * NAMESPACE (("" "/")) (("Other Users/" "/" "TRANSLATION" 299 ("Andere Ben&APw-tzer/"))) (("Public Folders/" "/" 300 "TRANSLATION" ("Gemeinsame Postf&AM8-cher/"))) 301 S: A001 OK LANGUAGE-Befehl ausgefuehrt 303 3.5 Formal Syntax 305 The following syntax specification inherits ABNF [RFC4234] rules 306 from IMAP4rev1 [RFC3501], IMAP4 Namespace [RFC2342], Tags for the 307 Identifying Languages [RFC4646], UTF-8 [RFC3629] and Collected 308 Extensions to IMAP4 ABNF [RFC4466]. 310 command-any =/ language-cmd 311 ; LANGUAGE command is valid in all states 313 language-cmd = "LANGUAGE" *(SP lang-range-quoted) 315 response-payload =/ language-data 317 language-data = "LANGUAGE" SP "(" lang-tag-quoted *(SP 318 lang-tag-quoted) ")" 320 namespace-trans = SP DQUOTE "TRANSLATION" DQUOTE SP "(" string ")" 321 ; the string is encoded in Modified UTF-7. 322 ; this is a subset of the syntax permitted by 323 ; the Namespace-Response-Extension rule in [RFC4466] 325 lang-range-quoted = astring 326 ; Once any literal wrapper or quoting is removed, this 327 ; follows the language-range rule in [RFC4647] 329 lang-tag-quoted = astring 330 ; Once any literal wrapper or quoting is removed, this follows 331 ; the Language-Tag rule in [RFC4646] 333 resp-text = ["[" resp-text-code "]" SP ] UTF8-TEXT-CHAR 334 *(UTF8-TEXT-CHAR / "[") 335 ; After the server is changed to a language other than 336 ; i-default, this resp-text rule replaces the resp-text 337 ; rule from [RFC3501]. 339 UTF8-TEXT-CHAR = %x20-5A / %x5C-7E / UTF8-2 / UTF8-3 / UTF8-4 340 ; UTF-8 excluding 7-bit control characters and "[" 342 Internet-draft February 2008 344 4. I18NLEVEL=1 and I18NLEVEL=2 Extensions 346 4.1 Introduction and Overview 348 IMAP4rev1 [RFC3501] includes the SEARCH command which can be used to 349 locate messages matching criteria including human-readable text. 350 The SORT extension [SORT] to IMAP allows the client to ask the 351 server to determine the order of messages based on criteria 352 including human-readable text. These mechanisms require the ability 353 to support non-English search and sort functions. 355 Section 4 defines two IMAP extensions for internationalizing IMAP 356 SEARCH, SORT and THREAD [SORT] using the comparator framework 357 [RFC4790]. 359 The I18NLEVEL=1 extension updates SEARCH/SORT/THREAD to use 360 i;unicode-casemap comparator, as defined in [UCM]. See Sections 4.2 361 and 4.3 for more details. 363 The I18NLEVEL=2 extension is a superset of the I18NLEVEL=1 364 extension. It adds to I18NLEVEL=1 extension the ability to determine 365 the active comparator (see definition below) and negotiate use of 366 comparators using the COMPARATOR command. It also adds the 367 COMPARATOR response that indicates the active comparator and 368 possibly other available comparators. See Sections 4.2 and 4.4 for 369 more details. 371 4.2 Requirements common to both I18NLEVEL=1 and I18NLEVEL=2 373 The term "default comparator" refers to the comparator which is used 374 by SEARCH and SORT absent any negotiation using the COMPARATOR (see 375 Section 4.7) command. The term "active comparator" refers to the 376 comparator which will be used within a session e.g. by SEARCH and 377 SORT. The COMPARATOR command is used to change the active 378 comparator. 380 The active comparator applies to the following SEARCH keys: "BCC", 381 "BODY", "CC", "FROM", "SUBJECT", "TEXT", "TO" and "HEADER". If the 382 server also advertises the "SORT" extension, then the active 383 comparator applies to the following SORT keys: "CC", "FROM", 384 "SUBJECT" and "TO". If the server advertises THREAD=ORDEREDSUBJECT, 385 then the active comparator applies to the ORDEREDSUBJECT threading 386 algorithm. If the server advertises THREAD=REFERENCES, then the 387 active comparator applies to the subject field comparisons done by 388 REFERENCES threading algorithm. Future extensions may choose to 389 apply the active comparator to their SEARCH keys. 391 Internet-draft February 2008 393 For SORT and THREAD, the pre-processing necessary to extract the 394 base subject text from a Subject header occurs prior to the 395 application of a comparator. 397 A server that advertises I18NLEVEL=1 or I18NLEVEL=2 extension MUST 398 implement the i;unicode-casemap comparator, as defined in [UCM]. 400 A server that advertises I18NLEVEL=1 or I18NLEVEL=2 extension MUST 401 support UTF-8 as a SEARCH charset. 403 4.3 I18NLEVEL=1 Extension Requirements 405 An IMAP server that satisfies all requirements specified in sections 406 4.2 and 4.6 (and doesn't support/advertise any other I18NLEVEL= 407 extension, where n > 1) MUST list the keyword I18NLEVEL=1 in its 408 CAPABILITY data once IMAP enters the authenticated state, and MAY 409 list that keyword in other states. 411 4.4 I18NLEVEL=2 Extension Requirements 413 IMAP server that satisfies all requirements specified in sections 414 4.2, 4.4, 4.6-4.10 (and doesn't support/advertise any other 415 I18NLEVEL= extension, where n > 2) MUST list the keyword 416 I18NLEVEL=2 in its CAPABILITY data once IMAP enters the 417 authenticated state, and MAY list that keyword in other states. 419 A server that advertises this extension MUST implement the 420 i;unicode-casemap comparator, as defined in [UCM]. It MAY implement 421 other comparators from the IANA registry established by [RFC4790]. 422 See also section 4.5 of this document. 424 A server that advertises this extension SHOULD use i;unicode-casemap 425 as the default comparator. (Note that i;unicode-casemap is the 426 default comparator for I18NLEVEL=1, but not necessarily the default 427 for I18NLEVEL=2.) The selection of the default comparator MAY be 428 adjustable by the server administrator, and MAY be sensitive to the 429 current user. Once the IMAP connection enters authenticated state, 430 the default comparator MUST remain static for the remainder of that 431 connection. 433 Note that since SEARCH uses the substring operation, IMAP servers 434 can only implement collations that offer the substring operation 435 (see [RFC4790 section 4.2.2). Since SORT uses ordering operation 436 (and by implication equality), IMAP servers which advertise the SORT 437 extension can only implement collations that offer all three 439 Internet-draft February 2008 441 operations (see [RFC4790] sections 4.2.2-4). 443 If the active collation does not provide the operations needed by an 444 IMAP command, the server MUST respond with a tagged BAD. 446 4.5 Compatibility Notes 448 Several server implementations deployed prior to the publication of 449 this specification comply with I18NLEVEL=1 (see section 4.3), but do 450 not advertise that. Other legacy servers use the i;ascii-casemap 451 (see [RFC4790]) comparator. 453 There is no good way for a client to know which comparator that a 454 legacy server uses. If the client has to assume the worst, it may 455 end up doing expensive local operations to obtain i;unicode-casemap 456 comparisons even though the server implements it. 458 Legacy server implementations which comply with I18NLEVEL=1 should 459 be updated to advertise I18NLEVEL=1. All server implementations 460 should eventually be updated to comply with the I18NLEVEL=2 461 extension. 463 4.6 Comparators and Character Encodings 465 RFC 3501, section 6.4.4 says: 467 In all search keys that use strings, a message matches 468 the key if the string is a substring of the field. The 469 matching is case-insensitive. 471 When performing the SEARCH operation, the active comparator is 472 applied instead of the case-insensitive matching specified above. 474 An IMAP server which performs collation operations (e.g., as part of 475 commands such as SEARCH, SORT, THREAD) does so according to the 476 following procedure: 478 (a) MIME encoding (for example see [RFC2047] for headers and 479 [RFC2045] for body parts) MUST be removed in the texts being 480 collated. 482 If MIME encoding removal fails for a message (e.g., a body part 483 of the message has an unsupported Content-Transfer-Encoding, 484 uses characters not allowed by the Content-Transfer-Encoding, 485 etc.), the collation of this message is undefined by this 486 specification, and is handled in an implementation-dependent 488 Internet-draft February 2008 490 manner. 492 (b) The decoded text from (a) MUST be converted to the charset 493 expected by the active comparator. 495 (c) For the substring operation: 496 If step (b) failed (e.g., the text is in an unknown charset, 497 contains a sequence which is not valid according in that 498 charset, etc.), the original decoded text from (a) (i.e., 499 before the charset conversion attempt) is collated using the 500 i;octet comparator (see [RFC4790]). 502 If step (b) was successful, the converted text from (b) is 503 collated according to the active comparator. 505 For the ordering operation: 507 All strings that were successfully converted by step (b) are 508 separated from all strings that failed step (b). Strings in 509 each group are collated independently. All strings successfully 510 converted by step (b) are then validated by the active 511 comparator. Strings that pass validation are collated using the 512 active comparator. All strings that either fail step (b) or fail 513 the active collation's validity operation are collated (after 514 applying step (a)) using the i;octet comparator (see [RFC4790]). 515 The resulting sorted list is produced by appending all collated 516 "failed" strings after all strings collated using the active 517 comparator. 519 Example: The following example demonstrates ordering of 4 520 different strings using i;unicode-casemap [UCM] comparator. 521 Strings are represented using hexadecimal notation used by 522 ABNF [RFC4234]. 524 (1) %xD0 %xC0 %xD0 %xBD %xD0 %xB4 %xD1 %x80 %xD0 %xB5 525 %xD0 %xB9 (labeled with charset=UTF-8) 526 (2) %xD1 %x81 %xD0 %x95 %xD0 %xA0 %xD0 %x93 %xD0 %x95 527 %xD0 %x99 (labeled with charset=UTF-8) 528 (3) %xD0 %x92 %xD0 %xB0 %xD1 %x81 %xD0 %xB8 %xD0 %xBB 529 %xD0 %xB8 %xFF %xB9 (labeled with charset=UTF-8) 530 (4) %xE1 %xCC %xC5 %xCB %xD3 %xC5 %xCA (labeled with 531 charset=KOI8-R) 533 Step (b) will convert string # 4 to the following 534 sequence of octets (in UTF-8): 536 Internet-draft February 2008 538 %xD0 %x90 %xD0 %xBB %xD0 %xB5 %xD0 %xBA %xD1 %x81 %xD0 539 %xB5 %xD0 %xB9 541 and will reject strings (1) and (3), as they contain 542 octets not allowed in charset=UTF-8. 543 After that, using the i;unicode-casemap collation, 544 string (4) will collate before string (2). Using the 545 i;octet collation on the original strings, string (3) 546 will collate before string (1). So the final ordering 547 is as follows: (4) (2) (3) (1). 549 If the substring operation (e.g., IMAP SEARCH) of the active 550 comparator returns the "undefined" result (see section 4.2.3 of 551 [RFC4790]) for either the text specified in the SEARCH command or 552 the message text, then the operation is repeated on the result of 553 step (a) using the i;octet comparator. 555 The ordering operation (e.g., IMAP SORT and THREAD) SHOULD collate 556 the following together: strings encoded using unknown or invalid 557 character encodings, strings in unrecognized charsets, and invalid 558 input (as defined by the active collation). 560 4.7 COMPARATOR Command 562 Arguments: Optional comparator order arguments. 564 Response: A possible COMPARATOR response (see Section 4.8). 566 Result: OK - Command completed 567 NO - No matching comparator found 568 BAD - arguments invalid 570 The COMPARATOR command is valid in authenticated and selected 571 states. 573 The COMPARATOR command is used to determine or change the active 574 comparator. When issued with no arguments, it results in a 575 COMPARATOR response indicating the currently active comparator. 577 When issued with one or more comparator argument, it changes the 578 active comparator as directed. (If more than one installed 579 comparator is matched by an argument, the first argument wins.) The 580 COMPARATOR response lists all matching comparators if more than one 581 matches the specified patterns. 583 The argument "default" refers to the server's default comparator. 585 Internet-draft February 2008 587 Otherwise each argument is an collation specification as defined in 588 the Internet Application Protocol Comparator Registry [RFC4790]. 590 < The client requests activating a Czech comparator if possible, 591 or else a generic international comparator which it considers 592 suitable for Czech. The server picks the first supported 593 comparator. > 595 C: A001 COMPARATOR "cz;*" i;basic 596 S: * COMPARATOR i;basic 597 S: A001 OK Will use i;basic for collation 599 4.8 COMPARATOR Response 601 Contents: The active comparator. 602 An optional list of available matching comparators 604 The COMPARATOR response occurs as a result of a COMPARATOR command. 605 The first argument in the comparator response is the name of the 606 active comparator. The second argument is a list of comparators 607 which matched any of the arguments to the COMPARATOR command and is 608 present only if more than one match is found. 610 4.9 BADCOMPARATOR response code 612 This response code SHOULD be returned as a result of server failing 613 an IMAP command (returning NO), when the server knows that none of 614 the specified comparators match the requested comparator(s). 616 4.10 Formal Syntax 618 The following syntax specification inherits ABNF [RFC4234] rules 619 from IMAP4rev1 [RFC3501], and Internet Application Protocol 620 Comparator Registry [RFC4790]. 622 command-auth =/ comparator-cmd 624 resp-text-code =/ "BADCOMPARATOR" 626 comparator-cmd = "COMPARATOR" *(SP comp-order-quoted) 628 response-payload =/ comparator-data 630 comparator-data = "COMPARATOR" SP comp-sel-quoted [SP "(" 631 comp-id-quoted *(SP comp-id-quoted) ")"] 633 Internet-draft February 2008 635 comp-id-quoted = astring 636 ; Once any literal wrapper or quoting is removed, this 637 ; follows the collation-id rule from [RFC4790] 639 comp-order-quoted = astring 640 ; Once any literal wrapper or quoting is removed, this 641 ; follows the collation-order rule from [RFC4790] 643 comp-sel-quoted = astring 644 ; Once any literal wrapper or quoting is removed, this 645 ; follows the collation-selected rule from [RFC4790] 647 5. Other IMAP Internationalization Issues 649 The following sections provide an overview of various other IMAP 650 internationalization issues. These issues are not resolved by this 651 specification, but could be resolved by other standards work, such 652 as that being done by the EAI group (see [IMAP-EAI]). 654 5.1 Unicode Userids and Passwords 656 IMAP4rev1 currently restricts the userid and password fields of the 657 LOGIN command to US-ASCII. The "userid" and "password" fields of the 658 IMAP LOGIN command are restricted to US-ASCII only until a future 659 standards track RFC states otherwise. Servers are encouraged to 660 validate both fields to make sure they conform to the formal syntax 661 of UTF-8 and to reject the LOGIN command if that syntax is violated. 662 Servers MAY reject the use of any 8-bit in the "userid" or 663 "password" field. 665 When AUTHENTICATE is used, some servers may support userids and 666 passwords in Unicode [RFC3490] since SASL (see [RFC4422]) allows 667 that. However, such userids cannot be used as part of email 668 addresses. 670 5.2 UTF-8 Mailbox Names 672 The modified UTF-7 mailbox naming convention described in section 673 5.1.3 of RFC 3501 is best viewed as an transition from the status 674 quo in 1996 when modified UTF-7 was first specified. At that time, 675 there was widespread unofficial use of local character sets such as 676 ISO-8859-1 and Shift-JIS for non-ASCII mailbox names, with resultant 677 non-interoperability. 679 The requirements in section 5.1 of RFC 3501 are very important if 681 Internet-draft February 2008 683 we're ever going to be able to deploy UTF-8 mailbox names. Servers 684 are encouraged to enforce them. 686 5.3 UTF-8 Domains, Addresses and Mail Headers 688 There is now an IETF standard for Internationalizing Domain Names in 689 Applications [RFC3490]. While IMAP clients are free to support this 690 standard, an argument can be made that it would be helpful to simple 691 clients if the IMAP server could perform this conversion (the same 692 argument would apply to MIME header encoding [RFC2047]). However, 693 it would be unwise to move forward with such work until the work in 694 progress to define the format of international email addresses is 695 complete. 697 6. IANA Considerations 699 The IANA is requested to add LANGUAGE, I18NLEVEL=1 and I18NLEVEL=2 700 to the IMAP4 Capabilities Registry. [Note to IANA: 701 http://www.iana.org/assignments/imap4-capabilities] 703 7. Security Considerations 705 The LANGUAGE extension makes a new command available in "Not 706 Authenticated" state in IMAP. Some IMAP implementations run with 707 root privilege when the server is in "Not Authenticated" state and 708 do not revoke that privilege until after authentication is complete. 709 Such implementations are particularly vulnerable to buffer overflow 710 security errors at this stage and need to implement parsing of this 711 command with extra care. 713 A LANGUAGE command issued prior to activation of a security layer is 714 subject to an active attack which suppresses or modifies the 715 negotiation and thus makes STARTTLS or authentication error messages 716 more difficult to interpret. This is not a new attack as the error 717 messages themselves are subject to active attack. Clients MUST re- 718 issue the LANGUAGE command once a security layer is active, so this 719 does not impact subsequent protocol operations. 721 LANGUAGE, I18NLEVEL=1 and I18NLEVEL=2 extensions use the UTF-8 722 charset, thus the security considerations for UTF-8 [RFC3629] are 723 relevent. However, neither uses UTF-8 for identifiers so the most 724 serious concerns do not apply. 726 8. Acknowledgements 728 Internet-draft February 2008 730 The LANGUAGE extension is based on a previous Internet draft by Mike 731 Gahrns, a substantial portion of the text in that section was 732 written by him. Many people have participated in discussions about 733 an IMAP Language extension in the various fora of the IETF and 734 Internet working groups, so any list of contributors is bound to be 735 incomplete. However, the authors would like to thank Andrew McCown 736 for early work on the original proposal, John Myers for suggestions 737 regarding the namespace issue, along with Jutta Degener, Mark 738 Crispin, Mark Pustilnik, Larry Osterman, Cyrus Daboo, Martin Duerst, 739 Timo Sirainen, Ben Campbell and Magnus Nystrom for their many 740 suggestions that have been incorporated into this document. 742 Initial discussion of the I18NLEVEL=2 extension involved input from 743 Mark Crispin and other participants of the IMAP Extensions WG. 745 9. Relevant Standards for i18n IMAP Implementations 747 This is a non-normative list of standards to consider when 748 implementing i18n aware IMAP software. 750 o The LANGUAGE and I18NLEVEL=2 extensions to IMAP (this 751 specification). 752 o The 8-bit rules for mailbox naming in section 5.1 of RFC 3501. 753 o The Mailbox International Naming Convention in section 5.1.3 of 754 RFC 3501. 755 o MIME [RFC2045] for message bodies. 756 o MIME header encoding [RFC2047] for message headers. 757 o The IETF EAI working group. 758 o MIME Parameter Value and Encoded Word Extensions [RFC2231] for 759 filenames. Quality IMAP server implementations will 760 automatically combine multipart parameters when generating the 761 BODYSTRUCTURE. There is also some deployed non-standard use of 762 MIME header encoding inside double-quotes for filenames. 763 o IDNA [RFC3490] and punycode [RFC3492] for domain names 764 (currently only relevant to IMAP clients). 765 o The UTF-8 charset [RFC3629]. 766 o The IETF policy on Character Sets and Languages [RFC2277]. 768 Normative References 770 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 771 Requirement Levels", BCP 14, RFC 2119, March 1997. 773 [RFC2277] Alvestrand, "IETF Policy on Character Sets and 774 Languages", BCP 18, RFC 2277, January 1998. 776 Internet-draft February 2008 778 [RFC2342] Gahrns, Newman, "IMAP4 Namespace", RFC 2342, May 1998. 780 [RFC3501] Crispin, "INTERNET MESSAGE ACCESS PROTOCOL - VERSION 781 4rev1", RFC 3501, March 2003. 783 [RFC3629] Yergeau, "UTF-8, a transformation format of ISO 10646", 784 STD 63, RFC 3629, November 2003. 786 [RFC4234] Crocker, Overell, "Augmented BNF for Syntax 787 Specifications: ABNF", RFC 4234, Brandenburg 788 Internetworking, Demon Internet Ltd, October 2005. 790 [RFC4422] Melnikov, Zeilenga, "Simple Authentication and Security 791 Layer (SASL)", RFC 4422, June 2006. 793 [RFC4466] Melnikov, Daboo, "Collected Extensions to IMAP4 ABNF", 794 RFC 4466, Isode Ltd., April 2006. 796 [RFC4646] Philips, Davis, "Tags for Identifying Languages", BCP 47, 797 RFC 4646, September 2006. 799 [RFC4647] Philips, Davis, "Matching of Language Tags", BCP 47, RFC 800 4647, September 2006. 802 [RFC4790] Newman, Duerst, Gulbrandsen, "Internet Application 803 Protocol Comparator Registry", RFC 4790, February 2007. 805 [SORT] Crispin, M. and K. Murchison, "INTERNET MESSAGE ACCESS 806 PROTOCOL - SORT AND THREAD EXTENSION", draft-ietf- 807 imapext-sort-19 (work in progress), November 2006. 809 [UCM] Crispin, "i;unicode-casemap - Simple Unicode Collation 810 Algorithm", RFC 5051, October 2007. 812 [RFC2045] Freed, Borenstein, "Multipurpose Internet Mail Extensions 813 (MIME) Part One: Format of Internet Message Bodies", RFC 814 2045, November 1996. 816 [RFC2047] Moore, "MIME (Multipurpose Internet Mail Extensions) Part 817 Three: Message Header Extensions for Non-ASCII Text", RFC 818 2047, November 1996. 820 Informative References 822 [RFC2231] Freed, Moore, "MIME Parameter Value and Encoded Word 823 Extensions: Character Sets, Languages, and 825 Internet-draft February 2008 827 Continuations", RFC 2231, November 1997. 829 [RFC3490] Faltstrom, Hoffman, Costello, "Internationalizing Domain 830 Names in Applications (IDNA)", RFC 3490, March 2003. 832 [RFC3492] Costello, "Punycode: A Bootstring encoding of Unicode for 833 Internationalized Domain Names in Applications (IDNA)", 834 RFC 3492, March 2003. 836 [METADATA] Daboo, C., "IMAP METADATA Extension", draft-daboo-imap- 837 annotatemore-12 (work in progress), December 2007. 839 [IMAP-EAI] Resnick, Newman, "IMAP Support for UTF-8", draft-ietf- 840 eai-imap-utf8 (work in progress), May 2006. 842 Authors' Addresses 844 Chris Newman 845 Sun Microsystems 846 3401 Centrelake Dr., Suite 410 847 Ontario, CA 91761 848 US 850 Email: chris.newman@sun.com 852 Arnt Gulbrandsen 853 Oryx Mail Systems GmbH 854 Schweppermannstr. 8 855 D-81671 Muenchen 856 Germany 858 Email: arnt@oryx.com 860 Fax: +49 89 4502 9758 862 Alexey Melnikov 863 Isode Limited 864 5 Castle Business Village, 36 Station Road, 865 Hampton, Middlesex, TW12 2BX, UK 867 Email: Alexey.Melnikov@isode.com 869 Internet-draft February 2008 871 Intellectual Property Statement 873 The IETF takes no position regarding the validity or scope of any 874 Intellectual Property Rights or other rights that might be claimed to 875 pertain to the implementation or use of the technology described in 876 this document or the extent to which any license under such rights 877 might or might not be available; nor does it represent that it has 878 made any independent effort to identify any such rights. Information 879 on the procedures with respect to rights in RFC documents can be found 880 in BCP 78 and BCP 79. 882 Copies of IPR disclosures made to the IETF Secretariat and any 883 assurances of licenses to be made available, or the result of an 884 attempt made to obtain a general license or permission for the use of 885 such proprietary rights by implementers or users of this specification 886 can be obtained from the IETF on-line IPR repository at 887 http://www.ietf.org/ipr. 889 The IETF invites any interested party to bring to its attention any 890 copyrights, patents or patent applications, or other proprietary 891 rights that may cover technology that may be required to implement 892 this standard. Please address the information to the IETF at 893 ietf-ipr@ietf.org. 895 Full Copyright Statement 897 Copyright (C) The IETF Trust (2008). This document is subject to 898 the rights, licenses and restrictions contained in BCP 78, and 899 except as set forth therein, the authors retain all their rights. 901 This document and the information contained herein are provided on 902 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 903 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE 904 IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL 905 WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY 906 WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE 907 ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS 908 FOR A PARTICULAR PURPOSE. 910 Acknowledgment 912 Funding for the RFC Editor function is currently provided by the 913 Internet Society.