idnits 2.17.1 draft-ietf-eai-5738bis-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == The 'Obsoletes: ' line in the draft header should list only the _numbers_ of the RFCs which will be obsoleted by this document (if approved); it should not include the word 'RFC' in the list. -- The abstract seems to indicate that this document obsoletes RFC5738, but the header doesn't have an 'Obsoletes:' line to match this. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (August 1, 2012) is 4279 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 3501 (Obsoleted by RFC 9051) ** Obsolete normative reference: RFC 4013 (Obsoleted by RFC 7613) -- Obsolete informational reference (is this intentional?): RFC 2088 (Obsoleted by RFC 7888) -- Obsolete informational reference (is this intentional?): RFC 5738 (Obsoleted by RFC 6855) == Outdated reference: A later version (-07) exists of draft-ietf-eai-simpledowngrade-05 Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force P. Resnick, Ed. 3 Internet-Draft Qualcomm Incorporated 4 Obsoletes: RFC5738 (if approved) C. Newman, Ed. 5 Intended status: Standards Track Oracle 6 Expires: February 2, 2013 S. Shen, Ed. 7 CNNIC 8 August 1, 2012 10 IMAP Support for UTF-8 11 draft-ietf-eai-5738bis-07 13 Abstract 15 This specification extends the Internet Message Access Protocol 16 version 4rev1 (IMAP4rev1) to support UTF-8 encoded international 17 characters in user names, mail addresses and message headers. This 18 specification replaces RFC 5738. 20 Status of This Memo 22 This Internet-Draft is submitted to IETF in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on February 2, 2013. 37 Copyright Notice 39 Copyright (c) 2012 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 2. Conventions Used in this Document . . . . . . . . . . . . . . 3 56 3. UTF8=ACCEPT IMAP Capability and UTF-8 in IMAP Quoted 57 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 4. IMAP UTF8 Append Data Extension . . . . . . . . . . . . . . . 5 59 5. LOGIN Command and UTF-8 . . . . . . . . . . . . . . . . . . . 5 60 6. UTF8=ONLY Capability . . . . . . . . . . . . . . . . . . . . . 6 61 7. Dealing With Legacy Clients . . . . . . . . . . . . . . . . . 6 62 8. Issues with UTF-8 Header Mailstore . . . . . . . . . . . . . . 7 63 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 64 10. Security Considerations . . . . . . . . . . . . . . . . . . . 8 65 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 8 66 11.1. Normative References . . . . . . . . . . . . . . . . . . 8 67 11.2. Informative References . . . . . . . . . . . . . . . . . 9 68 Appendix A. Design Rationale . . . . . . . . . . . . . . . . . . 9 69 Appendix B. Acknowledgments . . . . . . . . . . . . . . . . . . . 10 71 1. Introduction 73 This specification forms part of the Email Address 74 Internationalization protocols described in the Email Address 75 Internationalization Framework document [RFC6530]. It extends 76 IMAP4rev1 [RFC3501] to permit UTF-8 [RFC3629] in headers as described 77 in "Internationalized Email Headers" [RFC6532]. It also adds a 78 mechanism to support mailbox names using the UTF-8 charset. This 79 specification creates two new IMAP capabilities to allow servers to 80 advertise these new extensions. 82 Most of this specification assumes that the IMAP server will be 83 operating in a fully internationalized environment, i.e., one in 84 which all clients accessing the server will be able to accept non- 85 ASCII message header fields and other information as specified in 86 Section 3. At least during a transition period, that assumption will 87 not be realistic for many environments; the issues involved are 88 discussed in Section 7 below. 90 This specification replaces an earlier, experimental, approach to the 91 same problem [RFC5738]. 93 2. Conventions Used in this Document 95 The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" 96 in this document are to be interpreted as defined in "Key words for 97 use in RFCs to Indicate Requirement Levels" [RFC2119]. 99 The formal syntax uses the Augmented Backus-Naur Form (ABNF) 100 [RFC5234] notation. In addition, rules from IMAP4rev1 [RFC3501], 101 UTF-8 [RFC3629], "Collected Extensions to IMAP4 ABNF" [RFC4466], and 102 IMAP4 LIST Command Extensions [RFC5258] are also referenced. This 103 document assumes that the reader will have a reasonably good 104 understanding of the RFCs above and their update. 106 In examples, "C:" and "S:" indicate lines sent by the client and 107 server, respectively. If a single "C:" or "S:" label applies to 108 multiple lines, then the line breaks between those lines are for 109 editorial clarity only and are not part of the actual protocol 110 exchange. 112 3. UTF8=ACCEPT IMAP Capability and UTF-8 in IMAP Quoted Strings 114 The "UTF8=ACCEPT" capability indicates that the server supports the 115 ability to open mailboxes containing internationalized messages with 116 SELECT and EXAMINE, and UTF-8 responses from the LIST and LSUB 117 commands. 119 A client MUST use the "ENABLE" command (defined in [RFC5161]) with 120 the "UTF8=ACCEPT" option (defined in Section 4 below) to indicate to 121 the server that the client accepts UTF-8 in quoted-strings. The 122 "ENABLE UTF8=ACCEPT" command MUST only be used in the authenticated 123 state. (Note that the "UTF8=ONLY" capability described in Section 6 124 imply the "UTF8=ACCEPT" capability. See additional information in 125 these sections.) 127 The IMAP4rev1 [RFC3501] base specification forbids the use of 8-bit 128 characters in atoms or quoted strings. Thus, a UTF-8 string can only 129 be sent as a literal. This can be inconvenient from a coding 130 standpoint, and unless the server offers IMAP4 non-synchronizing 131 literals [RFC2088], this requires an extra round trip for each UTF-8 132 string sent by the client. When the IMAP server advertises the 133 "UTF8=ACCEPT" capability, it informs the client that it supports 134 UTF-8 in quoted-strings with the following syntax: 136 quoted =/ DQUOTE *uQUOTED-CHAR DQUOTE 137 ; QUOTED-CHAR is not modified, as it will affect 138 ; other RFC 3501 ABNF non terminal. 140 uQUOTED-CHAR = QUOTED-CHAR / UTF8-2 / UTF8-3 / UTF8-4 142 UTF8-2 = 144 UTF8-3 = 146 UTF8-4 = 148 When this extended quoting mechanism is used by the client, then the 149 server MUST reject octet sequences with the high bit set that fail to 150 comply with the formal syntax in [RFC3629] with a BAD response. The 151 IMAP server MUST NOT send UTF-8 in quoted strings to the client 152 unless the client has indicated support for that syntax by using the 153 "ENABLE UTF8=ACCEPT" command. 155 If the server advertises the "UTF8=ACCEPT" capability, the client MAY 156 use extended quoted syntax with any IMAP argument that permits a 157 string (including astring and nstring). However, if characters 158 outside the US-ASCII repertoire are used in an inappropriate place, 159 the results would be the same as if other syntactically valid but 160 semantically invalid characters were used. Specific cases where 161 UTF-8 characters are permitted or not permitted are described in the 162 following paragraphs. 164 All IMAP servers that advertise the "UTF8=ACCEPT" capability SHOULD 165 accept UTF-8 in mailbox names, and those that also support the 166 "Mailbox International Naming Convention" described in RFC 3501, 167 Section 5.1.3 MUST accept utf8-quoted mailbox names and convert them 168 to the appropriate internal format. Mailbox names MUST comply with 169 the Net-Unicode Definition (Section 2 of [RFC5198]) with the specific 170 exception that they MUST NOT contain control characters (0000-001F, 171 0080-009F), delete (007F), line separator (2028), or paragraph 172 separator (2029). 174 An IMAP client MUST NOT issue a SEARCH command using CHARSET after 175 ENABLE command. If an IMAP server receives such a SEARCH command, it 176 SHOULD reject the command with a BAD response (due to the conflicting 177 charset labels). 179 4. IMAP UTF8 Append Data Extension 181 If the "UTF8=ACCEPT" capability is advertised, then the server 182 accepts UTF-8 headers in the APPEND command message argument. A 183 client that sends a message with UTF-8 headers to the server MUST 184 send them using the "UTF8" APPEND data extension. If the server also 185 advertises the CATENATE capability (as specified in [RFC4469]), the 186 client can use the same data extension to include such a message in a 187 CATENATE message part. The ABNF for the APPEND data extension and 188 CATENATE extension follows: 190 utf8-literal = "UTF8" SP "(" literal8 ")" 192 literal8 = 194 append-data =/ utf8-literal 196 cat-part =/ utf8-literal 198 IMAP servers that advertise support for "UTF8=ACCEPT" or "UTF8=ONLY" 199 MUST reject an APPEND command that includes any 8-bit in the message 200 headers with a "NO" response, when IMAP clients do not issue "ENABLE 201 UTF8=ACCEPT" or "ENABLE UTF8=ONLY". 203 Note that the "UTF8=ONLY" capability described in Section 6 implies 204 the "UTF8=ACCEPT" capability. See additional information in that 205 section. 207 5. LOGIN Command and UTF-8 209 This specification doesn't extend the IMAP LOGIN command [RFC3501] to 210 support UTF-8 usernames and passwords. Whenever a client needs to 211 use UTF-8 username/passwords, it MUST use the IMAP AUTHENTICATE 212 command which is already capable of passing UTF-8 user names and 213 credentials. 215 Although the use of the IMAP AUTHENTICATE command in this way makes 216 it syntactically legal to have a UTF-8 user name or password, there 217 is no guarantee the user provisioning system used by the IMAP server 218 will allow such identities. This is an implementation decision and 219 MAY depend on what identity system the IMAP server is configured to 220 use. 222 6. UTF8=ONLY Capability 224 The "UTF8=ONLY" capability permits an IMAP server to advertise that 225 it does not support the international mailbox name convention 226 (modified UTF-7), allows all of the capabilities that are allowed by 227 "UTF8=ACCEPT" (see Section 4), and does not permit the international 228 mailbox name convention (modified UTF-7). As this is an incompatible 229 change to IMAP, a clear warning is necessary. IMAP clients that find 230 implementation of the "UTF8=ONLY" capability problematic are 231 encouraged to at least detect the "UTF8=ONLY" capability and provide 232 an informative error message to the end-user. 234 If the "UTF8=ONLY" capability is specified, UTF-8 must be accepted as 235 if the "UTF8=ACCEPT" had been specified. For convenience, the 236 explicit combination of "UTF8=ONLY" and "UTF8=ACCPET" is not allowed. 238 7. Dealing With Legacy Clients 240 In most situations, it will be difficult or impossible for the 241 implementer or operator of an IMAP (or POP) server to know whether 242 all of the clients that might access it, or the associated mail store 243 more generally, will be able to support the facilities defined in 244 this document. In almost all cases, servers who conform to this 245 specification will have to be prepared to deal with clients that do 246 not enable the relevant capabilities. Unfortunately, there is no 247 completely satisfactory way to do so other than for systems that wish 248 to receive email that requires SMTPUTF8 capabilities to be sure that 249 all components of those systems -- including IMAP and other clients 250 selected by users -- are upgraded appropriately. 252 Choices available to the server when a message that requires SMTPUTF8 253 is encountered and the client doesn't enable UTF-8 capability include 254 hiding the problematic message(s), creating in band or out of band 255 notifications or error messages, or somehow trying to create a 256 variation on the message with the intention of providing useful 257 information to that client about what has occurred. Such variant 258 messages cannot be actual substitutes for the original message: it 259 will rarely be possible to reply to (either at all or without loss of 260 information), new header fields or specialized constructs for server- 261 client communication may go beyond the requirements of, e.g., RFC 262 5322 and may consequently confuse some legacy mail user agents 263 (including IMAP clients) or otherwise not provide the expected 264 information to users. There are also tradeoffs in constructing 265 variants of the original message between accepting complexity and 266 additional computation costs in order to try to preserve as much 267 information as possible (for example, in [popimap-downgrade]) and 268 trying to minimize those costs while still providing useful 269 information (for example, in [I-D.ietf-eai-simpledowngrade]). 271 Because such messages are really variations on the original ones, not 272 really "downgraded ones" (although that terminology is often used for 273 convenience), they inevitably have relationships to the original ones 274 that the IMAP specification [RFC3501] did not anticipate. In 275 particular, digital signatures computed over the original message 276 will often not be applicable to the variant version and servers that 277 may be accessed by the same user with different clients or methods 278 (e.g., POP or webmail systems in addition to IMAP or IMAP clients 279 with different capabilities) will need to exert extreme care to be 280 sure that UIDVALIDITY behaves as the user would expect. Those issues 281 may be especially sensitive if the server caches the variant message 282 or computes and stores it when the message arrives with the intent of 283 making either form available depending on client capabilities. 285 The best (or "least bad") approach for any given environment will 286 depend on local conditions, local assumptions about user behavior, 287 the degree of control the server operator has over client usage and 288 upgrading, the options that are actually available, and so on. It is 289 impossible, at least at the time of publication of this 290 specification, to give good advice that will apply to all situations, 291 or even particular profiles of situations, other than "upgrade legacy 292 clients as soon as possible". 294 8. Issues with UTF-8 Header Mailstore 296 When an IMAP server uses a mailbox format that supports UTF-8 headers 297 and it permits selection or examination of that mailbox without the 298 "UTF8" parameter, it is the responsibility of the server to comply 299 with the IMAP4rev1 base specification [RFC3501] and [RFC5322] with 300 respect to all header information transmitted over the wire. The 301 issue of handling messages containing non-ASCII characters in legacy 302 environments is discussed in Section 7. 304 9. IANA Considerations 306 This document adds two new capabilities ("UTF8=ACCEPT" and 307 "UTF8=ONLY") to the IMAP4rev1 Capabilities registry [RFC3501]. Three 308 other IMAP capabilites that were described in the experimental 309 predecessor to this document (UTF8=ALL, UTF8=APPEND, UTF8=USER) are 310 to be marked OBSOLETE in the registry. 312 10. Security Considerations 314 The security considerations of UTF-8 [RFC3629] and SASLprep [RFC4013] 315 apply to this specification, particularly with respect to use of 316 UTF-8 in user names and passwords. Otherwise, this is not believed 317 to alter the security considerations of IMAP4rev1. 319 Special considerations, some of them with security implications, 320 occur if a server that conforms to this specification is accessed by 321 a client that does not and in some more complex situations in which a 322 given message is accessed by multiple clients that might use 323 different protocols and/or support different capabilities. Those 324 issues are discussed in Section 7 above. 326 11. References 328 11.1. Normative References 330 [RFC2119] Bradner, S., "Key words for use in 331 RFCs to Indicate Requirement Levels", 332 BCP 14, RFC 2119, March 1997. 334 [RFC3501] Crispin, M., "INTERNET MESSAGE ACCESS 335 PROTOCOL - VERSION 4rev1", RFC 3501, 336 March 2003. 338 [RFC3629] Yergeau, F., "UTF-8, a transformation 339 format of ISO 10646", STD 63, 340 RFC 3629, November 2003. 342 [RFC4013] Zeilenga, K., "SASLprep: Stringprep 343 Profile for User Names and 344 Passwords", RFC 4013, February 2005. 346 [RFC4466] Melnikov, A. and C. Daboo, "Collected 347 Extensions to IMAP4 ABNF", RFC 4466, 348 April 2006. 350 [RFC4469] Resnick, P., "Internet Message Access 351 Protocol (IMAP) CATENATE Extension", 352 RFC 4469, April 2006. 354 [RFC5161] Gulbrandsen, A. and A. Melnikov, "The 355 IMAP ENABLE Extension", RFC 5161, 356 March 2008. 358 [RFC5198] Klensin, J. and M. Padlipsky, 359 "Unicode Format for Network 360 Interchange", RFC 5198, March 2008. 362 [RFC5234] Crocker, D. and P. Overell, 363 "Augmented BNF for Syntax 364 Specifications: ABNF", STD 68, 365 RFC 5234, January 2008. 367 [RFC5258] Leiba, B. and A. Melnikov, "Internet 368 Message Access Protocol version 4 - 369 LIST Command Extensions", RFC 5258, 370 June 2008. 372 [RFC6532] Yang, A., Steele, S., and N. Freed, 373 "Internationalized Email Headers", 374 RFC 6532, February 2012. 376 [RFC5322] Resnick, P., Ed., "Internet Message 377 Format", RFC 5322, October 2008. 379 [RFC6530] Klensin, J. and Y. Ko, "Overview and 380 Framework for Internationalized 381 Email", RFC 6530, February 2012. 383 11.2. Informative References 385 [RFC2088] Myers, J., "IMAP4 non-synchronizing 386 literals", RFC 2088, January 1997. 388 [RFC5738] Resnick, P. and C. Newman, "IMAP 389 Support for UTF-8", RFC 5738, 390 March 2010. 392 [I-D.ietf-eai-simpledowngrade] Gulbrandsen, A., "EAI: Simplified 393 POP/IMAP downgrading", 394 draft-ietf-eai-simpledowngrade-05 395 (work in progress), June 2012. 397 [popimap-downgrade] Fujiwara, K., "Post-delivery Message 398 Downgrading for Internationalized 399 Email Messages", 400 draft-ietf-eai-popimap-downgrade-06 401 (work in progress), July 2012. 403 Appendix A. Design Rationale 405 This non-normative section discusses the reasons behind some of the 406 design choices in the above specification. 408 The basic approach of advertising the ability to access a mailbox in 409 UTF-8 mode is intended to permit graceful upgrade, including servers 410 that support multiple mailbox formats. In particular, it would be 411 undesirable to force conversion of an entire server mailstore to 412 UTF-8 headers, so being able to phase-in support for new mailboxes 413 and gradually migrate old mailboxes is permitted by this design. 415 The "UTF8=ONLY" mechanism simplifies diagnosis of interoperability 416 problems when legacy support goes away. In the situation where 417 backwards compatibility is broken anyway, just-send-UTF-8 IMAP has 418 the advantage that it might work with some legacy clients. However, 419 the difficulty of diagnosing interoperability problems caused by a 420 just-send-UTF-8 IMAP mechanism is the reason the "UTF8=ONLY" 421 capability mechanism was chosen. 423 Appendix B. Acknowledgments 425 The authors wish to thank the participants of the EAI working group 426 for their contributions to this document with particular thanks to 427 Harald Alvestrand, David Black, Randall Gellens, Arnt Gulbrandsen, 428 Kari Hurtta, John Klensin, Xiaodong Lee, Charles Lindsey, Alexey 429 Melnikov, Subramanian Moonesamy, Shawn Steele, Daniel Taharlev, and 430 Joseph Yee for their specific contributions to the discussion. 432 Authors' Addresses 434 Pete Resnick (editor) 435 Qualcomm Incorporated 436 5775 Morehouse Drive 437 San Diego, CA 92121-1714 438 US 440 Phone: +1 858 651 4478 441 EMail: presnick@qualcomm.com 443 Chris Newman (editor) 444 Oracle 445 800 Royal Oaks 446 Monrovia, CA 91016 447 USA 449 Phone: 450 EMail: chris.newman@oracle.com 451 Sean Shen (editor) 452 CNNIC 453 No.4 South 4th Zhongguancun Street 454 Beijing, 100190 455 China 457 Phone: +86 10-58813038 458 EMail: shenshuo@cnnic.cn