idnits 2.17.1 draft-ietf-eai-rfc5721bis-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == The 'Obsoletes: ' line in the draft header should list only the _numbers_ of the RFCs which will be obsoleted by this document (if approved); it should not include the word 'RFC' in the list. == The 'Updates: ' line in the draft header should list only the _numbers_ of the RFCs which will be updated by this document (if approved); it should not include the word 'RFC' in the list. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 28, 2010) is 4958 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC4952' is defined on line 498, but no explicit reference was found in the text == Outdated reference: A later version (-12) exists of draft-ietf-eai-frmwrk-4952bis-07 == Outdated reference: A later version (-13) exists of draft-ietf-eai-rfc5335bis-02 ** Obsolete normative reference: RFC 4013 (Obsoleted by RFC 7613) -- Obsolete informational reference (is this intentional?): RFC 4952 (Obsoleted by RFC 6530) Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Gellens 3 Internet-Draft QUALCOMM Incorporated 4 Obsoletes: RFC5721 C. Newman 5 (if approved) Oracle 6 Updates: RFC1939 Jiankang. Yao 7 (if approved) CNNIC 8 Intended status: Standards Track Kazunori. Fujiwara 9 Expires: March 14, 2011 JPRS 10 September 28, 2010 12 POP3 Support for UTF-8 13 draft-ietf-eai-rfc5721bis-00.txt 15 Abstract 17 This specification extends the Post Office Protocol version 3 (POP3) 18 to support un-encoded international characters in user names, 19 passwords, mail addresses, message headers, and protocol-level 20 textual error strings. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on March 14, 2011. 39 Copyright Notice 41 Copyright (c) 2010 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 This document may contain material from IETF Documents or IETF 55 Contributions published or made publicly available before November 56 10, 2008. The person(s) controlling the copyright in some of this 57 material may not have granted the IETF Trust the right to allow 58 modifications of such material outside the IETF Standards Process. 59 Without obtaining an adequate license from the person(s) controlling 60 the copyright in such materials, this document may not be modified 61 outside the IETF Standards Process, and derivative works of it may 62 not be created outside the IETF Standards Process, except to format 63 it for publication as an RFC or to translate it into languages other 64 than English. 66 Table of Contents 68 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 69 1.1. Conventions Used in This Document . . . . . . . . . . . . 3 70 2. LANG Capability . . . . . . . . . . . . . . . . . . . . . . . 4 71 3. UTF8 Capability . . . . . . . . . . . . . . . . . . . . . . . 6 72 3.1. The UTF8 Command . . . . . . . . . . . . . . . . . . . . . 7 73 3.2. USER Argument to UTF8 Capability . . . . . . . . . . . . . 8 74 4. Native UTF-8 Maildrops . . . . . . . . . . . . . . . . . . . . 9 75 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 76 6. Security Considerations . . . . . . . . . . . . . . . . . . . 9 77 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 78 7.1. Normative References . . . . . . . . . . . . . . . . . . . 10 79 7.2. Informative References . . . . . . . . . . . . . . . . . . 11 80 Appendix A. Design Rationale . . . . . . . . . . . . . . . . . . 12 81 Appendix B. Acknowledgments . . . . . . . . . . . . . . . . . . . 12 83 1. Introduction 85 This document forms part of the Email Address Internationalization 86 (EAI) protocols described in the EAI Framework document 87 [I-D.ietf-eai-frmwrk-4952bis]. As part of the overall EAI work, 88 email messages may be transmitted and delivered containing un-encoded 89 UTF-8 characters, and mail drops that are accessed using POP3 90 [RFC1939] might natively store UTF-8. 92 This specification extends POP3 [RFC1939] using the POP3 extension 93 mechanism [RFC2449] to permit un-encoded UTF-8 [RFC3629] in headers, 94 as described in "Internationalized Email Headers" 95 [I-D.ietf-eai-rfc5335bis]. It also adds a mechanism to support login 96 names and passwords outside the ASCII character set, and a mechanism 97 to support UTF-8 protocol-level error strings in a language 98 appropriate for the user. 100 Within this specification, the term "down-conversion" refers to the 101 process of modifying a message containing UTF-8 headers 102 [I-D.ietf-eai-rfc5335bis] or body parts with 8bit content-transfer- 103 encoding, as defined in MIME Section 2.8 [RFC2045], into conforming 104 7-bit Internet Message Format [RFC5322] with message header 105 extensions for non-ASCII text [RFC2047] and other 7-bit encodings. 106 Down-conversion is specified by "Message-Downgrading for Email 107 Address Internationalization" [message-downgrade]. 109 1.1. Conventions Used in This Document 111 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 112 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 113 document are to be interpreted as described in "Key words for use in 114 RFCs to Indicate Requirement Levels" [RFC2119]. 116 The formal syntax uses the Augmented Backus-Naur Form (ABNF) 117 [RFC5234] notation, including the core rules defined in Appendix B of 118 RFC 5234. 120 In examples, "C:" and "S:" indicate lines sent by the client and 121 server, respectively. If a single "C:" or "S:" label applies to 122 multiple lines, then the line breaks between those lines are for 123 editorial clarity only and are not part of the actual protocol 124 exchange. 126 Note that examples always use 7-bit ASCII characters due to 127 limitations of this document format; in particular, some examples for 128 the "LANG" command may appear silly as a result. 130 2. LANG Capability 132 Per "POP3 Extension Mechanism" [RFC2449], this document adds a new 133 capability response tag to indicate support for a new command: LANG. 134 The capability tag and new command are described below. 136 CAPA tag: 137 LANG 139 Arguments with CAPA tag: 140 none 142 Added Commands: 143 LANG 145 Standard commands affected: 146 All 148 Announced states / possible differences: 149 both / no 151 Commands valid in states: 152 AUTHENTICATION, TRANSACTION 154 Specification reference: 155 this document 157 Discussion: 159 POP3 allows most +OK and -ERR server responses to include human- 160 readable text that, in some cases, might be presented to the user. 161 But that text is limited to ASCII by the POP3 specification 162 [RFC1939]. The LANG capability and command permit a POP3 client to 163 negotiate which language the server should use when sending human- 164 readable text. 166 A server that advertises the LANG extension MUST use the language 167 "i-default" as described in [RFC2277] as its default language until 168 another supported language is negotiated by the client. A server 169 MUST include "i-default" as one of its supported languages. 171 The LANG command requests that human-readable text included in all 172 subsequent +OK and -ERR responses be localized to a language matching 173 the language range argument (the "Basic Language Range" as described 174 by [RFC4647]). If the command succeeds, the server returns a +OK 175 response followed by a single space, the exact language tag selected, 176 another space, and the rest of the line is human-readable text in the 177 appropriate language. This and subsequent protocol-level human- 178 readable text is encoded in the UTF-8 charset. 180 If the command fails, the server returns an -ERR response and 181 subsequent human-readable response text continues to use the language 182 that was previously active (typically i-default). 184 The special "*" language range argument indicates a request to use a 185 language designated as preferred by the server administrator. The 186 preferred language MAY vary based on the currently active user. 188 If no argument is given and the POP3 server issues a positive 189 response, then the response given is multi-line. After the initial 190 +OK, for each language tag the server supports, the POP3 server 191 responds with a line for that language. This line is called a 192 "language listing". 194 In order to simplify parsing, all POP3 servers are required to use a 195 certain format for language listings. A language listing consists of 196 the language tag [RFC5646] of the message, optionally followed by a 197 single space and a human-readable description of the language in the 198 language itself, using the UTF-8 charset. 200 Examples: 202 < Note that some examples do not include the correct character 203 accents due to limitations of this document format. > 205 < The server defaults to using English i-default responses until 206 the client explicitly changes the language. > 208 C: USER karen 209 S: +OK Hello, karen 210 C: PASS password 211 S: +OK karen's maildrop contains 2 messages (320 octets) 213 < Client requests deprecated MUL language. Server replies 214 with -ERR response. > 216 C: LANG MUL 217 S: -ERR invalid language MUL 219 < A LANG command with no parameters is a request for 220 a language listing. > 222 C: LANG 223 S: +OK Language listing follows: 224 S: en English 225 S: en-boont English Boontling dialect 226 S: de Deutsch 227 S: it Italiano 228 S: es Espanol 229 S: sv Svenska 230 S: i-default Default language 231 S: . 233 < A request for a language listing might fail. > 235 C: LANG 236 S: -ERR Server is unable to list languages 238 < Once the client changes the language, all responses will be in 239 that language, starting with the response to the LANG command. > 241 C: LANG es 242 S: +OK es Idioma cambiado 244 < If a server does not support the requested primary language, 245 responses will continue to be returned in the current language 246 the server is using. > 248 C: LANG uga 249 S: -ERR es Idioma <> no es conocido 251 C: LANG sv 252 S: +OK sv Kommandot "LANG" lyckades 254 C: LANG * 255 S: +OK es Idioma cambiado 257 3. UTF8 Capability 259 Per "POP3 Extension Mechanism" [RFC2449], this document adds a new 260 capability response tag to indicate support for new server 261 functionality, including a new command: UTF8. The capability tag and 262 new command and functionality are described below. 264 CAPA tag: 265 UTF8 267 Arguments with CAPA tag: 268 USER 270 Added Commands: 271 UTF8 273 Standard commands affected: 274 USER, PASS, APOP, LIST, TOP, RETR 276 Announced states / possible differences: 277 both / no 279 Commands valid in states: 280 AUTHORIZATION 282 Specification reference: 283 this document 285 Discussion: 287 This capability adds the "UTF8" command to POP3. The UTF8 command 288 switches the session from ASCII to UTF-8 mode. 290 3.1. The UTF8 Command 292 The UTF8 command enables UTF-8 mode. The UTF8 command has no 293 parameters. 295 Maildrops can natively store UTF-8 or be limited to ASCII. UTF-8 296 mode has no effect on messages in an ASCII-only maildrop. Messages 297 in native UTF-8 maildrops can be ASCII or UTF-8 using 298 internationalized headers [I-D.ietf-eai-rfc5335bis] and/or 8bit 299 content-transfer-encoding, as defined in MIME Section 2.8 [RFC2045]. 300 In UTF-8 mode, both UTF-8 and ASCII messages are sent to the client 301 as-is (without conversion). When not in UTF-8 mode, UTF-8 messages 302 in a native UTF-8 maildrop MUST NOT be sent to the client as-is. 303 UTF-8 messages in a native UTF-8 maildrop MUST be down-converted 304 (downgraded) to comply with unextended POP and Internet Mail Format 305 without UTF-8 mode support. 307 Note that even in UTF-8 mode, MIME binary content-transfer-encoding 308 is still not permitted. 310 The octet count (size) of a message reported in a response to the 311 LIST command SHOULD match the actual number of octets sent in a RETR 312 response (not counting byte-stuffing). Sizes reported elsewhere, 313 such as in STAT responses and non-standardized, free-form text in 314 positive status indicators (following "+OK") need not be accurate, 315 but it is preferable if they are. 317 Mail stores are either ASCII or native UTF-8, and clients either 318 issue the UTF8 command or not. The message needs converting only 319 when it is native UTF-8 and the client has not issued the UTF8 320 command, in which case the server must down-convert it. The down- 321 converted message may be larger. The server may choose various 322 strategies regarding down-conversion, which include when to down- 323 convert, whether to cache or store the down-converted form of a 324 message (and if so, for how long), and whether to calculate or retain 325 the size of a down-converted message independently of the down- 326 converted content. If the server does not have immediate access to 327 the accurate down-converted size, it may be faster to estimate rather 328 than calculate it. Servers are expected to normally follow the RFC 329 1939 [RFC1939] text on using the "exact size" in a scan listing, but 330 there may be situations with maildrops containing very large numbers 331 of messages in which this might be a problem. If the server does 332 estimate, reporting a scan listing size smaller than what it turns 333 out to be could be a problem for some clients. In summary, it is 334 better for servers to report accurate sizes, but if this is not 335 possible, high guesses are better than small ones. Some POP servers 336 include the message size in the non-standardized text response 337 following "+OK" (the 'text' production of RFC 2449 [RFC2449]), in a 338 RETR or TOP response (possibly because some examples in POP3 339 [RFC1939] do so). There has been at least one known case of a client 340 relying on this to know when it had received all of the message 341 rather than following the POP3 [RFC1939] rule of looking for a line 342 consisting of a termination octet (".") and a CRLF pair. While any 343 such client is non-compliant, if a server does include the size in 344 such text, it is better if it is accurate. 346 Clients MUST NOT issue the STLS command [RFC2595] after issuing UTF8; 347 servers MAY (but are not required to) enforce this by rejecting with 348 an "-ERR" response an STLS command issued subsequent to a successful 349 UTF8 command. (Because this is a protocol error as opposed to a 350 failure based on conditions, an extended response code [RFC2449] is 351 not specified.) 353 3.2. USER Argument to UTF8 Capability 355 If the USER argument is included with this capability, it indicates 356 that the server accepts UTF-8 user names and passwords. 358 Servers that include the USER argument in the UTF8 capability 359 response SHOULD apply SASLprep [RFC4013] to the arguments of the USER 360 and PASS commands. 362 A client or server that supports APOP and permits UTF-8 in user names 363 or passwords MUST apply SASLprep [RFC4013] to the user name and 364 password used to compute the APOP digest. 366 When applying SASLprep [RFC4013], servers MUST reject UTF-8 user 367 names or passwords that contain a Unicode character listed in Section 368 2.3 of SASLprep [RFC4013]. When applying SASLprep to the USER 369 argument, the PASS argument, or the APOP username argument, a 370 compliant server or client MUST treat them as a query string (i.e., 371 unassigned Unicode code points are allowed). When applying SASLprep 372 to the APOP password argument, a compliant server or client MUST 373 treat them as a stored string (i.e., unassigned Unicode code points 374 are prohibited). 376 The client does not need to issue the UTF8 command prior to using 377 UTF-8 in authentication. However, clients MUST NOT use UTF-8 378 characters in USER, PASS, or APOP commands unless the USER argument 379 is included in the UTF8 capability response. 381 The server MUST reject UTF-8 user names or passwords that fail to 382 comply with the formal syntax in UTF-8 [RFC3629]. 384 Use of UTF-8 characters in the AUTH command is governed by the POP3 385 SASL [RFC5034] mechanism. 387 4. Native UTF-8 Maildrops 389 When a POP3 server uses a native UTF-8 maildrop, it is the 390 responsibility of the server to comply with the POP3 base 391 specification [RFC1939] and Internet Message Format [RFC5322] when 392 not in UTF-8 mode. Mechanisms for 7-bit downgrading to help comply 393 with the standards are described in [message-downgrade]. 395 5. IANA Considerations 397 This specification adds two new capabilities ("UTF8" and "LANG") to 398 the POP3 capability registry [RFC2449]. 400 6. Security Considerations 402 The security considerations of UTF-8 [RFC3629] and SASLprep [RFC4013] 403 apply to this specification, particularly with respect to use of 404 UTF-8 in user names and passwords. 406 The "LANG *" command might reveal the existence and preferred 407 language of a user to an active attacker probing the system if the 408 active language changes in response to the USER, PASS, or APOP 409 commands prior to validating the user's credentials. Servers MUST 410 implement a configuration to prevent this exposure. 412 It is possible for a man-in-the-middle attacker to insert a LANG 413 command in the command stream, thus making protocol-level diagnostic 414 responses unintelligible to the user. A mechanism to integrity- 415 protect the session, such as Transport Layer Security (TLS) [RFC2595] 416 can be used to defeat such attacks. 418 Modifying server authentication code (in this case, to support UTF8 419 command) needs to be done with care to avoid introducing 420 vulnerabilities (for example, in string parsing). 422 The UTF8 command description (Section 3.1) contains a discussion on 423 reporting inaccurate sizes. An additional risk to doing so is that, 424 if a client allocates buffers based on the reported size, it may 425 overrun the buffer, crash, or have other problems if the message data 426 is larger than reported. 428 7. References 430 7.1. Normative References 432 [I-D.ietf-eai-frmwrk-4952bis] Klensin, J. and Y. Ko, "Overview and 433 Framework for Internationalized 434 Email", 435 draft-ietf-eai-frmwrk-4952bis-07 (work 436 in progress), August 2010. 438 [I-D.ietf-eai-rfc5335bis] Yang, A. and S. Steele, 439 "Internationalized Email Headers", 440 draft-ietf-eai-rfc5335bis-02 (work in 441 progress), August 2010. 443 [RFC1939] Myers, J. and M. Rose, "Post Office 444 Protocol - Version 3", STD 53, 445 RFC 1939, May 1996. 447 [RFC2045] Freed, N. and N. Borenstein, 448 "Multipurpose Internet Mail Extensions 449 (MIME) Part One: Format of Internet 450 Message Bodies", RFC 2045, 451 November 1996. 453 [RFC2047] Moore, K., "MIME (Multipurpose 454 Internet Mail Extensions) Part Three: 455 Message Header Extensions for Non- 456 ASCII Text", RFC 2047, November 1996. 458 [RFC2119] Bradner, S., "Key words for use in 459 RFCs to Indicate Requirement Levels", 460 BCP 14, RFC 2119, March 1997. 462 [RFC2277] Alvestrand, H., "IETF Policy on 463 Character Sets and Languages", BCP 18, 464 RFC 2277, January 1998. 466 [RFC2449] Gellens, R., Newman, C., and L. 467 Lundblade, "POP3 Extension Mechanism", 468 RFC 2449, November 1998. 470 [RFC3629] Yergeau, F., "UTF-8, a transformation 471 format of ISO 10646", STD 63, 472 RFC 3629, November 2003. 474 [RFC4013] Zeilenga, K., "SASLprep: Stringprep 475 Profile for User Names and Passwords", 476 RFC 4013, February 2005. 478 [RFC4647] Phillips, A. and M. Davis, "Matching 479 of Language Tags", BCP 47, RFC 4647, 480 September 2006. 482 [RFC5234] Crocker, D. and P. Overell, "Augmented 483 BNF for Syntax Specifications: ABNF", 484 STD 68, RFC 5234, January 2008. 486 [RFC5322] Resnick, P., Ed., "Internet Message 487 Format", RFC 5322, October 2008. 489 [RFC5646] Phillips, A. and M. Davis, "Tags for 490 Identifying Languages", BCP 47, 491 RFC 5646, September 2009. 493 7.2. Informative References 495 [RFC2595] Newman, C., "Using TLS with IMAP, POP3 496 and ACAP", RFC 2595, June 1999. 498 [RFC4952] Klensin, J. and Y. Ko, "Overview and 499 Framework for Internationalized 500 Email", RFC 4952, July 2007. 502 [RFC5034] Siemborski, R. and A. Menon-Sen, "The 503 Post Office Protocol (POP3) Simple 504 Authentication and Security Layer 505 (SASL) Authentication Mechanism", 506 RFC 5034, July 2007. 508 [message-downgrade] Fujiwara, K. and Y. Yoneya, "Message 509 Downgrading for Email Address 510 Internationalization (EAI) Maildrops", 511 draft-ietf-eai-rfc5504bis-00 (work in 512 progress), Sep 2010. 514 Appendix A. Design Rationale 516 This non-normative section discusses the reasons behind some of the 517 design choices in the above specification. 519 Due to interoperability problems with RFC 2047 and limited deployment 520 of RFC 2231, it is hoped these 7-bit encoding mechanisms can be 521 deprecated in the future when UTF-8 header support becomes prevalent. 523 USER is optional because the implementation burden of SASLprep 524 [RFC4013] is not well understood, and mandating such support in all 525 cases could negatively impact deployment. 527 While it is possible to provide useful examples for language 528 negotiation without support for non-ASCII characters, it is difficult 529 to provide useful examples for commands specifically designed to use 530 the UTF-8 charset un-encoded when the document format is limited to 531 ASCII. As a result, there are no plans to provide examples for that 532 part of the specification as long as this remains an experimental 533 proposal. However, implementers of this specification are encouraged 534 to provide examples to the document authors for a future revision. 536 Appendix B. Acknowledgments 538 Thanks to John Klensin, Tony Hansen, and other EAI working group 539 participants who provided helpful suggestions and interesting debate 540 that improved this specification. 542 Authors' Addresses 544 Randall Gellens 545 QUALCOMM Incorporated 546 5775 Morehouse Drive 547 San Diego, CA 92651 548 US 550 EMail: rg+ietf@qualcomm.com 552 Chris Newman 553 Oracle 554 800 Royal Oaks 555 Monrovia, CA 91016-6347 556 US 558 EMail: chris.newman@oracle.com 559 Jiankang YAO 560 CNNIC 561 No.4 South 4th Street, Zhongguancun 562 Beijing 564 Phone: +86 10 58813007 565 EMail: yaojk@cnnic.cn 567 Kazunori Fujiwara 568 Japan Registry Services Co., Ltd. 569 Chiyoda First Bldg. East 13F, 3-8-1 Nishi-Kanda 570 Tokyo 572 Phone: +81 3 5215 8451 573 EMail: fujiwara@jprs.co.jp