idnits 2.17.1 draft-yeh-ima-utf8headers-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 14. -- Found old boilerplate from RFC 3978, Section 5.5 on line 438. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 415. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 422. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 428. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 27, 2006) is 6632 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'CFWS' is mentioned on line 266, but not defined == Unused Reference: 'ASCII' is defined on line 337, but no explicit reference was found in the text == Unused Reference: 'RFC3066' is defined on line 366, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'ASCII' == Outdated reference: A later version (-02) exists of draft-yao-ima-smtpext-00 -- Possible downref: Normative reference to a draft: ref. 'IMA-SMTP-extension' == Outdated reference: A later version (-01) exists of draft-klensin-ima-framework-00 -- Possible downref: Normative reference to a draft: ref. 'IMA-overview' ** Obsolete normative reference: RFC 2821 (Obsoleted by RFC 5321) ** Obsolete normative reference: RFC 2822 (Obsoleted by RFC 5322) ** Obsolete normative reference: RFC 3066 (Obsoleted by RFC 4646, RFC 4647) Summary: 6 errors (**), 0 flaws (~~), 8 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Yeh, Ed. 3 Internet-Draft TWNIC 4 Expires: August 31, 2006 February 27, 2006 6 Internationalized Email Headers 7 draft-yeh-ima-utf8headers-01.txt 9 Status of this Memo 11 By submitting this Internet-Draft, each author represents that any 12 applicable patent or other IPR claims of which he or she is aware 13 have been or will be disclosed, and any of which he or she becomes 14 aware will be disclosed, in accordance with Section 6 of BCP 79. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six months 22 and may be updated, replaced, or obsoleted by other documents at any 23 time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt. 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 This Internet-Draft will expire on August 31, 2006. 34 Copyright Notice 36 Copyright (C) The Internet Society (2006). 38 Abstract 40 Full internationalization of electronic mail requires not only the 41 capability to transmit non-ASCII content, to encode selected 42 information in specific header fields, and to use international 43 characters in envelope addresses. It also requires being able to 44 express those addresses and information based on them in mail header 45 fields. This document specifies the use of Unicode encoded in UTF-8, 46 rather than ASCII, as the base form for Internet email header fields. 47 This form is permitted in transmission only if authorized by an SMTP 48 extension, as specified in an associated specification. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 53 1.1. Role of this specification . . . . . . . . . . . . . . . . 3 54 2. Background and History . . . . . . . . . . . . . . . . . . . . 3 55 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 56 4. Pre-requirement . . . . . . . . . . . . . . . . . . . . . . . 4 57 5. Identification of internationalized email . . . . . . . . . . 5 58 6. Impact on Message Header Fields . . . . . . . . . . . . . . . 6 59 7. Additional issue . . . . . . . . . . . . . . . . . . . . . . . 7 60 7.1. POP3/IMAP . . . . . . . . . . . . . . . . . . . . . . . . 7 61 7.2. Mailing list header fields . . . . . . . . . . . . . . . . 7 62 7.3. URI/IRI . . . . . . . . . . . . . . . . . . . . . . . . . 7 63 8. Security Considerations . . . . . . . . . . . . . . . . . . . 7 64 9. IANA considerations . . . . . . . . . . . . . . . . . . . . . 8 65 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 8 66 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 8 67 11.1. Normative References . . . . . . . . . . . . . . . . . . . 8 68 11.2. Informative References . . . . . . . . . . . . . . . . . . 9 69 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 10 70 Intellectual Property and Copyright Statements . . . . . . . . . . 11 72 1. Introduction 74 1.1. Role of this specification 76 Full internationalization of electronic mail requires several 77 capabilities: 79 o The capability to transmit non-ASCII content, provided for as part 80 of the basic MIME specification [RFC2045], [RFC2046]. 81 o The capability to encode selected information in specific header 82 fields, provided for as another part of the MIME specification 83 [RFC2047]. 84 o The capability to use international characters in envelope 85 addresses, discussed in [IMA-overview] and specified in [IMA-SMTP- 86 extension]. And, finally, 87 o The capability to express those addresses, and information related 88 to and based on them, in mail header fields, defined in this 89 document. 91 This document specifies the use of Unicode encoded in UTF-8 92 [RFC3629], rather than ASCII, as the base form for Internet email 93 header fields. This form is permitted in transmission, if authorized 94 by the SMTP extension specified in [IMA-SMTP-extension]. 96 2. Background and History 98 Mailbox names often represent the names of human users. Many of 99 these users throughout the world have names that are not normally 100 represented with just the ASCII repertoire of characters, and would 101 more the less like to use their real names in their mailbox names. 102 These users are also likely to use non-ASCII text in their common 103 names and subjects of email messages, both in what they send and what 104 they receive. This protocol specifies UTF-8 as the encoding to 105 represent email header messages. 107 The traditional format of email messages [RFC2822] only allows ASCII 108 characters in the header fields of messages. This prevents users 109 from having email addresses that contain non-ASCII characters. It 110 further forces non-ASCII text in common names, comments, and in free 111 text (such as in the Subject: field) to be in MIME format [RFC2047]. 112 This specification describes a change to the email message format 113 that is connected to the SMTP message transport change described in 114 the associated specifications [IMA-overview] and [IMA-SMTP- 115 extension], and that allows non-ASCII characters throughout email 116 header fields. These changes affect SMTP clients, SMTP servers, and 117 mail user agents (MUAs). 119 As specified in [IMA-SMTP-extension], an SMTP protocol extension 120 [RFC2821] is used to prevent the transmission of messages with UTF-8 121 header fields to systems that cannot handle such messages. 123 Use this SMTP extension helps prevent against the introduction of 124 such messages into message stores that might misrepresent or mangle 125 such messages. It should be noted that using an ESMTP extension does 126 not prevent against transferring email messages with UTF-8 header 127 fields to other systems that use the email format for messages and 128 that may not be upgraded, such as the POP and IMAP protocols. Those 129 protocols will need to be changed in order to handle stored messages 130 that have UTF-8 header fields. 132 The objective for this protocol is to allow UTF-8 in email header 133 fields. Issues about how to handle messages that contain UTF-8 134 header fields but are proposed to be delivered to systems that have 135 not been upgraded to support this capability are discussed elsewhere, 136 particularly in [IMA-downgrading]. 138 This protocol is workable even if IMA mailbox names are not 139 presented. For example, the protocol might still be used if just the 140 subject header has non-ASCII characters, but the protocol MUST be 141 used if other header fields (particularly trace header fields such as 142 "Received:") contain non-ASCII characters. 144 3. Terminology 146 In this document, header fields are "UTF-8 header" if the bodies of 147 headers contain UTF-8 characters. 149 Unless otherwise noted, all terms used here are defined in [RFC2821] 150 or [RFC2822] or in [IMA-overview]. 152 The key words "MUST", "SHALL", "REQUIRED", "SHOULD", "RECOMMENDED", 153 and "MAY" in this document are to be interpreted as described in RFC 154 2119 [RFC2119]. 156 This document is being discussed on the ima mailing list. See 157 https://www1.ietf.org/mailman/listinfo/ima for information about 158 subscribing. The list's archive is at 159 http://www1.ietf.org/mail-archive/web/ima/index.html. 161 4. Pre-requirement 163 The use of UTF-8 header fields is dependent on the use of an SMTP 164 extension named "IMA". 166 That protocol is defined in [IMA-SMTP-extension]. If that extension 167 is not supported, UTF-8 header fields MUST NOT be transmitted. 169 Sending MUAs that follow this protocol MUST create all header fields 170 encoded in UTF-8. No other direct encodings are allowed. MUAs MAY 171 continue to use MIME to specify some text in other encodings; however 172 this is not recommended because it is likely that this will not 173 interoperate well with MUAs that follow this specification. 175 5. Identification of internationalized email 177 When a SMTP client tries to send a mail to a SMTP server that does 178 not support IMA, the client should know whether the message requires 179 the support for IMA or not. In addition to this, identifiction of 180 internationalized email is also required when a message is stored and 181 presented. Checking the presence of UTF-8 characters in the header 182 whenever such an identification is required may also achieve the its 183 goal. However, this type of repeated processing wastes time and 184 processing power of involved systems. It is nice to have a mechanism 185 (such as self-label) or some indicator to identify whether the 186 message is new format(i.e. IMA compliant) or old one (i.e. RFC 2822 187 compliant). 189 To be able to do so, sending MUA should insert a new header field to 190 identify the presence of i18n information (particularly UTF-8 191 headers) in the message. The new header specified as "i18n-email", 192 and elements of the header is the version number of i18n email. The 193 i18n header field syntax specified like: 195 i18n-email: 1.0 197 [Note in draft: There should be more useful information can be place 198 in the new header field. ] 200 While we can't require ordering of headers, it would be good to have 201 it appear as near the top of the headers as possible. It would also 202 be good to be able to guarantee that it will be there when the 203 message is dropped into a mail store. Thus, when a i18n email is 204 delivered. 206 o The "i18n-email" header field MUST be inserted by the originating 207 MUA. 208 o The "i18n-email" header field MUST be inserted, along with Return- 209 path, by the final delivery MTA if not presented. 210 o The "i18n-email" header field, if present, MUST be removed as part 211 of any downgrading process that eliminates the UTF-8 header 212 information. 214 o MTAs MAY check for duplicates of the "i18n-email" header field and 215 eliminate all but one of them. However, if a receiving MUA 216 encounters more than one of these headers, it SHOULD simply ignore 217 any excess ones. 219 This combination guarantees that the header will be present on 220 delivery even if it is deleted in transit. 222 6. Impact on Message Header Fields 224 This protocol does NOT change the definition of header field names. 225 That is, only the bodies of header fields are allowed to have UTF-8 226 characters; the rules in RFC 2822 for header names are not changed. 228 SMTP client can send header fields in UTF-8 format, if the IEmail 229 extensionextension advertised by SMTP server. However, the 230 Message-ID is the unique identifier of a single email. [Note in 231 draft: Extension name depends on the SMTP extension defined in [IMA- 232 SMTP-extension]] In order to maintain the identity, message 233 identifiers of the Message-ID fields MUST be created in all ASCII. 235 To be specific, when IEE smtp extension is advertised. 236 o , and are allowed to use UTF-8. 237 o , remains the same definition as in RFC2822. 239 In this specification, internationalized email address will be 240 presented in UTF-8. Thus, all header fields involving es 241 may be different from traditional ones. There might be IMA 242 unawareMTAs in the mail routing path. In that case, MTA may bounce 243 the message with reply code 558, or downgrade the non-ASCII contents 244 of all header bodies before continuing to send the message, as 245 described in [IMA-downgrading]. However, MTAs never know if there 246 are any data or instructions embedded in the email address. Or there 247 also email addresses do not contain embedded operations. The only 248 one way is to let the mail address owner to tell if the address is ok 249 for downgrade process or not. Hence, the ATOMIC and ALT-ADDRESS 250 options are introduced. The detail of ATOMIC and ALT-ADDRESS options 251 can be found in [IMA-SMTP-extension]. With these two different 252 cases, there are two possible representation of . 253 o ATOMIC: 254 ATOMIC, it means that the email address can be downgraded safely 255 without damage to the mail delivery. In this case, the 256 syntax remains the same to RFC2822. The only difference is that 257 the and of allows UTF-8 258 characters. 260 o ALT-ADDRESS: 261 If user provides an alternative address for the internationalized 262 email address for the mail delivery. The syntax will be 264 mailbox = new-name-addr / new-addr-spec 265 new-name-addr = [display-name] new-angle-addr 266 new-angle-addr = [CFWS] "<" new-addr-spec ">" [CFWS] 267 new-angle-addr =/ obs-angle-addr 268 new-addr-spec = [addr-spec] non-ASCII-addr-spec 269 new-addr-spec =/ addr-spec 271 In any time, SMTP server can reject with a reply code of 558 whenever 272 ALT-ADDRESS is not provided and downgrade is not feasible. 274 [Note in draft: The detail ABNF will need to be prepared in this 275 document when proper WG establish.] 277 7. Additional issue 279 This section identifies issues that are not covered as part of this 280 set of specifications, but that will need to be considered as part of 281 IEE deployment. 283 7.1. POP3/IMAP 285 Receiving MUAs that follow this protocol MUST able to handle email 286 header fields encoded in UTF-8. Which means that the email fetching 287 protocol such as POP3 or IMAP MAY need to be updated. 289 7.2. Mailing list header fields 291 All mailing list and mail redistribution related header fields may 292 need further investigation. 294 7.3. URI/IRI 296 The mailto schema in URI/IRI may need further investigation. 298 8. Security Considerations 300 If a user has a non-ASCII mailbox address and a all-ASCII mailbox 301 address, a digital certificate that identifies that user SHOULD have 302 both addresses in the identity. Having multiple email addresses as 303 identities in a single certificate is already supported in PKIX and 304 OpenPGP. 306 Because UTF-8 often requires several octets to encode a single 307 character, internationalized local parts may cause mail addresses to 308 become longer. Then may possibly make it harder to keep lines in a 309 header under 78 octets. Lines that are longer than 78 octets (which 310 is a SHOULD specification, not a MUST specification, in RFC 2822) 311 could possibly cause mail user agents to fail in ways that affect 312 security. 314 9. IANA considerations 316 The ESMTP extension needed to support this specification is specified 317 in [IMA-SMTP-extension]. This specification does not require any 318 additional IANA actions in that regard. 320 10. Acknowledgements 322 This document was created by incorporating a good deal of material 323 from an old Internet Draft by Paul Hoffman [Hoffman-utf8-headers]. 324 While many of the concepts and details have changed, the 325 contributions from that draft are greatly appreciated. 327 Most of the content of this document is provided by John C Klensin. 328 Also some significant comments and suggestions were received from 329 Charles H. Lindsey, Yangwoo KO, Yoshiro YONEYA, and other members of 330 the JET team and were incorporated into the document. The editor is 331 much great thanks to their contribution sincerely. 333 11. References 335 11.1. Normative References 337 [ASCII] American National Standards Institute (formerly United 338 States of America Standards Institute), "USA Code for 339 Information Interchange", ANSI X3.4-1968, 1968. 341 ANSI X3.4-1968 has been replaced by newer versions with 342 slight modifications, but the 1968 version remains 343 definitive for the Internet. 345 [IMA-SMTP-extension] 346 Yao, J., Ed. and X. LEE, "SMTP extension for 347 internationalized email address", 348 draft-yao-ima-smtpext-00.txt (work in progress), 349 January 2006. 351 [IMA-overview] 352 Klensin, J. and Y. Ko, "Overview and Framework of 353 Internationalized Email Address Delivery", 354 draft-klensin-ima-framework-00.txt (work in progress), 355 September 2005. 357 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 358 Requirement Levels", BCP 14, RFC 2119, March 1997. 360 [RFC2821] Klensin, J., "Simple Mail Transfer Protocol", RFC 2821, 361 April 2001. 363 [RFC2822] Resnick, P., "Internet Message Format", RFC 2822, 364 April 2001. 366 [RFC3066] Alvestrand, H., "Tags for the Identification of 367 Languages", BCP 47, RFC 3066, January 2001. 369 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 370 10646", STD 63, RFC 3629, November 2003. 372 11.2. Informative References 374 [Hoffman-utf8-headers] 375 Hoffman, P., "SMTP Service Extensions or Transmission of 376 Headers in UTF-8 Encoding", 377 draft-hoffman-utf8headers-00.txt (work in progress), 378 December 2003. 380 [IMA-downgrading] 381 "whatever we call the downgrading document", 2005. 383 [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 384 Extensions (MIME) Part One: Format of Internet Message 385 Bodies", RFC 2045, November 1996. 387 [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 388 Extensions (MIME) Part Two: Media Types", RFC 2046, 389 November 1996. 391 [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail Extensions) 392 Part Three: Message Header Extensions for Non-ASCII Text", 393 RFC 2047, November 1996. 395 Author's Address 397 Jeff Yeh (editor) 398 TWNIC 399 4F-2, No. 9, Sec 2, Roosvelt Rd. 400 Taipei, 100 401 Taiwan 403 Phone: +886 2 23411313 ext 506 404 Email: jeff@twnic.net.tw 406 Intellectual Property Statement 408 The IETF takes no position regarding the validity or scope of any 409 Intellectual Property Rights or other rights that might be claimed to 410 pertain to the implementation or use of the technology described in 411 this document or the extent to which any license under such rights 412 might or might not be available; nor does it represent that it has 413 made any independent effort to identify any such rights. Information 414 on the procedures with respect to rights in RFC documents can be 415 found in BCP 78 and BCP 79. 417 Copies of IPR disclosures made to the IETF Secretariat and any 418 assurances of licenses to be made available, or the result of an 419 attempt made to obtain a general license or permission for the use of 420 such proprietary rights by implementers or users of this 421 specification can be obtained from the IETF on-line IPR repository at 422 http://www.ietf.org/ipr. 424 The IETF invites any interested party to bring to its attention any 425 copyrights, patents or patent applications, or other proprietary 426 rights that may cover technology that may be required to implement 427 this standard. Please address the information to the IETF at 428 ietf-ipr@ietf.org. 430 Disclaimer of Validity 432 This document and the information contained herein are provided on an 433 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 434 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 435 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 436 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 437 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 438 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 440 Copyright Statement 442 Copyright (C) The Internet Society (2006). This document is subject 443 to the rights, licenses and restrictions contained in BCP 78, and 444 except as set forth therein, the authors retain all their rights. 446 Acknowledgment 448 Funding for the RFC Editor function is currently provided by the 449 Internet Society.