idnits 2.17.1 draft-duerst-eai-mailto-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 2 instances of lines with non-RFC2606-compliant FQDNs in the document. -- The draft header indicates that this document obsoletes RFC6068, but the abstract doesn't seem to directly say this. It does mention RFC6068 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but does not include the phrase in its RFC 2119 key words list. == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 25, 2012) is 4225 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Duerst 3 Internet-Draft Aoyama Gakuin University 4 Obsoletes: 6068 (if approved) L. Masinter 5 Intended status: Standards Track Adobe Systems Incorporated 6 Expires: March 29, 2013 J. Zawinski 7 DNA Lounge 8 September 25, 2012 10 The 'mailto' URI/IRI Scheme 11 draft-duerst-eai-mailto-04 13 Abstract 15 This document defines the format of Uniform Resource Identifiers 16 (URIs) and Internationalized Resource Identfiers (IRIs) to identify 17 resources that are reached using Internet mail. It adds the 18 possibility to use Email Address Internationalization (EAI) email 19 addresses (RFC6530) to the previous syntax of 'mailto' URIs (RFC 20 6068). 22 Status of this Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on March 29, 2013. 39 Copyright Notice 41 Copyright (c) 2012 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 This document may contain material from IETF Documents or IETF 55 Contributions published or made publicly available before November 56 10, 2008. The person(s) controlling the copyright in some of this 57 material may not have granted the IETF Trust the right to allow 58 modifications of such material outside the IETF Standards Process. 59 Without obtaining an adequate license from the person(s) controlling 60 the copyright in such materials, this document may not be modified 61 outside the IETF Standards Process, and derivative works of it may 62 not be created outside the IETF Standards Process, except to format 63 it for publication as an RFC or to translate it into languages other 64 than English. 66 Table of Contents 68 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 69 2. Syntax of a 'mailto' URI . . . . . . . . . . . . . . . . . . . 4 70 2.1. Syntax Rules . . . . . . . . . . . . . . . . . . . . . . . 4 71 2.2. Additional Details about . . . . . . . . . 5 72 2.3. Additional Details about and . . . . . 6 73 3. Semantics and Operations . . . . . . . . . . . . . . . . . . . 8 74 4. Unsafe Header Fields . . . . . . . . . . . . . . . . . . . . . 9 75 5. Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 76 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 77 6.1. Conventions Used . . . . . . . . . . . . . . . . . . . . . 11 78 6.2. Basic Examples . . . . . . . . . . . . . . . . . . . . . . 11 79 6.3. Examples of Complicated Email Addresses . . . . . . . . . 12 80 6.4. Examples Using UTF-8-Based Percent-Encoding usable 81 with RFC 5322 . . . . . . . . . . . . . . . . . . . . . . 13 82 6.5. Examples Using UTF-8-Based Percent-Encoding usable 83 only with EAI . . . . . . . . . . . . . . . . . . . . . . 15 84 7. Security Considerations . . . . . . . . . . . . . . . . . . . 16 85 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 86 8.1. Update of the Registration of the 'mailto' URI/IRI 87 Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . 17 88 8.2. Registration of the Body Header Field . . . . . . . . . . 19 89 9. Main Changes from RFC 6068 . . . . . . . . . . . . . . . . . . 19 90 10. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 19 91 10.1. Changes from -03 to -04 . . . . . . . . . . . . . . . . . 19 92 10.2. Changes from -02 to -03 . . . . . . . . . . . . . . . . . 20 93 10.3. Changes from -01 to -02 . . . . . . . . . . . . . . . . . 20 94 10.4. Changes from -00 to -01 . . . . . . . . . . . . . . . . . 20 95 10.5. Changes from RFC 6068 to -00 . . . . . . . . . . . . . . . 20 96 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 21 97 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21 98 12.1. Normative References . . . . . . . . . . . . . . . . . . . 21 99 12.2. Informative References . . . . . . . . . . . . . . . . . . 22 100 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22 102 1. Introduction 104 The 'mailto' URI/IRI scheme is a URI/IRI scheme [RFC4395bis] used to 105 identify resources that are reached using Internet mail. In its 106 simplest form, a 'mailto' URI/IRI contains an Internet mail address. 107 For interactions that require message headers or message bodies to be 108 specified, the 'mailto' URI/IRI scheme also allows providing mail 109 header fields and a message body. 111 This specification extends the previous scheme definition ([RFC6068]) 112 to also allow non-ASCII characters in the left-hand sides (LHSs) of 113 email addresses. To work seamlessly with Internationalized Resource 114 Identfiers (IRIs, [RFC3987]) and Email Address Internationalization 115 (EAI, [RFC6530]), these LHSs are percent-encoded based on UTF-8 116 [STD63] when used in URIs. 118 This document is available in (line-printer ready) plaintext ASCII 119 and PDF. It is also available in HTML from http:// 120 www.sw.it.aoyama.ac.jp/2012/pub/draft-duerst-eai-mailto-04.html, and 121 in UTF-8 plaintext from http://www.sw.it.aoyama.ac.jp/2012/pub/ 122 draft-ietf-duerst-eai-mailto-04.utf8.txt. While all these versions 123 are identical in their technical content, the HTML, PDF, and UTF-8 124 plaintext versions show non-Unicode characters directly. This often 125 makes it easier to understand examples, and readers are therefore 126 advised to consult these versions in preference or as a supplement to 127 the ASCII version. 129 Example URIs and IRIs are enclosed in '<' and '>' as described in 130 Appendix C of [STD66]. Extra whitespace and line breaks are added to 131 present long URIs -- they are not part of the actual URI. 133 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 134 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 135 document are to be interpreted as described in [RFC2119]. 137 2. Syntax of a 'mailto' URI 139 2.1. Syntax Rules 141 The syntax of a 'mailto' URI is described using the ABNF of [STD68]. 142 The syntax of a 'mailto' IRI can be obtained from this definition by 143 allowing characters wherever characters 144 are allowed. The syntax below also uses non-terminal definitions 145 from [STD66] (unreserved, pct-encoded): 147 mailtoURI = "mailto:" [ to ] [ hfields ] 148 to = addr-spec-enc *("," addr-spec-enc ) 149 hfields = "?" hfield *( "&" hfield ) 150 hfield = hfname "=" hfvalue 151 hfname = *qchar 152 hfvalue = *qchar 153 addr-spec-enc = local-part-enc "@" domain-enc 154 local-part-enc = dot-atom-text-enc / quoted-string-enc 155 domain-enc = dot-atom-text-enc / "[" *dtext-no-obs "]" 156 dtext-no-obs = %d33-90 ; Printable US-ASCII 157 / %d94-126 ; characters not including 158 ; "[", "]", or "\" 159 dot-atom-text-enc = 161 quoted-string-enc = 163 qchar = unreserved / pct-encoded / some-delims 164 some-delims = "!" / "$" / "'" / "(" / ")" / "*" 165 / "+" / "," / ";" / ":" / "@" / "/" / "?" 167 In addition to the above syntax rules, the details given in the next 168 two subsections are relevant. 170 2.2. Additional Details about 172 is a mail address as specified by in 173 [RFC5322] or in [RFC6532], but excluding , with 174 the following changes: 176 1. A number of characters that can appear in MUST be 177 percent-encoded. These are the characters that cannot appear in 178 a URI according to [STD66] as well as "%" (because it is used for 179 percent-encoding) and all the characters in gen-delims except "@" 180 and ":" (i.e., "/", "?", "#", "[", and "]"). Of the characters 181 in sub-delims, at least the following also have to be percent- 182 encoded: "&", ";", and "=". Care has to be taken both when 183 encoding as well as when decoding to make sure these operations 184 are applied only once. 186 2. and as defined in [RFC5322] MUST NOT 187 be used. 189 3. Whitespace and comments within and 190 MUST NOT be used. They would not have any operational semantics. 192 4. Percent-encoding can be used in the part of an 193 , in order to denote an internationalized domain 194 name. The considerations for in [STD66] apply. In 195 particular, non-ASCII characters MUST first be encoded according 196 to UTF-8 [STD63], and then each octet of the corresponding UTF-8 197 sequence MUST be percent-encoded to be represented as URI 198 characters. URI-producing applications MUST NOT use percent- 199 encoding in domain names unless it is used to represent a UTF-8 200 character sequence. When the internationalized domain name is 201 used to compose a message, the name MUST be transformed to the 202 Internationalizing Domain Names in Applications (IDNA) encoding 203 [RFC5891] where appropriate. URI producers SHOULD provide these 204 domain names in the IDNA encoding, rather than percent-encoded, 205 if they wish to maximize interoperability with legacy 'mailto' 206 URI interpreters. 208 5. Percent-encoding of non-ASCII octets in the of 209 an is used for the internationalization of the 210 according to Email Address Internationalization 211 (EAI; [RFC6532]). Non-ASCII characters MUST first be encoded 212 according to UTF-8 [STD63], and then each octet of the 213 corresponding UTF-8 sequence MUST be percent-encoded to be 214 represented as URI characters. Any other percent-encoding of 215 non-ASCII characters is prohibited. When a 216 containing non-ASCII characters will be used to compose a 217 message, the MUST be transformed back to UTF-8 218 in order to conform to EAI. 220 is the percent-encoded version of 221 in [RFC5322] or in [RFC6532]. is 222 the percent-encoded version of in [RFC5322] or 223 in [RFC6532]. 225 2.3. Additional Details about and 227 and are encodings of an [RFC5322] header field 228 name and value, respectively. Percent-encoding is needed for the 229 same characters as listed above for . is 230 case-insensitive, but in general is case-sensitive. Note 231 that [RFC5322] allows all US-ASCII printable characters except ":" in 232 optional header field names (Section 3.6.8), which is the reason why 233 is part of the header field name production. 235 The special "body" indicates that the associated 236 is the body of the message. The "body" field value is intended to 237 contain the content for the first text/plain body part of the 238 message. The "body" pseudo header field is primarily intended for 239 the generation of short text messages for automatic processing (such 240 as "subscribe" messages for mailing lists), not for general MIME 241 bodies. Except for the encoding of characters based on UTF-8 and 242 percent-encoding, no additional encoding (such as e.g., base64 or 243 quoted-printable; see [RFC2045]) is used for the "body" field value. 244 As a consequence, header fields related to message encoding (e.g., 245 Content-Transfer-Encoding) in a 'mailto' URI are irrelevant and MUST 246 be ignored. The "body" pseudo header field name has been registered 247 with IANA for this special purpose (see Section 8.2). 249 Within 'mailto' URIs, the characters "?", "=", and "&" are reserved, 250 serving as delimiters. They have to be escaped (as "%3F", "%3D", and 251 "%26", respectively) when not serving as delimiters. 253 Additional restrictions on what characters are allowed might apply 254 depending on the context where the URI is used. Such restrictions 255 can be addressed by context-specific escaping mechanisms. For 256 example, because the "&" (ampersand) character is reserved in HTML 257 and XML, any 'mailto' URI that contains an ampersand has to be 258 written with an HTML/XML entity ("&") or numeric character 259 reference ("&" or "&"). 261 Non-ASCII characters can be encoded in as follows: 263 1. MIME encoded words (as defined in [RFC2047]) are permitted in 264 header field values, but not in an of a "body" 265 . Sequences of characters that look like MIME encoded 266 words can appear in an of a "body" , but in 267 that case have no special meaning. Please note that the '=' and 268 '?' characters used as delimiters in MIME encoded words have to 269 be percent-encoded. Also note that the use of MIME encoded words 270 differs slightly for so-called structured and unstructured header 271 fields. 273 2. Non-ASCII characters MUST be encoded according to UTF-8 [STD63] , 274 and then each octet of the corresponding UTF-8 sequence is 275 percent-encoded to be represented as URI characters. When header 276 field values encoded in this way are used to compose a message 277 conforming to [RFC5322], the has to be suitably encoded 278 (transformed into MIME encoded words [RFC2047]), except for an 279 of a "body" , which has to be encoded according 280 to [RFC2045]. Please note that for MIME encoded words and for 281 bodies in composed email messages, encodings other than UTF-8 MAY 282 be used as long as the characters are properly transcoded. When 283 header field values encoded in this way are used to compose a 284 message conforming to [RFC6532], percent-encoding (including 285 reserved characters) has to be decoded. The header field values 286 can then be used directly because EAI allows UTF-8 in header 287 field values. 289 Note that it is syntactically valid to specify both and an 290 whose value is "to". That is, 291 293 is equivalent to 295 297 is equivalent to 299 301 However, the latter two forms are NOT RECOMMENDED because different 302 user agents handle this case differently. In particular, some 303 existing clients ignore "to" s. 305 Implementations MUST NOT produce two "To:" header fields in a 306 message; the "To:" header field may occur at most once in a message 307 ([RFC5322], Section 3.6). Also, creators of 'mailto' URIs MUST NOT 308 include other message header fields multiple times if these header 309 fields can only be used once in a message. 311 To avoid interoperability problems, creators of 'mailto' URIs SHOULD 312 NOT use the same multiple times in the same URI. If the 313 same appears multiple times in a URI, behavior varies widely 314 for different user agents, and for each . Examples include 315 using only the first or last / pair, creating 316 multiple header fields, and combining each by simple 317 concatenation or in a way appropriate for the corresponding header 318 field. 320 Note that this specification, like any URI/IRI scheme specification, 321 does not define syntax or meaning of a fragment identifier (see 322 [STD66]), because these depend on the type of a retrieved 323 representation. In the currently known usage scenarios, a 'mailto' 324 URI cannot be used to retrieve such representations. The character 325 "#" in s MUST be escaped as %23. 327 3. Semantics and Operations 329 A 'mailto' URI/IRI designates an "Internet resource", which is the 330 mailbox specified in the address. When additional header fields are 331 supplied, the resource designated is the same address but with an 332 additional profile for accessing the resource. While there are 333 Internet resources that can only be accessed via electronic mail, the 334 'mailto' URI is not intended as a way of retrieving such objects 335 automatically. 337 The operation of how any URI/IRI scheme is resolved is not mandated 338 by the URI specifications. In current practice, resolving URIs/IRIs 339 such as those in the 'http' URI/IRI scheme causes an immediate 340 interaction between client software and a host running an interactive 341 server. The 'mailto' URI/IRI has unusual semantics because resolving 342 such a URI/IRI does not necessarily cause an immediate interaction 343 with a server. Instead, the client creates a message to the 344 designated address with the various header fields set as default. 345 The user can edit the message, send the message unedited, or choose 346 not to send the message. 348 Note that with the introduction of the possibility to register 349 handlers of URI/IRI schemes to web applications, there is no longer a 350 guarantee that the resolution of a 'mailto' URI/IRI is purely local. 351 Registering a web mail service as a handler of 'mailto' URIs/IRIs 352 means that the creation of a message to the designated address is 353 done with the help and knowledge of that web mail service. 355 The / pairs in a 'mailto' URI/IRI, although 356 syntactically equivalent to header fields in a mail message, do not 357 directly correspond to the header fields in a mail message. In 358 particular, the To, Cc, and Bcc s don't necessarily result 359 in a header field containing the specified value. Mail client 360 software MAY eliminate duplicate addresses. Creators of 'mailto' 361 URIs SHOULD avoid using the same address twice in a 'mailto' URI/IRI. 363 Originator fields like From and Date, fields related to routing 364 (Apparently-To, Resent-*, etc.), trace fields, and MIME header fields 365 (MIME-Version, Content-*), when present in the URI/IRI, MUST be 366 ignored. The mail client MUST create new fields when necessary, as 367 it would for any new message. Unrecognized header fields and header 368 fields with values inconsistent with those the mail client would 369 normally send SHOULD be treated as especially suspect. For example, 370 there may be header fields that are totally safe but not known to the 371 MUA, so the MUA MAY choose to show them to the user. 373 4. Unsafe Header Fields 375 The user agent interpreting a 'mailto' URI/IRI SHOULD NOT create a 376 message if any of the header fields are considered dangerous; it MAY 377 also choose to create a message with only a subset of the header 378 fields given in the URI/IRI. Only a limited set of header fields 379 such as Subject and Keywords, as well as Body, are believed to be 380 both safe and useful in the general case. In cases where the source 381 of a URI/IRI is well known, and/or specific header fields are limited 382 to specific well-known values, other header fields MAY be considered 383 safe, too. 385 The creator of a 'mailto' URI/IRI cannot expect the resolver of a 386 URI/IRI to understand more than the "subject" header field and 387 "body". Clients that resolve 'mailto' URIs/IRIs into mail messages 388 MUST be able to correctly create [RFC5322]-compliant mail messages 389 using the "subject" header field and "body". 391 5. Encoding 393 [STD66] requires that many characters in URIs/IRIs be encoded. This 394 affects the 'mailto' URI/IRI scheme for some common characters that 395 might appear in addresses, header fields, or message contents. One 396 such character is space (" ", ASCII hex 20). Note the examples below 397 that use "%20" for space in the message body. Also note that line 398 breaks in the body of a message MUST be encoded with "%0D%0A". 399 Implementations MAY add a final line break to the body of a message 400 even if there is no trailing "%0D%0A" in the body of the 401 'mailto' URI/IRI. Line breaks in other s SHOULD NOT be used. 403 When creating 'mailto' URIs/IRIs, any reserved characters that are 404 used in the URIs/IRIs MUST be encoded so that properly written URI/ 405 IRI interpreters can read them. Also, client software that reads 406 URIs/IRIs MUST decode strings before creating the mail message so 407 that the mail message appears in a form that the recipient software 408 will understand. These strings SHOULD be decoded before showing the 409 message to the sending user. 411 Software creating 'mailto' URIs/IRIs likewise has to be careful to 412 encode any reserved characters that are used. HTML forms are one 413 kind of software that creates 'mailto' URIs/IRIs. Current 414 implementations encode a space as '+', but this creates problems 415 because such a '+' standing for a space cannot be distinguished from 416 a real '+' in a 'mailto' URI/IRI. When producing 'mailto' URIs/IRIs, 417 all spaces SHOULD be encoded as %20, and '+' characters MAY be 418 encoded as %2B. Please note that '+' characters are frequently used 419 as part of an email address to indicate a subaddress, as for example 420 in . 422 The 'mailto' URI/IRI scheme is limited in that it does not provide 423 for substitution of variables. Thus, it is impossible to create a 424 'mailto' URI/IRI that includes a user's email address in the message 425 body. This limitation also prevents 'mailto' URIs/IRIs that are 426 signed with public keys and other such variable information. 428 6. Examples 429 6.1. Conventions Used 431 To represent characters outside US-ASCII in a document format that is 432 limited to US-ASCII, this document uses 'XML Notation'. A non-ASCII 433 character is denoted by a leading '&#x', a trailing ';', and the 434 hexadecimal number of the character in the UCS in between. For 435 example, Я stands for CYRILLIC CAPITAL LETTER YA. An actual 436 '&' is denoted by '&'. This notation is only used in the ASCII 437 version(s) of this document, because in the other versions, non-ASCII 438 characters are used directly. 440 Where the IRI form of an example is identical to the URI form, only 441 one form is given. If the IRI form is different, then both forms are 442 given. 444 6.2. Basic Examples 446 A URI for an ordinary individual mailing address: 448 450 A URI for a mail response system that requires the name of the file 451 to be sent back in the subject: 453 455 A mail response system that requires a "send" request in the body: 457 459 A similar URI, with two lines with different "send" requests (in this 460 case, "send current-issue" and, on the next line, "send index"): 462 465 An interesting use of 'mailto' URIs occurs when browsing archives of 466 messages. A link can be provided that allows replying to a message 467 and conserving threading information. This is done by adding an In- 468 Reply-To header field containing the Message-ID of the message where 469 the link is added, for example: 471 474 A request to subscribe to a mailing list: 476 477 A URI that is for a single user and that includes a CC of another 478 user: 480 482 Note the use of the "&" reserved character above. The following 483 example, using "?" twice, is incorrect: 485 ; WRONG! 487 According to [RFC5322], the characters "?", "&", and even "%" may 488 occur in s. The fact that they are reserved characters is 489 not a problem: those characters may appear in 'mailto' URIs -- they 490 just may not appear in unencoded form. The standard URI encoding 491 mechanisms ("%" followed by a two-digit hex number) MUST be used in 492 these cases. 494 To indicate the address "gorby%kremvax@example.com" one would use: 496 498 To indicate the address "unlikely?address@example.com", and include 499 another header field, one would use: 501 503 As described above, the "&" (ampersand) character is reserved in HTML 504 and has to be replaced, e.g., with "&". Thus, in an HTML context 505 a URI with an internal ampersand might look like: 506 Click mailto:joe@an.example?cc=bob@an.example&body=hello 509 to send a greeting message to Joe and Bob. 511 When an email address itself includes an "&" (ampersand) character, 512 that character has to be percent-encoded. For example, the 'mailto' 513 URI to send mail to "Mike&family@example.org" is 514 . 516 6.3. Examples of Complicated Email Addresses 518 Following are a few examples of how to treat email addresses that 519 contain complicated escaping syntax. 521 Email address: not@me"@example.org; corresponding 'mailto' URI: 523 . 525 Email address: "oh\\no"@example.org; corresponding 'mailto' URI: 527 . 529 Email address: "\\\"it's\ ugly\\\""@example.org; corresponding 530 'mailto' URI: 532 . 534 6.4. Examples Using UTF-8-Based Percent-Encoding usable with RFC 5322 536 Sending a mail with the subject "coffee" in French, i.e., "cafe" 537 where the final e is an e-acute, using UTF-8 and percent-encoding, as 538 an URI: 540 542 The same as an IRI: 544 546 The same subject, this time using an encoded-word (escaping the "=" 547 and "?" characters used in the encoded-word syntax, because they are 548 reserved): 550 553 The same subject, this time encoded as iso-8859-1: 555 558 Going back to straight UTF-8 and adding a body with the same value, 559 as an URI: 561 563 The same as an IRI: 565 566 This 'mailto' URI may result in an [RFC5322] message looking like 567 this: 568 From: sender@example.net 569 To: user@example.org 570 Subject: =?utf-8?Q?caf=C3=A9?= 571 Content-Type: text/plain;charset=utf-8 572 Content-Transfer-Encoding: quoted-printable 574 caf=C3=A9 576 The software sending the email is not restricted to UTF-8, but can 577 use other encodings. The following shows the same email using iso- 578 8859-1 two times: 579 From: sender@example.net 580 To: user@example.org 581 Subject: =?iso-8859-1?Q?caf=E9?= 582 Content-Type: text/plain;charset=iso-8859-1 583 Content-Transfer-Encoding: quoted-printable 585 caf=E9 587 Different content transfer encodings (i.e., "8bit" or "base64" 588 instead of "quoted-printable") and different encodings in encoded 589 words (i.e., "B" instead of "Q") can also be used. 591 In a context where EAI is supported, this 'mailto' URI can result in 592 an [RFC6532] message looking like this (encoded as UTF-8 on the 593 wire): 594 From: sender@example.net 595 To: user@example.org 596 Subject: café 597 Content-Type: text/plain;charset=utf-8 598 Content-Transfer-Encoding: 8bit 600 café 602 For more examples of encoding the word coffee in different languages, 603 see [RFC2324]. 605 The following example uses the Japanese word "natto" (Unicode 606 characters U+7D0D U+8C46) as a domain name label, sending a mail to a 607 user at 納豆.example.org, as an URI: 609 612 The same as an IRI: 614 617 When constructing the email for use with [RFC5322], the domain name 618 label is converted to punycode. The resulting message might look as 619 follows: 620 From: sender@example.net 621 To: user@xn--99zt52a.example.org 622 Subject: Test 623 Content-Type: text/plain;charset=utf-8 624 Content-Transfer-Encoding: base64 626 57SN6LGG 628 The same message using EAI ([RFC6532]) can look as follows (encoded 629 as UTF-8 on the wire): 630 From: sender@example.net 631 To: user@納豆.example.org 632 Subject: Test 633 Content-Type: text/plain;charset=utf-8 634 Content-Transfer-Encoding: 8bit 636 納豆 638 6.5. Examples Using UTF-8-Based Percent-Encoding usable only with EAI 640 All the previous 'mailto' URIs can be used with EAI. When used with 641 EAI, there is no need to use punycode in domain names, and no need to 642 use MIME encoding in headers and bodies. After decoding percent- 643 encoding, UTF-8 can be used directly. This subsection gives a few 644 additional examples of 'mailto' URI and IRIs which can only be used 645 with EAI. 647 Please note that the choice of URI vs. IRI is independent of whether 648 EAI can be used or not. 650 A hypothetical 'mailto' URI for ordering coffee from a French coffee 651 pot: 653 mailto:caf%C3%A9@pot.example?Subject=Espresso,%20please 655 The same as an IRI: 657 mailto:café@pot.example?Subject=Espresso,%20please 659 A hypothetical 'mailto' URI for sending a potential erratum to the 660 first author of this memo ("%C3%BC" represents an u-umlaut, "%E9%9D% 661 92%E5%B1%B1" represents the Unicode characters U+9752 (blue) and 662 U+5C71 (mountain)): 664 mailto:Martin.D%C3%BCrst@%E9%9D%92%E5%B1% 665 B1.example.net?Subject=Error%20in%20RFC6068bis 667 The same as an IRI: 669 mailto:Martin.Dürst@青&# 670 x5C71;.example.net?Subject=Error%20in%20RFC6068bis 672 7. Security Considerations 674 The 'mailto' URI/IRI scheme can be used to send a message from one 675 user to another, and thus can introduce many security concerns. Mail 676 messages can be logged at the originating site, the recipient site, 677 and intermediary sites along the delivery path. If the messages are 678 not encrypted, they can also be read at any of those sites. 680 Also, if a web mail service is registered as a handler of 'mailto' 681 URIs/IRIs, this means that the creation of a message to the 682 designated address is done with the knowledge of that web mail 683 service, even if the message is actually never sent. 685 A 'mailto' URI/IRI gives a template for a message that can be sent by 686 mail client software. The contents of that template may be opaque or 687 difficult to read by the user at the time of specifying the URI/IRI, 688 as well as being hidden in the user interface (for example, a link on 689 an HTML Web page might display something other than the content of 690 the corresponding 'mailto' URI/IRI that would be used when clicked). 691 Thus, a mail client SHOULD NOT send a message based on a 'mailto' 692 URI/IRI without first disclosing and showing to the user the full 693 message that will be sent (including all header fields that were 694 specified by the 'mailto' URI/IRI), fully decoded, and asking the 695 user for approval to send the message as electronic mail. The mail 696 client SHOULD also make it clear that the user is about to send an 697 electronic mail message, since the user may not be aware that this is 698 the result of a 'mailto' URI/IRI. Users are strongly encouraged to 699 ensure that the 'mailto' URI/IRI presented to them matches the 700 address included in the "To:" line of the email message. 702 Some header fields are inherently unsafe to include in a message 703 generated from a URI/IRI. For details, please see Section 3. In 704 general, the fewer header fields interpreted from the URI/IRI, the 705 less likely it is that a sending agent will create an unsafe message. 707 Examples of problems with sending unapproved mail include: 709 mail that breaks laws upon delivery, such as making illegal 710 threats; 712 mail that identifies the sender as someone interested in breaking 713 laws; 715 mail that identifies the sender to an unwanted third party; 717 mail that causes a financial charge to be incurred by the sender; 719 mail that causes an action on the recipient machine that causes 720 damage that might be attributed to the sender. 722 Programs that interpret 'mailto' URIs/IRIs SHOULD ensure that the 723 SMTP envelope return path address, which is given as an argument to 724 the SMTP MAIL FROM command, is set and correct, and that the 725 resulting email is a complete, workable message. 727 'mailto' URIs/IRIs on public Web pages expose mail addresses for 728 harvesting. This applies to all mail addresses that are part of the 729 'mailto' URI/IRI, including the addresses in a "bcc" . 730 Those addresses will not be sent to the recipients in the 'to' field 731 and in the "to" and "cc" s, but will still be publicly 732 visible in the URI/IRI. Addresses in a "bcc" may also leak 733 to other addresses in the same or become known otherwise, 734 depending on the mail user agent used. 736 Programs manipulating 'mailto' URIs/IRIs have to take great care to 737 not inadvertently double-escape or double-unescape 'mailto' URIs/ 738 IRIs, and to make sure that escaping and unescaping conventions 739 relating to URIs/IRIs and relating to mail addresses are applied in 740 the right order. 742 Implementations parsing 'mailto' URIs/IRIs must take care to sanity 743 check 'mailto' URIs/IRIs in order to avoid buffer overflows and 744 problems resulting from them (e.g., execution of code specified by 745 the attacker). 747 The security considerations for URIs ([STD66]), IRIs ([RFC3987]), 748 IDNA ([RFC5890] and [RFC5891]), and EAI ([RFC6530] and [RFC6532]) 749 also apply. Implementers and users are advised to check them 750 carefully. 752 8. IANA Considerations 754 8.1. Update of the Registration of the 'mailto' URI/IRI Scheme 755 This document changes the definition of the 'mailto' URI/IRI scheme; 756 the registry of URI/IRI schemes should be updated to refer to this 757 document rather than its predecessor [RFC6068]. The registration 758 template is as follows: 759 Resource Identifier (RI) Scheme name: 760 'mailto' 762 Status: 763 permanent 765 Scheme syntax: 766 See the syntax section of RFC YYYY. 767 [RFC Editor: Please replace with actual RFC number.] 769 Scheme semantics: 770 See the semantics section of RFC YYYY. 771 [RFC Editor: Please replace with actual RFC number.] 773 Encoding considerations: 774 See the syntax and encoding sections of RFC YYYY. 775 [RFC Editor: Please replace with actual RFC number.] 777 Applications/protocols that use this scheme name: 778 The 'mailto' URI/IRI scheme is widely used since 779 the start of the Web. 781 Interoperability considerations: 782 Interoperability for 'mailto' URIs/IRIs with UTF-8-based 783 percent-encoding might be somewhat lower than interoperability 784 for 'mailto' URIs with US-ASCII only. In particular, 785 interoperability for 'mailto' URIs/IRIs with UTF-8-based 786 percent-encoding in the LHS of email addresses requires 787 support of EAI [RFC6530]. 789 Security considerations: 790 See the security considerations section of RFC YYYY. 791 [RFC Editor: Please replace with actual RFC number.] 793 Contact: 794 IETF 796 Author/Change controller: 797 IETF 799 References: 800 Duerst, M., Masinter, L., and J. Zawinski, 801 "The 'mailto' URI/IRI Scheme", RFC YYYY, ???? 201?. 802 [RFC Editor: Please replace with actual RFC number and date.] 804 8.2. Registration of the Body Header Field 806 IANA is herewith requested to update the reference for the 807 registration of the Body header field in the Message Header Fields 808 Registry ([RFC3864]) from [RFC6068] to this document (there are no 809 changes to the specification of the Body header field itself). 811 9. Main Changes from RFC 6068 813 The main changes from [RFC6068] are as follows: 815 o Allowed UTF-8/percent-encoding in , to be used for 816 EAI email addresses. 818 o Added "/" and "?" back to some-delims, because they are allowed in 819 query parts. 821 o Added suffix "-enc" to some ABNF rule names to distinguish them 822 from their counterparts without percent-encoding. 824 o Added a MUST for using UTF-8 in . 826 o Added examples as IRIs where there's a difference to the URI form. 828 o Added non-ASCII examples in HTML and PDF versions for better 829 understanding. 831 10. Change Log 833 RFC Editor: Please remove this section before publication. 835 10.1. Changes from -03 to -04 837 Added explanation of consequences of registration of URI/IRI to 838 web mail service, both in Semantics section and in Security 839 Considerations. 841 Alligned registration template with the one in 842 draft-ietf-iri-4395bis-irireg-04. 844 Added EAI references and acronyms to security section. 846 Removed sentence "Therefore, fragment identifiers are meaningless, 847 SHOULD NOT be used on 'mailto' URIs, and SHOULD be ignored upon 848 resolution." because fragments are outside of the scope of an URI/ 849 IRI scheme definition. 851 Various minor tweaks and fixes. 853 Fixed spelling of Pete Resnick's name (see 854 http://www.rfc-editor.org/errata_search.php?rfc=6068&eid=3265). 856 10.2. Changes from -02 to -03 858 Introduced non-ASCII text in author names and examples for better 859 understanding and as a trial for future draft/rfc formats. 861 Split "Main Changes" and changes by draft number so that the 862 former can be kept, but the later removed when moving to 863 publication. 865 Fixed title of RFC 6068. 867 Various minor tweaks and fixes. 869 10.3. Changes from -01 to -02 871 TODO: Change syntax definition to be in terms of IRI syntax, not 872 URI syntax. 874 Split up the Syntax section into subsections. 876 Added "/" and "?" back to some-delims, because they are allowed in 877 query parts. 879 Updated references. 881 10.4. Changes from -00 to -01 883 Updated references. 885 Removed RFC Editor note for updating reference to RFC3987. 886 Depending on how the documents progress, this will be unnecessary 887 or will happen automatically. 889 Minor editorial tweaks. 891 10.5. Changes from RFC 6068 to -00 893 Changed title and various other places to also refer to IRIs. 895 Allowed UTF-8/percent-encoding in , to be used for 896 EAI email addresses. 898 Updated syntax to use "-enc" prefix in some places. 900 Added MUST for using UTF-8 in . 902 Added a new subsection with EAI-only examples. 904 Updated references. 906 Updated first author's address. 908 11. Acknowledgments 910 This document was derived from [RFC6068]; the acknowledgments from 911 that specification and its predecessor still apply. 913 Valuable input on this document was received from (in no particular 914 order): Shawn Steele, Frank Ellermann, John Klensin, Yangwoo Ko, John 915 Levine, and Roy Fielding. 917 12. References 919 12.1. Normative References 921 [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 922 Extensions (MIME) Part One: Format of Internet Message 923 Bodies", RFC 2045, November 1996. 925 [RFC2047] Moore, K., "MIME Part Three: Message Header Extensions for 926 Non-ASCII Text", RFC 2047, November 1996. 928 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 929 Requirement Levels", BCP 14, RFC 2119, March 1997. 931 [RFC3864] Klyne, G., Nottingham, M., and J. Mogul, "Registration 932 Procedures for Message Header Fields", BCP 90, RFC 3864, 933 September 2004. 935 [RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource 936 Identifiers (IRIs)", RFC 3987, January 2005. 938 [RFC5322] Resnick, P., "Internet Message Format", RFC 5322, 939 October 2008. 941 [RFC5890] Klensin, J., "Internationalized Domain Names for 942 Applications (IDNA): Definitions and Document Framework", 943 RFC 5890, August 2010. 945 [RFC5891] Klensin, J., "Internationalized Domain Names in 946 Applications (IDNA): Protocol", RFC 5891, August 2010. 948 [RFC6532] Yang, A., Steele, S., and N. Freed, "Internationalized 949 Email Headers", RFC 6532, February 2012. 951 [STD63] Yergeau, F., "UTF-8, a transformation format of ISO 952 10646", STD 63, RFC 3629, November 2003. 954 [STD66] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 955 Resource Identifier (URI): Generic Syntax", STD 66, 956 RFC 3986, January 2005. 958 [STD68] Crocker, D. and P. Overell, "Augmented BNF for Syntax 959 Specifications: ABNF", STD 68, RFC 5234, January 2008. 961 12.2. Informative References 963 [RFC2324] Masinter, L., "Hyper Text Coffee Pot Control Protocol 964 (HTCPCP/1.0)", RFC 2324, April 1998. 966 [RFC4395bis] 967 Hansen, T., Hardie, T., and L. Masinter, "Guidelines and 968 Registration Procedures for New URI/IRI Schemes", 969 draft draft-ietf-iri-4395bis-irireg-04, December 2011. 971 [RFC6068] Duerst, M., Masinter, L., and J. Zawinski, "The 'mailto' 972 URI Scheme", RFC 6068, October 2010. 974 [RFC6530] Klensin, J. and Y. Ko, "Overview and Framework for 975 Internationalized Email", RFC 6530, February 2012. 977 Authors' Addresses 979 Martin J. Duerst (Note: Please write "Duerst" with u-umlaut wherever 980 possible, for example as "Dürst" in XML and HTML.) 981 Aoyama Gakuin University 982 5-10-1 Fuchinobe 983 Chuo-ku 984 Sagamihara, Kanagawa 252-5258 985 Japan 987 Phone: +81 42 759 6329 988 Fax: +81 42 759 6495 989 Email: duerst@it.aoyama.ac.jp 990 URI: http://www.sw.it.aoyama.ac.jp/D%C3%BCrst/ 991 Larry Masinter 992 Adobe Systems Incorporated 993 345 Park Ave 994 San Jose, CA 95110 995 USA 997 Phone: +1-408-536-3024 998 Email: LMM@acm.org 999 URI: http://larry.masinter.net/ 1001 Jamie Zawinski 1002 DNA Lounge 1003 375 Eleventh Street 1004 San Francisco, CA 94103 1005 USA 1007 Email: jwz@jwz.org