idnits 2.17.1 draft-tomkinson-slim-multilangcontent-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 15, 2015) is 3116 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-03) exists of draft-gellens-slim-negotiating-human-language-02 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IETF N. Tomkinson 3 Internet-Draft N. Borenstein 4 Intended status: Standards Track Mimecast Ltd 5 Expires: April 17, 2016 October 15, 2015 7 Multiple Language Content Type 8 draft-tomkinson-slim-multilangcontent-02 10 Abstract 12 This document defines an addition to the Multipurpose Internet Mail 13 Extensions (MIME) standard to make it possible to send one message 14 that contains multiple language versions of the same information. 15 The translations would be identified by a language code and selected 16 by the email client based on a user's language settings or locale. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at http://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on April 17, 2016. 35 Copyright Notice 37 Copyright (c) 2015 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (http://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 1. Introduction 52 Since the invention of email and the rapid spread of the Internet, 53 more and more people have been able to communicate in more and more 54 countries and in more and more languages. But during this time of 55 technological evolution, email has remained a single-language 56 communication tool, whether it is English to English, Spanish to 57 Spanish or Japanese to Japanese. 59 Also during this time, many corporations have established their 60 offices in multi-cultural cities and formed departments and teams 61 that span continents, cultures and languages, so the need to 62 communicate efficiently with little margin for miscommunication has 63 grown exponentially. 65 The objective of this document is to define an addition to the 66 Multipurpose Internet Mail Extensions (MIME) standard, to make it 67 possible to send a single message to a group of people in such a way 68 that all of the recipients can read the email in their preferred 69 language. The methods of translation of the message content are 70 beyond the scope of this document, but the structure of the email 71 itself is defined herein. 73 Whilst this document depends on identification of language in message 74 parts for non-real-time communication, there is a companion document 75 that is concerned with a similar problem for real-time communication: 76 [I-D.gellens-slim-negotiating-human-language] 78 1.1. Requirements Language 80 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 81 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 82 document are to be interpreted as described in RFC 2119 [RFC2119]. 84 2. The Content-Type Header Field 86 The "multipart/multilingual" MIME subtype allows the sending of a 87 message in a number of different languages with the translations 88 embedded in the same message. This MIME subtype helps the receiving 89 email client make sense of the message structure. 91 The multipart subtype "multipart/multilingual" has similar semantics 92 to "multipart/alternative" (as discussed in RFC 2046 [RFC2046]) in 93 that each of the message parts is an alternative version of the same 94 information. The primary difference between "multipart/multilingual" 95 and "multipart/alternative" is that when using "multipart/ 96 multilingual", the message part to select for rendering is chosen 97 based on the values of the Content-Language field and optionally the 98 Translation-Type parameter of the Content-Language field instead of 99 the ordering of the parts and the Content-Types. 101 The syntax for this multipart subtype conforms to the common syntax 102 for subtypes of multipart given in section 5.1.1. of RFC 2046 103 [RFC2046]. An example "multipart/multilingual" Content-Type header 104 field would look like this: 106 Content-Type: multipart/multilingual; boundary=01189998819991197253 108 3. The Message Parts 110 A multipart/multilingual message will have a number of message parts: 111 exactly one multilingual preface, one or more language message parts 112 and zero or one unmatched message part. The details of these are 113 described below. 115 3.1. The Multilingual Preface 117 In order for the message to be received and displayed in non- 118 conforming email clients, the message SHOULD contain an explanatory 119 message part which MUST NOT be marked with a Content-Language field 120 and MUST be the first of the message parts. Because non-conforming 121 email clients are expected to treat the message as multipart/mixed 122 (in accordance with sections 5.1.3 and 5.1.7 of RFC 2046 [RFC2046]) 123 they may show all of the message parts sequentially or as 124 attachments. Including and showing this explanatory part will help 125 the message recipient understand the message structure. 127 This initial message part SHOULD explain briefly to the recipient 128 that the message contains multiple languages and the parts may be 129 rendered sequentially or as attachments. This SHOULD be presented in 130 the same languages that are provided in the subsequent language 131 message parts. 133 Whilst this section of the message is useful for backward 134 compatibility, it will normally only be shown when rendered by a non- 135 conforming email client, because conforming email clients SHOULD only 136 show the single language message part identified by the user's 137 preferred language (or locale) and the language message part's 138 Content-Language. 140 For the correct display of the multilingual preface in a non- 141 conforming email client, the sender MAY use the Content-Disposition 142 field with a value of 'inline' in conformance with RFC 2183 [RFC2183] 143 (which defines the Content-Disposition field). If provided, this 144 SHOULD be placed at the multipart/multilingual level and in the 145 multilingual preface. This makes it clear to a non-conforming email 146 client that the multilingual preface should be displayed immediately 147 to the recipient, followed by any subsequent parts marked as 148 'inline'. 150 For an example of a multilingual preface, see the examples in 151 Section 8. 153 3.2. The Language Message Parts 155 The language message parts are translations of the same message 156 content. These message parts MAY be ordered so that the first part 157 after the multilingual preface is in the language believed to be the 158 most likely to be recognised by the recipient. All of the language 159 message parts MUST have a Content-Language field and a Content-Type 160 field, they SHOULD have a Subject field and MAY have a Translation- 161 Type parameter applied to the Content-Language field. 163 The Content-Type for each individual language part MAY be any MIME 164 type (including multipart subtypes such as multipart/alternative). 165 However, it is RECOMMENDED that the Content-Type of the language 166 parts is kept as simple as possible for interoperability with 167 existing email clients. The language parts are not required to have 168 matching Content-Types or multipart structures. For example, there 169 might be an English part of type "text/html" followed by a Spanish 170 part of type "application/pdf" followed by a Chinese part of type 171 "image/jpeg". Whatever the content-type, the contents SHOULD be 172 composed for optimal viewing in the specified language. 174 For a non-multipart type, it is RECOMMENDED that the sender applies a 175 Name parameter to the Content-Type field. This will help the 176 recipient identify the translations when the translations are 177 rendered as attachments by a non-conforming email client. 179 An example of this parameter is as follows: 181 Content-Type: text/plain; name="english.txt" 183 3.3. The Unmatched Message Part 185 If there is content intended for the recipient to see if they have a 186 preferred language other than one of those specified in the language 187 parts, another part MAY be provided. This would also be useful when 188 a language independent graphic is available. When this unmatched 189 part is present, it MUST be the last part, MUST NOT have a Content- 190 Language field and SHOULD NOT have a Subject field. 192 4. Message Part Selection 194 The logic for selecting the message part to render and present to the 195 recipient is quite straightforward and is summarised in the next few 196 paragraphs. 198 Firstly, if the email client does not understand multipart/ 199 multilingual then it SHOULD treat the message as if it was multipart/ 200 mixed and render message parts accordingly. 202 If the email client does understand multipart/multilingual then it 203 SHOULD ignore the multilingual preface and select the best match for 204 the user's preferred language from the language message parts 205 available. Also, the user may prefer to see the original message 206 content in their second language over a machine translation in their 207 first language. The Translation-Type parameter of the Content- 208 Language field value can be used for further selection based on this 209 preference. The selection of language part may be implemented in a 210 variety of ways and is a matter for the email client and its user 211 preferences. The goal is to render the most appropriate translation 212 for the user. Similarly, the subject to display (for example in a 213 message listing) should be chosen from the selected language message 214 part if it is available. 216 If there is no match for the user's preferred language (or there is 217 no preferred language information available) the email client SHOULD 218 select the unmatched part (if one exists) or the first language part 219 (directly after the multilingual preface) if an unmatched part does 220 not exist. The top-level Subject header field value should be used 221 whenever a suitable translation cannot be identified. 223 If there is no translation type preference information available, the 224 values of the Translation-Type parameter may be ignored. 226 Additionally, interactive implementations MAY offer the user a choice 227 from among the available languages. 229 5. The Content-Language Field 231 The Content-Language field in the individual language message parts 232 is used to identify the language in which the message part is 233 written. Based on the value of this field, a conforming email client 234 can determine which message part to display (given the user's 235 language settings or locale). 237 The Content-Language MUST comply with RFC 3282 [RFC3282] (which 238 defines the Content-Language field) and BCP 47/RFC 5646 [RFC5646] 239 (which defines the structure and semantics for the language code 240 values). While RFC 5646 provides a mechanism accommodating 241 increasingly fine-grained distinctions, in the interest of maximum 242 interoperability, each Content-Language value SHOULD be restricted to 243 the largest granularity of language tags; in other words, it is 244 RECOMMENDED to specify only a Primary-subtag and NOT to include 245 subtags (e.g., for region or dialect) unless the languages might be 246 mutually incomprehensible without them. Examples of this field for 247 English, German and an instruction manual in Spanish and French, 248 could look like the following: 250 Content-Language: en 252 Content-Language: de 254 Content-Language: es, fr 256 6. The Translation-Type Parameter 258 The Translation-Type parameter can be applied to the Content-Language 259 field in the individual language message parts and is used to 260 identify the type of translation. Based on the value of this 261 parameter and the user's preferences, a conforming email client can 262 determine which message part to display. 264 This parameter can have one of three possible values: 'original', 265 'human' or 'automated' although other values may be added in the 266 future. A value of 'original' is given in the language message part 267 that is in the original language. A value of 'human' is used when a 268 language message part is translated by a human translator or a human 269 has checked and corrected an automated translation. A value of 270 'automated' is used when a language message part has been translated 271 by an electronic agent without proofreading or subsequent correction. 273 Examples of this parameter include: 275 Content-Language: en; translation-type=original 277 Content-Language: fr; translation-type=human 279 7. The Subject Field in the Language Message parts 281 On receipt of the message, conforming email clients will need to 282 render the subject in the correct language for the recipient. To 283 enable this the Subject field SHOULD be provided in each language 284 message part. The value for this field should be a translation of 285 the email subject. 287 US-ASCII and 'encoded-word' examples of this field include: 289 Subject: A really simple email subject 291 Subject: =?iso-8859-1?Q?un_asunto_de_correo_electr=F3nico_sencillo?= 293 See RFC 2047 [RFC2047] for the specification of 'encoded-word'. 295 8. Examples 297 8.1. An Example of a Simple Multiple language email message 298 From: Nik 299 To: Nathaniel 300 Subject: example of a message in Spanish and English 301 Content-Type: multipart/multilingual; boundary=01189998819991197253 302 Content-Disposition: inline 304 --01189998819991197253 305 Content-Disposition: inline 307 This is a message in multiple languages. It says the 308 same thing in each language. If you can read it in one language, 309 you can ignore the other translations. The other translations may be 310 presented as attachments or grouped together. 312 Este es un mensaje en varios idiomas. Dice lo mismo en 313 cada idioma. Si puede leerlo en un idioma, puede ignorar las otras 314 traducciones. Las otras traducciones pueden presentarse como archivos 315 adjuntos o agrupados. 317 --01189998819991197253 318 Content-Language: en; translation-type=original 319 Content-Type: text/plain; name="english.txt" 320 Content-Disposition: inline 321 Subject: example of a message in Spanish and English 323 Hello, this message content is provided in your language. 325 --01189998819991197253 326 Content-Language: es; translation-type=human 327 Content-Type: text/plain; name="espanol.txt" 328 Content-Disposition: inline 329 Subject: =?iso-8859-1?Q?ejemplo_pr=E1ctico_de_mensaje_ 330 en_espa=F1ol_e_ingl=E9s?= 332 Hola, el contenido de este mensaje esta disponible en su idioma. 334 --01189998819991197253 335 Content-Type: image/gif 336 Content-Disposition: inline 338 ..GIF image showing iconic or language-independent content here.. 340 --01189998819991197253-- 342 8.2. An Example of a Complex Multiple language email message 344 Below is an example of a more complex multiple language email message 345 formatted using the method detailed in this document. Note that the 346 language message parts have multipart contents and would therefore 347 require further processing to determine the content to display. 349 From: Nik 350 To: Nathaniel 351 Subject: example of a message in Spanish and English 352 Content-Type: multipart/multilingual; boundary=01189998819991197253 353 Content-Disposition: inline 355 --01189998819991197253 356 Content-Disposition: inline 358 This is a message in multiple languages. It says the 359 same thing in each language. If you can read it in one language, 360 you can ignore the other translations. The other translations may be 361 presented as attachments or grouped together. 363 Este es un mensaje en varios idiomas. Dice lo mismo en 364 cada idioma. Si puede leerlo en un idioma, puede ignorar las otras 365 traducciones. Las otras traducciones pueden presentarse como archivos 366 adjuntos o agrupados. 368 --01189998819991197253 369 Content-Language: en; translation-type=original 370 Content-Type: multipart/alternative; boundary=multipartaltboundary 371 Subject: example of a message in Spanish and English 373 --multipartaltboundary 374 Content-Type: text/plain; name="english.txt" 376 Hello, this message content is provided in your language. 378 --multipartaltboundary 379 Content-Type: text/html; name="english.html" 381

Hello, this message content is provided in your 382 language.

384 --multipartaltboundary-- 386 --01189998819991197253 387 Content-Language: es; translation-type=human 388 Content-Type: multipart/mixed; boundary=multipartmixboundary 389 Subject: =?iso-8859-1?Q?ejemplo_pr=E1ctico_de_mensaje_ 390 en_espa=F1ol_e_ingl=E9s?= 392 --multipartmixboundary 393 Content-Type:application/pdf; name="espanol.pdf" 395 ..PDF file in Spanish here.. 397 --multipartmixboundary 398 Content-Type:image/jpeg; name="espanol.jpg" 400 ..JPEG image showing Spanish content here.. 402 --multipartmixboundary-- 404 --01189998819991197253 405 Content-Type: image/gif 406 Content-Disposition: inline 408 ..GIF image showing iconic or language-independent content here.. 410 --01189998819991197253-- 412 9. Changes from Previous Versions 414 9.1. Changes from draft-tomkinson-multilangcontent-01 to draft- 415 tomkinson-slim-multilangcontent-00 417 o File name and version number changed to reflect the proposed WG 418 name SLIM (Selection of Language for Internet Media). 420 o Replaced the Subject-Translation field in the language message 421 parts with Subject and provided US-ASCII and non-US-ASCII 422 examples. 424 o Introduced the language-independent unmatched message part. 426 o Many wording improvements and clarifications throughout the 427 document. 429 9.2. Changes from draft-tomkinson-slim-multilangcontent-00 to draft- 430 tomkinson-slim-multilangcontent-01 432 o Added Translation-Type in each language message part to identify 433 the source of the translation (original/human/automated). 435 9.3. Changes from draft-tomkinson-slim-multilangcontent-01 to draft- 436 tomkinson-slim-multilangcontent-02 438 o Changed Translation-Type to be a parameter for the Content- 439 Language field rather than a new separate field. 441 o Added a paragraph about using Content-Disposition field to help 442 non-conforming mail clients correctly render the multilingual 443 preface. 445 o Recommended using a Name parameter on the language part Content- 446 Type to help the recipient identify the translations in non- 447 conforming mail clients. 449 o Many wording improvements and clarifications throughout the 450 document. 452 10. Acknowledgements 454 The authors are grateful for the helpful input received from many 455 people but would especially like to acknowledge the help of Harald 456 Alvestrand, Stephane Bortzmeyer, Eric Burger, Mark Davis, Doug Ewell, 457 Randall Gellens, Gunnar Hellstrom, Alexey Melnikov, Fiona Tomkinson, 458 Simon Tyler and Daniel Vargha. The authors would also like to thank 459 Fernando Alvaro and Luis de Pablo for their work on the Spanish 460 translations. 462 11. IANA Considerations 464 The multipart/multilingual MIME type will be registered with IANA. 466 12. Security Considerations 468 This document has no additional security considerations beyond those 469 that apply to the standards and procedures on which it is built. 471 13. References 473 13.1. Normative References 475 [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 476 Extensions (MIME) Part Two: Media Types", RFC 2046, 477 DOI 10.17487/RFC2046, November 1996, 478 . 480 [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail Extensions) 481 Part Three: Message Header Extensions for Non-ASCII Text", 482 RFC 2047, DOI 10.17487/RFC2047, November 1996, 483 . 485 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 486 Requirement Levels", BCP 14, RFC 2119, 487 DOI 10.17487/RFC2119, March 1997, 488 . 490 [RFC2183] Troost, R., Dorner, S., and K. Moore, Ed., "Communicating 491 Presentation Information in Internet Messages: The 492 Content-Disposition Header Field", RFC 2183, 493 DOI 10.17487/RFC2183, August 1997, 494 . 496 [RFC3282] Alvestrand, H., "Content Language Headers", RFC 3282, 497 DOI 10.17487/RFC3282, May 2002, 498 . 500 [RFC5646] Phillips, A., Ed. and M. Davis, Ed., "Tags for Identifying 501 Languages", BCP 47, RFC 5646, DOI 10.17487/RFC5646, 502 September 2009, . 504 13.2. Informational References 506 [I-D.gellens-slim-negotiating-human-language] 507 Gellens, R., "Negotiating Human Language in Real-Time 508 Communications", draft-gellens-slim-negotiating-human- 509 language-02 (work in progress), July 2015. 511 Authors' Addresses 513 Nik Tomkinson 514 Mimecast Ltd 515 CityPoint, One Ropemaker Street 516 London EC2Y 9AW 517 United Kingdom 519 Email: rfc.nik.tomkinson@gmail.com 520 Nathaniel Borenstein 521 Mimecast Ltd 522 480 Pleasant Street 523 Watertown MA 02472 524 North America 526 Email: nsb@mimecast.com