idnits 2.17.1 draft-tomkinson-slim-multilangcontent-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 06, 2015) is 3214 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-03) exists of draft-gellens-slim-negotiating-human-language-02 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IETF N. Tomkinson 3 Internet-Draft N. Borenstein 4 Intended status: Standards Track Mimecast Ltd 5 Expires: January 7, 2016 July 06, 2015 7 Multiple Language Content Type 8 draft-tomkinson-slim-multilangcontent-01 10 Abstract 12 This document defines an addition to the Multipurpose Internet Mail 13 Extensions (MIME) standard to make it possible to send one message 14 that contains multiple language versions of the same information. 15 The translations would be identified by a language code and selected 16 by the email client based on a user's language settings or locale. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at http://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on January 7, 2016. 35 Copyright Notice 37 Copyright (c) 2015 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (http://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 1. Introduction 52 Since the invention of email and the rapid spread of the internet, 53 more and more people have been able to communicate in more and more 54 countries and in more and more languages. But during this time of 55 technological evolution, email has remained a single language 56 communication tool, whether it is English to English, Spanish to 57 Spanish or Japanese to Japanese. 59 Also during this time, many corporations have established their 60 offices in multi-cultural cities and formed departments and teams 61 that span continents, cultures and languages so the need to 62 communicate efficiently with little margin for miscommunication has 63 grown exponentially. 65 The objective of this document is to define an addition to the 66 Multipurpose Internet Mail Extensions (MIME) standard, to make it 67 possible to send a single message to a group of people in such a way 68 that all of the recipients can read the email in their preferred 69 language. The methods of translation of the message content are 70 beyond the scope of this document, but the structure of the email 71 itself is defined herein. 73 Whilst this document depends on identification of language in message 74 parts for non-real-time communication, there is a companion document 75 that is concerned with a similar problem for real-time communication: 76 [I-D.gellens-slim-negotiating-human-language] 78 1.1. Requirements Language 80 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 81 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 82 document are to be interpreted as described in RFC 2119 [RFC2119]. 84 2. The Content-Type Header Field 86 When there is a requirement to send a message in a number of 87 different languages and the translations are to be embedded in the 88 same message, the multipart subtype "multipart/multilingual" SHOULD 89 be used to help the receiving email client make sense of the message 90 structure. 92 The suggested multipart subtype "multipart/multilingual" has similar 93 semantics to "multipart/alternative" (as discussed in RFC 2046 94 [RFC2046]) in that each of the message parts is an alternative 95 version of the same information. The primary difference between 96 "multipart/multilingual" and "multipart/alternative" is that when 97 using "multipart/multilingual", the message part to select for 98 rendering is chosen based on the values of the Content-Language field 99 and the Translation-Type field instead of the ordering of the parts 100 and the Content-Types. 102 The syntax for this multipart subtype conforms to the common syntax 103 for subtypes of multipart given in section 5.1.1. of RFC 2046 104 [RFC2046]. An example "multipart/multilingual" Content-Type header 105 field would look like this: 107 Content-Type: multipart/multilingual; boundary=01189998819991197253 109 3. The Message Parts 111 A multipart/multilingual message will have a number of message parts: 112 exactly one multilingual preface, one or more language message parts 113 and zero or one unmatched message part. The details of these are 114 described below. 116 3.1. The Multilingual Preface 118 In order for the message to be received and displayed in non- 119 conforming email clients, the message SHOULD contain an explanatory 120 message part which MUST-NOT be marked with a Content-Language field 121 and MUST be the first of the message parts. Because non-conforming 122 email clients are expected to treat the message as multipart/mixed 123 (in accordance with sections 5.1.3 and 5.1.7 of RFC 2046 [RFC2046]) 124 they may show all of the message parts sequentially or as 125 attachments. Including and showing this explanatory part will help 126 the message recipient understand the message structure. 128 This initial message part SHOULD explain briefly to the message 129 recipient that the message contains multiple languages and the parts 130 may be rendered sequentially or as attachments. This SHOULD be 131 presented in the same languages that are provided in the subsequent 132 language message parts. 134 Whilst this section of the message is useful for backward 135 compatibility, it SHOULD only be shown when rendered by a non- 136 conforming email client because conforming email clients SHOULD only 137 show the single language message part identified by the user's 138 preferred language (or locale) and the language message part's 139 Content-Language. 141 For an example of a Multilingual Preface, see the examples in 142 Section 8. 144 3.2. The Language Message Parts 146 The language message parts are translations of the same message 147 content. These message parts MAY be ordered so that the first part 148 after the multilingual preface is in the language believed to be the 149 most likely to be recognised by the recipient. All of the language 150 message parts MUST have a Content-Language field and a Content-Type 151 field, they SHOULD have a Subject field and MAY have a Translation- 152 Type field. 154 The Content-Type for each individual language part MAY be any MIME 155 type (including multipart subtypes such as multipart/alternative). 156 However, it is recommended that the Content-Type of the language 157 parts is kept as simple as possible for interoperability with 158 existing email clients. The language parts are not required to have 159 matching Content-Types or multipart structures. For example, there 160 might be an English part of type "text/html" followed by a Spanish 161 part of type "application/pdf" followed by a Chinese part of type 162 "image/jpeg". Whatever the content-type, the contents SHOULD be 163 composed for optimal viewing in the specified language. 165 3.3. The Unmatched Message Part 167 If there is content intended for the recipient to see if they have a 168 preferred language other than one of those specified in the language 169 parts, another part MAY be provided. This would be useful when a 170 language independent graphic is available. When this unmatched part 171 is present, it MUST be the last part, MUST NOT have a Content- 172 Language field and SHOULD-NOT have a Subject field. 174 4. Message Part Selection 176 The logic for selecting the message part to render and present to the 177 recipient is quite straightforward and is summarised in the next few 178 paragraphs. 180 Firstly, if the email client does not understand multipart/ 181 multilingual then it SHOULD treat the message as if it was multipart/ 182 mixed and render message parts accordingly. 184 If the email client does understand multipart/multilingual then it 185 SHOULD ignore the multilingual preface and select the best match for 186 the user's preferred language from the language message parts 187 available. Also, the user may prefer to see the original message 188 content in their second language over a machine translation in their 189 first language. The Translation-Type field value can be used for 190 further selection based on this preference. The selection of 191 language part may be implemented in a variety of ways and is 192 dependent on how the email client manages its user preferences. The 193 ultimate goal is to render the most appropriate translation for the 194 user. Similarly, the subject should be chosen from the selected 195 language message part. 197 If there is no match for the user's preferred language (or there is 198 no preferred language information available) the email client SHOULD 199 select the unmatched part (if one exists) or the first language part 200 (directly after the multilingual preface) if an unmatched part does 201 not exist. The Subject header field value should be used whenever a 202 suitable translation cannot be identified. 204 If there is no translation type preference information available, the 205 values of the Translation-Type field may be ignored. 207 Additionally, interactive implementations MAY offer the user a choice 208 from among the available languages. 210 5. The Content-Language Field 212 The Content-Language field in the individual language message parts 213 is used to identify the language in which the message part is 214 written. Based on the value of this field, a conforming email client 215 can determine which message part to display (given the user's 216 language settings or locale). 218 The Content-Language MUST comply with RFC 3282 [RFC3282] (which 219 defines the Content-Language field) and BCP 47/RFC 5646 [RFC5646] 220 (which defines the structure and semantics for the language code 221 values). While RFC 5646 provides a mechanism accommodating 222 increasingly fine-grained distinctions, in the interest of maximum 223 interoperability, each Content-Language value SHOULD be restricted to 224 the largest granularity of language tags; in other words, it is 225 RECOMMENDED to specify only a Primary-subtag and NOT to include 226 subtags (e.g., for region or dialect) unless the languages might be 227 mutually incomprehensible without them. Examples of this field for 228 English, German and an instruction manual in Spanish and French, 229 could look like the following: 231 Content-Language: en 233 Content-Language: de 235 Content-Language: es, fr 237 6. The Translation-Type Field 239 The Translation-Type field in the individual language message parts 240 is used to identify the type of translation. Based on the value of 241 this field and the user's preferences, a conforming email client can 242 determine which message part to display. 244 This field can have one of three possible values: 'original', 'human' 245 or 'automated' although other values may be added in the future. A 246 value of 'original' is given in the language message part that is in 247 the original language. A value of 'human' is used when a language 248 message part is translated by a human translator. A value of 249 'automated' is used when a language message part has been translated 250 by an electronic agent without proofreading or subsequent correction. 252 Examples of this field may look like this: 254 Translation-Type: original 256 Translation-Type: human 258 7. The Subject Field in the Language Message parts 260 On receipt of the message, conforming email clients will need to 261 render the subject in the correct language for the recipient. To 262 enable this the Subject field SHOULD be provided in each language 263 message part. The value for this field should be a translation of 264 the email subject. 266 US-ASCII and 'encoded-word' examples of this field may look like 267 this: 269 Subject: A really simple email subject 271 Subject: =?iso-8859-1?Q?un_asunto_de_correo_electr=F3nico_sencillo?= 273 See RFC 2047 [RFC2047] for the specification of 'encoded-word'. 275 8. Examples 277 8.1. An Example of a Simple Multiple language email message 279 Below is an example of a simple multiple language email message 280 formatted using the method detailed in this document. 282 From: Nik 283 To: Nathaniel 284 Subject: example of a message in Spanish and English 285 Content-Type: multipart/multilingual; boundary=01189998819991197253 287 --01189998819991197253 289 This is a message in two languages: English and Spanish. It says the 290 same thing in each language. If you can read it in one language, 291 you can ignore the other translations. The other translations may be 292 presented as attachments or grouped together. 294 Este es un mensaje en dos idiomas: Ingles y Espanol. Dice lo mismo en 295 cada idioma. Si puede leerlo en un idioma, puede ignorar las otras 296 traducciones. Las otras traducciones pueden presentes como archivos 297 adjuntos o agrupados. 299 --01189998819991197253 300 Content-Language: en 301 Translation-Type: original 302 Content-Type: text/plain 303 Subject: example of a message in Spanish and English 305 Hello, this message content is provided in your language. 307 --01189998819991197253 308 Content-Language: es 309 Translation-Type: human 310 Content-Type: text/plain 311 Subject: =?iso-8859-1?Q?ejemplo_pr=E1ctico_de_mensaje_ 312 en_espa=F1ol_e_ingl=E9s?= 314 Hola, el contenido de este mensaje esta disponible en su idioma. 316 --01189998819991197253 317 Content-Type: image/gif 319 ..GIF image showing iconic or language-independent content here.. 321 --01189998819991197253-- 323 8.2. An Example of a Complex Multiple language email message 325 Below is an example of a more complex multiple language email message 326 formatted using the method detailed in this document. Note that the 327 language message parts have multipart contents and would therefore 328 require further processing to determine the content to display. 330 From: Nik 331 To: Nathaniel 332 Subject: example of a message in Spanish and English 333 Content-Type: multipart/multilingual; boundary=01189998819991197253 335 --01189998819991197253 337 This is a message in two languages: English and Spanish. It says the 338 same thing in each language. If you can read it in one language, 339 you can ignore the other translations. The other translations may be 340 presented as attachments or grouped together. 342 Este es un mensaje en dos idiomas: Ingles y Espanol. Dice lo mismo en 343 cada idioma. Si puede leerlo en un idioma, puede ignorar las otras 344 traducciones. Las otras traducciones pueden presentes como archivos 345 adjuntos o agrupados. 347 --01189998819991197253 348 Content-Language: en 349 Translation-Type: original 350 Content-Type: multipart/alternative; boundary=multipartaltboundary 351 Subject: example of a message in Spanish and English 353 --multipartaltboundary 354 Content-Type: text/plain 356 Hello, this message content is provided in your language. 358 --multipartaltboundary 359 Content-Type: text/html 361

Hello, this message content is provided in your 362 language.

364 --multipartaltboundary-- 366 --01189998819991197253 367 Content-Language: es 368 Translation-Type: human 369 Content-Type: multipart/mixed; boundary=multipartmixboundary 370 Subject: =?iso-8859-1?Q?ejemplo_pr=E1ctico_de_mensaje_ 371 en_espa=F1ol_e_ingl=E9s?= 373 --multipartmixboundary 374 Content-Type:application/pdf 376 ..PDF file in Spanish here.. 378 --multipartmixboundary 379 Content-Type:image/jpeg 381 ..JPEG image showing Spanish content here.. 383 --multipartmixboundary-- 385 --01189998819991197253 386 Content-Type: image/gif 388 ..GIF image showing iconic or language-independent content here.. 390 --01189998819991197253-- 392 9. Changes from Previous Versions 394 9.1. Changes from draft-tomkinson-multilangcontent-01 to draft- 395 tomkinson-slim-multilangcontent-00 397 o File name and version number changed to reflect the proposed WG 398 name SLIM (Selection of Language for Internet Media). 400 o Replaced the Subject-Translation field in the language message 401 parts with Subject and provided US-ASCII and non-US-ASCII 402 examples. 404 o Introduced the language-independent unmatched message part. 406 o Many wording improvements and clarifications throughout the 407 document. 409 9.2. Changes from draft-tomkinson-slim-multilangcontent-00 to draft- 410 tomkinson-slim-multilangcontent-01 412 o Added Translation-Type in each language message part to identify 413 the source of the translation (original/human/automated). 415 10. Acknowledgements 417 The authors are grateful for the helpful input received from many 418 people but would especially like to acknowledge the help of Harald 419 Alvestrand, Stephane Bortzmeyer, Mark Davis, Doug Ewell, Randall 420 Gellens, Gunnar Hellstrom, Alexey Melnikov, Fiona Tomkinson, Simon 421 Tyler and Daniel Vargha. The authors would also like to thank Luis 422 de Pablo for his work on the Spanish translations. 424 11. IANA Considerations 426 The multipart/multilingual MIME type will be registered with IANA. 428 12. Security Considerations 430 This document has no additional security considerations beyond those 431 that apply to the standards and procedures on which it is built. 433 13. References 435 13.1. Normative References 437 [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 438 Extensions (MIME) Part Two: Media Types", RFC 2046, 439 November 1996. 441 [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail Extensions) 442 Part Three: Message Header Extensions for Non-ASCII Text", 443 RFC 2047, November 1996. 445 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 446 Requirement Levels", BCP 14, RFC 2119, March 1997. 448 [RFC3282] Alvestrand, H., "Content Language Headers", RFC 3282, May 449 2002. 451 [RFC5646] Phillips, A. and M. Davis, "Tags for Identifying 452 Languages", BCP 47, RFC 5646, September 2009. 454 13.2. Informational References 456 [I-D.gellens-slim-negotiating-human-language] 457 Gellens, R., "Negotiating Human Language in Real-Time 458 Communications", draft-gellens-slim-negotiating-human- 459 language-02 (work in progress), July 2015. 461 Authors' Addresses 463 Nik Tomkinson 464 Mimecast Ltd 465 CityPoint, One Ropemaker Street 466 London EC2Y 9AW 467 United Kingdom 469 Email: rfc.nik.tomkinson@gmail.com 470 Nathaniel Borenstein 471 Mimecast Ltd 472 480 Pleasant Street 473 Watertown MA 02472 474 North America 476 Email: nsb@mimecast.com