idnits 2.17.1 draft-tomkinson-slim-multilangcontent-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 10, 2014) is 3448 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-03) exists of draft-gellens-slim-negotiating-human-language-00 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IETF N. Tomkinson 3 Internet-Draft N. Borenstein 4 Intended status: Standards Track Mimecast Ltd 5 Expires: May 14, 2015 November 10, 2014 7 Multiple Language Content Type 8 draft-tomkinson-slim-multilangcontent-00 10 Abstract 12 This document defines an addition to the Multipurpose Internet Mail 13 Extensions (MIME) standard to make it possible to send one message 14 that contains multiple language versions of the same information. 15 The translations would be identified by a language code and selected 16 by the email client based on a user's language settings or locale. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at http://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on May 14, 2015. 35 Copyright Notice 37 Copyright (c) 2014 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (http://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 1. Introduction 52 Since the invention of email and the rapid spread of the internet, 53 more and more people have been able to communicate in more and more 54 countries and in more and more languages. But during this time of 55 technological evolution, email has remained a single language 56 communication tool, whether it is English to English, Spanish to 57 Spanish or Japanese to Japanese. 59 Also during this time, many corporations have established their 60 offices in multi-cultural cities and formed departments and teams 61 that span continents, cultures and languages so the need to 62 communicate efficiently with little margin for miscommunication has 63 grown exponentially. 65 The objective of this document is to define an addition to the 66 Multipurpose Internet Mail Extensions (MIME) standard, to make it 67 possible to send a single message to a group of people in such a way 68 that all of the recipients can read the email in their preferred 69 language. The methods of translation of the message content are 70 beyond the scope of this document, but the structure of the email 71 itself is defined herein. 73 Whilst this document depends on identification of language in message 74 parts for non-real-time communication, there is a companion document 75 that is concerned with a similar problem for real-time communication: 76 [I-D.gellens-slim-negotiating-human-language] 78 1.1. Requirements Language 80 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 81 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 82 document are to be interpreted as described in RFC 2119 [RFC2119]. 84 2. The Content-Type Header Field 86 When there is a requirement to send a message in a number of 87 different languages and the translations are to be embedded in the 88 same message, the multipart subtype "multipart/multilingual" SHOULD 89 be used to help the receiving email client make sense of the message 90 structure. 92 The suggested multipart subtype "multipart/multilingual" has similar 93 semantics to "multipart/alternative" (as discussed in RFC 2046 94 [RFC2046]) in that each of the message parts is an alternative 95 version of the same information. The primary difference between 96 "multipart/multilingual" and "multipart/alternative" is that when 97 using "multipart/multilingual", the message part to select for 98 rendering is chosen based on the value of the Content-Language header 99 field instead of the ordering of the parts and the Content-Types. 101 The syntax for this multipart subtype conforms to the common syntax 102 for subtypes of multipart given in section 5.1.1. of RFC 2046 103 [RFC2046]. An example "multipart/multilingual" Content-Type header 104 field would look like this: 106 Content-type: multipart/multilingual; boundary=01189998819991197253 108 3. The Message Parts 110 A multipart/multilingual message will have a number of message parts: 111 exactly one multilingual preface, one or more language message parts 112 and zero or one unmatched message part. The details of these are 113 described below. 115 3.1. The Multilingual Preface 117 In order for the message to be received and displayed in non- 118 conforming email clients, the message SHOULD contain an explanatory 119 message part which MUST-NOT be marked with a Content-Language field 120 and MUST be the first of the message parts. Because non-conforming 121 email clients are expected to treat the message as multipart/mixed 122 (in accordance with sections 5.1.3 and 5.1.7 of RFC 2046 [RFC2046]) 123 they may show all of the message parts sequentially or as 124 attachments. Including and showing this explanatory part will help 125 the message recipient understand the message structure. 127 This initial message part SHOULD explain briefly to the message 128 recipient that the message contains multiple languages and the parts 129 may be rendered sequentially or as attachments. This SHOULD be 130 presented in the same languages that are provided in the subsequent 131 language message parts. 133 Whilst this section of the message is useful for backward 134 compatibility, it SHOULD only be shown when rendered by a non- 135 conforming email client because conforming email clients SHOULD only 136 show the single language message part identified by the user's 137 preferred language (or locale) and the language message part's 138 Content-Language. 140 For an example of a Multilingual Preface, see the examples in 141 Section 7. 143 3.2. The Language Message Parts 145 The language message parts are translations of the same message 146 content. These message parts MAY be ordered so that the first part 147 after the multilingual preface is in the language believed to be the 148 most likely to be recognised by the recipient. All of the language 149 message parts MUST have a Content-Language field and a Content-Type 150 field and SHOULD have a Subject field. 152 The Content-Type for each individual language part MAY be any MIME 153 type (including multipart subtypes such as multipart/alternative). 154 However, it is recommended that the Content-Type of the language 155 parts is kept as simple as possible for interoperability with 156 existing email clients. The language parts are not required to have 157 matching Content-Types or multipart structures. For example, there 158 might be an English part of type "text/html" followed by a Spanish 159 part of type "application/pdf" followed by a Chinese part of type 160 "image/jpeg". Whatever the content-type, the contents SHOULD be 161 composed for optimal viewing in the specified language. 163 3.3. The Unmatched Message Part 165 If there is content intended for the recipient to see if they have a 166 preferred language other than one of those specified in the language 167 parts, another part MAY be provided. This would be useful when a 168 language independent graphic is available. When this unmatched part 169 is present, it MUST be the last part, MUST NOT have a Content- 170 Language field and SHOULD-NOT have a Subject field. 172 4. Message Part Selection 174 The logic for selecting the message part to render and present to the 175 recipient is quite straightforward and is summarised in the next few 176 paragraphs. 178 Firstly, if the email client does not understand multipart/ 179 multilingual then it SHOULD treat the message as if it was multipart/ 180 mixed and render message parts accordingly. 182 If the email client does understand multipart/multilingual then it 183 SHOULD ignore the multilingual preface and select the best match for 184 the user's preferred language from the language message parts 185 available. This may be implemented in a variety of ways and is 186 dependent on how the email client manages its preferred language 187 data. The ultimate goal is to render the most appropriate 188 translation for the user. Similarly, the subject should be chosen 189 from the matched language message part. 191 If there is no match for the user's preferred language (or there is 192 no preferred language information available) the email client SHOULD 193 select the unmatched part (if one exists) or the first language part 194 (directly after the multilingual preface) if an unmatched part does 195 not exist. The Subject header field value should be used whenever a 196 suitable translation cannot be identified. 198 Additionally, interactive implementations MAY offer the user a choice 199 from among the available languages. 201 5. The Content-Language Field 203 The Content-Language field in the individual language message parts 204 is used to identify the language in which the message part is 205 written. Based on the value of this field, a conforming email client 206 can determine which message part to display (given the user's 207 language settings or locale). 209 The Content-Language MUST comply with RFC 3282 [RFC3282] (which 210 defines the Content-Language field) and BCP 47/RFC 5646 [RFC5646] 211 (which defines the structure and semantics for the language code 212 values). While RFC 5646 provides a mechanism accommodating 213 increasingly fine-grained distinctions, in the interest of maximum 214 interoperability, each Content-Language value SHOULD be restricted to 215 the largest granularity of language tags; in other words, it is 216 RECOMMENDED to specify only a Primary-subtag and NOT to include 217 subtags (e.g., for region or dialect) unless the languages might be 218 mutually incomprehensible without them. Examples of this field for 219 English, German and an instruction manual in Spanish and French, 220 could look like the following: 222 Content-Language: en 224 Content-Language: de 226 Content-Language: es, fr 228 6. The Subject Field in the Language Message parts 230 On receipt of the message, conforming email clients will need to 231 render the subject in the correct language for the recipient. To 232 enable this the Subject field SHOULD be provided in each language 233 message part. The value for this field should be a translation of 234 the email subject. 236 US-ASCII and 'encoded-word' examples of this field may look like 237 this: 239 Subject: A really simple email subject 241 Subject: =?iso-8859-1?Q?un_asunto_de_correo_electr=F3nico_sencillo?= 243 See RFC 2047 [RFC2047] for the specification of 'encoded-word'. 245 7. Examples 247 7.1. An Example of a Simple Multiple language email message 249 Below is an example of a simple multiple language email message 250 formatted using the method detailed in this document. 252 From: Nik 253 To: Nathaniel 254 Subject: example of a message in Spanish and English 255 Content-type: multipart/multilingual; boundary=01189998819991197253 257 --01189998819991197253 259 This is a message in two languages: English and Spanish. It says the 260 same thing in each language. If you can read it in one language, 261 you can ignore the other translations. The other translations may be 262 presented as attachments or grouped together. 264 Este es un mensaje en dos idiomas: Ingles y Espanol. Dice lo mismo en 265 cada idioma. Si puede leerlo en un idioma, puede ignorar las otras 266 traducciones. Las otras traducciones pueden presentes como archivos 267 adjuntos o agrupados. 269 --01189998819991197253 270 Content-Language: en 271 Content-Type: text/plain 272 Subject: example of a message in Spanish and English 274 Hello, this message content is provided in your language. 276 --01189998819991197253 277 Content-Language: es 278 Content-Type: text/plain 279 Subject: =?iso-8859-1?Q?ejemplo_pr=E1ctico_de_mensaje_ 280 en_espa=F1ol_e_ingl=E9s?= 282 Hola, el contenido de este mensaje esta disponible en su idioma. 284 --01189998819991197253 285 Content-Type: image/gif 287 ..GIF image showing iconic or language-independent content here.. 289 --01189998819991197253-- 291 7.2. An Example of a Complex Multiple language email message 293 Below is an example of a more complex multiple language email message 294 formatted using the method detailed in this document. Note that the 295 language message parts have multipart contents and would therefore 296 require further processing to determine the content to display. 298 From: Nik 299 To: Nathaniel 300 Subject: example of a message in Spanish and English 301 Content-type: multipart/multilingual; boundary=01189998819991197253 303 --01189998819991197253 305 This is a message in two languages: English and Spanish. It says the 306 same thing in each language. If you can read it in one language, 307 you can ignore the other translations. The other translations may be 308 presented as attachments or grouped together. 310 Este es un mensaje en dos idiomas: Ingles y Espanol. Dice lo mismo en 311 cada idioma. Si puede leerlo en un idioma, puede ignorar las otras 312 traducciones. Las otras traducciones pueden presentes como archivos 313 adjuntos o agrupados. 315 --01189998819991197253 316 Content-Language: en 317 Content-Type: multipart/alternative; boundary=multipartaltboundary 318 Subject: example of a message in Spanish and English 320 --multipartaltboundary 321 Content-Type: text/plain 323 Hello, this message content is provided in your language. 325 --multipartaltboundary 326 Content-Type: text/html 328

Hello, this message content is provided in your 329 language.

331 --multipartaltboundary-- 333 --01189998819991197253 334 Content-Language: es 335 Content-Type: multipart/mixed; boundary=multipartmixboundary 336 Subject: =?iso-8859-1?Q?ejemplo_pr=E1ctico_de_mensaje_ 337 en_espa=F1ol_e_ingl=E9s?= 339 --multipartmixboundary 340 Content-Type:application/pdf 342 ..PDF file in Spanish here.. 344 --multipartmixboundary 345 Content-Type:image/jpeg 346 ..JPEG image showing Spanish content here.. 348 --multipartmixboundary-- 350 --01189998819991197253 351 Content-Type: image/gif 353 ..GIF image showing iconic or language-independent content here.. 355 --01189998819991197253-- 357 8. Changes from Previous Versions 359 8.1. Changes from draft-tomkinson-multilangcontent-01 to draft- 360 tomkinson-slim-multilangcontent-00 362 o File name and version number changed to reflect the proposed WG 363 name SLIM (Selection of Language for Internet Media). 365 o Replaced the Subject-Translation field in the language message 366 parts with Subject and provided US-ASCII and non-US-ASCII 367 examples. 369 o Introduced the language-independent unmatched message part. 371 o Many wording improvements and clarifications throughout the 372 document. 374 9. Acknowledgements 376 The authors are grateful for the helpful input received from many 377 people but would especially like to acknowledge the help of Harald 378 Alvestrand, Mark Davis, Doug Ewell, Randall Gellens, Alexey Melnikov, 379 Fiona Tomkinson, Simon Tyler and Daniel Vargha. The authors would 380 also like to thank Luis de Pablo for his work on the Spanish 381 translations. 383 10. IANA Considerations 385 The multipart/multilingual MIME type will be registered with IANA. 387 11. Security Considerations 389 This document has no additional security considerations beyond those 390 that apply to the standards and procedures on which it is built. 392 12. References 394 12.1. Normative References 396 [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 397 Extensions (MIME) Part Two: Media Types", RFC 2046, 398 November 1996. 400 [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail Extensions) 401 Part Three: Message Header Extensions for Non-ASCII Text", 402 RFC 2047, November 1996. 404 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 405 Requirement Levels", BCP 14, RFC 2119, March 1997. 407 [RFC3282] Alvestrand, H., "Content Language Headers", RFC 3282, May 408 2002. 410 [RFC5646] Phillips, A. and M. Davis, "Tags for Identifying 411 Languages", BCP 47, RFC 5646, September 2009. 413 12.2. Informational References 415 [I-D.gellens-slim-negotiating-human-language] 416 Randy, R., "Negotiating Human Language in Real-Time 417 Communications", draft-gellens-slim-negotiating-human- 418 language-00 (work in progress), October 2014. 420 Authors' Addresses 422 Nik Tomkinson 423 Mimecast Ltd 424 CityPoint, One Ropemaker Street 425 London EC2Y 9AW 426 United Kingdom 428 Email: rfc.nik.tomkinson@gmail.com 430 Nathaniel Borenstein 431 Mimecast Ltd 432 480 Pleasant Street 433 Watertown MA 02472 434 North America 436 Email: nsb@mimecast.com