idnits 2.17.1 draft-ietf-slim-multilangcontent-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 2, 2015) is 3096 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-03) exists of draft-gellens-slim-negotiating-human-language-02 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IETF N. Tomkinson 3 Internet-Draft N. Borenstein 4 Intended status: Standards Track Mimecast Ltd 5 Expires: May 5, 2016 November 2, 2015 7 Multiple Language Content Type 8 draft-ietf-slim-multilangcontent-00 10 Abstract 12 This document defines an addition to the Multipurpose Internet Mail 13 Extensions (MIME) standard to make it possible to send one message 14 that contains multiple language versions of the same information. 15 The translations would be identified by a language tag and selected 16 by the email client based on a user's language settings. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at http://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on May 5, 2016. 35 Copyright Notice 37 Copyright (c) 2015 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (http://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 1. Introduction 52 Since the invention of email and the rapid spread of the Internet, 53 more and more people have been able to communicate in more and more 54 countries and in more and more languages. But during this time of 55 technological evolution, email has remained a single-language 56 communication tool, whether it is English to English, Spanish to 57 Spanish or Japanese to Japanese. 59 Also during this time, many corporations have established their 60 offices in multi-cultural cities and formed departments and teams 61 that span continents, cultures and languages, so the need to 62 communicate efficiently with little margin for miscommunication has 63 grown exponentially. 65 The objective of this document is to define an addition to the 66 Multipurpose Internet Mail Extensions (MIME) standard, to make it 67 possible to send a single message to a group of people in such a way 68 that all of the recipients can read the email in their preferred 69 language. The methods of translation of the message content are 70 beyond the scope of this document, but the structure of the email 71 itself is defined herein. 73 Whilst this document depends on identification of language in message 74 parts for non-real-time communication, there is a companion document 75 that is concerned with a similar problem for real-time communication: 76 [I-D.gellens-slim-negotiating-human-language] 78 1.1. Requirements Language 80 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 81 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 82 document are to be interpreted as described in RFC 2119 [RFC2119]. 84 2. The Content-Type Header Field 86 The "multipart/multilingual" MIME subtype allows the sending of a 87 message in a number of different languages with the translations 88 embedded in the same message. This MIME subtype helps the receiving 89 email client make sense of the message structure. 91 The multipart subtype "multipart/multilingual" has similar semantics 92 to "multipart/alternative" (as discussed in RFC 2046 [RFC2046]) in 93 that each of the message parts is an alternative version of the same 94 information. The primary difference between "multipart/multilingual" 95 and "multipart/alternative" is that when using "multipart/ 96 multilingual", the message part to select for rendering is chosen 97 based on the values of the Content-Language field and optionally the 98 Translation-Type parameter of the Content-Language field instead of 99 the ordering of the parts and the Content-Types. 101 The syntax for this multipart subtype conforms to the common syntax 102 for subtypes of multipart given in section 5.1.1. of RFC 2046 103 [RFC2046]. An example "multipart/multilingual" Content-Type header 104 field would look like this: 106 Content-Type: multipart/multilingual; boundary=01189998819991197253 108 3. The Message Parts 110 A multipart/multilingual message will have a number of message parts: 111 exactly one multilingual preface, one or more language message parts 112 and zero or one language independent message part. The details of 113 these are described below. 115 3.1. The Multilingual Preface 117 In order for the message to be received and displayed in non- 118 conforming email clients, the message SHOULD contain an explanatory 119 message part which MUST NOT be marked with a Content-Language field 120 and MUST be the first of the message parts. Because non-conforming 121 email clients are expected to treat the message as multipart/mixed 122 (in accordance with sections 5.1.3 and 5.1.7 of RFC 2046 [RFC2046]) 123 they may show all of the message parts sequentially or as 124 attachments. Including and showing this explanatory part will help 125 the message recipient understand the message structure. 127 This initial message part SHOULD explain briefly to the recipient 128 that the message contains multiple languages and the parts may be 129 rendered sequentially or as attachments. This SHOULD be presented in 130 the same languages that are provided in the subsequent language 131 message parts. 133 As this explanatory section is likely to contain languages using 134 scripts that require non-US-ASCII characters, it is RECOMMENDED that 135 UTF-8 encoding is used for this message part. 137 Whilst this section of the message is useful for backward 138 compatibility, it will normally only be shown when rendered by a non- 139 conforming email client, because conforming email clients SHOULD only 140 show the single language message part identified by the user's 141 preferred language and the language message part's Content-Language. 143 For the correct display of the multilingual preface in a non- 144 conforming email client, the sender MAY use the Content-Disposition 145 field with a value of 'inline' in conformance with RFC 2183 [RFC2183] 146 (which defines the Content-Disposition field). If provided, this 147 SHOULD be placed at the multipart/multilingual level and in the 148 multilingual preface. This makes it clear to a non-conforming email 149 client that the multilingual preface should be displayed immediately 150 to the recipient, followed by any subsequent parts marked as 151 'inline'. 153 For an example of a multilingual preface, see the examples in 154 Section 8. 156 3.2. The Language Message Parts 158 The language message parts are typically translations of the same 159 message content. These message parts SHOULD be ordered so that the 160 first part after the multilingual preface is in the language believed 161 to be the most likely to be recognised by the recipient as this will 162 constitute the default part when language negotiation fails and there 163 is no Language Independent part. All of the language message parts 164 MUST have a Content-Language field and a Content-Type field, they 165 SHOULD have a Subject field and MAY have a Translation-Type parameter 166 applied to the Content-Language field. 168 The Content-Type for each individual language part MAY be any MIME 169 type (including multipart subtypes such as multipart/alternative). 170 However, it is RECOMMENDED that the Content-Type of the language 171 parts is kept as simple as possible for interoperability with 172 existing email clients. The language parts are not required to have 173 matching Content-Types or multipart structures. For example, there 174 might be an English part of type "text/html" followed by a Spanish 175 part of type "application/pdf" followed by a Chinese part of type 176 "image/jpeg". Whatever the content-type, the contents SHOULD be 177 composed for optimal viewing in the specified language. 179 For a non-multipart type, it is RECOMMENDED that the sender applies a 180 Name parameter to the Content-Type field. This will help the 181 recipient identify the translations when the translations are 182 rendered as attachments by a non-conforming email client. 184 Examples of this parameter include: 186 Content-Type: text/plain; name="english.txt" 188 Content-Type: text/plain; name="espanol.txt" 190 Content-Type: application/pdf; name="hellenic.pdf" 192 3.3. The Language Independent Message Part 194 If there is language independent content intended for the recipient 195 to see if they have a preferred language other than one of those 196 specified in the language message parts and the default language 197 message part is unlikely to be understood, another part MAY be 198 provided. This could typically be a language independent graphic. 199 When this part is present, it MUST be the last part, MUST have a 200 Content-Language field with a value of "zxx" (as described in BCP 47/ 201 RFC 5646 [RFC5646]) and SHOULD NOT have a Subject field. 203 4. Message Part Selection 205 The logic for selecting the message part to render and present to the 206 recipient is summarised in the next few paragraphs. 208 Firstly, if the email client does not understand multipart/ 209 multilingual then it SHOULD treat the message as if it was multipart/ 210 mixed and render message parts accordingly. 212 If the email client does understand multipart/multilingual then it 213 SHOULD ignore the multilingual preface and select the best match for 214 the user's preferred language from the language message parts 215 available. Also, the user may prefer to see the original message 216 content in their second language over a machine translation in their 217 first language. The Translation-Type parameter of the Content- 218 Language field value can be used for further selection based on this 219 preference. The selection of language part may be implemented in a 220 variety of ways, although the matching schemes detailed in RFC 4647 221 [RFC4647] are RECOMMENDED as a starting point for an implementation. 222 The goal is to render the most appropriate translation for the user. 224 If there is no match for the user's preferred language (or there is 225 no preferred language information available) the email client SHOULD 226 select the language independent part (if one exists) or the first 227 language part (directly after the multilingual preface) if a language 228 independent part does not exist. 230 If there is no translation type preference information available, the 231 values of the Translation-Type parameter may be ignored. 233 Additionally, interactive implementations MAY offer the user a choice 234 from among the available languages. 236 5. The Content-Language Field 238 The Content-Language field in the individual language message parts 239 is used to identify the language in which the message part is 240 written. Based on the value of this field, a conforming email client 241 can determine which message part to display (given the user's 242 language settings). 244 The Content-Language MUST comply with RFC 3282 [RFC3282] (which 245 defines the Content-Language field) and BCP 47/RFC 5646 [RFC5646] 246 (which defines the structure and semantics for the language code 247 values). While RFC 5646 provides a mechanism accommodating 248 increasingly fine-grained distinctions, in the interest of maximum 249 interoperability, each Content-Language value SHOULD be restricted to 250 the largest granularity of language tags; in other words, it is 251 RECOMMENDED to specify only a Primary-subtag and NOT to include 252 subtags (e.g., for region or dialect) unless the languages might be 253 mutually incomprehensible without them. Examples of this field for 254 English, German and an instruction manual in Spanish and French, 255 could look like the following: 257 Content-Language: en 259 Content-Language: de 261 Content-Language: es, fr 263 6. The Translation-Type Parameter 265 The Translation-Type parameter can be applied to the Content-Language 266 field in the individual language message parts and is used to 267 identify the type of translation. Based on the value of this 268 parameter and the user's preferences, a conforming email client can 269 determine which message part to display. 271 This parameter can have one of three possible values: 'original', 272 'human' or 'automated' although other values may be added in the 273 future. A value of 'original' is given in the language message part 274 that is in the original language. A value of 'human' is used when a 275 language message part is translated by a human translator or a human 276 has checked and corrected an automated translation. A value of 277 'automated' is used when a language message part has been translated 278 by an electronic agent without proofreading or subsequent correction. 280 Examples of this parameter include: 282 Content-Language: en; translation-type=original 283 Content-Language: fr; translation-type=human 285 7. The Subject Field in the Language Message parts 287 On receipt of the message, conforming email clients will need to 288 render the subject in the correct language for the recipient. To 289 enable this the Subject field SHOULD be provided in each language 290 message part. The value for this field should be a translation of 291 the email subject. 293 US-ASCII and 'encoded-word' examples of this field include: 295 Subject: A really simple email subject 297 Subject: =?UTF-8?Q?Un_asunto_de_correo_electr=C3=b3nico_ 298 realmente_sencillo?= 300 See RFC 2047 [RFC2047] for the specification of 'encoded-word'. 302 The subject to be presented to the recipient should be selected from 303 the message part identified during the message part selection stage. 304 If no Subject field is found (for example if the language independent 305 part is selected) the top-level Subject header field value should be 306 used. 308 8. Examples 310 8.1. An Example of a Simple Multiple language email message 311 From: Nik 312 To: Nathaniel 313 Subject: example of a message in Spanish and English 314 Content-Type: multipart/multilingual; boundary=01189998819991197253 315 Content-Disposition: inline 317 --01189998819991197253 318 Content-type: text/plain; charset="UTF-8" 319 Content-transfer-encoding: quoted-printable 320 Content-Disposition: inline 322 This is a message in multiple languages. It says the 323 same thing in each language. If you can read it in one language, 324 you can ignore the other translations. The other translations may be 325 presented as attachments or grouped together. 327 Este es un mensaje en varios idiomas. Dice lo mismo en 328 cada idioma. Si puede leerlo en un idioma, puede ignorar las otras 329 traducciones. Cualquier otra traducci=C3=B3n puede presentarse como 330 un archivo adjunto o agrupado. 332 --01189998819991197253 333 Content-Language: en; translation-type=original 334 Content-Type: text/plain; name="english.txt" 335 Content-Disposition: inline 336 Subject: example of a message in Spanish and English 338 Hello, this message content is provided in your language. 340 --01189998819991197253 341 Content-Language: es; translation-type=human 342 Content-Type: text/plain; name="espanol.txt" 343 Content-Disposition: inline 344 Subject: =?UTF-8?Q?ejemplo_pr=C3=A1ctico_de_mensaje_ 345 en_espa=C3=B1ol_e_ingl=C3=A9s?= 347 Hola, el contenido de este mensaje esta disponible en su idioma. 349 --01189998819991197253 350 Content-Language: zxx 351 Content-Type: image/gif 352 Content-Disposition: inline 354 ..GIF image showing iconic or language-independent content here.. 356 --01189998819991197253-- 358 8.2. An Example of a Complex Multiple language email message 360 Below is an example of a more complex multiple language email message 361 formatted using the method detailed in this document. Note that the 362 language message parts have multipart contents and would therefore 363 require further processing to determine the content to display. 365 From: Nik 366 To: Nathaniel 367 Subject: example of a message in Spanish and English 368 Content-Type: multipart/multilingual; boundary=01189998819991197253 369 Content-Disposition: inline 371 --01189998819991197253 372 Content-type: text/plain; charset="UTF-8" 373 Content-transfer-encoding: quoted-printable 374 Content-Disposition: inline 376 This is a message in multiple languages. It says the 377 same thing in each language. If you can read it in one language, 378 you can ignore the other translations. The other translations may be 379 presented as attachments or grouped together. 381 Este es un mensaje en varios idiomas. Dice lo mismo en 382 cada idioma. Si puede leerlo en un idioma, puede ignorar las otras 383 traducciones. Cualquier otra traducci=C3=B3n puede presentarse como 384 un archivo adjunto o agrupado. 386 --01189998819991197253 387 Content-Language: en; translation-type=original 388 Content-Type: multipart/alternative; boundary=multipartaltboundary 389 Subject: example of a message in Spanish and English 391 --multipartaltboundary 392 Content-Type: text/plain; name="english.txt" 394 Hello, this message content is provided in your language. 396 --multipartaltboundary 397 Content-Type: text/html; name="english.html" 399

Hello, this message content is provided in your 400 language.

402 --multipartaltboundary-- 404 --01189998819991197253 405 Content-Language: es; translation-type=human 406 Content-Type: multipart/mixed; boundary=multipartmixboundary 407 Subject: =?UTF-8?Q?ejemplo_pr=C3=A1ctico_de_mensaje_ 408 en_espa=C3=B1ol_e_ingl=C3=A9s?= 410 --multipartmixboundary 411 Content-Type:application/pdf; name="espanol.pdf" 413 ..PDF file in Spanish here.. 415 --multipartmixboundary 416 Content-Type:image/jpeg; name="espanol.jpg" 418 ..JPEG image showing Spanish content here.. 420 --multipartmixboundary-- 422 --01189998819991197253 423 Content-Language: zxx 424 Content-Type: image/gif 425 Content-Disposition: inline 427 ..GIF image showing iconic or language-independent content here.. 429 --01189998819991197253-- 431 9. Changes from Previous Versions 433 9.1. Changes from draft-tomkinson-multilangcontent-01 to draft- 434 tomkinson-slim-multilangcontent-00 436 o File name and version number changed to reflect the proposed WG 437 name SLIM (Selection of Language for Internet Media). 439 o Replaced the Subject-Translation field in the language message 440 parts with Subject and provided US-ASCII and non-US-ASCII 441 examples. 443 o Introduced the language-independent message part. 445 o Many wording improvements and clarifications throughout the 446 document. 448 9.2. Changes from draft-tomkinson-slim-multilangcontent-00 to draft- 449 tomkinson-slim-multilangcontent-01 451 o Added Translation-Type in each language message part to identify 452 the source of the translation (original/human/automated). 454 9.3. Changes from draft-tomkinson-slim-multilangcontent-01 to draft- 455 tomkinson-slim-multilangcontent-02 457 o Changed Translation-Type to be a parameter for the Content- 458 Language field rather than a new separate field. 460 o Added a paragraph about using Content-Disposition field to help 461 non-conforming mail clients correctly render the multilingual 462 preface. 464 o Recommended using a Name parameter on the language part Content- 465 Type to help the recipient identify the translations in non- 466 conforming mail clients. 468 o Many wording improvements and clarifications throughout the 469 document. 471 9.4. Changes from draft-tomkinson-slim-multilangcontent-02 to draft- 472 ietf-slim-multilangcontent-00 474 o Name change to reflect the draft being accepted into SLIM as a 475 working group document. 477 o Updated examples to use UTF-8 encoding where required. 479 o Removed references to 'locale' for identifying language 480 preference. 482 o Recommended language matching schemes from RFC 4647 [RFC4647]. 484 o Renamed the unmatched part to language independent part to 485 reinforce its intended purpose. 487 o Added requirement for using Content-Language: zxx in the language 488 independent part. 490 o Many wording improvements and clarifications throughout the 491 document. 493 10. Acknowledgements 495 The authors are grateful for the helpful input received from many 496 people but would especially like to acknowledge the help of Harald 497 Alvestrand, Stephane Bortzmeyer, Eric Burger, Mark Davis, Doug Ewell, 498 Randall Gellens, Gunnar Hellstrom, Alexey Melnikov, Addison Phillips, 499 Fiona Tomkinson, Simon Tyler and Daniel Vargha. The authors would 500 also like to thank Fernando Alvaro and Luis de Pablo for their work 501 on the Spanish translations. 503 11. IANA Considerations 505 The multipart/multilingual MIME type will be registered with IANA. 507 12. Security Considerations 509 This document has no additional security considerations beyond those 510 that apply to the standards and procedures on which it is built. 512 13. References 514 13.1. Normative References 516 [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 517 Extensions (MIME) Part Two: Media Types", RFC 2046, 518 DOI 10.17487/RFC2046, November 1996, 519 . 521 [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail Extensions) 522 Part Three: Message Header Extensions for Non-ASCII Text", 523 RFC 2047, DOI 10.17487/RFC2047, November 1996, 524 . 526 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 527 Requirement Levels", BCP 14, RFC 2119, 528 DOI 10.17487/RFC2119, March 1997, 529 . 531 [RFC2183] Troost, R., Dorner, S., and K. Moore, Ed., "Communicating 532 Presentation Information in Internet Messages: The 533 Content-Disposition Header Field", RFC 2183, 534 DOI 10.17487/RFC2183, August 1997, 535 . 537 [RFC3282] Alvestrand, H., "Content Language Headers", RFC 3282, 538 DOI 10.17487/RFC3282, May 2002, 539 . 541 [RFC4647] Phillips, A. and M. Davis, "Matching of Language Tags", 542 BCP 47, RFC 4647, DOI 10.17487/RFC4647, September 2006, 543 . 545 [RFC5646] Phillips, A., Ed. and M. Davis, Ed., "Tags for Identifying 546 Languages", BCP 47, RFC 5646, DOI 10.17487/RFC5646, 547 September 2009, . 549 13.2. Informational References 551 [I-D.gellens-slim-negotiating-human-language] 552 Gellens, R., "Negotiating Human Language in Real-Time 553 Communications", draft-gellens-slim-negotiating-human- 554 language-02 (work in progress), July 2015. 556 Authors' Addresses 558 Nik Tomkinson 559 Mimecast Ltd 560 CityPoint, One Ropemaker Street 561 London EC2Y 9AW 562 United Kingdom 564 Email: rfc.nik.tomkinson@gmail.com 566 Nathaniel Borenstein 567 Mimecast Ltd 568 480 Pleasant Street 569 Watertown MA 02472 570 North America 572 Email: nsb@mimecast.com