idnits 2.17.1 draft-ietf-mailext-lang-tag-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing document type: Expected "INTERNET-DRAFT" in the upper left hand corner of the first page ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** Expected the document's filename to be given on the first page, but didn't find any == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 12 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. (A line matching the expected section header was found, but with an unexpected indentation: ' 1. Introduction' ) ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** There are 3 instances of too long lines in the document, the longest one being 2 characters in excess of 72. ** The abstract seems to contain references ([ISO639], [ISO3166], [RFC1521], [RFC1327]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == Couldn't figure out when the document was first submitted -- there may comments or warnings related to the use of a disclaimer for pre-RFC5378 work that could not be issued because of this. Please check the Legal Provisions document at https://trustee.ietf.org/license-info to determine if you need the pre-RFC5378 disclaimer. -- The document date () is 739383 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'RFC 1521' on line 452 looks like a reference -- Missing reference section? 'ISO 639' on line 441 looks like a reference -- Missing reference section? 'ISO 3166' on line 447 looks like a reference -- Missing reference section? 'RFC 1327' on line 457 looks like a reference Summary: 13 errors (**), 0 flaws (~~), 3 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 draft Language Tag August 94 3 Tags for the identification of languages 5 Fri Aug 5 14:21:43 MET DST 1994 7 Harald Tveit Alvestrand 8 UNINETT 9 Harald.T.Alvestrand@uninett.no 11 Abstract 13 This document describes a language tag for use in cases where it 14 is desired to indicate the language used in an information object. 16 It also defines a Content-language: header, for use in the case 17 where one desires to indicate the language of something that has 18 RFC-822-like headers, like MIME body parts or Web documents, and a 19 new parameter to the Multipart/Alternative type, to aid in the 20 usage of the Content-Language: header. 22 Status of this Memo 24 This draft document is being circulated for comment. 26 If consensus is reached it may be submitted to the RFC editor as a 27 Proposed Standard protocol specificiation. 29 Please send comments to the author, or to the MAILEXT mailing list 30 32 The following text is required by the Internet-draft rules: 34 This document is an Internet Draft. Internet Drafts are working 35 documents of the Internet Engineering Task Force (IETF), its 36 Areas, and its Working Groups. Note that other groups may also 37 distribute working documents as Internet Drafts. 39 Internet Drafts are draft documents valid for a maximum of six 41 =0C 42 draft Language Tag August 94 44 months. Internet Drafts may be updated, replaced, or obsoleted by 45 other documents at any time. It is not appropriate to use 46 Internet Drafts as reference material or to cite them other than 47 as a "working draft" or "work in progress." 49 Please check the I-D abstract listing contained in each Internet 50 Draft directory to learn the current status of this or any other 51 Internet Draft. 53 The filename of this document is draft-mailext-lang-tag-01.txt 55 =0C 56 draft Language Tag August 94 58 1. Introduction 60 There are a number of languages spoken by human beings in this 61 world. 63 A great number of these people would prefer to have information 64 presented in a language that they understand. 66 In some contexts, it is possible to have information in more than 67 one language, or it might be possible to provide tools for 68 assisting in the understanding of a language (like dicationaries). 70 A prerequisite for any such function is a means of labelling the 71 information content with an identifier for the language in which 72 is is written. 74 In the tradition of solving only problems that we think we 75 understand, this document specifies an identifier mechanism, and 76 one possible use for it. 78 2. The Language tag 80 The language tag is composed of 1 or more parts: A main language 81 tag and a (possibly empty) series of subtags. 83 The syntax of this tag in RFC-822 EBNF is: 85 Language-Tag =3D Tag-List 86 Tag-List =3D Tag-Component *[ '-', Tag-List ] 87 Tag-Component =3D 1*8ALPHA 89 Whitespace is not allowed within the tag. 91 All tags are to be treated as case insensitive; there exist 92 conventions for capitalization of some of them, but these should 93 not be taken to carry meaning. 95 The namespace of language tags and subtags is administered by the 96 IANA. The following registrations are predefined: 98 In the language tag: 100 =0C 101 draft Language Tag August 94 103 - All 2-letter codes are interpreted according to ISO 639. 105 - The value "i" is reserved for IANA-defined registrations 107 - The value "x" is reserved for private use. Subtags of "X" 108 will not be registered by the IANA. 110 - Other values cannot be assigned except by updating this 111 standard. 113 The reason for reserving all other tags is to be open towards new 114 revisions of ISO 639; the use of "i" and "x" is the minimum we can 115 do here to be able to extend the mechanism to meet our 116 requirements. 118 In the first subtag: 120 - All 2-letter codes are interpreted as ISO 3166 country codes, 121 according to the rules laid down in ISO 639. 123 - Codes of 3 to 8 letters may be registered with the IANA by 124 anyone who feels a need for it. IANA has the right to reject 125 registrations that are felt to be misleading. 127 The information in the subtag may for instance be: 129 - Country identification, such as en-US (this usage is 130 described in ISO 639) 132 - Dialect or variant information, such as no-NYNORSK or en- 133 COCKNEY 135 - Languages not listed in ISO 639 that are not variants of any 136 listed language, which can be registered with the i- prefix, 137 such as i-cherokee 139 - Script variations, such as az-arabic and az-cyrillic 141 In the second and subsequent subtag, any value can be registered. 143 =0C 144 draft Language Tag August 94 146 NOTE: The ISO 639/ISO 3166 convention is that language names are 147 written in lower case, while country codes are written in upper 148 case. This convention is recommended, but not enforced; the tags 149 are case insensitive. 151 NOTE: ISO 639 defines a registration authority for additions to 152 and changes in the list of languages in ISO 639. This authority 153 is: 155 International Information Centre for Terminology (Infoterm) 156 P.O. Box 130 157 A-1021 Wien 158 Austria 159 Phone: +43 1 26 75 35 Ext. 312 160 Fax: +43 1 216 32 72 162 The following codes have been added in 1989 (nothing later): ug 163 (Uigur), iu (Inuktitut, also called Eskimo), za (Zhuang), he 164 (Hebrew, replacing iw), yi (Yiddish, replacing ji), and id 165 (Indonesian, replacing in). 167 NOTE: The registration agency for ISO 3166 (country codes) is: 169 ISO 3166 Maintenance Agency Secretariat 170 c/o DIN Deutches Institut F=FCr Normung 171 Burggrafenstrasse 6 172 Postfach 1107 173 D-10787 Berlin 174 Germany 175 Phone: +49 30 26 01 320 176 Fax: +49 30 26 01 231 178 The codes AA, QM-QZ, XA-XZ and ZZ are reserved by ISO 3166 as 179 user-assigned codes. 181 2.1. Meaning of the language tag 183 The language tag always defines a language as spoken (or written) 184 by human beings for communication of information to other human 185 beings. Computer languages are explicitly excluded. 187 There is no guaranteed relationship between languages that start 189 =0C 190 draft Language Tag August 94 192 out with the same series of tags; especially, they are NOT 193 guraranteed to be mutually comprehensible, although this will 194 sometimes be the case. 196 Applications should always treat language tags as a single token; 197 the division into subtags is an administrative mechanism, not a 198 navigation aid. 200 The relationship between the tag and the information it relates to 201 is defined by the standard describing the context in which it 202 appears. So, this section can only give possible examples of its 203 usage. 205 - For a single information object, it should be taken as the 206 set of languages that is required for a complete 207 comprehension of the complete object. Example: Simple text. 209 - For an aggregation of information objects, it should be taken 210 as the set of languages used inside components of that 211 aggregation. Examples: Document stores and libraries. 213 - For information objects whose purpose in life is providing 214 alternatives, it should be regarded as a hint that the 215 material inside is provided in several languages, and that 216 one has to inspect each of the alternatives in order to find 217 its language or languages. In this case, multiple languages 218 need not mean that one needs to be multilingual to get 219 complete understanding of the document. Example: MIME 220 multipart/alternative. 222 - It would be possible to define (for instance) an SGML DTD 223 that defines a tag for indicating that following or 224 contained text is written in this language, such that one 225 could write "C'est la vie"; the Norwegian- 226 speaking user could then access a French-Norwegian dictionary 227 to find out what the quote meant. 229 3. The Content-language header 231 The RFC-822 ABNF of the Language header is: 233 Language-Header =3D "Content-Language" ":" 1#Language-tag 235 =0C 236 draft Language Tag August 94 238 Note that the Language-Header is allowed to list several languages 239 in a comma-separated list. 241 Whitespace is allowed, which means also that one can place 242 parenthesized comments anywhere in the language sequence. 244 3.1. Examples of Content-language values 246 NOTE: NONE of the subtags shown in this document have actually 247 been assigned; they are used for illustration purposes only. 249 Norwegian official document, with parallel text in both official 250 versions of Norwegian. (Both versions are readable by all 251 Norwegians). 253 Content-Type: multipart/alternative; differences=3Dcontent- 254 language 255 Content-Language: no-nynorsk, no-bokmaal 257 Voice recording from the London docks 259 Content-type: audio/basic 260 Content-Language: en-cockney 262 Document in Sami, which does not have an ISO 639 code, and is 263 spoken in several countries, but with about half the speakers in 264 Norway, with six different, mutually incomprehensible dialects: 266 Content-type: text/plain; charset=3Diso-8859-10 267 Content-Language: i-sami-no (North Sami) 269 An English-French dictionary 271 Content-type: application/dictionary 272 Content-Language: en, fr (This is a dictionary) 274 An official EC document (in a few of its official languages) 276 Content-type: multipart/alternative 277 Content-Language: en, fr, de, da, el, it 279 An excerpt from Star Trek 281 =0C 282 draft Language Tag August 94 284 Content-type: video/mpeg 285 Content-Language: x-klingon 287 4. Use of Content-Language with Multipart/Alternative 289 When using the Multipart/Alternative body part of MIME, it is 290 possible to have the body parts giving the same information 291 content in different languages. In this case, one should put a 292 Content-Language header on each of the body parts, and a summary 293 Content-Language header onto the Multipart/Alternative itself. 295 4.1. The differences parameter to multipart/alternative 297 As defined in RFC 1541, Multipart/Alternative only has one 298 parameter: boundary. 300 The common usage of Multipart/Alternative is to have more than one 301 format of the same message (f.ex. PostScript and ASCII). 303 The use of language tags to differentiate between different 304 alternatives will certainly not lead all MIME UAs to present the 305 most sensible body part as default. 307 Therefore, a new parameter is defined, to allow the configuration 308 of MIME readers to handle language differences in a sensible 309 manner. 311 Name: Differences 312 Value: One or more of 313 Content-Type 314 Content-Language 316 Further values can be registered with IANA; it must be the name of 317 a header for which a definition exists in a published document. 318 If not present, Differences=3DContent-Type is assumed. 320 The intent is that the MIME reader can look at these headers of 321 the message component to do an intelligent choice of what to 322 present to the user, based on knowledge about the user preferences 323 and capabilities. 325 (The intent of having registration with IANA of the fields used in 327 =0C 328 draft Language Tag August 94 330 this context is to maintain a list of usages that a mail UA may 331 expect to see, not to reject usages) 333 (NOTE: The MIME specification [RFC 1521], section 7.2, states that 334 headers not beginning with "Content-" are generally to be ignored 335 in body parts. People defining a header for use with 336 "differences=3D" should take note of this) 338 The mechanism for deciding which body part to present is outside 339 the scope of this document. 341 MIME EXAMPLE: 343 Content-Type: multipart/alternative; differences=3DContent-Language; 344 boundary=3D"limit" 345 Content-Language: en, fr, de 347 --limit 348 Content-Language: fr 350 Le renard brun et agile saute par dessus le chien paresseux 351 --limit 352 Content-Language: de 353 Content-Type: text/plain; charset=3Diso-8859-1 354 Content-Transfer-encoding: 8bit 356 Der schnelle braune Fuchs h=FCpft =FCber den faulen Hund 357 --limit 358 Content-Language: en 360 The quick brown fox jumps over the lazy dog 361 --limit-- 363 When composing a message, the choice of sequence may be somewhat 364 arbitary. However, non-MIME mail readers will show the first body 365 part first, meaning that this should most likely be the language 366 understood by most of the recipients. 368 5. IANA registration procedure for language tags 370 Any language tag must start with an existing tag, and extend it. 372 =0C 373 draft Language Tag August 94 375 This registration form should be used by anyone who wants to use a 376 language tag not defined by ISO or IANA. 378 ---------------------------------------------------------------------- 379 LANGUAGE TAG REGISTRATION FORM 381 Name of requester : 382 E-mail address of requester: 383 Tag to be registered : 385 English name of language : 387 Native name of language (in ASCII): 389 Reference to published description of the language (book or article): 390 ---------------------------------------------------------------------- 391 The language form must be sent to language-review@uninett.no for a 392 2-week review period before submitting it to IANA. (This is an 393 open list. Requests to be added should be sent to language-review- 394 request@uninett.no. General language discussions are not 395 appropriate for this list) 397 The completed form should then be sent to IANA@ISI.EDU; all 398 registered forms are available online in the directory 399 ftp://ftp.iana.isi.edu/registrations/languages/ 401 (NOTE: The IANA may suggest alternative text here). 403 The IANA is free to reject registrations where it feels, based on 404 list feedback, that information is lacking, or that the tag name 405 suggests something different from the language referenced. 407 6. Security considerations 409 Security considerations are not considered in this memo 411 7. Character set considerations 413 Codes are always expressed using US-ASCII (a-z). 415 The issue of deciding upon the rendering of a character set based 416 on the language encoding is not addressed in this memo; however, 418 =0C 419 draft Language Tag August 94 421 the author cautions against thinking that such a decision can be 422 made correctly for all cases unless means of switching language in 423 the middle of a text are defined (for example, a rendering engine 424 that decides font based on Japanese or Chinese language will fail 425 to work when a mixed Japanese-Chinese text is encountered) 427 8. Gatewaying considerations 429 RFC 1327 defines a Language: header. This header is not 430 recommended now, because it is defined to be a single 2-letter 431 language code, and the X.400 header it is supposed to gateway is a 432 list of language codes. 434 It is suggested that RFC 1327 be updated to produce the Content- 435 Language: header, and to turn this header into the ISO/CCITT 436 specified Language components rather than the RFC-822-headers 437 heading extension. 439 9. References 441 [ISO 639] 442 ISO 639:1988 (E/F) - Code for the representation of names of 443 languages - The International Organization for 444 Standardization, 1st edition, 1988 17 pages Prepared by 445 ISO/TC 37 - Terminology (principles and coordination) 447 [ISO 3166] 448 ISO 3166:1988 (E/F) - Codes for the representation of names 449 of countries - The International Organization for 450 Standardization, 3rd edition, 1988-08-15 452 [RFC 1521] 453 MIME Part One: Mechanisms for Specifying and Describing the 454 Format of Internet Message Bodies - Borenstein and Freed - 455 September 1993 457 [RFC 1327] 458 Mapping between X.400(1988) / ISO 10021 and RFC 822 - Kille - 460 =0C 461 draft Language Tag August 94 463 May 1992 465 10. Change log Changes from draft-ietf-lang-tag-02.txt: 467 Clarified that a language tag is a single token 469 Changes from draft-alvestrand-language-tag-00: 471 IANA registration form added 473 IANA-reserved tag changed from "IANA" to "I", in order to 474 avoid clashing with possible ISO 4-letter codes 476 Separated "tag" definition from "header" definition 478 Info on ISO 639 registration office added 480 Created a multi-level tag, rather than strict two-level 482 Added examples of SGML usage 484 Lots of small nits fixed 486 =0C