idnits 2.17.1 draft-freed-charset-regist-03.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([RFC-2045,RFC-2046,, RFC-2047,, RFC-2184]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. -- The draft header indicates that this document obsoletes RFC2278, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 498 has weird spacing: '...rnished to ot...' == Line 499 has weird spacing: '...herwise expla...' == Line 501 has weird spacing: '...without restr...' == Line 502 has weird spacing: '... notice and t...' == Line 503 has weird spacing: '...ivative works...' == (4 more instances...) == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 2000) is 8685 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC-1543' is mentioned on line 312, but not defined ** Obsolete undefined reference: RFC 1543 (Obsoleted by RFC 2223) == Unused Reference: 'ISO-2022' is defined on line 381, but no explicit reference was found in the text == Unused Reference: 'RFC-1590' is defined on line 423, but no explicit reference was found in the text == Unused Reference: 'RFC-1700' is defined on line 427, but no explicit reference was found in the text == Unused Reference: 'RFC-2130' is defined on line 456, but no explicit reference was found in the text == Unused Reference: 'RFC-2278' is defined on line 469, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO-2022' -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO-8859' -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO-10646' ** Obsolete normative reference: RFC 1590 (Obsoleted by RFC 2045, RFC 2046, RFC 2047, RFC 2048, RFC 2049) ** Obsolete normative reference: RFC 1700 (Obsoleted by RFC 3232) ** Obsolete normative reference: RFC 1759 (Obsoleted by RFC 3805) ** Downref: Normative reference to an Informational RFC: RFC 2130 ** Obsolete normative reference: RFC 2184 (Obsoleted by RFC 2231) ** Downref: Normative reference to an Informational RFC: RFC 2468 ** Obsolete normative reference: RFC 2278 (Obsoleted by RFC 2978) -- Possible downref: Non-RFC (?) normative reference: ref. 'US-ASCII' Summary: 15 errors (**), 0 flaws (~~), 15 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Ned Freed, Innosoft 2 Internet Draft Jon Postel, ISI 3 Obsoletes: 2278 5 IANA Charset 6 Registration Procedures 8 July 2000 10 Status of this Memo 12 This document is an Internet-Draft and is in full conformance 13 with all provisions of Section 10 of RFC 2026. 15 Internet-Drafts are working documents of the Internet 16 Engineering Task Force (IETF), its areas, and its working 17 groups. Note that other groups may also distribute working 18 documents as Internet-Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six 21 months and may be updated, replaced, or obsoleted by other 22 documents at any time. It is inappropriate to use Internet- 23 Drafts as reference material or to cite them other than as 24 "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt 29 The list of Internet-Draft Shadow Directories can be accessed 30 at http://www.ietf.org/shadow.html. 32 Copyright Notice 34 Copyright (C) The Internet Society (2000). All Rights 35 Reserved. 37 1. Abstract 39 MIME [RFC-2045, RFC-2046, RFC-2047, RFC-2184] and various 40 other Internet protocols are capable of using many different 41 charsets. This in turn means that the ability to label 42 different charsets is essential. 44 Note: The charset registration procedure exists solely to 45 associate a specific name or names with a given charset and to 46 give an indication of whether or not a given charset can be 47 used in MIME text objects. In particular, the general 48 applicability and appropriateness of a given registered 49 charset to a particular application is a protocol issue, not a 50 registration issue, and is not dealt with by this registration 51 procedure. 53 2. Definitions and Notation 55 The following sections define terms used in this document. 57 2.1. Requirements Notation 59 This document occasionally uses terms that appear in capital 60 letters. When the terms "MUST", "SHOULD", "MUST NOT", "SHOULD 61 NOT", and "MAY" appear capitalized, they are being used to 62 indicate particular requirements of this specification. A 63 discussion of the meanings of these terms appears in 64 [RFC-2119]. 66 2.2. Character 68 A member of a set of elements used for the organisation, 69 control, or representation of data. 71 2.3. Charset 73 The term "charset" (referred to as a "character set" in 74 previous versions of this document) is used here to refer to a 75 method of converting a sequence of octets into a sequence of 76 characters. This conversion may also optionally produce 77 additional control information such as directionality 78 indicators. 80 Note that unconditional and unambiguous conversion in the 81 other direction is not required, in that not all characters 82 may be representable by a given charset and a charset may 83 provide more than one sequence of octets to represent a 84 particular sequence of characters. 86 This definition is intended to allow charsets to be defined in 87 a variety of different ways, from simple single-table mappings 88 such as US-ASCII to complex table switching methods such as 89 those that use ISO 2022's techniques. However, the definition 90 associated with a charset name must fully specify the mapping 91 to be performed. In particular, use of external profiling 92 information to determine the exact mapping is not permitted. 94 HISTORICAL NOTE: The term "character set" was originally used 95 in MIME to describe such straightforward schemes as US-ASCII 96 and ISO-8859-1 which consist of a small set of characters and 97 a simple one-to-one mapping from single octets to single 98 characters. Multi-octet character encoding schemes and 99 switching techniques make the situation much more complex. As 100 such, the definition of this term was revised to emphasize 101 both the conversion aspect of the process, and the term itself 102 has been changed to "charset" to emphasize that it is not, 103 after all, just a set of characters. A discussion of these 104 issues as well as specification of standard terminology for 105 use in the IETF appears in RFC 2130. 107 2.4. Coded Character Set 109 A Coded Character Set (CCS) is a one-to-one mapping from a set 110 of abstract characters to a set of integers. Examples of coded 111 character sets are ISO 10646 [ISO-10646], US-ASCII [US-ASCII], 112 and the ISO-8859 series [ISO-8859]. 114 2.5. Character Encoding Scheme 116 A Character Encoding Scheme (CES) is a mapping from a Coded 117 Character Set or several coded character sets to a set of 118 octet sequences. A given CES is sometimes associated with a 119 single CCS; for example, UTF-8 applies only to ISO 10646. 121 3. Charset Registration Requirements 123 Registered charsets are expected to conform to a number of 124 requirements as described below. 126 3.1. Required Characteristics 128 Registered charsets MUST conform to the definition of a 129 "charset" given above. In addition, charsets intended for use 130 in MIME content types under the "text" top-level type MUST 131 conform to the restrictions on that type described in RFC 132 2045. All registered charsets MUST note whether or not they 133 are suitable for use in MIME text. 135 All charsets which are constructed as a composition of one or 136 more CCS's and a CES MUST either include the CCS's and CES 137 they are based on in their registration or else cite a 138 definition of their CCS's and CES that appears elsewhere. 140 All registered charsets MUST be specified in a stable, openly 141 available specification. Registration of charsets whose 142 specifications aren't stable and openly available is 143 forbidden. 145 3.2. New Charsets 147 This registration mechanism is not intended to be a vehicle 148 for the design and definition of entirely new charsets. This 149 is due to the fact that the registration process does NOT 150 contain adequate review mechanisims for such undertakings. 152 As such, only charsets defined by other processes and 153 standards bodies, or specific profiles or combinations of such 154 charsets, are eligible for registration. 156 3.3. Naming Requirements 158 One or more names MUST be assigned to all registered charsets. 159 Multiple names for the same charset are permitted, but if 160 multiple names are assigned a single primary name for the 161 charset MUST be identified. All other names are considered to 162 be aliases for the primary name and use of the primary name is 163 preferred over use of any of the aliases. 165 Each assigned name MUST uniquely identify a single charset. 166 All charset names MUST be suitable for use as the value of a 167 MIME content type charset parameter and hence MUST conform to 168 MIME parameter value syntax. This applies even if the specific 169 charset being registered is not suitable for use with the 170 "text" media type. 172 All charsets MUST be assigned a name that provides a display 173 string for the associated "MIBenum" value defined below. These 174 "MIBenum" values are defined by and used in the Printer MIB 175 [RFC-1759]. Such names MUST begin with the letters "cs" and 176 MUST contain no more than 40 characters (including the "cs" 177 prefix) chosen from from the printable subset of US-ASCII. 178 Only one name beginning with "cs" may be assigned to a single 179 charset. If no name of this form is explicitly defined IANA 180 will assign an alias consisting of "cs" prepended to the 181 primary charset name. 183 Finally, charsets being registered for use with the "text" 184 media type MUST have a primary name that conforms to the more 185 restrictive syntax of the charset field in MIME encoded-words 186 [RFC-2047, RFC-2184] and MIME extended parameter values 187 [RFC-2184]. A combined ABNF definition for such names is as 188 follows: 190 mime-charset = 1*mime-charset-chars 191 mime-charset-chars = ALPHA / DIGIT / 192 "!" / "#" / "$" / "%" / "&" / 193 "'" / "+" / "-" / "^" / "_" / 194 "`" / "{" / "}" / "~" 195 ALPHA = "A".."Z" ; Case insensitive ASCII Letter 196 DIGIT = "0".."9" ; Numeric digit 198 3.4. Functionality Requirement 200 Charsets MUST function as actual charsets: Registration of 201 things that are better thought of as a transfer encoding, as a 202 media type, or as a collection of separate entities of another 203 type, is not allowed. For example, although HTML could 204 theoretically be thought of as a charset, it is really better 205 thought of as a media type and as such it cannot be registered 206 as a charset. 208 3.5. Usage and Implementation Requirements 210 Use of a large number of charsets in a given protocol may 211 hamper interoperability. However, the use of a large number of 212 undocumented and/or unlabelled charsets hampers 213 interoperability even more. 215 A charset should therefore be registered ONLY if it adds 216 significant functionality that is valuable to a large 217 community, OR if it documents existing practice in a large 218 community. Note that charsets registered for the second reason 219 should be explicitly marked as being of limited or specialized 220 use and should only be used in Internet messages with prior 221 bilateral agreement. 223 3.6. Publication Requirements 225 Charset registrations MAY be published in RFCs, however, RFC 226 publication is not required to register a new charset. 228 The registration of a charset does not imply endorsement, 229 approval, or recommendation by the IANA, IESG, or IETF, or 230 even certification that the specification is adequate. It is 231 expected that applicability statements for particular 232 applications will be published from time to time that 233 recommend implementation of, and support for, charsets that 234 have proven particularly useful in those contexts. 236 Charset registrations SHOULD include a specification of 237 mapping from the charset into ISO 10646 if specification of 238 such a mapping is feasible. 240 3.7. MIBenum Requirements 242 Each registered charset MUST also be assigned a unique 243 enumerated integer value. These "MIBenum" values are defined 244 by and used in the Printer MIB [RFC-1759]. 246 A MIBenum value for each charset will be assigned by IANA at 247 the time of registration. MIBenum values are not assigned by 248 the person registering the charset. 250 4. Charset Registration Procedure 252 The following procedure has been implemented by the IANA for 253 review and approval of new charsets. This is not a formal 254 standards process, but rather an administrative procedure 255 intended to allow community comment and sanity checking 256 without excessive time delay. 258 4.1. Present the Charset to the Community 260 Send the proposed charset registration to the "ietf- 261 charsets@iana.org" mailing list. (Information about joining 262 this list is available on the IANA Website, 263 http://www.iana.org.) This mailing list has been established 264 for the sole purpose of reviewing proposed charset 265 registrations. Proposed charsets are not formally registered 266 and must not be used; the "x-" prefix specified in RFC 2045 267 can be used until registration is complete. 269 The posting of a charset to the list initiates a two week 270 public review process. 272 The intent of the public posting is to solicit comments and 273 feedback on the definition of the charset and the name chosen 274 for it. 276 4.2. Charset Reviewer 278 When the two week period has passed and the registration 279 proposer is convinced that consensus has been achieved, the 280 registration application should be submitted to IANA and the 281 charset reviewer. The charset reviewer, who is appointed by 282 the IETF Applications Area Director(s), either approves the 283 request for registration or rejects it. Rejection may occur 284 because of significant objections raised on the list or 285 objections raised externally. If the charset reviewer 286 considers the registration sufficiently important and 287 controversial, a last call for comments may be issued to the 288 full IETF. The charset reviewer may also recommend standards 289 track processing (before or after registration) when that 290 appears appropriate and the level of specification of the 291 charset is adequate. 293 The charset reviewer must reach a decision and post it to the 294 ietf-charsets mailing list within two weeks. Decisions made by 295 the reviewer may be appealed to the IESG. 297 4.3. IANA Registration 299 Provided that the charset registration has either passed 300 review or has been successfully appealed to the IESG, the IANA 301 will register the charset, assign a MIBenum value, and make 302 its registration available to the community. 304 5. Location of Registered Charset List 306 Charset registrations will be posted in the anonymous FTP file 307 "ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets" 308 and all registered charsets will be listed in the periodically 309 issued "Assigned Numbers" RFC [currently RFC-1700]. The 310 description of the charset MAY also be published as an 311 Informational RFC by sending it to "rfc-editor@isi.edu" 312 (please follow the instructions to RFC authors [RFC-1543]). 314 6. Charset Registration Template 316 To: ietf-charsets@iana.org 317 Subject: Registration of new charset [names] 319 Charset name: 321 (All names must be suitable for use as the value of a 322 MIME content-type parameter.) 324 Charset aliases: 326 (All aliases must also be suitable for use as the value of 327 a MIME content-type parameter.) 329 Suitability for use in MIME text: 331 Published specification(s): 333 (A specification for the charset MUST be 334 openly available that accurately describes what 335 is being registered. If a charset is defined as 336 a composition of one or more CCS's and a CES then these 337 defintions MUST either be included or referenced.) 339 ISO 10646 equivalency table: 341 (A URI to a specification of how to translate from 342 this charset to ISO 10646 and vice versa SHOULD be 343 provided.) 345 Additional information: 347 Person & email address to contact for further information: 349 Intended usage: 351 (One of COMMON, LIMITED USE or OBSOLETE) 353 7. Security Considerations 355 This registration procedure is not known to raise any sort of 356 security considerations that are appreciably different from 357 those already existing in the protocols that employ registered 358 charsets. 360 8. Changes made since RFC 2278 362 Inclusion of a mapping to ISO 10646 is now recommended for all 363 registered charsets. The registration template has been 364 updated to include this as well as a place to indicate whether 365 or not the charset is suitable for use in MIME text. 367 9. IANA Actions 369 (THIS SECTION SHOULD BE REMOVED BEFORE PUBLICATION.) The IANA 370 Web site needs to be updated with information about the ietf- 371 charsets mailing list. In particular, it needs to specify the 372 list address (ietf-charsets@iana.org), the subscription 373 address (ietf-charsets-request@iana.org), the subscription 374 methodology (send a message with one line in the body saying 375 "SUBSCRIBE IETF-CHARSETS"), and the location of the list 376 archives (currently ftp://ftp.innosoft.com/ietf-charsets, but 377 likely to change in the near future). 379 10. References 381 [ISO-2022] 382 International Standard -- Information Processing -- 383 Character Code Structure and Extension Techniques, 384 ISO/IEC 2022:1994, 4th ed. 386 [ISO-8859] 387 International Standard -- Information Processing -- 8-bit 388 Single-Byte Coded Graphic Character Sets 389 - Part 1: Latin Alphabet No. 1, ISO 8859-1:1998, 1st ed. 390 - Part 2: Latin Alphabet No. 2, ISO 8859-2:1999, 1st ed. 391 - Part 3: Latin Alphabet No. 3, ISO 8859-3:1999, 1st ed. 392 - Part 4: Latin Alphabet No. 4, ISO 8859-4:1998, 1st ed. 393 - Part 5: Latin/Cyrillic Alphabet, ISO 8859-5:1999, 2nd 394 ed. 395 - Part 6: Latin/Arabic Alphabet, ISO 8859-6:1999, 1st ed. 396 - Part 7: Latin/Greek Alphabet, ISO 8859-7:1987, 1st ed. 397 - Part 8: Latin/Hebrew Alphabet, ISO 8859-8:1999, 1st ed. 398 - Part 9: Latin Alphabet No. 5, ISO/IEC 8859-9:1999, 2nd 399 ed. 400 International Standard -- Information Technology -- 8-bit 401 Single-Byte Coded Graphic Character Sets 402 - Part 10: Latin Alphabet No. 6, ISO/IEC 8859-10:1998, 403 2nd ed. 404 International Standard -- Information Technology -- 8-bit 405 Single-Byte Coded Graphic Character Sets 406 - Part 13: Latin Alphabet No. 7, ISO/IEC 8859-10:1998, 407 1st ed. 408 International Standard -- Information Technology -- 8-bit 409 Single-Byte Coded Graphic Character Sets 410 - Part 14: Latin Alphabet No. 8 (Celtic), ISO/IEC 411 8859-10:1998, 1st ed. 412 International Standard -- Information Technology -- 8-bit 413 Single-Byte Coded Graphic Character Sets 414 - Part 15: Latin Alphabet No. 9, ISO/IEC 8859-10:1999, 415 1st ed. 417 [ISO-10646] 418 ISO/IEC 10646-1:1993(E), "Information technology -- 419 Universal Multiple-Octet Coded Character Set (UCS) -- 420 Part 1: Architecture and Basic Multilingual Plane", 421 JTC1/SC2, 1993. 423 [RFC-1590] 424 Postel, J., "Media Type Registration Procedure", RFC 425 1590, USC/Information Sciences Institute, March 1994. 427 [RFC-1700] 428 Reynolds, J. and Postel, J., "Assigned Numbers", STD 2, 429 RFC 1700, USC/Information Sciences Institute, October 430 1994. 432 [RFC-1759] 433 Smith, R., Wright, F., Hastings, T., Zilles, S., 434 Gyllenskog, J., "Printer MIB", RFC 1759, March 1995. 436 [RFC-2045] 437 Freed, N. and Borenstein, N., "Multipurpose Internet Mail 438 Extensions (MIME) Part One: Format of Internet Message 439 Bodies", RFC 2045, Bellcore, Innosoft, November 1996. 441 [RFC-2046] 442 Freed, N. and Borenstein, N., "Multipurpose Internet Mail 443 Extensions (MIME) Part Two: Media Types", RFC 2046, 444 Bellcore, Innosoft, November 1996. 446 [RFC-2047] 447 Moore, K., "Multipurpose Internet Mail Extensions (MIME) 448 Part Three: Representation of Non-Ascii Text in Internet 449 Message Headers", RFC 2047, University of Tennessee, 450 November 1996. 452 [RFC-2119] 453 Bradner, S., "Key words for use in RFCs to Indicate 454 Requirement Levels", RFC 2119, March 1997. 456 [RFC-2130] 457 Weider, C., Preston, C., Simonsen, K., Alvestrand, H., 458 Atkinson, R., Crispin, M., Svanberg, P., "Report from the 459 IAB Character Set Workshop", RFC 2130, April 1997. 461 [RFC-2184] 462 Freed, N., Moore, K., "MIME Parameter Value and Encoded 463 Word Extensions: Character Sets, Languages, and 464 Continuations", RFC 2184, August 1997. 466 [RFC-2468] 467 Cerf, V., "I Remember IANA", RFC 2468, October 1998. 469 [RFC-2278] 470 Freed, N., Postel, J., "IANA Charset Registration 471 Procedures", RFC 2278, January 1998. 473 [US-ASCII] 474 Coded Character Set -- 7-Bit American Standard Code for 475 Information Interchange, ANSI X3.4-1986. 477 11. Authors' Addresses 479 Ned Freed 481 Innosoft International, Inc. 1050 Lakes Drive West Covina, CA 482 91790 USA 483 tel: +1 626 919 3600 fax: +1 626 919 3614 484 email: ned.freed@innosoft.com 486 Jon Postel 488 Sadly, Jon Postel, the co-author of this document, passed away 489 on October, 16, 1998 [RFC-2468]. Any omissions or errors are 490 solely the responsibility of the remaining co-author. 492 12. Full Copyright Statement 494 Copyright (C) The Internet Society (2000). All Rights 495 Reserved. 497 This document and translations of it may be copied and 498 furnished to others, and derivative works that comment on or 499 otherwise explain it or assist in its implementation may be 500 prepared, copied, published and distributed, in whole or in 501 part, without restriction of any kind, provided that the 502 above copyright notice and this paragraph are included on all 503 such copies and derivative works. However, this document 504 itself may not be modified in any way, such as by removing 505 the copyright notice or references to the Internet Society or 506 other Internet organizations, except as needed for the purpose 507 of developing Internet standards in which case the procedures 508 for copyrights defined in the Internet Standards process must 509 be followed, or as required to translate it into languages 510 other than English. 512 The limited permissions granted above are perpetual and will 513 not be revoked by the Internet Society or its successors or 514 assigns. 516 This document and the information contained herein is provided 517 on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET 518 ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR 519 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE 520 USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR 521 ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A 522 PARTICULAR PURPOSE.