| < draft-freed-charset-regist-02.txt | draft-freed-charset-regist-03.txt > | |||
|---|---|---|---|---|
| Network Working Group Ned Freed, Innosoft | Network Working Group Ned Freed, Innosoft | |||
| Internet Draft Jon Postel, ISI | Internet Draft Jon Postel, ISI | |||
| Obsoletes: 2278 <draft-freed-charset-regist-02.txt> | Obsoletes: 2278 <draft-freed-charset-regist-03.txt> | |||
| IANA Charset | IANA Charset | |||
| Registration Procedures | Registration Procedures | |||
| May 2000 | July 2000 | |||
| Status of this Memo | Status of this Memo | |||
| This document is an Internet-Draft and is in full conformance | This document is an Internet-Draft and is in full conformance | |||
| with all provisions of Section 10 of RFC 2026. | with all provisions of Section 10 of RFC 2026. | |||
| Internet-Drafts are working documents of the Internet | Internet-Drafts are working documents of the Internet | |||
| Engineering Task Force (IETF), its areas, and its working | Engineering Task Force (IETF), its areas, and its working | |||
| groups. Note that other groups may also distribute working | groups. Note that other groups may also distribute working | |||
| documents as Internet-Drafts. | documents as Internet-Drafts. | |||
| Internet-Drafts are draft documents valid for a maximum of six | Internet-Drafts are draft documents valid for a maximum of six | |||
| skipping to change at page 2, line ? ¶ | skipping to change at page 2, line ? ¶ | |||
| documents at any time. It is inappropriate to use Internet- | documents at any time. It is inappropriate to use Internet- | |||
| Drafts as reference material or to cite them other than as | Drafts as reference material or to cite them other than as | |||
| "work in progress." | "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
| The list of Internet-Draft Shadow Directories can be accessed | The list of Internet-Draft Shadow Directories can be accessed | |||
| at http://www.ietf.org/shadow.html. | at http://www.ietf.org/shadow.html. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (C) The Internet Society (2000). All Rights | Copyright (C) The Internet Society (2000). All Rights | |||
| Reserved. | Reserved. | |||
| 1. Abstract | 1. Abstract | |||
| MIME [RFC-2045, RFC-2046, RFC-2047, RFC-2184] and various | MIME [RFC-2045, RFC-2046, RFC-2047, RFC-2184] and various | |||
| other Internet protocols are capable of using many different | other Internet protocols are capable of using many different | |||
| charsets. This in turn means that the ability to label | charsets. This in turn means that the ability to label | |||
| different charsets is essential. | different charsets is essential. | |||
| skipping to change at page 2, line ? ¶ | skipping to change at page 2, line ? ¶ | |||
| 2. Definitions and Notation | 2. Definitions and Notation | |||
| The following sections define terms used in this document. | The following sections define terms used in this document. | |||
| 2.1. Requirements Notation | 2.1. Requirements Notation | |||
| This document occasionally uses terms that appear in capital | This document occasionally uses terms that appear in capital | |||
| letters. When the terms "MUST", "SHOULD", "MUST NOT", "SHOULD | letters. When the terms "MUST", "SHOULD", "MUST NOT", "SHOULD | |||
| NOT", and "MAY" appear capitalized, they are being used to | NOT", and "MAY" appear capitalized, they are being used to | |||
| indicate particular requirements of this specification. A | indicate particular requirements of this specification. A | |||
| discussion of the meanings of these terms appears in [RFC- | discussion of the meanings of these terms appears in | |||
| 2119]. | [RFC-2119]. | |||
| 2.2. Character | 2.2. Character | |||
| A member of a set of elements used for the organisation, | A member of a set of elements used for the organisation, | |||
| control, or representation of data. | control, or representation of data. | |||
| 2.3. Charset | 2.3. Charset | |||
| The term "charset" (referred to as a "character set" in | The term "charset" (referred to as a "character set" in | |||
| previous versions of this document) is used here to refer to a | previous versions of this document) is used here to refer to a | |||
| skipping to change at page 3, line 8 ¶ | skipping to change at page 3, line 8 ¶ | |||
| Note that unconditional and unambiguous conversion in the | Note that unconditional and unambiguous conversion in the | |||
| other direction is not required, in that not all characters | other direction is not required, in that not all characters | |||
| may be representable by a given charset and a charset may | may be representable by a given charset and a charset may | |||
| provide more than one sequence of octets to represent a | provide more than one sequence of octets to represent a | |||
| particular sequence of characters. | particular sequence of characters. | |||
| This definition is intended to allow charsets to be defined in | This definition is intended to allow charsets to be defined in | |||
| a variety of different ways, from simple single-table mappings | a variety of different ways, from simple single-table mappings | |||
| such as US-ASCII to complex table switching methods such as | such as US-ASCII to complex table switching methods such as | |||
| those that use ISO 2022's techniques, to be used as charsets. | those that use ISO 2022's techniques. However, the definition | |||
| However, the definition associated with a charset name must | associated with a charset name must fully specify the mapping | |||
| fully specify the mapping to be performed. In particular, use | to be performed. In particular, use of external profiling | |||
| of external profiling information to determine the exact | information to determine the exact mapping is not permitted. | |||
| mapping is not permitted. | ||||
| HISTORICAL NOTE: The term "character set" was originally used | HISTORICAL NOTE: The term "character set" was originally used | |||
| in MIME to describe such straightforward schemes as US-ASCII | in MIME to describe such straightforward schemes as US-ASCII | |||
| and ISO-8859-1 which consist of a small set of characters and | and ISO-8859-1 which consist of a small set of characters and | |||
| a simple one-to-one mapping from single octets to single | a simple one-to-one mapping from single octets to single | |||
| characters. Multi-octet character encoding schemes and | characters. Multi-octet character encoding schemes and | |||
| switching techniques make the situation much more complex. As | switching techniques make the situation much more complex. As | |||
| such, the definition of this term was revised to emphasize | such, the definition of this term was revised to emphasize | |||
| both the conversion aspect of the process, and the term itself | both the conversion aspect of the process, and the term itself | |||
| has been changed to "charset" to emphasize that it is not, | has been changed to "charset" to emphasize that it is not, | |||
| skipping to change at page 3, line 38 ¶ | skipping to change at page 3, line 37 ¶ | |||
| A Coded Character Set (CCS) is a one-to-one mapping from a set | A Coded Character Set (CCS) is a one-to-one mapping from a set | |||
| of abstract characters to a set of integers. Examples of coded | of abstract characters to a set of integers. Examples of coded | |||
| character sets are ISO 10646 [ISO-10646], US-ASCII [US-ASCII], | character sets are ISO 10646 [ISO-10646], US-ASCII [US-ASCII], | |||
| and the ISO-8859 series [ISO-8859]. | and the ISO-8859 series [ISO-8859]. | |||
| 2.5. Character Encoding Scheme | 2.5. Character Encoding Scheme | |||
| A Character Encoding Scheme (CES) is a mapping from a Coded | A Character Encoding Scheme (CES) is a mapping from a Coded | |||
| Character Set or several coded character sets to a set of | Character Set or several coded character sets to a set of | |||
| octet sequences. A given CES is typically associated with a | octet sequences. A given CES is sometimes associated with a | |||
| single CCS; for example, UTF-8 applies only to ISO 10646. | single CCS; for example, UTF-8 applies only to ISO 10646. | |||
| 3. Charset Registration Requirements | 3. Charset Registration Requirements | |||
| Registered charsets are expected to conform to a number of | Registered charsets are expected to conform to a number of | |||
| requirements as described below. | requirements as described below. | |||
| 3.1. Required Characteristics | 3.1. Required Characteristics | |||
| Registered charsets MUST conform to the definition of a | Registered charsets MUST conform to the definition of a | |||
| "charset" given above. In addition, charsets intended for use | "charset" given above. In addition, charsets intended for use | |||
| in MIME content types under the "text" top-level type MUST | in MIME content types under the "text" top-level type MUST | |||
| conform to the restrictions on that type described in RFC | conform to the restrictions on that type described in RFC | |||
| 2045. All registered charsets MUST note whether or not they | 2045. All registered charsets MUST note whether or not they | |||
| are suitable for use in MIME text. | are suitable for use in MIME text. | |||
| All charsets which are constructed as a composition of a CCS | All charsets which are constructed as a composition of one or | |||
| and a CES MUST either include the CCS and CES they are based | more CCS's and a CES MUST either include the CCS's and CES | |||
| on in their registration or else cite a definition of their | they are based on in their registration or else cite a | |||
| CCS and CES that appears elsewhere. | definition of their CCS's and CES that appears elsewhere. | |||
| All registered charsets MUST be specified in a stable, openly | All registered charsets MUST be specified in a stable, openly | |||
| available specification. Registration of charsets whose | available specification. Registration of charsets whose | |||
| specifications aren't stable and openly available is | specifications aren't stable and openly available is | |||
| forbidden. | forbidden. | |||
| 3.2. New Charsets | 3.2. New Charsets | |||
| This registration mechanism is not intended to be a vehicle | This registration mechanism is not intended to be a vehicle | |||
| for the design and definition of entirely new charsets. This | for the design and definition of entirely new charsets. This | |||
| skipping to change at page 5, line 21 ¶ | skipping to change at page 5, line 21 ¶ | |||
| MUST contain no more than 40 characters (including the "cs" | MUST contain no more than 40 characters (including the "cs" | |||
| prefix) chosen from from the printable subset of US-ASCII. | prefix) chosen from from the printable subset of US-ASCII. | |||
| Only one name beginning with "cs" may be assigned to a single | Only one name beginning with "cs" may be assigned to a single | |||
| charset. If no name of this form is explicitly defined IANA | charset. If no name of this form is explicitly defined IANA | |||
| will assign an alias consisting of "cs" prepended to the | will assign an alias consisting of "cs" prepended to the | |||
| primary charset name. | primary charset name. | |||
| Finally, charsets being registered for use with the "text" | Finally, charsets being registered for use with the "text" | |||
| media type MUST have a primary name that conforms to the more | media type MUST have a primary name that conforms to the more | |||
| restrictive syntax of the charset field in MIME encoded-words | restrictive syntax of the charset field in MIME encoded-words | |||
| [RFC-2047, RFC-2184] and MIME extended parameter values [RFC- | [RFC-2047, RFC-2184] and MIME extended parameter values | |||
| 2184]. A combined ABNF definition for such names is as | [RFC-2184]. A combined ABNF definition for such names is as | |||
| follows: | follows: | |||
| mime-charset = 1*mime-charset-chars | mime-charset = 1*mime-charset-chars | |||
| mime-charset-chars = ALPHA / DIGIT / | mime-charset-chars = ALPHA / DIGIT / | |||
| "!" / "#" / "$" / "%" / "&" / | "!" / "#" / "$" / "%" / "&" / | |||
| "'" / "+" / "-" / "^" / "_" / | "'" / "+" / "-" / "^" / "_" / | |||
| "`" / "{" / "}" / "~" | "`" / "{" / "}" / "~" | |||
| ALPHA = "A".."Z" ; Case insensitive ASCII Letter | ALPHA = "A".."Z" ; Case insensitive ASCII Letter | |||
| DIGIT = "0".."9" ; Numeric digit | DIGIT = "0".."9" ; Numeric digit | |||
| 3.4. Functionality Requirement | 3.4. Functionality Requirement | |||
| Charsets MUST function as actual charsets: Registration of | Charsets MUST function as actual charsets: Registration of | |||
| things that are better thought of as a transfer encoding, as a | things that are better thought of as a transfer encoding, as a | |||
| media type, or as a collection of separate entities of another | media type, or as a collection of separate entities of another | |||
| type, is not allowed. For example, although HTML could | type, is not allowed. For example, although HTML could | |||
| theoretically be thought of as a charset, it is really better | theoretically be thought of as a charset, it is really better | |||
| skipping to change at page 7, line 18 ¶ | skipping to change at page 7, line 18 ¶ | |||
| Send the proposed charset registration to the "ietf- | Send the proposed charset registration to the "ietf- | |||
| charsets@iana.org" mailing list. (Information about joining | charsets@iana.org" mailing list. (Information about joining | |||
| this list is available on the IANA Website, | this list is available on the IANA Website, | |||
| http://www.iana.org.) This mailing list has been established | http://www.iana.org.) This mailing list has been established | |||
| for the sole purpose of reviewing proposed charset | for the sole purpose of reviewing proposed charset | |||
| registrations. Proposed charsets are not formally registered | registrations. Proposed charsets are not formally registered | |||
| and must not be used; the "x-" prefix specified in RFC 2045 | and must not be used; the "x-" prefix specified in RFC 2045 | |||
| can be used until registration is complete. | can be used until registration is complete. | |||
| The posting of a charset to the list initiates a two week | ||||
| public review process. | ||||
| The intent of the public posting is to solicit comments and | The intent of the public posting is to solicit comments and | |||
| feedback on the definition of the charset and the name chosen | feedback on the definition of the charset and the name chosen | |||
| for it over a two week period. | for it. | |||
| 4.2. Charset Reviewer | 4.2. Charset Reviewer | |||
| When the two week period has passed and the registration | When the two week period has passed and the registration | |||
| proposer is convinced that consensus has been achieved, the | proposer is convinced that consensus has been achieved, the | |||
| registration application should be submitted to IANA and the | registration application should be submitted to IANA and the | |||
| charset reviewer. The charset reviewer, who is appointed by | charset reviewer. The charset reviewer, who is appointed by | |||
| the IETF Applications Area Director(s), either approves the | the IETF Applications Area Director(s), either approves the | |||
| request for registration or rejects it. Rejection may occur | request for registration or rejects it. Rejection may occur | |||
| because of significant objections raised on the list or | because of significant objections raised on the list or | |||
| objections raised externally. If the charset reviewer | objections raised externally. If the charset reviewer | |||
| considers the registration sufficiently important and | considers the registration sufficiently important and | |||
| controversial, a last call for comments may be issued to the | controversial, a last call for comments may be issued to the | |||
| full IETF. The charset reviewer may also recommend standards | full IETF. The charset reviewer may also recommend standards | |||
| track processing (before or after registration) when that | track processing (before or after registration) when that | |||
| appears appropriate and the level of specification of the | appears appropriate and the level of specification of the | |||
| charset is adequate. | charset is adequate. | |||
| Decisions made by the reviewer must be posted to the ietf- | The charset reviewer must reach a decision and post it to the | |||
| charsets mailing list within 14 days. Decisions made by the | ietf-charsets mailing list within two weeks. Decisions made by | |||
| reviewer may be appealed to the IESG. | the reviewer may be appealed to the IESG. | |||
| 4.3. IANA Registration | 4.3. IANA Registration | |||
| Provided that the charset registration has either passed | Provided that the charset registration has either passed | |||
| review or has been successfully appealed to the IESG, the IANA | review or has been successfully appealed to the IESG, the IANA | |||
| will register the charset, assign a MIBenum value, and make | will register the charset, assign a MIBenum value, and make | |||
| its registration available to the community. | its registration available to the community. | |||
| 5. Location of Registered Charset List | 5. Location of Registered Charset List | |||
| skipping to change at page 8, line 44 ¶ | skipping to change at page 8, line 44 ¶ | |||
| (All aliases must also be suitable for use as the value of | (All aliases must also be suitable for use as the value of | |||
| a MIME content-type parameter.) | a MIME content-type parameter.) | |||
| Suitability for use in MIME text: | Suitability for use in MIME text: | |||
| Published specification(s): | Published specification(s): | |||
| (A specification for the charset MUST be | (A specification for the charset MUST be | |||
| openly available that accurately describes what | openly available that accurately describes what | |||
| is being registered. If a charset is defined as | is being registered. If a charset is defined as | |||
| a composition of a CCS and a CES then these defintions | a composition of one or more CCS's and a CES then these | |||
| MUST either be included or referenced.) | defintions MUST either be included or referenced.) | |||
| ISO 10646 equivalency table: | ISO 10646 equivalency table: | |||
| (A URL to a specification of how to translate from | (A URI to a specification of how to translate from | |||
| this charset to ISO 10646 and vice versa SHOULD be | this charset to ISO 10646 and vice versa SHOULD be | |||
| provided.) | provided.) | |||
| Additional information: | Additional information: | |||
| Person & email address to contact for further information: | Person & email address to contact for further information: | |||
| Intended usage: | Intended usage: | |||
| (One of COMMON, LIMITED USE or OBSOLETE) | (One of COMMON, LIMITED USE or OBSOLETE) | |||
| skipping to change at page 10, line 15 ¶ | skipping to change at page 10, line 15 ¶ | |||
| 10. References | 10. References | |||
| [ISO-2022] | [ISO-2022] | |||
| International Standard -- Information Processing -- | International Standard -- Information Processing -- | |||
| Character Code Structure and Extension Techniques, | Character Code Structure and Extension Techniques, | |||
| ISO/IEC 2022:1994, 4th ed. | ISO/IEC 2022:1994, 4th ed. | |||
| [ISO-8859] | [ISO-8859] | |||
| International Standard -- Information Processing -- 8-bit | International Standard -- Information Processing -- 8-bit | |||
| Single-Byte Coded Graphic Character Sets | Single-Byte Coded Graphic Character Sets | |||
| - Part 1: Latin Alphabet No. 1, ISO 8859-1:1987, 1st ed. | - Part 1: Latin Alphabet No. 1, ISO 8859-1:1998, 1st ed. | |||
| - Part 2: Latin Alphabet No. 2, ISO 8859-2:1987, 1st ed. | - Part 2: Latin Alphabet No. 2, ISO 8859-2:1999, 1st ed. | |||
| - Part 3: Latin Alphabet No. 3, ISO 8859-3:1988, 1st ed. | - Part 3: Latin Alphabet No. 3, ISO 8859-3:1999, 1st ed. | |||
| - Part 4: Latin Alphabet No. 4, ISO 8859-4:1988, 1st ed. | - Part 4: Latin Alphabet No. 4, ISO 8859-4:1998, 1st ed. | |||
| - Part 5: Latin/Cyrillic Alphabet, ISO 8859-5:1988, 1st | - Part 5: Latin/Cyrillic Alphabet, ISO 8859-5:1999, 2nd | |||
| ed. | ed. | |||
| - Part 6: Latin/Arabic Alphabet, ISO 8859-6:1987, 1st ed. | - Part 6: Latin/Arabic Alphabet, ISO 8859-6:1999, 1st ed. | |||
| - Part 7: Latin/Greek Alphabet, ISO 8859-7:1987, 1st ed. | - Part 7: Latin/Greek Alphabet, ISO 8859-7:1987, 1st ed. | |||
| - Part 8: Latin/Hebrew Alphabet, ISO 8859-8:1988, 1st ed. | - Part 8: Latin/Hebrew Alphabet, ISO 8859-8:1999, 1st ed. | |||
| - Part 9: Latin Alphabet No. 5, ISO/IEC 8859-9:1989, 1st | - Part 9: Latin Alphabet No. 5, ISO/IEC 8859-9:1999, 2nd | |||
| ed. | ed. | |||
| International Standard -- Information Technology -- 8-bit | International Standard -- Information Technology -- 8-bit | |||
| Single-Byte Coded Graphic Character Sets | Single-Byte Coded Graphic Character Sets | |||
| - Part 10: Latin Alphabet No. 6, ISO/IEC 8859-10:1992, | - Part 10: Latin Alphabet No. 6, ISO/IEC 8859-10:1998, | |||
| 2nd ed. | ||||
| International Standard -- Information Technology -- 8-bit | ||||
| Single-Byte Coded Graphic Character Sets | ||||
| - Part 13: Latin Alphabet No. 7, ISO/IEC 8859-10:1998, | ||||
| 1st ed. | ||||
| International Standard -- Information Technology -- 8-bit | ||||
| Single-Byte Coded Graphic Character Sets | ||||
| - Part 14: Latin Alphabet No. 8 (Celtic), ISO/IEC | ||||
| 8859-10:1998, 1st ed. | ||||
| International Standard -- Information Technology -- 8-bit | ||||
| Single-Byte Coded Graphic Character Sets | ||||
| - Part 15: Latin Alphabet No. 9, ISO/IEC 8859-10:1999, | ||||
| 1st ed. | 1st ed. | |||
| [ISO-10646] | [ISO-10646] | |||
| ISO/IEC 10646-1:1993(E), "Information technology -- | ISO/IEC 10646-1:1993(E), "Information technology -- | |||
| Universal Multiple-Octet Coded Character Set (UCS) -- | Universal Multiple-Octet Coded Character Set (UCS) -- | |||
| Part 1: Architecture and Basic Multilingual Plane", | Part 1: Architecture and Basic Multilingual Plane", | |||
| JTC1/SC2, 1993. | JTC1/SC2, 1993. | |||
| [RFC-1590] | [RFC-1590] | |||
| Postel, J., "Media Type Registration Procedure", RFC | Postel, J., "Media Type Registration Procedure", RFC | |||
| End of changes. 22 change blocks. | ||||
| 41 lines changed or deleted | 54 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||