Network Working Group P. Faltstrom, Ed. Internet-Draft Cisco Intended status: Standards Track November 18, 2007 Expires: May 21, 2008 The Unicode Codepoints and IDN draft-faltstrom-idnabis-tables-03.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on May 21, 2008. Copyright Notice Copyright (C) The IETF Trust (2007). Abstract This document specifies rules for deciding whether a codepoint, considered in isolation, is a candidate for inclusion in an Internationalized Domain Name. It is part of the specification of IDNA200X. Faltstrom Expires May 21, 2008 [Page 1] Internet-Draft Unicode Codepoints November 2007 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Category definitions Used to Calculate Derived Property Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1. Categories Based on Unicode Properties . . . . . . . . . . 5 2.1.1. Category A - Classes of Codepoints . . . . . . . . . . 5 2.1.2. Category B - Normalization . . . . . . . . . . . . . . 6 2.1.3. Category C - Casefolding . . . . . . . . . . . . . . . 6 2.1.4. Category D - Ignorables . . . . . . . . . . . . . . . 7 2.1.5. Category E - Historical Scripts . . . . . . . . . . . 7 2.1.6. Category F - Blocks of Characters . . . . . . . . . . 7 2.2. Other Categories . . . . . . . . . . . . . . . . . . . . . 8 2.2.1. Category G - ASCII LDH . . . . . . . . . . . . . . . . 8 2.2.2. Category H - Exceptions . . . . . . . . . . . . . . . 8 2.2.3. Category I - CJK Subsetting . . . . . . . . . . . . . 8 2.2.4. Category J - Character Groups Requiring Special Treatment . . . . . . . . . . . . . . . . . . . . . . 9 2.2.5. Category K - Unassigned codepoints . . . . . . . . . . 9 3. Calculation of the Derived Property . . . . . . . . . . . . . 9 3.1. Classified Scripts . . . . . . . . . . . . . . . . . . . . 10 3.1.1. Scripts derived from Common European Scripts . . . . . 10 3.1.2. Scripts derived from Han . . . . . . . . . . . . . . . 10 3.2. Scripts not classified for IDN so far . . . . . . . . . . 11 4. Codepoints . . . . . . . . . . . . . . . . . . . . . . . . . . 11 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 6. Security Considerations . . . . . . . . . . . . . . . . . . . 11 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 12 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12 9.1. Normative References . . . . . . . . . . . . . . . . . . . 12 9.2. Informative References . . . . . . . . . . . . . . . . . . 12 Appendix A. Codepoints 0x0000 - 0x10FFFD . . . . . . . . . . . . 13 A.1. Codepoints in Unicode Character Database (UCD) format . . 14 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 53 Intellectual Property and Copyright Statements . . . . . . . . . . 54 Faltstrom Expires May 21, 2008 [Page 2] Internet-Draft Unicode Codepoints November 2007 1. Introduction RFC 4690 [RFC4690] suggests an inclusion based approach for selecting the codepoints from The Unicode Standard [Unicode5] that should be included in the list of codepoints that may be used in Internationalized Domain Names. Specifically, RFC 4690 [RFC4690] says the following: The IAB has concluded that there is a consensus within the broader community that lists of code points should be specified by the use of an inclusion-based mechanism (i.e., identifying the characters that are permitted), rather than by excluding a small number of characters from the total Unicode set as Stringprep [RFC3454] and Nameprep [RFC3491] do today. That conclusion should be reviewed by the IETF community and action taken as appropriate. This document reviews and classifies the collections of codepoints in the Unicode character set by examining various properties of the codepoints. It then defines an algorithm for determining a derived property value. It specifies a procedure and not a table of codepoints so that the algorithm can be used to determine code point sets independent of the version of Unicode that is in use. This document is not intended to specify precisely how these property values are to be applied in IDN labels. That information appears in [IDNA200X-protocol], but it is important to understand that the assignment of a value of this property to a particular character is not sufficient to determine whether it can be used in a given label. In particular, some combinations of allowed codepoints are not advisable for use in IDNs due to rules specific to a script or class of characters. The requirement for such rules is linked to the "CONTEXT" property value described below; the rules are described in other documents. The value of the property is to be interpreted approximately as follows. o ALWAYS: Those that should clearly be included in IDNs. Codepoints with this property value are permitted for general use in IDNs. Once assigned to this category, a character is never removed from it unless it is removed from Unicode. o MAYBE YES: These characters have not yet been completely classified, possibly because further information is needed about permitted context. The categories to which they belong generally predict that they will be acceptable. Over time, characters may be reclassified from this category to others. Faltstrom Expires May 21, 2008 [Page 3] Internet-Draft Unicode Codepoints November 2007 o MAYBE NOT: These characters have not yet been completely classified, possibly because further information is needed about permitted context. The categories to which they belong generally predict that they not be acceptable. Over time, characters may be reclassified from this category to others. o CONTEXT: (Contextual Rule Required) Some characteristic of the character, such as it being invisible in certain contexts or problematic in others, requires that it not be used in labels unless specific other characters or properties are present. So the character must not be used unless an appropriate rule has been established and the context of the character is consistent with that rule. o NEVER: Those that should clearly not be included in IDNs. Codepoints with this property value will never be permitted in IDNs. Once a codepoint is assigned to this category, it is never removed or reclassified. o UNASSIGNED: Those codepoints that are unassigned in the Unicode Standard. The category name MAYBE can be used for the union of the categories MAYBE YES and MAYBE NO. There is a need for more than two categories because there are complex trade offs involved. In many cases, there is just not sufficient information available at present, especially with regard to considerations of appropriateness of use of the character in IDNs that are not related to formal Unicode character properties. The (non-normative) table in Appendix A is derived from data in Unicode 5.0, rather than the earlier Unicode 3.2; this in order to take advantage of the expanded character repertoire and better definitions in the newer version. The mechanisms described here allow determination of the value of the property for future versions of Unicode (including characters added after Unicode 5.0). It should be suitable for newer revisions of Unicode, as long as the Unicode properties on which it is based remain stable. Some codepoints need to be allowed in exceptional circumstances, but should be excluded in all other cases; these rules are also described in other documents. The most notable of these are the ZERO WIDTH JOINER (U+200D) and ZERO WIDTH NON-JOINER (U+200C). Both of them have the derived property value CONTEXT. It is invalid to either register a string containing these characters or even to look one up unless the rule is found and satisfied. This document is part of a series that, together, constitute a preliminary proposal for updating the IDNA standards to resolve issues uncovered in recent years, cover a broader range of scripts, and provide for migration to newer versions of Unicode. See Faltstrom Expires May 21, 2008 [Page 4] Internet-Draft Unicode Codepoints November 2007 [IDNA200X-issues] for a broader discussion. 2. Category definitions Used to Calculate Derived Property Value 2.1. Categories Based on Unicode Properties The derived property obtains its value based on a two step procedure. First, characters are placed in one or more character categories based on core properties defined by the Unicode Standard. These categories are not mutually exclusive. For some of the categories, the value is based on the explicit codepoint is tested. One of them is for compatibility with "traditional DNS" hostnames. Another involves an exception rule for specific codepoints. In the second step, set operations are used with these categories to determine the values for an IDN-specific property. Those operations are specified in Section 3. In many cases aliases are used in the data in the Unicode Standard. This document uses both the alias and the spelled out terms (for example alias Ll for the General Category Lowercase_Letter). 2.1.1. Category A - Classes of Codepoints A: generalCategory(cp) is in {Ll, Lu, Lo, Nd, Lm, Mn, Mc} These rules identify characters commonly used in mnemonics and often informally described as "language characters". In general, only codepoints assigned to this category are suitable for use in IDN (but not all of them). The generalCategory of a codepoint is found in UnicodeData.txt [1] in the third column. The mapping between the alias (for example Ll) and the name of the general category is found in PropertyValueAliases.txt [2] under the heading General_Category. The categories used in this rule are: o Ll - Lowercase_Letter o Lu - Uppercase_Letter o Lo - Other_Letter o Nd - Decimal_Number o Lm - Modifier_Letter o Mn - Nonspacing_Mark o Mc - Spacing_Mark Faltstrom Expires May 21, 2008 [Page 5] Internet-Draft Unicode Codepoints November 2007 2.1.2. Category B - Normalization B: NFKC(cp) != cp The category is used to group the characters that change under NFKC normalization. In general, these codepoints are not suitable for use for IDN. The normalization algorithm NFKC is defined as NFKD (canonical) decomposition followed by canonical composition. Decomposition mapping is found in the sixth column in UnicodeData.txt [1] that together with UAX #15 [3] define the various normalization forms. The data (sixth column) include the decomposition mapping. In the case of canonical, rather than compatibility mapping, it also shows the decomposition type within angle brackets. An exclusion table for composition exists in CompositionExclusions.txt [4]. Singleton decompositions, those that map a single codepoint into another single codepoint, are never composed. Hangul is decomposed and composed according to an algorithm specified in the Unicode Standard. The rules that form this category may include the composed form of a codepoint, and because of step 2 the codepoint is not suitable for use in IDN. This implies only the associated decomposed form could be used for IDN (as is the case with U+0140). However, other rules involving the General Category might then cause rejection of some of the codepoints that make up the decomposed form. If that combination occurs, neither the composed nor the decomposed forms are permitted in IDN labels unless there is a Category H (Section 2.2.2) Exception (U+00B7 is an example of this case). 2.1.3. Category C - Casefolding C: casefold(cp) != cp This category is used to group all characters that change when folded to lower case. In general, these codepoints are not suitable for use for IDN. Case folding rules can in general be found in UnicodeData.txt [1] in the 14th column. A table of special cases can be found in SpecialCasing.txt [5]. The SpecialCasing.txt [5] file includes both general casing rules and conditional mappings. Only transformations associated with unconditional mappings are included in this category. One can see whether the mapping is conditional or not in the 5th column of SpecialCasing.txt [5]. If the column is empty, the mapping is unconditional. Known codepoints with difficult case folding include U+00DF (LATIN Faltstrom Expires May 21, 2008 [Page 6] Internet-Draft Unicode Codepoints November 2007 SMALL LETTER SHARP S) and U+0130 (LATIN CAPITAL LETTER I WITH DOT ABOVE). 2.1.4. Category D - Ignorables D: property(cp) is in {Other_Default_Ignorable_Code_Point, Noncharacter_Code_Point} In general, these codepoints are not suitable for use for IDN. The required properties are associated with codepoints in PropList.txt [6]. Note that there are also derived properties in DerivedCoreProperties.txt [7]. These are not used for reasons explained above. It has been discussed whether Default_Ignorable_Code_Point should be used, but as that is a derived property, it is not used. The definition for Default_Ignorable_Code_Point can be found in DerivedCoreProperties.txt [7] (and erratum of 2007-January-25 [8]) and is Other_Default_Ignorable_Code_Point + Cf + Cc + Cs + Noncharacter_Code_Point + Variation_Selector - White_Space - FFF9..FFFB (Annotation Characters) 2.1.5. Category E - Historical Scripts E: script(cp) in {Cuneiform, Ugaritic, Old_Persian, Gothic, Old_Italic, Cypriot, Linear_B, Phoenician, Kharoshthi, Phags_Pa, Glagolitic, Shavian, Deseret, Osmanya, Ogham} This category is used for identifying codepoints in historical scripts. In general, these codepoints are inappropriate for use in IDNs because no living script-user communities are known. The script to which a codepoint belongs is listed in Scripts.txt [9]. Note that aliases for scripts can be found in PropertyValueAliases.txt [2], so Xsux is for example an alias for the script Cuneiform. 2.1.6. Category F - Blocks of Characters F: block(cp) in {Combining_Diacritical_Marks_for_Symbols, Musical_Symbols, Ancient_Greek_Musical_Notation} This category rule is used to identifying codepoints that are not Faltstrom Expires May 21, 2008 [Page 7] Internet-Draft Unicode Codepoints November 2007 useful in mnemonics or that are otherwise impractical for IDN use. In general, these codepoints are not suitable for use for IDN. The block to which a codepoint belongs is listed in Blocks.txt [10]. 2.2. Other Categories 2.2.1. Category G - ASCII LDH G: cp is in {0061..007A, 0030..0039, 002D} This category is used in the second step to preserve the traditional "hostname" (LDH) characters (a-z, 0-9 and '-'). In general, these codepoints are suitable for use for IDN. The characters in this category are defined by a list of specific permitted ASCII [11] characters. 2.2.2. Category H - Exceptions [[anchor3: Note in Draft: This list may be further modified as the implications ofare examined.]]Category J (Section 2.2.4) H: cp in {00B7, 05F3, 05F4, 3007, 303B, 30FB} This category explicitly lists codepoints for which the category cannot be assigned using only the core property values that exist in the Unicode standard. The values are according to the table below: 00B7; MAYBE YES # MIDDLE DOT 05F3; MAYBE YES # HEBREW PUNCTUATION GERESH 05F4; MAYBE YES # HEBREW PUNCTUATION GERSHAYIM 3005; ALWAYS # IDEOGRAPHIC ITERATION MARK 3007; ALWAYS # IDEOGRAPHIC NUMBER ZERO 303B; MAYBE YES # VERTICAL IDEOGRAPHIC ITERATION MARK 30FB; MAYBE YES # KATAKANA MIDDLE DOT 2.2.3. Category I - CJK Subsetting I: script(cp) is in {Han} [[anchor4: Note in Draft: The actual rule here should not be a test for the script, as that is already part of the definition of. This is only a placeholder to make the overall algorithm be complete.]]Category Y (Section 3.1.2) Faltstrom Expires May 21, 2008 [Page 8] Internet-Draft Unicode Codepoints November 2007 2.2.4. Category J - Character Groups Requiring Special Treatment [[anchor5: Note in Draft: This list may be increased (for example with Cc) as the implications of.]]Classifying more scripts (Section 3.2) J: generalCategory(cp) is in {Cf} This category consists of non-language characters (i.e., those that are not in Category A (Section 2.1.1)) but that may still be required in IDN labels under some circumstances. In general, the second step uses this category to either require contextual rules or to exclude the characters. 2.2.5. Category K - Unassigned codepoints K: cp is unassigned This category consists of codepoints in the Unicode character set that is not (yet) assigned. 3. Calculation of the Derived Property As described above (Section 1) and in more detail in the "IDNA Protocol" document [IDNA200X-protocol], possible values of the IDN property are: o ALWAYS o MAYBE YES o MAYBE NOT o CONTEXT o NEVER o UNASSIGNED The algorithm to calculate the value of the derived property is as follows. The calculation is twofold. We first take care of exceptions and then look at multiple categories on a per-script basis. First the special cases. If there is a match, do not go to the second phase. o If the codepoint is in Category G (Section 2.2.1), the value is ALWAYS. o If the codepoint is in Category J (Section 2.2.4), the value is CONTEXT. Faltstrom Expires May 21, 2008 [Page 9] Internet-Draft Unicode Codepoints November 2007 o If the codepoint is in Category K (Section 2.2.5), the value is UNASSIGNED. o If the codepoint is in Category H (Section 2.2.2), the value is according to the table in Section 2.2.2. If the special cases do not yield a final value for the property, the value depends on the script to which the codepoint belongs. This step is therefore organized by script, with the expectation that the list will grow over time, using criteria discussed in [IDNA200X-protocol]. Once a script is added to this list, it is never removed. The order of scripts below is historical and has no substantive implications. [[anchor6: Note in Draft: A pointer to IANA Considerations and some discussion are probably needed here.]] The two sets of scripts chosen as initial cases below were selected because extensive specific work has been done on the applicability and practical use of characters in those scripts in IDNs and, consequently, because we feel confident that we understand these scripts well enough to be sure that the characters placed into the NEVER category by these rules cannot possibly be candidates for reassignment to the ALWAYS category (or vice versa). As our confidence grows with regard to other scripts, we expect that they will be effectively added to this list, permitting the ALWAYS category to be used. 3.1. Classified Scripts 3.1.1. Scripts derived from Common European Scripts Z: script(cp) in {Latin, Greek, Cyrillic} o If the codepoint is not in Category A (Section 2.1.1), the value is NEVER. o If the codepoint is in Category A (Section 2.1.1), the following applies: * If the codepoint does not appear in any of the categories B (Section 2.1.2), C (Section 2.1.3), D (Section 2.1.4), E (Section 2.1.5) or F (Section 2.1.6), the value is ALWAYS. * If the codepoint is in any of the categories B (Section 2.1.2), C (Section 2.1.3), D (Section 2.1.4), E (Section 2.1.5) or F (Section 2.1.6), the value is NEVER. 3.1.2. Scripts derived from Han Y: script(cp) in {Han} Faltstrom Expires May 21, 2008 [Page 10] Internet-Draft Unicode Codepoints November 2007 o If the codepoint is in Category B (Section 2.1.2), then the value is NEVER. o If the codepoint is not in Category B (Section 2.1.2), then the value is MAYBE YES. 3.2. Scripts not classified for IDN so far Other: script(cp) not classified so far Scripts not classified above, or in successor documents, are never associated with the ALWAYS property value. They are given values as follows: o If the codepoint is not in Category A (Section 2.1.1), the value is NEVER. o If the codepoint is in Category A (Section 2.1.1), the following applies: * If the codepoint does not appear in any of the Categories B (Section 2.1.2), C (Section 2.1.3), D (Section 2.1.4), E (Section 2.1.5) or F (Section 2.1.6), the value is MAYBE YES. * If the codepoint is in any of the categories B (Section 2.1.2), C (Section 2.1.3), D (Section 2.1.4), E (Section 2.1.5) or F (Section 2.1.6), the value is NEVER. 4. Codepoints The Categories and Rules defined in Section 2 and Section 3 apply to all assigned Unicode characters. The table in Appendix A shows, for illustrative purposes, the consequences of the categories and classification rules, and the resulting property values. The list of codepoints that can be found in Appendix A is non- normative. Section 2 and Section 3 are normative. 5. IANA Considerations ...To be supplied. This work will ultimately require registries of characters that are acceptable for use in IDNs. 6. Security Considerations The security issues associated with this work are discussed in [IDNA200X-issues] and [IDNA200X-protocol]. Faltstrom Expires May 21, 2008 [Page 11] Internet-Draft Unicode Codepoints November 2007 7. Contributors While the listed editor held the pen, this document represents the joint work and conclusions of an ad hoc design team. In addition to the editor this consisted of, Harald Alvestrand, Tina Dam, Cary Karp, and John Klensin. 8. Acknowledgements This document would not have been possible to produce without input from many people. The main contribuotrs are (in alphabetical order) Harald Alvestrand, Vint Cerf, Tina Dam, Mark Davis, Mouhammet Diop, Michael Everson, Asmus Freytag, Debbie Garside, Paul Hoffman, Cary Karp, John Klensin, Olaf Kolkman, Lisa Moore, Yngve Pettersen, Hualin Qian, Rick Reed, Michel Suignard and Kenneth Whistler. 9. References 9.1. Normative References [RFC4690] Klensin, J., Faltstrom, P., and Karp, C., "Review and Recommendations for Internationalized Domain Names (IDNs)", RFC 4690, September 2006. [Unicode5] The Unicode Consortium, "The Unicode Standard, Version 5.0", Boston, MA, Addison-Wesley ISBN 0-321-48091-0, 2007. 9.2. Informative References [IDNA200X-Bidi] Alvestrand, H. and C. Karp, "An IDNA problem in right-to- left scripts", July 2007, . [IDNA200X-issues] Klensin, J., Ed., "Internationalizing Domain Names for Applications (IDNA): Issues and Rationale", November 2007, . [IDNA200X-protocol] Klensin, J., "Internationalizing Domain Names in Applications (IDNA): Protocol", November 2007, . Faltstrom Expires May 21, 2008 [Page 12] Internet-Draft Unicode Codepoints November 2007 [RFC1035] Mockapetris, P., "Domain names - implementation and specification", STD 13, RFC 1035, November 1987. [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of Internationalized Strings ("stringprep")", RFC 3454, December 2002. [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)", RFC 3491, March 2003. [RFC4713] Lee, X., Mao, W., Chen, E., Hsu, N., and J. Klensin, "Registration and Administration Recommendations for Chinese Domain Names", RFC 4713, October 2006. URIs [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] Appendix A. Codepoints 0x0000 - 0x10FFFD If one apply the rules (Section 3) to the codepoints 0x0000 to 0x10FFFD, the result is as follows. This list is non-normative, as explained in Section 4, and only included for illustrative purposes. Faltstrom Expires May 21, 2008 [Page 13] Internet-Draft Unicode Codepoints November 2007 A.1. Codepoints in Unicode Character Database (UCD) format 0000..002C ; NEVER # ..COMMA 002D ; ALWAYS # HYPHEN-MINUS 002E..002F ; NEVER # FULL STOP..SOLIDUS 0030..0039 ; ALWAYS # DIGIT ZERO..DIGIT NINE 003A..0060 ; NEVER # COLON..GRAVE ACCENT 0061..007A ; ALWAYS # LATIN SMALL LETTER A..LATIN SMALL LETTER Z 007B..00AC ; NEVER # LEFT CURLY BRACKET..NOT SIGN 00AD ; CONTEXT # SOFT HYPHEN 00AE..00B6 ; NEVER # REGISTERED SIGN..PILCROW SIGN 00B7 ; MAYBE YES # MIDDLE DOT 00B8..00DF ; NEVER # CEDILLA..LATIN SMALL LETTER SHARP S 00E0..00F6 ; ALWAYS # LATIN SMALL LETTER A WITH GRAVE..LATIN SMALL LE 00F7 ; NEVER # DIVISION SIGN 00F8..00FF ; ALWAYS # LATIN SMALL LETTER O WITH STROKE..LATIN SMALL L 0100 ; NEVER # LATIN CAPITAL LETTER A WITH MACRON 0101 ; ALWAYS # LATIN SMALL LETTER A WITH MACRON 0102 ; NEVER # LATIN CAPITAL LETTER A WITH BREVE 0103 ; ALWAYS # LATIN SMALL LETTER A WITH BREVE 0104 ; NEVER # LATIN CAPITAL LETTER A WITH OGONEK 0105 ; ALWAYS # LATIN SMALL LETTER A WITH OGONEK 0106 ; NEVER # LATIN CAPITAL LETTER C WITH ACUTE 0107 ; ALWAYS # LATIN SMALL LETTER C WITH ACUTE 0108 ; NEVER # LATIN CAPITAL LETTER C WITH CIRCUMFLEX 0109 ; ALWAYS # LATIN SMALL LETTER C WITH CIRCUMFLEX 010A ; NEVER # LATIN CAPITAL LETTER C WITH DOT ABOVE 010B ; ALWAYS # LATIN SMALL LETTER C WITH DOT ABOVE 010C ; NEVER # LATIN CAPITAL LETTER C WITH CARON 010D ; ALWAYS # LATIN SMALL LETTER C WITH CARON 010E ; NEVER # LATIN CAPITAL LETTER D WITH CARON 010F ; ALWAYS # LATIN SMALL LETTER D WITH CARON 0110 ; NEVER # LATIN CAPITAL LETTER D WITH STROKE 0111 ; ALWAYS # LATIN SMALL LETTER D WITH STROKE 0112 ; NEVER # LATIN CAPITAL LETTER E WITH MACRON 0113 ; ALWAYS # LATIN SMALL LETTER E WITH MACRON 0114 ; NEVER # LATIN CAPITAL LETTER E WITH BREVE 0115 ; ALWAYS # LATIN SMALL LETTER E WITH BREVE 0116 ; NEVER # LATIN CAPITAL LETTER E WITH DOT ABOVE 0117 ; ALWAYS # LATIN SMALL LETTER E WITH DOT ABOVE 0118 ; NEVER # LATIN CAPITAL LETTER E WITH OGONEK 0119 ; ALWAYS # LATIN SMALL LETTER E WITH OGONEK 011A ; NEVER # LATIN CAPITAL LETTER E WITH CARON 011B ; ALWAYS # LATIN SMALL LETTER E WITH CARON 011C ; NEVER # LATIN CAPITAL LETTER G WITH CIRCUMFLEX 011D ; ALWAYS # LATIN SMALL LETTER G WITH CIRCUMFLEX 011E ; NEVER # LATIN CAPITAL LETTER G WITH BREVE 011F ; ALWAYS # LATIN SMALL LETTER G WITH BREVE Faltstrom Expires May 21, 2008 [Page 14] Internet-Draft Unicode Codepoints November 2007 0120 ; NEVER # LATIN CAPITAL LETTER G WITH DOT ABOVE 0121 ; ALWAYS # LATIN SMALL LETTER G WITH DOT ABOVE 0122 ; NEVER # LATIN CAPITAL LETTER G WITH CEDILLA 0123 ; ALWAYS # LATIN SMALL LETTER G WITH CEDILLA 0124 ; NEVER # LATIN CAPITAL LETTER H WITH CIRCUMFLEX 0125 ; ALWAYS # LATIN SMALL LETTER H WITH CIRCUMFLEX 0126 ; NEVER # LATIN CAPITAL LETTER H WITH STROKE 0127 ; ALWAYS # LATIN SMALL LETTER H WITH STROKE 0128 ; NEVER # LATIN CAPITAL LETTER I WITH TILDE 0129 ; ALWAYS # LATIN SMALL LETTER I WITH TILDE 012A ; NEVER # LATIN CAPITAL LETTER I WITH MACRON 012B ; ALWAYS # LATIN SMALL LETTER I WITH MACRON 012C ; NEVER # LATIN CAPITAL LETTER I WITH BREVE 012D ; ALWAYS # LATIN SMALL LETTER I WITH BREVE 012E ; NEVER # LATIN CAPITAL LETTER I WITH OGONEK 012F..0131 ; ALWAYS # LATIN SMALL LETTER I WITH OGONEK..LATIN SMALL L 0132..0134 ; NEVER # LATIN CAPITAL LIGATURE IJ..LATIN CAPITAL LETTER 0135 ; ALWAYS # LATIN SMALL LETTER J WITH CIRCUMFLEX 0136 ; NEVER # LATIN CAPITAL LETTER K WITH CEDILLA 0137..0138 ; ALWAYS # LATIN SMALL LETTER K WITH CEDILLA..LATIN SMALL 0139 ; NEVER # LATIN CAPITAL LETTER L WITH ACUTE 013A ; ALWAYS # LATIN SMALL LETTER L WITH ACUTE 013B ; NEVER # LATIN CAPITAL LETTER L WITH CEDILLA 013C ; ALWAYS # LATIN SMALL LETTER L WITH CEDILLA 013D ; NEVER # LATIN CAPITAL LETTER L WITH CARON 013E ; ALWAYS # LATIN SMALL LETTER L WITH CARON 013F..0141 ; NEVER # LATIN CAPITAL LETTER L WITH MIDDLE DOT..LATIN C 0142 ; ALWAYS # LATIN SMALL LETTER L WITH STROKE 0143 ; NEVER # LATIN CAPITAL LETTER N WITH ACUTE 0144 ; ALWAYS # LATIN SMALL LETTER N WITH ACUTE 0145 ; NEVER # LATIN CAPITAL LETTER N WITH CEDILLA 0146 ; ALWAYS # LATIN SMALL LETTER N WITH CEDILLA 0147 ; NEVER # LATIN CAPITAL LETTER N WITH CARON 0148 ; ALWAYS # LATIN SMALL LETTER N WITH CARON 0149..014A ; NEVER # LATIN SMALL LETTER N PRECEDED BY APOSTROPHE..LA 014B ; ALWAYS # LATIN SMALL LETTER ENG 014C ; NEVER # LATIN CAPITAL LETTER O WITH MACRON 014D ; ALWAYS # LATIN SMALL LETTER O WITH MACRON 014E ; NEVER # LATIN CAPITAL LETTER O WITH BREVE 014F ; ALWAYS # LATIN SMALL LETTER O WITH BREVE 0150 ; NEVER # LATIN CAPITAL LETTER O WITH DOUBLE ACUTE 0151 ; ALWAYS # LATIN SMALL LETTER O WITH DOUBLE ACUTE 0152 ; NEVER # LATIN CAPITAL LIGATURE OE 0153 ; ALWAYS # LATIN SMALL LIGATURE OE 0154 ; NEVER # LATIN CAPITAL LETTER R WITH ACUTE 0155 ; ALWAYS # LATIN SMALL LETTER R WITH ACUTE 0156 ; NEVER # LATIN CAPITAL LETTER R WITH CEDILLA 0157 ; ALWAYS # LATIN SMALL LETTER R WITH CEDILLA Faltstrom Expires May 21, 2008 [Page 15] Internet-Draft Unicode Codepoints November 2007 0158 ; NEVER # LATIN CAPITAL LETTER R WITH CARON 0159 ; ALWAYS # LATIN SMALL LETTER R WITH CARON 015A ; NEVER # LATIN CAPITAL LETTER S WITH ACUTE 015B ; ALWAYS # LATIN SMALL LETTER S WITH ACUTE 015C ; NEVER # LATIN CAPITAL LETTER S WITH CIRCUMFLEX 015D ; ALWAYS # LATIN SMALL LETTER S WITH CIRCUMFLEX 015E ; NEVER # LATIN CAPITAL LETTER S WITH CEDILLA 015F ; ALWAYS # LATIN SMALL LETTER S WITH CEDILLA 0160 ; NEVER # LATIN CAPITAL LETTER S WITH CARON 0161 ; ALWAYS # LATIN SMALL LETTER S WITH CARON 0162 ; NEVER # LATIN CAPITAL LETTER T WITH CEDILLA 0163 ; ALWAYS # LATIN SMALL LETTER T WITH CEDILLA 0164 ; NEVER # LATIN CAPITAL LETTER T WITH CARON 0165 ; ALWAYS # LATIN SMALL LETTER T WITH CARON 0166 ; NEVER # LATIN CAPITAL LETTER T WITH STROKE 0167 ; ALWAYS # LATIN SMALL LETTER T WITH STROKE 0168 ; NEVER # LATIN CAPITAL LETTER U WITH TILDE 0169 ; ALWAYS # LATIN SMALL LETTER U WITH TILDE 016A ; NEVER # LATIN CAPITAL LETTER U WITH MACRON 016B ; ALWAYS # LATIN SMALL LETTER U WITH MACRON 016C ; NEVER # LATIN CAPITAL LETTER U WITH BREVE 016D ; ALWAYS # LATIN SMALL LETTER U WITH BREVE 016E ; NEVER # LATIN CAPITAL LETTER U WITH RING ABOVE 016F ; ALWAYS # LATIN SMALL LETTER U WITH RING ABOVE 0170 ; NEVER # LATIN CAPITAL LETTER U WITH DOUBLE ACUTE 0171 ; ALWAYS # LATIN SMALL LETTER U WITH DOUBLE ACUTE 0172 ; NEVER # LATIN CAPITAL LETTER U WITH OGONEK 0173 ; ALWAYS # LATIN SMALL LETTER U WITH OGONEK 0174 ; NEVER # LATIN CAPITAL LETTER W WITH CIRCUMFLEX 0175 ; ALWAYS # LATIN SMALL LETTER W WITH CIRCUMFLEX 0176 ; NEVER # LATIN CAPITAL LETTER Y WITH CIRCUMFLEX 0177 ; ALWAYS # LATIN SMALL LETTER Y WITH CIRCUMFLEX 0178..0179 ; NEVER # LATIN CAPITAL LETTER Y WITH DIAERESIS..LATIN CA 017A ; ALWAYS # LATIN SMALL LETTER Z WITH ACUTE 017B ; NEVER # LATIN CAPITAL LETTER Z WITH DOT ABOVE 017C ; ALWAYS # LATIN SMALL LETTER Z WITH DOT ABOVE 017D ; NEVER # LATIN CAPITAL LETTER Z WITH CARON 017E ; ALWAYS # LATIN SMALL LETTER Z WITH CARON 017F ; NEVER # LATIN SMALL LETTER LONG S 0180 ; ALWAYS # LATIN SMALL LETTER B WITH STROKE 0181..0182 ; NEVER # LATIN CAPITAL LETTER B WITH HOOK..LATIN CAPITAL 0183 ; ALWAYS # LATIN SMALL LETTER B WITH TOPBAR 0184 ; NEVER # LATIN CAPITAL LETTER TONE SIX 0185 ; ALWAYS # LATIN SMALL LETTER TONE SIX 0186..0187 ; NEVER # LATIN CAPITAL LETTER OPEN O..LATIN CAPITAL LETT 0188 ; ALWAYS # LATIN SMALL LETTER C WITH HOOK 0189..018B ; NEVER # LATIN CAPITAL LETTER AFRICAN D..LATIN CAPITAL L 018C..018D ; ALWAYS # LATIN SMALL LETTER D WITH TOPBAR..LATIN SMALL L Faltstrom Expires May 21, 2008 [Page 16] Internet-Draft Unicode Codepoints November 2007 018E..0191 ; NEVER # LATIN CAPITAL LETTER REVERSED E..LATIN CAPITAL 0192 ; ALWAYS # LATIN SMALL LETTER F WITH HOOK 0193..0194 ; NEVER # LATIN CAPITAL LETTER G WITH HOOK..LATIN CAPITAL 0195 ; ALWAYS # LATIN SMALL LETTER HV 0196..0198 ; NEVER # LATIN CAPITAL LETTER IOTA..LATIN CAPITAL LETTER 0199..019B ; ALWAYS # LATIN SMALL LETTER K WITH HOOK..LATIN SMALL LET 019C..019D ; NEVER # LATIN CAPITAL LETTER TURNED M..LATIN CAPITAL LE 019E ; ALWAYS # LATIN SMALL LETTER N WITH LONG RIGHT LEG 019F..01A0 ; NEVER # LATIN CAPITAL LETTER O WITH MIDDLE TILDE..LATIN 01A1 ; ALWAYS # LATIN SMALL LETTER O WITH HORN 01A2 ; NEVER # LATIN CAPITAL LETTER OI 01A3 ; ALWAYS # LATIN SMALL LETTER OI 01A4 ; NEVER # LATIN CAPITAL LETTER P WITH HOOK 01A5 ; ALWAYS # LATIN SMALL LETTER P WITH HOOK 01A6..01A7 ; NEVER # LATIN LETTER YR..LATIN CAPITAL LETTER TONE TWO 01A8 ; ALWAYS # LATIN SMALL LETTER TONE TWO 01A9 ; NEVER # LATIN CAPITAL LETTER ESH 01AA..01AB ; ALWAYS # LATIN LETTER REVERSED ESH LOOP..LATIN SMALL LET 01AC ; NEVER # LATIN CAPITAL LETTER T WITH HOOK 01AD ; ALWAYS # LATIN SMALL LETTER T WITH HOOK 01AE..01AF ; NEVER # LATIN CAPITAL LETTER T WITH RETROFLEX HOOK..LAT 01B0 ; ALWAYS # LATIN SMALL LETTER U WITH HORN 01B1..01B3 ; NEVER # LATIN CAPITAL LETTER UPSILON..LATIN CAPITAL LET 01B4 ; ALWAYS # LATIN SMALL LETTER Y WITH HOOK 01B5 ; NEVER # LATIN CAPITAL LETTER Z WITH STROKE 01B6 ; ALWAYS # LATIN SMALL LETTER Z WITH STROKE 01B7..01B8 ; NEVER # LATIN CAPITAL LETTER EZH..LATIN CAPITAL LETTER 01B9..01BB ; ALWAYS # LATIN SMALL LETTER EZH REVERSED..LATIN LETTER T 01BC ; NEVER # LATIN CAPITAL LETTER TONE FIVE 01BD..01C3 ; ALWAYS # LATIN SMALL LETTER TONE FIVE..LATIN LETTER RETR 01C4..01CD ; NEVER # LATIN CAPITAL LETTER DZ WITH CARON..LATIN CAPIT 01CE ; ALWAYS # LATIN SMALL LETTER A WITH CARON 01CF ; NEVER # LATIN CAPITAL LETTER I WITH CARON 01D0 ; ALWAYS # LATIN SMALL LETTER I WITH CARON 01D1 ; NEVER # LATIN CAPITAL LETTER O WITH CARON 01D2 ; ALWAYS # LATIN SMALL LETTER O WITH CARON 01D3 ; NEVER # LATIN CAPITAL LETTER U WITH CARON 01D4 ; ALWAYS # LATIN SMALL LETTER U WITH CARON 01D5..01DC ; NEVER # LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRO 01DD ; ALWAYS # LATIN SMALL LETTER TURNED E 01DE..01E2 ; NEVER # LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRO 01E3 ; ALWAYS # LATIN SMALL LETTER AE WITH MACRON 01E4 ; NEVER # LATIN CAPITAL LETTER G WITH STROKE 01E5 ; ALWAYS # LATIN SMALL LETTER G WITH STROKE 01E6 ; NEVER # LATIN CAPITAL LETTER G WITH CARON 01E7 ; ALWAYS # LATIN SMALL LETTER G WITH CARON 01E8 ; NEVER # LATIN CAPITAL LETTER K WITH CARON 01E9 ; ALWAYS # LATIN SMALL LETTER K WITH CARON Faltstrom Expires May 21, 2008 [Page 17] Internet-Draft Unicode Codepoints November 2007 01EA ; NEVER # LATIN CAPITAL LETTER O WITH OGONEK 01EB ; ALWAYS # LATIN SMALL LETTER O WITH OGONEK 01EC..01EE ; NEVER # LATIN CAPITAL LETTER O WITH OGONEK AND MACRON.. 01EF ; ALWAYS # LATIN SMALL LETTER EZH WITH CARON 01F0..01F4 ; NEVER # LATIN SMALL LETTER J WITH CARON..LATIN CAPITAL 01F5 ; ALWAYS # LATIN SMALL LETTER G WITH ACUTE 01F6..01F8 ; NEVER # LATIN CAPITAL LETTER HWAIR..LATIN CAPITAL LETTE 01F9 ; ALWAYS # LATIN SMALL LETTER N WITH GRAVE 01FA..01FC ; NEVER # LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUT 01FD ; ALWAYS # LATIN SMALL LETTER AE WITH ACUTE 01FE ; NEVER # LATIN CAPITAL LETTER O WITH STROKE AND ACUTE 01FF ; ALWAYS # LATIN SMALL LETTER O WITH STROKE AND ACUTE 0200 ; NEVER # LATIN CAPITAL LETTER A WITH DOUBLE GRAVE 0201 ; ALWAYS # LATIN SMALL LETTER A WITH DOUBLE GRAVE 0202 ; NEVER # LATIN CAPITAL LETTER A WITH INVERTED BREVE 0203 ; ALWAYS # LATIN SMALL LETTER A WITH INVERTED BREVE 0204 ; NEVER # LATIN CAPITAL LETTER E WITH DOUBLE GRAVE 0205 ; ALWAYS # LATIN SMALL LETTER E WITH DOUBLE GRAVE 0206 ; NEVER # LATIN CAPITAL LETTER E WITH INVERTED BREVE 0207 ; ALWAYS # LATIN SMALL LETTER E WITH INVERTED BREVE 0208 ; NEVER # LATIN CAPITAL LETTER I WITH DOUBLE GRAVE 0209 ; ALWAYS # LATIN SMALL LETTER I WITH DOUBLE GRAVE 020A ; NEVER # LATIN CAPITAL LETTER I WITH INVERTED BREVE 020B ; ALWAYS # LATIN SMALL LETTER I WITH INVERTED BREVE 020C ; NEVER # LATIN CAPITAL LETTER O WITH DOUBLE GRAVE 020D ; ALWAYS # LATIN SMALL LETTER O WITH DOUBLE GRAVE 020E ; NEVER # LATIN CAPITAL LETTER O WITH INVERTED BREVE 020F ; ALWAYS # LATIN SMALL LETTER O WITH INVERTED BREVE 0210 ; NEVER # LATIN CAPITAL LETTER R WITH DOUBLE GRAVE 0211 ; ALWAYS # LATIN SMALL LETTER R WITH DOUBLE GRAVE 0212 ; NEVER # LATIN CAPITAL LETTER R WITH INVERTED BREVE 0213 ; ALWAYS # LATIN SMALL LETTER R WITH INVERTED BREVE 0214 ; NEVER # LATIN CAPITAL LETTER U WITH DOUBLE GRAVE 0215 ; ALWAYS # LATIN SMALL LETTER U WITH DOUBLE GRAVE 0216 ; NEVER # LATIN CAPITAL LETTER U WITH INVERTED BREVE 0217 ; ALWAYS # LATIN SMALL LETTER U WITH INVERTED BREVE 0218 ; NEVER # LATIN CAPITAL LETTER S WITH COMMA BELOW 0219 ; ALWAYS # LATIN SMALL LETTER S WITH COMMA BELOW 021A ; NEVER # LATIN CAPITAL LETTER T WITH COMMA BELOW 021B ; ALWAYS # LATIN SMALL LETTER T WITH COMMA BELOW 021C ; NEVER # LATIN CAPITAL LETTER YOGH 021D ; ALWAYS # LATIN SMALL LETTER YOGH 021E ; NEVER # LATIN CAPITAL LETTER H WITH CARON 021F ; ALWAYS # LATIN SMALL LETTER H WITH CARON 0220 ; NEVER # LATIN CAPITAL LETTER N WITH LONG RIGHT LEG 0221 ; ALWAYS # LATIN SMALL LETTER D WITH CURL 0222 ; NEVER # LATIN CAPITAL LETTER OU 0223 ; ALWAYS # LATIN SMALL LETTER OU Faltstrom Expires May 21, 2008 [Page 18] Internet-Draft Unicode Codepoints November 2007 0224 ; NEVER # LATIN CAPITAL LETTER Z WITH HOOK 0225 ; ALWAYS # LATIN SMALL LETTER Z WITH HOOK 0226 ; NEVER # LATIN CAPITAL LETTER A WITH DOT ABOVE 0227 ; ALWAYS # LATIN SMALL LETTER A WITH DOT ABOVE 0228 ; NEVER # LATIN CAPITAL LETTER E WITH CEDILLA 0229 ; ALWAYS # LATIN SMALL LETTER E WITH CEDILLA 022A..022E ; NEVER # LATIN CAPITAL LETTER O WITH DIAERESIS AND MACRO 022F ; ALWAYS # LATIN SMALL LETTER O WITH DOT ABOVE 0230..0232 ; NEVER # LATIN CAPITAL LETTER O WITH DOT ABOVE AND MACRO 0233..0239 ; ALWAYS # LATIN SMALL LETTER Y WITH MACRON..LATIN SMALL L 023A..023B ; NEVER # LATIN CAPITAL LETTER A WITH STROKE..LATIN CAPIT 023C ; ALWAYS # LATIN SMALL LETTER C WITH STROKE 023D..023E ; NEVER # LATIN CAPITAL LETTER L WITH BAR..LATIN CAPITAL 023F..0240 ; ALWAYS # LATIN SMALL LETTER S WITH SWASH TAIL..LATIN SMA 0241 ; NEVER # LATIN CAPITAL LETTER GLOTTAL STOP 0242 ; ALWAYS # LATIN SMALL LETTER GLOTTAL STOP 0243..0246 ; NEVER # LATIN CAPITAL LETTER B WITH STROKE..LATIN CAPIT 0247 ; ALWAYS # LATIN SMALL LETTER E WITH STROKE 0248 ; NEVER # LATIN CAPITAL LETTER J WITH STROKE 0249 ; ALWAYS # LATIN SMALL LETTER J WITH STROKE 024A ; NEVER # LATIN CAPITAL LETTER SMALL Q WITH HOOK TAIL 024B ; ALWAYS # LATIN SMALL LETTER Q WITH HOOK TAIL 024C ; NEVER # LATIN CAPITAL LETTER R WITH STROKE 024D ; ALWAYS # LATIN SMALL LETTER R WITH STROKE 024E ; NEVER # LATIN CAPITAL LETTER Y WITH STROKE 024F..02AF ; ALWAYS # LATIN SMALL LETTER Y WITH STROKE..LATIN SMALL L 02B0..02B8 ; NEVER # MODIFIER LETTER SMALL H..MODIFIER LETTER SMALL 02B9..02C1 ; MAYBE YES # MODIFIER LETTER PRIME..MODIFIER LETTER REVERSED 02C2..02C5 ; NEVER # MODIFIER LETTER LEFT ARROWHEAD..MODIFIER LETTER 02C6..02D1 ; MAYBE YES # MODIFIER LETTER CIRCUMFLEX ACCENT..MODIFIER LET 02D2..02ED ; NEVER # MODIFIER LETTER CENTRED RIGHT HALF RING..MODIFI 02EE ; MAYBE YES # MODIFIER LETTER DOUBLE APOSTROPHE 02EF..02FF ; NEVER # MODIFIER LETTER LOW DOWN ARROWHEAD..MODIFIER LE 0300..033F ; MAYBE YES # COMBINING GRAVE ACCENT..COMBINING DOUBLE OVERLI 0340..0341 ; NEVER # COMBINING GRAVE TONE MARK..COMBINING ACUTE TONE 0342 ; MAYBE YES # COMBINING GREEK PERISPOMENI 0343..0344 ; NEVER # COMBINING GREEK KORONIS..COMBINING GREEK DIALYT 0345..034E ; MAYBE YES # COMBINING GREEK YPOGEGRAMMENI..COMBINING UPWARD 034F ; NEVER # COMBINING GRAPHEME JOINER 0350..036F ; MAYBE YES # COMBINING RIGHT ARROWHEAD ABOVE..COMBINING LATI 0370..0373 ; UNASSIGNED# .. 0374..0375 ; NEVER # GREEK NUMERAL SIGN..GREEK LOWER NUMERAL SIGN 0376..0379 ; UNASSIGNED# .. 037A..037D ; ALWAYS # GREEK YPOGEGRAMMENI..GREEK SMALL REVERSED DOTTE 037E ; NEVER # GREEK QUESTION MARK 037F..0383 ; UNASSIGNED# .. 0384..038A ; NEVER # GREEK TONOS..GREEK CAPITAL LETTER IOTA WITH TON 038B ; UNASSIGNED# Faltstrom Expires May 21, 2008 [Page 19] Internet-Draft Unicode Codepoints November 2007 038C ; NEVER # GREEK CAPITAL LETTER OMICRON WITH TONOS 038D ; UNASSIGNED# 038E..03A1 ; NEVER # GREEK CAPITAL LETTER UPSILON WITH TONOS..GREEK 03A2 ; UNASSIGNED# 03A3..03AB ; NEVER # GREEK CAPITAL LETTER SIGMA..GREEK CAPITAL LETTE 03AC..03AF ; ALWAYS # GREEK SMALL LETTER ALPHA WITH TONOS..GREEK SMAL 03B0 ; NEVER # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND T 03B1..03CE ; ALWAYS # GREEK SMALL LETTER ALPHA..GREEK SMALL LETTER OM 03CF ; UNASSIGNED# 03D0..03D6 ; NEVER # GREEK BETA SYMBOL..GREEK PI SYMBOL 03D7 ; ALWAYS # GREEK KAI SYMBOL 03D8 ; NEVER # GREEK LETTER ARCHAIC KOPPA 03D9 ; ALWAYS # GREEK SMALL LETTER ARCHAIC KOPPA 03DA ; NEVER # GREEK LETTER STIGMA 03DB ; ALWAYS # GREEK SMALL LETTER STIGMA 03DC ; NEVER # GREEK LETTER DIGAMMA 03DD ; ALWAYS # GREEK SMALL LETTER DIGAMMA 03DE ; NEVER # GREEK LETTER KOPPA 03DF ; ALWAYS # GREEK SMALL LETTER KOPPA 03E0 ; NEVER # GREEK LETTER SAMPI 03E1 ; ALWAYS # GREEK SMALL LETTER SAMPI 03E2 ; NEVER # COPTIC CAPITAL LETTER SHEI 03E3 ; MAYBE YES # COPTIC SMALL LETTER SHEI 03E4 ; NEVER # COPTIC CAPITAL LETTER FEI 03E5 ; MAYBE YES # COPTIC SMALL LETTER FEI 03E6 ; NEVER # COPTIC CAPITAL LETTER KHEI 03E7 ; MAYBE YES # COPTIC SMALL LETTER KHEI 03E8 ; NEVER # COPTIC CAPITAL LETTER HORI 03E9 ; MAYBE YES # COPTIC SMALL LETTER HORI 03EA ; NEVER # COPTIC CAPITAL LETTER GANGIA 03EB ; MAYBE YES # COPTIC SMALL LETTER GANGIA 03EC ; NEVER # COPTIC CAPITAL LETTER SHIMA 03ED ; MAYBE YES # COPTIC SMALL LETTER SHIMA 03EE ; NEVER # COPTIC CAPITAL LETTER DEI 03EF ; MAYBE YES # COPTIC SMALL LETTER DEI 03F0..03F2 ; NEVER # GREEK KAPPA SYMBOL..GREEK LUNATE SIGMA SYMBOL 03F3 ; ALWAYS # GREEK LETTER YOT 03F4..03F7 ; NEVER # GREEK CAPITAL THETA SYMBOL..GREEK CAPITAL LETTE 03F8 ; ALWAYS # GREEK SMALL LETTER SHO 03F9..03FA ; NEVER # GREEK CAPITAL LUNATE SIGMA SYMBOL..GREEK CAPITA 03FB..03FC ; ALWAYS # GREEK SMALL LETTER SAN..GREEK RHO WITH STROKE S 03FD..042F ; NEVER # GREEK CAPITAL REVERSED LUNATE SIGMA SYMBOL..CYR 0430..045F ; ALWAYS # CYRILLIC SMALL LETTER A..CYRILLIC SMALL LETTER 0460 ; NEVER # CYRILLIC CAPITAL LETTER OMEGA 0461 ; ALWAYS # CYRILLIC SMALL LETTER OMEGA 0462 ; NEVER # CYRILLIC CAPITAL LETTER YAT 0463 ; ALWAYS # CYRILLIC SMALL LETTER YAT 0464 ; NEVER # CYRILLIC CAPITAL LETTER IOTIFIED E Faltstrom Expires May 21, 2008 [Page 20] Internet-Draft Unicode Codepoints November 2007 0465 ; ALWAYS # CYRILLIC SMALL LETTER IOTIFIED E 0466 ; NEVER # CYRILLIC CAPITAL LETTER LITTLE YUS 0467 ; ALWAYS # CYRILLIC SMALL LETTER LITTLE YUS 0468 ; NEVER # CYRILLIC CAPITAL LETTER IOTIFIED LITTLE YUS 0469 ; ALWAYS # CYRILLIC SMALL LETTER IOTIFIED LITTLE YUS 046A ; NEVER # CYRILLIC CAPITAL LETTER BIG YUS 046B ; ALWAYS # CYRILLIC SMALL LETTER BIG YUS 046C ; NEVER # CYRILLIC CAPITAL LETTER IOTIFIED BIG YUS 046D ; ALWAYS # CYRILLIC SMALL LETTER IOTIFIED BIG YUS 046E ; NEVER # CYRILLIC CAPITAL LETTER KSI 046F ; ALWAYS # CYRILLIC SMALL LETTER KSI 0470 ; NEVER # CYRILLIC CAPITAL LETTER PSI 0471 ; ALWAYS # CYRILLIC SMALL LETTER PSI 0472 ; NEVER # CYRILLIC CAPITAL LETTER FITA 0473 ; ALWAYS # CYRILLIC SMALL LETTER FITA 0474 ; NEVER # CYRILLIC CAPITAL LETTER IZHITSA 0475 ; ALWAYS # CYRILLIC SMALL LETTER IZHITSA 0476 ; NEVER # CYRILLIC CAPITAL LETTER IZHITSA WITH DOUBLE GRA 0477 ; ALWAYS # CYRILLIC SMALL LETTER IZHITSA WITH DOUBLE GRAVE 0478 ; NEVER # CYRILLIC CAPITAL LETTER UK 0479 ; ALWAYS # CYRILLIC SMALL LETTER UK 047A ; NEVER # CYRILLIC CAPITAL LETTER ROUND OMEGA 047B ; ALWAYS # CYRILLIC SMALL LETTER ROUND OMEGA 047C ; NEVER # CYRILLIC CAPITAL LETTER OMEGA WITH TITLO 047D ; ALWAYS # CYRILLIC SMALL LETTER OMEGA WITH TITLO 047E ; NEVER # CYRILLIC CAPITAL LETTER OT 047F ; ALWAYS # CYRILLIC SMALL LETTER OT 0480 ; NEVER # CYRILLIC CAPITAL LETTER KOPPA 0481 ; ALWAYS # CYRILLIC SMALL LETTER KOPPA 0482 ; NEVER # CYRILLIC THOUSANDS SIGN 0483..0486 ; ALWAYS # COMBINING CYRILLIC TITLO..COMBINING CYRILLIC PS 0487 ; UNASSIGNED# 0488..048A ; NEVER # COMBINING CYRILLIC HUNDRED THOUSANDS SIGN..CYRI 048B ; ALWAYS # CYRILLIC SMALL LETTER SHORT I WITH TAIL 048C ; NEVER # CYRILLIC CAPITAL LETTER SEMISOFT SIGN 048D ; ALWAYS # CYRILLIC SMALL LETTER SEMISOFT SIGN 048E ; NEVER # CYRILLIC CAPITAL LETTER ER WITH TICK 048F ; ALWAYS # CYRILLIC SMALL LETTER ER WITH TICK 0490 ; NEVER # CYRILLIC CAPITAL LETTER GHE WITH UPTURN 0491 ; ALWAYS # CYRILLIC SMALL LETTER GHE WITH UPTURN 0492 ; NEVER # CYRILLIC CAPITAL LETTER GHE WITH STROKE 0493 ; ALWAYS # CYRILLIC SMALL LETTER GHE WITH STROKE 0494 ; NEVER # CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK 0495 ; ALWAYS # CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK 0496 ; NEVER # CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER 0497 ; ALWAYS # CYRILLIC SMALL LETTER ZHE WITH DESCENDER 0498 ; NEVER # CYRILLIC CAPITAL LETTER ZE WITH DESCENDER 0499 ; ALWAYS # CYRILLIC SMALL LETTER ZE WITH DESCENDER Faltstrom Expires May 21, 2008 [Page 21] Internet-Draft Unicode Codepoints November 2007 049A ; NEVER # CYRILLIC CAPITAL LETTER KA WITH DESCENDER 049B ; ALWAYS # CYRILLIC SMALL LETTER KA WITH DESCENDER 049C ; NEVER # CYRILLIC CAPITAL LETTER KA WITH VERTICAL STROKE 049D ; ALWAYS # CYRILLIC SMALL LETTER KA WITH VERTICAL STROKE 049E ; NEVER # CYRILLIC CAPITAL LETTER KA WITH STROKE 049F ; ALWAYS # CYRILLIC SMALL LETTER KA WITH STROKE 04A0 ; NEVER # CYRILLIC CAPITAL LETTER BASHKIR KA 04A1 ; ALWAYS # CYRILLIC SMALL LETTER BASHKIR KA 04A2 ; NEVER # CYRILLIC CAPITAL LETTER EN WITH DESCENDER 04A3 ; ALWAYS # CYRILLIC SMALL LETTER EN WITH DESCENDER 04A4 ; NEVER # CYRILLIC CAPITAL LIGATURE EN GHE 04A5 ; ALWAYS # CYRILLIC SMALL LIGATURE EN GHE 04A6 ; NEVER # CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK 04A7 ; ALWAYS # CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK 04A8 ; NEVER # CYRILLIC CAPITAL LETTER ABKHASIAN HA 04A9 ; ALWAYS # CYRILLIC SMALL LETTER ABKHASIAN HA 04AA ; NEVER # CYRILLIC CAPITAL LETTER ES WITH DESCENDER 04AB ; ALWAYS # CYRILLIC SMALL LETTER ES WITH DESCENDER 04AC ; NEVER # CYRILLIC CAPITAL LETTER TE WITH DESCENDER 04AD ; ALWAYS # CYRILLIC SMALL LETTER TE WITH DESCENDER 04AE ; NEVER # CYRILLIC CAPITAL LETTER STRAIGHT U 04AF ; ALWAYS # CYRILLIC SMALL LETTER STRAIGHT U 04B0 ; NEVER # CYRILLIC CAPITAL LETTER STRAIGHT U WITH STROKE 04B1 ; ALWAYS # CYRILLIC SMALL LETTER STRAIGHT U WITH STROKE 04B2 ; NEVER # CYRILLIC CAPITAL LETTER HA WITH DESCENDER 04B3 ; ALWAYS # CYRILLIC SMALL LETTER HA WITH DESCENDER 04B4 ; NEVER # CYRILLIC CAPITAL LIGATURE TE TSE 04B5 ; ALWAYS # CYRILLIC SMALL LIGATURE TE TSE 04B6 ; NEVER # CYRILLIC CAPITAL LETTER CHE WITH DESCENDER 04B7 ; ALWAYS # CYRILLIC SMALL LETTER CHE WITH DESCENDER 04B8 ; NEVER # CYRILLIC CAPITAL LETTER CHE WITH VERTICAL STROK 04B9 ; ALWAYS # CYRILLIC SMALL LETTER CHE WITH VERTICAL STROKE 04BA ; NEVER # CYRILLIC CAPITAL LETTER SHHA 04BB ; ALWAYS # CYRILLIC SMALL LETTER SHHA 04BC ; NEVER # CYRILLIC CAPITAL LETTER ABKHASIAN CHE 04BD ; ALWAYS # CYRILLIC SMALL LETTER ABKHASIAN CHE 04BE ; NEVER # CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESC 04BF ; ALWAYS # CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCEN 04C0..04C1 ; NEVER # CYRILLIC LETTER PALOCHKA..CYRILLIC CAPITAL LETT 04C2 ; ALWAYS # CYRILLIC SMALL LETTER ZHE WITH BREVE 04C3 ; NEVER # CYRILLIC CAPITAL LETTER KA WITH HOOK 04C4 ; ALWAYS # CYRILLIC SMALL LETTER KA WITH HOOK 04C5 ; NEVER # CYRILLIC CAPITAL LETTER EL WITH TAIL 04C6 ; ALWAYS # CYRILLIC SMALL LETTER EL WITH TAIL 04C7 ; NEVER # CYRILLIC CAPITAL LETTER EN WITH HOOK 04C8 ; ALWAYS # CYRILLIC SMALL LETTER EN WITH HOOK 04C9 ; NEVER # CYRILLIC CAPITAL LETTER EN WITH TAIL 04CA ; ALWAYS # CYRILLIC SMALL LETTER EN WITH TAIL Faltstrom Expires May 21, 2008 [Page 22] Internet-Draft Unicode Codepoints November 2007 04CB ; NEVER # CYRILLIC CAPITAL LETTER KHAKASSIAN CHE 04CC ; ALWAYS # CYRILLIC SMALL LETTER KHAKASSIAN CHE 04CD ; NEVER # CYRILLIC CAPITAL LETTER EM WITH TAIL 04CE..04CF ; ALWAYS # CYRILLIC SMALL LETTER EM WITH TAIL..CYRILLIC SM 04D0 ; NEVER # CYRILLIC CAPITAL LETTER A WITH BREVE 04D1 ; ALWAYS # CYRILLIC SMALL LETTER A WITH BREVE 04D2 ; NEVER # CYRILLIC CAPITAL LETTER A WITH DIAERESIS 04D3 ; ALWAYS # CYRILLIC SMALL LETTER A WITH DIAERESIS 04D4 ; NEVER # CYRILLIC CAPITAL LIGATURE A IE 04D5 ; ALWAYS # CYRILLIC SMALL LIGATURE A IE 04D6 ; NEVER # CYRILLIC CAPITAL LETTER IE WITH BREVE 04D7 ; ALWAYS # CYRILLIC SMALL LETTER IE WITH BREVE 04D8 ; NEVER # CYRILLIC CAPITAL LETTER SCHWA 04D9 ; ALWAYS # CYRILLIC SMALL LETTER SCHWA 04DA ; NEVER # CYRILLIC CAPITAL LETTER SCHWA WITH DIAERESIS 04DB ; ALWAYS # CYRILLIC SMALL LETTER SCHWA WITH DIAERESIS 04DC ; NEVER # CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS 04DD ; ALWAYS # CYRILLIC SMALL LETTER ZHE WITH DIAERESIS 04DE ; NEVER # CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS 04DF ; ALWAYS # CYRILLIC SMALL LETTER ZE WITH DIAERESIS 04E0 ; NEVER # CYRILLIC CAPITAL LETTER ABKHASIAN DZE 04E1 ; ALWAYS # CYRILLIC SMALL LETTER ABKHASIAN DZE 04E2 ; NEVER # CYRILLIC CAPITAL LETTER I WITH MACRON 04E3 ; ALWAYS # CYRILLIC SMALL LETTER I WITH MACRON 04E4 ; NEVER # CYRILLIC CAPITAL LETTER I WITH DIAERESIS 04E5 ; ALWAYS # CYRILLIC SMALL LETTER I WITH DIAERESIS 04E6 ; NEVER # CYRILLIC CAPITAL LETTER O WITH DIAERESIS 04E7 ; ALWAYS # CYRILLIC SMALL LETTER O WITH DIAERESIS 04E8 ; NEVER # CYRILLIC CAPITAL LETTER BARRED O 04E9 ; ALWAYS # CYRILLIC SMALL LETTER BARRED O 04EA ; NEVER # CYRILLIC CAPITAL LETTER BARRED O WITH DIAERESIS 04EB ; ALWAYS # CYRILLIC SMALL LETTER BARRED O WITH DIAERESIS 04EC ; NEVER # CYRILLIC CAPITAL LETTER E WITH DIAERESIS 04ED ; ALWAYS # CYRILLIC SMALL LETTER E WITH DIAERESIS 04EE ; NEVER # CYRILLIC CAPITAL LETTER U WITH MACRON 04EF ; ALWAYS # CYRILLIC SMALL LETTER U WITH MACRON 04F0 ; NEVER # CYRILLIC CAPITAL LETTER U WITH DIAERESIS 04F1 ; ALWAYS # CYRILLIC SMALL LETTER U WITH DIAERESIS 04F2 ; NEVER # CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE 04F3 ; ALWAYS # CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE 04F4 ; NEVER # CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS 04F5 ; ALWAYS # CYRILLIC SMALL LETTER CHE WITH DIAERESIS 04F6 ; NEVER # CYRILLIC CAPITAL LETTER GHE WITH DESCENDER 04F7 ; ALWAYS # CYRILLIC SMALL LETTER GHE WITH DESCENDER 04F8 ; NEVER # CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS 04F9 ; ALWAYS # CYRILLIC SMALL LETTER YERU WITH DIAERESIS 04FA ; NEVER # CYRILLIC CAPITAL LETTER GHE WITH STROKE AND HOO 04FB ; ALWAYS # CYRILLIC SMALL LETTER GHE WITH STROKE AND HOOK Faltstrom Expires May 21, 2008 [Page 23] Internet-Draft Unicode Codepoints November 2007 04FC ; NEVER # CYRILLIC CAPITAL LETTER HA WITH HOOK 04FD ; ALWAYS # CYRILLIC SMALL LETTER HA WITH HOOK 04FE ; NEVER # CYRILLIC CAPITAL LETTER HA WITH STROKE 04FF ; ALWAYS # CYRILLIC SMALL LETTER HA WITH STROKE 0500 ; NEVER # CYRILLIC CAPITAL LETTER KOMI DE 0501 ; ALWAYS # CYRILLIC SMALL LETTER KOMI DE 0502 ; NEVER # CYRILLIC CAPITAL LETTER KOMI DJE 0503 ; ALWAYS # CYRILLIC SMALL LETTER KOMI DJE 0504 ; NEVER # CYRILLIC CAPITAL LETTER KOMI ZJE 0505 ; ALWAYS # CYRILLIC SMALL LETTER KOMI ZJE 0506 ; NEVER # CYRILLIC CAPITAL LETTER KOMI DZJE 0507 ; ALWAYS # CYRILLIC SMALL LETTER KOMI DZJE 0508 ; NEVER # CYRILLIC CAPITAL LETTER KOMI LJE 0509 ; ALWAYS # CYRILLIC SMALL LETTER KOMI LJE 050A ; NEVER # CYRILLIC CAPITAL LETTER KOMI NJE 050B ; ALWAYS # CYRILLIC SMALL LETTER KOMI NJE 050C ; NEVER # CYRILLIC CAPITAL LETTER KOMI SJE 050D ; ALWAYS # CYRILLIC SMALL LETTER KOMI SJE 050E ; NEVER # CYRILLIC CAPITAL LETTER KOMI TJE 050F ; ALWAYS # CYRILLIC SMALL LETTER KOMI TJE 0510 ; NEVER # CYRILLIC CAPITAL LETTER REVERSED ZE 0511 ; ALWAYS # CYRILLIC SMALL LETTER REVERSED ZE 0512 ; NEVER # CYRILLIC CAPITAL LETTER EL WITH HOOK 0513 ; ALWAYS # CYRILLIC SMALL LETTER EL WITH HOOK 0514..0530 ; UNASSIGNED# .. 0531..0556 ; NEVER # ARMENIAN CAPITAL LETTER AYB..ARMENIAN CAPITAL L 0557..0558 ; UNASSIGNED# .. 0559 ; MAYBE YES # ARMENIAN MODIFIER LETTER LEFT HALF RING 055A..055F ; NEVER # ARMENIAN APOSTROPHE..ARMENIAN ABBREVIATION MARK 0560 ; UNASSIGNED# 0561..0586 ; MAYBE YES # ARMENIAN SMALL LETTER AYB..ARMENIAN SMALL LETTE 0587 ; NEVER # ARMENIAN SMALL LIGATURE ECH YIWN 0588 ; UNASSIGNED# 0589..058A ; NEVER # ARMENIAN FULL STOP..ARMENIAN HYPHEN 058B..0590 ; UNASSIGNED# .. 0591..05BD ; MAYBE YES # HEBREW ACCENT ETNAHTA..HEBREW POINT METEG 05BE ; NEVER # HEBREW PUNCTUATION MAQAF 05BF ; MAYBE YES # HEBREW POINT RAFE 05C0 ; NEVER # HEBREW PUNCTUATION PASEQ 05C1..05C2 ; MAYBE YES # HEBREW POINT SHIN DOT..HEBREW POINT SIN DOT 05C3 ; NEVER # HEBREW PUNCTUATION SOF PASUQ 05C4..05C5 ; MAYBE YES # HEBREW MARK UPPER DOT..HEBREW MARK LOWER DOT 05C6 ; NEVER # HEBREW PUNCTUATION NUN HAFUKHA 05C7 ; MAYBE YES # HEBREW POINT QAMATS QATAN 05C8..05CF ; UNASSIGNED# .. 05D0..05EA ; MAYBE YES # HEBREW LETTER ALEF..HEBREW LETTER TAV 05EB..05EF ; UNASSIGNED# .. 05F0..05F4 ; MAYBE YES # HEBREW LIGATURE YIDDISH DOUBLE VAV..HEBREW PUNC Faltstrom Expires May 21, 2008 [Page 24] Internet-Draft Unicode Codepoints November 2007 05F5..05FF ; UNASSIGNED# .. 0600..0603 ; CONTEXT # ARABIC NUMBER SIGN..ARABIC SIGN SAFHA 0604..060A ; UNASSIGNED# .. 060B..060F ; NEVER # AFGHANI SIGN..ARABIC SIGN MISRA 0610..0615 ; MAYBE YES # ARABIC SIGN SALLALLAHOU ALAYHE WASSALLAM..ARABI 0616..061A ; UNASSIGNED# .. 061B ; NEVER # ARABIC SEMICOLON 061C..061D ; UNASSIGNED# .. 061E..061F ; NEVER # ARABIC TRIPLE DOT PUNCTUATION MARK..ARABIC QUES 0620 ; UNASSIGNED# 0621..063A ; MAYBE YES # ARABIC LETTER HAMZA..ARABIC LETTER GHAIN 063B..063F ; UNASSIGNED# .. 0640..065E ; MAYBE YES # ARABIC TATWEEL..ARABIC FATHA WITH TWO DOTS 065F ; UNASSIGNED# 0660..0669 ; MAYBE YES # ARABIC-INDIC DIGIT ZERO..ARABIC-INDIC DIGIT NIN 066A..066D ; NEVER # ARABIC PERCENT SIGN..ARABIC FIVE POINTED STAR 066E..0674 ; MAYBE YES # ARABIC LETTER DOTLESS BEH..ARABIC LETTER HIGH H 0675..0678 ; NEVER # ARABIC LETTER HIGH HAMZA ALEF..ARABIC LETTER HI 0679..06D3 ; MAYBE YES # ARABIC LETTER TTEH..ARABIC LETTER YEH BARREE WI 06D4 ; NEVER # ARABIC FULL STOP 06D5..06DC ; MAYBE YES # ARABIC LETTER AE..ARABIC SMALL HIGH SEEN 06DD ; CONTEXT # ARABIC END OF AYAH 06DE ; NEVER # ARABIC START OF RUB EL HIZB 06DF..06E8 ; MAYBE YES # ARABIC SMALL HIGH ROUNDED ZERO..ARABIC SMALL HI 06E9 ; NEVER # ARABIC PLACE OF SAJDAH 06EA..06FC ; MAYBE YES # ARABIC EMPTY CENTRE LOW STOP..ARABIC LETTER GHA 06FD..06FE ; NEVER # ARABIC SIGN SINDHI AMPERSAND..ARABIC SIGN SINDH 06FF ; MAYBE YES # ARABIC LETTER HEH WITH INVERTED V 0700..070D ; NEVER # SYRIAC END OF PARAGRAPH..SYRIAC HARKLEAN ASTERI 070E ; UNASSIGNED# 070F ; CONTEXT # SYRIAC ABBREVIATION MARK 0710..074A ; MAYBE YES # SYRIAC LETTER ALAPH..SYRIAC BARREKH 074B..074C ; UNASSIGNED# .. 074D..076D ; MAYBE YES # SYRIAC LETTER SOGDIAN ZHAIN..ARABIC LETTER SEEN 076E..077F ; UNASSIGNED# .. 0780..07B1 ; MAYBE YES # THAANA LETTER HAA..THAANA LETTER NAA 07B2..07BF ; UNASSIGNED# .. 07C0..07F5 ; MAYBE YES # NKO DIGIT ZERO..NKO LOW TONE APOSTROPHE 07F6..07F9 ; NEVER # NKO SYMBOL OO DENNEN..NKO EXCLAMATION MARK 07FA ; MAYBE YES # NKO LAJANYALAN 07FB..0900 ; UNASSIGNED# .. 0901..0939 ; MAYBE YES # DEVANAGARI SIGN CANDRABINDU..DEVANAGARI LETTER 093A..093B ; UNASSIGNED# .. 093C..094D ; MAYBE YES # DEVANAGARI SIGN NUKTA..DEVANAGARI SIGN VIRAMA 094E..094F ; UNASSIGNED# .. 0950..0954 ; MAYBE YES # DEVANAGARI OM..DEVANAGARI ACUTE ACCENT 0955..0957 ; UNASSIGNED# .. 0958..095F ; NEVER # DEVANAGARI LETTER QA..DEVANAGARI LETTER YYA Faltstrom Expires May 21, 2008 [Page 25] Internet-Draft Unicode Codepoints November 2007 0960..0963 ; MAYBE YES # DEVANAGARI LETTER VOCALIC RR..DEVANAGARI VOWEL 0964..0965 ; NEVER # DEVANAGARI DANDA..DEVANAGARI DOUBLE DANDA 0966..096F ; MAYBE YES # DEVANAGARI DIGIT ZERO..DEVANAGARI DIGIT NINE 0970 ; NEVER # DEVANAGARI ABBREVIATION SIGN 0971..097A ; UNASSIGNED# .. 097B..097F ; MAYBE YES # DEVANAGARI LETTER GGA..DEVANAGARI LETTER BBA 0980 ; UNASSIGNED# 0981..0983 ; MAYBE YES # BENGALI SIGN CANDRABINDU..BENGALI SIGN VISARGA 0984 ; UNASSIGNED# 0985..098C ; MAYBE YES # BENGALI LETTER A..BENGALI LETTER VOCALIC L 098D..098E ; UNASSIGNED# .. 098F..0990 ; MAYBE YES # BENGALI LETTER E..BENGALI LETTER AI 0991..0992 ; UNASSIGNED# .. 0993..09A8 ; MAYBE YES # BENGALI LETTER O..BENGALI LETTER NA 09A9 ; UNASSIGNED# 09AA..09B0 ; MAYBE YES # BENGALI LETTER PA..BENGALI LETTER RA 09B1 ; UNASSIGNED# 09B2 ; MAYBE YES # BENGALI LETTER LA 09B3..09B5 ; UNASSIGNED# .. 09B6..09B9 ; MAYBE YES # BENGALI LETTER SHA..BENGALI LETTER HA 09BA..09BB ; UNASSIGNED# .. 09BC..09C4 ; MAYBE YES # BENGALI SIGN NUKTA..BENGALI VOWEL SIGN VOCALIC 09C5..09C6 ; UNASSIGNED# .. 09C7..09C8 ; MAYBE YES # BENGALI VOWEL SIGN E..BENGALI VOWEL SIGN AI 09C9..09CA ; UNASSIGNED# .. 09CB..09CC ; NEVER # BENGALI VOWEL SIGN O..BENGALI VOWEL SIGN AU 09CD..09CE ; MAYBE YES # BENGALI SIGN VIRAMA..BENGALI LETTER KHANDA TA 09CF..09D6 ; UNASSIGNED# .. 09D7 ; MAYBE YES # BENGALI AU LENGTH MARK 09D8..09DB ; UNASSIGNED# .. 09DC..09DD ; NEVER # BENGALI LETTER RRA..BENGALI LETTER RHA 09DE ; UNASSIGNED# 09DF ; NEVER # BENGALI LETTER YYA 09E0..09E3 ; MAYBE YES # BENGALI LETTER VOCALIC RR..BENGALI VOWEL SIGN V 09E4..09E5 ; UNASSIGNED# .. 09E6..09F1 ; MAYBE YES # BENGALI DIGIT ZERO..BENGALI LETTER RA WITH LOWE 09F2..09FA ; NEVER # BENGALI RUPEE MARK..BENGALI ISSHAR 09FB..0A00 ; UNASSIGNED# .. 0A01..0A03 ; MAYBE YES # GURMUKHI SIGN ADAK BINDI..GURMUKHI SIGN VISARGA 0A04 ; UNASSIGNED# 0A05..0A0A ; MAYBE YES # GURMUKHI LETTER A..GURMUKHI LETTER UU 0A0B..0A0E ; UNASSIGNED# .. 0A0F..0A10 ; MAYBE YES # GURMUKHI LETTER EE..GURMUKHI LETTER AI 0A11..0A12 ; UNASSIGNED# .. 0A13..0A28 ; MAYBE YES # GURMUKHI LETTER OO..GURMUKHI LETTER NA 0A29 ; UNASSIGNED# 0A2A..0A30 ; MAYBE YES # GURMUKHI LETTER PA..GURMUKHI LETTER RA 0A31 ; UNASSIGNED# Faltstrom Expires May 21, 2008 [Page 26] Internet-Draft Unicode Codepoints November 2007 0A32 ; MAYBE YES # GURMUKHI LETTER LA 0A33 ; NEVER # GURMUKHI LETTER LLA 0A34 ; UNASSIGNED# 0A35 ; MAYBE YES # GURMUKHI LETTER VA 0A36 ; NEVER # GURMUKHI LETTER SHA 0A37 ; UNASSIGNED# 0A38..0A39 ; MAYBE YES # GURMUKHI LETTER SA..GURMUKHI LETTER HA 0A3A..0A3B ; UNASSIGNED# .. 0A3C ; MAYBE YES # GURMUKHI SIGN NUKTA 0A3D ; UNASSIGNED# 0A3E..0A42 ; MAYBE YES # GURMUKHI VOWEL SIGN AA..GURMUKHI VOWEL SIGN UU 0A43..0A46 ; UNASSIGNED# .. 0A47..0A48 ; MAYBE YES # GURMUKHI VOWEL SIGN EE..GURMUKHI VOWEL SIGN AI 0A49..0A4A ; UNASSIGNED# .. 0A4B..0A4D ; MAYBE YES # GURMUKHI VOWEL SIGN OO..GURMUKHI SIGN VIRAMA 0A4E..0A58 ; UNASSIGNED# .. 0A59..0A5B ; NEVER # GURMUKHI LETTER KHHA..GURMUKHI LETTER ZA 0A5C ; MAYBE YES # GURMUKHI LETTER RRA 0A5D ; UNASSIGNED# 0A5E ; NEVER # GURMUKHI LETTER FA 0A5F..0A65 ; UNASSIGNED# .. 0A66..0A74 ; MAYBE YES # GURMUKHI DIGIT ZERO..GURMUKHI EK ONKAR 0A75..0A80 ; UNASSIGNED# .. 0A81..0A83 ; MAYBE YES # GUJARATI SIGN CANDRABINDU..GUJARATI SIGN VISARG 0A84 ; UNASSIGNED# 0A85..0A8D ; MAYBE YES # GUJARATI LETTER A..GUJARATI VOWEL CANDRA E 0A8E ; UNASSIGNED# 0A8F..0A91 ; MAYBE YES # GUJARATI LETTER E..GUJARATI VOWEL CANDRA O 0A92 ; UNASSIGNED# 0A93..0AA8 ; MAYBE YES # GUJARATI LETTER O..GUJARATI LETTER NA 0AA9 ; UNASSIGNED# 0AAA..0AB0 ; MAYBE YES # GUJARATI LETTER PA..GUJARATI LETTER RA 0AB1 ; UNASSIGNED# 0AB2..0AB3 ; MAYBE YES # GUJARATI LETTER LA..GUJARATI LETTER LLA 0AB4 ; UNASSIGNED# 0AB5..0AB9 ; MAYBE YES # GUJARATI LETTER VA..GUJARATI LETTER HA 0ABA..0ABB ; UNASSIGNED# .. 0ABC..0AC5 ; MAYBE YES # GUJARATI SIGN NUKTA..GUJARATI VOWEL SIGN CANDRA 0AC6 ; UNASSIGNED# 0AC7..0AC9 ; MAYBE YES # GUJARATI VOWEL SIGN E..GUJARATI VOWEL SIGN CAND 0ACA ; UNASSIGNED# 0ACB..0ACD ; MAYBE YES # GUJARATI VOWEL SIGN O..GUJARATI SIGN VIRAMA 0ACE..0ACF ; UNASSIGNED# .. 0AD0 ; MAYBE YES # GUJARATI OM 0AD1..0ADF ; UNASSIGNED# .. 0AE0..0AE3 ; MAYBE YES # GUJARATI LETTER VOCALIC RR..GUJARATI VOWEL SIGN 0AE4..0AE5 ; UNASSIGNED# .. 0AE6..0AEF ; MAYBE YES # GUJARATI DIGIT ZERO..GUJARATI DIGIT NINE Faltstrom Expires May 21, 2008 [Page 27] Internet-Draft Unicode Codepoints November 2007 0AF0 ; UNASSIGNED# 0AF1 ; NEVER # GUJARATI RUPEE SIGN 0AF2..0B00 ; UNASSIGNED# .. 0B01..0B03 ; MAYBE YES # ORIYA SIGN CANDRABINDU..ORIYA SIGN VISARGA 0B04 ; UNASSIGNED# 0B05..0B0C ; MAYBE YES # ORIYA LETTER A..ORIYA LETTER VOCALIC L 0B0D..0B0E ; UNASSIGNED# .. 0B0F..0B10 ; MAYBE YES # ORIYA LETTER E..ORIYA LETTER AI 0B11..0B12 ; UNASSIGNED# .. 0B13..0B28 ; MAYBE YES # ORIYA LETTER O..ORIYA LETTER NA 0B29 ; UNASSIGNED# 0B2A..0B30 ; MAYBE YES # ORIYA LETTER PA..ORIYA LETTER RA 0B31 ; UNASSIGNED# 0B32..0B33 ; MAYBE YES # ORIYA LETTER LA..ORIYA LETTER LLA 0B34 ; UNASSIGNED# 0B35..0B39 ; MAYBE YES # ORIYA LETTER VA..ORIYA LETTER HA 0B3A..0B3B ; UNASSIGNED# .. 0B3C..0B43 ; MAYBE YES # ORIYA SIGN NUKTA..ORIYA VOWEL SIGN VOCALIC R 0B44..0B46 ; UNASSIGNED# .. 0B47 ; MAYBE YES # ORIYA VOWEL SIGN E 0B48 ; NEVER # ORIYA VOWEL SIGN AI 0B49..0B4A ; UNASSIGNED# .. 0B4B..0B4C ; NEVER # ORIYA VOWEL SIGN O..ORIYA VOWEL SIGN AU 0B4D ; MAYBE YES # ORIYA SIGN VIRAMA 0B4E..0B55 ; UNASSIGNED# .. 0B56..0B57 ; MAYBE YES # ORIYA AI LENGTH MARK..ORIYA AU LENGTH MARK 0B58..0B5B ; UNASSIGNED# .. 0B5C..0B5D ; NEVER # ORIYA LETTER RRA..ORIYA LETTER RHA 0B5E ; UNASSIGNED# 0B5F..0B61 ; MAYBE YES # ORIYA LETTER YYA..ORIYA LETTER VOCALIC LL 0B62..0B65 ; UNASSIGNED# .. 0B66..0B6F ; MAYBE YES # ORIYA DIGIT ZERO..ORIYA DIGIT NINE 0B70 ; NEVER # ORIYA ISSHAR 0B71 ; MAYBE YES # ORIYA LETTER WA 0B72..0B81 ; UNASSIGNED# .. 0B82..0B83 ; MAYBE YES # TAMIL SIGN ANUSVARA..TAMIL SIGN VISARGA 0B84 ; UNASSIGNED# 0B85..0B8A ; MAYBE YES # TAMIL LETTER A..TAMIL LETTER UU 0B8B..0B8D ; UNASSIGNED# .. 0B8E..0B90 ; MAYBE YES # TAMIL LETTER E..TAMIL LETTER AI 0B91 ; UNASSIGNED# 0B92..0B93 ; MAYBE YES # TAMIL LETTER O..TAMIL LETTER OO 0B94 ; NEVER # TAMIL LETTER AU 0B95 ; MAYBE YES # TAMIL LETTER KA 0B96..0B98 ; UNASSIGNED# .. 0B99..0B9A ; MAYBE YES # TAMIL LETTER NGA..TAMIL LETTER CA 0B9B ; UNASSIGNED# 0B9C ; MAYBE YES # TAMIL LETTER JA Faltstrom Expires May 21, 2008 [Page 28] Internet-Draft Unicode Codepoints November 2007 0B9D ; UNASSIGNED# 0B9E..0B9F ; MAYBE YES # TAMIL LETTER NYA..TAMIL LETTER TTA 0BA0..0BA2 ; UNASSIGNED# .. 0BA3..0BA4 ; MAYBE YES # TAMIL LETTER NNA..TAMIL LETTER TA 0BA5..0BA7 ; UNASSIGNED# .. 0BA8..0BAA ; MAYBE YES # TAMIL LETTER NA..TAMIL LETTER PA 0BAB..0BAD ; UNASSIGNED# .. 0BAE..0BB9 ; MAYBE YES # TAMIL LETTER MA..TAMIL LETTER HA 0BBA..0BBD ; UNASSIGNED# .. 0BBE..0BC2 ; MAYBE YES # TAMIL VOWEL SIGN AA..TAMIL VOWEL SIGN UU 0BC3..0BC5 ; UNASSIGNED# .. 0BC6..0BC8 ; MAYBE YES # TAMIL VOWEL SIGN E..TAMIL VOWEL SIGN AI 0BC9 ; UNASSIGNED# 0BCA..0BCC ; NEVER # TAMIL VOWEL SIGN O..TAMIL VOWEL SIGN AU 0BCD ; MAYBE YES # TAMIL SIGN VIRAMA 0BCE..0BD6 ; UNASSIGNED# .. 0BD7 ; MAYBE YES # TAMIL AU LENGTH MARK 0BD8..0BE5 ; UNASSIGNED# .. 0BE6..0BEF ; MAYBE YES # TAMIL DIGIT ZERO..TAMIL DIGIT NINE 0BF0..0BFA ; NEVER # TAMIL NUMBER TEN..TAMIL NUMBER SIGN 0BFB..0C00 ; UNASSIGNED# .. 0C01..0C03 ; MAYBE YES # TELUGU SIGN CANDRABINDU..TELUGU SIGN VISARGA 0C04 ; UNASSIGNED# 0C05..0C0C ; MAYBE YES # TELUGU LETTER A..TELUGU LETTER VOCALIC L 0C0D ; UNASSIGNED# 0C0E..0C10 ; MAYBE YES # TELUGU LETTER E..TELUGU LETTER AI 0C11 ; UNASSIGNED# 0C12..0C28 ; MAYBE YES # TELUGU LETTER O..TELUGU LETTER NA 0C29 ; UNASSIGNED# 0C2A..0C33 ; MAYBE YES # TELUGU LETTER PA..TELUGU LETTER LLA 0C34 ; UNASSIGNED# 0C35..0C39 ; MAYBE YES # TELUGU LETTER VA..TELUGU LETTER HA 0C3A..0C3D ; UNASSIGNED# .. 0C3E..0C44 ; MAYBE YES # TELUGU VOWEL SIGN AA..TELUGU VOWEL SIGN VOCALIC 0C45 ; UNASSIGNED# 0C46..0C48 ; MAYBE YES # TELUGU VOWEL SIGN E..TELUGU VOWEL SIGN AI 0C49 ; UNASSIGNED# 0C4A..0C4D ; MAYBE YES # TELUGU VOWEL SIGN O..TELUGU SIGN VIRAMA 0C4E..0C54 ; UNASSIGNED# .. 0C55..0C56 ; MAYBE YES # TELUGU LENGTH MARK..TELUGU AI LENGTH MARK 0C57..0C5F ; UNASSIGNED# .. 0C60..0C61 ; MAYBE YES # TELUGU LETTER VOCALIC RR..TELUGU LETTER VOCALIC 0C62..0C65 ; UNASSIGNED# .. 0C66..0C6F ; MAYBE YES # TELUGU DIGIT ZERO..TELUGU DIGIT NINE 0C70..0C81 ; UNASSIGNED# .. 0C82..0C83 ; MAYBE YES # KANNADA SIGN ANUSVARA..KANNADA SIGN VISARGA 0C84 ; UNASSIGNED# 0C85..0C8C ; MAYBE YES # KANNADA LETTER A..KANNADA LETTER VOCALIC L Faltstrom Expires May 21, 2008 [Page 29] Internet-Draft Unicode Codepoints November 2007 0C8D ; UNASSIGNED# 0C8E..0C90 ; MAYBE YES # KANNADA LETTER E..KANNADA LETTER AI 0C91 ; UNASSIGNED# 0C92..0CA8 ; MAYBE YES # KANNADA LETTER O..KANNADA LETTER NA 0CA9 ; UNASSIGNED# 0CAA..0CB3 ; MAYBE YES # KANNADA LETTER PA..KANNADA LETTER LLA 0CB4 ; UNASSIGNED# 0CB5..0CB9 ; MAYBE YES # KANNADA LETTER VA..KANNADA LETTER HA 0CBA..0CBB ; UNASSIGNED# .. 0CBC..0CBF ; MAYBE YES # KANNADA SIGN NUKTA..KANNADA VOWEL SIGN I 0CC0 ; NEVER # KANNADA VOWEL SIGN II 0CC1..0CC4 ; MAYBE YES # KANNADA VOWEL SIGN U..KANNADA VOWEL SIGN VOCALI 0CC5 ; UNASSIGNED# 0CC6 ; MAYBE YES # KANNADA VOWEL SIGN E 0CC7..0CC8 ; NEVER # KANNADA VOWEL SIGN EE..KANNADA VOWEL SIGN AI 0CC9 ; UNASSIGNED# 0CCA..0CCB ; NEVER # KANNADA VOWEL SIGN O..KANNADA VOWEL SIGN OO 0CCC..0CCD ; MAYBE YES # KANNADA VOWEL SIGN AU..KANNADA SIGN VIRAMA 0CCE..0CD4 ; UNASSIGNED# .. 0CD5..0CD6 ; MAYBE YES # KANNADA LENGTH MARK..KANNADA AI LENGTH MARK 0CD7..0CDD ; UNASSIGNED# .. 0CDE ; MAYBE YES # KANNADA LETTER FA 0CDF ; UNASSIGNED# 0CE0..0CE3 ; MAYBE YES # KANNADA LETTER VOCALIC RR..KANNADA VOWEL SIGN V 0CE4..0CE5 ; UNASSIGNED# .. 0CE6..0CEF ; MAYBE YES # KANNADA DIGIT ZERO..KANNADA DIGIT NINE 0CF0 ; UNASSIGNED# 0CF1..0CF2 ; NEVER # KANNADA SIGN JIHVAMULIYA..KANNADA SIGN UPADHMAN 0CF3..0D01 ; UNASSIGNED# .. 0D02..0D03 ; MAYBE YES # MALAYALAM SIGN ANUSVARA..MALAYALAM SIGN VISARGA 0D04 ; UNASSIGNED# 0D05..0D0C ; MAYBE YES # MALAYALAM LETTER A..MALAYALAM LETTER VOCALIC L 0D0D ; UNASSIGNED# 0D0E..0D10 ; MAYBE YES # MALAYALAM LETTER E..MALAYALAM LETTER AI 0D11 ; UNASSIGNED# 0D12..0D28 ; MAYBE YES # MALAYALAM LETTER O..MALAYALAM LETTER NA 0D29 ; UNASSIGNED# 0D2A..0D39 ; MAYBE YES # MALAYALAM LETTER PA..MALAYALAM LETTER HA 0D3A..0D3D ; UNASSIGNED# .. 0D3E..0D43 ; MAYBE YES # MALAYALAM VOWEL SIGN AA..MALAYALAM VOWEL SIGN V 0D44..0D45 ; UNASSIGNED# .. 0D46..0D48 ; MAYBE YES # MALAYALAM VOWEL SIGN E..MALAYALAM VOWEL SIGN AI 0D49 ; UNASSIGNED# 0D4A..0D4C ; NEVER # MALAYALAM VOWEL SIGN O..MALAYALAM VOWEL SIGN AU 0D4D ; MAYBE YES # MALAYALAM SIGN VIRAMA 0D4E..0D56 ; UNASSIGNED# .. 0D57 ; MAYBE YES # MALAYALAM AU LENGTH MARK 0D58..0D5F ; UNASSIGNED# .. Faltstrom Expires May 21, 2008 [Page 30] Internet-Draft Unicode Codepoints November 2007 0D60..0D61 ; MAYBE YES # MALAYALAM LETTER VOCALIC RR..MALAYALAM LETTER V 0D62..0D65 ; UNASSIGNED# .. 0D66..0D6F ; MAYBE YES # MALAYALAM DIGIT ZERO..MALAYALAM DIGIT NINE 0D70..0D81 ; UNASSIGNED# .. 0D82..0D83 ; MAYBE YES # SINHALA SIGN ANUSVARAYA..SINHALA SIGN VISARGAYA 0D84 ; UNASSIGNED# 0D85..0D96 ; MAYBE YES # SINHALA LETTER AYANNA..SINHALA LETTER AUYANNA 0D97..0D99 ; UNASSIGNED# .. 0D9A..0DB1 ; MAYBE YES # SINHALA LETTER ALPAPRAANA KAYANNA..SINHALA LETT 0DB2 ; UNASSIGNED# 0DB3..0DBB ; MAYBE YES # SINHALA LETTER SANYAKA DAYANNA..SINHALA LETTER 0DBC ; UNASSIGNED# 0DBD ; MAYBE YES # SINHALA LETTER DANTAJA LAYANNA 0DBE..0DBF ; UNASSIGNED# .. 0DC0..0DC6 ; MAYBE YES # SINHALA LETTER VAYANNA..SINHALA LETTER FAYANNA 0DC7..0DC9 ; UNASSIGNED# .. 0DCA ; MAYBE YES # SINHALA SIGN AL-LAKUNA 0DCB..0DCE ; UNASSIGNED# .. 0DCF..0DD4 ; MAYBE YES # SINHALA VOWEL SIGN AELA-PILLA..SINHALA VOWEL SI 0DD5 ; UNASSIGNED# 0DD6 ; MAYBE YES # SINHALA VOWEL SIGN DIGA PAA-PILLA 0DD7 ; UNASSIGNED# 0DD8..0DDB ; MAYBE YES # SINHALA VOWEL SIGN GAETTA-PILLA..SINHALA VOWEL 0DDC..0DDE ; NEVER # SINHALA VOWEL SIGN KOMBUVA HAA AELA-PILLA..SINH 0DDF ; MAYBE YES # SINHALA VOWEL SIGN GAYANUKITTA 0DE0..0DF1 ; UNASSIGNED# .. 0DF2..0DF3 ; MAYBE YES # SINHALA VOWEL SIGN DIGA GAETTA-PILLA..SINHALA V 0DF4 ; NEVER # SINHALA PUNCTUATION KUNDDALIYA 0DF5..0E00 ; UNASSIGNED# .. 0E01..0E32 ; MAYBE YES # THAI CHARACTER KO KAI..THAI CHARACTER SARA AA 0E33 ; NEVER # THAI CHARACTER SARA AM 0E34..0E3A ; MAYBE YES # THAI CHARACTER SARA I..THAI CHARACTER PHINTHU 0E3B..0E3E ; UNASSIGNED# .. 0E3F ; NEVER # THAI CURRENCY SYMBOL BAHT 0E40..0E4E ; MAYBE YES # THAI CHARACTER SARA E..THAI CHARACTER YAMAKKAN 0E4F ; NEVER # THAI CHARACTER FONGMAN 0E50..0E59 ; MAYBE YES # THAI DIGIT ZERO..THAI DIGIT NINE 0E5A..0E5B ; NEVER # THAI CHARACTER ANGKHANKHU..THAI CHARACTER KHOMU 0E5C..0E80 ; UNASSIGNED# .. 0E81..0E82 ; MAYBE YES # LAO LETTER KO..LAO LETTER KHO SUNG 0E83 ; UNASSIGNED# 0E84 ; MAYBE YES # LAO LETTER KHO TAM 0E85..0E86 ; UNASSIGNED# .. 0E87..0E88 ; MAYBE YES # LAO LETTER NGO..LAO LETTER CO 0E89 ; UNASSIGNED# 0E8A ; MAYBE YES # LAO LETTER SO TAM 0E8B..0E8C ; UNASSIGNED# .. 0E8D ; MAYBE YES # LAO LETTER NYO Faltstrom Expires May 21, 2008 [Page 31] Internet-Draft Unicode Codepoints November 2007 0E8E..0E93 ; UNASSIGNED# .. 0E94..0E97 ; MAYBE YES # LAO LETTER DO..LAO LETTER THO TAM 0E98 ; UNASSIGNED# 0E99..0E9F ; MAYBE YES # LAO LETTER NO..LAO LETTER FO SUNG 0EA0 ; UNASSIGNED# 0EA1..0EA3 ; MAYBE YES # LAO LETTER MO..LAO LETTER LO LING 0EA4 ; UNASSIGNED# 0EA5 ; MAYBE YES # LAO LETTER LO LOOT 0EA6 ; UNASSIGNED# 0EA7 ; MAYBE YES # LAO LETTER WO 0EA8..0EA9 ; UNASSIGNED# .. 0EAA..0EAB ; MAYBE YES # LAO LETTER SO SUNG..LAO LETTER HO SUNG 0EAC ; UNASSIGNED# 0EAD..0EB2 ; MAYBE YES # LAO LETTER O..LAO VOWEL SIGN AA 0EB3 ; NEVER # LAO VOWEL SIGN AM 0EB4..0EB9 ; MAYBE YES # LAO VOWEL SIGN I..LAO VOWEL SIGN UU 0EBA ; UNASSIGNED# 0EBB..0EBD ; MAYBE YES # LAO VOWEL SIGN MAI KON..LAO SEMIVOWEL SIGN NYO 0EBE..0EBF ; UNASSIGNED# .. 0EC0..0EC4 ; MAYBE YES # LAO VOWEL SIGN E..LAO VOWEL SIGN AI 0EC5 ; UNASSIGNED# 0EC6 ; MAYBE YES # LAO KO LA 0EC7 ; UNASSIGNED# 0EC8..0ECD ; MAYBE YES # LAO TONE MAI EK..LAO NIGGAHITA 0ECE..0ECF ; UNASSIGNED# .. 0ED0..0ED9 ; MAYBE YES # LAO DIGIT ZERO..LAO DIGIT NINE 0EDA..0EDB ; UNASSIGNED# .. 0EDC..0EDD ; NEVER # LAO HO NO..LAO HO MO 0EDE..0EFF ; UNASSIGNED# .. 0F00 ; MAYBE YES # TIBETAN SYLLABLE OM 0F01..0F17 ; NEVER # TIBETAN MARK GTER YIG MGO TRUNCATED A..TIBETAN 0F18..0F19 ; MAYBE YES # TIBETAN ASTROLOGICAL SIGN -KHYUD PA..TIBETAN AS 0F1A..0F1F ; NEVER # TIBETAN SIGN RDEL DKAR GCIG..TIBETAN SIGN RDEL 0F20..0F29 ; MAYBE YES # TIBETAN DIGIT ZERO..TIBETAN DIGIT NINE 0F2A..0F34 ; NEVER # TIBETAN DIGIT HALF ONE..TIBETAN MARK BSDUS RTAG 0F35 ; MAYBE YES # TIBETAN MARK NGAS BZUNG NYI ZLA 0F36 ; NEVER # TIBETAN MARK CARET -DZUD RTAGS BZHI MIG CAN 0F37 ; MAYBE YES # TIBETAN MARK NGAS BZUNG SGOR RTAGS 0F38 ; NEVER # TIBETAN MARK CHE MGO 0F39 ; MAYBE YES # TIBETAN MARK TSA -PHRU 0F3A..0F3D ; NEVER # TIBETAN MARK GUG RTAGS GYON..TIBETAN MARK ANG K 0F3E..0F42 ; MAYBE YES # TIBETAN SIGN YAR TSHES..TIBETAN LETTER GA 0F43 ; NEVER # TIBETAN LETTER GHA 0F44..0F47 ; MAYBE YES # TIBETAN LETTER NGA..TIBETAN LETTER JA 0F48 ; UNASSIGNED# 0F49..0F4C ; MAYBE YES # TIBETAN LETTER NYA..TIBETAN LETTER DDA 0F4D ; NEVER # TIBETAN LETTER DDHA 0F4E..0F51 ; MAYBE YES # TIBETAN LETTER NNA..TIBETAN LETTER DA Faltstrom Expires May 21, 2008 [Page 32] Internet-Draft Unicode Codepoints November 2007 0F52 ; NEVER # TIBETAN LETTER DHA 0F53..0F56 ; MAYBE YES # TIBETAN LETTER NA..TIBETAN LETTER BA 0F57 ; NEVER # TIBETAN LETTER BHA 0F58..0F5B ; MAYBE YES # TIBETAN LETTER MA..TIBETAN LETTER DZA 0F5C ; NEVER # TIBETAN LETTER DZHA 0F5D..0F68 ; MAYBE YES # TIBETAN LETTER WA..TIBETAN LETTER A 0F69 ; NEVER # TIBETAN LETTER KSSA 0F6A ; MAYBE YES # TIBETAN LETTER FIXED-FORM RA 0F6B..0F70 ; UNASSIGNED# .. 0F71..0F72 ; MAYBE YES # TIBETAN VOWEL SIGN AA..TIBETAN VOWEL SIGN I 0F73 ; NEVER # TIBETAN VOWEL SIGN II 0F74 ; MAYBE YES # TIBETAN VOWEL SIGN U 0F75..0F79 ; NEVER # TIBETAN VOWEL SIGN UU..TIBETAN VOWEL SIGN VOCAL 0F7A..0F80 ; MAYBE YES # TIBETAN VOWEL SIGN E..TIBETAN VOWEL SIGN REVERS 0F81 ; NEVER # TIBETAN VOWEL SIGN REVERSED II 0F82..0F84 ; MAYBE YES # TIBETAN SIGN NYI ZLA NAA DA..TIBETAN MARK HALAN 0F85 ; NEVER # TIBETAN MARK PALUTA 0F86..0F8B ; MAYBE YES # TIBETAN SIGN LCI RTAGS..TIBETAN SIGN GRU MED RG 0F8C..0F8F ; UNASSIGNED# .. 0F90..0F92 ; MAYBE YES # TIBETAN SUBJOINED LETTER KA..TIBETAN SUBJOINED 0F93 ; NEVER # TIBETAN SUBJOINED LETTER GHA 0F94..0F97 ; MAYBE YES # TIBETAN SUBJOINED LETTER NGA..TIBETAN SUBJOINED 0F98 ; UNASSIGNED# 0F99..0F9C ; MAYBE YES # TIBETAN SUBJOINED LETTER NYA..TIBETAN SUBJOINED 0F9D ; NEVER # TIBETAN SUBJOINED LETTER DDHA 0F9E..0FA1 ; MAYBE YES # TIBETAN SUBJOINED LETTER NNA..TIBETAN SUBJOINED 0FA2 ; NEVER # TIBETAN SUBJOINED LETTER DHA 0FA3..0FA6 ; MAYBE YES # TIBETAN SUBJOINED LETTER NA..TIBETAN SUBJOINED 0FA7 ; NEVER # TIBETAN SUBJOINED LETTER BHA 0FA8..0FAB ; MAYBE YES # TIBETAN SUBJOINED LETTER MA..TIBETAN SUBJOINED 0FAC ; NEVER # TIBETAN SUBJOINED LETTER DZHA 0FAD..0FB8 ; MAYBE YES # TIBETAN SUBJOINED LETTER WA..TIBETAN SUBJOINED 0FB9 ; NEVER # TIBETAN SUBJOINED LETTER KSSA 0FBA..0FBC ; MAYBE YES # TIBETAN SUBJOINED LETTER FIXED-FORM WA..TIBETAN 0FBD ; UNASSIGNED# 0FBE..0FC5 ; NEVER # TIBETAN KU RU KHA..TIBETAN SYMBOL RDO RJE 0FC6 ; MAYBE YES # TIBETAN SYMBOL PADMA GDAN 0FC7..0FCC ; NEVER # TIBETAN SYMBOL RDO RJE RGYA GRAM..TIBETAN SYMBO 0FCD..0FCE ; UNASSIGNED# .. 0FCF..0FD1 ; NEVER # TIBETAN SIGN RDEL NAG GSUM..TIBETAN MARK MNYAM 0FD2..0FFF ; UNASSIGNED# .. 1000..1021 ; MAYBE YES # MYANMAR LETTER KA..MYANMAR LETTER A 1022 ; UNASSIGNED# 1023..1025 ; MAYBE YES # MYANMAR LETTER I..MYANMAR LETTER U 1026 ; NEVER # MYANMAR LETTER UU 1027 ; MAYBE YES # MYANMAR LETTER E 1028 ; UNASSIGNED# 1029..102A ; MAYBE YES # MYANMAR LETTER O..MYANMAR LETTER AU Faltstrom Expires May 21, 2008 [Page 33] Internet-Draft Unicode Codepoints November 2007 102B ; UNASSIGNED# 102C..1032 ; MAYBE YES # MYANMAR VOWEL SIGN AA..MYANMAR VOWEL SIGN AI 1033..1035 ; UNASSIGNED# .. 1036..1039 ; MAYBE YES # MYANMAR SIGN ANUSVARA..MYANMAR SIGN VIRAMA 103A..103F ; UNASSIGNED# .. 1040..1049 ; MAYBE YES # MYANMAR DIGIT ZERO..MYANMAR DIGIT NINE 104A..104F ; NEVER # MYANMAR SIGN LITTLE SECTION..MYANMAR SYMBOL GEN 1050..1059 ; MAYBE YES # MYANMAR LETTER SHA..MYANMAR VOWEL SIGN VOCALIC 105A..109F ; UNASSIGNED# .. 10A0..10C5 ; NEVER # GEORGIAN CAPITAL LETTER AN..GEORGIAN CAPITAL LE 10C6..10CF ; UNASSIGNED# .. 10D0..10FA ; MAYBE YES # GEORGIAN LETTER AN..GEORGIAN LETTER AIN 10FB..10FC ; NEVER # GEORGIAN PARAGRAPH SEPARATOR..MODIFIER LETTER G 10FD..10FF ; UNASSIGNED# .. 1100..1159 ; MAYBE YES # HANGUL CHOSEONG KIYEOK..HANGUL CHOSEONG YEORINH 115A..115E ; UNASSIGNED# .. 115F..1160 ; NEVER # HANGUL CHOSEONG FILLER..HANGUL JUNGSEONG FILLER 1161..11A2 ; MAYBE YES # HANGUL JUNGSEONG A..HANGUL JUNGSEONG SSANGARAEA 11A3..11A7 ; UNASSIGNED# .. 11A8..11F9 ; MAYBE YES # HANGUL JONGSEONG KIYEOK..HANGUL JONGSEONG YEORI 11FA..11FF ; UNASSIGNED# .. 1200..1248 ; MAYBE YES # ETHIOPIC SYLLABLE HA..ETHIOPIC SYLLABLE QWA 1249 ; UNASSIGNED# 124A..124D ; MAYBE YES # ETHIOPIC SYLLABLE QWI..ETHIOPIC SYLLABLE QWE 124E..124F ; UNASSIGNED# .. 1250..1256 ; MAYBE YES # ETHIOPIC SYLLABLE QHA..ETHIOPIC SYLLABLE QHO 1257 ; UNASSIGNED# 1258 ; MAYBE YES # ETHIOPIC SYLLABLE QHWA 1259 ; UNASSIGNED# 125A..125D ; MAYBE YES # ETHIOPIC SYLLABLE QHWI..ETHIOPIC SYLLABLE QHWE 125E..125F ; UNASSIGNED# .. 1260..1288 ; MAYBE YES # ETHIOPIC SYLLABLE BA..ETHIOPIC SYLLABLE XWA 1289 ; UNASSIGNED# 128A..128D ; MAYBE YES # ETHIOPIC SYLLABLE XWI..ETHIOPIC SYLLABLE XWE 128E..128F ; UNASSIGNED# .. 1290..12B0 ; MAYBE YES # ETHIOPIC SYLLABLE NA..ETHIOPIC SYLLABLE KWA 12B1 ; UNASSIGNED# 12B2..12B5 ; MAYBE YES # ETHIOPIC SYLLABLE KWI..ETHIOPIC SYLLABLE KWE 12B6..12B7 ; UNASSIGNED# .. 12B8..12BE ; MAYBE YES # ETHIOPIC SYLLABLE KXA..ETHIOPIC SYLLABLE KXO 12BF ; UNASSIGNED# 12C0 ; MAYBE YES # ETHIOPIC SYLLABLE KXWA 12C1 ; UNASSIGNED# 12C2..12C5 ; MAYBE YES # ETHIOPIC SYLLABLE KXWI..ETHIOPIC SYLLABLE KXWE 12C6..12C7 ; UNASSIGNED# .. 12C8..12D6 ; MAYBE YES # ETHIOPIC SYLLABLE WA..ETHIOPIC SYLLABLE PHARYNG 12D7 ; UNASSIGNED# 12D8..1310 ; MAYBE YES # ETHIOPIC SYLLABLE ZA..ETHIOPIC SYLLABLE GWA Faltstrom Expires May 21, 2008 [Page 34] Internet-Draft Unicode Codepoints November 2007 1311 ; UNASSIGNED# 1312..1315 ; MAYBE YES # ETHIOPIC SYLLABLE GWI..ETHIOPIC SYLLABLE GWE 1316..1317 ; UNASSIGNED# .. 1318..135A ; MAYBE YES # ETHIOPIC SYLLABLE GGA..ETHIOPIC SYLLABLE FYA 135B..135E ; UNASSIGNED# .. 135F ; MAYBE YES # ETHIOPIC COMBINING GEMINATION MARK 1360..137C ; NEVER # ETHIOPIC SECTION MARK..ETHIOPIC NUMBER TEN THOU 137D..137F ; UNASSIGNED# .. 1380..138F ; MAYBE YES # ETHIOPIC SYLLABLE SEBATBEIT MWA..ETHIOPIC SYLLA 1390..1399 ; NEVER # ETHIOPIC TONAL MARK YIZET..ETHIOPIC TONAL MARK 139A..139F ; UNASSIGNED# .. 13A0..13F4 ; MAYBE YES # CHEROKEE LETTER A..CHEROKEE LETTER YV 13F5..1400 ; UNASSIGNED# .. 1401..166C ; MAYBE YES # CANADIAN SYLLABICS E..CANADIAN SYLLABICS CARRIE 166D..166E ; NEVER # CANADIAN SYLLABICS CHI SIGN..CANADIAN SYLLABICS 166F..1676 ; MAYBE YES # CANADIAN SYLLABICS QAI..CANADIAN SYLLABICS NNGA 1677..167F ; UNASSIGNED# .. 1680..169C ; NEVER # OGHAM SPACE MARK..OGHAM REVERSED FEATHER MARK 169D..169F ; UNASSIGNED# .. 16A0..16EA ; MAYBE YES # RUNIC LETTER FEHU FEOH FE F..RUNIC LETTER X 16EB..16F0 ; NEVER # RUNIC SINGLE PUNCTUATION..RUNIC BELGTHOR SYMBOL 16F1..16FF ; UNASSIGNED# .. 1700..170C ; MAYBE YES # TAGALOG LETTER A..TAGALOG LETTER YA 170D ; UNASSIGNED# 170E..1714 ; MAYBE YES # TAGALOG LETTER LA..TAGALOG SIGN VIRAMA 1715..171F ; UNASSIGNED# .. 1720..1734 ; MAYBE YES # HANUNOO LETTER A..HANUNOO SIGN PAMUDPOD 1735..1736 ; NEVER # PHILIPPINE SINGLE PUNCTUATION..PHILIPPINE DOUBL 1737..173F ; UNASSIGNED# .. 1740..1753 ; MAYBE YES # BUHID LETTER A..BUHID VOWEL SIGN U 1754..175F ; UNASSIGNED# .. 1760..176C ; MAYBE YES # TAGBANWA LETTER A..TAGBANWA LETTER YA 176D ; UNASSIGNED# 176E..1770 ; MAYBE YES # TAGBANWA LETTER LA..TAGBANWA LETTER SA 1771 ; UNASSIGNED# 1772..1773 ; MAYBE YES # TAGBANWA VOWEL SIGN I..TAGBANWA VOWEL SIGN U 1774..177F ; UNASSIGNED# .. 1780..17B3 ; MAYBE YES # KHMER LETTER KA..KHMER INDEPENDENT VOWEL QAU 17B4..17B5 ; CONTEXT # KHMER VOWEL INHERENT AQ..KHMER VOWEL INHERENT A 17B6..17D3 ; MAYBE YES # KHMER VOWEL SIGN AA..KHMER SIGN BATHAMASAT 17D4..17D6 ; NEVER # KHMER SIGN KHAN..KHMER SIGN CAMNUC PII KUUH 17D7 ; MAYBE YES # KHMER SIGN LEK TOO 17D8..17DB ; NEVER # KHMER SIGN BEYYAL..KHMER CURRENCY SYMBOL RIEL 17DC..17DD ; MAYBE YES # KHMER SIGN AVAKRAHASANYA..KHMER SIGN ATTHACAN 17DE..17DF ; UNASSIGNED# .. 17E0..17E9 ; MAYBE YES # KHMER DIGIT ZERO..KHMER DIGIT NINE 17EA..17EF ; UNASSIGNED# .. 17F0..17F9 ; NEVER # KHMER SYMBOL LEK ATTAK SON..KHMER SYMBOL LEK AT Faltstrom Expires May 21, 2008 [Page 35] Internet-Draft Unicode Codepoints November 2007 17FA..17FF ; UNASSIGNED# .. 1800..180A ; NEVER # MONGOLIAN BIRGA..MONGOLIAN NIRUGU 180B..180D ; MAYBE YES # MONGOLIAN FREE VARIATION SELECTOR ONE..MONGOLIA 180E ; NEVER # MONGOLIAN VOWEL SEPARATOR 180F ; UNASSIGNED# 1810..1819 ; MAYBE YES # MONGOLIAN DIGIT ZERO..MONGOLIAN DIGIT NINE 181A..181F ; UNASSIGNED# .. 1820..1877 ; MAYBE YES # MONGOLIAN LETTER A..MONGOLIAN LETTER MANCHU ZHA 1878..187F ; UNASSIGNED# .. 1880..18A9 ; MAYBE YES # MONGOLIAN LETTER ALI GALI ANUSVARA ONE..MONGOLI 18AA..18FF ; UNASSIGNED# .. 1900..191C ; MAYBE YES # LIMBU VOWEL-CARRIER LETTER..LIMBU LETTER HA 191D..191F ; UNASSIGNED# .. 1920..192B ; MAYBE YES # LIMBU VOWEL SIGN A..LIMBU SUBJOINED LETTER WA 192C..192F ; UNASSIGNED# .. 1930..193B ; MAYBE YES # LIMBU SMALL LETTER KA..LIMBU SIGN SA-I 193C..193F ; UNASSIGNED# .. 1940 ; NEVER # LIMBU SIGN LOO 1941..1943 ; UNASSIGNED# .. 1944..1945 ; NEVER # LIMBU EXCLAMATION MARK..LIMBU QUESTION MARK 1946..196D ; MAYBE YES # LIMBU DIGIT ZERO..TAI LE LETTER AI 196E..196F ; UNASSIGNED# .. 1970..1974 ; MAYBE YES # TAI LE LETTER TONE-2..TAI LE LETTER TONE-6 1975..197F ; UNASSIGNED# .. 1980..19A9 ; MAYBE YES # NEW TAI LUE LETTER HIGH QA..NEW TAI LUE LETTER 19AA..19AF ; UNASSIGNED# .. 19B0..19C9 ; MAYBE YES # NEW TAI LUE VOWEL SIGN VOWEL SHORTENER..NEW TAI 19CA..19CF ; UNASSIGNED# .. 19D0..19D9 ; MAYBE YES # NEW TAI LUE DIGIT ZERO..NEW TAI LUE DIGIT NINE 19DA..19DD ; UNASSIGNED# .. 19DE..19FF ; NEVER # NEW TAI LUE SIGN LAE..KHMER SYMBOL DAP-PRAM ROC 1A00..1A1B ; MAYBE YES # BUGINESE LETTER KA..BUGINESE VOWEL SIGN AE 1A1C..1A1D ; UNASSIGNED# .. 1A1E..1A1F ; NEVER # BUGINESE PALLAWA..BUGINESE END OF SECTION 1A20..1AFF ; UNASSIGNED# .. 1B00..1B05 ; MAYBE YES # BALINESE SIGN ULU RICEM..BALINESE LETTER AKARA 1B06 ; NEVER # BALINESE LETTER AKARA TEDUNG 1B07 ; MAYBE YES # BALINESE LETTER IKARA 1B08 ; NEVER # BALINESE LETTER IKARA TEDUNG 1B09 ; MAYBE YES # BALINESE LETTER UKARA 1B0A ; NEVER # BALINESE LETTER UKARA TEDUNG 1B0B ; MAYBE YES # BALINESE LETTER RA REPA 1B0C ; NEVER # BALINESE LETTER RA REPA TEDUNG 1B0D ; MAYBE YES # BALINESE LETTER LA LENGA 1B0E ; NEVER # BALINESE LETTER LA LENGA TEDUNG 1B0F..1B11 ; MAYBE YES # BALINESE LETTER EKARA..BALINESE LETTER OKARA 1B12 ; NEVER # BALINESE LETTER OKARA TEDUNG 1B13..1B3A ; MAYBE YES # BALINESE LETTER KA..BALINESE VOWEL SIGN RA REPA Faltstrom Expires May 21, 2008 [Page 36] Internet-Draft Unicode Codepoints November 2007 1B3B ; NEVER # BALINESE VOWEL SIGN RA REPA TEDUNG 1B3C ; MAYBE YES # BALINESE VOWEL SIGN LA LENGA 1B3D ; NEVER # BALINESE VOWEL SIGN LA LENGA TEDUNG 1B3E..1B3F ; MAYBE YES # BALINESE VOWEL SIGN TALING..BALINESE VOWEL SIGN 1B40..1B41 ; NEVER # BALINESE VOWEL SIGN TALING TEDUNG..BALINESE VOW 1B42 ; MAYBE YES # BALINESE VOWEL SIGN PEPET 1B43 ; NEVER # BALINESE VOWEL SIGN PEPET TEDUNG 1B44..1B4B ; MAYBE YES # BALINESE ADEG ADEG..BALINESE LETTER ASYURA SASA 1B4C..1B4F ; UNASSIGNED# .. 1B50..1B59 ; MAYBE YES # BALINESE DIGIT ZERO..BALINESE DIGIT NINE 1B5A..1B6A ; NEVER # BALINESE PANTI..BALINESE MUSICAL SYMBOL DANG GE 1B6B..1B73 ; MAYBE YES # BALINESE MUSICAL SYMBOL COMBINING TEGEH..BALINE 1B74..1B7C ; NEVER # BALINESE MUSICAL SYMBOL RIGHT-HAND OPEN DUG..BA 1B7D..1CFF ; UNASSIGNED# .. 1D00..1D2B ; ALWAYS # LATIN LETTER SMALL CAPITAL A..CYRILLIC LETTER S 1D2C..1D2E ; NEVER # MODIFIER LETTER CAPITAL A..MODIFIER LETTER CAPI 1D2F ; ALWAYS # MODIFIER LETTER CAPITAL BARRED B 1D30..1D3A ; NEVER # MODIFIER LETTER CAPITAL D..MODIFIER LETTER CAPI 1D3B ; ALWAYS # MODIFIER LETTER CAPITAL REVERSED N 1D3C..1D4D ; NEVER # MODIFIER LETTER CAPITAL O..MODIFIER LETTER SMAL 1D4E ; ALWAYS # MODIFIER LETTER SMALL TURNED I 1D4F..1D6A ; NEVER # MODIFIER LETTER SMALL K..GREEK SUBSCRIPT SMALL 1D6B..1D77 ; ALWAYS # LATIN SMALL LETTER UE..LATIN SMALL LETTER TURNE 1D78 ; NEVER # MODIFIER LETTER CYRILLIC EN 1D79..1D9A ; ALWAYS # LATIN SMALL LETTER INSULAR G..LATIN SMALL LETTE 1D9B..1DBF ; NEVER # MODIFIER LETTER SMALL TURNED ALPHA..MODIFIER LE 1DC0..1DCA ; MAYBE YES # COMBINING DOTTED GRAVE ACCENT..COMBINING LATIN 1DCB..1DFD ; UNASSIGNED# .. 1DFE..1DFF ; MAYBE YES # COMBINING LEFT ARROWHEAD ABOVE..COMBINING RIGHT 1E00 ; NEVER # LATIN CAPITAL LETTER A WITH RING BELOW 1E01 ; ALWAYS # LATIN SMALL LETTER A WITH RING BELOW 1E02 ; NEVER # LATIN CAPITAL LETTER B WITH DOT ABOVE 1E03 ; ALWAYS # LATIN SMALL LETTER B WITH DOT ABOVE 1E04 ; NEVER # LATIN CAPITAL LETTER B WITH DOT BELOW 1E05 ; ALWAYS # LATIN SMALL LETTER B WITH DOT BELOW 1E06 ; NEVER # LATIN CAPITAL LETTER B WITH LINE BELOW 1E07 ; ALWAYS # LATIN SMALL LETTER B WITH LINE BELOW 1E08..1E0A ; NEVER # LATIN CAPITAL LETTER C WITH CEDILLA AND ACUTE.. 1E0B ; ALWAYS # LATIN SMALL LETTER D WITH DOT ABOVE 1E0C ; NEVER # LATIN CAPITAL LETTER D WITH DOT BELOW 1E0D ; ALWAYS # LATIN SMALL LETTER D WITH DOT BELOW 1E0E ; NEVER # LATIN CAPITAL LETTER D WITH LINE BELOW 1E0F ; ALWAYS # LATIN SMALL LETTER D WITH LINE BELOW 1E10 ; NEVER # LATIN CAPITAL LETTER D WITH CEDILLA 1E11 ; ALWAYS # LATIN SMALL LETTER D WITH CEDILLA 1E12 ; NEVER # LATIN CAPITAL LETTER D WITH CIRCUMFLEX BELOW 1E13 ; ALWAYS # LATIN SMALL LETTER D WITH CIRCUMFLEX BELOW 1E14..1E18 ; NEVER # LATIN CAPITAL LETTER E WITH MACRON AND GRAVE..L Faltstrom Expires May 21, 2008 [Page 37] Internet-Draft Unicode Codepoints November 2007 1E19 ; ALWAYS # LATIN SMALL LETTER E WITH CIRCUMFLEX BELOW 1E1A ; NEVER # LATIN CAPITAL LETTER E WITH TILDE BELOW 1E1B ; ALWAYS # LATIN SMALL LETTER E WITH TILDE BELOW 1E1C..1E1E ; NEVER # LATIN CAPITAL LETTER E WITH CEDILLA AND BREVE.. 1E1F ; ALWAYS # LATIN SMALL LETTER F WITH DOT ABOVE 1E20 ; NEVER # LATIN CAPITAL LETTER G WITH MACRON 1E21 ; ALWAYS # LATIN SMALL LETTER G WITH MACRON 1E22 ; NEVER # LATIN CAPITAL LETTER H WITH DOT ABOVE 1E23 ; ALWAYS # LATIN SMALL LETTER H WITH DOT ABOVE 1E24 ; NEVER # LATIN CAPITAL LETTER H WITH DOT BELOW 1E25 ; ALWAYS # LATIN SMALL LETTER H WITH DOT BELOW 1E26 ; NEVER # LATIN CAPITAL LETTER H WITH DIAERESIS 1E27 ; ALWAYS # LATIN SMALL LETTER H WITH DIAERESIS 1E28 ; NEVER # LATIN CAPITAL LETTER H WITH CEDILLA 1E29 ; ALWAYS # LATIN SMALL LETTER H WITH CEDILLA 1E2A ; NEVER # LATIN CAPITAL LETTER H WITH BREVE BELOW 1E2B ; ALWAYS # LATIN SMALL LETTER H WITH BREVE BELOW 1E2C ; NEVER # LATIN CAPITAL LETTER I WITH TILDE BELOW 1E2D ; ALWAYS # LATIN SMALL LETTER I WITH TILDE BELOW 1E2E..1E30 ; NEVER # LATIN CAPITAL LETTER I WITH DIAERESIS AND ACUTE 1E31 ; ALWAYS # LATIN SMALL LETTER K WITH ACUTE 1E32 ; NEVER # LATIN CAPITAL LETTER K WITH DOT BELOW 1E33 ; ALWAYS # LATIN SMALL LETTER K WITH DOT BELOW 1E34 ; NEVER # LATIN CAPITAL LETTER K WITH LINE BELOW 1E35 ; ALWAYS # LATIN SMALL LETTER K WITH LINE BELOW 1E36 ; NEVER # LATIN CAPITAL LETTER L WITH DOT BELOW 1E37 ; ALWAYS # LATIN SMALL LETTER L WITH DOT BELOW 1E38..1E3A ; NEVER # LATIN CAPITAL LETTER L WITH DOT BELOW AND MACRO 1E3B ; ALWAYS # LATIN SMALL LETTER L WITH LINE BELOW 1E3C ; NEVER # LATIN CAPITAL LETTER L WITH CIRCUMFLEX BELOW 1E3D ; ALWAYS # LATIN SMALL LETTER L WITH CIRCUMFLEX BELOW 1E3E ; NEVER # LATIN CAPITAL LETTER M WITH ACUTE 1E3F ; ALWAYS # LATIN SMALL LETTER M WITH ACUTE 1E40 ; NEVER # LATIN CAPITAL LETTER M WITH DOT ABOVE 1E41 ; ALWAYS # LATIN SMALL LETTER M WITH DOT ABOVE 1E42 ; NEVER # LATIN CAPITAL LETTER M WITH DOT BELOW 1E43 ; ALWAYS # LATIN SMALL LETTER M WITH DOT BELOW 1E44 ; NEVER # LATIN CAPITAL LETTER N WITH DOT ABOVE 1E45 ; ALWAYS # LATIN SMALL LETTER N WITH DOT ABOVE 1E46 ; NEVER # LATIN CAPITAL LETTER N WITH DOT BELOW 1E47 ; ALWAYS # LATIN SMALL LETTER N WITH DOT BELOW 1E48 ; NEVER # LATIN CAPITAL LETTER N WITH LINE BELOW 1E49 ; ALWAYS # LATIN SMALL LETTER N WITH LINE BELOW 1E4A ; NEVER # LATIN CAPITAL LETTER N WITH CIRCUMFLEX BELOW 1E4B ; ALWAYS # LATIN SMALL LETTER N WITH CIRCUMFLEX BELOW 1E4C..1E54 ; NEVER # LATIN CAPITAL LETTER O WITH TILDE AND ACUTE..LA 1E55 ; ALWAYS # LATIN SMALL LETTER P WITH ACUTE 1E56 ; NEVER # LATIN CAPITAL LETTER P WITH DOT ABOVE Faltstrom Expires May 21, 2008 [Page 38] Internet-Draft Unicode Codepoints November 2007 1E57 ; ALWAYS # LATIN SMALL LETTER P WITH DOT ABOVE 1E58 ; NEVER # LATIN CAPITAL LETTER R WITH DOT ABOVE 1E59 ; ALWAYS # LATIN SMALL LETTER R WITH DOT ABOVE 1E5A ; NEVER # LATIN CAPITAL LETTER R WITH DOT BELOW 1E5B ; ALWAYS # LATIN SMALL LETTER R WITH DOT BELOW 1E5C..1E5E ; NEVER # LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRO 1E5F ; ALWAYS # LATIN SMALL LETTER R WITH LINE BELOW 1E60 ; NEVER # LATIN CAPITAL LETTER S WITH DOT ABOVE 1E61 ; ALWAYS # LATIN SMALL LETTER S WITH DOT ABOVE 1E62 ; NEVER # LATIN CAPITAL LETTER S WITH DOT BELOW 1E63 ; ALWAYS # LATIN SMALL LETTER S WITH DOT BELOW 1E64..1E6A ; NEVER # LATIN CAPITAL LETTER S WITH ACUTE AND DOT ABOVE 1E6B ; ALWAYS # LATIN SMALL LETTER T WITH DOT ABOVE 1E6C ; NEVER # LATIN CAPITAL LETTER T WITH DOT BELOW 1E6D ; ALWAYS # LATIN SMALL LETTER T WITH DOT BELOW 1E6E ; NEVER # LATIN CAPITAL LETTER T WITH LINE BELOW 1E6F ; ALWAYS # LATIN SMALL LETTER T WITH LINE BELOW 1E70 ; NEVER # LATIN CAPITAL LETTER T WITH CIRCUMFLEX BELOW 1E71 ; ALWAYS # LATIN SMALL LETTER T WITH CIRCUMFLEX BELOW 1E72 ; NEVER # LATIN CAPITAL LETTER U WITH DIAERESIS BELOW 1E73 ; ALWAYS # LATIN SMALL LETTER U WITH DIAERESIS BELOW 1E74 ; NEVER # LATIN CAPITAL LETTER U WITH TILDE BELOW 1E75 ; ALWAYS # LATIN SMALL LETTER U WITH TILDE BELOW 1E76 ; NEVER # LATIN CAPITAL LETTER U WITH CIRCUMFLEX BELOW 1E77 ; ALWAYS # LATIN SMALL LETTER U WITH CIRCUMFLEX BELOW 1E78..1E7C ; NEVER # LATIN CAPITAL LETTER U WITH TILDE AND ACUTE..LA 1E7D ; ALWAYS # LATIN SMALL LETTER V WITH TILDE 1E7E ; NEVER # LATIN CAPITAL LETTER V WITH DOT BELOW 1E7F ; ALWAYS # LATIN SMALL LETTER V WITH DOT BELOW 1E80 ; NEVER # LATIN CAPITAL LETTER W WITH GRAVE 1E81 ; ALWAYS # LATIN SMALL LETTER W WITH GRAVE 1E82 ; NEVER # LATIN CAPITAL LETTER W WITH ACUTE 1E83 ; ALWAYS # LATIN SMALL LETTER W WITH ACUTE 1E84 ; NEVER # LATIN CAPITAL LETTER W WITH DIAERESIS 1E85 ; ALWAYS # LATIN SMALL LETTER W WITH DIAERESIS 1E86 ; NEVER # LATIN CAPITAL LETTER W WITH DOT ABOVE 1E87 ; ALWAYS # LATIN SMALL LETTER W WITH DOT ABOVE 1E88 ; NEVER # LATIN CAPITAL LETTER W WITH DOT BELOW 1E89 ; ALWAYS # LATIN SMALL LETTER W WITH DOT BELOW 1E8A ; NEVER # LATIN CAPITAL LETTER X WITH DOT ABOVE 1E8B ; ALWAYS # LATIN SMALL LETTER X WITH DOT ABOVE 1E8C ; NEVER # LATIN CAPITAL LETTER X WITH DIAERESIS 1E8D ; ALWAYS # LATIN SMALL LETTER X WITH DIAERESIS 1E8E ; NEVER # LATIN CAPITAL LETTER Y WITH DOT ABOVE 1E8F ; ALWAYS # LATIN SMALL LETTER Y WITH DOT ABOVE 1E90 ; NEVER # LATIN CAPITAL LETTER Z WITH CIRCUMFLEX 1E91 ; ALWAYS # LATIN SMALL LETTER Z WITH CIRCUMFLEX 1E92 ; NEVER # LATIN CAPITAL LETTER Z WITH DOT BELOW Faltstrom Expires May 21, 2008 [Page 39] Internet-Draft Unicode Codepoints November 2007 1E93 ; ALWAYS # LATIN SMALL LETTER Z WITH DOT BELOW 1E94 ; NEVER # LATIN CAPITAL LETTER Z WITH LINE BELOW 1E95 ; ALWAYS # LATIN SMALL LETTER Z WITH LINE BELOW 1E96..1E9B ; NEVER # LATIN SMALL LETTER H WITH LINE BELOW..LATIN SMA 1E9C..1E9F ; UNASSIGNED# .. 1EA0 ; NEVER # LATIN CAPITAL LETTER A WITH DOT BELOW 1EA1 ; ALWAYS # LATIN SMALL LETTER A WITH DOT BELOW 1EA2 ; NEVER # LATIN CAPITAL LETTER A WITH HOOK ABOVE 1EA3 ; ALWAYS # LATIN SMALL LETTER A WITH HOOK ABOVE 1EA4..1EB8 ; NEVER # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND ACUT 1EB9 ; ALWAYS # LATIN SMALL LETTER E WITH DOT BELOW 1EBA ; NEVER # LATIN CAPITAL LETTER E WITH HOOK ABOVE 1EBB ; ALWAYS # LATIN SMALL LETTER E WITH HOOK ABOVE 1EBC ; NEVER # LATIN CAPITAL LETTER E WITH TILDE 1EBD ; ALWAYS # LATIN SMALL LETTER E WITH TILDE 1EBE..1EC8 ; NEVER # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUT 1EC9 ; ALWAYS # LATIN SMALL LETTER I WITH HOOK ABOVE 1ECA ; NEVER # LATIN CAPITAL LETTER I WITH DOT BELOW 1ECB ; ALWAYS # LATIN SMALL LETTER I WITH DOT BELOW 1ECC ; NEVER # LATIN CAPITAL LETTER O WITH DOT BELOW 1ECD ; ALWAYS # LATIN SMALL LETTER O WITH DOT BELOW 1ECE ; NEVER # LATIN CAPITAL LETTER O WITH HOOK ABOVE 1ECF ; ALWAYS # LATIN SMALL LETTER O WITH HOOK ABOVE 1ED0..1EE4 ; NEVER # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND ACUT 1EE5 ; ALWAYS # LATIN SMALL LETTER U WITH DOT BELOW 1EE6 ; NEVER # LATIN CAPITAL LETTER U WITH HOOK ABOVE 1EE7 ; ALWAYS # LATIN SMALL LETTER U WITH HOOK ABOVE 1EE8..1EF2 ; NEVER # LATIN CAPITAL LETTER U WITH HORN AND ACUTE..LAT 1EF3 ; ALWAYS # LATIN SMALL LETTER Y WITH GRAVE 1EF4 ; NEVER # LATIN CAPITAL LETTER Y WITH DOT BELOW 1EF5 ; ALWAYS # LATIN SMALL LETTER Y WITH DOT BELOW 1EF6 ; NEVER # LATIN CAPITAL LETTER Y WITH HOOK ABOVE 1EF7 ; ALWAYS # LATIN SMALL LETTER Y WITH HOOK ABOVE 1EF8 ; NEVER # LATIN CAPITAL LETTER Y WITH TILDE 1EF9 ; ALWAYS # LATIN SMALL LETTER Y WITH TILDE 1EFA..1EFF ; UNASSIGNED# .. 1F00..1F01 ; ALWAYS # GREEK SMALL LETTER ALPHA WITH PSILI..GREEK SMAL 1F02..1F0F ; NEVER # GREEK SMALL LETTER ALPHA WITH PSILI AND VARIA.. 1F10..1F11 ; ALWAYS # GREEK SMALL LETTER EPSILON WITH PSILI..GREEK SM 1F12..1F15 ; NEVER # GREEK SMALL LETTER EPSILON WITH PSILI AND VARIA 1F16..1F17 ; UNASSIGNED# .. 1F18..1F1D ; NEVER # GREEK CAPITAL LETTER EPSILON WITH PSILI..GREEK 1F1E..1F1F ; UNASSIGNED# .. 1F20..1F21 ; ALWAYS # GREEK SMALL LETTER ETA WITH PSILI..GREEK SMALL 1F22..1F2F ; NEVER # GREEK SMALL LETTER ETA WITH PSILI AND VARIA..GR 1F30..1F31 ; ALWAYS # GREEK SMALL LETTER IOTA WITH PSILI..GREEK SMALL 1F32..1F3F ; NEVER # GREEK SMALL LETTER IOTA WITH PSILI AND VARIA..G 1F40..1F41 ; ALWAYS # GREEK SMALL LETTER OMICRON WITH PSILI..GREEK SM Faltstrom Expires May 21, 2008 [Page 40] Internet-Draft Unicode Codepoints November 2007 1F42..1F45 ; NEVER # GREEK SMALL LETTER OMICRON WITH PSILI AND VARIA 1F46..1F47 ; UNASSIGNED# .. 1F48..1F4D ; NEVER # GREEK CAPITAL LETTER OMICRON WITH PSILI..GREEK 1F4E..1F4F ; UNASSIGNED# .. 1F50 ; NEVER # GREEK SMALL LETTER UPSILON WITH PSILI 1F51 ; ALWAYS # GREEK SMALL LETTER UPSILON WITH DASIA 1F52..1F57 ; NEVER # GREEK SMALL LETTER UPSILON WITH PSILI AND VARIA 1F58 ; UNASSIGNED# 1F59 ; NEVER # GREEK CAPITAL LETTER UPSILON WITH DASIA 1F5A ; UNASSIGNED# 1F5B ; NEVER # GREEK CAPITAL LETTER UPSILON WITH DASIA AND VAR 1F5C ; UNASSIGNED# 1F5D ; NEVER # GREEK CAPITAL LETTER UPSILON WITH DASIA AND OXI 1F5E ; UNASSIGNED# 1F5F ; NEVER # GREEK CAPITAL LETTER UPSILON WITH DASIA AND PER 1F60..1F61 ; ALWAYS # GREEK SMALL LETTER OMEGA WITH PSILI..GREEK SMAL 1F62..1F6F ; NEVER # GREEK SMALL LETTER OMEGA WITH PSILI AND VARIA.. 1F70 ; ALWAYS # GREEK SMALL LETTER ALPHA WITH VARIA 1F71 ; NEVER # GREEK SMALL LETTER ALPHA WITH OXIA 1F72 ; ALWAYS # GREEK SMALL LETTER EPSILON WITH VARIA 1F73 ; NEVER # GREEK SMALL LETTER EPSILON WITH OXIA 1F74 ; ALWAYS # GREEK SMALL LETTER ETA WITH VARIA 1F75 ; NEVER # GREEK SMALL LETTER ETA WITH OXIA 1F76 ; ALWAYS # GREEK SMALL LETTER IOTA WITH VARIA 1F77 ; NEVER # GREEK SMALL LETTER IOTA WITH OXIA 1F78 ; ALWAYS # GREEK SMALL LETTER OMICRON WITH VARIA 1F79 ; NEVER # GREEK SMALL LETTER OMICRON WITH OXIA 1F7A ; ALWAYS # GREEK SMALL LETTER UPSILON WITH VARIA 1F7B ; NEVER # GREEK SMALL LETTER UPSILON WITH OXIA 1F7C ; ALWAYS # GREEK SMALL LETTER OMEGA WITH VARIA 1F7D ; NEVER # GREEK SMALL LETTER OMEGA WITH OXIA 1F7E..1F7F ; UNASSIGNED# .. 1F80..1FAF ; NEVER # GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGR 1FB0..1FB1 ; ALWAYS # GREEK SMALL LETTER ALPHA WITH VRACHY..GREEK SMA 1FB2..1FB4 ; NEVER # GREEK SMALL LETTER ALPHA WITH VARIA AND YPOGEGR 1FB5 ; UNASSIGNED# 1FB6..1FC4 ; NEVER # GREEK SMALL LETTER ALPHA WITH PERISPOMENI..GREE 1FC5 ; UNASSIGNED# 1FC6..1FCF ; NEVER # GREEK SMALL LETTER ETA WITH PERISPOMENI..GREEK 1FD0..1FD1 ; ALWAYS # GREEK SMALL LETTER IOTA WITH VRACHY..GREEK SMAL 1FD2..1FD3 ; NEVER # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND VARI 1FD4..1FD5 ; UNASSIGNED# .. 1FD6..1FDB ; NEVER # GREEK SMALL LETTER IOTA WITH PERISPOMENI..GREEK 1FDC ; UNASSIGNED# 1FDD..1FDF ; NEVER # GREEK DASIA AND VARIA..GREEK DASIA AND PERISPOM 1FE0..1FE1 ; ALWAYS # GREEK SMALL LETTER UPSILON WITH VRACHY..GREEK S 1FE2..1FE4 ; NEVER # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND V 1FE5 ; ALWAYS # GREEK SMALL LETTER RHO WITH DASIA Faltstrom Expires May 21, 2008 [Page 41] Internet-Draft Unicode Codepoints November 2007 1FE6..1FEF ; NEVER # GREEK SMALL LETTER UPSILON WITH PERISPOMENI..GR 1FF0..1FF1 ; UNASSIGNED# .. 1FF2..1FF4 ; NEVER # GREEK SMALL LETTER OMEGA WITH VARIA AND YPOGEGR 1FF5 ; UNASSIGNED# 1FF6..1FFE ; NEVER # GREEK SMALL LETTER OMEGA WITH PERISPOMENI..GREE 1FFF ; UNASSIGNED# 2000..200A ; NEVER # EN QUAD..HAIR SPACE 200B..200F ; CONTEXT # ZERO WIDTH SPACE..RIGHT-TO-LEFT MARK 2010..2029 ; NEVER # HYPHEN..PARAGRAPH SEPARATOR 202A..202E ; CONTEXT # LEFT-TO-RIGHT EMBEDDING..RIGHT-TO-LEFT OVERRIDE 202F..205F ; NEVER # NARROW NO-BREAK SPACE..MEDIUM MATHEMATICAL SPAC 2060..2063 ; CONTEXT # WORD JOINER..INVISIBLE SEPARATOR 2064..2069 ; UNASSIGNED# .. 206A..206F ; CONTEXT # INHIBIT SYMMETRIC SWAPPING..NOMINAL DIGIT SHAPE 2070..2071 ; NEVER # SUPERSCRIPT ZERO..SUPERSCRIPT LATIN SMALL LETTE 2072..2073 ; UNASSIGNED# .. 2074..208E ; NEVER # SUPERSCRIPT FOUR..SUBSCRIPT RIGHT PARENTHESIS 208F ; UNASSIGNED# 2090..2094 ; NEVER # LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT 2095..209F ; UNASSIGNED# .. 20A0..20B5 ; NEVER # EURO-CURRENCY SIGN..CEDI SIGN 20B6..20CF ; UNASSIGNED# .. 20D0..20EF ; NEVER # COMBINING LEFT HARPOON ABOVE..COMBINING RIGHT A 20F0..20FF ; UNASSIGNED# .. 2100..214D ; NEVER # ACCOUNT OF..AKTIESELSKAB 214E ; ALWAYS # TURNED SMALL F 214F..2152 ; UNASSIGNED# .. 2153..2183 ; NEVER # VULGAR FRACTION ONE THIRD..ROMAN NUMERAL REVERS 2184 ; ALWAYS # LATIN SMALL LETTER REVERSED C 2185..218F ; UNASSIGNED# .. 2190..23E7 ; NEVER # LEFTWARDS ARROW..ELECTRICAL INTERSECTION 23E8..23FF ; UNASSIGNED# .. 2400..2426 ; NEVER # SYMBOL FOR NULL..SYMBOL FOR SUBSTITUTE FORM TWO 2427..243F ; UNASSIGNED# .. 2440..244A ; NEVER # OCR HOOK..OCR DOUBLE BACKSLASH 244B..245F ; UNASSIGNED# .. 2460..269C ; NEVER # CIRCLED DIGIT ONE..FLEUR-DE-LIS 269D..269F ; UNASSIGNED# .. 26A0..26B2 ; NEVER # WARNING SIGN..NEUTER 26B3..2700 ; UNASSIGNED# .. 2701..2704 ; NEVER # UPPER BLADE SCISSORS..WHITE SCISSORS 2705 ; UNASSIGNED# 2706..2709 ; NEVER # TELEPHONE LOCATION SIGN..ENVELOPE 270A..270B ; UNASSIGNED# .. 270C..2727 ; NEVER # VICTORY HAND..WHITE FOUR POINTED STAR 2728 ; UNASSIGNED# 2729..274B ; NEVER # STRESS OUTLINED WHITE STAR..HEAVY EIGHT TEARDRO 274C ; UNASSIGNED# Faltstrom Expires May 21, 2008 [Page 42] Internet-Draft Unicode Codepoints November 2007 274D ; NEVER # SHADOWED WHITE CIRCLE 274E ; UNASSIGNED# 274F..2752 ; NEVER # LOWER RIGHT DROP-SHADOWED WHITE SQUARE..UPPER R 2753..2755 ; UNASSIGNED# .. 2756 ; NEVER # BLACK DIAMOND MINUS WHITE X 2757 ; UNASSIGNED# 2758..275E ; NEVER # LIGHT VERTICAL BAR..HEAVY DOUBLE COMMA QUOTATIO 275F..2760 ; UNASSIGNED# .. 2761..2794 ; NEVER # CURVED STEM PARAGRAPH SIGN ORNAMENT..HEAVY WIDE 2795..2797 ; UNASSIGNED# .. 2798..27AF ; NEVER # HEAVY SOUTH EAST ARROW..NOTCHED LOWER RIGHT-SHA 27B0 ; UNASSIGNED# 27B1..27BE ; NEVER # NOTCHED UPPER RIGHT-SHADOWED WHITE RIGHTWARDS A 27BF ; UNASSIGNED# 27C0..27CA ; NEVER # THREE DIMENSIONAL ANGLE..VERTICAL BAR WITH HORI 27CB..27CF ; UNASSIGNED# .. 27D0..27EB ; NEVER # WHITE DIAMOND WITH CENTRED DOT..MATHEMATICAL RI 27EC..27EF ; UNASSIGNED# .. 27F0..2B1A ; NEVER # UPWARDS QUADRUPLE ARROW..DOTTED SQUARE 2B1B..2B1F ; UNASSIGNED# .. 2B20..2B23 ; NEVER # WHITE PENTAGON..HORIZONTAL BLACK HEXAGON 2B24..2BFF ; UNASSIGNED# .. 2C00..2C2E ; NEVER # GLAGOLITIC CAPITAL LETTER AZU..GLAGOLITIC CAPIT 2C2F ; UNASSIGNED# 2C30..2C5E ; NEVER # GLAGOLITIC SMALL LETTER AZU..GLAGOLITIC SMALL L 2C5F ; UNASSIGNED# 2C60 ; NEVER # LATIN CAPITAL LETTER L WITH DOUBLE BAR 2C61 ; ALWAYS # LATIN SMALL LETTER L WITH DOUBLE BAR 2C62..2C64 ; NEVER # LATIN CAPITAL LETTER L WITH MIDDLE TILDE..LATIN 2C65..2C66 ; ALWAYS # LATIN SMALL LETTER A WITH STROKE..LATIN SMALL L 2C67 ; NEVER # LATIN CAPITAL LETTER H WITH DESCENDER 2C68 ; ALWAYS # LATIN SMALL LETTER H WITH DESCENDER 2C69 ; NEVER # LATIN CAPITAL LETTER K WITH DESCENDER 2C6A ; ALWAYS # LATIN SMALL LETTER K WITH DESCENDER 2C6B ; NEVER # LATIN CAPITAL LETTER Z WITH DESCENDER 2C6C ; ALWAYS # LATIN SMALL LETTER Z WITH DESCENDER 2C6D..2C73 ; UNASSIGNED# .. 2C74 ; ALWAYS # LATIN SMALL LETTER V WITH CURL 2C75 ; NEVER # LATIN CAPITAL LETTER HALF H 2C76..2C77 ; ALWAYS # LATIN SMALL LETTER HALF H..LATIN SMALL LETTER T 2C78..2C7F ; UNASSIGNED# .. 2C80 ; NEVER # COPTIC CAPITAL LETTER ALFA 2C81 ; MAYBE YES # COPTIC SMALL LETTER ALFA 2C82 ; NEVER # COPTIC CAPITAL LETTER VIDA 2C83 ; MAYBE YES # COPTIC SMALL LETTER VIDA 2C84 ; NEVER # COPTIC CAPITAL LETTER GAMMA 2C85 ; MAYBE YES # COPTIC SMALL LETTER GAMMA 2C86 ; NEVER # COPTIC CAPITAL LETTER DALDA Faltstrom Expires May 21, 2008 [Page 43] Internet-Draft Unicode Codepoints November 2007 2C87 ; MAYBE YES # COPTIC SMALL LETTER DALDA 2C88 ; NEVER # COPTIC CAPITAL LETTER EIE 2C89 ; MAYBE YES # COPTIC SMALL LETTER EIE 2C8A ; NEVER # COPTIC CAPITAL LETTER SOU 2C8B ; MAYBE YES # COPTIC SMALL LETTER SOU 2C8C ; NEVER # COPTIC CAPITAL LETTER ZATA 2C8D ; MAYBE YES # COPTIC SMALL LETTER ZATA 2C8E ; NEVER # COPTIC CAPITAL LETTER HATE 2C8F ; MAYBE YES # COPTIC SMALL LETTER HATE 2C90 ; NEVER # COPTIC CAPITAL LETTER THETHE 2C91 ; MAYBE YES # COPTIC SMALL LETTER THETHE 2C92 ; NEVER # COPTIC CAPITAL LETTER IAUDA 2C93 ; MAYBE YES # COPTIC SMALL LETTER IAUDA 2C94 ; NEVER # COPTIC CAPITAL LETTER KAPA 2C95 ; MAYBE YES # COPTIC SMALL LETTER KAPA 2C96 ; NEVER # COPTIC CAPITAL LETTER LAULA 2C97 ; MAYBE YES # COPTIC SMALL LETTER LAULA 2C98 ; NEVER # COPTIC CAPITAL LETTER MI 2C99 ; MAYBE YES # COPTIC SMALL LETTER MI 2C9A ; NEVER # COPTIC CAPITAL LETTER NI 2C9B ; MAYBE YES # COPTIC SMALL LETTER NI 2C9C ; NEVER # COPTIC CAPITAL LETTER KSI 2C9D ; MAYBE YES # COPTIC SMALL LETTER KSI 2C9E ; NEVER # COPTIC CAPITAL LETTER O 2C9F ; MAYBE YES # COPTIC SMALL LETTER O 2CA0 ; NEVER # COPTIC CAPITAL LETTER PI 2CA1 ; MAYBE YES # COPTIC SMALL LETTER PI 2CA2 ; NEVER # COPTIC CAPITAL LETTER RO 2CA3 ; MAYBE YES # COPTIC SMALL LETTER RO 2CA4 ; NEVER # COPTIC CAPITAL LETTER SIMA 2CA5 ; MAYBE YES # COPTIC SMALL LETTER SIMA 2CA6 ; NEVER # COPTIC CAPITAL LETTER TAU 2CA7 ; MAYBE YES # COPTIC SMALL LETTER TAU 2CA8 ; NEVER # COPTIC CAPITAL LETTER UA 2CA9 ; MAYBE YES # COPTIC SMALL LETTER UA 2CAA ; NEVER # COPTIC CAPITAL LETTER FI 2CAB ; MAYBE YES # COPTIC SMALL LETTER FI 2CAC ; NEVER # COPTIC CAPITAL LETTER KHI 2CAD ; MAYBE YES # COPTIC SMALL LETTER KHI 2CAE ; NEVER # COPTIC CAPITAL LETTER PSI 2CAF ; MAYBE YES # COPTIC SMALL LETTER PSI 2CB0 ; NEVER # COPTIC CAPITAL LETTER OOU 2CB1 ; MAYBE YES # COPTIC SMALL LETTER OOU 2CB2 ; NEVER # COPTIC CAPITAL LETTER DIALECT-P ALEF 2CB3 ; MAYBE YES # COPTIC SMALL LETTER DIALECT-P ALEF 2CB4 ; NEVER # COPTIC CAPITAL LETTER OLD COPTIC AIN 2CB5 ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC AIN 2CB6 ; NEVER # COPTIC CAPITAL LETTER CRYPTOGRAMMIC EIE Faltstrom Expires May 21, 2008 [Page 44] Internet-Draft Unicode Codepoints November 2007 2CB7 ; MAYBE YES # COPTIC SMALL LETTER CRYPTOGRAMMIC EIE 2CB8 ; NEVER # COPTIC CAPITAL LETTER DIALECT-P KAPA 2CB9 ; MAYBE YES # COPTIC SMALL LETTER DIALECT-P KAPA 2CBA ; NEVER # COPTIC CAPITAL LETTER DIALECT-P NI 2CBB ; MAYBE YES # COPTIC SMALL LETTER DIALECT-P NI 2CBC ; NEVER # COPTIC CAPITAL LETTER CRYPTOGRAMMIC NI 2CBD ; MAYBE YES # COPTIC SMALL LETTER CRYPTOGRAMMIC NI 2CBE ; NEVER # COPTIC CAPITAL LETTER OLD COPTIC OOU 2CBF ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC OOU 2CC0 ; NEVER # COPTIC CAPITAL LETTER SAMPI 2CC1 ; MAYBE YES # COPTIC SMALL LETTER SAMPI 2CC2 ; NEVER # COPTIC CAPITAL LETTER CROSSED SHEI 2CC3 ; MAYBE YES # COPTIC SMALL LETTER CROSSED SHEI 2CC4 ; NEVER # COPTIC CAPITAL LETTER OLD COPTIC SHEI 2CC5 ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC SHEI 2CC6 ; NEVER # COPTIC CAPITAL LETTER OLD COPTIC ESH 2CC7 ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC ESH 2CC8 ; NEVER # COPTIC CAPITAL LETTER AKHMIMIC KHEI 2CC9 ; MAYBE YES # COPTIC SMALL LETTER AKHMIMIC KHEI 2CCA ; NEVER # COPTIC CAPITAL LETTER DIALECT-P HORI 2CCB ; MAYBE YES # COPTIC SMALL LETTER DIALECT-P HORI 2CCC ; NEVER # COPTIC CAPITAL LETTER OLD COPTIC HORI 2CCD ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC HORI 2CCE ; NEVER # COPTIC CAPITAL LETTER OLD COPTIC HA 2CCF ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC HA 2CD0 ; NEVER # COPTIC CAPITAL LETTER L-SHAPED HA 2CD1 ; MAYBE YES # COPTIC SMALL LETTER L-SHAPED HA 2CD2 ; NEVER # COPTIC CAPITAL LETTER OLD COPTIC HEI 2CD3 ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC HEI 2CD4 ; NEVER # COPTIC CAPITAL LETTER OLD COPTIC HAT 2CD5 ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC HAT 2CD6 ; NEVER # COPTIC CAPITAL LETTER OLD COPTIC GANGIA 2CD7 ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC GANGIA 2CD8 ; NEVER # COPTIC CAPITAL LETTER OLD COPTIC DJA 2CD9 ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC DJA 2CDA ; NEVER # COPTIC CAPITAL LETTER OLD COPTIC SHIMA 2CDB ; MAYBE YES # COPTIC SMALL LETTER OLD COPTIC SHIMA 2CDC ; NEVER # COPTIC CAPITAL LETTER OLD NUBIAN SHIMA 2CDD ; MAYBE YES # COPTIC SMALL LETTER OLD NUBIAN SHIMA 2CDE ; NEVER # COPTIC CAPITAL LETTER OLD NUBIAN NGI 2CDF ; MAYBE YES # COPTIC SMALL LETTER OLD NUBIAN NGI 2CE0 ; NEVER # COPTIC CAPITAL LETTER OLD NUBIAN NYI 2CE1 ; MAYBE YES # COPTIC SMALL LETTER OLD NUBIAN NYI 2CE2 ; NEVER # COPTIC CAPITAL LETTER OLD NUBIAN WAU 2CE3..2CE4 ; MAYBE YES # COPTIC SMALL LETTER OLD NUBIAN WAU..COPTIC SYMB 2CE5..2CEA ; NEVER # COPTIC SYMBOL MI RO..COPTIC SYMBOL SHIMA SIMA 2CEB..2CF8 ; UNASSIGNED# .. 2CF9..2CFF ; NEVER # COPTIC OLD NUBIAN FULL STOP..COPTIC MORPHOLOGIC Faltstrom Expires May 21, 2008 [Page 45] Internet-Draft Unicode Codepoints November 2007 2D00..2D25 ; MAYBE YES # GEORGIAN SMALL LETTER AN..GEORGIAN SMALL LETTER 2D26..2D2F ; UNASSIGNED# .. 2D30..2D65 ; MAYBE YES # TIFINAGH LETTER YA..TIFINAGH LETTER YAZZ 2D66..2D6E ; UNASSIGNED# .. 2D6F ; NEVER # TIFINAGH MODIFIER LETTER LABIALIZATION MARK 2D70..2D7F ; UNASSIGNED# .. 2D80..2D96 ; MAYBE YES # ETHIOPIC SYLLABLE LOA..ETHIOPIC SYLLABLE GGWE 2D97..2D9F ; UNASSIGNED# .. 2DA0..2DA6 ; MAYBE YES # ETHIOPIC SYLLABLE SSA..ETHIOPIC SYLLABLE SSO 2DA7 ; UNASSIGNED# 2DA8..2DAE ; MAYBE YES # ETHIOPIC SYLLABLE CCA..ETHIOPIC SYLLABLE CCO 2DAF ; UNASSIGNED# 2DB0..2DB6 ; MAYBE YES # ETHIOPIC SYLLABLE ZZA..ETHIOPIC SYLLABLE ZZO 2DB7 ; UNASSIGNED# 2DB8..2DBE ; MAYBE YES # ETHIOPIC SYLLABLE CCHA..ETHIOPIC SYLLABLE CCHO 2DBF ; UNASSIGNED# 2DC0..2DC6 ; MAYBE YES # ETHIOPIC SYLLABLE QYA..ETHIOPIC SYLLABLE QYO 2DC7 ; UNASSIGNED# 2DC8..2DCE ; MAYBE YES # ETHIOPIC SYLLABLE KYA..ETHIOPIC SYLLABLE KYO 2DCF ; UNASSIGNED# 2DD0..2DD6 ; MAYBE YES # ETHIOPIC SYLLABLE XYA..ETHIOPIC SYLLABLE XYO 2DD7 ; UNASSIGNED# 2DD8..2DDE ; MAYBE YES # ETHIOPIC SYLLABLE GYA..ETHIOPIC SYLLABLE GYO 2DDF..2DFF ; UNASSIGNED# .. 2E00..2E17 ; NEVER # RIGHT ANGLE SUBSTITUTION MARKER..DOUBLE OBLIQUE 2E18..2E1B ; UNASSIGNED# .. 2E1C..2E1D ; NEVER # LEFT LOW PARAPHRASE BRACKET..RIGHT LOW PARAPHRA 2E1E..2E7F ; UNASSIGNED# .. 2E80..2E99 ; MAYBE YES # CJK RADICAL REPEAT..CJK RADICAL RAP 2E9A ; UNASSIGNED# 2E9B..2E9E ; MAYBE YES # CJK RADICAL CHOKE..CJK RADICAL DEATH 2E9F ; NEVER # CJK RADICAL MOTHER 2EA0..2EF2 ; MAYBE YES # CJK RADICAL CIVILIAN..CJK RADICAL J-SIMPLIFIED 2EF3 ; NEVER # CJK RADICAL C-SIMPLIFIED TURTLE 2EF4..2EFF ; UNASSIGNED# .. 2F00..2FD5 ; NEVER # KANGXI RADICAL ONE..KANGXI RADICAL FLUTE 2FD6..2FEF ; UNASSIGNED# .. 2FF0..2FFB ; NEVER # IDEOGRAPHIC DESCRIPTION CHARACTER LEFT TO RIGHT 2FFC..2FFF ; UNASSIGNED# .. 3000..3004 ; NEVER # IDEOGRAPHIC SPACE..JAPANESE INDUSTRIAL STANDARD 3005..3007 ; MAYBE YES # IDEOGRAPHIC ITERATION MARK..IDEOGRAPHIC NUMBER 3008..3020 ; NEVER # LEFT ANGLE BRACKET..POSTAL MARK FACE 3021..302F ; MAYBE YES # HANGZHOU NUMERAL ONE..HANGUL DOUBLE DOT TONE MA 3030 ; NEVER # WAVY DASH 3031..3035 ; MAYBE YES # VERTICAL KANA REPEAT MARK..VERTICAL KANA REPEAT 3036..303A ; NEVER # CIRCLED POSTAL MARK..HANGZHOU NUMERAL THIRTY 303B..303C ; MAYBE YES # VERTICAL IDEOGRAPHIC ITERATION MARK..MASU MARK 303D..303F ; NEVER # PART ALTERNATION MARK..IDEOGRAPHIC HALF FILL SP Faltstrom Expires May 21, 2008 [Page 46] Internet-Draft Unicode Codepoints November 2007 3040 ; UNASSIGNED# 3041..3096 ; MAYBE YES # HIRAGANA LETTER SMALL A..HIRAGANA LETTER SMALL 3097..3098 ; UNASSIGNED# .. 3099..309A ; MAYBE YES # COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK.. 309B..309C ; NEVER # KATAKANA-HIRAGANA VOICED SOUND MARK..KATAKANA-H 309D..309E ; MAYBE YES # HIRAGANA ITERATION MARK..HIRAGANA VOICED ITERAT 309F..30A0 ; NEVER # HIRAGANA DIGRAPH YORI..KATAKANA-HIRAGANA DOUBLE 30A1..30FE ; MAYBE YES # KATAKANA LETTER SMALL A..KATAKANA VOICED ITERAT 30FF ; NEVER # KATAKANA DIGRAPH KOTO 3100..3104 ; UNASSIGNED# .. 3105..312C ; MAYBE YES # BOPOMOFO LETTER B..BOPOMOFO LETTER GN 312D..3130 ; UNASSIGNED# .. 3131..318E ; NEVER # HANGUL LETTER KIYEOK..HANGUL LETTER ARAEAE 318F ; UNASSIGNED# 3190..319F ; NEVER # IDEOGRAPHIC ANNOTATION LINKING MARK..IDEOGRAPHI 31A0..31B7 ; MAYBE YES # BOPOMOFO LETTER BU..BOPOMOFO FINAL LETTER H 31B8..31BF ; UNASSIGNED# .. 31C0..31CF ; NEVER # CJK STROKE T..CJK STROKE N 31D0..31EF ; UNASSIGNED# .. 31F0..31FF ; MAYBE YES # KATAKANA LETTER SMALL KU..KATAKANA LETTER SMALL 3200..321E ; NEVER # PARENTHESIZED HANGUL KIYEOK..PARENTHESIZED KORE 321F ; UNASSIGNED# 3220..3243 ; NEVER # PARENTHESIZED IDEOGRAPH ONE..PARENTHESIZED IDEO 3244..324F ; UNASSIGNED# .. 3250..32FE ; NEVER # PARTNERSHIP SIGN..CIRCLED KATAKANA WO 32FF ; UNASSIGNED# 3300..33FF ; NEVER # SQUARE APAATO..SQUARE GAL 3400..4DB5 ; MAYBE YES # .... 4DC0..4DFF ; NEVER # HEXAGRAM FOR THE CREATIVE HEAVEN..HEXAGRAM FOR 4E00..9FBB ; MAYBE YES # .. 9FBC..9FFF ; UNASSIGNED# .. A000..A48C ; MAYBE YES # YI SYLLABLE IT..YI SYLLABLE YYR A48D..A48F ; UNASSIGNED# .. A490..A4C6 ; NEVER # YI RADICAL QOT..YI RADICAL KE A4C7..A6FF ; UNASSIGNED# .. A700..A716 ; NEVER # MODIFIER LETTER CHINESE TONE YIN PING..MODIFIER A717..A71A ; MAYBE YES # MODIFIER LETTER DOT VERTICAL BAR..MODIFIER LETT A71B..A71F ; UNASSIGNED# .. A720..A721 ; NEVER # MODIFIER LETTER STRESS AND HIGH TONE..MODIFIER A722..A7FF ; UNASSIGNED# .. A800..A827 ; MAYBE YES # SYLOTI NAGRI LETTER A..SYLOTI NAGRI VOWEL SIGN A828..A82B ; NEVER # SYLOTI NAGRI POETRY MARK-1..SYLOTI NAGRI POETRY A82C..A83F ; UNASSIGNED# .. A840..A877 ; NEVER # PHAGS-PA LETTER KA..PHAGS-PA MARK DOUBLE SHAD A878..ABFF ; UNASSIGNED# .. AC00..D7A3 ; MAYBE YES # .. D7A4..D7FF ; UNASSIGNED# .. Faltstrom Expires May 21, 2008 [Page 47] Internet-Draft Unicode Codepoints November 2007 D800..FA0D ; NEVER # ..CJK COM FA0E..FA0F ; MAYBE YES # CJK COMPATIBILITY IDEOGRAPH-FA0E..CJK COMPATIBI FA10 ; NEVER # CJK COMPATIBILITY IDEOGRAPH-FA10 FA11 ; MAYBE YES # CJK COMPATIBILITY IDEOGRAPH-FA11 FA12 ; NEVER # CJK COMPATIBILITY IDEOGRAPH-FA12 FA13..FA14 ; MAYBE YES # CJK COMPATIBILITY IDEOGRAPH-FA13..CJK COMPATIBI FA15..FA1E ; NEVER # CJK COMPATIBILITY IDEOGRAPH-FA15..CJK COMPATIBI FA1F ; MAYBE YES # CJK COMPATIBILITY IDEOGRAPH-FA1F FA20 ; NEVER # CJK COMPATIBILITY IDEOGRAPH-FA20 FA21 ; MAYBE YES # CJK COMPATIBILITY IDEOGRAPH-FA21 FA22 ; NEVER # CJK COMPATIBILITY IDEOGRAPH-FA22 FA23..FA24 ; MAYBE YES # CJK COMPATIBILITY IDEOGRAPH-FA23..CJK COMPATIBI FA25..FA26 ; NEVER # CJK COMPATIBILITY IDEOGRAPH-FA25..CJK COMPATIBI FA27..FA29 ; MAYBE YES # CJK COMPATIBILITY IDEOGRAPH-FA27..CJK COMPATIBI FA2A..FA2D ; NEVER # CJK COMPATIBILITY IDEOGRAPH-FA2A..CJK COMPATIBI FA2E..FA2F ; UNASSIGNED# .. FA30..FA6A ; NEVER # CJK COMPATIBILITY IDEOGRAPH-FA30..CJK COMPATIBI FA6B..FA6F ; UNASSIGNED# .. FA70..FAD9 ; NEVER # CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COMPATIBI FADA..FAFF ; UNASSIGNED# .. FB00..FB06 ; NEVER # LATIN SMALL LIGATURE FF..LATIN SMALL LIGATURE S FB07..FB12 ; UNASSIGNED# .. FB13..FB17 ; NEVER # ARMENIAN SMALL LIGATURE MEN NOW..ARMENIAN SMALL FB18..FB1C ; UNASSIGNED# .. FB1D ; NEVER # HEBREW LETTER YOD WITH HIRIQ FB1E ; MAYBE YES # HEBREW POINT JUDEO-SPANISH VARIKA FB1F..FB36 ; NEVER # HEBREW LIGATURE YIDDISH YOD YOD PATAH..HEBREW L FB37 ; UNASSIGNED# FB38..FB3C ; NEVER # HEBREW LETTER TET WITH DAGESH..HEBREW LETTER LA FB3D ; UNASSIGNED# FB3E ; NEVER # HEBREW LETTER MEM WITH DAGESH FB3F ; UNASSIGNED# FB40..FB41 ; NEVER # HEBREW LETTER NUN WITH DAGESH..HEBREW LETTER SA FB42 ; UNASSIGNED# FB43..FB44 ; NEVER # HEBREW LETTER FINAL PE WITH DAGESH..HEBREW LETT FB45 ; UNASSIGNED# FB46..FBB1 ; NEVER # HEBREW LETTER TSADI WITH DAGESH..ARABIC LETTER FBB2..FBD2 ; UNASSIGNED# .. FBD3..FC5A ; NEVER # ARABIC LETTER NG ISOLATED FORM..ARABIC LIGATURE FC5B..FC5C ; MAYBE YES # ARABIC LIGATURE THAL WITH SUPERSCRIPT ALEF ISOL FC5D ; NEVER # ARABIC LIGATURE ALEF MAKSURA WITH SUPERSCRIPT A FC5E..FC63 ; MAYBE YES # ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED F FC64..FC8F ; NEVER # ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH REH F FC90 ; MAYBE YES # ARABIC LIGATURE ALEF MAKSURA WITH SUPERSCRIPT A FC91..FCD8 ; NEVER # ARABIC LIGATURE YEH WITH REH FINAL FORM..ARABIC FCD9 ; MAYBE YES # ARABIC LIGATURE HEH WITH SUPERSCRIPT ALEF INITI FCDA..FCF1 ; NEVER # ARABIC LIGATURE YEH WITH JEEM INITIAL FORM..ARA FCF2..FCF4 ; MAYBE YES # ARABIC LIGATURE SHADDA WITH FATHA MEDIAL FORM.. Faltstrom Expires May 21, 2008 [Page 48] Internet-Draft Unicode Codepoints November 2007 FCF5..FD3C ; NEVER # ARABIC LIGATURE TAH WITH ALEF MAKSURA ISOLATED FD3D ; MAYBE YES # ARABIC LIGATURE ALEF WITH FATHATAN ISOLATED FOR FD3E..FD3F ; NEVER # ORNATE LEFT PARENTHESIS..ORNATE RIGHT PARENTHES FD40..FD4F ; UNASSIGNED# .. FD50..FD8F ; NEVER # ARABIC LIGATURE TEH WITH JEEM WITH MEEM INITIAL FD90..FD91 ; UNASSIGNED# .. FD92..FDC7 ; NEVER # ARABIC LIGATURE MEEM WITH JEEM WITH KHAH INITIA FDC8..FDEF ; UNASSIGNED# .. FDF0..FDFD ; NEVER # ARABIC LIGATURE SALLA USED AS KORANIC STOP SIGN FDFE..FDFF ; UNASSIGNED# .. FE00..FE0F ; MAYBE YES # VARIATION SELECTOR-1..VARIATION SELECTOR-16 FE10..FE19 ; NEVER # PRESENTATION FORM FOR VERTICAL COMMA..PRESENTAT FE1A..FE1F ; UNASSIGNED# .. FE20..FE23 ; MAYBE YES # COMBINING LIGATURE LEFT HALF..COMBINING DOUBLE FE24..FE2F ; UNASSIGNED# .. FE30..FE52 ; NEVER # PRESENTATION FORM FOR VERTICAL TWO DOT LEADER.. FE53 ; UNASSIGNED# FE54..FE66 ; NEVER # SMALL SEMICOLON..SMALL EQUALS SIGN FE67 ; UNASSIGNED# FE68..FE6B ; NEVER # SMALL REVERSE SOLIDUS..SMALL COMMERCIAL AT FE6C..FE6F ; UNASSIGNED# .. FE70..FE74 ; MAYBE YES # ARABIC FATHATAN ISOLATED FORM..ARABIC KASRATAN FE75 ; UNASSIGNED# FE76..FE7F ; MAYBE YES # ARABIC FATHA ISOLATED FORM..ARABIC SUKUN MEDIAL FE80..FEFC ; NEVER # ARABIC LETTER HAMZA ISOLATED FORM..ARABIC LIGAT FEFD..FEFE ; UNASSIGNED# .. FEFF ; CONTEXT # ZERO WIDTH NO-BREAK SPACE FF00 ; UNASSIGNED# FF01..FFBE ; NEVER # FULLWIDTH EXCLAMATION MARK..HALFWIDTH HANGUL LE FFBF..FFC1 ; UNASSIGNED# .. FFC2..FFC7 ; NEVER # HALFWIDTH HANGUL LETTER A..HALFWIDTH HANGUL LET FFC8..FFC9 ; UNASSIGNED# .. FFCA..FFCF ; NEVER # HALFWIDTH HANGUL LETTER YEO..HALFWIDTH HANGUL L FFD0..FFD1 ; UNASSIGNED# .. FFD2..FFD7 ; NEVER # HALFWIDTH HANGUL LETTER YO..HALFWIDTH HANGUL LE FFD8..FFD9 ; UNASSIGNED# .. FFDA..FFDC ; NEVER # HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LE FFDD..FFDF ; UNASSIGNED# .. FFE0..FFE6 ; NEVER # FULLWIDTH CENT SIGN..FULLWIDTH WON SIGN FFE7 ; UNASSIGNED# FFE8..FFEE ; NEVER # HALFWIDTH FORMS LIGHT VERTICAL..HALFWIDTH WHITE FFEF..FFF8 ; UNASSIGNED# .. FFF9..FFFB ; CONTEXT # INTERLINEAR ANNOTATION ANCHOR..INTERLINEAR ANNO FFFC..FFFD ; NEVER # OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHARA FFFE..FFFF ; UNASSIGNED# .. 10000..1000B; NEVER # LINEAR B SYLLABLE B008 A..LINEAR B SYLLABLE B0 1000C ; UNASSIGNED# 1000D..10026; NEVER # LINEAR B SYLLABLE B036 JO..LINEAR B SYLLABLE B Faltstrom Expires May 21, 2008 [Page 49] Internet-Draft Unicode Codepoints November 2007 10027 ; UNASSIGNED# 10028..1003A; NEVER # LINEAR B SYLLABLE B060 RA..LINEAR B SYLLABLE B 1003B ; UNASSIGNED# 1003C..1003D; NEVER # LINEAR B SYLLABLE B017 ZA..LINEAR B SYLLABLE B 1003E ; UNASSIGNED# 1003F..1004D; NEVER # LINEAR B SYLLABLE B020 ZO..LINEAR B SYLLABLE B 1004E..1004F; UNASSIGNED# .. 10050..1005D; NEVER # LINEAR B SYMBOL B018..LINEAR B SYMBOL B089 1005E..1007F; UNASSIGNED# .. 10080..100FA; NEVER # LINEAR B IDEOGRAM B100 MAN..LINEAR B IDEOGRAM 100FB..100FF; UNASSIGNED# .. 10100..10102; NEVER # AEGEAN WORD SEPARATOR LINE..AEGEAN CHECK MARK 10103..10106; UNASSIGNED# .. 10107..10133; NEVER # AEGEAN NUMBER ONE..AEGEAN NUMBER NINETY THOUSA 10134..10136; UNASSIGNED# .. 10137..1018A; NEVER # AEGEAN WEIGHT BASE UNIT..GREEK ZERO SIGN 1018B..102FF; UNASSIGNED# .. 10300..1031E; NEVER # OLD ITALIC LETTER A..OLD ITALIC LETTER UU 1031F ; UNASSIGNED# 10320..10323; NEVER # OLD ITALIC NUMERAL ONE..OLD ITALIC NUMERAL FIF 10324..1032F; UNASSIGNED# .. 10330..1034A; NEVER # GOTHIC LETTER AHSA..GOTHIC LETTER NINE HUNDRED 1034B..1037F; UNASSIGNED# .. 10380..1039D; NEVER # UGARITIC LETTER ALPA..UGARITIC LETTER SSU 1039E ; UNASSIGNED# 1039F..103C3; NEVER # UGARITIC WORD DIVIDER..OLD PERSIAN SIGN HA 103C4..103C7; UNASSIGNED# .. 103C8..103D5; NEVER # OLD PERSIAN SIGN AURAMAZDAA..OLD PERSIAN NUMBE 103D6..103FF; UNASSIGNED# .. 10400..1049D; NEVER # DESERET CAPITAL LETTER LONG I..OSMANYA LETTER 1049E..1049F; UNASSIGNED# .. 104A0..104A9; NEVER # OSMANYA DIGIT ZERO..OSMANYA DIGIT NINE 104AA..107FF; UNASSIGNED# .. 10800..10805; NEVER # CYPRIOT SYLLABLE A..CYPRIOT SYLLABLE JA 10806..10807; UNASSIGNED# .. 10808 ; NEVER # CYPRIOT SYLLABLE JO 10809 ; UNASSIGNED# 1080A..10835; NEVER # CYPRIOT SYLLABLE KA..CYPRIOT SYLLABLE WO 10836 ; UNASSIGNED# 10837..10838; NEVER # CYPRIOT SYLLABLE XA..CYPRIOT SYLLABLE XE 10839..1083B; UNASSIGNED# .. 1083C ; NEVER # CYPRIOT SYLLABLE ZA 1083D..1083E; UNASSIGNED# .. 1083F ; NEVER # CYPRIOT SYLLABLE ZO 10840..108FF; UNASSIGNED# .. 10900..10919; NEVER # PHOENICIAN LETTER ALF..PHOENICIAN NUMBER ONE H 1091A..1091E; UNASSIGNED# .. 1091F ; NEVER # PHOENICIAN WORD SEPARATOR Faltstrom Expires May 21, 2008 [Page 50] Internet-Draft Unicode Codepoints November 2007 10920..109FF; UNASSIGNED# .. 10A00..10A03; NEVER # KHAROSHTHI LETTER A..KHAROSHTHI VOWEL SIGN VOC 10A04 ; UNASSIGNED# 10A05..10A06; NEVER # KHAROSHTHI VOWEL SIGN E..KHAROSHTHI VOWEL SIGN 10A07..10A0B; UNASSIGNED# .. 10A0C..10A13; NEVER # KHAROSHTHI VOWEL LENGTH MARK..KHAROSHTHI LETTE 10A14 ; UNASSIGNED# 10A15..10A17; NEVER # KHAROSHTHI LETTER CA..KHAROSHTHI LETTER JA 10A18 ; UNASSIGNED# 10A19..10A33; NEVER # KHAROSHTHI LETTER NYA..KHAROSHTHI LETTER TTTHA 10A34..10A37; UNASSIGNED# .. 10A38..10A3A; NEVER # KHAROSHTHI SIGN BAR ABOVE..KHAROSHTHI SIGN DOT 10A3B..10A3E; UNASSIGNED# .. 10A3F..10A47; NEVER # KHAROSHTHI VIRAMA..KHAROSHTHI NUMBER ONE THOUS 10A48..10A4F; UNASSIGNED# .. 10A50..10A58; NEVER # KHAROSHTHI PUNCTUATION DOT..KHAROSHTHI PUNCTUA 10A59..11FFF; UNASSIGNED# .. 12000..1236E; NEVER # CUNEIFORM SIGN A..CUNEIFORM SIGN ZUM 1236F..123FF; UNASSIGNED# .. 12400..12462; NEVER # CUNEIFORM NUMERIC SIGN TWO ASH..CUNEIFORM NUME 12463..1246F; UNASSIGNED# .. 12470..12473; NEVER # CUNEIFORM PUNCTUATION SIGN OLD ASSYRIAN WORD D 12474..1CFFF; UNASSIGNED# .. 1D000..1D0F5; NEVER # BYZANTINE MUSICAL SYMBOL PSILI..BYZANTINE MUSI 1D0F6..1D0FF; UNASSIGNED# .. 1D100..1D126; NEVER # MUSICAL SYMBOL SINGLE BARLINE..MUSICAL SYMBOL 1D127..1D129; UNASSIGNED# .. 1D12A..1D172; NEVER # MUSICAL SYMBOL DOUBLE SHARP..MUSICAL SYMBOL CO 1D173..1D17A; CONTEXT # MUSICAL SYMBOL BEGIN BEAM..MUSICAL SYMBOL END 1D17B..1D1DD; NEVER # MUSICAL SYMBOL COMBINING ACCENT..MUSICAL SYMBO 1D1DE..1D1FF; UNASSIGNED# .. 1D200..1D245; NEVER # GREEK VOCAL NOTATION SYMBOL-1..GREEK MUSICAL L 1D246..1D2FF; UNASSIGNED# .. 1D300..1D356; NEVER # MONOGRAM FOR EARTH..TETRAGRAM FOR FOSTERING 1D357..1D35F; UNASSIGNED# .. 1D360..1D371; NEVER # COUNTING ROD UNIT DIGIT ONE..COUNTING ROD TENS 1D372..1D3FF; UNASSIGNED# .. 1D400..1D454; NEVER # MATHEMATICAL BOLD CAPITAL A..MATHEMATICAL ITAL 1D455 ; UNASSIGNED# 1D456..1D49C; NEVER # MATHEMATICAL ITALIC SMALL I..MATHEMATICAL SCRI 1D49D ; UNASSIGNED# 1D49E..1D49F; NEVER # MATHEMATICAL SCRIPT CAPITAL C..MATHEMATICAL SC 1D4A0..1D4A1; UNASSIGNED# .. 1D4A2 ; NEVER # MATHEMATICAL SCRIPT CAPITAL G 1D4A3..1D4A4; UNASSIGNED# .. 1D4A5..1D4A6; NEVER # MATHEMATICAL SCRIPT CAPITAL J..MATHEMATICAL SC 1D4A7..1D4A8; UNASSIGNED# .. 1D4A9..1D4AC; NEVER # MATHEMATICAL SCRIPT CAPITAL N..MATHEMATICAL SC Faltstrom Expires May 21, 2008 [Page 51] Internet-Draft Unicode Codepoints November 2007 1D4AD ; UNASSIGNED# 1D4AE..1D4B9; NEVER # MATHEMATICAL SCRIPT CAPITAL S..MATHEMATICAL SC 1D4BA ; UNASSIGNED# 1D4BB ; NEVER # MATHEMATICAL SCRIPT SMALL F 1D4BC ; UNASSIGNED# 1D4BD..1D4C3; NEVER # MATHEMATICAL SCRIPT SMALL H..MATHEMATICAL SCRI 1D4C4 ; UNASSIGNED# 1D4C5..1D505; NEVER # MATHEMATICAL SCRIPT SMALL P..MATHEMATICAL FRAK 1D506 ; UNASSIGNED# 1D507..1D50A; NEVER # MATHEMATICAL FRAKTUR CAPITAL D..MATHEMATICAL F 1D50B..1D50C; UNASSIGNED# .. 1D50D..1D514; NEVER # MATHEMATICAL FRAKTUR CAPITAL J..MATHEMATICAL F 1D515 ; UNASSIGNED# 1D516..1D51C; NEVER # MATHEMATICAL FRAKTUR CAPITAL S..MATHEMATICAL F 1D51D ; UNASSIGNED# 1D51E..1D539; NEVER # MATHEMATICAL FRAKTUR SMALL A..MATHEMATICAL DOU 1D53A ; UNASSIGNED# 1D53B..1D53E; NEVER # MATHEMATICAL DOUBLE-STRUCK CAPITAL D..MATHEMAT 1D53F ; UNASSIGNED# 1D540..1D544; NEVER # MATHEMATICAL DOUBLE-STRUCK CAPITAL I..MATHEMAT 1D545 ; UNASSIGNED# 1D546 ; NEVER # MATHEMATICAL DOUBLE-STRUCK CAPITAL O 1D547..1D549; UNASSIGNED# .. 1D54A..1D550; NEVER # MATHEMATICAL DOUBLE-STRUCK CAPITAL S..MATHEMAT 1D551 ; UNASSIGNED# 1D552..1D6A5; NEVER # MATHEMATICAL DOUBLE-STRUCK SMALL A..MATHEMATIC 1D6A6..1D6A7; UNASSIGNED# .. 1D6A8..1D7CB; NEVER # MATHEMATICAL BOLD CAPITAL ALPHA..MATHEMATICAL 1D7CC..1D7CD; UNASSIGNED# .. 1D7CE..1D7FF; NEVER # MATHEMATICAL BOLD DIGIT ZERO..MATHEMATICAL MON 1D800..1FFFF; UNASSIGNED# .. 20000..2A6D6; MAYBE YES # .... 2F800..2FA1D; NEVER # CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATI 2FA1E..E0000; UNASSIGNED# .. E0001 ; CONTEXT # LANGUAGE TAG E0002..E001F; UNASSIGNED# .. E0020..E007F; CONTEXT # TAG SPACE..CANCEL TAG E0080..E00FF; UNASSIGNED# .. E0100..E01EF; MAYBE YES # VARIATION SELECTOR-17..VARIATION SELECTOR-256 E01F0..EFFFF; UNASSIGNED# .. F0000..FFFFD; NEVER # .... 100000..10FFFC; NEVER # ..