idnits 2.17.1 draft-ietf-precis-framework-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 10, 2013) is 3940 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-12) exists of draft-ietf-precis-mappings-02 ** Downref: Normative reference to an Informational draft: draft-ietf-precis-mappings (ref. 'I-D.ietf-precis-mappings') -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE' == Outdated reference: A later version (-19) exists of draft-ietf-precis-nickname-06 == Outdated reference: A later version (-18) exists of draft-ietf-precis-saslprepbis-02 == Outdated reference: A later version (-24) exists of draft-ietf-xmpp-6122bis-07 -- Obsolete informational reference (is this intentional?): RFC 3454 (Obsoleted by RFC 7564) -- Obsolete informational reference (is this intentional?): RFC 3490 (Obsoleted by RFC 5890, RFC 5891) -- Obsolete informational reference (is this intentional?): RFC 3491 (Obsoleted by RFC 5891) -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 PRECIS P. Saint-Andre 3 Internet-Draft Cisco Systems, Inc. 4 Obsoletes: 3454 (if approved) M. Blanchet 5 Intended status: Standards Track Viagenie 6 Expires: January 11, 2014 July 10, 2013 8 PRECIS Framework: Preparation and Comparison of Internationalized 9 Strings in Application Protocols 10 draft-ietf-precis-framework-09 12 Abstract 14 Application protocols using Unicode code points in protocol strings 15 need to properly prepare such strings in order to perform valid 16 comparison operations (e.g., for purposes of authentication or 17 authorization). This document defines a framework enabling 18 application protocols to perform the preparation and comparison of 19 internationalized strings (a.k.a. "PRECIS") in a way that depends on 20 the properties of Unicode code points and thus is agile with respect 21 to versions of Unicode. As a result, this framework provides a more 22 sustainable approach to the handling of internationalized strings 23 than the previous framework, known as Stringprep (RFC 3454). A 24 specification that reuses this framework can either directly use the 25 PRECIS string classes or subclass the PRECIS string classes as 26 needed. This framework takes an approach similar to the revised 27 internationalized domain names (IDNs) in applications (IDNA) 28 technology (RFC 5890, RFC 5891, RFC 5892, RFC 5893, RFC 5894) and 29 thus adheres to the high-level design goals described in the IAB's 30 recommendations regarding IDNs (RFC 4690), albeit for application 31 technologies other than the Domain Name System (DNS). This document 32 obsoletes RFC 3454. 34 Status of this Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at http://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on January 11, 2014. 50 Copyright Notice 52 Copyright (c) 2013 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 68 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 69 3. String Classes . . . . . . . . . . . . . . . . . . . . . . . . 6 70 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 6 71 3.2. Order of Operations . . . . . . . . . . . . . . . . . . . 8 72 3.3. IdentifierClass . . . . . . . . . . . . . . . . . . . . . 8 73 3.4. FreeformClass . . . . . . . . . . . . . . . . . . . . . . 10 74 4. Use of PRECIS String Classes . . . . . . . . . . . . . . . . . 12 75 4.1. Principles . . . . . . . . . . . . . . . . . . . . . . . . 12 76 4.2. Subclassing . . . . . . . . . . . . . . . . . . . . . . . 13 77 4.3. Building Application-Layer Constructs . . . . . . . . . . 14 78 4.4. A Note about Spaces . . . . . . . . . . . . . . . . . . . 14 79 5. Code Point Properties . . . . . . . . . . . . . . . . . . . . 15 80 6. Category Definitions Used to Calculate Derived Property 81 Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 82 6.1. LetterDigits (A) . . . . . . . . . . . . . . . . . . . . . 17 83 6.2. Unstable (B) . . . . . . . . . . . . . . . . . . . . . . . 17 84 6.3. IgnorableProperties (C) . . . . . . . . . . . . . . . . . 17 85 6.4. IgnorableBlocks (D) . . . . . . . . . . . . . . . . . . . 17 86 6.5. LDH (E) . . . . . . . . . . . . . . . . . . . . . . . . . 18 87 6.6. Exceptions (F) . . . . . . . . . . . . . . . . . . . . . . 18 88 6.7. BackwardCompatible (G) . . . . . . . . . . . . . . . . . . 19 89 6.8. JoinControl (H) . . . . . . . . . . . . . . . . . . . . . 20 90 6.9. OldHangulJamo (I) . . . . . . . . . . . . . . . . . . . . 20 91 6.10. Unassigned (J) . . . . . . . . . . . . . . . . . . . . . . 20 92 6.11. ASCII7 (K) . . . . . . . . . . . . . . . . . . . . . . . . 20 93 6.12. Controls (L) . . . . . . . . . . . . . . . . . . . . . . . 21 94 6.13. PrecisIgnorableProperties (M) . . . . . . . . . . . . . . 21 95 6.14. Spaces (N) . . . . . . . . . . . . . . . . . . . . . . . . 21 96 6.15. Symbols (O) . . . . . . . . . . . . . . . . . . . . . . . 21 97 6.16. Punctuation (P) . . . . . . . . . . . . . . . . . . . . . 21 98 6.17. HasCompat (Q) . . . . . . . . . . . . . . . . . . . . . . 22 99 6.18. OtherLetterDigits (R) . . . . . . . . . . . . . . . . . . 22 100 7. Calculation of the Derived Property . . . . . . . . . . . . . 22 101 8. Code Points . . . . . . . . . . . . . . . . . . . . . . . . . 23 102 9. Security Considerations . . . . . . . . . . . . . . . . . . . 23 103 9.1. General Issues . . . . . . . . . . . . . . . . . . . . . . 23 104 9.2. Use of the IdentifierClass . . . . . . . . . . . . . . . . 24 105 9.3. Use of the FreeformClass . . . . . . . . . . . . . . . . . 24 106 9.4. Local Character Set Issues . . . . . . . . . . . . . . . . 24 107 9.5. Visually Similar Characters . . . . . . . . . . . . . . . 25 108 9.6. Security of Passwords . . . . . . . . . . . . . . . . . . 26 109 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 110 10.1. PRECIS Derived Property Value Registry . . . . . . . . . . 27 111 10.2. PRECIS Base Classes Registry . . . . . . . . . . . . . . . 27 112 10.3. PRECIS Subclasses Registry . . . . . . . . . . . . . . . . 29 113 10.4. PRECIS Usage Registry . . . . . . . . . . . . . . . . . . 29 114 11. Interoperability Considerations . . . . . . . . . . . . . . . 31 115 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31 116 12.1. Normative References . . . . . . . . . . . . . . . . . . . 31 117 12.2. Informative References . . . . . . . . . . . . . . . . . . 32 118 Appendix A. Codepoint Table . . . . . . . . . . . . . . . . . . . 34 119 Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 64 120 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 65 122 1. Introduction 124 As described in the problem statement for the preparation and 125 comparison of internationalized strings ("PRECIS") [RFC6885], many 126 IETF protocols have used the Stringprep framework [RFC3454] as the 127 basis for preparing and comparing protocol strings that contain 128 Unicode code points [UNICODE] outside the ASCII range [RFC20]. The 129 Stringprep framework was developed during work on the original 130 technology for internationalized domain names (IDNs), here called 131 "IDNA2003" [RFC3490], and Nameprep [RFC3491] was the Stringprep 132 profile for IDNs. At the time, Stringprep was designed as a general 133 framework so that other application protocols could define their own 134 Stringprep profiles for the preparation and comparison of strings and 135 identifiers. Indeed, a number of application protocols defined such 136 profiles. 138 After the publication of [RFC3454] in 2002, several significant 139 issues arose with the use of Stringprep in the IDN case, as 140 documented in the IAB's recommendations regarding IDNs [RFC4690] 141 (most significantly, Stringprep was tied to Unicode version 3.2). 142 Therefore, the newer IDNA specifications, here called "IDNA2008" 143 ([RFC5890], [RFC5891], [RFC5892], [RFC5893], [RFC5894]), no longer 144 use Stringprep and Nameprep. This migration away from Stringprep for 145 IDNs has prompted other "customers" of Stringprep to consider new 146 approaches to the preparation and comparison of internationalized 147 strings (a.k.a. "PRECIS"), as described in [RFC6885]. 149 This document defines a framework for a post-Stringprep approach to 150 the preparation and comparison of internationalized strings in 151 application protocols, based on several principles: 153 1. Define a small set of string classes appropriate for common 154 application protocol constructs such as usernames and free-form 155 strings. 156 2. Define each PRECIS string class in terms of Unicode code points 157 and their properties so that an algorithm can be used to 158 determine whether each code point or character category is valid, 159 disallowed, or unassigned. 160 3. Define string classes in terms of allowable code points, so that 161 any code point not explicitly allowed is forbidden. 162 4. Enable application protocols to subclass the PRECIS string 163 classes if needed, mainly to disallow particular code points that 164 are currently disallowed in the relevant application protocol 165 (e.g., characters with special or reserved meaning, such as "@" 166 and "/" when used as separators within identifiers). 167 5. Leave various mapping operations (e.g., case preservation or 168 lowercasing, Unicode normalization, mapping of certain characters 169 to other characters or to nothing, handling of full-width and 170 half-width characters, handling of right-to-left characters) as 171 the responsibility of application protocols, as was done for 172 IDNA2008 through an IDNA-specific mapping document [RFC5895]. 174 It is expected that this framework will yield the following benefits: 176 o Application protocols will be more version-agile with regard to 177 the Unicode database. 178 o Implementers will be able to share code point tables and software 179 code across application protocols, most likely by means of 180 software libraries. 181 o End users will be able to acquire more accurate expectations about 182 the code points that are acceptable in various contexts. Given 183 this more uniform set of string classes, it is also expected that 184 copy/paste operations between software implementing different 185 application protocols will be more predictable and coherent. 187 Although this framework is similar to IDNA2008 and borrows some of 188 the character categories defined in [RFC5892], it defines additional 189 string classes and character categories to meet the needs of common 190 application protocols. 192 2. Terminology 194 Many important terms used in this document are defined in [RFC5890], 195 [RFC6365], [RFC6885], and [UNICODE]. 197 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 198 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 199 "OPTIONAL" in this document are to be interpreted as described in 200 [RFC2119]. 202 3. String Classes 204 3.1. Overview 206 IDNA2008 essentially defines a string class of internationalized 207 domain name (IDN), although it does not use the term "string class". 208 (This document does not define a string class for domain names, and 209 application protocols are strongly encouraged to use IDNA2008 as the 210 appropriate method to prepare domain names and hostnames.) Because 211 the IDN string class is designed to meet the particular requirements 212 of the Domain Name System (DNS), additional string classes are needed 213 for non-DNS applications. 215 Starting in 2010, various "customers" of Stringprep began to discuss 216 the need to define a post-Stringprep approach to the preparation and 217 comparison of internationalized strings. As a result of analyzing 218 existing Stringprep profiles, this community concluded that most 219 existing uses could be addressed by two string classes: 221 IdentifierClass: a sequence of letters, numbers, and symbols that is 222 used to identify or address a network entity such as a user 223 account, a venue (e.g., a chatroom), an information source (e.g., 224 a data feed), or a collection of data (e.g., a file); the intent 225 is that this class will be very safe for use in a wide variety of 226 application protocols, with the result that safety has been 227 prioritized over inclusiveness for this class. 228 FreeformClass: a sequence of letters, numbers, symbols, spaces, and 229 other code points that is used for free-form strings, including 230 passwords as well as display elements such as human-friendly 231 nicknames in chatrooms; the intent is that this class will allow 232 nearly any Unicode character, with the result that inclusiveness 233 has been prioritized over safety for this class (e.g., protocol 234 designers, application developers, service providers, and end 235 users might not understand or be able to enter all of the 236 characters that can be included in the FreeformClass). 238 Although members of the community discussed the possibility of 239 defining other PRECIS string classes (e.g., a class falling somewhere 240 between the IdentifierClass and the FreeformClass), they concluded 241 that the IdentifierClass would be a safe choice meeting the needs of 242 many or even most application protocols, and that protocols needing a 243 wider range of Unicode characters could use the FreeformClass 244 directly or subclass it if needed. 246 The following subsections discuss the IdentifierClass and 247 FreeformClass in more detail, with reference to the dimensions 248 described in Section 3 of [RFC6885]. (Naturally, future documents 249 can define PRECIS string classes beyond the IdentifierClass and 250 FreeformClass; see Section 10.2.) Each string class (or a particular 251 usage thereof) is defined by the following behavioral rules: 253 Valid: defines which code points and character categories are 254 treated as valid input to the string. 255 Disallowed: defines which code points and character categories are 256 treated as disallowed for the string. 257 Unassigned: defines application behavior in the presence of code 258 points that are unassigned, i.e. unknown for the version of 259 Unicode the application is built upon. 261 Width Mapping: specifies if width mapping is performed on fullwidth 262 and halfwidth characters, and how the mapping is done (e.g., 263 mapping fullwidth and halfwidth characters to their decomposition 264 equivalents). 265 Additional Mappings: specifies whether additional mappings are to be 266 applied, such as mapping of delimiter characters, mapping of 267 special characters (e.g., non-ASCII space characters to ASCII 268 space or certain characters to nothing), and case mapping based on 269 language and local context (see [I-D.ietf-precis-mappings]). 270 Case Mapping: specifies if case mapping is performed (instead of 271 case preservation) on uppercase and titlecase characters, and how 272 the mapping is done (e.g., mapping uppercase and titlecase 273 characters to their lowercase equivalents). 274 Normalization: defines which Unicode normalization form (D, KD, C, 275 or KC) is to be applied (see [UAX15]). 276 Directionality: defines application behavior in the presence of code 277 points that have directionality, in particular right-to-left code 278 points as defined in the Unicode database (see [UAX9]). 280 This document defines the valid, disallowed, and unassigned rules for 281 the IdentifierClass and FreeformClass. Application protocols that 282 use these string classes are responsible for defining the 283 normalization, case mapping, width mapping, and directionality rules, 284 as well as any additional mappings to be applied 286 3.2. Order of Operations 288 To ensure proper comparison, the following order of operations is 289 REQUIRED: 291 1. Width mapping 292 2. Additional mappings as specified in [I-D.ietf-precis-mappings]: 293 1. Delimiter mapping 294 2. Special mapping 295 3. Local case mapping 296 3. Non-local case mapping 297 4. Normalization 298 5. PRECIS protocol 300 3.3. IdentifierClass 302 Most application technologies need strings that can be used to refer 303 to, include, or communicate protocol strings like usernames, file 304 names, data feed identifiers, and chatroom names. We group such 305 strings into a class called "IdentifierClass" having the following 306 features. 308 3.3.1. Valid 310 o Code points traditionally used as letters and numbers in writing 311 systems, i.e., the LetterDigits ("A") category first defined in 312 [RFC5892] and listed here under Section 6.1. 313 o Code points in the range U+0021 through U+007E, i.e., the 314 (printable) ASCII7 ("K") rule defined under Section 6.11. These 315 code points are "grandfathered" into PRECIS and thus are valid 316 even if they would otherwise be disallowed according to the 317 property-based rules specified in the next section. 319 Although the PRECIS IdentifierClass re-uses the LetterDigits category 320 from IDNA2008, the range of characters allowed in the IdentifierClass 321 is wider than the range of characters allowed in IDNA2008. The main 322 reason is that IDNA2008 applies the Unstable category before the 323 LetterDigits category, thus disallowing uppercase characters, whereas 324 the IdentifierClass does not apply the Unstable category. 326 3.3.2. Disallowed 328 o Control characters, i.e., the Controls ("L") category defined 329 under Section 6.12. 330 o Ignorable characters, i.e., the PrecisIgnorableProperties ("M") 331 category defined under Section 6.13. 332 o Space characters, i.e., the Spaces ("N") category defined under 333 Section 6.14. 334 o Symbol characters, i.e., the Symbols ("O") category defined under 335 Section 6.15. 336 o Punctuation characters, i.e., the Punctuation ("P") category 337 defined under Section 6.16. 338 o Any character that has a compatibility equivalent, i.e., the 339 HasCompat ("Q") category defined under Section 6.17. These code 340 points are disallowed even if they would otherwise be valid 341 according to the property-based rules specified in the previous 342 section. 343 o Letters and digits other than the "traditional" letters and digits 344 allowed in IDNs, i.e., the OtherLetterDigits ("R") category 345 defined under Section 6.18. 347 3.3.3. Unassigned 349 Any code points that are not yet assigned in the Unicode character 350 set SHALL be considered Unassigned for purposes of the 351 IdentifierClass. 353 3.3.4. Width Mapping 355 The width mapping rule MUST be specified by each application protocol 356 that uses or subclasses the IdentifierClass. 358 3.3.5. Additional Mappings 360 Additional mapping rules (if any) MUST be specified by each 361 application protocol that uses or subclasses the IdentifierClass (see 362 [I-D.ietf-precis-mappings]). 364 3.3.6. Case Mapping 366 The case mapping rule MUST be specified by each application protocol 367 that uses or subclasses the IdentifierClass. 369 3.3.7. Normalization 371 The Unicode normalization form MUST be specified by each application 372 protocol that uses or subclasses the IdentifierClass. 374 However, in accordance with [RFC5198], normalization form C (NFC) is 375 RECOMMENDED. 377 3.3.8. Directionality 379 The directionality rule MUST be specified by each application 380 protocol that uses or subclasses the IdentifierClass. 382 3.4. FreeformClass 384 Some application technologies need strings that can be used in a 385 free-form way, e.g., as a password in an authentication exchange (see 386 [I-D.ietf-precis-saslprepbis] or a nickname in a chatroom (see 387 [I-D.ietf-precis-nickname]). We group such things into a class 388 called "FreeformClass" having the following features. 390 Note: Consult Section 9.6 for relevant security considerations when 391 strings conforming to the FreeformClass, or a subclass thereof, are 392 used as passwords. 394 3.4.1. Valid 396 o Traditional letters and numbers, i.e., the LetterDigits ("A") 397 category first defined in [RFC5892] and listed here under 398 Section 6.1. 400 o Letters and digits other than the "traditional" letters and digits 401 allowed in IDNs, i.e., the OtherLetterDigits ("R") category 402 defined under Section 6.18. 403 o Code points in the range U+0021 through U+007E, i.e., the 404 (printable) ASCII7 ("K") rule defined under Section 6.11. 405 o Any character that has a compatibility equivalent, i.e., the 406 HasCompat ("Q") category defined under Section 6.17. 407 o Space characters, i.e., the Spaces ("N") category defined under 408 Section 6.14. 409 o Symbol characters, i.e., the Symbols ("O") category defined under 410 Section 6.15. 411 o Punctuation characters, i.e., the Punctuation ("P") category 412 defined under Section 6.16. 414 3.4.2. Disallowed 416 o Control characters, i.e., the Controls ("L") category defined 417 under Section 6.12. 418 o Ignorable characters, i.e., the PrecisIgnorableProperties ("M") 419 category defined under Section 6.13. 421 3.4.3. Unassigned 423 Any code points that are not yet assigned in the Unicode character 424 set SHALL be considered Unassigned for purposes of the FreeformClass. 426 3.4.4. Width Mapping 428 The width mapping rule MUST be specified by each application protocol 429 that uses or subclasses the FreeformClass. 431 Because one aspect of Unicode normalization form KC is width mapping, 432 a PRECIS usage or subclass that uses NFKC does not need to specify 433 width mapping. However, if NFC is used then the usage or subclass 434 needs to specify whether to apply width mapping; in this case, width 435 mapping is in general RECOMMENDED because allowing fullwidth and 436 halfwidth characters to remain unmapped to their decomposition 437 equivalents would violate the principle of least user surprise. For 438 more information about the concept of width in East Asian scripts 439 within Unicode, see for instance [UAX11]. 441 3.4.5. Additional Mappings 443 Additional mapping rules (if any) MUST be specified by each 444 application protocol that uses or subclasses the FreeformClass (see 445 [I-D.ietf-precis-mappings]). 447 3.4.6. Case Mapping 449 The case mapping rule MUST be specified by each application protocol 450 that uses or subclasses the FreeformClass. 452 In general, the combination of case preservation and case-insensitive 453 comparison of internationalized strings is NOT RECOMMENDED; instead, 454 application protocols SHOULD either (a) not preserve case but perform 455 case-insensitive comparison or (b) preserve case but perform case- 456 sensitive comparison. 458 In order to maximize entropy and minimize the potential for false 459 positives, it is NOT RECOMMENDED for application protocols to map 460 uppercase and titlecase code points to their lowercase equivalents 461 when strings conforming to the FreeformClass, or a subclass thereof, 462 are used in passwords; instead, it is RECOMMENDED to preserve the 463 case of all code points contained in such strings and then perform 464 case-sensitive comparison. See also the related discussion in 465 [I-D.ietf-precis-saslprepbis]. 467 3.4.7. Normalization 469 The Unicode normalization form MUST be specified by each application 470 protocol that uses or subclasses the FreeformClass. 472 However, in accordance with [RFC5198], normalization form C (NFC) is 473 RECOMMENDED. 475 3.4.8. Directionality 477 The directionality rule MUST be specified by each application 478 protocol that uses or subclasses the FreeformClass. 480 4. Use of PRECIS String Classes 482 4.1. Principles 484 This document defines the valid, disallowed, and unassigned rules. 485 Application protocols that use the PRECIS string classes MUST define 486 the width mapping, additional mapping (if any), case mapping, 487 normalization, and directionality rules. That is, such definitions 488 MUST at a minimum specify the following: 490 Width Mapping: Whether fullwidth and halfwidth code points are to be 491 mapped to their decomposition equivalents. 492 Additional Mappings: Whether additional mappings are to be applied, 493 such as mapping of delimiter characters, mapping of special 494 characters (e.g., non-ASCII space characters to ASCII space or 495 certain characters to nothing), and case mapping based on language 496 and local context (see [I-D.ietf-precis-mappings]). 497 Case Mapping: Whether uppercase and titlecase code points are to be 498 (a) preserved or (b) mapped to lowercase. 499 Normalization: Which Unicode normalization form (D, KD, C, or KC) is 500 to be applied (see [UAX15] for background information); in 501 accordance with [RFC5198], NFC is RECOMMENDED. 502 Directionality: Whether any instance of the class that contains a 503 right-to-left code point is to be considered a right-to-left 504 string, or whether some other rule is to be applied (e.g., the 505 "Bidi Rule" from [RFC5893]). 507 4.2. Subclassing 509 Application protocols are allowed to subclass the PRECIS string 510 classes specified in this document. As the word "subclass" implies, 511 a subclass MUST NOT add as valid any code points or character 512 categories that are disallowed by the relevant PRECIS string class. 513 However, a subclass MAY do either of the following: 515 1. Exclude specific code points that are included in the relevant 516 PRECIS string class. 517 2. Exclude characters matching certain Unicode properties (e.g., 518 math symbols) that are included in the relevant PRECIS string 519 class. 521 As a result, code points that are defined as valid for the PRECIS 522 string class being subclassed will be defined as disallowed for the 523 subclass. 525 Application protocols that subclass the PRECIS string classes MUST 526 register with the IANA as described under Section 10.3. 528 It is RECOMMENDED for subclass names to be of the form 529 "SubclassBaseClass", where the "Subclass" string is a differentiator 530 and "BaseClass" is the name of the PRECIS string class being 531 subclassed; for example, the subclass of the IdentifierClass used for 532 localparts in the Extensible Messaging and Presence Protocol (XMPP) 533 is named "LocalpartIdentifierClass" [I-D.ietf-xmpp-6122bis]. 535 4.3. Building Application-Layer Constructs 537 Sometimes, an application-layer construct does not map directly to 538 one of the PRECIS string classes. Consider, for example, the "simple 539 user name" construct in the Simple Authentication and Security Layer 540 (SASL) [RFC4422]. Depending on the deployment, a simple user name 541 might take the form of a user's full name (e.g., the user's personal 542 name followed by a space and then the user's family name). Such a 543 simple user name cannot be defined as an instance of the 544 IdentifierClass, since space characters are not allowed in the 545 IdentifierClass; however, it could be defined using a space-separated 546 sequence of IdentifierClass instances, as in the following pseudo- 547 ABNF [RFC5234]: 549 fullname = namepart [1*(1*SP namepart)] 550 namepart = 1*(idpoint) 551 ; 552 ; an "idpoint" is a UTF-8 encoded Unicode code point 553 ; that conforms to the PRECIS IdentifierClass 555 Similar techniques could be used to define many application-layer 556 constructs, say of the form "user@domain" or "/path/to/file". 558 4.4. A Note about Spaces 560 With regard to the IdentiferClass, the consensus of the PRECIS 561 Working Group was that spaces are problematic for many reasons, 562 including: 564 o Many Unicode characters are confusable with ASCII space. 565 o Even if non-ASCII space characters are mapped to ASCII space 566 (U+0020), space characters are often not rendered in user 567 interfaces, leading to the possibility that human user might 568 consider a string containing spaces to be equivalent to the same 569 string without spaces. 570 o In some locales, some devices are known to generate a character 571 other than ASCII space (such as ZERO WIDTH JOINER, U+200D) when a 572 user performs an action like hit the space bar on a keyboard. 574 One consequence of disallowing space characters in the 575 IdentifierClass might be to effectively discourage the use of ASCII 576 space (or, even more problematically, non-ASCII space characters) 577 within identifiers created in newer application protocols; given the 578 challenges involved in properly handling space characters in 579 identifiers and other protocol strings, the Working Group considered 580 this to be a feature, not a bug. 582 However, the FreeformClass does allow spaces, which enables 583 application protocols to define subclasses of the FreeformClass that 584 are more flexible than any profiles of the IdentifierClass. 586 5. Code Point Properties 588 In order to implement the string classes described above, this 589 document does the following: 591 1. Reviews and classifies the collections of code points in the 592 Unicode character set by examining various code point properties. 593 2. Defines an algorithm for determining a derived property value, 594 which can vary depending on the string class being used by the 595 relevant application protocol. 597 This document is not intended to specify precisely how derived 598 property values are to be applied in protocol strings. That 599 information is the responsibility of the protocol specification that 600 uses or subclasses a PRECIS string class from this document. 602 The value of the property is to be interpreted as follows. 604 PROTOCOL VALID Those code points that are allowed to be used in any 605 PRECIS string class (IdentifierClass and FreeformClass). Code 606 points with this property value are permitted for general use in 607 any string class. The abbreviated term PVALID is used to refer to 608 this value in the remainder of this document. 609 SPECIFIC CLASS PROTOCOL VALID Those code points that are allowed to 610 be used in specific string classes. Code points with this 611 property value are permitted for use in specific string classes. 612 In the remainder of this document, the abbreviated term *_PVAL is 613 used, where * = (NAME | FREE), i.e., either FREE_PVAL or ID_PVAL. 614 CONTEXTUAL RULE REQUIRED Some characteristics of the character, such 615 as its being invisible in certain contexts or problematic in 616 others, require that it not be used in labels unless specific 617 other characters or properties are present. The abbreviated term 618 CONTEXT is used to refer to this value in the remainder of this 619 document. As in IDNA2008, there are two subdivisions of 620 CONTEXTUAL RULE REQUIRED, the first for Join_controls (called 621 CONTEXTJ) and the second for other characters (called CONTEXTO). 622 DISALLOWED Those code points that must not permitted in any PRECIS 623 string class. 624 SPECIFIC CLASS DISALLOWED Those code points that are not to be 625 included in a specific string class. Code points with this 626 property value are not permitted in one of the string classes but 627 might be permitted in others. In the remainder of this document, 628 the abbreviated term *_DIS is used, where * = (NAME | FREE), i.e., 629 either FREE_DIS or ID_DIS. 631 UNASSIGNED Those code points that are not designated (i.e. are 632 unassigned) in the Unicode Standard. 634 The mechanisms described here allow determination of the value of the 635 property for future versions of Unicode (including characters added 636 after Unicode 5.2 or 6.1 depending on the category, since some 637 categories in this document are reused from IDNA2008 and therefore 638 were defined at the time of Unicode 5.2). Changes in Unicode 639 properties that do not affect the outcome of this process do not 640 affect this framework. For example, a character can have its Unicode 641 General_Category value [UNICODE] change from So to Sm, or from Lo to 642 Ll, without affecting the algorithm results. Moreover, even if such 643 changes were to result, the BackwardCompatible list (Section 6.7) can 644 be adjusted to ensure the stability of the results. 646 Some code points need to be allowed in exceptional circumstances, but 647 ought to be excluded in all other cases; these rules are also 648 described in other documents. The most notable of these are the Join 649 Control characters, U+200D ZERO WIDTH JOINER and U+200C ZERO WIDTH 650 NON-JOINER. Both of them have the derived property value CONTEXTJ. 651 A character with the derived property value CONTEXTJ or CONTEXTO 652 (CONTEXTUAL RULE REQUIRED) is not to be used unless an appropriate 653 rule has been established and the context of the character is 654 consistent with that rule. It is invalid to generate a string 655 containing these characters unless such a contextual rule is found 656 and satisfied. PRECIS does not define its own contextual rules, but 657 instead re-uses the contextual rules defined for IDNA2008; please see 658 Appendix A of [RFC5892] for more information. 660 6. Category Definitions Used to Calculate Derived Property Value 662 The derived property obtains its value based on a two-step procedure: 664 1. Characters are placed in one or more character categories either 665 (1) based on core properties defined by the Unicode Standard or 666 (2) by treating the code point as an exception and addressing the 667 code point based on its code point value. These categories are 668 not mutually exclusive. 669 2. Set operations are used with these categories to determine the 670 values for a property that is specific to a given string class. 671 These operations are specified under Section 7. 673 (Note: Unicode property names and property value names might have 674 short abbreviations, such as "gc" for the General_Category property 675 and "Ll" for the Lowercase_Letter property value of the gc property.) 677 In the following specification of character categories, the operation 678 that returns the value of a particular Unicode character property for 679 a code point is designated by using the formal name of that property 680 (from the Unicode PropertyAliases.txt [1]) followed by '(cp)' for 681 "code point". For example, the value of the General_Category 682 property for a code point is indicated by General_Category(cp). 684 The first ten categories (A-J) shown below were previously defined 685 for IDNA2008 and are copied directly from [RFC5892]. Some of these 686 categories are reused in PRECIS and some of them are not; however, 687 the lettering of categories is retained to prevent overlap and to 688 ease implementation of both IDNA2008 and PRECIS in a single software 689 application. The next seven categories (K-Q) are specific to PRECIS. 691 6.1. LetterDigits (A) 693 Note: This category is defined in [RFC5892] and copied here for use 694 in PRECIS. 696 A: General_Category(cp) is in {Ll, Lu, Lm, Lo, Mn, Mc, Nd} 698 These rules identify characters commonly used in mnemonics and often 699 informally described as "language characters". 701 For more information, see section 4.5 of [UNICODE]. 703 The categories used in this rule are: 704 o Ll - Lowercase_Letter 705 o Lu - Uppercase_Letter 706 o Lm - Modifier_Letter 707 o Lo - Other_Letter 708 o Mn - Nonspacing_Mark 709 o Mc - Spacing_Mark 710 o Nd - Decimal_Number 712 6.2. Unstable (B) 714 Note: This category is defined in [RFC5892] but not used in PRECIS. 716 6.3. IgnorableProperties (C) 718 Note: This category is defined in [RFC5892] but not used in PRECIS. 719 See the "PrecisIgnorableProperties (M)" category below for a more 720 inclusive category used in PRECIS identifiers. 722 6.4. IgnorableBlocks (D) 724 Note: This category is defined in [RFC5892] but not used in PRECIS. 726 6.5. LDH (E) 728 Note: This category is defined in [RFC5892] but not used in PRECIS. 729 See the "ASCII7 (K)" category below for a more inclusive category 730 used in PRECIS identifiers. 732 6.6. Exceptions (F) 734 Note: This category is defined in [RFC5892] and used in PRECIS to 735 ensure consistent treatment of the relevant code points. 737 F: cp is in {00B7, 00DF, 0375, 03C2, 05F3, 05F4, 0640, 0660, 738 0661, 0662, 0663, 0664, 0665, 0666, 0667, 0668, 739 0669, 06F0, 06F1, 06F2, 06F3, 06F4, 06F5, 06F6, 740 06F7, 06F8, 06F9, 06FD, 06FE, 07FA, 0F0B, 3007, 741 302E, 302F, 3031, 3032, 3033, 3034, 3035, 303B, 742 30FB} 744 This category explicitly lists code points for which the category 745 cannot be assigned using only the core property values that exist in 746 the Unicode standard. The values are according to the table below: 748 PVALID -- Would otherwise have been DISALLOWED 750 00DF; PVALID # LATIN SMALL LETTER SHARP S 751 03C2; PVALID # GREEK SMALL LETTER FINAL SIGMA 752 06FD; PVALID # ARABIC SIGN SINDHI AMPERSAND 753 06FE; PVALID # ARABIC SIGN SINDHI POSTPOSITION MEN 754 0F0B; PVALID # TIBETAN MARK INTERSYLLABIC TSHEG 755 3007; PVALID # IDEOGRAPHIC NUMBER ZERO 757 CONTEXTO -- Would otherwise have been DISALLOWED 759 00B7; CONTEXTO # MIDDLE DOT 760 0375; CONTEXTO # GREEK LOWER NUMERAL SIGN (KERAIA) 761 05F3; CONTEXTO # HEBREW PUNCTUATION GERESH 762 05F4; CONTEXTO # HEBREW PUNCTUATION GERSHAYIM 763 30FB; CONTEXTO # KATAKANA MIDDLE DOT 765 CONTEXTO -- Would otherwise have been PVALID 767 0660; CONTEXTO # ARABIC-INDIC DIGIT ZERO 768 0661; CONTEXTO # ARABIC-INDIC DIGIT ONE 769 0662; CONTEXTO # ARABIC-INDIC DIGIT TWO 770 0663; CONTEXTO # ARABIC-INDIC DIGIT THREE 771 0664; CONTEXTO # ARABIC-INDIC DIGIT FOUR 772 0665; CONTEXTO # ARABIC-INDIC DIGIT FIVE 773 0666; CONTEXTO # ARABIC-INDIC DIGIT SIX 774 0667; CONTEXTO # ARABIC-INDIC DIGIT SEVEN 775 0668; CONTEXTO # ARABIC-INDIC DIGIT EIGHT 776 0669; CONTEXTO # ARABIC-INDIC DIGIT NINE 777 06F0; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT ZERO 778 06F1; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT ONE 779 06F2; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT TWO 780 06F3; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT THREE 781 06F4; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT FOUR 782 06F5; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT FIVE 783 06F6; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT SIX 784 06F7; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT SEVEN 785 06F8; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT EIGHT 786 06F9; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT NINE 788 DISALLOWED -- Would otherwise have been PVALID 790 0640; DISALLOWED # ARABIC TATWEEL 791 07FA; DISALLOWED # NKO LAJANYALAN 792 302E; DISALLOWED # HANGUL SINGLE DOT TONE MARK 793 302F; DISALLOWED # HANGUL DOUBLE DOT TONE MARK 794 3031; DISALLOWED # VERTICAL KANA REPEAT MARK 795 3032; DISALLOWED # VERTICAL KANA REPEAT WITH VOICED SOUND MARK 796 3033; DISALLOWED # VERTICAL KANA REPEAT MARK UPPER HALF 797 3034; DISALLOWED # VERTICAL KANA REPEAT WITH VOICED SOUND MARK 798 UPPER HA 799 3035; DISALLOWED # VERTICAL KANA REPEAT MARK LOWER HALF 800 303B; DISALLOWED # VERTICAL IDEOGRAPHIC ITERATION MARK 802 6.7. BackwardCompatible (G) 804 Note: This category is defined in [RFC5892] and copied here for use 805 in PRECIS. Because of how the PRECIS string classes are defined, 806 only changes that would result in code points being added to or 807 removed from the LetterDigits ("A") category would result in 808 backward-incompatible modifications to code point assignments. 809 Therefore, management of this category is handled via the processes 810 specified in [RFC5892]. 812 G: cp is in {} 814 This category includes the code points for which property values in 815 versions of Unicode after 5.2 have changed in such a way that the 816 derived property value would no longer be PVALID or DISALLOWED. If 817 changes are made to future versions of Unicode so that code points 818 might change property value from PVALID or DISALLOWED, then this 819 table can be updated and keep special exception values so that the 820 property values for code points stay stable. 822 6.8. JoinControl (H) 824 Note: This category is defined in [RFC5892] and copied here for use 825 in PRECIS. 827 H: Join_Control(cp) = True 829 This category consists of Join Control characters (i.e., they are not 830 in LetterDigits (Section 6.1) but are still required in strings under 831 some circumstances). 833 6.9. OldHangulJamo (I) 835 Note: This category is defined in [RFC5892] and copied here for use 836 in PRECIS. 838 I: Hangul_Syllable_Type(cp) is in {L, V, T} 840 This category consists of all conjoining Hangul Jamo (Leading Jamo, 841 Vowel Jamo, and Trailing Jamo). 843 Elimination of conjoining Hangul Jamos from the set of PVALID 844 characters results in restricting the set of Korean PVALID characters 845 just to preformed, modern Hangul syllable characters. Old Hangul 846 syllables, which must be spelled with sequences of conjoining Hangul 847 Jamos, are not PVALID for string classes. 849 6.10. Unassigned (J) 851 Note: This category is defined in [RFC5892] and copied here for use 852 in PRECIS. 854 J: General_Category(cp) is in {Cn} and 855 Noncharacter_Code_Point(cp) = False 857 This category consists of code points in the Unicode character set 858 that are not (yet) assigned. It should be noted that Unicode 859 distinguishes between 'unassigned code points' and 'unassigned 860 characters'. The unassigned code points are all but (Cn - 861 Noncharacters), while the unassigned *characters* are all but (Cn + 862 Cs). 864 6.11. ASCII7 (K) 866 This PRECIS-specific category exempts most characters in the 867 (printable) ASCII-7 range from other rules that might be applied 868 during PRECIS processing, on the assumption that these code points 869 are in such wide use that disallowing them would be counter- 870 productive. 872 K: cp is in {0021..007E} 874 6.12. Controls (L) 876 L: Control(cp) = True 878 6.13. PrecisIgnorableProperties (M) 880 This PRECIS-specific category is used to group code points that are 881 not recommended for use in PRECIS string classes. 883 M: Default_Ignorable_Code_Point(cp) = True or 884 Noncharacter_Code_Point(cp) = True 886 The definition for Default_Ignorable_Code_Point can be found in the 887 DerivedCoreProperties.txt [2] file, and at the time of Unicode 6.1 is 888 as follows: 890 Other_Default_Ignorable_Code_Point 891 + Cf (Format characters) 892 + Variation_Selector 893 - White_Space 894 - FFF9..FFFB (Annotation Characters) 895 - 0600..0604, 06DD, 070F, 110BD (exceptional Cf characters 896 that should be visible) 898 6.14. Spaces (N) 900 This PRECIS-specific category is used to group code points that are 901 space characters. 903 N: General_Category(cp) is in {Zs} 905 6.15. Symbols (O) 907 This PRECIS-specific category is used to group code points that are 908 symbols. 910 O: General_Category(cp) is in {Sm, Sc, Sk, So} 912 6.16. Punctuation (P) 914 This PRECIS-specific category is used to group code points that are 915 punctuation characters. 917 P: General_Category(cp) is in {Pc, Pd, Ps, Pe, Pi, Pf, Po} 919 6.17. HasCompat (Q) 921 This PRECIS-specific category is used to group code points that have 922 compatibility equivalents as explained in Chapter 2 and Chapter 3 of 923 [UNICODE]. 925 Q: toNFKC(cp) != cp 927 The toNFKC() operation returns the code point in normalization form 928 KC. For more information, see Section 5 of [UAX15]. 930 6.18. OtherLetterDigits (R) 932 This PRECIS-specific category is used to group code points that are 933 letters and digits other than the "traditional" letters and digits 934 grouped under the LetterDigits (A) class (see Section 6.1). 936 R: General_Category(cp) is in {Lt, Nl, No, Me} 938 7. Calculation of the Derived Property 940 Possible values of the derived property are: 942 o PVALID 943 o ID_PVAL 944 o FREE_PVAL 945 o CONTEXTJ 946 o CONTEXTO 947 o DISALLOWED 948 o ID_DIS 949 o FREE_DIS 950 o UNASSIGNED 952 Note: The value of the derived property calculated can depend on the 953 string class; for example, if an identifier used in an application 954 protocol is defined as using or subclassing the PRECIS 955 IdentifierClass then a space character such as U+0020 would be 956 assigned to ID_DIS, whereas if an identifier is defined as using or 957 subclassing the PRECIS FreeformClass then the character would be 958 assigned to FREE_PVAL. For the sake of brevity, the designation 959 "FREE_PVAL" is used in the code point tables, instead of the longer 960 designation "ID_DIS or FREE_PVAL". In practice, the derived 961 properties ID_PVAL and FREE_DIS are not used in this specification, 962 since every ID_PVAL code point is PVALID and every FREE_DIS code 963 point is DISALLOWED. 965 The algorithm to calculate the value of the derived property is as 966 follows. (Note: Use of the name of a rule (such as "Exception") 967 implies the set of code points that the rule defines, whereas the 968 same name as a function call (such as "Exception(cp)") implies the 969 value that the code point has in the Exceptions table.) 971 If .cp. .in. Exceptions Then Exceptions(cp); 972 Else If .cp. .in. BackwardCompatible Then BackwardCompatible(cp); 973 Else If .cp. .in. Unassigned Then UNASSIGNED; 974 Else If .cp. .in. ASCII7 Then PVALID; 975 Else If .cp. .in. JoinControl Then CONTEXTJ; 976 Else If .cp. .in. OldHangulJamo Then DISALLOWED; 977 Else If .cp. .in. PrecisIgnorableProperties Then DISALLOWED; 978 Else If .cp. .in. Controls Then DISALLOWED; 979 Else If .cp. .in. HasCompat Then ID_DIS or FREE_PVAL; 980 Else If .cp. .in. LetterDigits Then PVALID; 981 Else If .cp. .in. OtherLetterDigits Then ID_DIS or FREE_PVAL; 982 Else If .cp. .in. Spaces Then ID_DIS or FREE_PVAL; 983 Else If .cp. .in. Symbols Then ID_DIS or FREE_PVAL; 984 Else If .cp. .in. Punctuation Then ID_DIS or FREE_PVAL; 985 Else DISALLOWED; 987 8. Code Points 989 The Categories and Rules defined under Section 6 and Section 7 apply 990 to all Unicode code points. The table in Appendix A shows, for 991 illustrative purposes, the consequences of the categories and 992 classification rules, and the resulting property values. 994 The list of code points that can be found in Appendix A is non- 995 normative. Instead, the rules defined by Section 6 and Section 7 are 996 normative, and any tables are derived from the rules. 998 9. Security Considerations 1000 9.1. General Issues 1002 The security of applications that use this framework can depend in 1003 part on the proper preparation and comparison of internationalized 1004 strings. For example, such strings can be used to make 1005 authentication and authorization decisions, and the security of an 1006 application could be compromised if an entity providing a given 1007 string is connected to the wrong account or online resource based on 1008 different interpretations of the string. 1010 Specifications of application protocols that use this framework are 1011 encouraged to describe how internationalized strings are used in the 1012 protocol, including the security implications of any false positives 1013 and false negatives that might result from various comparison 1014 operations. For some helpful guidelines, refer to [RFC6943], 1015 [RFC5890], [UTR36], and [UTR39]. 1017 9.2. Use of the IdentifierClass 1019 Strings that conform to the IdentifierClass and any subclass thereof 1020 are intended to be relatively safe for use in a broad range of 1021 applications, primarily because they include only letters, digits, 1022 and "grandfathered" non-space characters from the ASCII range; thus 1023 they exclude spaces, characters with compatibility equivalents, and 1024 almost all symbols and punctuation marks. However, because such 1025 strings can still include so-called confusable characters (see 1026 Section 9.5, protocol designers and implementers are encouraged to 1027 pay close attention to the security considerations described 1028 elsewhere in this document. 1030 9.3. Use of the FreeformClass 1032 Strings that conform to the FreeformClass and many subclasses thereof 1033 can include virtually any Unicode character. This makes the 1034 FreeformClass quite expressive, but also problematic from the 1035 perspective of possible user confusion. Protocol designers are 1036 hereby warned that the FreeformClass contains codepoints they might 1037 not understand, and are encouraged to use or subclass the 1038 IdentifierClass wherever feasible; however, if an application 1039 protocol requires more code points than are allowed by the 1040 IdentifierClass, protocol designers are encouraged to define a 1041 subclass of the FreeformClass that restricts the allowable code 1042 points as tightly as possible. (The working group considered the 1043 option of allowing superclasses as well as subclasses of PRECIS 1044 string classes, but decided against allowing superclasses to reduce 1045 the likelihood of security and interoperability problems.) 1047 9.4. Local Character Set Issues 1049 When systems use local character sets other than ASCII and Unicode, 1050 these specifications leave the problem of converting between the 1051 local character set and Unicode up to the application or local 1052 system. If different applications (or different versions of one 1053 application) implement different rules for conversions among coded 1054 character sets, they could interpret the same name differently and 1055 contact different application servers or other network entities. 1056 This problem is not solved by security protocols, such as Transport 1057 Layer Security (TLS) [RFC5246] and the Simple Authentication and 1058 Security Layer (SASL) [RFC4422], that do not take local character 1059 sets into account. 1061 9.5. Visually Similar Characters 1063 Some characters are visually similar and thus can cause confusion 1064 among humans. Such characters are often called "confusable 1065 characters" or "confusables". 1067 The problem of confusable characters is not necessarily caused by the 1068 use of Unicode code points outside the ASCII range. For example, in 1069 some presentations and to some individuals the string "ju1iet" 1070 (spelled with the Arabic numeral one as the third character) might 1071 appear to be the same as "juliet" (spelled with the lowercase version 1072 of the letter "L"), especially on casual visual inspection. This 1073 phenomenon is sometimes called "typejacking". 1075 However, the problem is made more serious by introducing the full 1076 range of Unicode code points into protocol strings. For example, the 1077 characters U+13DA U+13A2 U+13B5 U+13AC U+13A2 U+13AC U+13D2 from the 1078 Cherokee block look similar to the ASCII characters "STPETER" as they 1079 might look when presented using a "creative" font family. 1081 In some examples of confusable characters, it is unlikely that the 1082 average human could tell the difference between the real string and 1083 the fake string. (Indeed, there is no programmatic way to 1084 distinguish with full certainty which is the fake string and which is 1085 the real string; in some contexts, the string formed of Cherokee 1086 characters might be the real string and the string formed of ASCII 1087 characters might be the fake string.) Because PRECIS-compliant 1088 strings can contain almost any properly-encoded Unicode code point, 1089 it can be relatively easy to fake or mimic some strings in systems 1090 that use the PRECIS framework. The fact that some strings are easily 1091 confused introduces security vulnerabilities of the kind that have 1092 also plagued the World Wide Web, specifically the phenomenon known as 1093 phishing. 1095 Despite the fact that some specific suggestions about identification 1096 and handling of confusable characters appear in the Unicode Security 1097 Considerations [UTR36], it is also true (as noted in [RFC5890]) that 1098 "there are no comprehensive technical solutions to the problems of 1099 confusable characters". Because it is impossible to map visually 1100 similar characters without a great deal of context (such as knowing 1101 the font families used), the PRECIS framework does nothing to map 1102 similar-looking characters together, nor does it prohibit some 1103 characters because they look like others. 1105 Nevertheless, specifications for application protocols that use this 1106 framework MUST describe how confusable characters can be used to 1107 compromise the security of systems that use the protocol in question, 1108 along with any protocol-specific suggestions for overcoming those 1109 threats. In particular, software implementations and service 1110 deployments that use PRECIS-based technologies are strongly 1111 encouraged to define and implement consistent policies regarding the 1112 registration, storage, and presentation of visually similar 1113 characters. The following recommendations are appropriate: 1115 1. An application service SHOULD define a policy that specifies the 1116 scripts or blocks of characters that the service will allow to be 1117 registered (e.g., in an account name) or stored (e.g., in a file 1118 name). Such a policy SHOULD be informed by the languages and 1119 scripts that are used to write registered account names; in 1120 particular, to reduce confusion, the service SHOULD forbid 1121 registration or storage of stings that contain characters from 1122 more than one script and SHOULD restrict registrations to 1123 characters drawn from a very small number of scripts (e.g., 1124 scripts that are well-understood by the administrators of the 1125 service, to improve manageability). 1126 2. User-oriented application software SHOULD define a policy that 1127 specifies how internationalized strings will be presented to a 1128 human user. Because every human user of such software has a 1129 preferred language or a small set of preferred languages, the 1130 software SHOULD gather that information either explicitly from 1131 the user or implicitly via the operating system of the user's 1132 device. Furthermore, because most languages are typically 1133 represented by a single script or a small set of scripts, and 1134 because most scripts are typically contained in one or more 1135 blocks of characters, the software SHOULD warn the user when 1136 presenting a string that mixes characters from more than one 1137 script or block, or that uses characters outside the normal range 1138 of the user's preferred language(s). (Such a recommendation is 1139 not intended to discourage communication across different 1140 communities of language users; instead, it recognizes the 1141 existence of such communities and encourages due caution when 1142 presenting unfamiliar scripts or characters to human users.) 1144 9.6. Security of Passwords 1146 Two goals of passwords are to maximize the amount of entropy and to 1147 minimize the potential for false positives. These goals can be 1148 achieved in part by allowing a wide range of code points and by 1149 ensuring that passwords are handled in such a way that code points 1150 are not compared aggressively. Therefore, it is NOT RECOMMENDED for 1151 application protocols to subclass the FreeformClass for use in 1152 passwords in a way that removes entire categories (e.g., by 1153 disallowing symbols or punctuation). Furthermore, it is NOT 1154 RECOMMENDED for application protocols to map uppercase and titlecase 1155 code points to their lowercase equivalents in such strings; instead, 1156 it is RECOMMENDED to preserve the case of all code points contained 1157 in such strings and to compare them in a case-sensitive manner. 1159 That said, software implementers need to be aware that there exist 1160 tradeoffs between entropy and usability. For example, allowing a 1161 user to establish a password containing "uncommon" code points might 1162 make it difficult for the user to access a service when using an 1163 unfamiliar or constrained input device. 1165 Some application protocols use passwords directly, whereas others 1166 reuse technologies that themselves process passwords (one example of 1167 such a technology is the Simple Authentication and Security Layer 1168 [RFC4422]). Moreover, passwords are often carried by a sequence of 1169 protocols with backend authentication systems or data storage systems 1170 such as RADIUS [RFC2865] and LDAP [RFC4510]. Developers of 1171 application protocols are encouraged to look into reusing these 1172 profiles instead of defining new ones, so that end-user expectations 1173 about passwords are consistent no matter which application protocol 1174 is used. 1176 10. IANA Considerations 1178 10.1. PRECIS Derived Property Value Registry 1180 IANA is requested to create a PRECIS-specific registry with the 1181 Derived Properties for the versions of Unicode that are released 1182 after (and including) version 6.1. The derived property value is to 1183 be calculated in cooperation with a designated expert [RFC5226] 1184 according to the rules specified under Section 6 and Section 7, not 1185 by copying the non-normative table found under Appendix A. 1187 The IESG is to be notified if backward-incompatible changes to the 1188 table of derived properties are discovered or if other problems arise 1189 during the process of creating the table of derived property values 1190 or during expert review. Changes to the rules defined under 1191 Section 6 and Section 7) require IETF Review, as described in 1192 [RFC5226]. 1194 10.2. PRECIS Base Classes Registry 1196 IANA is requested to create a registry of PRECIS string classes. In 1197 accordance with [RFC5226], the registration policy is "RFC Required". 1199 The registration template is as follows: 1201 Base Class: [the name of the PRECIS string class] 1202 Description: [a brief description of the PRECIS string class and its 1203 intended use, e.g., "A sequence of letters, numbers, and symbols 1204 that is used to identify or address a network entity."] 1205 Width Mapping: [the behavioral rule for handling of width, e.g., 1206 "Map fullwidth and halfwidth characters to their decomposition 1207 equivalents."] 1208 Additional Mappings: [any additional mappings are required or 1209 recommended, e.g., "Map non-ASCII space characters to ASCII 1210 space."; or "Application Specific" if to be defined by protocols 1211 that use the PRECIS string class] 1212 Case Mapping: [the behavioral rule for handling of case, e.g., "Map 1213 uppercase and titlecase characters to lowercase."; or "Application 1214 Specific" if to be defined by protocols that use the PRECIS string 1215 class] 1216 Normalization: [which Unicode normalization form is applied, e.g., 1217 "NFC"; or "Application Specific" if to be defined by protocols 1218 that use the PRECIS string class] 1219 Directionality: [the behavioral rule for handling of right-to-left 1220 code points, e.g., "The 'Bidi Rule' defined in RFC 5893 applies."; 1221 or "Application Specific" if to be defined by protocols that use 1222 the PRECIS string class] 1223 Specification: [the RFC number] 1225 The initial registrations are as follows: 1227 Base Class: FreeformClass. 1228 Description: A sequence of letters, numbers, symbols, spaces, and 1229 other code points that is used for free-form strings. 1230 Width Mapping: Application Specific. 1231 Additional Mappings: Application Specific. 1232 Case Mapping: Application Specific. 1233 Normalization: Application Specific. 1234 Directionality: Application Specific. 1235 Specification: RFC XXXX. [Note to RFC Editor: please change XXXX to 1236 the number issued for this specification.] 1238 Base Class: IdentifierClass. 1239 Description: A sequence of letters, numbers, and symbols that is 1240 used to identify or address a network entity. 1241 Width Mapping: Application Specific. 1242 Additional Mappings: Application Specific. 1243 Case Mapping: Application Specific. 1244 Normalization: Application Specific. 1245 Directionality: Application Specific. 1246 Specification: RFC XXXX. [Note to RFC Editor: please change XXXX to 1247 the number issued for this specification.] 1249 10.3. PRECIS Subclasses Registry 1251 IANA is requested to create a registry of subclasses that use the 1252 PRECIS string classes. In accordance with [RFC5226], the 1253 registration policy is "Expert Review". This policy was chosen in 1254 order to ensure that "customers" of PRECIS receive appropriate 1255 guidance regarding the sometimes complex and subtle 1256 internationalization issues related to subclassing of PRECIS string 1257 classes. 1259 The registration template is as follows: 1261 Subclass: [the name of the subclass] 1262 Base Class: [which PRECIS string class is being subclassed] 1263 Exclusions: [a brief description of the specific code points that 1264 are excluded or of the properties based on which characters are 1265 excluded, e.g., "Eight legacy characters in the ASCII range" or 1266 "Any character that has a compatibility equivalent, i.e., the 1267 HasCompat category"] 1268 Specification: [a pointer to relevant documentation, such as an RFC 1269 or Internet-Draft] 1271 In order to request a review, the registrant shall send a completed 1272 template to the precis@ietf.org list or its designated successor. 1274 Factors to focus on while reviewing subclass registrations include 1275 the following: 1277 o Is the problem well-defined? 1278 o Is it clear what applications will use this subclass? 1279 o Would an existing PRECIS string class or subclass solve the 1280 problem? 1281 o Are the defined exclusions a reasonable solution to the problem 1282 for the relevant applications? 1283 o Is the subclass clearly defined? 1284 o Does the subclass reduce the degree to which human users could be 1285 surprised by application behavior (the "principle of least user 1286 surprise")? 1287 o Is the subclass based on an appropriate dividing line between user 1288 interface (culture, context, intent, locale, device limitations, 1289 etc.) and the use of conformant strings in protocol elements? 1290 o Does the subclass introduce any new security concerns (e.g., false 1291 positives for authentication or authorization)? 1293 10.4. PRECIS Usage Registry 1295 IANA is requested to create a registry of application protocols that 1296 use the PRECIS string classes. The registry will include one entry 1297 for each use (e.g., if a protocol uses both the IdentifierClass and 1298 the FreeformClass then the specification for that protocol would 1299 submit two registrations). In accordance with [RFC5226], the 1300 registration policy is "Expert Review". This policy was chosen in 1301 order to ensure that "customers" of PRECIS receive appropriate 1302 guidance regarding the sometimes complex and subtle 1303 internationalization issues related to use of PRECIS string classes. 1305 The registration template is as follows: 1307 Applicability: [the specific protocol elements to which this usage 1308 applies, e.g., "Localparts in XMPP addresses."] 1309 Base Class: [the PRECIS string class that is being used or 1310 subclassed] 1311 Subclass: [whether the protocol has defined a subclass of the PRECIS 1312 string class and, if so, the name of the subclass, e.g., "Yes, 1313 LocalpartIdentifierClass."] 1314 Replaces: [the Stringprep profile that this PRECIS usage replaces, 1315 if any] 1316 Width Mapping: [the behavioral rule for handling of width, e.g., 1317 "Map fullwidth and halfwidth characters to their decomposition 1318 equivalents."] 1319 Additional Mappings: [any additional mappings are required or 1320 recommended, e.g., "Map non-ASCII space characters to ASCII 1321 space."] 1322 Case Mapping: [the behavioral rule for handling of case, e.g., "Map 1323 uppercase and titlecase characters to lowercase."] 1324 Normalization: [which Unicode normalization form is applied, e.g., 1325 "NFC"] 1326 Directionality: [the behavioral rule for handling of right-to-left 1327 code points, e.g., "The 'Bidi Rule' defined in RFC 5893 applies."] 1328 Specification: [a pointer to relevant documentation, such as an RFC 1329 or Internet-Draft] 1331 In order to request a review, the registrant shall send a completed 1332 template to the precis@ietf.org list or its designated successor. 1334 Factors to focus on while reviewing usage registrations include the 1335 following: 1337 o Does the specification define what kinds of applications are 1338 involved and the protocol elements to which this usage applies? 1339 o Is there a PRECIS string class or subclass that would be more 1340 appropriate to use? 1341 o Are the normalization, case mapping, width mapping, additional 1342 mapping, and directionality rules appropriate for the intended 1343 use? 1345 o Does the usage reduce the degree to which human users could be 1346 surprised by application behavior (the "principle of least user 1347 surprise")? 1348 o Is the usage based on an appropriate dividing line between user 1349 interface (culture, context, intent, locale, device limitations, 1350 etc.) and the use of conformant strings in protocol elements? 1351 o Does the usage introduce any new security concerns (e.g., false 1352 positives for authentication or authorization)? 1354 11. Interoperability Considerations 1356 Although strings that are consumed in PRECIS-based application 1357 protocols are often encoded using UTF-8 [RFC3629], the exact encoding 1358 is a matter for the application protocol that reuses PRECIS, not for 1359 the PRECIS framework. 1361 It is known that some existing systems are unable to support the full 1362 Unicode character set, or even any characters outside the ASCII 1363 range. If two (or more) applications need to interoperate when 1364 exchanging data (e.g., for the purpose of authenticating a username 1365 or password), they will naturally need have in common at least one 1366 coded character set (as defined by [RFC6365]). Establishing such a 1367 baseline is a matter for the application protocol that reuses PRECIS, 1368 not for the PRECIS framework. 1370 12. References 1372 12.1. Normative References 1374 [I-D.ietf-precis-mappings] 1375 Yoneya, Y. and T. NEMOTO, "Mapping characters for precis 1376 classes", draft-ietf-precis-mappings-02 (work in 1377 progress), May 2013. 1379 [RFC20] Cerf, V., "ASCII format for network interchange", RFC 20, 1380 October 1969. 1382 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1383 Requirement Levels", BCP 14, RFC 2119, March 1997. 1385 [RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network 1386 Interchange", RFC 5198, March 2008. 1388 [UNICODE] The Unicode Consortium, "The Unicode Standard, Version 1389 6.2", 2012, 1390 . 1392 12.2. Informative References 1394 [I-D.ietf-precis-nickname] 1395 Saint-Andre, P., "Preparation and Comparison of 1396 Nicknames", draft-ietf-precis-nickname-06 (work in 1397 progress), July 2013. 1399 [I-D.ietf-precis-saslprepbis] 1400 Saint-Andre, P. and A. Melnikov, "Username and Password 1401 Preparation Algorithms", draft-ietf-precis-saslprepbis-02 1402 (work in progress), April 2013. 1404 [I-D.ietf-xmpp-6122bis] 1405 Saint-Andre, P., "Extensible Messaging and Presence 1406 Protocol (XMPP): Address Format", 1407 draft-ietf-xmpp-6122bis-07 (work in progress), April 2013. 1409 [RFC2865] Rigney, C., Willens, S., Rubens, A., and W. Simpson, 1410 "Remote Authentication Dial In User Service (RADIUS)", 1411 RFC 2865, June 2000. 1413 [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of 1414 Internationalized Strings ("stringprep")", RFC 3454, 1415 December 2002. 1417 [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, 1418 "Internationalizing Domain Names in Applications (IDNA)", 1419 RFC 3490, March 2003. 1421 [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep 1422 Profile for Internationalized Domain Names (IDN)", 1423 RFC 3491, March 2003. 1425 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 1426 10646", STD 63, RFC 3629, November 2003. 1428 [RFC4422] Melnikov, A. and K. Zeilenga, "Simple Authentication and 1429 Security Layer (SASL)", RFC 4422, June 2006. 1431 [RFC4510] Zeilenga, K., "Lightweight Directory Access Protocol 1432 (LDAP): Technical Specification Road Map", RFC 4510, 1433 June 2006. 1435 [RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and 1436 Recommendations for Internationalized Domain Names 1437 (IDNs)", RFC 4690, September 2006. 1439 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1440 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 1441 May 2008. 1443 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 1444 Specifications: ABNF", STD 68, RFC 5234, January 2008. 1446 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 1447 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 1449 [RFC5890] Klensin, J., "Internationalized Domain Names for 1450 Applications (IDNA): Definitions and Document Framework", 1451 RFC 5890, August 2010. 1453 [RFC5891] Klensin, J., "Internationalized Domain Names in 1454 Applications (IDNA): Protocol", RFC 5891, August 2010. 1456 [RFC5892] Faltstrom, P., "The Unicode Code Points and 1457 Internationalized Domain Names for Applications (IDNA)", 1458 RFC 5892, August 2010. 1460 [RFC5893] Alvestrand, H. and C. Karp, "Right-to-Left Scripts for 1461 Internationalized Domain Names for Applications (IDNA)", 1462 RFC 5893, August 2010. 1464 [RFC5894] Klensin, J., "Internationalized Domain Names for 1465 Applications (IDNA): Background, Explanation, and 1466 Rationale", RFC 5894, August 2010. 1468 [RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for 1469 Internationalized Domain Names in Applications (IDNA) 1470 2008", RFC 5895, September 2010. 1472 [RFC6365] Hoffman, P. and J. Klensin, "Terminology Used in 1473 Internationalization in the IETF", BCP 166, RFC 6365, 1474 September 2011. 1476 [RFC6885] Blanchet, M. and A. Sullivan, "Stringprep Revision and 1477 Problem Statement for the Preparation and Comparison of 1478 Internationalized Strings (PRECIS)", RFC 6885, March 2013. 1480 [RFC6943] Thaler, D., "Issues in Identifier Comparison for Security 1481 Purposes", RFC 6943, May 2013. 1483 [UAX9] The Unicode Consortium, "Unicode Standard Annex #9: 1484 Unicode Bidirectional Algorithm", September 2012, 1485 . 1487 [UAX11] The Unicode Consortium, "Unicode Standard Annex #11: East 1488 Asian Width", September 2012, 1489 . 1491 [UAX15] The Unicode Consortium, "Unicode Standard Annex #15: 1492 Unicode Normalization Forms", August 2012, 1493 . 1495 [UTR36] The Unicode Consortium, "Unicode Technical Report #36: 1496 Unicode Security Considerations", July 2012, 1497 . 1499 [UTR39] The Unicode Consortium, "Unicode Technical Report #39: 1500 Unicode Security Mechanisms", July 2012, 1501 . 1503 URIs 1505 [1] 1507 [2] 1509 Appendix A. Codepoint Table 1511 WARNING: The following table is provisional and is still being 1512 verified! 1514 If one applies the property calculation rules from Section 7 to the 1515 code points 0x0000 to 0x10FFFF in Unicode 6.2, the result is as shown 1516 in the following table, in Unicode Character Database (UCD) format. 1517 The columns of the table are as follows: 1519 1. The code point or codepoint range. 1520 2. The assignment for the code point or range, where the value is 1521 one of PVALID, DISALLOWED, UNASSIGNED, CONTEXTO, CONTEXTJ, or 1522 FREE_PVAL (which includes ID_DIS). 1523 3. The name or names for the code point or range. 1525 This table is non-normative, is included only for illustrative 1526 purposes, and applies only to Unicode 6.2, not to past or future 1527 versions of Unicode. Please note that the strings displayed in the 1528 third column are not necessarily the formal name of the code point 1529 (as defined in [UNICODE]) because the fixed width of the RFC format 1530 necessitated truncation of many names. 1532 0000..001F ; DISALLOWED # 1533 0020 ; FREE_PVAL # SPACE 1534 0021..007E ; PVALID # EXCLAM MARK .. TILDE 1535 007F..009F ; DISALLOWED # 1536 00A0..00AC ; FREE_PVAL # NO-BREAK SPACE .. NOT SIGN 1537 00AD ; DISALLOWED # SOFT HYPH 1538 00AE..00B6 ; FREE_PVAL # REGISTERED SIGN .. PILCROW SIGN 1539 00B7 ; CONTEXTO # MIDDLE DOT 1540 00B8..00BF ; FREE_PVAL # CEDILLA..INV QUEST IND 1541 00C0..00D6 ; PVALID # LAT CAP LET A W GRAV..LAT CAP O 1542 00D7 ; FREE_PVAL # MULTIPLICATION SIGN 1543 00D8..00F6 ; PVALID # LAT CAP LET O W STROKE..LAT SM 1544 00F7 ; FREE_PVAL # DIVISION SIGN 1545 00F8..0131 ; PVALID # LAT SM LET O W STROKE..LAT SM LET 1546 0132..0133 ; FREE_PVAL # LAT CAP LIG IJ..LAT SM LIB IJ 1547 0134..013E ; PVALID # LAT CAP LET J W CIRCUM..LAT SM LET 1548 013F..0140 ; FREE_PVAL # LAT CAP LET L W MID DOT..LAT SM LET 1549 0141..0148 ; PVALID # LAT CAP LET L W STROKE..LAT SM LET 1550 0149 ; FREE_PVAL # LAT SM LET N PRECEDED BY APOS 1551 014A..017E ; PVALID # LAT CAP LET ENG..LAT SM LET Z W CA 1552 017F ; FREE_PVAL # LAT SM LET LONG S 1553 0180..01C3 ; PVALID # LAT SM LET B W STROKE..LAT LET RETR 1554 01C4..01CC ; FREE_PVAL # LAT CAP LET DZ W CARON..LAT SM 1555 01CD..01F0 ; PVALID # LAT CAP LET A W CARON..LAT SM LET J 1556 01F1..01F3 ; FREE_PVAL # LAT CAP LET DZ..LAT SM LET DZ 1557 01F4..02AF ; PVALID # LAT CAP LET G W ACUTE..LAT SM 1558 02B0..02B8 ; FREE_PVAL # MOD LET SM H..MOD LET SM Y 1559 02B9..02C1 ; PVALID # MOD LET PRIME..MOD LET REV GLOT ST 1560 02C2..02C5 ; FREE_PVAL # MOD LET L ARROW..MOD LET D ARROW 1561 02C6..02D1 ; PVALID # MOD LET CIRCUM ACC..MOD LET HALF TR 1562 02D2..02EB ; FREE_PVAL # MOD LET CENT R HALF RING..MOD LET Y 1563 02EC ; PVALID # MOD LET VOICING 1564 02ED ; FREE_PVAL # MOD LET UNASPIRATED 1565 02EE ; PVALID # MOD LET DOUBLE APOS 1566 02EF..02FF ; FREE_PVAL # MOD LET LOW D ARR..MOD LET LOW L AR 1567 0300..034E ; PVALID # COMB GRAVE ACCENT..COMB UP ARROW BE 1568 034F ; DISALLOWED # COMB GRAPHEME JOINER 1569 0350..0374 ; PVALID # COMB RIGHT ARROWHEAD..GREEK NUM SIG 1570 0375 ; CONTEXTO # GREEK LOW NUM SIGN 1571 0376..0377 ; PVALID # GR CAP LET PAMPHYLIAN DIGAMMA..GR S 1572 0378..0379 ; UNASSIGNED # .. 1573 037A ; FREE_PVAL # GR YPOGEGRAMMENI..GR SM REV DOT LUN 1574 037B..037D ; PVALID # GR SM REV LUN SIG..GR SM REV DOT LU 1575 037E ; FREE_PVAL # GREEK QUEST MARK 1576 037F..0383 ; UNASSIGNED # .. 1577 0384..0385 ; FREE_PVAL # GREEK TONOS..GREEK DIALYTIKA TONOS 1578 0386 ; PVALID # GR CAP LET ALPHA W TONOS 1579 0387 ; FREE_PVAL # GREEK ANO TELEIA 1580 0388..038A ; PVALID # GR CAP LET EPSILON W TONOS..GR CAP 1581 038B ; UNASSIGNED # 1582 038C ; PVALID # GREEK CAP LET OMICRON W TONOS 1583 038D ; UNASSIGNED # 1584 038E..03A1 ; PVALID # GR CAP LET EPSILON W TONOS..GR CAP 1585 03A2 ; UNASSIGNED # 1586 03A3..03CF ; PVALID # GREEK CAP LET SIGMA..GR CAP 1587 03D0..03D2 ; FREE_PVAL # GR BETA SYM..GR UPSILON W HOOK 1588 03D3..03D4 ; PVALID # GR UPSILON W ACUTE AND HOOK..GR UP 1589 03D5..03D6 ; FREE_PVAL # GR PHI SYM..GR PI SYM 1590 03D7..03EF ; PVALID # GR KAI SYM..COPT SM LET DEI 1591 03F0..03F2 ; FREE_PVAL # GR KAPPA SYM..GR LUNATE SIGMA 1592 03F3 ; PVALID # GREEK LET YOT 1593 03F4..03F6 ; FREE_PVAL # GR CAP THETA..GR REV LUNATE EPSILON 1594 03F7..03F8 ; PVALID # GR CAP LET SHO..GR SM LET SHO 1595 03F9 ; FREE_PVAL # GREEK CAP LUNATE SIGMA SYM 1596 03FA..0481 ; PVALID # GR CAP LET SAN..CYR SML LET KOPPA 1597 0482 ; FREE_PVAL # CYR THOUSANDS SIGN 1598 0483..0487 ; PVALID # COMB CYR TITLO..COMB CYR POK 1599 0488..0489 ; FREE_PVAL # COMB CYR HUNDRED THOUSANDS SIGN..C 1600 048A..0527 ; PVALID # CYR CAP LET SH I W TAIL..CYR S 1601 0528..0530 ; UNASSIGNED # .. 1602 0531..0556 ; PVALID # ARM CAP LET AYB..ARM CAP LET FEH 1603 0557..0558 ; UNASSIGNED # .. 1604 0559 ; PVALID # ARM MOD LET LEFT HALF RING 1605 055A..055F ; FREE_PVAL # ARM APOS..ARM ABBREV 1606 0560 ; UNASSIGNED # 1607 0561..0586 ; PVALID # ARM SM LET AYB..ARMENIAN SM LE 1608 0587 ; FREE_PVAL # ARM SM LIG ECH YIWN 1609 0588 ; UNASSIGNED # 1610 0589..058A ; FREE_PVAL # ARMENIAN FULL STOP..ARMENIAN HYPH 1611 058B..058E ; UNASSIGNED # .. 1612 058F ; FREE_PVAL # ARMENIAN DRAM SIGN 1613 0590 ; UNASSIGNED # 1614 0591..05BD ; PVALID # HEBR ACC ETNAHTA..HEBR PNT ME 1615 05BE ; FREE_PVAL # HEBR PUNCT MAQAF 1616 05BF ; PVALID # HEBR PNT RAFE 1617 05C0 ; FREE_PVAL # HEBR PUNCT PASEQ 1618 05C1..05C2 ; PVALID # HEBR PNT SHIN DOT..HEBR PNT SIN DOT 1619 05C3 ; FREE_PVAL # HEBR PUNCT SOF PASUQ 1620 05C4..05C5 ; PVALID # HEBR MARK UP DOT..HEBR MARK LOW DOT 1621 05C6 ; FREE_PVAL # HEBR PUNCT NUN HAFUKHA 1622 05C7 ; PVALID # HEBR PNT QAMATS QATAN 1623 05C8..05CF ; UNASSIGNED # .. 1624 05D0..05EA ; PVALID # HEBR LET ALEF..HEBR LET TAV 1625 05EB..05EF ; UNASSIGNED # .. 1626 05F0..05F2 ; PVALID # HEBR LIG YIDDISH DOUBLE VAV..HEBR L 1627 05F3..05F4 ; CONTEXTO # HEBR PUNCT GERESH..HEBR PUNCTUATIO 1628 05F5..05FF ; UNASSIGNED # .. 1629 0600..0604 ; DISALLOWED # ARAB NUM SIGN..ARAB SIGN SAM 1630 0605 ; UNASSIGNED # .. 1631 0606..060F ; FREE_PVAL # AR-IND CUBE ROOT..ARAB SIGN MISRA 1632 0610..061A ; PVALID # ARAB SIGN SALLALLAHOU ALAYHE ..AR 1633 061B ; FREE_PVAL # ARAB SEMICOLON 1634 061C..061D ; UNASSIGNED # .. 1635 061E..061F ; FREE_PVAL # ARAB TRIPLE DOT PUNCT MARK..ARAB Q 1636 0620..063F ; PVALID # ARAB LET KASH..ARAB LET FARSI YEH 1637 0640 ; DISALLOWED # ARAB TATWEEL 1638 0641..065F ; PVALID # ARAB LET FEH..ARAB WAVY HAMZA BEL 1639 0660..0669 ; CONTEXTO # AR-IND DIG ZERO..AR-IND DIG 1640 066A..066D ; FREE_PVAL # ARAB PCT SIGN..ARAB FIVE PNTED STA 1641 066E..0674 ; PVALID # ARAB LET DOTLESS BEH..ARAB LET HIG 1642 0675..0678 ; FREE_PVAL # ARAB LET HIGH HAMZA ALEF..ARAB LET 1643 0679..06D3 ; PVALID # ARAB LET TTEH..ARAB LET YEH BARREE 1644 06D4 ; FREE_PVAL # ARAB FULL STOP 1645 06D5..06DC ; PVALID # ARAB LET AE..ARAB SM HIGH SEEN 1646 06DD ; DISALLOWED # ARAB END OF AYAH 1647 06DE ; FREE_PVAL # ARAB START OF RUB EL HIZB 1648 06DF..06E8 ; PVALID # ARAB SM HIGH ROUNDED ZERO..ARAB SM 1649 06E9 ; FREE_PVAL # ARAB PLACE OF SAJDAH 1650 06EA..06EF ; PVALID # ARAB EMPTY CENTRE LOW STOP..ARAB LET 1651 06F0..06F9 ; CONTEXTO # EXT AR-IND DIG ZERO..EXT A 1652 06FA..06FF ; PVALID # ARAB LET SHEEN W DOT BEL..ARAB 1653 0700..070D ; FREE_PVAL # SYR END OF PARA..SYR HARKLEAN AST 1654 070E ; UNASSIGNED # 1655 070F ; DISALLOWED # SYR ABBR MARK 1656 0710..074A ; PVALID # SYR LET ALAPH..SYR BARREKH 1657 074B..074C ; UNASSIGNED # .. 1658 074D..07B1 ; PVALID # SYR LET SOGDIAN ZHAIN..THAANA LET N 1659 07B2..07BF ; UNASSIGNED # .. 1660 07C0..07F5 ; PVALID # NKO DIG ZERO..NKO LOW TONE APOS 1661 07F6..07F9 ; FREE_PVAL # NKO SYM OO DENNEN..NKO EXCLAMATI 1662 07FA ; DISALLOWED # NKO LAJANYALAN 1663 07FB..07FF ; UNASSIGNED # .. 1664 0800..082D ; PVALID # SAMAR LET ALAF..SAMAR MARK NEQUDA 1665 082E..082F ; UNASSIGNED # .. 1666 0830..083E ; FREE_PVAL # SAMAR PUNCT NEQUDAA..SAMAR PUN 1667 083F ; UNASSIGNED # 1668 0840..085B ; PVALID # MANDAIC LET HALQA..MANDAIC GEM 1669 085C..085D ; UNASSIGNED # .. 1670 085E ; FREE_PVAL # MANDAIC PUNCTUATION 1671 085F..089F ; UNASSIGNED # .. 1672 08A0 ; PVALID # ARAB LET BEH W SM V BEL 1673 08A1 ; UNASSIGNED # 1674 08A2..08AC ; PVALID # ARAB LET JEEM W 2 DOTS AB..ARAB 1675 08AD..08E3 ; UNASSIGNED # .. 1676 08E4..08FE ; PVALID # ARAB CURLY FATHA..ARAB DAMMA W 1677 08FF ; UNASSIGNED # 1678 0900..0963 ; PVALID # DEVAN SIGN INV CANDRABINDU..DEVAN V 1679 0964..0965 ; FREE_PVAL # DEVAN DANDA..DEVAN DOUBLE DANDA 1680 0966..096F ; PVALID # DEVAN DIG ZERO..DEVAN DIG NINE 1681 0970 ; FREE_PVAL # DEVAN ABBR SIGN 1682 0971..0977 ; PVALID # DEVAN SIGN HIGH SPACING DOT..DEVAN 1683 0978 ; UNASSIGNED # 1684 0979..097F ; PVALID # DEVAN SIGN HIGH SPACING DOT..DEVAN 1685 0980 ; UNASSIGNED # 1686 0981..0983 ; PVALID # BENG SIGN CANDRABINDU..BENG SIGN VIS 1687 0984 ; UNASSIGNED # 1688 0985..098C ; PVALID # BENG LET A..BENG LET VOC L 1689 098D..098E ; UNASSIGNED # .. 1690 098F..0990 ; PVALID # BENG LET E..BENG LET AI 1691 0991..0992 ; UNASSIGNED # .. 1692 0993..09A8 ; PVALID # BENG LET O..BENG LET NA 1693 09A9 ; UNASSIGNED # 1694 09AA..09B0 ; PVALID # BENG LET PA..BENG LET RA 1695 09B1 ; UNASSIGNED # 1696 09B2 ; PVALID # BENG LET LA 1697 09B3..09B5 ; UNASSIGNED # .. 1698 09B6..09B9 ; PVALID # BENG LET SHA..BENG LET HA 1699 09BA..09BB ; UNASSIGNED # .. 1700 09BC..09C4 ; PVALID # BENG SIGN NUKTA..BENG VOW SIGN VOCAL 1701 09C5..09C6 ; UNASSIGNED # .. 1702 09C7..09C8 ; PVALID # BENG VOW SIGN E..BENG VOW SIGN AI 1703 09C9..09CA ; UNASSIGNED # .. 1704 09CB..09CE ; PVALID # BENG VOW SIGN O..BENG LET KHANDA 1705 09CF..09D6 ; UNASSIGNED # .. 1706 09D7 ; PVALID # BENG AU LEN MARK 1707 09D8..09DB ; UNASSIGNED # .. 1708 09DC..09DD ; PVALID # BENG LET RRA..BENG LET RHA 1709 09DE ; UNASSIGNED # 1710 09DF..09E3 ; PVALID # BENG LET YYA..BENG VOW SIG 1711 09E4..09E5 ; UNASSIGNED # .. 1712 09E6..09F1 ; PVALID # BENG DIG ZERO..BENG LET RA W L 1713 09F2..09FB ; FREE_PVAL # BENG RUPEE MARK..BENG GANDA MARK 1714 09FC..0A00 ; UNASSIGNED # .. 1715 0A01..0A03 ; PVALID # GURMUKHI SIGN ADAK BINDI..GURMUKHI 1716 0A04 ; UNASSIGNED # 1717 0A05..0A0A ; PVALID # GURMUKHI LET A..GURMUKHI LET UU 1718 0A0B..0A0E ; UNASSIGNED # .. 1719 0A0F..0A10 ; PVALID # GURMUKHI LET EE..GURMUKHI LET AI 1720 0A11..0A12 ; UNASSIGNED # .. 1721 0A13..0A28 ; PVALID # GURMUKHI LET OO..GURMUKHI LET NA 1722 0A29 ; UNASSIGNED # 1723 0A2A..0A30 ; PVALID # GURMUKHI LET PA..GURMUKHI LET RA 1724 0A31 ; UNASSIGNED # 1725 0A32..0A33 ; PVALID # GURMUKHI LET LA..GURMUKHI LET LLA 1726 0A34 ; UNASSIGNED # 1727 0A35.OA36 ; PVALID # GURMUKHI LET VA..GURMUKHI LET SHA 1728 0A37 ; UNASSIGNED # 1729 0A38..0A39 ; PVALID # GURMUKHI LET SA..GURMUKHI LET HA 1730 0A3A..0A3B ; UNASSIGNED # .. 1731 0A3C ; PVALID # GURMUKHI SIGN NUKTA 1732 0A3D ; UNASSIGNED # 1733 0A3E..0A42 ; PVALID # GURMUKHI VOW SIGN AA..GURMUKHI V 1734 0A43..0A46 ; UNASSIGNED # .. 1735 0A47..0A48 ; PVALID # GURMUKHI VOW SIGN EE..GURMUKHI V 1736 0A49..0A4A ; UNASSIGNED # .. 1737 0A4B..0A4D ; PVALID # GURMUKHI VOW SIGN OO..GURMUKHI S 1738 0A4E..0A50 ; UNASSIGNED # .. 1739 0A51 ; PVALID # GURMUKHI SIGN UDAAT 1740 0A52..0A58 ; UNASSIGNED # .. 1741 0A59..0A5C ; PVALID # GURMUKHI LET KHHA..GURMUKHI LET RRA 1742 0A5D ; UNASSIGNED # 1743 0A5E ; PVALID # GURMUKHI LET FA 1744 0A5F..0A65 ; UNASSIGNED # .. 1745 0A66..0A75 ; PVALID # GURMUKHI DIG ZERO..GURMUKHI SIGN YA 1746 0A76..0A80 ; UNASSIGNED # .. 1747 0A81..0A83 ; PVALID # GUJARATI SIGN CANDRABINDU..GUJARATI 1748 0A84 ; UNASSIGNED # 1749 0A85..0A8D ; PVALID # GUJARATI LET A..GUJARATI VOW CAND 1750 0A8E ; UNASSIGNED # 1751 0A8F..0A91 ; PVALID # GUJARATI LET E..GUJARATI VOW CAND 1752 0A92 ; UNASSIGNED # 1753 0A93..0AA8 ; PVALID # GUJARATI LET O..GUJARATI LET NA 1754 0AA9 ; UNASSIGNED # 1755 0AAA..0AB0 ; PVALID # GUJARATI LET PA..GUJARATI LET RA 1756 0AB1 ; UNASSIGNED # 1757 0AB2..0AB3 ; PVALID # GUJARATI LET LA..GUJARATI LET LLA 1758 0AB4 ; UNASSIGNED # 1759 0AB5..0AB9 ; PVALID # GUJARATI LET VA..GUJARATI LET HA 1760 0ABA..0ABB ; UNASSIGNED # .. 1761 0ABC..0AC5 ; PVALID # GUJARATI SIGN NUKTA..GUJARATI VOW 1762 0AC6 ; UNASSIGNED # 1763 0AC7..0AC9 ; PVALID # GUJARATI VOW SIGN E..GUJARATI VOW 1764 0ACA ; UNASSIGNED # 1765 0ACB..0ACD ; PVALID # GUJARATI VOW SIGN O..GUJARATI SIG 1766 0ACE..0ACF ; UNASSIGNED # .. 1767 0AD0 ; PVALID # GUJARATI OM 1768 0AD1..0ADF ; UNASSIGNED # .. 1769 0AE0..0AE3 ; PVALID # GUJARATI LET VOC RR..GUJARATI V 1770 0AE4..0AE5 ; UNASSIGNED # .. 1771 0AE6..0AEF ; PVALID # GUJARATI DIG ZERO..GUJARATI DIG NINE 1772 0AF0..0AF1 ; FREE_PVAL # GUJARATI ABBR SIGN..GUJARATI RUPEE S 1773 0AF2..0B00 ; UNASSIGNED # .. 1774 0B01..0B03 ; PVALID # ORIYA SIGN CANDRABINDU..ORIYA SIGN V 1775 0B04 ; UNASSIGNED # 1776 0B05..0B0C ; PVALID # ORIYA LET A..ORIYA LET VOC L 1777 0B0D..0B0E ; UNASSIGNED # .. 1778 0B0F..0B10 ; PVALID # ORIYA LET E..ORIYA LET AI 1779 0B11..0B12 ; UNASSIGNED # .. 1780 0B13..0B28 ; PVALID # ORIYA LET O..ORIYA LET NA 1781 0B29 ; UNASSIGNED # 1782 0B2A..0B30 ; PVALID # ORIYA LET PA..ORIYA LET RA 1783 0B31 ; UNASSIGNED # 1784 0B32..0B33 ; PVALID # ORIYA LET LA..ORIYA LET LLA 1785 0B34 ; UNASSIGNED # 1786 0B35..0B39 ; PVALID # ORIYA LET VA..ORIYA LET HA 1787 0B3A..0B3B ; UNASSIGNED # .. 1788 0B3C..0B44 ; PVALID # ORIYA SIGN NUKTA..ORIYA VOW SIGN 1789 0B45..0B46 ; UNASSIGNED # .. 1790 0B47..0B48 ; PVALID # ORIYA VOW SIGN E..ORIYA VOW SIG 1791 0B49..0B4A ; UNASSIGNED # .. 1792 0B4B..0B4D ; PVALID # ORIYA VOW SIGN O..ORIYA SIGN VIRA 1793 0B4E..0B55 ; UNASSIGNED # .. 1794 0B56..0B57 ; PVALID # ORIYA AI LEN MARK..ORIYA AU LENG 1795 0B58..0B5B ; UNASSIGNED # .. 1796 0B5C..0B5D ; PVALID # ORIYA LET RRA..ORIYA LET RHA 1797 0B5E ; UNASSIGNED # 1798 0B5F..0B63 ; PVALID # ORIYA LET YYA..ORIYA VOW SIGN VOCA 1799 0B64..0B65 ; UNASSIGNED # .. 1800 0B66..0B6F ; PVALID # ORIYA DIG ZERO..ORIYA DIG NINE 1801 0B70 ; FREE_PVAL # ORIYA ISSHAR 1802 0B71 ; PVALID # ORIYA LET WA 1803 0B72..0B77 ; FREE_PVAL # ORIYA FRACT ONE QUART..ORIYA FRACT 1804 0B78..0B81 ; UNASSIGNED # .. 1805 0B82..0B83 ; PVALID # TAMIL SIGN ANUSVARA..TAMIL SIGN VIS 1806 0B84 ; UNASSIGNED # 1807 0B85..0B8A ; PVALID # TAMIL LET A..TAMIL LET UU 1808 0B8B..0B8D ; UNASSIGNED # .. 1809 0B8E..0B90 ; PVALID # TAMIL LET E..TAMIL LET AI 1810 0B91 ; UNASSIGNED # 1811 0B92..0B95 ; PVALID # TAMIL LET O..TAMIL LET KA 1812 0B96..0B98 ; UNASSIGNED # .. 1813 0B99..0B9A ; PVALID # TAMIL LET NGA..TAMIL LET CA 1814 0B9B ; UNASSIGNED # 1815 0B9C ; PVALID # TAMIL LET JA 1816 0B9D ; UNASSIGNED # 1817 0B9E..0B9F ; PVALID # TAMIL LET NYA..TAMIL LET TTA 1818 0BA0..0BA2 ; UNASSIGNED # .. 1819 0BA3..0BA4 ; PVALID # TAMIL LET NNA..TAMIL LET TA 1820 0BA5..0BA7 ; UNASSIGNED # .. 1821 0BA8..0BAA ; PVALID # TAMIL LET NA..TAMIL LET PA 1822 0BAB..0BAD ; UNASSIGNED # .. 1823 0BAE..0BB9 ; PVALID # TAMIL LET MA..TAMIL LET HA 1824 0BBA..0BBD ; UNASSIGNED # .. 1825 0BBE..0BC2 ; PVALID # TAMIL VOW SIGN AA..TAMIL VOW SI 1826 0BC3..0BC5 ; UNASSIGNED # .. 1827 0BC6..0BC8 ; PVALID # TAMIL VOW SIGN E..TAMIL VOW SIG 1828 0BC9 ; UNASSIGNED # 1829 0BCA..0BCD ; PVALID # TAMIL VOW SIGN O..TAMIL SIGN VIRA 1830 0BCE..0BCF ; UNASSIGNED # .. 1831 0BD0 ; PVALID # TAMIL OM 1832 0BD1..0BD6 ; UNASSIGNED # .. 1833 0BD7 ; PVALID # TAMIL AU LEN MARK 1834 0BD8..0BE5 ; UNASSIGNED # .. 1835 0BE6..0BEF ; PVALID # TAMIL DIG ZERO..TAMIL DIG NINE 1836 0BF0..0BFA ; FREE_PVAL # TAMIL NUM TEN..TAMIL NUM SIGN 1837 0BFB..0C00 ; UNASSIGNED # .. 1838 0C01..0C03 ; PVALID # TELUGU SIGN CANDRABINDU..TELUGU SIG 1839 0C04 ; UNASSIGNED # 1840 0C05..0C0C ; PVALID # TELUGU LET A..TELUGU LET VOC L 1841 0C0D ; UNASSIGNED # 1842 0C0E..0C10 ; PVALID # TELUGU LET E..TELUGU LET AI 1843 0C11 ; UNASSIGNED # 1844 0C12..0C28 ; PVALID # TELUGU LET O..TELUGU LET NA 1845 0C29 ; UNASSIGNED # 1846 0C2A..0C33 ; PVALID # TELUGU LET PA..TELUGU LET LLA 1847 0C34 ; UNASSIGNED # 1848 0C35..0C39 ; PVALID # TELUGU LET VA..TELUGU LET HA 1849 0C3A..0C3C ; UNASSIGNED # .. 1850 0C3D..0C44 ; PVALID # TELUGU SIGN AVAGRAHA..TELUGU VOW SI 1851 0C45 ; UNASSIGNED # 1852 0C46..0C48 ; PVALID # TELUGU VOW SIGN E..TELUGU VOW SIGN 1853 0C49 ; UNASSIGNED # 1854 0C4A..0C4D ; PVALID # TELUGU VOW SIGN O..TELUGU SIGN VIRA 1855 0C4E..0C54 ; UNASSIGNED # .. 1856 0C55..0C56 ; PVALID # TELUGU LEN MARK..TELUGU AI LEN MARK 1857 0C57 ; UNASSIGNED # 1858 0C58..0C59 ; PVALID # TELUGU LET TSA..TELUGU LET DZA 1859 0C5A..0C5F ; UNASSIGNED # .. 1860 0C60..0C63 ; PVALID # TELUGU LET VOC RR..TELUGU VOW S 1861 0C64..0C65 ; UNASSIGNED # .. 1862 0C66..0C6F ; PVALID # TELUGU DIG ZERO..TELUGU DIG NINE 1863 0C70..0C77 ; UNASSIGNED # .. 1864 0C78..0C7F ; FREE_PVAL # TELUGU FRACTION DIG ZERO..TELUGU S 1865 0C80..0C81 ; UNASSIGNED # .. 1866 0C82..0C83 ; PVALID # KANNADA SIGN ANUSVARA..KANNADA SIGN 1867 0C84 ; UNASSIGNED # 1868 0C85..0C8C ; PVALID # KANNADA LET A..KANNADA LET VOC L 1869 0C8D ; UNASSIGNED # 1870 0C8E..0C90 ; PVALID # KANNADA LET E..KANNADA LET AI 1871 0C91 ; UNASSIGNED # 1872 0C92..0CA8 ; PVALID # KANNADA LET O..KANNADA LET NA 1873 0CA9 ; UNASSIGNED # 1874 0CAA..0CB3 ; PVALID # KANNADA LET PA..KANNADA LET LLA 1875 0CB4 ; UNASSIGNED # 1876 0CB5..0CB9 ; PVALID # KANNADA LET VA..KANNADA LET HA 1877 0CBA..0CBB ; UNASSIGNED # .. 1878 0CBC..0CC4 ; PVALID # KANNADA SIGN NUKTA..KANNADA VOW SIG 1879 0CC5 ; UNASSIGNED # 1880 0CC6..0CC8 ; PVALID # KANNADA VOW SIGN E..KANNADA VOW SIG 1881 0CC9 ; UNASSIGNED # 1882 0CCA..0CCD ; PVALID # KANNADA VOW SIGN O..KANNADA SIGN VI 1883 0CCE..0CD4 ; UNASSIGNED # .. 1884 0CD5..0CD6 ; PVALID # KANNADA LEN MARK..KANNADA AI LEN MA 1885 0CD7..0CDD ; UNASSIGNED # .. 1886 0CDE ; PVALID # KANNADA LET FA 1887 0CDF ; UNASSIGNED # 1888 0CE0..0CE3 ; PVALID # KANNADA LET VOC RR..KANNADA VOW SIG 1889 0CE4..0CE5 ; UNASSIGNED # .. 1890 0CE6..0CEF ; PVALID # KANNADA DIG ZERO..KANNADA DIG NINE 1891 0CF0 ; UNASSIGNED # 1892 0CF1..0CF2 ; DISALLOWED # KANNADA SIGN JIHVAMULIYA..KANNADA S 1893 0CF3..0D01 ; UNASSIGNED # .. 1894 0D02..0D03 ; PVALID # MALAY SIGN ANUSVARA..MALAY SIGN VIS 1895 0D04 ; UNASSIGNED # 1896 0D05..0D0C ; PVALID # MALAY LET A..MALAY LET VOC 1897 0D0D ; UNASSIGNED # 1898 0D0E..0D10 ; PVALID # MALAY LET E..MALAY LET AI 1899 0D11 ; UNASSIGNED # 1900 0D12..0D3A ; PVALID # MALAY LET O..MALAY LET TTTA 1901 0D3B..0D3C ; UNASSIGNED # .. 1902 0D3D..0D44 ; PVALID # MALAY SIGN AVAGRAHA..MALAY VOW SIG 1903 0D45 ; UNASSIGNED # 1904 0D46..0D48 ; PVALID # MALAY VOW SIGN E..MALAY VOW SIGN 1905 0D49 ; UNASSIGNED # 1906 0D4A..0D4E ; PVALID # MALAY VOW SIGN O..MALAY LET DOT REP 1907 0D4F..0D56 ; UNASSIGNED # .. 1908 0D57 ; PVALID # MALAY AU LEN MARK 1909 0D58..0D5F ; UNASSIGNED # .. 1910 0D60..0D63 ; PVALID # MALAY LET VOC RR..MALAY VOW 1911 0D64..0D65 ; UNASSIGNED # .. 1912 0D66..0D6F ; PVALID # MALAY DIG ZERO..MALAY DIG NINE 1913 0D70..0D75 ; FREE_PVAL # MALAY NUM TEN..MALAY FRACTION THR 1914 0D76..0D78 ; UNASSIGNED # .. 1915 0D79 ; FREE_PVAL # MALAY DATE MARK 1916 0D7A..0D7F ; PVALID # MALAY LET CHILLU NN..MALAY LET 1917 0D80..0D81 ; UNASSIGNED # .. 1918 0D82..0D83 ; PVALID # SINH SIGN ANUSVARAYA..SINH SIGN VIS 1919 0D84 ; UNASSIGNED # 1920 0D85..0D96 ; PVALID # SINH LET AYANNA..SINH LET AUYANN 1921 0D97..0D99 ; UNASSIGNED # .. 1922 0D9A..0DB1 ; PVALID # SINH LET ALPAPRAANA KAYANNA..SINH L 1923 0DB2 ; UNASSIGNED # 1924 0DB3..0DBB ; PVALID # SINH LET SANYAKA DAYANNA..SINH LETT 1925 0DBC ; UNASSIGNED # 1926 0DBD ; PVALID # SINH LET DANTAJA LAYANNA 1927 0DBE..0DBF ; UNASSIGNED # .. 1928 0DC0..0DC6 ; PVALID # SINH LET VAYANNA..SINH LET FAYAN 1929 0DC7..0DC9 ; UNASSIGNED # .. 1930 0DCA ; PVALID # SINH SIGN AL-LAKUNA 1931 0DCB..0DCE ; UNASSIGNED # .. 1932 0DCF..0DD4 ; PVALID # SINH VOW SIGN AELA-PILLA..SINH VOW 1933 0DD5 ; UNASSIGNED # 1934 0DD6 ; PVALID # SINH VOW SIGN DIGA PAA-PILLA 1935 0DD7 ; UNASSIGNED # 1936 0DD8..0DDF ; PVALID # SINH VOW SIGN GAETTA-PILLA..SINH VO 1937 0DE0..0DF1 ; UNASSIGNED # .. 1938 0DF2..0DF3 ; PVALID # SINH VOW SIGN DIGA GAETTA-PILLA..SI 1939 0DF4 ; FREE_PVAL # SINH PUNCT KUNDDALIYA 1940 0DF5..0E00 ; UNASSIGNED # .. 1941 0E01..0E32 ; PVALID # THAI CHAR KO KAI..THAI CHAR SARA A 1942 0E33 ; FREE_PVAL # THAI CHAR SARA AM 1943 0E34..0E3A ; PVALID # THAI CHAR SARA I..THAI CHAR PHINTH 1944 0E3B..0E3E ; UNASSIGNED # .. 1945 0E3F ; FREE_PVAL # THAI CURRENCY SYM BAHT 1946 0E40..0E4E ; PVALID # THAI CHAR SARA E..THAI CHAR YAMAKK 1947 0E4F ; FREE_PVAL # THAI CHAR FONGMAN 1948 0E50..0E59 ; PVALID # THAI DIG ZERO..THAI DIG NINE 1949 0E5A..0E5B ; FREE_PVAL # THAI CHAR ANGKHANKHU..THAI CHAR KH 1950 0E5C..0E80 ; UNASSIGNED # .. 1951 0E81..0E82 ; PVALID # LAO LET KO..LAO LET KHO SUNG 1952 0E83 ; UNASSIGNED # 1953 0E84 ; PVALID # LAO LET KHO TAM 1954 0E85..0E86 ; UNASSIGNED # .. 1955 0E87..0E88 ; PVALID # LAO LET NGO..LAO LET CO 1956 0E89 ; UNASSIGNED # 1957 0E8A ; PVALID # LAO LET SO TAM 1958 0E8B..0E8C ; UNASSIGNED # .. 1959 0E8D ; PVALID # LAO LET NYO 1960 0E8E..0E93 ; UNASSIGNED # .. 1961 0E94..0E97 ; PVALID # LAO LET DO..LAO LET THO TAM 1962 0E98 ; UNASSIGNED # 1963 0E99..0E9F ; PVALID # LAO LET NO..LAO LET FO SUNG 1964 0EA0 ; UNASSIGNED # 1965 0EA1..0EA3 ; PVALID # LAO LET MO..LAO LET LO LING 1966 0EA4 ; UNASSIGNED # 1967 0EA5 ; PVALID # LAO LET LO LOOT 1968 0EA6 ; UNASSIGNED # 1969 0EA7 ; PVALID # LAO LET WO 1970 0EA8..0EA9 ; UNASSIGNED # .. 1971 0EAA..0EAB ; PVALID # LAO LET SO SUNG..LAO LET HO SUNG 1972 0EAC ; UNASSIGNED # 1973 0EAD..0EB2 ; PVALID # LAO LET O..LAO VOW SIGN AA 1974 0EB3 ; FREE_PVAL # LAO VOW SIGN AM 1975 0EB4..0EB9 ; PVALID # LAO VOW SIGN I..LAO VOW SIGN UU 1976 0EBA ; UNASSIGNED # 1977 0EBB..0EBD ; PVALID # LAO VOW SIGN MAI KON..LAO SEMIVOW SIG 1978 0EBE..0EBF ; UNASSIGNED # .. 1979 0EC0..0EC4 ; PVALID # LAO VOW SIGN E..LAO VOW SIGN AI 1980 0EC5 ; UNASSIGNED # 1981 0EC6 ; PVALID # LAO KO LA 1982 0EC7 ; UNASSIGNED # 1983 0EC8..0ECD ; PVALID # LAO TONE MAI EK..LAO NIGGAHITA 1984 0ECE..0ECF ; UNASSIGNED # .. 1985 0ED0..0ED9 ; PVALID # LAO DIG ZERO..LAO DIG NINE 1986 0EDA..0EDB ; UNASSIGNED # .. 1987 0EDC..0EDD ; FREE_PVAL # LAO HO NO..LAO HO MO 1988 0EDE..0EDF ; PVALID # LAO LET KHMU GO..TIB SYL OM 1989 0EE0..0EEF ; UNASSIGNED # .. 1990 0F00 ; PVALID # TIB SYLL OM 1991 0F01..0F0A ; FREE_PVAL # TIB MARK GTER YIG MGO TRUNC A..TIB 1992 0F0B ; PVALID # TIB MARK INTERSYLLABIC TSHEG 1993 0F0C..0F17 ; FREE_PVAL # TIB MARK DELIMITER TSHEG BSTAR..TIB 1994 0F18..0F19 ; PVALID # TIB ASTROLOGICAL SIGN -KHYUD PA..TIB 1995 0F1A..0F1F ; FREE_PVAL # TIB SIGN RDEL DKAR GCIG..TIB SIGN RD 1996 0F20..0F29 ; PVALID # TIB DIG ZERO..TIB DIG NINE 1997 0F2A..0F34 ; FREE_PVAL # TIB DIG HALF ONE..TIB MARK BSDUS R 1998 0F35 ; PVALID # TIB MARK NGAS BZUNG NYI ZLA 1999 0F36 ; FREE_PVAL # TIB MARK CARET DZUD RTAGS BZHI MIG C 2000 0F37 ; PVALID # TIB MARK NGAS BZUNG SGOR RTAGS 2001 0F38 ; FREE_PVAL # TIB MARK CHE MGO 2002 0F39 ; PVALID # TIB MARK TSA PHRU 2003 0F3A..0F3D ; FREE_PVAL # TIB MARK GUG RTAGS GYON..TIB MARK AN 2004 0F3E..0F47 ; PVALID # TIB SIGN YAR TSHES..TIB LET JA 2005 0F48 ; UNASSIGNED # 2006 0F49..0F6C ; PVALID # TIB LET NYA..TIB LET RRA 2007 0F6D..0F70 ; UNASSIGNED # .. 2008 0F71..0F76 ; PVALID # TIB VOW SIGN AA..TIB VOW SIGN VO 2009 0F77..0F79 ; FREE_PVAL # TIB VOW SIGN UU..TIB VOW SIGN VO 2010 0F7A..0F80 ; PVALID # TIB VOW SIGN E..TIB VOW SIGN REV 2011 0F81 ; FREE_PVAL # TIB VOW SIGN REV II 2012 0F82..0F84 ; PVALID # TIB SIGN NYI ZLA NAA DA..TIB MARK H 2013 0F85 ; FREE_PVAL # TIB MARK PALUTA 2014 0F86..0F8F ; PVALID # TIB SIGN LCI RTAGS..TIB SUBJOIN S 2015 0F90..0F92 ; PVALID # TIB SUBJOIN LET KA..TIB SUBJOIN 2016 0F93 ; FREE_PVAL # TIB SUBJOIN LET GHA 2017 0F94..0F97 ; PVALID # TIB SUBJOIN LET NGA..TIB SUBJOI 2018 0F98 ; UNASSIGNED # 2019 0F99..0FBC ; PVALID # TIB SUBJOIN LET NYA..TIB SUBJOI 2020 0FBD ; UNASSIGNED # 2021 0FBE..0FC5 ; FREE_PVAL # TIB KU RU KHA..TIB SYM RDO RJE 2022 0FC6 ; PVALID # TIB SYM PADMA GDAN 2023 0FC7..0FCC ; FREE_PVAL # TIB SYM RDO RJE RGYA GRAM..TIB SY 2024 0FCD ; UNASSIGNED # 2025 0FCE..0FDA ; FREE_PVAL # TIB SIGN RDEL NAG RDEL DKAR..TIB MA 2026 0FDB..0FFF ; UNASSIGNED # .. 2027 1000..1049 ; PVALID # MYAN LET KA..MYAN DIG NINE 2028 104A..104F ; FREE_PVAL # MYAN SIGN LITTLE SECTION..MYAN SYM 2029 1050..109D ; PVALID # MYAN LET SHA..MYAN VOW SIGN AITON 2030 109E..109F ; FREE_PVAL # MYAN SYM SHAN ONE..MYAN SYM SHAN EX 2031 10A0..10C5 ; PVALID # GEORG CAP LET AN..GEORG CAP LET HOE 2032 10C6 ; UNASSIGNED # 2033 10C7 ; PVALID # GEORG CAP LET YN 2034 10C8..10CC ; UNASSIGNED # .. 2035 10CD ; PVALID # GEORG CAP LET AEN 2036 10CE..10CF ; UNASSIGNED # .. 2037 10D0..10FA ; PVALID # GEORG LET AN..GEORG LET AIN 2038 10FB..10FC ; FREE_PVAL # GEORG PARA SEP..MOD LET GEORG NAR 2039 10FD..10FF ; PVALID # GEORG LET AEN..GEORG LET LABIAL 2040 1100..11FF ; DISALLOWED # HANGUL CHO KIYEOK..HANGUL JONG SSA 2041 1200..1248 ; PVALID # ETHI SYL HA..ETHI SYL QWA 2042 1249 ; UNASSIGNED # 2043 124A..124D ; PVALID # ETHI SYL QWI..ETHI SYL QWE 2044 124E..124F ; UNASSIGNED # .. 2045 1250..1256 ; PVALID # ETHI SYL QHA..ETHI SYL QHO 2046 1257 ; UNASSIGNED # 2047 1258 ; PVALID # ETHI SYL QHWA 2048 1259 ; UNASSIGNED # 2049 125A..125D ; PVALID # ETHI SYL QHWI..ETHI SYL QH 2050 125E..125F ; UNASSIGNED # .. 2051 1260..1288 ; PVALID # ETHI SYL BA..ETHI SYL XWA 2052 1289 ; UNASSIGNED # 2053 128A..128D ; PVALID # ETHI SYL XWI..ETHI SYL XWE 2054 128E..128F ; UNASSIGNED # .. 2055 1290..12B0 ; PVALID # ETHI SYL NA..ETHI SYL KWA 2056 12B1 ; UNASSIGNED # 2057 12B2..12B5 ; PVALID # ETHI SYL KWI..ETHI SYL KWE 2058 12B6..12B7 ; UNASSIGNED # .. 2059 12B8..12BE ; PVALID # ETHI SYL KXA..ETHI SYL KXO 2060 12BF ; UNASSIGNED # 2061 12C0 ; PVALID # ETHI SYL KXWA 2062 12C1 ; UNASSIGNED # 2063 12C2..12C5 ; PVALID # ETHI SYL KXWI..ETHI SYL KX 2064 12C6..12C7 ; UNASSIGNED # .. 2065 12C8..12D6 ; PVALID # ETHI SYL WA..ETHI SYL PHAR 2066 12D7 ; UNASSIGNED # 2067 12D8..1310 ; PVALID # ETHI SYL ZA..ETHI SYL GWA 2068 1311 ; UNASSIGNED # 2069 1312..1315 ; PVALID # ETHI SYL GWI..ETHI SYL GWE 2070 1316..1317 ; UNASSIGNED # .. 2071 1318..135A ; PVALID # ETHI SYL GGA..ETHI SYL FYA 2072 135B..135C ; UNASSIGNED # .. 2073 135D..135F ; PVALID # ETHI COMB GEM AND VOW..ETHI COMB GE 2074 1360..137C ; FREE_PVAL # ETHI SECT MARK..ETHI NUM TEN THOUS 2075 137D..137F ; UNASSIGNED # .. 2076 1380..138F ; PVALID # ETHI SYL SEBATBEIT MWA..ETHI SYL PW 2077 1390..1399 ; FREE_PVAL # ETHI TON MARK YIZET..ETHI TON MARK 2078 139A..139F ; UNASSIGNED # .. 2079 13A0..13F4 ; PVALID # CHEROKEE LET A..CHEROKEE LET YV 2080 13F5..13FF ; UNASSIGNED # .. 2081 1400 ; FREE_PVAL # CANAD SYL HYPHEN 2082 1401..166C ; PVALID # CANAD SYL E..CANAD SYL CAR 2083 166D..166E ; FREE_PVAL # CANAD SYL CHI SIGN..CANAD SYLLAB 2084 166F..167F ; PVALID # CANAD SYL QAI..CANAD SYL B 2085 1680 ; FREE_PVAL # OGHAM SPACE MARK 2086 1681..169A ; PVALID # OGHAM LET BEITH..OGHAM LET PEITH 2087 169B..169C ; FREE_PVAL # OGHAM FEATHER MARK..OGHAM REV FEAT 2088 169D..169F ; UNASSIGNED # .. 2089 16A0..16EA ; PVALID # RUNIC LET FEHU FEOH FE F..RUNIC LET 2090 16EB..16F0 ; FREE_PVAL # RUNIC SINGLE PUNCT..RUNIC BELGTHOR 2091 16F1..16FF ; UNASSIGNED # .. 2092 1700..170C ; PVALID # TAGALOG LET A..TAGALOG LET YA 2093 170D ; UNASSIGNED # 2094 170E..1714 ; PVALID # TAGALOG LET LA..TAGALOG SIGN VIRAMA 2095 1715..171F ; UNASSIGNED # .. 2096 1720..1734 ; PVALID # HANUNOO LET A..HANUNOO SIGN PAMUDPO 2097 1735..1736 ; FREE_PVAL # PHILIP SINGLE PUNCT..PHILIP DOUBLE 2098 1737..173F ; UNASSIGNED # .. 2099 1740..1753 ; PVALID # BUHID LET A..BUHID VOW SIGN U 2100 1754..175F ; UNASSIGNED # .. 2101 1760..176C ; PVALID # TAGBANWA LET A..TAGBANWA LET YA 2102 176D ; UNASSIGNED # 2103 176E..1770 ; PVALID # TAGBANWA LET LA..TAGBANWA LET SA 2104 1771 ; UNASSIGNED # 2105 1772..1773 ; PVALID # TAGBANWA VOW SIGN I..TAGBANWA VOW S 2106 1774..177F ; UNASSIGNED # .. 2107 1780..17D3 ; PVALID # KHMER LET KA..KHMER SIGN BATHAMASAT 2108 17D4..17D6 ; FREE_PVAL # KHMER SIGN KHAN..KHMER SIGN CAMNUC 2109 17D7 ; PVALID # KHMER SIGN LEK TOO 2110 17D8..17DB ; FREE_PVAL # KHMER SIGN BEYYAL..KHMER CURR SYM R 2111 17DC..17DD ; PVALID # KHMER SIGN AVAKRAHASANYA..KHMER SIG 2112 17DE..17DF ; UNASSIGNED # .. 2113 17E0..17E9 ; PVALID # KHMER DIG ZERO..KHMER DIG NINE 2114 17EA..17EF ; UNASSIGNED # .. 2115 17F0..17F9 ; FREE_PVAL # KHMER SYM LEK ATTAK SON..KHMER SYM 2116 17FA..17FF ; UNASSIGNED # .. 2117 1800..180A ; FREE_PVAL # MONG BIRGA..MONG NIRUGU 2118 180B..180D ; PVALID # MONG FREE VAR SEL ONE..MONG FREE VA 2119 180E ; FREE_PVAL # MONG VOW SEP 2120 180F ; UNASSIGNED # 2121 1810..1819 ; PVALID # MONG DIG ZERO..MONG DIG NINE 2122 181A..181F ; UNASSIGNED # .. 2123 1820..1877 ; PVALID # MONG LET A..MONG LET MANCHU 2124 1878..187F ; UNASSIGNED # .. 2125 1880..18AA ; PVALID # MONG LET ALI GALI ANUSVARA ONE..MON 2126 18AB..18AF ; UNASSIGNED # .. 2127 18B0..18F5 ; PVALID # CAN SYL OY..CAN SYL CA 2128 18F6..18FF ; UNASSIGNED # .. 2129 1900..191C ; PVALID # LIMBU VOW-CARRIER LET..LIMBU LET HA 2130 191D..191F ; UNASSIGNED # .. 2131 1920..192B ; PVALID # LIMBU VOW SIGN A..LIMBU SUBJOIN LET 2132 192C..192F ; UNASSIGNED # .. 2133 1930..193B ; PVALID # LIMBU SM LET KA..LIMBU SIGN SA-I 2134 193C..193F ; UNASSIGNED # .. 2135 1940 ; FREE_PVAL # LIMBU SIGN LOO 2136 1941..1943 ; UNASSIGNED # .. 2137 1944..1945 ; FREE_PVAL # LIMBU EXCLAM MARK..LIMBU QUEST MARK 2138 1946..196D ; PVALID # LIMBU DIG ZERO..TAI LE LET AI 2139 196E..196F ; UNASSIGNED # .. 2140 1970..1974 ; PVALID # TAI LE LET TONE-2..TAI LE LET TONE- 2141 1975..197F ; UNASSIGNED # .. 2142 1980..19AB ; PVALID # NEW TAI LUE LET HIGH QA..NEW TAI LU 2143 19AC..19AF ; UNASSIGNED # .. 2144 19B0..19C9 ; PVALID # NEW TAI LUE VOW SIGN VOW SHORT..NEW 2145 19CA..19CF ; UNASSIGNED # .. 2146 19D0..19D9 ; PVALID # NEW TAI LUE DIG ZERO..NEW TAI DIG N 2147 19DA ; FREE_PVAL # NEW TAI LUE THAM 2148 19DB..19DD ; UNASSIGNED # .. 2149 19DE..19FF ; FREE_PVAL # NEW TAI LUE SIGN LAE..KHMER SYM DAP 2150 1A00..1A1B ; PVALID # BUGIN LET KA..BUGIN VOW SIGN AE 2151 1A1C..1A1D ; UNASSIGNED # .. 2152 1A1E..1A1F ; FREE_PVAL # BUGIN PALLAWA..BUGIN END OF SECTION 2153 1A20..1A5E ; PVALID # TAI THAM LET HIGH KA..TAI THAM CONS 2154 1A5F ; UNASSIGNED # 2155 1A60..1A7C ; PVALID # TAI THAM SIGN SAKOT..TAI THAM SIGN 2156 1A7D..1A7E ; UNASSIGNED # .. 2157 1A7F..1A89 ; PVALID # TAI THAM COMB CRYPT DOT..TAI THAM D 2158 1A8A..1A8F ; UNASSIGNED # .. 2159 1A90..1A99 ; PVALID # TAI THAM THAM DIG ZERO..TAI THAM TH 2160 1A9A..1A9F ; UNASSIGNED # .. 2161 1AA0..1AA6 ; FREE_PVAL # TAI THAM SIGN WIANG..TAI THAM SIGN 2162 1AA7 ; PVALID # TAI THAM SIGN MAI YAMOK 2163 1AA8..1AAD ; FREE_PVAL # TAI THAM SIGN KAAN..TAI THAM SIGN C 2164 1AAE..1AFF ; UNASSIGNED # .. 2165 1B00..1B4B ; PVALID # BAL SIGN ULU RICEM..BAL LET ASYURA 2166 1B4C..1B4F ; UNASSIGNED # .. 2167 1B50..1B59 ; PVALID # BAL DIG ZERO..BAL DIG NINE 2168 1B5A..1B6A ; FREE_PVAL # BAL PANTI..BAL MUS SYM DANG 2169 1B6B..1B73 ; PVALID # BAL MUS SYM COMB TEGEH..BAL MUS 2170 1B74..1B7C ; FREE_PVAL # BAL MUS SYM RIGHT-HAND OPEN DUG 2171 1B7D..1B7F ; UNASSIGNED # .. 2172 1B80..1BF3 ; PVALID # SUND SIGN PANYECEK..BATAK PANONGONAN 2173 1BF4..1BFB ; UNASSIGNED # .. 2174 1BFC..1BFF ; FREE_PVAL # BATAK SYM BINDU NA METEK..BATAK SYM 2175 1C00..1C37 ; PVALID # LEPCHA LET KA..LEPCHA SIGN NUKTA 2176 1C38..1C3A ; UNASSIGNED # .. 2177 1C3B..1C3F ; FREE_PVAL # LEPCHA PUNCT TA-ROL..LEPCHA PUNCT T 2178 1C40..1C49 ; PVALID # LEPCHA DIG ZERO..LEPCHA DIG NINE 2179 1C4A..1C4C ; UNASSIGNED # .. 2180 1C4D..1C7D ; PVALID # LEPCHA LET TTA..OL CHIKI AHAD 2181 1C7E..1C7F ; FREE_PVAL # OL CHIKI PUNCT MUCAAD..OL CHIKI PUN 2182 1C80..1C9F ; UNASSIGNED # .. 2183 1CC0..1CC7 ; FREE_PVAL # SUNDA PUNCT BINDU SURYA..SUNDA PUNC 2184 1CC8..1CCF ; UNASSIGNED # .. 2185 1CD0..1CD2 ; PVALID # VED TONE KARSHANA..VED TONE PRENKHA 2186 1CD3 ; FREE_PVAL # VED SIGN NIHSHVASA 2187 1CD4..1CF6 ; PVALID # VED SIGN YAJURVEDIC MID SVARITA..VE 2188 1CF7..1CFF ; UNASSIGNED # .. 2189 1D00..1D2B ; PVALID # LAT LET SM CAP A..CYR LET SM 2190 1D2C..1D2E ; FREE_PVAL # MOD LET CAP A..MOD LET C 2191 1D2F ; PVALID # MOD LET CAP BARRED B 2192 1D30..1D3A ; FREE_PVAL # MOD LET CAP D..MOD LET C 2193 1D3B ; PVALID # MOD LET CAP REV N 2194 1D3C..1D4D ; FREE_PVAL # MOD LET CAP O..MOD LET S 2195 1D4E ; PVALID # MOD LET SM TURNED I 2196 1D4F..1D6A ; FREE_PVAL # MOD LET SM K..GREEK SUB SMA 2197 1D6B..1D77 ; PVALID # LAT SM LET UE..LAT SM LET TU 2198 1D78 ; FREE_PVAL # MOD LET CYR EN 2199 1D79..1D9A ; PVALID # LAT SM LET INSULAR G..LAT SM LE 2200 1D9B..1DBF ; FREE_PVAL # MOD LET SM TURNED ALPHA..MOD 2201 1DC0..1DE6 ; PVALID # COMB DOTTED GRAVE ACCENT..COMB LAT 2202 1DE7..1DFB ; UNASSIGNED # .. 2203 1DFC..1DFF ; PVALID # COMB DOUBLE INV BREVE BEL..COMB R 2204 1E9A ; FREE_PVAL # LAT SM LET A WITH R HALF RING 2205 1E9B..1F15 ; PVALID # LAT SM LET LONG S W DOT ABOVE..GR 2206 1F16..1F17 ; UNASSIGNED # .. 2207 1F18..1F1D ; FREE_PVAL # GREEK CAP LET EPSILON W PSILI..GRE 2208 1F1E..1F1F ; UNASSIGNED # .. 2209 1F20..1F45 ; PVALID # GREEK SM LET ETA W PSILI..GREEK SMA 2210 1F46..1F47 ; UNASSIGNED # .. 2211 1F48..1F4D ; FREE_PVAL # GREEK CAP LET OMICRON W PSILI..GRE 2212 1F4E..1F4F ; UNASSIGNED # .. 2213 1F50..1F57 ; PVALID # GREEK SM LET UPSILON W PSILI..GREEK 2214 1F58 ; UNASSIGNED # 2215 1F59 ; PVALID # GREEK CAP LET UPSILON W DASIA 2216 1F5A ; UNASSIGNED # 2217 1F5B ; PVALID # GREEK CAP LET UPSILON W DASIA AND 2218 1F5C ; UNASSIGNED # 2219 1F5D ; PVALID # GREEK CAP LET UPSILON W DASIA AND 2220 1F5E ; UNASSIGNED # 2221 1F5F..1F7D ; PVALID # GREEK CAP LET UPSILON W DASIA A..GR 2222 1F7E..1F7F ; UNASSIGNED # .. 2223 1F80..1F87 ; PVALID # GREEK SM LET ALPHA W PSILI AND YPOG 2224 1F88..1F8F ; FREE_PVAL # GREEK CAP LET ALPHA W PSILI AND..GR 2225 1F90..1F97 ; PVALID # GREEK SM LET ETA W PSILI AND YP..GR 2226 1F98..1F9F ; FREE_PVAL # GREEK CAP LET ETA W PSILI AND P..GR 2227 1FA0..1FA7 ; PVALID # GREEK SM LET OMEGA W PSILI AND ..GR 2228 1FA8..1FAF ; FREE_PVAL # GREEK CAPL LET OMEGA W PSILI AN..GR 2229 1FB0..1FB4 ; PVALID # GREEK SM LET ALPHA W VRACHY..GREEK 2230 1FB5 ; UNASSIGNED # 2231 1FB6..1FBB ; PVALID # GREEK SM LET ALPHA W PERISPOMEN..GR 2232 1FBC..1FBD ; FREE_PVAL # GREEK CAP LET ALPHA W PROSGEGRA..GR 2233 1FBE ; PVALID # GREEK PROSGEGRAMMENI 2234 1FBF..1FC1 ; FREE_PVAL # GREEK PSILI..GREEK DIALYTIKA AND PE 2235 1FC2..1FC4 ; PVALID # GREEK SM LET ETA W VARIA AND YP..GR 2236 1FC5 ; UNASSIGNED # 2237 1FC6..1FCB ; PVALID # GREEK SM LET ETA W PERISPOMENI..GR 2238 1FCC..1FCF ; FREE_PVAL # GREEK CAP LET ETA W PROSGEGRAM..GR 2239 1FD0..1FD3 ; PVALID # GREEK SM LET IOTA W VRACHY..GREEK S 2240 1FD4..1FD5 ; UNASSIGNED # .. 2241 1FD6..1FDB ; PVALID # GREEK SM LET IOTA W PERISPOMENI..GR 2242 1FDC ; UNASSIGNED # 2243 1FDD..1FDF ; FREE_PVAL # GREEK DASIA AND VARIA..GREEK DASIA 2244 1FE0..1FEC ; PVALID # GREEK SM LET UPSILON W VRACHY..GREE 2245 1FED..1FEF ; FREE_PVAL # GREEK DIALYTIKA AND VARIA..GREEK VA 2246 1FF0..1FF1 ; UNASSIGNED # .. 2247 1FF2..1FF4 ; FREE_PVAL # GREEK SM LET OMEGA W VARIA AND YPOG 2248 1FF5 ; UNASSIGNED # 2249 1FF6..1FFB ; PVALID # GREEK SM LET OMEGA W PERISPOMEN..GR 2250 1FFC..1FFE ; FREE_PVAL # GREEK CAP LET OMEGA W PROSGEGRA..GR 2251 1FFF ; UNASSIGNED # 2252 2000..200A ; FREE_PVAL # EN QUAD..HAIR SPACE 2253 200B ; DISALLOWED # ZERO WIDTH SPACE 2254 200C..200D ; CONTEXTJ # ZERO WIDTH NON-JOINER..ZERO WIDTH J 2255 200E..200F ; DISALLOWED # LEFT-TO-RIGHT MARK..RIGHT-TO-LEFT M 2256 2010..2027 ; FREE_PVAL # HYPHEN..HYPHENATION POINT 2257 2028..202E ; DISALLOWED # LINE SEP..RIGHT-TO-LEFT OVERRIDE 2258 202F..205F ; FREE_PVAL # NARROW NO-BREAK SPACE..MED MATH SP 2259 2060..2064 ; DISALLOWED # WORD JOINER..INVISIBLE PLUS 2260 2065..2069 ; UNASSIGNED # .. 2261 206A..206F ; DISALLOWED # INHIBIT SYMM SWAP..NOM DIGIT SHAPES 2262 2070..2071 ; FREE_PVAL # SUPER ZERO..SUPER LAT SM LET I 2263 2072..2073 ; UNASSIGNED # .. 2264 2074..208E ; FREE_PVAL # SUPER FOUR..SUB RIGHT PARENTHESIS 2265 208F ; UNASSIGNED # 2266 2090..209C ; FREE_PVAL # LAT SUB SM LET A..LAT SUB SM LET T 2267 209D..209F ; UNASSIGNED # .. 2268 20A0..20B9 ; FREE_PVAL # EURO-CURRENCY SIGN..INDIAN RUPEE SI 2269 20BA..20CF ; UNASSIGNED # .. 2270 20D0..20DC ; PVALID # COMB LEFT HARPOON ABOVE..COMB FOUR 2271 20DD..20E0 ; FREE_PVAL # COMB ENC CIRC..COMB ENC CIRC BACKS 2272 20E1 ; PVALID # COMB L R ARROW ABOVE 2273 20E2..20E4 ; FREE_PVAL # COMB ENC SCREEN..COMB ENC UPWARD PO 2274 20E5..20F0 ; PVALID # COMB REV SOLIDUS OVERLAY..COMB ASTE 2275 20F1..20FF ; UNASSIGNED # .. 2276 2100..2129 ; FREE_PVAL # ACCOUNT OF..TURNED GREEK SM LET IOT 2277 212A..212B ; PVALID # KELVIN SIGN..ANGSTROM SIGN 2278 212C..2131 ; FREE_PVAL # SCRIPT CAP C..SCRIPT CAP F 2279 2132 ; PVALID # TURNED CAP F 2280 2133..214D ; FREE_PVAL # SCRIPT CAP M..AKTIESELSKAB 2281 214E ; PVALID # TURNED SM F 2282 214F..2182 ; DISALLOWED # SYM FOR SAMAR SOURCE..ROM NUM TEN T 2283 2183..2184 ; PVALID # ROM NUM REV ONE HUNDRED..LAT SM LET 2284 2185..2189 ; FREE_PVAL # ROM NUM SIX LATE FORM..VULGAR FRACT 2285 218A..218F ; UNASSIGNED # .. 2286 2190..23F3 ; FREE_PVAL # LEFTWARDS ARROW..HOURGLASS WITH FLO 2287 23F4..23FF ; UNASSIGNED # .. 2288 2400..2426 ; FREE_PVAL # SYM FOR NULL..SYM FOR SUB FORM 2289 2427..243F ; UNASSIGNED # .. 2290 2440..244A ; FREE_PVAL # OCR HOOK..OCR DOUBLE BACKSLASH 2291 244B..245F ; UNASSIGNED # .. 2292 2460..26FF ; FREE_PVAL # CIRCLED DIG ONE..WHITE FLAG W HORIZ 2293 2700 ; UNASSIGNED # 2294 2701..2B4C ; FREE_PVAL # UP BLADE SCISSORS..RIGHTWARDS ARROW 2295 2B4D..2B4F ; UNASSIGNED # .. 2296 2B50..2B59 ; FREE_PVAL # WHITE MEDIUM STAR..HEAVY CIRCLED SA 2297 2B5A..2BFF ; UNASSIGNED # .. 2298 2C00..2C2E ; PVALID # GLAG CAP LET AZU..GLAG CA 2299 2C2F ; UNASSIGNED # 2300 2C30..2C5E ; PVALID # GLAG SM LET AZU..GLAG SMAL 2301 2C5F ; UNASSIGNED # 2302 2C60..2C7B ; PVALID # LAT CAP LET L W DOUBLE BAR..LAT SM 2303 2C7C..2C7D ; FREE_PVAL # LAT SUB SM LET J..MOD LET CAP V 2304 2C7E..2CE4 ; PVALID # LAT CAP LET S W SWASH TAIL..COPT SY 2305 2CE5..2CEA ; FREE_PVAL # COPT SYM MI RO..COPT SYM SHIMA SIMA 2306 2CEB..2CF3 ; PVALID # COPT CAP LET CRYPTOGRAMMIC SHEI..CO 2307 2CF4..2CF8 ; UNASSIGNED # .. 2308 2CF9..2CFF ; FREE_PVAL # COPT OLD NUB FULL STOP..COPT MORPHO 2309 2D00..2D25 ; PVALID # GEORG SM LET AN..GEORG SM LET 2310 2D26 ; UNASSIGNED # 2311 2D27 ; PVALID # GEORG SM LET YN 2312 2D28..2D2C ; UNASSIGNED # .. 2313 2D2D ; PVALID # GEORG SM LET AEN 2314 2D2E..2D2F ; UNASSIGNED # .. 2315 2D30..2D67 ; PVALID # TIFINAGH LET YA..TIFINAGH LETTER YO 2316 2D68..2D6E ; UNASSIGNED # .. 2317 2D6F..2D70 ; PVALID # TIFINAGH MOD LET LABIALIZATION MARK 2318 2D71..2D7E ; UNASSIGNED # .. 2319 2D7F..2D96 ; PVALID # TIFINAGH CONS JOINER..ETHI SYL GGW 2320 2D97..2D9F ; UNASSIGNED # .. 2321 2DA0..2DA6 ; PVALID # ETHI SYL SSA..ETHI SYL SSO 2322 2DA7 ; UNASSIGNED # 2323 2DA8..2DAE ; PVALID # ETHI SYL CCA..ETHI SYL CCO 2324 2DAF ; UNASSIGNED # 2325 2DB0..2DB6 ; PVALID # ETHI SYL ZZA..ETHI SYL ZZO 2326 2DB7 ; UNASSIGNED # 2327 2DB8..2DBE ; PVALID # ETHI SYL CCHA..ETHI SYL CC 2328 2DBF ; UNASSIGNED # 2329 2DC0..2DC6 ; PVALID # ETHI SYL QYA..ETHI SYL QYO 2330 2DC7 ; UNASSIGNED # 2331 2DC8..2DCE ; PVALID # ETHI SYL KYA..ETHI SYL KYO 2332 2DCF ; UNASSIGNED # 2333 2DD0..2DD6 ; PVALID # ETHI SYL XYA..ETHI SYL XYO 2334 2DD7 ; UNASSIGNED # 2335 2DD8..2DDE ; PVALID # ETHI SYL GYA..ETHI SYL GYO 2336 2DDF ; UNASSIGNED # 2337 2DE0..2DFF ; PVALID # COMB CYR LET BE..COMB CYRI 2338 2E00..2E2E ; FREE_PVAL # RIGHT ANGLE SUB MARK..REV QUEST MAR 2339 2E2F ; PVALID # VERT TILDE 2340 2E30..2E3B ; FREE_PVAL # RING PNT..THREE-EM DASH 2341 2E3C..2E7F ; UNASSIGNED # .. 2342 2E80..2E99 ; FREE_PVAL # CJK RAD REPEAT..CJK RAD RAP 2343 2E9A ; UNASSIGNED # 2344 2E9B..2EF3 ; FREE_PVAL # CJK RAD CHOKE..CJK RAD C-SIMPLIFIED 2345 2EF4..2EFF ; UNASSIGNED # .. 2346 2F00..2FD5 ; FREE_PVAL # KANGXI RAD ONE..KANGXI RAD FLUTE 2347 2FD6..2FEF ; UNASSIGNED # .. 2348 2FF0..2FFB ; FREE_PVAL # IDEO DESC CHAR LEFT TO RIGHT..IDEO 2349 2FFC..2FFF ; UNASSIGNED # .. 2350 3000..3004 ; FREE_PVAL # IDEO SPACE..JAPAN INDUST STAND 2351 3005..3007 ; PVALID # IDEO ITER MARK..IDEO NUMB ZERO 2352 3008..3029 ; FREE_PVAL # LEFT ANGLE BRACKET..HANGZH NUM NINE 2353 302A..302D ; PVALID # IDEO LEVEL TONE MARK..IDEO ENT 2354 302E..302F ; FREE_PVAL # HANGUL SING DOT TONE MARK..WAVY DAS 2355 3031..3035 ; DISALLOWED # VERT KANA REP MARK..VERT KANA REP M 2356 3036..303A ; FREE_PVAL # CIRCLED POSTAL MARK..HANGZH NUM THI 2357 303B ; DISALLOWED # VERT IDEO ITER MARK 2358 303C ; PVALID # MASU MARK 2359 303D..303F ; DISALLOWED # PART ALTER MARK..IDEO HALF FILL 2360 3040 ; UNASSIGNED # 2361 3041..3096 ; PVALID # HIRAGANA LET SM A..HIRAGANA LET SMA 2362 3097..3098 ; UNASSIGNED # .. 2363 3099..309A ; PVALID # COMB KAT-HIR VOICED SOUND 2364 309B..309C ; FREE_PVAL # KAT-HIR VOICED SOUND MARK..KAT-HIR 2365 309D..309E ; PVALID # HIRAGANA ITER MARK..HIRAGANA VOICED 2366 309F..30A0 ; FREE_PVAL # HIRAGANA DIGRAPH YORI..KAT-HIR DOU 2367 30A1..30FA ; PVALID # KATAKANA LET SM A..KATAKANA LET VO 2368 30FB ; CONTEXTO # KATAKANA MIDDLE DOT 2369 30FC..30FE ; PVALID # KAT-HIR PROLONGED SOUND MARK..KATA 2370 30FF ; FREE_PVAL # KATAKANA DIGRAPH KOTO 2371 3100..3104 ; UNASSIGNED # .. 2372 3105..312D ; PVALID # BOPOMOFO LET B..BOPOMOFO LET IH 2373 312E..3130 ; UNASSIGNED # .. 2374 3131..3163 ; FREE_PVAL # HANGUL LET KIYEOK..HANGUL LET I 2375 3164 ; DISALLOWED # HANGUL FILLER 2376 3165..318E ; FREE_PVAL # HANGUL LET SSANGNIEUN..HANGUL LET 2377 318F ; UNASSIGNED # 2378 3190..319F ; FREE_PVAL # IDEO ANNO LINK MARK..IDEO ANNO MAN 2379 31A0..31BA ; PVALID # BOPOMOFO LET BU..BOPOMOFO LET ZY 2380 31BB..31BF ; UNASSIGNED # .. 2381 31C0..31E3 ; FREE_PVAL # CJK STROKE T..CJK STROKE Q 2382 31E4..31EF ; UNASSIGNED # .. 2383 31F0..31FF ; PVALID # KATAKANA LET SM KU..KATAKANA LET SM 2384 3200..321E ; FREE_PVAL # PAREN HANGUL KIYEOK..PAREN KOREAN C 2385 321F ; UNASSIGNED # 2386 3220..32FE ; FREE_PVAL # PAREN IDEO ONE..CIRCLED KATAKANA WO 2387 32FF ; UNASSIGNED # 2388 3300..33FF ; FREE_PVAL # SQUARE APAATO..SQUARE GAL 2389 3400..4DB5 ; PVALID # 2390 4DB6..4DBF ; UNASSIGNED # .. 2391 4DC0..4DFF ; FREE_PVAL # HEX FOR THE CREATIVE HEAVEN..HEX FO 2392 4E00..9FCC ; PVALID # 2393 9FCE..9FFF ; UNASSIGNED # .. 2394 A000..A48C ; PVALID # YI SYL IT..YI SYL YYR 2395 A48D..A48F ; UNASSIGNED # .. 2396 A490..A4C6 ; FREE_PVAL # YI RAD QOT..YI RAD KE 2397 A4C7..A4CF ; UNASSIGNED # .. 2398 A4D0..A4FD ; PVALID # LISU LET BA..LISU LET TONE MYA JEU 2399 A4FE..A4FF ; FREE_PVAL # LISU PUNCT COMMA..LISU PUNCT FUL 2400 A500..A60C ; PVALID # VAI SYL EE..VAI SYL LENENER 2401 A60D..A60F ; FREE_PVAL # VAI COMMA..VAI QUEST MARK 2402 A610..A62B ; PVALID # VAI SYL NDOLE FA..VAI SYL NDOLE DO 2403 A62C..A63F ; UNASSIGNED # .. 2404 A640..A66F ; PVALID # CYR CAP LET ZEMLYA..COMB CYR VZMET 2405 A670..A673 ; FREE_PVAL # COMB CYR TEN MILLIONS SIGN..SLAVON 2406 A674..A67D ; PVALID # COMB CYR KAVYKA..COMB CYR PAYEROK 2407 A67E ; FREE_PVAL # CYR KAVYKA 2408 A67F..A697 ; PVALID # CYR PAYEROK..CYR SM LET SHWE 2409 A698..A69E ; UNASSIGNED # .. 2410 A69F..A6E5 ; PVALID # COMB CYR LET IOTIFIED E..BAMUM LET 2411 A6E6..A6EF ; FREE_PVAL # BAMUM LET MO..BAMUM LET KOGHOM 2412 A6F0..A6F1 ; PVALID # BAMUM COMB MARK KOQNDON..BAMUM COMB 2413 A6F2..A6F7 ; FREE_PVAL # BAMUM NJAEMLI..BAMUM QUEST MARK 2414 A6F8..A6FF ; UNASSIGNED # .. 2415 A700..A716 ; FREE_PVAL # MOD LET CHIN TONE YIN PING..MOD 2416 A717..A71F ; PVALID # MOD LET DOT VERT BAR..MOD L 2417 A720..A721 ; FREE_PVAL # MOD LET STRESS AND HIGH TONE..MOD 2418 A722..A76F ; PVALID # LAT CAP LET EGYPT ALEF..LAT SM LET 2419 A770 ; FREE_PVAL # MODIFIER LETTER US 2420 A771..A788 ; PVALID # LATIN SMALL LETTER DUM..MOD LET LOW 2421 A789..A78A ; FREE_PVAL # MOD LET COLON..MOD LET SH EQUALS SI 2422 A78B..A78E ; PVALID # LAT SM LET SALTILLO..LAT SM LET L W 2423 A78F ; UNASSIGNED # 2424 A790..A793 ; PVALID # LAT CAP LET N W DESC..LAT SM LET C 2425 A794..A79F ; UNASSIGNED # .. 2426 A7A0..A7AA ; PVALID # LAT CAP LET G W OBLIQUE STROKE..LAT 2427 A7AB..A7F7 ; UNASSIGNED # .. 2428 A7F8..A7F9 ; FREE_PVAL # MOD LET CAP H W STROKE..MOD LET SM 2429 A7FA..A827 ; PVALID # LAT LET SM CAP TURNED M..SYLOTI NA 2430 A828..A82B ; FREE_PVAL # SYLOTI NAGRI POET MARK-1..SYLOTI NA 2431 A82C..A82F ; UNASSIGNED # .. 2432 A830..A839 ; FREE_PVAL # N INDIC FRACT ONE QUART..N INDIC QU 2433 A83A..A83F ; UNASSIGNED # .. 2434 A840..A873 ; PVALID # PHAGS-PA LET KA..PHAGS-PA LET CANDR 2435 A874..A877 ; FREE_PVAL # PHAGS-PA SINGLE HEAD MARK..PHAGS-PA 2436 A878..A87F ; UNASSIGNED # .. 2437 A880..A8C4 ; PVALID # SAUR SIGN ANUSVARA..SAUR SIGN VIRAM 2438 A8C5..A8CD ; UNASSIGNED # .. 2439 A8CE..A8CF ; FREE_PVAL # SAUR DANDA..SAUR DOUBLE DANDA 2440 A8D0..A8D9 ; PVALID # SAUR DIG ZERO..SAUR DIG NINE 2441 A8DA..A8DF ; UNASSIGNED # .. 2442 A8E0..A8F7 ; PVALID # COMB DEVAN DIG ZERO..DEVAN SIGN CAN 2443 A8F8..A8FA ; FREE_PVAL # DEVAN SIGN PUSHPIKA..DEVAN CARET 2444 A8FB ; PVALID # DEVAN HEADSTROKE 2445 A8FC..A8FF ; UNASSIGNED # .. 2446 A900..A92D ; PVALID # KAYAH LI DIG ZERO..KAYAH LI TONE CA 2447 A92E..A92F ; FREE_PVAL # KAYAH LI SIGN CWI..KAYAH LI SIGN SH 2448 A930..A953 ; PVALID # REJANG LET KA..REJANG VIRAMA 2449 A954..A95E ; UNASSIGNED # .. 2450 A95F ; FREE_PVAL # REJANG SECTION MARK 2451 A960..A97C ; DISALLOWED # HANGUL CHO TIKEUT-MIUEM..HANGUL CHO 2452 A97D..A97F ; UNASSIGNED # .. 2453 A980..A9C0 ; PVALID # JAV SIGN PANYANGGA..JAV PANGKON 2454 A9C1..A9CD ; FREE_PVAL # JAV LEFT RERENGGAN..JAV TURNED PADA 2455 A9CE ; UNASSIGNED # 2456 A9CF..A9D9 ; PVALID # JAV PANGRANGKEP..JAV DIG NINE 2457 A9DA..A9DD ; UNASSIGNED # .. 2458 A9DE..A9DF ; FREE_PVAL # JAV PADA TIRTA TUMETES..JAV PADA I 2459 A9E0..A9FF ; UNASSIGNED # .. 2460 AA00..AA36 ; PVALID # CHAM LET A..CHAM CONS SIGN WA 2461 AA37..AA3F ; UNASSIGNED # .. 2462 AA40..AA4D ; PVALID # CHAM LET FIN K..CHAM CONS SIGN FIN 2463 AA4E..AA4F ; UNASSIGNED # .. 2464 AA50..AA59 ; PVALID # CHAM DIG ZERO..CHAM DIG NINE 2465 AA5A..AA5B ; UNASSIGNED # .. 2466 AA5C..AA5F ; FREE_PVAL # CHAM PUNCT SPIRAL..CHAM PUNCT TR 2467 AA60..AA76 ; PVALID # MYAN LET KHAMTI GA..MYAN LOGOGRAM K 2468 AA77..AA79 ; FREE_PVAL # MYAN SYM AITON EXCLAM..MYAN SYM AIT 2469 AA7A..AA7B ; PVALID # MYAN LET AITON RA..MYAN SIGN PAO KA 2470 AA7C..AA7F ; UNASSIGNED # .. 2471 AA80..AAC2 ; PVALID # TAI VIET LET LOW KO..TAI VIET TONE 2472 AAC3..AADA ; UNASSIGNED # .. 2473 AADB..AADD ; PVALID # TAI VIET SYM KON..TAI VIET SYM SAM 2474 AADE..AADF ; FREE_PVAL # TAI VIET SYM HO HOI..TAI VIET SYM K 2475 AAE0..AAEF ; PVALID # MEETEI MAYEK LET E..MEETEI MAYEK VO 2476 AAF0..AAF1 ; FREE_PVAL # MEETEI MAYEK CHEIKHAN..MEETEI MAYEK 2477 AAF2..AAF6 ; PVALID # MEETEI MAYEK ANJI..MEETEI MAYEK VIR 2478 AAF7..AB00 ; UNASSIGNED # .. 2479 AB01..AB06 ; PVALID # ETHI SYL TTHU..ETHI SYL TTHO 2480 AB07..AB08 ; UNASSIGNED # .. 2481 AB09..AB0E ; PVALID # ETHI SYL DDHAA..ETHI SYL DDHO 2482 AB0F..AB10 ; UNASSIGNED # .. 2483 AB11..AB16 ; PVALID # ETHI SYL DZU..ETHI SYL DZO 2484 AB17..AB1F ; UNASSIGNED # .. 2485 AB20..AB26 ; PVALID # ETHI SYL CCHHA..ETHI SYL CCHHO 2486 AB27 ; UNASSIGNED # .. 2487 AB28..AB2E ; PVALID # ETHI SYL BBAA..ETHI SYL BBO 2488 AB2F..ABBF ; UNASSIGNED # .. 2489 ABC0..ABEA ; PVALID # MEETEI MAYEK LET KOK..MEETEI MAYEK 2490 ABEB ; FREE_PVAL # MEETEI MAYEK CHEIKHEI 2491 ABEC..ABED ; PVALID # MEETEI MAYEK LUM IYEK..MEETEI MAYEK 2492 ABEE..ABEF ; UNASSIGNED # .. 2493 ABF0..ABF9 ; PVALID # MEETEI MAYEK DIG ZERO..MEETEI MAYEK 2494 ABFA..ABFF ; UNASSIGNED # .. 2495 AC00..D7A3 ; PVALID # 2496 D7A4..D7AF ; UNASSIGNED # .. 2497 D7B0..D7C6 ; DISALLOWED # HANGUL JUNG O-YEO..HANGUL JUNG ARAE 2498 D7C7..D7CA ; UNASSIGNED # .. 2499 D7CB..D7FB ; DISALLOWED # HANGUL JONG NIEUN-RIEUL..HANGUL JON 2500 D7FC..D7FF ; UNASSIGNED # .. 2501 D800..F8FF ; DISALLOWED # 2502 F900..FA6D ; PVALID # CJK COMP IDEO-F900..CJK COMP IDEO 2503 FA6E..FA6F ; UNASSIGNED # .. 2504 FA70..FAD9 ; FREE_PVAL # CJK COMP IDEO-FA70..CJK COMP IDEO 2505 FADA..FAFF ; UNASSIGNED # .. 2506 FB00..FB06 ; FREE_PVAL # LAT SM LIG FF..LAT SM LIG ST 2507 FB07..FB12 ; UNASSIGNED # .. 2508 FB13..FB17 ; FREE_PVAL # ARMENIAN SM LIG MEN NOW..ARMENIAN SM 2509 FB18..FB1C ; UNASSIGNED # .. 2510 FB1D..FB1F ; PVALID # HEBR LET YOD W HIRIQ..HEBR LIG YID Y 2511 FB20..FB29 ; FREE_PVAL # HEBR LET ALT AYIN..HEB LET ALT PLUS 2512 FB2A..FB36 ; PVALID # HEBR LET SHIN W SHIN DOT..HEBR LET Z 2513 FB37 ; UNASSIGNED # 2514 FB38..FB3C ; FREE_PVAL # HEBR LET TET W DAGESH..HEBR LET 2515 FB3D ; UNASSIGNED # 2516 FB3E ; FREE_PVAL # HEBR LET MEM W DAGESH 2517 FB3F ; UNASSIGNED # 2518 FB40..FB41 ; FREE_PVAL # HEBR LET NUN W DAGESH..HEBR LET 2519 FB42 ; UNASSIGNED # 2520 FB43..FB44 ; FREE_PVAL # HEBR LET FIN PE W DAGESH..HEBR L 2521 FB45 ; UNASSIGNED # 2522 FB46..FB4E ; PVALID # HEBR LET TSADI W DAGESH..HEBR LET P 2523 FB4F..FBC1 ; FREE_PVAL # HEBR LIG ALEF LAMED..ARAB SYM S 2524 FBC2..FBD2 ; UNASSIGNED # .. 2525 FBD3..FD3F ; FREE_PVAL # ARAB LET NG ISO FORM..ORNATE RIGHT 2526 FD40..FD4F ; UNASSIGNED # .. 2527 FD50..FD8F ; FREE_PVAL # ARAB LIG TEH W JEEM W MEEM INIT 2528 FD90..FD91 ; UNASSIGNED # .. 2529 FD92..FDC7 ; FREE_PVAL # ARAB LIG MEEM W JEEM W KHAH INI 2530 FDC8..FDEF ; UNASSIGNED # .. 2531 FDF0..FDFD ; FREE_PVAL # ARAB LIG SALLA USED..ARAB LIG BISMI 2532 FDFE..FDFF ; UNASSIGNED # .. 2533 FE00..FE0F ; PVALID # VAR SEL-1..VAR SEL-16 2534 FE10..FE19 ; FREE_PVAL # PRES FORM FOR VERT COMMA..PRES FORM 2535 FE20..FE26 ; PVALID # COMB LIG LEFT HALF..COMB CONJ MACRO 2536 FE27..FE2F ; UNASSIGNED # .. 2537 FE30..FE52 ; FREE_PVAL # PRES FORM FOR VERT TWO DOT LEAD..SM 2538 FE53 ; UNASSIGNED # 2539 FE54..FE66 ; FREE_PVAL # SM SEMICOLON..SM EQUALS SIGN 2540 FE67 ; UNASSIGNED # 2541 FE68..FE6B ; FREE_PVAL # SM REV SOLIDUS..SM COMM AT 2542 FE6C..FE6F ; UNASSIGNED # .. 2543 FE70..FE72 ; FREE_PVAL # ARAB FATHATAN ISO FORM..ARAB DAMMAT 2544 FE73 ; PVALID # ARAB TAIL FRAGMENT 2545 FE74 ; FREE_PVAL # ARAB KASRATAN ISO FORM 2546 FE75 ; UNASSIGNED # 2547 FE76..FEFC ; FREE_PVAL # ARAB FATHA ISO FORM..ARAB LIG LAM W 2548 FEFD..FEFE ; UNASSIGNED # .. 2549 FEFF ; DISALLOWED # ZERO WIDTH NO-BREAK SPACE 2550 FF00 ; UNASSIGNED # 2551 FF01..FF9F ; FREE_PVAL # FULLW EXCLAM MARK..HALFW KATA SE 2552 FFA0 ; DISALLOWED # HALFW HANGUL FILLER 2553 FFA1..FFBE ; FREE_PVAL # HALFW HANGUL LET KIYEOK..HALFW H 2554 FFBF..FFC1 ; UNASSIGNED # .. 2555 FFC2..FFC7 ; FREE_PVAL # HALFW HANGUL LET A..HALFW HANGUL 2556 FFC8..FFC9 ; UNASSIGNED # .. 2557 FFCA..FFCF ; FREE_PVAL # HALFW HANGUL LET YEO..HALFW HANGU 2558 FFD0..FFD1 ; UNASSIGNED # .. 2559 FFD2..FFD7 ; FREE_PVAL # HALFW HANGUL LET YO..HALFW HANGUL 2560 FFD8..FFD9 ; UNASSIGNED # .. 2561 FFDA..FFDC ; FREE_PVAL # HALFW HANGUL LET EU..HALFW HANGUL 2562 FFDD..FFDF ; UNASSIGNED # .. 2563 FFE0..FFE6 ; FREE_PVAL # FULLW CENT SIGN..FULLW WON SIGN 2564 FFE7 ; UNASSIGNED # 2565 FFE8..FFEE ; FREE_PVAL # HALFW FORMS LIGHT VERT..HALFW WH 2566 FFEF..FFF8 ; UNASSIGNED # .. 2567 FFF9..FFFB ; DISALLOWED # INTERL ANNO ANCHOR..INTERL ANNO TER 2568 FFFC..FFFD ; FREE_PVAL # OBJECT REPL CHAR..REPL CHAR 2569 FFFE..FFFF ; UNASSIGNED # .. 2570 10000..1000B; PVALID # LIN B SYL B008 A..LIN B SYL 2571 1000C ; UNASSIGNED # 2572 1000D..10026; PVALID # LIN B SYL B036 JO..LIN B SYL 2573 10027 ; UNASSIGNED # 2574 10028..1003A; PVALID # LIN B SYL B060 RA..LIN B SYL 2575 1003B ; UNASSIGNED # 2576 1003C..1003D; PVALID # LIN B SYL B017 ZA..LIN B SYL 2577 1003E ; UNASSIGNED # 2578 1003F..1004D; PVALID # LIN B SYL B020 ZO..LIN B SYL 2579 1004E..1004F; UNASSIGNED # .. 2580 10050..1005D; PVALID # LIN B SYM B018..LIN B SYM B089 2581 1005E..1007F; UNASSIGNED # .. 2582 10080..100FA; PVALID # LIN B IDEO B100 MAN..LIN B IDEO 2583 100FB..100FF; UNASSIGNED # .. 2584 10100..10102; FREE_PVAL # AEG WORD SEP LINE..AEG CHECK MAR 2585 10103..10106; UNASSIGNED # .. 2586 10107..10133; FREE_PVAL # AEG NUM ONE..AEG NUM NINETY THOU 2587 10134..10136; UNASSIGNED # .. 2588 10137..1018A; FREE_PVAL # AEG WEIGHT BASE UNIT..GREEK ZERO SI 2589 1018B..1018F; UNASSIGNED # .. 2590 10190..1019B; FREE_PVAL # ROM SEXTANS SIGN..ROM CENTURIAL SIG 2591 1019C..101CF; UNASSIGNED # .. 2592 101D0..101FC; FREE_PVAL # PHAISTOS DISC SIGN PED..PHAISTOS DI 2593 101FD ; PVALID # PHAISTOS DISC SIGN COMB OBLIQUE STR 2594 101FE..1027F; UNASSIGNED # .. 2595 10280..1029C; PVALID # LYCIAN LET A..LYCIAN LET X 2596 1029D..1029F; UNASSIGNED # .. 2597 102A0..102D0; PVALID # CARIAN LET A..CARIAN LET UUU3 2598 102D1..102FF; UNASSIGNED # .. 2599 10300..1031E; PVALID # OLD ITAL LET A..OLD ITAL LET UU 2600 1031F ; UNASSIGNED # 2601 10320..10323; FREE_PVAL # OLD ITAL NUM ONE..OLD ITAL NUM F 2602 10324..1032F; UNASSIGNED # .. 2603 10330..10340; PVALID # GOTH LET AHSA..GOTH LET PAIRTHRA 2604 10341 ; FREE_PVAL # GOTH LET NINETY 2605 10342..10349; PVALID # GOTH LET RAIDA..GOTH LET OTHAL 2606 1034A ; FREE_PVAL # GOTH LET NINE HUNDRED 2607 1034B..1037F; UNASSIGNED # .. 2608 10380..1039D; PVALID # UGAR LET ALPA..UGAR LET SSU 2609 1039E ; UNASSIGNED # 2610 1039F ; FREE_PVAL # UGAR WORD DIVIDER 2611 103A0..103C3; PVALID # OLD PERS SIGN A..OLD PERS SIGN HA 2612 103C4..103C7; UNASSIGNED # .. 2613 103C8..103CF; PVALID # OLD PERS SIGN AURAMAZDAA..OLD PERS 2614 103D0..103D5; FREE_PVAL # OLD PERS WORD DIVIDER..OLD PERS NUM 2615 103D6..103FF; UNASSIGNED # .. 2616 10400..1049D; PVALID # DESERET CAP LET LONG I..OSMANYA LET 2617 1049E..1049F; UNASSIGNED # .. 2618 104A0..104A9; PVALID # OSMANYA DIG ZERO..OSMANYA DIG NINE 2619 104AA..107FF; UNASSIGNED # .. 2620 10800..10805; PVALID # CYPRIOT SYL A..CYPRIOT SYL JA 2621 10806..10807; UNASSIGNED # .. 2622 10808 ; PVALID # CYPRIOT SYL JO 2623 10809 ; UNASSIGNED # 2624 1080A..10835; PVALID # CYPRIOT SYL KA..CYPRIOT SYL WO 2625 10836 ; UNASSIGNED # 2626 10837..10838; PVALID # CYPRIOT SYL XA..CYPRIOT SYL XE 2627 10839..1083B; UNASSIGNED # .. 2628 1083C ; PVALID # CYPRIOT SYL ZA 2629 1083D..1083E; UNASSIGNED # .. 2630 1083F..10855; PVALID # CYPRIOT SYL ZO..IMP ARAM LET TAW 2631 10856 ; UNASSIGNED # 2632 10857..1085F; FREE_PVAL # IMP ARAM SECT SIGN..IMP ARAM 2633 10860..108FF; UNASSIGNED # .. 2634 10900..10915; PVALID # PHOEN LET ALF..PHOEN LET TAU 2635 10916..1091B; FREE_PVAL # PHOEN NUM ONE..PHOEN NUM THR 2636 1091C..1091E; UNASSIGNED # .. 2637 1091F ; FREE_PVAL # PHOEN WORD SEP 2638 10920..10939; PVALID # LYDIAN LET A..LYDIAN LET C 2639 1093A..1093E; UNASSIGNED # .. 2640 1093F ; FREE_PVAL # LYDIAN TRIANGULAR MARK 2641 10940..1097F; UNASSIGNED # .. 2642 10980..109B7; PVALID # MERO HIER LET A..MERO CURS LET 2643 109B8..109BD; UNASSIGNED # .. 2644 109BE..109BF; PVALID # MERO CURS LOG RMT..MERO CURS L 2645 109C0..109FF; UNASSIGNED # .. 2646 10A00..10A03; PVALID # KHARO LET A..KHARO VOW SIGN V 2647 10A04 ; UNASSIGNED # 2648 10A05..10A06; PVALID # KHARO VOW SIGN E..KHARO VOW SI 2649 10A07..10A0B; UNASSIGNED # .. 2650 10A0C..10A13; PVALID # KHARO VOW LEN MARK..KHARO LET 2651 10A14 ; UNASSIGNED # 2652 10A15..10A17; PVALID # KHARO LET CA..KHARO LET JA 2653 10A18 ; UNASSIGNED # 2654 10A19..10A33; PVALID # KHARO LET NYA..KHARO LET TTT 2655 10A34..10A37; UNASSIGNED # .. 2656 10A38..10A3A; PVALID # KHARO SIGN BAR ABOVE..KHARO SIGN D 2657 10A3B..10A3E; UNASSIGNED # .. 2658 10A3F ; PVALID # KHARO VIRAMA 2659 10A40..10A47; FREE_PVAL # KHARO DIG ONE..KHARO NUM ONE 2660 10A48..10A4F; UNASSIGNED # .. 2661 10A50..10A58; FREE_PVAL # KHARO PUNCT DOT..KHARO PUNCT 2662 10A59..10A5F; UNASSIGNED # .. 2663 10A60..10A7C; PVALID # OLD S ARAB LET HE..OLD SOUTH ARAB 2664 10A7D..10A7F; FREE_PVAL # OLD S ARAB NUM ONE..OLD SOUTH ARAB 2665 10A80..10AFF; UNASSIGNED # .. 2666 10B00..10B35; PVALID # AVESTAN LET A..AVESTAN LET HE 2667 10B36..10B38; UNASSIGNED # .. 2668 10B39..10B3F; FREE_PVAL # AVESTAN ABBR MARK..LARGE ONE RING O 2669 10B40..10B55; PVALID # INSCRIPT PARTHIAN LET ALEPH..INSCRI 2670 10B56..10B57; UNASSIGNED # .. 2671 10B58..10B5F; FREE_PVAL # INSCRIPT PARTHIAN NUM ONE..INSCRIPT 2672 10B60..10B72; PVALID # INSCRIPT PAHLAVI LET ALEPH..INSCRIP 2673 10B73..10B77; UNASSIGNED # .. 2674 10B78..10B7F; FREE_PVAL # INSCRIPT PAHLAVI NUM ONE..INSCRIPT 2675 10B80..10BFF; UNASSIGNED # .. 2676 10C00..10C48; PVALID # OLD TURK LET ORKHON A..OLD TURK LET 2677 10C49..10E5F; UNASSIGNED # .. 2678 10E60..10E7E; FREE_PVAL # RUMI DIG ONE..RUMI FRACTION TWO THI 2679 10E7F..10FFF; UNASSIGNED # .. 2680 11000..11046; PVALID # BRAHMI SIGN CANDRABINDU..BRAHMI VIR 2681 11047..1104D; FREE_PVAL # BRAHMI DANDA..BRAHMI PUNCT LOTUS 2682 1104E..11051; UNASSIGNED # .. 2683 11052..11065; FREE_PVAL # BRAHMI NUM ONE..BRAHMI NUM ONE THOU 2684 11066..1106F; PVALID # BRAHMI DIG ZERO..BRAHMI DIG NINE 2685 11070..1107F; UNASSIGNED # .. 2686 11080..110BA; PVALID # KAITHI SIGN CANDRABINDU..KAITHI SIG 2687 110BB..110BC; FREE_PVAL # KAITHI ABBR SIGN..KAITHI ENUM SIGN 2688 110BD ; DISALLOWED # KAITHI NUM SIGN 2689 110BE..110C1; FREE_PVAL # KAITHI SECT MARK..KAITHI DOUBLE DAN 2690 110C2..110CF; UNASSIGNED # .. 2691 110D0..110F8; PVALID # SORA SOMPENG LETTER SAH..SORA SOMPE 2692 110F9..110EF; UNASSIGNED # .. 2693 110F0..110F9; PVALID # SORA SOMPENG DIG ZERO..SORA SOMPENG DI 2694 110FA..110FF; UNASSIGNED # .. 2695 11100..11134; PVALID # CHAKMA SIGN CANDRABINDU..CHAKMA MAAYY 2696 11135 ; UNASSIGNED # 2697 11136..1113F; PVALID # CHAKMA DIG ZERO..CHAKMA DIG NINE 2698 11140..11143; FREE_PVAL # CHAKMA SECT MARK..CHAKMA QUEST MARK 2699 11144..1117F; UNASSIGNED # .. 2700 11180..111C4; PVALID # SHARADA SIGN CANDRABINDU..SHARADA OM 2701 111C5..111C8; FREE_PVAL # SHARADA DANDA..SHARADA SEPARATOR 2702 111C9..111CF; UNASSIGNED # .. 2703 111D0..111D9; PVALID # SHARADA DIG ZERO..SHARADA DIG NINE 2704 111DA..1167F; UNASSIGNED # .. 2705 11680..116B7; PVALID # TAKRI LET A..TAKRI SIGN NUKTA 2706 116B8..116BF; UNASSIGNED # .. 2707 116C0..116C9; PVALID # TAKRI DIGIT ZERO..TAKRI DIG NINE 2708 116CA..1FFFF; UNASSIGNED # .. 2709 12000..1236E; PVALID # CUNEI SIGN A..CUNEI SIGN ZUM 2710 1236F..123FF; UNASSIGNED # .. 2711 12400..12462; FREE_PVAL # CUNEI NUM SIGN TWO ASH..CUNEI NUM 2712 12463..1246F; UNASSIGNED # .. 2713 12470..12473; FREE_PVAL # CUNEI PUNCT SIGN OLD ASSYRIAN WORD 2714 12474..12FFF; UNASSIGNED # .. 2715 13000..1342E; PVALID # EGYPT HIERO A001..EGYPT HIERO AA032 2716 1342F..167FF; UNASSIGNED # .. 2717 16800..16A38; PVALID # BAMUM LET PHASE-A NGKUE MFON..BAMUN LE 2718 16A39..16EFF; UNASSIGNED # .. 2719 16F00..16F44; PVALID # MIAO LET PA..MIAO LET HHA 2720 16F45..16F4F; UNASSIGNED # .. 2721 16F50..16F7E; PVALID # MIAO LET NAS..MIAO VOWEL SIGN NG 2722 16F7F..16F8E; UNASSIGNED # .. 2723 16F8F..16F9F; PVALID # MIAO TONE RIGHT..MIAO LET REF TON 2724 16FA0..1AFFF; UNASSIGNED # .. 2725 1B000..1B001; PVALID # KATA LET ARCH E..KATA LET ARCH YE 2726 1B002..1CFFF; UNASSIGNED # .. 2727 1D000..1D0F5; FREE_PVAL # BYZ MUS SYM PSILI..BYZ MUS 2728 1D0F6..1D0FF; UNASSIGNED # .. 2729 1D100..1D126; FREE_PVAL # MUS SYM SINGLE BARLINE..MUS SYMBOL 2730 1D127..1D128; UNASSIGNED # .. 2731 1D129..1D164; FREE_PVAL # MUS SYM MULT MEASURE REST..MUS SYM ONE 2732 1D165..1D169; PVALID # MUS SYM COMB STEM..MUS SYM COMB TREMOL 2733 1D16A..1D16C; FREE_PVAL # MUS SYM FING TREM-1..MUS SYM FING TREM 2734 1D16D..1D172; PVALID # MUS SYM COMB AUG DOT..MUS SYM COMB FL 2735 1D173..1D17A; DISALLOWED # MUS SYM BEGIN BEAM..MUS SYM END PHRASE 2736 1D17B..1D182; PVALID # MUS SYM COMB ACCENT..MUS SYM COMB LOUR 2737 1D183..1D184; FREE_PVAL # MUS SYM ARP UP..MUS SYM ARP DOWN 2738 1D185..1D18B; PVALID # MUS SYM COMB DOIT..MUS SYM COMB TRIPLE 2739 1D18C..1D1A9; FREE_PVAL # MUS SYM RINFORZANDO..MUS SYM DEG SLASH 2740 1D1AA..1D1AD; PVALID # MUS SYM COMB DOWN BOW..MUS SYM COMB SN 2741 1D1AE..1D1DD; FREE_PVAL # MUS SYM PEDAL MARK..MUS SYM PES SUBPUN 2742 1D1DE..1D1FF; UNASSIGNED # .. 2743 1D200..1D241; FREE_PVAL # GREEK VOCAL NOTATION SYM-1..GREEK INS 2744 1D242..1D244; FREE_PVAL # COMB GREEK MUS TRISEME..COMB GREEK MU 2745 1D245 ; FREE_PVAL # GREEK MUSICAL LEIMMA 2746 1D246..1D2FF; UNASSIGNED # .. 2747 1D300..1D356; DISALLOWED # MONOG FOR EARTH..TETRAG FOR FOSTERING 2748 1D357..1D35F; UNASSIGNED # .. 2749 1D360..1D371; DISALLOWED # COUNT ROD UNIT DIG ONE..COUNT ROD TE 2750 1D372..1D3FF; UNASSIGNED # .. 2751 1D400..1D454; FREE_PVAL # MATH BOLD CAP A..MATH IT 2752 1D455 ; UNASSIGNED # 2753 1D456..1D49C; FREE_PVAL # MATH ITAL SM I..MATH SC 2754 1D49D ; UNASSIGNED # 2755 1D49E..1D49F; FREE_PVAL # MATH SCRIPT CAP C..MATH 2756 1D4A0..1D4A1; UNASSIGNED # .. 2757 1D4A2 ; FREE_PVAL # MATH SCRIPT CAP G 2758 1D4A3..1D4A4; UNASSIGNED # .. 2759 1D4A5..1D4A6; FREE_PVAL # MATH SCRIPT CAP J..MATH 2760 1D4A7..1D4A8; UNASSIGNED # .. 2761 1D4A9..1D4AC; FREE_PVAL # MATH SCRIPT CAP N..MATH 2762 1D4AD ; UNASSIGNED # 2763 1D4AE..1D4B9; FREE_PVAL # MATH SCRIPT CAP S..MATH 2764 1D4BA ; UNASSIGNED # 2765 1D4BB ; FREE_PVAL # MATH SCRIPT SM F 2766 1D4BC ; UNASSIGNED # 2767 1D4BD..1D4C3; FREE_PVAL # MATH SCRIPT SM H..MATH SC 2768 1D4C4 ; UNASSIGNED # 2769 1D4C5..1D505; FREE_PVAL # MATH SCRIPT SM P..MATH FR 2770 1D506 ; UNASSIGNED # 2771 1D507..1D50A; FREE_PVAL # MATH FRAKTUR CAP D..MATH 2772 1D50B..1D50C; UNASSIGNED # .. 2773 1D50D..1D514; FREE_PVAL # MATH FRAKTUR CAP J..MATH 2774 1D515 ; UNASSIGNED # 2775 1D516..1D51C; FREE_PVAL # MATH FRAKTUR CAP S..MATH 2776 1D51D ; UNASSIGNED # 2777 1D51E..1D539; FREE_PVAL # MATH FRAKTUR SM A..MATH D 2778 1D53A ; UNASSIGNED # 2779 1D53B..1D53E; FREE_PVAL # MATH DOUBLE-STRUCK CAP D..MATHEM 2780 1D53F ; UNASSIGNED # 2781 1D540..1D544; FREE_PVAL # MATH DOUBLE-STRUCK CAP I..MATHEM 2782 1D545 ; UNASSIGNED # 2783 1D546 ; FREE_PVAL # MATH DOUBLE-STRUCK CAP O 2784 1D547..1D549; UNASSIGNED # .. 2785 1D54A..1D550; FREE_PVAL # MATH DOUBLE-STRUCK CAP S..MATHEM 2786 1D551 ; UNASSIGNED # 2787 1D552..1D6A5; FREE_PVAL # MATH DOUBLE-STRUCK SM A..MATHEMAT 2788 1D6A6..1D6A7; UNASSIGNED # .. 2789 1D6A8..1D7CB; FREE_PVAL # MATH BOLD CAP ALPHA..MATHEMATICA 2790 1D7CC..1D7CD; UNASSIGNED # .. 2791 1D7CE..1D7FF; FREE_PVAL # MATH BOLD DIG ZERO..MATH M 2792 1D800..1EDFF; UNASSIGNED # .. 2793 1EE00..1EE03; FREE_PVAL # ARAB MATH ALEF..ARAB MATH DAL 2794 1EE04 ; UNASSIGNED # 2795 1EE05..1EE1F; FREE_PVAL # ARAB MATH WAW..ARAB MATH DOTLESS QAF 2796 1EE20 ; UNASSIGNED # 2797 1EE21..1EE22; FREE_PVAL # ARAB MATH INIT BEH..ARAB MATH INIT JEE 2798 1EE23 ; UNASSIGNED # 2799 1EE24 ; FREE_PVAL # ARAB MATH INIT HEH 2800 1EE25..1EE26; UNASSIGNED # .. 2801 1EE27 ; FREE_PVAL # ARAB MATH INIT HAH 2802 1EE28 ; UNASSIGNED # 2803 1EE29..1EE32; FREE_PVAL # ARAB MATH INIT YEH..ARAB MATH INIT QAF 2804 1EE33 ; UNASSIGNED # 2805 1EE34..1EE37; FREE_PVAL # ARAB MATH INIT SHEEN..ARAB MATH INITIA 2806 1EE38 ; UNASSIGNED # 2807 1EE39 ; FREE_PVAL # ARAB MATH INIT SHEEN 2808 1EE3A ; UNASSIGNED # 2809 1EE3B ; FREE_PVAL # ARAB MATH INIT GHAIN 2810 1EE3C..1EE41; UNASSIGNED # .. 2811 1EE42 ; FREE_PVAL # ARAB MATH TAILED JEEM 2812 1EE43..1EE46; UNASSIGNED # .. 2813 1EE47 ; FREE_PVAL # ARAB MATH TAILED HAH 2814 1EE48 ; UNASSIGNED # 2815 1EE49 ; FREE_PVAL # ARAB MATH TAILED YEH 2816 1EE4A ; UNASSIGNED # 2817 1EE4B ; FREE_PVAL # ARAB MATH TAILED LAM 2818 1EE4C ; UNASSIGNED # 2819 1EE4D..1EE4F; FREE_PVAL # ARAB MATH TAILED NOON..ARAB MATH TAILE 2820 1EE50 ; UNASSIGNED # 2821 1EE51..1EE52; FREE_PVAL # ARAB MATH TAILED QAF..ARAB MATH TAILED 2822 1EE53 ; UNASSIGNED # 2823 1EE54 ; FREE_PVAL # ARAB MATH TAILED SHEEN 2824 1EE55..1EE56; UNASSIGNED # .. 2825 1EE57 ; FREE_PVAL # ARAB MATH TAILED KHAH 2826 1EE58 ; UNASSIGNED # 2827 1EE59 ; FREE_PVAL # ARAB MATH TAILED DAD 2828 1EE5A ; UNASSIGNED # 2829 1EE5B ; FREE_PVAL # ARAB MATH TAILED GHAIN 2830 1EE5C ; UNASSIGNED # 2831 1EE5D ; FREE_PVAL # ARAB MATH TAILED DOTLESS NOON 2832 1EE5E ; UNASSIGNED # 2833 1EE5F ; FREE_PVAL # ARAB MATH TAILED DOTLESS GHAIN 2834 1EE60 ; UNASSIGNED # 2835 1EE61..1EE62; FREE_PVAL # ARAB MATH STRETCHED BEH..ARAB MATH STR 2836 1EE63 ; UNASSIGNED # 2837 1EE64 ; FREE_PVAL # ARAB MATH STRETCHED HEH 2838 1EE65..1EE66; UNASSIGNED # .. 2839 1EE67..1EE6A; FREE_PVAL # ARAB MATH STRETCHED HAH..ARAB MATH STR 2840 1EE6B ; UNASSIGNED # 2841 1EE6C..1EE72; FREE_PVAL # ARAB MATH STRETCHED MEEM..ARAB MATH ST 2842 1EE73 ; UNASSIGNED # 2843 1EE74..1EE77; FREE_PVAL # ARAB MATH STRETCHED SHEEN..ARAB MATH S 2844 1EE78 ; UNASSIGNED # 2845 1EE79..1EE7C; FREE_PVAL # ARAB MATH STRETCHED DAD..ARAB MATH STR 2846 1EE7D ; UNASSIGNED # 2847 1EE7E ; FREE_PVAL # ARAB MATH STRETCHED DOTLESS FEH 2848 1EE7F ; UNASSIGNED # 2849 1EE80..1EE89; FREE_PVAL # ARAB MATH LOOPED ALEF..ARAB MATH LOOPE 2850 1EE8A ; UNASSIGNED # 2851 1EE8B..1EE9B; FREE_PVAL # ARAB MATH LOOPED LAM..ARAB MATH LOOPED 2852 1EE9C..1EEA0; UNASSIGNED # .. 2853 1EEA1..1EEA3; FREE_PVAL # ARAB MATH DOUBLE-STRUCK BEH..ARAB MATH 2854 1EEA4 ; UNASSIGNED # 2855 1EEA5..1EEA9; FREE_PVAL # ARAB MATH DOUBLE-STRUCK WAW..ARAB MATH 2856 1EEAA ; UNASSIGNED # 2857 1EEAB..1EEBB; FREE_PVAL # ARAB MATH DOUBLE-STRUCK LAM..ARAB MATH 2858 1EEBC..1EEEF; UNASSIGNED # .. 2859 1EEF0..1EEF1; FREE_PVAL # ARAB MATH OP MEEM W HAH W TATWHEEL..AR 2860 1EEF2..1EFFF; UNASSIGNED # .. 2861 1F000..1F02B; FREE_PVAL # MAHJONG TILE EAST WIND..MAHJONG TILE B 2862 1F02C..1F02F; UNASSIGNED # .. 2863 1F030..1F093; FREE_PVAL # DOMINO TILE HORIZ BACK..DOMINO TILE VE 2864 1F094..1F09F; UNASSIGNED # .. 2865 1F0A0..1F0AE; FREE_PVAL # PLAY CARD BACK..PLAY CARD KING OF SPAD 2866 1F0AF..1F0B0; UNASSIGNED # .. 2867 1F0B1..1F0BE; FREE_PVAL # PLAY CARD ACE OF HEARTS..PLAY CARD KIN 2868 1F0BF..1F0C0; UNASSIGNED # .. 2869 1F0C1..1F0CF; FREE_PVAL # PLAY CARD ACE OF DIAMONDS..PLAY CARD B 2870 1F0D0 ; UNASSIGNED # 2871 1F0D1..1F0DF; FREE_PVAL # PLAY CARD ACE OF CLUBS..PLAY CARD WHIT 2872 1F0E0..1F0FF; UNASSIGNED # .. 2873 1F100..1F10A; FREE_PVAL # DIG ZERO FULL STOP..DIG NINE COMMA 2874 1F10B..1F10F; UNASSIGNED # .. 2875 1F110..1F12E; FREE_PVAL # PARENTHESIZED LAT CAP LET A..CIRCLE 2876 1F12F ; UNASSIGNED # 2877 1F130..1F16B; FREE_PVAL # SQUARED LAT CAP LET A..RAISED MD SIGN 2878 1F16C..1F16F; UNASSIGNED # .. 2879 1F170..1F19A; FREE_PVAL # NEG SQ LAT CAP LET A..SQUARED VS 2880 1F19B..1F1E5; UNASSIGNED # .. 2881 1F1E6..1F202; FREE_PVAL # REG IND SYMB LET A..SQ KATAKANA SA 2882 1F203..1F20F; UNASSIGNED # .. 2883 1F210..1F23A; FREE_PVAL # SQ CJK UNIF IDEO-624B..SQ CJK UNIF IDE 2884 1F23B..1F23F; UNASSIGNED # .. 2885 1F240..1F248; FREE_PVAL # TORT SH BRACK CJK UNIF IDEO-672C..TORT 2886 1F249..1F24F; UNASSIGNED # .. 2887 1F250..1F251; FREE_PVAL # CIRC IDEO ADVANTAGE..CIRC IDEO ACCEPT 2888 1F252..1F2FF; UNASSIGNED # .. 2889 1F300..1F320; FREE_PVAL # CYCLONE..SHOOTING STAR 2890 1F321..1F32F; UNASSIGNED # .. 2891 1F330..1F335; FREE_PVAL # CHESTNUT..CACTUS 2892 1F336 ; UNASSIGNED # 2893 1F337..1F37C; FREE_PVAL # TULIP..BABY BOTTLE 2894 1F37D..1F37F; UNASSIGNED # .. 2895 1F380..1F393; FREE_PVAL # RIBBON..GRADUATION CAP 2896 1F394..1F39F; UNASSIGNED # .. 2897 1F3A0..1F3C4; FREE_PVAL # CAROUSEL HORSE..SURFER 2898 1F3C5 ; UNASSIGNED # 2899 1F3C6..1F3CA; FREE_PVAL # TROPHY..SWIMMER 2900 1F3CB..1F3DF; UNASSIGNED # .. 2901 1F3E0..1F3F0; FREE_PVAL # HOUSE BUILDING..EUROPEAN CASTLE 2902 1F3F1..1F3FF; UNASSIGNED # .. 2903 1F400..1F43E; FREE_PVAL # RAT..PAW PRINTS 2904 1F43F ; UNASSIGNED # 2905 1F440 ; FREE_PVAL # EYES 2906 1F441 ; UNASSIGNED # 2907 1F442..1F4F7; FREE_PVAL # EAR..CAMERA 2908 1F4F8 ; UNASSIGNED # 2909 1F4F9..1F4FC; FREE_PVAL # VIDEOCASSETTE 2910 1F4FD..1F4FF; UNASSIGNED # .. 2911 1F500..1F53D; FREE_PVAL # TWISTED RIGHTWARDS ARROWS..DOWN-POINTI 2912 1F53E..1F53F; UNASSIGNED # .. 2913 1F540..1F543; FREE_PVAL # CIRCLED CROSS POMMEE..NOTCHED LEFT SEM 2914 1F544..1F54F; UNASSIGNED # .. 2915 1F550..1F567; FREE_PVAL # CLOCK FACE ONE OCLOCK..CLOCK FACE TWEL 2916 1F568..1F5FA; UNASSIGNED # .. 2917 1F5FB..1F640; FREE_PVAL # MOUNT FUJI..WEARY CAT FACE 2918 1F641..1F644; UNASSIGNED # .. 2919 1F645..1F650; FREE_PVAL # FACE WITH NO GOOD GESTURE..PERSON W FO 2920 1F650..1F67F; UNASSIGNED # .. 2921 1F680..1F6C5; FREE_PVAL # ROCKET..LEFT LUGGAGE 2922 1F6C6..1F6FF; UNASSIGNED # .. 2923 1F700..1F773; FREE_PVAL # ALCHEMICAL SYMBOL FOR QUINTESSENCE..AL 2924 1F774..1FFFF; UNASSIGNED # .. 2925 20000..2A6D6; PVALID # 2926 2A6D7..2A6FF; UNASSIGNED # .. 2927 2A700..2B734; PVALID # 2928 2A735..2A739; UNASSIGNED # .. 2929 2A740..2B81D; PVALID # 2930 2F800..2FA1D; FREE_PVAL # CJK COMP IDEO-2F800..CJK COMPA 2931 2FA1E..2FFFD; UNASSIGNED # .. 2932 2FFFE..2FFFF; DISALLOWED # .. 2933 30000..3FFFD; UNASSIGNED # .. 2934 3FFFE..3FFFF; DISALLOWED # .. 2935 40000..4FFFD; UNASSIGNED # .. 2936 4FFFE..4FFFF; DISALLOWED # .. 2937 50000..5FFFD; UNASSIGNED # .. 2938 5FFFE..5FFFF; DISALLOWED # .. 2939 60000..6FFFD; UNASSIGNED # .. 2940 6FFFE..6FFFF; DISALLOWED # .. 2941 70000..7FFFD; UNASSIGNED # .. 2942 7FFFE..7FFFF; DISALLOWED # .. 2943 80000..8FFFD; UNASSIGNED # .. 2944 8FFFE..8FFFF; DISALLOWED # .. 2945 90000..9FFFD; UNASSIGNED # .. 2946 9FFFE..9FFFF; DISALLOWED # .. 2947 A0000..AFFFD; UNASSIGNED # .. 2948 AFFFE..AFFFF; DISALLOWED # .. 2949 B0000..BFFFD; UNASSIGNED # .. 2950 BFFFE..BFFFF; DISALLOWED # .. 2951 C0000..CFFFD; UNASSIGNED # .. 2952 CFFFE..CFFFF; DISALLOWED # .. 2953 D0000..DFFFD; UNASSIGNED # .. 2954 DFFFE..DFFFF; DISALLOWED # .. 2955 E0000 ; UNASSIGNED # 2956 E0001 ; DISALLOWED # LANGUAGE TAG 2957 E0002..E001F; UNASSIGNED # .. 2958 E0020..E007F; DISALLOWED # TAG SPACE..CANCEL TAG 2959 E0080..E00FF; UNASSIGNED # .. 2960 E0100..E01EF; PVALID # VAR SEL-17..VAR SEL-256 2961 E01F0..EFFFD; UNASSIGNED # .. 2962 EFFFE..10FFFF; DISALLOWED # .. 2964 Appendix B. Acknowledgements 2966 The authors would like to acknowledge the comments and contributions 2967 of the following individuals: David Black, Mark Davis, Alan DeKok, 2968 Martin Duerst, Patrik Faltstrom, Ted Hardie, Joe Hildebrand, Paul 2969 Hoffman, Jeffrey Hutzelman, Simon Josefsson, John Klensin, Alexey 2970 Melnikov, Takahiro Nemoto, Yoav Nir, Mike Parker, Pete Resnick, 2971 Andrew Sullivan, Dave Thaler, and Yoshiro Yoneya. 2973 Some algorithms and textual descriptions have been borrowed from 2975 [RFC5892]. Some text regarding security has been borrowed from 2976 [RFC5890] and [I-D.ietf-xmpp-6122bis]. 2978 Authors' Addresses 2980 Peter Saint-Andre 2981 Cisco Systems, Inc. 2982 1899 Wynkoop Street, Suite 600 2983 Denver, CO 80202 2984 USA 2986 Phone: +1-303-308-3282 2987 Email: psaintan@cisco.com 2989 Marc Blanchet 2990 Viagenie 2991 246 Aberdeen 2992 Quebec, QC G1R 2E1 2993 Canada 2995 Email: Marc.Blanchet@viagenie.ca 2996 URI: http://www.viagenie.ca/