idnits 2.17.1 draft-ietf-precis-framework-15.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 14, 2014) is 3696 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 1532 -- Looks like a reference, but probably isn't: '2' on line 1534 -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE' == Outdated reference: A later version (-12) exists of draft-ietf-precis-mappings-07 == Outdated reference: A later version (-19) exists of draft-ietf-precis-nickname-09 == Outdated reference: A later version (-18) exists of draft-ietf-precis-saslprepbis-06 == Outdated reference: A later version (-24) exists of draft-ietf-xmpp-6122bis-11 -- Obsolete informational reference (is this intentional?): RFC 3454 (Obsoleted by RFC 7564) -- Obsolete informational reference (is this intentional?): RFC 3490 (Obsoleted by RFC 5890, RFC 5891) -- Obsolete informational reference (is this intentional?): RFC 3491 (Obsoleted by RFC 5891) -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 PRECIS P. Saint-Andre 3 Internet-Draft &yet 4 Obsoletes: 3454 (if approved) M. Blanchet 5 Intended status: Standards Track Viagenie 6 Expires: September 15, 2014 March 14, 2014 8 PRECIS Framework: Preparation and Comparison of Internationalized 9 Strings in Application Protocols 10 draft-ietf-precis-framework-15 12 Abstract 14 Application protocols using Unicode characters in protocol strings 15 need to properly prepare such strings in order to perform valid 16 comparison operations (e.g., for purposes of authentication or 17 authorization). This document defines a framework enabling 18 application protocols to perform the preparation and comparison of 19 internationalized strings ("PRECIS") in a way that depends on the 20 properties of Unicode characters and thus is agile with respect to 21 versions of Unicode. As a result, this framework provides a more 22 sustainable approach to the handling of internationalized strings 23 than the previous framework, known as Stringprep (RFC 3454). This 24 document obsoletes RFC 3454. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on September 15, 2014. 43 Copyright Notice 45 Copyright (c) 2014 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 3. String Classes . . . . . . . . . . . . . . . . . . . . . . . 5 63 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 5 64 3.2. IdentifierClass . . . . . . . . . . . . . . . . . . . . . 7 65 3.3. FreeformClass . . . . . . . . . . . . . . . . . . . . . . 8 66 4. Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . 10 67 4.1. Principles . . . . . . . . . . . . . . . . . . . . . . . 10 68 4.2. Building Application-Layer Constructs . . . . . . . . . . 12 69 4.3. A Note about Spaces . . . . . . . . . . . . . . . . . . . 13 70 5. Order of Operations . . . . . . . . . . . . . . . . . . . . . 13 71 6. Code Point Properties . . . . . . . . . . . . . . . . . . . . 14 72 7. Category Definitions Used to Calculate Derived Property . . . 16 73 7.1. LetterDigits (A) . . . . . . . . . . . . . . . . . . . . 16 74 7.2. Unstable (B) . . . . . . . . . . . . . . . . . . . . . . 17 75 7.3. IgnorableProperties (C) . . . . . . . . . . . . . . . . . 17 76 7.4. IgnorableBlocks (D) . . . . . . . . . . . . . . . . . . . 17 77 7.5. LDH (E) . . . . . . . . . . . . . . . . . . . . . . . . . 17 78 7.6. Exceptions (F) . . . . . . . . . . . . . . . . . . . . . 17 79 7.7. BackwardCompatible (G) . . . . . . . . . . . . . . . . . 19 80 7.8. JoinControl (H) . . . . . . . . . . . . . . . . . . . . . 19 81 7.9. OldHangulJamo (I) . . . . . . . . . . . . . . . . . . . . 19 82 7.10. Unassigned (J) . . . . . . . . . . . . . . . . . . . . . 20 83 7.11. ASCII7 (K) . . . . . . . . . . . . . . . . . . . . . . . 20 84 7.12. Controls (L) . . . . . . . . . . . . . . . . . . . . . . 20 85 7.13. PrecisIgnorableProperties (M) . . . . . . . . . . . . . . 20 86 7.14. Spaces (N) . . . . . . . . . . . . . . . . . . . . . . . 21 87 7.15. Symbols (O) . . . . . . . . . . . . . . . . . . . . . . . 21 88 7.16. Punctuation (P) . . . . . . . . . . . . . . . . . . . . . 21 89 7.17. HasCompat (Q) . . . . . . . . . . . . . . . . . . . . . . 21 90 7.18. OtherLetterDigits (R) . . . . . . . . . . . . . . . . . . 22 91 8. Calculation of the Derived Property . . . . . . . . . . . . . 22 92 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 93 9.1. PRECIS Derived Property Value Registry . . . . . . . . . 23 94 9.2. PRECIS Base Classes Registry . . . . . . . . . . . . . . 23 95 9.3. PRECIS Profiles Registry . . . . . . . . . . . . . . . . 24 97 10. Security Considerations . . . . . . . . . . . . . . . . . . . 26 98 10.1. General Issues . . . . . . . . . . . . . . . . . . . . . 26 99 10.2. Use of the IdentifierClass . . . . . . . . . . . . . . . 26 100 10.3. Use of the FreeformClass . . . . . . . . . . . . . . . . 26 101 10.4. Local Character Set Issues . . . . . . . . . . . . . . . 27 102 10.5. Visually Similar Characters . . . . . . . . . . . . . . 27 103 10.6. Security of Passwords . . . . . . . . . . . . . . . . . 29 104 11. Interoperability Considerations . . . . . . . . . . . . . . . 30 105 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 106 12.1. Normative References . . . . . . . . . . . . . . . . . . 30 107 12.2. Informative References . . . . . . . . . . . . . . . . . 30 108 12.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 33 109 Appendix A. Codepoint Table . . . . . . . . . . . . . . . . . . 33 110 Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 64 111 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 64 113 1. Introduction 115 As described in the problem statement for the preparation and 116 comparison of internationalized strings ("PRECIS") [RFC6885], many 117 IETF protocols have used the Stringprep framework [RFC3454] as the 118 basis for preparing and comparing protocol strings that contain 119 Unicode characters [UNICODE] outside the ASCII range [RFC20]. The 120 Stringprep framework was developed during work on the original 121 technology for internationalized domain names (IDNs), here called 122 "IDNA2003" [RFC3490], and Nameprep [RFC3491] was the Stringprep 123 profile for IDNs. At the time, Stringprep was designed as a general 124 framework so that other application protocols could define their own 125 Stringprep profiles for the preparation and comparison of strings and 126 identifiers. Indeed, a number of application protocols defined such 127 profiles. 129 After the publication of [RFC3454] in 2002, several significant 130 issues arose with the use of Stringprep in the IDN case, as 131 documented in the IAB's recommendations regarding IDNs [RFC4690] 132 (most significantly, Stringprep was tied to Unicode version 3.2). 133 Therefore, the newer IDNA specifications, here called "IDNA2008" 134 ([RFC5890], [RFC5891], [RFC5892], [RFC5893], [RFC5894]), no longer 135 use Stringprep and Nameprep. This migration away from Stringprep for 136 IDNs has prompted other "customers" of Stringprep to consider new 137 approaches to the preparation and comparison of internationalized 138 strings, as described in [RFC6885]. 140 This document defines a framework for a post-Stringprep approach to 141 the preparation and comparison of internationalized strings in 142 application protocols, based on several principles: 144 1. Define a small set of string classes that specify the Unicode 145 characters (i.e., specific "code points") appropriate for common 146 application protocol constructs. 148 2. Define each PRECIS string class in terms of Unicode code points 149 and their properties so that an algorithm can be used to 150 determine whether each code point or character category is (a) 151 valid, (b) allowed in certain contexts, (c) disallowed, or (d) 152 unassigned. 154 3. Use an "inclusion model" such that a string class consists only 155 of code points that are explicitly allowed, with the result that 156 any code point not explicitly allowed is forbidden. 158 4. Enable application protocols to define profiles of the PRECIS 159 string classes, addressing matters such as width mapping, case 160 folding and other forms of character mapping, Unicode 161 normalization, directionality, and further excluded code points 162 or character categories. 164 Whereas the string classes define the "baseline" code points for a 165 range of applications, profiling enables application protocols to 166 further restrict the allowable code points beyond those specified for 167 the relevant string class (e.g., characters with special or reserved 168 meaning, such as "@" and "/" when used as separators within 169 identifiers) and to apply the string classes in ways that are 170 appropriate for constructs such as usernames and passwords 171 [I-D.ietf-precis-saslprepbis], nicknames [I-D.ietf-precis-nickname], 172 the localparts of instant messaging addresses 173 [I-D.ietf-xmpp-6122bis], and free-form strings 174 [I-D.ietf-xmpp-6122bis]. Profiles are responsible for defining the 175 handling of right-to-left characters as well as various mapping 176 operations of the kind also discussed for IDNs in [RFC5895], such as 177 case preservation or lowercasing, Unicode normalization, mapping of 178 certain characters to other characters or to nothing, and mapping of 179 full-width and half-width characters. 181 It is expected that this framework will yield the following benefits: 183 o Application protocols will be agile with regard to Unicode 184 versions. 186 o Implementers will be able to share code point tables and software 187 code across application protocols, most likely by means of 188 software libraries. 190 o End users will be able to acquire more accurate expectations about 191 the characters that are acceptable in various contexts. Given 192 this more uniform set of string classes, it is also expected that 193 copy/paste operations between software implementing different 194 application protocols will be more predictable and coherent. 196 Although this framework is similar to IDNA2008 and borrows some of 197 the character categories defined in [RFC5892], it defines additional 198 character categories to meet the needs of common application 199 protocols. 201 The character categories and calculation rules defined under 202 Section 7 and Section 8 are normative and apply to all Unicode code 203 points. The code point table provided under Appendix A is non- 204 normative and merely shows, for illustrative purposes, the 205 consequences of the character categories and calculation rules, as 206 well as the resulting property values. 208 2. Terminology 210 Many important terms used in this document are defined in [RFC5890], 211 [RFC6365], [RFC6885], and [UNICODE]. The terms "left-to-right" (LTR) 212 and "right-to-left" (RTL) are defined in Unicode Standard Annex #9 213 [UAX9]. 215 As of the date of writing, the version of Unicode published by the 216 Unicode Consortium is 6.3; however, PRECIS is not tied to a specific 217 version of Unicode. 219 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 220 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 221 "OPTIONAL" in this document are to be interpreted as described in 222 [RFC2119]. 224 3. String Classes 226 3.1. Overview 228 IDNA2008 essentially defines a string class of internationalized 229 domain name (IDN), although it does not use the term "string class". 230 (This document does not define a string class for domain names, and 231 application protocols are strongly encouraged to use IDNA2008 as the 232 appropriate method to prepare domain names and hostnames.) Because 233 the IDN string class is designed to meet the particular requirements 234 of the Domain Name System (DNS), additional string classes are needed 235 for non-DNS applications. 237 Starting in 2010, various "customers" of Stringprep began to discuss 238 the need to define a post-Stringprep approach to the preparation and 239 comparison of internationalized strings other than IDNs. This 240 community analyzed the existing Stringprep profiles and also weighed 241 the costs and benefits of defining a relatively small set of Unicode 242 characters that would minimize the potential for user confusion 243 caused by visually similar characters (and thus be relatively "safe") 244 vs. defining a much larger set of Unicode characters that would 245 maximize the potential for user creativity (and thus be relatively 246 "expressive"). As a result, the community concluded that most 247 existing uses could be addressed by two string classes: 249 IdentifierClass: a sequence of letters, numbers, and some symbols 250 that is used to identify or address a network entity such as a 251 user account, a venue (e.g., a chatroom), an information source 252 (e.g., a data feed), or a collection of data (e.g., a file); the 253 intent is that this class will minimize user confusion in a wide 254 variety of application protocols, with the result that safety has 255 been prioritized over expressiveness for this class. 257 FreeformClass: a sequence of letters, numbers, symbols, spaces, and 258 other characters that is used for free-form strings, including 259 passwords as well as display elements such as human-friendly 260 nicknames in chatrooms; the intent is that this class will allow 261 nearly any Unicode character, with the result that expressiveness 262 has been prioritized over safety for this class (e.g., protocol 263 designers, application developers, service providers, and end 264 users might not understand or be able to enter all of the 265 characters that can be included in the FreeformClass). 267 Future specifications might define additional PRECIS string classes, 268 such as a class that falls somewhere between the IdentifierClass and 269 the FreeformClass. At this time, it is not clear how useful such a 270 class would be. In any case, because application developers are able 271 to define profiles of PRECIS string classes, a protocol needing a 272 construct between the IdentiferClass and the FreeformClass could 273 define a restricted profile of the FreeformClass if needed. 275 The following subsections discuss the IdentifierClass and 276 FreeformClass in more detail, with reference to the dimensions 277 described in Section 3 of [RFC6885]. Each string class is defined by 278 the following behavioral rules: 280 Valid: Defines which code points and character categories are 281 treated as valid input to the string. 283 Contextual Rule Required: Defines which code points and character 284 categories are treated as allowed only if the requirements of a 285 contextual rule are met (i.e., either CONTEXTJ or CONTEXTO). 287 Disallowed: Defines which code points and character categories need 288 to be excluded from the string. 290 Unassigned: Defines application behavior in the presence of code 291 points that are unknown (i.e., not yet designated) for the version 292 of Unicode used by the application. 294 This document defines the valid, contextual rule required, 295 disallowed, and unassigned rules for the IdentifierClass and 296 FreeformClass. As described under Section 4, profiles of these 297 string classes are responsible for defining the width mapping, 298 additional mappings, case mapping, normalization, directionality, and 299 exclusion rules. 301 3.2. IdentifierClass 303 Most application technologies need strings that can be used to refer 304 to, include, or communicate protocol strings like usernames, file 305 names, data feed identifiers, and chatroom names. We group such 306 strings into a class called "IdentifierClass" having the following 307 features. 309 3.2.1. Valid 311 o Code points traditionally used as letters and numbers in writing 312 systems, i.e., the LetterDigits ("A") category first defined in 313 [RFC5892] and listed here under Section 7.1. 315 o Code points in the range U+0021 through U+007E, i.e., the 316 (printable) ASCII7 ("K") rule defined under Section 7.11. These 317 code points are "grandfathered" into PRECIS and thus are valid 318 even if they would otherwise be disallowed according to the 319 property-based rules specified in the next section. 321 Informational Note: Although the PRECIS IdentifierClass re-uses 322 the LetterDigits category from IDNA2008, the range of characters 323 allowed in the IdentifierClass is wider than the range of 324 characters allowed in IDNA2008. The main reason is that IDNA2008 325 applies the Unstable category before the LetterDigits category, 326 thus disallowing uppercase characters, whereas the IdentifierClass 327 does not apply the Unstable category. 329 3.2.2. Contextual Rule Required 331 o A number of characters from the Exceptions ("F") category defined 332 under Section 7.6 (see Section 7.6 for a full list). 334 o Joining characters, i.e., the JoinControl ("H") category defined 335 under Section 7.8. 337 3.2.3. Disallowed 339 o Old Hangul Jamo characters, i.e., the OldHangulJamo ("I") category 340 defined under Section 7.9. 342 o Control characters, i.e., the Controls ("L") category defined 343 under Section 7.12. 345 o Ignorable characters, i.e., the PrecisIgnorableProperties ("M") 346 category defined under Section 7.13. 348 o Space characters, i.e., the Spaces ("N") category defined under 349 Section 7.14. 351 o Symbol characters, i.e., the Symbols ("O") category defined under 352 Section 7.15. 354 o Punctuation characters, i.e., the Punctuation ("P") category 355 defined under Section 7.16. 357 o Any character that has a compatibility equivalent, i.e., the 358 HasCompat ("Q") category defined under Section 7.17. These code 359 points are disallowed even if they would otherwise be valid 360 according to the property-based rules specified in the previous 361 section. 363 o Letters and digits other than the "traditional" letters and digits 364 allowed in IDNs, i.e., the OtherLetterDigits ("R") category 365 defined under Section 7.18. 367 3.2.4. Unassigned 369 Any code points that are not yet designated in the Unicode character 370 set SHALL be considered Unassigned for purposes of the 371 IdentifierClass, and a string containing such code points SHALL be 372 rejected. 374 3.3. FreeformClass 376 Some application technologies need strings that can be used in a 377 free-form way, e.g., as a password in an authentication exchange (see 378 [I-D.ietf-precis-saslprepbis] or a nickname in a chatroom (see 379 [I-D.ietf-precis-nickname]). We group such things into a class 380 called "FreeformClass" having the following features. 382 Security Warning: Consult Section 10.6 for relevant security 383 considerations when strings conforming to the FreeformClass, or a 384 profile thereof, are used as passwords. 386 3.3.1. Valid 388 o Traditional letters and numbers, i.e., the LetterDigits ("A") 389 category first defined in [RFC5892] and listed here under 390 Section 7.1. 392 o Letters and digits other than the "traditional" letters and digits 393 allowed in IDNs, i.e., the OtherLetterDigits ("R") category 394 defined under Section 7.18. 396 o Code points in the range U+0021 through U+007E, i.e., the 397 (printable) ASCII7 ("K") rule defined under Section 7.11. 399 o Any character that has a compatibility equivalent, i.e., the 400 HasCompat ("Q") category defined under Section 7.17. 402 o Space characters, i.e., the Spaces ("N") category defined under 403 Section 7.14. 405 o Symbol characters, i.e., the Symbols ("O") category defined under 406 Section 7.15. 408 o Punctuation characters, i.e., the Punctuation ("P") category 409 defined under Section 7.16. 411 3.3.2. Contextual Rule Required 413 o A number of characters from the Exceptions ("F") category defined 414 under Section 7.6 (see Section 7.6 for a full list). 416 o Joining characters, i.e., the JoinControl ("H") category defined 417 under Section 7.8. 419 3.3.3. Disallowed 421 o Old Hangul Jamo characters, i.e., the OldHangulJamo ("I") category 422 defined under Section 7.9. 424 o Control characters, i.e., the Controls ("L") category defined 425 under Section 7.12. 427 o Ignorable characters, i.e., the PrecisIgnorableProperties ("M") 428 category defined under Section 7.13. 430 3.3.4. Unassigned 432 Any code points that are not yet designated in the Unicode character 433 set SHALL be considered Unassigned for purposes of the FreeformClass, 434 and a string containing such code points SHALL be rejected. 436 4. Profiles 438 4.1. Principles 440 This framework document defines the valid, contextual-rule-required, 441 disallowed, and unassigned rules for the IdentifierClass and the 442 FreeformClass. A profile of a PRECIS string class MUST define the 443 width mapping, additional mappings (if any), case mapping, 444 normalization, directionality, and exclusion rules. A profile MAY 445 also restrict the allowable characters above and beyond the 446 definition of the relevant PRECIS string class (but MUST NOT add as 447 valid any code points or character categories that are disallowed by 448 the relevant PRECIS string class). These matters are discussed in 449 the following subsections. 451 Profiles of the PRECIS string classes MUST register with the IANA as 452 described under Section 9.3. It is RECOMMENDED for profile names to 453 be of the form "ProfilenameBaseClass", where the "Profilename" string 454 is a differentiator and "BaseClass" is the name of the PRECIS string 455 class being profiled; for example, the profile of the IdentifierClass 456 used for localparts of Jabber IDs in the Extensible Messaging and 457 Presence Protocol (XMPP) is named "JIDlocalIdentifierClass" 458 [I-D.ietf-xmpp-6122bis]. 460 4.1.1. Width Mapping 462 The width mapping rule of a profile specifies whether width mapping 463 is performed on fullwidth and halfwidth characters, and how the 464 mapping is done. Typically such mapping consists of mapping 465 fullwidth and halfwidth characters, i.e., code points with a 466 Decomposition Type of Wide or Narrow, to their decomposition 467 mappings; as an example, FULLWIDTH DIGIT ZERO (U+FF10) would be 468 mapped to DIGIT ZERO (U+0030). 470 The normalization form specified by a profile (see below) has an 471 impact on the need for width mapping. Because width mapping is 472 performed as a part of compatibility decomposition, a profile 473 employing either normalization form KD (NFKD) or normalization form 474 KC (NFKC) does not need to specify width mapping. However, if 475 Unicode normalization form C (NFC) is used then the profile needs to 476 specify whether to apply width mapping; in this case, width mapping 477 is in general RECOMMENDED because allowing fullwidth and halfwidth 478 characters to remain unmapped to their compatibility variants would 479 violate the principle of least user surprise. For more information 480 about the concept of width in East Asian scripts within Unicode, see 481 Unicode Standard Annex #11 [UAX11]. 483 4.1.2. Additional Mappings 485 The additional mappings rule of a profile specifies whether 486 additional mappings are to be applied, such as mapping of delimiter 487 characters and mapping of special characters (e.g., non-ASCII space 488 characters to ASCII space or certain characters to nothing). 490 4.1.3. Case Mapping 492 The case mapping rule of a profile specifies whether case mapping is 493 performed (instead of case preservation) on uppercase and titlecase 494 characters, and how the mapping is done (e.g., mapping uppercase and 495 titlecase characters to their lowercase equivalents). 497 If case mapping is desired (instead of case preservation), it is 498 RECOMMENDED to use Unicode Default Case Folding as defined in Chapter 499 3 of the Unicode Standard [UNICODE]. 501 Informational Note: Unicode Default Case Folding is not designed 502 to handle various localization issues (such as so-called "dotless 503 i" in several Turkic languages). The PRECIS mappings document 504 [I-D.ietf-precis-mappings] describes these issues in greater 505 detail and defines a "local case mapping" method that handles some 506 locale-dependent and context-dependent mappings. 508 In order to maximize entropy and minimize the potential for false 509 positives, it is NOT RECOMMENDED for application protocols to map 510 uppercase and titlecase code points to their lowercase equivalents 511 when strings conforming to the FreeformClass, or a profile thereof, 512 are used in passwords; instead, it is RECOMMENDED to preserve the 513 case of all code points contained in such strings and then perform 514 case-sensitive comparison. See also the related discussion in 515 [I-D.ietf-precis-saslprepbis]. 517 4.1.4. Normalization 519 The normalization rule of a profile specifies which Unicode 520 normalization form (D, KD, C, or KC) is to be applied (see Unicode 521 Standard Annex #15 [UAX15] for background information). 523 In accordance with [RFC5198], normalization form C (NFC) is 524 RECOMMENDED. 526 4.1.5. Directionality 528 The directionality rule of a profile specifies which strings are to 529 be considered left-to-right (LTR) and right-to-left (RTL), and the 530 allowable sequences of characters in LTR and RTL strings (see Unicode 531 Standard Annex #9 [UAX9]); note that mixed-direction strings are not 532 supported, since there is currently no widely accepted and 533 implemented solution for the processing and display of mixed- 534 direction strings. Possible rules include, but are not limited to, 535 (a) considering any string that contains a right-to-left code point 536 to be a right-to-left string, or (b) applying the "Bidi Rule" from 537 [RFC5893]. 539 4.1.6. Exclusions 541 The exclusions rule of a profile specifies whether the profile 542 excludes additional code points or character categories above and 543 beyond those excluded by the string class being profiled. That is, a 544 profile MAY do either of the following: 546 1. Exclude specific code points that are allowed by the relevant 547 string class. 549 2. Exclude characters matching certain Unicode properties (e.g., 550 math symbols) that are included in the relevant PRECIS string 551 class. 553 As a result of such exclusions, code points that are defined as valid 554 for the PRECIS string class being profiled will be defined as 555 disallowed for the profile. 557 4.2. Building Application-Layer Constructs 559 Sometimes, an application-layer construct does not map in a 560 straightforward manner to one of the PRECIS string classes or a 561 profile thereof. Consider, for example, the "simple user name" 562 construct in the Simple Authentication and Security Layer (SASL) 563 [RFC4422]. Depending on the deployment, a simple user name might 564 take the form of a user's full name (e.g., the user's personal name 565 followed by a space and then the user's family name). Such a simple 566 user name cannot be defined as an instance of the IdentifierClass or 567 a profile thereof, since space characters are not allowed in the 568 IdentifierClass; however, it could be defined using a space-separated 569 sequence of IdentifierClass instances, as in the following pseudo- 570 ABNF [RFC5234]: 572 fullname = namepart [1*(1*SP namepart)] 573 namepart = 1*(idpoint) 574 ; 575 ; an "idpoint" is a UTF-8 encoded Unicode code point 576 ; that conforms to the PRECIS IdentifierClass 578 Similar techniques could be used to define many application-layer 579 constructs, say of the form "user@domain" or "/path/to/file". 581 4.3. A Note about Spaces 583 With regard to the IdentiferClass, the consensus of the PRECIS 584 Working Group was that spaces are problematic for many reasons, 585 including: 587 o Many Unicode characters are confusable with ASCII space. 589 o Even if non-ASCII space characters are mapped to ASCII space 590 (U+0020), space characters are often not rendered in user 591 interfaces, leading to the possibility that a human user might 592 consider a string containing spaces to be equivalent to the same 593 string without spaces. 595 o In some locales, some devices are known to generate a character 596 other than ASCII space (such as ZERO WIDTH JOINER, U+200D) when a 597 user performs an action like hit the space bar on a keyboard. 599 One consequence of disallowing space characters in the 600 IdentifierClass might be to effectively discourage the use of ASCII 601 space (or, even more problematically, non-ASCII space characters) 602 within identifiers created in newer application protocols; given the 603 challenges involved in properly handling space characters in 604 identifiers and other protocol strings, the Working Group considered 605 this to be a feature, not a bug. 607 However, the FreeformClass does allow spaces, which enables 608 application protocols to define profiles of the FreeformClass that 609 are more flexible than any profiles of the IdentifierClass. In 610 addition, as explained in the previous section, application protocols 611 can also define application-layer constructs containing spaces. 613 5. Order of Operations 615 To ensure proper comparison, the following order of operations is 616 REQUIRED: 618 1. Width mapping 619 2. Optionally, additional mappings such as mapping of delimiters 620 (e.g., characters such as '@', ':', '/', '+', and '-') and 621 special handling of certain characters or classes of characters 622 (e.g., mapping of non-ASCII spaces to ASCII space or mapping of 623 control characters to nothing); the PRECIS mappings document 624 [I-D.ietf-precis-mappings] describes such mappings in more detail 626 3. Case mapping as described under Section 4.1.3 of this document 628 4. Normalization 630 5. Behavioral rules for determining whether a code point is valid, 631 allowed under a contextual rule, disallowed, or unassigned 633 As already described, the width mapping, additional mappings, case 634 mapping, and normalization operations are specified for each profile, 635 whereas the behavioral rules are specified for each string class. 636 Some of the logic behind this order is provided under Section 4.1.1 637 (see also the PRECIS mappings document [I-D.ietf-precis-mappings]). 639 6. Code Point Properties 641 In order to implement the string classes described above, this 642 document does the following: 644 1. Reviews and classifies the collections of code points in the 645 Unicode character set by examining various code point properties. 647 2. Defines an algorithm for determining a derived property value, 648 which can vary depending on the string class being used by the 649 relevant application protocol. 651 This document is not intended to specify precisely how derived 652 property values are to be applied in protocol strings. That 653 information is the responsibility of the protocol specification that 654 uses or profiles a PRECIS string class from this document. 656 The value of the property is to be interpreted as follows. 658 PROTOCOL VALID Those code points that are allowed to be used in any 659 PRECIS string class (currently, IdentifierClass and 660 FreeformClass). Code points with this property value are 661 permitted for general use in any string class. The abbreviated 662 term "PVALID" is used to refer to this value in the remainder of 663 this document. 665 SPECIFIC CLASS PROTOCOL VALID Those code points that are allowed to 666 be used in specific string classes. Code points with this 667 property value are permitted for use in specific string classes. 668 In the remainder of this document, the abbreviated term *_PVAL is 669 used, where * = (ID | FREE), i.e., either "FREE_PVAL" or 670 "ID_PVAL". 672 CONTEXTUAL RULE REQUIRED Some characteristics of the character, such 673 as its being invisible in certain contexts or problematic in 674 others, require that it not be used in labels unless specific 675 other characters or properties are present. As in IDNA2008, there 676 are two subdivisions of CONTEXTUAL RULE REQUIRED, the first for 677 Join_controls (called "CONTEXTJ") and the second for other 678 characters (called "CONTEXTO"). A character with the derived 679 property value CONTEXTJ or CONTEXTO MUST NOT be used unless an 680 appropriate rule has been established and the context of the 681 character is consistent with that rule. The most notable of the 682 CONTEXTUAL RULE REQUIRED characters are the Join Control 683 characters U+200D ZERO WIDTH JOINER and U+200C ZERO WIDTH NON- 684 JOINER, which have a derived property value of CONTEXTJ. See 685 Appendix A of [RFC5892] for more information. 687 DISALLOWED Those code points that are not permitted in any PRECIS 688 string class. 690 SPECIFIC CLASS DISALLOWED Those code points that are not to be 691 included in a specific string class. Code points with this 692 property value are not permitted in one of the string classes but 693 might be permitted in others. In the remainder of this document, 694 the abbreviated term *_DIS is used, where * = (ID | FREE), i.e., 695 either "FREE_DIS" or "ID_DIS". 697 UNASSIGNED Those code points that are not designated (i.e. are 698 unassigned) in the Unicode Standard. 700 The mechanisms described here allow determination of the value of the 701 property for future versions of Unicode (including characters added 702 after Unicode 5.2 or 6.3 depending on the category, since some 703 categories in this document are reused from IDNA2008 and therefore 704 were defined at the time of Unicode 5.2). Changes in Unicode 705 properties that do not affect the outcome of this process therefore 706 do not affect this framework. For example, a character can have its 707 Unicode General_Category value [UNICODE] change from So to Sm, or 708 from Lo to Ll, without affecting the algorithm results. Moreover, 709 even if such changes were to result, the BackwardCompatible list 710 (Section 7.7) can be adjusted to ensure the stability of the results. 712 7. Category Definitions Used to Calculate Derived Property 714 The derived property obtains its value based on a two-step procedure: 716 1. Characters are placed in one or more character categories either 717 (1) based on core properties defined by the Unicode Standard or 718 (2) by treating the code point as an exception and addressing the 719 code point based on its code point value. These categories are 720 not mutually exclusive. 722 2. Set operations are used with these categories to determine the 723 values for a property specific to a given string class. These 724 operations are specified under Section 8. 726 Informational Note: Unicode property names and property value 727 names might have short abbreviations, such as "gc" for the 728 General_Category property and "Ll" for the Lowercase_Letter 729 property value of the gc property. 731 In the following specification of character categories, the operation 732 that returns the value of a particular Unicode character property for 733 a code point is designated by using the formal name of that property 734 (from the Unicode PropertyAliases.txt [1]) followed by '(cp)' for 735 "code point". For example, the value of the General_Category 736 property for a code point is indicated by General_Category(cp). 738 The first ten categories (A-J) shown below were previously defined 739 for IDNA2008 and are copied directly from [RFC5892]. Some of these 740 categories are reused in PRECIS and some of them are not; however, 741 the lettering of categories is retained to prevent overlap and to 742 ease implementation of both IDNA2008 and PRECIS in a single software 743 application. The next eight categories (K-R) are specific to PRECIS. 745 7.1. LetterDigits (A) 747 Note: This category is defined in [RFC5892] and copied here for use 748 in PRECIS. 750 A: General_Category(cp) is in {Ll, Lu, Lm, Lo, Mn, Mc, Nd} 752 These rules identify characters commonly used in mnemonics and often 753 informally described as "language characters". 755 For more information, see Chapter 4 of the Unicode Standard 756 [UNICODE]. 758 The categories used in this rule are: 760 o Ll - Lowercase_Letter 762 o Lu - Uppercase_Letter 764 o Lm - Modifier_Letter 766 o Lo - Other_Letter 768 o Mn - Nonspacing_Mark 770 o Mc - Spacing_Mark 772 o Nd - Decimal_Number 774 7.2. Unstable (B) 776 Note: This category is defined in [RFC5892] but not used in PRECIS. 778 7.3. IgnorableProperties (C) 780 Note: This category is defined in [RFC5892] but not used in PRECIS. 781 See the "PrecisIgnorableProperties (M)" category below for a more 782 inclusive category used in PRECIS identifiers. 784 7.4. IgnorableBlocks (D) 786 Note: This category is defined in [RFC5892] but not used in PRECIS. 788 7.5. LDH (E) 790 Note: This category is defined in [RFC5892] but not used in PRECIS. 791 See the "ASCII7 (K)" category below for a more inclusive category 792 used in PRECIS identifiers. 794 7.6. Exceptions (F) 796 Note: This category is defined in [RFC5892] and used in PRECIS to 797 ensure consistent treatment of the relevant code points. 799 F: cp is in {00B7, 00DF, 0375, 03C2, 05F3, 05F4, 0640, 0660, 800 0661, 0662, 0663, 0664, 0665, 0666, 0667, 0668, 801 0669, 06F0, 06F1, 06F2, 06F3, 06F4, 06F5, 06F6, 802 06F7, 06F8, 06F9, 06FD, 06FE, 07FA, 0F0B, 3007, 803 302E, 302F, 3031, 3032, 3033, 3034, 3035, 303B, 804 30FB} 806 This category explicitly lists code points for which the category 807 cannot be assigned using only the core property values that exist in 808 the Unicode Standard. The values are according to the table below: 810 PVALID -- Would otherwise have been DISALLOWED 812 00DF; PVALID # LATIN SMALL LETTER SHARP S 813 03C2; PVALID # GREEK SMALL LETTER FINAL SIGMA 814 06FD; PVALID # ARABIC SIGN SINDHI AMPERSAND 815 06FE; PVALID # ARABIC SIGN SINDHI POSTPOSITION MEN 816 0F0B; PVALID # TIBETAN MARK INTERSYLLABIC TSHEG 817 3007; PVALID # IDEOGRAPHIC NUMBER ZERO 819 CONTEXTO -- Would otherwise have been DISALLOWED 821 00B7; CONTEXTO # MIDDLE DOT 822 0375; CONTEXTO # GREEK LOWER NUMERAL SIGN (KERAIA) 823 05F3; CONTEXTO # HEBREW PUNCTUATION GERESH 824 05F4; CONTEXTO # HEBREW PUNCTUATION GERSHAYIM 825 30FB; CONTEXTO # KATAKANA MIDDLE DOT 827 CONTEXTO -- Would otherwise have been PVALID 829 0660; CONTEXTO # ARABIC-INDIC DIGIT ZERO 830 0661; CONTEXTO # ARABIC-INDIC DIGIT ONE 831 0662; CONTEXTO # ARABIC-INDIC DIGIT TWO 832 0663; CONTEXTO # ARABIC-INDIC DIGIT THREE 833 0664; CONTEXTO # ARABIC-INDIC DIGIT FOUR 834 0665; CONTEXTO # ARABIC-INDIC DIGIT FIVE 835 0666; CONTEXTO # ARABIC-INDIC DIGIT SIX 836 0667; CONTEXTO # ARABIC-INDIC DIGIT SEVEN 837 0668; CONTEXTO # ARABIC-INDIC DIGIT EIGHT 838 0669; CONTEXTO # ARABIC-INDIC DIGIT NINE 839 06F0; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT ZERO 840 06F1; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT ONE 841 06F2; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT TWO 842 06F3; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT THREE 843 06F4; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT FOUR 844 06F5; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT FIVE 845 06F6; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT SIX 846 06F7; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT SEVEN 847 06F8; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT EIGHT 848 06F9; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT NINE 850 DISALLOWED -- Would otherwise have been PVALID 852 0640; DISALLOWED # ARABIC TATWEEL 853 07FA; DISALLOWED # NKO LAJANYALAN 854 302E; DISALLOWED # HANGUL SINGLE DOT TONE MARK 855 302F; DISALLOWED # HANGUL DOUBLE DOT TONE MARK 856 3031; DISALLOWED # VERTICAL KANA REPEAT MARK 857 3032; DISALLOWED # VERTICAL KANA REPEAT WITH VOICED SOUND MARK 858 3033; DISALLOWED # VERTICAL KANA REPEAT MARK UPPER HALF 859 3034; DISALLOWED # VERTICAL KANA REPEAT WITH VOICED SOUND MARK 860 UPPER HA 861 3035; DISALLOWED # VERTICAL KANA REPEAT MARK LOWER HALF 862 303B; DISALLOWED # VERTICAL IDEOGRAPHIC ITERATION MARK 864 7.7. BackwardCompatible (G) 866 Note: This category is defined in [RFC5892] and copied here for use 867 in PRECIS. Because of how the PRECIS string classes are defined, 868 only changes that would result in code points being added to or 869 removed from the LetterDigits ("A") category would result in 870 backward-incompatible modifications to code point assignments. 871 Therefore, management of this category is handled via the processes 872 specified in [RFC5892]. 874 G: cp is in {} 876 This category includes the code points for which property values in 877 versions of Unicode after 5.2 have changed in such a way that the 878 derived property value would no longer be PVALID or DISALLOWED. If 879 changes are made to future versions of Unicode so that code points 880 might change property value from PVALID or DISALLOWED, then this 881 table can be updated and keep special exception values so that the 882 property values for code points stay stable. 884 7.8. JoinControl (H) 886 Note: This category is defined in [RFC5892] and copied here for use 887 in PRECIS. 889 H: Join_Control(cp) = True 891 This category consists of Join Control characters (i.e., they are not 892 in LetterDigits (Section 7.1) but are still required in strings under 893 some circumstances). 895 7.9. OldHangulJamo (I) 897 Note: This category is defined in [RFC5892] and copied here for use 898 in PRECIS. 900 I: Hangul_Syllable_Type(cp) is in {L, V, T} 901 This category consists of all conjoining Hangul Jamo (Leading Jamo, 902 Vowel Jamo, and Trailing Jamo). 904 Elimination of conjoining Hangul Jamos from the set of PVALID 905 characters results in restricting the set of Korean PVALID characters 906 just to preformed, modern Hangul syllable characters. Old Hangul 907 syllables, which are spelled with sequences of conjoining Hangul 908 Jamos, are not PVALID for string classes. 910 7.10. Unassigned (J) 912 Note: This category is defined in [RFC5892] and copied here for use 913 in PRECIS. 915 J: General_Category(cp) is in {Cn} and 916 Noncharacter_Code_Point(cp) = False 918 This category consists of code points in the Unicode character set 919 that are not (yet) designated. Implementers might want to keep in 920 mind that the Unicode Standard distinguishes between 'unassigned code 921 points' and 'unassigned characters'. The unassigned code points are 922 all but (Cn - Noncharacters), whereas the unassigned characters are 923 all but (Cn + Cs). 925 7.11. ASCII7 (K) 927 This PRECIS-specific category consists of all printable, non-space 928 characters from the 7-bit ASCII range. By applying this category, 929 the algorithm specified under Section 8 exempts these characters from 930 other rules that might be applied during PRECIS processing, on the 931 assumption that these code points are in such wide use that 932 disallowing them would be counter-productive. 934 K: cp is in {0021..007E} 936 7.12. Controls (L) 938 L: Control(cp) = True 940 7.13. PrecisIgnorableProperties (M) 942 This PRECIS-specific category is used to group code points that are 943 discouraged from use in PRECIS string classes. 945 M: Default_Ignorable_Code_Point(cp) = True or 946 Noncharacter_Code_Point(cp) = True 948 The definition for Default_Ignorable_Code_Point can be found in the 949 DerivedCoreProperties.txt [2] file, and at the time of Unicode 6.3 is 950 as follows: 952 Other_Default_Ignorable_Code_Point 953 + Cf (Format characters) 954 + Variation_Selector 955 - White_Space 956 - FFF9..FFFB (Annotation Characters) 957 - 0600..0604, 06DD, 070F, 110BD (exceptional Cf characters 958 that should be visible) 960 7.14. Spaces (N) 962 This PRECIS-specific category is used to group code points that are 963 space characters. 965 N: General_Category(cp) is in {Zs} 967 7.15. Symbols (O) 969 This PRECIS-specific category is used to group code points that are 970 symbols. 972 O: General_Category(cp) is in {Sm, Sc, Sk, So} 974 7.16. Punctuation (P) 976 This PRECIS-specific category is used to group code points that are 977 punctuation characters. 979 P: General_Category(cp) is in {Pc, Pd, Ps, Pe, Pi, Pf, Po} 981 7.17. HasCompat (Q) 983 This PRECIS-specific category is used to group code points that have 984 compatibility equivalents as explained in Chapter 2 and Chapter 3 of 985 the Unicode Standard [UNICODE]. 987 Q: toNFKC(cp) != cp 989 The toNFKC() operation returns the code point in normalization form 990 KC. For more information, see Section 5 of Unicode Standard Annex 991 #15 [UAX15]. 993 7.18. OtherLetterDigits (R) 995 This PRECIS-specific category is used to group code points that are 996 letters and digits other than the "traditional" letters and digits 997 grouped under the LetterDigits (A) class (see Section 7.1). 999 R: General_Category(cp) is in {Lt, Nl, No, Me} 1001 8. Calculation of the Derived Property 1003 Possible values of the derived property are: 1005 o PVALID 1007 o ID_PVAL 1009 o FREE_PVAL 1011 o CONTEXTJ 1013 o CONTEXTO 1015 o DISALLOWED 1017 o ID_DIS 1019 o FREE_DIS 1021 o UNASSIGNED 1023 Informational Note: The value of the derived property calculated 1024 can depend on the string class; for example, if an identifier used 1025 in an application protocol is defined as profiling the PRECIS 1026 IdentifierClass then a space character such as U+0020 would be 1027 assigned to ID_DIS, whereas if an identifier is defined as 1028 profiling the PRECIS FreeformClass then the character would be 1029 assigned to FREE_PVAL. For the sake of brevity, the designation 1030 "FREE_PVAL" is used in the code point tables, instead of the 1031 longer designation "ID_DIS or FREE_PVAL". In practice, the 1032 derived properties ID_PVAL and FREE_DIS are not used in this 1033 specification, since every ID_PVAL code point is PVALID and every 1034 FREE_DIS code point is DISALLOWED. 1036 The algorithm to calculate the value of the derived property is as 1037 follows: 1039 If .cp. .in. Exceptions Then Exceptions(cp); 1040 Else If .cp. .in. BackwardCompatible Then BackwardCompatible(cp); 1041 Else If .cp. .in. Unassigned Then UNASSIGNED; 1042 Else If .cp. .in. ASCII7 Then PVALID; 1043 Else If .cp. .in. JoinControl Then CONTEXTJ; 1044 Else If .cp. .in. OldHangulJamo Then DISALLOWED; 1045 Else If .cp. .in. PrecisIgnorableProperties Then DISALLOWED; 1046 Else If .cp. .in. Controls Then DISALLOWED; 1047 Else If .cp. .in. HasCompat Then ID_DIS or FREE_PVAL; 1048 Else If .cp. .in. LetterDigits Then PVALID; 1049 Else If .cp. .in. OtherLetterDigits Then ID_DIS or FREE_PVAL; 1050 Else If .cp. .in. Spaces Then ID_DIS or FREE_PVAL; 1051 Else If .cp. .in. Symbols Then ID_DIS or FREE_PVAL; 1052 Else If .cp. .in. Punctuation Then ID_DIS or FREE_PVAL; 1053 Else DISALLOWED; 1055 Note: Use of the name of a rule (such as "Exceptions") implies the 1056 set of code points that the rule defines, whereas the same name as a 1057 function call (such as "Exceptions(cp)") implies the value that the 1058 code point has in the Exceptions table. 1060 9. IANA Considerations 1062 9.1. PRECIS Derived Property Value Registry 1064 IANA is requested to create a PRECIS-specific registry with the 1065 Derived Properties for the versions of Unicode that are released 1066 after (and including) version 6.3. The derived property value is to 1067 be calculated in cooperation with a designated expert [RFC5226] 1068 according to the rules specified under Section 7 and Section 8, not 1069 by copying the non-normative table found under Appendix A. 1071 The IESG is to be notified if backward-incompatible changes to the 1072 table of derived properties are discovered or if other problems arise 1073 during the process of creating the table of derived property values 1074 or during expert review. Changes to the rules defined under 1075 Section 7 and Section 8 require IETF Review. 1077 9.2. PRECIS Base Classes Registry 1079 IANA is requested to create a registry of PRECIS string classes. In 1080 accordance with [RFC5226], the registration policy is "RFC Required". 1082 The registration template is as follows: 1084 Base Class: [the name of the PRECIS string class] 1085 Description: [a brief description of the PRECIS string class and its 1086 intended use, e.g., "A sequence of letters, numbers, and symbols 1087 that is used to identify or address a network entity."] 1089 Specification: [the RFC number] 1091 The initial registrations are as follows: 1093 Base Class: FreeformClass. 1094 Description: A sequence of letters, numbers, symbols, spaces, and 1095 other code points that is used for free-form strings. 1096 Specification: RFC XXXX. [Note to RFC Editor: please change XXXX to 1097 the number issued for this specification.] 1099 Base Class: IdentifierClass. 1100 Description: A sequence of letters, numbers, and symbols that is 1101 used to identify or address a network entity. 1102 Specification: RFC XXXX. [Note to RFC Editor: please change XXXX to 1103 the number issued for this specification.] 1105 9.3. PRECIS Profiles Registry 1107 IANA is requested to create a registry of profiles that use the 1108 PRECIS string classes. In accordance with [RFC5226], the 1109 registration policy is "Expert Review". This policy was chosen in 1110 order to ease the burden of registration while ensuring that 1111 "customers" of PRECIS receive appropriate guidance regarding the 1112 sometimes complex and subtle internationalization issues related to 1113 profiles of PRECIS string classes. 1115 The registration template is as follows: 1117 Name: [the name of the profile] 1119 Applicability: [the specific protocol elements to which this profile 1120 applies, e.g., "Localparts in XMPP addresses."] 1122 Base Class: [which PRECIS string class is being profiled] 1124 Replaces: [the Stringprep profile that this PRECIS profile replaces, 1125 if any] 1127 Width Mapping: [the behavioral rule for handling of width, e.g., 1128 "Map fullwidth and halfwidth characters to their compatibility 1129 variants."] 1131 Additional Mappings: [any additional mappings are required or 1132 recommended, e.g., "Map non-ASCII space characters to ASCII 1133 space."] 1135 Case Mapping: [the behavioral rule for handling of case, e.g., 1136 "Unicode Default Case Folding"] 1138 Normalization: [which Unicode normalization form is applied, e.g., 1139 "NFC"] 1141 Directionality: [the behavioral rule for handling of right-to-left 1142 code points, e.g., "The 'Bidi Rule' defined in RFC 5893 applies."] 1144 Exclusions: [a brief description of the specific code points or 1145 characters categories are excluded, e.g., "Eight legacy characters 1146 in the ASCII range" or "Any character that has a compatibility 1147 equivalent, i.e., the HasCompat category"] 1149 Enforcement: [which entities enforce the rules, and when that 1150 enforcement occurs during protocol operations] 1152 Specification: [a pointer to relevant documentation, such as an RFC 1153 or Internet-Draft] 1155 In order to request a review, the registrant shall send a completed 1156 template to the precis@ietf.org list or its designated successor. 1158 Factors to focus on while defining profiles and reviewing profile 1159 registrations include the following: 1161 o Is the problem being addressed by this profile well-defined? 1163 o Does the specification define what kinds of applications are 1164 involved and the protocol elements to which this profile applies? 1166 o Would an existing PRECIS string class or profile solve the 1167 problem? 1169 o Is the profile clearly defined? 1171 o Is the profile based on an appropriate dividing line between user 1172 interface (culture, context, intent, locale, device limitations, 1173 etc.) and the use of conformant strings in protocol elements? 1175 o Are the width mapping, case mapping, additional mappings, 1176 normalization, exclusion, and directionality rules appropriate for 1177 the intended use? 1179 o Does the profile explain which entities enforce the rules, and 1180 when such enforcement occurs during protocol operations? 1182 o Does the profile reduce the degree to which human users could be 1183 surprised or confused by application behavior (the "principle of 1184 least user surprise")? 1186 o Does the profile introduce any new security concerns such as those 1187 described under Section 10 of this document (e.g., false positives 1188 for authentication or authorization)? 1190 10. Security Considerations 1192 10.1. General Issues 1194 The security of applications that use this framework can depend in 1195 part on the proper preparation and comparison of internationalized 1196 strings. For example, such strings can be used to make 1197 authentication and authorization decisions, and the security of an 1198 application could be compromised if an entity providing a given 1199 string is connected to the wrong account or online resource based on 1200 different interpretations of the string. 1202 Specifications of application protocols that use this framework are 1203 encouraged to describe how internationalized strings are used in the 1204 protocol, including the security implications of any false positives 1205 and false negatives that might result from various comparison 1206 operations. For some helpful guidelines, refer to [RFC6943], 1207 [RFC5890], [UTR36], and [UTS39]. 1209 10.2. Use of the IdentifierClass 1211 Strings that conform to the IdentifierClass and any profile thereof 1212 are intended to be relatively safe for use in a broad range of 1213 applications, primarily because they include only letters, digits, 1214 and "grandfathered" non-space characters from the ASCII range; thus 1215 they exclude spaces, characters with compatibility equivalents, and 1216 almost all symbols and punctuation marks. However, because such 1217 strings can still include so-called confusable characters (see 1218 Section 10.5), protocol designers and implementers are encouraged to 1219 pay close attention to the security considerations described 1220 elsewhere in this document. 1222 10.3. Use of the FreeformClass 1224 Strings that conform to the FreeformClass and many profiles thereof 1225 can include virtually any Unicode character. This makes the 1226 FreeformClass quite expressive, but also problematic from the 1227 perspective of possible user confusion. Protocol designers are 1228 hereby warned that the FreeformClass contains codepoints they might 1229 not understand, and are encouraged to profile the IdentifierClass 1230 wherever feasible; however, if an application protocol requires more 1231 code points than are allowed by the IdentifierClass, protocol 1232 designers are encouraged to define a profile of the FreeformClass 1233 that restricts the allowable code points as tightly as possible. 1234 (The PRECIS Working Group considered the option of allowing 1235 superclasses as well as profiles of PRECIS string classes, but 1236 decided against allowing superclasses to reduce the likelihood of 1237 security and interoperability problems.) 1239 10.4. Local Character Set Issues 1241 When systems use local character sets other than ASCII and Unicode, 1242 this specification leaves the problem of converting between the local 1243 character set and Unicode up to the application or local system. If 1244 different applications (or different versions of one application) 1245 implement different rules for conversions among coded character sets, 1246 they could interpret the same name differently and contact different 1247 application servers or other network entities. This problem is not 1248 solved by security protocols, such as Transport Layer Security (TLS) 1249 [RFC5246] and the Simple Authentication and Security Layer (SASL) 1250 [RFC4422], that do not take local character sets into account. 1252 10.5. Visually Similar Characters 1254 Some characters are visually similar and thus can cause confusion 1255 among humans. Such characters are often called "confusable 1256 characters" or "confusables". 1258 The problem of confusable characters is not necessarily caused by the 1259 use of Unicode code points outside the ASCII range. For example, in 1260 some presentations and to some individuals the string "ju1iet" 1261 (spelled with DIGIT ONE, U+0031, as the third character) might appear 1262 to be the same as "juliet" (spelled with LATIN SMALL LETTER L, 1263 U+006C), especially on casual visual inspection. This phenomenon is 1264 sometimes called "typejacking". 1266 However, the problem is made more serious by introducing the full 1267 range of Unicode code points into protocol strings. For example, the 1268 characters U+13DA U+13A2 U+13B5 U+13AC U+13A2 U+13AC U+13D2 from the 1269 Cherokee block look similar to the ASCII characters "STPETER" as they 1270 might appear when presented using a "creative" font family. 1272 In some examples of confusable characters, it is unlikely that the 1273 average human could tell the difference between the real string and 1274 the fake string. (Indeed, there is no programmatic way to 1275 distinguish with full certainty which is the fake string and which is 1276 the real string; in some contexts, the string formed of Cherokee 1277 characters might be the real string and the string formed of ASCII 1278 characters might be the fake string.) Because PRECIS-compliant 1279 strings can contain almost any properly-encoded Unicode code point, 1280 it can be relatively easy to fake or mimic some strings in systems 1281 that use the PRECIS framework. The fact that some strings are easily 1282 confused introduces security vulnerabilities of the kind that have 1283 also plagued the World Wide Web, specifically the phenomenon known as 1284 phishing. 1286 Despite the fact that some specific suggestions about identification 1287 and handling of confusable characters appear in the Unicode Security 1288 Considerations [UTR36] and the Unicode Security Mechanisms [UTS39], 1289 it is also true (as noted in [RFC5890]) that "there are no 1290 comprehensive technical solutions to the problems of confusable 1291 characters". Because it is impossible to map visually similar 1292 characters without a great deal of context (such as knowing the font 1293 families used), the PRECIS framework does nothing to map similar- 1294 looking characters together, nor does it prohibit some characters 1295 because they look like others. 1297 Nevertheless, specifications for application protocols that use this 1298 framework MUST describe how confusable characters can be abused to 1299 compromise the security of systems that use the protocol in question, 1300 along with any protocol-specific suggestions for overcoming those 1301 threats. In particular, software implementations and service 1302 deployments that use PRECIS-based technologies are strongly 1303 encouraged to define and implement consistent policies regarding the 1304 registration, storage, and presentation of visually similar 1305 characters. The following recommendations are appropriate: 1307 1. An application service SHOULD define a policy that specifies the 1308 scripts or blocks of characters that the service will allow to be 1309 registered (e.g., in an account name) or stored (e.g., in a file 1310 name). Such a policy SHOULD be informed by the languages and 1311 scripts that are used to write registered account names; in 1312 particular, to reduce confusion, the service SHOULD forbid 1313 registration or storage of strings that contain characters from 1314 more than one script and SHOULD restrict registrations to 1315 characters drawn from a very small number of scripts (e.g., 1316 scripts that are well-understood by the administrators of the 1317 service, to improve manageability). 1319 2. User-oriented application software SHOULD define a policy that 1320 specifies how internationalized strings will be presented to a 1321 human user. Because every human user of such software has a 1322 preferred language or a small set of preferred languages, the 1323 software SHOULD gather that information either explicitly from 1324 the user or implicitly via the operating system of the user's 1325 device. Furthermore, because most languages are typically 1326 represented by a single script or a small set of scripts, and 1327 because most scripts are typically contained in one or more 1328 blocks of characters, the software SHOULD warn the user when 1329 presenting a string that mixes characters from more than one 1330 script or block, or that uses characters outside the normal range 1331 of the user's preferred language(s). (Such a recommendation is 1332 not intended to discourage communication across different 1333 communities of language users; instead, it recognizes the 1334 existence of such communities and encourages due caution when 1335 presenting unfamiliar scripts or characters to human users.) 1337 The challenges inherent in supporting the full range of Unicode code 1338 points have in the past led some to hope for a way to 1339 programmatically negotiate more restrictive ranges based on locale, 1340 script, or other relevant factors, to tag the locale associated with 1341 a particular string, etc. As a general-purpose internationalization 1342 technology, the PRECIS framework does not include such mechanisms. 1344 10.6. Security of Passwords 1346 Two goals of passwords are to maximize the amount of entropy and to 1347 minimize the potential for false positives. These goals can be 1348 achieved in part by allowing a wide range of code points and by 1349 ensuring that passwords are handled in such a way that code points 1350 are not compared aggressively. Therefore, it is NOT RECOMMENDED for 1351 application protocols to profile the FreeformClass for use in 1352 passwords in a way that removes entire categories (e.g., by 1353 disallowing symbols or punctuation). Furthermore, it is NOT 1354 RECOMMENDED for application protocols to map uppercase and titlecase 1355 code points to their lowercase equivalents in such strings; instead, 1356 it is RECOMMENDED to preserve the case of all code points contained 1357 in such strings and to compare them in a case-sensitive manner. 1359 That said, software implementers need to be aware that there exist 1360 tradeoffs between entropy and usability. For example, allowing a 1361 user to establish a password containing "uncommon" code points might 1362 make it difficult for the user to access a service when using an 1363 unfamiliar or constrained input device. 1365 Some application protocols use passwords directly, whereas others 1366 reuse technologies that themselves process passwords (one example of 1367 such a technology is the Simple Authentication and Security Layer 1368 [RFC4422]). Moreover, passwords are often carried by a sequence of 1369 protocols with backend authentication systems or data storage systems 1370 such as RADIUS [RFC2865] and LDAP [RFC4510]. Developers of 1371 application protocols are encouraged to look into reusing these 1372 profiles instead of defining new ones, so that end-user expectations 1373 about passwords are consistent no matter which application protocol 1374 is used. 1376 11. Interoperability Considerations 1378 Although strings that are consumed in PRECIS-based application 1379 protocols are often encoded using UTF-8 [RFC3629], the exact encoding 1380 is a matter for the application protocol that uses PRECIS, not for 1381 the PRECIS framework. 1383 It is known that some existing systems are unable to support the full 1384 Unicode character set, or even any characters outside the ASCII 1385 range. If two (or more) applications need to interoperate when 1386 exchanging data (e.g., for the purpose of authenticating a username 1387 or password), they will naturally need to have in common at least one 1388 coded character set (as defined by [RFC6365]). Establishing such a 1389 baseline is a matter for the application protocol that uses PRECIS, 1390 not for the PRECIS framework. 1392 The PRECIS framework, which is defined in terms of the latest version 1393 of Unicode as of the time of this writing (6.3), treats the character 1394 U+19DA NEW TAI LUE THAM as DISALLOWED. Implementers need to be aware 1395 that this treatment is different from IDNA2008 (originally defined in 1396 terms of Unicode 5.2), which treats U+19DA as PVALID. 1398 12. References 1400 12.1. Normative References 1402 [RFC20] Cerf, V., "ASCII format for network interchange", RFC 20, 1403 October 1969. 1405 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1406 Requirement Levels", BCP 14, RFC 2119, March 1997. 1408 [RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network 1409 Interchange", RFC 5198, March 2008. 1411 [UNICODE] The Unicode Consortium, "The Unicode Standard", 2013, 1412 . 1414 12.2. Informative References 1416 [I-D.ietf-precis-mappings] 1417 Yoneya, Y. and T. NEMOTO, "Mapping characters for PRECIS 1418 classes", draft-ietf-precis-mappings-07 (work in 1419 progress), February 2014. 1421 [I-D.ietf-precis-nickname] 1422 Saint-Andre, P., "Preparation and Comparison of 1423 Nicknames", draft-ietf-precis-nickname-09 (work in 1424 progress), January 2014. 1426 [I-D.ietf-precis-saslprepbis] 1427 Saint-Andre, P. and A. Melnikov, "Username and Password 1428 Preparation Algorithms", draft-ietf-precis-saslprepbis-06 1429 (work in progress), December 2013. 1431 [I-D.ietf-xmpp-6122bis] 1432 Saint-Andre, P., "Extensible Messaging and Presence 1433 Protocol (XMPP): Address Format", draft-ietf-xmpp- 1434 6122bis-11 (work in progress), February 2014. 1436 [RFC2865] Rigney, C., Willens, S., Rubens, A., and W. Simpson, 1437 "Remote Authentication Dial In User Service (RADIUS)", RFC 1438 2865, June 2000. 1440 [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of 1441 Internationalized Strings ("stringprep")", RFC 3454, 1442 December 2002. 1444 [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, 1445 "Internationalizing Domain Names in Applications (IDNA)", 1446 RFC 3490, March 2003. 1448 [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep 1449 Profile for Internationalized Domain Names (IDN)", RFC 1450 3491, March 2003. 1452 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 1453 10646", STD 63, RFC 3629, November 2003. 1455 [RFC4422] Melnikov, A. and K. Zeilenga, "Simple Authentication and 1456 Security Layer (SASL)", RFC 4422, June 2006. 1458 [RFC4510] Zeilenga, K., "Lightweight Directory Access Protocol 1459 (LDAP): Technical Specification Road Map", RFC 4510, June 1460 2006. 1462 [RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and 1463 Recommendations for Internationalized Domain Names 1464 (IDNs)", RFC 4690, September 2006. 1466 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1467 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 1468 May 2008. 1470 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 1471 Specifications: ABNF", STD 68, RFC 5234, January 2008. 1473 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 1474 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 1476 [RFC5890] Klensin, J., "Internationalized Domain Names for 1477 Applications (IDNA): Definitions and Document Framework", 1478 RFC 5890, August 2010. 1480 [RFC5891] Klensin, J., "Internationalized Domain Names in 1481 Applications (IDNA): Protocol", RFC 5891, August 2010. 1483 [RFC5892] Faltstrom, P., "The Unicode Code Points and 1484 Internationalized Domain Names for Applications (IDNA)", 1485 RFC 5892, August 2010. 1487 [RFC5893] Alvestrand, H. and C. Karp, "Right-to-Left Scripts for 1488 Internationalized Domain Names for Applications (IDNA)", 1489 RFC 5893, August 2010. 1491 [RFC5894] Klensin, J., "Internationalized Domain Names for 1492 Applications (IDNA): Background, Explanation, and 1493 Rationale", RFC 5894, August 2010. 1495 [RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for 1496 Internationalized Domain Names in Applications (IDNA) 1497 2008", RFC 5895, September 2010. 1499 [RFC6365] Hoffman, P. and J. Klensin, "Terminology Used in 1500 Internationalization in the IETF", BCP 166, RFC 6365, 1501 September 2011. 1503 [RFC6885] Blanchet, M. and A. Sullivan, "Stringprep Revision and 1504 Problem Statement for the Preparation and Comparison of 1505 Internationalized Strings (PRECIS)", RFC 6885, March 2013. 1507 [RFC6943] Thaler, D., "Issues in Identifier Comparison for Security 1508 Purposes", RFC 6943, May 2013. 1510 [UAX9] The Unicode Consortium, "Unicode Standard Annex #9: 1511 Unicode Bidirectional Algorithm", September 2012, 1512 . 1514 [UAX11] The Unicode Consortium, "Unicode Standard Annex #11: East 1515 Asian Width", September 2012, 1516 . 1518 [UAX15] The Unicode Consortium, "Unicode Standard Annex #15: 1519 Unicode Normalization Forms", August 2012, 1520 . 1522 [UTR36] The Unicode Consortium, "Unicode Technical Report #36: 1523 Unicode Security Considerations", July 2012, 1524 . 1526 [UTS39] The Unicode Consortium, "Unicode Technical Standard #39: 1527 Unicode Security Mechanisms", July 2012, 1528 . 1530 12.3. URIs 1532 [1] http://unicode.org/Public/UNIDATA/PropertyAliases.txt 1534 [2] http://unicode.org/Public/UNIDATA/DerivedCoreProperties.txt 1536 Appendix A. Codepoint Table 1538 If one applies the property calculation rules from Section 8 to the 1539 code points 0x0000 to 0x10FFFF in Unicode 6.3, the result is as shown 1540 in the following table, in Unicode Character Database (UCD) format. 1541 The columns of the table are as follows: 1543 1. The code point or codepoint range. 1545 2. The assignment for the code point or range, where the value is 1546 one of PVALID, DISALLOWED, UNASSIGNED, CONTEXTO, CONTEXTJ, or 1547 FREE_PVAL (where the latter includes ID_DIS). 1549 3. The name or names for the code point or range. 1551 This table is non-normative, is included only for illustrative 1552 purposes, and applies only to Unicode 6.3, not to past or future 1553 versions of Unicode. Please note that the strings displayed in the 1554 third column are not necessarily the formal name of the code point 1555 (as defined in [UNICODE]) because the fixed width of the RFC format 1556 necessitated truncation of many names. 1558 0000..001F ; DISALLOWED # 1559 0020 ; FREE_PVAL # SPACE 1560 0021..007E ; PVALID # EXCLAM MARK..TILDE 1561 007F..009F ; DISALLOWED # 1562 00A0..00AC ; FREE_PVAL # NO-BREAK SPACE..NOT SIGN 1563 00AD ; DISALLOWED # SOFT HYPH 1564 00AE..00B6 ; FREE_PVAL # REGISTERED SIGN..PILCROW SIGN 1565 00B7 ; CONTEXTO # MIDDLE DOT 1566 00B8..00BF ; FREE_PVAL # CEDILLA..INV QUEST IND 1567 00C0..00D6 ; PVALID # LAT CAP LET A W GRAV..LAT CAP O 1568 00D7 ; FREE_PVAL # MULTIPLICATION SIGN 1569 00D8..00F6 ; PVALID # LAT CAP LET O W STROKE..LAT SM 1570 00F7 ; FREE_PVAL # DIVISION SIGN 1571 00F8..0131 ; PVALID # LAT SM LET O W STROKE..LAT SM LET 1572 0132..0133 ; FREE_PVAL # LAT CAP LIG IJ..LAT SM LIB IJ 1573 0134..013E ; PVALID # LAT CAP LET J W CIRCUM..LAT SM LET 1574 013F..0140 ; FREE_PVAL # LAT CAP LET L W MID DOT..LAT SM LET 1575 0141..0148 ; PVALID # LAT CAP LET L W STROKE..LAT SM LET 1576 0149 ; FREE_PVAL # LAT SM LET N PRECEDED BY APOS 1577 014A..017E ; PVALID # LAT CAP LET ENG..LAT SM LET Z W CA 1578 017F ; FREE_PVAL # LAT SM LET LONG S 1579 0180..01C3 ; PVALID # LAT SM LET B W STROKE..LAT LET RETR 1580 01C4..01CC ; FREE_PVAL # LAT CAP LET DZ W CARON..LAT SM 1581 01CD..01F0 ; PVALID # LAT CAP LET A W CARON..LAT SM LET J 1582 01F1..01F3 ; FREE_PVAL # LAT CAP LET DZ..LAT SM LET DZ 1583 01F4..02AF ; PVALID # LAT CAP LET G W ACUTE..LAT SM 1584 02B0..02B8 ; FREE_PVAL # MOD LET SM H..MOD LET SM Y 1585 02B9..02C1 ; PVALID # MOD LET PRIME..MOD LET REV GLOT ST 1586 02C2..02C5 ; FREE_PVAL # MOD LET L ARROW..MOD LET D ARROW 1587 02C6..02D1 ; PVALID # MOD LET CIRCUM ACC..MOD LET HALF TR 1588 02D2..02EB ; FREE_PVAL # MOD LET CENT R HALF RING..MOD LET Y 1589 02EC ; PVALID # MOD LET VOICING 1590 02ED ; FREE_PVAL # MOD LET UNASPIRATED 1591 02EE ; PVALID # MOD LET DOUBLE APOS 1592 02EF..02FF ; FREE_PVAL # MOD LET LOW D ARR..MOD LET LOW L AR 1593 0300..034E ; PVALID # COMB GRAVE ACCENT..COMB UP ARROW BE 1594 034F ; DISALLOWED # COMB GRAPHEME JOINER 1595 0350..0374 ; PVALID # COMB RIGHT ARROWHEAD..GREEK NUM SIG 1596 0375 ; CONTEXTO # GREEK LOW NUM SIGN 1597 0376..0377 ; PVALID # GR CAP LET PAMPHYLIAN DIGAMMA..GR S 1598 0378..0379 ; UNASSIGNED # .. 1599 037A ; FREE_PVAL # GR YPOGEGRAMMENI..GR SM REV DOT LUN 1600 037B..037D ; PVALID # GR SM REV LUN SIG..GR SM REV DOT LU 1601 037E ; FREE_PVAL # GREEK QUEST MARK 1602 037F..0383 ; UNASSIGNED # .. 1603 0384..0385 ; FREE_PVAL # GREEK TONOS..GREEK DIALYTIKA TONOS 1604 0386 ; PVALID # GR CAP LET ALPHA W TONOS 1605 0387 ; FREE_PVAL # GREEK ANO TELEIA 1606 0388..038A ; PVALID # GR CAP LET EPSILON W TONOS..GR CAP 1607 038B ; UNASSIGNED # 1608 038C ; PVALID # GREEK CAP LET OMICRON W TONOS 1609 038D ; UNASSIGNED # 1610 038E..03A1 ; PVALID # GR CAP LET EPSILON W TONOS..GR CAP 1611 03A2 ; UNASSIGNED # 1612 03A3..03CF ; PVALID # GREEK CAP LET SIGMA..GR CAP 1613 03D0..03D2 ; FREE_PVAL # GR BETA SYM..GR UPSILON W HOOK 1614 03D3..03D4 ; FREE_PVAL # GR UPSILON W ACUTE AND HOOK..GR UP 1615 03D5..03D6 ; FREE_PVAL # GR PHI SYM..GR PI SYM 1616 03D7..03EF ; PVALID # GR KAI SYM..COPT SM LET DEI 1617 03F0..03F2 ; FREE_PVAL # GR KAPPA SYM..GR LUNATE SIGMA 1618 03F3 ; PVALID # GREEK LET YOT 1619 03F4..03F6 ; FREE_PVAL # GR CAP THETA..GR REV LUNATE EPSILON 1620 03F7..03F8 ; PVALID # GR CAP LET SHO..GR SM LET SHO 1621 03F9 ; FREE_PVAL # GREEK CAP LUNATE SIGMA SYM 1622 03FA..0481 ; PVALID # GR CAP LET SAN..CYR SML LET KOPPA 1623 0482 ; FREE_PVAL # CYR THOUSANDS SIGN 1624 0483..0487 ; PVALID # COMB CYR TITLO..COMB CYR POK 1625 0488..0489 ; FREE_PVAL # COMB CYR HUNDRED THOUSANDS SIGN..C 1626 048A..0527 ; PVALID # CYR CAP LET SH I W TAIL..CYR S 1627 0528..0530 ; UNASSIGNED # .. 1628 0531..0556 ; PVALID # ARM CAP LET AYB..ARM CAP LET FEH 1629 0557..0558 ; UNASSIGNED # .. 1630 0559 ; PVALID # ARM MOD LET LEFT HALF RING 1631 055A..055F ; FREE_PVAL # ARM APOS..ARM ABBREV 1632 0560 ; UNASSIGNED # 1633 0561..0586 ; PVALID # ARM SM LET AYB..ARMENIAN SM LE 1634 0587 ; FREE_PVAL # ARM SM LIG ECH YIWN 1635 0588 ; UNASSIGNED # 1636 0589..058A ; FREE_PVAL # ARMENIAN FULL STOP..ARMENIAN HYPH 1637 058B..058E ; UNASSIGNED # .. 1638 058F ; FREE_PVAL # ARMENIAN DRAM SIGN 1639 0590 ; UNASSIGNED # 1640 0591..05BD ; PVALID # HEBR ACC ETNAHTA..HEBR PNT ME 1641 05BE ; FREE_PVAL # HEBR PUNCT MAQAF 1642 05BF ; PVALID # HEBR PNT RAFE 1643 05C0 ; FREE_PVAL # HEBR PUNCT PASEQ 1644 05C1..05C2 ; PVALID # HEBR PNT SHIN DOT..HEBR PNT SIN DOT 1645 05C3 ; FREE_PVAL # HEBR PUNCT SOF PASUQ 1646 05C4..05C5 ; PVALID # HEBR MARK UP DOT..HEBR MARK LOW DOT 1647 05C6 ; FREE_PVAL # HEBR PUNCT NUN HAFUKHA 1648 05C7 ; PVALID # HEBR PNT QAMATS QATAN 1649 05C8..05CF ; UNASSIGNED # .. 1650 05D0..05EA ; PVALID # HEBR LET ALEF..HEBR LET TAV 1651 05EB..05EF ; UNASSIGNED # .. 1652 05F0..05F2 ; PVALID # HEBR LIG YIDDISH DOUBLE VAV..HEBR L 1653 05F3..05F4 ; CONTEXTO # HEBR PUNCT GERESH..HEBR PUNCTUATIO 1654 05F5..05FF ; UNASSIGNED # .. 1655 0600..0604 ; DISALLOWED # ARAB NUM SIGN..ARAB SIGN SAM 1656 0605 ; UNASSIGNED # .. 1657 0606..060F ; FREE_PVAL # AR-IND CUBE ROOT..ARAB SIGN MISRA 1658 0610..061A ; PVALID # ARAB SIGN SALLALLAHOU ALAYHE ..AR 1659 061B ; FREE_PVAL # ARAB SEMICOLON 1660 061C ; DISALLOWED # ARAB LET MARK 1661 061D..061D ; UNASSIGNED # .. 1662 061E..061F ; FREE_PVAL # ARAB TRIPLE DOT PUNCT MARK..ARAB Q 1663 0620..063F ; PVALID # ARAB LET KASH..ARAB LET FARSI YEH 1664 0640 ; DISALLOWED # ARAB TATWEEL 1665 0641..065F ; PVALID # ARAB LET FEH..ARAB WAVY HAMZA BEL 1666 0660..0669 ; CONTEXTO # AR-IND DIG ZERO..AR-IND DIG 1667 066A..066D ; FREE_PVAL # ARAB PCT SIGN..ARAB FIVE PNTED STA 1668 066E..0674 ; PVALID # ARAB LET DOTLESS BEH..ARAB LET HIG 1669 0675..0678 ; FREE_PVAL # ARAB LET HIGH HAMZA ALEF..ARAB LET 1670 0679..06D3 ; PVALID # ARAB LET TTEH..ARAB LET YEH BARREE 1671 06D4 ; FREE_PVAL # ARAB FULL STOP 1672 06D5..06DC ; PVALID # ARAB LET AE..ARAB SM HIGH SEEN 1673 06DD ; DISALLOWED # ARAB END OF AYAH 1674 06DE ; FREE_PVAL # ARAB START OF RUB EL HIZB 1675 06DF..06E8 ; PVALID # ARAB SM HIGH ROUNDED ZERO..ARAB SM 1676 06E9 ; FREE_PVAL # ARAB PLACE OF SAJDAH 1677 06EA..06EF ; PVALID # ARAB EMPTY CENTRE LOW STOP..ARAB LET 1678 06F0..06F9 ; CONTEXTO # EXT AR-IND DIG ZERO..EXT A 1679 06FA..06FF ; PVALID # ARAB LET SHEEN W DOT BEL..ARAB 1680 0700..070D ; FREE_PVAL # SYR END OF PARA..SYR HARKLEAN AST 1681 070E ; UNASSIGNED # 1682 070F ; DISALLOWED # SYR ABBR MARK 1683 0710..074A ; PVALID # SYR LET ALAPH..SYR BARREKH 1684 074B..074C ; UNASSIGNED # .. 1685 074D..07B1 ; PVALID # SYR LET SOGDIAN ZHAIN..THAANA LET N 1686 07B2..07BF ; UNASSIGNED # .. 1687 07C0..07F5 ; PVALID # NKO DIG ZERO..NKO LOW TONE APOS 1688 07F6..07F9 ; FREE_PVAL # NKO SYM OO DENNEN..NKO EXCLAMATI 1689 07FA ; DISALLOWED # NKO LAJANYALAN 1690 07FB..07FF ; UNASSIGNED # .. 1691 0800..082D ; PVALID # SAMAR LET ALAF..SAMAR MARK NEQUDA 1692 082E..082F ; UNASSIGNED # .. 1693 0830..083E ; FREE_PVAL # SAMAR PUNCT NEQUDAA..SAMAR PUN 1694 083F ; UNASSIGNED # 1695 0840..085B ; PVALID # MANDAIC LET HALQA..MANDAIC GEM 1696 085C..085D ; UNASSIGNED # .. 1697 085E ; FREE_PVAL # MANDAIC PUNCTUATION 1698 085F..089F ; UNASSIGNED # .. 1699 08A0 ; PVALID # ARAB LET BEH W SM V BEL 1700 08A1 ; UNASSIGNED # 1701 08A2..08AC ; PVALID # ARAB LET JEEM W 2 DOTS AB..ARAB 1702 08AD..08E3 ; UNASSIGNED # .. 1703 08E4..08FE ; PVALID # ARAB CURLY FATHA..ARAB DAMMA W 1704 08FF ; UNASSIGNED # 1705 0900..0963 ; PVALID # DEVAN SIGN INV CANDRABINDU..DEVAN V 1706 0964..0965 ; FREE_PVAL # DEVAN DANDA..DEVAN DOUBLE DANDA 1707 0966..096F ; PVALID # DEVAN DIG ZERO..DEVAN DIG NINE 1708 0970 ; FREE_PVAL # DEVAN ABBR SIGN 1709 0971..0977 ; PVALID # DEVAN SIGN HIGH SPACING DOT..DEVAN 1710 0978 ; UNASSIGNED # 1711 0979..097F ; PVALID # DEVAN SIGN HIGH SPACING DOT..DEVAN 1712 0980 ; UNASSIGNED # 1713 0981..0983 ; PVALID # BENG SIGN CANDRABINDU..BENG SIGN VIS 1714 0984 ; UNASSIGNED # 1715 0985..098C ; PVALID # BENG LET A..BENG LET VOC L 1716 098D..098E ; UNASSIGNED # .. 1717 098F..0990 ; PVALID # BENG LET E..BENG LET AI 1718 0991..0992 ; UNASSIGNED # .. 1719 0993..09A8 ; PVALID # BENG LET O..BENG LET NA 1720 09A9 ; UNASSIGNED # 1721 09AA..09B0 ; PVALID # BENG LET PA..BENG LET RA 1722 09B1 ; UNASSIGNED # 1723 09B2 ; PVALID # BENG LET LA 1724 09B3..09B5 ; UNASSIGNED # .. 1725 09B6..09B9 ; PVALID # BENG LET SHA..BENG LET HA 1726 09BA..09BB ; UNASSIGNED # .. 1727 09BC..09C4 ; PVALID # BENG SIGN NUKTA..BENG VOW SIGN VOCAL 1728 09C5..09C6 ; UNASSIGNED # .. 1729 09C7..09C8 ; PVALID # BENG VOW SIGN E..BENG VOW SIGN AI 1730 09C9..09CA ; UNASSIGNED # .. 1731 09CB..09CE ; PVALID # BENG VOW SIGN O..BENG LET KHANDA 1732 09CF..09D6 ; UNASSIGNED # .. 1733 09D7 ; PVALID # BENG AU LEN MARK 1734 09D8..09DB ; UNASSIGNED # .. 1735 09DC..09DD ; PVALID # BENG LET RRA..BENG LET RHA 1736 09DE ; UNASSIGNED # 1737 09DF..09E3 ; PVALID # BENG LET YYA..BENG VOW SIG 1738 09E4..09E5 ; UNASSIGNED # .. 1739 09E6..09F1 ; PVALID # BENG DIG ZERO..BENG LET RA W L 1740 09F2..09FB ; FREE_PVAL # BENG RUPEE MARK..BENG GANDA MARK 1741 09FC..0A00 ; UNASSIGNED # .. 1742 0A01..0A03 ; PVALID # GURMUKHI SIGN ADAK BINDI..GURMUKHI 1743 0A04 ; UNASSIGNED # 1744 0A05..0A0A ; PVALID # GURMUKHI LET A..GURMUKHI LET UU 1745 0A0B..0A0E ; UNASSIGNED # .. 1746 0A0F..0A10 ; PVALID # GURMUKHI LET EE..GURMUKHI LET AI 1747 0A11..0A12 ; UNASSIGNED # .. 1748 0A13..0A28 ; PVALID # GURMUKHI LET OO..GURMUKHI LET NA 1749 0A29 ; UNASSIGNED # 1750 0A2A..0A30 ; PVALID # GURMUKHI LET PA..GURMUKHI LET RA 1751 0A31 ; UNASSIGNED # 1752 0A32..0A33 ; PVALID # GURMUKHI LET LA..GURMUKHI LET LLA 1753 0A34 ; UNASSIGNED # 1754 0A35.OA36 ; PVALID # GURMUKHI LET VA..GURMUKHI LET SHA 1755 0A37 ; UNASSIGNED # 1756 0A38..0A39 ; PVALID # GURMUKHI LET SA..GURMUKHI LET HA 1757 0A3A..0A3B ; UNASSIGNED # .. 1758 0A3C ; PVALID # GURMUKHI SIGN NUKTA 1759 0A3D ; UNASSIGNED # 1760 0A3E..0A42 ; PVALID # GURMUKHI VOW SIGN AA..GURMUKHI V 1761 0A43..0A46 ; UNASSIGNED # .. 1762 0A47..0A48 ; PVALID # GURMUKHI VOW SIGN EE..GURMUKHI V 1763 0A49..0A4A ; UNASSIGNED # .. 1764 0A4B..0A4D ; PVALID # GURMUKHI VOW SIGN OO..GURMUKHI S 1765 0A4E..0A50 ; UNASSIGNED # .. 1766 0A51 ; PVALID # GURMUKHI SIGN UDAAT 1767 0A52..0A58 ; UNASSIGNED # .. 1768 0A59..0A5C ; PVALID # GURMUKHI LET KHHA..GURMUKHI LET RRA 1769 0A5D ; UNASSIGNED # 1770 0A5E ; PVALID # GURMUKHI LET FA 1771 0A5F..0A65 ; UNASSIGNED # .. 1772 0A66..0A75 ; PVALID # GURMUKHI DIG ZERO..GURMUKHI SIGN YA 1773 0A76..0A80 ; UNASSIGNED # .. 1774 0A81..0A83 ; PVALID # GUJARATI SIGN CANDRABINDU..GUJARATI 1775 0A84 ; UNASSIGNED # 1776 0A85..0A8D ; PVALID # GUJARATI LET A..GUJARATI VOW CAND 1777 0A8E ; UNASSIGNED # 1778 0A8F..0A91 ; PVALID # GUJARATI LET E..GUJARATI VOW CAND 1779 0A92 ; UNASSIGNED # 1780 0A93..0AA8 ; PVALID # GUJARATI LET O..GUJARATI LET NA 1781 0AA9 ; UNASSIGNED # 1782 0AAA..0AB0 ; PVALID # GUJARATI LET PA..GUJARATI LET RA 1783 0AB1 ; UNASSIGNED # 1784 0AB2..0AB3 ; PVALID # GUJARATI LET LA..GUJARATI LET LLA 1785 0AB4 ; UNASSIGNED # 1786 0AB5..0AB9 ; PVALID # GUJARATI LET VA..GUJARATI LET HA 1787 0ABA..0ABB ; UNASSIGNED # .. 1788 0ABC..0AC5 ; PVALID # GUJARATI SIGN NUKTA..GUJARATI VOW 1789 0AC6 ; UNASSIGNED # 1790 0AC7..0AC9 ; PVALID # GUJARATI VOW SIGN E..GUJARATI VOW 1791 0ACA ; UNASSIGNED # 1792 0ACB..0ACD ; PVALID # GUJARATI VOW SIGN O..GUJARATI SIG 1793 0ACE..0ACF ; UNASSIGNED # .. 1794 0AD0 ; PVALID # GUJARATI OM 1795 0AD1..0ADF ; UNASSIGNED # .. 1796 0AE0..0AE3 ; PVALID # GUJARATI LET VOC RR..GUJARATI V 1797 0AE4..0AE5 ; UNASSIGNED # .. 1798 0AE6..0AEF ; PVALID # GUJARATI DIG ZERO..GUJARATI DIG NINE 1799 0AF0..0AF1 ; FREE_PVAL # GUJARATI ABBR SIGN..GUJARATI RUPEE S 1800 0AF2..0B00 ; UNASSIGNED # .. 1801 0B01..0B03 ; PVALID # ORIYA SIGN CANDRABINDU..ORIYA SIGN V 1802 0B04 ; UNASSIGNED # 1803 0B05..0B0C ; PVALID # ORIYA LET A..ORIYA LET VOC L 1804 0B0D..0B0E ; UNASSIGNED # .. 1805 0B0F..0B10 ; PVALID # ORIYA LET E..ORIYA LET AI 1806 0B11..0B12 ; UNASSIGNED # .. 1807 0B13..0B28 ; PVALID # ORIYA LET O..ORIYA LET NA 1808 0B29 ; UNASSIGNED # 1809 0B2A..0B30 ; PVALID # ORIYA LET PA..ORIYA LET RA 1810 0B31 ; UNASSIGNED # 1811 0B32..0B33 ; PVALID # ORIYA LET LA..ORIYA LET LLA 1812 0B34 ; UNASSIGNED # 1813 0B35..0B39 ; PVALID # ORIYA LET VA..ORIYA LET HA 1814 0B3A..0B3B ; UNASSIGNED # .. 1815 0B3C..0B44 ; PVALID # ORIYA SIGN NUKTA..ORIYA VOW SIGN 1816 0B45..0B46 ; UNASSIGNED # .. 1817 0B47..0B48 ; PVALID # ORIYA VOW SIGN E..ORIYA VOW SIG 1818 0B49..0B4A ; UNASSIGNED # .. 1819 0B4B..0B4D ; PVALID # ORIYA VOW SIGN O..ORIYA SIGN VIRA 1820 0B4E..0B55 ; UNASSIGNED # .. 1821 0B56..0B57 ; PVALID # ORIYA AI LEN MARK..ORIYA AU LENG 1822 0B58..0B5B ; UNASSIGNED # .. 1823 0B5C..0B5D ; PVALID # ORIYA LET RRA..ORIYA LET RHA 1824 0B5E ; UNASSIGNED # 1825 0B5F..0B63 ; PVALID # ORIYA LET YYA..ORIYA VOW SIGN VOCA 1826 0B64..0B65 ; UNASSIGNED # .. 1827 0B66..0B6F ; PVALID # ORIYA DIG ZERO..ORIYA DIG NINE 1828 0B70 ; FREE_PVAL # ORIYA ISSHAR 1829 0B71 ; PVALID # ORIYA LET WA 1830 0B72..0B77 ; FREE_PVAL # ORIYA FRACT ONE QUART..ORIYA FRACT 1831 0B78..0B81 ; UNASSIGNED # .. 1832 0B82..0B83 ; PVALID # TAMIL SIGN ANUSVARA..TAMIL SIGN VIS 1833 0B84 ; UNASSIGNED # 1834 0B85..0B8A ; PVALID # TAMIL LET A..TAMIL LET UU 1835 0B8B..0B8D ; UNASSIGNED # .. 1836 0B8E..0B90 ; PVALID # TAMIL LET E..TAMIL LET AI 1837 0B91 ; UNASSIGNED # 1838 0B92..0B95 ; PVALID # TAMIL LET O..TAMIL LET KA 1839 0B96..0B98 ; UNASSIGNED # .. 1840 0B99..0B9A ; PVALID # TAMIL LET NGA..TAMIL LET CA 1841 0B9B ; UNASSIGNED # 1842 0B9C ; PVALID # TAMIL LET JA 1843 0B9D ; UNASSIGNED # 1844 0B9E..0B9F ; PVALID # TAMIL LET NYA..TAMIL LET TTA 1845 0BA0..0BA2 ; UNASSIGNED # .. 1846 0BA3..0BA4 ; PVALID # TAMIL LET NNA..TAMIL LET TA 1847 0BA5..0BA7 ; UNASSIGNED # .. 1848 0BA8..0BAA ; PVALID # TAMIL LET NA..TAMIL LET PA 1849 0BAB..0BAD ; UNASSIGNED # .. 1850 0BAE..0BB9 ; PVALID # TAMIL LET MA..TAMIL LET HA 1851 0BBA..0BBD ; UNASSIGNED # .. 1852 0BBE..0BC2 ; PVALID # TAMIL VOW SIGN AA..TAMIL VOW SI 1853 0BC3..0BC5 ; UNASSIGNED # .. 1854 0BC6..0BC8 ; PVALID # TAMIL VOW SIGN E..TAMIL VOW SIG 1855 0BC9 ; UNASSIGNED # 1856 0BCA..0BCD ; PVALID # TAMIL VOW SIGN O..TAMIL SIGN VIRA 1857 0BCE..0BCF ; UNASSIGNED # .. 1858 0BD0 ; PVALID # TAMIL OM 1859 0BD1..0BD6 ; UNASSIGNED # .. 1860 0BD7 ; PVALID # TAMIL AU LEN MARK 1861 0BD8..0BE5 ; UNASSIGNED # .. 1862 0BE6..0BEF ; PVALID # TAMIL DIG ZERO..TAMIL DIG NINE 1863 0BF0..0BFA ; FREE_PVAL # TAMIL NUM TEN..TAMIL NUM SIGN 1864 0BFB..0C00 ; UNASSIGNED # .. 1865 0C01..0C03 ; PVALID # TELUGU SIGN CANDRABINDU..TELUGU SIG 1866 0C04 ; UNASSIGNED # 1867 0C05..0C0C ; PVALID # TELUGU LET A..TELUGU LET VOC L 1868 0C0D ; UNASSIGNED # 1869 0C0E..0C10 ; PVALID # TELUGU LET E..TELUGU LET AI 1870 0C11 ; UNASSIGNED # 1871 0C12..0C28 ; PVALID # TELUGU LET O..TELUGU LET NA 1872 0C29 ; UNASSIGNED # 1873 0C2A..0C33 ; PVALID # TELUGU LET PA..TELUGU LET LLA 1874 0C34 ; UNASSIGNED # 1875 0C35..0C39 ; PVALID # TELUGU LET VA..TELUGU LET HA 1876 0C3A..0C3C ; UNASSIGNED # .. 1877 0C3D..0C44 ; PVALID # TELUGU SIGN AVAGRAHA..TELUGU VOW SI 1878 0C45 ; UNASSIGNED # 1879 0C46..0C48 ; PVALID # TELUGU VOW SIGN E..TELUGU VOW SIGN 1880 0C49 ; UNASSIGNED # 1881 0C4A..0C4D ; PVALID # TELUGU VOW SIGN O..TELUGU SIGN VIRA 1882 0C4E..0C54 ; UNASSIGNED # .. 1883 0C55..0C56 ; PVALID # TELUGU LEN MARK..TELUGU AI LEN MARK 1884 0C57 ; UNASSIGNED # 1885 0C58..0C59 ; PVALID # TELUGU LET TSA..TELUGU LET DZA 1886 0C5A..0C5F ; UNASSIGNED # .. 1887 0C60..0C63 ; PVALID # TELUGU LET VOC RR..TELUGU VOW S 1888 0C64..0C65 ; UNASSIGNED # .. 1889 0C66..0C6F ; PVALID # TELUGU DIG ZERO..TELUGU DIG NINE 1890 0C70..0C77 ; UNASSIGNED # .. 1891 0C78..0C7F ; FREE_PVAL # TELUGU FRACTION DIG ZERO..TELUGU S 1892 0C80..0C81 ; UNASSIGNED # .. 1893 0C82..0C83 ; PVALID # KANNADA SIGN ANUSVARA..KANNADA SIGN 1894 0C84 ; UNASSIGNED # 1895 0C85..0C8C ; PVALID # KANNADA LET A..KANNADA LET VOC L 1896 0C8D ; UNASSIGNED # 1897 0C8E..0C90 ; PVALID # KANNADA LET E..KANNADA LET AI 1898 0C91 ; UNASSIGNED # 1899 0C92..0CA8 ; PVALID # KANNADA LET O..KANNADA LET NA 1900 0CA9 ; UNASSIGNED # 1901 0CAA..0CB3 ; PVALID # KANNADA LET PA..KANNADA LET LLA 1902 0CB4 ; UNASSIGNED # 1903 0CB5..0CB9 ; PVALID # KANNADA LET VA..KANNADA LET HA 1904 0CBA..0CBB ; UNASSIGNED # .. 1905 0CBC..0CC4 ; PVALID # KANNADA SIGN NUKTA..KANNADA VOW SIG 1906 0CC5 ; UNASSIGNED # 1907 0CC6..0CC8 ; PVALID # KANNADA VOW SIGN E..KANNADA VOW SIG 1908 0CC9 ; UNASSIGNED # 1909 0CCA..0CCD ; PVALID # KANNADA VOW SIGN O..KANNADA SIGN VI 1910 0CCE..0CD4 ; UNASSIGNED # .. 1911 0CD5..0CD6 ; PVALID # KANNADA LEN MARK..KANNADA AI LEN MA 1912 0CD7..0CDD ; UNASSIGNED # .. 1913 0CDE ; PVALID # KANNADA LET FA 1914 0CDF ; UNASSIGNED # 1915 0CE0..0CE3 ; PVALID # KANNADA LET VOC RR..KANNADA VOW SIG 1916 0CE4..0CE5 ; UNASSIGNED # .. 1917 0CE6..0CEF ; PVALID # KANNADA DIG ZERO..KANNADA DIG NINE 1918 0CF0 ; UNASSIGNED # 1919 0CF1..0CF2 ; PVALID # KANNADA SIGN JIHVAMULIYA..KANNADA S 1920 0CF3..0D01 ; UNASSIGNED # .. 1921 0D02..0D03 ; PVALID # MALAY SIGN ANUSVARA..MALAY SIGN VIS 1922 0D04 ; UNASSIGNED # 1923 0D05..0D0C ; PVALID # MALAY LET A..MALAY LET VOC 1924 0D0D ; UNASSIGNED # 1925 0D0E..0D10 ; PVALID # MALAY LET E..MALAY LET AI 1926 0D11 ; UNASSIGNED # 1927 0D12..0D3A ; PVALID # MALAY LET O..MALAY LET TTTA 1928 0D3B..0D3C ; UNASSIGNED # .. 1929 0D3D..0D44 ; PVALID # MALAY SIGN AVAGRAHA..MALAY VOW SIG 1930 0D45 ; UNASSIGNED # 1931 0D46..0D48 ; PVALID # MALAY VOW SIGN E..MALAY VOW SIGN 1932 0D49 ; UNASSIGNED # 1933 0D4A..0D4E ; PVALID # MALAY VOW SIGN O..MALAY LET DOT REP 1934 0D4F..0D56 ; UNASSIGNED # .. 1935 0D57 ; PVALID # MALAY AU LEN MARK 1936 0D58..0D5F ; UNASSIGNED # .. 1937 0D60..0D63 ; PVALID # MALAY LET VOC RR..MALAY VOW 1938 0D64..0D65 ; UNASSIGNED # .. 1939 0D66..0D6F ; PVALID # MALAY DIG ZERO..MALAY DIG NINE 1940 0D70..0D75 ; FREE_PVAL # MALAY NUM TEN..MALAY FRACTION THR 1941 0D76..0D78 ; UNASSIGNED # .. 1942 0D79 ; FREE_PVAL # MALAY DATE MARK 1943 0D7A..0D7F ; PVALID # MALAY LET CHILLU NN..MALAY LET 1944 0D80..0D81 ; UNASSIGNED # .. 1945 0D82..0D83 ; PVALID # SINH SIGN ANUSVARAYA..SINH SIGN VIS 1946 0D84 ; UNASSIGNED # 1947 0D85..0D96 ; PVALID # SINH LET AYANNA..SINH LET AUYANN 1948 0D97..0D99 ; UNASSIGNED # .. 1949 0D9A..0DB1 ; PVALID # SINH LET ALPAPRAANA KAYANNA..SINH L 1950 0DB2 ; UNASSIGNED # 1951 0DB3..0DBB ; PVALID # SINH LET SANYAKA DAYANNA..SINH LETT 1952 0DBC ; UNASSIGNED # 1953 0DBD ; PVALID # SINH LET DANTAJA LAYANNA 1954 0DBE..0DBF ; UNASSIGNED # .. 1955 0DC0..0DC6 ; PVALID # SINH LET VAYANNA..SINH LET FAYAN 1956 0DC7..0DC9 ; UNASSIGNED # .. 1957 0DCA ; PVALID # SINH SIGN AL-LAKUNA 1958 0DCB..0DCE ; UNASSIGNED # .. 1959 0DCF..0DD4 ; PVALID # SINH VOW SIGN AELA-PILLA..SINH VOW 1960 0DD5 ; UNASSIGNED # 1961 0DD6 ; PVALID # SINH VOW SIGN DIGA PAA-PILLA 1962 0DD7 ; UNASSIGNED # 1963 0DD8..0DDF ; PVALID # SINH VOW SIGN GAETTA-PILLA..SINH VO 1964 0DE0..0DF1 ; UNASSIGNED # .. 1965 0DF2..0DF3 ; PVALID # SINH VOW SIGN DIGA GAETTA-PILLA..SI 1966 0DF4 ; FREE_PVAL # SINH PUNCT KUNDDALIYA 1967 0DF5..0E00 ; UNASSIGNED # .. 1968 0E01..0E32 ; PVALID # THAI CHAR KO KAI..THAI CHAR SARA A 1969 0E33 ; FREE_PVAL # THAI CHAR SARA AM 1970 0E34..0E3A ; PVALID # THAI CHAR SARA I..THAI CHAR PHINTH 1971 0E3B..0E3E ; UNASSIGNED # .. 1972 0E3F ; FREE_PVAL # THAI CURRENCY SYM BAHT 1973 0E40..0E4E ; PVALID # THAI CHAR SARA E..THAI CHAR YAMAKK 1974 0E4F ; FREE_PVAL # THAI CHAR FONGMAN 1975 0E50..0E59 ; PVALID # THAI DIG ZERO..THAI DIG NINE 1976 0E5A..0E5B ; FREE_PVAL # THAI CHAR ANGKHANKHU..THAI CHAR KH 1977 0E5C..0E80 ; UNASSIGNED # .. 1978 0E81..0E82 ; PVALID # LAO LET KO..LAO LET KHO SUNG 1979 0E83 ; UNASSIGNED # 1980 0E84 ; PVALID # LAO LET KHO TAM 1981 0E85..0E86 ; UNASSIGNED # .. 1982 0E87..0E88 ; PVALID # LAO LET NGO..LAO LET CO 1983 0E89 ; UNASSIGNED # 1984 0E8A ; PVALID # LAO LET SO TAM 1985 0E8B..0E8C ; UNASSIGNED # .. 1986 0E8D ; PVALID # LAO LET NYO 1987 0E8E..0E93 ; UNASSIGNED # .. 1988 0E94..0E97 ; PVALID # LAO LET DO..LAO LET THO TAM 1989 0E98 ; UNASSIGNED # 1990 0E99..0E9F ; PVALID # LAO LET NO..LAO LET FO SUNG 1991 0EA0 ; UNASSIGNED # 1992 0EA1..0EA3 ; PVALID # LAO LET MO..LAO LET LO LING 1993 0EA4 ; UNASSIGNED # 1994 0EA5 ; PVALID # LAO LET LO LOOT 1995 0EA6 ; UNASSIGNED # 1996 0EA7 ; PVALID # LAO LET WO 1997 0EA8..0EA9 ; UNASSIGNED # .. 1998 0EAA..0EAB ; PVALID # LAO LET SO SUNG..LAO LET HO SUNG 1999 0EAC ; UNASSIGNED # 2000 0EAD..0EB2 ; PVALID # LAO LET O..LAO VOW SIGN AA 2001 0EB3 ; FREE_PVAL # LAO VOW SIGN AM 2002 0EB4..0EB9 ; PVALID # LAO VOW SIGN I..LAO VOW SIGN UU 2003 0EBA ; UNASSIGNED # 2004 0EBB..0EBD ; PVALID # LAO VOW SIGN MAI KON..LAO SEMIVOW SIG 2005 0EBE..0EBF ; UNASSIGNED # .. 2006 0EC0..0EC4 ; PVALID # LAO VOW SIGN E..LAO VOW SIGN AI 2007 0EC5 ; UNASSIGNED # 2008 0EC6 ; PVALID # LAO KO LA 2009 0EC7 ; UNASSIGNED # 2010 0EC8..0ECD ; PVALID # LAO TONE MAI EK..LAO NIGGAHITA 2011 0ECE..0ECF ; UNASSIGNED # .. 2012 0ED0..0ED9 ; PVALID # LAO DIG ZERO..LAO DIG NINE 2013 0EDA..0EDB ; UNASSIGNED # .. 2014 0EDC..0EDD ; FREE_PVAL # LAO HO NO..LAO HO MO 2015 0EDE..0EDF ; PVALID # LAO LET KHMU GO..TIB SYL OM 2016 0EE0..0EEF ; UNASSIGNED # .. 2017 0F00 ; PVALID # TIB SYLL OM 2018 0F01..0F0A ; FREE_PVAL # TIB MARK GTER YIG MGO TRUNC A..TIB 2019 0F0B ; PVALID # TIB MARK INTERSYLLABIC TSHEG 2020 0F0C..0F17 ; FREE_PVAL # TIB MARK DELIMITER TSHEG BSTAR..TIB 2021 0F18..0F19 ; PVALID # TIB ASTROLOGICAL SIGN -KHYUD PA..TIB 2022 0F1A..0F1F ; FREE_PVAL # TIB SIGN RDEL DKAR GCIG..TIB SIGN RD 2023 0F20..0F29 ; PVALID # TIB DIG ZERO..TIB DIG NINE 2024 0F2A..0F34 ; FREE_PVAL # TIB DIG HALF ONE..TIB MARK BSDUS R 2025 0F35 ; PVALID # TIB MARK NGAS BZUNG NYI ZLA 2026 0F36 ; FREE_PVAL # TIB MARK CARET DZUD RTAGS BZHI MIG C 2027 0F37 ; PVALID # TIB MARK NGAS BZUNG SGOR RTAGS 2028 0F38 ; FREE_PVAL # TIB MARK CHE MGO 2029 0F39 ; PVALID # TIB MARK TSA PHRU 2030 0F3A..0F3D ; FREE_PVAL # TIB MARK GUG RTAGS GYON..TIB MARK AN 2031 0F3E..0F47 ; PVALID # TIB SIGN YAR TSHES..TIB LET JA 2032 0F48 ; UNASSIGNED # 2033 0F49..0F6C ; PVALID # TIB LET NYA..TIB LET RRA 2034 0F6D..0F70 ; UNASSIGNED # .. 2035 0F71..0F76 ; PVALID # TIB VOW SIGN AA..TIB VOW SIGN VO 2036 0F77 ; FREE_PVAL # TIB VOW SIGN VO RR 2037 0F78 ; PVALID # TIB VOW SIGN VO L 2038 0F79 ; FREE_PVAL # TIB VOW SIGN VO LL 2039 0F7A..0F84 ; PVALID # TIB VOW SIGN E..TIB MARK H 2040 0F85 ; FREE_PVAL # TIB MARK PALUTA 2041 0F86..0F8F ; PVALID # TIB SIGN LCI RTAGS..TIB SUBJOIN S 2042 0F90..0F97 ; PVALID # TIB SUBJOIN LET KA..TIB SUBJOIN 2043 0F98 ; UNASSIGNED # 2044 0F99..0FBC ; PVALID # TIB SUBJOIN LET NYA..TIB SUBJOI 2045 0FBD ; UNASSIGNED # 2046 0FBE..0FC5 ; FREE_PVAL # TIB KU RU KHA..TIB SYM RDO RJE 2047 0FC6 ; PVALID # TIB SYM PADMA GDAN 2048 0FC7..0FCC ; FREE_PVAL # TIB SYM RDO RJE RGYA GRAM..TIB SY 2049 0FCD ; UNASSIGNED # 2050 0FCE..0FDA ; FREE_PVAL # TIB SIGN RDEL NAG RDEL DKAR..TIB MA 2051 0FDB..0FFF ; UNASSIGNED # .. 2052 1000..1049 ; PVALID # MYAN LET KA..MYAN DIG NINE 2053 104A..104F ; FREE_PVAL # MYAN SIGN LITTLE SECTION..MYAN SYM 2054 1050..109D ; PVALID # MYAN LET SHA..MYAN VOW SIGN AITON 2055 109E..109F ; FREE_PVAL # MYAN SYM SHAN ONE..MYAN SYM SHAN EX 2056 10A0..10C5 ; PVALID # GEORG CAP LET AN..GEORG CAP LET HOE 2057 10C6 ; UNASSIGNED # 2058 10C7 ; PVALID # GEORG CAP LET YN 2059 10C8..10CC ; UNASSIGNED # .. 2060 10CD ; PVALID # GEORG CAP LET AEN 2061 10CE..10CF ; UNASSIGNED # .. 2062 10D0..10FA ; PVALID # GEORG LET AN..GEORG LET AIN 2063 10FB..10FC ; FREE_PVAL # GEORG PARA SEP..MOD LET GEORG NAR 2064 10FD..10FF ; PVALID # GEORG LET AEN..GEORG LET LABIAL 2065 1100..11FF ; DISALLOWED # HANGUL CHO KIYEOK..HANGUL JONG SSA 2066 1200..1248 ; PVALID # ETHI SYL HA..ETHI SYL QWA 2067 1249 ; UNASSIGNED # 2068 124A..124D ; PVALID # ETHI SYL QWI..ETHI SYL QWE 2069 124E..124F ; UNASSIGNED # .. 2070 1250..1256 ; PVALID # ETHI SYL QHA..ETHI SYL QHO 2071 1257 ; UNASSIGNED # 2072 1258 ; PVALID # ETHI SYL QHWA 2073 1259 ; UNASSIGNED # 2074 125A..125D ; PVALID # ETHI SYL QHWI..ETHI SYL QH 2075 125E..125F ; UNASSIGNED # .. 2076 1260..1288 ; PVALID # ETHI SYL BA..ETHI SYL XWA 2077 1289 ; UNASSIGNED # 2078 128A..128D ; PVALID # ETHI SYL XWI..ETHI SYL XWE 2079 128E..128F ; UNASSIGNED # .. 2080 1290..12B0 ; PVALID # ETHI SYL NA..ETHI SYL KWA 2081 12B1 ; UNASSIGNED # 2082 12B2..12B5 ; PVALID # ETHI SYL KWI..ETHI SYL KWE 2083 12B6..12B7 ; UNASSIGNED # .. 2084 12B8..12BE ; PVALID # ETHI SYL KXA..ETHI SYL KXO 2085 12BF ; UNASSIGNED # 2086 12C0 ; PVALID # ETHI SYL KXWA 2087 12C1 ; UNASSIGNED # 2088 12C2..12C5 ; PVALID # ETHI SYL KXWI..ETHI SYL KX 2089 12C6..12C7 ; UNASSIGNED # .. 2090 12C8..12D6 ; PVALID # ETHI SYL WA..ETHI SYL PHAR 2091 12D7 ; UNASSIGNED # 2092 12D8..1310 ; PVALID # ETHI SYL ZA..ETHI SYL GWA 2093 1311 ; UNASSIGNED # 2094 1312..1315 ; PVALID # ETHI SYL GWI..ETHI SYL GWE 2095 1316..1317 ; UNASSIGNED # .. 2096 1318..135A ; PVALID # ETHI SYL GGA..ETHI SYL FYA 2097 135B..135C ; UNASSIGNED # .. 2098 135D..135F ; PVALID # ETHI COMB GEM AND VOW..ETHI COMB GE 2099 1360..137C ; FREE_PVAL # ETHI SECT MARK..ETHI NUM TEN THOUS 2100 137D..137F ; UNASSIGNED # .. 2101 1380..138F ; PVALID # ETHI SYL SEBATBEIT MWA..ETHI SYL PW 2102 1390..1399 ; FREE_PVAL # ETHI TON MARK YIZET..ETHI TON MARK 2103 139A..139F ; UNASSIGNED # .. 2104 13A0..13F4 ; PVALID # CHEROKEE LET A..CHEROKEE LET YV 2105 13F5..13FF ; UNASSIGNED # .. 2106 1400 ; FREE_PVAL # CANAD SYL HYPHEN 2107 1401..166C ; PVALID # CANAD SYL E..CANAD SYL CAR 2108 166D..166E ; FREE_PVAL # CANAD SYL CHI SIGN..CANAD SYLLAB 2109 166F..167F ; PVALID # CANAD SYL QAI..CANAD SYL B 2110 1680 ; FREE_PVAL # OGHAM SPACE MARK 2111 1681..169A ; PVALID # OGHAM LET BEITH..OGHAM LET PEITH 2112 169B..169C ; FREE_PVAL # OGHAM FEATHER MARK..OGHAM REV FEAT 2113 169D..169F ; UNASSIGNED # .. 2114 16A0..16EA ; PVALID # RUNIC LET FEHU FEOH FE F..RUNIC LET 2115 16EB..16F0 ; FREE_PVAL # RUNIC SINGLE PUNCT..RUNIC BELGTHOR 2116 16F1..16FF ; UNASSIGNED # .. 2117 1700..170C ; PVALID # TAGALOG LET A..TAGALOG LET YA 2118 170D ; UNASSIGNED # 2119 170E..1714 ; PVALID # TAGALOG LET LA..TAGALOG SIGN VIRAMA 2120 1715..171F ; UNASSIGNED # .. 2121 1720..1734 ; PVALID # HANUNOO LET A..HANUNOO SIGN PAMUDPO 2122 1735..1736 ; FREE_PVAL # PHILIP SINGLE PUNCT..PHILIP DOUBLE 2123 1737..173F ; UNASSIGNED # .. 2124 1740..1753 ; PVALID # BUHID LET A..BUHID VOW SIGN U 2125 1754..175F ; UNASSIGNED # .. 2126 1760..176C ; PVALID # TAGBANWA LET A..TAGBANWA LET YA 2127 176D ; UNASSIGNED # 2128 176E..1770 ; PVALID # TAGBANWA LET LA..TAGBANWA LET SA 2129 1771 ; UNASSIGNED # 2130 1772..1773 ; PVALID # TAGBANWA VOW SIGN I..TAGBANWA VOW S 2131 1774..177F ; UNASSIGNED # .. 2132 1780..17B3 ; PVALID # KHMER LET KA..KHMER IND VOW QAU 2133 17B4..17B5 ; DISALLOWED # KHMER VOW INH AQ..KHMER VOW INH AA 2134 17B6..17D3 ; PVALID # KHMER VOW SIGN AA..KHMER SIGN BATHA 2135 17D4..17D6 ; FREE_PVAL # KHMER SIGN KHAN..KHMER SIGN CAMNUC 2136 17D7 ; PVALID # KHMER SIGN LEK TOO 2137 17D8..17DB ; FREE_PVAL # KHMER SIGN BEYYAL..KHMER CURR SYM R 2138 17DC..17DD ; PVALID # KHMER SIGN AVAKRAHASANYA..KHMER SIG 2139 17DE..17DF ; UNASSIGNED # .. 2140 17E0..17E9 ; PVALID # KHMER DIG ZERO..KHMER DIG NINE 2141 17EA..17EF ; UNASSIGNED # .. 2142 17F0..17F9 ; FREE_PVAL # KHMER SYM LEK ATTAK SON..KHMER SYM 2143 17FA..17FF ; UNASSIGNED # .. 2144 1800..180A ; FREE_PVAL # MONG BIRGA..MONG NIRUGU 2145 180B..180E ; DISALLOWED # MONG FREE VAR SEL ONE..MONG VOW SEP 2146 180F ; UNASSIGNED # 2147 1810..1819 ; PVALID # MONG DIG ZERO..MONG DIG NINE 2148 181A..181F ; UNASSIGNED # .. 2149 1820..1877 ; PVALID # MONG LET A..MONG LET MANCHU 2150 1878..187F ; UNASSIGNED # .. 2151 1880..18AA ; PVALID # MONG LET ALI GALI ANUSVARA ONE..MON 2152 18AB..18AF ; UNASSIGNED # .. 2153 18B0..18F5 ; PVALID # CAN SYL OY..CAN SYL CA 2154 18F6..18FF ; UNASSIGNED # .. 2155 1900..191C ; PVALID # LIMBU VOW-CARRIER LET..LIMBU LET HA 2156 191D..191F ; UNASSIGNED # .. 2157 1920..192B ; PVALID # LIMBU VOW SIGN A..LIMBU SUBJOIN LET 2158 192C..192F ; UNASSIGNED # .. 2159 1930..193B ; PVALID # LIMBU SM LET KA..LIMBU SIGN SA-I 2160 193C..193F ; UNASSIGNED # .. 2161 1940 ; FREE_PVAL # LIMBU SIGN LOO 2162 1941..1943 ; UNASSIGNED # .. 2163 1944..1945 ; FREE_PVAL # LIMBU EXCLAM MARK..LIMBU QUEST MARK 2164 1946..196D ; PVALID # LIMBU DIG ZERO..TAI LE LET AI 2165 196E..196F ; UNASSIGNED # .. 2166 1970..1974 ; PVALID # TAI LE LET TONE-2..TAI LE LET TONE- 2167 1975..197F ; UNASSIGNED # .. 2168 1980..19AB ; PVALID # NEW TAI LUE LET HIGH QA..NEW TAI LU 2169 19AC..19AF ; UNASSIGNED # .. 2170 19B0..19C9 ; PVALID # NEW TAI LUE VOW SIGN VOW SHORT..NEW 2171 19CA..19CF ; UNASSIGNED # .. 2172 19D0..19D9 ; PVALID # NEW TAI LUE DIG ZERO..NEW TAI DIG N 2173 19DA ; DISALLOWED # NEW TAI LUE THAM 2174 19DB..19DD ; UNASSIGNED # .. 2175 19DE..19FF ; FREE_PVAL # NEW TAI LUE SIGN LAE..KHMER SYM DAP 2176 1A00..1A1B ; PVALID # BUGIN LET KA..BUGIN VOW SIGN AE 2177 1A1C..1A1D ; UNASSIGNED # .. 2178 1A1E..1A1F ; FREE_PVAL # BUGIN PALLAWA..BUGIN END OF SECTION 2179 1A20..1A5E ; PVALID # TAI THAM LET HIGH KA..TAI THAM CONS 2180 1A5F ; UNASSIGNED # 2181 1A60..1A7C ; PVALID # TAI THAM SIGN SAKOT..TAI THAM SIGN 2182 1A7D..1A7E ; UNASSIGNED # .. 2183 1A7F..1A89 ; PVALID # TAI THAM COMB CRYPT DOT..TAI THAM D 2184 1A8A..1A8F ; UNASSIGNED # .. 2185 1A90..1A99 ; PVALID # TAI THAM THAM DIG ZERO..TAI THAM TH 2186 1A9A..1A9F ; UNASSIGNED # .. 2187 1AA0..1AA6 ; FREE_PVAL # TAI THAM SIGN WIANG..TAI THAM SIGN 2188 1AA7 ; PVALID # TAI THAM SIGN MAI YAMOK 2189 1AA8..1AAD ; FREE_PVAL # TAI THAM SIGN KAAN..TAI THAM SIGN C 2190 1AAE..1AFF ; UNASSIGNED # .. 2191 1B00..1B4B ; PVALID # BAL SIGN ULU RICEM..BAL LET ASYURA 2192 1B4C..1B4F ; UNASSIGNED # .. 2193 1B50..1B59 ; PVALID # BAL DIG ZERO..BAL DIG NINE 2194 1B5A..1B6A ; FREE_PVAL # BAL PANTI..BAL MUS SYM DANG 2195 1B6B..1B73 ; PVALID # BAL MUS SYM COMB TEGEH..BAL MUS 2196 1B74..1B7C ; FREE_PVAL # BAL MUS SYM RIGHT-HAND OPEN DUG 2197 1B7D..1B7F ; UNASSIGNED # .. 2198 1B80..1BF3 ; PVALID # SUND SIGN PANYECEK..BATAK PANONGONAN 2199 1BF4..1BFB ; UNASSIGNED # .. 2200 1BFC..1BFF ; FREE_PVAL # BATAK SYM BINDU NA METEK..BATAK SYM 2201 1C00..1C37 ; PVALID # LEPCHA LET KA..LEPCHA SIGN NUKTA 2202 1C38..1C3A ; UNASSIGNED # .. 2203 1C3B..1C3F ; FREE_PVAL # LEPCHA PUNCT TA-ROL..LEPCHA PUNCT T 2204 1C40..1C49 ; PVALID # LEPCHA DIG ZERO..LEPCHA DIG NINE 2205 1C4A..1C4C ; UNASSIGNED # .. 2206 1C4D..1C7D ; PVALID # LEPCHA LET TTA..OL CHIKI AHAD 2207 1C7E..1C7F ; FREE_PVAL # OL CHIKI PUNCT MUCAAD..OL CHIKI PUN 2208 1C80..1CBF ; UNASSIGNED # .. 2209 1CC0..1CC7 ; FREE_PVAL # SUNDA PUNCT BINDU SURYA..SUNDA PUNC 2210 1CC8..1CCF ; UNASSIGNED # .. 2211 1CD0..1CD2 ; PVALID # VED TONE KARSHANA..VED TONE PRENKHA 2212 1CD3 ; FREE_PVAL # VED SIGN NIHSHVASA 2213 1CD4..1CF6 ; PVALID # VED SIGN YAJURVEDIC MID SVARITA..VE 2214 1CF7..1CFF ; UNASSIGNED # .. 2215 1D00..1D2B ; PVALID # LAT LET SM CAP A..CYR LET SM 2216 1D2C..1D2E ; FREE_PVAL # MOD LET CAP A..MOD LET C 2217 1D2F ; PVALID # MOD LET CAP BARRED B 2218 1D30..1D3A ; FREE_PVAL # MOD LET CAP D..MOD LET C 2219 1D3B ; PVALID # MOD LET CAP REV N 2220 1D3C..1D4D ; FREE_PVAL # MOD LET CAP O..MOD LET S 2221 1D4E ; PVALID # MOD LET SM TURNED I 2222 1D4F..1D6A ; FREE_PVAL # MOD LET SM K..GREEK SUB SMA 2223 1D6B..1D77 ; PVALID # LAT SM LET UE..LAT SM LET TU 2224 1D78 ; FREE_PVAL # MOD LET CYR EN 2225 1D79..1D9A ; PVALID # LAT SM LET INSULAR G..LAT SM LE 2226 1D9B..1DBF ; FREE_PVAL # MOD LET SM TURNED ALPHA..MOD 2227 1DC0..1DE6 ; PVALID # COMB DOTTED GRAVE ACCENT..COMB LAT 2228 1DE7..1DFB ; UNASSIGNED # .. 2229 1DFC..1E99 ; PVALID # COMB DOUBLE INV BREVE BEL..LAT SM L 2230 1E9A ; FREE_PVAL # LAT SM LET A W R HALF RING 2231 1E9B..1F15 ; PVALID # LAT SM LET LONG S W BOT ABOVE..GR 2232 1F16..1F17 ; UNASSIGNED # .. 2233 1F18..1F1D ; FREE_PVAL # GREEK CAP LET EPSILON W PSILI..GRE 2234 1F1E..1F1F ; UNASSIGNED # .. 2235 1F20..1F45 ; PVALID # GREEK SM LET ETA W PSILI..GREEK SMA 2236 1F46..1F47 ; UNASSIGNED # .. 2237 1F48..1F4D ; FREE_PVAL # GREEK CAP LET OMICRON W PSILI..GRE 2238 1F4E..1F4F ; UNASSIGNED # .. 2239 1F50..1F57 ; PVALID # GREEK SM LET UPSILON W PSILI..GREEK 2240 1F58 ; UNASSIGNED # 2241 1F59 ; PVALID # GREEK CAP LET UPSILON W DASIA 2242 1F5A ; UNASSIGNED # 2243 1F5B ; PVALID # GREEK CAP LET UPSILON W DASIA AND 2244 1F5C ; UNASSIGNED # 2245 1F5D ; PVALID # GREEK CAP LET UPSILON W DASIA AND 2246 1F5E ; UNASSIGNED # 2247 1F5F..1F7D ; PVALID # GREEK CAP LET UPSILON W DASIA A..GR 2248 1F7E..1F7F ; UNASSIGNED # .. 2249 1F80..1F87 ; PVALID # GREEK SM LET ALPHA W PSILI AND YPOG 2250 1F88..1F8F ; FREE_PVAL # GREEK CAP LET ALPHA W PSILI AND..GR 2251 1F90..1F97 ; PVALID # GREEK SM LET ETA W PSILI AND YP..GR 2252 1F98..1F9F ; FREE_PVAL # GREEK CAP LET ETA W PSILI AND P..GR 2253 1FA0..1FA7 ; PVALID # GREEK SM LET OMEGA W PSILI AND ..GR 2254 1FA8..1FAF ; FREE_PVAL # GREEK CAPL LET OMEGA W PSILI AN..GR 2255 1FB0..1FB4 ; PVALID # GREEK SM LET ALPHA W VRACHY..GREEK 2256 1FB5 ; UNASSIGNED # 2257 1FB6..1FBB ; PVALID # GREEK SM LET ALPHA W PERISPOMEN..GR 2258 1FBC..1FBD ; FREE_PVAL # GREEK CAP LET ALPHA W PROSGEGRA..GR 2259 1FBE ; PVALID # GREEK PROSGEGRAMMENI 2260 1FBF..1FC1 ; FREE_PVAL # GREEK PSILI..GREEK DIALYTIKA AND PE 2261 1FC2..1FC4 ; PVALID # GREEK SM LET ETA W VARIA AND YP..GR 2262 1FC5 ; UNASSIGNED # 2263 1FC6..1FCB ; PVALID # GREEK SM LET ETA W PERISPOMENI..GR 2264 1FCC..1FCF ; FREE_PVAL # GREEK CAP LET ETA W PROSGEGRAM..GR 2265 1FD0..1FD3 ; PVALID # GREEK SM LET IOTA W VRACHY..GREEK S 2266 1FD4..1FD5 ; UNASSIGNED # .. 2267 1FD6..1FDB ; PVALID # GREEK SM LET IOTA W PERISPOMENI..GR 2268 1FDC ; UNASSIGNED # 2269 1FDD..1FDF ; FREE_PVAL # GREEK DASIA AND VARIA..GREEK DASIA 2270 1FE0..1FEC ; PVALID # GREEK SM LET UPSILON W VRACHY..GREE 2271 1FED..1FEF ; FREE_PVAL # GREEK DIALYTIKA AND VARIA..GREEK VA 2272 1FF0..1FF1 ; UNASSIGNED # .. 2273 1FF2..1FF4 ; FREE_PVAL # GREEK SM LET OMEGA W VARIA AND YPOG 2274 1FF5 ; UNASSIGNED # 2275 1FF6..1FFB ; PVALID # GREEK SM LET OMEGA W PERISPOMEN..GR 2276 1FFC..1FFE ; FREE_PVAL # GREEK CAP LET OMEGA W PROSGEGRA..GR 2277 1FFF ; UNASSIGNED # 2278 2000..200A ; FREE_PVAL # EN QUAD..HAIR SPACE 2279 200B ; DISALLOWED # ZERO WIDTH SPACE 2280 200C..200D ; CONTEXTJ # ZERO WIDTH NON-JOINER..ZERO WIDTH J 2281 200E..200F ; DISALLOWED # LEFT-TO-RIGHT MARK..RIGHT-TO-LEFT M 2282 2010..2027 ; FREE_PVAL # HYPHEN..HYPHENATION POINT 2283 2028..202E ; DISALLOWED # LINE SEP..RIGHT-TO-LEFT OVERRIDE 2284 202F..205F ; FREE_PVAL # NARROW NO-BREAK SPACE..MED MATH SP 2285 2060..2064 ; DISALLOWED # WORD JOINER..INVISIBLE PLUS 2286 2065 ; UNASSIGNED # 2287 2066..206F ; DISALLOWED # LEFT-TO-RIGHT IS..NOM DIGIT SHAPES 2288 2070..2071 ; FREE_PVAL # SUPER ZERO..SUPER LAT SM LET I 2289 2072..2073 ; UNASSIGNED # .. 2290 2074..208E ; FREE_PVAL # SUPER FOUR..SUB RIGHT PARENTHESIS 2291 208F ; UNASSIGNED # 2292 2090..209C ; FREE_PVAL # LAT SUB SM LET A..LAT SUB SM LET T 2293 209D..209F ; UNASSIGNED # .. 2294 20A0..20BA ; FREE_PVAL # EURO-CURRENCY SIGN..TURKISH LIRA SI 2295 20BB..20CF ; UNASSIGNED # .. 2296 20D0..20DC ; PVALID # COMB LEFT HARPOON ABOVE..COMB FOUR 2297 20DD..20E0 ; FREE_PVAL # COMB ENC CIRC..COMB ENC CIRC BACKS 2298 20E1 ; PVALID # COMB L R ARROW ABOVE 2299 20E2..20E4 ; FREE_PVAL # COMB ENC SCREEN..COMB ENC UPWARD PO 2300 20E5..20F0 ; PVALID # COMB REV SOLIDUS OVERLAY..COMB ASTE 2301 20F1..20FF ; UNASSIGNED # .. 2302 2100..2129 ; FREE_PVAL # ACCOUNT OF..TURNED GREEK SM LET IOT 2303 212A..212B ; PVALID # KELVIN SIGN..ANGSTROM SIGN 2304 212C..2131 ; FREE_PVAL # SCRIPT CAP C..SCRIPT CAP F 2305 2132 ; PVALID # TURNED CAP F 2306 2133..214D ; FREE_PVAL # SCRIPT CAP M..AKTIESELSKAB 2307 214E ; PVALID # TURNED SM F 2308 214F..2182 ; FREE_PVAL # SYM FOR SAMAR SOURCE..ROM NUM TEN T 2309 2183..2184 ; PVALID # ROM NUM REV ONE HUNDRED..LAT SM LET 2310 2185..2189 ; FREE_PVAL # ROM NUM SIX LATE FORM..VULGAR FRACT 2311 218A..218F ; UNASSIGNED # .. 2312 2190..23F3 ; FREE_PVAL # LEFTWARDS ARROW..HOURGLASS W FLO 2313 23F4..23FF ; UNASSIGNED # .. 2314 2400..2426 ; FREE_PVAL # SYM FOR NULL..SYM FOR SUB FORM 2315 2427..243F ; UNASSIGNED # .. 2316 2440..244A ; FREE_PVAL # OCR HOOK..OCR DOUBLE BACKSLASH 2317 244B..245F ; UNASSIGNED # .. 2318 2460..26FF ; FREE_PVAL # CIRCLED DIG ONE..WHITE FLAG W HORIZ 2319 2700 ; UNASSIGNED # 2320 2701..2B4C ; FREE_PVAL # UP BLADE SCISSORS..RIGHTWARDS ARROW 2321 2B4D..2B4F ; UNASSIGNED # .. 2322 2B50..2B59 ; FREE_PVAL # WHITE MEDIUM STAR..HEAVY CIRCLED SA 2323 2B5A..2BFF ; UNASSIGNED # .. 2324 2C00..2C2E ; PVALID # GLAG CAP LET AZU..GLAG CA 2325 2C2F ; UNASSIGNED # 2326 2C30..2C5E ; PVALID # GLAG SM LET AZU..GLAG SMAL 2327 2C5F ; UNASSIGNED # 2328 2C60..2C7B ; PVALID # LAT CAP LET L W DOUBLE BAR..LAT SM 2329 2C7C..2C7D ; FREE_PVAL # LAT SUB SM LET J..MOD LET CAP V 2330 2C7E..2CE4 ; PVALID # LAT CAP LET S W SWASH TAIL..COPT SY 2331 2CE5..2CEA ; FREE_PVAL # COPT SYM MI RO..COPT SYM SHIMA SIMA 2332 2CEB..2CF3 ; PVALID # COPT CAP LET CRYPTOGRAMMIC SHEI..CO 2333 2CF4..2CF8 ; UNASSIGNED # .. 2334 2CF9..2CFF ; FREE_PVAL # COPT OLD NUB FULL STOP..COPT MORPHO 2335 2D00..2D25 ; PVALID # GEORG SM LET AN..GEORG SM LET 2336 2D26 ; UNASSIGNED # 2337 2D27 ; PVALID # GEORG SM LET YN 2338 2D28..2D2C ; UNASSIGNED # .. 2339 2D2D ; PVALID # GEORG SM LET AEN 2340 2D2E..2D2F ; UNASSIGNED # .. 2341 2D30..2D67 ; PVALID # TIFINAGH LET YA..TIFINAGH LETTER YO 2342 2D68..2D6E ; UNASSIGNED # .. 2343 2D6F..2D70 ; FREE_PVAL # TIFINAGH MOD LET LABIALIZATION MARK 2344 2D71..2D7E ; UNASSIGNED # .. 2345 2D7F..2D96 ; PVALID # TIFINAGH CONS JOINER..ETHI SYL GGW 2346 2D97..2D9F ; UNASSIGNED # .. 2347 2DA0..2DA6 ; PVALID # ETHI SYL SSA..ETHI SYL SSO 2348 2DA7 ; UNASSIGNED # 2349 2DA8..2DAE ; PVALID # ETHI SYL CCA..ETHI SYL CCO 2350 2DAF ; UNASSIGNED # 2351 2DB0..2DB6 ; PVALID # ETHI SYL ZZA..ETHI SYL ZZO 2352 2DB7 ; UNASSIGNED # 2353 2DB8..2DBE ; PVALID # ETHI SYL CCHA..ETHI SYL CC 2354 2DBF ; UNASSIGNED # 2355 2DC0..2DC6 ; PVALID # ETHI SYL QYA..ETHI SYL QYO 2356 2DC7 ; UNASSIGNED # 2357 2DC8..2DCE ; PVALID # ETHI SYL KYA..ETHI SYL KYO 2358 2DCF ; UNASSIGNED # 2359 2DD0..2DD6 ; PVALID # ETHI SYL XYA..ETHI SYL XYO 2360 2DD7 ; UNASSIGNED # 2361 2DD8..2DDE ; PVALID # ETHI SYL GYA..ETHI SYL GYO 2362 2DDF ; UNASSIGNED # 2363 2DE0..2DFF ; PVALID # COMB CYR LET BE..COMB CYRI 2364 2E00..2E2E ; FREE_PVAL # RIGHT ANGLE SUB MARK..REV QUEST MAR 2365 2E2F ; PVALID # VERT TILDE 2366 2E30..2E3B ; FREE_PVAL # RING PNT..THREE-EM DASH 2367 2E3C..2E7F ; UNASSIGNED # .. 2368 2E80..2E99 ; FREE_PVAL # CJK RAD REPEAT..CJK RAD RAP 2369 2E9A ; UNASSIGNED # 2370 2E9B..2EF3 ; FREE_PVAL # CJK RAD CHOKE..CJK RAD C-SIMPLIFIED 2371 2EF4..2EFF ; UNASSIGNED # .. 2372 2F00..2FD5 ; FREE_PVAL # KANGXI RAD ONE..KANGXI RAD FLUTE 2373 2FD6..2FEF ; UNASSIGNED # .. 2374 2FF0..2FFB ; FREE_PVAL # IDEO DESC CHAR LEFT TO RIGHT..IDEO 2375 2FFC..2FFF ; UNASSIGNED # .. 2376 3000..3004 ; FREE_PVAL # IDEO SPACE..JAPAN INDUST STAND 2377 3005..3007 ; PVALID # IDEO ITER MARK..IDEO NUMB ZERO 2378 3008..3029 ; FREE_PVAL # LEFT ANGLE BRACKET..HANGZH NUM NINE 2379 302A..302D ; PVALID # IDEO LEVEL TONE MARK..IDEO ENT 2380 302E..302F ; DISALLOWED # HANGUL SING DOT TONE MARK..WAVY DAS 2381 3030 ; FREE_PVAL # WAVY DASH 2382 3031..3035 ; DISALLOWED # VERT KANA REP MARK..VERT KANA REP M 2383 3036..303A ; FREE_PVAL # CIRCLED POSTAL MARK..HANGZH NUM THI 2384 303B ; DISALLOWED # VERT IDEO ITER MARK 2385 303C ; PVALID # MASU MARK 2386 303D..303F ; FREE_PVAL # PART ALTER MARK..IDEO HALF FILL 2387 3040 ; UNASSIGNED # 2388 3041..3096 ; PVALID # HIRAGANA LET SM A..HIRAGANA LET SMA 2389 3097..3098 ; UNASSIGNED # .. 2390 3099..309A ; PVALID # COMB KAT-HIR VOICED SOUND 2391 309B..309C ; FREE_PVAL # KAT-HIR VOICED SOUND MARK..KAT-HIR 2392 309D..309E ; PVALID # HIRAGANA ITER MARK..HIRAGANA VOICED 2393 309F..30A0 ; FREE_PVAL # HIRAGANA DIGRAPH YORI..KAT-HIR DOU 2394 30A1..30FA ; PVALID # KATAKANA LET SM A..KATAKANA LET VO 2395 30FB ; CONTEXTO # KATAKANA MIDDLE DOT 2396 30FC..30FE ; PVALID # KAT-HIR PROLONGED SOUND MARK..KATA 2397 30FF ; FREE_PVAL # KATAKANA DIGRAPH KOTO 2398 3100..3104 ; UNASSIGNED # .. 2399 3105..312D ; PVALID # BOPOMOFO LET B..BOPOMOFO LET IH 2400 312E..3130 ; UNASSIGNED # .. 2401 3131..3163 ; FREE_PVAL # HANGUL LET KIYEOK..HANGUL LET I 2402 3164 ; DISALLOWED # HANGUL FILLER 2403 3165..318E ; FREE_PVAL # HANGUL LET SSANGNIEUN..HANGUL LET 2404 318F ; UNASSIGNED # 2405 3190..319F ; FREE_PVAL # IDEO ANNO LINK MARK..IDEO ANNO MAN 2406 31A0..31BA ; PVALID # BOPOMOFO LET BU..BOPOMOFO LET ZY 2407 31BB..31BF ; UNASSIGNED # .. 2408 31C0..31E3 ; FREE_PVAL # CJK STROKE T..CJK STROKE Q 2409 31E4..31EF ; UNASSIGNED # .. 2410 31F0..31FF ; PVALID # KATAKANA LET SM KU..KATAKANA LET SM 2411 3200..321E ; FREE_PVAL # PAREN HANGUL KIYEOK..PAREN KOREAN C 2412 321F ; UNASSIGNED # 2413 3220..32FE ; FREE_PVAL # PAREN IDEO ONE..CIRCLED KATAKANA WO 2414 32FF ; UNASSIGNED # 2415 3300..33FF ; FREE_PVAL # SQUARE APAATO..SQUARE GAL 2416 3400..4DB5 ; PVALID # 2417 4DB6..4DBF ; UNASSIGNED # .. 2418 4DC0..4DFF ; FREE_PVAL # HEX FOR THE CREATIVE HEAVEN..HEX FO 2419 4E00..9FCC ; PVALID # 2420 9FCD..9FFF ; UNASSIGNED # .. 2421 A000..A48C ; PVALID # YI SYL IT..YI SYL YYR 2422 A48D..A48F ; UNASSIGNED # .. 2423 A490..A4C6 ; FREE_PVAL # YI RAD QOT..YI RAD KE 2424 A4C7..A4CF ; UNASSIGNED # .. 2425 A4D0..A4FD ; PVALID # LISU LET BA..LISU LET TONE MYA JEU 2426 A4FE..A4FF ; FREE_PVAL # LISU PUNCT COMMA..LISU PUNCT FUL 2427 A500..A60C ; PVALID # VAI SYL EE..VAI SYL LENENER 2428 A60D..A60F ; FREE_PVAL # VAI COMMA..VAI QUEST MARK 2429 A610..A62B ; PVALID # VAI SYL NDOLE FA..VAI SYL NDOLE DO 2430 A62C..A63F ; UNASSIGNED # .. 2431 A640..A66F ; PVALID # CYR CAP LET ZEMLYA..COMB CYR VZMET 2432 A670..A673 ; FREE_PVAL # COMB CYR TEN MILLIONS SIGN..SLAVON 2433 A674..A67D ; PVALID # COMB CYR KAVYKA..COMB CYR PAYEROK 2434 A67E ; FREE_PVAL # CYR KAVYKA 2435 A67F..A697 ; PVALID # CYR PAYEROK..CYR SM LET SHWE 2436 A698..A69E ; UNASSIGNED # .. 2437 A69F..A6E5 ; PVALID # COMB CYR LET IOTIFIED E..BAMUM LET 2438 A6E6..A6EF ; FREE_PVAL # BAMUM LET MO..BAMUM LET KOGHOM 2439 A6F0..A6F1 ; PVALID # BAMUM COMB MARK KOQNDON..BAMUM COMB 2440 A6F2..A6F7 ; FREE_PVAL # BAMUM NJAEMLI..BAMUM QUEST MARK 2441 A6F8..A6FF ; UNASSIGNED # .. 2442 A700..A716 ; FREE_PVAL # MOD LET CHIN TONE YIN PING..MOD 2443 A717..A71F ; PVALID # MOD LET DOT VERT BAR..MOD L 2444 A720..A721 ; FREE_PVAL # MOD LET STRESS AND HIGH TONE..MOD 2445 A722..A76F ; PVALID # LAT CAP LET EGYPT ALEF..LAT SM LET 2446 A770 ; FREE_PVAL # MODIFIER LETTER US 2447 A771..A788 ; PVALID # LATIN SMALL LETTER DUM..MOD LET LOW 2448 A789..A78A ; FREE_PVAL # MOD LET COLON..MOD LET SH EQUALS SI 2449 A78B..A78E ; PVALID # LAT SM LET SALTILLO..LAT SM LET L W 2450 A78F ; UNASSIGNED # 2451 A790..A793 ; PVALID # LAT CAP LET N W DESC..LAT SM LET C 2452 A794..A79F ; UNASSIGNED # .. 2453 A7A0..A7AA ; PVALID # LAT CAP LET G W OBLIQUE STROKE..LAT 2454 A7AB..A7F7 ; UNASSIGNED # .. 2455 A7F8..A7F9 ; FREE_PVAL # MOD LET CAP H W STROKE..MOD LET SM 2456 A7FA..A827 ; PVALID # LAT LET SM CAP TURNED M..SYLOTI NA 2457 A828..A82B ; FREE_PVAL # SYLOTI NAGRI POET MARK-1..SYLOTI NA 2458 A82C..A82F ; UNASSIGNED # .. 2459 A830..A839 ; FREE_PVAL # N INDIC FRACT ONE QUART..N INDIC QU 2460 A83A..A83F ; UNASSIGNED # .. 2461 A840..A873 ; PVALID # PHAGS-PA LET KA..PHAGS-PA LET CANDR 2462 A874..A877 ; FREE_PVAL # PHAGS-PA SINGLE HEAD MARK..PHAGS-PA 2463 A878..A87F ; UNASSIGNED # .. 2464 A880..A8C4 ; PVALID # SAUR SIGN ANUSVARA..SAUR SIGN VIRAM 2465 A8C5..A8CD ; UNASSIGNED # .. 2466 A8CE..A8CF ; FREE_PVAL # SAUR DANDA..SAUR DOUBLE DANDA 2467 A8D0..A8D9 ; PVALID # SAUR DIG ZERO..SAUR DIG NINE 2468 A8DA..A8DF ; UNASSIGNED # .. 2469 A8E0..A8F7 ; PVALID # COMB DEVAN DIG ZERO..DEVAN SIGN CAN 2470 A8F8..A8FA ; FREE_PVAL # DEVAN SIGN PUSHPIKA..DEVAN CARET 2471 A8FB ; PVALID # DEVAN HEADSTROKE 2472 A8FC..A8FF ; UNASSIGNED # .. 2473 A900..A92D ; PVALID # KAYAH LI DIG ZERO..KAYAH LI TONE CA 2474 A92E..A92F ; FREE_PVAL # KAYAH LI SIGN CWI..KAYAH LI SIGN SH 2475 A930..A953 ; PVALID # REJANG LET KA..REJANG VIRAMA 2476 A954..A95E ; UNASSIGNED # .. 2477 A95F ; FREE_PVAL # REJANG SECTION MARK 2478 A960..A97C ; DISALLOWED # HANGUL CHO TIKEUT-MIUEM..HANGUL CHO 2479 A97D..A97F ; UNASSIGNED # .. 2480 A980..A9C0 ; PVALID # JAV SIGN PANYANGGA..JAV PANGKON 2481 A9C1..A9CD ; FREE_PVAL # JAV LEFT RERENGGAN..JAV TURNED PADA 2482 A9CE ; UNASSIGNED # 2483 A9CF..A9D9 ; PVALID # JAV PANGRANGKEP..JAV DIG NINE 2484 A9DA..A9DD ; UNASSIGNED # .. 2485 A9DE..A9DF ; FREE_PVAL # JAV PADA TIRTA TUMETES..JAV PADA I 2486 A9E0..A9FF ; UNASSIGNED # .. 2487 AA00..AA36 ; PVALID # CHAM LET A..CHAM CONS SIGN WA 2488 AA37..AA3F ; UNASSIGNED # .. 2489 AA40..AA4D ; PVALID # CHAM LET FIN K..CHAM CONS SIGN FIN 2490 AA4E..AA4F ; UNASSIGNED # .. 2491 AA50..AA59 ; PVALID # CHAM DIG ZERO..CHAM DIG NINE 2492 AA5A..AA5B ; UNASSIGNED # .. 2493 AA5C..AA5F ; FREE_PVAL # CHAM PUNCT SPIRAL..CHAM PUNCT TR 2494 AA60..AA76 ; PVALID # MYAN LET KHAMTI GA..MYAN LOGOGRAM K 2495 AA77..AA79 ; FREE_PVAL # MYAN SYM AITON EXCLAM..MYAN SYM AIT 2496 AA7A..AA7B ; PVALID # MYAN LET AITON RA..MYAN SIGN PAO KA 2497 AA7C..AA7F ; UNASSIGNED # .. 2498 AA80..AAC2 ; PVALID # TAI VIET LET LOW KO..TAI VIET TONE 2499 AAC3..AADA ; UNASSIGNED # .. 2500 AADB..AADD ; PVALID # TAI VIET SYM KON..TAI VIET SYM SAM 2501 AADE..AADF ; FREE_PVAL # TAI VIET SYM HO HOI..TAI VIET SYM K 2502 AAE0..AAEF ; PVALID # MEETEI MAYEK LET E..MEETEI MAYEK VO 2503 AAF0..AAF1 ; FREE_PVAL # MEETEI MAYEK CHEIKHAN..MEETEI MAYEK 2504 AAF2..AAF6 ; PVALID # MEETEI MAYEK ANJI..MEETEI MAYEK VIR 2505 AAF7..AB00 ; UNASSIGNED # .. 2506 AB01..AB06 ; PVALID # ETHI SYL TTHU..ETHI SYL TTHO 2507 AB07..AB08 ; UNASSIGNED # .. 2508 AB09..AB0E ; PVALID # ETHI SYL DDHAA..ETHI SYL DDHO 2509 AB0F..AB10 ; UNASSIGNED # .. 2510 AB11..AB16 ; PVALID # ETHI SYL DZU..ETHI SYL DZO 2511 AB17..AB1F ; UNASSIGNED # .. 2512 AB20..AB26 ; PVALID # ETHI SYL CCHHA..ETHI SYL CCHHO 2513 AB27 ; UNASSIGNED # .. 2514 AB28..AB2E ; PVALID # ETHI SYL BBAA..ETHI SYL BBO 2515 AB2F..ABBF ; UNASSIGNED # .. 2516 ABC0..ABEA ; PVALID # MEETEI MAYEK LET KOK..MEETEI MAYEK 2517 ABEB ; FREE_PVAL # MEETEI MAYEK CHEIKHEI 2518 ABEC..ABED ; PVALID # MEETEI MAYEK LUM IYEK..MEETEI MAYEK 2519 ABEE..ABEF ; UNASSIGNED # .. 2520 ABF0..ABF9 ; PVALID # MEETEI MAYEK DIG ZERO..MEETEI MAYEK 2521 ABFA..ABFF ; UNASSIGNED # .. 2522 AC00..D7A3 ; PVALID # 2523 D7A4..D7AF ; UNASSIGNED # .. 2524 D7B0..D7C6 ; DISALLOWED # HANGUL JUNG O-YEO..HANGUL JUNG ARAE 2525 D7C7..D7CA ; UNASSIGNED # .. 2526 D7CB..D7FB ; DISALLOWED # HANGUL JONG NIEUN-RIEUL..HANGUL JON 2527 D7FC..D7FF ; UNASSIGNED # .. 2528 D800..F8FF ; DISALLOWED # 2529 F900..FA6D ; PVALID # CJK COMP IDEO-F900..CJK COMP IDEO 2530 FA6E..FA6F ; UNASSIGNED # .. 2531 FA70..FAD9 ; PVALID # CJK COMP IDEO-FA70..CJK COMP IDEO 2532 FADA..FAFF ; UNASSIGNED # .. 2533 FB00..FB06 ; FREE_PVAL # LAT SM LIG FF..LAT SM LIG ST 2534 FB07..FB12 ; UNASSIGNED # .. 2535 FB13..FB17 ; FREE_PVAL # ARMENIAN SM LIG MEN NOW..ARMENIAN SM 2536 FB18..FB1C ; UNASSIGNED # .. 2537 FB1D..FB1F ; PVALID # HEBR LET YOD W HIRIQ..HEBR LIG YID Y 2538 FB20..FB29 ; FREE_PVAL # HEBR LET ALT AYIN..HEB LET ALT PLUS 2539 FB2A..FB36 ; PVALID # HEBR LET SHIN W SHIN DOT..HEBR LET Z 2540 FB37 ; UNASSIGNED # 2541 FB38..FB3C ; PVALID # HEBR LET TET W DAGESH..HEBR LET 2542 FB3D ; UNASSIGNED # 2543 FB3E ; PVALID # HEBR LET MEM W DAGESH 2544 FB3F ; UNASSIGNED # 2545 FB40..FB41 ; PVALID # HEBR LET NUN W DAGESH..HEBR LET 2546 FB42 ; UNASSIGNED # 2547 FB43..FB44 ; PVALID # HEBR LET FIN PE W DAGESH..HEBR L 2548 FB45 ; UNASSIGNED # 2549 FB46..FB4E ; PVALID # HEBR LET TSADI W DAGESH..HEBR LET P 2550 FB4F..FBC1 ; FREE_PVAL # HEBR LIG ALEF LAMED..ARAB SYM S 2551 FBC2..FBD2 ; UNASSIGNED # .. 2552 FBD3..FD3F ; FREE_PVAL # ARAB LET NG ISO FORM..ORNATE RIGHT 2553 FD40..FD4F ; UNASSIGNED # .. 2554 FD50..FD8F ; FREE_PVAL # ARAB LIG TEH W JEEM W MEEM INIT 2555 FD90..FD91 ; UNASSIGNED # .. 2556 FD92..FDC7 ; FREE_PVAL # ARAB LIG MEEM W JEEM W KHAH INI 2557 FDC8..FDCF ; UNASSIGNED # .. 2558 FDD0..FDEF ; DISALLOWED # .. 2559 FDF0..FDFD ; FREE_PVAL # ARAB LIG SALLA USED..ARAB LIG BISMI 2560 FDFE..FDFF ; UNASSIGNED # .. 2561 FE00..FE0F ; DISALLOWED # VAR SEL-1..VAR SEL-16 2562 FE10..FE19 ; FREE_PVAL # PRES FORM FOR VERT COMMA..PRES FORM 2563 FE1A..FE1F ; UNASSIGNED # .. 2564 FE20..FE26 ; PVALID # COMB LIG LEFT HALF..COMB CONJ MACRO 2565 FE27..FE2F ; UNASSIGNED # .. 2566 FE30..FE52 ; FREE_PVAL # PRES FORM FOR VERT TWO DOT LEAD..SM 2567 FE53 ; UNASSIGNED # 2568 FE54..FE66 ; FREE_PVAL # SM SEMICOLON..SM EQUALS SIGN 2569 FE67 ; UNASSIGNED # 2570 FE68..FE6B ; FREE_PVAL # SM REV SOLIDUS..SM COMM AT 2571 FE6C..FE6F ; UNASSIGNED # .. 2572 FE70..FE72 ; FREE_PVAL # ARAB FATHATAN ISO FORM..ARAB DAMMAT 2573 FE73 ; PVALID # ARAB TAIL FRAGMENT 2574 FE74 ; FREE_PVAL # ARAB KASRATAN ISO FORM 2575 FE75 ; UNASSIGNED # 2576 FE76..FEFC ; FREE_PVAL # ARAB FATHA ISO FORM..ARAB LIG LAM W 2577 FEFD..FEFE ; UNASSIGNED # .. 2578 FEFF ; DISALLOWED # ZERO WIDTH NO-BREAK SPACE 2579 FF00 ; UNASSIGNED # 2580 FF01..FF9F ; FREE_PVAL # FULLW EXCLAM MARK..HALFW KATA SE 2581 FFA0 ; DISALLOWED # HALFW HANGUL FILLER 2582 FFA1..FFBE ; FREE_PVAL # HALFW HANGUL LET KIYEOK..HALFW H 2583 FFBF..FFC1 ; UNASSIGNED # .. 2584 FFC2..FFC7 ; FREE_PVAL # HALFW HANGUL LET A..HALFW HANGUL 2585 FFC8..FFC9 ; UNASSIGNED # .. 2586 FFCA..FFCF ; FREE_PVAL # HALFW HANGUL LET YEO..HALFW HANGU 2587 FFD0..FFD1 ; UNASSIGNED # .. 2588 FFD2..FFD7 ; FREE_PVAL # HALFW HANGUL LET YO..HALFW HANGUL 2589 FFD8..FFD9 ; UNASSIGNED # .. 2590 FFDA..FFDC ; FREE_PVAL # HALFW HANGUL LET EU..HALFW HANGUL 2591 FFDD..FFDF ; UNASSIGNED # .. 2592 FFE0..FFE6 ; FREE_PVAL # FULLW CENT SIGN..FULLW WON SIGN 2593 FFE7 ; UNASSIGNED # 2594 FFE8..FFEE ; FREE_PVAL # HALFW FORMS LIGHT VERT..HALFW WH 2595 FFEF..FFF8 ; UNASSIGNED # .. 2596 FFF9..FFFB ; DISALLOWED # INTERL ANNO ANCHOR..INTERL ANNO TER 2597 FFFC..FFFD ; FREE_PVAL # OBJECT REPL CHAR..REPL CHAR 2598 FFFE..FFFF ; DISALLOWED # .. 2599 10000..1000B; PVALID # LIN B SYL B008 A..LIN B SYL 2600 1000C ; UNASSIGNED # 2601 1000D..10026; PVALID # LIN B SYL B036 JO..LIN B SYL 2602 10027 ; UNASSIGNED # 2603 10028..1003A; PVALID # LIN B SYL B060 RA..LIN B SYL 2604 1003B ; UNASSIGNED # 2605 1003C..1003D; PVALID # LIN B SYL B017 ZA..LIN B SYL 2606 1003E ; UNASSIGNED # 2607 1003F..1004D; PVALID # LIN B SYL B020 ZO..LIN B SYL 2608 1004E..1004F; UNASSIGNED # .. 2609 10050..1005D; PVALID # LIN B SYM B018..LIN B SYM B089 2610 1005E..1007F; UNASSIGNED # .. 2611 10080..100FA; PVALID # LIN B IDEO B100 MAN..LIN B IDEO 2612 100FB..100FF; UNASSIGNED # .. 2613 10100..10102; FREE_PVAL # AEG WORD SEP LINE..AEG CHECK MAR 2614 10103..10106; UNASSIGNED # .. 2615 10107..10133; FREE_PVAL # AEG NUM ONE..AEG NUM NINETY THOU 2616 10134..10136; UNASSIGNED # .. 2617 10137..1018A; FREE_PVAL # AEG WEIGHT BASE UNIT..GREEK ZERO SI 2618 1018B..1018F; UNASSIGNED # .. 2619 10190..1019B; FREE_PVAL # ROM SEXTANS SIGN..ROM CENTURIAL SIG 2620 1019C..101CF; UNASSIGNED # .. 2621 101D0..101FC; FREE_PVAL # PHAISTOS DISC SIGN PED..PHAISTOS DI 2622 101FD ; PVALID # PHAISTOS DISC SIGN COMB OBLIQUE STR 2623 101FE..1027F; UNASSIGNED # .. 2624 10280..1029C; PVALID # LYCIAN LET A..LYCIAN LET X 2625 1029D..1029F; UNASSIGNED # .. 2626 102A0..102D0; PVALID # CARIAN LET A..CARIAN LET UUU3 2627 102D1..102FF; UNASSIGNED # .. 2628 10300..1031E; PVALID # OLD ITAL LET A..OLD ITAL LET UU 2629 1031F ; UNASSIGNED # 2630 10320..10323; FREE_PVAL # OLD ITAL NUM ONE..OLD ITAL NUM F 2631 10324..1032F; UNASSIGNED # .. 2632 10330..10340; PVALID # GOTH LET AHSA..GOTH LET PAIRTHRA 2633 10341 ; FREE_PVAL # GOTH LET NINETY 2634 10342..10349; PVALID # GOTH LET RAIDA..GOTH LET OTHAL 2635 1034A ; FREE_PVAL # GOTH LET NINE HUNDRED 2636 1034B..1037F; UNASSIGNED # .. 2637 10380..1039D; PVALID # UGAR LET ALPA..UGAR LET SSU 2638 1039E ; UNASSIGNED # 2639 1039F ; FREE_PVAL # UGAR WORD DIVIDER 2640 103A0..103C3; PVALID # OLD PERS SIGN A..OLD PERS SIGN HA 2641 103C4..103C7; UNASSIGNED # .. 2642 103C8..103CF; PVALID # OLD PERS SIGN AURAMAZDAA..OLD PERS 2643 103D0..103D5; FREE_PVAL # OLD PERS WORD DIVIDER..OLD PERS NUM 2644 103D6..103FF; UNASSIGNED # .. 2645 10400..1049D; PVALID # DESERET CAP LET LONG I..OSMANYA LET 2646 1049E..1049F; UNASSIGNED # .. 2647 104A0..104A9; PVALID # OSMANYA DIG ZERO..OSMANYA DIG NINE 2648 104AA..107FF; UNASSIGNED # .. 2649 10800..10805; PVALID # CYPRIOT SYL A..CYPRIOT SYL JA 2650 10806..10807; UNASSIGNED # .. 2651 10808 ; PVALID # CYPRIOT SYL JO 2652 10809 ; UNASSIGNED # 2653 1080A..10835; PVALID # CYPRIOT SYL KA..CYPRIOT SYL WO 2654 10836 ; UNASSIGNED # 2655 10837..10838; PVALID # CYPRIOT SYL XA..CYPRIOT SYL XE 2656 10839..1083B; UNASSIGNED # .. 2657 1083C ; PVALID # CYPRIOT SYL ZA 2658 1083D..1083E; UNASSIGNED # .. 2659 1083F..10855; PVALID # CYPRIOT SYL ZO..IMP ARAM LET TAW 2660 10856 ; UNASSIGNED # 2661 10857..1085F; FREE_PVAL # IMP ARAM SECT SIGN..IMP ARAM 2662 10860..108FF; UNASSIGNED # .. 2663 10900..10915; PVALID # PHOEN LET ALF..PHOEN LET TAU 2664 10916..1091B; FREE_PVAL # PHOEN NUM ONE..PHOEN NUM THR 2665 1091C..1091E; UNASSIGNED # .. 2666 1091F ; FREE_PVAL # PHOEN WORD SEP 2667 10920..10939; PVALID # LYDIAN LET A..LYDIAN LET C 2668 1093A..1093E; UNASSIGNED # .. 2669 1093F ; FREE_PVAL # LYDIAN TRIANGULAR MARK 2670 10940..1097F; UNASSIGNED # .. 2671 10980..109B7; PVALID # MERO HIER LET A..MERO CURS LET 2672 109B8..109BD; UNASSIGNED # .. 2673 109BE..109BF; PVALID # MERO CURS LOG RMT..MERO CURS L 2674 109C0..109FF; UNASSIGNED # .. 2675 10A00..10A03; PVALID # KHARO LET A..KHARO VOW SIGN V 2676 10A04 ; UNASSIGNED # 2677 10A05..10A06; PVALID # KHARO VOW SIGN E..KHARO VOW SI 2678 10A07..10A0B; UNASSIGNED # .. 2679 10A0C..10A13; PVALID # KHARO VOW LEN MARK..KHARO LET 2680 10A14 ; UNASSIGNED # 2681 10A15..10A17; PVALID # KHARO LET CA..KHARO LET JA 2682 10A18 ; UNASSIGNED # 2683 10A19..10A33; PVALID # KHARO LET NYA..KHARO LET TTT 2684 10A34..10A37; UNASSIGNED # .. 2685 10A38..10A3A; PVALID # KHARO SIGN BAR ABOVE..KHARO SIGN D 2686 10A3B..10A3E; UNASSIGNED # .. 2687 10A3F ; PVALID # KHARO VIRAMA 2688 10A40..10A47; FREE_PVAL # KHARO DIG ONE..KHARO NUM ONE 2689 10A48..10A4F; UNASSIGNED # .. 2690 10A50..10A58; FREE_PVAL # KHARO PUNCT DOT..KHARO PUNCT 2691 10A59..10A5F; UNASSIGNED # .. 2692 10A60..10A7C; PVALID # OLD S ARAB LET HE..OLD SOUTH ARAB 2693 10A7D..10A7F; FREE_PVAL # OLD S ARAB NUM ONE..OLD SOUTH ARAB 2694 10A80..10AFF; UNASSIGNED # .. 2695 10B00..10B35; PVALID # AVESTAN LET A..AVESTAN LET HE 2696 10B36..10B38; UNASSIGNED # .. 2697 10B39..10B3F; FREE_PVAL # AVESTAN ABBR MARK..LARGE ONE RING O 2698 10B40..10B55; PVALID # INSCRIPT PARTHIAN LET ALEPH..INSCRI 2699 10B56..10B57; UNASSIGNED # .. 2700 10B58..10B5F; FREE_PVAL # INSCRIPT PARTHIAN NUM ONE..INSCRIPT 2701 10B60..10B72; PVALID # INSCRIPT PAHLAVI LET ALEPH..INSCRIP 2702 10B73..10B77; UNASSIGNED # .. 2703 10B78..10B7F; FREE_PVAL # INSCRIPT PAHLAVI NUM ONE..INSCRIPT 2704 10B80..10BFF; UNASSIGNED # .. 2705 10C00..10C48; PVALID # OLD TURK LET ORKHON A..OLD TURK LET 2706 10C49..10E5F; UNASSIGNED # .. 2707 10E60..10E7E; FREE_PVAL # RUMI DIG ONE..RUMI FRACTION TWO THI 2708 10E7F..10FFF; UNASSIGNED # .. 2709 11000..11046; PVALID # BRAHMI SIGN CANDRABINDU..BRAHMI VIR 2710 11047..1104D; FREE_PVAL # BRAHMI DANDA..BRAHMI PUNCT LOTUS 2711 1104E..11051; UNASSIGNED # .. 2712 11052..11065; FREE_PVAL # BRAHMI NUM ONE..BRAHMI NUM ONE THOU 2713 11066..1106F; PVALID # BRAHMI DIG ZERO..BRAHMI DIG NINE 2714 11070..1107F; UNASSIGNED # .. 2715 11080..110BA; PVALID # KAITHI SIGN CANDRABINDU..KAITHI SIG 2716 110BB..110BC; FREE_PVAL # KAITHI ABBR SIGN..KAITHI ENUM SIGN 2717 110BD ; DISALLOWED # KAITHI NUM SIGN 2718 110BE..110C1; FREE_PVAL # KAITHI SECT MARK..KAITHI DOUBLE DAN 2719 110C2..110CF; UNASSIGNED # .. 2720 110D0..110F8; PVALID # SORA SOMPENG LETTER SAH..SORA SOMPE 2721 110F9..110EF; UNASSIGNED # .. 2722 110F0..110F9; PVALID # SORA SOMPENG DIG ZERO..SORA SOMPENG DI 2723 110FA..110FF; UNASSIGNED # .. 2724 11100..11134; PVALID # CHAKMA SIGN CANDRABINDU..CHAKMA MAAYY 2725 11135 ; UNASSIGNED # 2726 11136..1113F; PVALID # CHAKMA DIG ZERO..CHAKMA DIG NINE 2727 11140..11143; FREE_PVAL # CHAKMA SECT MARK..CHAKMA QUEST MARK 2728 11144..1117F; UNASSIGNED # .. 2729 11180..111C4; PVALID # SHARADA SIGN CANDRABINDU..SHARADA OM 2730 111C5..111C8; FREE_PVAL # SHARADA DANDA..SHARADA SEPARATOR 2731 111C9..111CF; UNASSIGNED # .. 2732 111D0..111D9; PVALID # SHARADA DIG ZERO..SHARADA DIG NINE 2733 111DA..1167F; UNASSIGNED # .. 2734 11680..116B7; PVALID # TAKRI LET A..TAKRI SIGN NUKTA 2735 116B8..116BF; UNASSIGNED # .. 2736 116C0..116C9; PVALID # TAKRI DIGIT ZERO..TAKRI DIG NINE 2737 116CA..1FFFF; UNASSIGNED # .. 2738 12000..1236E; PVALID # CUNEI SIGN A..CUNEI SIGN ZUM 2739 1236F..123FF; UNASSIGNED # .. 2740 12400..12462; FREE_PVAL # CUNEI NUM SIGN TWO ASH..CUNEI NUM 2741 12463..1246F; UNASSIGNED # .. 2742 12470..12473; FREE_PVAL # CUNEI PUNCT SIGN OLD ASSYRIAN WORD 2743 12474..12FFF; UNASSIGNED # .. 2744 13000..1342E; PVALID # EGYPT HIERO A001..EGYPT HIERO AA032 2745 1342F..167FF; UNASSIGNED # .. 2746 16800..16A38; PVALID # BAMUM LET PHASE-A NGKUE MFON..BAMUN LE 2747 16A39..16EFF; UNASSIGNED # .. 2748 16F00..16F44; PVALID # MIAO LET PA..MIAO LET HHA 2749 16F45..16F4F; UNASSIGNED # .. 2750 16F50..16F7E; PVALID # MIAO LET NAS..MIAO VOWEL SIGN NG 2751 16F7F..16F8E; UNASSIGNED # .. 2752 16F8F..16F9F; PVALID # MIAO TONE RIGHT..MIAO LET REF TON 2753 16FA0..1AFFF; UNASSIGNED # .. 2754 1B000..1B001; PVALID # KATA LET ARCH E..KATA LET ARCH YE 2755 1B002..1CFFF; UNASSIGNED # .. 2756 1D000..1D0F5; FREE_PVAL # BYZ MUS SYM PSILI..BYZ MUS 2757 1D0F6..1D0FF; UNASSIGNED # .. 2758 1D100..1D126; FREE_PVAL # MUS SYM SINGLE BARLINE..MUS SYMBOL 2759 1D127..1D128; UNASSIGNED # .. 2760 1D129..1D164; FREE_PVAL # MUS SYM MULT MEASURE REST..MUS SYM ONE 2761 1D165..1D169; PVALID # MUS SYM COMB STEM..MUS SYM COMB TREMOL 2762 1D16A..1D16C; FREE_PVAL # MUS SYM FING TREM-1..MUS SYM FING TREM 2763 1D16D..1D172; PVALID # MUS SYM COMB AUG DOT..MUS SYM COMB FL 2764 1D173..1D17A; DISALLOWED # MUS SYM BEGIN BEAM..MUS SYM END PHRASE 2765 1D17B..1D182; PVALID # MUS SYM COMB ACCENT..MUS SYM COMB LOUR 2766 1D183..1D184; FREE_PVAL # MUS SYM ARP UP..MUS SYM ARP DOWN 2767 1D185..1D18B; PVALID # MUS SYM COMB DOIT..MUS SYM COMB TRIPLE 2768 1D18C..1D1A9; FREE_PVAL # MUS SYM RINFORZANDO..MUS SYM DEG SLASH 2769 1D1AA..1D1AD; PVALID # MUS SYM COMB DOWN BOW..MUS SYM COMB SN 2770 1D1AE..1D1DD; FREE_PVAL # MUS SYM PEDAL MARK..MUS SYM PES SUBPUN 2771 1D1DE..1D1FF; UNASSIGNED # .. 2772 1D200..1D241; FREE_PVAL # GREEK VOCAL NOTATION SYM-1..GREEK INS 2773 1D242..1D244; FREE_PVAL # COMB GREEK MUS TRISEME..COMB GREEK MU 2774 1D245 ; FREE_PVAL # GREEK MUSICAL LEIMMA 2775 1D246..1D2FF; UNASSIGNED # .. 2776 1D300..1D356; DISALLOWED # MONOG FOR EARTH..TETRAG FOR FOSTERING 2777 1D357..1D35F; UNASSIGNED # .. 2778 1D360..1D371; DISALLOWED # COUNT ROD UNIT DIG ONE..COUNT ROD TE 2779 1D372..1D3FF; UNASSIGNED # .. 2780 1D400..1D454; FREE_PVAL # MATH BOLD CAP A..MATH IT 2781 1D455 ; UNASSIGNED # 2782 1D456..1D49C; FREE_PVAL # MATH ITAL SM I..MATH SC 2783 1D49D ; UNASSIGNED # 2784 1D49E..1D49F; FREE_PVAL # MATH SCRIPT CAP C..MATH 2785 1D4A0..1D4A1; UNASSIGNED # .. 2786 1D4A2 ; FREE_PVAL # MATH SCRIPT CAP G 2787 1D4A3..1D4A4; UNASSIGNED # .. 2788 1D4A5..1D4A6; FREE_PVAL # MATH SCRIPT CAP J..MATH 2789 1D4A7..1D4A8; UNASSIGNED # .. 2790 1D4A9..1D4AC; FREE_PVAL # MATH SCRIPT CAP N..MATH 2791 1D4AD ; UNASSIGNED # 2792 1D4AE..1D4B9; FREE_PVAL # MATH SCRIPT CAP S..MATH 2793 1D4BA ; UNASSIGNED # 2794 1D4BB ; FREE_PVAL # MATH SCRIPT SM F 2795 1D4BC ; UNASSIGNED # 2796 1D4BD..1D4C3; FREE_PVAL # MATH SCRIPT SM H..MATH SC 2797 1D4C4 ; UNASSIGNED # 2798 1D4C5..1D505; FREE_PVAL # MATH SCRIPT SM P..MATH FR 2799 1D506 ; UNASSIGNED # 2800 1D507..1D50A; FREE_PVAL # MATH FRAKTUR CAP D..MATH 2801 1D50B..1D50C; UNASSIGNED # .. 2802 1D50D..1D514; FREE_PVAL # MATH FRAKTUR CAP J..MATH 2803 1D515 ; UNASSIGNED # 2804 1D516..1D51C; FREE_PVAL # MATH FRAKTUR CAP S..MATH 2805 1D51D ; UNASSIGNED # 2806 1D51E..1D539; FREE_PVAL # MATH FRAKTUR SM A..MATH D 2807 1D53A ; UNASSIGNED # 2808 1D53B..1D53E; FREE_PVAL # MATH DOUBLE-STRUCK CAP D..MATHEM 2809 1D53F ; UNASSIGNED # 2810 1D540..1D544; FREE_PVAL # MATH DOUBLE-STRUCK CAP I..MATHEM 2811 1D545 ; UNASSIGNED # 2812 1D546 ; FREE_PVAL # MATH DOUBLE-STRUCK CAP O 2813 1D547..1D549; UNASSIGNED # .. 2814 1D54A..1D550; FREE_PVAL # MATH DOUBLE-STRUCK CAP S..MATHEM 2815 1D551 ; UNASSIGNED # 2816 1D552..1D6A5; FREE_PVAL # MATH DOUBLE-STRUCK SM A..MATHEMAT 2817 1D6A6..1D6A7; UNASSIGNED # .. 2818 1D6A8..1D7CB; FREE_PVAL # MATH BOLD CAP ALPHA..MATHEMATICA 2819 1D7CC..1D7CD; UNASSIGNED # .. 2820 1D7CE..1D7FF; FREE_PVAL # MATH BOLD DIG ZERO..MATH M 2821 1D800..1EDFF; UNASSIGNED # .. 2822 1EE00..1EE03; FREE_PVAL # ARAB MATH ALEF..ARAB MATH DAL 2823 1EE04 ; UNASSIGNED # 2824 1EE05..1EE1F; FREE_PVAL # ARAB MATH WAW..ARAB MATH DOTLESS QAF 2825 1EE20 ; UNASSIGNED # 2826 1EE21..1EE22; FREE_PVAL # ARAB MATH INIT BEH..ARAB MATH INIT JEE 2827 1EE23 ; UNASSIGNED # 2828 1EE24 ; FREE_PVAL # ARAB MATH INIT HEH 2829 1EE25..1EE26; UNASSIGNED # .. 2830 1EE27 ; FREE_PVAL # ARAB MATH INIT HAH 2831 1EE28 ; UNASSIGNED # 2832 1EE29..1EE32; FREE_PVAL # ARAB MATH INIT YEH..ARAB MATH INIT QAF 2833 1EE33 ; UNASSIGNED # 2834 1EE34..1EE37; FREE_PVAL # ARAB MATH INIT SHEEN..ARAB MATH INITIA 2835 1EE38 ; UNASSIGNED # 2836 1EE39 ; FREE_PVAL # ARAB MATH INIT SHEEN 2837 1EE3A ; UNASSIGNED # 2838 1EE3B ; FREE_PVAL # ARAB MATH INIT GHAIN 2839 1EE3C..1EE41; UNASSIGNED # .. 2840 1EE42 ; FREE_PVAL # ARAB MATH TAILED JEEM 2841 1EE43..1EE46; UNASSIGNED # .. 2842 1EE47 ; FREE_PVAL # ARAB MATH TAILED HAH 2843 1EE48 ; UNASSIGNED # 2844 1EE49 ; FREE_PVAL # ARAB MATH TAILED YEH 2845 1EE4A ; UNASSIGNED # 2846 1EE4B ; FREE_PVAL # ARAB MATH TAILED LAM 2847 1EE4C ; UNASSIGNED # 2848 1EE4D..1EE4F; FREE_PVAL # ARAB MATH TAILED NOON..ARAB MATH TAILE 2849 1EE50 ; UNASSIGNED # 2850 1EE51..1EE52; FREE_PVAL # ARAB MATH TAILED QAF..ARAB MATH TAILED 2851 1EE53 ; UNASSIGNED # 2852 1EE54 ; FREE_PVAL # ARAB MATH TAILED SHEEN 2853 1EE55..1EE56; UNASSIGNED # .. 2854 1EE57 ; FREE_PVAL # ARAB MATH TAILED KHAH 2855 1EE58 ; UNASSIGNED # 2856 1EE59 ; FREE_PVAL # ARAB MATH TAILED DAD 2857 1EE5A ; UNASSIGNED # 2858 1EE5B ; FREE_PVAL # ARAB MATH TAILED GHAIN 2859 1EE5C ; UNASSIGNED # 2860 1EE5D ; FREE_PVAL # ARAB MATH TAILED DOTLESS NOON 2861 1EE5E ; UNASSIGNED # 2862 1EE5F ; FREE_PVAL # ARAB MATH TAILED DOTLESS GHAIN 2863 1EE60 ; UNASSIGNED # 2864 1EE61..1EE62; FREE_PVAL # ARAB MATH STRETCHED BEH..ARAB MATH STR 2865 1EE63 ; UNASSIGNED # 2866 1EE64 ; FREE_PVAL # ARAB MATH STRETCHED HEH 2867 1EE65..1EE66; UNASSIGNED # .. 2868 1EE67..1EE6A; FREE_PVAL # ARAB MATH STRETCHED HAH..ARAB MATH STR 2869 1EE6B ; UNASSIGNED # 2870 1EE6C..1EE72; FREE_PVAL # ARAB MATH STRETCHED MEEM..ARAB MATH ST 2871 1EE73 ; UNASSIGNED # 2872 1EE74..1EE77; FREE_PVAL # ARAB MATH STRETCHED SHEEN..ARAB MATH S 2873 1EE78 ; UNASSIGNED # 2874 1EE79..1EE7C; FREE_PVAL # ARAB MATH STRETCHED DAD..ARAB MATH STR 2875 1EE7D ; UNASSIGNED # 2876 1EE7E ; FREE_PVAL # ARAB MATH STRETCHED DOTLESS FEH 2877 1EE7F ; UNASSIGNED # 2878 1EE80..1EE89; FREE_PVAL # ARAB MATH LOOPED ALEF..ARAB MATH LOOPE 2879 1EE8A ; UNASSIGNED # 2880 1EE8B..1EE9B; FREE_PVAL # ARAB MATH LOOPED LAM..ARAB MATH LOOPED 2881 1EE9C..1EEA0; UNASSIGNED # .. 2882 1EEA1..1EEA3; FREE_PVAL # ARAB MATH DOUBLE-STRUCK BEH..ARAB MATH 2883 1EEA4 ; UNASSIGNED # 2884 1EEA5..1EEA9; FREE_PVAL # ARAB MATH DOUBLE-STRUCK WAW..ARAB MATH 2885 1EEAA ; UNASSIGNED # 2886 1EEAB..1EEBB; FREE_PVAL # ARAB MATH DOUBLE-STRUCK LAM..ARAB MATH 2887 1EEBC..1EEEF; UNASSIGNED # .. 2888 1EEF0..1EEF1; FREE_PVAL # ARAB MATH OP MEEM W HAH W TATWHEEL..AR 2889 1EEF2..1EFFF; UNASSIGNED # .. 2890 1F000..1F02B; FREE_PVAL # MAHJONG TILE EAST WIND..MAHJONG TILE B 2891 1F02C..1F02F; UNASSIGNED # .. 2892 1F030..1F093; FREE_PVAL # DOMINO TILE HORIZ BACK..DOMINO TILE VE 2893 1F094..1F09F; UNASSIGNED # .. 2894 1F0A0..1F0AE; FREE_PVAL # PLAY CARD BACK..PLAY CARD KING OF SPAD 2895 1F0AF..1F0B0; UNASSIGNED # .. 2896 1F0B1..1F0BE; FREE_PVAL # PLAY CARD ACE OF HEARTS..PLAY CARD KIN 2897 1F0BF..1F0C0; UNASSIGNED # .. 2898 1F0C1..1F0CF; FREE_PVAL # PLAY CARD ACE OF DIAMONDS..PLAY CARD B 2899 1F0D0 ; UNASSIGNED # 2900 1F0D1..1F0DF; FREE_PVAL # PLAY CARD ACE OF CLUBS..PLAY CARD WHIT 2901 1F0E0..1F0FF; UNASSIGNED # .. 2902 1F100..1F10A; FREE_PVAL # DIG ZERO FULL STOP..DIG NINE COMMA 2903 1F10B..1F10F; UNASSIGNED # .. 2904 1F110..1F12E; FREE_PVAL # PARENTHESIZED LAT CAP LET A..CIRCLE 2905 1F12F ; UNASSIGNED # 2906 1F130..1F16B; FREE_PVAL # SQUARED LAT CAP LET A..RAISED MD SIGN 2907 1F16C..1F16F; UNASSIGNED # .. 2908 1F170..1F19A; FREE_PVAL # NEG SQ LAT CAP LET A..SQUARED VS 2909 1F19B..1F1E5; UNASSIGNED # .. 2910 1F1E6..1F202; FREE_PVAL # REG IND SYMB LET A..SQ KATAKANA SA 2911 1F203..1F20F; UNASSIGNED # .. 2912 1F210..1F23A; FREE_PVAL # SQ CJK UNIF IDEO-624B..SQ CJK UNIF IDE 2913 1F23B..1F23F; UNASSIGNED # .. 2914 1F240..1F248; FREE_PVAL # TORT SH BRACK CJK UNIF IDEO-672C..TORT 2915 1F249..1F24F; UNASSIGNED # .. 2916 1F250..1F251; FREE_PVAL # CIRC IDEO ADVANTAGE..CIRC IDEO ACCEPT 2917 1F252..1F2FF; UNASSIGNED # .. 2918 1F300..1F320; FREE_PVAL # CYCLONE..SHOOTING STAR 2919 1F321..1F32F; UNASSIGNED # .. 2920 1F330..1F335; FREE_PVAL # CHESTNUT..CACTUS 2921 1F336 ; UNASSIGNED # 2922 1F337..1F37C; FREE_PVAL # TULIP..BABY BOTTLE 2923 1F37D..1F37F; UNASSIGNED # .. 2924 1F380..1F393; FREE_PVAL # RIBBON..GRADUATION CAP 2925 1F394..1F39F; UNASSIGNED # .. 2926 1F3A0..1F3C4; FREE_PVAL # CAROUSEL HORSE..SURFER 2927 1F3C5 ; UNASSIGNED # 2928 1F3C6..1F3CA; FREE_PVAL # TROPHY..SWIMMER 2929 1F3CB..1F3DF; UNASSIGNED # .. 2930 1F3E0..1F3F0; FREE_PVAL # HOUSE BUILDING..EUROPEAN CASTLE 2931 1F3F1..1F3FF; UNASSIGNED # .. 2932 1F400..1F43E; FREE_PVAL # RAT..PAW PRINTS 2933 1F43F ; UNASSIGNED # 2934 1F440 ; FREE_PVAL # EYES 2935 1F441 ; UNASSIGNED # 2936 1F442..1F4F7; FREE_PVAL # EAR..CAMERA 2937 1F4F8 ; UNASSIGNED # 2938 1F4F9..1F4FC; FREE_PVAL # VIDEO CAMERA..VIDEOCASSETTE 2939 1F4FD..1F4FF; UNASSIGNED # .. 2940 1F500..1F53D; FREE_PVAL # TWISTED RIGHTWARDS ARROWS..DOWN-POINTI 2941 1F53E..1F53F; UNASSIGNED # .. 2942 1F540..1F543; FREE_PVAL # CIRCLED CROSS POMMEE..NOTCHED LEFT SEM 2943 1F544..1F54F; UNASSIGNED # .. 2944 1F550..1F567; FREE_PVAL # CLOCK FACE ONE OCLOCK..CLOCK FACE TWEL 2945 1F568..1F5FA; UNASSIGNED # .. 2946 1F5FB..1F640; FREE_PVAL # MOUNT FUJI..WEARY CAT FACE 2947 1F641..1F644; UNASSIGNED # .. 2948 1F645..1F650; FREE_PVAL # FACE W NO GOOD GESTURE..PERSON W FO 2949 1F650..1F67F; UNASSIGNED # .. 2950 1F680..1F6C5; FREE_PVAL # ROCKET..LEFT LUGGAGE 2951 1F6C6..1F6FF; UNASSIGNED # .. 2952 1F700..1F773; FREE_PVAL # ALCHEMICAL SYMBOL FOR QUINTESSENCE..AL 2953 1F774..1FFFF; UNASSIGNED # .. 2954 20000..2A6D6; PVALID # 2955 2A6D7..2A6FF; UNASSIGNED # .. 2956 2A700..2B734; PVALID # 2957 2A735..2A739; UNASSIGNED # .. 2958 2A740..2B81D; PVALID # 2959 2B81E..2F7FF; UNASSIGNED # .. 2960 2F800..2FA1D; PVALID # CJK COMP IDEO-2F800..CJK COMPA 2961 2FA1E..2FFFD; UNASSIGNED # .. 2962 2FFFE..2FFFF; DISALLOWED # .. 2963 30000..3FFFD; UNASSIGNED # .. 2964 3FFFE..3FFFF; DISALLOWED # .. 2965 40000..4FFFD; UNASSIGNED # .. 2966 4FFFE..4FFFF; DISALLOWED # .. 2967 50000..5FFFD; UNASSIGNED # .. 2968 5FFFE..5FFFF; DISALLOWED # .. 2969 60000..6FFFD; UNASSIGNED # .. 2970 6FFFE..6FFFF; DISALLOWED # .. 2971 70000..7FFFD; UNASSIGNED # .. 2972 7FFFE..7FFFF; DISALLOWED # .. 2973 80000..8FFFD; UNASSIGNED # .. 2974 8FFFE..8FFFF; DISALLOWED # .. 2975 90000..9FFFD; UNASSIGNED # .. 2976 9FFFE..9FFFF; DISALLOWED # .. 2977 A0000..AFFFD; UNASSIGNED # .. 2978 AFFFE..AFFFF; DISALLOWED # .. 2979 B0000..BFFFD; UNASSIGNED # .. 2980 BFFFE..BFFFF; DISALLOWED # .. 2981 C0000..CFFFD; UNASSIGNED # .. 2982 CFFFE..CFFFF; DISALLOWED # .. 2983 D0000..DFFFD; UNASSIGNED # .. 2984 DFFFE..DFFFF; DISALLOWED # .. 2985 E0000 ; UNASSIGNED # 2986 E0001 ; DISALLOWED # LANGUAGE TAG 2987 E0002..E001F; UNASSIGNED # .. 2988 E0020..E007F; DISALLOWED # TAG SPACE..CANCEL TAG 2989 E0080..E00FF; UNASSIGNED # .. 2990 E0100..E01EF; DISALLOWED # VAR SEL-17..VAR SEL-256 2991 E01F0..EFFFD; UNASSIGNED # .. 2992 EFFFE..10FFFF; DISALLOWED # .. 2994 Appendix B. Acknowledgements 2996 The authors would like to acknowledge the comments and contributions 2997 of the following individuals: David Black, Mark Davis, Alan DeKok, 2998 Martin Duerst, Patrik Faltstrom, Ted Hardie, Joe Hildebrand, Bjoern 2999 Hoehrmann, Paul Hoffman, Jeffrey Hutzelman, Simon Josefsson, John 3000 Klensin, Alexey Melnikov, Takahiro Nemoto, Yoav Nir, Mike Parker, 3001 Pete Resnick, Andrew Sullivan, Dave Thaler, Yoshiro Yoneya, and 3002 Florian Zeitz. 3004 Some algorithms and textual descriptions have been borrowed from 3005 [RFC5892]. Some text regarding security has been borrowed from 3006 [RFC5890] and [I-D.ietf-xmpp-6122bis]. 3008 Peter Saint-Andre wishes to acknowledge Cisco Systems, Inc., for 3009 employing him during his work on earlier versions of this document. 3011 Authors' Addresses 3013 Peter Saint-Andre 3014 &yet 3016 Email: ietf@stpeter.im 3018 Marc Blanchet 3019 Viagenie 3020 246 Aberdeen 3021 Quebec, QC G1R 2E1 3022 Canada 3024 Email: Marc.Blanchet@viagenie.ca 3025 URI: http://www.viagenie.ca/