idnits 2.17.1 draft-ietf-precis-framework-13.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 5, 2013) is 3793 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-12) exists of draft-ietf-precis-mappings-05 ** Downref: Normative reference to an Informational draft: draft-ietf-precis-mappings (ref. 'I-D.ietf-precis-mappings') -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE' == Outdated reference: A later version (-19) exists of draft-ietf-precis-nickname-08 == Outdated reference: A later version (-18) exists of draft-ietf-precis-saslprepbis-06 == Outdated reference: A later version (-24) exists of draft-ietf-xmpp-6122bis-09 -- Obsolete informational reference (is this intentional?): RFC 3454 (Obsoleted by RFC 7564) -- Obsolete informational reference (is this intentional?): RFC 3490 (Obsoleted by RFC 5890, RFC 5891) -- Obsolete informational reference (is this intentional?): RFC 3491 (Obsoleted by RFC 5891) -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 PRECIS P. Saint-Andre 3 Internet-Draft Cisco Systems, Inc. 4 Obsoletes: 3454 (if approved) M. Blanchet 5 Intended status: Standards Track Viagenie 6 Expires: June 8, 2014 December 5, 2013 8 PRECIS Framework: Preparation and Comparison of Internationalized 9 Strings in Application Protocols 10 draft-ietf-precis-framework-13 12 Abstract 14 Application protocols using Unicode characters in protocol strings 15 need to properly prepare such strings in order to perform valid 16 comparison operations (e.g., for purposes of authentication or 17 authorization). This document defines a framework enabling 18 application protocols to perform the preparation and comparison of 19 internationalized strings ("PRECIS") in a way that depends on the 20 properties of Unicode characters and thus is agile with respect to 21 versions of Unicode. As a result, this framework provides a more 22 sustainable approach to the handling of internationalized strings 23 than the previous framework, known as Stringprep (RFC 3454). This 24 document obsoletes RFC 3454. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on June 8, 2014. 43 Copyright Notice 45 Copyright (c) 2013 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 3. String Classes . . . . . . . . . . . . . . . . . . . . . . . . 6 63 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 6 64 3.2. IdentifierClass . . . . . . . . . . . . . . . . . . . . . 7 65 3.3. FreeformClass . . . . . . . . . . . . . . . . . . . . . . 9 66 4. Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 67 4.1. Principles . . . . . . . . . . . . . . . . . . . . . . . . 10 68 4.2. Building Application-Layer Constructs . . . . . . . . . . 12 69 4.3. A Note about Spaces . . . . . . . . . . . . . . . . . . . 13 70 5. Order of Operations . . . . . . . . . . . . . . . . . . . . . 14 71 6. Code Point Properties . . . . . . . . . . . . . . . . . . . . 14 72 7. Category Definitions Used to Calculate Derived Property . . . 16 73 7.1. LetterDigits (A) . . . . . . . . . . . . . . . . . . . . . 16 74 7.2. Unstable (B) . . . . . . . . . . . . . . . . . . . . . . . 17 75 7.3. IgnorableProperties (C) . . . . . . . . . . . . . . . . . 17 76 7.4. IgnorableBlocks (D) . . . . . . . . . . . . . . . . . . . 17 77 7.5. LDH (E) . . . . . . . . . . . . . . . . . . . . . . . . . 17 78 7.6. Exceptions (F) . . . . . . . . . . . . . . . . . . . . . . 17 79 7.7. BackwardCompatible (G) . . . . . . . . . . . . . . . . . . 19 80 7.8. JoinControl (H) . . . . . . . . . . . . . . . . . . . . . 19 81 7.9. OldHangulJamo (I) . . . . . . . . . . . . . . . . . . . . 19 82 7.10. Unassigned (J) . . . . . . . . . . . . . . . . . . . . . . 20 83 7.11. ASCII7 (K) . . . . . . . . . . . . . . . . . . . . . . . . 20 84 7.12. Controls (L) . . . . . . . . . . . . . . . . . . . . . . . 20 85 7.13. PrecisIgnorableProperties (M) . . . . . . . . . . . . . . 20 86 7.14. Spaces (N) . . . . . . . . . . . . . . . . . . . . . . . . 21 87 7.15. Symbols (O) . . . . . . . . . . . . . . . . . . . . . . . 21 88 7.16. Punctuation (P) . . . . . . . . . . . . . . . . . . . . . 21 89 7.17. HasCompat (Q) . . . . . . . . . . . . . . . . . . . . . . 21 90 7.18. OtherLetterDigits (R) . . . . . . . . . . . . . . . . . . 21 91 8. Calculation of the Derived Property . . . . . . . . . . . . . 21 92 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 93 9.1. PRECIS Derived Property Value Registry . . . . . . . . . . 23 94 9.2. PRECIS Base Classes Registry . . . . . . . . . . . . . . . 23 95 9.3. PRECIS Profiles Registry . . . . . . . . . . . . . . . . . 24 97 10. Security Considerations . . . . . . . . . . . . . . . . . . . 25 98 10.1. General Issues . . . . . . . . . . . . . . . . . . . . . . 25 99 10.2. Use of the IdentifierClass . . . . . . . . . . . . . . . . 25 100 10.3. Use of the FreeformClass . . . . . . . . . . . . . . . . . 26 101 10.4. Local Character Set Issues . . . . . . . . . . . . . . . . 26 102 10.5. Visually Similar Characters . . . . . . . . . . . . . . . 26 103 10.6. Security of Passwords . . . . . . . . . . . . . . . . . . 28 104 11. Interoperability Considerations . . . . . . . . . . . . . . . 29 105 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 29 106 12.1. Normative References . . . . . . . . . . . . . . . . . . . 29 107 12.2. Informative References . . . . . . . . . . . . . . . . . . 30 108 Appendix A. Codepoint Table . . . . . . . . . . . . . . . . . . . 32 109 Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 63 110 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 63 112 1. Introduction 114 As described in the problem statement for the preparation and 115 comparison of internationalized strings ("PRECIS") [RFC6885], many 116 IETF protocols have used the Stringprep framework [RFC3454] as the 117 basis for preparing and comparing protocol strings that contain 118 Unicode characters [UNICODE] outside the ASCII range [RFC20]. The 119 Stringprep framework was developed during work on the original 120 technology for internationalized domain names (IDNs), here called 121 "IDNA2003" [RFC3490], and Nameprep [RFC3491] was the Stringprep 122 profile for IDNs. At the time, Stringprep was designed as a general 123 framework so that other application protocols could define their own 124 Stringprep profiles for the preparation and comparison of strings and 125 identifiers. Indeed, a number of application protocols defined such 126 profiles. 128 After the publication of [RFC3454] in 2002, several significant 129 issues arose with the use of Stringprep in the IDN case, as 130 documented in the IAB's recommendations regarding IDNs [RFC4690] 131 (most significantly, Stringprep was tied to Unicode version 3.2). 132 Therefore, the newer IDNA specifications, here called "IDNA2008" 133 ([RFC5890], [RFC5891], [RFC5892], [RFC5893], [RFC5894]), no longer 134 use Stringprep and Nameprep. This migration away from Stringprep for 135 IDNs has prompted other "customers" of Stringprep to consider new 136 approaches to the preparation and comparison of internationalized 137 strings, as described in [RFC6885]. 139 This document defines a framework for a post-Stringprep approach to 140 the preparation and comparison of internationalized strings in 141 application protocols, based on several principles: 143 1. Define a small set of string classes that specify the Unicode 144 characters (i.e., specific "code points") appropriate for common 145 application protocol constructs. 146 2. Define each PRECIS string class in terms of Unicode code points 147 and their properties so that an algorithm can be used to 148 determine whether each code point or character category is (a) 149 valid, (b) allowed in certain contexts, (c) disallowed, or (d) 150 unassigned. 151 3. Use an "inclusion model" such that a string class consists only 152 of code points that are explicitly allowed, with the result that 153 any code point not explicitly allowed is forbidden. 154 4. Enable application protocols to define profiles of the PRECIS 155 string classes, addressing matters such as width mapping, case 156 folding and other forms of character mapping, Unicode 157 normalization, directionality, and further excluded code points 158 or character categories. 160 Whereas the string classes define the "baseline" code points for a 161 range of applications, profiling enables application protocols to 162 further restrict the allowable code points beyond those specified for 163 the relevant string class (e.g., characters with special or reserved 164 meaning, such as "@" and "/" when used as separators within 165 identifiers) and to apply the string classes in ways that are 166 appropriate for constructs such as usernames and passwords 167 [I-D.ietf-precis-saslprepbis], nicknames [I-D.ietf-precis-nickname], 168 the localparts of instant messaging addresses 169 [I-D.ietf-xmpp-6122bis], and free-form strings 170 [I-D.ietf-xmpp-6122bis]. Profiles are responsible for defining the 171 handling of right-to-left characters as well as various mapping 172 operations of the kind also discussed for IDNs in [RFC5895], such as 173 case preservation or lowercasing, Unicode normalization, mapping of 174 certain characters to other characters or to nothing, and mapping of 175 full-width and half-width characters. 177 It is expected that this framework will yield the following benefits: 179 o Application protocols will be agile with regard to Unicode 180 versions. 181 o Implementers will be able to share code point tables and software 182 code across application protocols, most likely by means of 183 software libraries. 184 o End users will be able to acquire more accurate expectations about 185 the characters that are acceptable in various contexts. Given 186 this more uniform set of string classes, it is also expected that 187 copy/paste operations between software implementing different 188 application protocols will be more predictable and coherent. 190 Although this framework is similar to IDNA2008 and borrows some of 191 the character categories defined in [RFC5892], it defines additional 192 character categories to meet the needs of common application 193 protocols. 195 The character categories and calculation rules defined under 196 Section 7 and Section 8 are normative and apply to all Unicode code 197 points. The code point table provided under Appendix A is non- 198 normative and merely shows, for illustrative purposes, the 199 consequences of the character categories and calculation rules, as 200 well as the resulting property values. 202 2. Terminology 204 Many important terms used in this document are defined in [RFC5890], 205 [RFC6365], [RFC6885], and [UNICODE]. The terms "left-to-right" (LTR) 206 and "right-to-left" (RTL) are defined in Unicode Standard Annex #9 208 [UAX9]. 210 As of the date of writing, the version of Unicode published by the 211 Unicode Consortium is 6.3; however, PRECIS is not tied to a specific 212 version of Unicode. 214 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 215 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 216 "OPTIONAL" in this document are to be interpreted as described in 217 [RFC2119]. 219 3. String Classes 221 3.1. Overview 223 IDNA2008 essentially defines a string class of internationalized 224 domain name (IDN), although it does not use the term "string class". 225 (This document does not define a string class for domain names, and 226 application protocols are strongly encouraged to use IDNA2008 as the 227 appropriate method to prepare domain names and hostnames.) Because 228 the IDN string class is designed to meet the particular requirements 229 of the Domain Name System (DNS), additional string classes are needed 230 for non-DNS applications. 232 Starting in 2010, various "customers" of Stringprep began to discuss 233 the need to define a post-Stringprep approach to the preparation and 234 comparison of internationalized strings other than IDNs. This 235 community analyzed the existing Stringprep profiles and also weighed 236 the costs and benefits of defining a relatively small set of Unicode 237 characters that would minimize the potential for user confusion 238 caused by visually similar characters (and thus be relatively "safe") 239 vs. defining a much larger set of Unicode characters that would 240 maximize the potential for user creativity (and thus be relatively 241 "expressive"). As a result, the community concluded that most 242 existing uses could be addressed by two string classes: 244 IdentifierClass: a sequence of letters, numbers, and some symbols 245 that is used to identify or address a network entity such as a 246 user account, a venue (e.g., a chatroom), an information source 247 (e.g., a data feed), or a collection of data (e.g., a file); the 248 intent is that this class will minimize user confusion in a wide 249 variety of application protocols, with the result that safety has 250 been prioritized over expressiveness for this class. 252 FreeformClass: a sequence of letters, numbers, symbols, spaces, and 253 other characters that is used for free-form strings, including 254 passwords as well as display elements such as human-friendly 255 nicknames in chatrooms; the intent is that this class will allow 256 nearly any Unicode character, with the result that expressiveness 257 has been prioritized over safety for this class (e.g., protocol 258 designers, application developers, service providers, and end 259 users might not understand or be able to enter all of the 260 characters that can be included in the FreeformClass). 262 Future specifications might define additional PRECIS string classes, 263 such as a class that falls somewhere between the IdentifierClass and 264 the FreeformClass. At this time, it is not clear how useful such a 265 class would be. In any case, because application developers are able 266 to define profiles of PRECIS string classes, a protocol needing a 267 construct between the IdentiferClass and the FreeformClass could 268 define a restricted profile of the FreeformClass if needed. 270 The following subsections discuss the IdentifierClass and 271 FreeformClass in more detail, with reference to the dimensions 272 described in Section 3 of [RFC6885]. Each string class is defined by 273 the following behavioral rules: 275 Valid: Defines which code points and character categories are 276 treated as valid input to the string. 278 Contextual Rule Required: Defines which code points and character 279 categories are treated as allowed only if the requirements of a 280 contextual rule are met (i.e., either CONTEXTJ or CONTEXTO). 282 Disallowed: Defines which code points and character categories need 283 to be excluded from the string. 285 Unassigned: Defines application behavior in the presence of code 286 points that are unknown (i.e., not yet designated) for the version 287 of Unicode used by the application. 289 This document defines the valid, contextual rule required, 290 disallowed, and unassigned rules for the IdentifierClass and 291 FreeformClass. As described under Section 4, profiles of these 292 string classes are responsible for defining the width mapping, 293 additional mapping, case mapping, normalization, directionality, and 294 exclusion rules. 296 3.2. IdentifierClass 298 Most application technologies need strings that can be used to refer 299 to, include, or communicate protocol strings like usernames, file 300 names, data feed identifiers, and chatroom names. We group such 301 strings into a class called "IdentifierClass" having the following 302 features. 304 3.2.1. Valid 306 o Code points traditionally used as letters and numbers in writing 307 systems, i.e., the LetterDigits ("A") category first defined in 308 [RFC5892] and listed here under Section 7.1. 309 o Code points in the range U+0021 through U+007E, i.e., the 310 (printable) ASCII7 ("K") rule defined under Section 7.11. These 311 code points are "grandfathered" into PRECIS and thus are valid 312 even if they would otherwise be disallowed according to the 313 property-based rules specified in the next section. 315 Note: Although the PRECIS IdentifierClass re-uses the LetterDigits 316 category from IDNA2008, the range of characters allowed in the 317 IdentifierClass is wider than the range of characters allowed in 318 IDNA2008. The main reason is that IDNA2008 applies the Unstable 319 category before the LetterDigits category, thus disallowing uppercase 320 characters, whereas the IdentifierClass does not apply the Unstable 321 category. 323 3.2.2. Contextual Rule Required 325 o A number of characters from the Exceptions ("F") category defined 326 under Section 7.6 (see Section 7.6 for a full list). 327 o Joining characters, i.e., the JoinControl ("H") category defined 328 under Section 7.8. 330 3.2.3. Disallowed 332 o Old Hangul Jamo characters, i.e., the OldHangulJamo ("I") category 333 defined under Section 7.9. 334 o Control characters, i.e., the Controls ("L") category defined 335 under Section 7.12. 336 o Ignorable characters, i.e., the PrecisIgnorableProperties ("M") 337 category defined under Section 7.13. 338 o Space characters, i.e., the Spaces ("N") category defined under 339 Section 7.14. 340 o Symbol characters, i.e., the Symbols ("O") category defined under 341 Section 7.15. 342 o Punctuation characters, i.e., the Punctuation ("P") category 343 defined under Section 7.16. 344 o Any character that has a compatibility equivalent, i.e., the 345 HasCompat ("Q") category defined under Section 7.17. These code 346 points are disallowed even if they would otherwise be valid 347 according to the property-based rules specified in the previous 348 section. 349 o Letters and digits other than the "traditional" letters and digits 350 allowed in IDNs, i.e., the OtherLetterDigits ("R") category 351 defined under Section 7.18. 353 3.2.4. Unassigned 355 Any code points that are not yet designated in the Unicode character 356 set SHALL be considered Unassigned for purposes of the 357 IdentifierClass, and a string containing such code points SHALL be 358 rejected. 360 3.3. FreeformClass 362 Some application technologies need strings that can be used in a 363 free-form way, e.g., as a password in an authentication exchange (see 364 [I-D.ietf-precis-saslprepbis] or a nickname in a chatroom (see 365 [I-D.ietf-precis-nickname]). We group such things into a class 366 called "FreeformClass" having the following features. 368 Note: Consult Section 10.6 for relevant security considerations when 369 strings conforming to the FreeformClass, or a profile thereof, are 370 used as passwords. 372 3.3.1. Valid 374 o Traditional letters and numbers, i.e., the LetterDigits ("A") 375 category first defined in [RFC5892] and listed here under 376 Section 7.1. 377 o Letters and digits other than the "traditional" letters and digits 378 allowed in IDNs, i.e., the OtherLetterDigits ("R") category 379 defined under Section 7.18. 380 o Code points in the range U+0021 through U+007E, i.e., the 381 (printable) ASCII7 ("K") rule defined under Section 7.11. 382 o Any character that has a compatibility equivalent, i.e., the 383 HasCompat ("Q") category defined under Section 7.17. 384 o Space characters, i.e., the Spaces ("N") category defined under 385 Section 7.14. 386 o Symbol characters, i.e., the Symbols ("O") category defined under 387 Section 7.15. 388 o Punctuation characters, i.e., the Punctuation ("P") category 389 defined under Section 7.16. 391 3.3.2. Contextual Rule Required 393 o A number of characters from the Exceptions ("F") category defined 394 under Section 7.6 (see Section 7.6 for a full list). 396 o Joining characters, i.e., the JoinControl ("H") category defined 397 under Section 7.8. 399 3.3.3. Disallowed 401 o Old Hangul Jamo characters, i.e., the OldHangulJamo ("I") category 402 defined under Section 7.9. 403 o Control characters, i.e., the Controls ("L") category defined 404 under Section 7.12. 405 o Ignorable characters, i.e., the PrecisIgnorableProperties ("M") 406 category defined under Section 7.13. 408 3.3.4. Unassigned 410 Any code points that are not yet designated in the Unicode character 411 set SHALL be considered Unassigned for purposes of the FreeformClass, 412 and a string containing such code points SHALL be rejected. 414 4. Profiles 416 4.1. Principles 418 This framework document defines the valid, contextual-rule-required, 419 disallowed, and unassigned rules for the IdentifierClass and the 420 FreeformClass. A profile of a PRECIS string class MUST define the 421 width mapping, additional mapping (if any), case mapping, 422 normalization, directionality, and exclusion rules. A profile MAY 423 also restrict the allowable characters above and beyond the 424 definition of the relevant PRECIS string class (but MUST NOT add as 425 valid any code points or character categories that are disallowed by 426 the relevant PRECIS string class). These matters are discussed in 427 the following subsections. 429 Profiles of the PRECIS string classes MUST register with the IANA as 430 described under Section 9.3. It is RECOMMENDED for profile names to 431 be of the form "ProfilenameBaseClass", where the "Profilename" string 432 is a differentiator and "BaseClass" is the name of the PRECIS string 433 class being profiled; for example, the profile of the IdentifierClass 434 used for localparts of Jabber IDs in the Extensible Messaging and 435 Presence Protocol (XMPP) is named "JIDlocalIdentifierClass" 436 [I-D.ietf-xmpp-6122bis]. 438 4.1.1. Width Mapping 440 The width mapping rule of a profile specifies whether width mapping 441 is performed on fullwidth and halfwidth characters, and how the 442 mapping is done. Typically such mapping consists of mapping 443 fullwidth and halfwidth characters, i.e., code points with a 444 Decomposition Type of Wide or Narrow, to their decomposition 445 mappings; as an example, FULLWIDTH DIGIT ZERO (U+FF10) would be 446 mapped to DIGIT ZERO (U+0030). 448 The normalization form specified by a profile (see below) has an 449 impact on the need for width mapping. Because width mapping is 450 performed as a part of compatibility decomposition, a profile 451 employing either normalization form KD (NFKD) or normalization form 452 KC (NFKC) does not need to specify width mapping. However, if 453 Unicode normalization form C (NFC) is used then the profile needs to 454 specify whether to apply width mapping; in this case, width mapping 455 is in general RECOMMENDED because allowing fullwidth and halfwidth 456 characters to remain unmapped to their compatibility variants would 457 violate the principle of least user surprise. For more information 458 about the concept of width in East Asian scripts within Unicode, see 459 Unicode Standard Annex #11 [UAX11]. 461 4.1.2. Additional Mappings 463 The additional mappings rule of a profile specifies whether 464 additional mappings are to be applied, such as mapping of delimiter 465 characters, mapping of special characters (e.g., non-ASCII space 466 characters to ASCII space or certain characters to nothing), and case 467 mapping based on locale or on locale and context (see 468 [I-D.ietf-precis-mappings]). 470 4.1.3. Case Mapping 472 The case mapping rule of a profile specifies whether case mapping is 473 performed (instead of case preservation) on uppercase and titlecase 474 characters, and how the mapping is done (e.g., mapping uppercase and 475 titlecase characters to their lowercase equivalents). 477 If case preservation is not desired, it is RECOMMENDED to use Unicode 478 Default Case Folding as defined in Chapter 3 of the Unicode Standard 479 [UNICODE]. 481 In order to maximize entropy and minimize the potential for false 482 positives, it is NOT RECOMMENDED for application protocols to map 483 uppercase and titlecase code points to their lowercase equivalents 484 when strings conforming to the FreeformClass, or a profile thereof, 485 are used in passwords; instead, it is RECOMMENDED to preserve the 486 case of all code points contained in such strings and then perform 487 case-sensitive comparison. See also the related discussion in 488 [I-D.ietf-precis-saslprepbis]. 490 4.1.4. Normalization 492 The normalization rule of a profile specifies which Unicode 493 normalization form (D, KD, C, or KC) is to be applied (see Unicode 494 Standard Annex #15 [UAX15] for background information). 496 In accordance with [RFC5198], normalization form C (NFC) is 497 RECOMMENDED. 499 4.1.5. Directionality 501 The directionality rule of a profile specifies which strings are to 502 be considered left-to-right (LTR) and right-to-left (RTL), and the 503 allowable sequences of characters in LTR and RTL strings (see Unicode 504 Standard Annex #9 [UAX9]); note that mixed-direction strings are not 505 supported, since there is currently no widely accepted and 506 implemented solution for the processing and display of mixed- 507 direction strings. Possible rules include, but are not limited to, 508 (a) considering any string that contains a right-to-left code point 509 to be a right-to-left string, or (b) applying the "Bidi Rule" from 510 [RFC5893]. 512 4.1.6. Exclusions 514 The exclusions rule of a profile specifies whether the profile 515 excludes additional code points or character categories above and 516 beyond those excluded by the string class being profiled. That is, a 517 profile MAY do either of the following: 519 1. Exclude specific code points that are allowed by the relevant 520 string class. 521 2. Exclude characters matching certain Unicode properties (e.g., 522 math symbols) that are included in the relevant PRECIS string 523 class. 525 As a result of such exclusions, code points that are defined as valid 526 for the PRECIS string class being profiled will be defined as 527 disallowed for the profile. 529 4.2. Building Application-Layer Constructs 531 Sometimes, an application-layer construct does not map in a 532 straightforward manner to one of the PRECIS string classes or a 533 profile thereof. Consider, for example, the "simple user name" 534 construct in the Simple Authentication and Security Layer (SASL) 535 [RFC4422]. Depending on the deployment, a simple user name might 536 take the form of a user's full name (e.g., the user's personal name 537 followed by a space and then the user's family name). Such a simple 538 user name cannot be defined as an instance of the IdentifierClass or 539 a profile thereof, since space characters are not allowed in the 540 IdentifierClass; however, it could be defined using a space-separated 541 sequence of IdentifierClass instances, as in the following pseudo- 542 ABNF [RFC5234]: 544 fullname = namepart [1*(1*SP namepart)] 545 namepart = 1*(idpoint) 546 ; 547 ; an "idpoint" is a UTF-8 encoded Unicode code point 548 ; that conforms to the PRECIS IdentifierClass 550 Similar techniques could be used to define many application-layer 551 constructs, say of the form "user@domain" or "/path/to/file". 553 4.3. A Note about Spaces 555 With regard to the IdentiferClass, the consensus of the PRECIS 556 Working Group was that spaces are problematic for many reasons, 557 including: 559 o Many Unicode characters are confusable with ASCII space. 561 o Even if non-ASCII space characters are mapped to ASCII space 562 (U+0020), space characters are often not rendered in user 563 interfaces, leading to the possibility that a human user might 564 consider a string containing spaces to be equivalent to the same 565 string without spaces. 567 o In some locales, some devices are known to generate a character 568 other than ASCII space (such as ZERO WIDTH JOINER, U+200D) when a 569 user performs an action like hit the space bar on a keyboard. 571 One consequence of disallowing space characters in the 572 IdentifierClass might be to effectively discourage the use of ASCII 573 space (or, even more problematically, non-ASCII space characters) 574 within identifiers created in newer application protocols; given the 575 challenges involved in properly handling space characters in 576 identifiers and other protocol strings, the Working Group considered 577 this to be a feature, not a bug. 579 However, the FreeformClass does allow spaces, which enables 580 application protocols to define profiles of the FreeformClass that 581 are more flexible than any profiles of the IdentifierClass. In 582 addition, as explained in the previous section, application protocols 583 can also define application-layer constructs containing spaces. 585 5. Order of Operations 587 To ensure proper comparison, the following order of operations is 588 REQUIRED: 590 1. Width mapping 591 2. Optionally, additional mappings such as those as specified in 592 [I-D.ietf-precis-mappings]: 593 1. Delimiter mapping 594 2. Special mapping 595 3. Local case mapping 596 3. Non-local case mapping 597 4. Normalization 598 5. Behavioral rules for determining whether a code point is valid, 599 allowed under a contextual rule, disallowed, or unassigned 601 As already described, the width mapping, additional mapping, non- 602 local case mapping, and normalization operations are specified for 603 each profile, whereas the behavioral rules are specified for each 604 string class. Some of the logic behind this order is provided under 605 Section 4.1.1 and in [I-D.ietf-precis-mappings]. 607 6. Code Point Properties 609 In order to implement the string classes described above, this 610 document does the following: 612 1. Reviews and classifies the collections of code points in the 613 Unicode character set by examining various code point properties. 615 2. Defines an algorithm for determining a derived property value, 616 which can vary depending on the string class being used by the 617 relevant application protocol. 619 This document is not intended to specify precisely how derived 620 property values are to be applied in protocol strings. That 621 information is the responsibility of the protocol specification that 622 uses or profiles a PRECIS string class from this document. 624 The value of the property is to be interpreted as follows. 626 PROTOCOL VALID Those code points that are allowed to be used in any 627 PRECIS string class (currently, IdentifierClass and 628 FreeformClass). Code points with this property value are 629 permitted for general use in any string class. The abbreviated 630 term "PVALID" is used to refer to this value in the remainder of 631 this document. 633 SPECIFIC CLASS PROTOCOL VALID Those code points that are allowed to 634 be used in specific string classes. Code points with this 635 property value are permitted for use in specific string classes. 636 In the remainder of this document, the abbreviated term *_PVAL is 637 used, where * = (ID | FREE), i.e., either "FREE_PVAL" or 638 "ID_PVAL". 640 CONTEXTUAL RULE REQUIRED Some characteristics of the character, such 641 as its being invisible in certain contexts or problematic in 642 others, require that it not be used in labels unless specific 643 other characters or properties are present. As in IDNA2008, there 644 are two subdivisions of CONTEXTUAL RULE REQUIRED, the first for 645 Join_controls (called "CONTEXTJ") and the second for other 646 characters (called "CONTEXTO"). A character with the derived 647 property value CONTEXTJ or CONTEXTO MUST NOT be used unless an 648 appropriate rule has been established and the context of the 649 character is consistent with that rule. The most notable of the 650 CONTEXTUAL RULE REQUIRED characters are the Join Control 651 characters U+200D ZERO WIDTH JOINER and U+200C ZERO WIDTH NON- 652 JOINER, which have a derived property value of CONTEXTJ. See 653 Appendix A of [RFC5892] for more information. 655 DISALLOWED Those code points that are not permitted in any PRECIS 656 string class. 658 SPECIFIC CLASS DISALLOWED Those code points that are not to be 659 included in a specific string class. Code points with this 660 property value are not permitted in one of the string classes but 661 might be permitted in others. In the remainder of this document, 662 the abbreviated term *_DIS is used, where * = (ID | FREE), i.e., 663 either "FREE_DIS" or "ID_DIS". 665 UNASSIGNED Those code points that are not designated (i.e. are 666 unassigned) in the Unicode Standard. 668 The mechanisms described here allow determination of the value of the 669 property for future versions of Unicode (including characters added 670 after Unicode 5.2 or 6.3 depending on the category, since some 671 categories in this document are reused from IDNA2008 and therefore 672 were defined at the time of Unicode 5.2). Changes in Unicode 673 properties that do not affect the outcome of this process therefore 674 do not affect this framework. For example, a character can have its 675 Unicode General_Category value [UNICODE] change from So to Sm, or 676 from Lo to Ll, without affecting the algorithm results. Moreover, 677 even if such changes were to result, the BackwardCompatible list 678 (Section 7.7) can be adjusted to ensure the stability of the results. 680 7. Category Definitions Used to Calculate Derived Property 682 The derived property obtains its value based on a two-step procedure: 684 1. Characters are placed in one or more character categories either 685 (1) based on core properties defined by the Unicode Standard or 686 (2) by treating the code point as an exception and addressing the 687 code point based on its code point value. These categories are 688 not mutually exclusive. 689 2. Set operations are used with these categories to determine the 690 values for a property specific to a given string class. These 691 operations are specified under Section 8. 693 (Note: Unicode property names and property value names might have 694 short abbreviations, such as "gc" for the General_Category property 695 and "Ll" for the Lowercase_Letter property value of the gc property.) 697 In the following specification of character categories, the operation 698 that returns the value of a particular Unicode character property for 699 a code point is designated by using the formal name of that property 700 (from the Unicode PropertyAliases.txt [1]) followed by '(cp)' for 701 "code point". For example, the value of the General_Category 702 property for a code point is indicated by General_Category(cp). 704 The first ten categories (A-J) shown below were previously defined 705 for IDNA2008 and are copied directly from [RFC5892]. Some of these 706 categories are reused in PRECIS and some of them are not; however, 707 the lettering of categories is retained to prevent overlap and to 708 ease implementation of both IDNA2008 and PRECIS in a single software 709 application. The next eight categories (K-R) are specific to PRECIS. 711 7.1. LetterDigits (A) 713 Note: This category is defined in [RFC5892] and copied here for use 714 in PRECIS. 716 A: General_Category(cp) is in {Ll, Lu, Lm, Lo, Mn, Mc, Nd} 718 These rules identify characters commonly used in mnemonics and often 719 informally described as "language characters". 721 For more information, see Chapter 4 of the Unicode Standard 722 [UNICODE]. 724 The categories used in this rule are: 725 o Ll - Lowercase_Letter 726 o Lu - Uppercase_Letter 727 o Lm - Modifier_Letter 728 o Lo - Other_Letter 729 o Mn - Nonspacing_Mark 730 o Mc - Spacing_Mark 731 o Nd - Decimal_Number 733 7.2. Unstable (B) 735 Note: This category is defined in [RFC5892] but not used in PRECIS. 737 7.3. IgnorableProperties (C) 739 Note: This category is defined in [RFC5892] but not used in PRECIS. 740 See the "PrecisIgnorableProperties (M)" category below for a more 741 inclusive category used in PRECIS identifiers. 743 7.4. IgnorableBlocks (D) 745 Note: This category is defined in [RFC5892] but not used in PRECIS. 747 7.5. LDH (E) 749 Note: This category is defined in [RFC5892] but not used in PRECIS. 750 See the "ASCII7 (K)" category below for a more inclusive category 751 used in PRECIS identifiers. 753 7.6. Exceptions (F) 755 Note: This category is defined in [RFC5892] and used in PRECIS to 756 ensure consistent treatment of the relevant code points. 758 F: cp is in {00B7, 00DF, 0375, 03C2, 05F3, 05F4, 0640, 0660, 759 0661, 0662, 0663, 0664, 0665, 0666, 0667, 0668, 760 0669, 06F0, 06F1, 06F2, 06F3, 06F4, 06F5, 06F6, 761 06F7, 06F8, 06F9, 06FD, 06FE, 07FA, 0F0B, 3007, 762 302E, 302F, 3031, 3032, 3033, 3034, 3035, 303B, 763 30FB} 765 This category explicitly lists code points for which the category 766 cannot be assigned using only the core property values that exist in 767 the Unicode Standard. The values are according to the table below: 769 PVALID -- Would otherwise have been DISALLOWED 771 00DF; PVALID # LATIN SMALL LETTER SHARP S 772 03C2; PVALID # GREEK SMALL LETTER FINAL SIGMA 773 06FD; PVALID # ARABIC SIGN SINDHI AMPERSAND 774 06FE; PVALID # ARABIC SIGN SINDHI POSTPOSITION MEN 775 0F0B; PVALID # TIBETAN MARK INTERSYLLABIC TSHEG 776 3007; PVALID # IDEOGRAPHIC NUMBER ZERO 778 CONTEXTO -- Would otherwise have been DISALLOWED 780 00B7; CONTEXTO # MIDDLE DOT 781 0375; CONTEXTO # GREEK LOWER NUMERAL SIGN (KERAIA) 782 05F3; CONTEXTO # HEBREW PUNCTUATION GERESH 783 05F4; CONTEXTO # HEBREW PUNCTUATION GERSHAYIM 784 30FB; CONTEXTO # KATAKANA MIDDLE DOT 786 CONTEXTO -- Would otherwise have been PVALID 788 0660; CONTEXTO # ARABIC-INDIC DIGIT ZERO 789 0661; CONTEXTO # ARABIC-INDIC DIGIT ONE 790 0662; CONTEXTO # ARABIC-INDIC DIGIT TWO 791 0663; CONTEXTO # ARABIC-INDIC DIGIT THREE 792 0664; CONTEXTO # ARABIC-INDIC DIGIT FOUR 793 0665; CONTEXTO # ARABIC-INDIC DIGIT FIVE 794 0666; CONTEXTO # ARABIC-INDIC DIGIT SIX 795 0667; CONTEXTO # ARABIC-INDIC DIGIT SEVEN 796 0668; CONTEXTO # ARABIC-INDIC DIGIT EIGHT 797 0669; CONTEXTO # ARABIC-INDIC DIGIT NINE 798 06F0; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT ZERO 799 06F1; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT ONE 800 06F2; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT TWO 801 06F3; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT THREE 802 06F4; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT FOUR 803 06F5; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT FIVE 804 06F6; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT SIX 805 06F7; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT SEVEN 806 06F8; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT EIGHT 807 06F9; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT NINE 809 DISALLOWED -- Would otherwise have been PVALID 811 0640; DISALLOWED # ARABIC TATWEEL 812 07FA; DISALLOWED # NKO LAJANYALAN 813 302E; DISALLOWED # HANGUL SINGLE DOT TONE MARK 814 302F; DISALLOWED # HANGUL DOUBLE DOT TONE MARK 815 3031; DISALLOWED # VERTICAL KANA REPEAT MARK 816 3032; DISALLOWED # VERTICAL KANA REPEAT WITH VOICED SOUND MARK 817 3033; DISALLOWED # VERTICAL KANA REPEAT MARK UPPER HALF 818 3034; DISALLOWED # VERTICAL KANA REPEAT WITH VOICED SOUND MARK 819 UPPER HA 820 3035; DISALLOWED # VERTICAL KANA REPEAT MARK LOWER HALF 821 303B; DISALLOWED # VERTICAL IDEOGRAPHIC ITERATION MARK 823 7.7. BackwardCompatible (G) 825 Note: This category is defined in [RFC5892] and copied here for use 826 in PRECIS. Because of how the PRECIS string classes are defined, 827 only changes that would result in code points being added to or 828 removed from the LetterDigits ("A") category would result in 829 backward-incompatible modifications to code point assignments. 830 Therefore, management of this category is handled via the processes 831 specified in [RFC5892]. 833 G: cp is in {} 835 This category includes the code points for which property values in 836 versions of Unicode after 5.2 have changed in such a way that the 837 derived property value would no longer be PVALID or DISALLOWED. If 838 changes are made to future versions of Unicode so that code points 839 might change property value from PVALID or DISALLOWED, then this 840 table can be updated and keep special exception values so that the 841 property values for code points stay stable. 843 7.8. JoinControl (H) 845 Note: This category is defined in [RFC5892] and copied here for use 846 in PRECIS. 848 H: Join_Control(cp) = True 850 This category consists of Join Control characters (i.e., they are not 851 in LetterDigits (Section 7.1) but are still required in strings under 852 some circumstances). 854 7.9. OldHangulJamo (I) 856 Note: This category is defined in [RFC5892] and copied here for use 857 in PRECIS. 859 I: Hangul_Syllable_Type(cp) is in {L, V, T} 861 This category consists of all conjoining Hangul Jamo (Leading Jamo, 862 Vowel Jamo, and Trailing Jamo). 864 Elimination of conjoining Hangul Jamos from the set of PVALID 865 characters results in restricting the set of Korean PVALID characters 866 just to preformed, modern Hangul syllable characters. Old Hangul 867 syllables, which are spelled with sequences of conjoining Hangul 868 Jamos, are not PVALID for string classes. 870 7.10. Unassigned (J) 872 Note: This category is defined in [RFC5892] and copied here for use 873 in PRECIS. 875 J: General_Category(cp) is in {Cn} and 876 Noncharacter_Code_Point(cp) = False 878 This category consists of code points in the Unicode character set 879 that are not (yet) designated. Implementers might want to keep in 880 mind that the Unicode Standard distinguishes between 'unassigned code 881 points' and 'unassigned characters'. The unassigned code points are 882 all but (Cn - Noncharacters), whereas the unassigned characters are 883 all but (Cn + Cs). 885 7.11. ASCII7 (K) 887 This PRECIS-specific category consists of all printable, non-space 888 characters from the 7-bit ASCII range. By applying this category, 889 the algorithm specified under Section 8 exempts these characters from 890 other rules that might be applied during PRECIS processing, on the 891 assumption that these code points are in such wide use that 892 disallowing them would be counter-productive. 894 K: cp is in {0021..007E} 896 7.12. Controls (L) 898 L: Control(cp) = True 900 7.13. PrecisIgnorableProperties (M) 902 This PRECIS-specific category is used to group code points that are 903 discouraged from use in PRECIS string classes. 905 M: Default_Ignorable_Code_Point(cp) = True or 906 Noncharacter_Code_Point(cp) = True 908 The definition for Default_Ignorable_Code_Point can be found in the 909 DerivedCoreProperties.txt [2] file, and at the time of Unicode 6.3 is 910 as follows: 912 Other_Default_Ignorable_Code_Point 913 + Cf (Format characters) 914 + Variation_Selector 915 - White_Space 916 - FFF9..FFFB (Annotation Characters) 917 - 0600..0604, 06DD, 070F, 110BD (exceptional Cf characters 918 that should be visible) 920 7.14. Spaces (N) 922 This PRECIS-specific category is used to group code points that are 923 space characters. 925 N: General_Category(cp) is in {Zs} 927 7.15. Symbols (O) 929 This PRECIS-specific category is used to group code points that are 930 symbols. 932 O: General_Category(cp) is in {Sm, Sc, Sk, So} 934 7.16. Punctuation (P) 936 This PRECIS-specific category is used to group code points that are 937 punctuation characters. 939 P: General_Category(cp) is in {Pc, Pd, Ps, Pe, Pi, Pf, Po} 941 7.17. HasCompat (Q) 943 This PRECIS-specific category is used to group code points that have 944 compatibility equivalents as explained in Chapter 2 and Chapter 3 of 945 the Unicode Standard [UNICODE]. 947 Q: toNFKC(cp) != cp 949 The toNFKC() operation returns the code point in normalization form 950 KC. For more information, see Section 5 of Unicode Standard Annex 951 #15 [UAX15]. 953 7.18. OtherLetterDigits (R) 955 This PRECIS-specific category is used to group code points that are 956 letters and digits other than the "traditional" letters and digits 957 grouped under the LetterDigits (A) class (see Section 7.1). 959 R: General_Category(cp) is in {Lt, Nl, No, Me} 961 8. Calculation of the Derived Property 963 Possible values of the derived property are: 965 o PVALID 966 o ID_PVAL 967 o FREE_PVAL 968 o CONTEXTJ 969 o CONTEXTO 970 o DISALLOWED 971 o ID_DIS 972 o FREE_DIS 973 o UNASSIGNED 975 Note: The value of the derived property calculated can depend on the 976 string class; for example, if an identifier used in an application 977 protocol is defined as profiling the PRECIS IdentifierClass then a 978 space character such as U+0020 would be assigned to ID_DIS, whereas 979 if an identifier is defined as profiling the PRECIS FreeformClass 980 then the character would be assigned to FREE_PVAL. For the sake of 981 brevity, the designation "FREE_PVAL" is used in the code point 982 tables, instead of the longer designation "ID_DIS or FREE_PVAL". In 983 practice, the derived properties ID_PVAL and FREE_DIS are not used in 984 this specification, since every ID_PVAL code point is PVALID and 985 every FREE_DIS code point is DISALLOWED. 987 The algorithm to calculate the value of the derived property is as 988 follows: 990 If .cp. .in. Exceptions Then Exceptions(cp); 991 Else If .cp. .in. BackwardCompatible Then BackwardCompatible(cp); 992 Else If .cp. .in. Unassigned Then UNASSIGNED; 993 Else If .cp. .in. ASCII7 Then PVALID; 994 Else If .cp. .in. JoinControl Then CONTEXTJ; 995 Else If .cp. .in. OldHangulJamo Then DISALLOWED; 996 Else If .cp. .in. PrecisIgnorableProperties Then DISALLOWED; 997 Else If .cp. .in. Controls Then DISALLOWED; 998 Else If .cp. .in. HasCompat Then ID_DIS or FREE_PVAL; 999 Else If .cp. .in. LetterDigits Then PVALID; 1000 Else If .cp. .in. OtherLetterDigits Then ID_DIS or FREE_PVAL; 1001 Else If .cp. .in. Spaces Then ID_DIS or FREE_PVAL; 1002 Else If .cp. .in. Symbols Then ID_DIS or FREE_PVAL; 1003 Else If .cp. .in. Punctuation Then ID_DIS or FREE_PVAL; 1004 Else DISALLOWED; 1006 Note: Use of the name of a rule (such as "Exceptions") implies the 1007 set of code points that the rule defines, whereas the same name as a 1008 function call (such as "Exceptions(cp)") implies the value that the 1009 code point has in the Exceptions table. 1011 9. IANA Considerations 1013 9.1. PRECIS Derived Property Value Registry 1015 IANA is requested to create a PRECIS-specific registry with the 1016 Derived Properties for the versions of Unicode that are released 1017 after (and including) version 6.3. The derived property value is to 1018 be calculated in cooperation with a designated expert [RFC5226] 1019 according to the rules specified under Section 7 and Section 8, not 1020 by copying the non-normative table found under Appendix A. 1022 The IESG is to be notified if backward-incompatible changes to the 1023 table of derived properties are discovered or if other problems arise 1024 during the process of creating the table of derived property values 1025 or during expert review. Changes to the rules defined under 1026 Section 7 and Section 8 require IETF Review. 1028 9.2. PRECIS Base Classes Registry 1030 IANA is requested to create a registry of PRECIS string classes. In 1031 accordance with [RFC5226], the registration policy is "RFC Required". 1033 The registration template is as follows: 1035 Base Class: [the name of the PRECIS string class] 1036 Description: [a brief description of the PRECIS string class and its 1037 intended use, e.g., "A sequence of letters, numbers, and symbols 1038 that is used to identify or address a network entity."] 1039 Specification: [the RFC number] 1041 The initial registrations are as follows: 1043 Base Class: FreeformClass. 1044 Description: A sequence of letters, numbers, symbols, spaces, and 1045 other code points that is used for free-form strings. 1046 Specification: RFC XXXX. [Note to RFC Editor: please change XXXX to 1047 the number issued for this specification.] 1049 Base Class: IdentifierClass. 1050 Description: A sequence of letters, numbers, and symbols that is 1051 used to identify or address a network entity. 1052 Specification: RFC XXXX. [Note to RFC Editor: please change XXXX to 1053 the number issued for this specification.] 1055 9.3. PRECIS Profiles Registry 1057 IANA is requested to create a registry of profiles that use the 1058 PRECIS string classes. In accordance with [RFC5226], the 1059 registration policy is "Expert Review". This policy was chosen in 1060 order to ease the burden of registration while ensuring that 1061 "customers" of PRECIS receive appropriate guidance regarding the 1062 sometimes complex and subtle internationalization issues related to 1063 profiles of PRECIS string classes. 1065 The registration template is as follows: 1067 Name: [the name of the profile] 1068 Applicability: [the specific protocol elements to which this profile 1069 applies, e.g., "Localparts in XMPP addresses."] 1070 Base Class: [which PRECIS string class is being profiled] 1071 Replaces: [the Stringprep profile that this PRECIS profile replaces, 1072 if any] 1073 Width Mapping: [the behavioral rule for handling of width, e.g., 1074 "Map fullwidth and halfwidth characters to their compatibility 1075 variants."] 1076 Additional Mappings: [any additional mappings are required or 1077 recommended, e.g., "Map non-ASCII space characters to ASCII 1078 space."] 1079 Case Mapping: [the behavioral rule for handling of case, e.g., "Map 1080 uppercase and titlecase characters to lowercase."] 1081 Normalization: [which Unicode normalization form is applied, e.g., 1082 "NFC"] 1083 Directionality: [the behavioral rule for handling of right-to-left 1084 code points, e.g., "The 'Bidi Rule' defined in RFC 5893 applies."] 1085 Exclusions: [a brief description of the specific code points or 1086 characters categories are excluded, e.g., "Eight legacy characters 1087 in the ASCII range" or "Any character that has a compatibility 1088 equivalent, i.e., the HasCompat category"] 1089 Enforcement: [which entities enforce the rules, and when that 1090 enforcement occurs during protocol operations] 1091 Specification: [a pointer to relevant documentation, such as an RFC 1092 or Internet-Draft] 1094 In order to request a review, the registrant shall send a completed 1095 template to the precis@ietf.org list or its designated successor. 1097 Factors to focus on while defining profiles and reviewing profile 1098 registrations include the following: 1100 o Is the problem being addressed by this profile well-defined? 1101 o Does the specification define what kinds of applications are 1102 involved and the protocol elements to which this profile applies? 1103 o Would an existing PRECIS string class or profile solve the 1104 problem? 1105 o Is the profile clearly defined? 1106 o Does the profile reduce the degree to which human users could be 1107 surprised by application behavior (the "principle of least user 1108 surprise")? 1109 o Is the profile based on an appropriate dividing line between user 1110 interface (culture, context, intent, locale, device limitations, 1111 etc.) and the use of conformant strings in protocol elements? 1112 o Are the width mapping, case mapping, additional mapping, 1113 normalization, exclusion, and directionality rules appropriate for 1114 the intended use? 1115 o Does the profile explain which entities enforce the rules, and 1116 when such enforcement occurs during protocol operations? 1117 o Does the profile reduce the degree to which human users could be 1118 surprised or confused by application behavior (the "principle of 1119 least user surprise")? 1120 o Does the profile introduce any new security concerns such as those 1121 described under Section 10 of this document (e.g., false positives 1122 for authentication or authorization)? 1124 10. Security Considerations 1126 10.1. General Issues 1128 The security of applications that use this framework can depend in 1129 part on the proper preparation and comparison of internationalized 1130 strings. For example, such strings can be used to make 1131 authentication and authorization decisions, and the security of an 1132 application could be compromised if an entity providing a given 1133 string is connected to the wrong account or online resource based on 1134 different interpretations of the string. 1136 Specifications of application protocols that use this framework are 1137 encouraged to describe how internationalized strings are used in the 1138 protocol, including the security implications of any false positives 1139 and false negatives that might result from various comparison 1140 operations. For some helpful guidelines, refer to [RFC6943], 1141 [RFC5890], [UTR36], and [UTS39]. 1143 10.2. Use of the IdentifierClass 1145 Strings that conform to the IdentifierClass and any profile thereof 1146 are intended to be relatively safe for use in a broad range of 1147 applications, primarily because they include only letters, digits, 1148 and "grandfathered" non-space characters from the ASCII range; thus 1149 they exclude spaces, characters with compatibility equivalents, and 1150 almost all symbols and punctuation marks. However, because such 1151 strings can still include so-called confusable characters (see 1152 Section 10.5), protocol designers and implementers are encouraged to 1153 pay close attention to the security considerations described 1154 elsewhere in this document. 1156 10.3. Use of the FreeformClass 1158 Strings that conform to the FreeformClass and many profiles thereof 1159 can include virtually any Unicode character. This makes the 1160 FreeformClass quite expressive, but also problematic from the 1161 perspective of possible user confusion. Protocol designers are 1162 hereby warned that the FreeformClass contains codepoints they might 1163 not understand, and are encouraged to profile the IdentifierClass 1164 wherever feasible; however, if an application protocol requires more 1165 code points than are allowed by the IdentifierClass, protocol 1166 designers are encouraged to define a profile of the FreeformClass 1167 that restricts the allowable code points as tightly as possible. 1168 (The PRECIS Working Group considered the option of allowing 1169 superclasses as well as profiles of PRECIS string classes, but 1170 decided against allowing superclasses to reduce the likelihood of 1171 security and interoperability problems.) 1173 10.4. Local Character Set Issues 1175 When systems use local character sets other than ASCII and Unicode, 1176 this specification leaves the problem of converting between the local 1177 character set and Unicode up to the application or local system. If 1178 different applications (or different versions of one application) 1179 implement different rules for conversions among coded character sets, 1180 they could interpret the same name differently and contact different 1181 application servers or other network entities. This problem is not 1182 solved by security protocols, such as Transport Layer Security (TLS) 1183 [RFC5246] and the Simple Authentication and Security Layer (SASL) 1184 [RFC4422], that do not take local character sets into account. 1186 10.5. Visually Similar Characters 1188 Some characters are visually similar and thus can cause confusion 1189 among humans. Such characters are often called "confusable 1190 characters" or "confusables". 1192 The problem of confusable characters is not necessarily caused by the 1193 use of Unicode code points outside the ASCII range. For example, in 1194 some presentations and to some individuals the string "ju1iet" 1195 (spelled with DIGIT ONE, U+0031, as the third character) might appear 1196 to be the same as "juliet" (spelled with LATIN SMALL LETTER L, 1197 U+006C), especially on casual visual inspection. This phenomenon is 1198 sometimes called "typejacking". 1200 However, the problem is made more serious by introducing the full 1201 range of Unicode code points into protocol strings. For example, the 1202 characters U+13DA U+13A2 U+13B5 U+13AC U+13A2 U+13AC U+13D2 from the 1203 Cherokee block look similar to the ASCII characters "STPETER" as they 1204 might appear when presented using a "creative" font family. 1206 In some examples of confusable characters, it is unlikely that the 1207 average human could tell the difference between the real string and 1208 the fake string. (Indeed, there is no programmatic way to 1209 distinguish with full certainty which is the fake string and which is 1210 the real string; in some contexts, the string formed of Cherokee 1211 characters might be the real string and the string formed of ASCII 1212 characters might be the fake string.) Because PRECIS-compliant 1213 strings can contain almost any properly-encoded Unicode code point, 1214 it can be relatively easy to fake or mimic some strings in systems 1215 that use the PRECIS framework. The fact that some strings are easily 1216 confused introduces security vulnerabilities of the kind that have 1217 also plagued the World Wide Web, specifically the phenomenon known as 1218 phishing. 1220 Despite the fact that some specific suggestions about identification 1221 and handling of confusable characters appear in the Unicode Security 1222 Considerations [UTR36] and the Unicode Security Mechanisms [UTS39], 1223 it is also true (as noted in [RFC5890]) that "there are no 1224 comprehensive technical solutions to the problems of confusable 1225 characters". Because it is impossible to map visually similar 1226 characters without a great deal of context (such as knowing the font 1227 families used), the PRECIS framework does nothing to map similar- 1228 looking characters together, nor does it prohibit some characters 1229 because they look like others. 1231 Nevertheless, specifications for application protocols that use this 1232 framework MUST describe how confusable characters can be abused to 1233 compromise the security of systems that use the protocol in question, 1234 along with any protocol-specific suggestions for overcoming those 1235 threats. In particular, software implementations and service 1236 deployments that use PRECIS-based technologies are strongly 1237 encouraged to define and implement consistent policies regarding the 1238 registration, storage, and presentation of visually similar 1239 characters. The following recommendations are appropriate: 1241 1. An application service SHOULD define a policy that specifies the 1242 scripts or blocks of characters that the service will allow to be 1243 registered (e.g., in an account name) or stored (e.g., in a file 1244 name). Such a policy SHOULD be informed by the languages and 1245 scripts that are used to write registered account names; in 1246 particular, to reduce confusion, the service SHOULD forbid 1247 registration or storage of strings that contain characters from 1248 more than one script and SHOULD restrict registrations to 1249 characters drawn from a very small number of scripts (e.g., 1250 scripts that are well-understood by the administrators of the 1251 service, to improve manageability). 1253 2. User-oriented application software SHOULD define a policy that 1254 specifies how internationalized strings will be presented to a 1255 human user. Because every human user of such software has a 1256 preferred language or a small set of preferred languages, the 1257 software SHOULD gather that information either explicitly from 1258 the user or implicitly via the operating system of the user's 1259 device. Furthermore, because most languages are typically 1260 represented by a single script or a small set of scripts, and 1261 because most scripts are typically contained in one or more 1262 blocks of characters, the software SHOULD warn the user when 1263 presenting a string that mixes characters from more than one 1264 script or block, or that uses characters outside the normal range 1265 of the user's preferred language(s). (Such a recommendation is 1266 not intended to discourage communication across different 1267 communities of language users; instead, it recognizes the 1268 existence of such communities and encourages due caution when 1269 presenting unfamiliar scripts or characters to human users.) 1271 The challenges inherent in supporting the full range of Unicode code 1272 points have in the past led some to hope for a way to 1273 programmatically negotiate more restrictive ranges based on locale, 1274 script, or other relevant factors, to tag the locale associated with 1275 a particular string, etc. As a general-purpose internationalization 1276 technology, the PRECIS framework does not include such mechanisms. 1278 10.6. Security of Passwords 1280 Two goals of passwords are to maximize the amount of entropy and to 1281 minimize the potential for false positives. These goals can be 1282 achieved in part by allowing a wide range of code points and by 1283 ensuring that passwords are handled in such a way that code points 1284 are not compared aggressively. Therefore, it is NOT RECOMMENDED for 1285 application protocols to profile the FreeformClass for use in 1286 passwords in a way that removes entire categories (e.g., by 1287 disallowing symbols or punctuation). Furthermore, it is NOT 1288 RECOMMENDED for application protocols to map uppercase and titlecase 1289 code points to their lowercase equivalents in such strings; instead, 1290 it is RECOMMENDED to preserve the case of all code points contained 1291 in such strings and to compare them in a case-sensitive manner. 1293 That said, software implementers need to be aware that there exist 1294 tradeoffs between entropy and usability. For example, allowing a 1295 user to establish a password containing "uncommon" code points might 1296 make it difficult for the user to access a service when using an 1297 unfamiliar or constrained input device. 1299 Some application protocols use passwords directly, whereas others 1300 reuse technologies that themselves process passwords (one example of 1301 such a technology is the Simple Authentication and Security Layer 1302 [RFC4422]). Moreover, passwords are often carried by a sequence of 1303 protocols with backend authentication systems or data storage systems 1304 such as RADIUS [RFC2865] and LDAP [RFC4510]. Developers of 1305 application protocols are encouraged to look into reusing these 1306 profiles instead of defining new ones, so that end-user expectations 1307 about passwords are consistent no matter which application protocol 1308 is used. 1310 11. Interoperability Considerations 1312 Although strings that are consumed in PRECIS-based application 1313 protocols are often encoded using UTF-8 [RFC3629], the exact encoding 1314 is a matter for the application protocol that uses PRECIS, not for 1315 the PRECIS framework. 1317 It is known that some existing systems are unable to support the full 1318 Unicode character set, or even any characters outside the ASCII 1319 range. If two (or more) applications need to interoperate when 1320 exchanging data (e.g., for the purpose of authenticating a username 1321 or password), they will naturally need to have in common at least one 1322 coded character set (as defined by [RFC6365]). Establishing such a 1323 baseline is a matter for the application protocol that uses PRECIS, 1324 not for the PRECIS framework. 1326 The PRECIS framework, which is defined in terms of the latest version 1327 of Unicode as of the time of this writing (6.3), treats the character 1328 U+19DA NEW TAI LUE THAM as DISALLOWED. Implementers need to be aware 1329 that this treatment is different from IDNA2008 (originally defined in 1330 terms of Unicode 5.2), which treats U+19DA as PVALID. 1332 12. References 1334 12.1. Normative References 1336 [I-D.ietf-precis-mappings] 1337 Yoneya, Y. and T. NEMOTO, "Mapping characters for PRECIS 1338 classes", draft-ietf-precis-mappings-05 (work in 1339 progress), October 2013. 1341 [RFC20] Cerf, V., "ASCII format for network interchange", RFC 20, 1342 October 1969. 1344 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1345 Requirement Levels", BCP 14, RFC 2119, March 1997. 1347 [RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network 1348 Interchange", RFC 5198, March 2008. 1350 [UNICODE] The Unicode Consortium, "The Unicode Standard", 2013, 1351 . 1353 12.2. Informative References 1355 [I-D.ietf-precis-nickname] 1356 Saint-Andre, P., "Preparation and Comparison of 1357 Nicknames", draft-ietf-precis-nickname-08 (work in 1358 progress), December 2013. 1360 [I-D.ietf-precis-saslprepbis] 1361 Saint-Andre, P. and A. Melnikov, "Username and Password 1362 Preparation Algorithms", draft-ietf-precis-saslprepbis-06 1363 (work in progress), December 2013. 1365 [I-D.ietf-xmpp-6122bis] 1366 Saint-Andre, P., "Extensible Messaging and Presence 1367 Protocol (XMPP): Address Format", 1368 draft-ietf-xmpp-6122bis-09 (work in progress), 1369 November 2013. 1371 [RFC2865] Rigney, C., Willens, S., Rubens, A., and W. Simpson, 1372 "Remote Authentication Dial In User Service (RADIUS)", 1373 RFC 2865, June 2000. 1375 [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of 1376 Internationalized Strings ("stringprep")", RFC 3454, 1377 December 2002. 1379 [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, 1380 "Internationalizing Domain Names in Applications (IDNA)", 1381 RFC 3490, March 2003. 1383 [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep 1384 Profile for Internationalized Domain Names (IDN)", 1385 RFC 3491, March 2003. 1387 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 1388 10646", STD 63, RFC 3629, November 2003. 1390 [RFC4422] Melnikov, A. and K. Zeilenga, "Simple Authentication and 1391 Security Layer (SASL)", RFC 4422, June 2006. 1393 [RFC4510] Zeilenga, K., "Lightweight Directory Access Protocol 1394 (LDAP): Technical Specification Road Map", RFC 4510, 1395 June 2006. 1397 [RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and 1398 Recommendations for Internationalized Domain Names 1399 (IDNs)", RFC 4690, September 2006. 1401 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1402 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 1403 May 2008. 1405 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 1406 Specifications: ABNF", STD 68, RFC 5234, January 2008. 1408 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 1409 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 1411 [RFC5890] Klensin, J., "Internationalized Domain Names for 1412 Applications (IDNA): Definitions and Document Framework", 1413 RFC 5890, August 2010. 1415 [RFC5891] Klensin, J., "Internationalized Domain Names in 1416 Applications (IDNA): Protocol", RFC 5891, August 2010. 1418 [RFC5892] Faltstrom, P., "The Unicode Code Points and 1419 Internationalized Domain Names for Applications (IDNA)", 1420 RFC 5892, August 2010. 1422 [RFC5893] Alvestrand, H. and C. Karp, "Right-to-Left Scripts for 1423 Internationalized Domain Names for Applications (IDNA)", 1424 RFC 5893, August 2010. 1426 [RFC5894] Klensin, J., "Internationalized Domain Names for 1427 Applications (IDNA): Background, Explanation, and 1428 Rationale", RFC 5894, August 2010. 1430 [RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for 1431 Internationalized Domain Names in Applications (IDNA) 1432 2008", RFC 5895, September 2010. 1434 [RFC6365] Hoffman, P. and J. Klensin, "Terminology Used in 1435 Internationalization in the IETF", BCP 166, RFC 6365, 1436 September 2011. 1438 [RFC6885] Blanchet, M. and A. Sullivan, "Stringprep Revision and 1439 Problem Statement for the Preparation and Comparison of 1440 Internationalized Strings (PRECIS)", RFC 6885, March 2013. 1442 [RFC6943] Thaler, D., "Issues in Identifier Comparison for Security 1443 Purposes", RFC 6943, May 2013. 1445 [UAX9] The Unicode Consortium, "Unicode Standard Annex #9: 1446 Unicode Bidirectional Algorithm", September 2012, 1447 . 1449 [UAX11] The Unicode Consortium, "Unicode Standard Annex #11: East 1450 Asian Width", September 2012, 1451 . 1453 [UAX15] The Unicode Consortium, "Unicode Standard Annex #15: 1454 Unicode Normalization Forms", August 2012, 1455 . 1457 [UTR36] The Unicode Consortium, "Unicode Technical Report #36: 1458 Unicode Security Considerations", July 2012, 1459 . 1461 [UTS39] The Unicode Consortium, "Unicode Technical Standard #39: 1462 Unicode Security Mechanisms", July 2012, 1463 . 1465 URIs 1467 [1] 1469 [2] 1471 Appendix A. Codepoint Table 1473 If one applies the property calculation rules from Section 8 to the 1474 code points 0x0000 to 0x10FFFF in Unicode 6.3, the result is as shown 1475 in the following table, in Unicode Character Database (UCD) format. 1476 The columns of the table are as follows: 1478 1. The code point or codepoint range. 1479 2. The assignment for the code point or range, where the value is 1480 one of PVALID, DISALLOWED, UNASSIGNED, CONTEXTO, CONTEXTJ, or 1481 FREE_PVAL (where the latter includes ID_DIS). 1483 3. The name or names for the code point or range. 1485 This table is non-normative, is included only for illustrative 1486 purposes, and applies only to Unicode 6.3, not to past or future 1487 versions of Unicode. Please note that the strings displayed in the 1488 third column are not necessarily the formal name of the code point 1489 (as defined in [UNICODE]) because the fixed width of the RFC format 1490 necessitated truncation of many names. 1492 0000..001F ; DISALLOWED # 1493 0020 ; FREE_PVAL # SPACE 1494 0021..007E ; PVALID # EXCLAM MARK..TILDE 1495 007F..009F ; DISALLOWED # 1496 00A0..00AC ; FREE_PVAL # NO-BREAK SPACE..NOT SIGN 1497 00AD ; DISALLOWED # SOFT HYPH 1498 00AE..00B6 ; FREE_PVAL # REGISTERED SIGN..PILCROW SIGN 1499 00B7 ; CONTEXTO # MIDDLE DOT 1500 00B8..00BF ; FREE_PVAL # CEDILLA..INV QUEST IND 1501 00C0..00D6 ; PVALID # LAT CAP LET A W GRAV..LAT CAP O 1502 00D7 ; FREE_PVAL # MULTIPLICATION SIGN 1503 00D8..00F6 ; PVALID # LAT CAP LET O W STROKE..LAT SM 1504 00F7 ; FREE_PVAL # DIVISION SIGN 1505 00F8..0131 ; PVALID # LAT SM LET O W STROKE..LAT SM LET 1506 0132..0133 ; FREE_PVAL # LAT CAP LIG IJ..LAT SM LIB IJ 1507 0134..013E ; PVALID # LAT CAP LET J W CIRCUM..LAT SM LET 1508 013F..0140 ; FREE_PVAL # LAT CAP LET L W MID DOT..LAT SM LET 1509 0141..0148 ; PVALID # LAT CAP LET L W STROKE..LAT SM LET 1510 0149 ; FREE_PVAL # LAT SM LET N PRECEDED BY APOS 1511 014A..017E ; PVALID # LAT CAP LET ENG..LAT SM LET Z W CA 1512 017F ; FREE_PVAL # LAT SM LET LONG S 1513 0180..01C3 ; PVALID # LAT SM LET B W STROKE..LAT LET RETR 1514 01C4..01CC ; FREE_PVAL # LAT CAP LET DZ W CARON..LAT SM 1515 01CD..01F0 ; PVALID # LAT CAP LET A W CARON..LAT SM LET J 1516 01F1..01F3 ; FREE_PVAL # LAT CAP LET DZ..LAT SM LET DZ 1517 01F4..02AF ; PVALID # LAT CAP LET G W ACUTE..LAT SM 1518 02B0..02B8 ; FREE_PVAL # MOD LET SM H..MOD LET SM Y 1519 02B9..02C1 ; PVALID # MOD LET PRIME..MOD LET REV GLOT ST 1520 02C2..02C5 ; FREE_PVAL # MOD LET L ARROW..MOD LET D ARROW 1521 02C6..02D1 ; PVALID # MOD LET CIRCUM ACC..MOD LET HALF TR 1522 02D2..02EB ; FREE_PVAL # MOD LET CENT R HALF RING..MOD LET Y 1523 02EC ; PVALID # MOD LET VOICING 1524 02ED ; FREE_PVAL # MOD LET UNASPIRATED 1525 02EE ; PVALID # MOD LET DOUBLE APOS 1526 02EF..02FF ; FREE_PVAL # MOD LET LOW D ARR..MOD LET LOW L AR 1527 0300..034E ; PVALID # COMB GRAVE ACCENT..COMB UP ARROW BE 1528 034F ; DISALLOWED # COMB GRAPHEME JOINER 1529 0350..0374 ; PVALID # COMB RIGHT ARROWHEAD..GREEK NUM SIG 1530 0375 ; CONTEXTO # GREEK LOW NUM SIGN 1531 0376..0377 ; PVALID # GR CAP LET PAMPHYLIAN DIGAMMA..GR S 1532 0378..0379 ; UNASSIGNED # .. 1533 037A ; FREE_PVAL # GR YPOGEGRAMMENI..GR SM REV DOT LUN 1534 037B..037D ; PVALID # GR SM REV LUN SIG..GR SM REV DOT LU 1535 037E ; FREE_PVAL # GREEK QUEST MARK 1536 037F..0383 ; UNASSIGNED # .. 1537 0384..0385 ; FREE_PVAL # GREEK TONOS..GREEK DIALYTIKA TONOS 1538 0386 ; PVALID # GR CAP LET ALPHA W TONOS 1539 0387 ; FREE_PVAL # GREEK ANO TELEIA 1540 0388..038A ; PVALID # GR CAP LET EPSILON W TONOS..GR CAP 1541 038B ; UNASSIGNED # 1542 038C ; PVALID # GREEK CAP LET OMICRON W TONOS 1543 038D ; UNASSIGNED # 1544 038E..03A1 ; PVALID # GR CAP LET EPSILON W TONOS..GR CAP 1545 03A2 ; UNASSIGNED # 1546 03A3..03CF ; PVALID # GREEK CAP LET SIGMA..GR CAP 1547 03D0..03D2 ; FREE_PVAL # GR BETA SYM..GR UPSILON W HOOK 1548 03D3..03D4 ; FREE_PVAL # GR UPSILON W ACUTE AND HOOK..GR UP 1549 03D5..03D6 ; FREE_PVAL # GR PHI SYM..GR PI SYM 1550 03D7..03EF ; PVALID # GR KAI SYM..COPT SM LET DEI 1551 03F0..03F2 ; FREE_PVAL # GR KAPPA SYM..GR LUNATE SIGMA 1552 03F3 ; PVALID # GREEK LET YOT 1553 03F4..03F6 ; FREE_PVAL # GR CAP THETA..GR REV LUNATE EPSILON 1554 03F7..03F8 ; PVALID # GR CAP LET SHO..GR SM LET SHO 1555 03F9 ; FREE_PVAL # GREEK CAP LUNATE SIGMA SYM 1556 03FA..0481 ; PVALID # GR CAP LET SAN..CYR SML LET KOPPA 1557 0482 ; FREE_PVAL # CYR THOUSANDS SIGN 1558 0483..0487 ; PVALID # COMB CYR TITLO..COMB CYR POK 1559 0488..0489 ; FREE_PVAL # COMB CYR HUNDRED THOUSANDS SIGN..C 1560 048A..0527 ; PVALID # CYR CAP LET SH I W TAIL..CYR S 1561 0528..0530 ; UNASSIGNED # .. 1562 0531..0556 ; PVALID # ARM CAP LET AYB..ARM CAP LET FEH 1563 0557..0558 ; UNASSIGNED # .. 1564 0559 ; PVALID # ARM MOD LET LEFT HALF RING 1565 055A..055F ; FREE_PVAL # ARM APOS..ARM ABBREV 1566 0560 ; UNASSIGNED # 1567 0561..0586 ; PVALID # ARM SM LET AYB..ARMENIAN SM LE 1568 0587 ; FREE_PVAL # ARM SM LIG ECH YIWN 1569 0588 ; UNASSIGNED # 1570 0589..058A ; FREE_PVAL # ARMENIAN FULL STOP..ARMENIAN HYPH 1571 058B..058E ; UNASSIGNED # .. 1572 058F ; FREE_PVAL # ARMENIAN DRAM SIGN 1573 0590 ; UNASSIGNED # 1574 0591..05BD ; PVALID # HEBR ACC ETNAHTA..HEBR PNT ME 1575 05BE ; FREE_PVAL # HEBR PUNCT MAQAF 1576 05BF ; PVALID # HEBR PNT RAFE 1577 05C0 ; FREE_PVAL # HEBR PUNCT PASEQ 1578 05C1..05C2 ; PVALID # HEBR PNT SHIN DOT..HEBR PNT SIN DOT 1579 05C3 ; FREE_PVAL # HEBR PUNCT SOF PASUQ 1580 05C4..05C5 ; PVALID # HEBR MARK UP DOT..HEBR MARK LOW DOT 1581 05C6 ; FREE_PVAL # HEBR PUNCT NUN HAFUKHA 1582 05C7 ; PVALID # HEBR PNT QAMATS QATAN 1583 05C8..05CF ; UNASSIGNED # .. 1584 05D0..05EA ; PVALID # HEBR LET ALEF..HEBR LET TAV 1585 05EB..05EF ; UNASSIGNED # .. 1586 05F0..05F2 ; PVALID # HEBR LIG YIDDISH DOUBLE VAV..HEBR L 1587 05F3..05F4 ; CONTEXTO # HEBR PUNCT GERESH..HEBR PUNCTUATIO 1588 05F5..05FF ; UNASSIGNED # .. 1589 0600..0604 ; DISALLOWED # ARAB NUM SIGN..ARAB SIGN SAM 1590 0605 ; UNASSIGNED # .. 1591 0606..060F ; FREE_PVAL # AR-IND CUBE ROOT..ARAB SIGN MISRA 1592 0610..061A ; PVALID # ARAB SIGN SALLALLAHOU ALAYHE ..AR 1593 061B ; FREE_PVAL # ARAB SEMICOLON 1594 061C ; DISALLOWED # ARAB LET MARK 1595 061D..061D ; UNASSIGNED # .. 1596 061E..061F ; FREE_PVAL # ARAB TRIPLE DOT PUNCT MARK..ARAB Q 1597 0620..063F ; PVALID # ARAB LET KASH..ARAB LET FARSI YEH 1598 0640 ; DISALLOWED # ARAB TATWEEL 1599 0641..065F ; PVALID # ARAB LET FEH..ARAB WAVY HAMZA BEL 1600 0660..0669 ; CONTEXTO # AR-IND DIG ZERO..AR-IND DIG 1601 066A..066D ; FREE_PVAL # ARAB PCT SIGN..ARAB FIVE PNTED STA 1602 066E..0674 ; PVALID # ARAB LET DOTLESS BEH..ARAB LET HIG 1603 0675..0678 ; FREE_PVAL # ARAB LET HIGH HAMZA ALEF..ARAB LET 1604 0679..06D3 ; PVALID # ARAB LET TTEH..ARAB LET YEH BARREE 1605 06D4 ; FREE_PVAL # ARAB FULL STOP 1606 06D5..06DC ; PVALID # ARAB LET AE..ARAB SM HIGH SEEN 1607 06DD ; DISALLOWED # ARAB END OF AYAH 1608 06DE ; FREE_PVAL # ARAB START OF RUB EL HIZB 1609 06DF..06E8 ; PVALID # ARAB SM HIGH ROUNDED ZERO..ARAB SM 1610 06E9 ; FREE_PVAL # ARAB PLACE OF SAJDAH 1611 06EA..06EF ; PVALID # ARAB EMPTY CENTRE LOW STOP..ARAB LET 1612 06F0..06F9 ; CONTEXTO # EXT AR-IND DIG ZERO..EXT A 1613 06FA..06FF ; PVALID # ARAB LET SHEEN W DOT BEL..ARAB 1614 0700..070D ; FREE_PVAL # SYR END OF PARA..SYR HARKLEAN AST 1615 070E ; UNASSIGNED # 1616 070F ; DISALLOWED # SYR ABBR MARK 1617 0710..074A ; PVALID # SYR LET ALAPH..SYR BARREKH 1618 074B..074C ; UNASSIGNED # .. 1619 074D..07B1 ; PVALID # SYR LET SOGDIAN ZHAIN..THAANA LET N 1620 07B2..07BF ; UNASSIGNED # .. 1621 07C0..07F5 ; PVALID # NKO DIG ZERO..NKO LOW TONE APOS 1622 07F6..07F9 ; FREE_PVAL # NKO SYM OO DENNEN..NKO EXCLAMATI 1623 07FA ; DISALLOWED # NKO LAJANYALAN 1624 07FB..07FF ; UNASSIGNED # .. 1625 0800..082D ; PVALID # SAMAR LET ALAF..SAMAR MARK NEQUDA 1626 082E..082F ; UNASSIGNED # .. 1627 0830..083E ; FREE_PVAL # SAMAR PUNCT NEQUDAA..SAMAR PUN 1628 083F ; UNASSIGNED # 1629 0840..085B ; PVALID # MANDAIC LET HALQA..MANDAIC GEM 1630 085C..085D ; UNASSIGNED # .. 1631 085E ; FREE_PVAL # MANDAIC PUNCTUATION 1632 085F..089F ; UNASSIGNED # .. 1633 08A0 ; PVALID # ARAB LET BEH W SM V BEL 1634 08A1 ; UNASSIGNED # 1635 08A2..08AC ; PVALID # ARAB LET JEEM W 2 DOTS AB..ARAB 1636 08AD..08E3 ; UNASSIGNED # .. 1637 08E4..08FE ; PVALID # ARAB CURLY FATHA..ARAB DAMMA W 1638 08FF ; UNASSIGNED # 1639 0900..0963 ; PVALID # DEVAN SIGN INV CANDRABINDU..DEVAN V 1640 0964..0965 ; FREE_PVAL # DEVAN DANDA..DEVAN DOUBLE DANDA 1641 0966..096F ; PVALID # DEVAN DIG ZERO..DEVAN DIG NINE 1642 0970 ; FREE_PVAL # DEVAN ABBR SIGN 1643 0971..0977 ; PVALID # DEVAN SIGN HIGH SPACING DOT..DEVAN 1644 0978 ; UNASSIGNED # 1645 0979..097F ; PVALID # DEVAN SIGN HIGH SPACING DOT..DEVAN 1646 0980 ; UNASSIGNED # 1647 0981..0983 ; PVALID # BENG SIGN CANDRABINDU..BENG SIGN VIS 1648 0984 ; UNASSIGNED # 1649 0985..098C ; PVALID # BENG LET A..BENG LET VOC L 1650 098D..098E ; UNASSIGNED # .. 1651 098F..0990 ; PVALID # BENG LET E..BENG LET AI 1652 0991..0992 ; UNASSIGNED # .. 1653 0993..09A8 ; PVALID # BENG LET O..BENG LET NA 1654 09A9 ; UNASSIGNED # 1655 09AA..09B0 ; PVALID # BENG LET PA..BENG LET RA 1656 09B1 ; UNASSIGNED # 1657 09B2 ; PVALID # BENG LET LA 1658 09B3..09B5 ; UNASSIGNED # .. 1659 09B6..09B9 ; PVALID # BENG LET SHA..BENG LET HA 1660 09BA..09BB ; UNASSIGNED # .. 1661 09BC..09C4 ; PVALID # BENG SIGN NUKTA..BENG VOW SIGN VOCAL 1662 09C5..09C6 ; UNASSIGNED # .. 1663 09C7..09C8 ; PVALID # BENG VOW SIGN E..BENG VOW SIGN AI 1664 09C9..09CA ; UNASSIGNED # .. 1665 09CB..09CE ; PVALID # BENG VOW SIGN O..BENG LET KHANDA 1666 09CF..09D6 ; UNASSIGNED # .. 1667 09D7 ; PVALID # BENG AU LEN MARK 1668 09D8..09DB ; UNASSIGNED # .. 1669 09DC..09DD ; PVALID # BENG LET RRA..BENG LET RHA 1670 09DE ; UNASSIGNED # 1671 09DF..09E3 ; PVALID # BENG LET YYA..BENG VOW SIG 1672 09E4..09E5 ; UNASSIGNED # .. 1673 09E6..09F1 ; PVALID # BENG DIG ZERO..BENG LET RA W L 1674 09F2..09FB ; FREE_PVAL # BENG RUPEE MARK..BENG GANDA MARK 1675 09FC..0A00 ; UNASSIGNED # .. 1676 0A01..0A03 ; PVALID # GURMUKHI SIGN ADAK BINDI..GURMUKHI 1677 0A04 ; UNASSIGNED # 1678 0A05..0A0A ; PVALID # GURMUKHI LET A..GURMUKHI LET UU 1679 0A0B..0A0E ; UNASSIGNED # .. 1680 0A0F..0A10 ; PVALID # GURMUKHI LET EE..GURMUKHI LET AI 1681 0A11..0A12 ; UNASSIGNED # .. 1682 0A13..0A28 ; PVALID # GURMUKHI LET OO..GURMUKHI LET NA 1683 0A29 ; UNASSIGNED # 1684 0A2A..0A30 ; PVALID # GURMUKHI LET PA..GURMUKHI LET RA 1685 0A31 ; UNASSIGNED # 1686 0A32..0A33 ; PVALID # GURMUKHI LET LA..GURMUKHI LET LLA 1687 0A34 ; UNASSIGNED # 1688 0A35.OA36 ; PVALID # GURMUKHI LET VA..GURMUKHI LET SHA 1689 0A37 ; UNASSIGNED # 1690 0A38..0A39 ; PVALID # GURMUKHI LET SA..GURMUKHI LET HA 1691 0A3A..0A3B ; UNASSIGNED # .. 1692 0A3C ; PVALID # GURMUKHI SIGN NUKTA 1693 0A3D ; UNASSIGNED # 1694 0A3E..0A42 ; PVALID # GURMUKHI VOW SIGN AA..GURMUKHI V 1695 0A43..0A46 ; UNASSIGNED # .. 1696 0A47..0A48 ; PVALID # GURMUKHI VOW SIGN EE..GURMUKHI V 1697 0A49..0A4A ; UNASSIGNED # .. 1698 0A4B..0A4D ; PVALID # GURMUKHI VOW SIGN OO..GURMUKHI S 1699 0A4E..0A50 ; UNASSIGNED # .. 1700 0A51 ; PVALID # GURMUKHI SIGN UDAAT 1701 0A52..0A58 ; UNASSIGNED # .. 1702 0A59..0A5C ; PVALID # GURMUKHI LET KHHA..GURMUKHI LET RRA 1703 0A5D ; UNASSIGNED # 1704 0A5E ; PVALID # GURMUKHI LET FA 1705 0A5F..0A65 ; UNASSIGNED # .. 1706 0A66..0A75 ; PVALID # GURMUKHI DIG ZERO..GURMUKHI SIGN YA 1707 0A76..0A80 ; UNASSIGNED # .. 1708 0A81..0A83 ; PVALID # GUJARATI SIGN CANDRABINDU..GUJARATI 1709 0A84 ; UNASSIGNED # 1710 0A85..0A8D ; PVALID # GUJARATI LET A..GUJARATI VOW CAND 1711 0A8E ; UNASSIGNED # 1712 0A8F..0A91 ; PVALID # GUJARATI LET E..GUJARATI VOW CAND 1713 0A92 ; UNASSIGNED # 1714 0A93..0AA8 ; PVALID # GUJARATI LET O..GUJARATI LET NA 1715 0AA9 ; UNASSIGNED # 1716 0AAA..0AB0 ; PVALID # GUJARATI LET PA..GUJARATI LET RA 1717 0AB1 ; UNASSIGNED # 1718 0AB2..0AB3 ; PVALID # GUJARATI LET LA..GUJARATI LET LLA 1719 0AB4 ; UNASSIGNED # 1720 0AB5..0AB9 ; PVALID # GUJARATI LET VA..GUJARATI LET HA 1721 0ABA..0ABB ; UNASSIGNED # .. 1722 0ABC..0AC5 ; PVALID # GUJARATI SIGN NUKTA..GUJARATI VOW 1723 0AC6 ; UNASSIGNED # 1724 0AC7..0AC9 ; PVALID # GUJARATI VOW SIGN E..GUJARATI VOW 1725 0ACA ; UNASSIGNED # 1726 0ACB..0ACD ; PVALID # GUJARATI VOW SIGN O..GUJARATI SIG 1727 0ACE..0ACF ; UNASSIGNED # .. 1728 0AD0 ; PVALID # GUJARATI OM 1729 0AD1..0ADF ; UNASSIGNED # .. 1730 0AE0..0AE3 ; PVALID # GUJARATI LET VOC RR..GUJARATI V 1731 0AE4..0AE5 ; UNASSIGNED # .. 1732 0AE6..0AEF ; PVALID # GUJARATI DIG ZERO..GUJARATI DIG NINE 1733 0AF0..0AF1 ; FREE_PVAL # GUJARATI ABBR SIGN..GUJARATI RUPEE S 1734 0AF2..0B00 ; UNASSIGNED # .. 1735 0B01..0B03 ; PVALID # ORIYA SIGN CANDRABINDU..ORIYA SIGN V 1736 0B04 ; UNASSIGNED # 1737 0B05..0B0C ; PVALID # ORIYA LET A..ORIYA LET VOC L 1738 0B0D..0B0E ; UNASSIGNED # .. 1739 0B0F..0B10 ; PVALID # ORIYA LET E..ORIYA LET AI 1740 0B11..0B12 ; UNASSIGNED # .. 1741 0B13..0B28 ; PVALID # ORIYA LET O..ORIYA LET NA 1742 0B29 ; UNASSIGNED # 1743 0B2A..0B30 ; PVALID # ORIYA LET PA..ORIYA LET RA 1744 0B31 ; UNASSIGNED # 1745 0B32..0B33 ; PVALID # ORIYA LET LA..ORIYA LET LLA 1746 0B34 ; UNASSIGNED # 1747 0B35..0B39 ; PVALID # ORIYA LET VA..ORIYA LET HA 1748 0B3A..0B3B ; UNASSIGNED # .. 1749 0B3C..0B44 ; PVALID # ORIYA SIGN NUKTA..ORIYA VOW SIGN 1750 0B45..0B46 ; UNASSIGNED # .. 1751 0B47..0B48 ; PVALID # ORIYA VOW SIGN E..ORIYA VOW SIG 1752 0B49..0B4A ; UNASSIGNED # .. 1753 0B4B..0B4D ; PVALID # ORIYA VOW SIGN O..ORIYA SIGN VIRA 1754 0B4E..0B55 ; UNASSIGNED # .. 1755 0B56..0B57 ; PVALID # ORIYA AI LEN MARK..ORIYA AU LENG 1756 0B58..0B5B ; UNASSIGNED # .. 1757 0B5C..0B5D ; PVALID # ORIYA LET RRA..ORIYA LET RHA 1758 0B5E ; UNASSIGNED # 1759 0B5F..0B63 ; PVALID # ORIYA LET YYA..ORIYA VOW SIGN VOCA 1760 0B64..0B65 ; UNASSIGNED # .. 1761 0B66..0B6F ; PVALID # ORIYA DIG ZERO..ORIYA DIG NINE 1762 0B70 ; FREE_PVAL # ORIYA ISSHAR 1763 0B71 ; PVALID # ORIYA LET WA 1764 0B72..0B77 ; FREE_PVAL # ORIYA FRACT ONE QUART..ORIYA FRACT 1765 0B78..0B81 ; UNASSIGNED # .. 1766 0B82..0B83 ; PVALID # TAMIL SIGN ANUSVARA..TAMIL SIGN VIS 1767 0B84 ; UNASSIGNED # 1768 0B85..0B8A ; PVALID # TAMIL LET A..TAMIL LET UU 1769 0B8B..0B8D ; UNASSIGNED # .. 1770 0B8E..0B90 ; PVALID # TAMIL LET E..TAMIL LET AI 1771 0B91 ; UNASSIGNED # 1772 0B92..0B95 ; PVALID # TAMIL LET O..TAMIL LET KA 1773 0B96..0B98 ; UNASSIGNED # .. 1774 0B99..0B9A ; PVALID # TAMIL LET NGA..TAMIL LET CA 1775 0B9B ; UNASSIGNED # 1776 0B9C ; PVALID # TAMIL LET JA 1777 0B9D ; UNASSIGNED # 1778 0B9E..0B9F ; PVALID # TAMIL LET NYA..TAMIL LET TTA 1779 0BA0..0BA2 ; UNASSIGNED # .. 1780 0BA3..0BA4 ; PVALID # TAMIL LET NNA..TAMIL LET TA 1781 0BA5..0BA7 ; UNASSIGNED # .. 1782 0BA8..0BAA ; PVALID # TAMIL LET NA..TAMIL LET PA 1783 0BAB..0BAD ; UNASSIGNED # .. 1784 0BAE..0BB9 ; PVALID # TAMIL LET MA..TAMIL LET HA 1785 0BBA..0BBD ; UNASSIGNED # .. 1786 0BBE..0BC2 ; PVALID # TAMIL VOW SIGN AA..TAMIL VOW SI 1787 0BC3..0BC5 ; UNASSIGNED # .. 1788 0BC6..0BC8 ; PVALID # TAMIL VOW SIGN E..TAMIL VOW SIG 1789 0BC9 ; UNASSIGNED # 1790 0BCA..0BCD ; PVALID # TAMIL VOW SIGN O..TAMIL SIGN VIRA 1791 0BCE..0BCF ; UNASSIGNED # .. 1792 0BD0 ; PVALID # TAMIL OM 1793 0BD1..0BD6 ; UNASSIGNED # .. 1794 0BD7 ; PVALID # TAMIL AU LEN MARK 1795 0BD8..0BE5 ; UNASSIGNED # .. 1796 0BE6..0BEF ; PVALID # TAMIL DIG ZERO..TAMIL DIG NINE 1797 0BF0..0BFA ; FREE_PVAL # TAMIL NUM TEN..TAMIL NUM SIGN 1798 0BFB..0C00 ; UNASSIGNED # .. 1799 0C01..0C03 ; PVALID # TELUGU SIGN CANDRABINDU..TELUGU SIG 1800 0C04 ; UNASSIGNED # 1801 0C05..0C0C ; PVALID # TELUGU LET A..TELUGU LET VOC L 1802 0C0D ; UNASSIGNED # 1803 0C0E..0C10 ; PVALID # TELUGU LET E..TELUGU LET AI 1804 0C11 ; UNASSIGNED # 1805 0C12..0C28 ; PVALID # TELUGU LET O..TELUGU LET NA 1806 0C29 ; UNASSIGNED # 1807 0C2A..0C33 ; PVALID # TELUGU LET PA..TELUGU LET LLA 1808 0C34 ; UNASSIGNED # 1809 0C35..0C39 ; PVALID # TELUGU LET VA..TELUGU LET HA 1810 0C3A..0C3C ; UNASSIGNED # .. 1811 0C3D..0C44 ; PVALID # TELUGU SIGN AVAGRAHA..TELUGU VOW SI 1812 0C45 ; UNASSIGNED # 1813 0C46..0C48 ; PVALID # TELUGU VOW SIGN E..TELUGU VOW SIGN 1814 0C49 ; UNASSIGNED # 1815 0C4A..0C4D ; PVALID # TELUGU VOW SIGN O..TELUGU SIGN VIRA 1816 0C4E..0C54 ; UNASSIGNED # .. 1817 0C55..0C56 ; PVALID # TELUGU LEN MARK..TELUGU AI LEN MARK 1818 0C57 ; UNASSIGNED # 1819 0C58..0C59 ; PVALID # TELUGU LET TSA..TELUGU LET DZA 1820 0C5A..0C5F ; UNASSIGNED # .. 1821 0C60..0C63 ; PVALID # TELUGU LET VOC RR..TELUGU VOW S 1822 0C64..0C65 ; UNASSIGNED # .. 1823 0C66..0C6F ; PVALID # TELUGU DIG ZERO..TELUGU DIG NINE 1824 0C70..0C77 ; UNASSIGNED # .. 1825 0C78..0C7F ; FREE_PVAL # TELUGU FRACTION DIG ZERO..TELUGU S 1826 0C80..0C81 ; UNASSIGNED # .. 1827 0C82..0C83 ; PVALID # KANNADA SIGN ANUSVARA..KANNADA SIGN 1828 0C84 ; UNASSIGNED # 1829 0C85..0C8C ; PVALID # KANNADA LET A..KANNADA LET VOC L 1830 0C8D ; UNASSIGNED # 1831 0C8E..0C90 ; PVALID # KANNADA LET E..KANNADA LET AI 1832 0C91 ; UNASSIGNED # 1833 0C92..0CA8 ; PVALID # KANNADA LET O..KANNADA LET NA 1834 0CA9 ; UNASSIGNED # 1835 0CAA..0CB3 ; PVALID # KANNADA LET PA..KANNADA LET LLA 1836 0CB4 ; UNASSIGNED # 1837 0CB5..0CB9 ; PVALID # KANNADA LET VA..KANNADA LET HA 1838 0CBA..0CBB ; UNASSIGNED # .. 1839 0CBC..0CC4 ; PVALID # KANNADA SIGN NUKTA..KANNADA VOW SIG 1840 0CC5 ; UNASSIGNED # 1841 0CC6..0CC8 ; PVALID # KANNADA VOW SIGN E..KANNADA VOW SIG 1842 0CC9 ; UNASSIGNED # 1843 0CCA..0CCD ; PVALID # KANNADA VOW SIGN O..KANNADA SIGN VI 1844 0CCE..0CD4 ; UNASSIGNED # .. 1845 0CD5..0CD6 ; PVALID # KANNADA LEN MARK..KANNADA AI LEN MA 1846 0CD7..0CDD ; UNASSIGNED # .. 1847 0CDE ; PVALID # KANNADA LET FA 1848 0CDF ; UNASSIGNED # 1849 0CE0..0CE3 ; PVALID # KANNADA LET VOC RR..KANNADA VOW SIG 1850 0CE4..0CE5 ; UNASSIGNED # .. 1851 0CE6..0CEF ; PVALID # KANNADA DIG ZERO..KANNADA DIG NINE 1852 0CF0 ; UNASSIGNED # 1853 0CF1..0CF2 ; PVALID # KANNADA SIGN JIHVAMULIYA..KANNADA S 1854 0CF3..0D01 ; UNASSIGNED # .. 1855 0D02..0D03 ; PVALID # MALAY SIGN ANUSVARA..MALAY SIGN VIS 1856 0D04 ; UNASSIGNED # 1857 0D05..0D0C ; PVALID # MALAY LET A..MALAY LET VOC 1858 0D0D ; UNASSIGNED # 1859 0D0E..0D10 ; PVALID # MALAY LET E..MALAY LET AI 1860 0D11 ; UNASSIGNED # 1861 0D12..0D3A ; PVALID # MALAY LET O..MALAY LET TTTA 1862 0D3B..0D3C ; UNASSIGNED # .. 1863 0D3D..0D44 ; PVALID # MALAY SIGN AVAGRAHA..MALAY VOW SIG 1864 0D45 ; UNASSIGNED # 1865 0D46..0D48 ; PVALID # MALAY VOW SIGN E..MALAY VOW SIGN 1866 0D49 ; UNASSIGNED # 1867 0D4A..0D4E ; PVALID # MALAY VOW SIGN O..MALAY LET DOT REP 1868 0D4F..0D56 ; UNASSIGNED # .. 1869 0D57 ; PVALID # MALAY AU LEN MARK 1870 0D58..0D5F ; UNASSIGNED # .. 1871 0D60..0D63 ; PVALID # MALAY LET VOC RR..MALAY VOW 1872 0D64..0D65 ; UNASSIGNED # .. 1873 0D66..0D6F ; PVALID # MALAY DIG ZERO..MALAY DIG NINE 1874 0D70..0D75 ; FREE_PVAL # MALAY NUM TEN..MALAY FRACTION THR 1875 0D76..0D78 ; UNASSIGNED # .. 1876 0D79 ; FREE_PVAL # MALAY DATE MARK 1877 0D7A..0D7F ; PVALID # MALAY LET CHILLU NN..MALAY LET 1878 0D80..0D81 ; UNASSIGNED # .. 1879 0D82..0D83 ; PVALID # SINH SIGN ANUSVARAYA..SINH SIGN VIS 1880 0D84 ; UNASSIGNED # 1881 0D85..0D96 ; PVALID # SINH LET AYANNA..SINH LET AUYANN 1882 0D97..0D99 ; UNASSIGNED # .. 1883 0D9A..0DB1 ; PVALID # SINH LET ALPAPRAANA KAYANNA..SINH L 1884 0DB2 ; UNASSIGNED # 1885 0DB3..0DBB ; PVALID # SINH LET SANYAKA DAYANNA..SINH LETT 1886 0DBC ; UNASSIGNED # 1887 0DBD ; PVALID # SINH LET DANTAJA LAYANNA 1888 0DBE..0DBF ; UNASSIGNED # .. 1889 0DC0..0DC6 ; PVALID # SINH LET VAYANNA..SINH LET FAYAN 1890 0DC7..0DC9 ; UNASSIGNED # .. 1891 0DCA ; PVALID # SINH SIGN AL-LAKUNA 1892 0DCB..0DCE ; UNASSIGNED # .. 1893 0DCF..0DD4 ; PVALID # SINH VOW SIGN AELA-PILLA..SINH VOW 1894 0DD5 ; UNASSIGNED # 1895 0DD6 ; PVALID # SINH VOW SIGN DIGA PAA-PILLA 1896 0DD7 ; UNASSIGNED # 1897 0DD8..0DDF ; PVALID # SINH VOW SIGN GAETTA-PILLA..SINH VO 1898 0DE0..0DF1 ; UNASSIGNED # .. 1899 0DF2..0DF3 ; PVALID # SINH VOW SIGN DIGA GAETTA-PILLA..SI 1900 0DF4 ; FREE_PVAL # SINH PUNCT KUNDDALIYA 1901 0DF5..0E00 ; UNASSIGNED # .. 1902 0E01..0E32 ; PVALID # THAI CHAR KO KAI..THAI CHAR SARA A 1903 0E33 ; FREE_PVAL # THAI CHAR SARA AM 1904 0E34..0E3A ; PVALID # THAI CHAR SARA I..THAI CHAR PHINTH 1905 0E3B..0E3E ; UNASSIGNED # .. 1906 0E3F ; FREE_PVAL # THAI CURRENCY SYM BAHT 1907 0E40..0E4E ; PVALID # THAI CHAR SARA E..THAI CHAR YAMAKK 1908 0E4F ; FREE_PVAL # THAI CHAR FONGMAN 1909 0E50..0E59 ; PVALID # THAI DIG ZERO..THAI DIG NINE 1910 0E5A..0E5B ; FREE_PVAL # THAI CHAR ANGKHANKHU..THAI CHAR KH 1911 0E5C..0E80 ; UNASSIGNED # .. 1912 0E81..0E82 ; PVALID # LAO LET KO..LAO LET KHO SUNG 1913 0E83 ; UNASSIGNED # 1914 0E84 ; PVALID # LAO LET KHO TAM 1915 0E85..0E86 ; UNASSIGNED # .. 1916 0E87..0E88 ; PVALID # LAO LET NGO..LAO LET CO 1917 0E89 ; UNASSIGNED # 1918 0E8A ; PVALID # LAO LET SO TAM 1919 0E8B..0E8C ; UNASSIGNED # .. 1920 0E8D ; PVALID # LAO LET NYO 1921 0E8E..0E93 ; UNASSIGNED # .. 1922 0E94..0E97 ; PVALID # LAO LET DO..LAO LET THO TAM 1923 0E98 ; UNASSIGNED # 1924 0E99..0E9F ; PVALID # LAO LET NO..LAO LET FO SUNG 1925 0EA0 ; UNASSIGNED # 1926 0EA1..0EA3 ; PVALID # LAO LET MO..LAO LET LO LING 1927 0EA4 ; UNASSIGNED # 1928 0EA5 ; PVALID # LAO LET LO LOOT 1929 0EA6 ; UNASSIGNED # 1930 0EA7 ; PVALID # LAO LET WO 1931 0EA8..0EA9 ; UNASSIGNED # .. 1932 0EAA..0EAB ; PVALID # LAO LET SO SUNG..LAO LET HO SUNG 1933 0EAC ; UNASSIGNED # 1934 0EAD..0EB2 ; PVALID # LAO LET O..LAO VOW SIGN AA 1935 0EB3 ; FREE_PVAL # LAO VOW SIGN AM 1936 0EB4..0EB9 ; PVALID # LAO VOW SIGN I..LAO VOW SIGN UU 1937 0EBA ; UNASSIGNED # 1938 0EBB..0EBD ; PVALID # LAO VOW SIGN MAI KON..LAO SEMIVOW SIG 1939 0EBE..0EBF ; UNASSIGNED # .. 1940 0EC0..0EC4 ; PVALID # LAO VOW SIGN E..LAO VOW SIGN AI 1941 0EC5 ; UNASSIGNED # 1942 0EC6 ; PVALID # LAO KO LA 1943 0EC7 ; UNASSIGNED # 1944 0EC8..0ECD ; PVALID # LAO TONE MAI EK..LAO NIGGAHITA 1945 0ECE..0ECF ; UNASSIGNED # .. 1946 0ED0..0ED9 ; PVALID # LAO DIG ZERO..LAO DIG NINE 1947 0EDA..0EDB ; UNASSIGNED # .. 1948 0EDC..0EDD ; FREE_PVAL # LAO HO NO..LAO HO MO 1949 0EDE..0EDF ; PVALID # LAO LET KHMU GO..TIB SYL OM 1950 0EE0..0EEF ; UNASSIGNED # .. 1951 0F00 ; PVALID # TIB SYLL OM 1952 0F01..0F0A ; FREE_PVAL # TIB MARK GTER YIG MGO TRUNC A..TIB 1953 0F0B ; PVALID # TIB MARK INTERSYLLABIC TSHEG 1954 0F0C..0F17 ; FREE_PVAL # TIB MARK DELIMITER TSHEG BSTAR..TIB 1955 0F18..0F19 ; PVALID # TIB ASTROLOGICAL SIGN -KHYUD PA..TIB 1956 0F1A..0F1F ; FREE_PVAL # TIB SIGN RDEL DKAR GCIG..TIB SIGN RD 1957 0F20..0F29 ; PVALID # TIB DIG ZERO..TIB DIG NINE 1958 0F2A..0F34 ; FREE_PVAL # TIB DIG HALF ONE..TIB MARK BSDUS R 1959 0F35 ; PVALID # TIB MARK NGAS BZUNG NYI ZLA 1960 0F36 ; FREE_PVAL # TIB MARK CARET DZUD RTAGS BZHI MIG C 1961 0F37 ; PVALID # TIB MARK NGAS BZUNG SGOR RTAGS 1962 0F38 ; FREE_PVAL # TIB MARK CHE MGO 1963 0F39 ; PVALID # TIB MARK TSA PHRU 1964 0F3A..0F3D ; FREE_PVAL # TIB MARK GUG RTAGS GYON..TIB MARK AN 1965 0F3E..0F47 ; PVALID # TIB SIGN YAR TSHES..TIB LET JA 1966 0F48 ; UNASSIGNED # 1967 0F49..0F6C ; PVALID # TIB LET NYA..TIB LET RRA 1968 0F6D..0F70 ; UNASSIGNED # .. 1969 0F71..0F76 ; PVALID # TIB VOW SIGN AA..TIB VOW SIGN VO 1970 0F77 ; FREE_PVAL # TIB VOW SIGN VO RR 1971 0F78 ; PVALID # TIB VOW SIGN VO L 1972 0F79 ; FREE_PVAL # TIB VOW SIGN VO LL 1973 0F7A..0F84 ; PVALID # TIB VOW SIGN E..TIB MARK H 1974 0F85 ; FREE_PVAL # TIB MARK PALUTA 1975 0F86..0F8F ; PVALID # TIB SIGN LCI RTAGS..TIB SUBJOIN S 1976 0F90..0F97 ; PVALID # TIB SUBJOIN LET KA..TIB SUBJOIN 1977 0F98 ; UNASSIGNED # 1978 0F99..0FBC ; PVALID # TIB SUBJOIN LET NYA..TIB SUBJOI 1979 0FBD ; UNASSIGNED # 1980 0FBE..0FC5 ; FREE_PVAL # TIB KU RU KHA..TIB SYM RDO RJE 1981 0FC6 ; PVALID # TIB SYM PADMA GDAN 1982 0FC7..0FCC ; FREE_PVAL # TIB SYM RDO RJE RGYA GRAM..TIB SY 1983 0FCD ; UNASSIGNED # 1984 0FCE..0FDA ; FREE_PVAL # TIB SIGN RDEL NAG RDEL DKAR..TIB MA 1985 0FDB..0FFF ; UNASSIGNED # .. 1986 1000..1049 ; PVALID # MYAN LET KA..MYAN DIG NINE 1987 104A..104F ; FREE_PVAL # MYAN SIGN LITTLE SECTION..MYAN SYM 1988 1050..109D ; PVALID # MYAN LET SHA..MYAN VOW SIGN AITON 1989 109E..109F ; FREE_PVAL # MYAN SYM SHAN ONE..MYAN SYM SHAN EX 1990 10A0..10C5 ; PVALID # GEORG CAP LET AN..GEORG CAP LET HOE 1991 10C6 ; UNASSIGNED # 1992 10C7 ; PVALID # GEORG CAP LET YN 1993 10C8..10CC ; UNASSIGNED # .. 1994 10CD ; PVALID # GEORG CAP LET AEN 1995 10CE..10CF ; UNASSIGNED # .. 1996 10D0..10FA ; PVALID # GEORG LET AN..GEORG LET AIN 1997 10FB..10FC ; FREE_PVAL # GEORG PARA SEP..MOD LET GEORG NAR 1998 10FD..10FF ; PVALID # GEORG LET AEN..GEORG LET LABIAL 1999 1100..11FF ; DISALLOWED # HANGUL CHO KIYEOK..HANGUL JONG SSA 2000 1200..1248 ; PVALID # ETHI SYL HA..ETHI SYL QWA 2001 1249 ; UNASSIGNED # 2002 124A..124D ; PVALID # ETHI SYL QWI..ETHI SYL QWE 2003 124E..124F ; UNASSIGNED # .. 2004 1250..1256 ; PVALID # ETHI SYL QHA..ETHI SYL QHO 2005 1257 ; UNASSIGNED # 2006 1258 ; PVALID # ETHI SYL QHWA 2007 1259 ; UNASSIGNED # 2008 125A..125D ; PVALID # ETHI SYL QHWI..ETHI SYL QH 2009 125E..125F ; UNASSIGNED # .. 2010 1260..1288 ; PVALID # ETHI SYL BA..ETHI SYL XWA 2011 1289 ; UNASSIGNED # 2012 128A..128D ; PVALID # ETHI SYL XWI..ETHI SYL XWE 2013 128E..128F ; UNASSIGNED # .. 2014 1290..12B0 ; PVALID # ETHI SYL NA..ETHI SYL KWA 2015 12B1 ; UNASSIGNED # 2016 12B2..12B5 ; PVALID # ETHI SYL KWI..ETHI SYL KWE 2017 12B6..12B7 ; UNASSIGNED # .. 2018 12B8..12BE ; PVALID # ETHI SYL KXA..ETHI SYL KXO 2019 12BF ; UNASSIGNED # 2020 12C0 ; PVALID # ETHI SYL KXWA 2021 12C1 ; UNASSIGNED # 2022 12C2..12C5 ; PVALID # ETHI SYL KXWI..ETHI SYL KX 2023 12C6..12C7 ; UNASSIGNED # .. 2024 12C8..12D6 ; PVALID # ETHI SYL WA..ETHI SYL PHAR 2025 12D7 ; UNASSIGNED # 2026 12D8..1310 ; PVALID # ETHI SYL ZA..ETHI SYL GWA 2027 1311 ; UNASSIGNED # 2028 1312..1315 ; PVALID # ETHI SYL GWI..ETHI SYL GWE 2029 1316..1317 ; UNASSIGNED # .. 2030 1318..135A ; PVALID # ETHI SYL GGA..ETHI SYL FYA 2031 135B..135C ; UNASSIGNED # .. 2032 135D..135F ; PVALID # ETHI COMB GEM AND VOW..ETHI COMB GE 2033 1360..137C ; FREE_PVAL # ETHI SECT MARK..ETHI NUM TEN THOUS 2034 137D..137F ; UNASSIGNED # .. 2035 1380..138F ; PVALID # ETHI SYL SEBATBEIT MWA..ETHI SYL PW 2036 1390..1399 ; FREE_PVAL # ETHI TON MARK YIZET..ETHI TON MARK 2037 139A..139F ; UNASSIGNED # .. 2038 13A0..13F4 ; PVALID # CHEROKEE LET A..CHEROKEE LET YV 2039 13F5..13FF ; UNASSIGNED # .. 2040 1400 ; FREE_PVAL # CANAD SYL HYPHEN 2041 1401..166C ; PVALID # CANAD SYL E..CANAD SYL CAR 2042 166D..166E ; FREE_PVAL # CANAD SYL CHI SIGN..CANAD SYLLAB 2043 166F..167F ; PVALID # CANAD SYL QAI..CANAD SYL B 2044 1680 ; FREE_PVAL # OGHAM SPACE MARK 2045 1681..169A ; PVALID # OGHAM LET BEITH..OGHAM LET PEITH 2046 169B..169C ; FREE_PVAL # OGHAM FEATHER MARK..OGHAM REV FEAT 2047 169D..169F ; UNASSIGNED # .. 2048 16A0..16EA ; PVALID # RUNIC LET FEHU FEOH FE F..RUNIC LET 2049 16EB..16F0 ; FREE_PVAL # RUNIC SINGLE PUNCT..RUNIC BELGTHOR 2050 16F1..16FF ; UNASSIGNED # .. 2051 1700..170C ; PVALID # TAGALOG LET A..TAGALOG LET YA 2052 170D ; UNASSIGNED # 2053 170E..1714 ; PVALID # TAGALOG LET LA..TAGALOG SIGN VIRAMA 2054 1715..171F ; UNASSIGNED # .. 2055 1720..1734 ; PVALID # HANUNOO LET A..HANUNOO SIGN PAMUDPO 2056 1735..1736 ; FREE_PVAL # PHILIP SINGLE PUNCT..PHILIP DOUBLE 2057 1737..173F ; UNASSIGNED # .. 2058 1740..1753 ; PVALID # BUHID LET A..BUHID VOW SIGN U 2059 1754..175F ; UNASSIGNED # .. 2060 1760..176C ; PVALID # TAGBANWA LET A..TAGBANWA LET YA 2061 176D ; UNASSIGNED # 2062 176E..1770 ; PVALID # TAGBANWA LET LA..TAGBANWA LET SA 2063 1771 ; UNASSIGNED # 2064 1772..1773 ; PVALID # TAGBANWA VOW SIGN I..TAGBANWA VOW S 2065 1774..177F ; UNASSIGNED # .. 2066 1780..17B3 ; PVALID # KHMER LET KA..KHMER IND VOW QAU 2067 17B4..17B5 ; DISALLOWED # KHMER VOW INH AQ..KHMER VOW INH AA 2068 17B6..17D3 ; PVALID # KHMER VOW SIGN AA..KHMER SIGN BATHA 2069 17D4..17D6 ; FREE_PVAL # KHMER SIGN KHAN..KHMER SIGN CAMNUC 2070 17D7 ; PVALID # KHMER SIGN LEK TOO 2071 17D8..17DB ; FREE_PVAL # KHMER SIGN BEYYAL..KHMER CURR SYM R 2072 17DC..17DD ; PVALID # KHMER SIGN AVAKRAHASANYA..KHMER SIG 2073 17DE..17DF ; UNASSIGNED # .. 2074 17E0..17E9 ; PVALID # KHMER DIG ZERO..KHMER DIG NINE 2075 17EA..17EF ; UNASSIGNED # .. 2076 17F0..17F9 ; FREE_PVAL # KHMER SYM LEK ATTAK SON..KHMER SYM 2077 17FA..17FF ; UNASSIGNED # .. 2078 1800..180A ; FREE_PVAL # MONG BIRGA..MONG NIRUGU 2079 180B..180E ; DISALLOWED # MONG FREE VAR SEL ONE..MONG VOW SEP 2080 180F ; UNASSIGNED # 2081 1810..1819 ; PVALID # MONG DIG ZERO..MONG DIG NINE 2082 181A..181F ; UNASSIGNED # .. 2083 1820..1877 ; PVALID # MONG LET A..MONG LET MANCHU 2084 1878..187F ; UNASSIGNED # .. 2085 1880..18AA ; PVALID # MONG LET ALI GALI ANUSVARA ONE..MON 2086 18AB..18AF ; UNASSIGNED # .. 2087 18B0..18F5 ; PVALID # CAN SYL OY..CAN SYL CA 2088 18F6..18FF ; UNASSIGNED # .. 2089 1900..191C ; PVALID # LIMBU VOW-CARRIER LET..LIMBU LET HA 2090 191D..191F ; UNASSIGNED # .. 2091 1920..192B ; PVALID # LIMBU VOW SIGN A..LIMBU SUBJOIN LET 2092 192C..192F ; UNASSIGNED # .. 2093 1930..193B ; PVALID # LIMBU SM LET KA..LIMBU SIGN SA-I 2094 193C..193F ; UNASSIGNED # .. 2095 1940 ; FREE_PVAL # LIMBU SIGN LOO 2096 1941..1943 ; UNASSIGNED # .. 2097 1944..1945 ; FREE_PVAL # LIMBU EXCLAM MARK..LIMBU QUEST MARK 2098 1946..196D ; PVALID # LIMBU DIG ZERO..TAI LE LET AI 2099 196E..196F ; UNASSIGNED # .. 2100 1970..1974 ; PVALID # TAI LE LET TONE-2..TAI LE LET TONE- 2101 1975..197F ; UNASSIGNED # .. 2102 1980..19AB ; PVALID # NEW TAI LUE LET HIGH QA..NEW TAI LU 2103 19AC..19AF ; UNASSIGNED # .. 2104 19B0..19C9 ; PVALID # NEW TAI LUE VOW SIGN VOW SHORT..NEW 2105 19CA..19CF ; UNASSIGNED # .. 2106 19D0..19D9 ; PVALID # NEW TAI LUE DIG ZERO..NEW TAI DIG N 2107 19DA ; DISALLOWED # NEW TAI LUE THAM 2108 19DB..19DD ; UNASSIGNED # .. 2109 19DE..19FF ; FREE_PVAL # NEW TAI LUE SIGN LAE..KHMER SYM DAP 2110 1A00..1A1B ; PVALID # BUGIN LET KA..BUGIN VOW SIGN AE 2111 1A1C..1A1D ; UNASSIGNED # .. 2112 1A1E..1A1F ; FREE_PVAL # BUGIN PALLAWA..BUGIN END OF SECTION 2113 1A20..1A5E ; PVALID # TAI THAM LET HIGH KA..TAI THAM CONS 2114 1A5F ; UNASSIGNED # 2115 1A60..1A7C ; PVALID # TAI THAM SIGN SAKOT..TAI THAM SIGN 2116 1A7D..1A7E ; UNASSIGNED # .. 2117 1A7F..1A89 ; PVALID # TAI THAM COMB CRYPT DOT..TAI THAM D 2118 1A8A..1A8F ; UNASSIGNED # .. 2119 1A90..1A99 ; PVALID # TAI THAM THAM DIG ZERO..TAI THAM TH 2120 1A9A..1A9F ; UNASSIGNED # .. 2121 1AA0..1AA6 ; FREE_PVAL # TAI THAM SIGN WIANG..TAI THAM SIGN 2122 1AA7 ; PVALID # TAI THAM SIGN MAI YAMOK 2123 1AA8..1AAD ; FREE_PVAL # TAI THAM SIGN KAAN..TAI THAM SIGN C 2124 1AAE..1AFF ; UNASSIGNED # .. 2125 1B00..1B4B ; PVALID # BAL SIGN ULU RICEM..BAL LET ASYURA 2126 1B4C..1B4F ; UNASSIGNED # .. 2127 1B50..1B59 ; PVALID # BAL DIG ZERO..BAL DIG NINE 2128 1B5A..1B6A ; FREE_PVAL # BAL PANTI..BAL MUS SYM DANG 2129 1B6B..1B73 ; PVALID # BAL MUS SYM COMB TEGEH..BAL MUS 2130 1B74..1B7C ; FREE_PVAL # BAL MUS SYM RIGHT-HAND OPEN DUG 2131 1B7D..1B7F ; UNASSIGNED # .. 2132 1B80..1BF3 ; PVALID # SUND SIGN PANYECEK..BATAK PANONGONAN 2133 1BF4..1BFB ; UNASSIGNED # .. 2134 1BFC..1BFF ; FREE_PVAL # BATAK SYM BINDU NA METEK..BATAK SYM 2135 1C00..1C37 ; PVALID # LEPCHA LET KA..LEPCHA SIGN NUKTA 2136 1C38..1C3A ; UNASSIGNED # .. 2137 1C3B..1C3F ; FREE_PVAL # LEPCHA PUNCT TA-ROL..LEPCHA PUNCT T 2138 1C40..1C49 ; PVALID # LEPCHA DIG ZERO..LEPCHA DIG NINE 2139 1C4A..1C4C ; UNASSIGNED # .. 2140 1C4D..1C7D ; PVALID # LEPCHA LET TTA..OL CHIKI AHAD 2141 1C7E..1C7F ; FREE_PVAL # OL CHIKI PUNCT MUCAAD..OL CHIKI PUN 2142 1C80..1CBF ; UNASSIGNED # .. 2143 1CC0..1CC7 ; FREE_PVAL # SUNDA PUNCT BINDU SURYA..SUNDA PUNC 2144 1CC8..1CCF ; UNASSIGNED # .. 2145 1CD0..1CD2 ; PVALID # VED TONE KARSHANA..VED TONE PRENKHA 2146 1CD3 ; FREE_PVAL # VED SIGN NIHSHVASA 2147 1CD4..1CF6 ; PVALID # VED SIGN YAJURVEDIC MID SVARITA..VE 2148 1CF7..1CFF ; UNASSIGNED # .. 2149 1D00..1D2B ; PVALID # LAT LET SM CAP A..CYR LET SM 2150 1D2C..1D2E ; FREE_PVAL # MOD LET CAP A..MOD LET C 2151 1D2F ; PVALID # MOD LET CAP BARRED B 2152 1D30..1D3A ; FREE_PVAL # MOD LET CAP D..MOD LET C 2153 1D3B ; PVALID # MOD LET CAP REV N 2154 1D3C..1D4D ; FREE_PVAL # MOD LET CAP O..MOD LET S 2155 1D4E ; PVALID # MOD LET SM TURNED I 2156 1D4F..1D6A ; FREE_PVAL # MOD LET SM K..GREEK SUB SMA 2157 1D6B..1D77 ; PVALID # LAT SM LET UE..LAT SM LET TU 2158 1D78 ; FREE_PVAL # MOD LET CYR EN 2159 1D79..1D9A ; PVALID # LAT SM LET INSULAR G..LAT SM LE 2160 1D9B..1DBF ; FREE_PVAL # MOD LET SM TURNED ALPHA..MOD 2161 1DC0..1DE6 ; PVALID # COMB DOTTED GRAVE ACCENT..COMB LAT 2162 1DE7..1DFB ; UNASSIGNED # .. 2163 1DFC..1E99 ; PVALID # COMB DOUBLE INV BREVE BEL..LAT SM L 2164 1E9A ; FREE_PVAL # LAT SM LET A W R HALF RING 2165 1E9B..1F15 ; PVALID # LAT SM LET LONG S W BOT ABOVE..GR 2166 1F16..1F17 ; UNASSIGNED # .. 2167 1F18..1F1D ; FREE_PVAL # GREEK CAP LET EPSILON W PSILI..GRE 2168 1F1E..1F1F ; UNASSIGNED # .. 2169 1F20..1F45 ; PVALID # GREEK SM LET ETA W PSILI..GREEK SMA 2170 1F46..1F47 ; UNASSIGNED # .. 2171 1F48..1F4D ; FREE_PVAL # GREEK CAP LET OMICRON W PSILI..GRE 2172 1F4E..1F4F ; UNASSIGNED # .. 2173 1F50..1F57 ; PVALID # GREEK SM LET UPSILON W PSILI..GREEK 2174 1F58 ; UNASSIGNED # 2175 1F59 ; PVALID # GREEK CAP LET UPSILON W DASIA 2176 1F5A ; UNASSIGNED # 2177 1F5B ; PVALID # GREEK CAP LET UPSILON W DASIA AND 2178 1F5C ; UNASSIGNED # 2179 1F5D ; PVALID # GREEK CAP LET UPSILON W DASIA AND 2180 1F5E ; UNASSIGNED # 2181 1F5F..1F7D ; PVALID # GREEK CAP LET UPSILON W DASIA A..GR 2182 1F7E..1F7F ; UNASSIGNED # .. 2183 1F80..1F87 ; PVALID # GREEK SM LET ALPHA W PSILI AND YPOG 2184 1F88..1F8F ; FREE_PVAL # GREEK CAP LET ALPHA W PSILI AND..GR 2185 1F90..1F97 ; PVALID # GREEK SM LET ETA W PSILI AND YP..GR 2186 1F98..1F9F ; FREE_PVAL # GREEK CAP LET ETA W PSILI AND P..GR 2187 1FA0..1FA7 ; PVALID # GREEK SM LET OMEGA W PSILI AND ..GR 2188 1FA8..1FAF ; FREE_PVAL # GREEK CAPL LET OMEGA W PSILI AN..GR 2189 1FB0..1FB4 ; PVALID # GREEK SM LET ALPHA W VRACHY..GREEK 2190 1FB5 ; UNASSIGNED # 2191 1FB6..1FBB ; PVALID # GREEK SM LET ALPHA W PERISPOMEN..GR 2192 1FBC..1FBD ; FREE_PVAL # GREEK CAP LET ALPHA W PROSGEGRA..GR 2193 1FBE ; PVALID # GREEK PROSGEGRAMMENI 2194 1FBF..1FC1 ; FREE_PVAL # GREEK PSILI..GREEK DIALYTIKA AND PE 2195 1FC2..1FC4 ; PVALID # GREEK SM LET ETA W VARIA AND YP..GR 2196 1FC5 ; UNASSIGNED # 2197 1FC6..1FCB ; PVALID # GREEK SM LET ETA W PERISPOMENI..GR 2198 1FCC..1FCF ; FREE_PVAL # GREEK CAP LET ETA W PROSGEGRAM..GR 2199 1FD0..1FD3 ; PVALID # GREEK SM LET IOTA W VRACHY..GREEK S 2200 1FD4..1FD5 ; UNASSIGNED # .. 2201 1FD6..1FDB ; PVALID # GREEK SM LET IOTA W PERISPOMENI..GR 2202 1FDC ; UNASSIGNED # 2203 1FDD..1FDF ; FREE_PVAL # GREEK DASIA AND VARIA..GREEK DASIA 2204 1FE0..1FEC ; PVALID # GREEK SM LET UPSILON W VRACHY..GREE 2205 1FED..1FEF ; FREE_PVAL # GREEK DIALYTIKA AND VARIA..GREEK VA 2206 1FF0..1FF1 ; UNASSIGNED # .. 2207 1FF2..1FF4 ; FREE_PVAL # GREEK SM LET OMEGA W VARIA AND YPOG 2208 1FF5 ; UNASSIGNED # 2209 1FF6..1FFB ; PVALID # GREEK SM LET OMEGA W PERISPOMEN..GR 2210 1FFC..1FFE ; FREE_PVAL # GREEK CAP LET OMEGA W PROSGEGRA..GR 2211 1FFF ; UNASSIGNED # 2212 2000..200A ; FREE_PVAL # EN QUAD..HAIR SPACE 2213 200B ; DISALLOWED # ZERO WIDTH SPACE 2214 200C..200D ; CONTEXTJ # ZERO WIDTH NON-JOINER..ZERO WIDTH J 2215 200E..200F ; DISALLOWED # LEFT-TO-RIGHT MARK..RIGHT-TO-LEFT M 2216 2010..2027 ; FREE_PVAL # HYPHEN..HYPHENATION POINT 2217 2028..202E ; DISALLOWED # LINE SEP..RIGHT-TO-LEFT OVERRIDE 2218 202F..205F ; FREE_PVAL # NARROW NO-BREAK SPACE..MED MATH SP 2219 2060..2064 ; DISALLOWED # WORD JOINER..INVISIBLE PLUS 2220 2065 ; UNASSIGNED # 2221 2066..206F ; DISALLOWED # LEFT-TO-RIGHT IS..NOM DIGIT SHAPES 2222 2070..2071 ; FREE_PVAL # SUPER ZERO..SUPER LAT SM LET I 2223 2072..2073 ; UNASSIGNED # .. 2224 2074..208E ; FREE_PVAL # SUPER FOUR..SUB RIGHT PARENTHESIS 2225 208F ; UNASSIGNED # 2226 2090..209C ; FREE_PVAL # LAT SUB SM LET A..LAT SUB SM LET T 2227 209D..209F ; UNASSIGNED # .. 2228 20A0..20BA ; FREE_PVAL # EURO-CURRENCY SIGN..TURKISH LIRA SI 2229 20BB..20CF ; UNASSIGNED # .. 2230 20D0..20DC ; PVALID # COMB LEFT HARPOON ABOVE..COMB FOUR 2231 20DD..20E0 ; FREE_PVAL # COMB ENC CIRC..COMB ENC CIRC BACKS 2232 20E1 ; PVALID # COMB L R ARROW ABOVE 2233 20E2..20E4 ; FREE_PVAL # COMB ENC SCREEN..COMB ENC UPWARD PO 2234 20E5..20F0 ; PVALID # COMB REV SOLIDUS OVERLAY..COMB ASTE 2235 20F1..20FF ; UNASSIGNED # .. 2236 2100..2129 ; FREE_PVAL # ACCOUNT OF..TURNED GREEK SM LET IOT 2237 212A..212B ; PVALID # KELVIN SIGN..ANGSTROM SIGN 2238 212C..2131 ; FREE_PVAL # SCRIPT CAP C..SCRIPT CAP F 2239 2132 ; PVALID # TURNED CAP F 2240 2133..214D ; FREE_PVAL # SCRIPT CAP M..AKTIESELSKAB 2241 214E ; PVALID # TURNED SM F 2242 214F..2182 ; FREE_PVAL # SYM FOR SAMAR SOURCE..ROM NUM TEN T 2243 2183..2184 ; PVALID # ROM NUM REV ONE HUNDRED..LAT SM LET 2244 2185..2189 ; FREE_PVAL # ROM NUM SIX LATE FORM..VULGAR FRACT 2245 218A..218F ; UNASSIGNED # .. 2246 2190..23F3 ; FREE_PVAL # LEFTWARDS ARROW..HOURGLASS W FLO 2247 23F4..23FF ; UNASSIGNED # .. 2248 2400..2426 ; FREE_PVAL # SYM FOR NULL..SYM FOR SUB FORM 2249 2427..243F ; UNASSIGNED # .. 2250 2440..244A ; FREE_PVAL # OCR HOOK..OCR DOUBLE BACKSLASH 2251 244B..245F ; UNASSIGNED # .. 2252 2460..26FF ; FREE_PVAL # CIRCLED DIG ONE..WHITE FLAG W HORIZ 2253 2700 ; UNASSIGNED # 2254 2701..2B4C ; FREE_PVAL # UP BLADE SCISSORS..RIGHTWARDS ARROW 2255 2B4D..2B4F ; UNASSIGNED # .. 2256 2B50..2B59 ; FREE_PVAL # WHITE MEDIUM STAR..HEAVY CIRCLED SA 2257 2B5A..2BFF ; UNASSIGNED # .. 2258 2C00..2C2E ; PVALID # GLAG CAP LET AZU..GLAG CA 2259 2C2F ; UNASSIGNED # 2260 2C30..2C5E ; PVALID # GLAG SM LET AZU..GLAG SMAL 2261 2C5F ; UNASSIGNED # 2262 2C60..2C7B ; PVALID # LAT CAP LET L W DOUBLE BAR..LAT SM 2263 2C7C..2C7D ; FREE_PVAL # LAT SUB SM LET J..MOD LET CAP V 2264 2C7E..2CE4 ; PVALID # LAT CAP LET S W SWASH TAIL..COPT SY 2265 2CE5..2CEA ; FREE_PVAL # COPT SYM MI RO..COPT SYM SHIMA SIMA 2266 2CEB..2CF3 ; PVALID # COPT CAP LET CRYPTOGRAMMIC SHEI..CO 2267 2CF4..2CF8 ; UNASSIGNED # .. 2268 2CF9..2CFF ; FREE_PVAL # COPT OLD NUB FULL STOP..COPT MORPHO 2269 2D00..2D25 ; PVALID # GEORG SM LET AN..GEORG SM LET 2270 2D26 ; UNASSIGNED # 2271 2D27 ; PVALID # GEORG SM LET YN 2272 2D28..2D2C ; UNASSIGNED # .. 2273 2D2D ; PVALID # GEORG SM LET AEN 2274 2D2E..2D2F ; UNASSIGNED # .. 2275 2D30..2D67 ; PVALID # TIFINAGH LET YA..TIFINAGH LETTER YO 2276 2D68..2D6E ; UNASSIGNED # .. 2277 2D6F..2D70 ; FREE_PVAL # TIFINAGH MOD LET LABIALIZATION MARK 2278 2D71..2D7E ; UNASSIGNED # .. 2279 2D7F..2D96 ; PVALID # TIFINAGH CONS JOINER..ETHI SYL GGW 2280 2D97..2D9F ; UNASSIGNED # .. 2281 2DA0..2DA6 ; PVALID # ETHI SYL SSA..ETHI SYL SSO 2282 2DA7 ; UNASSIGNED # 2283 2DA8..2DAE ; PVALID # ETHI SYL CCA..ETHI SYL CCO 2284 2DAF ; UNASSIGNED # 2285 2DB0..2DB6 ; PVALID # ETHI SYL ZZA..ETHI SYL ZZO 2286 2DB7 ; UNASSIGNED # 2287 2DB8..2DBE ; PVALID # ETHI SYL CCHA..ETHI SYL CC 2288 2DBF ; UNASSIGNED # 2289 2DC0..2DC6 ; PVALID # ETHI SYL QYA..ETHI SYL QYO 2290 2DC7 ; UNASSIGNED # 2291 2DC8..2DCE ; PVALID # ETHI SYL KYA..ETHI SYL KYO 2292 2DCF ; UNASSIGNED # 2293 2DD0..2DD6 ; PVALID # ETHI SYL XYA..ETHI SYL XYO 2294 2DD7 ; UNASSIGNED # 2295 2DD8..2DDE ; PVALID # ETHI SYL GYA..ETHI SYL GYO 2296 2DDF ; UNASSIGNED # 2297 2DE0..2DFF ; PVALID # COMB CYR LET BE..COMB CYRI 2298 2E00..2E2E ; FREE_PVAL # RIGHT ANGLE SUB MARK..REV QUEST MAR 2299 2E2F ; PVALID # VERT TILDE 2300 2E30..2E3B ; FREE_PVAL # RING PNT..THREE-EM DASH 2301 2E3C..2E7F ; UNASSIGNED # .. 2302 2E80..2E99 ; FREE_PVAL # CJK RAD REPEAT..CJK RAD RAP 2303 2E9A ; UNASSIGNED # 2304 2E9B..2EF3 ; FREE_PVAL # CJK RAD CHOKE..CJK RAD C-SIMPLIFIED 2305 2EF4..2EFF ; UNASSIGNED # .. 2306 2F00..2FD5 ; FREE_PVAL # KANGXI RAD ONE..KANGXI RAD FLUTE 2307 2FD6..2FEF ; UNASSIGNED # .. 2308 2FF0..2FFB ; FREE_PVAL # IDEO DESC CHAR LEFT TO RIGHT..IDEO 2309 2FFC..2FFF ; UNASSIGNED # .. 2310 3000..3004 ; FREE_PVAL # IDEO SPACE..JAPAN INDUST STAND 2311 3005..3007 ; PVALID # IDEO ITER MARK..IDEO NUMB ZERO 2312 3008..3029 ; FREE_PVAL # LEFT ANGLE BRACKET..HANGZH NUM NINE 2313 302A..302D ; PVALID # IDEO LEVEL TONE MARK..IDEO ENT 2314 302E..302F ; DISALLOWED # HANGUL SING DOT TONE MARK..WAVY DAS 2315 3030 ; FREE_PVAL # WAVY DASH 2316 3031..3035 ; DISALLOWED # VERT KANA REP MARK..VERT KANA REP M 2317 3036..303A ; FREE_PVAL # CIRCLED POSTAL MARK..HANGZH NUM THI 2318 303B ; DISALLOWED # VERT IDEO ITER MARK 2319 303C ; PVALID # MASU MARK 2320 303D..303F ; FREE_PVAL # PART ALTER MARK..IDEO HALF FILL 2321 3040 ; UNASSIGNED # 2322 3041..3096 ; PVALID # HIRAGANA LET SM A..HIRAGANA LET SMA 2323 3097..3098 ; UNASSIGNED # .. 2324 3099..309A ; PVALID # COMB KAT-HIR VOICED SOUND 2325 309B..309C ; FREE_PVAL # KAT-HIR VOICED SOUND MARK..KAT-HIR 2326 309D..309E ; PVALID # HIRAGANA ITER MARK..HIRAGANA VOICED 2327 309F..30A0 ; FREE_PVAL # HIRAGANA DIGRAPH YORI..KAT-HIR DOU 2328 30A1..30FA ; PVALID # KATAKANA LET SM A..KATAKANA LET VO 2329 30FB ; CONTEXTO # KATAKANA MIDDLE DOT 2330 30FC..30FE ; PVALID # KAT-HIR PROLONGED SOUND MARK..KATA 2331 30FF ; FREE_PVAL # KATAKANA DIGRAPH KOTO 2332 3100..3104 ; UNASSIGNED # .. 2333 3105..312D ; PVALID # BOPOMOFO LET B..BOPOMOFO LET IH 2334 312E..3130 ; UNASSIGNED # .. 2335 3131..3163 ; FREE_PVAL # HANGUL LET KIYEOK..HANGUL LET I 2336 3164 ; DISALLOWED # HANGUL FILLER 2337 3165..318E ; FREE_PVAL # HANGUL LET SSANGNIEUN..HANGUL LET 2338 318F ; UNASSIGNED # 2339 3190..319F ; FREE_PVAL # IDEO ANNO LINK MARK..IDEO ANNO MAN 2340 31A0..31BA ; PVALID # BOPOMOFO LET BU..BOPOMOFO LET ZY 2341 31BB..31BF ; UNASSIGNED # .. 2342 31C0..31E3 ; FREE_PVAL # CJK STROKE T..CJK STROKE Q 2343 31E4..31EF ; UNASSIGNED # .. 2344 31F0..31FF ; PVALID # KATAKANA LET SM KU..KATAKANA LET SM 2345 3200..321E ; FREE_PVAL # PAREN HANGUL KIYEOK..PAREN KOREAN C 2346 321F ; UNASSIGNED # 2347 3220..32FE ; FREE_PVAL # PAREN IDEO ONE..CIRCLED KATAKANA WO 2348 32FF ; UNASSIGNED # 2349 3300..33FF ; FREE_PVAL # SQUARE APAATO..SQUARE GAL 2350 3400..4DB5 ; PVALID # 2351 4DB6..4DBF ; UNASSIGNED # .. 2352 4DC0..4DFF ; FREE_PVAL # HEX FOR THE CREATIVE HEAVEN..HEX FO 2353 4E00..9FCC ; PVALID # 2354 9FCD..9FFF ; UNASSIGNED # .. 2355 A000..A48C ; PVALID # YI SYL IT..YI SYL YYR 2356 A48D..A48F ; UNASSIGNED # .. 2357 A490..A4C6 ; FREE_PVAL # YI RAD QOT..YI RAD KE 2358 A4C7..A4CF ; UNASSIGNED # .. 2359 A4D0..A4FD ; PVALID # LISU LET BA..LISU LET TONE MYA JEU 2360 A4FE..A4FF ; FREE_PVAL # LISU PUNCT COMMA..LISU PUNCT FUL 2361 A500..A60C ; PVALID # VAI SYL EE..VAI SYL LENENER 2362 A60D..A60F ; FREE_PVAL # VAI COMMA..VAI QUEST MARK 2363 A610..A62B ; PVALID # VAI SYL NDOLE FA..VAI SYL NDOLE DO 2364 A62C..A63F ; UNASSIGNED # .. 2365 A640..A66F ; PVALID # CYR CAP LET ZEMLYA..COMB CYR VZMET 2366 A670..A673 ; FREE_PVAL # COMB CYR TEN MILLIONS SIGN..SLAVON 2367 A674..A67D ; PVALID # COMB CYR KAVYKA..COMB CYR PAYEROK 2368 A67E ; FREE_PVAL # CYR KAVYKA 2369 A67F..A697 ; PVALID # CYR PAYEROK..CYR SM LET SHWE 2370 A698..A69E ; UNASSIGNED # .. 2371 A69F..A6E5 ; PVALID # COMB CYR LET IOTIFIED E..BAMUM LET 2372 A6E6..A6EF ; FREE_PVAL # BAMUM LET MO..BAMUM LET KOGHOM 2373 A6F0..A6F1 ; PVALID # BAMUM COMB MARK KOQNDON..BAMUM COMB 2374 A6F2..A6F7 ; FREE_PVAL # BAMUM NJAEMLI..BAMUM QUEST MARK 2375 A6F8..A6FF ; UNASSIGNED # .. 2376 A700..A716 ; FREE_PVAL # MOD LET CHIN TONE YIN PING..MOD 2377 A717..A71F ; PVALID # MOD LET DOT VERT BAR..MOD L 2378 A720..A721 ; FREE_PVAL # MOD LET STRESS AND HIGH TONE..MOD 2379 A722..A76F ; PVALID # LAT CAP LET EGYPT ALEF..LAT SM LET 2380 A770 ; FREE_PVAL # MODIFIER LETTER US 2381 A771..A788 ; PVALID # LATIN SMALL LETTER DUM..MOD LET LOW 2382 A789..A78A ; FREE_PVAL # MOD LET COLON..MOD LET SH EQUALS SI 2383 A78B..A78E ; PVALID # LAT SM LET SALTILLO..LAT SM LET L W 2384 A78F ; UNASSIGNED # 2385 A790..A793 ; PVALID # LAT CAP LET N W DESC..LAT SM LET C 2386 A794..A79F ; UNASSIGNED # .. 2387 A7A0..A7AA ; PVALID # LAT CAP LET G W OBLIQUE STROKE..LAT 2388 A7AB..A7F7 ; UNASSIGNED # .. 2389 A7F8..A7F9 ; FREE_PVAL # MOD LET CAP H W STROKE..MOD LET SM 2390 A7FA..A827 ; PVALID # LAT LET SM CAP TURNED M..SYLOTI NA 2391 A828..A82B ; FREE_PVAL # SYLOTI NAGRI POET MARK-1..SYLOTI NA 2392 A82C..A82F ; UNASSIGNED # .. 2393 A830..A839 ; FREE_PVAL # N INDIC FRACT ONE QUART..N INDIC QU 2394 A83A..A83F ; UNASSIGNED # .. 2395 A840..A873 ; PVALID # PHAGS-PA LET KA..PHAGS-PA LET CANDR 2396 A874..A877 ; FREE_PVAL # PHAGS-PA SINGLE HEAD MARK..PHAGS-PA 2397 A878..A87F ; UNASSIGNED # .. 2398 A880..A8C4 ; PVALID # SAUR SIGN ANUSVARA..SAUR SIGN VIRAM 2399 A8C5..A8CD ; UNASSIGNED # .. 2400 A8CE..A8CF ; FREE_PVAL # SAUR DANDA..SAUR DOUBLE DANDA 2401 A8D0..A8D9 ; PVALID # SAUR DIG ZERO..SAUR DIG NINE 2402 A8DA..A8DF ; UNASSIGNED # .. 2403 A8E0..A8F7 ; PVALID # COMB DEVAN DIG ZERO..DEVAN SIGN CAN 2404 A8F8..A8FA ; FREE_PVAL # DEVAN SIGN PUSHPIKA..DEVAN CARET 2405 A8FB ; PVALID # DEVAN HEADSTROKE 2406 A8FC..A8FF ; UNASSIGNED # .. 2407 A900..A92D ; PVALID # KAYAH LI DIG ZERO..KAYAH LI TONE CA 2408 A92E..A92F ; FREE_PVAL # KAYAH LI SIGN CWI..KAYAH LI SIGN SH 2409 A930..A953 ; PVALID # REJANG LET KA..REJANG VIRAMA 2410 A954..A95E ; UNASSIGNED # .. 2411 A95F ; FREE_PVAL # REJANG SECTION MARK 2412 A960..A97C ; DISALLOWED # HANGUL CHO TIKEUT-MIUEM..HANGUL CHO 2413 A97D..A97F ; UNASSIGNED # .. 2414 A980..A9C0 ; PVALID # JAV SIGN PANYANGGA..JAV PANGKON 2415 A9C1..A9CD ; FREE_PVAL # JAV LEFT RERENGGAN..JAV TURNED PADA 2416 A9CE ; UNASSIGNED # 2417 A9CF..A9D9 ; PVALID # JAV PANGRANGKEP..JAV DIG NINE 2418 A9DA..A9DD ; UNASSIGNED # .. 2419 A9DE..A9DF ; FREE_PVAL # JAV PADA TIRTA TUMETES..JAV PADA I 2420 A9E0..A9FF ; UNASSIGNED # .. 2421 AA00..AA36 ; PVALID # CHAM LET A..CHAM CONS SIGN WA 2422 AA37..AA3F ; UNASSIGNED # .. 2423 AA40..AA4D ; PVALID # CHAM LET FIN K..CHAM CONS SIGN FIN 2424 AA4E..AA4F ; UNASSIGNED # .. 2425 AA50..AA59 ; PVALID # CHAM DIG ZERO..CHAM DIG NINE 2426 AA5A..AA5B ; UNASSIGNED # .. 2427 AA5C..AA5F ; FREE_PVAL # CHAM PUNCT SPIRAL..CHAM PUNCT TR 2428 AA60..AA76 ; PVALID # MYAN LET KHAMTI GA..MYAN LOGOGRAM K 2429 AA77..AA79 ; FREE_PVAL # MYAN SYM AITON EXCLAM..MYAN SYM AIT 2430 AA7A..AA7B ; PVALID # MYAN LET AITON RA..MYAN SIGN PAO KA 2431 AA7C..AA7F ; UNASSIGNED # .. 2432 AA80..AAC2 ; PVALID # TAI VIET LET LOW KO..TAI VIET TONE 2433 AAC3..AADA ; UNASSIGNED # .. 2434 AADB..AADD ; PVALID # TAI VIET SYM KON..TAI VIET SYM SAM 2435 AADE..AADF ; FREE_PVAL # TAI VIET SYM HO HOI..TAI VIET SYM K 2436 AAE0..AAEF ; PVALID # MEETEI MAYEK LET E..MEETEI MAYEK VO 2437 AAF0..AAF1 ; FREE_PVAL # MEETEI MAYEK CHEIKHAN..MEETEI MAYEK 2438 AAF2..AAF6 ; PVALID # MEETEI MAYEK ANJI..MEETEI MAYEK VIR 2439 AAF7..AB00 ; UNASSIGNED # .. 2440 AB01..AB06 ; PVALID # ETHI SYL TTHU..ETHI SYL TTHO 2441 AB07..AB08 ; UNASSIGNED # .. 2442 AB09..AB0E ; PVALID # ETHI SYL DDHAA..ETHI SYL DDHO 2443 AB0F..AB10 ; UNASSIGNED # .. 2444 AB11..AB16 ; PVALID # ETHI SYL DZU..ETHI SYL DZO 2445 AB17..AB1F ; UNASSIGNED # .. 2446 AB20..AB26 ; PVALID # ETHI SYL CCHHA..ETHI SYL CCHHO 2447 AB27 ; UNASSIGNED # .. 2448 AB28..AB2E ; PVALID # ETHI SYL BBAA..ETHI SYL BBO 2449 AB2F..ABBF ; UNASSIGNED # .. 2450 ABC0..ABEA ; PVALID # MEETEI MAYEK LET KOK..MEETEI MAYEK 2451 ABEB ; FREE_PVAL # MEETEI MAYEK CHEIKHEI 2452 ABEC..ABED ; PVALID # MEETEI MAYEK LUM IYEK..MEETEI MAYEK 2453 ABEE..ABEF ; UNASSIGNED # .. 2454 ABF0..ABF9 ; PVALID # MEETEI MAYEK DIG ZERO..MEETEI MAYEK 2455 ABFA..ABFF ; UNASSIGNED # .. 2456 AC00..D7A3 ; PVALID # 2457 D7A4..D7AF ; UNASSIGNED # .. 2458 D7B0..D7C6 ; DISALLOWED # HANGUL JUNG O-YEO..HANGUL JUNG ARAE 2459 D7C7..D7CA ; UNASSIGNED # .. 2460 D7CB..D7FB ; DISALLOWED # HANGUL JONG NIEUN-RIEUL..HANGUL JON 2461 D7FC..D7FF ; UNASSIGNED # .. 2462 D800..F8FF ; DISALLOWED # 2463 F900..FA6D ; PVALID # CJK COMP IDEO-F900..CJK COMP IDEO 2464 FA6E..FA6F ; UNASSIGNED # .. 2465 FA70..FAD9 ; PVALID # CJK COMP IDEO-FA70..CJK COMP IDEO 2466 FADA..FAFF ; UNASSIGNED # .. 2467 FB00..FB06 ; FREE_PVAL # LAT SM LIG FF..LAT SM LIG ST 2468 FB07..FB12 ; UNASSIGNED # .. 2469 FB13..FB17 ; FREE_PVAL # ARMENIAN SM LIG MEN NOW..ARMENIAN SM 2470 FB18..FB1C ; UNASSIGNED # .. 2471 FB1D..FB1F ; PVALID # HEBR LET YOD W HIRIQ..HEBR LIG YID Y 2472 FB20..FB29 ; FREE_PVAL # HEBR LET ALT AYIN..HEB LET ALT PLUS 2473 FB2A..FB36 ; PVALID # HEBR LET SHIN W SHIN DOT..HEBR LET Z 2474 FB37 ; UNASSIGNED # 2475 FB38..FB3C ; PVALID # HEBR LET TET W DAGESH..HEBR LET 2476 FB3D ; UNASSIGNED # 2477 FB3E ; PVALID # HEBR LET MEM W DAGESH 2478 FB3F ; UNASSIGNED # 2479 FB40..FB41 ; PVALID # HEBR LET NUN W DAGESH..HEBR LET 2480 FB42 ; UNASSIGNED # 2481 FB43..FB44 ; PVALID # HEBR LET FIN PE W DAGESH..HEBR L 2482 FB45 ; UNASSIGNED # 2483 FB46..FB4E ; PVALID # HEBR LET TSADI W DAGESH..HEBR LET P 2484 FB4F..FBC1 ; FREE_PVAL # HEBR LIG ALEF LAMED..ARAB SYM S 2485 FBC2..FBD2 ; UNASSIGNED # .. 2486 FBD3..FD3F ; FREE_PVAL # ARAB LET NG ISO FORM..ORNATE RIGHT 2487 FD40..FD4F ; UNASSIGNED # .. 2488 FD50..FD8F ; FREE_PVAL # ARAB LIG TEH W JEEM W MEEM INIT 2489 FD90..FD91 ; UNASSIGNED # .. 2490 FD92..FDC7 ; FREE_PVAL # ARAB LIG MEEM W JEEM W KHAH INI 2491 FDC8..FDCF ; UNASSIGNED # .. 2492 FDD0..FDEF ; DISALLOWED # .. 2493 FDF0..FDFD ; FREE_PVAL # ARAB LIG SALLA USED..ARAB LIG BISMI 2494 FDFE..FDFF ; UNASSIGNED # .. 2495 FE00..FE0F ; DISALLOWED # VAR SEL-1..VAR SEL-16 2496 FE10..FE19 ; FREE_PVAL # PRES FORM FOR VERT COMMA..PRES FORM 2497 FE1A..FE1F ; UNASSIGNED # .. 2498 FE20..FE26 ; PVALID # COMB LIG LEFT HALF..COMB CONJ MACRO 2499 FE27..FE2F ; UNASSIGNED # .. 2500 FE30..FE52 ; FREE_PVAL # PRES FORM FOR VERT TWO DOT LEAD..SM 2501 FE53 ; UNASSIGNED # 2502 FE54..FE66 ; FREE_PVAL # SM SEMICOLON..SM EQUALS SIGN 2503 FE67 ; UNASSIGNED # 2504 FE68..FE6B ; FREE_PVAL # SM REV SOLIDUS..SM COMM AT 2505 FE6C..FE6F ; UNASSIGNED # .. 2506 FE70..FE72 ; FREE_PVAL # ARAB FATHATAN ISO FORM..ARAB DAMMAT 2507 FE73 ; PVALID # ARAB TAIL FRAGMENT 2508 FE74 ; FREE_PVAL # ARAB KASRATAN ISO FORM 2509 FE75 ; UNASSIGNED # 2510 FE76..FEFC ; FREE_PVAL # ARAB FATHA ISO FORM..ARAB LIG LAM W 2511 FEFD..FEFE ; UNASSIGNED # .. 2512 FEFF ; DISALLOWED # ZERO WIDTH NO-BREAK SPACE 2513 FF00 ; UNASSIGNED # 2514 FF01..FF9F ; FREE_PVAL # FULLW EXCLAM MARK..HALFW KATA SE 2515 FFA0 ; DISALLOWED # HALFW HANGUL FILLER 2516 FFA1..FFBE ; FREE_PVAL # HALFW HANGUL LET KIYEOK..HALFW H 2517 FFBF..FFC1 ; UNASSIGNED # .. 2518 FFC2..FFC7 ; FREE_PVAL # HALFW HANGUL LET A..HALFW HANGUL 2519 FFC8..FFC9 ; UNASSIGNED # .. 2520 FFCA..FFCF ; FREE_PVAL # HALFW HANGUL LET YEO..HALFW HANGU 2521 FFD0..FFD1 ; UNASSIGNED # .. 2522 FFD2..FFD7 ; FREE_PVAL # HALFW HANGUL LET YO..HALFW HANGUL 2523 FFD8..FFD9 ; UNASSIGNED # .. 2524 FFDA..FFDC ; FREE_PVAL # HALFW HANGUL LET EU..HALFW HANGUL 2525 FFDD..FFDF ; UNASSIGNED # .. 2526 FFE0..FFE6 ; FREE_PVAL # FULLW CENT SIGN..FULLW WON SIGN 2527 FFE7 ; UNASSIGNED # 2528 FFE8..FFEE ; FREE_PVAL # HALFW FORMS LIGHT VERT..HALFW WH 2529 FFEF..FFF8 ; UNASSIGNED # .. 2530 FFF9..FFFB ; DISALLOWED # INTERL ANNO ANCHOR..INTERL ANNO TER 2531 FFFC..FFFD ; FREE_PVAL # OBJECT REPL CHAR..REPL CHAR 2532 FFFE..FFFF ; DISALLOWED # .. 2533 10000..1000B; PVALID # LIN B SYL B008 A..LIN B SYL 2534 1000C ; UNASSIGNED # 2535 1000D..10026; PVALID # LIN B SYL B036 JO..LIN B SYL 2536 10027 ; UNASSIGNED # 2537 10028..1003A; PVALID # LIN B SYL B060 RA..LIN B SYL 2538 1003B ; UNASSIGNED # 2539 1003C..1003D; PVALID # LIN B SYL B017 ZA..LIN B SYL 2540 1003E ; UNASSIGNED # 2541 1003F..1004D; PVALID # LIN B SYL B020 ZO..LIN B SYL 2542 1004E..1004F; UNASSIGNED # .. 2543 10050..1005D; PVALID # LIN B SYM B018..LIN B SYM B089 2544 1005E..1007F; UNASSIGNED # .. 2545 10080..100FA; PVALID # LIN B IDEO B100 MAN..LIN B IDEO 2546 100FB..100FF; UNASSIGNED # .. 2547 10100..10102; FREE_PVAL # AEG WORD SEP LINE..AEG CHECK MAR 2548 10103..10106; UNASSIGNED # .. 2549 10107..10133; FREE_PVAL # AEG NUM ONE..AEG NUM NINETY THOU 2550 10134..10136; UNASSIGNED # .. 2551 10137..1018A; FREE_PVAL # AEG WEIGHT BASE UNIT..GREEK ZERO SI 2552 1018B..1018F; UNASSIGNED # .. 2553 10190..1019B; FREE_PVAL # ROM SEXTANS SIGN..ROM CENTURIAL SIG 2554 1019C..101CF; UNASSIGNED # .. 2555 101D0..101FC; FREE_PVAL # PHAISTOS DISC SIGN PED..PHAISTOS DI 2556 101FD ; PVALID # PHAISTOS DISC SIGN COMB OBLIQUE STR 2557 101FE..1027F; UNASSIGNED # .. 2558 10280..1029C; PVALID # LYCIAN LET A..LYCIAN LET X 2559 1029D..1029F; UNASSIGNED # .. 2560 102A0..102D0; PVALID # CARIAN LET A..CARIAN LET UUU3 2561 102D1..102FF; UNASSIGNED # .. 2562 10300..1031E; PVALID # OLD ITAL LET A..OLD ITAL LET UU 2563 1031F ; UNASSIGNED # 2564 10320..10323; FREE_PVAL # OLD ITAL NUM ONE..OLD ITAL NUM F 2565 10324..1032F; UNASSIGNED # .. 2566 10330..10340; PVALID # GOTH LET AHSA..GOTH LET PAIRTHRA 2567 10341 ; FREE_PVAL # GOTH LET NINETY 2568 10342..10349; PVALID # GOTH LET RAIDA..GOTH LET OTHAL 2569 1034A ; FREE_PVAL # GOTH LET NINE HUNDRED 2570 1034B..1037F; UNASSIGNED # .. 2571 10380..1039D; PVALID # UGAR LET ALPA..UGAR LET SSU 2572 1039E ; UNASSIGNED # 2573 1039F ; FREE_PVAL # UGAR WORD DIVIDER 2574 103A0..103C3; PVALID # OLD PERS SIGN A..OLD PERS SIGN HA 2575 103C4..103C7; UNASSIGNED # .. 2576 103C8..103CF; PVALID # OLD PERS SIGN AURAMAZDAA..OLD PERS 2577 103D0..103D5; FREE_PVAL # OLD PERS WORD DIVIDER..OLD PERS NUM 2578 103D6..103FF; UNASSIGNED # .. 2579 10400..1049D; PVALID # DESERET CAP LET LONG I..OSMANYA LET 2580 1049E..1049F; UNASSIGNED # .. 2581 104A0..104A9; PVALID # OSMANYA DIG ZERO..OSMANYA DIG NINE 2582 104AA..107FF; UNASSIGNED # .. 2583 10800..10805; PVALID # CYPRIOT SYL A..CYPRIOT SYL JA 2584 10806..10807; UNASSIGNED # .. 2585 10808 ; PVALID # CYPRIOT SYL JO 2586 10809 ; UNASSIGNED # 2587 1080A..10835; PVALID # CYPRIOT SYL KA..CYPRIOT SYL WO 2588 10836 ; UNASSIGNED # 2589 10837..10838; PVALID # CYPRIOT SYL XA..CYPRIOT SYL XE 2590 10839..1083B; UNASSIGNED # .. 2591 1083C ; PVALID # CYPRIOT SYL ZA 2592 1083D..1083E; UNASSIGNED # .. 2593 1083F..10855; PVALID # CYPRIOT SYL ZO..IMP ARAM LET TAW 2594 10856 ; UNASSIGNED # 2595 10857..1085F; FREE_PVAL # IMP ARAM SECT SIGN..IMP ARAM 2596 10860..108FF; UNASSIGNED # .. 2597 10900..10915; PVALID # PHOEN LET ALF..PHOEN LET TAU 2598 10916..1091B; FREE_PVAL # PHOEN NUM ONE..PHOEN NUM THR 2599 1091C..1091E; UNASSIGNED # .. 2600 1091F ; FREE_PVAL # PHOEN WORD SEP 2601 10920..10939; PVALID # LYDIAN LET A..LYDIAN LET C 2602 1093A..1093E; UNASSIGNED # .. 2603 1093F ; FREE_PVAL # LYDIAN TRIANGULAR MARK 2604 10940..1097F; UNASSIGNED # .. 2605 10980..109B7; PVALID # MERO HIER LET A..MERO CURS LET 2606 109B8..109BD; UNASSIGNED # .. 2607 109BE..109BF; PVALID # MERO CURS LOG RMT..MERO CURS L 2608 109C0..109FF; UNASSIGNED # .. 2609 10A00..10A03; PVALID # KHARO LET A..KHARO VOW SIGN V 2610 10A04 ; UNASSIGNED # 2611 10A05..10A06; PVALID # KHARO VOW SIGN E..KHARO VOW SI 2612 10A07..10A0B; UNASSIGNED # .. 2613 10A0C..10A13; PVALID # KHARO VOW LEN MARK..KHARO LET 2614 10A14 ; UNASSIGNED # 2615 10A15..10A17; PVALID # KHARO LET CA..KHARO LET JA 2616 10A18 ; UNASSIGNED # 2617 10A19..10A33; PVALID # KHARO LET NYA..KHARO LET TTT 2618 10A34..10A37; UNASSIGNED # .. 2619 10A38..10A3A; PVALID # KHARO SIGN BAR ABOVE..KHARO SIGN D 2620 10A3B..10A3E; UNASSIGNED # .. 2621 10A3F ; PVALID # KHARO VIRAMA 2622 10A40..10A47; FREE_PVAL # KHARO DIG ONE..KHARO NUM ONE 2623 10A48..10A4F; UNASSIGNED # .. 2624 10A50..10A58; FREE_PVAL # KHARO PUNCT DOT..KHARO PUNCT 2625 10A59..10A5F; UNASSIGNED # .. 2626 10A60..10A7C; PVALID # OLD S ARAB LET HE..OLD SOUTH ARAB 2627 10A7D..10A7F; FREE_PVAL # OLD S ARAB NUM ONE..OLD SOUTH ARAB 2628 10A80..10AFF; UNASSIGNED # .. 2629 10B00..10B35; PVALID # AVESTAN LET A..AVESTAN LET HE 2630 10B36..10B38; UNASSIGNED # .. 2631 10B39..10B3F; FREE_PVAL # AVESTAN ABBR MARK..LARGE ONE RING O 2632 10B40..10B55; PVALID # INSCRIPT PARTHIAN LET ALEPH..INSCRI 2633 10B56..10B57; UNASSIGNED # .. 2634 10B58..10B5F; FREE_PVAL # INSCRIPT PARTHIAN NUM ONE..INSCRIPT 2635 10B60..10B72; PVALID # INSCRIPT PAHLAVI LET ALEPH..INSCRIP 2636 10B73..10B77; UNASSIGNED # .. 2637 10B78..10B7F; FREE_PVAL # INSCRIPT PAHLAVI NUM ONE..INSCRIPT 2638 10B80..10BFF; UNASSIGNED # .. 2639 10C00..10C48; PVALID # OLD TURK LET ORKHON A..OLD TURK LET 2640 10C49..10E5F; UNASSIGNED # .. 2641 10E60..10E7E; FREE_PVAL # RUMI DIG ONE..RUMI FRACTION TWO THI 2642 10E7F..10FFF; UNASSIGNED # .. 2643 11000..11046; PVALID # BRAHMI SIGN CANDRABINDU..BRAHMI VIR 2644 11047..1104D; FREE_PVAL # BRAHMI DANDA..BRAHMI PUNCT LOTUS 2645 1104E..11051; UNASSIGNED # .. 2646 11052..11065; FREE_PVAL # BRAHMI NUM ONE..BRAHMI NUM ONE THOU 2647 11066..1106F; PVALID # BRAHMI DIG ZERO..BRAHMI DIG NINE 2648 11070..1107F; UNASSIGNED # .. 2649 11080..110BA; PVALID # KAITHI SIGN CANDRABINDU..KAITHI SIG 2650 110BB..110BC; FREE_PVAL # KAITHI ABBR SIGN..KAITHI ENUM SIGN 2651 110BD ; DISALLOWED # KAITHI NUM SIGN 2652 110BE..110C1; FREE_PVAL # KAITHI SECT MARK..KAITHI DOUBLE DAN 2653 110C2..110CF; UNASSIGNED # .. 2654 110D0..110F8; PVALID # SORA SOMPENG LETTER SAH..SORA SOMPE 2655 110F9..110EF; UNASSIGNED # .. 2656 110F0..110F9; PVALID # SORA SOMPENG DIG ZERO..SORA SOMPENG DI 2657 110FA..110FF; UNASSIGNED # .. 2658 11100..11134; PVALID # CHAKMA SIGN CANDRABINDU..CHAKMA MAAYY 2659 11135 ; UNASSIGNED # 2660 11136..1113F; PVALID # CHAKMA DIG ZERO..CHAKMA DIG NINE 2661 11140..11143; FREE_PVAL # CHAKMA SECT MARK..CHAKMA QUEST MARK 2662 11144..1117F; UNASSIGNED # .. 2663 11180..111C4; PVALID # SHARADA SIGN CANDRABINDU..SHARADA OM 2664 111C5..111C8; FREE_PVAL # SHARADA DANDA..SHARADA SEPARATOR 2665 111C9..111CF; UNASSIGNED # .. 2666 111D0..111D9; PVALID # SHARADA DIG ZERO..SHARADA DIG NINE 2667 111DA..1167F; UNASSIGNED # .. 2668 11680..116B7; PVALID # TAKRI LET A..TAKRI SIGN NUKTA 2669 116B8..116BF; UNASSIGNED # .. 2670 116C0..116C9; PVALID # TAKRI DIGIT ZERO..TAKRI DIG NINE 2671 116CA..1FFFF; UNASSIGNED # .. 2672 12000..1236E; PVALID # CUNEI SIGN A..CUNEI SIGN ZUM 2673 1236F..123FF; UNASSIGNED # .. 2674 12400..12462; FREE_PVAL # CUNEI NUM SIGN TWO ASH..CUNEI NUM 2675 12463..1246F; UNASSIGNED # .. 2676 12470..12473; FREE_PVAL # CUNEI PUNCT SIGN OLD ASSYRIAN WORD 2677 12474..12FFF; UNASSIGNED # .. 2678 13000..1342E; PVALID # EGYPT HIERO A001..EGYPT HIERO AA032 2679 1342F..167FF; UNASSIGNED # .. 2680 16800..16A38; PVALID # BAMUM LET PHASE-A NGKUE MFON..BAMUN LE 2681 16A39..16EFF; UNASSIGNED # .. 2682 16F00..16F44; PVALID # MIAO LET PA..MIAO LET HHA 2683 16F45..16F4F; UNASSIGNED # .. 2684 16F50..16F7E; PVALID # MIAO LET NAS..MIAO VOWEL SIGN NG 2685 16F7F..16F8E; UNASSIGNED # .. 2686 16F8F..16F9F; PVALID # MIAO TONE RIGHT..MIAO LET REF TON 2687 16FA0..1AFFF; UNASSIGNED # .. 2688 1B000..1B001; PVALID # KATA LET ARCH E..KATA LET ARCH YE 2689 1B002..1CFFF; UNASSIGNED # .. 2690 1D000..1D0F5; FREE_PVAL # BYZ MUS SYM PSILI..BYZ MUS 2691 1D0F6..1D0FF; UNASSIGNED # .. 2692 1D100..1D126; FREE_PVAL # MUS SYM SINGLE BARLINE..MUS SYMBOL 2693 1D127..1D128; UNASSIGNED # .. 2694 1D129..1D164; FREE_PVAL # MUS SYM MULT MEASURE REST..MUS SYM ONE 2695 1D165..1D169; PVALID # MUS SYM COMB STEM..MUS SYM COMB TREMOL 2696 1D16A..1D16C; FREE_PVAL # MUS SYM FING TREM-1..MUS SYM FING TREM 2697 1D16D..1D172; PVALID # MUS SYM COMB AUG DOT..MUS SYM COMB FL 2698 1D173..1D17A; DISALLOWED # MUS SYM BEGIN BEAM..MUS SYM END PHRASE 2699 1D17B..1D182; PVALID # MUS SYM COMB ACCENT..MUS SYM COMB LOUR 2700 1D183..1D184; FREE_PVAL # MUS SYM ARP UP..MUS SYM ARP DOWN 2701 1D185..1D18B; PVALID # MUS SYM COMB DOIT..MUS SYM COMB TRIPLE 2702 1D18C..1D1A9; FREE_PVAL # MUS SYM RINFORZANDO..MUS SYM DEG SLASH 2703 1D1AA..1D1AD; PVALID # MUS SYM COMB DOWN BOW..MUS SYM COMB SN 2704 1D1AE..1D1DD; FREE_PVAL # MUS SYM PEDAL MARK..MUS SYM PES SUBPUN 2705 1D1DE..1D1FF; UNASSIGNED # .. 2706 1D200..1D241; FREE_PVAL # GREEK VOCAL NOTATION SYM-1..GREEK INS 2707 1D242..1D244; FREE_PVAL # COMB GREEK MUS TRISEME..COMB GREEK MU 2708 1D245 ; FREE_PVAL # GREEK MUSICAL LEIMMA 2709 1D246..1D2FF; UNASSIGNED # .. 2710 1D300..1D356; DISALLOWED # MONOG FOR EARTH..TETRAG FOR FOSTERING 2711 1D357..1D35F; UNASSIGNED # .. 2712 1D360..1D371; DISALLOWED # COUNT ROD UNIT DIG ONE..COUNT ROD TE 2713 1D372..1D3FF; UNASSIGNED # .. 2714 1D400..1D454; FREE_PVAL # MATH BOLD CAP A..MATH IT 2715 1D455 ; UNASSIGNED # 2716 1D456..1D49C; FREE_PVAL # MATH ITAL SM I..MATH SC 2717 1D49D ; UNASSIGNED # 2718 1D49E..1D49F; FREE_PVAL # MATH SCRIPT CAP C..MATH 2719 1D4A0..1D4A1; UNASSIGNED # .. 2720 1D4A2 ; FREE_PVAL # MATH SCRIPT CAP G 2721 1D4A3..1D4A4; UNASSIGNED # .. 2722 1D4A5..1D4A6; FREE_PVAL # MATH SCRIPT CAP J..MATH 2723 1D4A7..1D4A8; UNASSIGNED # .. 2724 1D4A9..1D4AC; FREE_PVAL # MATH SCRIPT CAP N..MATH 2725 1D4AD ; UNASSIGNED # 2726 1D4AE..1D4B9; FREE_PVAL # MATH SCRIPT CAP S..MATH 2727 1D4BA ; UNASSIGNED # 2728 1D4BB ; FREE_PVAL # MATH SCRIPT SM F 2729 1D4BC ; UNASSIGNED # 2730 1D4BD..1D4C3; FREE_PVAL # MATH SCRIPT SM H..MATH SC 2731 1D4C4 ; UNASSIGNED # 2732 1D4C5..1D505; FREE_PVAL # MATH SCRIPT SM P..MATH FR 2733 1D506 ; UNASSIGNED # 2734 1D507..1D50A; FREE_PVAL # MATH FRAKTUR CAP D..MATH 2735 1D50B..1D50C; UNASSIGNED # .. 2736 1D50D..1D514; FREE_PVAL # MATH FRAKTUR CAP J..MATH 2737 1D515 ; UNASSIGNED # 2738 1D516..1D51C; FREE_PVAL # MATH FRAKTUR CAP S..MATH 2739 1D51D ; UNASSIGNED # 2740 1D51E..1D539; FREE_PVAL # MATH FRAKTUR SM A..MATH D 2741 1D53A ; UNASSIGNED # 2742 1D53B..1D53E; FREE_PVAL # MATH DOUBLE-STRUCK CAP D..MATHEM 2743 1D53F ; UNASSIGNED # 2744 1D540..1D544; FREE_PVAL # MATH DOUBLE-STRUCK CAP I..MATHEM 2745 1D545 ; UNASSIGNED # 2746 1D546 ; FREE_PVAL # MATH DOUBLE-STRUCK CAP O 2747 1D547..1D549; UNASSIGNED # .. 2748 1D54A..1D550; FREE_PVAL # MATH DOUBLE-STRUCK CAP S..MATHEM 2749 1D551 ; UNASSIGNED # 2750 1D552..1D6A5; FREE_PVAL # MATH DOUBLE-STRUCK SM A..MATHEMAT 2751 1D6A6..1D6A7; UNASSIGNED # .. 2752 1D6A8..1D7CB; FREE_PVAL # MATH BOLD CAP ALPHA..MATHEMATICA 2753 1D7CC..1D7CD; UNASSIGNED # .. 2754 1D7CE..1D7FF; FREE_PVAL # MATH BOLD DIG ZERO..MATH M 2755 1D800..1EDFF; UNASSIGNED # .. 2756 1EE00..1EE03; FREE_PVAL # ARAB MATH ALEF..ARAB MATH DAL 2757 1EE04 ; UNASSIGNED # 2758 1EE05..1EE1F; FREE_PVAL # ARAB MATH WAW..ARAB MATH DOTLESS QAF 2759 1EE20 ; UNASSIGNED # 2760 1EE21..1EE22; FREE_PVAL # ARAB MATH INIT BEH..ARAB MATH INIT JEE 2761 1EE23 ; UNASSIGNED # 2762 1EE24 ; FREE_PVAL # ARAB MATH INIT HEH 2763 1EE25..1EE26; UNASSIGNED # .. 2764 1EE27 ; FREE_PVAL # ARAB MATH INIT HAH 2765 1EE28 ; UNASSIGNED # 2766 1EE29..1EE32; FREE_PVAL # ARAB MATH INIT YEH..ARAB MATH INIT QAF 2767 1EE33 ; UNASSIGNED # 2768 1EE34..1EE37; FREE_PVAL # ARAB MATH INIT SHEEN..ARAB MATH INITIA 2769 1EE38 ; UNASSIGNED # 2770 1EE39 ; FREE_PVAL # ARAB MATH INIT SHEEN 2771 1EE3A ; UNASSIGNED # 2772 1EE3B ; FREE_PVAL # ARAB MATH INIT GHAIN 2773 1EE3C..1EE41; UNASSIGNED # .. 2774 1EE42 ; FREE_PVAL # ARAB MATH TAILED JEEM 2775 1EE43..1EE46; UNASSIGNED # .. 2776 1EE47 ; FREE_PVAL # ARAB MATH TAILED HAH 2777 1EE48 ; UNASSIGNED # 2778 1EE49 ; FREE_PVAL # ARAB MATH TAILED YEH 2779 1EE4A ; UNASSIGNED # 2780 1EE4B ; FREE_PVAL # ARAB MATH TAILED LAM 2781 1EE4C ; UNASSIGNED # 2782 1EE4D..1EE4F; FREE_PVAL # ARAB MATH TAILED NOON..ARAB MATH TAILE 2783 1EE50 ; UNASSIGNED # 2784 1EE51..1EE52; FREE_PVAL # ARAB MATH TAILED QAF..ARAB MATH TAILED 2785 1EE53 ; UNASSIGNED # 2786 1EE54 ; FREE_PVAL # ARAB MATH TAILED SHEEN 2787 1EE55..1EE56; UNASSIGNED # .. 2788 1EE57 ; FREE_PVAL # ARAB MATH TAILED KHAH 2789 1EE58 ; UNASSIGNED # 2790 1EE59 ; FREE_PVAL # ARAB MATH TAILED DAD 2791 1EE5A ; UNASSIGNED # 2792 1EE5B ; FREE_PVAL # ARAB MATH TAILED GHAIN 2793 1EE5C ; UNASSIGNED # 2794 1EE5D ; FREE_PVAL # ARAB MATH TAILED DOTLESS NOON 2795 1EE5E ; UNASSIGNED # 2796 1EE5F ; FREE_PVAL # ARAB MATH TAILED DOTLESS GHAIN 2797 1EE60 ; UNASSIGNED # 2798 1EE61..1EE62; FREE_PVAL # ARAB MATH STRETCHED BEH..ARAB MATH STR 2799 1EE63 ; UNASSIGNED # 2800 1EE64 ; FREE_PVAL # ARAB MATH STRETCHED HEH 2801 1EE65..1EE66; UNASSIGNED # .. 2802 1EE67..1EE6A; FREE_PVAL # ARAB MATH STRETCHED HAH..ARAB MATH STR 2803 1EE6B ; UNASSIGNED # 2804 1EE6C..1EE72; FREE_PVAL # ARAB MATH STRETCHED MEEM..ARAB MATH ST 2805 1EE73 ; UNASSIGNED # 2806 1EE74..1EE77; FREE_PVAL # ARAB MATH STRETCHED SHEEN..ARAB MATH S 2807 1EE78 ; UNASSIGNED # 2808 1EE79..1EE7C; FREE_PVAL # ARAB MATH STRETCHED DAD..ARAB MATH STR 2809 1EE7D ; UNASSIGNED # 2810 1EE7E ; FREE_PVAL # ARAB MATH STRETCHED DOTLESS FEH 2811 1EE7F ; UNASSIGNED # 2812 1EE80..1EE89; FREE_PVAL # ARAB MATH LOOPED ALEF..ARAB MATH LOOPE 2813 1EE8A ; UNASSIGNED # 2814 1EE8B..1EE9B; FREE_PVAL # ARAB MATH LOOPED LAM..ARAB MATH LOOPED 2815 1EE9C..1EEA0; UNASSIGNED # .. 2816 1EEA1..1EEA3; FREE_PVAL # ARAB MATH DOUBLE-STRUCK BEH..ARAB MATH 2817 1EEA4 ; UNASSIGNED # 2818 1EEA5..1EEA9; FREE_PVAL # ARAB MATH DOUBLE-STRUCK WAW..ARAB MATH 2819 1EEAA ; UNASSIGNED # 2820 1EEAB..1EEBB; FREE_PVAL # ARAB MATH DOUBLE-STRUCK LAM..ARAB MATH 2821 1EEBC..1EEEF; UNASSIGNED # .. 2822 1EEF0..1EEF1; FREE_PVAL # ARAB MATH OP MEEM W HAH W TATWHEEL..AR 2823 1EEF2..1EFFF; UNASSIGNED # .. 2824 1F000..1F02B; FREE_PVAL # MAHJONG TILE EAST WIND..MAHJONG TILE B 2825 1F02C..1F02F; UNASSIGNED # .. 2826 1F030..1F093; FREE_PVAL # DOMINO TILE HORIZ BACK..DOMINO TILE VE 2827 1F094..1F09F; UNASSIGNED # .. 2828 1F0A0..1F0AE; FREE_PVAL # PLAY CARD BACK..PLAY CARD KING OF SPAD 2829 1F0AF..1F0B0; UNASSIGNED # .. 2830 1F0B1..1F0BE; FREE_PVAL # PLAY CARD ACE OF HEARTS..PLAY CARD KIN 2831 1F0BF..1F0C0; UNASSIGNED # .. 2832 1F0C1..1F0CF; FREE_PVAL # PLAY CARD ACE OF DIAMONDS..PLAY CARD B 2833 1F0D0 ; UNASSIGNED # 2834 1F0D1..1F0DF; FREE_PVAL # PLAY CARD ACE OF CLUBS..PLAY CARD WHIT 2835 1F0E0..1F0FF; UNASSIGNED # .. 2836 1F100..1F10A; FREE_PVAL # DIG ZERO FULL STOP..DIG NINE COMMA 2837 1F10B..1F10F; UNASSIGNED # .. 2838 1F110..1F12E; FREE_PVAL # PARENTHESIZED LAT CAP LET A..CIRCLE 2839 1F12F ; UNASSIGNED # 2840 1F130..1F16B; FREE_PVAL # SQUARED LAT CAP LET A..RAISED MD SIGN 2841 1F16C..1F16F; UNASSIGNED # .. 2842 1F170..1F19A; FREE_PVAL # NEG SQ LAT CAP LET A..SQUARED VS 2843 1F19B..1F1E5; UNASSIGNED # .. 2844 1F1E6..1F202; FREE_PVAL # REG IND SYMB LET A..SQ KATAKANA SA 2845 1F203..1F20F; UNASSIGNED # .. 2846 1F210..1F23A; FREE_PVAL # SQ CJK UNIF IDEO-624B..SQ CJK UNIF IDE 2847 1F23B..1F23F; UNASSIGNED # .. 2848 1F240..1F248; FREE_PVAL # TORT SH BRACK CJK UNIF IDEO-672C..TORT 2849 1F249..1F24F; UNASSIGNED # .. 2850 1F250..1F251; FREE_PVAL # CIRC IDEO ADVANTAGE..CIRC IDEO ACCEPT 2851 1F252..1F2FF; UNASSIGNED # .. 2852 1F300..1F320; FREE_PVAL # CYCLONE..SHOOTING STAR 2853 1F321..1F32F; UNASSIGNED # .. 2854 1F330..1F335; FREE_PVAL # CHESTNUT..CACTUS 2855 1F336 ; UNASSIGNED # 2856 1F337..1F37C; FREE_PVAL # TULIP..BABY BOTTLE 2857 1F37D..1F37F; UNASSIGNED # .. 2858 1F380..1F393; FREE_PVAL # RIBBON..GRADUATION CAP 2859 1F394..1F39F; UNASSIGNED # .. 2860 1F3A0..1F3C4; FREE_PVAL # CAROUSEL HORSE..SURFER 2861 1F3C5 ; UNASSIGNED # 2862 1F3C6..1F3CA; FREE_PVAL # TROPHY..SWIMMER 2863 1F3CB..1F3DF; UNASSIGNED # .. 2864 1F3E0..1F3F0; FREE_PVAL # HOUSE BUILDING..EUROPEAN CASTLE 2865 1F3F1..1F3FF; UNASSIGNED # .. 2866 1F400..1F43E; FREE_PVAL # RAT..PAW PRINTS 2867 1F43F ; UNASSIGNED # 2868 1F440 ; FREE_PVAL # EYES 2869 1F441 ; UNASSIGNED # 2870 1F442..1F4F7; FREE_PVAL # EAR..CAMERA 2871 1F4F8 ; UNASSIGNED # 2872 1F4F9..1F4FC; FREE_PVAL # VIDEO CAMERA..VIDEOCASSETTE 2873 1F4FD..1F4FF; UNASSIGNED # .. 2874 1F500..1F53D; FREE_PVAL # TWISTED RIGHTWARDS ARROWS..DOWN-POINTI 2875 1F53E..1F53F; UNASSIGNED # .. 2876 1F540..1F543; FREE_PVAL # CIRCLED CROSS POMMEE..NOTCHED LEFT SEM 2877 1F544..1F54F; UNASSIGNED # .. 2878 1F550..1F567; FREE_PVAL # CLOCK FACE ONE OCLOCK..CLOCK FACE TWEL 2879 1F568..1F5FA; UNASSIGNED # .. 2880 1F5FB..1F640; FREE_PVAL # MOUNT FUJI..WEARY CAT FACE 2881 1F641..1F644; UNASSIGNED # .. 2882 1F645..1F650; FREE_PVAL # FACE W NO GOOD GESTURE..PERSON W FO 2883 1F650..1F67F; UNASSIGNED # .. 2884 1F680..1F6C5; FREE_PVAL # ROCKET..LEFT LUGGAGE 2885 1F6C6..1F6FF; UNASSIGNED # .. 2886 1F700..1F773; FREE_PVAL # ALCHEMICAL SYMBOL FOR QUINTESSENCE..AL 2887 1F774..1FFFF; UNASSIGNED # .. 2888 20000..2A6D6; PVALID # 2889 2A6D7..2A6FF; UNASSIGNED # .. 2890 2A700..2B734; PVALID # 2891 2A735..2A739; UNASSIGNED # .. 2892 2A740..2B81D; PVALID # 2893 2B81E..2F7FF; UNASSIGNED # .. 2894 2F800..2FA1D; PVALID # CJK COMP IDEO-2F800..CJK COMPA 2895 2FA1E..2FFFD; UNASSIGNED # .. 2896 2FFFE..2FFFF; DISALLOWED # .. 2897 30000..3FFFD; UNASSIGNED # .. 2898 3FFFE..3FFFF; DISALLOWED # .. 2899 40000..4FFFD; UNASSIGNED # .. 2900 4FFFE..4FFFF; DISALLOWED # .. 2901 50000..5FFFD; UNASSIGNED # .. 2902 5FFFE..5FFFF; DISALLOWED # .. 2903 60000..6FFFD; UNASSIGNED # .. 2904 6FFFE..6FFFF; DISALLOWED # .. 2905 70000..7FFFD; UNASSIGNED # .. 2906 7FFFE..7FFFF; DISALLOWED # .. 2907 80000..8FFFD; UNASSIGNED # .. 2908 8FFFE..8FFFF; DISALLOWED # .. 2909 90000..9FFFD; UNASSIGNED # .. 2910 9FFFE..9FFFF; DISALLOWED # .. 2911 A0000..AFFFD; UNASSIGNED # .. 2912 AFFFE..AFFFF; DISALLOWED # .. 2913 B0000..BFFFD; UNASSIGNED # .. 2914 BFFFE..BFFFF; DISALLOWED # .. 2915 C0000..CFFFD; UNASSIGNED # .. 2916 CFFFE..CFFFF; DISALLOWED # .. 2917 D0000..DFFFD; UNASSIGNED # .. 2918 DFFFE..DFFFF; DISALLOWED # .. 2919 E0000 ; UNASSIGNED # 2920 E0001 ; DISALLOWED # LANGUAGE TAG 2921 E0002..E001F; UNASSIGNED # .. 2922 E0020..E007F; DISALLOWED # TAG SPACE..CANCEL TAG 2923 E0080..E00FF; UNASSIGNED # .. 2924 E0100..E01EF; DISALLOWED # VAR SEL-17..VAR SEL-256 2925 E01F0..EFFFD; UNASSIGNED # .. 2926 EFFFE..10FFFF; DISALLOWED # .. 2928 Appendix B. Acknowledgements 2930 The authors would like to acknowledge the comments and contributions 2931 of the following individuals: David Black, Mark Davis, Alan DeKok, 2932 Martin Duerst, Patrik Faltstrom, Ted Hardie, Joe Hildebrand, Bjoern 2933 Hoehrmann, Paul Hoffman, Jeffrey Hutzelman, Simon Josefsson, John 2934 Klensin, Alexey Melnikov, Takahiro Nemoto, Yoav Nir, Mike Parker, 2935 Pete Resnick, Andrew Sullivan, Dave Thaler, Yoshiro Yoneya, and 2936 Florian Zeitz. 2938 Some algorithms and textual descriptions have been borrowed from 2939 [RFC5892]. Some text regarding security has been borrowed from 2940 [RFC5890] and [I-D.ietf-xmpp-6122bis]. 2942 Authors' Addresses 2944 Peter Saint-Andre 2945 Cisco Systems, Inc. 2946 1899 Wynkoop Street, Suite 600 2947 Denver, CO 80202 2948 USA 2950 Phone: +1-303-308-3282 2951 Email: psaintan@cisco.com 2953 Marc Blanchet 2954 Viagenie 2955 246 Aberdeen 2956 Quebec, QC G1R 2E1 2957 Canada 2959 Email: Marc.Blanchet@viagenie.ca 2960 URI: http://www.viagenie.ca/