idnits 2.17.1 draft-ietf-precis-framework-16.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 21, 2014) is 3656 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 1564 -- Looks like a reference, but probably isn't: '2' on line 1566 -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE' == Outdated reference: A later version (-12) exists of draft-ietf-precis-mappings-07 == Outdated reference: A later version (-19) exists of draft-ietf-precis-nickname-09 == Outdated reference: A later version (-18) exists of draft-ietf-precis-saslprepbis-07 == Outdated reference: A later version (-24) exists of draft-ietf-xmpp-6122bis-12 -- Obsolete informational reference (is this intentional?): RFC 3454 (Obsoleted by RFC 7564) -- Obsolete informational reference (is this intentional?): RFC 3490 (Obsoleted by RFC 5890, RFC 5891) -- Obsolete informational reference (is this intentional?): RFC 3491 (Obsoleted by RFC 5891) -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 PRECIS P. Saint-Andre 3 Internet-Draft &yet 4 Obsoletes: 3454 (if approved) M. Blanchet 5 Intended status: Standards Track Viagenie 6 Expires: October 23, 2014 April 21, 2014 8 PRECIS Framework: Preparation and Comparison of Internationalized 9 Strings in Application Protocols 10 draft-ietf-precis-framework-16 12 Abstract 14 Application protocols using Unicode characters in protocol strings 15 need to properly prepare such strings in order to perform valid 16 comparison operations (e.g., for purposes of authentication or 17 authorization). This document defines a framework enabling 18 application protocols to perform the preparation and comparison of 19 internationalized strings ("PRECIS") in a way that depends on the 20 properties of Unicode characters and thus is agile with respect to 21 versions of Unicode. As a result, this framework provides a more 22 sustainable approach to the handling of internationalized strings 23 than the previous framework, known as Stringprep (RFC 3454). This 24 document obsoletes RFC 3454. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on October 23, 2014. 43 Copyright Notice 45 Copyright (c) 2014 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 3. String Classes . . . . . . . . . . . . . . . . . . . . . . . 5 63 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 6 64 3.2. IdentifierClass . . . . . . . . . . . . . . . . . . . . . 7 65 3.3. FreeformClass . . . . . . . . . . . . . . . . . . . . . . 9 66 4. Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . 10 67 4.1. Principles . . . . . . . . . . . . . . . . . . . . . . . 10 68 4.2. Building Application-Layer Constructs . . . . . . . . . . 13 69 4.3. A Note about Spaces . . . . . . . . . . . . . . . . . . . 13 70 5. Order of Operations . . . . . . . . . . . . . . . . . . . . . 14 71 6. Code Point Properties . . . . . . . . . . . . . . . . . . . . 15 72 7. Category Definitions Used to Calculate Derived Property . . . 16 73 7.1. LetterDigits (A) . . . . . . . . . . . . . . . . . . . . 17 74 7.2. Unstable (B) . . . . . . . . . . . . . . . . . . . . . . 18 75 7.3. IgnorableProperties (C) . . . . . . . . . . . . . . . . . 18 76 7.4. IgnorableBlocks (D) . . . . . . . . . . . . . . . . . . . 18 77 7.5. LDH (E) . . . . . . . . . . . . . . . . . . . . . . . . . 18 78 7.6. Exceptions (F) . . . . . . . . . . . . . . . . . . . . . 18 79 7.7. BackwardCompatible (G) . . . . . . . . . . . . . . . . . 19 80 7.8. JoinControl (H) . . . . . . . . . . . . . . . . . . . . . 20 81 7.9. OldHangulJamo (I) . . . . . . . . . . . . . . . . . . . . 20 82 7.10. Unassigned (J) . . . . . . . . . . . . . . . . . . . . . 20 83 7.11. ASCII7 (K) . . . . . . . . . . . . . . . . . . . . . . . 21 84 7.12. Controls (L) . . . . . . . . . . . . . . . . . . . . . . 21 85 7.13. PrecisIgnorableProperties (M) . . . . . . . . . . . . . . 21 86 7.14. Spaces (N) . . . . . . . . . . . . . . . . . . . . . . . 21 87 7.15. Symbols (O) . . . . . . . . . . . . . . . . . . . . . . . 22 88 7.16. Punctuation (P) . . . . . . . . . . . . . . . . . . . . . 22 89 7.17. HasCompat (Q) . . . . . . . . . . . . . . . . . . . . . . 22 90 7.18. OtherLetterDigits (R) . . . . . . . . . . . . . . . . . . 22 91 8. Calculation of the Derived Property . . . . . . . . . . . . . 22 92 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 93 9.1. PRECIS Derived Property Value Registry . . . . . . . . . 24 94 9.2. PRECIS Base Classes Registry . . . . . . . . . . . . . . 24 95 9.3. PRECIS Profiles Registry . . . . . . . . . . . . . . . . 25 97 10. Security Considerations . . . . . . . . . . . . . . . . . . . 26 98 10.1. General Issues . . . . . . . . . . . . . . . . . . . . . 26 99 10.2. Use of the IdentifierClass . . . . . . . . . . . . . . . 27 100 10.3. Use of the FreeformClass . . . . . . . . . . . . . . . . 27 101 10.4. Local Character Set Issues . . . . . . . . . . . . . . . 27 102 10.5. Visually Similar Characters . . . . . . . . . . . . . . 28 103 10.6. Security of Passwords . . . . . . . . . . . . . . . . . 30 104 11. Interoperability Considerations . . . . . . . . . . . . . . . 30 105 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 31 106 12.1. Normative References . . . . . . . . . . . . . . . . . . 31 107 12.2. Informative References . . . . . . . . . . . . . . . . . 31 108 12.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 34 109 Appendix A. Codepoint Table . . . . . . . . . . . . . . . . . . 34 110 Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 64 111 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 65 113 1. Introduction 115 As described in the problem statement for the preparation and 116 comparison of internationalized strings ("PRECIS") [RFC6885], many 117 IETF protocols have used the Stringprep framework [RFC3454] as the 118 basis for preparing and comparing protocol strings that contain 119 Unicode characters [UNICODE] outside the ASCII range [RFC20]. The 120 Stringprep framework was developed during work on the original 121 technology for internationalized domain names (IDNs), here called 122 "IDNA2003" [RFC3490], and Nameprep [RFC3491] was the Stringprep 123 profile for IDNs. At the time, Stringprep was designed as a general 124 framework so that other application protocols could define their own 125 Stringprep profiles for the preparation and comparison of strings and 126 identifiers. Indeed, a number of application protocols defined such 127 profiles. 129 After the publication of [RFC3454] in 2002, several significant 130 issues arose with the use of Stringprep in the IDN case, as 131 documented in the IAB's recommendations regarding IDNs [RFC4690] 132 (most significantly, Stringprep was tied to Unicode version 3.2). 133 Therefore, the newer IDNA specifications, here called "IDNA2008" 134 ([RFC5890], [RFC5891], [RFC5892], [RFC5893], [RFC5894]), no longer 135 use Stringprep and Nameprep. This migration away from Stringprep for 136 IDNs has prompted other "customers" of Stringprep to consider new 137 approaches to the preparation and comparison of internationalized 138 strings, as described in [RFC6885]. 140 This document defines a framework for a post-Stringprep approach to 141 the preparation and comparison of internationalized strings in 142 application protocols, based on several principles: 144 1. Define a small set of string classes that specify the Unicode 145 characters (i.e., specific "code points") appropriate for common 146 application protocol constructs. 148 2. Define each PRECIS string class in terms of Unicode code points 149 and their properties so that an algorithm can be used to 150 determine whether each code point or character category is (a) 151 valid, (b) allowed in certain contexts, (c) disallowed, or (d) 152 unassigned. 154 3. Use an "inclusion model" such that a string class consists only 155 of code points that are explicitly allowed, with the result that 156 any code point not explicitly allowed is forbidden. 158 4. Enable application protocols to define profiles of the PRECIS 159 string classes, addressing matters such as width mapping, case 160 folding and other forms of character mapping, Unicode 161 normalization, directionality, and further excluded code points 162 or character categories. 164 Whereas the string classes define the "baseline" code points for a 165 range of applications, profiling enables application protocols to 166 further restrict the allowable code points beyond those specified for 167 the relevant string class (e.g., characters with special or reserved 168 meaning, such as "@" and "/" when used as separators within 169 identifiers) and to apply the string classes in ways that are 170 appropriate for constructs such as usernames and passwords 171 [I-D.ietf-precis-saslprepbis], nicknames [I-D.ietf-precis-nickname], 172 the localparts of instant messaging addresses 173 [I-D.ietf-xmpp-6122bis], and free-form strings 174 [I-D.ietf-xmpp-6122bis]. Profiles are responsible for defining the 175 handling of right-to-left characters as well as various mapping 176 operations of the kind also discussed for IDNs in [RFC5895], such as 177 case preservation or lowercasing, Unicode normalization, mapping of 178 certain characters to other characters or to nothing, and mapping of 179 full-width and half-width characters. 181 When an application applies a profile of a PRECIS string class, it 182 can achieve the following objectives: 184 a. Determine if a given string conforms to the profile (e.g. to 185 determine if it is allowed for use in the relevant "slot" 186 specified by an application protocol). 188 b. Determine if any two given strings are equivalent (e.g., to make 189 an access decision for purposes of authentication or 190 authorization as further described in [RFC6943]). 192 It is expected that this framework will yield the following benefits: 194 o Application protocols will be agile with regard to Unicode 195 versions. 197 o Implementers will be able to share code point tables and software 198 code across application protocols, most likely by means of 199 software libraries. 201 o End users will be able to acquire more accurate expectations about 202 the characters that are acceptable in various contexts. Given 203 this more uniform set of string classes, it is also expected that 204 copy/paste operations between software implementing different 205 application protocols will be more predictable and coherent. 207 Although this framework is similar to IDNA2008 and borrows some of 208 the character categories defined in [RFC5892], it defines additional 209 character categories to meet the needs of common application 210 protocols. 212 The character categories and calculation rules defined under 213 Section 7 and Section 8 are normative and apply to all Unicode code 214 points. The code point table provided under Appendix A is non- 215 normative and merely shows, for illustrative purposes, the 216 consequences of the character categories and calculation rules, as 217 well as the resulting property values. 219 2. Terminology 221 Many important terms used in this document are defined in [RFC5890], 222 [RFC6365], [RFC6885], and [UNICODE]. The terms "left-to-right" (LTR) 223 and "right-to-left" (RTL) are defined in Unicode Standard Annex #9 224 [UAX9]. 226 As of the date of writing, the version of Unicode published by the 227 Unicode Consortium is 6.3; however, PRECIS is not tied to a specific 228 version of Unicode. 230 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 231 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 232 "OPTIONAL" in this document are to be interpreted as described in 233 [RFC2119]. 235 3. String Classes 236 3.1. Overview 238 Starting in 2010, various "customers" of Stringprep began to discuss 239 the need to define a post-Stringprep approach to the preparation and 240 comparison of internationalized strings other than IDNs. This 241 community analyzed the existing Stringprep profiles and also weighed 242 the costs and benefits of defining a relatively small set of Unicode 243 characters that would minimize the potential for user confusion 244 caused by visually similar characters (and thus be relatively "safe") 245 vs. defining a much larger set of Unicode characters that would 246 maximize the potential for user creativity (and thus be relatively 247 "expressive"). As a result, the community concluded that most 248 existing uses could be addressed by two string classes: 250 IdentifierClass: a sequence of letters, numbers, and some symbols 251 that is used to identify or address a network entity such as a 252 user account, a venue (e.g., a chatroom), an information source 253 (e.g., a data feed), or a collection of data (e.g., a file); the 254 intent is that this class will minimize user confusion in a wide 255 variety of application protocols, with the result that safety has 256 been prioritized over expressiveness for this class. 258 FreeformClass: a sequence of letters, numbers, symbols, spaces, and 259 other characters that is used for free-form strings, including 260 passwords as well as display elements such as human-friendly 261 nicknames in chatrooms; the intent is that this class will allow 262 nearly any Unicode character, with the result that expressiveness 263 has been prioritized over safety for this class (e.g., protocol 264 designers, application developers, service providers, and end 265 users might not understand or be able to enter all of the 266 characters that can be included in the FreeformClass). 268 Future specifications might define additional PRECIS string classes, 269 such as a class that falls somewhere between the IdentifierClass and 270 the FreeformClass. At this time, it is not clear how useful such a 271 class would be. In any case, because application developers are able 272 to define profiles of PRECIS string classes, a protocol needing a 273 construct between the IdentiferClass and the FreeformClass could 274 define a restricted profile of the FreeformClass if needed. 276 The following subsections discuss the IdentifierClass and 277 FreeformClass in more detail, with reference to the dimensions 278 described in Section 3 of [RFC6885]. Each string class is defined by 279 the following behavioral rules: 281 Valid: Defines which code points and character categories are 282 treated as valid input to the string. 284 Contextual Rule Required: Defines which code points and character 285 categories are treated as allowed only if the requirements of a 286 contextual rule are met (i.e., either CONTEXTJ or CONTEXTO). 288 Disallowed: Defines which code points and character categories need 289 to be excluded from the string. 291 Unassigned: Defines application behavior in the presence of code 292 points that are unknown (i.e., not yet designated) for the version 293 of Unicode used by the application. 295 This document defines the valid, contextual rule required, 296 disallowed, and unassigned rules for the IdentifierClass and 297 FreeformClass. As described under Section 4, profiles of these 298 string classes are responsible for defining the width mapping, 299 additional mappings, case mapping, normalization, directionality, and 300 exclusion rules. 302 3.2. IdentifierClass 304 Most application technologies need strings that can be used to refer 305 to, include, or communicate protocol strings like usernames, file 306 names, data feed identifiers, and chatroom names. We group such 307 strings into a class called "IdentifierClass" having the following 308 features. 310 3.2.1. Valid 312 o Code points traditionally used as letters and numbers in writing 313 systems, i.e., the LetterDigits ("A") category first defined in 314 [RFC5892] and listed here under Section 7.1. 316 o Code points in the range U+0021 through U+007E, i.e., the 317 (printable) ASCII7 ("K") rule defined under Section 7.11. These 318 code points are "grandfathered" into PRECIS and thus are valid 319 even if they would otherwise be disallowed according to the 320 property-based rules specified in the next section. 322 Note: Although the PRECIS IdentifierClass re-uses the LetterDigits 323 category from IDNA2008, the range of characters allowed in the 324 IdentifierClass is wider than the range of characters allowed in 325 IDNA2008. The main reason is that IDNA2008 applies the Unstable 326 category before the LetterDigits category, thus disallowing 327 uppercase characters, whereas the IdentifierClass does not apply 328 the Unstable category. 330 3.2.2. Contextual Rule Required 332 o A number of characters from the Exceptions ("F") category defined 333 under Section 7.6 (see Section 7.6 for a full list). 335 o Joining characters, i.e., the JoinControl ("H") category defined 336 under Section 7.8. 338 3.2.3. Disallowed 340 o Old Hangul Jamo characters, i.e., the OldHangulJamo ("I") category 341 defined under Section 7.9. 343 o Control characters, i.e., the Controls ("L") category defined 344 under Section 7.12. 346 o Ignorable characters, i.e., the PrecisIgnorableProperties ("M") 347 category defined under Section 7.13. 349 o Space characters, i.e., the Spaces ("N") category defined under 350 Section 7.14. 352 o Symbol characters, i.e., the Symbols ("O") category defined under 353 Section 7.15. 355 o Punctuation characters, i.e., the Punctuation ("P") category 356 defined under Section 7.16. 358 o Any character that has a compatibility equivalent, i.e., the 359 HasCompat ("Q") category defined under Section 7.17. These code 360 points are disallowed even if they would otherwise be valid 361 according to the property-based rules specified in the previous 362 section. 364 o Letters and digits other than the "traditional" letters and digits 365 allowed in IDNs, i.e., the OtherLetterDigits ("R") category 366 defined under Section 7.18. 368 3.2.4. Unassigned 370 Any code points that are not yet designated in the Unicode character 371 set are considered Unassigned for purposes of the IdentifierClass, 372 and such code points are to be treated as Disallowed. 374 3.2.5. Examples 376 As described in the Introduction to this document, the string classes 377 do not handle all issues related to string preparation and comparison 378 (such as case mapping); instead, such issues are handled at the level 379 of profiles. Examples for two profiles of the IdentifierClass can be 380 found in [I-D.ietf-precis-saslprepbis] (the UsernameIdentifierClass 381 profile) and in [I-D.ietf-xmpp-6122bis] (the JIDlocalIdentifierClass 382 profile). 384 3.3. FreeformClass 386 Some application technologies need strings that can be used in a 387 free-form way, e.g., as a password in an authentication exchange (see 388 [I-D.ietf-precis-saslprepbis] or a nickname in a chatroom (see 389 [I-D.ietf-precis-nickname]). We group such things into a class 390 called "FreeformClass" having the following features. 392 Security Warning: Consult Section 10.6 for relevant security 393 considerations when strings conforming to the FreeformClass, or a 394 profile thereof, are used as passwords. 396 3.3.1. Valid 398 o Traditional letters and numbers, i.e., the LetterDigits ("A") 399 category first defined in [RFC5892] and listed here under 400 Section 7.1. 402 o Letters and digits other than the "traditional" letters and digits 403 allowed in IDNs, i.e., the OtherLetterDigits ("R") category 404 defined under Section 7.18. 406 o Code points in the range U+0021 through U+007E, i.e., the 407 (printable) ASCII7 ("K") rule defined under Section 7.11. 409 o Any character that has a compatibility equivalent, i.e., the 410 HasCompat ("Q") category defined under Section 7.17. 412 o Space characters, i.e., the Spaces ("N") category defined under 413 Section 7.14. 415 o Symbol characters, i.e., the Symbols ("O") category defined under 416 Section 7.15. 418 o Punctuation characters, i.e., the Punctuation ("P") category 419 defined under Section 7.16. 421 3.3.2. Contextual Rule Required 423 o A number of characters from the Exceptions ("F") category defined 424 under Section 7.6 (see Section 7.6 for a full list). 426 o Joining characters, i.e., the JoinControl ("H") category defined 427 under Section 7.8. 429 3.3.3. Disallowed 431 o Old Hangul Jamo characters, i.e., the OldHangulJamo ("I") category 432 defined under Section 7.9. 434 o Control characters, i.e., the Controls ("L") category defined 435 under Section 7.12. 437 o Ignorable characters, i.e., the PrecisIgnorableProperties ("M") 438 category defined under Section 7.13. 440 3.3.4. Unassigned 442 Any code points that are not yet designated in the Unicode character 443 set are considered Unassigned for purposes of the FreeformClass, and 444 such code points are to be treated as Disallowed. 446 3.3.5. Examples 448 As described in the Introduction to this document, the string classes 449 do not handle all issues related to string preparation and comparison 450 (such as case mapping); instead, such issues are handled at the level 451 of profiles. Examples for two profiles of the FreeformClass can be 452 found in [I-D.ietf-precis-nickname] (the NicknameFreeformClass 453 profile) and in [I-D.ietf-xmpp-6122bis] (the 454 JIDresourceIdentifierClass profile). 456 4. Profiles 458 4.1. Principles 460 This framework document defines the valid, contextual-rule-required, 461 disallowed, and unassigned rules for the IdentifierClass and the 462 FreeformClass. A profile of a PRECIS string class MUST define the 463 width mapping, additional mappings (if any), case mapping, 464 normalization, directionality, and exclusion rules. A profile MAY 465 also restrict the allowable characters above and beyond the 466 definition of the relevant PRECIS string class (but MUST NOT add as 467 valid any code points or character categories that are disallowed by 468 the relevant PRECIS string class). These matters are discussed in 469 the following subsections. 471 Profiles of the PRECIS string classes are registered with the IANA as 472 described under Section 9.3. The naming convention for profile names 473 is that they of the form "ProfilenameBaseClass", where the 474 "Profilename" string is a differentiator and "BaseClass" is the name 475 of the PRECIS string class being profiled; for example, the profile 476 of the IdentifierClass used for localparts of Jabber IDs in the 477 Extensible Messaging and Presence Protocol (XMPP) is named 478 "JIDlocalIdentifierClass" [I-D.ietf-xmpp-6122bis]. 480 4.1.1. Width Mapping 482 The width mapping rule of a profile specifies whether width mapping 483 is performed on fullwidth and halfwidth characters, and how the 484 mapping is done. Typically such mapping consists of mapping 485 fullwidth and halfwidth characters, i.e., code points with a 486 Decomposition Type of Wide or Narrow, to their decomposition 487 mappings; as an example, FULLWIDTH DIGIT ZERO (U+FF10) would be 488 mapped to DIGIT ZERO (U+0030). 490 The normalization form specified by a profile (see below) has an 491 impact on the need for width mapping. Because width mapping is 492 performed as a part of compatibility decomposition, a profile 493 employing either normalization form KD (NFKD) or normalization form 494 KC (NFKC) does not need to specify width mapping. However, if 495 Unicode normalization form C (NFC) is used then the profile needs to 496 specify whether to apply width mapping; in this case, width mapping 497 is in general RECOMMENDED because allowing fullwidth and halfwidth 498 characters to remain unmapped to their compatibility variants would 499 violate the principle of least user surprise. For more information 500 about the concept of width in East Asian scripts within Unicode, see 501 Unicode Standard Annex #11 [UAX11]. 503 4.1.2. Additional Mappings 505 The additional mappings rule of a profile specifies whether 506 additional mappings are to be applied, such as mapping of delimiter 507 characters and mapping of special characters (e.g., non-ASCII space 508 characters to ASCII space or certain characters to nothing). 510 4.1.3. Case Mapping 512 The case mapping rule of a profile specifies whether case mapping is 513 performed (instead of case preservation) on uppercase and titlecase 514 characters, and how the mapping is done (e.g., mapping uppercase and 515 titlecase characters to their lowercase equivalents). 517 If case mapping is desired (instead of case preservation), it is 518 RECOMMENDED to use Unicode Default Case Folding as defined in Chapter 519 3 of the Unicode Standard [UNICODE]. 521 Note: Unicode Default Case Folding is not designed to handle 522 various localization issues (such as so-called "dotless i" in 523 several Turkic languages). The PRECIS mappings document 524 [I-D.ietf-precis-mappings] describes these issues in greater 525 detail and defines a "local case mapping" method that handles some 526 locale-dependent and context-dependent mappings. 528 In order to maximize entropy and minimize the potential for false 529 positives, it is NOT RECOMMENDED for application protocols to map 530 uppercase and titlecase code points to their lowercase equivalents 531 when strings conforming to the FreeformClass, or a profile thereof, 532 are used in passwords; instead, it is RECOMMENDED to preserve the 533 case of all code points contained in such strings and then perform 534 case-sensitive comparison. See also the related discussion in 535 [I-D.ietf-precis-saslprepbis]. 537 4.1.4. Normalization 539 The normalization rule of a profile specifies which Unicode 540 normalization form (D, KD, C, or KC) is to be applied (see Unicode 541 Standard Annex #15 [UAX15] for background information). 543 In accordance with [RFC5198], normalization form C (NFC) is 544 RECOMMENDED. 546 4.1.5. Directionality 548 The directionality rule of a profile specifies which strings are to 549 be considered left-to-right (LTR) and right-to-left (RTL), and the 550 allowable sequences of characters in LTR and RTL strings (see Unicode 551 Standard Annex #9 [UAX9]). Possible rules include, but are not 552 limited to, (a) considering any string that contains a right-to-left 553 code point to be a right-to-left string, or (b) applying the "Bidi 554 Rule" from [RFC5893]. 556 Mixed-direction strings are not directly supported by the PRECIS 557 framework itself, since there is currently no widely accepted and 558 implemented solution for the processing and safe display of mixed- 559 direction strings. An application protocol that uses the PRECIS 560 framework (or an extension to the framework) could define methods for 561 handling mixed-direction strings; however, such methods are outside 562 the scope of the framework. 564 4.1.6. Exclusions 566 The exclusions rule of a profile specifies whether the profile 567 excludes additional code points or character categories above and 568 beyond those excluded by the string class being profiled. That is, a 569 profile MAY do either of the following: 571 1. Exclude specific code points that are allowed by the relevant 572 string class. 574 2. Exclude characters matching certain Unicode properties (e.g., 575 math symbols) that are included in the relevant PRECIS string 576 class. 578 As a result of such exclusions, code points that are defined as valid 579 for the PRECIS string class being profiled will be defined as 580 disallowed for the profile. 582 4.2. Building Application-Layer Constructs 584 Sometimes, an application-layer construct does not map in a 585 straightforward manner to one of the PRECIS string classes or a 586 profile thereof. Consider, for example, the "simple user name" 587 construct in the Simple Authentication and Security Layer (SASL) 588 [RFC4422]. Depending on the deployment, a simple user name might 589 take the form of a user's full name (e.g., the user's personal name 590 followed by a space and then the user's family name). Such a simple 591 user name cannot be defined as an instance of the IdentifierClass or 592 a profile thereof, since space characters are not allowed in the 593 IdentifierClass; however, it could be defined using a space-separated 594 sequence of IdentifierClass instances, as in the following pseudo- 595 ABNF [RFC5234]: 597 fullname = namepart *(1*SP namepart) 598 namepart = 1*idpoint 599 ; 600 ; an "idpoint" is a UTF-8 encoded Unicode code point 601 ; that conforms to the PRECIS IdentifierClass 603 Similar techniques could be used to define many application-layer 604 constructs, say of the form "user@domain" or "/path/to/file". 606 4.3. A Note about Spaces 608 With regard to the IdentiferClass, the consensus of the PRECIS 609 Working Group was that spaces are problematic for many reasons, 610 including: 612 o Many Unicode characters are confusable with ASCII space. 614 o Even if non-ASCII space characters are mapped to ASCII space 615 (U+0020), space characters are often not rendered in user 616 interfaces, leading to the possibility that a human user might 617 consider a string containing spaces to be equivalent to the same 618 string without spaces. 620 o In some locales, some devices are known to generate a character 621 other than ASCII space (such as ZERO WIDTH JOINER, U+200D) when a 622 user performs an action like hit the space bar on a keyboard. 624 One consequence of disallowing space characters in the 625 IdentifierClass might be to effectively discourage their use within 626 identifiers created in newer application protocols; given the 627 challenges involved in properly handling space characters (especially 628 non-ASCII space characters) in identifiers and other protocol 629 strings, the Working Group considered this to be a feature, not a 630 bug. 632 However, the FreeformClass does allow spaces, which enables 633 application protocols to define profiles of the FreeformClass that 634 are more flexible than any profiles of the IdentifierClass. In 635 addition, as explained in the previous section, application protocols 636 can also define application-layer constructs containing spaces. 638 5. Order of Operations 640 To ensure proper comparison, the following order of operations is 641 REQUIRED: 643 1. Width mapping 645 2. Optionally, additional mappings such as mapping of delimiters 646 (e.g., characters such as '@', ':', '/', '+', and '-') and 647 special handling of certain characters or classes of characters 648 (e.g., mapping of non-ASCII spaces to ASCII space or mapping of 649 control characters to nothing); the PRECIS mappings document 650 [I-D.ietf-precis-mappings] describes such mappings in more detail 652 3. Case mapping as described under Section 4.1.3 of this document 654 4. Normalization 656 5. Behavioral rules for determining whether a code point is valid, 657 allowed under a contextual rule, disallowed, or unassigned 659 As already described, the width mapping, additional mappings, case 660 mapping, and normalization operations are specified for each profile, 661 whereas the behavioral rules are specified for each string class. 662 Some of the logic behind this order is provided under Section 4.1.1 663 (see also the PRECIS mappings document [I-D.ietf-precis-mappings]). 665 6. Code Point Properties 667 In order to implement the string classes described above, this 668 document does the following: 670 1. Reviews and classifies the collections of code points in the 671 Unicode character set by examining various code point properties. 673 2. Defines an algorithm for determining a derived property value, 674 which can vary depending on the string class being used by the 675 relevant application protocol. 677 This document is not intended to specify precisely how derived 678 property values are to be applied in protocol strings. That 679 information is the responsibility of the protocol specification that 680 uses or profiles a PRECIS string class from this document. 682 The value of the property is to be interpreted as follows. 684 PROTOCOL VALID Those code points that are allowed to be used in any 685 PRECIS string class (currently, IdentifierClass and 686 FreeformClass). Code points with this property value are 687 permitted for general use in any string class. The abbreviated 688 term "PVALID" is used to refer to this value in the remainder of 689 this document. 691 SPECIFIC CLASS PROTOCOL VALID Those code points that are allowed to 692 be used in specific string classes. Code points with this 693 property value are permitted for use in specific string classes. 694 In the remainder of this document, the abbreviated term *_PVAL is 695 used, where * = (ID | FREE), i.e., either "FREE_PVAL" or 696 "ID_PVAL". 698 CONTEXTUAL RULE REQUIRED Some characteristics of the character, such 699 as its being invisible in certain contexts or problematic in 700 others, require that it not be used in labels unless specific 701 other characters or properties are present. As in IDNA2008, there 702 are two subdivisions of CONTEXTUAL RULE REQUIRED, the first for 703 Join_controls (called "CONTEXTJ") and the second for other 704 characters (called "CONTEXTO"). A character with the derived 705 property value CONTEXTJ or CONTEXTO MUST NOT be used unless an 706 appropriate rule has been established and the context of the 707 character is consistent with that rule. The most notable of the 708 CONTEXTUAL RULE REQUIRED characters are the Join Control 709 characters U+200D ZERO WIDTH JOINER and U+200C ZERO WIDTH NON- 710 JOINER, which have a derived property value of CONTEXTJ. See 711 Appendix A of [RFC5892] for more information. 713 DISALLOWED Those code points that are not permitted in any PRECIS 714 string class. 716 SPECIFIC CLASS DISALLOWED Those code points that are not to be 717 included in a specific string class. Code points with this 718 property value are not permitted in one of the string classes but 719 might be permitted in others. In the remainder of this document, 720 the abbreviated term *_DIS is used, where * = (ID | FREE), i.e., 721 either "FREE_DIS" or "ID_DIS". 723 UNASSIGNED Those code points that are not designated (i.e. are 724 unassigned) in the Unicode Standard. 726 The mechanisms described here allow determination of the value of the 727 property for future versions of Unicode (including characters added 728 after Unicode 5.2 or 6.3 depending on the category, since some 729 categories in this document are reused from IDNA2008 and therefore 730 were defined at the time of Unicode 5.2). Changes in Unicode 731 properties that do not affect the outcome of this process therefore 732 do not affect this framework. For example, a character can have its 733 Unicode General_Category value [UNICODE] change from So to Sm, or 734 from Lo to Ll, without affecting the algorithm results. Moreover, 735 even if such changes were to result, the BackwardCompatible list 736 (Section 7.7) can be adjusted to ensure the stability of the results. 738 7. Category Definitions Used to Calculate Derived Property 740 The derived property obtains its value based on a two-step procedure: 742 1. Characters are placed in one or more character categories either 743 (1) based on core properties defined by the Unicode Standard or 744 (2) by treating the code point as an exception and addressing the 745 code point based on its code point value. These categories are 746 not mutually exclusive. 748 2. Set operations are used with these categories to determine the 749 values for a property specific to a given string class. These 750 operations are specified under Section 8. 752 Note: Unicode property names and property value names might have 753 short abbreviations, such as "gc" for the General_Category 754 property and "Ll" for the Lowercase_Letter property value of the 755 gc property. 757 In the following specification of character categories, the operation 758 that returns the value of a particular Unicode character property for 759 a code point is designated by using the formal name of that property 760 (from the Unicode PropertyAliases.txt [1]) followed by '(cp)' for 761 "code point". For example, the value of the General_Category 762 property for a code point is indicated by General_Category(cp). 764 The first ten categories (A-J) shown below were previously defined 765 for IDNA2008 and are copied directly from [RFC5892]. Some of these 766 categories are reused in PRECIS and some of them are not; however, 767 the lettering of categories is retained to prevent overlap and to 768 ease implementation of both IDNA2008 and PRECIS in a single software 769 application. The next eight categories (K-R) are specific to PRECIS. 771 7.1. LetterDigits (A) 773 Note: This category is defined in [RFC5892] and copied here for use 774 in PRECIS. 776 A: General_Category(cp) is in {Ll, Lu, Lm, Lo, Mn, Mc, Nd} 778 These rules identify characters commonly used in mnemonics and often 779 informally described as "language characters". 781 For more information, see Chapter 4 of the Unicode Standard 782 [UNICODE]. 784 The categories used in this rule are: 786 o Ll - Lowercase_Letter 788 o Lu - Uppercase_Letter 790 o Lm - Modifier_Letter 792 o Lo - Other_Letter 794 o Mn - Nonspacing_Mark 796 o Mc - Spacing_Mark 798 o Nd - Decimal_Number 800 7.2. Unstable (B) 802 Note: This category is defined in [RFC5892] but not used in PRECIS. 804 7.3. IgnorableProperties (C) 806 Note: This category is defined in [RFC5892] but not used in PRECIS. 807 See the "PrecisIgnorableProperties (M)" category below for a more 808 inclusive category used in PRECIS identifiers. 810 7.4. IgnorableBlocks (D) 812 Note: This category is defined in [RFC5892] but not used in PRECIS. 814 7.5. LDH (E) 816 Note: This category is defined in [RFC5892] but not used in PRECIS. 817 See the "ASCII7 (K)" category below for a more inclusive category 818 used in PRECIS identifiers. 820 7.6. Exceptions (F) 822 Note: This category is defined in [RFC5892] and used in PRECIS to 823 ensure consistent treatment of the relevant code points. 825 F: cp is in {00B7, 00DF, 0375, 03C2, 05F3, 05F4, 0640, 0660, 826 0661, 0662, 0663, 0664, 0665, 0666, 0667, 0668, 827 0669, 06F0, 06F1, 06F2, 06F3, 06F4, 06F5, 06F6, 828 06F7, 06F8, 06F9, 06FD, 06FE, 07FA, 0F0B, 3007, 829 302E, 302F, 3031, 3032, 3033, 3034, 3035, 303B, 830 30FB} 832 This category explicitly lists code points for which the category 833 cannot be assigned using only the core property values that exist in 834 the Unicode Standard. The values are according to the table below: 836 PVALID -- Would otherwise have been DISALLOWED 838 00DF; PVALID # LATIN SMALL LETTER SHARP S 839 03C2; PVALID # GREEK SMALL LETTER FINAL SIGMA 840 06FD; PVALID # ARABIC SIGN SINDHI AMPERSAND 841 06FE; PVALID # ARABIC SIGN SINDHI POSTPOSITION MEN 842 0F0B; PVALID # TIBETAN MARK INTERSYLLABIC TSHEG 843 3007; PVALID # IDEOGRAPHIC NUMBER ZERO 845 CONTEXTO -- Would otherwise have been DISALLOWED 847 00B7; CONTEXTO # MIDDLE DOT 848 0375; CONTEXTO # GREEK LOWER NUMERAL SIGN (KERAIA) 849 05F3; CONTEXTO # HEBREW PUNCTUATION GERESH 850 05F4; CONTEXTO # HEBREW PUNCTUATION GERSHAYIM 851 30FB; CONTEXTO # KATAKANA MIDDLE DOT 853 CONTEXTO -- Would otherwise have been PVALID 855 0660; CONTEXTO # ARABIC-INDIC DIGIT ZERO 856 0661; CONTEXTO # ARABIC-INDIC DIGIT ONE 857 0662; CONTEXTO # ARABIC-INDIC DIGIT TWO 858 0663; CONTEXTO # ARABIC-INDIC DIGIT THREE 859 0664; CONTEXTO # ARABIC-INDIC DIGIT FOUR 860 0665; CONTEXTO # ARABIC-INDIC DIGIT FIVE 861 0666; CONTEXTO # ARABIC-INDIC DIGIT SIX 862 0667; CONTEXTO # ARABIC-INDIC DIGIT SEVEN 863 0668; CONTEXTO # ARABIC-INDIC DIGIT EIGHT 864 0669; CONTEXTO # ARABIC-INDIC DIGIT NINE 865 06F0; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT ZERO 866 06F1; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT ONE 867 06F2; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT TWO 868 06F3; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT THREE 869 06F4; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT FOUR 870 06F5; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT FIVE 871 06F6; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT SIX 872 06F7; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT SEVEN 873 06F8; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT EIGHT 874 06F9; CONTEXTO # EXTENDED ARABIC-INDIC DIGIT NINE 876 DISALLOWED -- Would otherwise have been PVALID 878 0640; DISALLOWED # ARABIC TATWEEL 879 07FA; DISALLOWED # NKO LAJANYALAN 880 302E; DISALLOWED # HANGUL SINGLE DOT TONE MARK 881 302F; DISALLOWED # HANGUL DOUBLE DOT TONE MARK 882 3031; DISALLOWED # VERTICAL KANA REPEAT MARK 883 3032; DISALLOWED # VERTICAL KANA REPEAT WITH VOICED SOUND MARK 884 3033; DISALLOWED # VERTICAL KANA REPEAT MARK UPPER HALF 885 3034; DISALLOWED # VERTICAL KANA REPEAT WITH VOICED SOUND MARK 886 UPPER HA 887 3035; DISALLOWED # VERTICAL KANA REPEAT MARK LOWER HALF 888 303B; DISALLOWED # VERTICAL IDEOGRAPHIC ITERATION MARK 890 7.7. BackwardCompatible (G) 892 Note: This category is defined in [RFC5892] and copied here for use 893 in PRECIS. Because of how the PRECIS string classes are defined, 894 only changes that would result in code points being added to or 895 removed from the LetterDigits ("A") category would result in 896 backward-incompatible modifications to code point assignments. 897 Therefore, management of this category is handled via the processes 898 specified in [RFC5892]. 900 G: cp is in {} 902 This category includes the code points for which property values in 903 versions of Unicode after 5.2 have changed in such a way that the 904 derived property value would no longer be PVALID or DISALLOWED. If 905 changes are made to future versions of Unicode so that code points 906 might change property value from PVALID or DISALLOWED, then this 907 table can be updated and keep special exception values so that the 908 property values for code points stay stable. 910 7.8. JoinControl (H) 912 Note: This category is defined in [RFC5892] and copied here for use 913 in PRECIS. 915 H: Join_Control(cp) = True 917 This category consists of Join Control characters (i.e., they are not 918 in LetterDigits (Section 7.1) but are still required in strings under 919 some circumstances). 921 7.9. OldHangulJamo (I) 923 Note: This category is defined in [RFC5892] and copied here for use 924 in PRECIS. 926 I: Hangul_Syllable_Type(cp) is in {L, V, T} 928 This category consists of all conjoining Hangul Jamo (Leading Jamo, 929 Vowel Jamo, and Trailing Jamo). 931 Elimination of conjoining Hangul Jamos from the set of PVALID 932 characters results in restricting the set of Korean PVALID characters 933 just to preformed, modern Hangul syllable characters. Old Hangul 934 syllables, which are spelled with sequences of conjoining Hangul 935 Jamos, are not PVALID for string classes. 937 7.10. Unassigned (J) 939 Note: This category is defined in [RFC5892] and copied here for use 940 in PRECIS. 942 J: General_Category(cp) is in {Cn} and 943 Noncharacter_Code_Point(cp) = False 945 This category consists of code points in the Unicode character set 946 that are not (yet) designated. Implementers might want to keep in 947 mind that the Unicode Standard distinguishes between 'unassigned code 948 points' and 'unassigned characters'. The unassigned code points are 949 all but (Cn - Noncharacters), whereas the unassigned characters are 950 all but (Cn + Cs). 952 7.11. ASCII7 (K) 954 This PRECIS-specific category consists of all printable, non-space 955 characters from the 7-bit ASCII range. By applying this category, 956 the algorithm specified under Section 8 exempts these characters from 957 other rules that might be applied during PRECIS processing, on the 958 assumption that these code points are in such wide use that 959 disallowing them would be counter-productive. 961 K: cp is in {0021..007E} 963 7.12. Controls (L) 965 L: Control(cp) = True 967 7.13. PrecisIgnorableProperties (M) 969 This PRECIS-specific category is used to group code points that are 970 discouraged from use in PRECIS string classes. 972 M: Default_Ignorable_Code_Point(cp) = True or 973 Noncharacter_Code_Point(cp) = True 975 The definition for Default_Ignorable_Code_Point can be found in the 976 DerivedCoreProperties.txt [2] file, and at the time of Unicode 6.3 is 977 as follows: 979 Other_Default_Ignorable_Code_Point 980 + Cf (Format characters) 981 + Variation_Selector 982 - White_Space 983 - FFF9..FFFB (Annotation Characters) 984 - 0600..0604, 06DD, 070F, 110BD (exceptional Cf characters 985 that should be visible) 987 7.14. Spaces (N) 989 This PRECIS-specific category is used to group code points that are 990 space characters. 992 N: General_Category(cp) is in {Zs} 994 7.15. Symbols (O) 996 This PRECIS-specific category is used to group code points that are 997 symbols. 999 O: General_Category(cp) is in {Sm, Sc, Sk, So} 1001 7.16. Punctuation (P) 1003 This PRECIS-specific category is used to group code points that are 1004 punctuation characters. 1006 P: General_Category(cp) is in {Pc, Pd, Ps, Pe, Pi, Pf, Po} 1008 7.17. HasCompat (Q) 1010 This PRECIS-specific category is used to group code points that have 1011 compatibility equivalents as explained in Chapter 2 and Chapter 3 of 1012 the Unicode Standard [UNICODE]. 1014 Q: toNFKC(cp) != cp 1016 The toNFKC() operation returns the code point in normalization form 1017 KC. For more information, see Section 5 of Unicode Standard Annex 1018 #15 [UAX15]. 1020 7.18. OtherLetterDigits (R) 1022 This PRECIS-specific category is used to group code points that are 1023 letters and digits other than the "traditional" letters and digits 1024 grouped under the LetterDigits (A) class (see Section 7.1). 1026 R: General_Category(cp) is in {Lt, Nl, No, Me} 1028 8. Calculation of the Derived Property 1030 Possible values of the derived property are: 1032 o PVALID 1034 o ID_PVAL 1036 o FREE_PVAL 1038 o CONTEXTJ 1040 o CONTEXTO 1041 o DISALLOWED 1043 o ID_DIS 1045 o FREE_DIS 1047 o UNASSIGNED 1049 Note: The value of the derived property calculated can depend on 1050 the string class; for example, if an identifier used in an 1051 application protocol is defined as profiling the PRECIS 1052 IdentifierClass then a space character such as U+0020 would be 1053 assigned to ID_DIS, whereas if an identifier is defined as 1054 profiling the PRECIS FreeformClass then the character would be 1055 assigned to FREE_PVAL. For the sake of brevity, the designation 1056 "FREE_PVAL" is used in the code point tables, instead of the 1057 longer designation "ID_DIS or FREE_PVAL". In practice, the 1058 derived properties ID_PVAL and FREE_DIS are not used in this 1059 specification, since every ID_PVAL code point is PVALID and every 1060 FREE_DIS code point is DISALLOWED. 1062 The algorithm to calculate the value of the derived property is as 1063 follows: 1065 If .cp. .in. Exceptions Then Exceptions(cp); 1066 Else If .cp. .in. BackwardCompatible Then BackwardCompatible(cp); 1067 Else If .cp. .in. Unassigned Then UNASSIGNED; 1068 Else If .cp. .in. ASCII7 Then PVALID; 1069 Else If .cp. .in. JoinControl Then CONTEXTJ; 1070 Else If .cp. .in. OldHangulJamo Then DISALLOWED; 1071 Else If .cp. .in. PrecisIgnorableProperties Then DISALLOWED; 1072 Else If .cp. .in. Controls Then DISALLOWED; 1073 Else If .cp. .in. HasCompat Then ID_DIS or FREE_PVAL; 1074 Else If .cp. .in. LetterDigits Then PVALID; 1075 Else If .cp. .in. OtherLetterDigits Then ID_DIS or FREE_PVAL; 1076 Else If .cp. .in. Spaces Then ID_DIS or FREE_PVAL; 1077 Else If .cp. .in. Symbols Then ID_DIS or FREE_PVAL; 1078 Else If .cp. .in. Punctuation Then ID_DIS or FREE_PVAL; 1079 Else DISALLOWED; 1081 Note: Use of the name of a rule (such as "Exceptions") implies the 1082 set of code points that the rule defines, whereas the same name as a 1083 function call (such as "Exceptions(cp)") implies the value that the 1084 code point has in the Exceptions table. 1086 9. IANA Considerations 1088 9.1. PRECIS Derived Property Value Registry 1090 IANA is requested to create a PRECIS-specific registry with the 1091 Derived Properties for the versions of Unicode that are released 1092 after (and including) version 6.3. The derived property value is to 1093 be calculated in cooperation with a designated expert [RFC5226] 1094 according to the rules specified under Section 7 and Section 8, not 1095 by copying the non-normative table found under Appendix A. 1097 The IESG is to be notified if backward-incompatible changes to the 1098 table of derived properties are discovered or if other problems arise 1099 during the process of creating the table of derived property values 1100 or during expert review. Changes to the rules defined under 1101 Section 7 and Section 8 require IETF Review. 1103 9.2. PRECIS Base Classes Registry 1105 IANA is requested to create a registry of PRECIS string classes. In 1106 accordance with [RFC5226], the registration policy is "RFC Required". 1108 The registration template is as follows: 1110 Base Class: [the name of the PRECIS string class] 1112 Description: [a brief description of the PRECIS string class and its 1113 intended use, e.g., "A sequence of letters, numbers, and symbols 1114 that is used to identify or address a network entity."] 1116 Specification: [the RFC number] 1118 The initial registrations are as follows: 1120 Base Class: FreeformClass. 1121 Description: A sequence of letters, numbers, symbols, spaces, and 1122 other code points that is used for free-form strings. 1123 Specification: Section 3.3 of this document. 1124 [Note to RFC Editor: please change "this document" 1125 to the RFC number issued for this specification.] 1127 Base Class: IdentifierClass. 1128 Description: A sequence of letters, numbers, and symbols that is 1129 used to identify or address a network entity. 1130 Specification: Section 3.3 of this document. 1131 [Note to RFC Editor: please change "this document" 1132 to the RFC number issued for this specification.] 1134 9.3. PRECIS Profiles Registry 1136 IANA is requested to create a registry of profiles that use the 1137 PRECIS string classes. In accordance with [RFC5226], the 1138 registration policy is "Expert Review". This policy was chosen in 1139 order to ease the burden of registration while ensuring that 1140 "customers" of PRECIS receive appropriate guidance regarding the 1141 sometimes complex and subtle internationalization issues related to 1142 profiles of PRECIS string classes. 1144 The registration template is as follows: 1146 Name: [the name of the profile] 1148 Applicability: [the specific protocol elements to which this profile 1149 applies, e.g., "Localparts in XMPP addresses."] 1151 Base Class: [which PRECIS string class is being profiled] 1153 Replaces: [the Stringprep profile that this PRECIS profile replaces, 1154 if any] 1156 Width Mapping: [the behavioral rule for handling of width, e.g., 1157 "Map fullwidth and halfwidth characters to their compatibility 1158 variants."] 1160 Additional Mappings: [any additional mappings are required or 1161 recommended, e.g., "Map non-ASCII space characters to ASCII 1162 space."] 1164 Case Mapping: [the behavioral rule for handling of case, e.g., 1165 "Unicode Default Case Folding"] 1167 Normalization: [which Unicode normalization form is applied, e.g., 1168 "NFC"] 1170 Directionality: [the behavioral rule for handling of right-to-left 1171 code points, e.g., "The 'Bidi Rule' defined in RFC 5893 applies."] 1173 Exclusions: [a brief description of the specific code points or 1174 characters categories are excluded, e.g., "Eight legacy characters 1175 in the ASCII range" or "Any character that has a compatibility 1176 equivalent, i.e., the HasCompat category"] 1178 Enforcement: [which entities enforce the rules, and when that 1179 enforcement occurs during protocol operations] 1181 Specification: [a pointer to relevant documentation, such as an RFC 1182 or Internet-Draft] 1184 In order to request a review, the registrant shall send a completed 1185 template to the precis@ietf.org list or its designated successor. 1187 Factors to focus on while defining profiles and reviewing profile 1188 registrations include the following: 1190 o Is the problem being addressed by this profile well-defined? 1192 o Does the specification define what kinds of applications are 1193 involved and the protocol elements to which this profile applies? 1195 o Would an existing PRECIS string class or profile solve the 1196 problem? 1198 o Is the profile clearly defined? 1200 o Is the profile based on an appropriate dividing line between user 1201 interface (culture, context, intent, locale, device limitations, 1202 etc.) and the use of conformant strings in protocol elements? 1204 o Are the width mapping, case mapping, additional mappings, 1205 normalization, exclusion, and directionality rules appropriate for 1206 the intended use? 1208 o Does the profile explain which entities enforce the rules, and 1209 when such enforcement occurs during protocol operations? 1211 o Does the profile reduce the degree to which human users could be 1212 surprised or confused by application behavior (the "principle of 1213 least user surprise")? 1215 o Does the profile introduce any new security concerns such as those 1216 described under Section 10 of this document (e.g., false positives 1217 for authentication or authorization)? 1219 10. Security Considerations 1221 10.1. General Issues 1223 The security of applications that use this framework can depend in 1224 part on the proper preparation and comparison of internationalized 1225 strings. For example, such strings can be used to make 1226 authentication and authorization decisions, and the security of an 1227 application could be compromised if an entity providing a given 1228 string is connected to the wrong account or online resource based on 1229 different interpretations of the string. 1231 Specifications of application protocols that use this framework are 1232 encouraged to describe how internationalized strings are used in the 1233 protocol, including the security implications of any false positives 1234 and false negatives that might result from various comparison 1235 operations. For some helpful guidelines, refer to [RFC6943], 1236 [RFC5890], [UTR36], and [UTS39]. 1238 10.2. Use of the IdentifierClass 1240 Strings that conform to the IdentifierClass and any profile thereof 1241 are intended to be relatively safe for use in a broad range of 1242 applications, primarily because they include only letters, digits, 1243 and "grandfathered" non-space characters from the ASCII range; thus 1244 they exclude spaces, characters with compatibility equivalents, and 1245 almost all symbols and punctuation marks. However, because such 1246 strings can still include so-called confusable characters (see 1247 Section 10.5), protocol designers and implementers are encouraged to 1248 pay close attention to the security considerations described 1249 elsewhere in this document. 1251 10.3. Use of the FreeformClass 1253 Strings that conform to the FreeformClass and many profiles thereof 1254 can include virtually any Unicode character. This makes the 1255 FreeformClass quite expressive, but also problematic from the 1256 perspective of possible user confusion. Protocol designers are 1257 hereby warned that the FreeformClass contains codepoints they might 1258 not understand, and are encouraged to profile the IdentifierClass 1259 wherever feasible; however, if an application protocol requires more 1260 code points than are allowed by the IdentifierClass, protocol 1261 designers are encouraged to define a profile of the FreeformClass 1262 that restricts the allowable code points as tightly as possible. 1263 (The PRECIS Working Group considered the option of allowing 1264 superclasses as well as profiles of PRECIS string classes, but 1265 decided against allowing superclasses to reduce the likelihood of 1266 security and interoperability problems.) 1268 10.4. Local Character Set Issues 1270 When systems use local character sets other than ASCII and Unicode, 1271 this specification leaves the problem of converting between the local 1272 character set and Unicode up to the application or local system. If 1273 different applications (or different versions of one application) 1274 implement different rules for conversions among coded character sets, 1275 they could interpret the same name differently and contact different 1276 application servers or other network entities. This problem is not 1277 solved by security protocols, such as Transport Layer Security (TLS) 1278 [RFC5246] and the Simple Authentication and Security Layer (SASL) 1279 [RFC4422], that do not take local character sets into account. 1281 10.5. Visually Similar Characters 1283 Some characters are visually similar and thus can cause confusion 1284 among humans. Such characters are often called "confusable 1285 characters" or "confusables". 1287 The problem of confusable characters is not necessarily caused by the 1288 use of Unicode code points outside the ASCII range. For example, in 1289 some presentations and to some individuals the string "ju1iet" 1290 (spelled with DIGIT ONE, U+0031, as the third character) might appear 1291 to be the same as "juliet" (spelled with LATIN SMALL LETTER L, 1292 U+006C), especially on casual visual inspection. This phenomenon is 1293 sometimes called "typejacking". 1295 However, the problem is made more serious by introducing the full 1296 range of Unicode code points into protocol strings. For example, the 1297 characters U+13DA U+13A2 U+13B5 U+13AC U+13A2 U+13AC U+13D2 from the 1298 Cherokee block look similar to the ASCII characters "STPETER" as they 1299 might appear when presented using a "creative" font family. 1301 In some examples of confusable characters, it is unlikely that the 1302 average human could tell the difference between the real string and 1303 the fake string. (Indeed, there is no programmatic way to 1304 distinguish with full certainty which is the fake string and which is 1305 the real string; in some contexts, the string formed of Cherokee 1306 characters might be the real string and the string formed of ASCII 1307 characters might be the fake string.) Because PRECIS-compliant 1308 strings can contain almost any properly-encoded Unicode code point, 1309 it can be relatively easy to fake or mimic some strings in systems 1310 that use the PRECIS framework. The fact that some strings are easily 1311 confused introduces security vulnerabilities of the kind that have 1312 also plagued the World Wide Web, specifically the phenomenon known as 1313 phishing. 1315 Despite the fact that some specific suggestions about identification 1316 and handling of confusable characters appear in the Unicode Security 1317 Considerations [UTR36] and the Unicode Security Mechanisms [UTS39], 1318 it is also true (as noted in [RFC5890]) that "there are no 1319 comprehensive technical solutions to the problems of confusable 1320 characters". Because it is impossible to map visually similar 1321 characters without a great deal of context (such as knowing the font 1322 families used), the PRECIS framework does nothing to map similar- 1323 looking characters together, nor does it prohibit some characters 1324 because they look like others. 1326 Nevertheless, specifications for application protocols that use this 1327 framework MUST describe how confusable characters can be abused to 1328 compromise the security of systems that use the protocol in question, 1329 along with any protocol-specific suggestions for overcoming those 1330 threats. In particular, software implementations and service 1331 deployments that use PRECIS-based technologies are strongly 1332 encouraged to define and implement consistent policies regarding the 1333 registration, storage, and presentation of visually similar 1334 characters. The following recommendations are appropriate: 1336 1. An application service SHOULD define a policy that specifies the 1337 scripts or blocks of characters that the service will allow to be 1338 registered (e.g., in an account name) or stored (e.g., in a file 1339 name). Such a policy SHOULD be informed by the languages and 1340 scripts that are used to write registered account names; in 1341 particular, to reduce confusion, the service SHOULD forbid 1342 registration or storage of strings that contain characters from 1343 more than one script and SHOULD restrict registrations to 1344 characters drawn from a very small number of scripts (e.g., 1345 scripts that are well-understood by the administrators of the 1346 service, to improve manageability). 1348 2. User-oriented application software SHOULD define a policy that 1349 specifies how internationalized strings will be presented to a 1350 human user. Because every human user of such software has a 1351 preferred language or a small set of preferred languages, the 1352 software SHOULD gather that information either explicitly from 1353 the user or implicitly via the operating system of the user's 1354 device. Furthermore, because most languages are typically 1355 represented by a single script or a small set of scripts, and 1356 because most scripts are typically contained in one or more 1357 blocks of characters, the software SHOULD warn the user when 1358 presenting a string that mixes characters from more than one 1359 script or block, or that uses characters outside the normal range 1360 of the user's preferred language(s). (Such a recommendation is 1361 not intended to discourage communication across different 1362 communities of language users; instead, it recognizes the 1363 existence of such communities and encourages due caution when 1364 presenting unfamiliar scripts or characters to human users.) 1366 The challenges inherent in supporting the full range of Unicode code 1367 points have in the past led some to hope for a way to 1368 programmatically negotiate more restrictive ranges based on locale, 1369 script, or other relevant factors, to tag the locale associated with 1370 a particular string, etc. As a general-purpose internationalization 1371 technology, the PRECIS framework does not include such mechanisms. 1373 10.6. Security of Passwords 1375 Two goals of passwords are to maximize the amount of entropy and to 1376 minimize the potential for false positives. These goals can be 1377 achieved in part by allowing a wide range of code points and by 1378 ensuring that passwords are handled in such a way that code points 1379 are not compared aggressively. Therefore, it is NOT RECOMMENDED for 1380 application protocols to profile the FreeformClass for use in 1381 passwords in a way that removes entire categories (e.g., by 1382 disallowing symbols or punctuation). Furthermore, it is NOT 1383 RECOMMENDED for application protocols to map uppercase and titlecase 1384 code points to their lowercase equivalents in such strings; instead, 1385 it is RECOMMENDED to preserve the case of all code points contained 1386 in such strings and to compare them in a case-sensitive manner. 1388 That said, software implementers need to be aware that there exist 1389 tradeoffs between entropy and usability. For example, allowing a 1390 user to establish a password containing "uncommon" code points might 1391 make it difficult for the user to access a service when using an 1392 unfamiliar or constrained input device. 1394 Some application protocols use passwords directly, whereas others 1395 reuse technologies that themselves process passwords (one example of 1396 such a technology is the Simple Authentication and Security Layer 1397 [RFC4422]). Moreover, passwords are often carried by a sequence of 1398 protocols with backend authentication systems or data storage systems 1399 such as RADIUS [RFC2865] and LDAP [RFC4510]. Developers of 1400 application protocols are encouraged to look into reusing these 1401 profiles instead of defining new ones, so that end-user expectations 1402 about passwords are consistent no matter which application protocol 1403 is used. 1405 Further discussion of password handling can be found in 1406 [I-D.ietf-precis-saslprepbis]. 1408 11. Interoperability Considerations 1410 Although strings that are consumed in PRECIS-based application 1411 protocols are often encoded using UTF-8 [RFC3629], the exact encoding 1412 is a matter for the application protocol that uses PRECIS, not for 1413 the PRECIS framework. 1415 It is known that some existing systems are unable to support the full 1416 Unicode character set, or even any characters outside the ASCII 1417 range. If two (or more) applications need to interoperate when 1418 exchanging data (e.g., for the purpose of authenticating a username 1419 or password), they will naturally need to have in common at least one 1420 coded character set (as defined by [RFC6365]). Establishing such a 1421 baseline is a matter for the application protocol that uses PRECIS, 1422 not for the PRECIS framework. 1424 The PRECIS framework, which is defined in terms of the latest version 1425 of Unicode as of the time of this writing (6.3), treats the character 1426 U+19DA NEW TAI LUE THAM as DISALLOWED. Implementers need to be aware 1427 that this treatment is different from IDNA2008 (originally defined in 1428 terms of Unicode 5.2), which treats U+19DA as PVALID. 1430 12. References 1432 12.1. Normative References 1434 [RFC20] Cerf, V., "ASCII format for network interchange", RFC 20, 1435 October 1969. 1437 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1438 Requirement Levels", BCP 14, RFC 2119, March 1997. 1440 [RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network 1441 Interchange", RFC 5198, March 2008. 1443 [UNICODE] The Unicode Consortium, "The Unicode Standard", 2013, 1444 . 1446 12.2. Informative References 1448 [I-D.ietf-precis-mappings] 1449 Yoneya, Y. and T. NEMOTO, "Mapping characters for PRECIS 1450 classes", draft-ietf-precis-mappings-07 (work in 1451 progress), February 2014. 1453 [I-D.ietf-precis-nickname] 1454 Saint-Andre, P., "Preparation and Comparison of 1455 Nicknames", draft-ietf-precis-nickname-09 (work in 1456 progress), January 2014. 1458 [I-D.ietf-precis-saslprepbis] 1459 Saint-Andre, P. and A. Melnikov, "Username and Password 1460 Preparation Algorithms", draft-ietf-precis-saslprepbis-07 1461 (work in progress), March 2014. 1463 [I-D.ietf-xmpp-6122bis] 1464 Saint-Andre, P., "Extensible Messaging and Presence 1465 Protocol (XMPP): Address Format", draft-ietf-xmpp- 1466 6122bis-12 (work in progress), March 2014. 1468 [RFC2865] Rigney, C., Willens, S., Rubens, A., and W. Simpson, 1469 "Remote Authentication Dial In User Service (RADIUS)", RFC 1470 2865, June 2000. 1472 [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of 1473 Internationalized Strings ("stringprep")", RFC 3454, 1474 December 2002. 1476 [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, 1477 "Internationalizing Domain Names in Applications (IDNA)", 1478 RFC 3490, March 2003. 1480 [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep 1481 Profile for Internationalized Domain Names (IDN)", RFC 1482 3491, March 2003. 1484 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 1485 10646", STD 63, RFC 3629, November 2003. 1487 [RFC4422] Melnikov, A. and K. Zeilenga, "Simple Authentication and 1488 Security Layer (SASL)", RFC 4422, June 2006. 1490 [RFC4510] Zeilenga, K., "Lightweight Directory Access Protocol 1491 (LDAP): Technical Specification Road Map", RFC 4510, June 1492 2006. 1494 [RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and 1495 Recommendations for Internationalized Domain Names 1496 (IDNs)", RFC 4690, September 2006. 1498 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1499 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 1500 May 2008. 1502 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 1503 Specifications: ABNF", STD 68, RFC 5234, January 2008. 1505 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 1506 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 1508 [RFC5890] Klensin, J., "Internationalized Domain Names for 1509 Applications (IDNA): Definitions and Document Framework", 1510 RFC 5890, August 2010. 1512 [RFC5891] Klensin, J., "Internationalized Domain Names in 1513 Applications (IDNA): Protocol", RFC 5891, August 2010. 1515 [RFC5892] Faltstrom, P., "The Unicode Code Points and 1516 Internationalized Domain Names for Applications (IDNA)", 1517 RFC 5892, August 2010. 1519 [RFC5893] Alvestrand, H. and C. Karp, "Right-to-Left Scripts for 1520 Internationalized Domain Names for Applications (IDNA)", 1521 RFC 5893, August 2010. 1523 [RFC5894] Klensin, J., "Internationalized Domain Names for 1524 Applications (IDNA): Background, Explanation, and 1525 Rationale", RFC 5894, August 2010. 1527 [RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for 1528 Internationalized Domain Names in Applications (IDNA) 1529 2008", RFC 5895, September 2010. 1531 [RFC6365] Hoffman, P. and J. Klensin, "Terminology Used in 1532 Internationalization in the IETF", BCP 166, RFC 6365, 1533 September 2011. 1535 [RFC6885] Blanchet, M. and A. Sullivan, "Stringprep Revision and 1536 Problem Statement for the Preparation and Comparison of 1537 Internationalized Strings (PRECIS)", RFC 6885, March 2013. 1539 [RFC6943] Thaler, D., "Issues in Identifier Comparison for Security 1540 Purposes", RFC 6943, May 2013. 1542 [UAX9] The Unicode Consortium, "Unicode Standard Annex #9: 1543 Unicode Bidirectional Algorithm", September 2012, 1544 . 1546 [UAX11] The Unicode Consortium, "Unicode Standard Annex #11: East 1547 Asian Width", September 2012, 1548 . 1550 [UAX15] The Unicode Consortium, "Unicode Standard Annex #15: 1551 Unicode Normalization Forms", August 2012, 1552 . 1554 [UTR36] The Unicode Consortium, "Unicode Technical Report #36: 1555 Unicode Security Considerations", July 2012, 1556 . 1558 [UTS39] The Unicode Consortium, "Unicode Technical Standard #39: 1559 Unicode Security Mechanisms", July 2012, 1560 . 1562 12.3. URIs 1564 [1] http://unicode.org/Public/UNIDATA/PropertyAliases.txt 1566 [2] http://unicode.org/Public/UNIDATA/DerivedCoreProperties.txt 1568 Appendix A. Codepoint Table 1570 If one applies the property calculation rules from Section 8 to the 1571 code points 0x0000 to 0x10FFFF in Unicode 6.3, the result is as shown 1572 in the following table, in Unicode Character Database (UCD) format. 1573 The columns of the table are as follows: 1575 1. The code point or codepoint range. 1577 2. The assignment for the code point or range, where the value is 1578 one of PVALID, DISALLOWED, UNASSIGNED, CONTEXTO, CONTEXTJ, or 1579 FREE_PVAL (where the latter includes ID_DIS). 1581 3. The name or names for the code point or range. 1583 This table is non-normative, is included only for illustrative 1584 purposes, and applies only to Unicode 6.3, not to past or future 1585 versions of Unicode. Please note that the strings displayed in the 1586 third column are not necessarily the formal name of the code point 1587 (as defined in [UNICODE]) because the fixed width of the RFC format 1588 necessitated truncation of many names. 1590 0000..001F ; DISALLOWED # 1591 0020 ; FREE_PVAL # SPACE 1592 0021..007E ; PVALID # EXCLAM MARK..TILDE 1593 007F..009F ; DISALLOWED # 1594 00A0..00AC ; FREE_PVAL # NO-BREAK SPACE..NOT SIGN 1595 00AD ; DISALLOWED # SOFT HYPH 1596 00AE..00B6 ; FREE_PVAL # REGISTERED SIGN..PILCROW SIGN 1597 00B7 ; CONTEXTO # MIDDLE DOT 1598 00B8..00BF ; FREE_PVAL # CEDILLA..INV QUEST IND 1599 00C0..00D6 ; PVALID # LAT CAP LET A W GRAV..LAT CAP O 1600 00D7 ; FREE_PVAL # MULTIPLICATION SIGN 1601 00D8..00F6 ; PVALID # LAT CAP LET O W STROKE..LAT SM 1602 00F7 ; FREE_PVAL # DIVISION SIGN 1603 00F8..0131 ; PVALID # LAT SM LET O W STROKE..LAT SM LET 1604 0132..0133 ; FREE_PVAL # LAT CAP LIG IJ..LAT SM LIB IJ 1605 0134..013E ; PVALID # LAT CAP LET J W CIRCUM..LAT SM LET 1606 013F..0140 ; FREE_PVAL # LAT CAP LET L W MID DOT..LAT SM LET 1607 0141..0148 ; PVALID # LAT CAP LET L W STROKE..LAT SM LET 1608 0149 ; FREE_PVAL # LAT SM LET N PRECEDED BY APOS 1609 014A..017E ; PVALID # LAT CAP LET ENG..LAT SM LET Z W CA 1610 017F ; FREE_PVAL # LAT SM LET LONG S 1611 0180..01C3 ; PVALID # LAT SM LET B W STROKE..LAT LET RETR 1612 01C4..01CC ; FREE_PVAL # LAT CAP LET DZ W CARON..LAT SM 1613 01CD..01F0 ; PVALID # LAT CAP LET A W CARON..LAT SM LET J 1614 01F1..01F3 ; FREE_PVAL # LAT CAP LET DZ..LAT SM LET DZ 1615 01F4..02AF ; PVALID # LAT CAP LET G W ACUTE..LAT SM 1616 02B0..02B8 ; FREE_PVAL # MOD LET SM H..MOD LET SM Y 1617 02B9..02C1 ; PVALID # MOD LET PRIME..MOD LET REV GLOT ST 1618 02C2..02C5 ; FREE_PVAL # MOD LET L ARROW..MOD LET D ARROW 1619 02C6..02D1 ; PVALID # MOD LET CIRCUM ACC..MOD LET HALF TR 1620 02D2..02EB ; FREE_PVAL # MOD LET CENT R HALF RING..MOD LET Y 1621 02EC ; PVALID # MOD LET VOICING 1622 02ED ; FREE_PVAL # MOD LET UNASPIRATED 1623 02EE ; PVALID # MOD LET DOUBLE APOS 1624 02EF..02FF ; FREE_PVAL # MOD LET LOW D ARR..MOD LET LOW L AR 1625 0300..034E ; PVALID # COMB GRAVE ACCENT..COMB UP ARROW BE 1626 034F ; DISALLOWED # COMB GRAPHEME JOINER 1627 0350..0374 ; PVALID # COMB RIGHT ARROWHEAD..GREEK NUM SIG 1628 0375 ; CONTEXTO # GREEK LOW NUM SIGN 1629 0376..0377 ; PVALID # GR CAP LET PAMPHYLIAN DIGAMMA..GR S 1630 0378..0379 ; UNASSIGNED # .. 1631 037A ; FREE_PVAL # GR YPOGEGRAMMENI..GR SM REV DOT LUN 1632 037B..037D ; PVALID # GR SM REV LUN SIG..GR SM REV DOT LU 1633 037E ; FREE_PVAL # GREEK QUEST MARK 1634 037F..0383 ; UNASSIGNED # .. 1635 0384..0385 ; FREE_PVAL # GREEK TONOS..GREEK DIALYTIKA TONOS 1636 0386 ; PVALID # GR CAP LET ALPHA W TONOS 1637 0387 ; FREE_PVAL # GREEK ANO TELEIA 1638 0388..038A ; PVALID # GR CAP LET EPSILON W TONOS..GR CAP 1639 038B ; UNASSIGNED # 1640 038C ; PVALID # GREEK CAP LET OMICRON W TONOS 1641 038D ; UNASSIGNED # 1642 038E..03A1 ; PVALID # GR CAP LET EPSILON W TONOS..GR CAP 1643 03A2 ; UNASSIGNED # 1644 03A3..03CF ; PVALID # GREEK CAP LET SIGMA..GR CAP 1645 03D0..03D2 ; FREE_PVAL # GR BETA SYM..GR UPSILON W HOOK 1646 03D3..03D4 ; FREE_PVAL # GR UPSILON W ACUTE AND HOOK..GR UP 1647 03D5..03D6 ; FREE_PVAL # GR PHI SYM..GR PI SYM 1648 03D7..03EF ; PVALID # GR KAI SYM..COPT SM LET DEI 1649 03F0..03F2 ; FREE_PVAL # GR KAPPA SYM..GR LUNATE SIGMA 1650 03F3 ; PVALID # GREEK LET YOT 1651 03F4..03F6 ; FREE_PVAL # GR CAP THETA..GR REV LUNATE EPSILON 1652 03F7..03F8 ; PVALID # GR CAP LET SHO..GR SM LET SHO 1653 03F9 ; FREE_PVAL # GREEK CAP LUNATE SIGMA SYM 1654 03FA..0481 ; PVALID # GR CAP LET SAN..CYR SML LET KOPPA 1655 0482 ; FREE_PVAL # CYR THOUSANDS SIGN 1656 0483..0487 ; PVALID # COMB CYR TITLO..COMB CYR POK 1657 0488..0489 ; FREE_PVAL # COMB CYR HUNDRED THOUSANDS SIGN..C 1658 048A..0527 ; PVALID # CYR CAP LET SH I W TAIL..CYR S 1659 0528..0530 ; UNASSIGNED # .. 1660 0531..0556 ; PVALID # ARM CAP LET AYB..ARM CAP LET FEH 1661 0557..0558 ; UNASSIGNED # .. 1662 0559 ; PVALID # ARM MOD LET LEFT HALF RING 1663 055A..055F ; FREE_PVAL # ARM APOS..ARM ABBREV 1664 0560 ; UNASSIGNED # 1665 0561..0586 ; PVALID # ARM SM LET AYB..ARMENIAN SM LE 1666 0587 ; FREE_PVAL # ARM SM LIG ECH YIWN 1667 0588 ; UNASSIGNED # 1668 0589..058A ; FREE_PVAL # ARMENIAN FULL STOP..ARMENIAN HYPH 1669 058B..058E ; UNASSIGNED # .. 1670 058F ; FREE_PVAL # ARMENIAN DRAM SIGN 1671 0590 ; UNASSIGNED # 1672 0591..05BD ; PVALID # HEBR ACC ETNAHTA..HEBR PNT ME 1673 05BE ; FREE_PVAL # HEBR PUNCT MAQAF 1674 05BF ; PVALID # HEBR PNT RAFE 1675 05C0 ; FREE_PVAL # HEBR PUNCT PASEQ 1676 05C1..05C2 ; PVALID # HEBR PNT SHIN DOT..HEBR PNT SIN DOT 1677 05C3 ; FREE_PVAL # HEBR PUNCT SOF PASUQ 1678 05C4..05C5 ; PVALID # HEBR MARK UP DOT..HEBR MARK LOW DOT 1679 05C6 ; FREE_PVAL # HEBR PUNCT NUN HAFUKHA 1680 05C7 ; PVALID # HEBR PNT QAMATS QATAN 1681 05C8..05CF ; UNASSIGNED # .. 1682 05D0..05EA ; PVALID # HEBR LET ALEF..HEBR LET TAV 1683 05EB..05EF ; UNASSIGNED # .. 1684 05F0..05F2 ; PVALID # HEBR LIG YIDDISH DOUBLE VAV..HEBR L 1685 05F3..05F4 ; CONTEXTO # HEBR PUNCT GERESH..HEBR PUNCTUATIO 1686 05F5..05FF ; UNASSIGNED # .. 1687 0600..0604 ; DISALLOWED # ARAB NUM SIGN..ARAB SIGN SAM 1688 0605 ; UNASSIGNED # .. 1689 0606..060F ; FREE_PVAL # AR-IND CUBE ROOT..ARAB SIGN MISRA 1690 0610..061A ; PVALID # ARAB SIGN SALLALLAHOU ALAYHE ..AR 1691 061B ; FREE_PVAL # ARAB SEMICOLON 1692 061C ; DISALLOWED # ARAB LET MARK 1693 061D..061D ; UNASSIGNED # .. 1694 061E..061F ; FREE_PVAL # ARAB TRIPLE DOT PUNCT MARK..ARAB Q 1695 0620..063F ; PVALID # ARAB LET KASH..ARAB LET FARSI YEH 1696 0640 ; DISALLOWED # ARAB TATWEEL 1697 0641..065F ; PVALID # ARAB LET FEH..ARAB WAVY HAMZA BEL 1698 0660..0669 ; CONTEXTO # AR-IND DIG ZERO..AR-IND DIG 1699 066A..066D ; FREE_PVAL # ARAB PCT SIGN..ARAB FIVE PNTED STA 1700 066E..0674 ; PVALID # ARAB LET DOTLESS BEH..ARAB LET HIG 1701 0675..0678 ; FREE_PVAL # ARAB LET HIGH HAMZA ALEF..ARAB LET 1702 0679..06D3 ; PVALID # ARAB LET TTEH..ARAB LET YEH BARREE 1703 06D4 ; FREE_PVAL # ARAB FULL STOP 1704 06D5..06DC ; PVALID # ARAB LET AE..ARAB SM HIGH SEEN 1705 06DD ; DISALLOWED # ARAB END OF AYAH 1706 06DE ; FREE_PVAL # ARAB START OF RUB EL HIZB 1707 06DF..06E8 ; PVALID # ARAB SM HIGH ROUNDED ZERO..ARAB SM 1708 06E9 ; FREE_PVAL # ARAB PLACE OF SAJDAH 1709 06EA..06EF ; PVALID # ARAB EMPTY CENTRE LOW STOP..ARAB LET 1710 06F0..06F9 ; CONTEXTO # EXT AR-IND DIG ZERO..EXT A 1711 06FA..06FF ; PVALID # ARAB LET SHEEN W DOT BEL..ARAB 1712 0700..070D ; FREE_PVAL # SYR END OF PARA..SYR HARKLEAN AST 1713 070E ; UNASSIGNED # 1714 070F ; DISALLOWED # SYR ABBR MARK 1715 0710..074A ; PVALID # SYR LET ALAPH..SYR BARREKH 1716 074B..074C ; UNASSIGNED # .. 1717 074D..07B1 ; PVALID # SYR LET SOGDIAN ZHAIN..THAANA LET N 1718 07B2..07BF ; UNASSIGNED # .. 1719 07C0..07F5 ; PVALID # NKO DIG ZERO..NKO LOW TONE APOS 1720 07F6..07F9 ; FREE_PVAL # NKO SYM OO DENNEN..NKO EXCLAMATI 1721 07FA ; DISALLOWED # NKO LAJANYALAN 1722 07FB..07FF ; UNASSIGNED # .. 1723 0800..082D ; PVALID # SAMAR LET ALAF..SAMAR MARK NEQUDA 1724 082E..082F ; UNASSIGNED # .. 1725 0830..083E ; FREE_PVAL # SAMAR PUNCT NEQUDAA..SAMAR PUN 1726 083F ; UNASSIGNED # 1727 0840..085B ; PVALID # MANDAIC LET HALQA..MANDAIC GEM 1728 085C..085D ; UNASSIGNED # .. 1729 085E ; FREE_PVAL # MANDAIC PUNCTUATION 1730 085F..089F ; UNASSIGNED # .. 1731 08A0 ; PVALID # ARAB LET BEH W SM V BEL 1732 08A1 ; UNASSIGNED # 1733 08A2..08AC ; PVALID # ARAB LET JEEM W 2 DOTS AB..ARAB 1734 08AD..08E3 ; UNASSIGNED # .. 1735 08E4..08FE ; PVALID # ARAB CURLY FATHA..ARAB DAMMA W 1736 08FF ; UNASSIGNED # 1737 0900..0963 ; PVALID # DEVAN SIGN INV CANDRABINDU..DEVAN V 1738 0964..0965 ; FREE_PVAL # DEVAN DANDA..DEVAN DOUBLE DANDA 1739 0966..096F ; PVALID # DEVAN DIG ZERO..DEVAN DIG NINE 1740 0970 ; FREE_PVAL # DEVAN ABBR SIGN 1741 0971..0977 ; PVALID # DEVAN SIGN HIGH SPACING DOT..DEVAN 1742 0978 ; UNASSIGNED # 1743 0979..097F ; PVALID # DEVAN SIGN HIGH SPACING DOT..DEVAN 1744 0980 ; UNASSIGNED # 1745 0981..0983 ; PVALID # BENG SIGN CANDRABINDU..BENG SIGN VIS 1746 0984 ; UNASSIGNED # 1747 0985..098C ; PVALID # BENG LET A..BENG LET VOC L 1748 098D..098E ; UNASSIGNED # .. 1749 098F..0990 ; PVALID # BENG LET E..BENG LET AI 1750 0991..0992 ; UNASSIGNED # .. 1751 0993..09A8 ; PVALID # BENG LET O..BENG LET NA 1752 09A9 ; UNASSIGNED # 1753 09AA..09B0 ; PVALID # BENG LET PA..BENG LET RA 1754 09B1 ; UNASSIGNED # 1755 09B2 ; PVALID # BENG LET LA 1756 09B3..09B5 ; UNASSIGNED # .. 1757 09B6..09B9 ; PVALID # BENG LET SHA..BENG LET HA 1758 09BA..09BB ; UNASSIGNED # .. 1759 09BC..09C4 ; PVALID # BENG SIGN NUKTA..BENG VOW SIGN VOCAL 1760 09C5..09C6 ; UNASSIGNED # .. 1761 09C7..09C8 ; PVALID # BENG VOW SIGN E..BENG VOW SIGN AI 1762 09C9..09CA ; UNASSIGNED # .. 1763 09CB..09CE ; PVALID # BENG VOW SIGN O..BENG LET KHANDA 1764 09CF..09D6 ; UNASSIGNED # .. 1765 09D7 ; PVALID # BENG AU LEN MARK 1766 09D8..09DB ; UNASSIGNED # .. 1767 09DC..09DD ; PVALID # BENG LET RRA..BENG LET RHA 1768 09DE ; UNASSIGNED # 1769 09DF..09E3 ; PVALID # BENG LET YYA..BENG VOW SIG 1770 09E4..09E5 ; UNASSIGNED # .. 1771 09E6..09F1 ; PVALID # BENG DIG ZERO..BENG LET RA W L 1772 09F2..09FB ; FREE_PVAL # BENG RUPEE MARK..BENG GANDA MARK 1773 09FC..0A00 ; UNASSIGNED # .. 1774 0A01..0A03 ; PVALID # GURMUKHI SIGN ADAK BINDI..GURMUKHI 1775 0A04 ; UNASSIGNED # 1776 0A05..0A0A ; PVALID # GURMUKHI LET A..GURMUKHI LET UU 1777 0A0B..0A0E ; UNASSIGNED # .. 1778 0A0F..0A10 ; PVALID # GURMUKHI LET EE..GURMUKHI LET AI 1779 0A11..0A12 ; UNASSIGNED # .. 1780 0A13..0A28 ; PVALID # GURMUKHI LET OO..GURMUKHI LET NA 1781 0A29 ; UNASSIGNED # 1782 0A2A..0A30 ; PVALID # GURMUKHI LET PA..GURMUKHI LET RA 1783 0A31 ; UNASSIGNED # 1784 0A32..0A33 ; PVALID # GURMUKHI LET LA..GURMUKHI LET LLA 1785 0A34 ; UNASSIGNED # 1786 0A35.OA36 ; PVALID # GURMUKHI LET VA..GURMUKHI LET SHA 1787 0A37 ; UNASSIGNED # 1788 0A38..0A39 ; PVALID # GURMUKHI LET SA..GURMUKHI LET HA 1789 0A3A..0A3B ; UNASSIGNED # .. 1790 0A3C ; PVALID # GURMUKHI SIGN NUKTA 1791 0A3D ; UNASSIGNED # 1792 0A3E..0A42 ; PVALID # GURMUKHI VOW SIGN AA..GURMUKHI V 1793 0A43..0A46 ; UNASSIGNED # .. 1794 0A47..0A48 ; PVALID # GURMUKHI VOW SIGN EE..GURMUKHI V 1795 0A49..0A4A ; UNASSIGNED # .. 1796 0A4B..0A4D ; PVALID # GURMUKHI VOW SIGN OO..GURMUKHI S 1797 0A4E..0A50 ; UNASSIGNED # .. 1798 0A51 ; PVALID # GURMUKHI SIGN UDAAT 1799 0A52..0A58 ; UNASSIGNED # .. 1800 0A59..0A5C ; PVALID # GURMUKHI LET KHHA..GURMUKHI LET RRA 1801 0A5D ; UNASSIGNED # 1802 0A5E ; PVALID # GURMUKHI LET FA 1803 0A5F..0A65 ; UNASSIGNED # .. 1804 0A66..0A75 ; PVALID # GURMUKHI DIG ZERO..GURMUKHI SIGN YA 1805 0A76..0A80 ; UNASSIGNED # .. 1806 0A81..0A83 ; PVALID # GUJARATI SIGN CANDRABINDU..GUJARATI 1807 0A84 ; UNASSIGNED # 1808 0A85..0A8D ; PVALID # GUJARATI LET A..GUJARATI VOW CAND 1809 0A8E ; UNASSIGNED # 1810 0A8F..0A91 ; PVALID # GUJARATI LET E..GUJARATI VOW CAND 1811 0A92 ; UNASSIGNED # 1812 0A93..0AA8 ; PVALID # GUJARATI LET O..GUJARATI LET NA 1813 0AA9 ; UNASSIGNED # 1814 0AAA..0AB0 ; PVALID # GUJARATI LET PA..GUJARATI LET RA 1815 0AB1 ; UNASSIGNED # 1816 0AB2..0AB3 ; PVALID # GUJARATI LET LA..GUJARATI LET LLA 1817 0AB4 ; UNASSIGNED # 1818 0AB5..0AB9 ; PVALID # GUJARATI LET VA..GUJARATI LET HA 1819 0ABA..0ABB ; UNASSIGNED # .. 1820 0ABC..0AC5 ; PVALID # GUJARATI SIGN NUKTA..GUJARATI VOW 1821 0AC6 ; UNASSIGNED # 1822 0AC7..0AC9 ; PVALID # GUJARATI VOW SIGN E..GUJARATI VOW 1823 0ACA ; UNASSIGNED # 1824 0ACB..0ACD ; PVALID # GUJARATI VOW SIGN O..GUJARATI SIG 1825 0ACE..0ACF ; UNASSIGNED # .. 1826 0AD0 ; PVALID # GUJARATI OM 1827 0AD1..0ADF ; UNASSIGNED # .. 1828 0AE0..0AE3 ; PVALID # GUJARATI LET VOC RR..GUJARATI V 1829 0AE4..0AE5 ; UNASSIGNED # .. 1830 0AE6..0AEF ; PVALID # GUJARATI DIG ZERO..GUJARATI DIG NINE 1831 0AF0..0AF1 ; FREE_PVAL # GUJARATI ABBR SIGN..GUJARATI RUPEE S 1832 0AF2..0B00 ; UNASSIGNED # .. 1833 0B01..0B03 ; PVALID # ORIYA SIGN CANDRABINDU..ORIYA SIGN V 1834 0B04 ; UNASSIGNED # 1835 0B05..0B0C ; PVALID # ORIYA LET A..ORIYA LET VOC L 1836 0B0D..0B0E ; UNASSIGNED # .. 1837 0B0F..0B10 ; PVALID # ORIYA LET E..ORIYA LET AI 1838 0B11..0B12 ; UNASSIGNED # .. 1839 0B13..0B28 ; PVALID # ORIYA LET O..ORIYA LET NA 1840 0B29 ; UNASSIGNED # 1841 0B2A..0B30 ; PVALID # ORIYA LET PA..ORIYA LET RA 1842 0B31 ; UNASSIGNED # 1843 0B32..0B33 ; PVALID # ORIYA LET LA..ORIYA LET LLA 1844 0B34 ; UNASSIGNED # 1845 0B35..0B39 ; PVALID # ORIYA LET VA..ORIYA LET HA 1846 0B3A..0B3B ; UNASSIGNED # .. 1847 0B3C..0B44 ; PVALID # ORIYA SIGN NUKTA..ORIYA VOW SIGN 1848 0B45..0B46 ; UNASSIGNED # .. 1849 0B47..0B48 ; PVALID # ORIYA VOW SIGN E..ORIYA VOW SIG 1850 0B49..0B4A ; UNASSIGNED # .. 1851 0B4B..0B4D ; PVALID # ORIYA VOW SIGN O..ORIYA SIGN VIRA 1852 0B4E..0B55 ; UNASSIGNED # .. 1853 0B56..0B57 ; PVALID # ORIYA AI LEN MARK..ORIYA AU LENG 1854 0B58..0B5B ; UNASSIGNED # .. 1855 0B5C..0B5D ; PVALID # ORIYA LET RRA..ORIYA LET RHA 1856 0B5E ; UNASSIGNED # 1857 0B5F..0B63 ; PVALID # ORIYA LET YYA..ORIYA VOW SIGN VOCA 1858 0B64..0B65 ; UNASSIGNED # .. 1859 0B66..0B6F ; PVALID # ORIYA DIG ZERO..ORIYA DIG NINE 1860 0B70 ; FREE_PVAL # ORIYA ISSHAR 1861 0B71 ; PVALID # ORIYA LET WA 1862 0B72..0B77 ; FREE_PVAL # ORIYA FRACT ONE QUART..ORIYA FRACT 1863 0B78..0B81 ; UNASSIGNED # .. 1864 0B82..0B83 ; PVALID # TAMIL SIGN ANUSVARA..TAMIL SIGN VIS 1865 0B84 ; UNASSIGNED # 1866 0B85..0B8A ; PVALID # TAMIL LET A..TAMIL LET UU 1867 0B8B..0B8D ; UNASSIGNED # .. 1868 0B8E..0B90 ; PVALID # TAMIL LET E..TAMIL LET AI 1869 0B91 ; UNASSIGNED # 1870 0B92..0B95 ; PVALID # TAMIL LET O..TAMIL LET KA 1871 0B96..0B98 ; UNASSIGNED # .. 1872 0B99..0B9A ; PVALID # TAMIL LET NGA..TAMIL LET CA 1873 0B9B ; UNASSIGNED # 1874 0B9C ; PVALID # TAMIL LET JA 1875 0B9D ; UNASSIGNED # 1876 0B9E..0B9F ; PVALID # TAMIL LET NYA..TAMIL LET TTA 1877 0BA0..0BA2 ; UNASSIGNED # .. 1878 0BA3..0BA4 ; PVALID # TAMIL LET NNA..TAMIL LET TA 1879 0BA5..0BA7 ; UNASSIGNED # .. 1880 0BA8..0BAA ; PVALID # TAMIL LET NA..TAMIL LET PA 1881 0BAB..0BAD ; UNASSIGNED # .. 1882 0BAE..0BB9 ; PVALID # TAMIL LET MA..TAMIL LET HA 1883 0BBA..0BBD ; UNASSIGNED # .. 1884 0BBE..0BC2 ; PVALID # TAMIL VOW SIGN AA..TAMIL VOW SI 1885 0BC3..0BC5 ; UNASSIGNED # .. 1886 0BC6..0BC8 ; PVALID # TAMIL VOW SIGN E..TAMIL VOW SIG 1887 0BC9 ; UNASSIGNED # 1888 0BCA..0BCD ; PVALID # TAMIL VOW SIGN O..TAMIL SIGN VIRA 1889 0BCE..0BCF ; UNASSIGNED # .. 1890 0BD0 ; PVALID # TAMIL OM 1891 0BD1..0BD6 ; UNASSIGNED # .. 1892 0BD7 ; PVALID # TAMIL AU LEN MARK 1893 0BD8..0BE5 ; UNASSIGNED # .. 1894 0BE6..0BEF ; PVALID # TAMIL DIG ZERO..TAMIL DIG NINE 1895 0BF0..0BFA ; FREE_PVAL # TAMIL NUM TEN..TAMIL NUM SIGN 1896 0BFB..0C00 ; UNASSIGNED # .. 1897 0C01..0C03 ; PVALID # TELUGU SIGN CANDRABINDU..TELUGU SIG 1898 0C04 ; UNASSIGNED # 1899 0C05..0C0C ; PVALID # TELUGU LET A..TELUGU LET VOC L 1900 0C0D ; UNASSIGNED # 1901 0C0E..0C10 ; PVALID # TELUGU LET E..TELUGU LET AI 1902 0C11 ; UNASSIGNED # 1903 0C12..0C28 ; PVALID # TELUGU LET O..TELUGU LET NA 1904 0C29 ; UNASSIGNED # 1905 0C2A..0C33 ; PVALID # TELUGU LET PA..TELUGU LET LLA 1906 0C34 ; UNASSIGNED # 1907 0C35..0C39 ; PVALID # TELUGU LET VA..TELUGU LET HA 1908 0C3A..0C3C ; UNASSIGNED # .. 1909 0C3D..0C44 ; PVALID # TELUGU SIGN AVAGRAHA..TELUGU VOW SI 1910 0C45 ; UNASSIGNED # 1911 0C46..0C48 ; PVALID # TELUGU VOW SIGN E..TELUGU VOW SIGN 1912 0C49 ; UNASSIGNED # 1913 0C4A..0C4D ; PVALID # TELUGU VOW SIGN O..TELUGU SIGN VIRA 1914 0C4E..0C54 ; UNASSIGNED # .. 1915 0C55..0C56 ; PVALID # TELUGU LEN MARK..TELUGU AI LEN MARK 1916 0C57 ; UNASSIGNED # 1917 0C58..0C59 ; PVALID # TELUGU LET TSA..TELUGU LET DZA 1918 0C5A..0C5F ; UNASSIGNED # .. 1919 0C60..0C63 ; PVALID # TELUGU LET VOC RR..TELUGU VOW S 1920 0C64..0C65 ; UNASSIGNED # .. 1921 0C66..0C6F ; PVALID # TELUGU DIG ZERO..TELUGU DIG NINE 1922 0C70..0C77 ; UNASSIGNED # .. 1923 0C78..0C7F ; FREE_PVAL # TELUGU FRACTION DIG ZERO..TELUGU S 1924 0C80..0C81 ; UNASSIGNED # .. 1925 0C82..0C83 ; PVALID # KANNADA SIGN ANUSVARA..KANNADA SIGN 1926 0C84 ; UNASSIGNED # 1927 0C85..0C8C ; PVALID # KANNADA LET A..KANNADA LET VOC L 1928 0C8D ; UNASSIGNED # 1929 0C8E..0C90 ; PVALID # KANNADA LET E..KANNADA LET AI 1930 0C91 ; UNASSIGNED # 1931 0C92..0CA8 ; PVALID # KANNADA LET O..KANNADA LET NA 1932 0CA9 ; UNASSIGNED # 1933 0CAA..0CB3 ; PVALID # KANNADA LET PA..KANNADA LET LLA 1934 0CB4 ; UNASSIGNED # 1935 0CB5..0CB9 ; PVALID # KANNADA LET VA..KANNADA LET HA 1936 0CBA..0CBB ; UNASSIGNED # .. 1937 0CBC..0CC4 ; PVALID # KANNADA SIGN NUKTA..KANNADA VOW SIG 1938 0CC5 ; UNASSIGNED # 1939 0CC6..0CC8 ; PVALID # KANNADA VOW SIGN E..KANNADA VOW SIG 1940 0CC9 ; UNASSIGNED # 1941 0CCA..0CCD ; PVALID # KANNADA VOW SIGN O..KANNADA SIGN VI 1942 0CCE..0CD4 ; UNASSIGNED # .. 1943 0CD5..0CD6 ; PVALID # KANNADA LEN MARK..KANNADA AI LEN MA 1944 0CD7..0CDD ; UNASSIGNED # .. 1945 0CDE ; PVALID # KANNADA LET FA 1946 0CDF ; UNASSIGNED # 1947 0CE0..0CE3 ; PVALID # KANNADA LET VOC RR..KANNADA VOW SIG 1948 0CE4..0CE5 ; UNASSIGNED # .. 1949 0CE6..0CEF ; PVALID # KANNADA DIG ZERO..KANNADA DIG NINE 1950 0CF0 ; UNASSIGNED # 1951 0CF1..0CF2 ; PVALID # KANNADA SIGN JIHVAMULIYA..KANNADA S 1952 0CF3..0D01 ; UNASSIGNED # .. 1953 0D02..0D03 ; PVALID # MALAY SIGN ANUSVARA..MALAY SIGN VIS 1954 0D04 ; UNASSIGNED # 1955 0D05..0D0C ; PVALID # MALAY LET A..MALAY LET VOC 1956 0D0D ; UNASSIGNED # 1957 0D0E..0D10 ; PVALID # MALAY LET E..MALAY LET AI 1958 0D11 ; UNASSIGNED # 1959 0D12..0D3A ; PVALID # MALAY LET O..MALAY LET TTTA 1960 0D3B..0D3C ; UNASSIGNED # .. 1961 0D3D..0D44 ; PVALID # MALAY SIGN AVAGRAHA..MALAY VOW SIG 1962 0D45 ; UNASSIGNED # 1963 0D46..0D48 ; PVALID # MALAY VOW SIGN E..MALAY VOW SIGN 1964 0D49 ; UNASSIGNED # 1965 0D4A..0D4E ; PVALID # MALAY VOW SIGN O..MALAY LET DOT REP 1966 0D4F..0D56 ; UNASSIGNED # .. 1967 0D57 ; PVALID # MALAY AU LEN MARK 1968 0D58..0D5F ; UNASSIGNED # .. 1969 0D60..0D63 ; PVALID # MALAY LET VOC RR..MALAY VOW 1970 0D64..0D65 ; UNASSIGNED # .. 1971 0D66..0D6F ; PVALID # MALAY DIG ZERO..MALAY DIG NINE 1972 0D70..0D75 ; FREE_PVAL # MALAY NUM TEN..MALAY FRACTION THR 1973 0D76..0D78 ; UNASSIGNED # .. 1974 0D79 ; FREE_PVAL # MALAY DATE MARK 1975 0D7A..0D7F ; PVALID # MALAY LET CHILLU NN..MALAY LET 1976 0D80..0D81 ; UNASSIGNED # .. 1977 0D82..0D83 ; PVALID # SINH SIGN ANUSVARAYA..SINH SIGN VIS 1978 0D84 ; UNASSIGNED # 1979 0D85..0D96 ; PVALID # SINH LET AYANNA..SINH LET AUYANN 1980 0D97..0D99 ; UNASSIGNED # .. 1981 0D9A..0DB1 ; PVALID # SINH LET ALPAPRAANA KAYANNA..SINH L 1982 0DB2 ; UNASSIGNED # 1983 0DB3..0DBB ; PVALID # SINH LET SANYAKA DAYANNA..SINH LETT 1984 0DBC ; UNASSIGNED # 1985 0DBD ; PVALID # SINH LET DANTAJA LAYANNA 1986 0DBE..0DBF ; UNASSIGNED # .. 1987 0DC0..0DC6 ; PVALID # SINH LET VAYANNA..SINH LET FAYAN 1988 0DC7..0DC9 ; UNASSIGNED # .. 1989 0DCA ; PVALID # SINH SIGN AL-LAKUNA 1990 0DCB..0DCE ; UNASSIGNED # .. 1991 0DCF..0DD4 ; PVALID # SINH VOW SIGN AELA-PILLA..SINH VOW 1992 0DD5 ; UNASSIGNED # 1993 0DD6 ; PVALID # SINH VOW SIGN DIGA PAA-PILLA 1994 0DD7 ; UNASSIGNED # 1995 0DD8..0DDF ; PVALID # SINH VOW SIGN GAETTA-PILLA..SINH VO 1996 0DE0..0DF1 ; UNASSIGNED # .. 1997 0DF2..0DF3 ; PVALID # SINH VOW SIGN DIGA GAETTA-PILLA..SI 1998 0DF4 ; FREE_PVAL # SINH PUNCT KUNDDALIYA 1999 0DF5..0E00 ; UNASSIGNED # .. 2000 0E01..0E32 ; PVALID # THAI CHAR KO KAI..THAI CHAR SARA A 2001 0E33 ; FREE_PVAL # THAI CHAR SARA AM 2002 0E34..0E3A ; PVALID # THAI CHAR SARA I..THAI CHAR PHINTH 2003 0E3B..0E3E ; UNASSIGNED # .. 2004 0E3F ; FREE_PVAL # THAI CURRENCY SYM BAHT 2005 0E40..0E4E ; PVALID # THAI CHAR SARA E..THAI CHAR YAMAKK 2006 0E4F ; FREE_PVAL # THAI CHAR FONGMAN 2007 0E50..0E59 ; PVALID # THAI DIG ZERO..THAI DIG NINE 2008 0E5A..0E5B ; FREE_PVAL # THAI CHAR ANGKHANKHU..THAI CHAR KH 2009 0E5C..0E80 ; UNASSIGNED # .. 2010 0E81..0E82 ; PVALID # LAO LET KO..LAO LET KHO SUNG 2011 0E83 ; UNASSIGNED # 2012 0E84 ; PVALID # LAO LET KHO TAM 2013 0E85..0E86 ; UNASSIGNED # .. 2014 0E87..0E88 ; PVALID # LAO LET NGO..LAO LET CO 2015 0E89 ; UNASSIGNED # 2016 0E8A ; PVALID # LAO LET SO TAM 2017 0E8B..0E8C ; UNASSIGNED # .. 2018 0E8D ; PVALID # LAO LET NYO 2019 0E8E..0E93 ; UNASSIGNED # .. 2020 0E94..0E97 ; PVALID # LAO LET DO..LAO LET THO TAM 2021 0E98 ; UNASSIGNED # 2022 0E99..0E9F ; PVALID # LAO LET NO..LAO LET FO SUNG 2023 0EA0 ; UNASSIGNED # 2024 0EA1..0EA3 ; PVALID # LAO LET MO..LAO LET LO LING 2025 0EA4 ; UNASSIGNED # 2026 0EA5 ; PVALID # LAO LET LO LOOT 2027 0EA6 ; UNASSIGNED # 2028 0EA7 ; PVALID # LAO LET WO 2029 0EA8..0EA9 ; UNASSIGNED # .. 2030 0EAA..0EAB ; PVALID # LAO LET SO SUNG..LAO LET HO SUNG 2031 0EAC ; UNASSIGNED # 2032 0EAD..0EB2 ; PVALID # LAO LET O..LAO VOW SIGN AA 2033 0EB3 ; FREE_PVAL # LAO VOW SIGN AM 2034 0EB4..0EB9 ; PVALID # LAO VOW SIGN I..LAO VOW SIGN UU 2035 0EBA ; UNASSIGNED # 2036 0EBB..0EBD ; PVALID # LAO VOW SIGN MAI KON..LAO SEMIVOW SIG 2037 0EBE..0EBF ; UNASSIGNED # .. 2038 0EC0..0EC4 ; PVALID # LAO VOW SIGN E..LAO VOW SIGN AI 2039 0EC5 ; UNASSIGNED # 2040 0EC6 ; PVALID # LAO KO LA 2041 0EC7 ; UNASSIGNED # 2042 0EC8..0ECD ; PVALID # LAO TONE MAI EK..LAO NIGGAHITA 2043 0ECE..0ECF ; UNASSIGNED # .. 2044 0ED0..0ED9 ; PVALID # LAO DIG ZERO..LAO DIG NINE 2045 0EDA..0EDB ; UNASSIGNED # .. 2046 0EDC..0EDD ; FREE_PVAL # LAO HO NO..LAO HO MO 2047 0EDE..0EDF ; PVALID # LAO LET KHMU GO..TIB SYL OM 2048 0EE0..0EEF ; UNASSIGNED # .. 2049 0F00 ; PVALID # TIB SYLL OM 2050 0F01..0F0A ; FREE_PVAL # TIB MARK GTER YIG MGO TRUNC A..TIB 2051 0F0B ; PVALID # TIB MARK INTERSYLLABIC TSHEG 2052 0F0C..0F17 ; FREE_PVAL # TIB MARK DELIMITER TSHEG BSTAR..TIB 2053 0F18..0F19 ; PVALID # TIB ASTROLOGICAL SIGN -KHYUD PA..TIB 2054 0F1A..0F1F ; FREE_PVAL # TIB SIGN RDEL DKAR GCIG..TIB SIGN RD 2055 0F20..0F29 ; PVALID # TIB DIG ZERO..TIB DIG NINE 2056 0F2A..0F34 ; FREE_PVAL # TIB DIG HALF ONE..TIB MARK BSDUS R 2057 0F35 ; PVALID # TIB MARK NGAS BZUNG NYI ZLA 2058 0F36 ; FREE_PVAL # TIB MARK CARET DZUD RTAGS BZHI MIG C 2059 0F37 ; PVALID # TIB MARK NGAS BZUNG SGOR RTAGS 2060 0F38 ; FREE_PVAL # TIB MARK CHE MGO 2061 0F39 ; PVALID # TIB MARK TSA PHRU 2062 0F3A..0F3D ; FREE_PVAL # TIB MARK GUG RTAGS GYON..TIB MARK AN 2063 0F3E..0F47 ; PVALID # TIB SIGN YAR TSHES..TIB LET JA 2064 0F48 ; UNASSIGNED # 2065 0F49..0F6C ; PVALID # TIB LET NYA..TIB LET RRA 2066 0F6D..0F70 ; UNASSIGNED # .. 2067 0F71..0F76 ; PVALID # TIB VOW SIGN AA..TIB VOW SIGN VO 2068 0F77 ; FREE_PVAL # TIB VOW SIGN VO RR 2069 0F78 ; PVALID # TIB VOW SIGN VO L 2070 0F79 ; FREE_PVAL # TIB VOW SIGN VO LL 2071 0F7A..0F84 ; PVALID # TIB VOW SIGN E..TIB MARK H 2072 0F85 ; FREE_PVAL # TIB MARK PALUTA 2073 0F86..0F8F ; PVALID # TIB SIGN LCI RTAGS..TIB SUBJOIN S 2074 0F90..0F97 ; PVALID # TIB SUBJOIN LET KA..TIB SUBJOIN 2075 0F98 ; UNASSIGNED # 2076 0F99..0FBC ; PVALID # TIB SUBJOIN LET NYA..TIB SUBJOI 2077 0FBD ; UNASSIGNED # 2078 0FBE..0FC5 ; FREE_PVAL # TIB KU RU KHA..TIB SYM RDO RJE 2079 0FC6 ; PVALID # TIB SYM PADMA GDAN 2080 0FC7..0FCC ; FREE_PVAL # TIB SYM RDO RJE RGYA GRAM..TIB SY 2081 0FCD ; UNASSIGNED # 2082 0FCE..0FDA ; FREE_PVAL # TIB SIGN RDEL NAG RDEL DKAR..TIB MA 2083 0FDB..0FFF ; UNASSIGNED # .. 2084 1000..1049 ; PVALID # MYAN LET KA..MYAN DIG NINE 2085 104A..104F ; FREE_PVAL # MYAN SIGN LITTLE SECTION..MYAN SYM 2086 1050..109D ; PVALID # MYAN LET SHA..MYAN VOW SIGN AITON 2087 109E..109F ; FREE_PVAL # MYAN SYM SHAN ONE..MYAN SYM SHAN EX 2088 10A0..10C5 ; PVALID # GEORG CAP LET AN..GEORG CAP LET HOE 2089 10C6 ; UNASSIGNED # 2090 10C7 ; PVALID # GEORG CAP LET YN 2091 10C8..10CC ; UNASSIGNED # .. 2092 10CD ; PVALID # GEORG CAP LET AEN 2093 10CE..10CF ; UNASSIGNED # .. 2094 10D0..10FA ; PVALID # GEORG LET AN..GEORG LET AIN 2095 10FB..10FC ; FREE_PVAL # GEORG PARA SEP..MOD LET GEORG NAR 2096 10FD..10FF ; PVALID # GEORG LET AEN..GEORG LET LABIAL 2097 1100..11FF ; DISALLOWED # HANGUL CHO KIYEOK..HANGUL JONG SSA 2098 1200..1248 ; PVALID # ETHI SYL HA..ETHI SYL QWA 2099 1249 ; UNASSIGNED # 2100 124A..124D ; PVALID # ETHI SYL QWI..ETHI SYL QWE 2101 124E..124F ; UNASSIGNED # .. 2102 1250..1256 ; PVALID # ETHI SYL QHA..ETHI SYL QHO 2103 1257 ; UNASSIGNED # 2104 1258 ; PVALID # ETHI SYL QHWA 2105 1259 ; UNASSIGNED # 2106 125A..125D ; PVALID # ETHI SYL QHWI..ETHI SYL QH 2107 125E..125F ; UNASSIGNED # .. 2108 1260..1288 ; PVALID # ETHI SYL BA..ETHI SYL XWA 2109 1289 ; UNASSIGNED # 2110 128A..128D ; PVALID # ETHI SYL XWI..ETHI SYL XWE 2111 128E..128F ; UNASSIGNED # .. 2112 1290..12B0 ; PVALID # ETHI SYL NA..ETHI SYL KWA 2113 12B1 ; UNASSIGNED # 2114 12B2..12B5 ; PVALID # ETHI SYL KWI..ETHI SYL KWE 2115 12B6..12B7 ; UNASSIGNED # .. 2116 12B8..12BE ; PVALID # ETHI SYL KXA..ETHI SYL KXO 2117 12BF ; UNASSIGNED # 2118 12C0 ; PVALID # ETHI SYL KXWA 2119 12C1 ; UNASSIGNED # 2120 12C2..12C5 ; PVALID # ETHI SYL KXWI..ETHI SYL KX 2121 12C6..12C7 ; UNASSIGNED # .. 2122 12C8..12D6 ; PVALID # ETHI SYL WA..ETHI SYL PHAR 2123 12D7 ; UNASSIGNED # 2124 12D8..1310 ; PVALID # ETHI SYL ZA..ETHI SYL GWA 2125 1311 ; UNASSIGNED # 2126 1312..1315 ; PVALID # ETHI SYL GWI..ETHI SYL GWE 2127 1316..1317 ; UNASSIGNED # .. 2128 1318..135A ; PVALID # ETHI SYL GGA..ETHI SYL FYA 2129 135B..135C ; UNASSIGNED # .. 2130 135D..135F ; PVALID # ETHI COMB GEM AND VOW..ETHI COMB GE 2131 1360..137C ; FREE_PVAL # ETHI SECT MARK..ETHI NUM TEN THOUS 2132 137D..137F ; UNASSIGNED # .. 2133 1380..138F ; PVALID # ETHI SYL SEBATBEIT MWA..ETHI SYL PW 2134 1390..1399 ; FREE_PVAL # ETHI TON MARK YIZET..ETHI TON MARK 2135 139A..139F ; UNASSIGNED # .. 2136 13A0..13F4 ; PVALID # CHEROKEE LET A..CHEROKEE LET YV 2137 13F5..13FF ; UNASSIGNED # .. 2138 1400 ; FREE_PVAL # CANAD SYL HYPHEN 2139 1401..166C ; PVALID # CANAD SYL E..CANAD SYL CAR 2140 166D..166E ; FREE_PVAL # CANAD SYL CHI SIGN..CANAD SYLLAB 2141 166F..167F ; PVALID # CANAD SYL QAI..CANAD SYL B 2142 1680 ; FREE_PVAL # OGHAM SPACE MARK 2143 1681..169A ; PVALID # OGHAM LET BEITH..OGHAM LET PEITH 2144 169B..169C ; FREE_PVAL # OGHAM FEATHER MARK..OGHAM REV FEAT 2145 169D..169F ; UNASSIGNED # .. 2146 16A0..16EA ; PVALID # RUNIC LET FEHU FEOH FE F..RUNIC LET 2147 16EB..16F0 ; FREE_PVAL # RUNIC SINGLE PUNCT..RUNIC BELGTHOR 2148 16F1..16FF ; UNASSIGNED # .. 2149 1700..170C ; PVALID # TAGALOG LET A..TAGALOG LET YA 2150 170D ; UNASSIGNED # 2151 170E..1714 ; PVALID # TAGALOG LET LA..TAGALOG SIGN VIRAMA 2152 1715..171F ; UNASSIGNED # .. 2153 1720..1734 ; PVALID # HANUNOO LET A..HANUNOO SIGN PAMUDPO 2154 1735..1736 ; FREE_PVAL # PHILIP SINGLE PUNCT..PHILIP DOUBLE 2155 1737..173F ; UNASSIGNED # .. 2156 1740..1753 ; PVALID # BUHID LET A..BUHID VOW SIGN U 2157 1754..175F ; UNASSIGNED # .. 2158 1760..176C ; PVALID # TAGBANWA LET A..TAGBANWA LET YA 2159 176D ; UNASSIGNED # 2160 176E..1770 ; PVALID # TAGBANWA LET LA..TAGBANWA LET SA 2161 1771 ; UNASSIGNED # 2162 1772..1773 ; PVALID # TAGBANWA VOW SIGN I..TAGBANWA VOW S 2163 1774..177F ; UNASSIGNED # .. 2164 1780..17B3 ; PVALID # KHMER LET KA..KHMER IND VOW QAU 2165 17B4..17B5 ; DISALLOWED # KHMER VOW INH AQ..KHMER VOW INH AA 2166 17B6..17D3 ; PVALID # KHMER VOW SIGN AA..KHMER SIGN BATHA 2167 17D4..17D6 ; FREE_PVAL # KHMER SIGN KHAN..KHMER SIGN CAMNUC 2168 17D7 ; PVALID # KHMER SIGN LEK TOO 2169 17D8..17DB ; FREE_PVAL # KHMER SIGN BEYYAL..KHMER CURR SYM R 2170 17DC..17DD ; PVALID # KHMER SIGN AVAKRAHASANYA..KHMER SIG 2171 17DE..17DF ; UNASSIGNED # .. 2172 17E0..17E9 ; PVALID # KHMER DIG ZERO..KHMER DIG NINE 2173 17EA..17EF ; UNASSIGNED # .. 2174 17F0..17F9 ; FREE_PVAL # KHMER SYM LEK ATTAK SON..KHMER SYM 2175 17FA..17FF ; UNASSIGNED # .. 2176 1800..180A ; FREE_PVAL # MONG BIRGA..MONG NIRUGU 2177 180B..180E ; DISALLOWED # MONG FREE VAR SEL ONE..MONG VOW SEP 2178 180F ; UNASSIGNED # 2179 1810..1819 ; PVALID # MONG DIG ZERO..MONG DIG NINE 2180 181A..181F ; UNASSIGNED # .. 2181 1820..1877 ; PVALID # MONG LET A..MONG LET MANCHU 2182 1878..187F ; UNASSIGNED # .. 2183 1880..18AA ; PVALID # MONG LET ALI GALI ANUSVARA ONE..MON 2184 18AB..18AF ; UNASSIGNED # .. 2185 18B0..18F5 ; PVALID # CAN SYL OY..CAN SYL CA 2186 18F6..18FF ; UNASSIGNED # .. 2187 1900..191C ; PVALID # LIMBU VOW-CARRIER LET..LIMBU LET HA 2188 191D..191F ; UNASSIGNED # .. 2189 1920..192B ; PVALID # LIMBU VOW SIGN A..LIMBU SUBJOIN LET 2190 192C..192F ; UNASSIGNED # .. 2191 1930..193B ; PVALID # LIMBU SM LET KA..LIMBU SIGN SA-I 2192 193C..193F ; UNASSIGNED # .. 2193 1940 ; FREE_PVAL # LIMBU SIGN LOO 2194 1941..1943 ; UNASSIGNED # .. 2195 1944..1945 ; FREE_PVAL # LIMBU EXCLAM MARK..LIMBU QUEST MARK 2196 1946..196D ; PVALID # LIMBU DIG ZERO..TAI LE LET AI 2197 196E..196F ; UNASSIGNED # .. 2198 1970..1974 ; PVALID # TAI LE LET TONE-2..TAI LE LET TONE- 2199 1975..197F ; UNASSIGNED # .. 2200 1980..19AB ; PVALID # NEW TAI LUE LET HIGH QA..NEW TAI LU 2201 19AC..19AF ; UNASSIGNED # .. 2202 19B0..19C9 ; PVALID # NEW TAI LUE VOW SIGN VOW SHORT..NEW 2203 19CA..19CF ; UNASSIGNED # .. 2204 19D0..19D9 ; PVALID # NEW TAI LUE DIG ZERO..NEW TAI DIG N 2205 19DA ; DISALLOWED # NEW TAI LUE THAM 2206 19DB..19DD ; UNASSIGNED # .. 2207 19DE..19FF ; FREE_PVAL # NEW TAI LUE SIGN LAE..KHMER SYM DAP 2208 1A00..1A1B ; PVALID # BUGIN LET KA..BUGIN VOW SIGN AE 2209 1A1C..1A1D ; UNASSIGNED # .. 2210 1A1E..1A1F ; FREE_PVAL # BUGIN PALLAWA..BUGIN END OF SECTION 2211 1A20..1A5E ; PVALID # TAI THAM LET HIGH KA..TAI THAM CONS 2212 1A5F ; UNASSIGNED # 2213 1A60..1A7C ; PVALID # TAI THAM SIGN SAKOT..TAI THAM SIGN 2214 1A7D..1A7E ; UNASSIGNED # .. 2215 1A7F..1A89 ; PVALID # TAI THAM COMB CRYPT DOT..TAI THAM D 2216 1A8A..1A8F ; UNASSIGNED # .. 2217 1A90..1A99 ; PVALID # TAI THAM THAM DIG ZERO..TAI THAM TH 2218 1A9A..1A9F ; UNASSIGNED # .. 2219 1AA0..1AA6 ; FREE_PVAL # TAI THAM SIGN WIANG..TAI THAM SIGN 2220 1AA7 ; PVALID # TAI THAM SIGN MAI YAMOK 2221 1AA8..1AAD ; FREE_PVAL # TAI THAM SIGN KAAN..TAI THAM SIGN C 2222 1AAE..1AFF ; UNASSIGNED # .. 2223 1B00..1B4B ; PVALID # BAL SIGN ULU RICEM..BAL LET ASYURA 2224 1B4C..1B4F ; UNASSIGNED # .. 2225 1B50..1B59 ; PVALID # BAL DIG ZERO..BAL DIG NINE 2226 1B5A..1B6A ; FREE_PVAL # BAL PANTI..BAL MUS SYM DANG 2227 1B6B..1B73 ; PVALID # BAL MUS SYM COMB TEGEH..BAL MUS 2228 1B74..1B7C ; FREE_PVAL # BAL MUS SYM RIGHT-HAND OPEN DUG 2229 1B7D..1B7F ; UNASSIGNED # .. 2230 1B80..1BF3 ; PVALID # SUND SIGN PANYECEK..BATAK PANONGONAN 2231 1BF4..1BFB ; UNASSIGNED # .. 2232 1BFC..1BFF ; FREE_PVAL # BATAK SYM BINDU NA METEK..BATAK SYM 2233 1C00..1C37 ; PVALID # LEPCHA LET KA..LEPCHA SIGN NUKTA 2234 1C38..1C3A ; UNASSIGNED # .. 2235 1C3B..1C3F ; FREE_PVAL # LEPCHA PUNCT TA-ROL..LEPCHA PUNCT T 2236 1C40..1C49 ; PVALID # LEPCHA DIG ZERO..LEPCHA DIG NINE 2237 1C4A..1C4C ; UNASSIGNED # .. 2238 1C4D..1C7D ; PVALID # LEPCHA LET TTA..OL CHIKI AHAD 2239 1C7E..1C7F ; FREE_PVAL # OL CHIKI PUNCT MUCAAD..OL CHIKI PUN 2240 1C80..1CBF ; UNASSIGNED # .. 2241 1CC0..1CC7 ; FREE_PVAL # SUNDA PUNCT BINDU SURYA..SUNDA PUNC 2242 1CC8..1CCF ; UNASSIGNED # .. 2243 1CD0..1CD2 ; PVALID # VED TONE KARSHANA..VED TONE PRENKHA 2244 1CD3 ; FREE_PVAL # VED SIGN NIHSHVASA 2245 1CD4..1CF6 ; PVALID # VED SIGN YAJURVEDIC MID SVARITA..VE 2246 1CF7..1CFF ; UNASSIGNED # .. 2247 1D00..1D2B ; PVALID # LAT LET SM CAP A..CYR LET SM 2248 1D2C..1D2E ; FREE_PVAL # MOD LET CAP A..MOD LET C 2249 1D2F ; PVALID # MOD LET CAP BARRED B 2250 1D30..1D3A ; FREE_PVAL # MOD LET CAP D..MOD LET C 2251 1D3B ; PVALID # MOD LET CAP REV N 2252 1D3C..1D4D ; FREE_PVAL # MOD LET CAP O..MOD LET S 2253 1D4E ; PVALID # MOD LET SM TURNED I 2254 1D4F..1D6A ; FREE_PVAL # MOD LET SM K..GREEK SUB SMA 2255 1D6B..1D77 ; PVALID # LAT SM LET UE..LAT SM LET TU 2256 1D78 ; FREE_PVAL # MOD LET CYR EN 2257 1D79..1D9A ; PVALID # LAT SM LET INSULAR G..LAT SM LE 2258 1D9B..1DBF ; FREE_PVAL # MOD LET SM TURNED ALPHA..MOD 2259 1DC0..1DE6 ; PVALID # COMB DOTTED GRAVE ACCENT..COMB LAT 2260 1DE7..1DFB ; UNASSIGNED # .. 2261 1DFC..1E99 ; PVALID # COMB DOUBLE INV BREVE BEL..LAT SM L 2262 1E9A ; FREE_PVAL # LAT SM LET A W R HALF RING 2263 1E9B..1F15 ; PVALID # LAT SM LET LONG S W BOT ABOVE..GR 2264 1F16..1F17 ; UNASSIGNED # .. 2265 1F18..1F1D ; FREE_PVAL # GREEK CAP LET EPSILON W PSILI..GRE 2266 1F1E..1F1F ; UNASSIGNED # .. 2267 1F20..1F45 ; PVALID # GREEK SM LET ETA W PSILI..GREEK SMA 2268 1F46..1F47 ; UNASSIGNED # .. 2269 1F48..1F4D ; FREE_PVAL # GREEK CAP LET OMICRON W PSILI..GRE 2270 1F4E..1F4F ; UNASSIGNED # .. 2271 1F50..1F57 ; PVALID # GREEK SM LET UPSILON W PSILI..GREEK 2272 1F58 ; UNASSIGNED # 2273 1F59 ; PVALID # GREEK CAP LET UPSILON W DASIA 2274 1F5A ; UNASSIGNED # 2275 1F5B ; PVALID # GREEK CAP LET UPSILON W DASIA AND 2276 1F5C ; UNASSIGNED # 2277 1F5D ; PVALID # GREEK CAP LET UPSILON W DASIA AND 2278 1F5E ; UNASSIGNED # 2279 1F5F..1F7D ; PVALID # GREEK CAP LET UPSILON W DASIA A..GR 2280 1F7E..1F7F ; UNASSIGNED # .. 2281 1F80..1F87 ; PVALID # GREEK SM LET ALPHA W PSILI AND YPOG 2282 1F88..1F8F ; FREE_PVAL # GREEK CAP LET ALPHA W PSILI AND..GR 2283 1F90..1F97 ; PVALID # GREEK SM LET ETA W PSILI AND YP..GR 2284 1F98..1F9F ; FREE_PVAL # GREEK CAP LET ETA W PSILI AND P..GR 2285 1FA0..1FA7 ; PVALID # GREEK SM LET OMEGA W PSILI AND ..GR 2286 1FA8..1FAF ; FREE_PVAL # GREEK CAPL LET OMEGA W PSILI AN..GR 2287 1FB0..1FB4 ; PVALID # GREEK SM LET ALPHA W VRACHY..GREEK 2288 1FB5 ; UNASSIGNED # 2289 1FB6..1FBB ; PVALID # GREEK SM LET ALPHA W PERISPOMEN..GR 2290 1FBC..1FBD ; FREE_PVAL # GREEK CAP LET ALPHA W PROSGEGRA..GR 2291 1FBE ; PVALID # GREEK PROSGEGRAMMENI 2292 1FBF..1FC1 ; FREE_PVAL # GREEK PSILI..GREEK DIALYTIKA AND PE 2293 1FC2..1FC4 ; PVALID # GREEK SM LET ETA W VARIA AND YP..GR 2294 1FC5 ; UNASSIGNED # 2295 1FC6..1FCB ; PVALID # GREEK SM LET ETA W PERISPOMENI..GR 2296 1FCC..1FCF ; FREE_PVAL # GREEK CAP LET ETA W PROSGEGRAM..GR 2297 1FD0..1FD3 ; PVALID # GREEK SM LET IOTA W VRACHY..GREEK S 2298 1FD4..1FD5 ; UNASSIGNED # .. 2299 1FD6..1FDB ; PVALID # GREEK SM LET IOTA W PERISPOMENI..GR 2300 1FDC ; UNASSIGNED # 2301 1FDD..1FDF ; FREE_PVAL # GREEK DASIA AND VARIA..GREEK DASIA 2302 1FE0..1FEC ; PVALID # GREEK SM LET UPSILON W VRACHY..GREE 2303 1FED..1FEF ; FREE_PVAL # GREEK DIALYTIKA AND VARIA..GREEK VA 2304 1FF0..1FF1 ; UNASSIGNED # .. 2305 1FF2..1FF4 ; FREE_PVAL # GREEK SM LET OMEGA W VARIA AND YPOG 2306 1FF5 ; UNASSIGNED # 2307 1FF6..1FFB ; PVALID # GREEK SM LET OMEGA W PERISPOMEN..GR 2308 1FFC..1FFE ; FREE_PVAL # GREEK CAP LET OMEGA W PROSGEGRA..GR 2309 1FFF ; UNASSIGNED # 2310 2000..200A ; FREE_PVAL # EN QUAD..HAIR SPACE 2311 200B ; DISALLOWED # ZERO WIDTH SPACE 2312 200C..200D ; CONTEXTJ # ZERO WIDTH NON-JOINER..ZERO WIDTH J 2313 200E..200F ; DISALLOWED # LEFT-TO-RIGHT MARK..RIGHT-TO-LEFT M 2314 2010..2027 ; FREE_PVAL # HYPHEN..HYPHENATION POINT 2315 2028..202E ; DISALLOWED # LINE SEP..RIGHT-TO-LEFT OVERRIDE 2316 202F..205F ; FREE_PVAL # NARROW NO-BREAK SPACE..MED MATH SP 2317 2060..2064 ; DISALLOWED # WORD JOINER..INVISIBLE PLUS 2318 2065 ; UNASSIGNED # 2319 2066..206F ; DISALLOWED # LEFT-TO-RIGHT IS..NOM DIGIT SHAPES 2320 2070..2071 ; FREE_PVAL # SUPER ZERO..SUPER LAT SM LET I 2321 2072..2073 ; UNASSIGNED # .. 2322 2074..208E ; FREE_PVAL # SUPER FOUR..SUB RIGHT PARENTHESIS 2323 208F ; UNASSIGNED # 2324 2090..209C ; FREE_PVAL # LAT SUB SM LET A..LAT SUB SM LET T 2325 209D..209F ; UNASSIGNED # .. 2326 20A0..20BA ; FREE_PVAL # EURO-CURRENCY SIGN..TURKISH LIRA SI 2327 20BB..20CF ; UNASSIGNED # .. 2328 20D0..20DC ; PVALID # COMB LEFT HARPOON ABOVE..COMB FOUR 2329 20DD..20E0 ; FREE_PVAL # COMB ENC CIRC..COMB ENC CIRC BACKS 2330 20E1 ; PVALID # COMB L R ARROW ABOVE 2331 20E2..20E4 ; FREE_PVAL # COMB ENC SCREEN..COMB ENC UPWARD PO 2332 20E5..20F0 ; PVALID # COMB REV SOLIDUS OVERLAY..COMB ASTE 2333 20F1..20FF ; UNASSIGNED # .. 2334 2100..2129 ; FREE_PVAL # ACCOUNT OF..TURNED GREEK SM LET IOT 2335 212A..212B ; PVALID # KELVIN SIGN..ANGSTROM SIGN 2336 212C..2131 ; FREE_PVAL # SCRIPT CAP C..SCRIPT CAP F 2337 2132 ; PVALID # TURNED CAP F 2338 2133..214D ; FREE_PVAL # SCRIPT CAP M..AKTIESELSKAB 2339 214E ; PVALID # TURNED SM F 2340 214F..2182 ; FREE_PVAL # SYM FOR SAMAR SOURCE..ROM NUM TEN T 2341 2183..2184 ; PVALID # ROM NUM REV ONE HUNDRED..LAT SM LET 2342 2185..2189 ; FREE_PVAL # ROM NUM SIX LATE FORM..VULGAR FRACT 2343 218A..218F ; UNASSIGNED # .. 2344 2190..23F3 ; FREE_PVAL # LEFTWARDS ARROW..HOURGLASS W FLO 2345 23F4..23FF ; UNASSIGNED # .. 2346 2400..2426 ; FREE_PVAL # SYM FOR NULL..SYM FOR SUB FORM 2347 2427..243F ; UNASSIGNED # .. 2348 2440..244A ; FREE_PVAL # OCR HOOK..OCR DOUBLE BACKSLASH 2349 244B..245F ; UNASSIGNED # .. 2350 2460..26FF ; FREE_PVAL # CIRCLED DIG ONE..WHITE FLAG W HORIZ 2351 2700 ; UNASSIGNED # 2352 2701..2B4C ; FREE_PVAL # UP BLADE SCISSORS..RIGHTWARDS ARROW 2353 2B4D..2B4F ; UNASSIGNED # .. 2354 2B50..2B59 ; FREE_PVAL # WHITE MEDIUM STAR..HEAVY CIRCLED SA 2355 2B5A..2BFF ; UNASSIGNED # .. 2356 2C00..2C2E ; PVALID # GLAG CAP LET AZU..GLAG CA 2357 2C2F ; UNASSIGNED # 2358 2C30..2C5E ; PVALID # GLAG SM LET AZU..GLAG SMAL 2359 2C5F ; UNASSIGNED # 2360 2C60..2C7B ; PVALID # LAT CAP LET L W DOUBLE BAR..LAT SM 2361 2C7C..2C7D ; FREE_PVAL # LAT SUB SM LET J..MOD LET CAP V 2362 2C7E..2CE4 ; PVALID # LAT CAP LET S W SWASH TAIL..COPT SY 2363 2CE5..2CEA ; FREE_PVAL # COPT SYM MI RO..COPT SYM SHIMA SIMA 2364 2CEB..2CF3 ; PVALID # COPT CAP LET CRYPTOGRAMMIC SHEI..CO 2365 2CF4..2CF8 ; UNASSIGNED # .. 2366 2CF9..2CFF ; FREE_PVAL # COPT OLD NUB FULL STOP..COPT MORPHO 2367 2D00..2D25 ; PVALID # GEORG SM LET AN..GEORG SM LET 2368 2D26 ; UNASSIGNED # 2369 2D27 ; PVALID # GEORG SM LET YN 2370 2D28..2D2C ; UNASSIGNED # .. 2371 2D2D ; PVALID # GEORG SM LET AEN 2372 2D2E..2D2F ; UNASSIGNED # .. 2373 2D30..2D67 ; PVALID # TIFINAGH LET YA..TIFINAGH LETTER YO 2374 2D68..2D6E ; UNASSIGNED # .. 2375 2D6F..2D70 ; FREE_PVAL # TIFINAGH MOD LET LABIALIZATION MARK 2376 2D71..2D7E ; UNASSIGNED # .. 2377 2D7F..2D96 ; PVALID # TIFINAGH CONS JOINER..ETHI SYL GGW 2378 2D97..2D9F ; UNASSIGNED # .. 2379 2DA0..2DA6 ; PVALID # ETHI SYL SSA..ETHI SYL SSO 2380 2DA7 ; UNASSIGNED # 2381 2DA8..2DAE ; PVALID # ETHI SYL CCA..ETHI SYL CCO 2382 2DAF ; UNASSIGNED # 2383 2DB0..2DB6 ; PVALID # ETHI SYL ZZA..ETHI SYL ZZO 2384 2DB7 ; UNASSIGNED # 2385 2DB8..2DBE ; PVALID # ETHI SYL CCHA..ETHI SYL CC 2386 2DBF ; UNASSIGNED # 2387 2DC0..2DC6 ; PVALID # ETHI SYL QYA..ETHI SYL QYO 2388 2DC7 ; UNASSIGNED # 2389 2DC8..2DCE ; PVALID # ETHI SYL KYA..ETHI SYL KYO 2390 2DCF ; UNASSIGNED # 2391 2DD0..2DD6 ; PVALID # ETHI SYL XYA..ETHI SYL XYO 2392 2DD7 ; UNASSIGNED # 2393 2DD8..2DDE ; PVALID # ETHI SYL GYA..ETHI SYL GYO 2394 2DDF ; UNASSIGNED # 2395 2DE0..2DFF ; PVALID # COMB CYR LET BE..COMB CYRI 2396 2E00..2E2E ; FREE_PVAL # RIGHT ANGLE SUB MARK..REV QUEST MAR 2397 2E2F ; PVALID # VERT TILDE 2398 2E30..2E3B ; FREE_PVAL # RING PNT..THREE-EM DASH 2399 2E3C..2E7F ; UNASSIGNED # .. 2400 2E80..2E99 ; FREE_PVAL # CJK RAD REPEAT..CJK RAD RAP 2401 2E9A ; UNASSIGNED # 2402 2E9B..2EF3 ; FREE_PVAL # CJK RAD CHOKE..CJK RAD C-SIMPLIFIED 2403 2EF4..2EFF ; UNASSIGNED # .. 2404 2F00..2FD5 ; FREE_PVAL # KANGXI RAD ONE..KANGXI RAD FLUTE 2405 2FD6..2FEF ; UNASSIGNED # .. 2406 2FF0..2FFB ; FREE_PVAL # IDEO DESC CHAR LEFT TO RIGHT..IDEO 2407 2FFC..2FFF ; UNASSIGNED # .. 2408 3000..3004 ; FREE_PVAL # IDEO SPACE..JAPAN INDUST STAND 2409 3005..3007 ; PVALID # IDEO ITER MARK..IDEO NUMB ZERO 2410 3008..3029 ; FREE_PVAL # LEFT ANGLE BRACKET..HANGZH NUM NINE 2411 302A..302D ; PVALID # IDEO LEVEL TONE MARK..IDEO ENT 2412 302E..302F ; DISALLOWED # HANGUL SING DOT TONE MARK..WAVY DAS 2413 3030 ; FREE_PVAL # WAVY DASH 2414 3031..3035 ; DISALLOWED # VERT KANA REP MARK..VERT KANA REP M 2415 3036..303A ; FREE_PVAL # CIRCLED POSTAL MARK..HANGZH NUM THI 2416 303B ; DISALLOWED # VERT IDEO ITER MARK 2417 303C ; PVALID # MASU MARK 2418 303D..303F ; FREE_PVAL # PART ALTER MARK..IDEO HALF FILL 2419 3040 ; UNASSIGNED # 2420 3041..3096 ; PVALID # HIRAGANA LET SM A..HIRAGANA LET SMA 2421 3097..3098 ; UNASSIGNED # .. 2422 3099..309A ; PVALID # COMB KAT-HIR VOICED SOUND 2423 309B..309C ; FREE_PVAL # KAT-HIR VOICED SOUND MARK..KAT-HIR 2424 309D..309E ; PVALID # HIRAGANA ITER MARK..HIRAGANA VOICED 2425 309F..30A0 ; FREE_PVAL # HIRAGANA DIGRAPH YORI..KAT-HIR DOU 2426 30A1..30FA ; PVALID # KATAKANA LET SM A..KATAKANA LET VO 2427 30FB ; CONTEXTO # KATAKANA MIDDLE DOT 2428 30FC..30FE ; PVALID # KAT-HIR PROLONGED SOUND MARK..KATA 2429 30FF ; FREE_PVAL # KATAKANA DIGRAPH KOTO 2430 3100..3104 ; UNASSIGNED # .. 2431 3105..312D ; PVALID # BOPOMOFO LET B..BOPOMOFO LET IH 2432 312E..3130 ; UNASSIGNED # .. 2433 3131..3163 ; FREE_PVAL # HANGUL LET KIYEOK..HANGUL LET I 2434 3164 ; DISALLOWED # HANGUL FILLER 2435 3165..318E ; FREE_PVAL # HANGUL LET SSANGNIEUN..HANGUL LET 2436 318F ; UNASSIGNED # 2437 3190..319F ; FREE_PVAL # IDEO ANNO LINK MARK..IDEO ANNO MAN 2438 31A0..31BA ; PVALID # BOPOMOFO LET BU..BOPOMOFO LET ZY 2439 31BB..31BF ; UNASSIGNED # .. 2440 31C0..31E3 ; FREE_PVAL # CJK STROKE T..CJK STROKE Q 2441 31E4..31EF ; UNASSIGNED # .. 2442 31F0..31FF ; PVALID # KATAKANA LET SM KU..KATAKANA LET SM 2443 3200..321E ; FREE_PVAL # PAREN HANGUL KIYEOK..PAREN KOREAN C 2444 321F ; UNASSIGNED # 2445 3220..32FE ; FREE_PVAL # PAREN IDEO ONE..CIRCLED KATAKANA WO 2446 32FF ; UNASSIGNED # 2447 3300..33FF ; FREE_PVAL # SQUARE APAATO..SQUARE GAL 2448 3400..4DB5 ; PVALID # 2449 4DB6..4DBF ; UNASSIGNED # .. 2450 4DC0..4DFF ; FREE_PVAL # HEX FOR THE CREATIVE HEAVEN..HEX FO 2451 4E00..9FCC ; PVALID # 2452 9FCD..9FFF ; UNASSIGNED # .. 2453 A000..A48C ; PVALID # YI SYL IT..YI SYL YYR 2454 A48D..A48F ; UNASSIGNED # .. 2455 A490..A4C6 ; FREE_PVAL # YI RAD QOT..YI RAD KE 2456 A4C7..A4CF ; UNASSIGNED # .. 2457 A4D0..A4FD ; PVALID # LISU LET BA..LISU LET TONE MYA JEU 2458 A4FE..A4FF ; FREE_PVAL # LISU PUNCT COMMA..LISU PUNCT FUL 2459 A500..A60C ; PVALID # VAI SYL EE..VAI SYL LENENER 2460 A60D..A60F ; FREE_PVAL # VAI COMMA..VAI QUEST MARK 2461 A610..A62B ; PVALID # VAI SYL NDOLE FA..VAI SYL NDOLE DO 2462 A62C..A63F ; UNASSIGNED # .. 2463 A640..A66F ; PVALID # CYR CAP LET ZEMLYA..COMB CYR VZMET 2464 A670..A673 ; FREE_PVAL # COMB CYR TEN MILLIONS SIGN..SLAVON 2465 A674..A67D ; PVALID # COMB CYR KAVYKA..COMB CYR PAYEROK 2466 A67E ; FREE_PVAL # CYR KAVYKA 2467 A67F..A697 ; PVALID # CYR PAYEROK..CYR SM LET SHWE 2468 A698..A69E ; UNASSIGNED # .. 2469 A69F..A6E5 ; PVALID # COMB CYR LET IOTIFIED E..BAMUM LET 2470 A6E6..A6EF ; FREE_PVAL # BAMUM LET MO..BAMUM LET KOGHOM 2471 A6F0..A6F1 ; PVALID # BAMUM COMB MARK KOQNDON..BAMUM COMB 2472 A6F2..A6F7 ; FREE_PVAL # BAMUM NJAEMLI..BAMUM QUEST MARK 2473 A6F8..A6FF ; UNASSIGNED # .. 2474 A700..A716 ; FREE_PVAL # MOD LET CHIN TONE YIN PING..MOD 2475 A717..A71F ; PVALID # MOD LET DOT VERT BAR..MOD L 2476 A720..A721 ; FREE_PVAL # MOD LET STRESS AND HIGH TONE..MOD 2477 A722..A76F ; PVALID # LAT CAP LET EGYPT ALEF..LAT SM LET 2478 A770 ; FREE_PVAL # MODIFIER LETTER US 2479 A771..A788 ; PVALID # LATIN SMALL LETTER DUM..MOD LET LOW 2480 A789..A78A ; FREE_PVAL # MOD LET COLON..MOD LET SH EQUALS SI 2481 A78B..A78E ; PVALID # LAT SM LET SALTILLO..LAT SM LET L W 2482 A78F ; UNASSIGNED # 2483 A790..A793 ; PVALID # LAT CAP LET N W DESC..LAT SM LET C 2484 A794..A79F ; UNASSIGNED # .. 2485 A7A0..A7AA ; PVALID # LAT CAP LET G W OBLIQUE STROKE..LAT 2486 A7AB..A7F7 ; UNASSIGNED # .. 2487 A7F8..A7F9 ; FREE_PVAL # MOD LET CAP H W STROKE..MOD LET SM 2488 A7FA..A827 ; PVALID # LAT LET SM CAP TURNED M..SYLOTI NA 2489 A828..A82B ; FREE_PVAL # SYLOTI NAGRI POET MARK-1..SYLOTI NA 2490 A82C..A82F ; UNASSIGNED # .. 2491 A830..A839 ; FREE_PVAL # N INDIC FRACT ONE QUART..N INDIC QU 2492 A83A..A83F ; UNASSIGNED # .. 2493 A840..A873 ; PVALID # PHAGS-PA LET KA..PHAGS-PA LET CANDR 2494 A874..A877 ; FREE_PVAL # PHAGS-PA SINGLE HEAD MARK..PHAGS-PA 2495 A878..A87F ; UNASSIGNED # .. 2496 A880..A8C4 ; PVALID # SAUR SIGN ANUSVARA..SAUR SIGN VIRAM 2497 A8C5..A8CD ; UNASSIGNED # .. 2498 A8CE..A8CF ; FREE_PVAL # SAUR DANDA..SAUR DOUBLE DANDA 2499 A8D0..A8D9 ; PVALID # SAUR DIG ZERO..SAUR DIG NINE 2500 A8DA..A8DF ; UNASSIGNED # .. 2501 A8E0..A8F7 ; PVALID # COMB DEVAN DIG ZERO..DEVAN SIGN CAN 2502 A8F8..A8FA ; FREE_PVAL # DEVAN SIGN PUSHPIKA..DEVAN CARET 2503 A8FB ; PVALID # DEVAN HEADSTROKE 2504 A8FC..A8FF ; UNASSIGNED # .. 2505 A900..A92D ; PVALID # KAYAH LI DIG ZERO..KAYAH LI TONE CA 2506 A92E..A92F ; FREE_PVAL # KAYAH LI SIGN CWI..KAYAH LI SIGN SH 2507 A930..A953 ; PVALID # REJANG LET KA..REJANG VIRAMA 2508 A954..A95E ; UNASSIGNED # .. 2509 A95F ; FREE_PVAL # REJANG SECTION MARK 2510 A960..A97C ; DISALLOWED # HANGUL CHO TIKEUT-MIUEM..HANGUL CHO 2511 A97D..A97F ; UNASSIGNED # .. 2512 A980..A9C0 ; PVALID # JAV SIGN PANYANGGA..JAV PANGKON 2513 A9C1..A9CD ; FREE_PVAL # JAV LEFT RERENGGAN..JAV TURNED PADA 2514 A9CE ; UNASSIGNED # 2515 A9CF..A9D9 ; PVALID # JAV PANGRANGKEP..JAV DIG NINE 2516 A9DA..A9DD ; UNASSIGNED # .. 2517 A9DE..A9DF ; FREE_PVAL # JAV PADA TIRTA TUMETES..JAV PADA I 2518 A9E0..A9FF ; UNASSIGNED # .. 2519 AA00..AA36 ; PVALID # CHAM LET A..CHAM CONS SIGN WA 2520 AA37..AA3F ; UNASSIGNED # .. 2521 AA40..AA4D ; PVALID # CHAM LET FIN K..CHAM CONS SIGN FIN 2522 AA4E..AA4F ; UNASSIGNED # .. 2523 AA50..AA59 ; PVALID # CHAM DIG ZERO..CHAM DIG NINE 2524 AA5A..AA5B ; UNASSIGNED # .. 2525 AA5C..AA5F ; FREE_PVAL # CHAM PUNCT SPIRAL..CHAM PUNCT TR 2526 AA60..AA76 ; PVALID # MYAN LET KHAMTI GA..MYAN LOGOGRAM K 2527 AA77..AA79 ; FREE_PVAL # MYAN SYM AITON EXCLAM..MYAN SYM AIT 2528 AA7A..AA7B ; PVALID # MYAN LET AITON RA..MYAN SIGN PAO KA 2529 AA7C..AA7F ; UNASSIGNED # .. 2530 AA80..AAC2 ; PVALID # TAI VIET LET LOW KO..TAI VIET TONE 2531 AAC3..AADA ; UNASSIGNED # .. 2532 AADB..AADD ; PVALID # TAI VIET SYM KON..TAI VIET SYM SAM 2533 AADE..AADF ; FREE_PVAL # TAI VIET SYM HO HOI..TAI VIET SYM K 2534 AAE0..AAEF ; PVALID # MEETEI MAYEK LET E..MEETEI MAYEK VO 2535 AAF0..AAF1 ; FREE_PVAL # MEETEI MAYEK CHEIKHAN..MEETEI MAYEK 2536 AAF2..AAF6 ; PVALID # MEETEI MAYEK ANJI..MEETEI MAYEK VIR 2537 AAF7..AB00 ; UNASSIGNED # .. 2538 AB01..AB06 ; PVALID # ETHI SYL TTHU..ETHI SYL TTHO 2539 AB07..AB08 ; UNASSIGNED # .. 2540 AB09..AB0E ; PVALID # ETHI SYL DDHAA..ETHI SYL DDHO 2541 AB0F..AB10 ; UNASSIGNED # .. 2542 AB11..AB16 ; PVALID # ETHI SYL DZU..ETHI SYL DZO 2543 AB17..AB1F ; UNASSIGNED # .. 2544 AB20..AB26 ; PVALID # ETHI SYL CCHHA..ETHI SYL CCHHO 2545 AB27 ; UNASSIGNED # .. 2546 AB28..AB2E ; PVALID # ETHI SYL BBAA..ETHI SYL BBO 2547 AB2F..ABBF ; UNASSIGNED # .. 2548 ABC0..ABEA ; PVALID # MEETEI MAYEK LET KOK..MEETEI MAYEK 2549 ABEB ; FREE_PVAL # MEETEI MAYEK CHEIKHEI 2550 ABEC..ABED ; PVALID # MEETEI MAYEK LUM IYEK..MEETEI MAYEK 2551 ABEE..ABEF ; UNASSIGNED # .. 2552 ABF0..ABF9 ; PVALID # MEETEI MAYEK DIG ZERO..MEETEI MAYEK 2553 ABFA..ABFF ; UNASSIGNED # .. 2554 AC00..D7A3 ; PVALID # 2555 D7A4..D7AF ; UNASSIGNED # .. 2556 D7B0..D7C6 ; DISALLOWED # HANGUL JUNG O-YEO..HANGUL JUNG ARAE 2557 D7C7..D7CA ; UNASSIGNED # .. 2558 D7CB..D7FB ; DISALLOWED # HANGUL JONG NIEUN-RIEUL..HANGUL JON 2559 D7FC..D7FF ; UNASSIGNED # .. 2560 D800..F8FF ; DISALLOWED # 2561 F900..FA6D ; PVALID # CJK COMP IDEO-F900..CJK COMP IDEO 2562 FA6E..FA6F ; UNASSIGNED # .. 2563 FA70..FAD9 ; PVALID # CJK COMP IDEO-FA70..CJK COMP IDEO 2564 FADA..FAFF ; UNASSIGNED # .. 2565 FB00..FB06 ; FREE_PVAL # LAT SM LIG FF..LAT SM LIG ST 2566 FB07..FB12 ; UNASSIGNED # .. 2567 FB13..FB17 ; FREE_PVAL # ARMENIAN SM LIG MEN NOW..ARMENIAN SM 2568 FB18..FB1C ; UNASSIGNED # .. 2569 FB1D..FB1F ; PVALID # HEBR LET YOD W HIRIQ..HEBR LIG YID Y 2570 FB20..FB29 ; FREE_PVAL # HEBR LET ALT AYIN..HEB LET ALT PLUS 2571 FB2A..FB36 ; PVALID # HEBR LET SHIN W SHIN DOT..HEBR LET Z 2572 FB37 ; UNASSIGNED # 2573 FB38..FB3C ; PVALID # HEBR LET TET W DAGESH..HEBR LET 2574 FB3D ; UNASSIGNED # 2575 FB3E ; PVALID # HEBR LET MEM W DAGESH 2576 FB3F ; UNASSIGNED # 2577 FB40..FB41 ; PVALID # HEBR LET NUN W DAGESH..HEBR LET 2578 FB42 ; UNASSIGNED # 2579 FB43..FB44 ; PVALID # HEBR LET FIN PE W DAGESH..HEBR L 2580 FB45 ; UNASSIGNED # 2581 FB46..FB4E ; PVALID # HEBR LET TSADI W DAGESH..HEBR LET P 2582 FB4F..FBC1 ; FREE_PVAL # HEBR LIG ALEF LAMED..ARAB SYM S 2583 FBC2..FBD2 ; UNASSIGNED # .. 2584 FBD3..FD3F ; FREE_PVAL # ARAB LET NG ISO FORM..ORNATE RIGHT 2585 FD40..FD4F ; UNASSIGNED # .. 2586 FD50..FD8F ; FREE_PVAL # ARAB LIG TEH W JEEM W MEEM INIT 2587 FD90..FD91 ; UNASSIGNED # .. 2588 FD92..FDC7 ; FREE_PVAL # ARAB LIG MEEM W JEEM W KHAH INI 2589 FDC8..FDCF ; UNASSIGNED # .. 2590 FDD0..FDEF ; DISALLOWED # .. 2591 FDF0..FDFD ; FREE_PVAL # ARAB LIG SALLA USED..ARAB LIG BISMI 2592 FDFE..FDFF ; UNASSIGNED # .. 2593 FE00..FE0F ; DISALLOWED # VAR SEL-1..VAR SEL-16 2594 FE10..FE19 ; FREE_PVAL # PRES FORM FOR VERT COMMA..PRES FORM 2595 FE1A..FE1F ; UNASSIGNED # .. 2596 FE20..FE26 ; PVALID # COMB LIG LEFT HALF..COMB CONJ MACRO 2597 FE27..FE2F ; UNASSIGNED # .. 2598 FE30..FE52 ; FREE_PVAL # PRES FORM FOR VERT TWO DOT LEAD..SM 2599 FE53 ; UNASSIGNED # 2600 FE54..FE66 ; FREE_PVAL # SM SEMICOLON..SM EQUALS SIGN 2601 FE67 ; UNASSIGNED # 2602 FE68..FE6B ; FREE_PVAL # SM REV SOLIDUS..SM COMM AT 2603 FE6C..FE6F ; UNASSIGNED # .. 2604 FE70..FE72 ; FREE_PVAL # ARAB FATHATAN ISO FORM..ARAB DAMMAT 2605 FE73 ; PVALID # ARAB TAIL FRAGMENT 2606 FE74 ; FREE_PVAL # ARAB KASRATAN ISO FORM 2607 FE75 ; UNASSIGNED # 2608 FE76..FEFC ; FREE_PVAL # ARAB FATHA ISO FORM..ARAB LIG LAM W 2609 FEFD..FEFE ; UNASSIGNED # .. 2610 FEFF ; DISALLOWED # ZERO WIDTH NO-BREAK SPACE 2611 FF00 ; UNASSIGNED # 2612 FF01..FF9F ; FREE_PVAL # FULLW EXCLAM MARK..HALFW KATA SE 2613 FFA0 ; DISALLOWED # HALFW HANGUL FILLER 2614 FFA1..FFBE ; FREE_PVAL # HALFW HANGUL LET KIYEOK..HALFW H 2615 FFBF..FFC1 ; UNASSIGNED # .. 2616 FFC2..FFC7 ; FREE_PVAL # HALFW HANGUL LET A..HALFW HANGUL 2617 FFC8..FFC9 ; UNASSIGNED # .. 2618 FFCA..FFCF ; FREE_PVAL # HALFW HANGUL LET YEO..HALFW HANGU 2619 FFD0..FFD1 ; UNASSIGNED # .. 2620 FFD2..FFD7 ; FREE_PVAL # HALFW HANGUL LET YO..HALFW HANGUL 2621 FFD8..FFD9 ; UNASSIGNED # .. 2622 FFDA..FFDC ; FREE_PVAL # HALFW HANGUL LET EU..HALFW HANGUL 2623 FFDD..FFDF ; UNASSIGNED # .. 2624 FFE0..FFE6 ; FREE_PVAL # FULLW CENT SIGN..FULLW WON SIGN 2625 FFE7 ; UNASSIGNED # 2626 FFE8..FFEE ; FREE_PVAL # HALFW FORMS LIGHT VERT..HALFW WH 2627 FFEF..FFF8 ; UNASSIGNED # .. 2628 FFF9..FFFB ; DISALLOWED # INTERL ANNO ANCHOR..INTERL ANNO TER 2629 FFFC..FFFD ; FREE_PVAL # OBJECT REPL CHAR..REPL CHAR 2630 FFFE..FFFF ; DISALLOWED # .. 2631 10000..1000B; PVALID # LIN B SYL B008 A..LIN B SYL 2632 1000C ; UNASSIGNED # 2633 1000D..10026; PVALID # LIN B SYL B036 JO..LIN B SYL 2634 10027 ; UNASSIGNED # 2635 10028..1003A; PVALID # LIN B SYL B060 RA..LIN B SYL 2636 1003B ; UNASSIGNED # 2637 1003C..1003D; PVALID # LIN B SYL B017 ZA..LIN B SYL 2638 1003E ; UNASSIGNED # 2639 1003F..1004D; PVALID # LIN B SYL B020 ZO..LIN B SYL 2640 1004E..1004F; UNASSIGNED # .. 2641 10050..1005D; PVALID # LIN B SYM B018..LIN B SYM B089 2642 1005E..1007F; UNASSIGNED # .. 2643 10080..100FA; PVALID # LIN B IDEO B100 MAN..LIN B IDEO 2644 100FB..100FF; UNASSIGNED # .. 2645 10100..10102; FREE_PVAL # AEG WORD SEP LINE..AEG CHECK MAR 2646 10103..10106; UNASSIGNED # .. 2647 10107..10133; FREE_PVAL # AEG NUM ONE..AEG NUM NINETY THOU 2648 10134..10136; UNASSIGNED # .. 2649 10137..1018A; FREE_PVAL # AEG WEIGHT BASE UNIT..GREEK ZERO SI 2650 1018B..1018F; UNASSIGNED # .. 2651 10190..1019B; FREE_PVAL # ROM SEXTANS SIGN..ROM CENTURIAL SIG 2652 1019C..101CF; UNASSIGNED # .. 2653 101D0..101FC; FREE_PVAL # PHAISTOS DISC SIGN PED..PHAISTOS DI 2654 101FD ; PVALID # PHAISTOS DISC SIGN COMB OBLIQUE STR 2655 101FE..1027F; UNASSIGNED # .. 2656 10280..1029C; PVALID # LYCIAN LET A..LYCIAN LET X 2657 1029D..1029F; UNASSIGNED # .. 2658 102A0..102D0; PVALID # CARIAN LET A..CARIAN LET UUU3 2659 102D1..102FF; UNASSIGNED # .. 2660 10300..1031E; PVALID # OLD ITAL LET A..OLD ITAL LET UU 2661 1031F ; UNASSIGNED # 2662 10320..10323; FREE_PVAL # OLD ITAL NUM ONE..OLD ITAL NUM F 2663 10324..1032F; UNASSIGNED # .. 2664 10330..10340; PVALID # GOTH LET AHSA..GOTH LET PAIRTHRA 2665 10341 ; FREE_PVAL # GOTH LET NINETY 2666 10342..10349; PVALID # GOTH LET RAIDA..GOTH LET OTHAL 2667 1034A ; FREE_PVAL # GOTH LET NINE HUNDRED 2668 1034B..1037F; UNASSIGNED # .. 2669 10380..1039D; PVALID # UGAR LET ALPA..UGAR LET SSU 2670 1039E ; UNASSIGNED # 2671 1039F ; FREE_PVAL # UGAR WORD DIVIDER 2672 103A0..103C3; PVALID # OLD PERS SIGN A..OLD PERS SIGN HA 2673 103C4..103C7; UNASSIGNED # .. 2674 103C8..103CF; PVALID # OLD PERS SIGN AURAMAZDAA..OLD PERS 2675 103D0..103D5; FREE_PVAL # OLD PERS WORD DIVIDER..OLD PERS NUM 2676 103D6..103FF; UNASSIGNED # .. 2677 10400..1049D; PVALID # DESERET CAP LET LONG I..OSMANYA LET 2678 1049E..1049F; UNASSIGNED # .. 2679 104A0..104A9; PVALID # OSMANYA DIG ZERO..OSMANYA DIG NINE 2680 104AA..107FF; UNASSIGNED # .. 2681 10800..10805; PVALID # CYPRIOT SYL A..CYPRIOT SYL JA 2682 10806..10807; UNASSIGNED # .. 2683 10808 ; PVALID # CYPRIOT SYL JO 2684 10809 ; UNASSIGNED # 2685 1080A..10835; PVALID # CYPRIOT SYL KA..CYPRIOT SYL WO 2686 10836 ; UNASSIGNED # 2687 10837..10838; PVALID # CYPRIOT SYL XA..CYPRIOT SYL XE 2688 10839..1083B; UNASSIGNED # .. 2689 1083C ; PVALID # CYPRIOT SYL ZA 2690 1083D..1083E; UNASSIGNED # .. 2691 1083F..10855; PVALID # CYPRIOT SYL ZO..IMP ARAM LET TAW 2692 10856 ; UNASSIGNED # 2693 10857..1085F; FREE_PVAL # IMP ARAM SECT SIGN..IMP ARAM 2694 10860..108FF; UNASSIGNED # .. 2695 10900..10915; PVALID # PHOEN LET ALF..PHOEN LET TAU 2696 10916..1091B; FREE_PVAL # PHOEN NUM ONE..PHOEN NUM THR 2697 1091C..1091E; UNASSIGNED # .. 2698 1091F ; FREE_PVAL # PHOEN WORD SEP 2699 10920..10939; PVALID # LYDIAN LET A..LYDIAN LET C 2700 1093A..1093E; UNASSIGNED # .. 2701 1093F ; FREE_PVAL # LYDIAN TRIANGULAR MARK 2702 10940..1097F; UNASSIGNED # .. 2703 10980..109B7; PVALID # MERO HIER LET A..MERO CURS LET 2704 109B8..109BD; UNASSIGNED # .. 2705 109BE..109BF; PVALID # MERO CURS LOG RMT..MERO CURS L 2706 109C0..109FF; UNASSIGNED # .. 2707 10A00..10A03; PVALID # KHARO LET A..KHARO VOW SIGN V 2708 10A04 ; UNASSIGNED # 2709 10A05..10A06; PVALID # KHARO VOW SIGN E..KHARO VOW SI 2710 10A07..10A0B; UNASSIGNED # .. 2711 10A0C..10A13; PVALID # KHARO VOW LEN MARK..KHARO LET 2712 10A14 ; UNASSIGNED # 2713 10A15..10A17; PVALID # KHARO LET CA..KHARO LET JA 2714 10A18 ; UNASSIGNED # 2715 10A19..10A33; PVALID # KHARO LET NYA..KHARO LET TTT 2716 10A34..10A37; UNASSIGNED # .. 2717 10A38..10A3A; PVALID # KHARO SIGN BAR ABOVE..KHARO SIGN D 2718 10A3B..10A3E; UNASSIGNED # .. 2719 10A3F ; PVALID # KHARO VIRAMA 2720 10A40..10A47; FREE_PVAL # KHARO DIG ONE..KHARO NUM ONE 2721 10A48..10A4F; UNASSIGNED # .. 2722 10A50..10A58; FREE_PVAL # KHARO PUNCT DOT..KHARO PUNCT 2723 10A59..10A5F; UNASSIGNED # .. 2724 10A60..10A7C; PVALID # OLD S ARAB LET HE..OLD SOUTH ARAB 2725 10A7D..10A7F; FREE_PVAL # OLD S ARAB NUM ONE..OLD SOUTH ARAB 2726 10A80..10AFF; UNASSIGNED # .. 2727 10B00..10B35; PVALID # AVESTAN LET A..AVESTAN LET HE 2728 10B36..10B38; UNASSIGNED # .. 2729 10B39..10B3F; FREE_PVAL # AVESTAN ABBR MARK..LARGE ONE RING O 2730 10B40..10B55; PVALID # INSCRIPT PARTHIAN LET ALEPH..INSCRI 2731 10B56..10B57; UNASSIGNED # .. 2732 10B58..10B5F; FREE_PVAL # INSCRIPT PARTHIAN NUM ONE..INSCRIPT 2733 10B60..10B72; PVALID # INSCRIPT PAHLAVI LET ALEPH..INSCRIP 2734 10B73..10B77; UNASSIGNED # .. 2735 10B78..10B7F; FREE_PVAL # INSCRIPT PAHLAVI NUM ONE..INSCRIPT 2736 10B80..10BFF; UNASSIGNED # .. 2737 10C00..10C48; PVALID # OLD TURK LET ORKHON A..OLD TURK LET 2738 10C49..10E5F; UNASSIGNED # .. 2739 10E60..10E7E; FREE_PVAL # RUMI DIG ONE..RUMI FRACTION TWO THI 2740 10E7F..10FFF; UNASSIGNED # .. 2741 11000..11046; PVALID # BRAHMI SIGN CANDRABINDU..BRAHMI VIR 2742 11047..1104D; FREE_PVAL # BRAHMI DANDA..BRAHMI PUNCT LOTUS 2743 1104E..11051; UNASSIGNED # .. 2744 11052..11065; FREE_PVAL # BRAHMI NUM ONE..BRAHMI NUM ONE THOU 2745 11066..1106F; PVALID # BRAHMI DIG ZERO..BRAHMI DIG NINE 2746 11070..1107F; UNASSIGNED # .. 2747 11080..110BA; PVALID # KAITHI SIGN CANDRABINDU..KAITHI SIG 2748 110BB..110BC; FREE_PVAL # KAITHI ABBR SIGN..KAITHI ENUM SIGN 2749 110BD ; DISALLOWED # KAITHI NUM SIGN 2750 110BE..110C1; FREE_PVAL # KAITHI SECT MARK..KAITHI DOUBLE DAN 2751 110C2..110CF; UNASSIGNED # .. 2752 110D0..110F8; PVALID # SORA SOMPENG LETTER SAH..SORA SOMPE 2753 110F9..110EF; UNASSIGNED # .. 2754 110F0..110F9; PVALID # SORA SOMPENG DIG ZERO..SORA SOMPENG DI 2755 110FA..110FF; UNASSIGNED # .. 2756 11100..11134; PVALID # CHAKMA SIGN CANDRABINDU..CHAKMA MAAYY 2757 11135 ; UNASSIGNED # 2758 11136..1113F; PVALID # CHAKMA DIG ZERO..CHAKMA DIG NINE 2759 11140..11143; FREE_PVAL # CHAKMA SECT MARK..CHAKMA QUEST MARK 2760 11144..1117F; UNASSIGNED # .. 2761 11180..111C4; PVALID # SHARADA SIGN CANDRABINDU..SHARADA OM 2762 111C5..111C8; FREE_PVAL # SHARADA DANDA..SHARADA SEPARATOR 2763 111C9..111CF; UNASSIGNED # .. 2764 111D0..111D9; PVALID # SHARADA DIG ZERO..SHARADA DIG NINE 2765 111DA..1167F; UNASSIGNED # .. 2766 11680..116B7; PVALID # TAKRI LET A..TAKRI SIGN NUKTA 2767 116B8..116BF; UNASSIGNED # .. 2768 116C0..116C9; PVALID # TAKRI DIGIT ZERO..TAKRI DIG NINE 2769 116CA..1FFFF; UNASSIGNED # .. 2770 12000..1236E; PVALID # CUNEI SIGN A..CUNEI SIGN ZUM 2771 1236F..123FF; UNASSIGNED # .. 2772 12400..12462; FREE_PVAL # CUNEI NUM SIGN TWO ASH..CUNEI NUM 2773 12463..1246F; UNASSIGNED # .. 2774 12470..12473; FREE_PVAL # CUNEI PUNCT SIGN OLD ASSYRIAN WORD 2775 12474..12FFF; UNASSIGNED # .. 2776 13000..1342E; PVALID # EGYPT HIERO A001..EGYPT HIERO AA032 2777 1342F..167FF; UNASSIGNED # .. 2778 16800..16A38; PVALID # BAMUM LET PHASE-A NGKUE MFON..BAMUN LE 2779 16A39..16EFF; UNASSIGNED # .. 2780 16F00..16F44; PVALID # MIAO LET PA..MIAO LET HHA 2781 16F45..16F4F; UNASSIGNED # .. 2782 16F50..16F7E; PVALID # MIAO LET NAS..MIAO VOWEL SIGN NG 2783 16F7F..16F8E; UNASSIGNED # .. 2784 16F8F..16F9F; PVALID # MIAO TONE RIGHT..MIAO LET REF TON 2785 16FA0..1AFFF; UNASSIGNED # .. 2786 1B000..1B001; PVALID # KATA LET ARCH E..KATA LET ARCH YE 2787 1B002..1CFFF; UNASSIGNED # .. 2788 1D000..1D0F5; FREE_PVAL # BYZ MUS SYM PSILI..BYZ MUS 2789 1D0F6..1D0FF; UNASSIGNED # .. 2790 1D100..1D126; FREE_PVAL # MUS SYM SINGLE BARLINE..MUS SYMBOL 2791 1D127..1D128; UNASSIGNED # .. 2792 1D129..1D164; FREE_PVAL # MUS SYM MULT MEASURE REST..MUS SYM ONE 2793 1D165..1D169; PVALID # MUS SYM COMB STEM..MUS SYM COMB TREMOL 2794 1D16A..1D16C; FREE_PVAL # MUS SYM FING TREM-1..MUS SYM FING TREM 2795 1D16D..1D172; PVALID # MUS SYM COMB AUG DOT..MUS SYM COMB FL 2796 1D173..1D17A; DISALLOWED # MUS SYM BEGIN BEAM..MUS SYM END PHRASE 2797 1D17B..1D182; PVALID # MUS SYM COMB ACCENT..MUS SYM COMB LOUR 2798 1D183..1D184; FREE_PVAL # MUS SYM ARP UP..MUS SYM ARP DOWN 2799 1D185..1D18B; PVALID # MUS SYM COMB DOIT..MUS SYM COMB TRIPLE 2800 1D18C..1D1A9; FREE_PVAL # MUS SYM RINFORZANDO..MUS SYM DEG SLASH 2801 1D1AA..1D1AD; PVALID # MUS SYM COMB DOWN BOW..MUS SYM COMB SN 2802 1D1AE..1D1DD; FREE_PVAL # MUS SYM PEDAL MARK..MUS SYM PES SUBPUN 2803 1D1DE..1D1FF; UNASSIGNED # .. 2804 1D200..1D241; FREE_PVAL # GREEK VOCAL NOTATION SYM-1..GREEK INS 2805 1D242..1D244; FREE_PVAL # COMB GREEK MUS TRISEME..COMB GREEK MU 2806 1D245 ; FREE_PVAL # GREEK MUSICAL LEIMMA 2807 1D246..1D2FF; UNASSIGNED # .. 2808 1D300..1D356; DISALLOWED # MONOG FOR EARTH..TETRAG FOR FOSTERING 2809 1D357..1D35F; UNASSIGNED # .. 2810 1D360..1D371; DISALLOWED # COUNT ROD UNIT DIG ONE..COUNT ROD TE 2811 1D372..1D3FF; UNASSIGNED # .. 2812 1D400..1D454; FREE_PVAL # MATH BOLD CAP A..MATH IT 2813 1D455 ; UNASSIGNED # 2814 1D456..1D49C; FREE_PVAL # MATH ITAL SM I..MATH SC 2815 1D49D ; UNASSIGNED # 2816 1D49E..1D49F; FREE_PVAL # MATH SCRIPT CAP C..MATH 2817 1D4A0..1D4A1; UNASSIGNED # .. 2818 1D4A2 ; FREE_PVAL # MATH SCRIPT CAP G 2819 1D4A3..1D4A4; UNASSIGNED # .. 2820 1D4A5..1D4A6; FREE_PVAL # MATH SCRIPT CAP J..MATH 2821 1D4A7..1D4A8; UNASSIGNED # .. 2822 1D4A9..1D4AC; FREE_PVAL # MATH SCRIPT CAP N..MATH 2823 1D4AD ; UNASSIGNED # 2824 1D4AE..1D4B9; FREE_PVAL # MATH SCRIPT CAP S..MATH 2825 1D4BA ; UNASSIGNED # 2826 1D4BB ; FREE_PVAL # MATH SCRIPT SM F 2827 1D4BC ; UNASSIGNED # 2828 1D4BD..1D4C3; FREE_PVAL # MATH SCRIPT SM H..MATH SC 2829 1D4C4 ; UNASSIGNED # 2830 1D4C5..1D505; FREE_PVAL # MATH SCRIPT SM P..MATH FR 2831 1D506 ; UNASSIGNED # 2832 1D507..1D50A; FREE_PVAL # MATH FRAKTUR CAP D..MATH 2833 1D50B..1D50C; UNASSIGNED # .. 2834 1D50D..1D514; FREE_PVAL # MATH FRAKTUR CAP J..MATH 2835 1D515 ; UNASSIGNED # 2836 1D516..1D51C; FREE_PVAL # MATH FRAKTUR CAP S..MATH 2837 1D51D ; UNASSIGNED # 2838 1D51E..1D539; FREE_PVAL # MATH FRAKTUR SM A..MATH D 2839 1D53A ; UNASSIGNED # 2840 1D53B..1D53E; FREE_PVAL # MATH DOUBLE-STRUCK CAP D..MATHEM 2841 1D53F ; UNASSIGNED # 2842 1D540..1D544; FREE_PVAL # MATH DOUBLE-STRUCK CAP I..MATHEM 2843 1D545 ; UNASSIGNED # 2844 1D546 ; FREE_PVAL # MATH DOUBLE-STRUCK CAP O 2845 1D547..1D549; UNASSIGNED # .. 2846 1D54A..1D550; FREE_PVAL # MATH DOUBLE-STRUCK CAP S..MATHEM 2847 1D551 ; UNASSIGNED # 2848 1D552..1D6A5; FREE_PVAL # MATH DOUBLE-STRUCK SM A..MATHEMAT 2849 1D6A6..1D6A7; UNASSIGNED # .. 2850 1D6A8..1D7CB; FREE_PVAL # MATH BOLD CAP ALPHA..MATHEMATICA 2851 1D7CC..1D7CD; UNASSIGNED # .. 2852 1D7CE..1D7FF; FREE_PVAL # MATH BOLD DIG ZERO..MATH M 2853 1D800..1EDFF; UNASSIGNED # .. 2854 1EE00..1EE03; FREE_PVAL # ARAB MATH ALEF..ARAB MATH DAL 2855 1EE04 ; UNASSIGNED # 2856 1EE05..1EE1F; FREE_PVAL # ARAB MATH WAW..ARAB MATH DOTLESS QAF 2857 1EE20 ; UNASSIGNED # 2858 1EE21..1EE22; FREE_PVAL # ARAB MATH INIT BEH..ARAB MATH INIT JEE 2859 1EE23 ; UNASSIGNED # 2860 1EE24 ; FREE_PVAL # ARAB MATH INIT HEH 2861 1EE25..1EE26; UNASSIGNED # .. 2862 1EE27 ; FREE_PVAL # ARAB MATH INIT HAH 2863 1EE28 ; UNASSIGNED # 2864 1EE29..1EE32; FREE_PVAL # ARAB MATH INIT YEH..ARAB MATH INIT QAF 2865 1EE33 ; UNASSIGNED # 2866 1EE34..1EE37; FREE_PVAL # ARAB MATH INIT SHEEN..ARAB MATH INITIA 2867 1EE38 ; UNASSIGNED # 2868 1EE39 ; FREE_PVAL # ARAB MATH INIT SHEEN 2869 1EE3A ; UNASSIGNED # 2870 1EE3B ; FREE_PVAL # ARAB MATH INIT GHAIN 2871 1EE3C..1EE41; UNASSIGNED # .. 2872 1EE42 ; FREE_PVAL # ARAB MATH TAILED JEEM 2873 1EE43..1EE46; UNASSIGNED # .. 2874 1EE47 ; FREE_PVAL # ARAB MATH TAILED HAH 2875 1EE48 ; UNASSIGNED # 2876 1EE49 ; FREE_PVAL # ARAB MATH TAILED YEH 2877 1EE4A ; UNASSIGNED # 2878 1EE4B ; FREE_PVAL # ARAB MATH TAILED LAM 2879 1EE4C ; UNASSIGNED # 2880 1EE4D..1EE4F; FREE_PVAL # ARAB MATH TAILED NOON..ARAB MATH TAILE 2881 1EE50 ; UNASSIGNED # 2882 1EE51..1EE52; FREE_PVAL # ARAB MATH TAILED QAF..ARAB MATH TAILED 2883 1EE53 ; UNASSIGNED # 2884 1EE54 ; FREE_PVAL # ARAB MATH TAILED SHEEN 2885 1EE55..1EE56; UNASSIGNED # .. 2886 1EE57 ; FREE_PVAL # ARAB MATH TAILED KHAH 2887 1EE58 ; UNASSIGNED # 2888 1EE59 ; FREE_PVAL # ARAB MATH TAILED DAD 2889 1EE5A ; UNASSIGNED # 2890 1EE5B ; FREE_PVAL # ARAB MATH TAILED GHAIN 2891 1EE5C ; UNASSIGNED # 2892 1EE5D ; FREE_PVAL # ARAB MATH TAILED DOTLESS NOON 2893 1EE5E ; UNASSIGNED # 2894 1EE5F ; FREE_PVAL # ARAB MATH TAILED DOTLESS GHAIN 2895 1EE60 ; UNASSIGNED # 2896 1EE61..1EE62; FREE_PVAL # ARAB MATH STRETCHED BEH..ARAB MATH STR 2897 1EE63 ; UNASSIGNED # 2898 1EE64 ; FREE_PVAL # ARAB MATH STRETCHED HEH 2899 1EE65..1EE66; UNASSIGNED # .. 2900 1EE67..1EE6A; FREE_PVAL # ARAB MATH STRETCHED HAH..ARAB MATH STR 2901 1EE6B ; UNASSIGNED # 2902 1EE6C..1EE72; FREE_PVAL # ARAB MATH STRETCHED MEEM..ARAB MATH ST 2903 1EE73 ; UNASSIGNED # 2904 1EE74..1EE77; FREE_PVAL # ARAB MATH STRETCHED SHEEN..ARAB MATH S 2905 1EE78 ; UNASSIGNED # 2906 1EE79..1EE7C; FREE_PVAL # ARAB MATH STRETCHED DAD..ARAB MATH STR 2907 1EE7D ; UNASSIGNED # 2908 1EE7E ; FREE_PVAL # ARAB MATH STRETCHED DOTLESS FEH 2909 1EE7F ; UNASSIGNED # 2910 1EE80..1EE89; FREE_PVAL # ARAB MATH LOOPED ALEF..ARAB MATH LOOPE 2911 1EE8A ; UNASSIGNED # 2912 1EE8B..1EE9B; FREE_PVAL # ARAB MATH LOOPED LAM..ARAB MATH LOOPED 2913 1EE9C..1EEA0; UNASSIGNED # .. 2914 1EEA1..1EEA3; FREE_PVAL # ARAB MATH DOUBLE-STRUCK BEH..ARAB MATH 2915 1EEA4 ; UNASSIGNED # 2916 1EEA5..1EEA9; FREE_PVAL # ARAB MATH DOUBLE-STRUCK WAW..ARAB MATH 2917 1EEAA ; UNASSIGNED # 2918 1EEAB..1EEBB; FREE_PVAL # ARAB MATH DOUBLE-STRUCK LAM..ARAB MATH 2919 1EEBC..1EEEF; UNASSIGNED # .. 2920 1EEF0..1EEF1; FREE_PVAL # ARAB MATH OP MEEM W HAH W TATWHEEL..AR 2921 1EEF2..1EFFF; UNASSIGNED # .. 2922 1F000..1F02B; FREE_PVAL # MAHJONG TILE EAST WIND..MAHJONG TILE B 2923 1F02C..1F02F; UNASSIGNED # .. 2924 1F030..1F093; FREE_PVAL # DOMINO TILE HORIZ BACK..DOMINO TILE VE 2925 1F094..1F09F; UNASSIGNED # .. 2926 1F0A0..1F0AE; FREE_PVAL # PLAY CARD BACK..PLAY CARD KING OF SPAD 2927 1F0AF..1F0B0; UNASSIGNED # .. 2928 1F0B1..1F0BE; FREE_PVAL # PLAY CARD ACE OF HEARTS..PLAY CARD KIN 2929 1F0BF..1F0C0; UNASSIGNED # .. 2930 1F0C1..1F0CF; FREE_PVAL # PLAY CARD ACE OF DIAMONDS..PLAY CARD B 2931 1F0D0 ; UNASSIGNED # 2932 1F0D1..1F0DF; FREE_PVAL # PLAY CARD ACE OF CLUBS..PLAY CARD WHIT 2933 1F0E0..1F0FF; UNASSIGNED # .. 2934 1F100..1F10A; FREE_PVAL # DIG ZERO FULL STOP..DIG NINE COMMA 2935 1F10B..1F10F; UNASSIGNED # .. 2936 1F110..1F12E; FREE_PVAL # PARENTHESIZED LAT CAP LET A..CIRCLE 2937 1F12F ; UNASSIGNED # 2938 1F130..1F16B; FREE_PVAL # SQUARED LAT CAP LET A..RAISED MD SIGN 2939 1F16C..1F16F; UNASSIGNED # .. 2940 1F170..1F19A; FREE_PVAL # NEG SQ LAT CAP LET A..SQUARED VS 2941 1F19B..1F1E5; UNASSIGNED # .. 2942 1F1E6..1F202; FREE_PVAL # REG IND SYMB LET A..SQ KATAKANA SA 2943 1F203..1F20F; UNASSIGNED # .. 2944 1F210..1F23A; FREE_PVAL # SQ CJK UNIF IDEO-624B..SQ CJK UNIF IDE 2945 1F23B..1F23F; UNASSIGNED # .. 2946 1F240..1F248; FREE_PVAL # TORT SH BRACK CJK UNIF IDEO-672C..TORT 2947 1F249..1F24F; UNASSIGNED # .. 2948 1F250..1F251; FREE_PVAL # CIRC IDEO ADVANTAGE..CIRC IDEO ACCEPT 2949 1F252..1F2FF; UNASSIGNED # .. 2950 1F300..1F320; FREE_PVAL # CYCLONE..SHOOTING STAR 2951 1F321..1F32F; UNASSIGNED # .. 2952 1F330..1F335; FREE_PVAL # CHESTNUT..CACTUS 2953 1F336 ; UNASSIGNED # 2954 1F337..1F37C; FREE_PVAL # TULIP..BABY BOTTLE 2955 1F37D..1F37F; UNASSIGNED # .. 2956 1F380..1F393; FREE_PVAL # RIBBON..GRADUATION CAP 2957 1F394..1F39F; UNASSIGNED # .. 2958 1F3A0..1F3C4; FREE_PVAL # CAROUSEL HORSE..SURFER 2959 1F3C5 ; UNASSIGNED # 2960 1F3C6..1F3CA; FREE_PVAL # TROPHY..SWIMMER 2961 1F3CB..1F3DF; UNASSIGNED # .. 2962 1F3E0..1F3F0; FREE_PVAL # HOUSE BUILDING..EUROPEAN CASTLE 2963 1F3F1..1F3FF; UNASSIGNED # .. 2964 1F400..1F43E; FREE_PVAL # RAT..PAW PRINTS 2965 1F43F ; UNASSIGNED # 2966 1F440 ; FREE_PVAL # EYES 2967 1F441 ; UNASSIGNED # 2968 1F442..1F4F7; FREE_PVAL # EAR..CAMERA 2969 1F4F8 ; UNASSIGNED # 2970 1F4F9..1F4FC; FREE_PVAL # VIDEO CAMERA..VIDEOCASSETTE 2971 1F4FD..1F4FF; UNASSIGNED # .. 2972 1F500..1F53D; FREE_PVAL # TWISTED RIGHTWARDS ARROWS..DOWN-POINTI 2973 1F53E..1F53F; UNASSIGNED # .. 2974 1F540..1F543; FREE_PVAL # CIRCLED CROSS POMMEE..NOTCHED LEFT SEM 2975 1F544..1F54F; UNASSIGNED # .. 2976 1F550..1F567; FREE_PVAL # CLOCK FACE ONE OCLOCK..CLOCK FACE TWEL 2977 1F568..1F5FA; UNASSIGNED # .. 2978 1F5FB..1F640; FREE_PVAL # MOUNT FUJI..WEARY CAT FACE 2979 1F641..1F644; UNASSIGNED # .. 2980 1F645..1F650; FREE_PVAL # FACE W NO GOOD GESTURE..PERSON W FO 2981 1F650..1F67F; UNASSIGNED # .. 2982 1F680..1F6C5; FREE_PVAL # ROCKET..LEFT LUGGAGE 2983 1F6C6..1F6FF; UNASSIGNED # .. 2984 1F700..1F773; FREE_PVAL # ALCHEMICAL SYMBOL FOR QUINTESSENCE..AL 2985 1F774..1FFFF; UNASSIGNED # .. 2986 20000..2A6D6; PVALID # 2987 2A6D7..2A6FF; UNASSIGNED # .. 2988 2A700..2B734; PVALID # 2989 2A735..2A739; UNASSIGNED # .. 2990 2A740..2B81D; PVALID # 2991 2B81E..2F7FF; UNASSIGNED # .. 2992 2F800..2FA1D; PVALID # CJK COMP IDEO-2F800..CJK COMPA 2993 2FA1E..2FFFD; UNASSIGNED # .. 2994 2FFFE..2FFFF; DISALLOWED # .. 2995 30000..3FFFD; UNASSIGNED # .. 2996 3FFFE..3FFFF; DISALLOWED # .. 2997 40000..4FFFD; UNASSIGNED # .. 2998 4FFFE..4FFFF; DISALLOWED # .. 2999 50000..5FFFD; UNASSIGNED # .. 3000 5FFFE..5FFFF; DISALLOWED # .. 3001 60000..6FFFD; UNASSIGNED # .. 3002 6FFFE..6FFFF; DISALLOWED # .. 3003 70000..7FFFD; UNASSIGNED # .. 3004 7FFFE..7FFFF; DISALLOWED # .. 3005 80000..8FFFD; UNASSIGNED # .. 3006 8FFFE..8FFFF; DISALLOWED # .. 3007 90000..9FFFD; UNASSIGNED # .. 3008 9FFFE..9FFFF; DISALLOWED # .. 3009 A0000..AFFFD; UNASSIGNED # .. 3010 AFFFE..AFFFF; DISALLOWED # .. 3011 B0000..BFFFD; UNASSIGNED # .. 3012 BFFFE..BFFFF; DISALLOWED # .. 3013 C0000..CFFFD; UNASSIGNED # .. 3014 CFFFE..CFFFF; DISALLOWED # .. 3015 D0000..DFFFD; UNASSIGNED # .. 3016 DFFFE..DFFFF; DISALLOWED # .. 3017 E0000 ; UNASSIGNED # 3018 E0001 ; DISALLOWED # LANGUAGE TAG 3019 E0002..E001F; UNASSIGNED # .. 3020 E0020..E007F; DISALLOWED # TAG SPACE..CANCEL TAG 3021 E0080..E00FF; UNASSIGNED # .. 3022 E0100..E01EF; DISALLOWED # VAR SEL-17..VAR SEL-256 3023 E01F0..EFFFD; UNASSIGNED # .. 3024 EFFFE..10FFFF; DISALLOWED # .. 3026 Appendix B. Acknowledgements 3028 The authors would like to acknowledge the comments and contributions 3029 of the following individuals during working group discussion: David 3030 Black, Mark Davis, Alan DeKok, Martin Duerst, Patrik Faltstrom, Ted 3031 Hardie, Joe Hildebrand, Bjoern Hoehrmann, Paul Hoffman, Jeffrey 3032 Hutzelman, Simon Josefsson, John Klensin, Alexey Melnikov, Takahiro 3033 Nemoto, Yoav Nir, Mike Parker, Pete Resnick, Andrew Sullivan, Dave 3034 Thaler, Yoshiro Yoneya, and Florian Zeitz. 3036 Charlie Kaufman performed a helpful review on behalf of the Security 3037 Directorate, and Tom Taylor reviewed the document on behalf of the 3038 General Area Review Team. 3040 During IESG review, Alissa Cooper provided comments that led to 3041 further improvements. 3043 Some algorithms and textual descriptions have been borrowed from 3044 [RFC5892]. Some text regarding security has been borrowed from 3045 [RFC5890] and [I-D.ietf-xmpp-6122bis]. 3047 Peter Saint-Andre wishes to acknowledge Cisco Systems, Inc., for 3048 employing him during his work on earlier versions of this document. 3050 Authors' Addresses 3052 Peter Saint-Andre 3053 &yet 3055 Email: ietf@stpeter.im 3057 Marc Blanchet 3058 Viagenie 3059 246 Aberdeen 3060 Quebec, QC G1R 2E1 3061 Canada 3063 Email: Marc.Blanchet@viagenie.ca 3064 URI: http://www.viagenie.ca/