INTERNET-DRAFT Soobok Lee draft-ietf-idn-lsb-ace-00.txt Expires 2001-Dec-28 2001-Jun-28 Improving ACE using code point reordering v0.9 Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Distribution of this document is unlimited. Please send comments to the authors or to the idn working group at idn@ops.ietf.org. Abstract This document describes code point reordering to improve ACE compression algorithms. Based on frequency statistics of characater sets, this reordering is easily implemented with simple character mapping tables. DUDE implementation of this idea shows great improvements for Hangul, Chineses, Vietnamese and European domains. Contents Overview Hangul Basic Latin Extended Latin and Combining Diacritical Marks Unified Han Other character sets Modified Encoding procedure of DUDE implementation of this idea Modified Decoding procedure of DUDE implementation of this idea Example strings Security considerations References Author LDUDE: Example implementation into DUDE Overview Pursuing shorter ACE labels is justified to save memory resources and to reduce internet traffic even for domains of average length in various application/core internet protocols. Both 11172 Hangul syllables and 24000 or more CJK Han syllables occupy roughly half of the entire unicode space. Their lexicographical ordering( not in frequency ordering) makes various ACE compression technique work poorly for them. Frequently used syllables are spread evenly through out those wide ranges. Even, Latin characters code range, including 'a' - 'z' has lexicographical order that does not reflect the fact that 's','t' and 'r' are more frequently used than 'j','k' and 'h'. Each Hangul code point integer have useful bit structure in it. That reflects its hangul jamo(consonant/vowel) combination which provides us the hint for applying reordering by jamo usage frequency. Most ACE algorithms show good compression ratio when frequently used characters are re-located into narrower code areas. Especially, to reduce DUDE XOR distance, we can make the narrow area fit in 16,256 and 4096 code-long page boundaries. Hangul Each code point integer of modern hangul syllables(u+ac00 ~ u+bd7f) can be decomposed to produce the following 3 jamo (hangul consonant/vowel) indices, 1) L: a leading consonant index in the array of modern 19 leading consonants, 2) V: a vowel index in the array of modern 21 vowels, 3) T: an optional trailing consonant index in the array of modern 28 trailing consonants. In fact, Unicode hangul range (u+ac00 ~ u+bd7f) is a 3-dimensional array Hangul[L][V][T] with base 0xac00 and each hangul syllable has code point integer determined by L*21*28 + V*28 + T + 0xac00. Statistically, roughly six of 28 hangul trailing consonants are frequent in korean nouns. Leading consontants and vowels show relatively even frequency distributions compared with that of trailing consonants. A frequency mapping table f(T) for T is defined so that f(T)= FrequencyOrderOf(T) in the array of modern trailing consonants. Hangul[L][V][T] is reordered into another 3-dimensional array Hangul[f(T)][f(V)][f(L)] with index T augmented to the highest order. Each hangul syllable in this reorderd range has a new code point value determined by f(T)*21*19 + f(V)*19 + f(L) + 0xac00. Average f(T) value should be sufficiently lower than f(L) and F(V), so that the lowered code point value and its induced lowered successive XOR-distances contribute to produce shorter ACE labels. Basic Latin Basic Latin row u+0070 ~ u+007f has 'p','r','s','t' and 'u' which are more frequently used in European nouns than '`','j','k','f' and 'g' in u+0060 ~ u+006f row which includes most frequently used 'a'~'o' . If these two sets of 5 characters are swapped character-wise, 'p','r','s','t','u' go into the u+0060 ~ u+006f row. Single row fits in 16 codes boundary and any character sequences from only this single row make XOR-distance or code window length shorter than 0x10 and make DUDE and other ACEs do good compression. Extended Latin and Combining Diacritical Marks First 6 rows from Latin Extension A(u+0100 ~ u+015f) and 6 rows from Basic Latin & Latin-1 Supplement (u+0000 ~ u+002f and u+0080 ~ u+00a0) are swapped. First 3 rows from Combining Diacritical Marks(u+0300 ~ u+032f) and 3 rows from Latin-1 Supplement (u+00B0 ~ u+00df) are also swapped. This makes frequently used parts of Latin Extended-A and Combining Diacritical Marks go into first 256-codes page(u+0000 ~ u+00ff). Any character sequences from this single 256-codes-long page make XOR-distance or code window length much shorter than 0x100. This improvement benefits especially East-European and Vietnamese that use Latin Extented A and Combining Diacritical Marks. Unified Han CJK Unified Han syllables (u+4e00 ~ u+a48f) are ordered by their radicals. Most frequently used 2048 Traditional/Simpilfied Chinese Han syllables are reordered into single 4096-code aligned page. Other character sets I have no statistical data to optimize for this code ranges, yet. Japaneses katakana and hiragana (u+3040 ~ u+30ff) code points have lexicographical order and if we could put most frequent ones in a single row, it may greatly improve katakana-only or hiragana-only domains. For Arabic,Hebrew,Cyrillic and Hindi, we could devise similiar optimizations, if relevant frequency statistical data are available. Modified Encoding procedure of DUDE implementation of this idea All ordering of nybbles and quintets is big-endian (most significant first). A nybble is 4 bits. XOR is bitwise exclusive or. This modification is hyphen-safe. Hyphen encoding and decoding are not affected by this modification. let prev = 96 for each input integer n (in order) do begin if n == 45 then output hyphen minus else begin n = reorder(n) // ******** ADDED ********** let diff = prev XOR n extract the least significant nybbles of diff, as few as are sufficient to hold all the nonzero bits (but at least one) prepend 0 to the last nybble and 1 to the rest output base-32 characters corresponding to the quintets let prev = n end end The encoder must either correctly handle all integer values that can be represented in the type of its input, or it must check whether the input contains values that it cannot handle and return an error if so. Under no circumstances may it produce incorrect output. Modified Decoding procedure of DUDE implementation of this idea let prev = 96 while the input string is not exhausted do begin if the next character is hyphen-minus then output 45 else begin input characters and convert them to quintets until encountering a quintet beginning with 0 fail upon encountering a non-base-32 character or end-of-input strip the first bit of each quintet concatenate the resulting nybbles to form diff let prev = prev XOR diff output restore_order(prev) // ******** MODIFIED ********** end end encode the output sequence and compare it to the input string fail if they are not equal Example strings about 20% improvement in DUDE compression ratio is achieved in these Hangul examples. (A) Korean Strings 1: ( 24 hangul syllables ) U+C138 U+ACC4 U+C758 U+BAA8 U+B4E0 U+C0AC U+B78C U+B4E4 U+C774 U+D55C U+AD6D U+C5B4 U+B97C U+C774 U+D574 U+D55C U+B2E4 U+BA74 U+C5BC U+B9C8 U+B098 U+C88B U+C744 U+AE4C DUDE-02: 6txIy79Ny53Nz79A8wIzwwNzzuAvyIzv3AtuuIz2vBy27Jz66Iz8sI\ tusAuIyz5I23Az96Iz6zE3xAz2tD96Ry3sI ( 89 chars ) LDUDE : 5szf8pt3bttat3jt6iywhu3bw9qt5m37r2vmxxjxsg2mtvat3auykz\ wpxubc8xd42fw6p ( 70 chars ) (B) Korean Strings 2: ( 6 hangul syllables ) U+C138 U+ACC4 U+C758 U+BAA8 U+B4E0 U+C0AC DUDE-02: 6txiy79ny53nz79a8wizwwn ( 24 chars ) LDUDE : 5szf8pt3bttat3jt6i ( 19 chars ) about 15%~20% improvement in DUDE compression ratio is achieved in these UniHan examples. (C) Traditional Chinese Strings 1: ( 16 syllables ) u+5354 u+91c7 u+5065 u+5eb7 u+4e8b u+696d u+670d u+52d9 u+7db2 u+002d u+5354 u+91c7 u+6709 u+9650 u+516c u+53f8 DUDE-02: xvve6u3d6t4c87ctsvnuz8g8yavx7eu9ym-u88g6u3d9y6q9txj6z\ vnu3e (59 chars) LDUDE : w84bt2sp62tc66vgu2btvi5mmtvq-t8tqt2spy6juznu5iuwg (50 chars) (D) Traditional Chinese Strings 2: ( 21 syllables ) u+5317 u+4eac u+5e02 u+91ab u+85e5 u+7d93 u+6fdf u+6280 u+8853 u+7d93 u+71df u+516c u+53f8 u+5fa1 u+91ab u+7db2 u+7d61 u+83ef u+91ab u+7db2 u+8def DUDE-02: xvzht75mts4q694jtwwq92zgtuwn7xr847d9x6a6wnus5du3e6xj6\ 8sk86tj7d982qtuwe86tj9sxp ( 79 chars) LDUDE : xturwzfwxiv4kywgw9jv5jt2awiuzbu3buuhuwg7ug7yjuthv3pv6\ buwmuthwd ( 63 chars ) (E) Traditional Chinese Strings 3: ( 18 syllables ) u+795e u+8fb2 u+7db2 u+990a u+8eab u+4fdd u+5065 u+7db2 u+5065 u+5eb7 u+4e16 u+754c u+5065 u+5eb7 u+8a2d u+8a08 u+5bb6 u+60e0 DUDE-02: z3vq9y8n9usa8w5itz4b6tzgt95iu77hu77h87cts4bv5xkuxuj87\ c7w3kuf7t5qv5xg ( 69 chars ) LDUDE : xtxkwybx2euuev9ezn694p68sb68sb66vgv4k9q69yc66vgv6gvit\ xhw5b ( 59 chars ) (F) Simplified Chinese String 1 : ( 16 syllables ) u+4e2d u+534e u+4eba u+6c11 u+5171 u+548c u+56fd u+5bf9 u+5916 u+8d38 u+6613 u+7ecf u+6d4e u+5408 u+4f5c u+90e8 DUDE-02: w8wpt7ydt79euu4mv7yax9puzb7seu8r7wuq85umt27ntv2bv3wgt\ 5xe795e ( 61 chars ) LDUDE : xsyk8zc8za4ht8ntud6saz3j5xn773p79umu67ntv2bv75e2j4m ( 52 chars ) (G) Simplified Chinese String 2 : ( 18 syllables ) u+4e2d u+56fd u+4eba u+6c11 u+5927 u+5b66 u+4e2d u+56fd u+8d22 u+653f u+91d1 u+878d u+653f u+7b56 u+7814 u+7a76 u+4e2d u+5fc3 DUDE-02: w8wpt27at2whuu4mvxvguwbtxwmt27a757r82tp9w8qtyxn8u5ct\ 8yjvwcuycvwxmtt8q ( 73 chars ) LDUDE : xsyk6yk6yi4h4j5ya5yn6yk7twc76vjwcywhysfu7iu8rxktznwj (53 chars ) LDUDE show better compression ratios for other scripts,too. (H) Vietnamese: ( 38 syllables using diacritical marks ) Taisaohokhngthchi\ noitingVit U+0054 u+0061 u+0323 u+0069 u+0073 u+0061 u+006F u+0068 u+006F u+0323 u+006B u+0068 u+00F4 u+006E u+0067 u+0074 u+0068 u+00EA u+0309 u+0063 u+0068 u+0069 u+0309 u+006E u+006F u+0301 u+0069 u+0074 u+0069 u+00EA u+0301 u+006E u+0067 U+0056 u+0069 u+00EA u+0323 u+0074 DUDE-02: vEvfvwcvwktktcqhhvwnvwid3n3kjtdtn2cv8dvykmbvyavyhbvyqv\ yitptp2dv8mvyrjvBvr2dv6jvxh ( 82 chars ) LDUDE : uGuh5c5kckqhh5n4atm3n3ktmtdq2cxd7kmb7a7hb7q7irr2dxm7rt\ muDvr2dvj5f (66 chars , 16 chars(19%) shorter) (I) Spanish: ( using basic Latin & Latin Supplement ) PorqunopuedensimplementehablarenEspaol U+0050 u+006F u+0072 u+0071 u+0075 u+00E9 u+006E u+006F u+0070 u+0075 u+0065 u+0064 u+0065 u+006E u+0073 u+0069 u+006D u+0070 u+006C u+0065 u+006D u+0065 u+006E u+0074 u+0065 u+0068 u+0061 u+0062 u+006C u+0061 u+0072 u+0065 u+006E U+0045 u+0073 u+0070 u+0061 u+00F1 u+006F u+006C DUDE-02: vAvrtpde3n2hbtrftabbmtptketptnjiimtktbpjdqptdthmuMvgdt\ b3a3qd (61 chars) LDUDE : uAurftmtg2q2hbrhcbbmfcepnjiimidpjdqpmrmuMuqmb3a3qd (51 chars, 10 chars (16%) shorter) (J) Czech: (using Latin Extended A) Proprostnemluvesky U+0050 u+0072 u+006F u+010D u+0070 u+0072 u+006F u+0073 u+0074 u+011B u+006E u+0065 u+006D u+006C u+0075 u+0076 u+00ED u+010D u+0065 u+0073 u+006B u+0079 DUDE-02: vAuctptyctzpctptnhtyrtzfmibtjd3mt8atyitgtitc (45 chars) LDUDE : uAukfycypkfepzpzfmibmtb3m8ayiqtik(34 chars,24% shorter) Security considerations ACE-encoded reordered code points are restored in reverse ACE translation with no problem, and this improvement do not introduce any new security problems into ACE. References [DUDE02] Mark Welter, Brian Spolarich, Adam Costello, "DUDE: Differential Unicode Domain Encoding", 2001-May-31, draft-ietf-idn-dude-02. [AMCACEW] Adam Costello, "AMC-ACE-W version 0.1.0", 2001-May-31, draft-ietf-idn-amc-ace-w-00, latest version at http://www.cs.berkeley.edu/~amc/charset/amc-ace-w. [UNICODE] The Unicode Consortium, "The Unicode Standard", http://www.unicode.org/unicode/standard/standard.html. [IDNA] Patrik Falstrom, Paul Hoffman, "Internationalizing Host Names In Applications (IDNA)", draft-ietf-idn-idna-01 [NAMEPREP] Paul Hoffman, Marc Blanchet, "Preparation of Internationalized Host Names", Feb 2001, draft-ietf-idn-nameprep-03 Author Soobok Lee Postel Services, Inc. http://www.postel.co.kr Tel: +82-11-9774-2737 Example implementation This idea is applicable to any ACEs. LDUDE (in this example implementation) is merely a name for DUDE implementation of this idea. Embedded hangul jamo and Latin frequency tables are subject to change with further studies in the next revision of this draft. In Unix, save this example source code into ldude.c % cc -o ldude ldude.c % ./ldude -e < input_file > output_file % ./ldude -d < output_file A input file should contains u+????-form code points delimited with spaces or newlines. /******************************************/ /* ldude.c 1.0 (2001-Jun-28) */ /* Soobok Lee */ /* Adam M. Costello */ /******************************************/ /* This is ANSI C code (C89) implementing */ /* DUDE (draft-ietf-idn-ldude-01). */ /************************************************************/ /* Public interface (would normally go in its own .h file): */ #include #include enum dude_status { dude_success, dude_bad_input, dude_big_output /* Output would exceed the space provided. */ }; enum case_sensitivity { case_sensitive, case_insensitive }; #if UINT_MAX >= 0x1FFFFF typedef unsigned int u_code_point; #else typedef unsigned long u_code_point; #endif enum dude_status dude_encode( unsigned int input_length, const u_code_point input[], const unsigned char uppercase_flags[], unsigned int *output_size, char output[] ); /* dude_encode() converts Unicode to DUDE (without any */ /* signature). The input must be represented as an array */ /* of Unicode code points (not code units; surrogate pairs */ /* are not allowed), and the output will be represented as */ /* null-terminated ASCII. The input_length is the number of code */ /* points in the input. The output_size is an in/out argument: */ /* the caller must pass in the maximum number of characters */ /* that may be output (including the terminating null), and on */ /* successful return it will contain the number of characters */ /* actually output (including the terminating null, so it will be */ /* one more than strlen() would return, which is why it is called */ /* output_size rather than output_length). The uppercase_flags */ /* array must hold input_length boolean values, where nonzero */ /* means the corresponding Unicode character should be forced */ /* to uppercase after being decoded, and zero means it is */ /* caseless or should be forced to lowercase. Alternatively, */ /* uppercase_flags may be a null pointer, which is equivalent */ /* to all zeros. The encoder always outputs lowercase base-32 */ /* characters except when nonzero values of uppercase_flags */ /* require otherwise. The return value may be any of the */ /* dude_status values defined above; if not dude_success, then */ /* output_size and output may contain garbage. On success, the */ /* encoder will never need to write an output_size greater than */ /* input_length*k+1 if all the input code points are less than 1 */ /* << (4*k), because of how the encoding is defined. */ enum dude_status dude_decode( enum case_sensitivity case_sensitivity, char scratch_space[], const char input[], unsigned int *output_length, u_code_point output[], unsigned char uppercase_flags[] ); /* dude_decode() converts DUDE (without any signature) to */ /* Unicode. The input must be represented as null-terminated */ /* ASCII, and the output will be represented as an array of */ /* Unicode code points. The case_sensitivity argument influences */ /* the check on the well-formedness of the input string; it */ /* must be case_sensitive if case-sensitive comparisons are */ /* allowed on encoded strings, case_insensitive otherwise. */ /* The scratch_space must point to space at least as large */ /* as the input, which will get overwritten (this allows the */ /* decoder to avoid calling malloc()). The output_length is */ /* an in/out argument: the caller must pass in the maximum */ /* number of code points that may be output, and on successful */ /* return it will contain the actual number of code points */ /* output. The uppercase_flags array must have room for at */ /* least output_length values, or it may be a null pointer if */ /* the case information is not needed. A nonzero flag indicates */ /* that the corresponding Unicode character should be forced to */ /* uppercase by the caller, while zero means it is caseless or */ /* should be forced to lowercase. The return value may be any */ /* of the dude_status values defined above; if not dude_success, */ /* then output_length, output, and uppercase_flags may contain */ /* garbage. On success, the decoder will never need to write */ /* an output_length greater than the length of the input (not */ /* counting the null terminator), because of how the encoding is */ /* defined. */ /**********************************************************/ /* Implementation (would normally go in its own .c file): */ #include /* Character utilities: */ /* base32[q] is the lowercase base-32 character representing */ /* the number q from the range 0 to 31. Note that we cannot */ /* use string literals for ASCII characters because an ANSI C */ /* compiler does not necessarily use ASCII. */ static const char base32[] = { 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, /* a-k */ 109, 110, /* m-n */ 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, /* p-z */ 50, 51, 52, 53, 54, 55, 56, 57 /* 2-9 */ }; /* base32_decode(c) returns the value of a base-32 character, in the */ /* range 0 to 31, or the constant base32_invalid if c is not a valid */ /* base-32 character. */ enum { base32_invalid = 32 }; static unsigned int base32_decode(char c) { if (c < 50) return base32_invalid; if (c <= 57) return c - 26; if (c < 97) c += 32; if (c < 97 || c == 108 || c == 111 || c > 122) return base32_invalid; return c - 97 - (c > 108) - (c > 111); } /* unequal(case_sensitivity,s1,s2) returns 0 if the strings s1 and s2 */ /* are equal, 1 otherwise. If case_sensitivity is case_insensitive, */ /* then ASCII A-Z are considered equal to a-z respectively. */ static int unequal( enum case_sensitivity case_sensitivity, const char s1[], const char s2[] ) { char c1, c2; if (case_sensitivity != case_insensitive) return strcmp(s1,s2) != 0; for (;;) { c1 = *s1; c2 = *s2; if (c1 >= 65 && c1 <= 90) c1 += 32; if (c2 >= 65 && c2 <= 90) c2 += 32; if (c1 != c2) return 1; if (c1 == 0) return 0; ++s1, ++s2; } } /* LANGUAGE-SPECIFIC IMPROVEMENTS TO DUDE BASED ON CODE REORDERING */ /* Common Constants for Hangul */ static u_code_point SBase = 0xAC00, LBase = 0x1100, VBase = 0x1161, TBase = 0x11A7, LCount = 19, VCount = 21, TCount = 28, NCount, SCount; static u_code_point NCount = 588; // VCount * TCount static u_code_point SCount = 11172; // LCount * NCount static u_code_point MCount = 399; // LCount * VCount int JAMO_L_FREQ[19][2] = { {1,11}, // "G" 0 {14,0}, // "GG" {9,9}, // "N" {5,12}, // "D" {15,7}, // "DD" {13,3}, // "R" 5 {7,18}, // "M" {4,6 }, // "B" {16,14}, // "BB" {2, 2}, // "S" {17,17}, // "SS" 10 {0,16}, // "O" {3,15}, // "J" {18,5 }, // "JJ" {8,1}, // "C" {12,4}, // "K" 15 {11,8}, // "T" {10,10}, // "P" {6 ,13} // "H" }; int JAMO_V_FREQ[21][2] = { {2,20}, // "A" 0 {7,5}, // "AE" {15,0}, // "YA" {20,13}, // "YAE" {5,18}, // "EO" {1, 4}, // "E" 5 {9,8}, // "YEO" {13,1},// "YE" {6,13}, // "O" {10, 6}, // "WA" {18,9}, // "WAE" 10 {16,17}, // "OE" {17,14}, // "YO" {8, 7}, // "U" {12,16}, // "WEO" {19, 2}, // "WE" 15 {14,11}, // "WI" {11,12}, // "YU" {4,10}, // "EU" {19,19}, // "YI" {0,3} // "I" 20 }; int JAMO_T_FREQ[28][2] = { {0,0}, // "" 0 {5,4}, // "G" {15,21}, // "GG" {16,8}, // "GS" {1,16}, // "N" {27,1 }, // "NJ" 5 {17,17}, // "NH" {10,19}, // "D" {3,24}, // "L" {18,27}, // "LG" {19, 7}, // "LM" 10 {20,22}, // "LB" {21,25}, // "LS" {22,26}, // "LT" {23,27}, // "LP" {24,2}, // "LH" 15 {4,3}, // "M" {6,6}, // "B" {25,9}, // "BS" {7,10}, // "S" {26,11}, // "SS" 20 {2,12}, // "NG" {11,13}, // "J" {10,14}, // "C" {8 ,15}, // "K" {12,18}, // "T" 25 {13,20}, // "P" {9 ,5} // "H" 27 }; /* Hangul Decomposition */ int isHANGUL(u_code_point s) { int SIndex = s - SBase; if (SIndex < 0 || SIndex >= SCount) { return 0; } return 1; }; int isUNIHAN(u_code_point s) { if (s >= 0x4E00 && s <= 0x9FAF) { return 1; } return 0; }; int isLatins(u_code_point s) { if (s < 0x370) { return 1; } return 0; }; u_code_point reorder_hangul(u_code_point s) { u_code_point SIndex = s - SBase; int zL = JAMO_L_FREQ[SIndex / NCount][0]; int zV = JAMO_V_FREQ[(SIndex % NCount) / TCount][0]; int zT = JAMO_T_FREQ[SIndex % TCount][0]; int idx= (zT*MCount + zV*LCount + zL); if(idx < SCount-0x400) { idx += 0x400; } else { idx = idx - (SCount -0x400); }; return idx + SBase; } u_code_point restore_order_hangul(u_code_point z) { int T,V,L; u_code_point zIndex = z-SBase; if(zIndex < 0x400) { zIndex += SCount - 0x400; } else { zIndex = zIndex - 0x400; }; T = JAMO_T_FREQ[(zIndex / MCount)][1]; V = JAMO_V_FREQ[(zIndex % MCount) / LCount][1]; L = JAMO_L_FREQ[(zIndex % LCount)][1]; return (L*NCount + V*TCount + T) + SBase; } #define U8(A,B,C,D,E,F,G,H) U(A);U(B);U(C);U(D);U(E);U(F);U(G);U(H); #define U(A) if(j==A) return(i); if(j==i) return(A); i++; u_code_point reorder_unihan(u_code_point s) { register u_code_point i=0x5000; register u_code_point j=s; // TC : most frequent 2048 U8(0x7684,0x662f,0x4e0d,0x6211,0x4e00,0x6709,0x5927,0x5728); U8(0x4eba,0x4e86,0x4e2d,0x5230,0x8cc7,0x8981,0x4ee5,0x53ef); U8(0x9019,0x500b,0x4f60,0x6703,0x597d,0x70ba,0x4e0a,0x4f86); U8(0x5b78,0x5c31,0x4ea4,0x4e5f,0x7528,0x80fd,0x5982,0x6642); U8(0x6587,0x8aaa,0x6c92,0x4ed6,0x770b,0x90a3,0x554f,0x751f); U8(0x63d0,0x4e0b,0x904e,0x8acb,0x5011,0x5929,0x6240,0x591a); U8(0x9ebc,0x5c0f,0x4e4b,0x60f3,0x5f97,0x5de5,0x51fa,0x9084); U8(0x96fb,0x5c0d,0x90fd,0x6a5f,0x81ea,0x800c,0x5b50,0x5f8c); U8(0x8a0a,0x5bb6,0x7ad9,0x5fc3,0x53ea,0x53bb,0x77e5,0x570b); U8(0x5f88,0x53f0,0x6210,0x4fe1,0x540c,0x4f55,0x7ae0,0x9053); U8(0x767c,0x5730,0x6cd5,0x7121,0x7136,0x4f46,0x7576,0x65bc); U8(0x55ce,0x672c,0x5e74,0x73fe,0x524d,0x6700,0x771f,0x65b0); U8(0x548c,0x56e0,0x679c,0x610f,0x5b9a,0x9ede,0x60c5,0x5176); U8(0x984c,0x4e8b,0x79d1,0x65b9,0x4e9b,0x6e05,0x4e09,0x6a23); U8(0x6b64,0x5427,0x4f4d,0x4f5c,0x7406,0x884c,0x8005,0x7d93); U8(0x540d,0x4ec0,0x8b1d,0x65e5,0x6b63,0x958b,0x8a71,0x8207); U8(0x5be6,0x611b,0x518d,0x83ef,0x4e8c,0x57ce,0x52d5,0x6bd4); U8(0x9762,0x9ad8,0x53c8,0x6216,0x529b,0x61c9,0x5973,0x7a2e); U8(0x6559,0x8eca,0x5206,0x50cf,0x7cfb,0x9577,0x624b,0x6b21); U8(0x5df2,0x660e,0x6253,0x592a,0x8def,0x8d77,0x5df1,0x76f8); U8(0x4e3b,0x95dc,0x5341,0x9593,0x9cf3,0x5916,0x5462,0x89ba); U8(0x4f7f,0x8a72,0x53cb,0x624d,0x9032,0x51f0,0x5979,0x6c11); U8(0x8457,0x5404,0x5168,0x5c07,0x5c11,0x5169,0x52a0,0x56de); U8(0x611f,0x5f0f,0x7b2c,0x7403,0x6027,0x8001,0x7a0b,0x628a); U8(0x88ab,0x516c,0x8ad6,0x53ca,0x9f8d,0x6821,0x5225,0x9ad4); U8(0x91cd,0x7d66,0x807d,0x6c34,0x505a,0x5e38,0x60a8,0x898b); U8(0x88e1,0x6771,0x98a8,0x89e3,0x7063,0x6708,0x7b49,0x5566); U8(0x90e8,0x539f,0x7f8e,0x5148,0x97f3,0x901a,0x7ba1,0x7db2); U8(0x5340,0x671f,0x932f,0x5426,0x6a02,0x5165,0x627e,0x66f8); U8(0x8b93,0x56db,0x554a,0x7531,0x9078,0x8f03,0x6578,0x8868); U8(0x5167,0x5834,0x5b83,0x5f9e,0x5feb,0x6b61,0x81f3,0x7acb); U8(0x76ee,0x793e,0x5408,0x671b,0x600e,0x8a8d,0x544a,0x66f4); U8(0x5e7e,0x8003,0x5ea6,0x96e3,0x7248,0x982d,0x559c,0x8a31); U8(0x5149,0x4eca,0x8cb7,0x7b97,0x5f1f,0x82e5,0x7d71,0x8eab); U8(0x8a18,0x4ee3,0x865f,0x8655,0x5b8c,0x63a5,0x8a08,0x8a00); U8(0x5b57,0x5e2b,0x4e26,0x653f,0x73a9,0x5f35,0x7537,0x8ab0); U8(0x5c71,0x6bcf,0x7d50,0x4e14,0x661f,0x975e,0x5efa,0x6539); U8(0x9023,0x653e,0x54c8,0x6d3b,0x7814,0x76f4,0x8a2d,0x9673); U8(0x5831,0x8f49,0x9ee8,0x6307,0x4e94,0x8b8a,0x6c23,0x897f); U8(0x8a66,0x5e0c,0x795e,0x53d6,0x5316,0x7269,0x738b,0x4efb); U8(0x6797,0x55ae,0x4e16,0x53d7,0x8fd1,0x7fa9,0x6b7b,0x4fbf); U8(0x53cd,0x58eb,0x6230,0x7a7a,0x968a,0x8ddf,0x537b,0x5317); U8(0x5fc5,0x696d,0x529f,0x5beb,0x5f71,0x8072,0x5e73,0x81fa); U8(0x54e1,0x91d1,0x8a0e,0x8272,0x5247,0x5bb9,0x6a94,0x7247); U8(0x5411,0x59b3,0x5e02,0x5229,0x8208,0x767d,0x5f37,0x5b89); U8(0x592e,0x7279,0x8b70,0x8fa6,0x50f9,0x7e3d,0x50b3,0x601d); U8(0x82b1,0x5143,0x53eb,0x4fdd,0x4efd,0x6c42,0x7a76,0x5475); U8(0x4ef6,0x672a,0x6c7a,0x7d44,0x842c,0x7af9,0x7d1a,0x6301); U8(0x7b11,0x6295,0x54ea,0x5ba4,0x66fe,0x8d70,0x5594,0x6a19); U8(0x6d41,0x652f,0x7368,0x8c93,0x5361,0x9700,0x5144,0x9580); U8(0x5171,0x8a9e,0x6d77,0x53e3,0x963f,0x7dda,0x99ac,0x9ec3); U8(0x53c3,0x822c,0x547d,0x8996,0x89c0,0x806f,0x8166,0x670b); U8(0x683c,0x5152,0x516b,0x4fee,0x6599,0x9322,0x5931,0x5403); U8(0x4f4f,0x5373,0x53e6,0x9304,0x5c08,0x8c61,0x63db,0x57fa); U8(0x677f,0x62ff,0x9060,0x901f,0x5f62,0x5b69,0x5099,0x6b4c); U8(0x5e6b,0x78ba,0x5019,0x9664,0x754c,0x88dd,0x985e,0x8b1b); U8(0x5668,0x5357,0x6848,0x756b,0x82f1,0x8a34,0x5e36,0x5dee); U8(0x4e4e,0x91cf,0x4e45,0x6389,0x4f3c,0x6574,0x5f15,0x73ed); U8(0x8ff7,0x5716,0x5236,0x8cbb,0x8cfd,0x5947,0x8b58,0x578b); U8(0x8d85,0x908a,0x8036,0x54c1,0x820d,0x96d6,0x59cb,0x904b); U8(0x674e,0x52d9,0x6b0a,0x9a57,0x6545,0x516d,0x8b80,0x602a); U8(0x98db,0x6eff,0x670d,0x5922,0x6536,0x773c,0x9020,0x5ff5); U8(0x7559,0x8ab2,0x8ecd,0x7834,0x7cbe,0x534a,0x7d04,0x9858); U8(0x4ee4,0x5e95,0x7b54,0x6f14,0x9054,0x96c4,0x6df1,0x7968); U8(0x65e9,0x9662,0x5920,0x66f2,0x5047,0x8ac7,0x8853,0x68d2); U8(0x8ce3,0x9ed1,0x767e,0x52dd,0x63a8,0x5b58,0x706b,0x6e96); U8(0x793a,0x5f80,0x789f,0x6613,0x6cc1,0x665a,0x96e2,0x6cbb); U8(0x5c0e,0x4e03,0x6bb5,0x5718,0x8abf,0x8b49,0x5217,0x50b7); U8(0x6c38,0x525b,0x6392,0x54e5,0x5fb7,0x4e5d,0x751a,0x6bba); U8(0x7167,0x8edf,0x5305,0x6015,0x689d,0x591c,0x5546,0x6982); U8(0x6839,0x4f9b,0x7d55,0x5343,0x5ba2,0x5207,0x96c6,0x7a31); U8(0x64da,0x843d,0x8d8a,0x7adf,0x76e1,0x5f85,0x805e,0x5712); U8(0x5fd8,0x503c,0x7522,0x6d88,0x96d9,0x7d05,0x5ea7,0x5c55); U8(0x80b2,0x8dd1,0x9644,0x561b,0x57f7,0x5531,0x6280,0x67d0); U8(0x786c,0x65af,0x96f2,0x904a,0x606f,0x52a9,0x9808,0x82e6); U8(0x4ecb,0x6548,0x9996,0x8cea,0x4f8b,0x5509,0x8077,0x5fa9); U8(0x8f38,0x7bc0,0x898f,0x6ce8,0x7562,0x67e5,0x71b1,0x6cb9); U8(0x9928,0x614b,0x505c,0x798f,0x6551,0x5012,0x89aa,0x5bb3); U8(0x4e82,0x53e4,0x6b65,0x5bf6,0x64ca,0x8209,0x7d42,0x55ef); U8(0x5370,0x9650,0x4f9d,0x65b7,0x8f15,0x74b0,0x7c21,0x8da3); U8(0x5fd7,0x97ff,0x96a8,0x7df4,0x7e8c,0x9b5a,0x7bc7,0x53f8); U8(0x5c40,0x9001,0x6975,0x89d2,0x7701,0x6e90,0x967d,0x5e79); U8(0x7fd2,0x7f85,0x6b66,0x514d,0x7591,0x62c9,0x514b,0x4ecd); U8(0x6a13,0x4f5b,0x8db3,0x4f4e,0x5ee3,0x7169,0x9ce5,0x986f); U8(0x78bc,0x571f,0x7387,0x8056,0x58de,0x521d,0x5177,0x9810); U8(0x5440,0x773e,0x8cac,0x722d,0x5175,0x667a,0x8aa4,0x5883); U8(0x9752,0x9806,0x91ce,0x695a,0x8cb4,0x8ca0,0x58d3,0x53f2); U8(0x9069,0x4fc2,0x6e2c,0x61f7,0x8fce,0x914d,0x9b54,0x6162); U8(0x54c7,0x61c2,0x55da,0x4ea6,0x5473,0x8a55,0x821e,0x7d30); U8(0x91ab,0x5e1d,0x5c6c,0x53e5,0x6200,0x6557,0x5b9c,0x694a); U8(0x7532,0x8ffd,0x704c,0x6625,0x5de6,0x6562,0x9748,0x72c2); U8(0x969b,0x7fa4,0x65cf,0x6728,0x9a0e,0x91cc,0x9805,0x6232); U8(0x9047,0x72d7,0x4f73,0x535a,0x53f3,0x75db,0x71df,0x59b9); U8(0x5eb7,0x5584,0x5fb5,0x6b77,0x5b98,0x723e,0x6309,0x7de8); U8(0x75c5,0x8b77,0x88dc,0x64c7,0x6293,0x77f3,0x6b72,0x96bb); U8(0x9818,0x5c0b,0x6eab,0x990a,0x6b62,0x5b88,0x541b,0x8840); U8(0x7530,0x96e8,0x5c45,0x8b02,0x7570,0x512a,0x8df3,0x62dc); U8(0x721b,0x5c01,0x60e1,0x826f,0x6a21,0x72c0,0x6d6a,0x804a); U8(0x589e,0x6838,0x6fc0,0x7dad,0x9678,0x5433,0x725b,0x5fd9); U8(0x8a5e,0x5287,0x5bbf,0x6025,0x5565,0x62b1,0x975c,0x653b); U8(0x4e9e,0x6c5f,0x81f4,0x9663,0x56b4,0x5b97,0x8b66,0x58d8); U8(0x592b,0x5bc6,0x7761,0x5348,0x5e97,0x52e2,0x60b2,0x862d); U8(0x5e55,0x7de3,0x9031,0x5ee0,0x7c3d,0x5750,0x9999,0x723d); U8(0x63a7,0x5fae,0x767b,0x7ffb,0x666e,0x883b,0x51b7,0x5a01); U8(0x6bd2,0x4fca,0x7d61,0x8f2f,0x6bcd,0x5275,0x5802,0x8d99); U8(0x5957,0x820a,0x96dc,0x5468,0x8ff0,0x6050,0x5e78,0x4eae); U8(0x9e97,0x5df4,0x79ae,0x9152,0x4ec1,0x9910,0x724c,0x7a81); U8(0x8173,0x528d,0x62db,0x5409,0x7236,0x4ed4,0x5178,0x641e); U8(0x623f,0x7d20,0x9632,0x6388,0x5145,0x8349,0x66b4,0x616e); U8(0x7d39,0x80cc,0x5289,0x59d4,0x5e9c,0x666f,0x61b6,0x5c24); U8(0x8af8,0x7f3a,0x63f4,0x6f2b,0x7434,0x7f75,0x7d14,0x5c1a); U8(0x85dd,0x60dc,0x7f6e,0x76ca,0x59d0,0x8aa0,0x7e7c,0x6e56); U8(0x6b32,0x9ebb,0x9760,0x8089,0x677e,0x523b,0x7d00,0x9000); U8(0x65e2,0x542b,0x5224,0x91cb,0x76ae,0x6ce2,0x627f,0x5c04); U8(0x5806,0x83ab,0x88fd,0x9375,0x8d95,0x65c1,0x7b46,0x6241); U8(0x8a3b,0x594f,0x6a39,0x5f8b,0x9435,0x69ae,0x6628,0x6bdb); U8(0x5f69,0x6b78,0x864e,0x7f6a,0x7686,0x8449,0x552e,0x5f48); U8(0x885b,0x65bd,0x9298,0x5200,0x584a,0x6f22,0x6b23,0x5e03); U8(0x8cde,0x8f09,0x96aa,0x64ad,0x5347,0x9418,0x5bc4,0x5f04); U8(0x4ed8,0x69cb,0x56c9,0x78c1,0x87a2,0x5049,0x85a6,0x6d0b); U8(0x563f,0x555f,0x6885,0x7b56,0x563b,0x71c8,0x9b3c,0x6aa2); U8(0x5ba3,0x54e6,0x5abd,0x5747,0x6d3e,0x8c6c,0x6fdf,0x67b6); U8(0x4eab,0x5446,0x8a13,0x85cd,0x5283,0x64d4,0x52aa,0x90ed); U8(0x6b49,0x7d19,0x8cbc,0x6697,0x547c,0x7f77,0x5de7,0x6167); U8(0x7a7f,0x8a73,0x96f7,0x5354,0x7763,0x9867,0x81c9,0x9022); U8(0x5cf6,0x734e,0x6e38,0x6279,0x7565,0x77ed,0x5e7b,0x6c99); U8(0x6563,0x6575,0x9109,0x518a,0x8f2a,0x671d,0x7a97,0x5fcd); U8(0x6cb3,0x85cf,0x885d,0x6df7,0x552f,0x4e7e,0x51a0,0x719f); U8(0x9df9,0x86cb,0x5c0a,0x68c4,0x656c,0x5b63,0x5a5a,0x7e23); U8(0x7dca,0x4f2f,0x7533,0x8863,0x8cfc,0x50c5,0x5e33,0x5c64); U8(0x79cb,0x731c,0x504f,0x93e1,0x98df,0x559d,0x5077,0x8d0a); U8(0x72af,0x52c7,0x9846,0x59d3,0x675f,0x6de1,0x8a69,0x5609); U8(0x66c9,0x501f,0x5f92,0x6d32,0x64c1,0x5e8f,0x6176,0x7e3e); U8(0x795d,0x7345,0x570d,0x9918,0x79c1,0x9b25,0x67d4,0x6f02); U8(0x5bcc,0x79c0,0x7bc4,0x907f,0x8f1d,0x8b6f,0x5b64,0x7b28); U8(0x62ec,0x5438,0x7aef,0x79fb,0x5ef3,0x84cb,0x672b,0x5bdf); U8(0x6297,0x63ee,0x4e56,0x7a4d,0x63d2,0x9a5a,0x8521,0x5fe0); U8(0x6108,0x96ea,0x5de8,0x78a9,0x745e,0x51e1,0x6731,0x7c43); U8(0x4e1f,0x76e4,0x5076,0x6e2f,0x5bae,0x5e25,0x5b8f,0x96c5); U8(0x8ca8,0x9192,0x865b,0x907a,0x639b,0x900f,0x7206,0x70c8); U8(0x6ec5,0x6750,0x62cd,0x4f11,0x8a8c,0x5713,0x9280,0x4e92); U8(0x98ef,0x638c,0x8c6a,0x66ff,0x6311,0x9802,0x718a,0x7d2f); U8(0x5740,0x5065,0x4fe0,0x96de,0x56f0,0x8afe,0x8f14,0x4f34); U8(0x7389,0x6d17,0x654f,0x81e8,0x8857,0x964d,0x5538,0x6190); U8(0x8607,0x7c73,0x6e1b,0x64cd,0x760b,0x8f9b,0x5b87,0x805a); U8(0x516e,0x5f7c,0x63a1,0x5c46,0x5c3c,0x8f29,0x85e5,0x9a19); U8(0x7da0,0x7ae5,0x7e2e,0x7b26,0x7372,0x9ed8,0x90ce,0x78b0); U8(0x7981,0x5a46,0x54a7,0x80a1,0x62bd,0x780d,0x6068,0x80af); U8(0x520a,0x6ce1,0x6f38,0x92fc,0x5ee2,0x6dda,0x6a1e,0x8d0f); U8(0x6b50,0x8ca1,0x8a02,0x9014,0x89f8,0x8ce2,0x5091,0x812b); U8(0x7bb1,0x4f48,0x4ed9,0x51b0,0x6790,0x84bc,0x53ad,0x7c4d); U8(0x5805,0x54ed,0x590f,0x61f6,0x6a4b,0x7a69,0x9732,0x8a62); U8(0x68ee,0x7d72,0x5875,0x6163,0x6fe4,0x8abc,0x4f54,0x8336); U8(0x8cf4,0x968e,0x8c50,0x8c9d,0x6691,0x90f5,0x5371,0x5bd2); U8(0x68b0,0x96b1,0x7238,0x6c7d,0x6158,0x65c5,0x6751,0x4e88); U8(0x632f,0x4ea1,0x74f6,0x906d,0x54f2,0x96f6,0x7d0d,0x73cd); U8(0x4e43,0x5e2d,0x54e9,0x912d,0x5439,0x9aa8,0x7159,0x80e1); U8(0x502b,0x6d2a,0x6416,0x5a18,0x82b3,0x7d22,0x5cf0,0x7e54); U8(0x60e0,0x676f,0x6d6e,0x72fc,0x9694,0x6930,0x79df,0x5c3e); U8(0x5ffd,0x8907,0x838a,0x6b3e,0x83dc,0x526f,0x4f01,0x6298); U8(0x64fe,0x63da,0x9f20,0x5192,0x7661,0x9f13,0x523a,0x983b); U8(0x55b5,0x91dd,0x9738,0x66ab,0x4f0a,0x6c89,0x5634,0x5eab); U8(0x6094,0x9670,0x5bc2,0x501a,0x63a2,0x7956,0x6469,0x8ddd); U8(0x5269,0x904d,0x87f2,0x71d2,0x9801,0x9686,0x5f31,0x8c46); U8(0x5ef6,0x856d,0x63e1,0x5be7,0x662d,0x76df,0x8986,0x61b2); U8(0x79d8,0x8010,0x6101,0x4e01,0x64a5,0x6c88,0x6607,0x9aee); U8(0x54c0,0x9f9c,0x7709,0x56fa,0x5377,0x984f,0x9592,0x7a0d); U8(0x6b98,0x642d,0x4eac,0x50bb,0x8033,0x6620,0x8a17,0x6069); U8(0x9d3b,0x5999,0x8faf,0x5a92,0x5435,0x64ec,0x9055,0x6182); U8(0x684c,0x62fc,0x92b7,0x85c9,0x80d6,0x5c4b,0x57df,0x8239); U8(0x52de,0x6d1e,0x5be2,0x4e95,0x5f90,0x5ddd,0x5948,0x6236); U8(0x4e58,0x984d,0x500d,0x8fb2,0x64fa,0x585e,0x6a6b,0x8cc0); U8(0x7687,0x62d6,0x6028,0x7259,0x9f4a,0x9003,0x969c,0x8cfa); U8(0x6efe,0x5e72,0x8b5c,0x70cf,0x77ad,0x6234,0x642c,0x9b06); U8(0x74dc,0x8feb,0x5687,0x5976,0x68cb,0x9b27,0x76db,0x6478); U8(0x5c3a,0x4f19,0x67d3,0x4f69,0x5e8a,0x9177,0x5b6b,0x7260); U8(0x5c41,0x9189,0x9707,0x9396,0x51f1,0x640d,0x6f6e,0x5f79); U8(0x6cf0,0x4f0d,0x51ac,0x6843,0x8070,0x4e59,0x8ff4,0x7af6); U8(0x5d07,0x649e,0x53b2,0x6de8,0x727d,0x7f8a,0x60d1,0x6eaa); U8(0x5e3d,0x7720,0x5df7,0x6e9d,0x4fd7,0x5b8b,0x98c4,0x6dfa); U8(0x737b,0x9eb5,0x8266,0x6ed1,0x699c,0x5708,0x5ead,0x7e31); U8(0x9583,0x84ee,0x9ea5,0x90a6,0x8000,0x65cb,0x95b1,0x8ad2); U8(0x7d2b,0x651d,0x5100,0x59a8,0x5be9,0x63cf,0x6d3d,0x6bc0); U8(0x9010,0x966a,0x586b,0x8e0f,0x8dcc,0x85e4,0x6155,0x52d2); U8(0x95c6,0x6b8a,0x59ca,0x6191,0x5bde,0x5cb8,0x82ac,0x93ae); U8(0x6c60,0x62d2,0x845b,0x5dde,0x92d2,0x5bec,0x62f3,0x4e8e); U8(0x51fd,0x9a45,0x540e,0x88c1,0x87f9,0x963b,0x90aa,0x9769); U8(0x9817,0x81e5,0x660c,0x5b5f,0x822a,0x9059,0x7fd4,0x880d); U8(0x606d,0x5967,0x9f4b,0x775b,0x69cd,0x5854,0x9589,0x9813); U8(0x7cca,0x4ef0,0x65d7,0x52f5,0x539a,0x9b42,0x51c6,0x4e38); U8(0x5b99,0x9038,0x88d5,0x8212,0x8a2a,0x517c,0x85a9,0x98f2); U8(0x8d64,0x95a3,0x5eb8,0x556a,0x6436,0x52ff,0x7802,0x7dd2); U8(0x4e0c,0x7a79,0x5118,0x6383,0x6148,0x6b04,0x5617,0x52f8); U8(0x9bae,0x596e,0x8cdc,0x83e9,0x558a,0x95e1,0x5e7d,0x64f4); U8(0x6c61,0x5f70,0x6beb,0x7965,0x76dc,0x8a3c,0x7f70,0x756a); U8(0x5c16,0x7070,0x72d0,0x8861,0x64ce,0x6c6a,0x76e3,0x7c4c); U8(0x9072,0x8a87,0x90b1,0x7a05,0x6d69,0x62c6,0x7d1b,0x543e); U8(0x6258,0x59d1,0x54b1,0x6f0f,0x8a95,0x6dd1,0x9b31,0x7832); U8(0x67cf,0x7246,0x7aa9,0x50be,0x8fad,0x79d2,0x7fbd,0x6e21); U8(0x6c96,0x5ed6,0x8f1b,0x9f61,0x57f9,0x9080,0x7518,0x888b); U8(0x675c,0x9178,0x80f8,0x908f,0x8c48,0x6dbc,0x80a5,0x8c37); U8(0x7378,0x5a03,0x5b54,0x9677,0x4f30,0x59bb,0x5f91,0x73ab); U8(0x6f5b,0x6676,0x80de,0x715e,0x8106,0x62b9,0x7336,0x7bc9); U8(0x5442,0x788e,0x5674,0x622a,0x65e6,0x5a66,0x5783,0x8352); U8(0x5e63,0x573e,0x4e32,0x58ef,0x4f38,0x50b2,0x5f26,0x6ce5); U8(0x7e41,0x9b6f,0x5fcc,0x5104,0x53ed,0x5766,0x6065,0x871c); U8(0x6b3a,0x6368,0x6021,0x73b2,0x81ed,0x5510,0x6e6f,0x7aae); U8(0x6089,0x52c1,0x62d4,0x751c,0x8499,0x901b,0x6735,0x6458); U8(0x81bd,0x7cdf,0x817f,0x5587,0x919c,0x8e8d,0x7e73,0x6cc9); U8(0x6fa4,0x78a7,0x978b,0x97d3,0x5f65,0x6668,0x8389,0x99d5); U8(0x731b,0x7de9,0x5761,0x70b8,0x6012,0x53ec,0x6355,0x5e7c); U8(0x827e,0x6046,0x5272,0x626f,0x76fe,0x4e48,0x5410,0x660f); U8(0x8fa8,0x8ed2,0x5deb,0x4e39,0x9451,0x5f4e,0x8ce6,0x576a); U8(0x4fb5,0x80ce,0x60a0,0x85aa,0x522a,0x5f6c,0x609f,0x73e0); U8(0x52e4,0x7470,0x690d,0x75f4,0x7919,0x655d,0x76c3,0x76fc); U8(0x4ea8,0x901d,0x526a,0x78e8,0x903c,0x6fc3,0x6d1b,0x4e4f); U8(0x64cb,0x59c6,0x75c7,0x5acc,0x8eb2,0x58c1,0x91e3,0x6696); U8(0x61be,0x905c,0x6109,0x66f0,0x91c7,0x6377,0x5606,0x7384); U8(0x6ef4,0x6263,0x62b5,0x5954,0x7fc1,0x8e5f,0x83f2,0x722c); U8(0x68da,0x5915,0x60f1,0x6212,0x7272,0x6cf3,0x6070,0x6ec4); U8(0x6851,0x8523,0x6f54,0x7126,0x52c9,0x596a,0x840a,0x5496); U8(0x8caa,0x8e22,0x6284,0x8de1,0x937e,0x6daf,0x7bad,0x8a93); U8(0x7a4c,0x6d89,0x7a3f,0x7f72,0x7642,0x790e,0x50d1,0x6db5); U8(0x9727,0x7c89,0x4ff1,0x5fb9,0x8b00,0x5c60,0x72a7,0x5537); U8(0x7e5e,0x58fd,0x723a,0x9a37,0x6c0f,0x7ffc,0x4e91,0x809a); U8(0x541d,0x6881,0x54aa,0x7e6a,0x5ec9,0x98fd,0x9cf4,0x8c8c); U8(0x9739,0x70ae,0x6350,0x5561,0x708e,0x55e8,0x8ca2,0x66fc); U8(0x58e2,0x9f3b,0x5949,0x9234,0x8017,0x6062,0x878d,0x7b4b); U8(0x64bf,0x5112,0x98fe,0x798d,0x4faf,0x552c,0x6316,0x5e45); U8(0x77a7,0x5ef7,0x54a6,0x6490,0x67f3,0x8584,0x9ece,0x6372); U8(0x5d50,0x540a,0x9742,0x64e0,0x89bd,0x8fea,0x51cc,0x832b); U8(0x71c3,0x746a,0x5f18,0x77db,0x6566,0x745c,0x9326,0x6953); U8(0x8cd3,0x68af,0x742a,0x7344,0x5074,0x54ac,0x58a8,0x60b6); U8(0x6674,0x6016,0x55ac,0x9f52,0x5dba,0x4ef2,0x97fb,0x65ed); U8(0x7d9c,0x5154,0x8e64,0x59ff,0x9081,0x50a2,0x8b7d,0x8932); U8(0x5b5d,0x4fc3,0x99db,0x4ec7,0x7c97,0x59ae,0x7267,0x821f); U8(0x7cd5,0x6bbc,0x8776,0x5147,0x934b,0x61fc,0x92b3,0x541f); U8(0x8667,0x679d,0x800d,0x95ca,0x75bc,0x62fe,0x6bc5,0x5237); U8(0x5857,0x593e,0x88c2,0x9640,0x54c9,0x9130,0x6085,0x5951); U8(0x8f5f,0x7235,0x5c4f,0x99d0,0x8ced,0x8702,0x6247,0x7f9e); U8(0x74e6,0x758f,0x6b3d,0x6597,0x5764,0x582a,0x6d29,0x9006); U8(0x62ac,0x54ce,0x9905,0x5f4c,0x8cab,0x7db1,0x60f9,0x7ff9); U8(0x58ae,0x72e0,0x9811,0x838e,0x64e6,0x8d08,0x51dd,0x6ecb); U8(0x9756,0x5582,0x5141,0x5f6d,0x633a,0x62cb,0x611a,0x5f81); U8(0x8fb1,0x6717,0x4f59,0x6170,0x79aa,0x86d9,0x79e6,0x707d); U8(0x5075,0x8b9a,0x6905,0x5925,0x7897,0x4e19,0x5352,0x8377); U8(0x62ab,0x8da8,0x52ab,0x8fc5,0x7b1b,0x70e4,0x80a9,0x77e3); U8(0x9077,0x6c5d,0x9e7f,0x7f38,0x617e,0x7210,0x9b45,0x4e08); U8(0x5de1,0x63aa,0x79e4,0x6cbf,0x75d5,0x6feb,0x6f32,0x8170); U8(0x8ce4,0x59a5,0x6db2,0x9d5d,0x8ecc,0x53d4,0x9675,0x8c9e); U8(0x626d,0x6f20,0x7955,0x8766,0x6349,0x8b20,0x6c1b,0x81a0); U8(0x905e,0x9716,0x96ef,0x9ad2,0x5006,0x65fa,0x6749,0x71d5); U8(0x5bf8,0x8302,0x5faa,0x5c4d,0x947d,0x6643,0x6084,0x7fc5); U8(0x8292,0x659c,0x5ec1,0x5466,0x6e9c,0x9db4,0x55aa,0x76d2); U8(0x6454,0x6127,0x7051,0x6d25,0x76c8,0x8108,0x82d7,0x5821); U8(0x5211,0x53c9,0x6492,0x866b,0x7e8f,0x9075,0x81e3,0x6afb); // SC : most frequent 2048 SC minus TC ( 768 code points ) U8(0x8cc7,0x9019,0x500b,0x6703,0x70ba,0x4f86,0x5b78,0x6642); U8(0x8aaa,0x6c92,0x554f,0x904e,0x8acb,0x5011,0x9ebc,0x9084); U8(0x96fb,0x5c0d,0x6a5f,0x8a0a,0x570b,0x767c,0x7121,0x7576); U8(0x55ce,0x73fe,0x9ede,0x984c,0x6a23,0x7d93,0x8b1d,0x958b); U8(0x8a71,0x8207,0x5be6,0x611b,0x83ef,0x52d5,0x61c9,0x7a2e); U8(0x8eca,0x9577,0x95dc,0x9593,0x9cf3,0x89ba,0x8a72,0x9032); U8(0x5c07,0x5169,0x8ad6,0x9f8d,0x5225,0x9ad4,0x7d66,0x807d); U8(0x898b,0x88e1,0x6771,0x98a8,0x7063,0x539f,0x7db2,0x5340); U8(0x932f,0x6a02,0x66f8,0x8b93,0x9078,0x8f03,0x6578,0x5167); U8(0x5834,0x5f9e,0x6b61,0x8a8d,0x5e7e,0x96e3,0x982d,0x8a31); U8(0x8cb7,0x7d71,0x8a18,0x865f,0x8655,0x8a08,0x5e2b,0x4e26); U8(0x5f35,0x8ab0,0x7d50,0x9023,0x8a2d,0x9673,0x5831,0x8f49); U8(0x9ee8,0x8b8a,0x6c23,0x8a66,0x55ae,0x7fa9,0x6230,0x968a); U8(0x537b,0x696d,0x5beb,0x8072,0x81fa,0x54e1,0x8a0e,0x5247); U8(0x6a94,0x59b3,0x8208,0x5f37,0x8b70,0x8fa6,0x50f9,0x7e3d); U8(0x50b3,0x6c7a,0x7d44,0x842c,0x7d1a,0x6a19,0x7368,0x8c93); U8(0x9580,0x8a9e,0x7dda,0x99ac,0x9ec3,0x53c3,0x8996,0x89c0); U8(0x806f,0x8166,0x5152,0x9322,0x9304,0x5c08,0x63db,0x9060); U8(0x5099,0x5e6b,0x78ba,0x88dd,0x985e,0x8b1b,0x756b,0x8a34); U8(0x5e36,0x5716,0x8cbb,0x8cfd,0x8b58,0x908a,0x96d6,0x904b); U8(0x52d9,0x6b0a,0x9a57,0x8b80,0x98db,0x6eff,0x5922,0x8ab2); U8(0x8ecd,0x7d04,0x9858,0x9054,0x5920,0x8ac7,0x8853,0x8ce3); U8(0x52dd,0x6e96,0x6cc1,0x96e2,0x5c0e,0x5718,0x8abf,0x8b49); U8(0x50b7,0x525b,0x6bba,0x8edf,0x689d,0x7d55,0x7a31,0x64da); U8(0x76e1,0x805e,0x5712,0x7522,0x96d9,0x7d05,0x57f7,0x96f2); U8(0x904a,0x9808,0x8cea,0x8077,0x5fa9,0x8f38,0x7bc0,0x898f); U8(0x7562,0x71b1,0x9928,0x614b,0x89aa,0x4e82,0x5bf6,0x64ca); U8(0x8209,0x7d42,0x65b7,0x8f15,0x74b0,0x7c21,0x97ff,0x96a8); U8(0x7df4,0x7e8c,0x9b5a,0x6975,0x967d,0x5e79,0x7fd2,0x7f85); U8(0x62c9,0x6a13,0x5ee3,0x7169,0x9ce5,0x986f,0x78bc,0x8056); U8(0x58de,0x9810,0x773e,0x8cac,0x722d,0x8aa4,0x9806,0x8cb4); U8(0x8ca0,0x58d3,0x9069,0x4fc2,0x6e2c,0x61f7,0x8fce,0x55da); U8(0x8a55,0x7d30,0x91ab,0x5c6c,0x6200,0x6557,0x694a,0x9748); U8(0x969b,0x9a0e,0x9805,0x6232,0x71df,0x6b77,0x723e,0x7de8); U8(0x8b77,0x88dc,0x64c7,0x6b72,0x96bb,0x9818,0x5c0b,0x6eab); U8(0x990a,0x8b02,0x7570,0x512a,0x721b,0x60e1,0x72c0,0x7dad); U8(0x9678,0x5433,0x8a5e,0x5287,0x975c,0x4e9e,0x6c5f,0x9663); U8(0x56b4,0x58d8,0x52e2,0x862d,0x7de3,0x9031,0x5ee0,0x7c3d); U8(0x7ffb,0x883b,0x7d61,0x8f2f,0x5275,0x8d99,0x820a,0x96dc); U8(0x9e97,0x79ae,0x8173,0x528d,0x616e,0x7d39,0x5289,0x61b6); U8(0x8af8,0x7f75,0x7d14,0x85dd,0x8aa0,0x7e7c,0x7d00,0x91cb); U8(0x88fd,0x9375,0x8d95,0x7b46,0x8a3b,0x6a39,0x9435,0x69ae); U8(0x6b78,0x8449,0x5f48,0x885b,0x9298,0x584a,0x6f22,0x8cde); U8(0x8f09,0x96aa,0x9418,0x69cb,0x56c9,0x87a2,0x5049,0x85a6); U8(0x555f,0x71c8,0x6aa2,0x5abd,0x8c6c,0x6fdf,0x8a13,0x85cd); U8(0x5283,0x64d4,0x7d19,0x8cbc,0x7f77,0x8a73,0x5354,0x9867); U8(0x81c9,0x5cf6,0x734e,0x6575,0x9109,0x518a,0x8f2a,0x885d); U8(0x9df9,0x68c4,0x7e23,0x7dca,0x8cfc,0x50c5,0x5e33,0x5c64); U8(0x93e1,0x8d0a,0x9846,0x6de1,0x8a69,0x66c9,0x64c1,0x6176); U8(0x7e3e,0x7345,0x570d,0x9918,0x9b25,0x7bc4,0x8f1d,0x8b6f); U8(0x5ef3,0x84cb,0x63ee,0x7a4d,0x9a5a,0x78a9,0x7c43,0x4e1f); U8(0x76e4,0x5bae,0x5e25,0x8ca8,0x865b,0x907a,0x639b,0x6ec5); U8(0x8a8c,0x5713,0x9280,0x98ef,0x9802,0x4fe0,0x96de,0x8afe); U8(0x8f14,0x81e8,0x5538,0x6190,0x8607,0x6e1b,0x760b,0x63a1); U8(0x5c46,0x8f29,0x85e5,0x9a19,0x7da0,0x7e2e,0x7372,0x6f38); U8(0x92fc,0x5ee2,0x6dda,0x6a1e,0x8d0f,0x6b50,0x8ca1,0x8a02); U8(0x89f8,0x8ce2,0x5091,0x812b,0x4f48,0x84bc,0x53ad,0x5805); U8(0x61f6,0x6a4b,0x7a69,0x8a62,0x68ee,0x7d72,0x5875,0x6163); U8(0x6fe4,0x8abc,0x4f54,0x8cf4,0x968e,0x8c50,0x8c9d,0x90f5); U8(0x96b1,0x6158,0x7d0d,0x912d,0x7159,0x502b,0x6416,0x7e54); U8(0x676f,0x8907,0x838a,0x64fe,0x63da,0x7661,0x983b,0x55b5); U8(0x91dd,0x66ab,0x5eab,0x9670,0x87f2,0x71d2,0x9801,0x856d); U8(0x5be7,0x61b2,0x64a5,0x6607,0x9aee,0x9f9c,0x984f,0x9592); U8(0x6b98,0x8a17,0x9d3b,0x8faf,0x64ec,0x9055,0x6182,0x92b7); U8(0x52de,0x5be2,0x6236,0x984d,0x8fb2,0x64fa,0x6a6b,0x8cc0); U8(0x9f4a,0x8cfa,0x6efe,0x8b5c,0x70cf,0x77ad,0x9b06,0x5687); U8(0x9b27,0x5b6b,0x7260,0x9396,0x51f1,0x640d,0x8070,0x8ff4); U8(0x7af6,0x53b2,0x6de8,0x727d,0x6e9d,0x98c4,0x6dfa,0x737b); U8(0x9eb5,0x8266,0x7e31,0x9583,0x84ee,0x9ea5,0x95b1,0x8ad2); U8(0x651d,0x5100,0x5be9,0x6bc0,0x95c6,0x6191,0x93ae,0x92d2); U8(0x5bec,0x62f3,0x9a45,0x9817,0x81e5,0x9059,0x880d,0x5967); U8(0x9f4b,0x69cd,0x9589,0x9813,0x52f5,0x8a2a,0x85a9,0x98f2); U8(0x95a3,0x6436,0x7dd2,0x4e0c,0x5118,0x6383,0x6b04,0x5617); U8(0x52f8,0x9bae,0x596e,0x8cdc,0x95e1,0x64f4,0x76dc,0x8a3c); U8(0x7f70,0x76e3,0x7c4c,0x9072,0x8a87,0x7a05,0x7d1b,0x8a95); U8(0x9b31,0x7832,0x7246,0x7aa9,0x50be,0x8fad,0x6c96,0x8f1b); U8(0x9f61,0x908f,0x8c48,0x6dbc,0x7378,0x5f91,0x6f5b,0x7336); U8(0x7bc9,0x5442,0x5674,0x5a66,0x5e63,0x58ef,0x9b6f,0x5104); U8(0x6065,0x6368,0x6e6f,0x7aae,0x52c1,0x81bd,0x919c,0x8e8d); U8(0x7e73,0x6fa4,0x97d3,0x5f65,0x99d5,0x7de9,0x6012,0x6046); U8(0x8ed2,0x9451,0x5f4e,0x8ce6,0x522a,0x7919,0x76c3,0x6fc3); U8(0x64cb,0x91e3,0x905c,0x5606,0x8e5f,0x60f1,0x6ec4,0x8523); U8(0x6f54,0x596a,0x840a,0x8caa,0x6284,0x8de1,0x937e,0x7a4c); U8(0x7642,0x790e,0x50d1,0x6db5,0x9727,0x5fb9,0x8b00,0x72a7); U8(0x7e5e,0x58fd,0x723a,0x9a37,0x7e6a,0x98fd,0x9cf4,0x8ca2); U8(0x58e2,0x9234,0x64bf,0x98fe,0x798d,0x6490,0x6372,0x5d50); U8(0x9742,0x64e0,0x89bd,0x746a,0x9326,0x6953,0x8cd3,0x7344); U8(0x5074,0x60b6,0x55ac,0x9f52,0x5dba,0x97fb,0x7d9c,0x8e64); U8(0x9081,0x50a2,0x8b7d,0x8932,0x99db,0x6bbc,0x5147,0x934b); U8(0x61fc,0x92b3,0x8667,0x95ca,0x5857,0x593e,0x9130,0x6085); U8(0x8f5f,0x99d0,0x8ced,0x6b3d,0x6d29,0x9905,0x5f4c,0x8cab); U8(0x7db1,0x7ff9,0x58ae,0x9811,0x8d08,0x62cb,0x79aa,0x707d); U8(0x5075,0x8b9a,0x8da8,0x9077,0x617e,0x7210,0x6feb,0x6f32); U8(0x8ce4,0x9d5d,0x8ecc,0x8c9e,0x7955,0x8766,0x8b20,0x81a0); U8(0x905e,0x9ad2,0x5006,0x5faa,0x5c4d,0x947d,0x5ec1,0x9db4); U8(0x55aa,0x7051,0x8108,0x7e8f,0x6afb,0x6e3e,0x5641,0x5132); return s; } u_code_point restore_order_unihan(u_code_point z) { return reorder_unihan(z); } #define MAPCHAR(x,A,B,bytes) if(A<=x && x< (A+bytes)) \ return(x+(B-A)); if(B<=x && x< (B+bytes)) return(x+(A-B)) #define MAP16BL(x,A,B,block) if(A<=x && x< (A+(block<<4))) \ return(x+(B-A)); if(B<=x && x< (B+(block<<4))) \ return(x+(A-B)) u_code_point reorder_latins(u_code_point s) { MAP16BL(s,0x0100,0x0000,3); // Latin Extension A MAP16BL(s,0x0130,0x0080,2); MAP16BL(s,0x0150,0x00A0,1); MAP16BL(s,0x0300,0x00B0,3); // Combining Diacritical Marks MAPCHAR(s,0x0070,0x0060,1); // p,` MAPCHAR(s,0x0072,0x006A,1); // r,j MAPCHAR(s,0x0073,0x006B,1); // s,k MAPCHAR(s,0x0074,0x0066,1); // t,f MAPCHAR(s,0x0075,0x0067,1); // u,g MAPCHAR(s,0x0050,0x0040,1); // P,@ UPPER MAPCHAR(s,0x0052,0x004A,1); // R,J UPPER MAPCHAR(s,0x0053,0x004B,1); // S,K UPPER MAPCHAR(s,0x0054,0x0046,1); // T,F UPPER MAPCHAR(s,0x0055,0x0047,1); // U,G UPPER MAPCHAR(s,0x0160,0x003A,6); // Latin Extension A MAPCHAR(s,0x0166,0x005B,5); MAPCHAR(s,0x016B,0x007B,5); return s; } u_code_point restore_order_latins(u_code_point z) { return reorder_latins(z); } u_code_point reorder(u_code_point s) { if(isHANGUL(s)) return reorder_hangul(s); if(isUNIHAN(s)) return reorder_unihan(s); if(isLatins(s)) return reorder_latins(s); return s; } u_code_point restore_order(u_code_point s) { if(isHANGUL(s)) return restore_order_hangul(s); if(isUNIHAN(s)) return restore_order_unihan(s); if(isLatins(s)) return restore_order_latins(s); return s; } /* Encoder: */ enum dude_status dude_encode( unsigned int input_length, const u_code_point input[], const unsigned char uppercase_flags[], unsigned int *output_size, char output[] ) { unsigned int max_out, in, out, k, j; u_code_point prev, codept, diff, tmp; char shift; prev = 0x60; max_out = *output_size; for (in = out = 0; in < input_length; ++in) { /* At the start of each iteration, in and out are the number of */ /* items already input/output, or equivalently, the indices of */ /* the next items to be input/output. */ codept = input[in]; if (codept == 0x2D) { /* Hyphen-minus stands for itself. */ if (max_out - out < 1) return dude_big_output; output[out++] = 0x2D; continue; } codept = reorder(codept); // by LSB diff = prev^codept; /* Compute the number of base-32 characters (k): */ for (tmp = diff >> 4, k = 1; tmp != 0; ++k, tmp >>= 4); //fprintf(stderr,"diff %x,%x = prev %x ^ codept %x \n", // k,diff,prev,codept); if (max_out - out < k) return dude_big_output; shift = uppercase_flags && uppercase_flags[in] ? 32 : 0; /* shift controls the case of the last base-32 digit. */ /* Each quintet has the form 1xxxx except the last is 0xxxx. */ /* Computing the base-32 digits in reverse order is easiest. */ out += k; output[out - 1] = base32[diff & 0xF] - shift; for (j = 2; j <= k; ++j) { diff >>= 4; output[out - j] = base32[0x10 | (diff & 0xF)]; } prev = codept; } /* Append the null terminator: */ if (max_out - out < 1) return dude_big_output; output[out++] = 0; *output_size = out; return dude_success; } /* Decoder: */ enum dude_status dude_decode( enum case_sensitivity case_sensitivity, char scratch_space[], const char input[], unsigned int *output_length, u_code_point output[], unsigned char uppercase_flags[] ) { u_code_point prev, q, diff; char c; unsigned int max_out, in, out, scratch_size; enum dude_status status; prev = 0x60; max_out = *output_length; for (c = input[in = 0], out = 0; c != 0; c = input[++in], ++out) { /* At the start of each iteration, in and out are the number of */ /* items already input/output, or equivalently, the indices of */ /* the next items to be input/output. */ if (max_out - out < 1) return dude_big_output; if (c == 0x2D) output[out] = c; /* hyphen-minus is literal */ else { /* Base-32 sequence. Decode quintets until 0xxxx is found: */ for (diff = 0; ; c = input[++in]) { q = base32_decode(c); if (q == base32_invalid){ return dude_bad_input; }; diff = (diff << 4) | (q & 0xF); if (q >> 4 == 0) break; } // prev = output[out] = prev ^ diff; prev = prev ^ diff; output[out] = restore_order(prev); // LSB } /* Case of last character determines uppercase flag: */ if (uppercase_flags) uppercase_flags[out] = c >= 65 && c <= 90; } /* Enforce the uniqueness of the encoding by re-encoding */ /* the output and comparing the result to the input: */ scratch_size = ++in; status = dude_encode(out, output, uppercase_flags, &scratch_size, scratch_space); if (status != dude_success || scratch_size != in || unequal(case_sensitivity, scratch_space, input) ) return dude_bad_input; *output_length = out; return dude_success; } /******************************************************************/ /* Wrapper for testing (would normally go in a separate .c file): */ #include #include #include #include /* For testing, we'll just set some compile-time limits rather than */ /* use malloc(), and set a compile-time option rather than using a */ /* command-line option. */ enum { unicode_max_length = 256, ace_max_size = 256, test_case_sensitivity = case_insensitive /* suitable for host names */ }; static void usage(char **argv) { fprintf(stderr, "%s -e reads code points and writes a DUDE string.\n" "%s -d reads a DUDE string and writes code points.\n" "Input and output are plain text in the native character set.\n" "Code points are in the form u+hex separated by whitespace.\n" "A DUDE string is a newline-terminated sequence of LDH characters\n" "(without any signature).\n" "The case of the u in u+hex is the force-to-uppercase flag.\n" , argv[0], argv[0]); exit(EXIT_FAILURE); } static void fail(const char *msg) { fputs(msg,stderr); exit(EXIT_FAILURE); } static const char too_big[] = "input or output is too large, recompile with larger limits\n"; static const char invalid_input[] = "invalid input\n"; static const char io_error[] = "I/O error\n"; /* The following string is used to convert LDH */ /* characters between ASCII and the native charset: */ static const char ldh_ascii[] = "................" "................" ".............-.." "0123456789......" ".ABCDEFGHIJKLMNO" "PQRSTUVWXYZ....." ".abcdefghijklmno" "pqrstuvwxyz"; int main(int argc, char **argv) { enum dude_status status; int r; char *p; if (argc != 2) usage(argv); if (argv[1][0] != '-') usage(argv); if (argv[1][2] != 0) usage(argv); if (argv[1][1] == 'e') { u_code_point input[unicode_max_length]; unsigned long codept; unsigned char uppercase_flags[unicode_max_length]; char output[ace_max_size], uplus[3]; unsigned int input_length, output_size, i; /* Read the input code points: */ input_length = 0; for (;;) { r = scanf("%2s%lx", uplus, &codept); if (ferror(stdin)) fail(io_error); if (r == EOF || r == 0) break; if (r != 2 || uplus[1] != '+' || codept > (u_code_point)-1) { fail(invalid_input); } if (input_length == unicode_max_length) fail(too_big); if (uplus[0] == 'u') uppercase_flags[input_length] = 0; else if (uplus[0] == 'U') uppercase_flags[input_length] = 1; else fail(invalid_input); input[input_length++] = codept; } /* Encode: */ output_size = ace_max_size; status = dude_encode(input_length, input, uppercase_flags, &output_size, output); if (status == dude_bad_input) fail(invalid_input); if (status == dude_big_output) fail(too_big); assert(status == dude_success); /* Convert to native charset and output: */ for (p = output; *p != 0; ++p) { i = *p; assert(i <= 122 && ldh_ascii[i] != '.'); *p = ldh_ascii[i]; } r = puts(output); if (r == EOF) fail(io_error); return EXIT_SUCCESS; } if (argv[1][1] == 'd') { char input[ace_max_size], scratch[ace_max_size], *pp; u_code_point output[unicode_max_length]; unsigned char uppercase_flags[unicode_max_length]; unsigned int input_length, output_length, i; /* Read the DUDE input string and convert to ASCII: */ fgets(input, ace_max_size, stdin); if (ferror(stdin)) fail(io_error); if (feof(stdin)) fail(invalid_input); input_length = strlen(input); if (input[input_length - 1] != '\n') fail(too_big); input[--input_length] = 0; for (p = input; *p != 0; ++p) { pp = strchr(ldh_ascii, *p); if (pp == 0) fail(invalid_input); *p = pp - ldh_ascii; } /* Decode: */ output_length = unicode_max_length; status = dude_decode(test_case_sensitivity, scratch, input, &output_length, output, uppercase_flags); if (status == dude_bad_input) fail(invalid_input); if (status == dude_big_output) fail(too_big); assert(status == dude_success); /* Output the result: */ for (i = 0; i < output_length; ++i) { r = printf("%s+%04lX\n", uppercase_flags[i] ? "U" : "u", (unsigned long) output[i] ); if (r < 0) fail(io_error); } return EXIT_SUCCESS; } usage(argv); return EXIT_SUCCESS; /* not reached, but quiets compiler warning */ }