idnits 2.17.1 draft-ietf-idn-cjk-01.txt: -(1): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing document type: Expected "INTERNET-DRAFT" in the upper left hand corner of the first page == There are 7 instances of lines with non-ascii characters in the document. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 454 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([CNRP]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC1035' is mentioned on line 71, but not defined == Missing Reference: 'UTR21' is mentioned on line 82, but not defined == Missing Reference: 'UTR15' is mentioned on line 198, but not defined == Unused Reference: 'UNISTD3' is defined on line 413, but no explicit reference was found in the text == Unused Reference: 'IDN' is defined on line 416, but no explicit reference was found in the text == Unused Reference: 'CJKV' is defined on line 422, but no explicit reference was found in the text == Unused Reference: 'C2C' is defined on line 424, but no explicit reference was found in the text == Unused Reference: 'KANJIDIC' is defined on line 428, but no explicit reference was found in the text == Unused Reference: 'UNICHART' is defined on line 431, but no explicit reference was found in the text == Unused Reference: 'ISO11941' is defined on line 438, but no explicit reference was found in the text == Unused Reference: 'KimK 1990' is defined on line 443, but no explicit reference was found in the text == Unused Reference: 'KimK 1992' is defined on line 447, but no explicit reference was found in the text == Unused Reference: 'KimK 1999' is defined on line 451, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'UNISTD3' -- Possible downref: Non-RFC (?) normative reference: ref. 'UCS' -- Possible downref: Non-RFC (?) normative reference: ref. 'IDN' -- Possible downref: Non-RFC (?) normative reference: ref. 'CNRP' -- Possible downref: Non-RFC (?) normative reference: ref. 'CJKV' -- Possible downref: Non-RFC (?) normative reference: ref. 'C2C' -- Possible downref: Non-RFC (?) normative reference: ref. 'KANJIDIC' -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICHART' -- Possible downref: Non-RFC (?) normative reference: ref. 'ZONGBIAO' -- Possible downref: Non-RFC (?) normative reference: ref. 'UNIHAN' -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO11941' -- Possible downref: Non-RFC (?) normative reference: ref. 'KimK 1990' -- Possible downref: Non-RFC (?) normative reference: ref. 'KimK 1992' -- Possible downref: Non-RFC (?) normative reference: ref. 'KimK 1999' Summary: 7 errors (**), 0 flaws (~~), 16 warnings (==), 16 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 ���Internet Draft James SENG 2 Yoshiro YONEYA 3 11th Apr 2001 Kenny HUANG 4 Expires 11 Oct 2001 KIM Kyongsok 6 Han Ideograph (CJK) for Internationalized Domain Names 8 Status of this Memo 10 This document is an Internet-Draft and is in full conformance 11 with all provisions of Section 10 of RFC2026. 13 Internet-Drafts are working documents of the Internet 14 Engineering Task Force (IETF), its areas, and its working 15 groups. Note that other groups may also distribute working 16 documents as Internet-Drafts. 18 Internet-Drafts are draft documents valid for a maximum of 19 six months and may be updated, replaced, or obsoleted by other 20 documents at any time. It is inappropriate to use Internet- 21 Drafts as reference material or to cite them other than as 22 "work in progress." 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/ietf/1id-abstracts.txt 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html. 30 Abstract 32 During the development of Internationalized Domain Name (IDN), it is 33 discovered that there is a substantial lack of information and 34 misunderstanding on Han ideographs and its folding mechanism. 36 This document attempts to address some of the issues on doing han 37 folding with respect to IDN. Hopefully, this will dispel some of the 38 common misunderstanding of this problem and to discuss some of the 39 issues with han ideograph and its folding mechanism. 41 This document addresses very specific problem to IDN and thus is not 42 meant as a reference for generic Han folding. Generic Han folding are 43 much more complicated and certainly beyond this document. However, the 44 use of this document may be applicable to other areas that are related 45 with names, e.g. Common Name Resolution Protocol [CNRP]. 47 1. Definition and convention 49 Characters mentioned in this document are identified by their position 50 or code point in the Unicode character set [UCS]. The notation U+12AB, 51 for example, indicates the character at the position 12AB (hexadecimal) 52 in the [UCS]. It is strongly recommended that a [UCS] table is available 53 for reference for the ideograph described. 55 Han ideographs are defined as the Chinese ideographs starting from 56 U+3400 to U+9FFF or commonly known as CJK Unification Ideographs. This 57 covers Chinese 'hanzi' {U+6F22 U+5B57/U+6C49 U+5B57}, Japanese 'kanji' 58 (U+6F22 U+5B57) and Korean 'hanja' {U+6F22 U+5B57/U+D55C U+C790}. 59 Additional Han ideographs will appear in other location (not necessary 60 in plane 0) in the future. 62 Conversion between ideographs can be done using four different 63 approaches: Code-base substitution, character-based substitution, 64 lexicon-based substitution and context-based substitution. Han folding 65 refers only to code-base substitution, similar to case mapping of 66 alphabetic characters. 68 2. Introduction 70 Traditionally, domain names have been case insensitive (as defined in 71 [RFC1035] Section 2.3.3). While this is not a problem when domain names 72 are restricted to English alphanumeric letters and digits, it becomes a 73 serious problem for IDN. An important criterion for having a robust IDN 74 is to have good normalization and canonicalization forms. This is to 75 ensure domain name duplications are kept to the minimal. 77 Fortunately, Unicode Consortium is developing technical reports on 78 canonicalization [UTR21] and normalization [UTR15]. Hence, it becomes 79 simple for IDN to ride upon the work of Unicode and use these 80 references. 82 Unfortunately, both [UTR15] and [UTR21] are limited in scope and do not 83 address many other scripts. In particular, Han ideographs are not 84 discussed in detail in these documents and most experts are quick to 85 point out that this problem is technically impossible. 87 2.1 Han ideographs 89 While there are many forms or writing style for Chinese characters, the 90 most common used 'zhengti' {U+6B63 U+4F53/U+6B63 U+9AD4} represent 91 Chinese ideographs by radicals (U+2E80-U+2FDF) that is composed of 92 simple strokes. 94 When the Unicode Consortium started work on Universal Character Set, it 95 was suggested that Hanzi, Kanji and Hanja ideographs should be unified 96 into a single code space. This resulted in the CJK Unification, whereby 97 27,786 Han ideographs are allocated in U+3400-U+9FFF and U+F900-U+FAFF 98 range. Another 41,000 Han ideographs will be added to Plane 2. 100 Ideographs are common in China, Korea and Japan but as ideographs spread 101 and evolve, the form of the ideographs sometimes differs slightly from 102 country to country. For example, the word 'villa' {U+838A} 'zhuang' in 103 Chinese, in Japanese is 'sou' {U+8358}. These are given different code 104 points in Unicode. 106 3. Chinese (Hanzi) 108 Chinese ideographs or hanzi {U+6F22 U+5B57/U+6C49 U+5B57} originated 109 from pictograph. They are 'pictures' which evolved into ideographs 110 during several thousand years. For instance, the ideograph for "hill" 111 {U+5C71} still bears some resembles to 3 peaks of a hill. 113 Not all ideographs are pictograph. There are other classifications such 114 as compound ideographs, phonetic ideographs etc. For example, 115 'endurance' {U+5FCD} is a pierced 'knife' {U+5200} above the 'heart' 116 {U+5FC3}, or as a Chinese saying goes, 'endurance is like having a 117 pierced knife in your heart'. 119 Hence, almost all Han ideographs are associated with some meaning by 120 itself which is very different from most other scripts. This causes some 121 confusion that Han folding is a form of lexicon-substitution. 123 Chinese ideographs underwent a major change in the 1950s after the 124 establishment of People's Republic of China. A committee on Language 125 Reform was established in China whose activities include simplification 126 of Chinese ideographs. The Simplified Chinese (SC) are used in China 127 and Singapore and Traditional Chinese (TC) in Taiwan, Hong Kong PRC, 128 Macau PRC, and most other oversea Chinese. 130 The process is to take complex ideographs and simplify them. The main 131 purposes is to make it easier to remember and write and thus to raise 132 the literacy of the population. 134 For example, 'lightning' TC {U+96FB} becomes SC {U+6535} (They drop the 135 'rain' {U+96E8} part from the TC). In many cases, they bear no 136 resemblance to any of the original traditional forms e.g. 'dragon' TC 137 {U+9F8D} SC {U+9F99}. Two different TC may also have the same SC since 138 it means fewer ideographs to learn, e.g. SC {U+53D1} can be {U+667C} or 139 {U+9AEE} depending on semantics. The official 'Comprehensive List of 140 Simplified Characters' latest published in 1986 listed 2244 SC 141 [ZONGBIAO]. 143 Therefore, the process of SC-to-TC is very complicated. It is not 144 possible to do it accurately without considering the semantics of the 145 phrase. 147 On the other hand, TC-to-SC is much simple although different TCs may 148 map to one single SC. While Unicode does not handle TC & SC, in the 149 informal [UNIHAN] document, it listed 2145 TC and its equivalent mapping 150 of SC. However, because that document is informal and not part of the 151 Unicode standard, it is incomplete and has mistakes in the code points. 152 Hence, precise tables for TC-to-SC conversion have not been fully laid 153 out. 155 In domain names, we are particularly interested in is to equivalences 156 comparison of the names, and not converting SC-to-TC. Therefore, for 157 this purpose, it is possible that equivalency matching be done in the 158 TC-to-SC folding prior to comparison, similar to lower-case English 159 strings before comparing them, e.g. 'taiwan' SC {U+53F0 U+6E7E} will 160 match with TC {U+81FA U+5F4E} or TC {U+53F0 U+5F4E}. 162 The side effect of this method is that comparing SC {U+53D1} to TC 163 {U+667C} or TC {U+9AEE} will both be positive. This implies that SC 164 'hair' SC ���� {U+5934 U+53D1} will match TC 165 (U+982D U+9AEE). It will also match TC {U+982D U+9AEE} that does not 166 have any meaning in Chinese. 168 It should also be noted that SC are not used together with TC. Hence, 169 'hair' is either written as SC {U+5934 U+53D1} or TC {U+982D U+9AEE} 170 but (almost) never {U+5934 U+9AEE} or {U+982D U+53D1}. So the problem 171 of SC and TC may not too serious for IDN. 173 Unfortunately, when it comes to names in Chinese, places where SC are 174 used (i.e. Singapore and China), traditional and simplified ideographs 175 are sometimes mixed within a single name for artistic reasons. Some of 176 them even 'create' ideographs for their names. 178 [Need to add a section on Bopomofo U+3118 to U+312A in future draft] 180 4. Korean (Hanja and Hangeul) 182 Korean is one of the first cultures to imported Chinese ideographs into 183 Korean language as a written form. These Korean ideographs are known as 184 'hanja' {U+6F22 U+5B57/U+D55C U+C790} and they are widely used until 185 recently where 'hangeul' {U+D55C U+AE00} become more popular. 187 Hangeul {U+D55C U+AE00} is a systemic script designed by a 15th century 188 ruler and linguistic expert, King Sejong {U+4E16 U+5B97}. It is based 189 on the pronunciation of the Korean language, hanmal. A Korean syllable 190 is composed of 'jamo' {U+5B57 U+6BCD/U+C790 U+BAA8} elements that 191 represent different sound. Hence, unlike Han ideographs, each hangeul 192 syllable does not have any meaning. 194 Each hanja ideographs can be represented by hangeul syllable. For 195 example, 'samsung' hanja {U+4E09 U+661F} hangeul {U+C0BC U+C131}. Note 196 that {U+4E09} is pronounced as 'sa-ah-am' or in jamo {U+3145} {U+314F} 197 {U+3141}, which gives hangeul {U+C0BC}. While Jamo decompositions are 198 described in [UTR15] in Form D decomposition, this document also 199 suggested another hanguel canonical decomposition in Appendix A to 200 accommodates both modern and old hangeul. 201 [Need to fill up Appendix A when information is more complete] 203 Most hanja characters have only one pronunciation. However, some hanja 204 pronunciation differs as according to orthography (same for Chinese & 205 Japanese) or the position in a word, which make this more complex. And 206 of course, conversation of Hangeul back to hanja is impossible by code 207 substitution without consideration for semantics. 209 Korean also invented their own ideographs that are called 'gugja' 210 {U+56FD U+5B57/U+AD6D U+C790}. 212 5. Japanese (Kanji, Hiragana, Katakana) 214 Japanese adopted Chinese ideograph from the Korean and the Chinese since 215 the 5th century. Chinese ideographs in Japanese are known as 'kanji' 216 {U+6F22 U+5B57}. They also developed their own syllabary hiragana 217 {U+5E73 U+4EEE U+540D} (U+3040-U+309F) and katakana {U+7247 U+4EEE 218 U+540D} (U+30A0-U+30FF), both are derivative of kanji that has same 219 pronunciation. Hiragana is a simplified cursive form, for example, 'a' 220 {U+3042} was derived from 'an' {U+5B89}. Katakana is a simplified part 221 form, for example, 'a' {U+30A2} was derived from 'a' {U+963F}. However, 222 kanji all remain very integrated within the Japanese language. 224 Japanese also invented ideographs known as 'kokuji' {U+56FD U+5B57}. For 225 example, 'iwashi' {U+9C2F} is a Japanese kokuji ideograph. Kokuji are 226 invented according to Han ligature rules. For example, 'touge' "mountain 227 pass" {U+5CE0} is a conjunction of meaning with 'yama' "mountain" 228 {U+5C71} + 'ue' "up" {U+4E0A} + 'shita' "down" {U+4E0B}. 230 Japanese is also a vocal language, i.e. the script itself is based on 231 pronunciation. Each hiragana corresponding to one pronunciation and 48 232 hiragana forms the basic of the Japanese language, including the less 233 commonly used 'we' {U+3091}. Furthermore, hiragana has more 35 forms to 234 represent voiced sound, P-sound, double consonant. For example, 'ga' 235 {U+304C} is a voiced sound of 'ka' {U+304B}. Katakana is a mirror of 236 hiragana with few more forms and they are used to integrate foreign 237 words or phrases into Japanese, or to emphasize words or phrases even 238 in Japanese, or to represent onomatopoeia. For example, 'hamburger' 239 pronounced as 'han-baa-gaa' in Japanese is written as {U+30CF U+30F3 240 U+30D0 U+30FC U+30AC U+30FC} instead of {U+306F U+3093 U+3070 U+3041 241 U+304C U+3041} because it is a foreign word. 243 If Japanese uses hiragana and katakana only, then it is fairly obvious 244 that written Japanese is going to be very long. Hence, kanji are used 245 when referring to nouns or verbs. Each kanji corresponds to one or more 246 hiragana characters. For example, 'japan' pronounced as 'nippon' 247 {U+306B U+3063 U+307D U+3093} are written as {U+65E5 U+672C} instead. 249 Hiragana, like Korean jamo, has no meaning itself. And also, Kanji can 250 take on different pronunciation (which means different hiragana) 251 depending where and how it is use in the sentence. For example, 'sky' 252 {U+7A7A} can be pronounced as {U+305D U+3089} or {U+30BD U+30E9}. 254 Hence, a code substitution between hiragana and kanji is impractical. 256 On the other hand, there are Kanji that has the same meaning with the 257 same pronunciation and equivalent. For example, 'river' "kawa" can be 258 either {U+5DDD} or {U+6CB3}. The only differential between the two 259 ideographs is that it signifies the 'size of the river' (the latter is 260 bigger river). 262 Japanese also reduce complex Chinese ideographs to a simplified form. 263 For example, 'both' {U+5169} was simplified {U+4E21}. Note that Chinese 264 simplified it to {U+4E24} instead. However, traditional Japanese kanji 265 are seldom used nowadays beyond documenting old historical text that 266 they are treated different from the more commonly used simplified form, 267 or used to express proper noun such as person's name or trademarks. 268 Hence, Han folding here is not recommended. 270 4. Vietnamese 272 While Vietnamese also adopted Chinese ideographs ('chu han') and created 273 their own ideographs ('chu nom'), they were now replaced by romanized 274 'quoc ngu' today. Hence, this document does not attempt to address any 275 issues with 'chu han' or 'chu nom'. 277 5. zVariant 279 Unicode has a three dimension conceptual model to Ideograph 280 Unification. The three dimensions are semantic (X axis - meaning, 281 function), abstract shape (Y-axis - general form) and actual shape 282 (Z-axis ��� instantiated, type-faced). 284 When two ideographs have similar etymology but are given two different 285 code points in Unicode, they are known as zVariant ideograph i.e. they 286 belong to the same 'Z' axis. For example, 'villa' {U+838A} and {U+8358}. 288 6. Ideographic Description 290 In Unicode v3.0, an ideographic description (U+2FF0-U+2FFB) was 291 introduced allowing Han ideograph to be constructed using radical 292 (U+2E80-U+2FD5) and Han ideograph (U+3400-U+9FFF). 294 The intention of this description method is to allow ideograph that is 295 not defined by Unicode to be described. Hence, it is not necessary that 296 these ideograph can be display properly. In addition, this method are 297 not deterministic and allowing same ideograph to be represented in 298 different sequence. 300 For example, 'zong' {U+9B03} (for discussion sake, we are going to use 301 an ideograph which is already in Unicode) can be decomposed to U+2FF1 302 U+9ADF U+5B97 using descriptive code points and Unified Ideograph. 303 U+9ADF can also be decomposed as U+2FF0 U+2ED2 U+2F3A and U+5B97 as 304 U+2FF5 U+2F28 U+2F70. In addition, U+9ADF is equivalent to U+2FBD. 305 Hence, if we were to use only descriptive code points and radicals only, 306 we can get U+2FF1 U+2FBD U+2FF5 U+2F28 U+2F70 or U+2FF1 U+2FF0 U+2ED2 307 U+2F3A U+2FF5 U+2F28 U+2F70. 309 In addition, certain radical has been simplified and thus, in some 310 context, equivalent. For example, the radical for 'bird' can be either 311 U+2EE6 or U+2FC3. 313 Hence, until there is a deterministic well-defined rule for 314 ideographic description, ideographs formed by this method are not 315 recommended for domain names use. 317 It should be noted that the Unicode Consortium never intended the 318 ideographic description to be used in protocols like IDN where exact 319 comparison must be done. But it is certainly desirable to this feature 320 as it is commons for Chinese to invent ideographs for names by adding 321 or removing radical from standard ideographs. 323 7. Mechanism 325 The implicit proposal in this document is that CJKV ideographs may or 326 may not be "folded" for the purposes of comparison of domain names. 328 But if folding is required, there are four different ways that this 329 folding could be done. 331 a) Folding by DNS clients, or by user agents 332 b) Folding by DNS servers 333 c) Folding by Domain Name registration services for the purposes of 334 preventing confusing allocations CJKV Domain Names which would, 335 if transcoded, be the same 337 Before we can give much more reaction, we need to know which use is 338 planned. 340 The third use is important. It should be put in place. This problem can 341 be reduced alternately by representing non-ASCII characters that are 342 domain names or other URL characters using hex-escaped character 343 references in HTML pages. 345 To characterize Han characters as ideographs or pictograms is 346 inadequate, because most of the Han ideograph have both a phonetic and 347 a semantic element. Indeed, this is enough to characterize Chinese 348 writing as phonetic, though it is other things as well. Thus, it's 349 difficult to comment on whether folding is useful for Chinese or not. 351 The first use has the problem that lightweight devices do not have 352 enough room to fit a Unicode X-axis mapping table. 354 The second use has the problem that introducing mapping will limit the 355 performance of DNS servers. Alphabetic case mapping can be performed 356 using a single logical AND instruction; CJKV character folding requires 357 a lookup table. 359 In alphabetic scripts, there is also requirement to fold Latin, Greek, 360 Hebrew, Cyrillic, Hebrew and Arabic together. There may be a stronger 361 requirement for CJKV characters. 363 Note also that because modern OS are Unicode based and have network- 364 downloadable IMEs, "interoperability" is becoming less equivalent to 365 "use BIG5 characters only" or "use GB2312 character only" or "use 366 Shift-JIS characters only". 368 If conservative safety is really required, then 369 1) find the x-axis characters which are available in all major CJK 370 character sets used on the internet; 371 2) only allow variants of those in domain names; 372 3) when one variant is used, no other can be allocated. So comparisons 373 are made on x-axis characters, but the license of that domain name 374 can pick which y or z variants they wish to use.. 376 Acknowledgement 378 The editor gratefully acknowledge the contributions of: 380 Paul Hoffman 381 Jiang Mingliang 382 Dongman Lee 383 Karlsson Kent 385 Author(s) 387 James SENG ����� 388 i-DNS.net International Pte Ltd. 389 8 Temasek Boulevard 390 Suntec Tower 3 #24-02 391 Singapore 038988 392 Email: James@Seng.cc 393 Tel: +65 2468208 395 Yoshiro YONEYA 396 NTT Software Corporation 397 Shinagawa IntercityBldg., B-13F 398 2-15-2 Kohnan, Minato-ku Tokyo 108-6113 Japan 399 Email: yone@po.ntts.co.jp 400 Tel: +81-3-5782-7291 402 Kenny HUANG ���雷�� 403 Geotempo International Ltd; TWNIC 404 3F, No 16 Kang Hwa Street, Nei Hu 405 Taipei 114, Taiwan 406 Email: huangk@alum.sinica.edu 407 Tel: +886-2-2658-6510 409 KIM Kyongsok/GIM Gyeongseog 411 References 413 [UNISTD3] The Unicode Standard v3.0. Unicode Consortium. 414 [UCS] ISBN 0-201-61633-5 416 [IDN] "IETF Internationalized Domain Names Working Group", 417 idn@ops.ietf.org, James Seng, Marc Blanchet 419 [CNRP] "Common Name Resolution Protocol", 420 cnrp-ietf@lists.netsol.com, Leslie Daigle 422 [CJKV] CJKV Information Processing ISBN 1-56592-224-7 424 [C2C] The pitfalls and Complexities of Chinese to Chinese 425 Conversion. http://www.basistech.com/articles/C2C.html, 426 Jack Halpern, Jouni Kerman 428 [KANJIDIC] Sanseido���s Unicode Kanji Information Dictionary 429 ISBN 4-385-13690-4 431 [UNICHART] Unicode chart http://charts.unicode.org/ 433 [ZONGBIAO] Simplified Characters Standard Chart 2nd Edition, 1986 435 [UNIHAN] Unicode Han Database, Unicode Consortium 436 ftp://ftp.unicode.org/Public/UNIDATA/Unihan.txt 438 [ISO11941] ISO TS 11941: Information and documentation ��� 439 Transliteration of Korean script into Latin characters. 440 Technical Specification 11941. First edition. 1996-12-31. 441 ISO (International Organization for Standardization). 443 [KimK 1990] "A New Proposal for a Standard Hangeul (or Korean Script) 444 Code", KIM Kyongsok. Computer Standards & Interfaces, 445 Vol. 9, No. 3, pp. 187-202, 1990. 447 [KimK 1992] "A common Approach to Designing the Hangeul Code and 448 Keyboard", KIM Kyongsok. Computer Standards & Interfaces, 449 Vol. 14, No. 4, pp. 297-325, Aug. 1992. 451 [KimK 1999] A Hangeul story inside computers. KIM, Kyongsok. Busan 452 National University Press. 1999. [in Hangeul]