idnits 2.17.1 draft-whistler-plane14-00.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-24) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == Mismatching filename: the document gives the document name as 'draft-whistler-plane14-01', but the file name used is 'draft-whistler-plane14-00' == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 604 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 5 instances of too long lines in the document, the longest one being 3 characters in excess of 72. ** The abstract seems to contain references ([UNICODE], [ISO10646], [RFC1766]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 15, 1998) is 9565 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'A-Z' is mentioned on line 364, but not defined == Unused Reference: 'RFC2070' is defined on line 553, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO10646' ** Obsolete normative reference: RFC 1766 (Obsoleted by RFC 3066, RFC 3282) ** Obsolete normative reference: RFC 2070 (Obsoleted by RFC 2854) ** Downref: Normative reference to an Informational RFC: RFC 2130 -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE' Summary: 14 errors (**), 0 flaws (~~), 5 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Ken Whistler, Sybase 2 Internet Draft Glenn Adams, Spyglass 3 5 Language Tagging in Unicode Plain Text 7 February 15, 1998 9 Status of this Memo 11 This document is an Internet-Draft. Internet-Drafts are working 12 documents of the Internet Engineering Task Force (IETF), its areas, and 13 its working groups. Note that other groups may also distribute working 14 documents as Internet- Drafts. 16 Internet-Drafts are draft documents valid for a maximum of six months. 17 Internet-Drafts may be updated, replaced, or obsoleted by other 18 documents at any time. It is not appropriate to use Internet-Drafts as 19 reference material or to cite them other than as a "working draft" or 20 "work in progress". 22 To learn the current status of any Internet-Draft, please check the 23 1id-abstracts.txt listing contained in the Internet-Drafts Shadow 24 Directories on ds.internic.net (US East Coast), nic.nordu.net (Europe), 25 ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific Rim). 27 1. Abstract 29 This document proposed a mechanism for language tagging in [UNICODE] 30 plain text. A set of special-use tag characters on Plane 14 of 31 [ISO10646] (accessible through UTF-8, UTF-16, and UCS-4 encoding forms) 32 are proposed for encoding to enable the spelling out of ASCII-based 33 string tags using characters which can be strictly separated from 34 ordinary text content characters in ISO10646 (or UNICODE). 36 One tag identification character and one cancel tag character are also 37 proposed. In particular, a language tag identification character is 38 proposed to identify a language tag string specifically; the language 39 tag itself makes use of [RFC1766] language tag strings spelled out 40 using the Plane 14 tag characters. Provision of a specific, 41 low-overhead mechanism for embedding language tags in plain text is 42 aimed at meeting the need of Internet Protocols such as ACAP, which 43 require a standard mechanism for marking language in UTF-8 strings. 45 The tagging mechanism as well the characters proposed in this document 46 have been approved by the Unicode Consortium for inclusion in The 47 Unicode Standard. However, implementation of this decision awaits 48 formal acceptance by ISO JTC1/SC2/WG2, the working group responsible 49 for ISO10646. Potential implementers should be aware that until this 50 formal acceptance occurs, any usage of the characters proposed herein 51 is strictly experimental and not sanctioned for standardized character 52 data interchange. 54 2. Definitions and Notation 56 No attempt is made to define all terms used in this document. In 57 particular, the terminology pertaining to the subject of coded 58 character systems is not explicitly specified. See [UNICODE], 59 [ISO10646], and [RFC2130] for additional definitions in this area. 61 2.1 Requirements Notation 63 This document occasionally uses terms that appear in capital letters. 64 When the terms "MUST", "SHOULD", "MUST NOT", "SHOULD NOT", and "MAY" 65 appear capitalized, they are being used to indicate particular 66 requirements of this specification. A discussion of the meanings of 67 these terms appears in [RFC2119]. 69 2.2 Definitions 71 The terms defined below are used in special senses and thus warrant 72 some clarification. 74 2.2.1 Tagging 76 The association of attributes of text with a point or range of the 77 primary text. (The value of a particular tag is not generally 78 considered to be a part of the "content" of the text. Typical examples 79 of tagging is to mark language or font of a portion of text.) 81 2.2.2 Annotation 83 The association of secondary textual content with a point or range of 84 the primary text. (The value of a particular annotation *is* considered 85 to be a part of the "content" of the text. Typical examples include 86 glossing, citations, exemplication, Japanese yomi, etc.) 88 2.2.3 Out-of-band 90 An out-of-band channel conveys a tag in such a way that the textual 91 content, as encoded, is completely untouched and unmodified. This is 92 typically done by metadata or hyperstructure of some sort. 94 2.2.4 In-band 96 An in-band channel conveys a tag along with the textual content, using 97 the same basic encoding mechanism as the text itself. This is done by 98 various means, but an obvious example is SGML markup, where the tags 99 are encoded in the same character set as the text and are interspersed 100 with and carried along with the text data. 102 3.0 Background 104 There has been much discussion over the last 8 years of 105 language tagging and of other kinds of tagging of Unicode plain 106 text. It is fair to say that there is more-or-less universal 107 agreement that language tagging of Unicode plain text is 108 required for certain textual processes. For example, language 109 "hinting" of multilingual text is necessary for multilingual 110 spell-checking based on multiple dictionaries to work well. 111 Language tagging provides a minimum level of required 112 information for text-to-speech processes to work correctly. 113 Language tagging is regularly done on web pages, to enable 114 selection of alternate content, for example. 116 However, there has been a great deal of controversy regarding 117 the appropriate placement of language tags. Some have 118 held that the only appropriate placement of language tags 119 (or other kinds of tags) is out-of-band, making use of 120 attributed text structures or metadata. Others have argued 121 that there are requirements for lower-complexity in-band 122 mechanisms for language tags (or other tags) in plain text. 124 The controversy has been muddied by the existence and widespread 125 use of a number of in-band text markup mechanisms (HTML, 126 text/enriched, etc.) which enable language tagging, but 127 which imply the use of general parsing mechanisms which 128 are deemed too "heavyweight" for protocol developers and 129 a number of other applications. The difficulty of using 130 general in-band text markup for simple protocols derives 131 from the fact that some characters are used both for textual 132 content and for the text markup; this makes it more difficult 133 to write simple, fast algorithms to find only the textual 134 content and ignore the tags, or vice versa. (Think of this 135 as the algorithmic equivalent of the difficulty the human 136 reader has attempting to read just the content of raw 137 HTML source text without a browser interpreting all the 138 markup tags.) 140 The Plane 14 proposal addresses the recurrent and persistent 141 call for a lighter-weight mechanism for text tagging than 142 typical text markup mechanisms in Unicode. It proposes a special set 143 of characters used *only* for tagging. These tag characters 144 can be embedded into plain text and can be identified and/or 145 ignored with trivial algorithms, since there is no overloading 146 of usage for these tag characters--they can only express 147 tag values and never textual content itself. 149 The Plane 14 proposal is not intended for general annotation 150 of text, such as textual citations, phonetic readings (e.g. 151 Japanese Yomi), etc. In its present form, its use is intended 152 to be restriced solely to specifying in-line language tags. 153 Future extensions may widen this scope of intended usage. 155 4.0 Proposal 157 This proposal suggests the use of 97 dedicated tag characters 158 encoded at the start of Plane 14 of ISO/IEC 10646 consisting of 159 a clone of the 94 printable 7-bit ASCII graphic characters and 160 ASCII SPACE, as well as a tag identification character and a tag 161 cancel character. 163 These tag characters are to be used to spell out any ASCII- 164 based tagging scheme which needs to be embedded in Unicode 165 plain text. In particular, they can be used to spell out 166 language tags in order to meet the expressed requirements 167 of the ACAP protocol and the likely requirements of other 168 new protocols following the guidelines of the IAB character 169 workshop (RFC 2130). 171 The suggested range in Plane 14 for the block reserved for 172 tag characters is as follows, expressed in each of the 173 three most generally used encoding schemes for ISO/IEC 174 10646: 176 UCS-4 178 U-000E0000 .. U-000E007F 180 UTF-16 182 U+DB40 U+DC00 .. U+DB40 U+DC7F 184 UTF-8 186 0xF3 0xA0 0x80 0x80 .. 0xF3 0xA0 0x81 0xBF 188 Of this range, U-000E0020 .. U-000E007E is the 189 suggested range for the ASCII clone tag characters themselves. 191 4.1 Names for the Tag Characters 193 The names for the ASCII clone tag characters should be exactly 194 the ISO 10646 names for 7-bit ASCII, prefixed with the word 195 "TAG". 197 In addition, there is one tag identification character 198 and a CANCEL TAG character. The use and syntax of these characters 199 is described in detail below. 201 The entire encoding for the proposed Plane 14 tag characters and 202 names of those characters can be derived from the following list. 203 (The encoded values here and throughout this proposal are listed 204 in UCS-4 form, which is easiest to interpret. It is assumed that 205 most Unicode applications will, however, be making use either 206 of UTF-16 or UTF-8 encoding forms for actual implementation.) 208 U-000E0000 209 U-000E0001 LANGUAGE TAG 210 U-000E0002 211 .... 212 U-000E001F 213 U-000E0020 TAG SPACE 214 U-000E0021 TAG EXCLAMATION MARK 215 .... 216 U-000E0041 TAG LATIN CAPITAL LETTER A 217 .... 218 U-000E007A TAG LATIN SMALL LETTER Z 219 .... 220 U-000E007E TAG TILDE 221 U-000E007F CANCEL TAG 223 4.2 Range Checking for Tag Characters 225 The range checks required for code testing for tag characters 226 would be as follows. The same range check is expressed here 227 in C for each of the three significant encoding forms for 10646. 229 Range check expressed in UCS-4: 231 if ( ( *s >= 0xE0000 ) || ( *s <= 0xE007F ) ) 233 Range check expressed in UTF-16 (Unicode): 235 if ( ( *s == 0xDB40 ) && ( *(s+1) >= 0xDC00 ) && ( *(s+1) <= 0xDC7F ) ) 237 Expressed in UTF-8: 239 if ( ( *s == 0xF3 ) && ( *(s+1) == 0xA0 ) && ( *(s+2) & 0xE0 == 0x80 ) 241 Because of the choice of the range for the tag characters, it would also 242 be possible to express the range check for UCS-4 or UTF-16 in terms of 243 bitmask operations, as well. 245 4.3 Syntax for Embedding Tags 247 The use of the Plane 14 tag characters is very simple. In order 248 to embed any ASCII-derived tag in Unicode plain text, the tag 249 is simply spelled out with the tag characters instead, prefixed 250 with the relevant tag identification character. The 251 resultant string is embedded directly in the text. 253 The tag identification character is used as a mechanism for 254 identifying tags of different types. This enables multiple 255 types of tags to coexist amicably embedded in plain text and 256 solves the problem of delimitation if a tag is concatenated 257 directly onto another tag. Although only one type of tag is 258 currently specified, namely the language tag, the encoding 259 of other tag identification characters in the future would 260 allow for distinct tag types to be used. 262 No termination character is required for a tag. A tag terminates 263 either when the first non Plane 14 Tag Character (i.e. any 264 other normal Unicode value) is encountered, or when the next 265 tag identification character is encountered. 267 All tag arguments must be encoded only with the tag characters 268 U-000E0020 .. U-000E007E. No other characters are valid for 269 expressing the tag argument. 271 A detailed BNF syntax for tags is listed below. 273 4.4 Tag Scope and Nesting 275 The value of an established tag continues from the point the 276 tag is embedded in text until either: 278 A. The text itself goes out of scope, as defined by the 279 application. (E.g. for line-oriented protocols, when 280 reaching the end-of-line or end-of-string; for text 281 streams, when reaching the end-of-stream; etc.) 283 or 285 B. The tag is explicitly cancelled by the CANCEL TAG 286 character. 288 Tags of the same type cannot be nested in any way. The appearance 289 of a new embedded language tag, for example, after text which 290 was already language tagged, simply changes the tagged value for 291 subsequent text to that specified in the new tag. 293 Tags of different type can have interdigitating scope, but 294 not hierarchical scope. In effect, 295 tags of different type completely ignore each other, so that 296 the use of language tags can be completely asynchronous with the 297 use of character set source tags (or any other tag type) in the 298 same text in the future. 300 4.5 Cancelling Tag Values 302 U-000E007F CANCEL TAG is provided to allow the specific cancelling 303 of a tag value. The use of CANCEL TAG has the following syntax. 304 To cancel a tag value of a particular type, prefix the CANCEL 305 TAG character with the tag identification character of the 306 appropriate type. For example, the complete string to cancel 307 a language tag is: 309 U-000E0001 U-000E007F 311 The value of the relevant tag type returns to the default state 312 for that tag type, namely: no tag value specified, the same as 313 untagged text. 315 The use of CANCEL TAG without a prefixed tag identification 316 character cancels *any* Plane 14 tag values which may be 317 defined. Since only language tags are currently provided with 318 an explicit tag identification character, only language tags 319 are currently affected. 321 The main function of CANCEL TAG is to make possible such 322 operations as blind concatenation of strings in a tagged context 323 without the propagation of inappropriate tag values across the 324 string boundaries. For example, a string tagged with a Japanese 325 language tag can have its tag value "sealed off" with a terminating 326 CANCEL TAG before another string of unknown language value is 327 concatenated to it. This would prevent the string of unknown 328 language from being erroneously marked as being Japanese simply 329 because of a concatenation to a Japanese string. 331 4.6 Tag Syntax Description 333 An extended BNF (Backus-Naur Form) description of the tags specified 334 in this proposal is found below. Note the following BNF extensions 335 used in this formalism: 337 1. Semantic constraints are specified by rules in the form of an 338 assertion specified between double braces; the variable $$ denotes 339 the string consisting of all terminal symbols matched by the 340 this non-terminal. 342 Example: {{ Assert ( $$[0] == '?' ); }} 344 Meaning: The first character of the string matched by this 345 non-terminal must be '?' 347 2. A number of predicate functions are employed in semantic constraint 348 rules which are not otherwise defined; their name is sufficient for 349 determining their predication. 351 Example: IsRFC1766LanguageIdentifier ( tag-argument ) 353 Meaning: tag-argument is a valid RFC1766 language identifier 355 3. A lexical expander function, TAG, is employed to denote the tag 356 form of an ASCII character; the argument to this function is either 357 a character or a character set specified by a range or enumeration 358 expression. 360 Example: TAG('-') 362 Meaning: TAG HYPHEN-MINUS 364 Example: TAG([A-Z]) 366 Meaning: TAG LATIN CAPITAL LETTER A ... 367 TAG LATIN CAPITAL LETTER Z 369 4. A macro is employed to denote terminal symbols that are character 370 literals which can't be directly represented in ASCII. The argument 371 to the macro is the UNICODE (ISO/IEC 10646) character name. 373 Example: '${TAG CANCEL}' 375 Meaning: character literal whose code value is U-000E007F 377 5. Occurrence indicators used are '+' (one or more) and '*' (zero 378 or more); optional occurrence is indicated by enclosure in '[' 379 and ']'. 381 4.6.1 Formal Tag Syntax 383 tag : language-tag 384 | cancel-all-tag 385 ; 387 language-tag : language-tag-introducer language-tag-argument 388 ; 390 language-tag-argument : tag-argument 391 {{ Assert ( IsRFC1766LanguageIdentifier ( $$ ); }} 392 | tag-cancel 393 ; 395 cancel-all-tag : tag-cancel 396 ; 398 tag-argument : tag-character+ 399 ; 401 tag-character : { c : c in 402 TAG( { a : a in printable ASCII characters or SPACE } ) } 403 ; 405 language-tag-introducer : '${TAG LANGUAGE}' 406 ; 408 tag-cancel : '${TAG CANCEL}' 409 ; 411 5.0 Tag Types 413 5.1 Language Tags 415 Language tags are of general interest and should have a high 416 degree of interoperability for protocol usage. To this end, a 417 specific LANGUAGE TAG tag identification character is provided. 418 A Plane 14 tag string prefixed by U-000E0001 LANGUAGE TAG is 419 specified to constitute a language tag. Furthermore, the tag values 420 for the language tag are to be spelled out as specified in RFC 421 1766, making use only of registered tag values or of user-defined 422 language tags starting with the characters "x-". 424 For example, to embed a language tag for Japanese, the Plane 14 425 characters would be used as follows. The Japanese tag from RFC 1766 426 is "ja" (composed of ISO 639 language id) or, alternatively, 427 "ja-JP" (composed of ISO 639 language id plus ISO 3166 country id). 428 Since RFC 1766 specifies that language tags are not case significant, 429 it is recommended that for language tags, the entire tag be 430 lowercased before conversion to Plane 14 tag characters. (This 431 would not be required for Unicode conformance, but should be followed 432 as general practice by protocols making use of RFC 1766 language tags, 433 to simplify and speed up the processing for operations which need to 434 identify or ignore language tags embedded in text.) Lowercasing, 435 rather than uppercasing, is recommended because it follows the majority 436 practice of expressing language tag values in lowercase letters. 438 Thus the entire language tag (in its longer form) would be converted 439 to Plane 14 tag characters as follows: 441 U-000E0001 U-000E006A U-000E0061 U-000E002D U-000E006A U-000E0070 443 The language tag (in its shorter, "ja" form) could be expressed 444 as follows: 446 U-000E0001 U-000E006A U-000E0061 448 The value of this string is then expressed in whichever encoding 449 form (UCS-4, UTF-16, UTF-8) is required and embedded in text at 450 the relevant point. 452 5.2 Additional Tags 454 Additional tag identification characters might be defined in the 455 future. An example would be a CHARACTER SET SOURCE TAG, or a 456 GENERIC TAG for private definition of tags. 458 In each case, when a specific tag identification character is encoded, 459 a corresponding reference standard for the values of the tags associated 460 with the identifier should be designated, so that interoperating 461 parties which make use of the tags will know how to interpret the 462 values the tags may take. 464 6.0 Display Issues 466 All characters in the tag character block are considered to have 467 no visible rendering in normal text. A process which interprets 468 tags may choose to modify the rendering of text based on the tag 469 values (as for example, changing font to preferred style for 470 rendering Chinese versus Japanese). The tag characters 471 themselves have no display; they may be considered similar to 472 a U+200B ZERO WIDTH SPACE in that regard. The tag characters also 473 do not affect breaking, joining, or any other format or layout 474 properties, except insofar as the process interpreting the 475 tag chooses to impose such behavior based on the tag value. 477 For debugging or other operations which must render the tags 478 themselves visible, it is advisable that the tag characters be 479 rendered using the corresponding ASCII character glyphs (perhaps 480 modified systematically to differentiate them from normal ASCII 481 characters). But, as noted below, the tag character values are 482 chosen so that even without display support, the tag characters 483 will be interpretable in most debuggers. 485 8.0 Unicode Conformance Issues 487 The basic rules for Unicode conformance for the tag characters are 488 exactly the same as for any other Unicode characters. A conformant 489 process is not required to interpret the tag characters. If it does 490 not interpret tag characters, it should leave their values undisturbed 491 and do whatever it does with any other uninterpreted characters. If 492 it does interpret them, it should interpret them according to the 493 standard, i.e. as spelled-out tags. 495 So for a non-TagAware Unicode application, any language tag characters 496 (or any other kind of tag expressed with Plane 14 tag characters) 497 encountered would be handled exactly as for uninterpreted Tibetan 498 from the BMP, uninterpreted Linear B from Plane 1, or uninterpreted 499 Egyptian hieroglyphics from private use space in Plane 15. 501 A TagAware but TagPhobic Unicode application can recognize the tag 502 character range in Plane 14 and choose to deliberately strip them 503 out completely to produce plain text with no tags. 505 The presence of a correctly formed tag cannot be taken as a 506 guarantee that the data so tagged is correctly tagged. For example, 507 nothing prevents an application from erroneously labelling French 508 data as Spanish, or from labelling JIS-derived data as Japanese, even 509 if it contains Greek or Cyrillic characters. 511 8.1 Note on Encoding Language Tags 513 The fact that this proposal for encoding tag characters in 514 Unicode includes a mechanism for specifying language tag values 515 does not mean that Unicode is departing from one of its 516 basic encoding principles: 518 Unicode encodes scripts, not languages. 520 This is still true of the Unicode encoding (and ISO/IEC 10646), even 521 in the presence of a mechanism for specifying language tags 522 in plain text. There is nothing obligatory about the use of Plane 14 523 tags, whether for language tags or any other kind of tags. 525 Language tagging in no way impacts current encoded characters 526 or the encoding of future scripts. 528 It is fully anticipated that implementations of Unicode which 529 already make use of out-of-band mechanisms for language tagging 530 or "heavy-weight" in-band mechanisms such as HTML will continue 531 to do exactly what they are doing and will ignore Plane 14 532 tag characters completely. 534 9.0 Security Considerations 536 Security issues are not discussed in this memo. 538 ************************************************************************ 540 References 542 [ISO10646] 544 ISO/IEC 10646-1:1993 International Organization for Standardization. 545 "Information Technology -- Universal Multiple-Octet Coded Character 546 Set (UCS) -- Part 1: Architecture and Basic Multilingual Plane", 547 Geneva, 1993. 549 [RFC1766] 551 Alvestrand, H., "Tags for the Identification of Languages", RFC 1766. 553 [RFC2070] 555 F. Yergeau, G. Nicol, G. Adams, and M. Duerst, "Internationalization 556 of the Hypertext Markup Language", RFC 2070, January 1997. 558 [RFC2119] 560 S. Bradner, "Key words for use in RFCs to Indicate Requirement Levels", 561 RFC 2119, March 1997. 563 [RFC2130] 565 C. Weider, C. Preston, K. Simonsen, H. Alvestrand, R. Atkinson, 566 M. Crispin, and P. Svanberg, "The Report of the IAB Character Set 567 Workshop held 29 February - 1 March, 1996", RFC 2130, April 1997. 569 [UNICODE] 571 The Unicode Standard, Version 2.0, The Unicode Consortium, 572 Addison-Wesley, July 1996. 574 Acknowledgements 576 The following people also contributed to this document, directly or 577 indirectly: Chris Newman, Mark Crispin, Rick McGowan, Joe Becker, 578 John Jenkins, and Asmus Freytag. This document also was reviewed by 579 the Unicode Technical Committee, and the authors wish to thank all 580 of the UTC representatives for their input. The authors are, of course, 581 responsible for any errors or omissions which may remain in the text. 583 Authors' Addresses 585 Ken Whistler 586 Sybase, Inc. 587 6475 Christie Ave. 588 Emeryville, CA 94608-1050 589 Phone: +1 510 922 3611 590 Email: kenw@sybase.com 592 Glenn Adams 593 Spyglass, Inc. 594 One Cambridge Center 595 Cambridge, MA 02142 596 Phone: +1 617 679 4652 597 Email: glenn@spyglass.com