[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Ltru] Re: UTF-8



Peter Constable <petercon at microsoft dot com> wrote:

There's a prior question: what will the content of registry records contain? (We don't need an encoding that supports the entire UCS if we don't intend to have records using characters from that entire repertoire.) I don't recall if that's been discussed and resolved.

The existing RFC 4646 Registry already contains instances of 11 different non-ASCII characters, including 3 that are not in Latin-1. The current set of 7,600 reference names in ISO/FDIS 639-3 contains almost 500 Latin-1 characters, all of which will end up in the Registry. (They have to; there are some language names in 639-3 that are differentiated only by an accent.)

Clearly the Registry needs an encoding that supports characters beyond ASCII or even Latin-1. We already have one: the much-grumbled-about hex NCR's. UTF-8 would be another. While we could limit the Registry to a subset of the UCS, such as MES-2 -- or MES-1, if we were willing to go through the intense pain of removing "Ethiopic (Ge&#x2BB;ez)" -- I don't see what advantage it would bring.

--
Doug Ewell
Fullerton, California, USA
http://users.adelphia.net/~dewell/
RFC 4645  *  UTN #14


_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru




Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.