[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ltru] Re: UTF-8



This thread is very amusing, but, I think, not very useful. I think that using a widely-recognized, plain-text character encoding (UTF-8: "the new ASCII") would be a Good Thing. But changing which escape syntax we use, especially to use something, um, "unusual" like UTF-1 or BOCU-1, gives us nothing over using US-ASCII with some escape syntax (such as our existing NCR format).

So:

Is there any support for changing to UTF-8?

Addison

Frank Ellermann wrote:
Doug Ewell wrote:

I'm not sure why you chose 0x86 as your sequence introducer

Six is the number of trailing octets: 91909F9F9F9F (for John's
example u+10FFFF).

you could make each sequence 1 byte shorter by marking the
lead or trail byte specially

Yes, but then only 2 octets (80+81) would never occur (instead
of 11), and lost 9x bytes won't cause an error.  UTF-8 has now
13 "impossible" octets, and similar features.


--
Addison Phillips
Globalization Architect -- Yahoo! Inc.

Internationalization is an architecture.
It is not a feature.

_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru




Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.