[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ltru] Re: UTF-8



Descriptions and comments both allow the full range of Unicode today. Doug has noted some number of items (perhaps a dozen) that use non-ASCII (mostly Latin-script) letters or symbols. There is no check for the range of characters permitted, nor (in our discussion in draft-registry days) should there be.

I think that limiting the repertoire to anything less than the full range of the UCS would prove problematic anyway. And while I don't see a pressing need to identify languages in their native form, some people might find it offensive if it were banned outright. Similarly, some might find comments clearer if they contained some native description. I don't know.

What I do know is that I think that the only limitation that could reasonably be made that would be non-arbitrary would be a limitation to ASCII. And I, personally, will not support any encodings except those two for encoding the registry: either US-ASCII or UTF-8.

Addison

Peter Constable wrote:
There's a prior question: what will the content of registry records
contain? (We don't need an encoding that supports the entire UCS if we
don't intend to have records using characters from that entire
repertoire.) I don't recall if that's been discussed and resolved.

I do support using an encoding that directly supports whatever
characters we wish to allow in the registry without use of NCRs or other
such escape mechanisms; and if we wish to allow any UCS character in the
registry then I would support using UTF-8 as the encoding.


Peter



-----Original Message-----
From: Martin Duerst [mailto:duerst at it.aoyama.ac.jp]
Sent: Tuesday, September 19, 2006 6:18 AM
To: Peter Constable; LTRU Working Group
Subject: RE: [Ltru] Re: UTF-8

[chair hat on]

Peter, with the observation below, do you want to say
you are in favor of moving to UTF-8, or against, or
did you write that strictly as an observation only?

Regards,    Martin.

At 16:04 06/09/18, Peter Constable wrote:
From: Martin Duerst [mailto:duerst at it.aoyama.ac.jp]

Agreed. In my point of view, the pain of Bokm&#$! at *l outweights
a lot of things. The main benefit is that native people can
easily check that things are correct, which they won't do
if we show them just a number.
There's lots of software that will interpret an NCR in an HTML, XML
or HTML
file and display it to the user as the actual character in the UCS
for
which it is a reference. I don't think there is any software that
will do
this for the file located at
http://www.iana.org/assignments/language-subtag-registry.




Peter Constable


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp
mailto:duerst at it.aoyama.ac.jp


_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru

--
Addison Phillips
Globalization Architect -- Yahoo! Inc.

Internationalization is an architecture.
It is not a feature.

_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru




Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.