Re: [iesg-secretary at ietf.org: Last Call: UTF-16, an encoding of ISO 10646 to Proposed]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [iesg-secretary at ietf.org: Last Call: UTF-16, an encoding of ISO 10646 to Proposed]



On Tue, Sep 14, 1999 at 10:34:02AM -0400, Francois Yergeau wrote:
> À 01:38 1999-09-13 +0200, Keld Simonsen a écrit :
> >Here is my response to last call on UTF-16 draft
> >
> > [snip]
> >
> >1. The RFC should not be standards track, but rather an informative RFC.
> >IETF should not define character sets normatively, and I believe
> >this was actually set as an IAB policy some years ago.
> 
> The rationale for this is the same as for RFC 2279: Standards Track IETF
> specs need to refer to Standards Track specs, not Informational.

I can understand the standards track status of RFC 2279, as we want all
IP protocols to support UTF-8. UTF-16 is another ballpark, we actually want
the protocols to support UTF-8 as the main charset, all other charsets
should only be supplementary. From an IAB point of view we actually would
like to discuourage all other forms of UCS in our protocols - UTF-8 is 
designed and very well suited for communication, and adding more UCS
charsets will disturb the policy of having UTF-8 be the primary and default
encoding for our protocol suites. UTF-16 should not enjoy more status
than eg the 8859 series or vendor charsets.

I am not sure where UTF-16 is to be used in standards track RFCs.
If this rule is used, a number of charset realted RFCs should be standards 
track, I believe.

> >2. Instead RFCs on charset should refer to the normative
> >specifications defining the charsets, and to refer to
> >ISO specifications whenever possible.
> 
> It now does. Previous comments (private mail) made us realize that the
> following sentence had not been carried over from RFC 2279: "The definitive
> reference is Annex Q of ISO/IEC 10646-1 [ISO-10646]." (Now in the first
> paragraph of section 2).

Good. Maybe I should discuss the issues from the new draft. 
Is it available somewhere?

> > The draft gives preference
> >to an industry standard (Unicode) where the ISO standard (10646)
> 
> It never did that.

The draft I read did (draft 4), IMHO.

> >3. Referencing the ISO standard would also mean that ISO terminology
> >should be used in the draft. In the current draft a mixture of
> >ISO and Unicode terminology, together with newly invented terminology
> >is used, furthering the confusion about charset concepts.
> 
> Given the price and accessibility of ISO standard, and especially of 10646,
> it is a fact that the Unicode terminology is much more widely known and
> cannot be simply ignored.  Giving preference to ISO standards does not
> extend to ignoring other important standards.  As for new terminology,
> there is exactly one case: we chose to do that because ISO 10646 did not
> have appropriate terminology whereas the Unicode terminology was felt to be
> ackward.

I am not sure what you mean with that. ISO/IEC 10646 can be obtained on
the Internet, see http://www.dkuug.dk/jtc1/sc2/wg2/docs/standards.html
All you need is internet access and some 1 Mb on your harddisk, which is
quite inexpensive these days and quite common among the interested parties
of IS 10646.

I am not sure that the Unicode terminology is more used, and I believe
we need to harmonize on ISO terminology and promote that as the internationally
agreed terminology. Another contributor mentioned that ISO terminology
is not consistant within itself, and I agrre that there may be some
cases where that is true, but if we are using IS 10646 terminology
then that is consistent, and this is also the policy that ISO groups
are using.

I do not know why the ISO terminology not adequate for also the term
you call "character value" - the ISO term would be "charcater code" AFAIK.

You only mention Unicode in supplement to 10646, but there are actually
more ISO standards that are relevant in the areas you mention such as
15897, 14651, 14652 and the POSIX data collection at
http://www.dkuug.dk/i18n/WG15-collection

> >4. The draft mentions UTF-8 being variable width, and does not mention
> >that UTF-16 is also variable-width.
> 
> Already fixed, "variable-width" is not mentionned any more.

Souds fine, I would like to see it.

Keld Simonsen




Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.

Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.