Re: [iesg-secretary at ietf.org: Last Call: UTF-16, an encoding of ISO 10646 to Proposed]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [iesg-secretary at ietf.org: Last Call: UTF-16, an encoding of ISO 10646 to Proposed]



> > The IESG has received a request to consider UTF-16, an encoding of ISO
> > 10646 <draft-hoffman-utf16-04.txt> as Proposed Standard. This has
> > been reviewed in the IETF but is not the product of an IETF Working
> > Group.

> > The IESG plans to make a decision in the next few weeks, and solicits
> > final comments on this action.  Please send any comments to the
> > iesg at ietf.org or ietf at ietf.org mailing lists by September 13, 1999.

> > Files can be obtained via
> > http://www.ietf.org/internet-drafts/draft-hoffman-utf16-04.txt

> Here is my response to last call on UTF-16 draft

> I appreciate the draft and I agree to the purpose of
> registering the MIME-charsets UTF-16 UTF-16LE and UTF-16BE.

> But I have some objections to publishing it as an RFC in its
> current form.

> 1. The RFC should not be standards track, but rather an informative RFC.
> IETF should not define character sets normatively, and I believe
> this was actually set as an IAB policy some years ago.

There are two independent issues here: Whether the IETF should define charsets
normatively and whether or not charset registration RFCs should be on the
standards track.

As to the first issue, a clear consensus emerged from the discussions
that led to RFC 2279 that the IETF has no choice but to make such
normative definitions. I see no material diference in the situation
surrounding UTF-16 that suggests that the alternative Keld suggests is viable.

Specifically, the approach taken in RFC 2279 is to define UTF-8 as aligned
with ISO 10646 plus any published amendments unless an amendment appears
that isn't backwards compatible. I believe this is the approach to take
with UTF-16 as well.

Now, insofar as standards track status is concerned, again we see that a
precedent has been established for this by RFC 2279, which is standards track.
I see no reason to deviate from this approach.

> 2. Instead RFCs on charset should refer to the normative
> specifications defining the charsets, and to refer to
> ISO specifications whenever possible. The draft gives preference
> to an industry standard (Unicode) where the ISO standard (10646)
> gives the same specification, and should be preferred over the industry
> standard. This would be in line with how other charsets are defined,
> for example the ISO 8859 series, where the preceding and technically
> equivalent ECMA standards are not referenced. I think it is appropiate,
> though, to mention the Unicode standard the first time the ISO
> standard is introduced. The charsets should be defined with reference
> to the ISO standard, and not do its own definition. An explanation
> (but not definition) of the standard could be appropiate.

I agree with the principle that ISO specifications should be referenced when
possible (in keeping with RFC 2279). However, I'm not sure I see a case in the
current document where Unicode is referenced preferentially over ISO 10646. I'd
like to see specific examples of something that needs changing here.

> 3. Referencing the ISO standard would also mean that ISO terminology
> should be used in the draft. In the current draft a mixture of
> ISO and Unicode terminology, together with newly invented terminology
> is used, furthering the confusion about charset concepts.

The implicit assumption here is that ISO terminology is consistent. It isn't.
Nor do I think that ISO terminology is always better, or less confusing. It is
neither.

Again I suggest that specific examples be given of problematic text and how it
should be changed. I am much more comfortable with specific examples than I am
with arguments of the form "ISO good, Unicode bad".

> 4. The draft mentions UTF-8 being variable width, and does not mention
> that UTF-16 is also variable-width. I think it woul be prudent to
> also mention that UTF16 is variable width, the reader could get
> the impression that UTF-16 is not variable width.

I agree that this should be made explicit in the document -- the implication
currently is that UTF-16 is fixed width, and it isn't.

> 5. ISO/IEC 19646 has recently adapted more than 12 amendments,
> so the references section may be updated in this respect.
> But anyway that note will be inaccurate when new amendments are
> approved.

The RFC 2279 approach should be used here, where the specification
automatically aligns to admendments as long as they are backwards compatible.
However, I do agree that the references should list all amendments that have
been made thus far.

				Ned




Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.

Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.