![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
> > The IESG has received a request to consider UTF-16, an encoding of ISO > > 10646 <draft-hoffman-utf16-04.txt> as Proposed Standard. This has > > been reviewed in the IETF but is not the product of an IETF Working > > Group. > > The IESG plans to make a decision in the next few weeks, and solicits > > final comments on this action. Please send any comments to the > > iesg at ietf.org or ietf at ietf.org mailing lists by September 13, 1999. > > Files can be obtained via > > http://www.ietf.org/internet-drafts/draft-hoffman-utf16-04.txt > Here is my response to last call on UTF-16 draft > I appreciate the draft and I agree to the purpose of > registering the MIME-charsets UTF-16 UTF-16LE and UTF-16BE. > But I have some objections to publishing it as an RFC in its > current form. > 1. The RFC should not be standards track, but rather an informative RFC. > IETF should not define character sets normatively, and I believe > this was actually set as an IAB policy some years ago. There are two independent issues here: Whether the IETF should define charsets normatively and whether or not charset registration RFCs should be on the standards track. As to the first issue, a clear consensus emerged from the discussions that led to RFC 2279 that the IETF has no choice but to make such normative definitions. I see no material diference in the situation surrounding UTF-16 that suggests that the alternative Keld suggests is viable. Specifically, the approach taken in RFC 2279 is to define UTF-8 as aligned with ISO 10646 plus any published amendments unless an amendment appears that isn't backwards compatible. I believe this is the approach to take with UTF-16 as well. Now, insofar as standards track status is concerned, again we see that a precedent has been established for this by RFC 2279, which is standards track. I see no reason to deviate from this approach. > 2. Instead RFCs on charset should refer to the normative > specifications defining the charsets, and to refer to > ISO specifications whenever possible. The draft gives preference > to an industry standard (Unicode) where the ISO standard (10646) > gives the same specification, and should be preferred over the industry > standard. This would be in line with how other charsets are defined, > for example the ISO 8859 series, where the preceding and technically > equivalent ECMA standards are not referenced. I think it is appropiate, > though, to mention the Unicode standard the first time the ISO > standard is introduced. The charsets should be defined with reference > to the ISO standard, and not do its own definition. An explanation > (but not definition) of the standard could be appropiate. I agree with the principle that ISO specifications should be referenced when possible (in keeping with RFC 2279). However, I'm not sure I see a case in the current document where Unicode is referenced preferentially over ISO 10646. I'd like to see specific examples of something that needs changing here. > 3. Referencing the ISO standard would also mean that ISO terminology > should be used in the draft. In the current draft a mixture of > ISO and Unicode terminology, together with newly invented terminology > is used, furthering the confusion about charset concepts. The implicit assumption here is that ISO terminology is consistent. It isn't. Nor do I think that ISO terminology is always better, or less confusing. It is neither. Again I suggest that specific examples be given of problematic text and how it should be changed. I am much more comfortable with specific examples than I am with arguments of the form "ISO good, Unicode bad". > 4. The draft mentions UTF-8 being variable width, and does not mention > that UTF-16 is also variable-width. I think it woul be prudent to > also mention that UTF16 is variable width, the reader could get > the impression that UTF-16 is not variable width. I agree that this should be made explicit in the document -- the implication currently is that UTF-16 is fixed width, and it isn't. > 5. ISO/IEC 19646 has recently adapted more than 12 amendments, > so the references section may be updated in this respect. > But anyway that note will be inaccurate when new amendments are > approved. The RFC 2279 approach should be used here, where the specification automatically aligns to admendments as long as they are backwards compatible. However, I do agree that the references should list all amendments that have been made thus far. Ned
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.