Re: [EAI] Gen-ART review of draft-ietf-eai-imap-utf8-07
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [EAI] Gen-ART review of draft-ietf-eai-imap-utf8-07
David,
A comment on one issue -- speaking for myself, not the WG...
--On Monday, August 31, 2009 03:03 -0400 Black_David at emc.com
wrote:
> I have been selected as the General Area Review Team (Gen-ART)
> reviewer for this draft (for background on Gen-ART, please see
> http://www.alvestrand.no/ietf/gen/art/gen-art-FAQ.html).
>...
> Major issues:
>...
> Section 8 lists a number of 8859 character sets for which
> upconversion of MIME headers MUST be supported, and then says
> "Other widely deployed MIME charsets SHOULD be supported."
> How does an implementer figure out which character sets those
> would be? As an alternative, I suggest saying something along
> the lines of: any server-supported character set that is a
> superset of ASCII should be supported for upconversion. That
> probably leads to fewer client surprises caused by UTF-8 not
> working as expected.
FWIW, "superset of ASCII" turns out to be a nightmare
definition. There are languages out there that use Latin-based
scripts but don't use all ASCII characters. While I'm not aware
of any of them that use coded character sets that don't include
all of the undecorated "Latin" characters that appear in ASCII,
that combination is certainly plausible and probably does occur
somewhere. More important, the issues with local character
coding systems are far more severe with non-Latin-based writing
systems than they are with Latin-based ones, partially because
there are different models for coding (precomposed characters
versus combining ones, format and style effectors that may
actually change the characters versus greater numbers of
precomposed forms, etc.), rendering conversion to or from
Unicode a context-dependent translation activity rather than a
character-to-character mapping one.
On the other hand, if the specification says "this needs to be
Unicode, presented in UTF-8, on the wire" then servers pretty
much know what they use (or have to deliver to mail-reading
MUAs) locally and have to support and convert to and from,
without our doing anything more than reminding them that, if
they don't do those conversions, things won't work at all well.
So, while I agree with you that the text doesn't sound
sufficiently specific, I fear that the reality is that trying to
actually make it more specific will set us up for either
requiring support for conversions that are completely
unnecessary in context or for leaving out ones that are
important. Indeed, I suspect that the listing of specific 8859
standards will turn out to be a bad idea in the long term, but
that is something the WG decided on and, for an Experimental
document, I think it is reasonable to leave it there and examine
operational experience when that is relevant.
best,
john
Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.