[EAI] Gen-ART review of draft-ietf-eai-imap-utf8-07 - upconversion SHOULD
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[EAI] Gen-ART review of draft-ietf-eai-imap-utf8-07 - upconversion SHOULD
John,
I freely admit to not being a character set expert. The core
question from the review was:
> > Section 8 lists a number of 8859 character sets for which
> > upconversion of MIME headers MUST be supported, and then says
> > "Other widely deployed MIME charsets SHOULD be supported."
> > How does an implementer figure out which character sets those
> > would be?
My attempt to improve on this seems to have missed the mark:
> FWIW, "superset of ASCII" turns out to be a nightmare definition.
I won't disagree with that, despite the number of 8859 character
sets that are listed in the draft.
Rather, what would you suggest to address the original concern?
Specifically, you wrote:
> On the other hand, if the specification says "this needs to be
> Unicode, presented in UTF-8, on the wire" then servers pretty
> much know what they use (or have to deliver to mail-reading
> MUAs) locally and have to support and convert to and from,
> without our doing anything more than reminding them that, if
> they don't do those conversions, things won't work at all well.
Are you suggesting that the "SHOULD" for upconversion to UTF-8
ought to be on all character sets supported by the server, so
that a client can just deal with UTF-8?
Thanks,
--David
> -----Original Message-----
> From: John C Klensin [mailto:klensin at jck.com]
> Sent: Tuesday, September 01, 2009 2:27 PM
> To: Black, David; presnick at qualcomm.com;
> chris.newman at sun.com; gen-art at ietf.org; harald at alvestrand.no
> Cc: ima at ietf.org
> Subject: Re: [EAI] Gen-ART review of draft-ietf-eai-imap-utf8-07
>
> David,
>
> A comment on one issue -- speaking for myself, not the WG...
>
> --On Monday, August 31, 2009 03:03 -0400 Black_David at emc.com
> wrote:
>
> > I have been selected as the General Area Review Team (Gen-ART)
> > reviewer for this draft (for background on Gen-ART, please see
> > http://www.alvestrand.no/ietf/gen/art/gen-art-FAQ.html).
> >...
>
> > Major issues:
> >...
> > Section 8 lists a number of 8859 character sets for which
> > upconversion of MIME headers MUST be supported, and then says
> > "Other widely deployed MIME charsets SHOULD be supported."
> > How does an implementer figure out which character sets those
> > would be? As an alternative, I suggest saying something along
> > the lines of: any server-supported character set that is a
> > superset of ASCII should be supported for upconversion. That
> > probably leads to fewer client surprises caused by UTF-8 not
> > working as expected.
>
> FWIW, "superset of ASCII" turns out to be a nightmare
> definition. There are languages out there that use Latin-based
> scripts but don't use all ASCII characters. While I'm not aware
> of any of them that use coded character sets that don't include
> all of the undecorated "Latin" characters that appear in ASCII,
> that combination is certainly plausible and probably does occur
> somewhere. More important, the issues with local character
> coding systems are far more severe with non-Latin-based writing
> systems than they are with Latin-based ones, partially because
> there are different models for coding (precomposed characters
> versus combining ones, format and style effectors that may
> actually change the characters versus greater numbers of
> precomposed forms, etc.), rendering conversion to or from
> Unicode a context-dependent translation activity rather than a
> character-to-character mapping one.
>
> On the other hand, if the specification says "this needs to be
> Unicode, presented in UTF-8, on the wire" then servers pretty
> much know what they use (or have to deliver to mail-reading
> MUAs) locally and have to support and convert to and from,
> without our doing anything more than reminding them that, if
> they don't do those conversions, things won't work at all well.
>
> So, while I agree with you that the text doesn't sound
> sufficiently specific, I fear that the reality is that trying to
> actually make it more specific will set us up for either
> requiring support for conversions that are completely
> unnecessary in context or for leaving out ones that are
> important. Indeed, I suspect that the listing of specific 8859
> standards will turn out to be a bad idea in the long term, but
> that is something the WG decided on and, for an Experimental
> document, I think it is reasonable to leave it there and examine
> operational experience when that is relevant.
>
> best,
> john
>
>
>
Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.