[EAI] Discussion of draft-ietf-eai-utf8headers-03/04
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[EAI] Discussion of draft-ietf-eai-utf8headers-03/04
This document is now in a quite reasonable state, but a few nigglee/bugs
...
2. Background and History
This specification describes a change to the email message format
that is related to the SMTP message transport change described in the
associated specifications [EAI-overview] and [EAI-SMTP-extension],
and that allows non-ASCII characters throughout email header fields.
"throughout" is not quite right. We allow UTF-8 in "most" header fields,
but not all of them.
Use of this SMTP extension helps prevent against the introduction of
^^^^^^^^^^^^^^^
prevents
such messages into message stores that might misrepresent or mangle
such messages. It should be noted that using an ESMTP extension does
not prevent against transferring email messages with UTF-8 header
^^^^^^^^^^^^^^^
prtevents
Use of word "against" is wrong in those contexts. Just omit it and it will
be fine.
fields to other systems that use the email format for messages and
that may not be upgraded, such as the POP and IMAP protocols. ...
^^^^^^^^^
s/protocols/servers/ (or systems)
3. Terminology
In this document, header fields are "UTF-8 headers" if the bodies of
those headers contain UTF-8 characters.
No, that is wrong; even ordinarg ASCII characters are UTF-8 characters.
ITYM "if ... contain <utf8-xtra-char>s".
5. Changes on Message Header Fields
SMTP client can send header fields in UTF-8 format, if the UTF8SMTP
^
s
extension advertised by SMTP server or as permitted by other
^ ^
is the
transport mechanisms.
to support new format. That following ABNF is defined to substitute
^^^^
The
those definition in RFC 2822.
For those syntax rules not referred in this section remains as the
^^^^^^^^^ ^ ^^^^^^^
Those to remain
original definition in RFC 2822.
5.2. Syntax extensions to RFC 2822
... However, it will also lead <msg-id> to
^^^^
would
allow UTF8 characters, which is not allowed due to the limitation
described in Section 5.4. ....
5.3. Change on addr-spec syntax
.... Thus, all header fields involving <mailbox>es
may be different from traditional ones. There might be UTF8SMTP
^
the
unaware MTAs in the mail routing path. In that case, MTA may bounce
^^^
MTAs
the message with reply code 550, or downgrade the non-ASCII contents
of all header bodies before continuing to send the message. The
downgrade process involve with a new ALT-ADDRESS parameter. ...
^^^^^^^^^^^^
involves
"DISPLAY_NAME" <ASCII at ASCII>
; traditional mailbox format
"DISPLAY_NAME" <non-ASCII at non-ASCII>
; UTF8SMTP but no ALT-ADDRESS parameter provided,
; message will bounce if UTF8SMTP extension is not supported
You also need an example such as
non-ASCII at non-ASCII
since we agreed that a pure <utf8-addr-spec> without "<...>" would still
be allowed, and your syntax shows it.
5.4. Trace field syntax
Internationalized domain names in Received fields must be transmitted
in punycode form when downgrading.
What did we decide about use of the word "punycode" (as opposed to "IDNA"
or soimesuch)? I seem to remember that we discussed it.
........ "For" fields containing
internationalized addresses are allowed, since subsequent downgrading
...
^^^^^^^^^
<local-part>s
(since the domain part is to be IDNAed/punycoded)
6.2. MIME headers
The syntax of <value>, as defined in RFC 2045, is
value = token / quoted-string
To be able to use UTF-8 characters in MIME headers, <quoted-string>
syntax is extended as
qcontent = utf8-qtext / quoted-pair
No, that is now wrong (because you have changed the syntax of
<quoted-pair). I think it should be:
qcontent = utf8-qtext / utf8-quoted-pair
But you can actually do better than that by saying:
value = token / utf8-quoted-string
and there is no need to mention <qcontent> at all.
In all those headers, such as Content-Type and Content-Dispoaition
^
fields
[plus lots of others being defined in various other documents], which
make use of <value> within <parameter> as defined in [RFC2045] as
modified by [RFC2231], it will now be allowed to use <quoted-string>s
containing UTF-8 characters (see the revised syntax of <utf8-qtext>
in Section 5.2 of this document).
Yes, that is all fine. All we need now is for the 'downgrade' document to
catch up with it. But I would also add, just to make it clear:
"Observe that such Content-Type and other header fields may be found
both amongst the top-level fields of a message and also within multiparts;
and also that a complete message conforming to this document may now
appear as a message/rfc822 (in both cases, subject to downgrade when that
is necessary)."
Also, now that we no longer have the Header-Type header, you need to say
in a NOTE somewhere (not necessarily here) that, in order to detect
whether a given message contains any UTF-8 headers, you have to do a
recursive descent of the MIME structure (since a UTF8 header field in some
inner multipart or message/rfc822 might be the only non-ASCII header field
present. Note that MTAs regularly perform such a descent when looking for
8BITMIME stuff that may need to be downgraded, so it is really no extra
work.
8. IANA considerations
There is no IANA considerations in this document.
^^
are
--
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 ;
Web: http://www.cs.man.ac.uk/~chl
Email: chl at clerew.man.ac.uk Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5
_______________________________________________
IMA mailing list
IMA at ietf.org
https://www1.ietf.org/mailman/listinfo/ima
Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.