[EAI] Comments on draft-ietf-eai-downgrade-04
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[EAI] Comments on draft-ietf-eai-downgrade-04
Sorry, I should have got around to commenting on this much earlier, but it
only just reached the top of my 'todo' list :-( .
3. New header fields definition
fields =/ downgraded
downgraded = "Downgraded:" [FWS] field-name ":" unstructured CRLF
Encapsulating a header in a Downgraded: header is defined as:
^ ^
field field
1. Generate new Downgraded: header whose former value is the
original header field name and latter value is the original
header fleid value.
2. Encode the generated header by [RFC2047] section 5(1) method with
charset='UTF-8'.
3. Replace the original header field as the generated header field.
That Step 3 does not always apply; for example when the original was a
From:/To:/etc field, which then remains (in altered form) in addition to
the
new Downgraded:
fields =/ edowngraded
edowngraded = "Envelope-Downgraded:" [FWS] edowngraded-field ":"
[FWS] "<" uPath ">" [FWS]
"<" Mailbox ">" [FWS] CRLF
edowngraded-field = "From" / "To"
Why not
edowngraded-field = "Mail From" / "Rcpt To"
to avoid any possible confusion with the From: and To: headers?
Original non-ASCII address <uPath> is defined in
[I-D.ietf-eai-smtpext]. <Mailbox> is defined in [RFC2821], section
4.1.2. The "Envelope-Downgraded:" header field is encoded by
[RFC2047] in the downgraded message.
I think it is wrong to say (as you do frequently) that
'The "Some-Header:" is encoded by [RFC2047]'
RFC2047 encodes carefully regulated _parts_ of header fields (such as
<unstructured>s, <comments>s and <phrase>s), never the whole header field.
In this case it is the <uPath> (treated as if it were <unstructured>) that
gets encoded (or maybe even only a part of it - 2047 allows you to split it
up in many ways). Your later examples indeed indicate that is what you
intend to happen.
4. SMTP Downgrading
MTA replaces non-ASCII mail address with specified alternative US-
ASCII address when downgrading. Before replacing, decode the ALT-
ADDRESS parameter value because it is encoded as xtest [RFC3461].
Eh? That last sentence only applies to an ORCPT parameter, I think.
Also MTA preserves original information using "Envelope-Downgraded"
header defined in Section 3 with From or To field name. The non-
ASCII mail addresses are encoded by [RFC2047] and put into "Envelope-
Downgraded" header.
Yes, that wording is correct, as opposed to what you said about encoding
the
whole header earlier.
5. Email header fields downgrading
I found the whole of this section very confusing. You have done an
exhaustive analysis of lots of particular cases, resulting in much
unnecessary repetition, instead of setting out the _principles_ on which
the
whole thing was based.
For example, it is always correct to encode a <comment> whatever header it
occurs in and whether or not that header contains further stuff (e.g.
addresses) that have to be downgraded specially. So you might as well state
that once up-front (even in a conceptual first pass over all the headers),
rather than mention it as an extra thing to do when performing other more
particular downgrade operations.
o Downgrading Address header fields
From:
Sender:
Reply-To:
To:
Cc:
Bcc:
Resent-From:
Resent-Sender:
Resent-To:
Resent-Cc:
But that is only the present list. Surely the process you describe MAY also
be used for any header defined in the future which contains an <angle-addr>
etc. The upgrade/display mechanisms you describe in A1 and A2 would work
just fine on such headers.
The header field value is composed of single or multiple <angle-
addr>/<utf8-addr-spec> fields defined in
[I-D.ietf-eai-utf8headers].
If the header has no <angle-addr> or <utf8-addr-spec> which
contains non-ASCII characters, only "display-name" part or
comments contain non-ASCII characters, the "display-name" or
comments are encoded by [RFC2047] with charset='UTF-8'.
Otherwise, preserve the header field in "Downgraded:" header,
generate US-ASCII only address header, and replace the original
header field with the generated US-ASCII only header field. New
header generation method are shown in below.
Extract every field and downgrade each <mailbox>/<angle-addr>/
<utf8-addr-spec>.
Remove <mailbox> there. The other two are just particular cases of
<mailbox>.
If the non-ASCII address is in <utf8-addr-spec> form, then rewrite
^
(i.e. it contains no <alt-address>)
it as "Internationalized Address utf8-addr-spec-encoded
Removed:;". "utf8-addr-spec" is encoded to "utf8-addr-spec-
encoded" by [RFC2047].
That is exceedingly confusing until you have worked out what it means. What
you really mean to say is:
rewrite it as a <group> [RFC2822] of the form
Internationalized Address "<encoded-word>" Removed: ;
where the <encoded-word> is the original <utf8-addr-spec> encoded
according to [RFC2047] (and needs to be within a <quoted-string>).
That <quoted-string> is essential because, for sure, the <utf8-addr-spec>
will have at least an '@' somewhere inside it.
And yes, I liked that idea once I had worked out what you were doing.
The field may contain multiple <comment> fields. The <comment>
fields are encoded by [RFC2047] with charset='UTF-8', if
necessary.
Which is an example of the unnecessary repetition I mentioned earlier.
<mailbox> is defined as "display-name <angle-addr>" in
[I-D.ietf-eai-utf8headers].
Well no it isn't. What you really meant to say was
If the non-ASCII address is in <[display-name] angle-addr> form,
then ...
The "display-name" field, if present, is encoded
UTF8SMTP <angle-addr> defined in [I-D.ietf-eai-utf8headers]
consists of 3 forms. Downgrading method is defined for each form.
* <non-ASCII>
Non-ASCII mail address without sender-specified US-ASCII
address is replaced as
"Internationalized Address non-ASCII-encoded Removed:;".
non-ASCII address is encoded to "non-ASCII-encoded" by
[RFC2047].
No, that is not right, becuase if there is both a <display-name> and a
<non-ASCII> you would then get:
<encoded-display-name> Internationalized Address "<encoded-word>"
Removed: ;
which might not by a syntactically valid <group> according to RFC2822. Well
I am not quite sure about that, but in any case it would look better if you
cuold arrange it as:
Internationalized Address <encoded-display-name> "<encoded-word>"
Removed: ;
* <non-ASCII <US-ASCII>>
Non-ASCII mail address with sender-specified US-ASCII address
MUST be replaced as "display-name <US-ASCII>".
And there you mean 'MUST be replaced by "<encoded-word> <US-ASCII>" where
the <encoded word> is a <display-name> obtained by encoding the non-ASCII
according to [RFC2047]'. And you probably need a <quoted-string> in there
as
well, as before.
o Downgrading Non-ASCII in comments
Date:
Message-ID:
In-Reply-To:
References:
Resent-Date:
Resent-Message-ID:
MIME-Version:
Content-ID:
Actually, it is any structured header where CFWS is allowed (or its
equivalent in [RFC822]) and that includes structured headers that might be
defined in the future. Which is why I would much prefer you to cover all
such cases by generaic wording right at the start.
o Trace header
^
fields
Received:
If the FOR clause contains non-ASCII addresses, remove the FOR
^^^ ^^^
any that
clause in the header. The other part does not contain non-ASCII
values.
o MIME Content header
^
fields
Content-Type:
Content-Disposition:
But again, this applies to ANY header that contains <parameter>s as defined
in RFC2045. For example the Auto-Submitted header in [RFC3834] and the
Injection-Info header in draft-ietf-usefor-usefor-11 (already approved as a
proposed standard and now in the rfc-editor's queue). Both of those
documents mention [RFC2231] explicitly, so downgrading those headers would
indeed yield something already valid on the current network.
Encode the header by [RFC2231] with charset='UTF-8'.
Again, RFC2231 does not encode headers; it encodes <parameter>s, which may
be
found in headers.
o Unstructured text headers and structured text headers
Subject:
Comments:
Keywords:
Content-Description:
Encode the header by [RFC2047] with charset='UTF-8'.
And there is it the <unstructured> in the header that gets encoded (or a
<phrase> in the case of Keywords).
o URI headers
^
fields
o Other target headers
All other headers which contains non-ASCII characters are
preserved in Downgraded: header and removed.
Again, what you really mean is:
All other header fields which contains non-ASCII characters outside
of <unstructured>s, <comment>s and <phrases>, and for which no rule
is given above, are preserved in a Downgraded: header field and then
removed.
o ASCII only headers
^
fields
6. MIME body part headers downgrading
^
fields
Content-ID: .......
Content-Type: ......
Content-Disposition: .........
Content-Description: .........
But again, the downgrading of these is exactly the same as downgrading the
same headers at the top level, which you have already explained. Moreover,
there might be further such headers allowed in future which would come to
no harm if downgraded using 2047 or 2231. I agree that creating a
Downgraded: header in a MIME body part header might not be such a good idea
(though I doubt it wold break anything in practice) so you might forbid
that. There is no current situation where that might be needed, though.
Here, however, you need to specify how to downgrade a message/utf8smtp,
since there is a promise in section 4.6 of
draft-ietf-eai-utf8headers-06.txt
that such a downgrading (to message/rfc822) would be defined here.
Presumably this would involve the usual recursive descent through the
message/utf8smtp, downgrading individual headers as described above, and
applying Content-Transfer-Encodings to bodies, as needed.
And, having done that, you could also say that downgraders MAY do the same
thing to message/rfc822. Yes, I know we have forbidden these to contain any
utf8smtp headers, but such "leaks" are surely going to happen, so being
"liberal" here might undo some such blunders.
7. Security considerations
o It is likely that the techniques suggested here will invalidate
methods that depend on signatures over headers or the envelope.
"Issues" does talk about that, but, because this document strongly
implies that one can downgrade and then upgrade again with no risk
of loss of information, the topic should be explored further.
But this document "implies" no such thing. Moreover, RFC2047 encoding will
surely change the folding, and simply undoing every RFC2047 will not solve
the problem because such encoding might have already been present in the
original version, as signed before downgrading. The only way out of this
dilemma is some pretty aggressive canonicalization in the signature
algorithms. Tne recent DKIM standard goes some way towards such
canonicalization, but it is nowhere near aggressive enough.
Agreed it needs further exploration, so all we can do here is to identify
the pitfalls.
Appendix A. Displaying downgraded message
I presume this appendix is intended to replace the former discussion of
"upgrading". I have no problem with that.
A.1. Displaying technique 1
MUA can remove 'Downgraded:' from decoded 'Downgraded:' header
fields. With this technique, The address header fields may be
displayed twice, one is ASCII-only downgraded header field and the
other is from decoded Downgraded: header.
Ugh! I think I prefer the next method for normal use.
A.2. Displaying technique 2
+ Remove the header field which is the same with the generated
ASCII only header from the header fields. If the headers
contain [RFC2047] encoded part, decode it before comparison.
But that last sentence will not always work because, sometimes, the
[RFC2047] encoded part might have been present before downgrading. And, in
any case, you have to unfold before doing the comparison.
Appendix B. Examples
B.1. Downgrading example 1
Result of the header downgrading.
Return-Path: <ASCII-FROM>
It is customary to insert the Return-Path: at the _end_ of the headers.
Envelope-Downgraded: From: <RFC2047(NON-ASCII-FROM)> <ASCII-FROM>
Envelope-Downgraded: To: <RFC2047(NON-ASCII-TO)> <ASCII-TO>
Message-Id: MESSAGE_ID
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Subject: RFC2047(UTF-8_SUBJECT)
Downgraded: From: RFC2047(<NON-ASCII-FROM <ASCII-FROM>>)
From: <ASCII-FROM>
Downgraded: To: RFC2047(<NON-ASCII-TO <ASCII-TO>>)
To: <ASCII-TO>
Downgraded: CC: RFC2047(<NON-ASCII-CC>)
CC: Internationalized address RFC2047(NON-ASCII-CC) removed:;
It would be nice to show an example with a <display-name> in it there.
Date: DATE
MAIL_BODY
Figure 5: Header downgraded message
Figure 13: Header downgraded message 2
The previous draft had a useful paragraph here about a MIME encapsulated
subject header. Why has it been removed?
_______________________________________________
IMA mailing list
IMA at ietf.org
https://www1.ietf.org/mailman/listinfo/ima
Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.