[EAI] Comments on draft-ietf-eai-downgrade-04
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EAI] Comments on draft-ietf-eai-downgrade-04



Sorry, I should have got around to commenting on this much earlier, but it only just reached the top of my 'todo' list :-( .

3.  New header fields definition
  fields     =/ downgraded
   downgraded =  "Downgraded:" [FWS] field-name ":" unstructured CRLF
  Encapsulating a header in a Downgraded: header is defined as:
                           ^                       ^
                          field                  field
   1.  Generate new Downgraded: header whose former value is the
       original header field name and latter value is the original
       header fleid value.
   2.  Encode the generated header by [RFC2047] section 5(1) method with
       charset='UTF-8'.
   3.  Replace the original header field as the generated header field.

That Step 3 does not always apply; for example when the original was a
From:/To:/etc field, which then remains (in altered form) in addition to the
new Downgraded:



   fields      =/ edowngraded
   edowngraded = "Envelope-Downgraded:" [FWS] edowngraded-field ":"
                                        [FWS] "<" uPath ">" [FWS]
                                        "<" Mailbox ">" [FWS] CRLF
   edowngraded-field =  "From" / "To"

Why not

edowngraded-field = "Mail From" / "Rcpt To"

to avoid any possible confusion with the From: and To: headers?

  Original non-ASCII address <uPath> is defined in
   [I-D.ietf-eai-smtpext]. <Mailbox> is defined in [RFC2821], section
   4.1.2.  The "Envelope-Downgraded:" header field is encoded by
   [RFC2047] in the downgraded message.

I think it is wrong to say (as you do frequently) that

    'The "Some-Header:" is encoded by [RFC2047]'

RFC2047 encodes carefully regulated _parts_ of header fields (such as
<unstructured>s, <comments>s and <phrase>s), never the whole header field.
In this case it is the <uPath> (treated as if it were <unstructured>) that
gets encoded (or maybe even only a part of it - 2047 allows you to split it
up in many ways). Your later examples indeed indicate that is what you
intend to happen.



4.  SMTP Downgrading
  MTA replaces non-ASCII mail address with specified alternative US-
   ASCII address when downgrading.  Before replacing, decode the ALT-
   ADDRESS parameter value because it is encoded as xtest [RFC3461].

Eh? That last sentence only applies to an ORCPT parameter, I think.

   Also MTA preserves original information using "Envelope-Downgraded"
   header defined in Section 3 with From or To field name.  The non-
   ASCII mail addresses are encoded by [RFC2047] and put into "Envelope-
   Downgraded" header.

Yes, that wording is correct, as opposed to what you said about encoding the
whole header earlier.


5. Email header fields downgrading

I found the whole of this section very confusing. You have done an
exhaustive analysis of lots of particular cases, resulting in much
unnecessary repetition, instead of setting out the _principles_ on which the
whole thing was based.


For example, it is always correct to encode a <comment> whatever header it
occurs in and whether or not that header contains further stuff (e.g.
addresses) that have to be downgraded specially. So you might as well state
that once up-front (even in a conceptual first pass over all the headers),
rather than mention it as an extra thing to do when performing other more
particular downgrade operations.


   o  Downgrading Address header fields
     From:
      Sender:
      Reply-To:
      To:
      Cc:
      Bcc:
      Resent-From:
      Resent-Sender:
      Resent-To:
      Resent-Cc:

But that is only the present list. Surely the process you describe MAY also be used for any header defined in the future which contains an <angle-addr> etc. The upgrade/display mechanisms you describe in A1 and A2 would work just fine on such headers.

      The header field value is composed of single or multiple <angle-
      addr>/<utf8-addr-spec> fields defined in
      [I-D.ietf-eai-utf8headers].
      If the header has no <angle-addr> or <utf8-addr-spec> which
      contains non-ASCII characters, only "display-name" part or
      comments contain non-ASCII characters, the "display-name" or
      comments are encoded by [RFC2047] with charset='UTF-8'.
      Otherwise, preserve the header field in "Downgraded:" header,
      generate US-ASCII only address header, and replace the original
      header field with the generated US-ASCII only header field.  New
      header generation method are shown in below.
     Extract every field and downgrade each <mailbox>/<angle-addr>/
      <utf8-addr-spec>.

Remove <mailbox> there. The other two are just particular cases of <mailbox>.

If the non-ASCII address is in <utf8-addr-spec> form, then rewrite
                                                            ^
                                       (i.e. it contains no <alt-address>)
      it as "Internationalized Address utf8-addr-spec-encoded
      Removed:;". "utf8-addr-spec" is encoded to "utf8-addr-spec-
      encoded" by [RFC2047].

That is exceedingly confusing until you have worked out what it means. What you really mean to say is:

        rewrite it as a <group> [RFC2822] of the form

              Internationalized Address "<encoded-word>" Removed: ;

         where the <encoded-word> is the original <utf8-addr-spec> encoded
         according to [RFC2047] (and needs to be within a <quoted-string>).

That <quoted-string> is essential because, for sure, the <utf8-addr-spec>
will have at least an '@' somewhere inside it.

And yes, I liked that idea once I had worked out what you were doing.

      The field may contain multiple <comment> fields.  The <comment>
      fields are encoded by [RFC2047] with charset='UTF-8', if
      necessary.

Which is an example of the unnecessary repetition I mentioned earlier.
     <mailbox> is defined as "display-name <angle-addr>" in
      [I-D.ietf-eai-utf8headers].

Well no it isn't. What you really meant to say was

        If the non-ASCII address is in <[display-name] angle-addr> form,
        then ...
  The "display-name" field, if present, is encoded


      UTF8SMTP <angle-addr> defined in [I-D.ietf-eai-utf8headers]
      consists of 3 forms.  Downgrading method is defined for each form.
     *  <non-ASCII>
         Non-ASCII mail address without sender-specified US-ASCII
         address is replaced as
         "Internationalized Address non-ASCII-encoded Removed:;".
         non-ASCII address is encoded to "non-ASCII-encoded" by
         [RFC2047].

No, that is not right, becuase if there is both a <display-name> and a <non-ASCII> you would then get:

<encoded-display-name> Internationalized Address "<encoded-word>" Removed: ;

which might not by a syntactically valid <group> according to RFC2822. Well
I am not quite sure about that, but in any case it would look better if you
cuold arrange it as:

Internationalized Address <encoded-display-name> "<encoded-word>" Removed: ;

      *  <non-ASCII <US-ASCII>>
         Non-ASCII mail address with sender-specified US-ASCII address
         MUST be replaced as "display-name <US-ASCII>".

And there you mean 'MUST be replaced by "<encoded-word> <US-ASCII>" where
the <encoded word> is a <display-name> obtained by encoding the non-ASCII
according to [RFC2047]'. And you probably need a <quoted-string> in there as
well, as before.


   o  Downgrading Non-ASCII in comments
     Date:
      Message-ID:
      In-Reply-To:
      References:
      Resent-Date:
      Resent-Message-ID:
      MIME-Version:
      Content-ID:

Actually, it is any structured header where CFWS is allowed (or its equivalent in [RFC822]) and that includes structured headers that might be defined in the future. Which is why I would much prefer you to cover all such cases by generaic wording right at the start.



o Trace header
                     ^
                   fields
     Received:
     If the FOR clause contains non-ASCII addresses, remove the FOR
           ^^^                                                 ^^^
           any                                                 that
      clause in the header.  The other part does not contain non-ASCII
      values.
  o  MIME Content header
                            ^
                          fields

     Content-Type:
      Content-Disposition:

But again, this applies to ANY header that contains <parameter>s as defined in RFC2045. For example the Auto-Submitted header in [RFC3834] and the Injection-Info header in draft-ietf-usefor-usefor-11 (already approved as a proposed standard and now in the rfc-editor's queue). Both of those documents mention [RFC2231] explicitly, so downgrading those headers would indeed yield something already valid on the current network.
Encode the header by [RFC2231] with charset='UTF-8'.

Again, RFC2231 does not encode headers; it encodes <parameter>s, which may be
found in headers.
  o  Unstructured text headers and structured text headers
     Subject:
      Comments:
      Keywords:
      Content-Description:
     Encode the header by [RFC2047] with charset='UTF-8'.

And there is it the <unstructured> in the header that gets encoded (or a <phrase> in the case of Keywords).


o URI headers
                  ^
                  fields


   o  Other target headers
      All other headers which contains non-ASCII characters are
      preserved in Downgraded: header and removed.

Again, what you really mean is:

        All other header fields which contains non-ASCII characters outside
	of <unstructured>s, <comment>s and <phrases>, and for which no rule
	is given above, are preserved in a Downgraded: header field and then
	removed.
o ASCII only headers
                         ^
                         fields


6. MIME body part headers downgrading
                           ^
                           fields


   Content-ID: .......
  Content-Type: ......
   Content-Disposition: .........
  Content-Description: .........

But again, the downgrading of these is exactly the same as downgrading the same headers at the top level, which you have already explained. Moreover, there might be further such headers allowed in future which would come to no harm if downgraded using 2047 or 2231. I agree that creating a Downgraded: header in a MIME body part header might not be such a good idea (though I doubt it wold break anything in practice) so you might forbid that. There is no current situation where that might be needed, though.

Here, however, you need to specify how to downgrade a message/utf8smtp,
since there is a promise in section 4.6 of draft-ietf-eai-utf8headers-06.txt
that such a downgrading (to message/rfc822) would be defined here.


Presumably this would involve the usual recursive descent through the
message/utf8smtp, downgrading individual headers as described above, and
applying Content-Transfer-Encodings to bodies, as needed.

And, having done that, you could also say that downgraders MAY do the same
thing to message/rfc822. Yes, I know we have forbidden these to contain any
utf8smtp headers, but such "leaks" are surely going to happen, so being
"liberal" here might undo some such blunders.


7.  Security considerations
  o  It is likely that the techniques suggested here will invalidate
      methods that depend on signatures over headers or the envelope.
      "Issues" does talk about that, but, because this document strongly
      implies that one can downgrade and then upgrade again with no risk
      of loss of information, the topic should be explored further.

But this document "implies" no such thing. Moreover, RFC2047 encoding will surely change the folding, and simply undoing every RFC2047 will not solve the problem because such encoding might have already been present in the original version, as signed before downgrading. The only way out of this dilemma is some pretty aggressive canonicalization in the signature algorithms. Tne recent DKIM standard goes some way towards such canonicalization, but it is nowhere near aggressive enough.

Agreed it needs further exploration, so all we can do here is to identify
the pitfalls.


Appendix A.  Displaying downgraded message

I presume this appendix is intended to replace the former discussion of
"upgrading". I have no problem with that.

A.1.  Displaying technique 1
  MUA can remove 'Downgraded:' from decoded 'Downgraded:' header
   fields.  With this technique, The address header fields may be
   displayed twice, one is ASCII-only downgraded header field and the
   other is from decoded Downgraded: header.

Ugh! I think I prefer the next method for normal use.
A.2.  Displaying technique 2
        +  Remove the header field which is the same with the generated
            ASCII only header from the header fields.  If the headers
            contain [RFC2047] encoded part, decode it before comparison.

But that last sentence will not always work because, sometimes, the [RFC2047] encoded part might have been present before downgrading. And, in any case, you have to unfold before doing the comparison.
Appendix B.  Examples
B.1.  Downgrading example 1
  Result of the header downgrading.
  Return-Path: <ASCII-FROM>

It is customary to insert the Return-Path: at the _end_ of the headers.

   Envelope-Downgraded: From: <RFC2047(NON-ASCII-FROM)> <ASCII-FROM>
   Envelope-Downgraded: To: <RFC2047(NON-ASCII-TO)> <ASCII-TO>
   Message-Id: MESSAGE_ID
   Mime-Version: 1.0
   Content-Type: text/plain; charset="UTF-8"
   Content-Transfer-Encoding: 8bit
   Subject: RFC2047(UTF-8_SUBJECT)
   Downgraded: From: RFC2047(<NON-ASCII-FROM <ASCII-FROM>>)
   From: <ASCII-FROM>
   Downgraded: To: RFC2047(<NON-ASCII-TO <ASCII-TO>>)
   To: <ASCII-TO>
   Downgraded: CC: RFC2047(<NON-ASCII-CC>)
   CC: Internationalized address RFC2047(NON-ASCII-CC) removed:;

It would be nice to show an example with a <display-name> in it there.

   Date: DATE
  MAIL_BODY
                   Figure 5: Header downgraded message



                  Figure 13: Header downgraded message 2

The previous draft had a useful paragraph here about a MIME encapsulated
subject header. Why has it been removed?


_______________________________________________ IMA mailing list IMA at ietf.org https://www1.ietf.org/mailman/listinfo/ima




Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.