[apps-discuss] apps-review team review for draft-ietf-eai-rfc5335bis-07

"Murray S. Kucherawy" <msk@cloudmark.com> Mon, 03 January 2011 07:22 UTC

Return-Path: <msk@cloudmark.com>
X-Original-To: apps-discuss@core3.amsl.com
Delivered-To: apps-discuss@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 0AA7E28C0FD for <apps-discuss@core3.amsl.com>; Sun, 2 Jan 2011 23:22:32 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -103.607
X-Spam-Level:
X-Spam-Status: No, score=-103.607 tagged_above=-999 required=5 tests=[AWL=-1.009, BAYES_00=-2.599, HTML_MESSAGE=0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gUayJQtJe1iI for <apps-discuss@core3.amsl.com>; Sun, 2 Jan 2011 23:22:21 -0800 (PST)
Received: from ht2-outbound.cloudmark.com (ht2-outbound.cloudmark.com [72.5.239.36]) by core3.amsl.com (Postfix) with ESMTP id 327A328C0FA for <apps-discuss@ietf.org>; Sun, 2 Jan 2011 23:22:21 -0800 (PST)
Received: from EXCH-C2.corp.cloudmark.com ([172.22.1.74]) by spite.corp.cloudmark.com ([172.22.10.72]) with mapi; Sun, 2 Jan 2011 23:24:28 -0800
From: "Murray S. Kucherawy" <msk@cloudmark.com>
To: "apps-discuss@ietf.org" <apps-discuss@ietf.org>
Date: Sun, 02 Jan 2011 23:24:27 -0800
Thread-Topic: apps-review team review for draft-ietf-eai-rfc5335bis-07
Thread-Index: AcurF0DTVgxpJJBeRKWAgDSK1T3MMg==
Message-ID: <F5833273385BB34F99288B3648C4F06F1341E73C8A@EXCH-C2.corp.cloudmark.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: multipart/alternative; boundary="_000_F5833273385BB34F99288B3648C4F06F1341E73C8AEXCHC2corpclo_"
MIME-Version: 1.0
Cc: John C Klensin <klensin@jck.com>, "Shawn.Steele@microsoft.com" <Shawn.Steele@microsoft.com>, "abelyang@twnic.net.tw" <abelyang@twnic.net.tw>
Subject: [apps-discuss] apps-review team review for draft-ietf-eai-rfc5335bis-07
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 03 Jan 2011 07:22:32 -0000

With apologies for the delayed submission of this due to the holidays...



I have been selected as the Applications Area Review Team reviewer for this draft (for background on apps-review, please see http://www.apps.ietf.org/content/applications-area-review-team).

Please resolve these comments along with any other Last Call comments you may receive. Please wait for direction from your document shepherd or AD before posting a new version of the draft.



Document: draft-ietf-eai-rfc5335bis-07

Title: Internationalized Email Headers

Reviewer: Murray S, Kucherawy

Review Date: 2010-12-23

IETF Last Call Date: unknown

IESG Telechat Date: unknown

Summary: This draft is not ready for publication as a Proposed Standard and should be revised before publication.



Major issues:


Sections 1.2 and 4.2 of this specification "removes the blanket ban on applying a content-transfer-encoding to all subtypes of message/ and instead..."  This is in reference to RFC2045 Section 6.4.  I re-read RFC2045 Section 6.4 and I don't have the same interpretation.  It says:



   "In particular, it is EXPRESSLY FORBIDDEN to use any
   encodings other than "7bit", "8bit", or "binary" with any composite
   media type, i.e. one that recursively includes other Content-Type
   fields.  Currently the only composite media types are "multipart" and
   "message".  All encodings that are desired for bodies of type
   multipart or message must be done at the innermost level, by encoding
   the actual body that needs to be encoded."

It seems to me that there is not an outright ban, but a restriction on which particular encodings can be used.  It also seems to me that "8bit" and "binary" encodings are sufficiently unrestrictive that they could contain any conceivable encoding, including UTF-8, especially "binary".  Thus, a binary-encoded message/global MIME object is legal and doesn't require any alteration to RFC2045, and therefore I don't understand the need for this section of this draft.

Section 5.2.4 of RFC2046 specifies that future "message" subtypes are expected to be "7bit" only, and anything else should register some other top-level media type.  Why was that not followed here?

Having exchanged email now with one of the WG chairs, I realize there may be more background or operational experience with EAI than I have that explains the above concerns.  However, future readers of this specification will likely also lack that context.  Therefore, I recommend that either Section 1.2 of this draft be extended with such discussion and background, or an appendix be included to contain such.  The last paragraph of Section 4.2 does do a little of this, but I'm at a loss to see the point being made here.  Perhaps an example would help.  If one of the other documents EAI is advancing contains such material, please add a reference to it here.



The Security Considerations section should discuss the problem of having UTF-8 aware transport (i.e. MTAs) coupled with UTF-8 unaware user agents (e.g. readers) as well as filters and the like.  The author talks about needing bigger buffers, but I think that's far less interesting than the possible semantic implications.  I consider this a major issue, and so I would expect this discussion to be non-trivial in size, and include some admonishment about not upgrading a delivery MTA to support UTF-8 message headers until the entire infrastructure it serves has already been verified to handle it.  This might be discussed in one of the other EAI documents already; if it is, this one should contain a reference to that.



On a related note, Security Considerations should also talk about abuse mechanisms.  If, for example, there are lots of ways of using UTF-8 to represent something equivalent or similar to a particular displayed character or group of characters (all the variants of "e" in French, using accents, for example), then filtering systems can be bypassed by using one of the variants to avoid detection while still reaching the end user with largely the same original effect.  This too might be discussed elsewhere in general, in which case a reference to that discussion can be left here.

Minor issues:



Section 2 explains that some people want to be able to use non-ASCII characters as part of header fields, though it also points out that RFC2047 already presents a mechanism to do so in a 7-bit-clean manner.  Section 4.3 then lays out the ABNF required to update RFC5322 such that just about any field can contain UTF-8 data.  Since this is a pretty big deal, it would be helpful to have some data about why the RFC2047 encoding mechanism that has been in widespread use since 1996 is not sufficient or appropriate, other than what's presented here which basically states that people don't want to use that encoding scheme for some reason, perhaps personal preference.


Section 4.6, which contains the registration for "message/global", says "(Note that a system compliant with MIME that doesn't recognize message/global SHOULD treat it as "application/octet-stream" as described in Section 5.2.4 of [RFC2046].)"  As another reviewer points out, and I concur, it's dangerous to copy normative language from one document to another, rather than simply referencing it, as it encourages divergent requirements.  Another instance of this appears in Section 5.



Nits:



In the Abstract and in Section 1.1, the phrase "specifies a variant of Internet mail" seems to portray a much wider scope than what's being presented.  Suggest "specifies a modified version of the Internet message format".



Section 3: "A plain ASCII string is full compatible..."  I believe the authors wanted "fully" here.



Section 4: The title "Changes on..." should be "Changes to..."



Section 4.1: The final paragraph is a little too conversational in its use of language, such as starting a sentence with "Actually, ..." and another with "And ..."  This is inconsistent with the rest of the document and should be avoided.

Section 4.2 and 4.4: The titles "Changes on..." should be "Changes to..."



Section 4.5: "It described in" should be "It is described in"



Section 4.6: "or within a non-SMTP environment which supports these messages."  Change "which" to "that".



Section 4.6: The Security Considerations of "See Section 5" should also say "of [this document]" or such.



Section 4.6: "Email clients which forward messages..." and "This is a structured media type which embeds..."  Change "which" to "that" in both cases.



Section 5: "...characters MUST be no more 998 octets, excluding..." Change "no more 998" to "no more than 998".