Re: [smime] [pkix] Support for email address internationalization in RFC5280 certificates

Sean Leonard <dev+ietf@seantek.com> Mon, 04 April 2016 22:20 UTC

Return-Path: <dev+ietf@seantek.com>
X-Original-To: smime@ietfa.amsl.com
Delivered-To: smime@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 14D8C12D646; Mon, 4 Apr 2016 15:20:15 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.601
X-Spam-Level:
X-Spam-Status: No, score=-2.601 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id l8qc_QcmDuoF; Mon, 4 Apr 2016 15:20:13 -0700 (PDT)
Received: from mxout-08.mxes.net (mxout-08.mxes.net [216.86.168.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2740D12D190; Mon, 4 Apr 2016 15:20:13 -0700 (PDT)
Received: from dhcp-aa67.meeting.ietf.org (unknown [31.133.170.103]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTPSA id 21660509B6; Mon, 4 Apr 2016 18:20:10 -0400 (EDT)
Content-Type: multipart/alternative; boundary="Apple-Mail=_2AAAC662-09D8-483E-AC29-211072D21E82"
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Sean Leonard <dev+ietf@seantek.com>
In-Reply-To: <CAAFsWK0yYrEJkazOcyc+hOUTaihcBi6Aa31g9g3TyxvVzxyF5A@mail.gmail.com>
Date: Mon, 04 Apr 2016 19:20:10 -0300
Message-Id: <C726CA9F-369B-4EC9-BB0E-8AE38553858D@seantek.com>
References: <CAAFsWK0F6K_9VrDL7aX0QN56mWdhHsq0KV_1moR9pJ=A4E1BaA@mail.gmail.com> <CAK6vND-nAztjm9DzKNdCf1Hm2rbN5zAN4GWKuu5PiF49LeRSsw@mail.gmail.com> <CAAFsWK0yYrEJkazOcyc+hOUTaihcBi6Aa31g9g3TyxvVzxyF5A@mail.gmail.com>
To: "<pkix@ietf.org>" <pkix@ietf.org>
X-Mailer: Apple Mail (2.3124)
Archived-At: <http://mailarchive.ietf.org/arch/msg/smime/zr1z03QKHLs9Jo1QdefS76q2xzU>
Cc: smime@ietf.org
Subject: Re: [smime] [pkix] Support for email address internationalization in RFC5280 certificates
X-BeenThere: smime@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: SMIME Working Group <smime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/smime>, <mailto:smime-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/smime/>
List-Post: <mailto:smime@ietf.org>
List-Help: <mailto:smime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/smime>, <mailto:smime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Apr 2016 22:20:15 -0000

> On Feb 7, 2016, at 12:15 PM, Wei Chuang <weihaw@google.com> wrote:
> 
> 
> 
> On Fri, Feb 5, 2016 at 4:46 PM, Peter Bowen <pzbowen@gmail.com <mailto:pzbowen@gmail.com>> wrote:
> On Thu, Feb 4, 2016 at 11:05 AM, Wei Chuang <weihaw@google.com <mailto:weihaw@google.com>> wrote:
> > PKIX community,
> >
> > We've observed a limitation for specifying internationalized email addresses
> > as the local part which is restricted to essentially ASCII.  That is subject
> > or issuer email addresses which should be stored as subject-alt-name or
> > issuer-alt-name rfc822Name and are encoded as IA5String.  This is despite
> > the internationalization in email usage as specified by internationalization
> > of email headers in RFC6532 allowing Unicode in To, From, etc fields and
> > becoming fairly commonplace.  RFC5280 already specifies internationalization
> > of the domain but lacks any specification for the local-part.

Up until now, I have tried to lay low on this topic. However, having reviewed the relevant standards and implementations in the field, I have my 22¢:

The proposed methods are to create an otherName form and assign a new object identifier for it (A. Melnikov, ed., draft-ietf-pkix-eai-addresses-00), and to encode the local part in base64 with “:” as an escape signal (L. Baudoin, et. al., draft-lbaudoin-iemax-02). There is also a counterproposal on the agenda, which I will label as #3, to make rfc822Name a CHOICE {IA5String, UTF8String}. There are two other methods that deserve serious consideration. My 0.2¢ is on #4 and my 21.8¢ is on #5:

#4 Extend GeneralName with a new name type:

GeneralName ::= CHOICE {
  otherName [0] INSTANCE OF OTHER-NAME,
  rfc822Name [1] IA5String,
  dNSName [2] IA5String,
  x400Address [3] ORAddress,
  directoryName [4] Name,
  ediPartyName [5] EDIPartyName,
  uniformResourceIdentifier [6] IA5String,
  iPAddress [7] OCTET STRING,
  registeredID [8] OBJECT IDENTIFIER,
  eaiName [9] UTF8String
  ... }

The advantage of this approach is that it conforms to X.509:2012, which uses … syntax to show that the CHOICE is extensible. However, the IETF invented GeneralName (RFC 2459), and the latest ASN.1 (RFC 5912) does not use … syntax for extensibility. (Basically I think most implementations would barf on this CHOICE, and would cause the overall ASN.1 decoding op to fail, meaning all places where GeneralName is directly encoded, would cause implementations to barf.)

#5 Change GeneralName so that rfc822Name is actually just UTF8String:

   GeneralName ::= CHOICE {
        otherName                   [0]  INSTANCE OF OTHER-NAME,
        rfc822Name                  [1]  UTF8String,
        dNSName                     [2]  IA5String,
        x400Address                 [3]  ORAddress,
        directoryName               [4]  Name,
        ediPartyName                [5]  EDIPartyName,
        uniformResourceIdentifier   [6]  IA5String,
        iPAddress                   [7]  OCTET STRING,
        registeredID                [8]  OBJECT IDENTIFIER
   }

GeneralName is in the IMPLICIT TAGS part of PKIX. That means that on the wire, a GeneralName will (almost always) just be serialized as the
application tag in the choice, followed by the length and the data. The counterproposal of a CHOICE {IA5String, UTF8String} is flawed in that it will force ALL rfc822Names to include an additional tag UNIVERSAL 22 in the case of IA5String, because the choice is ambiguous without the tag (so a proper ASN.1 compiler will force the serialization and de-serialization of the tag). Note: UTF8String (in a CHOICE) would force serialization of the tag UNIVERSAL 12.

With this proposal #5, UTF8String is just a superset of IA5String. Therefore, new implementations will “just work” with virtually no further coding. The high-octet data in UTF8String will violate expectations for older implementations that are looking for IA5String. But enforcement of octets 00-7F is almost never done in the decoding step, or if it is done, it does not cause the entire ASN.1 decoding op to fail. (Note: this would be an “ASN.1 value constraint violation.”) If most implementations will continue to decode the ASN.1 and simply skip over what it perceives to be “invalid ASCII” (or simply rejects that particular alternative when doing name comparisons), we are good to go. This basically mirrors the way that EAI itself works in RFCs 6530-6532.

To test this, one would want to construct a signed certificate with “invalid” IA5String data that actually contains valid Unicode octets, and see what happens with various implementations.

I am not saying that this is the “right” approach, but I do think that it deserves serious consideration when evaluating alternatives. An example of an advantage is that it should preserve name constraints with no additional coding.

Regards,

Sean