Re: [precis] U+212B (ANGSTROM SIGN) in Usernames

Peter Saint-Andre <stpeter@stpeter.im> Wed, 04 May 2016 21:31 UTC

Return-Path: <stpeter@stpeter.im>
X-Original-To: precis@ietfa.amsl.com
Delivered-To: precis@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B615412D88C for <precis@ietfa.amsl.com>; Wed, 4 May 2016 14:31:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.898
X-Spam-Level:
X-Spam-Status: No, score=-2.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-0.996, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dD-WFq6OY7ZC for <precis@ietfa.amsl.com>; Wed, 4 May 2016 14:31:58 -0700 (PDT)
Received: from stpeter.im (mailhost.stpeter.im [207.210.219.225]) by ietfa.amsl.com (Postfix) with ESMTP id 6984512D8B5 for <precis@ietf.org>; Wed, 4 May 2016 14:31:58 -0700 (PDT)
Received: from aither.local (unknown [73.34.202.214]) (Authenticated sender: stpeter) by stpeter.im (Postfix) with ESMTPSA id 902FCE8241; Wed, 4 May 2016 15:41:36 -0600 (MDT)
To: Christian Schudt <christian.schudt@gmx.de>, precis@ietf.org
References: <3A5E5283-7BCC-41DA-A183-6872695D91DC@gmx.de>
From: Peter Saint-Andre <stpeter@stpeter.im>
Message-ID: <572A6A4C.8080102@stpeter.im>
Date: Wed, 04 May 2016 15:31:56 -0600
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0) Gecko/20100101 Thunderbird/38.7.2
MIME-Version: 1.0
In-Reply-To: <3A5E5283-7BCC-41DA-A183-6872695D91DC@gmx.de>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/precis/hYbbwkxSUDzbNtGPNsVCP4jyJ_s>
Subject: Re: [precis] U+212B (ANGSTROM SIGN) in Usernames
X-BeenThere: precis@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Preparation and Comparison of Internationalized Strings <precis.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/precis>, <mailto:precis-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/precis/>
List-Post: <mailto:precis@ietf.org>
List-Help: <mailto:precis-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/precis>, <mailto:precis-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 04 May 2016 21:31:59 -0000

Hi Christian, thanks for your input. Please accept my apologies for the 
seriously delayed reply.

On 11/21/15 6:12 AM, Christian Schudt wrote:
> Hi,
>
> can you please help answering the following question:
>
> When doing enforcement of a string of the UsernameCaseMappedProfile
> as per RFC 7613 § 3.2.2 and the string contains U+212B (ANGSTROM
> SIGN), I would first do the preparation and it would be disallowed by
> the IdentifierClass because it has a compatibility equivalent (which
> is U+00C5 LATIN CAPITAL LETTER A WITH RING ABOVE).
>
> However, if I stick to the order rules specified in the Precis Core
> Framework (RFC 7564 § 7), the string would first be normalized with
> NFC (RFC 7613 § 4.2.2.4), becoming U+00C5, and then later would pass
> the IdentifierClass check.
>
> If I stick to the rules in RFC 7613 § 3.2.2, such a string would be
> disallowed. If I stick to the rules in RFC 7564 § 7, it would be
> allowed.
>
> What’s correct behavior?
>
> Preparation (RFC 7613 § 3.2.1) has introduced a workaround for the
> HasCompat issue, but only for full- and halfwidth characters (which
> U+212B is not).
>
> Generally I don’t understand why RFC 7613 violates the order of rules
> (compared to RFC 7564 § 7): Doing the preparation (checking the
> IdentifierClass) first as opposed to do it after the normalization.

As hinted in the message I just sent to the PRECIS list, I think that 
using the order of operations specified in RFC 7564 is the right thing 
to do during enforcement and comparison because (a) the IdentifierClass 
is supposed to be a "safe" choice for things like usernames and (b) we 
don't want to willfully violate the "Principle of Least Astonishment" 
(U+212B looks and behaves like U+00C5 and in fact is canoniclaly 
equivalent to U+00C5).

Thus my inclination is to treat this as a spec bug in RFC 7613, by which 
I mean that for enforcement and comparison in 7613bis we would follow 
the order of operations specified in RFC 7564; this would imply that 
U+212B would be mapped to U+00C5 (which was also the result of applying 
SASLprep in accordance with RFC 4013).

Or so it seems to me.

Peter