[Ltru] RFC 4646 production "grandfathered" considered harmful

John Cowan <cowan@ccil.org> Sun, 17 September 2006 06:19 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1GOpzo-0002ym-HL; Sun, 17 Sep 2006 02:19:16 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1GOpzn-0002xq-9d for ltru@ietf.org; Sun, 17 Sep 2006 02:19:15 -0400
Received: from mercury.ccil.org ([192.190.237.100]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1GOpzl-0006CM-2A for ltru@ietf.org; Sun, 17 Sep 2006 02:19:15 -0400
Received: from cowan by mercury.ccil.org with local (Exim 4.34) id 1GOpzk-0002d7-M4 for ltru@ietf.org; Sun, 17 Sep 2006 02:19:12 -0400
Date: Sun, 17 Sep 2006 02:19:12 -0400
To: ltru@ietf.org
Message-ID: <20060917061912.GB26073@ccil.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
User-Agent: Mutt/1.3.28i
From: John Cowan <cowan@ccil.org>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 538aad3a3c4f01d8b6a6477ca4248793
Subject: [Ltru] RFC 4646 production "grandfathered" considered harmful
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

Section 2.2.9 of RFC 4646 says:

   An implementation that claims to check for well-formed language tags
   MUST:

   o  Check that the tag and all of its subtags, including extension and
      private use subtags, conform to the ABNF OR that the tag is on the
      list of grandfathered tags.

   o  Check that singleton subtags that identify extensions do not
      repeat.  For example, the tag "en-a-xx-b-yy-a-zz" is not well-
      formed.

(I have emphasized the word OR in the first bullet point.)

Unfortunately, this wording allows too much.  For example, the invalid tag
"ra-bb-it" matches the "grandfathered" rule in the ABNF.  Therefore it
winds up being well-formed even though it cannot be analyzed as a sequence
of subtags and is not on the grandfathered list either.

To avoid this, we can take one of two actions:

1) Remove the "grandfathered" production in the ABNF altogether, and use
the "OR" in the conformance clause to allow the irregular grandfathered
tags (that is, those which don't match the "langtag" or "privateuse"
productions) to be well-formed.  The danger here is that people will
implement the ABNF only, and the grandfathered tags will become outright
unusable rather than merely discouraged.

2) Add an explicit "irregular" production in place of the "grandfathered"
production which explicitly enumerates the 17 irregular grandfathered
tags, thus:

irregular = "en-GB-oed" / "i-ami" / "i-bnn" / "i-default"
            / "i-enochian" / "i-hak" / "i-klingon" / "i-lux" / "i-mingo"
            / "i-navajo" / "i-pwn" / "i-tao" / "i-tay" / "i-tsu"
            / "sgn-BE-fr" / "sgn-BE-nl" / "sgn-CH-de"

It is safe to enumerate this list explicitly, as it can neither grow
nor shrink.  It's true that all the tags except "i-default" can become
deprecated, but that makes no difference to well-formed processors.
The other grandfathered tags in the registry are all well-formed already
and do not need to be in this list.

In this case the conformance clause can be simplified by omitting the
second part of the "OR".

I favor choice 2.

-- 
John Cowan                                   cowan@ccil.org
        "You need a change: try Canada"  "You need a change: try China"
                --fortune cookies opened by a couple that I know

_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru