[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ltru] RFC 4646 production "grandfathered" considered harmful



Good point. I like the idea of the irregular list.

Addison

John Cowan wrote:
Section 2.2.9 of RFC 4646 says:

   An implementation that claims to check for well-formed language tags
   MUST:

   o  Check that the tag and all of its subtags, including extension and
      private use subtags, conform to the ABNF OR that the tag is on the
      list of grandfathered tags.

   o  Check that singleton subtags that identify extensions do not
      repeat.  For example, the tag "en-a-xx-b-yy-a-zz" is not well-
      formed.

(I have emphasized the word OR in the first bullet point.)

Unfortunately, this wording allows too much.  For example, the invalid tag
"ra-bb-it" matches the "grandfathered" rule in the ABNF.  Therefore it
winds up being well-formed even though it cannot be analyzed as a sequence
of subtags and is not on the grandfathered list either.

To avoid this, we can take one of two actions:

1) Remove the "grandfathered" production in the ABNF altogether, and use
the "OR" in the conformance clause to allow the irregular grandfathered
tags (that is, those which don't match the "langtag" or "privateuse"
productions) to be well-formed.  The danger here is that people will
implement the ABNF only, and the grandfathered tags will become outright
unusable rather than merely discouraged.

2) Add an explicit "irregular" production in place of the "grandfathered"
production which explicitly enumerates the 17 irregular grandfathered
tags, thus:

irregular = "en-GB-oed" / "i-ami" / "i-bnn" / "i-default"
            / "i-enochian" / "i-hak" / "i-klingon" / "i-lux" / "i-mingo"
            / "i-navajo" / "i-pwn" / "i-tao" / "i-tay" / "i-tsu"
            / "sgn-BE-fr" / "sgn-BE-nl" / "sgn-CH-de"

It is safe to enumerate this list explicitly, as it can neither grow
nor shrink.  It's true that all the tags except "i-default" can become
deprecated, but that makes no difference to well-formed processors.
The other grandfathered tags in the registry are all well-formed already
and do not need to be in this list.

In this case the conformance clause can be simplified by omitting the
second part of the "OR".

I favor choice 2.


--
Addison Phillips
Globalization Architect -- Yahoo! Inc.

Internationalization is an architecture.
It is not a feature.

_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru




Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.