[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Ltru] Re: [psg.com #1061] eliminate (or proscribe) Private Use Tags



Dylan N. Pierce <dylanpierce at megared dot net dot mx> wrote:

> The tag "zh-Hans-XQ" provides enough information that parsing agents
> know what they don't know: the region. The legitimate-but-opaque tag
> "zh-Hans-x-whoKnows" might contain a privately-defined region... but
> if so, why not use the first option? It also might contain something
> that's completely outside the intended scope of the language tag. If
> we're serious about what these tags are meant to represent, why make
> it possible to work against it? This returns to the "scribble space"
> issue.

Private-use regions should be encoded as "XQ" (or whatever), private-use
scripts as "Qaaa" (or whatever), and so forth.  I do not question that.
Private-use subtags of the form "x-whoKnows" are best used for concepts
that, if registered, would take the form of a variant or an extension -- 
something for which built-in private-use areas are not available.

I agree completely that "zh-Hans-XQ" is better than
"zh-Hans-x-whoKnows", *IF* the private-use thing in question is a
region.  But if the tagger is attempting to indicate a spoken accent, or
some other attribute of the language variety other than region, and
doesn't have a registered variant or extension RFC to turn to, then the
subtag in "x-" is really her only alternative.

> I'm not sure a stronger version of the warning itself would clarify
> this. It would have to say something like:
>
> "Private-use subtags require private agreement between the parties
> that intend to use or exchange language tags that use them and they
> MUST NOT be used in content or protocols intended for general use
> unless it can be guaranteed that they will not inadvertently show up
> in parsing agents used by other parties with private-use agreements
> that coincedentally employ the same alphanumeric sequence."

I think this is far too strong.  It is impossible to "guarantee" that
such leakage, even if inadvertent, will not occur.  It is possible to
encourage taggers strongly not to toss these things out into the
marketplace willy-nilly, as some sort of corporate de-facto standard
(Dylan talks about this later), and it might be a good idea to add such
wording.  But I think MUST NOT goes too far.

> What does allowing an unknown "something else" gain in clarity or
> interoperability?

The ability to concoct a private agreement and achieve interoperability
within the confines of that agreement.

> So I guess at this point, what I am looking at isn't proposed new
> text, or proposed changes in text, but a proposed elimination of text,
> to whit, the text defining the private-use subtag "x". My question:
> is the gain in "swallowing comfort" won by unregulated, unparsable
> private-use tags worth the corresponding loss in clarity and
> interoperability?

It doesn't tend to work out that way.  Currently, RFC 3066 allows
private-use whole-tags (not subtags) of the form "x-whoKnows" (which of
course are much less interoperable than "zh-Hans-x-whoKnows" since they
don't contain ANY publicly understood information), and it turns out
that there is no great problem with private-use x-tags being generated
and spread all over the internet and elsewhere.  They just aren't that
common in the real world.  (In fact, during the "matching" portion of
our show, we heard repeatedly that many existing tag parsers assume all
RFC 3066 tags are of the form "xx" or "xx-xx" and would choke on script
subtags; these parsers obviously would choke on private tags as well.)
In short, I think the problem is overstated.

--
Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/



_______________________________________________
Ltru mailing list
Ltru at lists.ietf.org
https://www1.ietf.org/mailman/listinfo/ltru




Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.