Dylan N. Pierce <dylanpierce at megared dot net dot mx> wrote: > The tag "zh-Hans-XQ" provides enough information that parsing agents > know what they don't know: the region. The legitimate-but-opaque tag > "zh-Hans-x-whoKnows" might contain a privately-defined region... but > if so, why not use the first option? It also might contain something > that's completely outside the intended scope of the language tag. If > we're serious about what these tags are meant to represent, why make > it possible to work against it? This returns to the "scribble space" > issue. Private-use regions should be encoded as "XQ" (or whatever), private-use scripts as "Qaaa" (or whatever), and so forth. I do not question that. Private-use subtags of the form "x-whoKnows" are best used for concepts that, if registered, would take the form of a variant or an extension -- something for which built-in private-use areas are not available. I agree completely that "zh-Hans-XQ" is better than "zh-Hans-x-whoKnows", *IF* the private-use thing in question is a region. But if the tagger is attempting to indicate a spoken accent, or some other attribute of the language variety other than region, and doesn't have a registered variant or extension RFC to turn to, then the subtag in "x-" is really her only alternative. > I'm not sure a stronger version of the warning itself would clarify > this. It would have to say something like: > > "Private-use subtags require private agreement between the parties > that intend to use or exchange language tags that use them and they > MUST NOT be used in content or protocols intended for general use > unless it can be guaranteed that they will not inadvertently show up > in parsing agents used by other parties with private-use agreements > that coincedentally employ the same alphanumeric sequence." I think this is far too strong. It is impossible to "guarantee" that such leakage, even if inadvertent, will not occur. It is possible to encourage taggers strongly not to toss these things out into the marketplace willy-nilly, as some sort of corporate de-facto standard (Dylan talks about this later), and it might be a good idea to add such wording. But I think MUST NOT goes too far. > What does allowing an unknown "something else" gain in clarity or > interoperability? The ability to concoct a private agreement and achieve interoperability within the confines of that agreement. > So I guess at this point, what I am looking at isn't proposed new > text, or proposed changes in text, but a proposed elimination of text, > to whit, the text defining the private-use subtag "x". My question: > is the gain in "swallowing comfort" won by unregulated, unparsable > private-use tags worth the corresponding loss in clarity and > interoperability? It doesn't tend to work out that way. Currently, RFC 3066 allows private-use whole-tags (not subtags) of the form "x-whoKnows" (which of course are much less interoperable than "zh-Hans-x-whoKnows" since they don't contain ANY publicly understood information), and it turns out that there is no great problem with private-use x-tags being generated and spread all over the internet and elsewhere. They just aren't that common in the real world. (In fact, during the "matching" portion of our show, we heard repeatedly that many existing tag parsers assume all RFC 3066 tags are of the form "xx" or "xx-xx" and would choke on script subtags; these parsers obviously would choke on private tags as well.) In short, I think the problem is overstated. -- Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/ _______________________________________________ Ltru mailing list Ltru at lists.ietf.org https://www1.ietf.org/mailman/listinfo/ltru
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.