[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ltru] Canonical variants



> However, users might sometimes supply one.

It is this statement that gives me hiccoughs. If John Smith gives meaning to V1-V2 vs V2-V1, how is anyone to know what meaning he gives them? Suppose Joe has V1-V2 mean "as used before the Rapture" and V2-V1 mean "as used after the Rapture". Who's to know? We have to have something concrete, or it is not interoperable.

However, what you and others seem to be implicitly assuming is the ordering "most significant subtag first". So just to get out of this issue, we could use that as a guiding principle. That is, we give the advice to users (in the "tag wisely section") for variant subtag ordering as follows:

- put prefix variant subtags before their suffixes (that is, if V1 occurs in a Prefix of V2, V1 should go first)
- otherwise, put "more significant" variant subtags before others
- otherwise -- or in case there is any doubt as to relative significance -- put variant subtags in alphabetical order.

Mark

On Tue, Jul 15, 2008 at 11:07 AM, Phillips, Addison <addison at amazon.com> wrote:
There is no conventional semantic. However, users might sometimes supply one. Generally this isn't a huge issue because true variants are rare in the wild and of limited practicality. While that could change, in practice most general variants are going to be nonsense together. At present, none of the general purpose variants make sense with one-another. While that will change over time, the utility of more than a couple together is going to always remain limited.

When implementing language tags, I have always made the assumption that the subtag's order had some meaning to the user, even though my implementation could not know what it was. All of the other subtags are canonically ordered in a tag, after all---extlang, private use, and extension sequences have to remain in the same order. Why would variants be any different? And somehow I don't think it is a good thing that canonicalizing processors produce a vastly different result from non-canonicalizing ones when that result is not measurably better (it is measurably better to match a request for en-BU also to the tag en-MM, etc.)

I've had the opportunity to recommend various of the matching schemes to different working groups. RFC 4647's compatibility with RFC 3066 has been a tremendous boon here. However, if some implementations start reordering subtags, they'll produce different results than those that are strictly of the "Tag Content Wisely" flavor (i.e. "you give me subtags and I'll remove them, one-by-one, until I find something"). The less I need the registry, the happier I am overall because that hews closer to the original language tag implementation requirements. So it makes me feel more comfortable saying that matching implementations don't need to worry about subtag (re-)ordering.

Addison

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.

> -----Original Message-----
> From: ltru-bounces at ietf.org [mailto:ltru-bounces at ietf.org] On
> Behalf Of Peter Constable
> Sent: Tuesday, July 15, 2008 10:20 AM
> To: ltru at ietf.org
> Subject: Re: [Ltru] Canonical variants
>
> > From: John Cowan [mailto:cowan at ccil.org]
>
>
> > Alas, some combining marks that were thought to be in irrelevant
> > positions
> > turned out not to be so.
>
> This response to the point I made is ignoring the point I was
> making: that those particular aspects of Unicode are not analogous:
> encoding order in Unicode is a transparent metaphor for positional
> ordering, whereas there isn't a metaphor for ordering of our
> subtags.
>
>
> > The analogy isn't about the specifics: it's about assuming that
> > something
> > is meaningless that later on turns out to be meaningful.
>
> But the analogy breaks down for the reasons I've stated: encoding
> order has no obvious metaphor for our subtags, and so unless *we
> decide* to define some conventional semantic, there is no
> conventional semantic.
>
>
>
> Peter
>
> _______________________________________________
> Ltru mailing list
> Ltru at ietf.org
> https://www.ietf.org/mailman/listinfo/ltru
_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www.ietf.org/mailman/listinfo/ltru

_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www.ietf.org/mailman/listinfo/ltru

Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.