There is no conventional semantic. However, users might sometimes supply one. Generally this isn't a huge issue because true variants are rare in the wild and of limited practicality. While that could change, in practice most general variants are going to be nonsense together. At present, none of the general purpose variants make sense with one-another. While that will change over time, the utility of more than a couple together is going to always remain limited.
When implementing language tags, I have always made the assumption that the subtag's order had some meaning to the user, even though my implementation could not know what it was. All of the other subtags are canonically ordered in a tag, after all---extlang, private use, and extension sequences have to remain in the same order. Why would variants be any different? And somehow I don't think it is a good thing that canonicalizing processors produce a vastly different result from non-canonicalizing ones when that result is not measurably better (it is measurably better to match a request for en-BU also to the tag en-MM, etc.)
I've had the opportunity to recommend various of the matching schemes to different working groups. RFC 4647's compatibility with RFC 3066 has been a tremendous boon here. However, if some implementations start reordering subtags, they'll produce different results than those that are strictly of the "Tag Content Wisely" flavor (i.e. "you give me subtags and I'll remove them, one-by-one, until I find something"). The less I need the registry, the happier I am overall because that hews closer to the original language tag implementation requirements. So it makes me feel more comfortable saying that matching implementations don't need to worry about subtag (re-)ordering.
Addison
Addison Phillips
Globalization Architect -- Lab126
Internationalization is not a feature.
It is an architecture.
> Behalf Of Peter Constable
> Sent: Tuesday, July 15, 2008 10:20 AM
> To: ltru at ietf.org
> Subject: Re: [Ltru] Canonical variants
>
> > From: John Cowan [mailto:cowan at ccil.org]
>
>
> > Alas, some combining marks that were thought to be in irrelevant
> > positions
> > turned out not to be so.
>
> This response to the point I made is ignoring the point I was
> making: that those particular aspects of Unicode are not analogous:
> encoding order in Unicode is a transparent metaphor for positional
> ordering, whereas there isn't a metaphor for ordering of our
> subtags.
>
>
> > The analogy isn't about the specifics: it's about assuming that
> > something
> > is meaningless that later on turns out to be meaningful.
>
> But the analogy breaks down for the reasons I've stated: encoding
> order has no obvious metaphor for our subtags, and so unless *we
> decide* to define some conventional semantic, there is no
> conventional semantic.
>
>
>
> Peter
>
> _______________________________________________
> Ltru mailing list
> Ltru at ietf.org
> https://www.ietf.org/mailman/listinfo/ltru
_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www.ietf.org/mailman/listinfo/ltru
_______________________________________________ Ltru mailing list Ltru at ietf.org https://www.ietf.org/mailman/listinfo/ltru
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.