There is no conventional semantic. However, users might sometimes supply one. Generally this isn't a huge issue because true variants are From ltru-bounces at ietf.org Tue Jul 15 11:07:36 2008 Return-Path: <ltru-bounces at ietf.org> X-Original-To: ltru-archive at megatron.ietf.org Delivered-To: ietfarch-ltru-archive at core3.amsl.com Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 88C343A68CD; Tue, 15 Jul 2008 11:07:36 -0700 (PDT) X-Original-To: ltru at core3.amsl.com Delivered-To: ltru at core3.amsl.com Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id A80C13A68CD for <ltru at core3.amsl.com>; Tue, 15 Jul 2008 11:07:35 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -106.716 X-Spam-Level: X-Spam-Status: No, score=-106.716 tagged_above=-999 required=5 tests=[AWL=-0.117, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9N9eRxSTH95g for <ltru at core3.amsl.com>; Tue, 15 Jul 2008 11:07:34 -0700 (PDT) Received: from smtp-fw-9101.amazon.com (smtp-fw-9101.amazon.com [207.171.184.25]) by core3.amsl.com (Postfix) with ESMTP id 947243A67D4 for <ltru at ietf.org>; Tue, 15 Jul 2008 11:07:34 -0700 (PDT) X-IronPort-AV: E=Sophos;i="4.30,367,1212364800"; d="scan'208";a="76926752" Received: from smtp-in-0201.sea3.amazon.com ([172.20.19.24]) by smtp-border-fw-out-9101.sea19.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 15 Jul 2008 18:08:02 +0000 Received: from ex-hub-4101.ant.amazon.com (ex-hub-4101.ant.amazon.com [10.248.163.22]) by smtp-in-0201.sea3.amazon.com (8.12.11/8.12.11) with ESMTP id m6FI80Zr026570 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=FAIL); Tue, 15 Jul 2008 18:08:00 GMT Received: from EX-SEA5-D.ant.amazon.com ([10.248.163.28]) by ex-hub-4101.ant.amazon.com ([10.248.163.22]) with mapi; Tue, 15 Jul 2008 11:08:00 -0700 From: "Phillips, Addison" <addison at amazon.com> To: Peter Constable <petercon at microsoft.com>, "ltru at ietf.org" <ltru at ietf.org> Date: Tue, 15 Jul 2008 11:07:58 -0700 Thread-Topic: [Ltru] Canonical variants Thread-Index: AcjmnUuyfLujx2z8S8uiKOV14YgEZwAAP1/gAADUYbA= Message-ID: <4D25F22093241741BC1D0EEBC2DBB1DA013BE315A9 at EX-SEA5-D.ant.amazon.com> References: <DDB6DE6E9D27DD478AE6D1BBBB83579563391BED5F at NA-EXMSG-C117.redmond.corp.microsoft.com> <30b660a20807111047y40ed7baan93dfd4552ce76ddd at mail.gmail.com> <6.0.0.20.2.20080712172515.0962c1d8 at localhost> <4D25F22093241741BC1D0EEBC2DBB1DA013BD2AA42 at EX-SEA5-D.ant.amazon.com> <DDB6DE6E9D27DD478AE6D1BBBB83579563391BF847 at NA-EXMSG-C117.redmond.corp.microsoft.com> <6.0.0.20.2.20080715183726.097251a8 at localhost> <30b660a20807150743s2b055ee6n1e7de2064231336c at mail.gmail.com> <20080715151952.GF18383 at mercury.ccil.org> <30b660a20807150843t3ab3c12cif098feb980d4efce at mail.gmail.com> <DDB6DE6E9D27DD478AE6D1BBBB835795633928C6A1 at NA-EXMSG-C117.redmond.corp.microsoft.com> <20080715170720.GJ18383 at mercury.ccil.org> <DDB6DE6E9D27DD478AE6D1BBBB835795633928C701 at NA-EXMSG-C117.redmond.corp.microsoft.com> In-Reply-To: <DDB6DE6E9D27DD478AE6D1BBBB835795633928C701 at NA-EXMSG-C117.redmond.corp.microsoft.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US MIME-Version: 1.0 Subject: Re: [Ltru] Canonical variants X-BeenThere: ltru at ietf.org X-Mailman-Version: 2.1.9 Precedence: list List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org> List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request at ietf.org?subject=unsubscribe> List-Archive: <http://www.ietf.org/pipermail/ltru> List-Post: <mailto:ltru at ietf.org> List-Help: <mailto:ltru-request at ietf.org?subject=help> List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request at ietf.org?subject=subscribe> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: ltru-bounces at ietf.org Errors-To: ltru-bounces at ietf.org There is no conventional semantic. However, users might sometimes supply one. Generally this isn't a huge issue because true variants are rare in rare in the wild and of limited practicality. While that could change, in practice most general variants are going to be nonsense together. At present, none of the general purpose variants make sense with one-another. While that will change over time, the utility of more than a couple together is going to always remain limited. When implementing language tags, I have always made the assumption that the subtag's order had some meaning to the user, even though my implementation could not know what it was. All of the other subtags are canonically ordered in a tag, after all---extlang, private use, and extension sequences have to remain in the same order. Why would variants be any different? And somehow I don't think it is a good thing that canonicalizing processors produce a vastly different result from non-canonicalizing ones when that result is not measurably better (it is measurably better to match a request for en-BU also to the tag en-MM, etc.) I've had the opportunity to recommend various of the matching schemes to different working groups. RFC 4647's compatibility with RFC 3066 has been a tremendous boon here. However, if some implementations start reordering subtags, they'll produce different results than those that are strictly of the "Tag Content Wisely" flavor (i.e. "you give me subtags and I'll remove them, one-by-one, until I find something"). The less I need the registry, the happier I am overall because that hews closer to the original language tag implementation requirements. So it makes me feel more comfortable saying that matching implementations don't need to worry about subtag (re-)ordering. Addison Addison Phillips Globalization Architect -- Lab126 Internationalization is not a feature. It is an architecture. > -----Original Message----- > From: ltru-bounces at ietf.org [mailto:ltru-bounces at ietf.org] On > Behalf Of Peter Constable > Sent: Tuesday, July 15, 2008 10:20 AM > To: ltru at ietf.org > Subject: Re: [Ltru] Canonical variants > > > From: John Cowan [mailto:cowan at ccil.org] > > > > Alas, some combining marks that were thought to be in irrelevant > > positions > > turned out not to be so. > > This response to the point I made is ignoring the point I was > making: that those particular aspects of Unicode are not analogous: > encoding order in Unicode is a transparent metaphor for positional > ordering, whereas there isn't a metaphor for ordering of our > subtags. > > > > The analogy isn't about the specifics: it's about assuming that > > something > > is meaningless that later on turns out to be meaningful. > > But the analogy breaks down for the reasons I've stated: encoding > order has no obvious metaphor for our subtags, and so unless *we > decide* to define some conventional semantic, there is no > conventional semantic. > > > > Peter > > _______________________________________________ > Ltru mailing list > Ltru at ietf.org > https://www.ietf.org/mailman/listinfo/ltru _______________________________________________ Ltru mailing list Ltru at ietf.org https://www.ietf.org/mailman/listinfo/ltru the wild and of limited practicality. While that could change, in practice most general variants are going to be nonsense together. At present, none of the general purpose variants make sense with one-another. While that will change over time, the utility of more than a couple together is going to always remain limited. When implementing language tags, I have always made the assumption that the subtag's order had some meaning to the user, even though my implementation could not know what it was. All of the other subtags are canonically ordered in a tag, after all---extlang, private use, and extension sequences have to remain in the same order. Why would variants be any different? And somehow I don't think it is a good thing that canonicalizing processors produce a vastly different result from non-canonicalizing ones when that result is not measurably better (it is measurably better to match a request for en-BU also to the tag en-MM, etc.) I've had the opportunity to recommend various of the matching schemes to different working groups. RFC 4647's compatibility with RFC 3066 has been a tremendous boon here. However, if some implementations start reordering subtags, they'll produce different results than those that are strictly of the "Tag Content Wisely" flavor (i.e. "you give me subtags and I'll remove them, one-by-one, until I find something"). The less I need the registry, the happier I am overall because that hews closer to the original language tag implementation requirements. So it makes me feel more comfortable saying that matching implementations don't need to worry about subtag (re-)ordering. Addison Addison Phillips Globalization Architect -- Lab126 Internationalization is not a feature. It is an architecture. > -----Original Message----- > From: ltru-bounces at ietf.org [mailto:ltru-bounces at ietf.org] On > Behalf Of Peter Constable > Sent: Tuesday, July 15, 2008 10:20 AM > To: ltru at ietf.org > Subject: Re: [Ltru] Canonical variants > > > From: John Cowan [mailto:cowan at ccil.org] > > > > Alas, some combining marks that were thought to be in irrelevant > > positions > > turned out not to be so. > > This response to the point I made is ignoring the point I was > making: that those particular aspects of Unicode are not analogous: > encoding order in Unicode is a transparent metaphor for positional > ordering, whereas there isn't a metaphor for ordering of our > subtags. > > > > The analogy isn't about the specifics: it's about assuming that > > something > > is meaningless that later on turns out to be meaningful. > > But the analogy breaks down for the reasons I've stated: encoding > order has no obvious metaphor for our subtags, and so unless *we > decide* to define some conventional semantic, there is no > conventional semantic. > > > > Peter > > _______________________________________________ > Ltru mailing list > Ltru at ietf.org > https://www.ietf.org/mailman/listinfo/ltru _______________________________________________ Ltru mailing list Ltru at ietf.org https://www.ietf.org/mailman/listinfo/ltru
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.