> > You here use the word "romanization" - aka latinisation - instead > of transcription. Can we rule out that UNGEGN will not specify > e.g. Cyrillic transcriptions? (But if they should add Cyrillic > transcriptions, thFrom ltru-bounces at ietf.org Thu Oct 2 15:58:28 2008 Return-Path: <ltru-bounces at ietf.org> X-Original-To: ltru-archive at megatron.ietf.org Delivered-To: ietfarch-ltru-archive at core3.amsl.com Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id E102E3A6ABA; Thu, 2 Oct 2008 15:58:28 -0700 (PDT) X-Original-To: ltru at core3.amsl.com Delivered-To: ltru at core3.amsl.com Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 8FF5B3A6A7E for <ltru at core3.amsl.com>; Thu, 2 Oct 2008 15:58:27 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -106.656 X-Spam-Level: X-Spam-Status: No, score=-106.656 tagged_above=-999 required=5 tests=[AWL=-0.057, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id a8jgnroWNG1g for <ltru at core3.amsl.com>; Thu, 2 Oct 2008 15:58:26 -0700 (PDT) Received: from smtp-fw-9101.amazon.com (smtp-fw-9101.amazon.com [207.171.184.25]) by core3.amsl.com (Postfix) with ESMTP id 9FC563A6A57 for <ltru at ietf.org>; Thu, 2 Oct 2008 15:58:26 -0700 (PDT) X-IronPort-AV: E=Sophos;i="4.33,353,1220227200"; d="scan'208";a="116681041" Received: from smtp-in-1105.vdc.amazon.com ([10.140.9.24]) by smtp-border-fw-out-9101.sea19.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 02 Oct 2008 22:58:41 +0000 Received: from ex-hub-4102.ant.amazon.com (ex-hub-4102.ant.amazon.com [10.248.163.23]) by smtp-in-1105.vdc.amazon.com (8.12.11/8.12.11) with ESMTP id m92MweJM019177 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=FAIL); Thu, 2 Oct 2008 22:58:40 GMT Received: from EX-SEA5-D.ant.amazon.com ([10.248.163.28]) by ex-hub-4102.ant.amazon.com ([10.248.163.23]) with mapi; Thu, 2 Oct 2008 15:58:39 -0700 From: "Phillips, Addison" <addison at amazon.com> To: Leif Halvard Silli <lhs at malform.no>, Mark Davis <mark at macchiato.com> Date: Thu, 2 Oct 2008 15:58:37 -0700 Thread-Topic: [Ltru] Uniqueness of variant subtags Thread-Index: AckkuHp95CC7esDoSDejsDJCfxAIIgAKHBPw Message-ID: <4D25F22093241741BC1D0EEBC2DBB1DA014C7265E8 at EX-SEA5-D.ant.amazon.com> References: <mailman.1037.1222842292.4981.ltru at ietf.org> <6.0.0.20.2.20081002095015.09a23070 at localhost> <4D25F22093241741BC1D0EEBC2DBB1DA014C65C600 at EX-SEA5-D.ant.amazon.com> <30b660a20810020011l33c5ccc1r95ffa182a0f527cd at mail.gmail.com> <C6C1CB1E05214E1DA4D40CE6EBF576DC at streamserve.com> <30b660a20810020118i7967d579o6058def92efa31ef at mail.gmail.com> <20081002135029.GF31839 at mercury.ccil.org> <30b660a20810020723g40356d34y688b46cc13673a9e at mail.gmail.com> <4D25F22093241741BC1D0EEBC2DBB1DA014C65C876 at EX-SEA5-D.ant.amazon.com> <30b660a20810020749j286c508dr1f3fbfeab5d1fc2d at mail.gmail.com> <30b660a20810020813s590582bjbe1307a8bfe42635 at mail.gmail.com> <48E50BBD.8050501 at malform.no> In-Reply-To: <48E50BBD.8050501 at malform.no> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US MIME-Version: 1.0 Cc: LTRU Working Group <ltru at ietf.org>, Kent Karlsson <kent.karlsson14 at comhem.se> Subject: Re: [Ltru] Uniqueness of variant subtags X-BeenThere: ltru at ietf.org X-Mailman-Version: 2.1.9 Precedence: list List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org> List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request at ietf.org?subject=unsubscribe> List-Archive: <http://www.ietf.org/pipermail/ltru> List-Post: <mailto:ltru at ietf.org> List-Help: <mailto:ltru-request at ietf.org?subject=help> List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request at ietf.org?subject=subscribe> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: ltru-bounces at ietf.org Errors-To: ltru-bounces at ietf.org > > You here use the word "romanization" - aka latinisation - instead > of transcription. Can we rule out that UNGEGN will not specify > e.g. Cyrillic transcriptions? (But if they should add Cyrillic > transcriptions, then one cen one could just add e.g. "en-Cyrl" to the list > of prefixes.) Mark is talking about a generic variant: one with no prefixes at all. > > And what about "be-ungegn", which you in fact used as example in > one of your previous messages? Also see my note about > "supress-script" for variant tags below. Suppress-Script for variants is not permitted and should not be permitted, IMO. It is an unnecessary change. Furthermore, the ietf-languages list has gone the other way and suggested that script subtags SHOULD be used with a variant representing a transliteration. > > > Good. ISSUE: You take for granted that it is meant > "fr-CA-latn-1994". I would propose that variant subtag NB> The script always precedes the region. "fr-Latn-CA-1994" > registrations should always be very spesific about which Region > and Script they are valid for. For instance, the current > registration of "1996" does not specify that it is a "de-DE" > related norm. As if "de-CH-1996" would also be correct. Others have pointed out the fallacy in this example, which does not invalidate the point. Registrations SHOULD be specific about script or region or even variant (in addition to language) when they are actually confined to a specific usages. See, for example, the registry record for '1994'. > > The taggers would then have to know the Suppress-Script rules in > order to understand that in real life tagging, they ought to write > "de-DE-1996" and not "de-DE-Latn-1996". And they would also have Uh... taggers already need to know when to use subtags. They are specifically advised not to use scripts unless it adds something to the tag, a specific example of the more general advisory not to use subtags that add nothing to the overall tag. Most German applications are actually fine with "de". In some cases "de-DE" or "de-DE-1996" are appropriate. Rarely "de-Latn-DE-1996" is appropriate---probably when the German in question is also rendered with other scripts in the same document/collection. > to know the rules in order to know that they can/should/could also > often drop "DE" and just write "de-1996". RTFRFC. > > Just because a language subtag has (or hasn't) a Suppress-Script > field, does not necessarily imply that the variant subtag has the > same - or the same lack of - script association. Therefore I think Yes it does. Note that transliteration variants, such as 'wadegile', have a Prefix (zh-Latn, in that case) to convey the fact that the script is needed. Although 'zh' does not suppress a script, the same thing can be said of (for example) romanizations of other languages. "be" requires no script and suppresses "Cyrl", but "be-Latn-ungegn" would probably be a good choice if 'ungegn' were a valid subtag representing a Latin transcription scheme. > > If one cannot use the Prefix field to do this, then one should > invent a new field for this particular purpose. Perhaps a > Relates-to field. I think this is overkill. At some point we have to let the LSR register subtags and at some point we have to let people tag stuff. It is difficult enough--and maybe too difficult--for people to understand the information there today. Addison _______________________________________________ Ltru mailing list Ltru at ietf.org https://www.ietf.org/mailman/listinfo/ltru ould just add e.g. "en-Cyrl" to the list > of prefixes.) Mark is talking about a generic variant: one with no prefixes at all. > > And what about "be-ungegn", which you in fact used as example in > one of your previous messages? Also see my note about > "supress-script" for variant tags below. Suppress-Script for variants is not permitted and should not be permitted, IMO. It is an unnecessary change. Furthermore, the ietf-languages list has gone the other way and suggested that script subtags SHOULD be used with a variant representing a transliteration. > > > Good. ISSUE: You take for granted that it is meant > "fr-CA-latn-1994". I would propose that variant subtag NB> The script always precedes the region. "fr-Latn-CA-1994" > registrations should always be very spesific about which Region > and Script they are valid for. For instance, the current > registration of "1996" does not specify that it is a "de-DE" > related norm. As if "de-CH-1996" would also be correct. Others have pointed out the fallacy in this example, which does not invalidate the point. Registrations SHOULD be specific about script or region or even variant (in addition to language) when they are actually confined to a specific usages. See, for example, the registry record for '1994'. > > The taggers would then have to know the Suppress-Script rules in > order to understand that in real life tagging, they ought to write > "de-DE-1996" and not "de-DE-Latn-1996". And they would also have Uh... taggers already need to know when to use subtags. They are specifically advised not to use scripts unless it adds something to the tag, a specific example of the more general advisory not to use subtags that add nothing to the overall tag. Most German applications are actually fine with "de". In some cases "de-DE" or "de-DE-1996" are appropriate. Rarely "de-Latn-DE-1996" is appropriate---probably when the German in question is also rendered with other scripts in the same document/collection. > to know the rules in order to know that they can/should/could also > often drop "DE" and just write "de-1996". RTFRFC. > > Just because a language subtag has (or hasn't) a Suppress-Script > field, does not necessarily imply that the variant subtag has the > same - or the same lack of - script association. Therefore I think Yes it does. Note that transliteration variants, such as 'wadegile', have a Prefix (zh-Latn, in that case) to convey the fact that the script is needed. Although 'zh' does not suppress a script, the same thing can be said of (for example) romanizations of other languages. "be" requires no script and suppresses "Cyrl", but "be-Latn-ungegn" would probably be a good choice if 'ungegn' were a valid subtag representing a Latin transcription scheme. > > If one cannot use the Prefix field to do this, then one should > invent a new field for this particular purpose. Perhaps a > Relates-to field. I think this is overkill. At some point we have to let the LSR register subtags and at some point we have to let people tag stuff. It is difficult enough--and maybe too difficult--for people to understand the information there today. Addison _______________________________________________ Ltru mailing list Ltru at ietf.org https://www.ietf.org/mailman/listinfo/ltru
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.