[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ltru] Uniqueness of variant subtags



Mark Davis 2008-10-02 17.13:

I  tried to go through and account for Addisons comments,

  [ snip ]

<t>Variant subtags that are used with multiple prefixes MUST have a
single core meaning across those prefixes. Such a core meaning may be
narrow, or may be broad. For example, it could refer to an
organization (including governments), and mean a variant as specified
by that organization for relevant prefixes.
Thus
'ungegn' could be defined as referring to a romanization for any given
prefix as specified by the United Nations Group of Experts on
Geographical Names (UNGEGN), and be thus productively used with
'ru-Latn',
'hi-Latn', 'zh-Latn', and others.</t>

You here use the word "romanization" - aka latinisation - instead of transcription. Can we rule out that UNGEGN will not specify e.g. Cyrillic transcriptions? (But if they should add CFrom ltru-bounces at ietf.org Thu Oct 2 10:58:12 2008
Return-Path: <ltru-bounces at ietf.org>
X-Original-To: ltru-archive at megatron.ietf.org
Delivered-To: ietfarch-ltru-archive at core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1])
	by core3.amsl.com (Postfix) with ESMTP id 4400A28C116;
	Thu,  2 Oct 2008 10:58:12 -0700 (PDT)
X-Original-To: ltru at core3.amsl.com
Delivered-To: ltru at core3.amsl.com
Received: from localhost (localhost [127.0.0.1])
	by core3.amsl.com (Postfix) with ESMTP id EA90A3A6AAE
	for <ltru at core3.amsl.com>; Thu,  2 Oct 2008 10:58:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.849
X-Spam-Level: X-Spam-Status: No, score=-2.849 tagged_above=-999 required=5
	tests=[AWL=-0.250, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32])
	by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id IKgigEjNlZ82 for <ltru at core3.amsl.com>;
	Thu,  2 Oct 2008 10:58:10 -0700 (PDT)
Received: from smtp.domeneshop.no (smtp.domeneshop.no [194.63.248.54])
	by core3.amsl.com (Postfix) with ESMTP id B03053A6816
	for <ltru at ietf.org>; Thu,  2 Oct 2008 10:58:09 -0700 (PDT)
Received: from cm-84.208.108.246.getinternet.no ([84.208.108.246] helo=s.local)
	by smtp.domeneshop.no with esmtpa (Exim 4.68)
	(envelope-from <lhs at malform.no>)
	id 1KlSRS-0003lN-1o; Thu, 02 Oct 2008 19:58:22 +0200
Message-ID: <48E50BBD.8050501 at malform.no>
Date: Thu, 02 Oct 2008 19:58:21 +0200
From: Leif Halvard Silli <lhs at malform.no>
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US;
	rv:1.8.1b1) Gecko/20060724 Thunderbird/2.0a1 Mnenhy/0.7.4.666
MIME-Version: 1.0
To: Mark Davis <mark at macchiato.com>
References: <mailman.1037.1222842292.4981.ltru at ietf.org>	<6.0.0.20.2.20081002095015.09a23070 at localhost>	<4D25F22093241741BC1D0EEBC2DBB1DA014C65C600 at EX-SEA5-D.ant.amazon.com>	<30b660a20810020011l33c5ccc1r95ffa182a0f527cd at mail.gmail.com>	<C6C1CB1E05214E1DA4D40CE6EBF576DC at streamserve.com>	<30b660a20810020118i7967d579o6058def92efa31ef at mail.gmail.com>	<20081002135029.GF31839 at mercury.ccil.org>	<30b660a20810020723g40356d34y688b46cc13673a9e at mail.gmail.com>	<4D25F22093241741BC1D0EEBC2DBB1DA014C65C876 at EX-SEA5-D.ant.amazon.com>	<30b660a20810020749j286c508dr1f3fbfeab5d1fc2d at mail.gmail.com>
	<30b660a20810020813s590582bjbe1307a8bfe42635 at mail.gmail.com>
In-Reply-To: <30b660a20810020813s590582bjbe1307a8bfe42635 at mail.gmail.com>
Cc: LTRU Working Group <ltru at ietf.org>,
	Kent Karlsson <kent.karlsson14 at comhem.se>
Subject: Re: [Ltru] Uniqueness of variant subtags
X-BeenThere: ltru at ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Language Tag Registry Update working group discussion list
	<ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>,
	<mailto:ltru-request at ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru at ietf.org>
List-Help: <mailto:ltru-request at ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>,
	<mailto:ltru-request at ietf.org?subject=subscribe>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Sender: ltru-bounces at ietf.org
Errors-To: ltru-bounces at ietf.org

Mark Davis 2008-10-02 17.13:

I  tried to go through and account for Addisons comments,

  [ snip ]

<t>Variant subtags that are used with multiple prefixes MUST have a
single core meaning across those prefixes. Such a core meaning may be
narrow, or may be broad. For example, it could refer to an
organization (including governments), and mean a variant as specified
by that organization for relevant prefixes.
Thus
'ungegn' could be defined as referring to a romanization for any given
prefix as specified by the United Nations Group of Experts on
Geographical Names (UNGEGN), and be thus productively used with
'ru-Latn',
'hi-Latn', 'zh-Latn', and others.</t>

You here use the word "romanization" - aka latinisation - instead of transcription. Can we rule out that UNGEGN will not specify e.g. Cyrillic transcriptions? (But if they should add Cyrillic yrillic transcriptions, then one could just add e.g. "en-Cyrl" to the list of prefixes.)

And what about "be-ungegn", which you in fact used as example in one of your previous messages? Also see my note about "supress-script" for variant tags below.

Option A.

<t>Four digit subtags are specially reserved for indicating a year.
Their meaning is in reference to some significant specification or
other work associated with that year. It may have multiple prefixes:
the particular specification for a given prefix MUST be clearly
indicated in one of the Descriptions. For example, the '1994' subtag
variant record could have the prefix 'fr-CA' added together with a
description of usage for a French Canadian spelling reform associated
with the year 1994.</t>


Good. ISSUE: You take for granted that it is meant "fr-CA-latn-1994". I would propose that variant subtag registrations should always be very spesific about which Region and Script they are valid for. For instance, the current registration of "1996" does not specify that it is a "de-DE" related norm. As if "de-CH-1996" would also be correct.

So, for example, for the subtag "1996", I propose that it should say that the prefix is not "de" but "de-DE-Latn":

Type: variant
Subtag: 1996
Description: Offical or recognised language reform associated with year 1996.
Prefix: de-DE-Latn
Prefix: uz-UZ-Latn
Prefix: be-Latn-ungegn

The taggers would then have to know the Suppress-Script rules in order to understand that in real life tagging, they ought to write "de-DE-1996" and not "de-DE-Latn-1996". And they would also have to know the rules in order to know that they can/should/could also often drop "DE" and just write "de-1996".

Just because a language subtag has (or hasn't) a Suppress-Script field, does not necessarily imply that the variant subtag has the same - or the same lack of - script association. Therefore I think that variant registration should be "hypercorrect" when they specify which Prefix the variant can be used with. But leave it to the taggers to drop unnecessary tags.

If one cannot use the Prefix field to do this, then one should invent a new field for this particular purpose. Perhaps a Relates-to field.

Option B.

<t>Additional four digit variant subtags MUST NOT be registered.</t>

It would be hard to live with Option B.
--
leif halvard silli
_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www.ietf.org/mailman/listinfo/ltru



transcriptions, then one could just add e.g. "en-Cyrl" to the list of prefixes.)

And what about "be-ungegn", which you in fact used as example in one of your previous messages? Also see my note about "supress-script" for variant tags below.

Option A.

<t>Four digit subtags are specially reserved for indicating a year.
Their meaning is in reference to some significant specification or
other work associated with that year. It may have multiple prefixes:
the particular specification for a given prefix MUST be clearly
indicated in one of the Descriptions. For example, the '1994' subtag
variant record could have the prefix 'fr-CA' added together with a
description of usage for a French Canadian spelling reform associated
with the year 1994.</t>


Good. ISSUE: You take for granted that it is meant "fr-CA-latn-1994". I would propose that variant subtag registrations should always be very spesific about which Region and Script they are valid for. For instance, the current registration of "1996" does not specify that it is a "de-DE" related norm. As if "de-CH-1996" would also be correct.

So, for example, for the subtag "1996", I propose that it should say that the prefix is not "de" but "de-DE-Latn":

Type: variant
Subtag: 1996
Description: Offical or recognised language reform associated with year 1996.
Prefix: de-DE-Latn
Prefix: uz-UZ-Latn
Prefix: be-Latn-ungegn

The taggers would then have to know the Suppress-Script rules in order to understand that in real life tagging, they ought to write "de-DE-1996" and not "de-DE-Latn-1996". And they would also have to know the rules in order to know that they can/should/could also often drop "DE" and just write "de-1996".

Just because a language subtag has (or hasn't) a Suppress-Script field, does not necessarily imply that the variant subtag has the same - or the same lack of - script association. Therefore I think that variant registration should be "hypercorrect" when they specify which Prefix the variant can be used with. But leave it to the taggers to drop unnecessary tags.

If one cannot use the Prefix field to do this, then one should invent a new field for this particular purpose. Perhaps a Relates-to field.

Option B.

<t>Additional four digit variant subtags MUST NOT be registered.</t>

It would be hard to live with Option B.
--
leif halvard silli
_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www.ietf.org/mailman/listinfo/ltru



Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.