[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Ltru] Re: Proposed Text for Moving Forward



> Now, probing a bit deeper, (and these may be purely theoretical
> problems, and could be dismissed as such) do we have any
> cases like this:
>     language L in general is written in script S1 overwhelmingly,
>     and there is a substantial corpus of data tagged simply "L".
>     language L has several important variants, V1..Vn
>     V2, a rather rare variant, uses script S2 overwhelmingly
> 
> If there are cases like this (for example, a hypothetical community
> of German speakers using the Hebrew alphabet in a manner similar to
> Yiddish, or a community of Yiddish speakers using the latin alphabet)
> would there be any need to represent this in the registry?
> The tag L-S2-V2 would be preferred to the tag L-V2, right?  How would
> a developer looking at the registry determine this?


There is the beauty of "Suppress_Script": it only suppresses S1. It has no effect on S2. A user might then choose (thanks to escape clauses in the rules) to use L-S1 for some other content in the same context. I think you're thinking of rule 3:

3. If the script of the content matches neither field (or the fields are both unpopulated), the script subtag SHOULD NOT be used to form language tags for that language unless it adds specific information to the language tag which is required by the application.

We should probably modify that to say:

3. If the script doesn't match any of the "Require_Script" subtags, then the script subtag SHOULD be used to form the language tag as long as it adds useful information to the tag or it matches a "Supress_Script" subtag for that language (in which case it SHOULD NOT be used to form the language tag).

(IOW, these fields only affect scripts that match their entries and nothing else. The previous rule about not using subtags that add no information still applies.)

Then consider:

Subtag: de
Suppress_Script: Latn

"de-DE" // no Latn
"de-Yiii-AQ" // Yiii not suppressed

Subtag: zh
Require_Script: Hant, Hans
Suppress_Script: Yyyy // example only

Zh-Hant-TW
Zh-Hans-CN
Zh-AQ // Yyyy suppressed
Zh-Latn-AQ

The only thing that matters in this design is the majority script. Suppress only suppresses the named script. It has no effect on a language variation that uses another script unless that script is specifically suppressed.

Addison

Addison P. Phillips
Globalization Architect, Quest Software
Chair, W3C Internationalization Core Working Group

Internationalization is not a feature.
It is an architecture. 

> -----Original Message-----
> From: ltru-bounces at lists.ietf.org [mailto:ltru-bounces at lists.ietf.org] On
> Behalf Of Randy Presuhn
> Sent: vendredi 15 avril 2005 14:51
> To: ltru at ietf.org
> Subject: Re: [Ltru] Re: Proposed Text for Moving Forward
> 
> Hi -
> 
> > From: "Addison Phillips" <addison.phillips at quest.com>
> > To: "Randy Presuhn" <randy_presuhn at mindspring.com>; <ltru at ietf.org>
> > Sent: Friday, April 15, 2005 2:07 PM
> > Subject: RE: [Ltru] Re: Proposed Text for Moving Forward
> ...
> > So the mechanism works for you?
> ...
> 
> It seems to be able to produce the desired results for the cases
> where I have first-hand knowledge of the language, and is easier
> to understand than the approaches that required counting.
> 
> If the WG converges on this approach, what will be critical is the
> clarity of the SHOULDs and SHOULD NOTs.  It might be helpful
> to include an example explaining why tagging data with en-Latn
> would normally be a bad idea, in order to give implementers a
> serious clue about just how strong the SHOULD NOTs are
> meant to be.
> 
> Now, probing a bit deeper, (and these may be purely theoretical
> problems, and could be dismissed as such) do we have any
> cases like this:
>     language L in general is written in script S1 overwhelmingly,
>     and there is a substantial corpus of data tagged simply "L".
>     language L has several important variants, V1..Vn
>     V2, a rather rare variant, uses script S2 overwhelmingly
> 
> If there are cases like this (for example, a hypothetical community
> of German speakers using the Hebrew alphabet in a manner similar to
> Yiddish, or a community of Yiddish speakers using the latin alphabet)
> would there be any need to represent this in the registry?
> The tag L-S2-V2 would be preferred to the tag L-V2, right?  How would
> a developer looking at the registry determine this?
> 
> Randy
> 
> 
> 
> 
> _______________________________________________
> Ltru mailing list
> Ltru at lists.ietf.org
> https://www1.ietf.org/mailman/listinfo/ltru


_______________________________________________
Ltru mailing list
Ltru at lists.ietf.org
https://www1.ietf.org/mailman/listinfo/ltru



Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.