[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ltru] Changes to draft-4645bis resulting from new draft-4646bis




Mark

On Mon, Jul 21, 2008 at 2:01 PM, Phillips, Addison <addison at amazon.com> wrote:

Comments follow.

 

Addison Phillips

Globalization Architect -- Lab126

 

Internationalization is not a feature.

It is an architecture.





Issues that we still haven't resolved include:

3.  whether to change the list of languages that have extlangs (currently ar, kok, ms, sgn, sw, uz, zh), and if we do change it, what explanation to provide for the choice in draft-4645bis


Unless someone makes a good argument for changing them, I suggest we just go forward with these. I think we have enough text in 4.1.2 already.

 

+1


There are a few other items that seem to not be in http://inter-locale.com/ID/draft-ietf-ltru-4646bis-17.html yet, I think because we may not have had specific language. I think we were in agreement that we need to drop the Deprecated from the Extlang cases, and tie the Preferred Value closer to canonicalization.

Here are my suggestions:

> Extended language subtag records MUST include a 'Preferred-Value' and 'Deprecated' field. The 'Preferred-Value' and 'Subtag' fields MUST be identical.
=>
Extended language subtag records MUST include a 'Preferred-Value' field. The 'Preferred-Value' and 'Subtag' fields MUST be identical.

 

(editor hat) DONE.



> Although valid in language tags, subtags and tags with a 'Deprecated' field are deprecated and validating processors SHOULD NOT generate these subtags.
ADD: However, for backwards compatibility the deprecated code may be preferred in many contexts: see 3.1.7.  Preferred-Value Field.

 

(editor hat OFF). I don't think this quite jibes with removing 'Deprecated'. If we remove deprecated from extlang records, we don't have to say anything else here. What we do have to say is something about preferred values later.

ok



> The value in this field is strongly RECOMMENDED as the best choice to represent the value of this record when selecting a language tag.
=>
The value in this field is used for canonicalization, and shows that the tags SHOULD be treated as equivalent on input. If the subtag or tag is also Deprecated, then the Preferred Value is RECOMMENDED. However, for backwards compatibility the deprecated code may be preferred in many contexts. For example, both "iw" and "he" can be used in the Java programming language, but "he" is converted on input to "iw", which is thus the canonical form in Java.

 

(editor hat OFF) The iw/he example is orthogonal to the extlang problem we're addressing here. I'm not sure it makes sense to thrust this example in at this juncture.

 

(editor hat) This text also seemed a little awkward to me and a rewrite was in order. I didn't ignore your suggestions, but had to reorganize to achieve the right result (such as adding a case to the former "three cases" list. Here is the resulting text:

 

--

<section anchor="preferredfield" title="Preferred-Value Field">

                                                                          

<t>The field 'Preferred-Value' contains a mapping between the record in which it appears and another tag or subtag (depending on the record's 'Type'). The value in this field is used for canonicalization (see <xref target="canonical"></xref>). In cases where the subtag or tag also has a 'Deprecated' field, then the 'Preferred-Value' is RECOMMENDED as the best choice to represent the value of this record when selecting a language tag. </t>

 

<t>Records containing a Preferred-Value fall into one of these four groups:

 

<list style="numbers">

                                                                                                          

   <t>ISO 639 language codes that were later withdrawn in favor of other codes. These values are mostly a historical curiosity. The 'he'/'iw' pairing above is an example of this.</t>

                                                                                                          

   <t>Subtags (with types other than language or extlang) taken from codes or values that have been withdrawn in favor of a new code. In particular, this applies to region subtags taken from ISO 3166-1, because sometimes a country will change its name or administration in such a way that warrants a new region code. In some cases, countries have reverted to an older name, which might already be encoded. For example, the subtag 'ZR' (Zaire) was replaced by the subtag 'CD' (Democratic Republic of the Congo) when that country's name was changed.</t>

                                                                                                          

   <t>Tags or subtags that have become obsolete because the values they represent were later encoded. Many of the grandfathered or redundant tags were later encoded by ISO 639, for example, and fall into this grouping. For example, "i-klingon" was deprecated when the subtag 'tlh' was added. The record for "i-klingon" has a 'Preferred-Value' of 'tlh'.</t>

 

   <t>Extended language subtags always have a mapping to their identical primary language subtag. For example, the extended language subtag 'yue' (Cantonese) can be used to form the tag "zh-yue". It has a Preferred-Value mapping to the primary language subtag 'yue', meaning that a tag such as "zh-yue-Hant-HK" can be canonicalized to "yue-Hant-HK".</t>

                                                                                          

</list>

</t>

                                                                          

<t>Records other than those of type 'extlang' that contain a 'Preferred-Value' field MUST also have a 'Deprecated' field. This field contains the date on which the tag or subtag was deprecated in favor of the preferred value.</t><t>For records of type 'extlang', the 'Preferred-Value' field appears without a corresponding 'Deprecated' field. An implementation MAY ignore these preferred value mappings, although if it ignores the mapping, it SHOULD do so consistently. It SHOULD also treat the Preferred-Value as equivalent to the mapped item. For example, the tags "zh-yue-Hant-HK" and "yue-Hant-HK" are semantically equivalent and ought to be treated as if they were the same tag.</t>


all good (we could make the first MUST be a SHOULD, but I'm ok with your text).

 

<t>Occasionally the deprecated code is preferred in certain contexts. For example, both "iw" and "he" can be used in the Java programming language, but "he" is converted on input to "iw", which is thus the canonical form in Java. </t>

                                                                          

<t>'Preferred-Value' mappings in records of type 'region' sometimes do not represent exactly the same meaning as the original value. There are many reasons for a country code to be changed, and the effect this has on the formation of language tags will depend on the nature of the change in question. For example, the region subtag 'YD' (Democratic Yemen) was deprecated in favor of the subtag 'YE' (Yemen) when those two countries unified in 1990.</t>

                                                                          

<t>A 'Preferred-Value' MAY be added to, changed, or removed from records according to the rules in <xref target="maintreg"/>. Addition, modification, or removal of a 'Preferred-Value' field in a record does not imply that content using the affected subtag needs to be retagged.</t>

                                                                          

                                                                          

<t>The 'Preferred-Value' fields in records of type "grandfathered" and "redundant" each contain an "extended language range" (<xref target="RFC4647"></xref>) that is strongly RECOMMENDED for use in place of the record's value. In many cases, these mappings were created via deprecation of the tags during the period before <xref target="RFC4646"/> was adopted. For example, the tag "no-nyn" was deprecated in favor of the ISO 639-1-defined language code 'nn'. </t><t>The 'Preferred-Value' field in subtag records of type "extlang" also contains an "extended language range". This allows the subtag to be deprecated in favor of either a single primary language subtag or a new language-extlang sequence.</t>

 

<t>Usually the addition, removal, or change of a Preferred-Value field for a

subtag is done to reflect changes in one of the source standards. For example, if an ISO 3166-1 region code is deprecated in favor of

another code, that SHOULD result in the addition of a Preferred-Value field.

</t>

 

<t>Changes to one subtag MAY affect other subtags as well: when proposing changes to the registry, the Language Subtag Reviewer will review the registry for such effects and propose the necessary changes using the process in <xref target="registrationProc"></xref>, although anyone MAY request such changes. For example:

 

<list>

                                                                                                          

   <t>Suppose that subtag 'XX' has a Preferred-Value of 'YY'. If 'YY' later changes to

have a Preferred-Value of 'ZZ', then the Preferred-Value for 'XX' MUST also change

to be 'ZZ'.</t>

                                                                                                          

   <t>Suppose that a registered language subtag 'dialect' represents a language not

yet available in any part of ISO 639. The later addition of a corresponding language

code in ISO 639 SHOULD result in the addition of a Preferred-Value for

'dialect'.</t>

                                                                                          

</list>

</t>

</section>

--

good

 

 



"Records that contain a 'Preferred-Value' field MUST also have a 'Deprecated' field. This field contains the date on which the tag or subtag was deprecated in favor of the preferred value."
REMOVE

"The 'Preferred-Value' field in subtag records of type "extlang" also contains an "extended language range". This allows the subtag to be deprecated in favor of either a single primary language subtag or a new language-extlang sequence."
=>
The 'Preferred-Value' field in subtag records of type "extlang" also contains an "extended language range". This allows either a single primary language subtag or a new language-extlang sequence to be preferred.


> Extended language subtag records MUST include the fields 'Prefix', 'Deprecated', and 'Preferred-Value' with field-values assigned as described in Section 2.2.2 (Extended Language Subtags).
=>
Extended language subtag records MUST include the fields 'Prefix' and 'Preferred-Value' with field-values assigned as described in Section 2.2.2 (Extended Language Subtags).

 

(editor hat) DONE.



> For example, use 'he' for Hebrew in preference to 'iw'.
ADD: However, for backwards compatibility the deprecated code may be preferred in many contexts: see 3.1.7.  Preferred-Value Field.

 

(editor hat OFF) Do we really need to litter the document for this one case?

No, I guess as long as we have it in one place. (We do tend to repeat the same MUSTs and SHOULDs with these things in different places....)

 

 

(editor hat) Changed the 'he'/'iw' example here to 'jbo'/"art-lojban".

ok



> However, tags that include these values SHOULD NOT be selected by users or generated by implementations.
=> Tags that include these values SHOULD NOT be selected by users or generated by implementations.
However, for backwards compatibility the deprecated code may be preferred in many contexts: see 3.1.7.  Preferred-Value Field.

 

(editor hat OFF) This doesn't belong here? The context of this sentence is subtags that are deprecated with no P-V. The item says "SHOULD NOT" on purpose and I don't think we need to belabor it in this instance.


ok, good



4.  whether extlangs can have Suppress-Script (question asked by Addison with special attention to the Arabic extlangs)


We are treating the two forms as equivalent. So I think we can just add the following:

[in 3.1.9.  Suppress-Script Field]
ADD AT END
Tags formed using extlangs are equivalent to those with the extlang subtag as the primary language subtag; thus the Suppress-Script field for the primary language subtag applies to tags using the corresponding extlang. For example, if 'abh' has Suppress-Script: Arab, then Arb should be suppressed from both ar-abh-AF and abh-AF.

 

(editor hat OFF) Wouldn't we just include the S-S field in the records of type extlang?


Because they there is always the chance that we screw up the synchronization.  If we include the S-S by reference that can't happen. If we include the SS field in extlang fields, then we also have to add the rule for the Language Reviewer that anything that every extlang MUST have an SS if and only if the corresponding primary language subtag does.

 

(editor hat) I did the necessary edits to do it that way.



I would hope that we can resolve these remaining questions with at least as much analytical effort and energy as we devoted to the largely theoretical question of arranging multiple prefixless variants within a tag.


agreed, glad you refocused this.
 



--
Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14
http://www.ewellic.org
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages  ˆ

______________________________
_________________
Ltru mailing list
Ltru at ietf.org
https://www.ietf.org/mailman/listinfo/ltru

 


_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www.ietf.org/mailman/listinfo/ltru

Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.