I incorporated the below with only very minor tidying. Addison John Cowan wrote:
Because of the thicket of rewordings in this part, I'm just presenting
my suggested revised text here. It's very important to make sure
that we don't talk about "dialects" or "sub-languages" here. Also,
I've used "Macrolanguage" for the header only, but "macrolanguage"
for the languages.
The affected text begins "Languages with a Macrolanguage field" and ends
"did not specify zh-Hans-CN in their request.)".
Some of the languages in the registry are labeled
"macrolanguages" by ISO 639-3, which defines the term as
"clusters of closely-related language varieties that [...] can
be considered distinct individual languages, yet in certain
usage contexts a single language identity for all is needed".
These correspond to codes registered in ISO 639-2 as single
languages that were found to correspond to more than one language
in ISO 639-3. The languages encompassed by a macrolanguage
contain a Macrolanguage header in the registry; the macrolanguages
themselves are not specially marked.
It is always permitted, and sometimes useful, to tag an
encompassed language using the subtag for its macrolanguage.
However, the Macrolanguage field doesn't define what the
relationship is between the encompassed language and its
macrolanguage, nor does it define how languages encompassed by the
same macrolanguage are related to each other. In some cases, In
some cases, one of the encompassed languages serves as a standard
form for the entire macrolanguage and is frequently identified
with it; in other cases there is no dominant language, and the
macrolanguage simply serves as a cover term for the entire group.
Applications MAY use macrolanguage information to improve matching
or language negotiation. For example, the information that 'sr'
(Serbian) and 'hr' (Croatian) share a macrolanguage expresses
a closer relation between those languages than between, say,
'sr' (Serbian) and 'ma' (Macedonian). It is valid to use the
subtag of the encompassed language or of the macrolanguage to
form language tags. However, many matching applications will
not be aware of the relationship between the languages. Care in
selecting which subtags are used is crucial to interoperability.
In general, use the most specific tag. However, where the
macrolanguage tag has been historically used to denote a dominant
encompassed language, it SHOULD be used in place of the subtag
specific to that encompassed language unless it is necessary
to clearly distinguish the macrolanguage as a whole from the
dominant language variety.
In particular, the Chinese family of languages call for special
consideration. Because the written form is very similar for most
languages having 'zh' as a macrolanguage (and because historically
subtags for the various encompassed languages were not available),
languages such as 'yue' (Cantonese) have historically used
either 'zh' or a tag (now grandfathered) beginning with 'zh'.
This means that macrolanguage information can be usefully
applied when searching for content or when providing fallbacks
in language negotiation. For example, the information that 'yue'
has a macrolangauge of 'zh' could be used in the Lookup algorithm
to fallback from a request for "yue-Hans-CN" to "zh-Hans-CN"
without losing the script and region information (even though
the user did not specify "zh-Hans-CN" in their request).
--
John Cowan cowan at ccil.org http://www.ccil.org/~cowan
Any day you get all five woodpeckers is a good day. --Elliotte Rusty Harold
_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru
-- Addison Phillips Globalization Architect -- Yahoo! Inc. Chair -- W3C Internationalization Core WG Internationalization is an architecture. It is not a feature. _______________________________________________ Ltru mailing list Ltru at ietf.org https://www1.ietf.org/mailman/listinfo/ltru
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.