[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ltru] Review of 4646bis-10, macrolanguages in section 4.1



Firstly, let me express my Deep Annoyance at getting emails with proposed text with line breaks and big wads of whitespace in front of each line. Yes, a regular expression will tame the beast, but I'd prefer to cut-and-paste a bit more :-)

I incorporated the below with only very minor tidying.

Addison

John Cowan wrote:
Because of the thicket of rewordings in this part, I'm just presenting
my suggested revised text here.  It's very important to make sure
that we don't talk about "dialects" or "sub-languages" here.  Also,
I've used "Macrolanguage" for the header only, but "macrolanguage"
for the languages.

The affected text begins "Languages with a Macrolanguage field" and ends
"did not specify zh-Hans-CN in their request.)".

        Some of the languages in the registry are labeled
        "macrolanguages" by ISO 639-3, which defines the term as
        "clusters of closely-related language varieties that [...] can
        be considered distinct individual languages, yet in certain
        usage contexts a single language identity for all is needed".
        These correspond to codes registered in ISO 639-2 as single
        languages that were found to correspond to more than one language
        in ISO 639-3.  The languages encompassed by a macrolanguage
        contain a Macrolanguage header in the registry; the macrolanguages
        themselves are not specially marked.

        It is always permitted, and sometimes useful, to tag an
        encompassed language using the subtag for its macrolanguage.
        However, the Macrolanguage field doesn't define what the
        relationship is between the encompassed language and its
        macrolanguage, nor does it define how languages encompassed by the
        same macrolanguage are related to each other.  In some cases, In
        some cases, one of the encompassed languages serves as a standard
        form for the entire macrolanguage and is frequently identified
        with it; in other cases there is no dominant language, and the
        macrolanguage simply serves as a cover term for the entire group.

        Applications MAY use macrolanguage information to improve matching
        or language negotiation.  For example, the information that 'sr'
        (Serbian) and 'hr' (Croatian) share a macrolanguage expresses
        a closer relation between those languages than between, say,
        'sr' (Serbian) and 'ma' (Macedonian).  It is valid to use the
        subtag of the encompassed language or of the macrolanguage to
        form language tags.  However, many matching applications will
        not be aware of the relationship between the languages.  Care in
        selecting which subtags are used is crucial to interoperability.

        In general, use the most specific tag.  However, where the
        macrolanguage tag has been historically used to denote a dominant
        encompassed language, it SHOULD be used in place of the subtag
        specific to that encompassed language unless it is necessary
        to clearly distinguish the macrolanguage as a whole from the
        dominant language variety.

	In particular, the Chinese family of languages call for special
	consideration.	Because the written form is very similar for most
	languages having 'zh' as a macrolanguage (and because historically
	subtags for the various encompassed languages were not available),
	languages such as 'yue' (Cantonese) have historically used
	either 'zh' or a tag (now grandfathered) beginning with 'zh'.
	This means that macrolanguage information can be usefully
	applied when searching for content or when providing fallbacks
	in language negotiation.  For example, the information that 'yue'
	has a macrolangauge of 'zh' could be used in the Lookup algorithm
	to fallback from a request for "yue-Hans-CN" to "zh-Hans-CN"
	without losing the script and region information (even though
	the user did not specify "zh-Hans-CN" in their request).

--
John Cowan              cowan at ccil.org          http://www.ccil.org/~cowan
Any day you get all five woodpeckers is a good day.  --Elliotte Rusty Harold


_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru


--
Addison Phillips
Globalization Architect -- Yahoo! Inc.
Chair -- W3C Internationalization Core WG

Internationalization is an architecture.
It is not a feature.


_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru




Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.