>
> This is incorrect on two counts. First, the set of collection
> codes
> isn't the subset of -2 codes that are also in -5, because that
> implies
> that the -5 codes that aren't in -2 aren't collection codes. Of
> course,
> everything in -5 is a collection code.
I removed the reference to -2, since, as you point out, it is a subset of the overall list of collections (and a strict subset at that).
>
> Second, the collection codes in -2 aren't a subset of -5 anyway,
> because
> of 'bih' and 'day' and 'him' which are in -2 but not -5. Yes, I
> know --
> this is supposed to be just a one-time synchronization error, an
> isolated anomaly, won't happen again, 639/JAC will be fixing it any
> day
> now.
I'm temporarily pretending that this glitch is a one-time issue that will disappear pre-publication. But we can take that up in draft-17.
>
> > are included as primary language subtags in the registry. The
> > 'Description' field of a collection typically includes the word
> > "languages" to indicate that it represents more than one language.
>
> Why don't we simply direct the user to look for the magic field
> "Scope:
> collection" in the Registry, instead of telling her to parse
> Description
> fields for the word "languages" (which will miss Bihari and
> Himachali
> anyway) or worrying about whether the Registry gives the original
> source
> of subtags?
Argh. Good point.
>
> > For example, these subtags include "Chamic languages" ('cmc'),
> > "Algonquin languages" ('alg'), and "Germanic languages" ('gem').
> Each
> > collection is also represented by subtags for the individual
> > languages. In the case of 'cmc', the registry also contains the
> values
> > for each of the approximately ten individual languages
> represented by
> > this collective code.
>
> Should be "In the case of 'cmc', for example, the registry also
> contains...." With the current wording, it looks as though only
> 'cmc'
> is special in this regard. Maybe I just read the passage too fast,
> but
> when the draft is 80+ pages long, it's safe to say I won't be the
> last
> to do so.
+1
>
> > The subtag 'gem' helps illustrate this further: since it is
> > interpreted inclusively, content tagged with "en" (English), "de"
> > (German), or "gsw" (Swiss German, Alemannic) could also (but
> SHOULD
> > NOT) be tagged with "gem" (Germanic languages). Obviously, the
> > languages in a collection are frequently not mutually
> intelligible, as
> > this example demonstrates.
>
> Should be "may not be" or "are not necessarily" mutually
Well, maybe not "obviously".
> intelligible.
> This one selected example doesn't demonstrate anything about
> whether
> languages in a collection are "frequently" not mutually
> intelligible,
> although we here on the list know that to be the case.
>
An extensive edit produces:
<t>Use a specific language subtags or sequences of subtags in preference to subtags for language collections. A "language collection" is a subtag that represents multiple related languages. These codes, which comprise the set of <xref target="ISO639-5"></xref> codes, are included as primary language subtags in the registry. The 'Scope' field for a collection has the value 'collection'. Two illustrative examples of collections as language subtags are "Chamic languages" ('cmc') and "Germanic languages" ('gem'). Each of these collections is also represented by subtags for the individual languages. In the case of 'cmc', for example, the registry also contains the values for each of the approximately ten individual languages represented by this collective code. Collections are interpreted inclusively, so content tagged with "en" (English), "de" (German), or "gsw" (Swiss German, Alemannic) could (but SHOULD NOT) be tagged with "gem" (Germanic languages), which includes all of these (and other) languages. Languages in a collection are often not mutually intelligible and, while subtags derived from collection codes MAY be used when more specific language information is not available, most tag processes and users do not understand the relationship between the collection and its encompassed languages. Thus, users ought not assume a subtag based on a language collection is a useful means for selecting or identifying content in its encompassed languages.</t>
Addison
_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www.ietf.org/mailman/listinfo/ltru
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.