[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Ltru] ar and other Macrolanguages (was: Re: Macrolanguage, Extlang. The Sami language situation as example)



[technical hat on]

At 05:18 08/05/30, Peter Constable wrote:
>> From: ltru-bounces at ietf.org [mailto:ltru-bounces at ietf.org] On Behalf Of
>> Leif Halvard Silli
>
>> It does anyhow seem as if those who support extlang consider the
>> Macrolanguage information generally relevant, while the other
>> group is more sceptical about how relevant that information is.
>
>Well, what I find striking is that the *only* cases involving 
>macrolanguages that are getting discussed are:
>
>- Chinese
>
>- Norwegian: this has also been mentioned by you, but since "no" has been 
>deprecated in ISO 639 for many years now, and introducing "no-nb"/"no-nn" 
>or "no-nob"/"no-nno" would certainly but be compatible with existing usage.

skipped.

>- Arabic: this has been cited in some examples, but I don't hear anyone 
>sounding eager to use extlang subtags with ar.

One reason we don't see this discussed that often may be that because
Arabic is customarily only written with one script, and because that
script is phonetic, there are less arguments to be made against using
extlang. To be specific:
- There is no problem, as there may be with Chinese, that the script
  is removed first when falling back from the right, and only then
  is the encompassed language removed, and this in some cases may
  create the wrong fallback sequences (in others, as for a reader
  who is resonably fluent in both script varieties but simply prefers
  one, the fallbacks may be just the right way).
- Because the script is phonetic, anybody who reads standard Arabic
  pretty much should be able to also listen to standard Arabic
  (missing vowel signs complicate the situation a bit). This is
  quite different from the Chinese situation, where being able
  to read Mandarin does in no way imply that you are able to
  also understand spoken Mandarin. Thus I think that the need
  for fallbacks similar to "Cantonese, then French, then many
  others, and maybe then or maybe never Mandarin" are way, way
  rarer and less important for Arabic.

So based on the above analysis, when it comes to ar- for Arabic,
count me in.


>- Don Osborne has several times over the past couple of years mentioned 
>interest in the macrolanguage concept wrt African languages because of 
>(IIUC) trends toward evolution of varieties used for wider communication. 
>But it's not clear to me that macrolanguage is the appropriate concept 
>there, as opposed to a distinct, individual language, and even if these are 
>considered macrolanguages I don't see requests to have extlang for these 
>cases: if anything, the emerging variety used in wider communication makes 
>the local varieties *less* relevant for tagging purposes, not more.

Well, yes, the emerging variety may make the local varieties less
important, but are we not already at exactly this situation for
Chinese and Arabic? For me, the concern here is much more that
these are currently very fluid situations, and whether extlangs
or not, we are not really good at dealing with fluid situations.

>So, there is really *only* one case that people seem to care about (at 
>least, in the near term -- who knows what may happen with Bikol or Zapotec 
>in the future) and that would really be impacted by our current decision, 
>and that's Chinese.
>
>(As much as people don't want to cherry pick arbitrarily, it seems to me 
>that there is an elephant in this room that we're trying to say is no 
>different from the rest of the furniture.)

For me, if it's zh- and ar-, at least. But I think the other macrolanguages
with two-letter codes also need careful consideration. Most importantly,
in situations similar to Arabic and Chinese (widely used common written
variety, wide variation when spoken), the considerations I gave above
for Arabic will apply (including the fact that in most cases, the
caveat about missing vowel notation not applying).
An example may be 'ms' (Malay). However, the documentation in the Ethnologue
for all the three-letter codes is highly confusing, with very recent changes.
But that may be an additional argument for keeping 'ms' at the center.


Regards,    Martin.


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst at it.aoyama.ac.jp     

_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www.ietf.org/mailman/listinfo/ltru



Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.