[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Montenegrin what-if (was RE: [Ltru] Re: Review of 4646bis-10, macrolanguages in section 4.1)



> From: Frank Ellermann [mailto:nobody at xyzzy.claranet.de]

> >> what if ISO 639-3 adds 'ma' to the 'sh' set ?
>
> > Not gonna happen.  Macedonian is bogus Bulgarian,
> > not split-off Serbo-Croat.
>
> Ugh, sorry, I confused "ma" with Montenegrin.

Coding Montenegrin as a distinct individual language would be a bit messy. Right now, sh/hbs encompasses three individual-language entries, Bosnian (bs/bos), Croatian (hr/hrv/scr) and Serbian (sr/srp/scc). (Note: 639-2 coded separate T and B IDs for Croatian and Serbian.) If Montenegrin were coded, we'd be looking at a split of the existing Serbian. Therefore, we'd have to code *two* new languages:

'xxx' Serbian (Serbia, 2007)
'yyy' Montenegrin

(Some reference name for the new Serbian would have to be coined to differentiate it from the old Serbian, and/or maybe the name for the old Serbian could be changed. I haven't a clue what names to use.)

Now there are is the issue of how these relate to the existing entries. There are two options; I'll describe these in terms of 639-3 IDs, "MM" = macrolanguage mapping:

A)
        ID      Scope   MM      Status
        hbs     M       -
        bos     I       hbs
        hrv     I       hbs
        srp     I       hbs     deprecated (use xxx or yyy)
        xxx     I       hbs
        yyy     I       hbs

B)
        ID      Scope   MM      Status
        hbs     M       -
        bos     I       hbs
        hrv     I       hbs
        srp     M       hbs
        xxx     I       srp
        yyy     I       srp

With option B, we would also face the problem of what best practice should be for tagging new "Serbian" content: use srp or xxx? (Software platforms would also face the same problem in relation to locales.) Option A assumes that there would be no new use of srp for tagging content. (Software platforms would also need new xxx locales.)

I think it's fairly obvious that either of these options would present several problems in implementation.

Btw, the linguistic reality is that a distinct language did not suddenly pop into existence. Hence, if the split were coded, it's quite likely that lots of content / language resources would be created once for both. With either option A or B, we're likely to end up with lots of inconsistency in tagging.

Also btw, there has been no formal request submitted to the JAC to code Montenegrin. The question of adding Montenegrin was raised with the JAC, but given the many issues that would create, the JAC has chosen to take a wait-and-see stance.


Peter


_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru




Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.