[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Ltru] Re: Is 639-3 bogus ?



Doug Ewell wrote:
 
> There is a demonstrated need to tag things like Middle English and
> Ancient Egyptian, not to mention Esperanto (which would cause another
> problem, as existing constructed languages would be grandfathered in,
> but new ones in 639-3 would be excluded).

Esperanto and Orcish don't play in the same league, that's a good "C",
it couldn't cause a problem _because_ it already made it into 639-2.

Other good "C" not in 639-2 would be a problem, as far as registering
them individually as needed is a problem (in theory it should be easy).

The attempted "grabar" registration was a model, somebody (IIRC Peter)
said it should get its own code, and later sommebody else (IIRC John)
found that it already has a code "axm".  So far the plan was that this
and all other new A, C, E, and H are automatically included in the bulk
4645bis update.

This registration business (like RFC publication) boils down to "filter
nonsense and try to reflect some useful reality".  Some humans do this,
and if it's an open process with experts and appeals etc. we hope that
the outcome is good enough.

If it's a rather cryptic process with very few experts who simply can't
know each and every "orq"-case this outcome might be not good enough, 
and adding a second filter with an open process behind it could help.

Increased bureaucracy is often not helpful, but we have already a way
to register languages in addition to 639-3 with tons of SHOULD NOT to
discourage it.  We could as well add a way to register a proper subset
of 639-3 (A, C, E, H) only on demand.  Maybe it's a bad idea, but it's
not completely impossible.

> At least one Description needs to match the source standard, no matter
> how bad we think the spellings might be.

Yes, you can't simply "fix" the clicks, let 639-3 do it when they feel
like it.  It's rare that I say "please use Unicode", it was addressed 
to Peter, not to you... :-)

> Excluding about 500 to 600 non-"L" languages would reduce the size of
> the Registry by less than 8 percent

Yes, it's no size issue.  Using all available alpha3 is also okay, the
complete registry would be still below 2 MB.  It's a question of the
vetting, and whether users would need to know the source of an alpha3
in the registry, or only "interesting" types, or if that's irrelevant.

Apparently some including John, Peter, and me feel that it's relevant,
but we still have to figure out how to do it.

> The Language Subtag Reviewer and ietf-languages list can decide that
> a language requested for independent registration isn't "good enough,"
> but if it's in an ISO standard, who are they (or we) to say?

ISO standard, IANA registry, CLDR, ICU, tz database, or RFC, there are
some folks empowered to decide what's in and what's out, and depending
on how good that's organized the results can be good, bad, or ugly.

There always will be errors, and _we_ invented the rules not allowing
to fix errors in the registry.  As soon as "orq" is in it stays in.
And if 639-3 isn't designed for this radical approach it's our job to
get it right.  We did that already for the 3166 country codes.  For all
other 4646 sources we trust that it's unnecessary.  ISO 639-3 will be a
new player, and Peter always convinced me that it will be as good as
the other sources.  

But we have to verify this, and doing it with the April draft is a good
test.  Peter said that they might remove "orq", and that's the purpose
of this test, find nits and issues before they are "stable forever".

>> If later one of the Ms gets type E adding (M,E) to a description (or
>> saying Macrolanguage, Extinct in a comment) is no problem.
 
> If a currently non-extinct language that is widespread enough to serve
> as a macrolanguage goes extinct during our lifetime, it would amaze me.

Me too, it was an explanation why mixing the "scope" and "type" info is
possible even under unlikely conditions.

>> Changing the 4646 syntax adding a completely new field however is
>> IMNSHO a very bad idea.  We more or less promised that that won't
>> happen.
 
> Where did we promise that no new fields would ever be added?

By proposing <extlang> with detailed rules before it existed, the 4646
syntax is supposed to work also for 4646bis registries.

> This would also affect Addison's proposal to replace Suppress-Script
> with something new.

Yes, that's why I don't like the zero-one-more-script proposal.  And
why I don't like an additional Langtype field.  And why I don't like 
to use UTF-8, or to replace record-jar by XML.  Always the same reason,
stick to the 4646 registry format, it's okay.

Frank



_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru




Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.