[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ltru] going back to the roots to find a solution to "zh"



Hi -

As a technical contributor...

> From: "Peter Constable" <petercon at microsoft.com>
> To: "LTRU Working Group" <ltru at ietf.org>
> Sent: Tuesday, May 06, 2008 10:13 AM
> Subject: [Ltru] going back to the roots to find a solution to "zh"
...
> Of course, that's the general problem we're facing: we must find a solution
> to the dual usage or "zh" or abandon any possibility of allowing "zh" to have
> its generic meaning, yet existing usage seems to imply that the latter isn't
> an option, so a solution to the dual usage is essential - but we're at a loss
> as to how to solve it.
...

This points to the "soft underbelly" of "tag wisely" - the assumption that
the tagger can reasonably anticipate how the "consumers" of the tag
will want to use that information.

In retrospect, I think it would have been better to have taken the route of
   zh -> some kind of Chinese, likely (but not guaranteed) to be Mandarin
   zh-cmn -> Chinese, specifically Mandarin
   ar -> some kind of Arabic, likely (but not guaranteed) to be Standard Arabic
   ar-arb -> Arabic, specificall Standard Arabic
   de -> some kind of German, overwhelmingly likely (but not guaranteed) to be Hochdeutsch
   de-*hde -> German, specifically Hochdeutsch

Even recognizing that reasoning about "language X is some kind of Y" can be
horribly fuzzy, this would still be better aligned with a "principle of least
astonishment" for folks trying to understand the specification, trying to tag data,
or trying to formulate a query.  Yes, huge amounts of data might end up being tagged
less precisely than we might like, but at least they'd still be tagged accurately.

Randy

_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www.ietf.org/mailman/listinfo/ltru



Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.