[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ltru] Re: Great Script Debate "the Next Generation"...





On 10/15/06, Doug Ewell <dewell at adelphia.net> wrote:
Warning: micro-editing suggestions follow.

Mark Davis <mark dot davis at icu dash project dot org> wrote:

> 5.  There MUST be at most one script subtag in a language tag.

I think this could be too easily read as "There MUST be one script
subtag..." and would prefer turning the sentence around to something
like "There MUST NOT be more than one script subtag..."

agreed

> 6.  A script subtag SHOULD be used when tagging content if the content
> is in a script other than that customarily used for the language (i.e.
> zh-Latn, en-Brai).

Would this sentence mean the same, but be more concise, if the words "if
the content is" were replaced by "written"?

I was trying to avoid the use of written -- since the script Zxxx could be being used for unwritten material (see below).

Also, the examples should be marked "e.g." instead of "i.e."

agreed

> 7.  A script subtag SHOULD be used when tagging content if the
> language of the content is customarily written in more than one script
> (e.g. sr-Cyrl, sr-Latn).

Again for conciseness, if we don't need "of the content" in the previous
item, we may not need it here.

Actually, for precision, I think we should put it in both places.

> 8.  A script subtag SHOULD NOT be used if the content being tagged is
> written in the script customarily used for the vast majority of
> content in that language (e.g. en-Latn SHOULD NOT be used).

How about "... if the content being tagged is not written (e.g. spoken
or sung) or if it is written in the script customarily used..."

Because of edge cases, I think we need to settle on the usage first. Let's suppose I have the following cases of content, where we follow SHOULDs, and show what it could be tagged with (with varying degrees of specificity):

1. American English, written in Latin.
en
en-US

2. American English, written in Cyrillic (don't laugh; a collegue of mine has a printed ad in this.)
en
en-US
en-Cyrl
en-Cyrl-US

3. American English, only spoken -- no written content
en
en-US
en-Zxxx
en-Zxxx-US

4. American English in a video, some parts spoken, and with Cyrllic subtitles. In protocols limited to a single language, this is now limited to:

en
en-US

If a protocol allows tagging with multiple languages, then it could have any of the combinations of #2 & #3.

--
Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14
http://users.adelphia.net/~dewell/
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages


_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru

_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru

Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.