[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ltru] solving the chinese thing... lists





On Sat, May 10, 2008 at 8:48 AM, Peter Constable <petercon at microsoft.com> wrote:

I don't see what d adds: a-c give examples in which multiple tags apply, and if the language is unknown I don't know how why a list declaring multiple languages would be used.

 

I don't understand what you have in mind for e. Do you mean that a list may be used to make a declaration using a set of current tags for some set of varieties encompassed by a now-deprecated tag that had previously been used for that record?


Yes. The issue is that sr-CS is deprecated. As such, one should replace it. However, because CS was replaced by RS/ME, one cannot replace it by a single entity without being more specific than the original data was. And if you don't know how the original data was derived, then you don't know which of the choices it should be. (This is just one example of the general case.).

This doesn't strike me as an instance of "Content items that contain multiple, distinct varieties"; rather the examples you give are cases of a single variety that over time, for geo-political reasons, have begun to be given distinct identities.


It is a case where one tag would be nowadays ambiguously one of two (or more). [Even worse for cases like SU.]. If you can use a list, you can tag it with nondeprecated codes, and neither lose nor add information.

 

Btw, it's not entirely clear to me why sh should be deprecated in our usage if there is content out there than can appropriately be described as "Serbo-Croatian".


I share that view, but it is in fact deprecated.

 

From: mark.edward.davis at gmail.com [mailto:mark.edward.davis at gmail.com] On Behalf Of Mark Davis
Sent: Friday, May 09, 2008 9:03 AM
To: Phillips, Addison
Cc: Shawn Steele; Peter Constable; LTRU Working Group
Subject: Re: [Ltru] solving the chinese thing... lists

 

Actually, one more:

e.<t>A single tag needs to be replaced by multiple tags in order to remove deprecated tags, or get more precise results than macrolanguages. Example: "sr-CS" => "sr-RS, sr-ME"; "sh" => "sr, bs, hr"</t>

On Fri, May 9, 2008 at 7:41 AM, Mark Davis <mark.davis at icu-project.org> wrote:

This looks like a good direction (I have to be brief; won't have time to look at these in detail until Monday).

The one thing I would add, which in certain ways is the most important, is

     d.<t>A document or segment of text that could equally well be tagged by each of two or more tags. This can happen where the exact language is not known, or multiple tags could validly apply.</t>



On Thu, May 8, 2008 at 2:26 PM, Phillips, Addison <addison at amazon.com> wrote:

I took Shawn's suggestion below and Peter's email about use cases for lists and molded them into a single, new, section for draft-14. At first I wanted to include them into the section on choice or on meaning, but it just doesn't fit into either. Here's my proposed text:

--
<section anchor="lists" title="Lists of Languages">

<t>In some applications, a single content item might best be associated with more than one language tag. Examples of such a usage include:<list style="symbols">

  1.<t>A language priority list <xref target="RFC4647"></xref> describing a user's language preferences. This is a (possibly weighted) list of potentially-unrelated varieties, expressing a preference, rather than as a declaration about actual content.</t>

  2.<t>Content items that contain multiple, distinct varieties. Often this is used to indicate an appropriate audience for a given content item when multiple choices might be appropriate. Examples of this could include: <list style="symbols">

     a.<t>Metadata about the appropriate audience for a movie title. For example, a DVD might label its individual audio tracks 'de' (German), 'fr' (French), and 'es' (Spanish), but the overall title would list "de, fr, es" as its overall audience.</t>

     b.<t>A French/English, English/French dictionary tagged as both "en" and "fr" to specify that it applies equally to French and English</t>

     c.<t>A side-by-side or interlinear translation of a document, as is commonly done with classical works in Latin or Greek</t>

  </list> </t>

  3.<t>Content items that contain a single language but which require multiple levels of specificity. For example, a library might wish to classify a particular work as both Chinese ('zh') and as Min Nan ('nan') for audiences capable of appreciating the distinction or needing to select content more narrowly.</t></list>

</t>

</section>
--

I can't seem to think of any normative language to wrap around this that isn't already handled by the choice or meaning sections or by 4647.

Comments?

Addison

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.


> -----Original Message-----
> From: Shawn Steele [mailto:Shawn.Steele at microsoft.com]
> Sent: Wednesday, May 07, 2008 6:00 PM
> To: Shawn Steele; Phillips, Addison; Peter Constable
> Cc: LTRU Working Group
> Subject: RE: [Ltru] going back to the roots to find a solution to "zh"
>
> I'm thinking something like this:
>
>
>
> 4.6? Lists of Language Tags
>
> In some applications a list of language tags is necessary, such as a
> user's language priority list, when multiple language apply equally to
> the same content, or when matching of older tags is expected.  Some
> example scenarios are:
>
> * A user could specify "en, de, jp" for a language priority list as
> described in RFC4647bis
> * A French/English, English/French dictionary could be tagged as "en,
> fr" to specify that it applies equally to French and English
> * A DVD could be tagged as having audio content in "en, fr" and
> subtitles in "en, fr, es, jp".
> * An English translation of a classical work could have the code "la,
> en".
> * Mandarin content expected to interchange with existing systems could
> be tagged "cmn, zh".
>
> Standards and protocols specifying this RFC MAY choose to allow lists
> of language tags.  Such standards and protocols MUST NOT be required to
> allow lists of language tags if not appropriate.
>
> If a list of language tags is used it MUST contain well formed language
> tags and strongly RECOMMENDED that those tags are chosen using the
> Choice of Language Tag guidelines.
>
> When a list of language tags is used, the protocols MUST specify
> whether or not an order is or is not implied or required for the list.
>
> When a collection of content, such as a DVD with multiple tracks, can
> be tagged with multiple languages, the collection SHOULD be tagged with
> the appropriate language tags, and the individual pieces of content
> SHOULD be tagged with a single language tag from the list of the
> collection.
>
> Recommendations for searching and matching of lists of language tags
> should be addressed in the successor to RFC 4647

_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www.ietf.org/mailman/listinfo/ltru



--
Mark




--
Mark


_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www.ietf.org/mailman/listinfo/ltru




--
Mark
_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www.ietf.org/mailman/listinfo/ltru

Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.