FWIW, I would prefer a design closer to what I originally proposed (with only one field type), but I documented what appears to be currently nearer consensus.
Addison
Addison P. Phillips
Globalization Architect, Quest Software
Chair, W3C Internationalization Core Working Group
Internationalization is not a feature.
It is an architecture.
> -----Original Message-----
> From: ltru-bounces at lists.ietf.org [mailto:ltru-bounces at lists.ietf.org] On
> Behalf Of Addison Phillips
> Sent: jeudi 14 avril 2005 12:08
> To: John Cowan; Mark Davis
> Cc: ltru at ietf.org
> Subject: RE: [Ltru] Re: Proposed Text for Moving Forward
>
> Below is a proposed version of the section "Choice of Language Tag". In
> writing this text, I began a heavy reorganization of the document's
> sections. Too many forward references were being created, so I have
> followed our previous discussion and:
>
> 1. Made the IANA Considerations section a very small separate section.
> 2. Made what was titled "IANA Considerations" into "Registry Format and
> Maintenance"
> 3. Made a new section "Formation and Processing of Language Tags"
> following the format and maintenance section. This contains "Choice",
> "Meaning", "Canonicalization", and "Considerations of Private Use"
> sections.
>
> I have not yet revised the registry handling text to reflect the
> additional fields and so forth, so please don't look there. For this
> discussion I have focused on the "Choice of Language Tag" section.
>
> The new versions are posted on:
>
> http://www.inter-locale.com/ID/draft-ietf-ltru-registry-01.html
> http://www.inter-locale.com/ID/draft-ietf-ltru-registry-01.txt
>
> Here is the salient text from the "Choice" section:
>
> ---
> Many applications can benefit from the use of script subtags in language
> tags, as long as the use is consistent for a given context. Script subtags
> were not formally defined in RFC 3066 and their use may impact matching
> and subtag identification by implementations of RFC 3066 and its
> predecessor RFC 1766, as these subtags appear between the primary language
> and region subtags. For example, if a user requests content in an
> implementation of Section 2.5 of RFC 3066 [22] using the language range
> "en-US", content labeled "en-Latn-US" will not match the request.
> Therefore it is important to know when script subtags will customarily be
> used and when they should not be used.
>
> The language subtag registry defines two informational fields to guide
> users in the choice of whether to use script subtags for a specific
> application. These fields SHOULD be used as follows:
>
> If the primary language subtag has an Expected_Script_Subtag field,
> then the use of a script subtag is strongly RECOMMENDED for all
> applications. This field is used in records for primary language subtags
> that are customarily or frequently written in more than one script and
> which may be ambiguous without script information.
>
> If the primary language subtag has a Default_Script field, then that
> script subtag SHOULD NOT be used to form the language tag unless it
> conveys additional information necessary to the specific application. For
> example, the subtag 'Latn' should not be used with the primary language
> 'en' because most English documents are written in the Latin script and it
> adds no distinguishing information. However, if a document were written in
> English mixing Latin script with another script such as Braille ('Brai'),
> then the content author may choose to indicate both scripts to aid in
> content selection, such as the application of a stylesheet.
>
> If the primary language subtag has neither an Expected_Script_Subtag
> field nor a Default_Script field, then the script subtag SHOULD NOT be
> used to form the language tag unless it conveys additional information
> necessary to the specific application. Most languages that have neither
> field are either not customarily written or are written, but the script is
> not known in the registry. Speakers of these languages are encouraged to
> register the appropriate information in the registry.
>
> Extended language subtags (type 'extlang' in the registry, see Section 3.1)
> also appear between the primary language and region subtags andare
> reserved for future standardization. Applications may benefit from their
> judicious use in forming language tags in the future and similar
> recommendations are expected to apply to their use as apply to script
> subtags.
>
> ---
>
>
> Addison P. Phillips
> Globalization Architect, Quest Software
> Chair, W3C Internationalization Core Working Group
>
> Internationalization is not a feature.
> It is an architecture.
>
> > -----Original Message-----
> > From: ltru-bounces at lists.ietf.org [mailto:ltru-bounces at lists.ietf.org]
> On
> > Behalf Of John Cowan
> > Sent: jeudi 14 avril 2005 11:34
> > To: Mark Davis
> > Cc: ltru at ietf.org
> > Subject: Re: [Ltru] Re: Moving Forward
> >
> > Mark Davis scripsit:
> >
> > > A. We know the language is customarily used with more than one script
> > > B. We know the language is customarily used with one script
> > > C. The information for A or B is not available in the registry: either
> > just
> > > not entered yet, or hard to find out.
> >
> > You omit:
> >
> > D. The language is not customarily used with any script. (Any
> language
> > may
> > be transcribed in writing, as any language may be spoken over the
> radio;
> > but just the same, Chinese is written and Burushaski is not, as Navajo
> > is
> > regularly broadcast and Akkadian is not.) This does not apply to
> > current
> > 639-2 languages but will apply to 639-3 languages.
> >
> > > As I said before, I have not yet seen an implementation scenario where
> > we
> > > need the infomation in B, and both the strategies that use it require
> > > entering in much more data than the ZZ scenario. So I'd like to see a
> > > scenario that drives B clearly stated.
> >
> > >From what I can understand, which is by no means sure, the advocates
> > for B information want it so that they can vet tags like en-Latn(-GB)
> > and warn against or reject them. Without B information we cannot
> > distinguish between en-Brai (anomalous, so correct) and en-Latn
> > (undesirable).
> >
> > > 2. As Peter says, we mustn't say MUST NOT, however: there are
> scenarios
> > in
> > > which it may be appropriate to use en-Latn or ko-Latn.
> >
> > I believe the desire of this part of the WG is to reject en-Latn
> > absolutely (because it is the default) while still retaining the
> > possibility of ko-Latn.
> >
> > --
> > All Gaul is divided into three parts: the part John Cowan
> > that cooks with lard and goose fat, the part
> > www.ccil.org/~cowan
> > that cooks with olive oil, and the part that
> > www.reutershealth.com
> > cooks with butter. -- David Chessler
> > jcowan at reutershealth.com
> >
> > _______________________________________________
> > Ltru mailing list
> > Ltru at lists.ietf.org
> > https://www1.ietf.org/mailman/listinfo/ltru
>
>
> _______________________________________________
> Ltru mailing list
> Ltru at lists.ietf.org
> https://www1.ietf.org/mailman/listinfo/ltru
_______________________________________________
Ltru mailing list
Ltru at lists.ietf.org
https://www1.ietf.org/mailman/listinfo/ltru
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.