Phillips, Addison 2008-10-03 00.58:
You here use the word "romanization" - aka latinisation - instead of transcription. Can we rule out that UNGEGN will not specify e.g. Cyrillic transcriptions? (But if they should add Cyrillic transcriptions, then one could just add e.g. "en-Cyrl" to the list of prefixes.)Mark is talking about a generic variant: one with no prefixes at all.
Mark said: 'ungegn' [...] productively used with 'ru-Latn' [...] But in an earlier message he said: Type: variant Subtag: 2003 [ snip ] Prefix: be-ungegn I think Prefix: be-Latn-ungegn would have be more consistent.
Others have pointed out the fallacy in this example, which does not invalidate the point. Registrations SHOULD be specific about script or region or even variant (in addition to language) when they are actually confined to a specific usages. See, for example, the registry record for '1994'.
It is interesting to compare the prefixes allowed for "1996", Prefix: de to those allowed for "1994": Prefix: sl-rozaj Prefix: sl-rozaj-biske Prefix: sl-rozaj-njiva Prefix: sl-rozaj-osojs Prefix: sl-rozaj-solbaAllthough "sl-rozaj-lipaw" is not listed, since "sl-rozaj" is listed, how can we know that "sl-rozaj-lipaw-1994" is not permitted? The only answer is: by applying the logic that the Prefix fields list *all* the possible "minimum prefixes".
But then, following the same logic, it should also be possible to write "de-NO-1996". In other words, I think the 1996 registration should have listed these "minimum prefixes":
Prefix: de Prefix: de-DE Prefix: de-AT Prefix: de-CH Prefix: de-LI Only then would "1996" and "1994" appear to follow the same rules."1996" and "1994" have much in common: I don't think that 1996 lays down one 100% identical norm for all 4 country norms. Just as for the 1994 norm, 1996 has room for some region/location spesific variants. And, just as for the Lipovaz/lipaw dialect of Resian, there are some locations (Luxemburg/"LU") which was not formally covered by the reform.
Hence one should think that "de-LU-1996" would be wrong, just as "de-NO-1996" would be.
The taggers would then have to know the Suppress-Script rules in order to understand that in real life tagging, they ought to write "de-DE-1996" and not "de-DE-Latn-1996". And they would also haveUh... taggers already need to know when to use subtags. They are specifically advised not to use scripts unless it adds something to the tag,
Right. My proposal was not to change that advisory.
a specific example of the more general advisory not to use subtags that add nothing to the overall tag. Most German applications are actually fine with "de". In some cases "de-DE" or "de-DE-1996" are appropriate. Rarely "de-Latn-DE-1996" is appropriate---probably when the German in question is also rendered with other scripts in the same document/collection.
to know the rules in order to know that they can/should/could also often drop "DE" and just write "de-1996".RTFRFC.
Of course. However, in order to know that the 1996 norm applies to the German as used in the regions "DE", "CH", "AT" and "LI", one must read Wikipedia - not the RFC.
Just because a language subtag has (or hasn't) a Suppress-Script field, does not necessarily imply that the variant subtag has the same - or the same lack of - script association. Therefore I thinkYes it does.
OK. But there are some language subtags for which there aren't a suppress-script field. E.g. Turkmen language. So, for the hypothetical 2008 reform of the Turkmen langauge, one would have to include "Latn" as part of the prefix in order to make clear what the reform relates to:
Prefix: tk-latnThe question then is: How do you discern this from the transliteration variants you discuss below, where the "latn" is an *required* prefix (which it probably should not be for "tk")?
Possible answer: in order to allow tk-1998, yet at the same time make clear in the registry that it relates to Turkmen of Latin script of Turkmenistan, the follwoing prefixes would have to be listed - in order to list all "minium prefix" variants:
Prefix: tk Prefix: tk-latn Prefix: tk-TMThis would rule out adding "1998" to any other region subtags than -TM or to any other script subtag than -latn.
Note that transliteration variants, such as 'wadegile', have a Prefix (zh-Latn, in that case) to convey the fact that the script is needed. Although 'zh' does not suppress a script, the same thing can be said of (for example) romanizations of other languages. "be" requires no script and suppresses "Cyrl", but "be-Latn-ungegn" would probably be a good choice if 'ungegn' were a valid subtag representing a Latin transcription scheme.
OK. So the current tradition is that the Prefix field lists the *minimum prefix* that must be present before the variant subtag in question can be used.
If one cannot use the Prefix field to do this, then one should invent a new field for this particular purpose. Perhaps a Relates-to field.I think this is overkill. At some point we have to let the LSR register subtags and at some point we have to let people tag stuff. It is difficult enough--and maybe too difficult--for people to understand the information there today.
I agree. But then we must commit overkill by adding more prefix fields instead, I think, such as I proposed for Turkmen above.
-- leif halvard silli _______________________________________________ Ltru mailing list Ltru at ietf.org https://www.ietf.org/mailman/listinfo/ltru
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.