[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Ltru] Re: [psg.com #954] all aliases must be equally supported



> 
> > I think that it would be better to spell out the principles
> > and enshrine those in the text.
> 
> ACK.
[Addison Phillips] 

I take that as the converse of NACK, that is, U+0006?
> 
> > Basically this is the problem we've been using NH->VU to
> > illustrate.
> 
> Also fine, nothing so exotic as CS.
[Addison Phillips] 

CS wouldn't make a good example for obvious reasons :-).
> 
>  [scenario 1]
> > then the registry would look like:
> 
> > Subtag: NH
> > Description: Vanuatu
> 
> Why didn't you add VU as in (4) ?  In 3.3 "rule 9" you say
> "add" if there's no conflict.  So scenario (1) reflecting
> draft -01 should be:
[Addison Phillips] 

I was trying to create a future tense example (if NH were to change to VU after Date B, as opposed to before Date B...). I read rule (4) to mean that VU would not be added, which is one thing you have objected to. Here is the rule I mean:

<q> For ISO 3166 codes, if the newly assigned code's meaning is associated with the same UN M.49 code as another 'region' subtag, then the existing region subtag remains as the canonical entry for that region and no new entry is created. A comment MAY be added to the existing region subtag indicating the relationship to the new ISO 3166 code.</q>

The phrase "no new entry is created" seems to forbid the creation of 'VU'. That could be changed to produce your following example, of course. It does also seem to conflict with the "rule 9" you cite:

<q> Codes assigned by ISO 639, ISO 15924, or ISO 3166 that do not conflict with existing subtags of the associated type but which represent the same meaning as an existing subtag of that type are entered into the IANA registry as new records. The field 'canonical value' for that record MUST contain the existing subtag of the same meaning</q>
> 
> | Subtag: NH
> | Description: Vanuatu
> | %%
> | Subtag: VU
> | Description: Vanuatu
> | Canonical: NH
> 
> "Rule 9" is the item with example IM.  Read NH for 833, and
> VU for IM.
> 
>  [scenario 2]
> > Subtag: NH
> > Description: New Hebrides
> > Deprecated: yyyy-mm-dd
> > Canonical: VU
> > %%
> > Subtag: VU
> > Description: Vanuatu
> 
> Yes, that's what I'd prefer.  Updating any old Canonical: NH
> elsewhere to Canonical: VU, because the registry is no lesson
> in graph theory.
[Addison Phillips] 

Yes, but it conflicts with guarantees of stability for tags. You are correct that a tag such as "en-NH" would still be "valid", but it would not be canonical and thus certain processors might transform tags from "en-NH" to "en-VU" per this text in Section 2.2.9:

<q> If the processor generates tags, it MUST do so in canonical form, including any supported extensions, as defined in Section 4.3 (Canonicalization of Language Tags).</q>

> 
> > This has the effect of "invalidating" tags in existing
> > content in canonical processors.
> 
> No, not "invalidating", NH is still valid, in all scenarios.
[Addison Phillips] 

You are correct.
> 
> Maybe "decanonalizing", but a validating processor using an
> obsoleted registry is always in trouble, not only for this
> scenario, also for completely new entries.
[Addison Phillips] 

The trouble is more for validating processors using the newer registry (changing the tag) and for content authors having to choose a different tag than previously. Old implementations generate what they generate.
> 
>  [scenario 3]
> > Subtag: NH
> > Description: New Hebrides
> > Deprecated: yyyy-mm-dd
> > %%
> > Subtag: VU
> > Description: Vanuatu
> 
> That's almost the same as (2), but no pointer from NH to VU
> or v.v., so that would be worse than (1) or (2).  All aliases
> should have exactly one canonical value.
[Addison Phillips] 

I agree. I included it for completeness.
> 
>  [scenario 4]
> > Subtag: NH
> > Description: New Hebrides
> > Deprecated: yyyy-mm-dd
> > Use: VU
> > %%
> > Subtag: VU
> > Description: Vanuatu
> 
> Same problem as (3), unless "Use: VU" is a new kind of pointer
> in addition to "Canonical:".  Then it's almost the same (2).
[Addison Phillips] 

It is different than 'Canonical': it could be informative, for example, instead of normative (as canonical is). In the end, users choose tags, wisely or otherwise. This documents changes in country names, regimes, borders, and the like without requiring (as in #1) that ietf-languages prefer the original and without violating stability of canonical tags.
> 
> I don't see any advantage in comparison with (2), but (4) is
> better than (1).  The "validating processors need a 'canonical
> forever' value" argument is IMHO wrong, a validating processor
> always works on its own old, potentially obsolete, view of the
> registry.  And if it's updated with a new registry it can as
> well integrate the complete diif, not only new records.
> 
> What you lose in (2) is "once tagged canonically as long as
> possible canonical".  I don't see where that's a problem.  If
> it really is a problem, (4) with an ersatz-canonical "Use: VU"
> could solve it.
[Addison Phillips] 

Exactly the point of that example.
> 
> > I don't care for #2 because I *do* want codes to remain
> > canonical for their given meaning.
> 
> But why ?  Who needs this, it's only confusing if "canonical"
> differs from the source standard, and otherwise irrelevant,
> because NH -> VU or VU -> NH establishs the "alias" relation.
> 
> > Either #3 or #4 seem like good choices to me. (Frank has
> > previously argued that #3 is not a good choice because there
> > is no machine readable mechanism for indicating what to use)
> 
> (3) is the only scenario without a documented alias relation,
> therefore it's the worst scenario,
[Addison Phillips] 

Well... (1) doesn't create an alias but does get ietf-languages into the tricky business of having to track the UN M.49 numbers and having to apply brainpower to new ISO 3166 assignments. Ideally we want to apply brainpower to assignments that exhibit obvious problems (e.g. the 'CS' mess or events such as the break-up of 'SU'). Really there are different kinds of events and the ones that we're discussing here only deal with a country changing names but not borders. When borders change (joins, splits, etc.) then a different set of problems may ensue.
> 
> (2) is more or less what we do for all existing records before
> the final date-B.  If it is what we want before date-B I don't
> see why we should stop to want it after date-B.
[Addison Phillips] 

The assertion is true. The question is whether we want to continue it post-Date-B.
> 
>  [missing MUST NOT Suppress-Script and rationale]
> > the rationale appears slightly lower in the document:
> 
> The reason for this ugly kludge is backwards compatibility.
[Addison Phillips] 

Yes.
> 
> That reason is not stated in what you have.  It's not only
> Doug who sees that it's ugly like hell, I see it too.  But
> unfortunately it's necessary.  Otherwise the effects with
> legacy tools and/or for legacy content are hard to predict.
[Addison Phillips] 

You will recall that I argued against Suppress-Script (even though I *also* thought of it and named it). Personally, I think that Suppress-Script will be harder to maintain than Require-Script... but I lost that argument.

Okay, the reason for the field is somewhat oblique. How about adding this text to Section 3.1 and to Section 4.1 (Choice):


"This field helps ensure greater compatibility between the language tags generated according to the rules in this document and language tags and tag processors or consumers based on RFC 3066."




_______________________________________________
Ltru mailing list
Ltru at lists.ietf.org
https://www1.ietf.org/mailman/listinfo/ltru




Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.