> From: Phillips, Addison [mailto:addison at amazon.com] > > I also see another potential concern wrt #2: ... > > with one or the other deprecated, there's some likelihood that some > > implementations will assume that the deprecated form can be largely > > disregarded in designing matching behaviour. > > Disregarded may not be the right word... No, that's what I meant: you may use canonicalization to ensure that the deprecated form is always considered, but nothing requires that. > > Btw, it should be noted that both #1 and #2 lead to consideration > > of cherry picking. > > I don't see why. Well, for #2, we have started to consider cherry picking (at least, I've had that impression), and the reasons for considering it applied just the same to #1 -- we just didn't get to a point while #1 was on the table of actually considering it. But this isn't a point worth debating. > It doesn't matter which one we choose and our choice doesn't affect > what you might choose. That's because the compiler reduces both to the > same thing. However, here our parallel breaks down: X-Y and Y produce > different results in non-validating implementations, so choosing one > over the other is a Good Thing. And it isn't cherry picking if we > ALWAYS do it the same way. Again, I thought that we were cherry picking in which cases the extlang form was even considered valid (IOW, cherry picking the macrolanguages): zh, ar, sgn and a few others. > > Now, let me propose an elaboration of #3 for adoption. This > > elaboration is captured by three points: > > > > (a) that both X-Y and Y are freely allowed, > > Both #2 and #3 provide this. I mean in the sense that neither is identified as preferred. > > (b) that at the level of the language production X-Y and Y must > > always be considered a match (regardless of which is part of a tag > > or of a language range), but > > This is where #3 is a non-starter for me: it requires us to change all > of the matching schemes True. > in ways that are incompatible with our previous > tenets. Haven't groked that yet. (I just thought of all this and wrote it up immediately for feedback.) > In lookup, the main problem with #3 is that we are not given a specific > canonical form to use for our fallbacks. If we don't choose a specific > canonical form, people will have to implement additional code and keep > track of other fields (macrolanguage, prefix) from the registry to > equate X-Y and Y for fallbacks. Currently "Preferred-Value" is the only > field you have to look at for canonicalization purposes and you look at > the same field for *all* subtags and tags. And this code can be > executed once separately from the lookup process, since it is a > canonicalization step. A possible concern is that "Preferred-Value" is not only used in canonicalization *for matching purposes*, but it is also what people SHOULD be using to tag all their content and interchange: as the opening sentence of 4.5 says, ------------- Since a particular language tag is sometimes used by many processes, language tags SHOULD always be created or generated in a canonical form. ------------- So, indeed, the means we have for supporting canonicalization in the registry is also how we handle deprecation, and so we currently have to deprecate either X-Y or Y to handle these in canonicalization. Just as you say. The concern is that we are also saying that only one or the other SHOULD be used in interchange and in tagging content. I'm concerned that might lead to the very same barrier to consensus that we had for #1: some think content/interchange should use (say) "cmn" while others would like to see "zh-cmn". I guess what I'm really suggesting for #3 is that we still have the effects of canonicalization used in matching without saying that the canonical form (just wrt extlang -- X-Y vs. Y) is what SHOULD always be generated. > > This proposal frees us entirely from having to decide whether "zh- > > cmn"/"zh-yue" or "cmn"/"yue" is better. > > "Better" may not be the right word. "Preferred" would better describe > the situation. OK. > I like form Y over form X-Y for two main reasons: > > 1. It simplifies the tags. > 2. It can be done without looking at the registry. I agree. > > It would also mean there's > > no particular reason to cherry pick: the IETF-Language can discuss > > when it may or may not be beneficial to use the extlang formulation, > > but users can ultimately decide for themselves and (because of > > requirement b) they are assured of some degree of interoperability > > no matter which they choose. > > Uh....... ietf-languages can discuss the benefits of extlang or no > extlang for a given new registration with a macrolanguage, and then > register subtags to match. But users only have a choice when ietf- > languages gifts us with subtags of both types. I'm not making myself clear: whereas currently, we have (IIUC) cherry picked only certain macrolanguages and sgn as potential prefixes for extlang, I was suggesting for #3 that LTRU doesn't need to make those choices up front, that the RFC can specify *any* macrolanguage (or sgn) can be a prefix for extlang, and then it can be left to IETF-Languages to decide (perhaps differently for different cases) whether X-Y or Y is better in general (i.e. for most applications) / preferred for purposes of tagging and interchange. (We'd still want something that specifies a "canonical" form that would be picked up automatically in matching.) Again, the main concern I'm trying to address anticipated difficulty in agreeing on whether content should get tagged "zh-cmn" or "cmn". Peter _______________________________________________ Ltru mailing list Ltru at ietf.org https://www.ietf.org/mailman/listinfo/ltru
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.