Mark, One thing I think you aren't acknowledging is that "treat as synonyms" means something very different to the vast numbers of content creators who use this standard than it does the handful of search engines that use the fuzzy logic associated with companion standards. As you note in your document, "It is clear that companies like Google or Yahoo can work around the problems with extlang." How many other users need and can afford to implement the extended fallback and filtering logic? Enough that this logic should be the primary driver behind the chosen solution? Before I spend too much time picking apart your lengthy screed involving a scenario where the BBC presents its web site in Sudanese Creole Arabic with rotating languages code logic for each day of the week ... (ahem) ... here's my real-world Chinese language list: Chinese (Variant Unknown) Chinese (Cantonese, Spoken) Chinese (Cantonese, Written) Chinese (Mandarin, Spoken) Chinese (Mandarin, Spoken Taiwanese) Chinese (Mandarin, Simplified) Chinese (Mandarin, Traditional) Chinese (Taiwanese, Spoken) Chinese (Taiwanese, Written) (Apologies, this is hard to represent in ASCII. I have a mini-spreadsheet if someone wants it.) 1 2 3 4 zh zh zh zh zh-yue yue yue yue zh-yue yue yue yue zh-cmn cmn zh cmn zh-cmn-TW cmn-TW zh-TW cmn-TW zh-cmn-Hans cmn-Hans zh-Hans zh-Hans zh-cmn-Hant cmn-Hant zh-Hant zh-Hant zh-min-nan nan nan nan zh-min-nan nan nan nan * Option #1 (RFC 4646) contains the codes as I have them today. * Option #2 (RFC 4646bis) contains the codes if I choose to go against the grain and use "cmn". * Option #3 (RFC 4646bis) treats "zh" and "cmn" as synonyms; avoids using "cmn" for compatibility. * Option #4 (RFC 4646bis) contains the codes "cmn" for spoken context (where distinction is essential) and "zh" for written context. Comments: * Option #1 is unambiguous and shows that there is a relationship between these languages. It also preserves the legacy "zh" tag so developers that aren't hip to later versions of BCP 47 or 639-3 will have some idea what these tags mean. The tags are maybe longer than they need to be, but if I need a fixed-length tag, I can wait for 639-6. The languages may not be mutually intelligible in some contexts, but they are related. * Option #2 is unambiguous, but Microsoft, Google, and Amazon won't be using the same tags for Chinese that I do. Even if I don't follow their lead, others likely will. This worries me. Also, the rules for #2 must include fuzzy guidelines such as, "use the 'zh' tag except when you think it's a bad idea" and "use the shortest tag except when you don't want to." This presents complications in trying to explain some sort of consistent method to the LTRU madness to others. Given this, I start to wish ISO 639-6 a safe and speedy passage. * Option #3 is what I believe you might suggest, but for me, that's the worst list of all. There are five ambiguous "zh" categories on that list. It follows the "always use the shortest tag" rule and respects history, but it's useless to me from an identification perspective. * Option #4 has three ambiguous tags and means I have to explain to people who aren't in this industry about why I use different tags for the same language. This strategy is less ambiguous that #3, but I'm not sure I can explain it to other content creators for the same reasons as #2 and presents the spoken/written complication others may not want. In the long run, this seems messy and unclear enough that it will result in bad tagging. * Options #2,3,4: pipermail/ltru> List-Post: <mailto:ltru at ietf.org> List-Help: <mailto:ltru-request at ietf.org?subject=help> List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request at ietf.org?subject=subscribe> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: ltru-bounces at ietf.org Errors-To: ltru-bounces at ietf.org Mark, One thing I think you aren't acknowledging is that "treat as synonyms" means something very different to the vast numbers of content creators who use this standard than it does the handful of search engines that use the fuzzy logic associated with companion standards. As you note in your document, "It is clear that companies like Google or Yahoo can work around the problems with extlang." How many other users need and can afford to implement the extended fallback and filtering logic? Enough that this logic should be the primary driver behind the chosen solution? Before I spend too much time picking apart your lengthy screed involving a scenario where the BBC presents its web site in Sudanese Creole Arabic with rotating languages code logic for each day of the week ... (ahem) ... here's my real-world Chinese language list: Chinese (Variant Unknown) Chinese (Cantonese, Spoken) Chinese (Cantonese, Written) Chinese (Mandarin, Spoken) Chinese (Mandarin, Spoken Taiwanese) Chinese (Mandarin, Simplified) Chinese (Mandarin, Traditional) Chinese (Taiwanese, Spoken) Chinese (Taiwanese, Written) (Apologies, this is hard to represent in ASCII. I have a mini-spreadsheet if someone wants it.) 1 2 3 4 zh zh zh zh zh-yue yue yue yue zh-yue yue yue yue zh-cmn cmn zh cmn zh-cmn-TW cmn-TW zh-TW cmn-TW zh-cmn-Hans cmn-Hans zh-Hans zh-Hans zh-cmn-Hant cmn-Hant zh-Hant zh-Hant zh-min-nan nan nan nan zh-min-nan nan nan nan * Option #1 (RFC 4646) contains the codes as I have them today. * Option #2 (RFC 4646bis) contains the codes if I choose to go against the grain and use "cmn". * Option #3 (RFC 4646bis) treats "zh" and "cmn" as synonyms; avoids using "cmn" for compatibility. * Option #4 (RFC 4646bis) contains the codes "cmn" for spoken context (where distinction is essential) and "zh" for written context. Comments: * Option #1 is unambiguous and shows that there is a relationship between these languages. It also preserves the legacy "zh" tag so developers that aren't hip to later versions of BCP 47 or 639-3 will have some idea what these tags mean. The tags are maybe longer than they need to be, but if I need a fixed-length tag, I can wait for 639-6. The languages may not be mutually intelligible in some contexts, but they are related. * Option #2 is unambiguous, but Microsoft, Google, and Amazon won't be using the same tags for Chinese that I do. Even if I don't follow their lead, others likely will. This worries me. Also, the rules for #2 must include fuzzy guidelines such as, "use the 'zh' tag except when you think it's a bad idea" and "use the shortest tag except when you don't want to." This presents complications in trying to explain some sort of consistent method to the LTRU madness to others. Given this, I start to wish ISO 639-6 a safe and speedy passage. * Option #3 is what I believe you might suggest, but for me, that's the worst list of all. There are five ambiguous "zh" categories on that list. It follows the "always use the shortest tag" rule and respects history, but it's useless to me from an identification perspective. * Option #4 has three ambiguous tags and means I have to explain to people who aren't in this industry about why I use different tags for the same language. This strategy is less ambiguous that #3, but I'm not sure I can explain it to other content creators for the same reasons as #2 and presents the spoken/written complication others may not want. In the long run, this seems messy and unclear enough that it will result in bad tagging. * Options #2,3,4: In generIn general, it worries me that RFC 4646bis offers so many "preferred" options for the same thing. I really can't see how this simplifies things for anyone. I don't have a need for fuzzy fallback scenarios. I need precise tags and mostly simple lookup. I think if you take the fallback scenarios and absurdities out of the document you reference, I don't think there's much left. Regards, Karen Broome >-----Original Message----- >From: ltru-bounces at ietf.org [mailto:ltru-bounces at ietf.org] On Behalf >Of Mark Davis >Sent: Thursday, May 29, 2008 4:00 PM >To: debbie at ictmarketing.co.uk >Cc: LTRU Working Group >Subject: Re: [Ltru] Consensus call: extlang > >What would be useful is to hear from the extlangistas what their >concerns are specifically; many have not given reasons for favoring >encompassed languages into extlang instead of into the primary >language subtag. It would be useful for them to give the scenarios >where they think extlang is an improvement. It would be useful to >find out why they think the scenarios such as in >http://docs.google.com/Doc?docid=dfqr8rd5_676kxxxjhd&hl=en are not a >problem. > >Clearly people think that using the extlang model solves more >problems than it causes, so it would be useful to example specific >cases and see if that is, in fact, true. > > >Mark _______________________________________________ Ltru mailing list Ltru at ietf.org https://www.ietf.org/mailman/listinfo/ltru al, it worries me that RFC 4646bis offers so many "preferred" options for the same thing. I really can't see how this simplifies things for anyone. I don't have a need for fuzzy fallback scenarios. I need precise tags and mostly simple lookup. I think if you take the fallback scenarios and absurdities out of the document you reference, I don't think there's much left. Regards, Karen Broome >-----Original Message----- >From: ltru-bounces at ietf.org [mailto:ltru-bounces at ietf.org] On Behalf >Of Mark Davis >Sent: Thursday, May 29, 2008 4:00 PM >To: debbie at ictmarketing.co.uk >Cc: LTRU Working Group >Subject: Re: [Ltru] Consensus call: extlang > >What would be useful is to hear from the extlangistas what their >concerns are specifically; many have not given reasons for favoring >encompassed languages into extlang instead of into the primary >language subtag. It would be useful for them to give the scenarios >where they think extlang is an improvement. It would be useful to >find out why they think the scenarios such as in >http://docs.google.com/Doc?docid=dfqr8rd5_676kxxxjhd&hl=en are not a >problem. > >Clearly people think that using the extlang model solves more >problems than it causes, so it would be useful to example specific >cases and see if that is, in fact, true. > > >Mark _______________________________________________ Ltru mailing list Ltru at ietf.org https://www.ietf.org/mailman/listinfo/ltru
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.