[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hybi] SPDY protocol from google frame



On 2009/11/13 11:45, Thomson, Martin wrote:
It was interesting to hear the IAB opinions on this in relation to internationalization.
The thesis was that the bulk of the web is video or some other efficiently encoded binary
> data and that text makes up a very small proportion of the data that is moved around.
The purported costs of more verbose encodings (in this case of Unicode) were actually
> considered negligible when compared to the other benefits.

UTF-8 (the Unicode encoding favored by the IETF) isn't particularly verbose in general. In particular, for ASCII text, it's exactly as verbose as ASCII. It's slightly more verbose than legacy encodings for Chinese and Japanese, a bit more verbose than legacy encodings for languages such as Russian, Greek, Arabic, Hebrew,..., and definitely more verbose for Indian languages and quite a few other not so well known languages.

Also, for the bulk of text (mail or http bodies), conventions are well established for labeling encodings. For identifiers, which were the main topic of the IAB talk, the overall data volume is indeed very low compared to video. So for the IAB message, in a very simplified form, of "let's just use only UTF-8 for identifiers", data volume is indeed not a major concern.

Regards,   Martin.


--
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst at it.aoyama.ac.jp

Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.