On 2009/10/27 19:08, Greg Wilkins wrote:
Existing UTF-8 to character conversions will error if they get a partial multipart character,
No, good transcoding libraries have options that say what to do in such a case. Iconv is definitely an example, String#encode in Ruby is another.
so you can start conversion to characters until you have the entire message.
I assume you wanted to say "cannot start conversion". False as explained above.
Efficient implementations will be forced to reimplement utf-8 converters to scan for 0x00 while converting.
No. These two concerns can be separated completely.And there's the case that no conversion from utf-8 to characters is needed (when your implementation uses utf-8 for its internal representation, as e.g. in Perl or (mostly) in Ruby). (Still, you better check for consistent UTF-8 byte sequences.)
Regards, Martin. -- #-# Martin J. Dürst, Professor, Aoyama Gakuin University #-# http://www.sw.it.aoyama.ac.jp mailto:duerst at it.aoyama.ac.jp
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.