[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Ltru] Re: UTF-8



Doug Ewell wrote:

> The ABNF in 4646 says the ampersand is not a valid character and must
> be escaped, because it introduces the &#x sequences.  I'm feeling lazy.
> Can someone else verify this,

Of course, it's in the <ASCCHAR> production, and in the prose.

> the impact of allowing a character that we previously did not allow?

I refuse to abuse my crystal-ball today.  Registry converters
to XML have to replace "<" by "&lt;", and while they're at it
">" to "&gt;" isn't wrong.  UTF-8-jar converters to XML would
use UTF-8 as encoding, and replace "&" by "&amp;" before the
"<" to "&lt;" and ">" to  "&gt;".

They could throw an error for "&#x", or they can try to get
it right:  "&#x" to ASCII SUB, other "&" to "&amp;", SUB back
to "&#x", after they've checked that it's a plausible NCR.
Or they could replace the NCR by UTF-8. 

The constant thing over all programming languages:  The code
has to be updated.  Not difficult, a minor technical change.

The "keep NCR as is" trick works only for XML as output format.

Frank



_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru




Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.