[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Uri-review] about: scheme; Simplified Encoding Considerations



Joseph A Holsten wrote:
URI people:

I intend to replace the current about: scheme Encoding Considerations[1]:

   Because many characters are not permitted with this syntax, the
   "segment" and "query" elements may contain characters from the
   Unicode Character Set [UCS] as suggested by URI [RFC3986], by first
   encoding those characters as octets to the UTF-8 character encoding
   [RFC3629]; then only those octets that do not correspond to
   characters in the unreserved set should be percent-encoded.

   By using UTF-8 encoding, there are no known compatibility issues with
   mapping Internationlized Resource Identifiers to about URIs according
   to [RFC3987].  Since about URIs do not use domain names, "ireg-name"
   conversion is unnecessary.

with the following (adapted from hixie's ws: scheme[2]):

   Characters in the "segment" or "query" parts that are excluded by the
   syntax defined above must be converted from Unicode to ASCII by first
   encoding the characters as UTF-8 and then replacing the corresponding
   bytes using their percent-encoded form as defined in the URI and IRI
   specifications. [RFC3986] [RFC3987]

Any objections or issues?
...

I think the current text is clearer with respect to whether IDNA conversion is needed or not.

BR, Julian