[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Simple] Re: [Geopriv] Domain identifier in common policy
On Nov 14, 2005, at 4:34 PM, Henning Schulzrinne wrote:
I'm probably well beyond my I18N/IDN depth here, so I agree that
external advice is called for.
Me too. So no argument here.
My understanding of RFC 3987, Section 5, is that these steps are
performed in sequence, starting with the low-cost step in 5.3.1 and
progressing to steps requiring more work if there's no match.
I'm not saying you are wrong, but that's not the impression I got
when reading that section. Otherwise, whey is step 1 (simple string
comparison) a MUST in certain situations if it always a MUST.
First, I think we can restrict ourselves to discussing URIs
appearing in specific protocols, such as part of a SIP From URI or
an XMPP URI.
I do not agree with this. If HELD comes to fruition, we'll be
talking about HTTP schemes as well.
There are probably two cases: Protocols that are clearly specified
as using IRIs already (XMPP, say) and those that require discussion
(SIP, say). [Discussion is required for SIP since SIP itself allows
UTF-8 and RFC 3987 Section 6.3 alludes to the fact that most
schemes do not have to be upgraded to support IRIs.]
I'll stick to the IRI case. If the IRI shows up in the protocol
request, the steps in 5.3.1 would be executed until either a match
occurred or the process falls off the ladder mentioned in the spec.
Clearly, some of the comparison steps do not apply since they
concern the port number or path components of the comparison.
This sounds right. I don't think we need to worry about the URI case
either.
From my reading of 3987, the punycode version would be compared as
well during the ladder, presumably by converting the IDN to
punycode. (I suspect that the conversion from UTF-8 to punycode is
unique, while I suspect that this is not true in the other
direction. In other words, multiple UTF-8 strings could generate
the same punycode.)
The word in the RFC is MAY:
Implementations with scheme-specific knowledge MAY convert
punycode-encoded domain name labels to the corresponding characters
by using the ToUnicode procedure.
Also, what do you do with the domain attribute? With
domain="xn--99zt52a" there is no scheme specific part. Because this
is being done for comparison purposes, perhaps common-policy ought to
insist on the conversion in domain= with a MUST ToUnicode. The only
problem with this is a future scenario where UTF-8 is considered
legal in DNS (DNS is 8-bit) labels. Then you have two domain names
that are compared as equivalent even though they may not be. Perhaps
thats worrying too much.
This is all a bit messier than ASCII comparison, but I don't think
we want users to edit punycode into their XML rule files.
You'd be amazed at what end users find to do with cut-and-paste.
-andy
_______________________________________________
Simple mailing list
Simple at ietf.org
https://www1.ietf.org/mailman/listinfo/simple