[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Simple] Re: [Geopriv] Domain identifier in common policy




On Nov 14, 2005, at 4:34 PM, Henning Schulzrinne wrote:

I'm probably well beyond my I18N/IDN depth here, so I agree that external advice is called for.

Me too. So no argument here.

My understanding of RFC 3987, Section 5, is that these steps are performed in sequence, starting with the low-cost step in 5.3.1 and progressing to steps requiring more work if there's no match.

I'm not saying you are wrong, but that's not the impression I got when reading that section. Otherwise, whey is step 1 (simple string comparison) a MUST in certain situations if it always a MUST.


First, I think we can restrict ourselves to discussing URIs appearing in specific protocols, such as part of a SIP From URI or an XMPP URI.

I do not agree with this. If HELD comes to fruition, we'll be talking about HTTP schemes as well.


There are probably two cases: Protocols that are clearly specified as using IRIs already (XMPP, say) and those that require discussion (SIP, say). [Discussion is required for SIP since SIP itself allows UTF-8 and RFC 3987 Section 6.3 alludes to the fact that most schemes do not have to be upgraded to support IRIs.]

I'll stick to the IRI case. If the IRI shows up in the protocol request, the steps in 5.3.1 would be executed until either a match occurred or the process falls off the ladder mentioned in the spec. Clearly, some of the comparison steps do not apply since they concern the port number or path components of the comparison.

This sounds right. I don't think we need to worry about the URI case either.


From my reading of 3987, the punycode version would be compared as well during the ladder, presumably by converting the IDN to punycode. (I suspect that the conversion from UTF-8 to punycode is unique, while I suspect that this is not true in the other direction. In other words, multiple UTF-8 strings could generate the same punycode.)

The word in the RFC is MAY:

   Implementations with scheme-specific knowledge MAY convert
   punycode-encoded domain name labels to the corresponding characters
   by using the ToUnicode procedure.

Also, what do you do with the domain attribute? With domain="xn--99zt52a" there is no scheme specific part. Because this is being done for comparison purposes, perhaps common-policy ought to insist on the conversion in domain= with a MUST ToUnicode. The only problem with this is a future scenario where UTF-8 is considered legal in DNS (DNS is 8-bit) labels. Then you have two domain names that are compared as equivalent even though they may not be. Perhaps thats worrying too much.

This is all a bit messier than ASCII comparison, but I don't think we want users to edit punycode into their XML rule files.

You'd be amazed at what end users find to do with cut-and-paste.

-andy

_______________________________________________
Simple mailing list
Simple at ietf.org
https://www1.ietf.org/mailman/listinfo/simple