IETF 88 -- Vancouver, CA Minutes of the WEIRDS WG meeting 15:20-17:20, Plaza C Murray Kucherawy and Olaf Kolkman, co-chairs Andy Newton scribing John Levine taking minutes - Called To Order - 1) Administrivia Blue sheets, note well, etc. 2) Document status update Murray gave a quick document update: Two documents went to IESG Evaluation, but were sent back with DISCUSSes. In summary, we need to send our entire document suite to the IESG all at once; there were too many cross-references to other incomplete documents for evaluation to be possible. 3) URI Syntax Mark Nottingham: Old advice was that of BCP 56, which said we shouldn't use HTTP as a substrate for other protocols. That's changed; using HTTP as a transport for other protocols and services has become "cool" again. We have a number of efforts in the IETF and other bodies that are doing things with HTTP APIs. We've noticed that there are some protocols that claim specific URIs or URI path components to mean specific things, but this violates some core principles of the web. A document is making its way through APPSAWG to codify this in IETF space, though it's been set down elsewhere already. Services opting in to the web should adhere to web structure rather than try to dictate it. APPSAWG's document will be going to last call pretty soon; it has a lot of support. A document that attempts to violate these principles will create friction with the W3C. For WEIRDS, we can look at the document (there may be some wiggle room), but it's simplest to go either the "well-known" route, or the templates route. Andy Newton: My understanding is that we are not allowed to say a particular path component is WEIRDS forever. Our drafts don't say that; what they say is that we have an unspecified base URI, and "/domain" after that. Does that work? Mark Nottingham: We can talk about that. There's danger there in that you're getting near Web philosophy, and you don't want to do that. It has to do with different resources at different places in the path segment; has to do with what a higher level resource can do to a lower level resource. Barry Leiba: There's also a question about the "unspecified" part here that ".well-known" answers that. Andy Newton: "well-known" won't work here; if you have a single service running two domains (e.g., com and net), you'd have a conflict. John Levine: Each server is allowed to redirect off to other servers that know about other things. "uk" is run by Nominet while "ac.uk" is run by JANET, and I don't see how you could do that redirect with well-known; what does it redirect to? Barry Leiba for Mark Nottingham: You can redirect within that space, or to a different host. Barry Levine: What if you do .well-known/weirds/nominet? John Levine: I don't think well-known will break anything, but it also doesn't help us much either. Barry Leiba: How do you get to Nominet in the first place? John Levine: That's a whole different topic (bootstrapping). Barry Leiba: Somehow through magic, someone gets the beginning of a URI. Then they're going to tack on "weirds" to it. Mark, are you saying that is still bad? Mark Nottingham: If that's inside of well-known, it's fine. Barry Leiba: [restates] Mark Nottingham: Why can't you just do that discovery by getting the links? John Levine: Some discovery only gives you a hostname, not a path. Barry Leiba: Ah, that's important. Mark Nottingham: That's what well-known is for; you start with a hostname, and then you have a known place at that host to give you the data you need. Barry Leiba: If all you have is a hostname, and you want to reserve "/domain" under that, that's where the problem begins. John Levine: I understand the problem now, I just don't know what the answer is. CHAIRS: We should take this back to the WG; we have a clearer understanding of the problem and there are solutions for us to consider here. Any comments on the issue of caching? Alex Mayerhofer: We've seen from past protocols that clients have no issues caching. Pete Resnick: I heard that .well-known is extra bits, that this WG seems to think it's silly/wasteful, but I don't hear anything about it being problematic. Andy Newton: I would much rather do the well-known thing than retrieving templates. Andy Newton, for Byron (name): Most of our clients (50%+) do one query and are never seen again. CHAIRS: We'll take this to the list. Murray will describe the two solutions on the list and we'll figure out where to go from there. 4) Object Inventory [slides presentation] George Michaelson (APNIC): We do have remarks, contrary to your list of behaviors. Also, Maint-By, Maint-Ref, Maint-Lower have specific properties that imply delegation of responsibility; it's a model of hierarchical delegation. CHARIS, clarifying: The semantics are not important. Kaveh Rajnbar: I saw some small differences in the list; I can take it to the RIR engineering group and make sure this is up to date. Olaf Kolkman: If there is ever a consensus call about the poetry objects, I will have to recuse myself. CHAIRS: How soon can we be done with this work? We are at all of our milestones, we need to set some deadlines. Is January reasonable? Linlin: I think that's possible. 5) Search, Unicode and Normalization [slides] Scott Hollenbeck: Internationalized searching is the issue. We have IDNA text after Andy talked to Peter St. Andre after Berlin. At least one implementer has come back asking why we need to do all this complicated stuff when the requirements only have to do with validation. How do you deal with combining characters, how do you deal with case sensitivity? Our goal in the document is to make the search specification as simple as possible, a basic search functionality. Is what we have in the document correct and adequate? Should it be simplified? Pete Resnick: Is the inclination of implementers to have more false positives or more false negatives in their search results? That is: Do you want fa(umlaut) to match fa (false positives), or do you want it not to match f followed by a followed by combining umlaut (false negatives)? Alex Mayerhofer: Client should not know too much about what the server is expected to do; it should do the same thing it does for variants handling. I don't want the query to explode. If you don't have a variant that maps to a single base registration, it gets complicated; never mind. Kaveh Rajnbar: Based on RIR experience, users prefer more false positives, but some LEAs contact false positives which is an unpleasant experience. John Klensin: I agree that Pete's is an important question, but if you start comparing characters by removing decoration, then they are other characters. From a conceptual standpoint, idiosyncrasies of particular languages notwithstanding, a decorated Latin character is a separate character from its undecorated form. That is probably a bit too many false positives for anybody's taste. Marcos (name): I don't like the framing of this discussion; false positive/false negative makes it sound like there are some heuristics in here and there are not. We have UTS-10, with levels 1-5 with varying subdivisions of precision. I think the question is: Do you want to expose this sort of API to the client? If so, you can give it control over search and matching; otherwise you leave it to server policy. Pete Resnick: I agree with that in principle. One of the choices is to say this is implementation dependent. Also following up with John, if you decide it's a protocol problem, false positive/false negative is a sliding scale exactly because of what Marcos said. I think it's likely this is how clients will want to go. Choices are: Leave it entirely up to the servers, or we put some discussion in the protocol about what you can do before you send the request to constrain the false positives/false negatives. Peter Koch: What Marcos and Pete said, but: Since we're not developing a protocol but a service description, and because we have a multi-stakeholder elephant in the room, we need to make it very clear that the specification is deliberately ambiguous. Anyone referring to it has to be aware that you cannot refer to this to implement it. Andy Newton: From a practical standpoint, the number of results that are returned will be limited in order to restrict data mining. John Levine: The servers are going to do whatever they want. We could demand that the query is a U-label followed by an asterisk, and I think that's all we can say. John Klensin: If we're going to be in the standards business, there has to be some minimal expectation of performance and behavior. I wonder when this specification will be reviewed by the appropriate committees. Scott Hollenbeck: The work being done here has gone a long way toward guiding those committees. John Klensin: And the three committees that come after it? Scott Hollenbeck: That's a very different question. Pete Resnick: I'm always loathe as an AD to come up to the mic and say technical things in a WG. I understand that this particular issue of I18N is awful. The PRECIS WG is trying to come up with advice, and it still wouldn't necessarily make people in this WG happy. What I've heard some people say that this document should be minimal. I don't know if we should go as far as what John Levine said, but at least giving what is expected as far as an encoding is, I think, the minimum; you may want to go further and specify the responsibilities of the client and the server regarding encodings and normalizations. John Klensin: Maybe the specification you want is to say there has to be a syntax for exact matches, and they have to exact-match. If someone then invokes something other than exact match, let it be implementation/server/registry dependent. I don't know that that's a wonderful idea, it may be that all other options are worse or unrealistic. Marc Blanchet: Exact match is not search, so I interpret that as "don't do search". John Klensin: What I mean is: Specify match, be explicitly that fuzzy match is fuzzy, and stop. Alex Mayerhofer: I think the stronger restrictions on servers we specify, the fewer server implementations we should expect to see. John Levine: If I give you a U-label and you are a registry that does variants, what does "exact match" mean? Does it match a registered label, one in the same bundle as something registered, … Olaf Kolkman: Doesn't that fall under "unspecified"? John Levine: All matching I've seen to date are ASCII prefix matches. I don't know if they do anything beyond that. Pete Resnick: I think John said you allow an exact match of whatever the server has as its plausible result set. The server could decide to do variants, but that's not a protocol issue, it's a policy issue. John Levine: Each registry already has a variants policy. Marc Blanchet: I think that's simple: If a variant is defined in the registry, whatever that means in terms of the zone file, then it's a match. It's just another name that's in the registry. Maybe it would be a good idea to add text, but are we starting to open a can of worms that we don't need to? Maybe some adjustment that our current text includes this possibility without naming it; whenever that name is defined in the registry, (off-mic chatter). Scott Hollenbeck: We did some interop testing on Sunday. The Verisign implementation was returning results against unicode strings; simple binary comparisons. Were there any other results? Guillaume (name): We need time to consult our notes. Scott Hollenbeck: We did get some results. Murray Kucherawy: If the servers are free to decide, does the reply need to indicate to the client anything about the response it's getting? Scott Hollenbeck: That seems overly complicated. (name): We did not test variants. We did not test whether multiple answers were returned. We have at least two implementations that implement U-label search. Marc Blanchet: I don't know if we settled on what to do here, but if we settle into exact match, we need to say a few things about that. Where is the normalization being done? Who does what in terms of preparing and comparing the string? We need to say something about that. Olaf Kolkman: Exact match is prefix matching plus anything after it, correct? (answer appears to be "yes") Pete Resnick: I think what we're hearing before is that the client should get it into a U-label form, and that's the only responsibility. The server should match U-label forms exactly, and fuzzy matching is completely server dependent. John Klensin: As long as that's what everyone heard, I'm happy. CHAIRS: Do we need text here? If so, please send suggestions. Peter Koch to send text if needed. John Klensin: I think ultimately the things you're lookup are U-labels, or you're off into very funny territory. U-labels match at most one DNS entry; it's a characteristic of U-labels. Whether we call them variants or not, if we decide we want to put links into a database between two names, that's a different kind of issue than whether these are variants. This is a very different thing about policy questions with respect to registrars. A variant is just one kind of link that joins two names, and that's it. I'm terribly afraid of ICANN defining what variants are. Pete Resnick: I was hoping the earlier answer to this is that the protocol will be agnostic as far as what it returns in the search beyond what it considers a partial match; the server is expected to return things that match the query, and if it wants to return records that match in some interesting way (avoiding terminology like "variants") it's welcome to do that, but that's server dependent. All a server is expected to do is return those things that match the partial query. John Klensin: I like that solution. I think in the real world is that it (a) takes you down the path of multiple kinds of information depending on who you ask and what day it is, and that's not fun; and (b) if some authority over the data comes in with a requirement do this in a certain way, then hand-waving about additional information a server might return will get us into trouble. I'm trying to push this toward database theory than "return what was asked for plus hand-waving". Pete Resnick: We at least want to say something more definite about returning exact/partial matches plus any other data referenced within the database. You want something a little more definite about the current state of affairs about the data available, not some magical algorithm we need to apply. John Klensin: Right, the magical algorithm has been one of WHOIS' historical problems. The other caution about this is that if one owns/leases a name, and the owner/lessee of some other name associated with the first one by someone's policy rules, and you return both when asked for one, that's a potential privacy violation. We'll need to consider the privacy of the relationship between the two relative to who's doing the query. Scott, Alex, Olaf: We need to make sure privacy is discussed in the document. (name): If an RDAP query does not match a variant, it should not be returned. CHAIRS: John Klensin, could you write your concern about matching/linking down in a couple of sentences or a paragraph that could go into the document? Scott and Andy, new milestone? Scott, Andy: January is not out of the question. We need to have some more list discussion. 6) Bootstrapping [slides] Marc Blanchet: During the Berlin meeting we talked about several solutions. We have drafts available for all of these ideas; which one do we want? When we reach consensus, we'll advance one of them. The three were: (1) A .arpa solution for names or numbers; (2) IANA registry-based solution; (3) query the TLD (for names) or the daily-updated NRO file containing a detailed list of all allocations for the five registries, which could be augmented with an RDAP server for each allocation. Some pros and cons are shown [slides]. Which solution are we converging on? Alex Mayerhofer: I don't want to open a can of worms, but it seems obvious that at least for numbers, another option would be to have a set of servers that just do redirects to the right RIRs. Has this idea been discussed? Marc Blanchet: There is an entity that tells you where to go, in the (3) case. (name): The DNS solution: I'm not against this in general; I'm against what's described in the draft. The IANA solution is okay; I already have to synchronize the name of my WHOIS client with their information. The other solution reminds me of a draft I wrote ten years ago, but for numbers it's a very bad idea. The syntax is completely arbitrary, and the URI is a WordPress URI; it seems very brittle, don't do it that way. Peter Koch: There's a split between names and numbers which should be appreciated more. Numbers work well today because there are just five RIRs, even if it's hard-coded into clients or some other weird mechanism. This doesn't have to be a one-size fits all solution. Does this need to be a unified solution? Marc Blanchet: Some people say it would be good to have a single solution. Peter Koch: There's no firm requirement that it be unified, correct? CHAIRS: No. Murray Kucherawy: Is redirection expensive? Is that something we need to worry about? Peter Koch: No. And redirection is not the only way it can work. At least for the V4 space, there are clients that can do the mapping to the right RIR anyway. Peter Koch: Back to names: What's not in the draft is how to solve the .co.uk problem, and public suffix is not an acceptable answer. So how would that work? Marc Blanchet: The draft is incomplete. It doesn't answer all the questions right now. Peter Koch: This could be a problem, a misleading of the client by .uk about .co.uk data (for example). Marc Blanchet: That's what I was trying to show here [slides]. Peter Koch: So how would we solve this? Or do we want to? Olaf Kolkman: The solution might the something you don't like; it might involve some walking. Peter Koch: "Can be secured": Are you after authentication or confidentiality? Marc Blanchet: Good point. It's more like: What can we do here in terms of security? What's possible? For DNS, you can secure it with DNSSEC, so you get what that gives you. Peter Koch: The criteria you're evaluation have not reached consensus yet, right? Marc Blanchet: Not at all. CHAIRS: We're hoping this discussion will lead to some indication of what is a good way forward. Understanding what's in the slides is important, but really we'd like to know if it's obvious there's a good candidate going forward. John Levine: For the names stuff, in Berlin, I heard strong pushback from the ccTLDs over anything that requires IANA to publish stuff. In terms of sub level TLDs, there's the ac.uk question (Nominet and JANET are friends; they could be persuaded to do referrals) and there's the uk.com question (just another customer of Verisign; unlikely to get referrals). Marc Blanchet: This needs to be documented. Wes Hardaker: I don't care which solution you pick. One thing to note about reverse DNS queries on the address is that they're blocked on the 8-bit boundary, so you're limited to how you can search. There is a draft in dnsop to solve this, but it's expired. Marc Blanchet: We're talking about allocations, so the boundaries are more likely to be okay. Doug Otis: We do something like this all the time, we run it every day and generate our own list. We see 130M top level domains and about 5M of them are computer-generated names, part of a botnet, which we used to figure out where botnet CCCs might be. Try to set up a database to handle this stuff and it blows up. It's hard to keep a lid on it. We try to slice it at what we think to be the registered domain name. If you can come up with a strategy for doing this in conjunction with the registrar, you can possibly make progress. A way to signal this from the registrar, say a DNS record published by the registrar, would be good. John Klensin: I'm hearing a lot of assertions that are not useful bases for protocol design. We can't design for things based on "there are only five of them". Similar comments apply on the numbers side because of the boundaries of allocations today. We can't design things based on parameters and nuances that are not under IETF control. Marc Blanchet: We also can't conclude that all five RIRs will continue to work together. Andy Newton: On names vs. numbers: A lot of WHOIS clients don't make you decide whether the query is a name or number; they try to figure it out and send the query to the right place. Users don't want to spend time figuring out what kind of query they're doing, and we can expect people to avoid adopting RDAP if they have to do that. Has to be simple, especially for people that are writing simple scripts. Ray Bellis: +1 what John Levine said. Andrew Sullivan: I quite strongly disagree with that. Every time you delegate, you do establish a registry under you. I don't see any problem with a design that depends on people being able to assert "I have this service under this name" and I think that would be the right way to do this. You probably need to do it on the basis of the DNS. Olaf Kolkman: You're saying we need to follow the delegation model of the DNS, but not necessarily store it in the DNS. Andrew Sullivan: Yes, exactly. We've got hierarchical spaces here, so we should follow the hierarchy. John Klensin: I'd like to point out that we're discussing requirements set by people who aren't in the room, which is the definition of "bogged down". I'd like my email to the ADs and the chairs sent earlier in the week to be incorporated here by reference. CHAIRS: Noted. CHAIRS: Did we hear any expression of a favorite? Pete Resnick: We haven't heard anything about what the specific problems are with IANA. I'd like to hear those. If it's a political problem, I'd like to know if it will interfere with deployment. For the DNS solution, I'd like to know what the technical problems are. These need to be laid out on the list, so we can finally pick one of these. I'm hearing some issues with the autonomous solution, for example. 7) RDAP-WHOIS backward compatibility Ning Kong: There are requirements on the RDAP open source implementation that it implement port 43 query proxying. This raises a number of issues that could cause end user confusion. Do we need to update RFC3912? [see slides for details] Alexander Mayerhofer: I think your assumption that there's a data consistency problem is not necessarily correct because they're two mechanisms to access the same database. Ning Kong: One user might be authorized to get some fields from RDAP but get all-or-nothing from WHOIS. Alexander Mayerhofer: You're trying to piggyback an old service on top of a new service. Ning Kong: Inconsistent presentation is also a concern. Alexander Mayerhofer: The client can choose which format it wants by selecting which service it hits. Kaveh Rajnbar: I agree, this is an operational issue and not something for the WG to consider. Peter Koch: Not everything that appears in ICANN's wish list becomes a requirement for this working group. We also don't want to end up with a bunch of future working groups working on transition mechanisms. (name): It's not clear what "proxy" means here. As long as the problem is ill-defined, we shouldn't even ask if it's a WG issue. Francisco Arias: What we meant here is something else. I concur that this is not a WG concern. We can talk offline. 8) Interop report [slides] CHAIRS: We definitely want to do this in London. Email the list if you would like to be part of the next one. - Adjourned -