2.1.11 Uniform Resource Identifier Revisions (urirev04) Bof

Current Meeting Report

Minutes URIREV04 BOF, 8/6/2004 9-11:30 AM
Minutes were taken (using Jabber) by Ted Hardie, Paul Hoffman and Lisa Dusseault (with some contributions from the net.) Larry Masinter used log to create the minutes.

Ted = Ted Hardie, Larry = Larry Masinter, Roy = Roy Fielding,
Martin = Martin Duerst, John = John Klensin, James = James Seng,
Leslie = Leslie Daigle, Pete = Pete Resnik

Original agenda
5 min Agenda review
15 min RFC 2396 revision
*** draft-fielding-uri-rfc2396bis-06.txt
IETF last call has been requested.
(Roy Fielding)
15 min Internationalized Resource Identifiers (IRI)
*** draft-duerst-iri-09.txt
Let's hear current state & try to get this one out.
(Martin Duerst)
15 min RFC 1738 scheme updates
*** draft-hoffman-rfc1738bis-02.txt
Finally updating/obsoleting old spec.
(Paul Hoffman)
100 min Registering URI schemes
The policy and process for registering new URI
10 min Introduction to topic
"Guidelines for URI schemes",
*** RFC 2718, RFC 2717 registration procedure
"vnd and prs trees"
*** draft-king-vnd-urlscheme-03.txt
Some recent 'problematic schemes'
Request for IANA registry of 'proposed' schemes?
75 min Discussion
15 min Wrap-up: conclusions, next steps, action items

This is ready for IETF last call. It has had a very vigorous review on uri@w3.org. Larry believes that it will be relatively clean sailing through the rest of the process. Ted notes that it certainly ready for IETF last call.

There were many remarks about how RFC2396bis is much better than RFC2396, and a real service to the community. There was appreciation and applause for Roy's work on the document.

We discussed Section 6.3 (Canonical Form):
The discussion started with questions: are the following normalized?
'http://www.w3.org/2000/01/rdf-schema#' (empty fragment identifier)
'http://example.com:80/' (port number = default port)

Roy noted there had been a normalization step which removed empty fragment identifiers, but it was removed on request. The removal of a default port was considered a 'protocol-specific' normalization. We discussed other possible normalizations by invoking other equivalences listed in section 6, e.g., '6.1' and '6.2', etc. It was noted that different software uses different canonical forms, e.g., XML namespaces vs. HTTP proxy.

Roy concluded that Section 6.3 says exactly what it was meant to say, and nothing was left out on purpose. Larry suggested removing section 6.3 from the 'Standard' document and, if desired, having a separate Proposed Standard for a URI canonical form.

The ATOMPub group needs URI canonical form, will review this section. This discussion to be continued on the list.

This section will need some work, but it shouldn't hold up IETF last call, instead these can be handled as last call comments. Roy says he won't change this section until there is actual data.

A request to the IESG for publication has been sent. The mailing list for disucssion is public-iri@w3.org; last call comments to IESG & public-iri@w3.org. Larry was an original author, then it was handed off to Martin; then Michel joined as co-editor.

There is an issues list which notes issues and closures at w3.org as well. Nit: two of the names of the productions in RFC2396bis changed; cosmetic change in IRI document are needed to track them, but IETF last call need not wait for this. The W3C/IETF coordination call discussed how to move forward on this. The current proposal is to move the two documents (RFC2396bis and IRI) to Last Call at the same time.

If issues come up with URI spec it will either be orthogonal to the IRI spec or they will need joint update. The big difference of course is that IRI goes to Proposed.

Ted is concerned that we might need an informational document in the future, on usage of URIs and IRIs -- when do you use one, when to use the other, and other useful advice. For example, Larry talked to someone who wondered if they should register a URI scheme or an IRI scheme or both; perhaps it isn't clear there is just *one* namespace for schemes.

Martin would be willing to help with a document that is aimed at protocol developers, but might need help from someone with a more "outside" view.

Martin notes that discussion of URIs vs. IRIs in ATOM has not progressed in part due to limitations in the wiki. (???)

1738 scheme revision
RFC 1738 contained not only the early URI syntax but also several scheme definitions. Paul Hoffman did a draft which extracted the definitions of those schemes: ftp, gopher, news, nntp, telnet, wais, file, prospero.

Most of the discussion on the document has been on 'file:'. 'file:' doesn't interoperate across platforms, and sometimes not even on the same platform (e.g., one windows application will treat file: URIs differently than another).

'File:' it deeply broken. Is it worth trying to fix, given how widely deployed the broken implementations are?

Are we prposing new protocol work? It isn't worth doing new protocol work if the people responsible for the deployed software aren't going to participate in the process. Would having a spec be a forcing function? Should we just describe the way in which current implementations differ?

Larry thinks there's some common practice around UNC pathnamess & drive letters. It would be useful to document what's there. And people can use file: URIs interoperability as long as they're really relative URIs based on files that are selected in some other manner.

Martin thinks it would be useful to give implementors something they can converge on... perhaps within 5-10 years.

Larry suggests describing common practice and stopping there; if convergence is desired, all of the implementors can pick, say, "Section 2" and recommend it. We can even make that an editorial suggestion in the specification.

Is this 'a description of what is out there', but 'not a description of what people have to implement'?

There was a discussion about whether 'gopher' was going to be marked 'historic', and whether any of the old schemes might be either 'left behind' or 'obsoleted' or 'marked historic' and what that might mean. Marking something 'historic' might imply a different status than intended; gopher:, prospero: and wais: aren't dangerous, just not widely used.

There was a discussion back and forth about splitting this into several documents, leaving some in one document, etc. Not splitting now would just make someone else split it later. The discussion led to the conclusion that Paul would split this into seven separate documents; some people would be asked to take on some of the documents (news & nntp to usefor, for example).

There was some question about whether this would block RFC 2396bis. The thought was not, the meta-data about whether 1738 was 'updated' or 'obsoleted' was independently maintained.

Registration process, guidelines
(RFC 2717, 2718, draft-king-vnd-urlscheme)

We discussed the process for registering URI schemes.

* URI registration is broken:
The public perception of URI scheme registration is off from reality. There are many schemes whose attempted registration has languished for years without any deterministic process for either registering them or saying 'no' definitively.

We originally made an exception to the guidelines for URI schemes which allowed schemes to be registered even if they didn't quite meet the guidelines if they were widely deployed. The result has been people just use their scheme and hope that if they get widely deployed, they will get a registration.

The original intent of the high bar was keep the number of registered schemes down. But people just mint them, and plan to register later. Now are seeing conflicts; e.g., 'mmms:' has diffenent interpretations used by 3GPP and Microsoft. We need to fix this.

Ted suggests that we abandon the idea that registration will reduce total number. Our only purpose should be to eliminate namespace conflicts.

* What's in the registry? Is there a 'line'?
We discussed various forms of registries that might set some line -- schemes below the line not as good as schemes above the line. A provisional registration followed by a permanent one after six months, etc.

Ted suggests a provisional registration that provides a specification or an implementation pointer, for six months. If someone already has a provisional registration and a spec, they win, they get in.

James suggests that perhaps the rule is that the scheme has to have two different implementations.

Leslie asks how one might make a URI processor that can handle a zillion different schemes.

Leslie suggests that there are two classes: ones with published specs, one without; we should discourage non-protocol schemes.

Requiring a 'definition' or a 'protocol' for a scheme might not be enough; Paul gave an example of a scheme with a 'definition' which is 'just like http', i.e., it's well-defined, but useless as a URI scheme.

John points out that if we set up barriers, people will do whatever they do anyway.

Larry suggests registering implementations of URI schemes.

Larry says rather than setting a threshold ("must have at least 1 implementation") just document the values in the registry, and let people come to their own conclusions.

Leslie says the criteria might be running code and/or specification is enough to get above the line. Larry wants us to never draw a line. Larry wants us just list pointers; people will game the line.

Leslie thinks that "community vote" is OK, as long as we clearly define what is IETF (as in: on standards track). Larry agrees.

* Abuse:
Roy says that we might also need to worry about preventing abuse, e.g., registering URI schemes with other people's trade names, etc.

John points out that there is an easy denial-of-service on other people's names. With IANA and port numbers, the rule was 'you get one for free' but for the second registration, you need to provide something, e.g., a protocol definition.

Paul says there are big WIPO problems. John says that WIPO will probably just let people sue each other. John says that ICANN just defers to WIPO.

Paul talked about formal association with WIPO. John points out (again) about insanity and WIPO.

James says we don't have a problem now because the bar is so high; if we lower the bar, it's going to become a problem.

Geoff talks about WGs "preemting" registration. [[ed: ??]]

* Duplicates (and comparison to header registry)
Larry suggested that perhaps allowing multiple registrations for the same scheme might be allowed. This caused wild disagreement in the room ("that's nuts", "terrible, terrible", "if we allow collision, let's just not do this").

Pete says we had the same fight about the header registry; points out that we wanted a single place for people who didn't want to have a conflict to look.

Pete thinks that we can have duplicates in the registry. Pete says "document usage, allow people to see what isn't use". John says let the bad guys duke it out.

Leslie asks "what happens when a bad guy wants to add a second registation for urn:?".

Larry thought that allowing duplicates might reduce some DoS values, because someone else registering 'roy:' wouldn't stop Roy from using it.

Ted thinks the bar should be set above allowing multiple registration.

Martins says the Web just doesn't work with multiple schemes. Martin wants each one clear, and wants to resolve the problem with the current duplicates.

Leslie would have preferred a universe with just one, and a smaller number of schemes. But she wants to acknowledge reality. We should give the clearest picture possible of the universe.

Ted acknowledges that keeping the number of URI schemes low was not of benefit to the user. Argues that trying to shape it to avoid collision is paramount. Paul agrees with Leslie.

John points out that the header registry's purpose is to say "here's the legitimate use of foo, but there is another use". The header registry is used for security and user-defense warning (e.g., which systems might send headers which have different meanings).

Martin asks if the header registry works and what implementers think of it; how do negative comments get into the registry?

John says the negative comments can get there through the standards process.

Tony Hansen clarifies how the header registry works.

* scaling:
Roy ask about scaling for the IANA registration, and suggests that maybe we use an issue-tracking system to help with the scaling.

* vnd and pers:
We have gotten enormous pushback from the vnd- and prs- trees. Larry suggests abandoning them, and it is agreed.

* Guidelines
Someone unknown asks about getting on IANA pages. Larry clarifies that we are talking about changing the way to get onto the IANA pages. Martin points out that some people understand the current rules, but others are clueless and then get grumpy.

Cyrus Daboo points out that there is only a single entry right now.

* Other orgs:
Martin points out that other standards organizations can pass in a MIME type template in their own specs that must go through the IESG, and asks that we think about that for URIs as well.

* summary:
Ted says that we have general agreement that we need a lower bar and acknowledge what is happening now.

We have exposed many good issues, but there are still many details to be worked out.

Tony Hansen agreed to co-author with Ted on a draft.

This will be a replacement for RFC 2717 and RFC 2718 (registration procedures AND guidelines). Martin Duerst has offered to help.

Roy notes that RFC 2718 (Guidelines) are really about locators, and non-locator schemes may need some additional, but different, guidelines.

Non-IETF tree (vnd- or prs-) will be dropped.

wrap up
We agreed to do everything on uri@w3.org. Possibly even scheme review in the future.

Ted says that he wants to close down uri-review mailing list.

Martin says that URIs are important, and that the W3C may make the URI Interest Group mailing list a public list.

* Postscript:

On the jabber log, but not in the meeting, Michael Mealling asked: Has anyone actually gone to the authors of the non-registered schemes and asked them what it would take to make them happy?


None received.