2.1.19 Whois Enhancement (whoisfix) BOF

Current Meeting Report

August 9, 2001


Chairs: Eric Brunner-Williams (EBW) and Dave Crocker (CD)
Minutes: Ted Hardie

Meeting plan:
o requirements overview
o charter discussion
o RIR state-of-whois overview
o TLD state-of-whois overview
o design discussion: queries and formats
o transitions
o interactions with other work groups
o it isn't whois:43, trademark, law enforcement, LDAP

Summary of Actions taken:
- -------------------------
Mailing list for further work set as ietf-whois@imc.org
Agreed that charter needed revision.
A set of initial candidate problems established and ranked.
Agreement reached that if those problems could not be solved on port 43 while maintaining backward compatibility, the group would cease as a whois group, potentially reusing the work to create a new protocol.

Meeting Review:
- ---------------

DC started general discussion by noting that the scope of the effort is to "fix" whois, not enhance or extend it to be a replacement for LDAP or other mechanisms. The key requirements that chairs see lie in satisfying the needs of address registries, domain name registries (ccTLD and gTLD), and network operators.

The proposed charter limits the scope to enabling technical changes to queries, responses, and redirection mechanisms.

Out of scope are non-operational issues, e.g., civil and criminal legal generalized theories of content, access and capabilities.

The basic goals identified by the chairs are:

1) standardize structured query format,
2) produce mechanisms to allow for query redirection and server-server queries, and
3) maintain interoperability among old clients, old servers, new clients and new servers.

Initial timetable:
Sep 01: Initial spec
Jan 02: Revisions
Mar 02: Done

- ---------------

Mirjam presented the RIPE NCC whois uses:
track IP and ASN allocation,
maintain contact info, and
maintain both reverse and forward domain name data.

Data are populated both by the RIPE NCC and the domain holders.

In addition, RIPE uses whois as part of its routing registry (RIPEDB), which registers some forms of routing info; this is based on RPSL a syntactically rich form which allows users to produce router configurations and which contains route object maintainer contact information.

She noted that APNIC uses a version of the RIPE NCC code for similar purposes.

The query language is structured:
o RPSL for the routing registry,
o attribute/value pairs for the IP data,

This structure is transparent to the user. Data is kept in databases and whois is an access method rather than a data description language.

The RIPE NCC server can act like a client, but the interaction is pure referral and provides no state for the interaction.

The domain data maintained by RIPE was initially on reverse domains for in-addr.arpa, but it can and has been used for forward data on behalf of cctlds. More and more ccTLDs are moving out of the RIPE database; now only the top level domains are in RIPE, but referrals can be made to whois servers maintained by the individual ccTLDs.

- -----------

Cathy presented the ARIN whois use, similar to that of RIPE (above). She noted that the principal issue facing them is scaling -- the service runs at the saturation point of the servers involved. Ran at 20 million a month at the beginning of 2000, moved up to 35 million by the end of the year, and then jumped when new servers were added. Before the addition the query rate 14.54 queries/sec; post-addition it ran up to July 20q/sec.

Cathy also noted that her users have requested referral whois.

- -----------

Randy discussed use, noting that many ccTLDs run RIPE code, and that ISPs commonly use the RADB and IRR.

There is a large community using a traditional "dumb" port 43 (i.e. no RPSL structure) for contact data.

Randy noted the four software families:
ARIN (Forked from NSI), and

Andy noted some ccTLDs (e.g. New Zealand) have rolled their own.

Jaap reported on the whois service for the Dutch ccTLD, which limits hosts to 500 queries a day. The .nl whois is often used simply to test whether a name exists, so they have developed a light-weight response which notes only the state: exists, doesn't exist, and is blocked.

George commented that APNIC has a confederation structure, which is reflected in the whois data. Different members use different formats, which may be in country-specific character sets.

Nakayama-san commented that the Japanese data is in encoded in JIS, and that a trailing "/e" yeilds ASCII encoded data,

whois -h whois.nic.ad.jp u-tokyo.ac.jp
whois -h whois.nic.ad.jp u-tokyo.ac.jp/e
whois -h whois.nic.ad.jp help/e

The following lists are brief summary of answers when he asked to ccTLD administrative contacts at 1997 for RIDE BoF.

ccTLD : whois server port Responce Format and other info.
- ---------------------------------------------------------------------------
.fr : whois.nic.fr 43 RIPE Format, 7bits
.is : whois.isnet.is 43 Original Format,
.jp : whois.nic.ad.jp 43 Original Format, 7bits
.kr : whois.nic.or.kr 43 RIPE Format, 8bits
.mx : whois.nic.mx 43 Original Format, 7bits
.ru : whois.ripn.net 43 RIPE Format, 7bits
.se : whois.nic-se.se 43 RIPE Format, (Changed?)
.th : whois.thnic.net 43 RIPE Format, 7bits
.za : whois.za 43 Original Format, 7bits
.ch : whois.nic.ch 43 Original Format, 8bits
.li : whois.nic.li 43 Original Format, 8bits

.gf : whois.nplus.gf 43 (This server is not available now)
.hk : whois.hknic.net.hk 43 (This server is not listed in DNS now)
.us : nii.isi.edu 43 (This server is not available now)

.pe : rwhois.rcp.net.pe 4321 InterNIC Format
.ve : rwhois.reacciun.ve 4321 InterNIC format

.am : whois.amnic.net 43 Original Format, Not Available yet.
[all others unclear or no response]

.ca : whois.cira.ca 43 8bits [per Dan]
.ng : whois.rg.net -- IRRD [per Randy]
.ua : whois.com.ua 43 RIPE format, 8 bits

[ EBW: Someone should html-ize this and put it on the web, to be delta'd until (most) all 267 TLDs and all 3 (or 5) RIRs are documented. Volunteers? ]

As with other whois services, a clear concern in the APNIC reason is the use of whois to obtain data on members, "grazing". A distinction is clear to the APNIC members that some forms (e.g. Geoff Houston's data association grazing) are desirable.

- ------------

Randy presented on NIC handle info in the whois data at ARIN and RIPE.

Different registries have different identities attached to the same HANDLE identifier, despite a hack that tries to use REGISTRY-handle method for associating data to the original handle. ***This is partly because of RIPE's method of doing referential integrity (dumping the other database into their own).***

[Engin Gunduz (RIPE NCC) comments on the scribe's original that the other DBs are not dumped into their own. The just allow users to create person objects with non-RIPE NIC handles.]

- -------------

EBW then noted that discussions within PROVREG have produced a document, draft-rader-dnwhois-defn-00.txt, which may be useful but likely will need to be split into a provreg area and a whois area.

DC presented issues/desiderata for queries:
o a standardized query format,
o possibly expressed as a regularized query "template", so that independent of the content a common query format can be used,
o constrained XML mentioned as a format (later rejected; see below).

for responses:
o a structured output with standard labels,
o registered response codes,
o and the ability to redirect or refer.

Randy questioned the need for redirect/referral as a method of raising the question of how to develop the requirements; Patrik followed up by pointing out the transition issues and noting that the WG needs strong indications why individual fixes are needed so the WG can do the cost/benefit analysis.

The first justification put forward was for structured output, as an aid to parsing. John replied that the current limits of the protocol are such that expecting to be able to process output from a whois query raises risks, and that it might be better to use a new port. After discussion there was some agreement that it if the queries were not posed to arbitrary whois servers, but to an enumerated set which adhered to a published standard (such as the RPSL [rfc2622] syntax), that this concern would be mitigated. This approach applies to the transition issue as well.

Mark commented that this is a well known slope, and that the next thing will be a request for client capabilities; further steps and ultimate destination are also well known. John and Randy agreed that a strict management of further feature creep is necessary, as is the avoidance of generics in the enumerated list (as in, all ccTLD whois servers).

Greg commented that with authentication and a well known format, the set of requirements sounds a lot less like whois. Mark added the original whois' function was as a public service with general availability, and someone observed that the service this restricted set describes is not parallel, as it creates closed islands.

Randy disagreed, asserting that well-specified formats and enumerated serves gave a pretty broad scope, though any-to-any functionality is lost.

DC proposed a process approach to the rest of the BOF:

Part one: what problem do you want fixed?
Part two: what minimal approach would solve part one?
Part three: rank these problems
Part four: partition the ranking statements and list the ones that we think we can do on port 43.

If there are requirements that we cannot do on port 43, we go to another port for all of the work this group may do.

Patrik noted that he does not like "or" or "if" in charters. DC agrees that this is appropriate and that if cannot happen on port 43, the group will report that it cannot complete and a new group *might* be chartered for a different port, but this group would shut down.

The group discussed this approach; there was discussion among Jordan, Bruce, Neil and others on whether we are "updating whois" or meeting the need for a protocol that solves a specific set of problems; while it was agreed that backwards compatibility does constrain possible solutions consensus seemed to be that if it is not restricted to port 43, there will be no progress.

DC then stated his belief that we had ripped the current charter apart, and that it will need revision, but he hopes that we try to stay in this kind of time frame to encourage progress.

Some challenge on the practicality of this approach by Randy, but this would be cleaned up before resubmission to area directors.

EBW and DC reviewed other working groups salient to this work: provreg, idn, *ng, security? Bill noted that there are also people still revising and enhancing rwhois protocols and that it would be useful to coordinate with them.

Patrik made the point that whois might not be the right tool for retrieving data and performing data calculation; he believe that because those clients try to machine parsing they have risks. He suggests that those reports would be better produced by the people who have the data, rather than via whois. EBW agreed that good stewardship is an alternate to good grazing, but it outside the scope of a working group to do more than encourage (provide a mechanism for) good stewardship.

A listing and prioritizing of candidates for the problem list was then held, the results are summarized below item (shown by contributor) each:

Andy: Privacy and access control.
Strong interest. Discussion on several aspects, e.g., applications and throttling.

Randy: nic handle identity
Moderate interest by the group, with the synchronization element seen as probably out of scope.

Jordon: need for referrals.
Strong interest by the group.

Bruce: standardized format for specified query
Agreement by the group.

George: backwards and forwards compatibility
Agreed to survey existing implementations; attempt not to create incompatibilities.

Werner: whois query in URL format.
Some exist, and this is a current APPs area requirement

Max: don't make it comment or use XML, as it won't be human readable
Agreed with some objections.

Andy: should be transcribable (short) query format.

Hans: Meet ICANN contract requirements.

Bill: motivate accuracy, etc.
Disagreed. (?)

Meeting adjourned.

The last agenda item, "it isn't whois:43", was not covered in this meeting.

End of minutes and chairs' corrections/amendations/fixes.


None received.