idnits 2.17.1 draft-ietf-asid-whoispp-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 2116 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 92 instances of too long lines in the document, the longest one being 8 characters in excess of 72. ** There are 8 instances of lines with control characters in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The "Author's Address" (or "Authors' Addresses") section title is misspelled. == Line 253 has weird spacing: '... server hand...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 1997) is 9904 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'HARR85' on line 1390 looks like a reference -- Missing reference section? 'WINDX' on line 1401 looks like a reference -- Missing reference section? 'IIIR' on line 1397 looks like a reference -- Missing reference section? 'ALVE95' on line 1384 looks like a reference -- Missing reference section? 'RFC822' on line 1910 looks like a reference -- Missing reference section? 'POST82' on line 1729 looks like a reference Summary: 11 errors (**), 0 flaws (~~), 4 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 ASID Working Group Patrik Faltstrom 2 Internet-Draft Tele 2 3 Expires: September 1997 Sima Newell 4 draft-ietf-asid-whoispp-01.txt Bunyip Information Systems Inc. 5 Replaces: RFC-1835 Leslie L. Daigle 6 Bunyip Information Systems Inc. 7 March 1997 9 Architecture of the Whois++ service 11 Status of this Memo 13 This document is an Internet-Draft. Internet-Drafts are working 14 documents of the Internet Engineering Task Force (IETF), its 15 areas, and its working groups. Note that other groups may also 16 distribute working documents as Internet-Drafts. 18 Internet-Drafts are draft documents valid for a maximum of six 19 months and may be updated, replaced, or obsoleted by other docu- 20 ments at any time. It is inappropriate to use Internet- Drafts as 21 reference material or to cite them other than as ``work in 22 progress.'' 24 To learn the current status of any Internet-Draft, please check 25 the ``1id-abstracts.txt'' listing contained in the Internet- 26 Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net 27 (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East 28 Coast), or ftp.isi.edu (US West Coast). 30 Distribution of this document is unlimited. 32 Abstract 34 This document describes Whois++, an extension to the trivial WHOIS 35 service described in RFC 954 to permit WHOIS-like servers to make 36 available more structured information to the Internet. We describe 37 an extension to the simple WHOIS data model and query protocol and a 38 companion extensible, distributed indexing service. A number of 39 options have also been added such as the use of multiple languages 40 and character sets, more advanced search expressions, structured data 41 and a number of other useful features. An optional authentication 42 mechanism for protecting all or part of the associated Whois++ 43 information database from unauthorized access is also described. 45 Table of Contents 47 Part I - Whois++ Overview ................................. 48 1.1. Purpose and Motivation .............................. 49 1.2. Basic Information Model ............................. 50 1.2.1. Changes to the current WHOIS Model ................ 51 1.2.2. Registering Whois++ servers ....................... 52 1.2.3. The Whois++ Search Selection Mechanism ............ 53 1.2.4. The Whois++ Architecture .......................... 54 1.3. Indexing in Whois++ ................................. 55 1.4. Getting Help ........................................ 56 1.4.1. Minimum HELP Required ............................. 57 1.5. Options and Constraints ............................. 58 1.6. Formatting Responses ................................ 59 1.7. Reporting Warnings and Errors ....................... 60 1.8. Privacy and Security Issues ......................... 61 Part II - Whois++ Implementation .......................... 62 2.1. The Whois++ interaction model ....................... 63 2.2. The Whois++ Command set ............................. 64 2.2.1. System Commands ................................... 65 2.2.1.1. The COMMANDS command ............................ 66 2.2.1.2. The CONSTRAINTS command ......................... 67 2.2.1.3. The DESCRIBE command ............................ 68 2.2.1.4. The HELP command ................................ 69 2.2.1.5. The LIST command ................................ 70 2.2.1.6. The POLLED-BY command ........................... 71 2.2.1.7. The POLLED-FOR command .......................... 72 2.2.1.8. The SHOW command ................................ 73 2.2.1.9. The VERSION command ............................. 74 2.2.2. The Search Command ................................ 75 2.2.2.1. Format of a Search Term ......................... 76 2.2.2.2. Format of a Search String ....................... 77 2.3. Whois++ Constraints ................................. 78 2.3.1. Required Constraints .............................. 79 2.3.2. Optional CONSTRAINTS .............................. 80 2.3.2.1. The SEARCH Constraint ........................... 81 2.3.2.2. The FORMAT Constraint ........................... 82 2.3.2.3. The MAXFULL Constraint .......................... 83 2.3.2.4. The MAXHITS Constraint .......................... 84 2.3.2.5. The CASE Constraint ............................. 85 2.3.2.6. The AUTHENTICATE Constraint ..................... 86 2.3.2.7. The NAME Constraint ............................. 87 2.3.2.8. The PASSWORD Constraint ......................... 88 2.3.2.9. The LANGUAGE Constraint ......................... 89 2.3.2.10. The INCHARSET Constraint ....................... 90 2.3.2.11. The INCHARSET Constraint ....................... 91 2.3.2.12. The IGNORE Constraint .......................... 92 2.3.2.13. The INCLUDE Constraint ......................... 93 2.4. Server Response Modes ............................... 94 2.4.1. Default Responses ................................. 95 2.4.2. Format of Responses ............................... 96 2.4.3. Syntax of a Formatted Response .................... 97 2.4.3.1. A FULL format response .......................... 98 2.4.3.2. ABRIDGED Format Response ........................ 99 2.4.3.3. HANDLE Format Response .......................... 100 2.4.3.4. SUMMARY Format Response ......................... 101 2.4.3.5. SERVERS-TO-ASK Response ......................... 102 2.4.4. System Generated Messages ......................... 103 2.5. Compatibility with Older WHOIS Servers .............. 104 3. Miscellaneous ......................................... 105 3.1. Acknowledgements .................................... 106 3.2. References .......................................... 107 3.3. Authors' Addresses .................................. 108 Appendix A - Some Sample Queries .......................... 109 Appendix B - Some sample responses ........................ 110 Appendix C - Sample responses to system commands .......... 111 Appendix D - Sample Whois++ session ....................... 112 Appendix E - System messages .............................. 113 Appendix F - The Whois++ Input Syntax ..................... 114 Appendix G - The Whois++ Response Syntax .................. 115 Appendix H - Description of Regular expressions ........... 117 1. Part I - Whois++ Overview 119 1.1. Purpose and Motivation 121 The current NIC WHOIS service [HARR85] is used to provide a very 122 limited directory service, serving information about a small number 123 of Internet users registered with the DDN NIC. Over time the basic 124 service has been expanded to serve additional information and similar 125 services have also been set up on other hosts. Unfortunately, these 126 additions and extensions have been done in an ad hoc and 127 uncoordinated manner. 129 The basic WHOIS information model represents each individual record 130 as a Rolodex-like collection of text. Each record has a unique 131 identifier (or handle), but otherwise is assumed to have little 132 structure. The current service allows users to issue searches for 133 individual strings within individual records, as well as searches for 134 individual record handles using a very simple query-response 135 protocol. 137 Despite its utility, the current NIC WHOIS service cannot function as 138 a general White Pages service for the entire Internet. Given the 139 inability of a single server to offer guaranteed response or 140 reliability, the huge volume of traffic that a full scale directory 141 service will generate and the potentially huge number of users of 142 such a service, such a trivial architecture is obviously unsuitable 143 for the current Internet's needs for information services. 145 This document describes the architecture and protocol for Whois++, a 146 simple, distributed and extensible information lookup service based 147 upon a small set of extensions to the original WHOIS information 148 model. These extensions allow the new service to address the 149 community's needs for a simple directory service, yet the extensible 150 architecture is expected to also allow it to find applications in a 151 number of other information service areas. 153 Added features include an extension to the trivial WHOIS data model 154 and query protocol and a companion extensible, distributed indexing 155 service. A number of other options have also been added, like boolean 156 operators, more powerful search constraints and search methods. In 157 addition, the data has been structured to make both the client and 158 server elements of the dialogue more stringent and easily 159 parsed. An optional authentication mechanism for protecting all or 160 parts of the associated Whois++ information database from 161 unauthorized access is also briefly described. 163 The architecture of Whois++ allows distributed maintenance of 164 the directory contents and the use of the Whois++ indexing service 165 for locating additional Whois++ servers. Although a general overview 166 of this service is included for completeness, the indexing extensions 167 are described described separately in [WINDX]. 169 It should be noted that Whois++ is not backward compatible with WHOIS. 171 1.2. The Whois++ Information Model 173 The Whois++ service is based on the use of information templates, which 174 consist of ordered sets of data elements (or attribute-value pairs). 175 It underlying recommendation is to use standardized templates where 176 available. 178 It is intended that adding structured template types to a server 179 and subsequently searching through information stored in templates 180 of a specified type should be simple tasks. The creation and use of 181 customized templates should also be possible with little effort, although 182 their use is discouraged where appropriate standardized templates exist. 184 Registration and schema definitions are done on an attribute by 185 attribute basis, so a client that receives a record parses the 186 record structure one attribute at a time. Because of this system, 187 the client does not need to know the structure of the whole record, 188 only individual attributes. If the client sees an unknown 189 attribute, it will skip that one and continue parsing the 190 subsequent attributes. A server that defines schemas can therefore 191 add its own unregistered attributes to a well-defined template type. 193 We also offer methods to allow the user to constrain searches to 194 desired attributes or template types, in addition to the existing 195 commands for specifying handles or simple strings. 197 It is expected that the minimalist approach we have taken will find 198 applications where the high cost of configuring and operating 199 traditional White Pages services can not currently be justified. 201 Note also that the architecture makes no assumptions about the search 202 and retrieval mechanisms used within individual servers. Operators 203 are free to use dedicated database formats, fast indexing software or 204 even provide gateways to other directory services to store and 205 retrieve information. The Whois++ server simply functions as a 206 known front end, offering a simple data model and communicating 207 through a well known port and query protocol. The format of both 208 queries and replies has been structured to allow the use of client 209 software for generating searches and displaying the results. At the 210 same time, some effort has been made to keep responses legible (to 211 some degree) by human users, both to ensure low entry cost and to 212 ease debugging. 214 The actual implemention details of an individual Whois++ search 215 engine are left to the imagination of the implementor. It is hoped 216 that the simple, extensible approach taken will encourage 217 experimentation and the development of improved search engines. 219 1.2.1. Changes to the current WHOIS Model 221 The current WHOIS service is based upon an extremely simple data 222 model. The NIC WHOIS database consists of a series of individual 223 records, each of which is identified by a single unique identifer 224 (the "handle"). Each record contains one or more lines of 225 information. Currently, there is no structure or implicit ordering of 226 this information, although each record is implicitly concerned 227 with information about a single user or service. 229 We have implemented two basic changes to this model. First, we have 230 structured the information within the database as collections of data 231 elements that are simple attribute/value pairs. Each individual record 232 contains a specified ordered set of these data elements. 234 Second, we have introduced the classing of database records into 235 template types. In effect, each record is based upon one template of a 236 specified set; each template contains a finite and specified number 237 of data elements. This classing allows users to limit searches 238 to specific collections of information, such as information about 239 users, services, abstracts of papers, or descriptions of software. 241 Since the data typing is done at the attribute level, not the template 242 level, it is also possible to add non-standard attributes to a 243 well-known template type. 245 As an addition to the model, we require that each individual Whois++ 246 database on the Internet be assigned a unique handle, analogous to 247 the handle associated with each database record. 249 The Whois++ database structure is shown in Fig. 1. 251 ______________________________________________________________________ 252 | | 253 | + Single unique Whois++ server handle | 254 | | 255 | _______ _______ _______ | 256 | handle3 |.. .. | handle6 |.. .. | handle9 |.. .. | | 257 | _______ | _______ | _______ | | 258 | handle2 |.. .. | handle5 |.. .. | handle8 |.. .. | | 259 | _______ | _______ | _______ | | 260 | handle1 |.. .. | handle4 |.. .. | handle7 |.. .. | | 261 | |.. .. | |.. .. | |.. .. | | 262 | ------- ------- ------- | 263 | Template Template Template | 264 | Type 1 Type 2 Type 3 | 265 | | 266 | | 267 | | 268 | | 269 | Fig.1 - Structure of a Whois++ database. | 270 | | 271 | Notes: - Entire database is identified by a single unique Whois++ | 272 | serverhandle. | 273 | - Each record has a single unique handle. | 274 | - Each record has a specific set of attributes, which is | 275 | determined by the Template Type used. | 276 | - Each value associated with an attribute is a text string | 277 | of an arbitrary length. | 278 |______________________________________________________________________| 280 1.2.2. Registering Whois++ servers 282 We propose that individual database handles be registered through the 283 Internet Assigned Numbers Authority (the IANA), ensuring their 284 uniqueness. This will allow us to specify each Whois++ entry on the 285 Internet as a unique pair consisting of a server handle and a record 286 handle. 288 A unique registered handle is preferable to using the host's IP 289 address, since it is conceivable that the Whois++ server for a 290 particular domain may move over time. If we preserve the unique 291 Whois++ handle in such cases we have the option of using it for 292 resource discovery and networked information retrieval (see [IIIR] 293 for a discussion of resource and discovery and support issues). 295 Uniqueness of server handles can be guaranteed by registering them with 296 IANA. 298 We believe that organizing information around a series of such 299 templates will make it easier for administrators to gather and 300 maintain this information and thus encourage them to make such 301 information available. At the same time, as users become more 302 familiar with the data elements available within specific templates 303 they will be able to specify their searches better, and the service 304 will become more useful. 306 1.2.3. The Whois++ Search Selection Mechanism 308 The WHOIS++ search mechanism is intended to be extremely simple. A 309 search command comprises one required element and one optional 310 element. The first (required) element is a set of one or more search 311 terms. The second (optional) element is a colon followed by set of 312 one or more global constraints, which modify or control the search. 314 Within each search term, the user may specify the template type, 315 attribute, value or handle that any record returned must satisfy. Each 316 search term can have an optional set of local constraints that apply 317 only to that term. 319 A Whois++ database may be seen as a single collection of 320 typed records. Each search term specifies a further constraint that the 321 selected set of output records must satisfy. Each term may thus be 322 thought of as performing a subtractive selection, in the sense that 323 any record that does not fulfill the term is discarded from the result 324 set. Result sets can be further specified by supplying multiple search 325 terms, related by logical connectives (AND, OR, NOT). 327 1.2.4. The Whois++ Architecture 329 The Whois++ directory service has an architecture which is separated 330 into two components: the base level server, which is described in 331 this paper, and an indexing server (described in [WINDX]). A single 332 physical server can act as both a base level server and an indexing server. 334 A base level server is one which contains only filled templates. An 335 indexing server is one which contains forward knowledge (q.v.) and 336 pointers to other indexing servers or base level servers. 338 1.3. Indexing in Whois++ 340 Indexing in Whois++ is used to tie together many base level servers 341 and index servers into a unified directory service. For more detailed 342 information on this subject, see [WINDX]. 344 Each base level server and index server that is to participate 345 in the unified directory service must generate forward knowledge 346 for the entries it contains. One type of forward knowledge is the 347 "centroid". 349 An example of a centroid is as follows. Consider a Whois++ server 350 that contains exactly three records: 352 Record 1 Record 2 353 Template: Person Template: Person 354 First-Name: John First-Name: Joe 355 Last-Name: Smith Last-Name: Smith 356 Favourite-Drink: Labatt Beer Favourite-Drink: Molson Beer 358 Record 3 359 Template: Domain 360 Domain-Name: foo.edu 361 Contact-Name: Mike Foobar 363 the centroid for this server would be 365 Template: Person 366 First-Name: Joe 367 John 368 Last-Name: Smith 369 Favourite-Drink:Beer 370 Labatt 371 Molson 373 Template: Domain 374 Domain-Name: foo.edu 375 Contact-Name: Mike 376 Foobar 378 An index server would then collect this centroid for this server as 379 forward knowledge. 381 Index servers can collect forward knowledge for any servers it 382 polls. In effect, all of the servers that the index server knows 383 about can be searched with a single query to the index server; the 384 index server holds the forward knowledge along with pointers to the 385 servers it indexes, and can refer the query to servers which might 386 hold information which satisfies the query. 388 Implementors of this protocol are strongly encouraged to incorporate 389 centroid generation abilities into their servers. 391 Whois++ uses the Common Indexing Protocol, which was originally described 392 in [WINDX] as a centroid-like object to provide index information 393 (forward knowledge) about server contents. This work is being extended in 394 the IETF's FIND Working-Group. 396 ------------------------------------------------------------------- 398 ____ ____ 399 top level | | | | 400 whois index | | | | 401 servers ---- ---- 402 / \________ / 403 / \ / 404 ____ ____ 405 first level | | | | 406 whois index | | | | 407 servers ---- ---- 408 / / \ 409 / / \ 410 ____ ____ ____ 411 individual | | | | | | 412 whois servers | | | | | | 413 ---- ---- ---- 415 Fig. 2 - Indexing system architecture. 417 ------------------------------------------------------------------- 419 1.4. Getting Help 421 Another extension to the basic WHOIS service is the requirement that 422 all servers support at least a minimal set of help commands, allowing 423 users to find out information about both the individual server and 424 the entire Whois++ service itself. This is done in the context of the 425 new extended information model by defining two specific template 426 formats and requiring each server to offer at least one example of 427 each record using these formats. The operator of each Whois++ service 428 is therefor expected to have, as a minimum, a single example of 429 SERVICES and HELP records, which can be accessed through appropriate 430 commands. 432 1.4.1. Minimum HELP Required 434 Executing the command: 436 DESCRIBE 438 gives a brief information about the Whois++ server. 440 Executing the command: 442 HELP 444 gives a brief description of the Whois++ service itself. 446 The text of both required helped records should contain pointers to 447 additional help subjects that are available. 449 Executing the command: 451 HELP 453 gives information on . 455 1.5. Options and Constraints 457 The Whois++ service is based upon a minimal core set of commands and 458 controlling constraints. A small set of additional optional commands 459 and constraints can be supported by a server. These allow users to 460 perform such tasks as provide security options, modify the 461 information contents of a server or add multilingual support. The 462 required set of Whois++ commands are listed in section 2.2. 463 Whois++ constraints are described in section 2.3. Optional 464 constraints are described in section 2.3.2. 466 1.6. Formatting Responses 468 The output returned by a Whois++ server is structured to allow 469 machine parsing and automated handling. Of particular interest is the 470 ability to return summary information about a search instead of having 471 to return the entire results. 473 All output of searches will be returned in one of five output 474 formats, which will be one of FULL, ABRIDGED, HANDLE, SUMMARY or 475 SERVER-TO-ASK. Note that a conforming server is only required to 476 support the FULL format. 478 When available, SERVER-TO-ASK format is used to indicate that a 479 search cannot be completed but that one or more alternative Whois++ 480 servers may be able to perform the search. 482 Details of each output format are specified in section 2.4. 484 1.7. Reporting Warnings and Errors 486 The formatted response of Whois++ commands allows the encoding of 487 warning or error messages to simplify parsing and machine handling. 488 The syntax of output formats are described in detail in section 2.4, 489 and details of Whois++ warnings and error conditions are given in 490 Appendix E. 492 All system messages are numerical, but can be tagged with text. It is 493 the client's decision if the text is presented to the user. 495 1.8. Privacy and Security Issues 497 The basic Whois++ service was conceived as a simple, unauthenticated 498 information lookup service, but there are occasions when 499 authentication mechanisms are required. To handle such cases, one 500 optional mechanism is provided for authenticating each Whois++ 501 transaction. This is the ability to name a (mutually-recognized) 502 authentication scheme in the optional AUTHENTICATE global constraint. 504 The one currently defined authentication scheme is PASSWORD, which 505 uses simple password authentication. Any other scheme name used must 506 begin with the characters "X-" and should thus be regarded as 507 experimental and non-standard. 509 Note that the Whois++ authentication mechanism does not dictate the 510 actual authentication scheme used, it merely provides a framework for 511 indicating that a particular transaction is to be authenticated, and 512 the appropriate scheme to use. This mechanism is extensible and 513 individual implementors are free to add additional schemes. 515 This document describes a very simple authentication scheme in which a 516 combination of username and password is sent together with the search 517 string so the server can verify that the user have access to the 518 information. Note that this is NOT by any means a method recommended 519 to secure the data itself because both password and information are 520 transferred unencrypted over the network. 522 Other, more sophisticated security and authentication schemes may 523 be proposed to address specific needs. For example, the Simple 524 Authentication and Security Layer (SASL) work proposed by John Myers 525 (particularly for POP and IMAP) may be applicable here. 527 2. Part II - Whois++ Implementation 529 2.1. The Whois++ interaction model 531 The Whois++ service has an assigned port number -- number 63. 532 However, there is nothing inherent the Whois++ protocol or interaction 533 model that prevents it from being used on any TCP connection on 534 any port -- the specification of the connection is outside the scope 535 of this protocol spec. Once a connection is established, the 536 server issues a banner message, and listens for input. The command 537 specified in this input is processed and the results returned 538 including an ending system message. If the client 539 does not specify the optional HOLD constraint, the connection is 540 then terminated. 542 If the server supports the optional HOLD constraint, and this 543 constraint is specified as part of any command, the server continues 544 to listen on the connection for another (single) line of input. 545 This cycle continues as long as the sender continues to append the 546 required HOLD constraint to each subsequent command. 548 2.2. The Whois++ Command set 550 The Whois++ command set consists of a core set of required systems 551 commands, a single required search command and an set of optional 552 system commands which support features that are not required by all 553 servers. The set of required Whois++ system commands are listed in 554 Table I. Valid search terms for the search command 555 are described in Table II. 557 Each Whois++ command also allows the use of one or more controlling 558 constraints, which, when selected, are used to override defaults or 559 otherwise modify the server's behavior. There is a core set of 560 constraints that must be supported by all conforming servers: 561 SEARCH (which controls the type of search performed), FORMAT (which 562 determines the output format used) and MAXHITS (which determines the 563 maximum number of matches that a search can return). These required 564 constraints are summarized in Table III. 566 An additional set of optional constraints are used to provide support 567 for different character sets, provide data for the authentication 568 scheme, and requesting multiple transactions during a single communications 569 session. These optional constraints are listed in Table IV. 571 It is possible, using the required COMMANDS and CONSTRAINTS system 572 commands, to query any Whois++ server for its list of supported 573 commands and constraints. 575 Please note that the line terminator is defined as a carriage 576 return and line feed (CR/LF) pair. Also, none of the commands or 577 constraints supported by Whois++ are case sensitive. For example, 578 the following are equivalent: HELP, Help, help, hElp. 579 Capitalization of all letters (e.g. HELP) is used only to improve 580 the legibility of this document. Finally, "attribute value" is 581 defined as "the value associated with an attribute". 583 2.2.1. System Commands 585 System commands are commands to the server for information or to 586 control its operation. These include commands to list the template 587 types available from individual servers, to obtain a single blank 588 template of any available type, and commands to obtain the list of 589 valid commands and constraints supported on a server. 591 There are also commands to obtain the current version of the Whois++ 592 protocol supported, to access a simple help subsystem, to obtain a 593 brief description of the service provided by the Whois++ 594 server. The DESCRIBE command is intended, among other 595 things, to support the automated registration of the service in 596 yellow pages directory services. The required commands are listed 597 in Table I. 599 ------------------------------------------------------------------------ 601 Short Long Form Functionality 602 ----- --------- ------------- 603 COMMANDS [ ':' HOLD ] List Whois++ commands 604 supported by this server 606 CONSTRAINTS [ ':' HOLD ] List valid constraints 607 supported by this server 609 DESCRIBE [ ':' HOLD ] Describe this server, 610 formating the response 611 using a standard 612 SERVICES template 614 '?' HELP [ [':' ( / HOLD) 615 0*(';' ( / HOLD))]] 616 Provide help specific to this 617 Whois++ server, using a 618 "Help" template 620 LIST [':' ( / HOLD) 621 0*(';' ( / HOLD))] 622 List templates supported 623 by this server 625 POLLED-BY [ ':' HOLD ] List indexing servers 626 that are known to poll 627 this server 629 POLLED-FOR [ ':' HOLD ] List information about 630 servers this server polls 632 SHOW [':' ] Show contents of template 633 specified in 635 VERSION [ ':' HOLD ] Show the version of 636 the protocol supported by 637 this server 639 Table I - Required Whois++ SYSTEM commands. 641 ------------------------------------------------------------------------ 643 Below follows a descriptions for each command. Examples of responses 644 to each command are provided in Appendix C. 646 2.2.1.1. The COMMANDS command 648 The COMMANDS command returns a list of commands that the server 649 supports. The response is formatted as a FULL response. 651 2.2.1.2. The CONSTRAINTS command 653 The CONSTRAINTS command returns a list of both the constraints and 654 their values that the server supports. The response is formatted as a 655 FULL response, where every constraint is represented as a separate 656 record. The template name for these records is CONSTRAINT. No 657 attention is paid to handles. Each record has, as a minimum, the 658 following two attributes: 660 - "Constraint", whose value is the constraint name 661 - "Default", which shows the default value for this constraint. 663 If the client is permitted to change the value of the constraint, 664 there is also: 666 - "Range", which contains a list of values that this 667 server supports, as a comma separated list, or, if the range 668 is numerical, as a pair of numbers separated with a hyphen. 670 Note that, irrespective of whether a session is continued (with the HOLD 671 constraint) or not, constraints are set to the default value unless 672 explicitly changed with a constraint in each query. 674 2.2.1.3. The DESCRIBE command 676 The DESCRIBE command gives a brief description about the server in a 677 "Services" template. The result is formatted as a FULL response with 678 as a minimum one attribute: 680 - "Text", which describes the service in a form legible by human users. 682 2.2.1.4. The HELP command 684 The HELP command takes an optional argument which is the subject on 685 which to get help. The answer is formatted as a FULL format response. 687 2.2.1.5. The LIST command 689 The LIST command returns the name of the templates available on the 690 server. The answer is formatted as a FULL format response. 692 2.2.1.6. The POLLED-BY command 694 The POLLED-BY command returns a list of servers and the templates and 695 attribute names that those servers polled as centroids from this 696 server. The format is in FULL format with two attributes, "Template" 697 and "Field", whose values are lists of the names of the polled 698 templates and fields, respectively. An empty result means either 699 that the server is not polled by anyone, or that it doesn't support 700 indexing. 702 2.2.1.7. The POLLED-FOR command 704 The POLLED-FOR command returns a list of servers that this server has 705 polled, and the template and attribute names for each of those. The 706 answer is in FULL format with two attributes, Template and Field. An 707 empty result means either that the server is not polling anyone, or 708 that it doesn't support indexing. 710 2.2.1.8. The SHOW command 712 The SHOW command takes a template name as argument and returns 713 information about that template, formatted as a FULL response. 714 The answer is formatted as a blank template with the requested name. 716 2.2.1.9. The VERSION command 718 The output format is a FULL response containg a record with template 719 name VERSION. The record must have attribute name "Version", whose 720 value is "2.0" for this version of the protocol. The record may also 721 have the additional fields "Program-Name" and "Program-Version" which 722 gives information about the server implementation if the server so 723 desires. 725 If the server also supports the earlier version of the protocol, 726 "1.0", two records are given back as a response to the VERSION 727 command, one for each version supported. 729 2.2.2. The SEARCH Command 731 A SEARCH command comprises one required element and one optional 732 element. The first (required) element is a set of one or more search 733 terms. The second (optional) element is a set of global constraints, 734 which modify or control the search. Each search term can have an 735 optional set of local constraints that apply only to that term. 737 Each attribute value in the Whois++ database is divided into one or 738 more words separated by whitespace (see Appendix F for a definition 739 of whitespace) . Each search term operates on every word in the attribute 740 value. 742 Two or more search terms have to be combined with boolean operators AND, 743 OR or NOT. The operator AND has higher precedence than the operator OR, 744 but this can be changed by the use of parentheses. 746 Boolean operators function as follows for two search terms, A and 747 B. Let A1 be the result set from the first search term and B1 be the 748 result set from the second search. The operation A AND B returns the 749 hits in the intersection of sets A1 and B1. The operation A OR B 750 returns the hits in the union of the sets A1 and B1. The operation 751 NOT A returns all possible results that are not in set A1. The 752 behaviour of the boolean operators can be generalized to N search 753 terms where N > 2. Note that NOT has a higher precedence than AND 754 or OR, so NOT A AND B returns the hits in B that are not in A. 756 Search constraints that apply to all search terms are specified as 757 global constraints. Local constraints override global constraints for 758 the search term they are bound to. The search terms and the global 759 constraints are separated with a colon (':'). Each additional global 760 constraint is appended to the end of the search command, and a 761 semicolon ';' is used as the delimiter between global constraints. 763 If any of the search constraints can not be fulfilled, or if 764 several of the specified constraints are mutually exclusive, the 765 server ignores the constraints that can not be fulfilled and those 766 that are mutually exclusive. The server performs the search using 767 only the remaining constraints and returns the corresponding set of 768 records. 770 The set of required constraints are listed in Table III. The set 771 of optional constraints are listed in Table IV. 773 As an option, the server may accept specifications for attributes 774 to be included or excluded from a reply. Thus, users could specify 775 -only- those attributes to return, or specific attributes to filter 776 out, thus creating custom views. 778 2.2.2.1. Format of a Search Term 780 Each search term consists of one of the following: 782 1) A search string, followed by an optional set of semicolon- 783 separated local constraints. If local constraints are 784 specified, they are separated from the search string by a 785 semicolon. This is noted as: 787 [';' ]* 789 2) A search term specifier (as listed in Table II), followed by a 790 '=', followed by a search string, an optional set of 791 semicolon-separated local constraints. If local constraints are 792 specified, they are separated from the search string by a 793 semicolon. This is noted as: 795 = [';' ]* 797 3) An attribute name, followed by '=', followed by 798 a search string, followed by an optional set of 799 semicolon-separate local constraints. If local constraints are 800 specified, they are separated from the search string by a 801 semicolon. 803 = [';' ]* 805 (Note: A is a valid local constraint specification.) 807 If no search term specifier is provided, then the search will be 808 applied to attribute values only. This corresponds to an identifier 809 of VALUE. 811 When the user specifies the search term using the form: 813 " = " 815 this is considered to be an ATTRIBUTE-VALUE search. 817 For discussion of the system reply format, and selecting the 818 appropriate reply format, see section 2.4. 820 ------------------------------------------------------------------- 822 Valid specifiers: 823 ----------------- 825 Name Functionality 826 ---- ------------- 828 HANDLE Confine search to handles. 829 VALUE Confine search to attribute 830 values. 832 (Note: The specifier HANDLE= can be replaced with the shorthand '!') 834 Table II - Valid search command term specifiers. 836 ------------------------------------------------------------------- 838 2.2.2.2. Format of a Search String 840 Special characters that need to be quoted are preceeded by a 841 backslash, '\'. 843 Special characters are space ' ', tab, equal sign '=', comma ',', 844 colon ':', backslash '\', semicolon ';', asterisk '*', period '.', 845 parenthesis '()', square brackets '[]', dollar sign '$' and 846 circumflex '^'. 848 If the search term is given in some other character set than ISO- 849 8859-1, it must be specified by the constraint INCHARSET. 851 2.3. Whois++ Constraints 853 Constraints are intended to be hints or recommendations to the server 854 about how to process a command. They may also be used to override 855 default behaviour, such as requesting that a server not drop the 856 connection after performing a command. 858 Thus, a user might specify a search constraint as "SEARCH=exact", 859 which means that the search engine is to perform an exact match 860 search. The user might also specify "LANGUAGE=Fr", which means that the 861 server should (if possible) display the French versions of the attribute 862 values, and if possible use French in fuzzy matches. The server should also 863 issue system messages in French. 865 In general, constraints take the form "=", where 866 is one of a specified set of valid values. The notable 867 exception is "HOLD", which takes no argument. 869 All constraints can be used as a global constraint (i.e., on the 870 whole query transaction). Only a few can be used as a constraint 871 local to a search term. See tables III and IV for information about which 872 constraints can be local. 874 The CONSTRAINTS system command is used to list the search constraints 875 supported by an individual server. 877 If a server cannot satisfy the specified constraint, the server should 878 indicate this to the user through the use of system messages. 879 In such cases, the search is still performed, with the the server 880 ignoring unsupported constraints. 882 2.3.1. Required Constraints 884 The following CONSTRAINTS must be supported in all conforming Whois++ 885 servers. 887 ------------------------------------------------------------------ 889 Format LOCAL/GLOBAL 890 ------ ------------- 892 SEARCH= exact / lstring LOCAL/GLOBAL 894 FORMAT= full / abridged / handle / summary GLOBAL 896 MAXHITS= 1- GLOBAL 898 Table III - Required Whois++ constraints. 900 ------------------------------------------------------------------ 902 2.3.2. Optional CONSTRAINTS 904 The following CONSTRAINTS and constraint values are not required of a 905 conforming Whois++ server, but may be supported. If supported, their 906 names and supported values must be returned in the response to the 907 CONSTRAINTS command. 909 --------------------------------------------------------------------- 911 Format LOCAL/GLOBAL 912 ------ ------------- 914 SEARCH= regex / fuzzy / substring LOCAL/GLOBAL 916 CASE= ignore | consider LOCAL/GLOBAL 918 FORMAT= server-to-ask GLOBAL 920 MAXFULL= 1- GLOBAL 922 AUTHENTICATE= password GLOBAL 924 NAME= GLOBAL 926 PASSWORD= GLOBAL 928 INCHARSET= us-ascii / iso-8859-* / 929 UNICODE-1-1-UTF-8 / UNICODE-2-0-UTF-8 GLOBAL 931 OUTCHARSET= us-ascii / iso-8859-* / 932 UNICODE-1-1-UTF-8 / UNICODE-2-0-UTF-8 GLOBAL 934 LANGUAGE= GLOBAL 936 HOLD GLOBAL 938 IGNORE= GLOBAL 940 INCLUDE= GLOBAL 942 Table IV - Optional Whois++ constraints. 944 ---------------------------------------------------------------------- 946 2.3.2.1. The SEARCH Constraint 948 The SEARCH constraint is used for specifying the method that is to be 949 used for the search. The default method is "exact". Following is a 950 definition of each search method. 952 exact The search will succeed for a word that exactly 953 matches the search string. 955 substring The search will succeed for a word that matches 956 a part of a word. 958 regex The search will succeed for a word when a regular 959 expression matches the searched data. Regular 960 expression is built up by using constructions of 961 '*', '.', '^', '$', and '[]'. For use of 962 regular expressions see Appendix H. 964 fuzzy The search will succeed for words that matches the 965 search string by using an algorithm designed to catch 966 closely related names with different spelling, e.g. 967 names with the same pronunciation. The server 968 chooses which algorithm to use, but it may vary 969 depending on template name, attribute name and 970 language used (see Constraint Language above). 972 lstring The search will succeed for words that begins 973 with the search string. 975 2.3.2.2. The FORMAT Constraint 977 The FORMAT constraint describes what format the result will be in. 978 Default format is FULL. For a description of each format, see Server 979 Response Modes below. 981 2.3.2.3. The MAXFULL Constraint 983 The MAXFULL constraint sets the limit of the number of matching 984 records the server allows before it enforces SUMMARY responses. The 985 client may attempt to override this value by specifying another value 986 to that constraint. Example: If, for privacy reasons, the server is to 987 return the response in SUMMARY format if the number of hits exceeds 988 2, the MAXFULL constraint is set to 2 by the server. 990 Regardless of what format the client asked for, the server will change the 991 response format to SUMMARY when the number of matching records equals or 992 exceeds this value. 994 2.3.2.4. The MAXHITS Constraint 996 The MAXHITS constraint sets the maximum number of records returned to the 997 client in response to a query. 999 2.3.2.5. The CASE Constraint 1001 The CASE constraint defines if the search should be case 1002 sensitive or not. Default value is to have case ignored. 1004 2.3.2.6. The AUTHENTICATE Constraint 1006 The AUTHENTICATE constraint describes which authentication scheme to 1007 use when executing the search. Depending on the authentication scheme 1008 used, some other constraints may have to be specified. The authentication 1009 scheme definition identifies which constraints it requires. 1011 The only authentication scheme described in this document is 1012 "password". If used, also the two other constraints "name" and 1013 "password" need to be set. 1015 2.3.2.7. The NAME Constraint 1017 The NAME constraint is only used together with some authentication 1018 scheme named by the constraint "authenticate". 1020 With the password authentication scheme, this is expected to be a string 1021 of characters representing a username, for which the specified password 1022 should be verified (i.e., similar to the UNIX login program). 1024 2.3.2.8. The PASSWORD Constraint 1026 The PASSWORD constraint is only used together with some 1027 authentication scheme named by the constraint "authenticate". 1029 The password authentication scheme requires that the password associated 1030 with the username be supplied by this constraint. The server 1031 can use that pair of strings to do a simple authentication check, 1032 similar to the UNIX login program. 1034 2.3.2.9. The LANGUAGE Constraint 1036 The LANGUAGE constraint specifies the language in which the client 1037 wishes to receive responses. It therefore specifies which attribute 1038 values should be presented to the user (i.e., only those in the specified 1039 language, or for which no language information is available). 1040 It can also be used as an extra information to the fuzzy matching search 1041 method, and it might also be used to tell the server to give the system 1042 responses in another language. This should preferably be handled by 1043 the client. The language codes defined in RFC 1766 [ALVE95] should be 1044 used as a value for the language constraint. In these, the case of 1045 the letters are insignificant. 1047 If a record has attribute values in different languages, and no LANGUAGE 1048 search constraint was given in the query, the switch between the 1049 different languages should be given in the response by the use 1050 of system messages 601 which has one argument only, the name of the 1051 language or one of the predefined strings "ANY" or "DEF". A block 1052 of alternative attribute values starts with a language definition 1053 like "% 601 SE". After the first language specification, zero or 1054 more language specifications can be given, each switching into the 1055 desired language. When all specific languages have been tagged, the 1056 specification "% 601 DEF" can be used for specifying default attribute 1057 values. A block of alternative attributes must end with "% 601 ANY". 1059 The following is an example of a response using the language messages: 1061 # FULL USER LOCAL USER-DOE 1062 % 601 FR 1063 Name: Monsieur John Doe 1064 % 601 SV 1065 Name: Herr John Doe 1066 % 601 DEF 1067 Name: Mister John Doe 1068 % 601 ANY 1069 Email: jdoe@doe.pp.se 1070 # END 1072 The language specifications may be suppressed by the server (using 1073 the % 601 messages) if the client has explicitly, by using the global 1074 constraint LANGUAGE, asked for a specific language. 1076 2.3.2.10. The INCHARSET Constraint 1078 The INCHARSET constraint tells the server in which character set the 1079 search string itself is given. The default character set is ISO- 1080 8859-1. 1082 2.3.2.11. The OUTCHARSET Constraint 1084 The OUTCHARSET constraint tells the server in which character set the 1085 search result is supposed to be given in. The default character set is 1086 ISO-8859-1, but the server may choose something else. 1088 2.3.2.12. The IGNORE Constraint 1090 The IGNORE constraint specifies which attributes NOT to include in 1091 the result. All other attributes will be included (as if named 1092 explicitly by the "include" constraint). 1094 If an attribute is named both with the "include" and "ignore" 1095 constraint, the attribute is to be included in the result, but the 1096 system message "% 112 Requested constraint not fulfilled" must be 1097 sent. 1099 2.3.2.13. The INCLUDE Constraint 1101 The INCLUDE constraint specifies which attributes to include in the 1102 result. All other attributes will be excluded (as if named explicitly 1103 by the "ignore" constraint). 1105 If an attribute is named both with the "include" and "ignore" 1106 constraint, the attribute is to be included in the result, but the 1107 system message must be "% 112 Requested constraint not fulfilled". 1109 2.3.2.14. The HOLD Constraint 1111 The HOLD constraint requests that the server hold open the connection 1112 after sending the response to the query. The server waits for another 1113 user input string. 1115 2.4. Server Response Modes 1117 The grammar for Whois++ responses is given in Appendix G, and described 1118 below. 1120 There are currently a total of five different response modes possible 1121 for Whois++ servers. These are FULL, ABRIDGED, HANDLE, SUMMARY and 1122 SERVER-TO-ASK. The syntax of each output format is specified in more 1123 detail in Appendix G. 1125 1) A FULL format response provides the complete contents of a 1126 template matching the specified query, including the template 1127 type, the server handle and an optional record handle. 1129 2) An ABRIDGED format response provides a brief summary, including 1130 (as a minimum) the server handle, the corresponding record handle 1131 and relevant information for that template. 1133 3) A HANDLE format response returns a line with information about 1134 the server handle and record handle for a record that matched 1135 the specified query. 1137 4) A SUMMARY response provides only a brief summary of information 1138 the number of matches and the list of template types in which the 1139 matches occurred. 1141 5) A SERVER-TO-ASK response only returns pointers to other index 1142 servers which might possibly be able to answer the specified 1143 query. 1145 The server may optionally respond with an empty result set and may also 1146 respond with an empty response together with a system message to indicate 1147 that the query was too complex for it to fulfill. 1149 2.4.1. Default Responses 1151 By default, a Whois++ server will provide FULL responses. This may be 1152 changed by the client with the use of the global constraint "format". 1154 The server will not respond with more matches than the value 1155 specified with the global constraint "maxhits" in any response 1156 format. If the number of matches exceeds this value, the server will 1157 issues the system message 110 (maxhits value exceeded), but will 1158 still show the responses, up to the number of the "maxhits" 1159 constraint value. This mechanism will allow the server to hide the 1160 number of possible matches to a search command. 1162 2.4.2. Format of Responses 1164 Each response consists of a numerical system generated message, which 1165 can be tagged with text, followed by an optional formatted response 1166 message, followed by a second system generated message. The formatted 1167 response itself can include system messages, for example for switches in 1168 language. 1170 That is: 1172 '%' 1174 [ ] 1176 '%' 1178 If there are no matches to a query, the system is not required to 1179 generate any output as a formatted response, although it must still 1180 generate system messages. 1182 For information about the standard text for system messages, see 1183 Appendix E. 1185 2.4.3. Syntax of a Formatted Response 1187 All formatted responses except for the HANDLE response, consist of a 1188 response-specific START line, followed by an optional response- 1189 specific data section, followed by a TERMINATION line. The HANDLE 1190 response is different in that it only consists of a START line. It 1191 is permissible to insert any number of lines consisting solely of 1192 CR/LF pairs within a formatted response to improve readability. 1194 Each line shall be limited to no more than 81 characters, including 1195 the terminating CR/LF pair. If a line (including the required leading 1196 single space) would exceed 81 characters, it must be broken into 1197 lines of no more than 81 characters, with each continuation line 1198 beginning with a "+" character in the first column instead of the 1199 leading character. 1201 If an attribute value in a data section includes a line break, the 1202 line break must be replaced by a CR/LF pair and the following line 1203 begin with a "-" character in the first column, instead of the 1204 leading character. The attribute name is not repeated on consecutive 1205 lines. 1207 A TERMINATION line consists of a line with a '#' in the first column, 1208 followed by one space (ASCII 32) character, followed by the keyword END, 1209 followed by zero or more characters, followed by a CR/LF pair. 1211 A response-specific section will be one of the following: 1213 1) FULL Format Response 1214 2) ABRIDGED Format Response 1215 3) HANDLE Format Response 1216 4) SUMMARY Format Response 1217 5) SERVER-TO-ASK Format Response 1219 2.4.3.1. A FULL format response 1221 A FULL format response consists of a series of responses, each 1222 consisting of a START line, followed by the complete template 1223 information for the matching record and a TERMINATION line. 1225 Each START line consists of a '#' in the first column, followed by 1226 one space character, the word "FULL", a space character, 1227 the name of the corresponding template type, one space 1228 character, the server handle, a space character, (optionally) the 1229 handle for the record, and a terminating CR/LF pair. 1231 The template information for the record will be returned as a series 1232 of lines consisting of a single space, followed by the corresponding 1233 line of the record. 1235 The line of the record shall consist of a single space and the 1236 attribute name followed by a ':', a single space, the value of that 1237 attribute, and a CR/LF pair. 1239 2.4.3.2. ABRIDGED Format Response 1241 Each ABRIDGED format response consists of a START line, a single line 1242 excerpt of the template information from each matching record and a 1243 TERMINATION line. The excerpt information shall include information 1244 that is relevant to the template type. 1246 The START line consists of a '#' in the first column, followed by one 1247 space character, the word "ABRIDGED", a space character, 1248 the name of the corresponding template type, a space character, 1249 the server handle, a space character, the handle for the 1250 record, and a terminating CR/LF pair. 1252 The abridged template information will be returned as a line, 1253 consisting of a single space, followed by the abridged line of the 1254 record and a CR/LF pair. 1256 2.4.3.3. HANDLE Format Response 1258 A HANDLE response consists of a single START line, which shall start 1259 with a '#' in the first column, followed by one space 1260 character, the word "HANDLE", a space character, the name of 1261 the corresponding template, a space character, the handle for 1262 the server, a space character, the handle for that record, and 1263 a terminating CR/LF pair. 1265 2.4.3.4. SUMMARY Format Response 1267 A SUMMARY format response consists of a single response, 1268 consisting of a line listing the number of matches to the specified 1269 query, optionally a count of referrals, followed by a list of all template 1270 types which satisfied the query at least once. 1272 The START line shall begin with a '#' in the first column, be 1273 followed by one white space character, the word "SUMMARY", a white 1274 space character, the handle for the server, and a terminating 1275 CR/LF pair. 1277 The format of the attributes in the SUMMARY format follows the 1278 rules for the FULL template, with the attributes "matches", 1279 "referrals" and "templates". "matches" and "templates" are 1280 mandatory, "referrals" optional. 1282 The first line must begin with the string "matches:", be 1283 followed by a space and the number of responses to the query and 1284 terminated by a CR/LF pair. 1286 The following line shall either begin with the string "templates: " 1287 or the string "referrals: ". The string "templates: " are followed 1288 by a CR/LF separated list of the name of the template types 1289 which matched the query. Each line following the first which 1290 include the text "templates:" must begin with a '-' instead of 1291 a space. The string "referrals: " is followed by the number of 1292 referrals included in the number of hits. 1294 2.4.3.5. SERVER-TO-ASK Response 1296 A SERVER-TO-ASK response consists of information to the client about 1297 a server to contact next to resolve a query. If the server has 1298 pointers to more than one server, it will present additional SERVER- 1299 TO-ASK responses. 1301 The SERVER-TO-ASK response will consist of a START line and a number 1302 of lines with attribute-value pairs, separated by CRLF. Each line is 1303 indented with one space. The end of a SERVER-TO-ASK response is 1304 indicated with a TERMINATION line. 1306 Each START line consists of a '#' in the first column, followed by 1307 one space character, the word "SERVER-TO-ASK", a space 1308 character, the handle of the server and a terminating CR/LF pair. 1310 1. "Server-Handle" - The server handle of the server pointed at. 1311 (req.) 1312 2. "Host-Name" - Hostname for the server pointed at. 1313 3. "Host-Port" - Portnumber for the server pointed at. 1314 4. "Protocol" - The protocol to use when contacting this server. (opt.) 1316 Other attributes may be present, depending on the index server. 1317 The default protocol to use is Whois++. 1319 2.4.4. System Generated Messages 1321 All system generated messages must have a '%' as the first 1322 character, a space as the second one, followed by a three digit 1323 number, a space and an optional text message. The total length of the 1324 line must be no more than 81 characters long, including the 1325 terminating CR/LF pair. There is no limit to the number of system 1326 messages that may be generated. 1328 The format for multiline replies requires that every line, except the 1329 last, begin with "%", followed by space, the reply code, a hyphen, 1330 and an optional text. The last line will begin with "%", followed by 1331 space, the reply code, a space and some optional text. 1333 System generated messages displayed before or after the formatted 1334 response section are expected to refer to operation of the system or 1335 refer to the entire query. System generated messages within the 1336 output of an individual record during a FULL response are expected to 1337 refer to that record only, and could (for example) be used to 1338 indicate problems with that record of the response. See Appendix E 1339 for a description of system messages. 1341 2.5. Compatibility with Older WHOIS Servers 1343 Note that this format, although potentially more verbose, is still in 1344 a human readable form. Responses from older systems that do not 1345 follow this format are still conformant, since their responses would 1346 be interpreted as being equivalent to optional text messages, without 1347 a formatted response. Clients written to this specification would 1348 display the responses as a advisory text message, where it would 1349 still be readable by the user. 1351 3. Miscellaneous 1353 3.1. Acknowledgements 1355 This document has been through many iterations of refinement, with 1356 contributions of different natures along the way. These acknowledgements 1357 accrue. 1359 The Whois++ effort began as an intensive brainstorming session at the 1360 24th IETF, in Boston Massachusetts. Present at the birth, and 1361 contributing ideas through this early phase, were (alphabetically) 1362 Peter Deutsch, Alan Emtage, Jim Fullton, Joan Gargano, Brad 1363 Passwaters, Simon Spero, and Chris Weider. Others who have since 1364 helped shape this document with feedback and suggestions include 1365 Roxana Bradescu, Patrik Faltstrom, Kevin Gamiel, Dan Kegel, Michael 1366 Mealling, Mark Prior and Rickard Schoultz. 1368 Version 2 of the protocol is based on input during the lifetime of 1369 version 1. Special mention goes to Jeff Allen, Leslie Daigle, 1370 and Philippe Boucher. During the polishing of the RFC for version 2, 1371 important input was given by Len Charest, Clarke Anderson and others 1372 in the ASID working group of the IETF. 1374 Work in the European ROADS project provided the opportunity to test this 1375 protocol specification from the point of view of developing a test suite. 1376 The challenge was not only to provide AN implementation that satisfied the 1377 document, but to build tools that would be able to respond to all 1378 POSSIBLE responses that could be implemented from the spec. This then 1379 lead to the contribution of some textual clarifications. Specific thanks 1380 go to Bill Heelan and Philippe Boucher. 1382 3.2 References 1384 [ALVE95] Alvestrand H., "Tags for the Identification of 1385 Languages", RFC 1766, UNINETT, March 1995. 1387 [RFC822] Crocker, D., "Standard for the Format of ARPA Internet 1388 Text Messages", RFC 822, August 1982. 1390 [HARR85] Harrenstein K., Stahl M., and E. Feinler, 1391 "NICNAME/WHOIS", RFC 954, SRI, October 1985. 1393 [POST82] Postel J., "Simple Mail Transfer Protocol", STD 10, 1394 RFC 821, USC/Information Sciences Institute, 1395 August 1982. 1397 [IIIR] Weider C., and P. Deutsch, "A Vision of an 1398 Integrated Internet Information Service", RFC 1727 1399 Bunyip Information Systems, Inc., December 1994. 1401 [WINDX] Weider, C., J. Fullton, and S. Spero, "Architecture of 1402 the Whois++ Index Service", RFC 1913, February 1996. 1404 3.3. Authors Addresses 1406 Patrik Faltstrom 1407 Tele2 1408 Borgarfjordsgatan 16 1409 BOX 62 1410 194 64 Kista 1411 SWEDEN 1413 Email: paf@swip.net 1415 Sima Newell 1416 Bunyip Information Systems Inc. 1417 310 Ste. Catherine St. W 1418 Suite 300 1419 Montreal, Quebec, CANADA 1420 H2X 2A1 1422 Email: sima@bunyip.com 1424 Leslie L. Daigle 1425 Bunyip Information Systems Inc. 1426 310 Ste. Catherine St. W 1427 Suite 300 1428 Montreal, Quebec, CANADA 1429 H2X 2A1 1431 Email: leslie@bunyip.com 1433 Appendix A - Some Sample Queries 1435 author=leslie and template=user 1437 The result will consist of all records where attribute "author" 1438 matches "leslie" with case ignored. Only USER templates will be 1439 searched. An example of a matching attribute is "Author=Leslie L. Daigle". 1440 This is the typical case of searching. 1442 author=leslie and template=user:language=fr 1444 The result will consist of the same records as above, but if 1445 attributes are available in alternative languages, only the 1446 ones in French will be displayed. These are either the ones which 1447 have explicitly been tagged as having French values, or ones that 1448 are tagged as being in the "DEF" (default) language. 1450 schoultz and rick;search=lstring 1452 The result will consist of all records which have one attribute value 1453 matching "schoultz" exactly (because the default search type is exact) 1454 and one attribute with "rick" as leading substring, both with case ignored. 1455 One example is "Name=Rickard Schoultz". 1457 value=phone;search=substring 1459 The result will consist of all records which have attribute values 1460 matching *phone*, for example the record "Name=Acme telephone inc.", 1461 but will not match the attribute name "phone". (Since term specifier 1462 is "value" by default, the search term could just as well have been 1463 simply "phone"). 1465 ucdavis;search=substring and (gargano or joan):include=name,email 1467 This search command will find records which have records containing 1468 the words "gargano" or "joan" somewhere in the record, and has the 1469 word "ucdavis" somewhere in a word. The result will only show the 1470 "name" and "email" fields. 1472 Appendix B - Some sample responses 1474 1) FULL format responses: 1476 # FULL USER SERVERHANDLE1 PD45 1477 Name: Peter Deutsch 1478 email: peterd@bunyip.com 1479 # END 1480 # FULL USER SERVERHANDLE1 AE1 1481 Name: Alan Emtage 1482 email: bajan@bunyip.com 1483 # END 1484 # FULL USER SERVERHANDLE1 NW1 1485 Name: Nick West 1486 Favourite-Bicycle-Forward-Wheel-Brand: New Bicy 1487 +cles Acme Inc. 1488 email: nick@bicycle.acme.com 1489 My-favourite-song: Happy birthday to you! 1490 -Happy birthday to you! 1491 -Happy birthday dear Nick! 1492 -Happy birthday to you. 1493 # END 1494 # FULL SERVICES SERVERHANDLE1 WWW1 1495 Type: World Wide Web 1496 Location: the world 1497 # END 1499 -------------------- 1501 2) An ABRIDGED format response: 1503 # ABRIDGED USER SERVERHANDLE1 PD45 1504 Peter Deutsch peterd@bunyip.com 1505 # END 1506 # ABRIDGED USER SERVERHANDLE1 AE1 1507 Alan Emtage bajan@bunyip.com 1508 # END 1509 # ABRIDGED USER SERVERHANDLE1 WWW1 1510 World Wide Web the world 1511 # END 1513 -------------------- 1515 3) HANDLE format responses: 1517 # HANDLE USER SERVERHANDLE1 PD45 1518 # HANDLE USER SERVERHANDLE1 AE1 1519 # HANDLE SERVICES SERVERHANDLE1 WWW1 1521 -------------------- 1523 4) A SUMMARY format response: 1525 # SUMMARY SERVERHANDLE1 1526 Matches: 35 1527 Referrals: 2 1528 Templates: User 1529 -Services 1530 -Abstracts 1531 # END 1533 Appendix C - Sample responses to system commands 1535 C.1 Response to the LIST command 1537 # FULL LIST SERVERHANDLE1 1538 Templates: USER 1539 -SERVICES 1540 -HELP 1541 # END 1543 C.2 Response to the SHOW command 1545 This example shows the result after issuing "show user": 1547 # FULL USER SERVERHANDLE1 1548 Name: 1549 Email: 1550 Work-Phone: 1551 Organization-Name: 1552 City: 1553 Country: 1554 # END 1556 C.3 Response to the POLLED-BY command 1558 # FULL POLLED-BY SERVERHANDLE1 1559 Server-handle: serverhandle2 1560 Cached-Host-Name: sunic.sunet.se 1561 Cached-Host-Port: 7070 1562 Template: USER 1563 Field: ALL 1564 # END 1565 # FULL POLLED-BY SERVERHANDLE1 1566 Server-handle: serverhandle3 1567 Cached-Host-Name: kth.se 1568 Cached-Host-Port: 7070 1569 Template: ALL 1570 Field: Name,Email 1571 # END 1573 C.4 Response to the POLLED-FOR command 1575 # FULL POLLED-FOR SERVERHANDLE1 1576 Server-Handle: serverhandle5 1577 Template: ALL 1578 Field: Name,Address,Job-Title,Organization-Name, 1579 +Organization-Address,Organization-Name 1580 # END 1581 # FULL POLLED-FOR SERVERHANDLE1 1582 Server-Handle: serverhandle4 1583 Template: USER 1584 Field: ALL 1585 # END 1587 C.5 Response to the VERSION command 1589 # FULL VERSION BUNYIP.COM 1590 Version: 2.0 1591 Program-Name: Digger 1592 Program-Version: 3.0b1 1593 Program-Author: Bunyip Information Systems Inc. 1594 Program-Author-Email: digger-info@bunyip.com 1595 Bug-Report-Email: digger-bugs@bunyip.com 1596 # END 1598 C.6 Response to the CONSTRAINTS command 1600 # FULL CONSTRAINTS SERVERHANDLE1 1601 CONSTRAINT: maxhits 1602 DEFAULT: 100 1603 RANGE: 0-100 1604 # END 1605 # FULL CONSTRAINTS SERVERHANDLE1 1606 CONSTRAINT: case 1607 DEFAULT: ignore 1608 RANGE: ignore, consider 1609 # END 1610 # FULL CONSTRAINTS SERVERHANDLE1 1611 CONSTRAINT: search 1612 DEFAULT: exact 1613 RANGE: exact, lstring, substring, fuzzy 1614 # END 1615 # FULL CONSTRAINTS SERVERHANDLE1 1616 CONSTRAINT: language 1617 DEFAULT: DEF 1618 RANGE: FR, EN, SV, ANY, DEF 1619 # END 1620 # FULL CONSTRAINTS SERVERHANDLE1 1621 CONSTRAINT: incharset 1622 DEFAULT: ISO-8859-1 1623 RANGE: ISO-8859-1, UNICODE-1-1-UTF8 1624 # END 1625 # FULL CONSTRAINTS SERVERHANDLE1 1626 CONSTRAINT: outcharset 1627 DEFAULT: ISO-8859-1 1628 RANGE: ISO-8859-1, UNICODE-1-1-UTF8, HTML 1629 # END 1631 C.7 Response to the COMMANDS command 1633 # FULL COMMANDS SERVERHANDLE1 1634 Commands: commands 1635 -constraints 1636 -describe 1637 -help 1638 -list 1639 -polled-by 1640 -polled-for 1641 -show 1642 -version 1643 # END 1645 Appendix D - Sample Whois++ session 1647 Below is an example of a session between a client and a server. The 1648 angle brackets to the left is not part of the communication, but is 1649 just put there to denote the direction of the communication between 1650 the server or the client. Text appended to '>' means messages from 1651 the server and '<' from the client. 1653 Client connects to the server 1655 >% 220-Welcome to 1656 >% 220-the Whois++ server 1657 >% 220 at ACME inc. 1658 % 200 Command okay 1660 > 1661 ># FULL USER ACME.COM NW1 1662 > name: Nick West 1663 > email: nick@acme.com 1664 ># END 1665 ># SERVER-TO-ASK ACME.COM 1666 > Server-Handle: SUNETSE01 1667 > Host-Name: whois.sunet.se 1668 > Host-Port: 7070 1669 ># END 1670 ># SERVER-TO-ASK ACME.COM 1671 > Server-Handle: KTHSE01 1672 ># END 1673 >% 226 Transfer complete 1674 % 200 Command okay 1676 ># FULL VERSION ACME.COM 1677 > Version: 2.0 1678 ># END 1679 >% 226 Transfer complete 1680 >% 203 Bye 1681 Server closes the connection 1683 In the example above, the client connected to a Whois++ server and 1684 queried for all records where the attribute "name" equals "Nick", and 1685 asked the server not to close the connection after the response by 1686 using the global constraint "HOLD". 1688 The server responds with one record and a pointer to two other 1689 servers that either holds records or pointers to other servers. 1691 The client continues with asking for the servers version number 1692 without using the HOLD constraint. After responding with protocol 1693 version, the server closes the connection. 1695 Note that each response from the server begins system message 200 1696 (Command OK), and ends with system message 226 (Transfer Complete). 1698 Appendix E - System messages 1700 A system message begins with a '%', followed by a space and a three 1701 digit number, a space, and an optional text message. The line message 1702 must be no more than 81 characters long, including the terminating CR 1703 LF pair. There is no limit to the number of system messages that may 1704 be generated. 1706 A multiline system message have a hyphen instead of a space in column 1707 6, immediately after the numeric response code in all lines, except 1708 the last one, where the space is used. 1710 Example 1 1712 % 200 Command okay 1714 Example 2 1716 % 220-Welcome to 1717 % 220-the Whois++ server 1718 % 220 at ACME inc. 1720 The client is not expected to parse the text part of the response 1721 message except when receiving reply 600 or 601, in which case the 1722 text part is in the former case the name of a character set that 1723 will be used by the server in the rest of the response, and in the 1724 latter case when it specifies what language the attribute value is in. 1725 The valid values for characters sets is specified in the "characterset" 1726 list in the grammar in Appendix F. 1728 The theory of reply codes is described in Appendix E in STD 10, RFC 1729 821 [POST82]. 1731 ------------------------------------------------------------------------ 1733 List of system response codes 1734 ------------------------------ 1736 110 Too many hits The number of matches exceeded 1737 the value specified by the 1738 maxhits constraint. Server 1739 will still reply with as many 1740 records as "maxhits" allows. 1742 111 Requested constraint not supported One or more constraints in 1743 query is not implemented, but 1744 the search is still done. 1746 112 Requested constraint not fulfilled One or more constraints in 1747 query has unacceptable value 1748 and was therefore not used, 1749 but the search is still done. 1751 200 Command Ok Command accepted (i.e., syntax 1752 okay, will be executed). 1753 The client must wait for a 1754 transaction end system message. 1756 201 Command Completed successfully Command accepted and executed. 1758 203 Bye Server is closing connection 1760 220 Service Ready Greeting message. Server is 1761 accepting commands. 1763 226 Transaction complete End of data. All responses to 1764 query are sent. 1766 430 Authentication needed Client requested information 1767 that needs authentication. 1769 500 Syntax error 1771 502 Search expression too complicated This message is sent when the 1772 server is not able to resolve 1773 a query (i.e. when a client 1774 sent a regular expression that 1775 is too deeply nested). 1777 530 Authentication failed The authentication phase 1778 failed. 1780 600 Subsequent attribute values 1781 are encoded in the character 1782 set specified by . 1784 601 Subsequent attribute values 1785 are in the language specified 1786 by . 1788 601 DEF Subsequent attribute values 1789 are default values, i.e. they 1790 should be used for all languages 1791 not specified by "601 " 1792 since last "601 ANY" message. 1794 601 ANY Subsequent attribute values 1795 are for all languages. 1797 Table V - System response codes 1799 ------------------------------------------------------------------------ 1801 Appendix F - The Whois++ Input Grammar 1803 The following grammar, which uses BNF-like notation as defined in [RFC822], 1804 defines the set of acceptable input to a Whois++ server. 1806 N.B.: All Whois++ command, constraint, and value literals are shown here in 1807 lower case for simplicity. These literals are to be accepted in upper, lower, 1808 or mixed case. 1810 whois-command = ( system-command [":" "hold"] 1811 / terms [":" globalcnstrnts] ) NL 1813 system-command = "constraints" 1814 / "describe" 1815 / "commands" 1816 / "polled-by" 1817 / "polled-for" 1818 / "version" 1819 / "list" 1820 / "show" [1*SP string] 1821 / "help" [1*SP string] 1822 / "?" [string] 1824 terms = and-expr *("or" and-expr) 1826 and-expr = not-expr *("and" not-expr) 1828 not-expr = ["not"] (term / ( "(" terms ")" )) 1830 term = generalterm / specificterm 1831 / shorthandle / combinedterm 1833 generalterm = string *(";" localcnstrnt) 1835 specificterm = specificname "=" string 1836 *(";" localcnstrnt) 1838 specificname = "handle" / "value" 1840 shorthandle = "!" string *(";" localcnstrnt) 1842 combinedterm = attributename "=" string *(";" localcnstrnt) 1844 globalcnstrnts = globalcnstrnt *(";" globalcnstrnt) 1846 globalcnstrnt = localcnstrnt 1847 / "format" "=" format 1848 / "maxfull" "=" 1*digit 1849 / "maxhits" "=" 1*digit 1850 / opt-globalcnst 1852 opt-globalcnst = "hold" 1853 / "authenticate" "=" auth-method 1854 / "name" "=" string 1855 / "password" "=" string 1856 / "language" "=" language 1857 / "incharset" "=" characterset 1858 / "ignore" "=" string 1859 / "include" "=" string 1861 format = "full" / "abridged" / "handle" / "summary" 1862 / "server-to-ask" 1864 language = 1866 characterset = "us-ascii" / "iso-8859-1" / "iso-8859-2" / 1867 "iso-8859-3" / "iso-8859-4" / "iso-8859-5" / 1868 "iso-8859-6" / "iso-8859-7" / "iso-8859-8" / 1869 "iso-8859-9" / "iso-8859-10" / "UNICODE-1-1-UTF-8" / 1870 "UNICODE-2-0-UTF-8" / charset-value 1872 charset-value = 1*char 1874 localcnstrnt = "search" "=" searchvalue / 1875 "case" "=" casevalue 1877 searchvalue = "exact" / "substring" / "regex" / "fuzzy" 1878 / "lstring" 1880 casevalue = "ignore" / "consider" 1882 auth-method = "password" 1884 string = 0*char 1886 attributename = 1*normalchar 1888 char = "\" specialchar / normalchar 1890 normalchar = 1892 specialchar = " " / / "=" / "," / ":" / ";" / "\" / 1893 "*" / "." / "(" / ")" / "[" / "]" / "^" / 1894 "$" / "!" / "?" 1895 whitespace = 1*(" " / / / / "@") 1897 digit = "0" / "1" / "2" / "3" / "4" / 1898 "5" / "6" / "7" / "8" / "9" 1900 NL = 1902 NOTE: Blanks that are significant to a query must be escaped. The 1903 following characters, when significant to the query, may be preceded 1904 and/or followed by a single blank: 1906 : ; , ( ) = ! 1908 Appendix G - The Whois++ Response Grammar 1910 The following grammar, which uses BNF-like notation as defined in [RFC822], 1911 defines the set of responses expected from a Whois++ server upon receipt of a 1912 valid Whois++ query. 1914 N.B.: All the literals supplied by the Whois++ server may be in upper, lower, 1915 or mixed case. For clarity, they are shown here in upper case only. 1917 server = goodmessage mnl output mnl endmessage onlynl 1918 / badmessage onlynl endmessage onlynl 1920 output = full / abridged / summary / handle 1922 full = 0*(full-record / server-to-ask) 1924 abridged = 0*(abridged-record / server-to-ask) 1926 summary = summary-record 1928 handle = 0*(handle-record / server-to-ask) 1930 full-record = "# FULL " template serverhandle localhandle nl 1931 1*(fulldata nl) 1932 "# END" nl 1934 abridged-record = "# ABRIDGED " template serverhandle localhandle nl 1935 abridgeddata nl 1936 "# END" nl 1938 summary-record = "# SUMMARY " serverhandle nl 1939 summarydata nl 1940 "# END" nl 1942 handle-record = "# HANDLE " template serverhandle localhandle nl 1944 server-to-ask = "# SERVER-TO-ASK " serverhandle nl 1945 server-to-askdata nl 1946 "# END" nl 1948 fulldata = " " attributename ": " attributevalue 1950 abridgeddata = " " 0*( attributevalue / tab ) 1952 summarydata = " Matches: " number nl 1953 [" Referrals: " number nl] 1954 " Templates: " template 0*( nl "-" template) 1956 server-to-ask-data = " Server-Handle:" 1957 " Host-Name: " hostname nl 1958 " Host-Port: " number nl 1959 [" Protocol: " prot nl] 1960 0*(" " sstring ": " sstring nl) 1962 attributename = sstring 1964 attributevalue = longstring 1966 template = sstring 1968 serverhandle = sstring 1970 localhandle = sstring 1972 hostname = sstring 1974 prot = sstring 1976 longstring = string 0*( nl ( "+" / "-" ) string ) 1978 string = 0*char 1980 sstring = 0*schar 1982 schar = 1984 char = 1986 special-char = ":" / " " / tab / nl 1988 tab = 1990 mnl = 1*nl 1992 nl = onlynl [ 1*(message onlynl) ] 1994 onlynl = 1996 message = [1*( messagestart "-" string onlynl)] 1997 messagestart " " string onlynl 1999 messagestart = "% " digit digit digit 2001 goodmessage = [1*( goodmessagestart "-" string onlynl)] 2002 goodmessagestart " " string onlynl 2004 goodmessagestart= "% 200" 2006 messagestart = "% " digit digit digit 2008 badmessage = [1*( badmessagestart "-" string onlynl)] 2009 badmessagestart " " string onlynl 2011 badmessagestart = "% 5" digit digit 2013 endmessage = endmessageclose / endmessagecont 2015 endmessageclose = [endmessagestart " " string onlynl] 2016 byemessage 2018 endmessagecont = endmessagestart " " string onlynl 2020 endmessagestart = "% 226" 2022 byemessage = byemessagestart " " string onlynl 2024 endmessagestart = "% 203" 2026 number = 1*( digit ) 2028 digit = "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" 2030 Appendix H - Description of Regular expressions 2032 The regular expressions described in this section are the same as used 2033 in many other applications and operating systems. However, it is very 2034 simple and does not include logical operators AND and OR. 2036 Searches using regular expressions always use substring 2037 matching except when the regular expression contains the characters 2038 '^' or '$'. 2040 Character Function 2041 --------- -------- 2043 Matches itself 2045 . Matches any character 2047 a* Matches zero or more 'a' 2049 [ab] Matches 'a' or 'b' 2051 [a-c] Matches 'a', 'b' or 'c' 2053 ^ Matches beginning of 2054 a token 2056 $ Matches end of a token 2058 Examples 2059 --------- 2061 String Matches Doesn't match 2062 ------- ------- ------------- 2063 hello xhelloy heello 2064 h.llo hello helio 2065 h.*o hello helloa 2066 h[a-f]llo hello hgllo 2067 ^he.* hello ehello 2068 .*lo$ hello helloo