idnits 2.17.1 draft-ietf-asid-whoispp-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 39 longer pages, the longest (page 2) being 61 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There is 1 instance of too long lines in the document, the longest one being 2 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The "Author's Address" (or "Authors' Addresses") section title is misspelled. == Line 256 has weird spacing: '... server hand...' == Line 734 has weird spacing: '... space tab ...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 1998) is 9538 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'HARR85' on line 1351 looks like a reference -- Missing reference section? 'WINDX' on line 1362 looks like a reference -- Missing reference section? 'IIIR' on line 1358 looks like a reference -- Missing reference section? 'RFC2279' on line 1837 looks like a reference -- Missing reference section? 'ALVE95' on line 1344 looks like a reference -- Missing reference section? 'RFC2234' on line 1881 looks like a reference -- Missing reference section? 'POST82' on line 1697 looks like a reference Summary: 10 errors (**), 0 flaws (~~), 5 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ASID Working Group Patrik Faltstrom 3 Internet-Draft Tele2 4 Expires: September 11, 1998 Leslie L. Daigle 5 draft-ietf-asid-whoispp-02.txt Bunyip Information Systems Inc. 6 Replaces: RFC-1835 Sima Newell 7 Bunyip Information Systems Inc. 8 March 1998 10 Architecture of the Whois++ service 12 Status of this Memo 14 This document is an Internet-Draft. Internet-Drafts are working 15 documents of the Internet Engineering Task Force (IETF), its 16 areas, and its working groups. Note that other groups may also 17 distribute working documents as Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six 20 months and may be updated, replaced, or obsoleted by other docu- 21 ments at any time. It is inappropriate to use Internet- Drafts as 22 reference material or to cite them other than as ``work in 23 progress.'' 25 To learn the current status of any Internet-Draft, please check 26 the ``1id-abstracts.txt'' listing contained in the Internet- 27 Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net 28 (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East 29 Coast), or ftp.isi.edu (US West Coast). 31 Distribution of this document is unlimited. 33 Abstract 35 This document describes Whois++, an extension to the trivial WHOIS 36 service described in RFC 954 to permit WHOIS-like servers to make 37 available more structured information to the Internet. We describe 38 an extension to the simple WHOIS data model and query protocol and a 39 companion extensible, distributed indexing service. A number of 40 options have also been added such as the use of multiple languages 41 and character sets, more advanced search expressions, structured data 42 and a number of other useful features. An optional authentication 43 mechanism for protecting all or part of the associated Whois++ 44 information database from unauthorized access is also described. 46 Table of Contents 48 Part I - Whois++ Overview ................................. 3 49 1.1. Purpose and Motivation .............................. 3 50 1.2. Basic Information Model ............................. 4 51 1.2.1. Changes to the current WHOIS Model ................ 5 52 1.2.2. Registering Whois++ servers ....................... 6 53 1.2.3. The Whois++ Search Selection Mechanism ............ 6 54 1.2.4. The Whois++ Architecture .......................... 7 55 1.3. Indexing in Whois++ ................................. 7 56 1.4. Getting Help ........................................ 8 57 1.4.1. Minimum HELP Required ............................. 8 58 1.5. Options and Constraints ............................. 9 59 1.6. Formatting Responses ................................ 9 60 1.7. Reporting Warnings and Errors ....................... 10 61 1.8. Privacy and Security Issues ......................... 10 62 Part II - Whois++ Implementation ........................... 10 63 2.1. The Whois++ interaction model ....................... 10 64 2.2. The Whois++ Command set ............................. 11 65 2.2.1. System Commands ................................... 11 66 2.2.1.1. The COMMANDS command ............................ 12 67 2.2.1.2. The CONSTRAINTS command ......................... 13 68 2.2.1.3. The DESCRIBE command ............................ 13 69 2.2.1.4. The HELP command ................................ 13 70 2.2.1.5. The LIST command ................................ 13 71 2.2.1.6. The POLLED-BY command ........................... 13 72 2.2.1.7. The POLLED-FOR command .......................... 14 73 2.2.1.8. The SHOW command ................................ 14 74 2.2.1.9. The VERSION command ............................. 14 75 2.2.2. The Search Command ................................ 14 76 2.2.2.1. Format of a Search Term ......................... 15 77 2.2.2.2. Format of a Search String ....................... 16 78 2.3. Whois++ Constraints ................................. 16 79 2.3.1. Required Constraints .............................. 17 80 2.3.2. Optional CONSTRAINTS .............................. 17 81 2.3.2.1. The SEARCH Constraint ........................... 18 82 2.3.2.2. The FORMAT Constraint ........................... 18 83 2.3.2.3. The MAXFULL Constraint .......................... 18 84 2.3.2.4. The MAXHITS Constraint .......................... 19 85 2.3.2.5. The CASE Constraint ............................. 19 86 2.3.2.6. The AUTHENTICATE Constraint ..................... 19 87 2.3.2.7. The LANGUAGE Constraint ......................... 19 88 2.3.2.8. The INCHARSET Constraint ....................... 20 89 2.3.2.9. The INCHARSET Constraint ....................... 20 90 2.3.2.10. The IGNORE Constraint .......................... 20 91 2.3.2.11. The INCLUDE Constraint ......................... 20 92 2.3.2.12. The HOLD Constraint ............................ 20 93 2.4. Server Response Modes ............................... 21 94 2.4.1. Default Responses ................................. 21 95 2.4.2. Format of Responses ............................... 22 96 2.4.3. Syntax of a Formatted Response .................... 22 97 2.4.3.1. A FULL format response .......................... 23 98 2.4.3.2. ABRIDGED Format Response ........................ 23 99 2.4.3.3. HANDLE Format Response .......................... 23 100 2.4.3.4. SUMMARY Format Response ......................... 23 101 2.4.3.5. SERVER-TO-ASK Response .......................... 24 102 2.4.4. System Generated Messages ......................... 24 103 2.5. Compatibility with Older WHOIS Servers .............. 25 104 3. Miscellaneous ......................................... 25 105 3.1. Acknowledgements .................................... 25 106 3.2. References .......................................... 26 107 3.3. Authors' Addresses .................................. 26 108 Appendix A - Some Sample Queries ........................... 27 109 Appendix B - Some sample responses ........................ 28 110 Appendix C - Sample responses to system commands .......... 29 111 Appendix D - Sample Whois++ session ....................... 31 112 Appendix E - System messages .............................. 32 113 Appendix F - The Whois++ Input Syntax ..................... 34 114 Appendix G - The Whois++ Response Syntax .................. 36 115 Appendix H - Description of Regular expressions ........... 39 117 1. Part I - Whois++ Overview 119 1.1. Purpose and Motivation 121 The current NIC WHOIS service [HARR85] is used to provide a very 122 limited directory service, serving information about a small number 123 of Internet users registered with the DDN NIC. Over time the basic 124 service has been expanded to serve additional information and similar 125 services have also been set up on other hosts. Unfortunately, these 126 additions and extensions have been done in an ad hoc and 127 uncoordinated manner. 129 The basic WHOIS information model represents each individual record 130 as a Rolodex-like collection of text. Each record has a unique 131 identifier (or handle), but otherwise is assumed to have little 132 structure. The current service allows users to issue searches for 133 individual strings within individual records, as well as searches for 134 individual record handles using a very simple query-response 135 protocol. 137 Despite its utility, the current NIC WHOIS service cannot function as 138 a general White Pages service for the entire Internet. Given the 139 inability of a single server to offer guaranteed response or 140 reliability, the huge volume of traffic that a full scale directory 141 service will generate and the potentially huge number of users of 142 such a service, such a trivial architecture is obviously unsuitable 143 for the current Internet's needs for information services. 145 This document describes the architecture and protocol for Whois++, a 146 simple, distributed and extensible information lookup service based 147 upon a small set of extensions to the original WHOIS information 148 model. These extensions allow the new service to address the 149 community's needs for a simple directory service, yet the extensible 150 architecture is expected to also allow it to find applications in a 151 number of other information service areas. 153 Added features include an extension to the trivial WHOIS data model 154 and query protocol and a companion extensible, distributed indexing 155 service. A number of other options have also been added, like boolean 156 operators, more powerful search constraints and search methods. In 157 addition, the data has been structured to make both the client and 158 server elements of the dialogue more stringent and easily 159 parsed. An optional authentication mechanism for protecting all or 160 parts of the associated Whois++ information database from 161 unauthorized access is also briefly described. 163 The architecture of Whois++ allows distributed maintenance of 164 the directory contents and the use of the Whois++ indexing service 165 for locating additional Whois++ servers. Although a general overview 166 of this service is included for completeness, the indexing extensions 167 are described described separately in [WINDX]. 169 It should be noted that Whois++ is not backward compatible with 170 WHOIS. 172 1.2. The Whois++ Information Model 174 The Whois++ service is based on the use of information templates, 175 which consist of ordered sets of data elements (or attribute-value 176 pairs). It underlying recommendation is to use standardized 177 templates where available. 179 It is intended that adding structured template types to a server 180 and subsequently searching through information stored in templates 181 of a specified type should be simple tasks. The creation and use of 182 customized templates should also be possible with little effort, 183 although their use is discouraged where appropriate standardized 184 templates exist. 186 Registration and schema definitions are done on an attribute by 187 attribute basis, so a client that receives a record parses the 188 record structure one attribute at a time. Because of this system, 189 the client does not need to know the structure of the whole record, 190 only individual attributes. If the client sees an unknown 191 attribute, it will skip that one and continue parsing the 192 subsequent attributes. A server that defines schemas can therefore 193 add its own unregistered attributes to a well-defined template type. 195 We also offer methods to allow the user to constrain searches to 196 desired attributes or template types, in addition to the existing 197 commands for specifying handles or simple strings. 199 It is expected that the minimalist approach we have taken will find 200 applications where the high cost of configuring and operating 201 traditional White Pages services can not currently be justified. 203 Note also that the architecture makes no assumptions about the search 204 and retrieval mechanisms used within individual servers. Operators 205 are free to use dedicated database formats, fast indexing software or 206 even provide gateways to other directory services to store and 207 retrieve information. The Whois++ server simply functions as a 208 known front end, offering a simple data model and communicating 209 through a well known port and query protocol. The format of both 210 queries and replies has been structured to allow the use of client 211 software for generating searches and displaying the results. At the 212 same time, some effort has been made to keep responses legible (to 213 some degree) by human users, both to ensure low entry cost and to 214 ease debugging. 216 The actual implemention details of an individual Whois++ search 217 engine are left to the imagination of the implementor. It is hoped 218 that the simple, extensible approach taken will encourage 219 experimentation and the development of improved search engines. 221 1.2.1. Changes to the current WHOIS Model 223 The current WHOIS service is based upon an extremely simple data 224 model. The NIC WHOIS database consists of a series of individual 225 records, each of which is identified by a single unique identifer 226 (the "handle"). Each record contains one or more lines of 227 information. Currently, there is no structure or implicit ordering of 228 this information, although each record is implicitly concerned 229 with information about a single user or service. 231 We have implemented two basic changes to this model. First, we have 232 structured the information within the database as collections of data 233 elements that are simple attribute/value pairs. Each individual 234 record contains a specified ordered set of these data elements. 236 Second, we have introduced the classing of database records into 237 template types. In effect, each record is based upon one template 238 of a specified set; each template contains a finite and specified 239 number of data elements. This classing allows users to limit 240 searches to specific collections of information, such as information 241 about users, services, abstracts of papers, or descriptions of 242 software. 244 Since the data typing is done at the attribute level, not the 245 template level, it is also possible to add non-standard attributes to 246 a well-known template type. 248 As an addition to the model, we require that each individual Whois++ 249 database on the Internet be assigned a unique handle, analogous to 250 the handle associated with each database record. 252 The Whois++ database structure is shown in Fig. 1. 254 ______________________________________________________________________ 255 | | 256 | + Single unique Whois++ server handle | 257 | | 258 | _______ _______ _______ | 259 | handle3 |.. .. | handle6 |.. .. | handle9 |.. .. | | 260 | _______ | _______ | _______ | | 261 | handle2 |.. .. | handle5 |.. .. | handle8 |.. .. | | 262 | _______ | _______ | _______ | | 263 | handle1 |.. .. | handle4 |.. .. | handle7 |.. .. | | 264 | |.. .. | |.. .. | |.. .. | | 265 | ------- ------- ------- | 266 | Template Template Template | 267 | Type 1 Type 2 Type 3 | 268 | | 269 | | 270 | | 271 | | 272 | Fig.1 - Structure of a Whois++ database. | 273 | | 274 | Notes: - Entire database is identified by a single unique Whois++ | 275 | serverhandle. | 276 | - Each record has a single unique handle. | 277 | - Each record has a specific set of attributes, which is | 278 | determined by the Template Type used. | 279 | - Each value associated with an attribute is a text string | 280 | of an arbitrary length. | 281 |______________________________________________________________________| 283 1.2.2. Registering Whois++ servers 285 We propose that individual database handles be registered through the 286 Internet Assigned Numbers Authority (the IANA), ensuring their 287 uniqueness. This will allow us to specify each Whois++ entry on the 288 Internet as a unique pair consisting of a server handle and a record 289 handle. 291 A unique registered handle is preferable to using the host's IP 292 address, since it is conceivable that the Whois++ server for a 293 particular domain may move over time. If we preserve the unique 294 Whois++ handle in such cases we have the option of using it for 295 resource discovery and networked information retrieval (see [IIIR] 296 for a discussion of resource and discovery and support issues). 298 Uniqueness of server handles can be guaranteed by registering them 299 with IANA. 301 We believe that organizing information around a series of such 302 templates will make it easier for administrators to gather and 303 maintain this information and thus encourage them to make such 304 information available. At the same time, as users become more 305 familiar with the data elements available within specific templates 306 they will be able to specify their searches better, and the service 307 will become more useful. 309 1.2.3. The Whois++ Search Selection Mechanism 311 The WHOIS++ search mechanism is intended to be extremely simple. A 312 search command comprises one required element and one optional 313 element. The first (required) element is a set of one or more search 314 terms. The second (optional) element is a colon followed by set of 315 one or more global constraints, which modify or control the search. 317 Within each search term, the user may specify the template type, 318 attribute, value or handle that any record returned must satisfy. 319 Each search term can have an optional set of local constraints that 320 apply only to that term. 322 A Whois++ database may be seen as a single collection of 323 typed records. Each search term specifies a further constraint that 324 the selected set of output records must satisfy. Each term may thus 325 be thought of as performing a subtractive selection, in the sense 326 that any record that does not fulfill the term is discarded from the 327 result set. Result sets can be further specified by supplying 328 multiple search terms, related by logical connectives (AND, OR, NOT). 330 1.2.4. The Whois++ Architecture 332 The Whois++ directory service has an architecture which is separated 333 into two components: the base level server, which is described in 334 this paper, and an indexing server (described in [WINDX]). A single 335 physical server can act as both a base level server and an indexing 336 server. 338 A base level server is one which contains only filled templates. An 339 indexing server is one which contains forward knowledge (q.v.) and 340 pointers to other indexing servers or base level servers. 342 1.3. Indexing in Whois++ 344 Indexing in Whois++ is used to tie together many base level servers 345 and index servers into a unified directory service. For more 346 detailed information on this subject, see [WINDX]. 348 Each base level server and index server that is to participate 349 in the unified directory service must generate forward knowledge 350 for the entries it contains. One type of forward knowledge is the 351 "centroid". 353 An example of a centroid is as follows. Consider a Whois++ server 354 that contains exactly three records: 356 Record 1 Record 2 357 Template: Person Template: Person 358 First-Name: John First-Name: Joe 359 Last-Name: Smith Last-Name: Smith 360 Favourite-Drink: Labatt Beer Favourite-Drink: Molson Beer 362 Record 3 363 Template: Domain 364 Domain-Name: foo.edu 365 Contact-Name: Mike Foobar 367 the centroid for this server would be 369 Template: Person 370 First-Name: Joe 371 John 372 Last-Name: Smith 373 Favourite-Drink:Beer 374 Labatt 375 Molson 377 Template: Domain 378 Domain-Name: foo.edu 379 Contact-Name: Mike 380 Foobar 382 An index server would then collect this centroid for this server as 383 forward knowledge. 385 Index servers can collect forward knowledge for any servers it 386 polls. In effect, all of the servers that the index server knows 387 about can be searched with a single query to the index server; the 388 index server holds the forward knowledge along with pointers to the 389 servers it indexes, and can refer the query to servers which might 390 hold information which satisfies the query. 392 Implementors of this protocol are strongly encouraged to incorporate 393 centroid generation abilities into their servers. 395 Whois++ uses the Common Indexing Protocol, which was originally 396 described in [WINDX] as a centroid-like object to provide index 397 information (forward knowledge) about server contents. This work 398 is being extended in the IETF's FIND Working-Group. 400 ------------------------------------------------------------------- 402 ---- ---- 403 top level | | | | 404 whois index | | | | 405 servers ---- ---- 406 / \________ / 407 / \ / 408 ____ ____ 409 first level | | | | 410 whois index | | | | 411 servers ---- ---- 412 / / \ 413 / / \ 414 ____ ____ ____ 415 individual | | | | | | 416 whois servers | | | | | | 417 ---- ---- ---- 419 Fig. 2 - Indexing system architecture. 421 ------------------------------------------------------------------- 423 1.4. Getting Help 425 Another extension to the basic WHOIS service is the requirement that 426 all servers support at least a minimal set of help commands, allowing 427 users to find out information about both the individual server and 428 the entire Whois++ service itself. This is done in the context of the 429 new extended information model by defining two specific template 430 formats and requiring each server to offer at least one example of 431 each record using these formats. The operator of each Whois++ service 432 is therefor expected to have, as a minimum, a single example of 433 SERVICES and HELP records, which can be accessed through appropriate 434 commands. 436 1.4.1. Minimum HELP Required 438 Executing the command: 440 DESCRIBE 442 gives a brief information about the Whois++ server. 444 Executing the command: 446 HELP 448 gives a brief description of the Whois++ service itself. 450 The text of both required helped records should contain pointers to 451 additional help subjects that are available. 453 Executing the command: 455 HELP 457 gives information on . 459 1.5. Options and Constraints 461 The Whois++ service is based upon a minimal core set of commands and 462 controlling constraints. A small set of additional optional commands 463 and constraints can be supported by a server. These allow users to 464 perform such tasks as provide security options, modify the 465 information contents of a server or add multilingual support. The 466 required set of Whois++ commands are listed in section 2.2. 467 Whois++ constraints are described in section 2.3. Optional 468 constraints are described in section 2.3.2. 470 1.6. Formatting Responses 472 The output returned by a Whois++ server is structured to allow 473 machine parsing and automated handling. Of particular interest is the 474 ability to return summary information about a search instead of 475 having to return the entire results. 477 All output of searches will be returned in one of five output 478 formats, which will be one of FULL, ABRIDGED, HANDLE, SUMMARY or 479 SERVER-TO-ASK. Note that a conforming server is only required to 480 support the FULL format. 482 When available, SERVER-TO-ASK format is used to indicate that a 483 search cannot be completed but that one or more alternative Whois++ 484 servers may be able to perform the search. 486 Details of each output format are specified in section 2.4. 488 1.7. Reporting Warnings and Errors 490 The formatted response of Whois++ commands allows the encoding of 491 warning or error messages to simplify parsing and machine handling. 492 The syntax of output formats are described in detail in section 2.4, 493 and details of Whois++ warnings and error conditions are given in 494 Appendix E. 496 All system messages are numerical, but can be tagged with text. It is 497 the client's decision if the text is presented to the user. 499 1.8. Privacy and Security Issues 501 The basic Whois++ service was conceived as a simple, unauthenticated 502 information lookup service, but there are occasions when 503 authentication mechanisms are required. To handle such cases, one 504 optional mechanism is provided for authenticating each Whois++ 505 transaction. This is the ability to name a (mutually-recognized) 506 authentication scheme in the optional AUTHENTICATE global 507 constraint. 509 Note that the Whois++ authentication mechanism does not dictate the 510 actual authentication scheme used, it merely provides a framework for 511 indicating that a particular transaction is to be authenticated, and 512 the appropriate scheme to use. This mechanism is extensible and 513 individual implementors are free to add additional schemes. 515 Sophisticated security and authentication schemes may be proposed to 516 address specific needs. For example, the Simple Authentication and 517 Security Layer (SASL) work proposed by John Myers (particularly for 518 POP and IMAP) may be applicable here. 520 2. Part II - Whois++ Implementation 522 2.1. The Whois++ interaction model 524 The Whois++ service has an assigned port number -- number 63. 525 However, there is nothing inherent the Whois++ protocol or 526 interaction model that prevents it from being used on any TCP 527 connection on any port -- the specification of the connection is 528 outside the scope of this protocol spec. Once a connection is 529 established, the server issues a banner message, and listens for 530 input. The command specified in this input is processed and the 531 results returned including an ending system message. If the client 532 does not specify the optional HOLD constraint, the connection is 533 then terminated. 535 If the server supports the optional HOLD constraint, and this 536 constraint is specified as part of any command, the server continues 537 to listen on the connection for another (single) line of input. 538 This cycle continues as long as the sender continues to append the 539 required HOLD constraint to each subsequent command. 541 2.2. The Whois++ Command set 543 The Whois++ command set consists of a core set of required systems 544 commands, a single required search command and an set of optional 545 system commands which support features that are not required by all 546 servers. The set of required Whois++ system commands are listed in 547 Table I. Valid search terms for the search command are described in 548 Table II. 550 Each Whois++ command also allows the use of one or more controlling 551 constraints, which, when selected, are used to override defaults or 552 otherwise modify the server's behavior. There is a core set of 553 constraints that must be supported by all conforming servers: 554 SEARCH (which controls the type of search performed), FORMAT (which 555 determines the output format used) and MAXHITS (which determines the 556 maximum number of matches that a search can return). These required 557 constraints are summarized in Table III. 559 An additional set of optional constraints are used to provide support 560 for different character sets, provide data for the authentication 561 scheme, and requesting multiple transactions during a single 562 communications session. These optional constraints are listed in 563 Table IV. 565 It is possible, using the required COMMANDS and CONSTRAINTS system 566 commands, to query any Whois++ server for its list of supported 567 commands and constraints. 569 Please note that the line terminator is defined as a carriage 570 return and line feed (CR/LF) pair. Also, none of the commands or 571 constraints supported by Whois++ are case sensitive. For example, 572 the following are equivalent: HELP, Help, help, hElp. 573 Capitalization of all letters (e.g. HELP) is used only to improve 574 the legibility of this document. Finally, "attribute value" is 575 defined as "the value associated with an attribute". 577 2.2.1. System Commands 579 System commands are commands to the server for information or to 580 control its operation. These include commands to list the template 581 types available from individual servers, to obtain a single blank 582 template of any available type, and commands to obtain the list of 583 valid commands and constraints supported on a server. 585 There are also commands to obtain the current version of the Whois++ 586 protocol supported, to access a simple help subsystem, to obtain a 587 brief description of the service provided by the Whois++ 588 server. The DESCRIBE command is intended, among other 589 things, to support the automated registration of the service in 590 yellow pages directory services. The required commands are listed 591 in Table I. 593 ------------------------------------------------------------------------ 595 Short Long Form Functionality 596 ----- --------- ------------- 597 COMMANDS [ ':' HOLD ] List Whois++ commands 598 supported by this server 600 CONSTRAINTS [ ':' HOLD ] List valid constraints 601 supported by this server 603 DESCRIBE [ ':' HOLD ] Describe this server, 604 formating the response 605 using a standard 606 SERVICES template 608 '?' HELP [ [':' ( / HOLD) 609 0*(';' ( / HOLD))]] 610 Provide help specific to 611 this Whois++ server, using 612 a "Help" template 614 LIST [':' ( / HOLD) 615 0*(';' ( / HOLD))] 616 List templates supported 617 by this server 619 POLLED-BY [ ':' HOLD ] List indexing servers 620 that are known to poll 621 this server 623 POLLED-FOR [ ':' HOLD ] List information about 624 servers this server polls 626 SHOW [':' ] Show contents of template 627 specified in 629 VERSION [ ':' HOLD ] Show the version of 630 the protocol supported by 631 this server 633 Table I - Required Whois++ SYSTEM commands. 635 ------------------------------------------------------------------------ 637 Descriptions of each command follow. Examples of responses 638 to each command are provided in Appendix C. 640 2.2.1.1. The COMMANDS command 641 The COMMANDS command returns a list of commands that the server 642 supports. The response is formatted as a FULL response. 644 2.2.1.2. The CONSTRAINTS command 646 The CONSTRAINTS command returns a list of both the constraints and 647 their values that the server supports. The response is formatted as a 648 FULL response, where every constraint is represented as a separate 649 record. The template name for these records is CONSTRAINT. No 650 attention is paid to handles. Each record has, as a minimum, the 651 following two attributes: 653 - "Constraint", whose value is the constraint name 654 - "Default", which shows the default value for this constraint. 656 If the client is permitted to change the value of the constraint, 657 there is also: 659 - "Range", which contains a list of values that this 660 server supports, as a comma separated list, or, if the range 661 is numerical, as a pair of numbers separated with a hyphen. 663 Note that, irrespective of whether a session is continued (with the 664 HOLD constraint) or not, constraints are set to the default value 665 unless explicitly changed with a constraint in each query. 667 2.2.1.3. The DESCRIBE command 669 The DESCRIBE command gives a brief description about the server in a 670 "Services" template. The result is formatted as a FULL response with 671 as a minimum one attribute: 673 - "Text", which describes the service in a form legible by human 674 users. 676 2.2.1.4. The HELP command 678 The HELP command takes an optional argument which is the subject on 679 which to get help. The answer is formatted as a FULL format response. 681 2.2.1.5. The LIST command 683 The LIST command returns the name of the templates available on the 684 server. The answer is formatted as a FULL format response. 686 2.2.1.6. The POLLED-BY command 688 The POLLED-BY command returns a list of servers and the templates and 689 attribute names that those servers polled as centroids from this 690 server. The format is in FULL format with two attributes, "Template" 691 and "Field", whose values are lists of the names of the polled 692 templates and fields, respectively. An empty result means either 693 that the server is not polled by anyone, or that it doesn't support 694 indexing. 696 2.2.1.7. The POLLED-FOR command 698 The POLLED-FOR command returns a list of servers that this server has 699 polled, and the template and attribute names for each of those. The 700 answer is in FULL format with two attributes, Template and Field. An 701 empty result means either that the server is not polling anyone, or 702 that it doesn't support indexing. 704 2.2.1.8. The SHOW command 706 The SHOW command takes a template name as argument and returns 707 information about that template, formatted as a FULL response. 708 The answer is formatted as a blank template with the requested name. 710 2.2.1.9. The VERSION command 712 The output format is a FULL response containg a record with template 713 name VERSION. The record must have attribute name "Version", whose 714 value is "2.0" for this version of the protocol. The record may also 715 have the additional fields "Program-Name" and "Program-Version" which 716 gives information about the server implementation if the server so 717 desires. 719 If the server also supports the earlier version of the protocol, 720 "1.0", two records are given back as a response to the VERSION 721 command, one for each version supported. 723 2.2.2. The SEARCH Command 725 A SEARCH command comprises one required element and one optional 726 element. The first (required) element is a set of one or more search 727 terms. The second (optional) element is a set of global constraints, 728 which modify or control the search. 730 Each attribute value in the Whois++ database is divided into one or 731 more words separated by whitespace: 733 whitespace = 1*( %d32 / %d09 / %d10 / %d13 / %d64 ) 734 ; space tab LF CR @ 736 Each search term operates on every word in the attribute value. 737 Two or more search terms have to be combined with boolean operators 738 AND, OR or NOT. The operator AND has higher precedence than the 739 operator OR, but this can be changed by the use of parentheses. 741 Boolean operators function as follows for two search terms, A and 742 B. Let A1 be the result set from the first search term and B1 be the 743 result set from the second search. The operation A AND B returns the 744 hits in the intersection of sets A1 and B1. The operation A OR B 745 returns the hits in the union of the sets A1 and B1. The operation 746 NOT A returns all possible results that are not in set A1. The 747 behaviour of the boolean operators can be generalized to N search 748 terms where N > 2. Note that NOT has a higher precedence than AND 749 or OR, so NOT A AND B returns the hits in B that are not in A. 751 Search constraints that apply to all search terms are specified as 752 global constraints. The search terms and the global constraints are 753 separated with a colon (':'). Each additional global constraint is 754 appended to the end of the search command, and a semicolon ';' is 755 used as the delimiter between global constraints. 757 If any of the search constraints can not be fulfilled, or if 758 several of the specified constraints are mutually exclusive, the 759 server ignores the constraints that can not be fulfilled and those 760 that are mutually exclusive. The server performs the search using 761 only the remaining constraints and returns the corresponding set of 762 records. 764 The set of required constraints are listed in Table III. The set 765 of optional constraints are listed in Table IV. 767 As an option, the server may accept specifications for attributes 768 to be included or excluded from a reply. Thus, users could specify 769 -only- those attributes to return, or specific attributes to filter 770 out, thus creating custom views. 772 2.2.2.1. Format of a Search Term 774 Each search term consists of one of the following: 776 1) A search string 778 780 2) A search term specifier (as listed in Table II), followed by a 781 '=', followed by a search string. This is noted as: 783 = 785 3) An attribute name, followed by '=', followed by 786 a search string: 788 = 790 If no search term specifier is provided, then the search will be 791 applied to attribute values only. This corresponds to an identifier 792 of VALUE. 794 When the user specifies the search term using the form: 796 " = " 798 this is considered to be an ATTRIBUTE-VALUE search. 800 For discussion of the system reply format, and selecting the 801 appropriate reply format, see section 2.4. 803 ------------------------------------------------------------------- 804 Valid specifiers: 805 ----------------- 807 Name Functionality 808 ---- ------------- 810 HANDLE Confine search to handles. 811 VALUE Confine search to attribute 812 values. 814 Table II - Valid search command term specifiers. 816 ------------------------------------------------------------------- 818 2.2.2.2. Format of a Search String 820 Special characters that need to be quoted are preceeded by a 821 backslash, ''. 823 Special characters are space ' ', tab, equal sign '=', comma ',', 824 colon ':', backslash '', semicolon ';', asterisk '*', period '.', 825 parenthesis '()', square brackets '[]', dollar sign '$' and 826 circumflex '^'. 828 If the search term is given in some other character set than ISO- 829 8859-1, it must be specified by the constraint INCHARSET. 831 2.3. Whois++ Constraints 833 Constraints are intended to be hints or recommendations to the server 834 about how to process a command. They may also be used to override 835 default behaviour, such as requesting that a server not drop the 836 connection after performing a command. 838 Thus, a user might specify a search constraint as "SEARCH=exact", 839 which means that the search engine is to perform an exact match 840 search. The user might also specify "LANGUAGE=Fr", which means that 841 the server should (if possible) display the French versions of the 842 attribute values, and if possible use French in fuzzy matches. The 843 server should also issue system messages in French. 845 In general, constraints take the form "=", 846 where is one of a specified set of valid values. The notable 847 exception is "HOLD", which takes no argument. 849 The CONSTRAINTS system command is used to list the search constraints 850 supported by an individual server. 852 If a server cannot satisfy the specified constraint, the server 853 should indicate this to the user through the use of system messages. 854 In such cases, the search is still performed, with the the server 855 ignoring unsupported constraints. 857 2.3.1. Required Constraints 859 The following CONSTRAINTS must be supported in all conforming Whois++ 860 servers. 862 ------------------------------------------------------------------ 864 Format 865 ------ 867 SEARCH= exact / lstring 869 FORMAT= full / abridged / handle / summary 871 MAXHITS= 1- 873 Table III - Required Whois++ constraints. 875 ------------------------------------------------------------------ 877 2.3.2. Optional CONSTRAINTS 879 The following CONSTRAINTS and constraint values are not required of a 880 conforming Whois++ server, but may be supported. If supported, their 881 names and supported values must be returned in the response to the 882 CONSTRAINTS command. 884 --------------------------------------------------------------------- 886 Format 887 ------ 889 SEARCH= regex / fuzzy / substring 891 CASE= ignore / consider 893 FORMAT= server-to-ask 895 MAXFULL= 1- 897 AUTHENTICATE= data 899 INCHARSET= us-ascii / iso-8859-* / 900 UNICODE-1-1-UTF-8 / UNICODE-2-0-UTF-8 / UTF-8 902 OUTCHARSET= us-ascii / iso-8859-* / 903 UNICODE-1-1-UTF-8 / UNICODE-2-0-UTF-8 / UTF-8 905 LANGUAGE= 907 HOLD 909 IGNORE= 910 INCLUDE= 912 N.B.: "UTF-8" is as defined in [RFC2279]. This is the character set 913 label that should be used for UTF encoded information; the 914 labels "UNICODE-2-0-UTF-8" and "UNICODE-1-1-UTF-8" are retained 915 primarily for compatibility with older Whois++ servers, and 916 as outlined in [RFC2279]. 918 Table IV - Optional Whois++ constraints. 920 ---------------------------------------------------------------------- 922 2.3.2.1. The SEARCH Constraint 924 The SEARCH constraint is used for specifying the method that is to be 925 used for the search. The default method is "exact". Following is a 926 definition of each search method. 928 exact The search will succeed for a word that exactly 929 matches the search string. 931 substring The search will succeed for a word that matches 932 a part of a word. 934 regex The search will succeed for a word when a regular 935 expression matches the searched data. Regular 936 expression is built up by using constructions of 937 '*', '.', '^', '$', and '[]'. For use of 938 regular expressions see Appendix H. 940 fuzzy The search will succeed for words that matches the 941 search string by using an algorithm designed to catch 942 closely related names with different spelling, e.g. 943 names with the same pronunciation. The server 944 chooses which algorithm to use, but it may vary 945 depending on template name, attribute name and 946 language used (see Constraint Language above). 948 lstring The search will succeed for words that begins 949 with the search string. 951 2.3.2.2. The FORMAT Constraint 953 The FORMAT constraint describes what format the result will be in. 954 Default format is FULL. For a description of each format, see Server 955 Response Modes below. 957 2.3.2.3. The MAXFULL Constraint 959 The MAXFULL constraint sets the limit of the number of matching 960 records the server allows before it enforces SUMMARY responses. The 961 client may attempt to override this value by specifying another value 962 to that constraint. Example: If, for privacy reasons, the server is 963 to return the response in SUMMARY format if the number of hits 964 exceeds 2, the MAXFULL constraint is set to 2 by the server. 966 Regardless of what format the client asked for, the server will 967 change the response format to SUMMARY when the number of matching 968 records equals or exceeds this value. 970 2.3.2.4. The MAXHITS Constraint 972 The MAXHITS constraint sets the maximum number of records returned 973 to the client in response to a query. 975 2.3.2.5. The CASE Constraint 977 The CASE constraint defines if the search should be case 978 sensitive or not. Default value is to have case ignored. 980 2.3.2.6. The AUTHENTICATE Constraint 982 The AUTHENTICATE constraint describes which authentication scheme to 983 use when executing the search. Depending on the authentication 984 scheme used, some other constraints may have to be specified. The 985 authentication scheme definition identifies which constraints it 986 requires. 988 2.3.2.7. The LANGUAGE Constraint 990 The LANGUAGE constraint specifies the language in which the client 991 wishes to receive responses. It therefore specifies which attribute 992 values should be presented to the user (i.e., only those in the 993 specified language, or for which no language information is 994 available). It can also be used as an extra information to the 995 fuzzy matching search method, and it might also be used to tell the 996 server to give the system responses in another language. This 997 should preferably be handled by the client. The language codes 998 defined in RFC 1766 [ALVE95] should be used as a value for the 999 language constraint. In these, the case of the letters are 1000 insignificant. 1002 If a record has attribute values in different languages, and no 1003 LANGUAGE search constraint was given in the query, the switch 1004 between the different languages should be given in the response by 1005 the use of system messages 601 which has one argument only, the 1006 name of the language or one of the predefined strings "ANY" or "DEF". 1007 A block of alternative attribute values starts with a language 1008 definition like "% 601 SE". After the first language specification, 1009 zero or more language specifications can be given, each switching 1010 into the desired language. When all specific languages have been 1011 tagged, the specification "% 601 DEF" can be used for specifying 1012 default attribute values. A block of alternative attributes must 1013 end with "% 601 ANY". 1015 The following is an example of a response using the language 1016 messages: 1018 # FULL USER LOCAL USER-DOE 1019 % 601 FR 1020 Name: Monsieur John Doe 1021 % 601 SV 1022 Name: Herr John Doe 1023 % 601 DEF 1024 Name: Mister John Doe 1025 % 601 ANY 1026 Email: jdoe@doe.pp.se 1027 # END 1029 The language specifications may be suppressed by the server (using 1030 the % 601 messages) if the client has explicitly, by using the global 1031 constraint LANGUAGE, asked for a specific language. 1033 2.3.2.8. The INCHARSET Constraint 1035 The INCHARSET constraint tells the server in which character set the 1036 search string itself is given. The default character set is 1037 ISO-8859-1. 1039 2.3.2.9. The OUTCHARSET Constraint 1041 The OUTCHARSET constraint tells the server in which character set the 1042 search result data (not attributenames or system information) is 1043 supposed to be given in. The default character set is ISO-8859-1, 1044 but the server may choose something else. 1046 2.3.2.10. The IGNORE Constraint 1048 The IGNORE constraint specifies which attributes NOT to include in 1049 the result. All other attributes will be included (as if named 1050 explicitly by the "include" constraint). 1052 If an attribute is named both with the "include" and "ignore" 1053 constraint, the attribute is to be included in the result, but the 1054 system message "% 112 Requested constraint not fulfilled" must be 1055 sent. 1057 2.3.2.11. The INCLUDE Constraint 1059 The INCLUDE constraint specifies which attributes to include in the 1060 result. All other attributes will be excluded (as if named explicitly 1061 by the "ignore" constraint). 1063 If an attribute is named both with the "include" and "ignore" 1064 constraint, the attribute is to be included in the result, but the 1065 system message must be "% 112 Requested constraint not fulfilled". 1067 2.3.2.12. The HOLD Constraint 1068 The HOLD constraint requests that the server hold open the connection 1069 after sending the response to the query. The server waits for 1070 another user input string. 1072 2.4. Server Response Modes 1074 The grammar for Whois++ responses is given in Appendix G, and 1075 described below. 1077 There are currently a total of five different response modes possible 1078 for Whois++ servers. These are FULL, ABRIDGED, HANDLE, SUMMARY and 1079 SERVER-TO-ASK. The syntax of each output format is specified in more 1080 detail in Appendix G. 1082 1) A FULL format response provides the complete contents of a 1083 template matching the specified query, including the template 1084 type, the server handle and an optional record handle. 1086 2) An ABRIDGED format response provides a brief summary, including 1087 (as a minimum) the server handle, the corresponding record 1088 handle and relevant information for that template. 1090 3) A HANDLE format response returns a line with information about 1091 the server handle and record handle for a record that matched 1092 the specified query. 1094 4) A SUMMARY response provides only a brief summary of information 1095 the number of matches and the list of template types in which 1096 the matches occurred. 1098 5) A SERVER-TO-ASK response only returns pointers to other index 1099 servers which might possibly be able to answer the specified 1100 query. 1102 The server may optionally respond with an empty result set and may 1103 also respond with an empty response together with a system message 1104 to indicate that the query was too complex for it to fulfill. 1106 2.4.1. Default Responses 1108 By default, a Whois++ server will provide FULL responses. This may be 1109 changed by the client with the use of the global constraint "format". 1111 The server will not respond with more matches than the value 1112 specified with the global constraint "maxhits" in any response 1113 format. If the number of matches exceeds this value, the server will 1114 issues the system message 110 (maxhits value exceeded), but will 1115 still show the responses, up to the number of the "maxhits" 1116 constraint value. This mechanism will allow the server to hide the 1117 number of possible matches to a search command. 1119 2.4.2. Format of Responses 1120 Each response consists of a numerical system generated message, which 1121 can be tagged with text, followed by an optional formatted response 1122 message, followed by a second system generated message. The formatted 1123 response itself can include system messages, for example for switches 1124 in language. 1126 That is: 1128 '%' 1130 [ ] 1132 '%' 1134 If there are no matches to a query, the system is not required to 1135 generate any output as a formatted response, although it must still 1136 generate system messages. 1138 For information about the standard text for system messages, see 1139 Appendix E. 1141 2.4.3. Syntax of a Formatted Response 1143 All formatted responses except for the HANDLE response, consist of a 1144 response-specific START line, followed by an optional response- 1145 specific data section, followed by a TERMINATION line. The HANDLE 1146 response is different in that it only consists of a START line. It 1147 is permissible to insert any number of lines consisting solely of 1148 CR/LF pairs within a formatted response to improve readability. 1150 Each line shall be limited to no more than 81 characters, including 1151 the terminating CR/LF pair. If a line (including the required 1152 leading single space) would exceed 81 characters, it must be broken 1153 into lines of no more than 81 characters, with each continuation line 1154 beginning with a "+" character in the first column instead of the 1155 leading character. 1157 If an attribute value in a data section includes a line break, the 1158 line break must be replaced by a CR/LF pair and the following line 1159 begin with a "-" character in the first column, instead of the 1160 leading character. The attribute name is not repeated on consecutive 1161 lines. 1163 A TERMINATION line consists of a line with a '#' in the first column, 1164 followed by one space (ASCII 32) character, followed by the keyword 1165 END, followed by zero or more characters, followed by a CR/LF pair. 1167 A response-specific section will be one of the following: 1169 1) FULL Format Response 1170 2) ABRIDGED Format Response 1171 3) HANDLE Format Response 1172 4) SUMMARY Format Response 1173 5) SERVER-TO-ASK Format Response 1175 2.4.3.1. A FULL format response 1177 A FULL format response consists of a series of responses, each 1178 consisting of a START line, followed by the complete template 1179 information for the matching record and a TERMINATION line. 1181 Each START line consists of a '#' in the first column, followed by 1182 one space character, the word "FULL", a space character, 1183 the name of the corresponding template type, one space 1184 character, the server handle, a space character, (optionally) the 1185 handle for the record, and a terminating CR/LF pair. 1187 The template information for the record will be returned as a series 1188 of lines consisting of a single space, followed by the corresponding 1189 line of the record. 1191 The line of the record shall consist of a single space and the 1192 attribute name followed by a ':', a single space, the value of that 1193 attribute, and a CR/LF pair. 1195 2.4.3.2. ABRIDGED Format Response 1197 Each ABRIDGED format response consists of a START line, a single line 1198 excerpt of the template information from each matching record and a 1199 TERMINATION line. The excerpt information shall include information 1200 that is relevant to the template type. 1202 The START line consists of a '#' in the first column, followed by one 1203 space character, the word "ABRIDGED", a space character, 1204 the name of the corresponding template type, a space character, 1205 the server handle, a space character, the handle for the 1206 record, and a terminating CR/LF pair. 1208 The abridged template information will be returned as a line, 1209 consisting of a single space, followed by the abridged line of the 1210 record and a CR/LF pair. 1212 2.4.3.3. HANDLE Format Response 1214 A HANDLE response consists of a single START line, which shall start 1215 with a '#' in the first column, followed by one space 1216 character, the word "HANDLE", a space character, the name of 1217 the corresponding template, a space character, the handle for 1218 the server, a space character, the handle for that record, and 1219 a terminating CR/LF pair. 1221 2.4.3.4. SUMMARY Format Response 1223 A SUMMARY format response consists of a single response, 1224 consisting of a line listing the number of matches to the specified 1225 query, optionally a count of referrals, followed by a list of all 1226 template types which satisfied the query at least once. 1228 The START line shall begin with a '#' in the first column, be 1229 followed by one space character (decimal 32), the word "SUMMARY", a 1230 single space character, the handle for the server, and a terminating 1231 CR/LF pair. 1233 The format of the attributes in the SUMMARY format follows the 1234 rules for the FULL template, with the attributes "matches", 1235 "referrals" and "templates". "matches" and "templates" are 1236 mandatory, "referrals" optional. 1238 The first line must begin with the string "matches:", be 1239 followed by a space and the number of responses to the query and 1240 terminated by a CR/LF pair. 1242 The following line shall either begin with the string "templates: " 1243 or the string "referrals: ". The string "templates: " are followed 1244 by a CR/LF separated list of the name of the template types 1245 which matched the query. Each line following the first which 1246 include the text "templates:" must begin with a '-' instead of 1247 a space. The string "referrals: " is followed by the number of 1248 referrals included in the number of hits. 1250 2.4.3.5. SERVER-TO-ASK Response 1252 A SERVER-TO-ASK response consists of information to the client about 1253 a server to contact next to resolve a query. If the server has 1254 pointers to more than one server, it will present additional SERVER- 1255 TO-ASK responses. 1257 The SERVER-TO-ASK response will consist of a START line and a number 1258 of lines with attribute-value pairs, separated by CRLF. Each line is 1259 indented with one space. The end of a SERVER-TO-ASK response is 1260 indicated with a TERMINATION line. 1262 Each START line consists of a '#' in the first column, followed by 1263 one space character, the word "SERVER-TO-ASK", a space 1264 character, the handle of the server and a terminating CR/LF pair. 1266 1. "Server-Handle" - The server handle of the server pointed at. 1267 (req.) 1268 2. "Host-Name" - Hostname for the server pointed at. 1269 3. "Host-Port" - Portnumber for the server pointed at. 1270 4. "Protocol" - The protocol to use when contacting this server. 1271 (opt.) 1273 Other attributes may be present, depending on the index server. 1274 The default protocol to use is Whois++. 1276 2.4.4. System Generated Messages 1278 All system generated messages must have a '%' as the first 1279 character, a space as the second one, followed by a three digit 1280 number, a space and an optional text message. The total length of the 1281 line must be no more than 81 characters long, including the 1282 terminating CR/LF pair. There is no limit to the number of system 1283 messages that may be generated. 1285 The format for multiline replies requires that every line, except the 1286 last, begin with "%", followed by space, the reply code, a hyphen, 1287 and an optional text. The last line will begin with "%", followed by 1288 space, the reply code, a space and some optional text. 1290 System generated messages displayed before or after the formatted 1291 response section are expected to refer to operation of the system or 1292 refer to the entire query. System generated messages within the 1293 output of an individual record during a FULL response are expected to 1294 refer to that record only, and could (for example) be used to 1295 indicate problems with that record of the response. See Appendix E 1296 for a description of system messages. 1298 2.5. Compatibility with Older WHOIS Servers 1300 Note that this format, although potentially more verbose, is still in 1301 a human readable form. Responses from older systems that do not 1302 follow this format are still conformant, since their responses would 1303 be interpreted as being equivalent to optional text messages, without 1304 a formatted response. Clients written to this specification would 1305 display the responses as a advisory text message, where it would 1306 still be readable by the user. 1308 3. Miscellaneous 1310 3.1. Acknowledgements 1312 This document has been through many iterations of refinement, with 1313 contributions of different natures along the way. These 1314 acknowledgements accrue. 1316 The Whois++ effort began as an intensive brainstorming session at the 1317 24th IETF, in Boston Massachusetts. Present at the birth, and 1318 contributing ideas through this early phase, were (alphabetically) 1319 Peter Deutsch, Alan Emtage, Jim Fullton, Joan Gargano, Brad 1320 Passwaters, Simon Spero, and Chris Weider. Others who have since 1321 helped shape this document with feedback and suggestions include 1322 Roxana Bradescu, Patrik Faltstrom, Kevin Gamiel, Dan Kegel, Michael 1323 Mealling, Mark Prior and Rickard Schoultz. 1325 Version 2 of the protocol spec is based on input during the lifetime 1326 of version 1. Special mention goes to Jeff Allen, Leslie Daigle, 1327 and Philippe Boucher. During the polishing of the RFC for version 2, 1328 important input was given by Len Charest, Clarke Anderson and others 1329 in the ASID working group of the IETF. 1331 This work was supported in part by grant 12/39/01 from the UK 1332 Electronic Libraries Programme (eLib), an initiative of the 1333 Joint Information Systems Committee (JISC). This grant has 1334 provided the opportunity to test the protocol specification 1335 by developing a test suite. The challenge was not only to provide AN 1336 implementation that satisfied the document, but to build tools that 1337 would be able to respond to all POSSIBLE responses that could be 1338 implemented from the spec. This lead to the contribution of some 1339 textual clarifications. Specific thanks go to Bill Heelan and 1340 Philippe Boucher. 1342 3.2 References 1344 [ALVE95] Alvestrand H., "Tags for the Identification of 1345 Languages", RFC 1766, UNINETT, March 1995. 1347 [RFC2234] Crocker, D. and P. Overell, "Augmented BNF for 1348 Syntax Specifications: ABNF", RFC 2234, November 1349 1997. 1351 [HARR85] Harrenstein K., Stahl M., and E. Feinler, 1352 "NICNAME/WHOIS", RFC 954, SRI, October 1985. 1354 [POST82] Postel J., "Simple Mail Transfer Protocol", STD 10, 1355 RFC 821, USC/Information Sciences Institute, 1356 August 1982. 1358 [IIIR] Weider C., and P. Deutsch, "A Vision of an 1359 Integrated Internet Information Service", RFC 1727 1360 Bunyip Information Systems, Inc., December 1994. 1362 [WINDX] Weider, C., J. Fullton, and S. Spero, "Architecture 1363 of the Whois++ Index Service", RFC 1913, February 1364 1996. 1366 [RFC2279] F. Yergeau, " UTF-8, a transformation format of ISO 1367 10646", RFC 2279, January 1998. 1369 3.3. Authors Addresses 1371 Patrik Faltstrom 1372 Tele2 1373 Borgarfjordsgatan 16 1374 BOX 62 1375 194 64 Kista 1376 SWEDEN 1378 Email: paf@swip.net 1380 Leslie L. Daigle 1381 Bunyip Information Systems Inc. 1383 310 Ste. Catherine St. W 1384 Suite 300 1385 Montreal, Quebec, CANADA 1386 H2X 2A1 1388 Email: leslie@bunyip.com 1390 Sima Newell 1391 Bunyip Information Systems Inc. 1392 310 Ste. Catherine St. W 1393 Suite 300 1394 Montreal, Quebec, CANADA 1395 H2X 2A1 1397 Email: sima@bunyip.com 1399 Appendix A - Some Sample Queries 1401 author=leslie and template=user 1403 The result will consist of all records where attribute "author" 1404 matches "leslie" with case ignored. Only USER templates will be 1405 searched. An example of a matching attribute is 1406 "Author=Leslie L. Daigle". 1408 This is the typical case of searching. 1410 author=leslie and template=user:language=fr 1412 The result will consist of the same records as above, but if 1413 attributes are available in alternative languages, only the 1414 ones in French will be displayed. These are either the ones which 1415 have explicitly been tagged as having French values, or ones that 1416 are tagged as being in the "DEF" (default) language. 1418 schoultz and rick;search=lstring 1420 The result will consist of all records which have one attribute value 1421 matching "schoultz" exactly (because the default search type is 1422 exact) and one attribute with "rick" as leading substring, both with 1423 case ignored. One example is "Name=Rickard Schoultz". 1425 value=phone;search=substring 1427 The result will consist of all records which have attribute values 1428 matching *phone*, for example the record "Name=Acme telephone inc.", 1429 but will not match the attribute name "phone". (Since term specifier 1430 is "value" by default, the search term could just as well have been 1431 simply "phone"). 1433 ucdavis;search=substring and (gargano or joan):include=name,email 1435 This search command will find records which have records containing 1436 the words "gargano" or "joan" somewhere in the record, and has the 1437 word "ucdavis" somewhere in a word. The result will only show the 1438 "name" and "email" fields. 1440 Appendix B - Some sample responses 1442 1) FULL format responses: 1444 # FULL USER SERVERHANDLE1 PD45 1445 Name: Peter Deutsch 1446 email: peterd@bunyip.com 1447 # END 1448 # FULL USER SERVERHANDLE1 AE1 1449 Name: Alan Emtage 1450 email: bajan@bunyip.com 1451 # END 1452 # FULL USER SERVERHANDLE1 NW1 1453 Name: Nick West 1454 Favourite-Bicycle-Forward-Wheel-Brand: New Bicy 1455 +cles Acme Inc. 1456 email: nick@bicycle.acme.com 1457 My-favourite-song: Happy birthday to you! 1458 -Happy birthday to you! 1459 -Happy birthday dear Nick! 1460 -Happy birthday to you. 1461 # END 1462 # FULL SERVICES SERVERHANDLE1 WWW1 1463 Type: World Wide Web 1464 Location: the world 1465 # END 1467 -------------------- 1469 2) An ABRIDGED format response: 1471 # ABRIDGED USER SERVERHANDLE1 PD45 1472 Peter Deutsch peterd@bunyip.com 1473 # END 1474 # ABRIDGED USER SERVERHANDLE1 AE1 1475 Alan Emtage bajan@bunyip.com 1476 # END 1477 # ABRIDGED USER SERVERHANDLE1 WWW1 1478 World Wide Web the world 1479 # END 1481 -------------------- 1483 3) HANDLE format responses: 1485 # HANDLE USER SERVERHANDLE1 PD45 1486 # HANDLE USER SERVERHANDLE1 AE1 1487 # HANDLE SERVICES SERVERHANDLE1 WWW1 1489 -------------------- 1491 4) A SUMMARY format response: 1493 # SUMMARY SERVERHANDLE1 1494 Matches: 35 1495 Referrals: 2 1496 Templates: User 1497 -Services 1498 -Abstracts 1499 # END 1501 Appendix C - Sample responses to system commands 1503 C.1 Response to the LIST command 1505 # FULL LIST SERVERHANDLE1 1506 Templates: USER 1507 -SERVICES 1508 -HELP 1509 # END 1511 C.2 Response to the SHOW command 1513 This example shows the result after issuing "show user": 1515 # FULL USER SERVERHANDLE1 1516 Name: 1517 Email: 1518 Work-Phone: 1519 Organization-Name: 1520 City: 1521 Country: 1522 # END 1524 C.3 Response to the POLLED-BY command 1526 # FULL POLLED-BY SERVERHANDLE1 1527 Server-handle: serverhandle2 1528 Cached-Host-Name: sunic.sunet.se 1529 Cached-Host-Port: 7070 1530 Template: USER 1531 Field: ALL 1532 # END 1533 # FULL POLLED-BY SERVERHANDLE1 1534 Server-handle: serverhandle3 1535 Cached-Host-Name: kth.se 1536 Cached-Host-Port: 7070 1537 Template: ALL 1538 Field: Name,Email 1539 # END 1541 C.4 Response to the POLLED-FOR command 1543 # FULL POLLED-FOR SERVERHANDLE1 1544 Server-Handle: serverhandle5 1545 Template: ALL 1546 Field: Name,Address,Job-Title,Organization-Name, 1547 +Organization-Address,Organization-Name 1548 # END 1549 # FULL POLLED-FOR SERVERHANDLE1 1550 Server-Handle: serverhandle4 1551 Template: USER 1552 Field: ALL 1553 # END 1555 C.5 Response to the VERSION command 1557 # FULL VERSION BUNYIP.COM 1558 Version: 2.0 1559 Program-Name: Digger 1560 Program-Version: 3.0b1 1561 Program-Author: Bunyip Information Systems Inc. 1562 Program-Author-Email: digger-info@bunyip.com 1563 Bug-Report-Email: digger-bugs@bunyip.com 1564 # END 1566 C.6 Response to the CONSTRAINTS command 1568 # FULL CONSTRAINTS SERVERHANDLE1 1569 CONSTRAINT: maxhits 1570 DEFAULT: 100 1571 RANGE: 0-100 1572 # END 1573 # FULL CONSTRAINTS SERVERHANDLE1 1574 CONSTRAINT: case 1575 DEFAULT: ignore 1576 RANGE: ignore, consider 1577 # END 1578 # FULL CONSTRAINTS SERVERHANDLE1 1579 CONSTRAINT: search 1580 DEFAULT: exact 1581 RANGE: exact, lstring, substring, fuzzy 1582 # END 1583 # FULL CONSTRAINTS SERVERHANDLE1 1584 CONSTRAINT: language 1585 DEFAULT: DEF 1586 RANGE: FR, EN, SV, ANY, DEF 1587 # END 1588 # FULL CONSTRAINTS SERVERHANDLE1 1589 CONSTRAINT: incharset 1590 DEFAULT: ISO-8859-1 1591 RANGE: ISO-8859-1, UTF-8 1592 # END 1593 # FULL CONSTRAINTS SERVERHANDLE1 1594 CONSTRAINT: outcharset 1595 DEFAULT: ISO-8859-1 1596 RANGE: ISO-8859-1, UTF-8, HTML 1597 # END 1599 C.7 Response to the COMMANDS command 1601 # FULL COMMANDS SERVERHANDLE1 1602 Commands: commands 1603 -constraints 1604 -describe 1605 -help 1606 -list 1607 -polled-by 1608 -polled-for 1609 -show 1610 -version 1611 # END 1613 Appendix D - Sample Whois++ session 1615 Below is an example of a session between a client and a server. The 1616 angle brackets to the left is not part of the communication, but is 1617 just put there to denote the direction of the communication between 1618 the server or the client. Text appended to '>' means messages from 1619 the server and '<' from the client. 1621 Client connects to the server 1623 >% 220-Welcome to 1624 >% 220-the Whois++ server 1625 >% 220 at ACME inc. 1626 % 200 Command okay 1628 > 1629 ># FULL USER ACME.COM NW1 1630 > name: Nick West 1631 > email: nick@acme.com 1632 ># END 1633 ># SERVER-TO-ASK ACME.COM 1634 > Server-Handle: SUNETSE01 1635 > Host-Name: whois.sunet.se 1636 > Host-Port: 7070 1637 ># END 1638 ># SERVER-TO-ASK ACME.COM 1639 > Server-Handle: KTHSE01 1640 ># END 1641 >% 226 Transfer complete 1642 % 200 Command okay 1644 ># FULL VERSION ACME.COM 1645 > Version: 2.0 1646 ># END 1647 >% 226 Transfer complete 1648 >% 203 Bye 1649 Server closes the connection 1651 In the example above, the client connected to a Whois++ server and 1652 queried for all records where the attribute "name" equals "Nick", and 1653 asked the server not to close the connection after the response by 1654 using the global constraint "HOLD". 1656 The server responds with one record and a pointer to two other 1657 servers that either holds records or pointers to other servers. 1659 The client continues with asking for the servers version number 1660 without using the HOLD constraint. After responding with protocol 1661 version, the server closes the connection. 1663 Note that each response from the server begins system message 200 1664 (Command OK), and ends with system message 226 (Transfer Complete). 1666 Appendix E - System messages 1668 A system message begins with a '%', followed by a space and a three 1669 digit number, a space, and an optional text message. The line message 1670 must be no more than 81 characters long, including the terminating CR 1671 LF pair. There is no limit to the number of system messages that may 1672 be generated. 1674 A multiline system message have a hyphen instead of a space in column 1675 6, immediately after the numeric response code in all lines, except 1676 the last one, where the space is used. 1678 Example 1 1680 % 200 Command okay 1682 Example 2 1684 % 220-Welcome to 1685 % 220-the Whois++ server 1686 % 220 at ACME inc. 1688 The client is not expected to parse the text part of the response 1689 message except when receiving reply 600 or 601, in which case the 1690 text part is in the former case the name of a character set that 1691 will be used by the server in the rest of the response, and in the 1692 latter case when it specifies what language the attribute value is 1693 in. The valid values for characters sets is specified in the 1694 "characterset" list in the grammar in Appendix F. 1696 The theory of reply codes is described in Appendix E in STD 10, RFC 1697 821 [POST82]. 1699 ------------------------------------------------------------------------ 1701 List of system response codes 1702 ------------------------------ 1704 110 Too many hits The number of matches exceeded 1705 the value specified by the 1706 maxhits constraint. Server 1707 will still reply with as many 1708 records as "maxhits" allows. 1710 111 Requested constraint not supported One or more constraints in 1711 query is not implemented, but 1712 the search is still done. 1714 112 Requested constraint not fulfilled One or more constraints in 1715 query has unacceptable value 1716 and was therefore not used, 1717 but the search is still done. 1719 200 Command Ok Command accepted (i.e., syntax 1720 okay, will be executed). 1721 The client must wait for a 1722 transaction end system 1723 message. 1725 201 Command Completed successfully Command accepted and executed. 1727 203 Bye Server is closing connection 1729 220 Service Ready Greeting message. Server is 1730 accepting commands. 1732 226 Transaction complete End of data. All responses to 1733 query are sent. 1735 430 Authentication needed Client requested information 1736 that needs authentication. 1738 500 Syntax error 1740 502 Search expression too complicated This message is sent when the 1741 server is not able to resolve 1742 a query (i.e. when a client 1743 sent a regular expression that 1744 is too deeply nested). 1746 530 Authentication failed The authentication phase 1747 failed. 1749 600 Subsequent attribute values 1750 are encoded in the character 1751 set specified by . 1753 601 Subsequent attribute values 1754 are in the language specified 1755 by . 1757 601 DEF Subsequent attribute values 1758 are default values, i.e. they 1759 should be used for all languages 1760 not specified by "601 " 1761 since last "601 ANY" message. 1763 601 ANY Subsequent attribute values 1764 are for all languages. 1766 Table V - System response codes 1768 ------------------------------------------------------------------------ 1770 Appendix F - The Whois++ Input Grammar 1772 The following grammar, which uses BNF-like notation as defined in 1773 [RFC2234] defines the set of acceptable input to a Whois++ server. 1775 N.B.: As outlined in the ABNF definition, rule names and string 1776 literals are in the US-ASCII character set, and are case-insensitive. 1778 whois-command = ( system-command [":" "hold"] 1779 / terms [":" globalcnstrnts] ) nl 1781 system-command = "constraints" 1782 / "describe" 1783 / "commands" 1784 / "polled-by" 1785 / "polled-for" 1786 / "version" 1787 / "list" 1788 / "show" [1*sp bytestring] 1789 / "help" [1*sp bytestring] 1790 / "?" [bytestring] 1792 terms = and-expr *("or" and-expr) 1794 and-expr = not-expr *("and" not-expr) 1796 not-expr = ["not"] (term / ( "(" terms ")" )) 1798 term = generalterm / specificterm 1799 / combinedterm 1801 generalterm = bytestring 1803 specificterm = specificname "=" bytestring 1805 specificname = "handle" / "value" 1807 combinedterm = attributename "=" bytestring 1809 globalcnstrnts = globalcnstrnt *(";" globalcnstrnt) 1811 globalcnstrnt = "format" "=" format 1812 / "maxfull" "=" 1*digit 1813 / "maxhits" "=" 1*digit 1814 / opt-globalcnst 1816 opt-globalcnst = "hold" 1817 / "authenticate" "=" auth-method 1818 / "language" "=" language 1819 / "incharset" "=" characterset 1820 / "ignore" "=" bytestring 1821 / "include" "=" bytestring 1823 format = "full" / "abridged" / "handle" / "summary" 1824 / "server-to-ask" 1826 auth-method = bytestring 1828 language = 1830 characterset = "us-ascii" / "iso-8859-1" / "iso-8859-2" / 1831 "iso-8859-3" / "iso-8859-4" / "iso-8859-5" / 1832 "iso-8859-6" / "iso-8859-7" / "iso-8859-8" / 1833 "iso-8859-9" / "iso-8859-10" / 1834 "UNICODE-1-1-UTF-8" / "UNICODE-2-0-UTF-8" 1835 "UTF-8" 1837 ;"UTF-8" is as defined in [RFC2279]. This is 1838 ;the character set label that should be used 1839 ;for UTF encoded information; the labels 1840 ;"UNICODE-2-0-UTF-8" and "UNICODE-1-1-UTF-8" 1841 ;are retained primarily for compatibility with 1842 ;older Whois++ servers (and as outlined in 1843 ;[RFC2279]). 1845 searchvalue = "exact" / "substring" / "regex" / "fuzzy" 1846 / "lstring" 1848 casevalue = "ignore" / "consider" 1850 bytestring = 0*charbyte 1851 attributename = 1*attrbyte 1853 charbyte = " 1855 normalbyte = <%d33-255, except specialbyte> 1857 attrbyte = <%d33-127 except specialbyte> / 1858 " 1860 specialbyte = " " / tab / "=" / "," / ":" / ";" / " 1861 "*" / "." / "(" / ")" / "[" / "]" / "^" / 1862 "$" / "!" / "?" 1864 tab = %d09 1865 sp = %d32 ; space 1867 digit = "0" / "1" / "2" / "3" / "4" / 1868 "5" / "6" / "7" / "8" / "9" 1870 nl = %d13 %d10 ; CR LF 1872 NOTE: Blanks that are significant to a query must be escaped. The 1873 following characters, when significant to the query, may be preceded 1874 and/or followed by a single blank: 1876 : ; , ( ) = ! 1878 Appendix G - The Whois++ Response Grammar 1880 The following grammar, which uses ABNF-like notation as defined in 1881 [RFC2234], defines the set of responses expected from a Whois++ server 1882 upon receipt of a valid Whois++ query. 1884 N.B.: As outlined in the ABNF definition, rule names and string 1885 literals are in the US-ASCII character set, and are case-insensitive. 1887 server = goodmessage mnl output mnl endmessage nl 1888 / badmessage nl endmessage nl 1890 output = full / abridged / summary / handle 1892 full = 0*(full-record / server-to-ask) 1894 abridged = 0*(abridged-record / server-to-ask) 1896 summary = summary-record 1898 handle = 0*(handle-record / server-to-ask) 1900 full-record = "# FULL " template serverhandle localhandle 1901 system-nl 1902 1*(fulldata system-nl) 1903 "# END" system-nl 1905 abridged-record = "# ABRIDGED " template serverhandle localhandle 1906 system-nl 1907 abridgeddata 1908 "# END" system-nl 1910 summary-record = "# SUMMARY " serverhandle system-nl 1911 summarydata 1912 "# END" system-nl 1914 handle-record = "# HANDLE " template serverhandle localhandle 1915 system-nl 1917 server-to-ask = "# SERVER-TO-ASK " serverhandle system-nl 1918 server-to-askdata 1919 "# END" system-nl 1921 fulldata = " " attributename ": " attributevalue 1923 abridgeddata = " " 0*( attributevalue / tab ) 1925 summarydata = " Matches: " number system-nl 1926 [" Referrals: " number system-nl] 1927 " Templates: " template 0*( system-nl "-" 1928 template) 1930 server-to-ask-data = " Server-Handle:" serverhandle system-nl 1931 " Host-Name: " hostname system-nl 1932 " Host-Port: " number system-nl 1933 [" Protocol: " prot system-nl] 1934 0*(" " labelstring ": " labelstring system-nl) 1936 attributename = 1*attrbyte 1938 attrbyte = <%d33-127 except specialbyte> 1940 attributevalue = longstring 1942 template = labelstring 1944 serverhandle = labelstring 1946 localhandle = labelstring 1948 hostname = labelstring 1950 prot = labelstring 1952 longstring = bytestring 0*( nl ( "+" / "-" ) bytestring ) 1953 bytestring = 0*charbyte 1955 labelstring = 0*restrictedbyte 1957 restrictedbyte = <%d32-%d255 except specialbyte> 1959 charbyte = <%d32-%d255 except nl> 1961 specialbyte = ":" / " " / tab / nl 1963 tab = %d09 1965 mnl = 1*system-nl 1967 system-nl = nl [ 1*(message nl) ] 1969 nl = %d13 %d10 1971 message = [1*( messagestart "-" bytestring nl)] 1972 messagestart " " bytestring nl 1974 messagestart = "% " digit digit digit 1976 goodmessage = [1*( goodmessagestart "-" bytestring nl)] 1977 goodmessagestart " " bytestring nl 1979 goodmessagestart= "% 200" 1981 messagestart = "% " digit digit digit 1983 badmessage = [1*( badmessagestart "-" bytestring nl)] 1984 badmessagestart " " bytestring nl 1986 badmessagestart = "% 5" digit digit 1988 endmessage = endmessageclose / endmessagecont 1990 endmessageclose = [endmessagestart " " bytestring nl] 1991 byemessage 1993 endmessagecont = endmessagestart " " bytestring nl 1995 endmessagestart = "% 226" 1997 byemessage = byemessagestart " " bytestring nl 1999 endmessagestart = "% 203" 2001 number = 1*( digit ) 2003 digit = "0" / "1" / "2" / "3" / "4" / "5" / 2004 "6" / "7" / "8" / "9" 2006 Appendix H - Description of Regular expressions 2008 The regular expressions described in this section are the same as 2009 used in many other applications and operating systems. However, it 2010 is very simple and does not include logical operators AND and OR. 2012 Searches using regular expressions always use substring 2013 matching except when the regular expression contains the characters 2014 '^' or '$'. 2016 Character Function 2017 --------- -------- 2019 Matches itself 2021 . Matches any character 2023 a* Matches zero or more 'a' 2025 [ab] Matches 'a' or 'b' 2027 [a-c] Matches 'a', 'b' or 'c' 2029 ^ Matches beginning of 2030 a token 2032 $ Matches end of a token 2034 Examples 2035 --------- 2037 String Matches Doesn't match 2038 ------- ------- ------------- 2039 hello xhelloy heello 2040 h.llo hello helio 2041 h.*o hello helloa 2042 h[a-f]llo hello hgllo 2043 ^he.* hello ehello 2044 .*lo$ hello helloo