idnits 2.17.1 draft-matavka-gopher-ii-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. -- The draft header indicates that this document updates RFC4266, but the abstract doesn't seem to mention this, which it should. -- The draft header indicates that this document updates RFC1436, but the abstract doesn't seem to directly say this. It does mention RFC1436 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC1436, updated by this document, for RFC5378 checks: 1993-03-01) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 3, 2015) is 3249 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'CR' is mentioned on line 1010, but not defined == Missing Reference: 'LF' is mentioned on line 1010, but not defined == Unused Reference: 'Anklesaria1993' is defined on line 1312, but no explicit reference was found in the text == Unused Reference: 'CapsRef' is defined on line 1331, but no explicit reference was found in the text == Unused Reference: 'Floodgap' is defined on line 1337, but no explicit reference was found in the text == Unused Reference: 'Goerzen2012' is defined on line 1343, but no explicit reference was found in the text == Unused Reference: 'GopherHistory' is defined on line 1350, but no explicit reference was found in the text == Unused Reference: 'RelatedDocs' is defined on line 1357, but no explicit reference was found in the text == Unused Reference: 'UpdatedGopher' is defined on line 1366, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'Anklesaria1993' ** Downref: Normative reference to an Informational RFC: RFC 1436 Summary: 1 error (**), 0 flaws (~~), 11 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Matavka 3 Internet-Draft 4 Updates: 1436, 4266 (if approved) W. Faust 5 Intended status: Standards Track June 3, 2015 6 Expires: December 5, 2015 8 Gopher-II: The Next Generation Gopher WWIS 9 draft-matavka-gopher-ii-01 11 Abstract 13 The Gopher protocol is over twenty years old. Changing practices and 14 unofficial extensions have caused Gopher as currently used to differ, 15 but remain largely compatible with, the standard established in its 16 official governing document, *The Internet Gopher Protocol (a 17 distributed document search and retrieval protocol)*, known as *RFC 18 1436*. Therefore, this document attempts to establish a contemporary 19 specification of the Gopher communications protocol, departing as 20 little as possible from current practice. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on December 5, 2015. 39 Copyright Notice 41 Copyright (c) 2015 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 57 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 58 1.2. Changes from RFC 1436 and 4266 . . . . . . . . . . . . . 3 59 2. Basic Gopher Transactions . . . . . . . . . . . . . . . . . . 3 60 2.1. Menu Transaction . . . . . . . . . . . . . . . . . . . . 4 61 2.2. Index Transaction . . . . . . . . . . . . . . . . . . . . 4 62 2.3. Simple Text Transaction . . . . . . . . . . . . . . . . . 4 63 2.4. Binary Transaction . . . . . . . . . . . . . . . . . . . 5 64 2.5. Sequencing . . . . . . . . . . . . . . . . . . . . . . . 5 65 3. Line Terminators . . . . . . . . . . . . . . . . . . . . . . 5 66 4. Selector Formats . . . . . . . . . . . . . . . . . . . . . . 6 67 4.1. Type Codes . . . . . . . . . . . . . . . . . . . . . . . 6 68 4.2. GopherIIbis: Metadata in Gopherspace . . . . . . . . . . 8 69 5. Gopher Menus . . . . . . . . . . . . . . . . . . . . . . . . 9 70 5.1. Note on the terminating full stop . . . . . . . . . . . . 10 71 6. Requesting Data . . . . . . . . . . . . . . . . . . . . . . . 10 72 7. Data Transfer . . . . . . . . . . . . . . . . . . . . . . . . 11 73 8. Requesting and Receiving Metadata . . . . . . . . . . . . . . 12 74 8.1. The `INFO` Record . . . . . . . . . . . . . . . . . . . . 13 75 8.2. The `ADMIN` Record . . . . . . . . . . . . . . . . . . . 13 76 8.3. The `VIEWS` Record . . . . . . . . . . . . . . . . . . . 14 77 8.4. The `ABSTRACT` Record . . . . . . . . . . . . . . . . . . 15 78 9. Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 79 9.1. Error Codes . . . . . . . . . . . . . . . . . . . . . . . 16 80 10. Titles in Gopher . . . . . . . . . . . . . . . . . . . . . . 17 81 11. Linking to Web Addresses . . . . . . . . . . . . . . . . . . 18 82 12. Algorithm to use with selectors . . . . . . . . . . . . . . . 20 83 13. Representation of Gopher Addresses . . . . . . . . . . . . . 21 84 14. Gopher Policy Files . . . . . . . . . . . . . . . . . . . . . 22 85 14.1. Capability Policy . . . . . . . . . . . . . . . . . . . 22 86 14.2. Robot Access Restrictions Policy . . . . . . . . . . . . 25 87 14.3. Administrator Contact File . . . . . . . . . . . . . . . 28 88 15. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 89 16. Security Considerations . . . . . . . . . . . . . . . . . . . 29 90 17. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 30 91 18. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 92 18.1. Normative References . . . . . . . . . . . . . . . . . . 30 93 18.2. Informative References . . . . . . . . . . . . . . . . . 30 94 Appendix A. Summary of Changes from RFC 1436 . . . . . . . . . . 31 95 Appendix B. Change Log . . . . . . . . . . . . . . . . . . . . . 31 96 B.1. Changes from -00 to -01 of this specification . . . . . . 31 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 31 100 1. Introduction 102 The over-riding aim of this document is to author a contemporary 103 specification of the Gopher world-wide information system, without 104 falling short of reflecting actual practice and without breaking 105 compliance with RFC 1436 [RFC1436]. This document shall attempt to 106 describe, and, where necessary, update current practice as regards 107 the means of handling errors, line and file terminators, policy 108 files, TITLE selectors, the URL: re-direction scheme, and new 109 selector types not compliant with RFC 1436. This document is not to 110 be construed as a replacement for RFC 1436; it merely complements it. 112 Gopher is a lightweight, client/server-oriented query/answer 113 protocol, functioning as a world-wide information system (WWIS) and 114 facilitating access to remote servers of any description. The 115 protocol and software permit users of a wide variety of desktop 116 systems to browse, search, and retrieve documents residing on 117 multiple distributed server machines. Gopher is unique among world- 118 wide information systems in that it encourages data to be sent in 119 textual form and that it imposes a strict hierarchy on content, 120 making it a protocol that is fast to transmit, receive, and search. 121 This, in turn, makes it useful in high-latency, low-bandwidth 122 communications, such as mobile links. In fact, Gopher provides the 123 ideal method for transmitting information from and to mobile devices. 125 1.1. Terminology 127 NOTE: The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL 128 NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 129 this document are to be interpreted as described in RFC 2119 130 [RFC2119]. Furthermore, backticks (`) around a string mean that it 131 is to be interpreted literally. 133 1.2. Changes from RFC 1436 and 4266 135 [[CREF1: this is required if the document is to go on standards 136 track. It can be a really short summary, stressing upward 137 compatibility, here and then a pointer to the first proposed appendix 138 if you like or you can consolate everything here.]] 140 2. Basic Gopher Transactions 142 There are four broad forms of basic transactions in Gopher: 144 o Menu Transaction; 145 o Index Transaction; 147 o Simple Text Transaction; and 149 o Binary Transaction. 151 The precise composition of these transactions is elucidated below. 153 2.1. Menu Transaction 155 o Client : [Open Connexion] 157 o Client : Send [selector] 159 o Server : Send 161 o Server : Send . 163 o Server : [Close Connexion] 165 2.2. Index Transaction 167 o Client : [Open Connexion] 169 o Client : Send [selectorquery parameters] 171 o Server : Send [] 173 o Server : Send . 175 o Server : [Close Connexion] 177 2.3. Simple Text Transaction 179 o Client : [Open Connexion] 181 o Client : Send [selector] 183 o Server : Send [] 185 o Server : Send . 187 o Server : [Close Connexion] 189 2.4. Binary Transaction 191 o Client : [Open Connexion] 193 o Client : Send [selector] 195 o Server : Send [] 197 o Server : DO NOT send . 199 o Server : [Close Connexion] 201 2.5. Sequencing 203 [[CREF2: the RFC format no longer allows text associated with a 204 primary section after one or more subsections. This subsection is 205 therefore a placeholder until and unless you do some rewriting. If 206 you want to just leave it, a better title ("Explanation", "Details", 207 "Comments", ?? ... ) may be appropriate. ]] 209 The fourth step of each transaction, with the exception of the binary 210 type, is OPTIONAL. Servers MAY send a full-stop character after 211 sending a menu, index, or text; if they do, clients MUST accept it. 212 Further information may be found in the appropriate sub-section. 214 Gopher servers are normally found on TCP port 70. Clients MUST 215 assume this port if no other port is specified. When a client opens 216 a connection to a server, the server MUST accept the connection but 217 say nothing, waiting for a CR/LF-terminated selector string from the 218 client. The client MAY then send the selector string followed by CR/ 219 LF (or nothing to retrieve the root menu from the server, which MUST 220 always be type 1). The server MUST then send the requested content 221 and close the connection. 223 3. Line Terminators 225 ASCII, the international standard that governs the interchange of 226 plain-text information between computer systems, is nothing more or 227 less than a table mapping each character (letter, number, space, or 228 symbol) to a numerical code, which is then converted to binary and 229 written to disc. Its necessity was seen long before the advent of 230 the electronic monitor, so some of its more unique quirks must be 231 understood in view of the time period of which it was a product. 232 Historically, input and output was through a specially-adapted 233 typewriter, and the ASCII convention reflects this in the codes it 234 uses to terminate lines of text. 236 In ASCII, there are two codes, both having physical equivalents in 237 the real world, that signal the end of the line: the Carriage Return 238 (abbreviated C/R, CR, or c/r) and the Line Feed (abbreviated L/F, LF, 239 or l/f). Originally, the term *carriage return* was used for a 240 command that caused the assembly holding the paper (the carriage) to 241 return to the right so the machine was ready to type again on the 242 left side of the paper (assuming a left-to-right language). On the 243 other hand, the *line feed* moved the paper upwards, allowing the 244 carriage to type on the following line. 246 Different operating systems traditionally signal the end of a line in 247 different ways. UNIX and its descendants (including Mac OS X), the 248 operating systems most likely to run on a server, use the line feed 249 alone. CP/M, DOS, and Microsoft Windows use the sequence of carriage 250 return and line feed (CR/LF). Obsolete versions of Mac OS (up to, 251 and including, System 9) use the carriage return alone. 253 All programmes using Gopher MUST always use the Microsoft standard of 254 CR/LF, irrespective of the operating system they run on. Both 255 internal Gopher commands and policy files MUST comply with this 256 standard. Other text files SHOULD use standard Gopher format, but 257 this is not strictly required as a matter of technical form; the 258 client MUST be capable of converting to and from all variants of line 259 terminators. The recommendation stands for the benefit of non- 260 compliant clients only. 262 4. Selector Formats 264 4.1. Type Codes 265 The following selectors are defined by RFC 1436: 267 Type Treat As Meaning 268 0 TEXT Plain text file 269 1 MENU Menu 270 2 EXTERNAL CCSO flat database (formerly used as telephone 271 directories); other databases 272 3 ERROR Error message 273 4 TEXT Macintosh BinHex file 274 5 BINARY Binary archive (zip; rar; 7-Zip; gzip; tar) 275 6 TEXT UUEncoded archive 276 7 INDEX Query a search engine or CGI script 277 8 EXTERNAL Telnet to: VT100 series server 278 9 BINARY Binary file (see also 5) 279 + - Redundant server 280 T EXTERNAL Telnet to: tn3270 series server 281 g BINARY GIF format graphics file (TODO: Why not use I?) 282 I BINARY Any image file. 284 The `+` selector indicates a mirror of the previous item in the menu, 285 and MUST behave as though it had the same type as that entry. For 286 example: 288 5Download software /software.zip gopher.example.com 70 289 +example.net mirror mirror.example.net /example.com/software.zip 70 290 +Another mirror mirror2.example.com /software.zip 70 292 Additionally, the following selectors have been in common use and are 293 made official here. If a client does not have the capability to 294 display a particular item type, it SHOULD treat it as a more generic 295 item type, passing it off to the operating system (itemtype p 296 "implies" itemtype 0, etc.). 298 Type Treat As Meaning 299 c BINARY Calendar file (Kim Holviala) 300 d BINARY Word-processing document (MS 301 Word; OpenOffice.org; 302 WordPerfect); PDF document 303 h TEXT HTML document 304 i - Informational text (not 305 selectable) 307 p TEXT Page layout or markup document 308 (TeX; LaTeX; PostScript; Rich 309 Text Format)---these documents are 310 all plain text, but contain ASCII 311 tags" that make the document 312 prettier when sent through a 313 special program. 314 m BINARY Electronic mail repository (also 315 known as MBOX) (Kim Holviala) 316 s BINARY Audio recordings (files that 317 consist of audible, but no 318 visible, data) (Wesley Teal) 319 x TEXT eXtensible Markup Language 320 document (Wesley Teal) 321 ; BINARY Video files (files that consist 322 of both audible and visible 323 data) (Wesley Teal) 325 Filetypes `4`, `6`, `h`, `p`, and `x` SHOULD send as text (itemtype 326 0). This way, the text appears directly on the user's terminal 327 without being downloaded (unless the appropriate command is given to 328 the client, i.e. `CTRL/S`). It is vital to note that text 329 information can be sent via binary (with the minor inconvenience 330 noted above), as binary files contain a greater range of information 331 than ASCII. However, binary files, if sent via text, will be 332 irreparably ruined, as this effectively passes raw eight-bit data 333 through an ASCII filter. In the case of confusion, the owner/ 334 operator of the server should simply mark the file as binary to 335 ensure that it transfers safely. 337 4.2. GopherIIbis: Metadata in Gopherspace 339 It is sometimes useful to transmit data about GopherII selectors. 340 This is known as "metadata": the *meta* construction is derived from 341 the Greek for "beyond", and refers to concepts which are abstractions 342 from other concepts intended to complete or add to the latter. For 343 instance, in psychology, metamemory refers to an individual's ability 344 to remember that he has remembered something. In plain English, 345 metadata refers to data about data. 347 GopherIIbis is an OPTIONAL, but recommended, addition to the basic 348 GopherII specification. That said, it is optional only in the sense 349 that a GopherII client MAY EITHER display the relevant information in 350 accordance with the specification, or ELSE ignore it entirely. To be 351 conformant with GopherII, Gopher clients MUST be capable of handling 352 GopherIIbis metadata. A GopherII client that displays GopherIIbis 353 metadata may be referred to as being compliant with GopherIIbis. 355 The name of the GopherIIbis extension is pronounced "gopher-two-biss" 356 or "gopher-two-beess". In typeset text, the French word "bis" should 357 be *italicised* so as to set it off visually. The name of the 358 GopherIIbis extension reflects that it is merely an addition, or an 359 iteration, of the GopherII protocol. 361 5. Gopher Menus 363 Menu (type 1) content has the following format: 365 T^I^I^I 367 Where: 369 o `^I` is the ASCII character corresponding to the `Tab` key 371 o `T` is the type code, which MUST be run together with the item 372 text 374 o is the selector string to send to the specified server 376 o is the server to send the selector to 378 o is the port on the server to connect to 380 If the server understands how to send and receive GopherIIbis 381 metadata, it MUST indicate this fact by adding a fourth tab character 382 (^I) and a plus sign after the port number. For example: 384 T^I^I^I^I+ 386 If the client does not understand GopherIIbis metadata, it MUST 387 ignore the trailing ^I+. 389 Note on `i` item type: For the `i` item type, Selector, Server, and 390 Port are mostly ignored, but MUST be there anyway. In that case, the 391 host SHOULD be set to placeholder value `example.com`, and the port 392 SHOULD be set to placeholder value `0` (zero). One exception to 393 their being ignored is TITLE entries. These have TITLE as the 394 selector value; host and port SHOULD again be set to aforementioned 395 placeholder values. 397 5.1. Note on the terminating full stop 399 Per RFC 1436, a terminating full stop (.) character followed by CR/LF 400 should be sent on a line by itself after the end of the content, with 401 exceptions for binary data. This terminating full stop has caused no 402 end of trouble ever since. Many, if not most, modern Gopher servers 403 omit this terminating full stop. Therefore, the practice suggested 404 in RFC 1436 is DEPRECATED and the following practice is RECOMMENDED. 406 o Servers MAY send the full stop; clients MUST accept it 408 o Servers SHOULD send the full stop after menus and may OPTIONALLY 409 send it after other files 411 o Clients SHOULD display the full stop at the end of menus, if sent, 412 to notify the user that this is the end of the menu 414 o Clients SHOULD NOT include the full stop in other output, in case 415 that output has some significance which the full stop may disrupt. 417 o Clients SHOULD NOT consider a full stop significant, unless it 418 occurs immediately before the connection is terminated. 420 6. Requesting Data 422 A standard GopherII client requests data from the server by 423 transmitting the selector string, a carriage return, and a line feed. 424 For instance, to retrieve the file `services.txt`, the client sends 426 services.txt[CR][LF] 428 GopherIIbis handles things in a slightly more complicated way. In 429 addition to a selector string, a GopherIIbis-compliant request 430 contains a *format* string, a data flag indicating the presence or 431 absence of a data block, and an OPTIONAL data block. 433 The reason for the inclusion of the format string is because 434 GopherIIbis allows one selector to point to multiple versions of the 435 same file, in multiple languages. For instance, the same file in 436 Portable Document Format, PostScript format, Rich Text Format, and 437 plain text may be available, and each of these may be available in 438 British English, American English, Canadian French, and Continental 439 French. The format string, therefore, is the desired MIME type of 440 whichever format is being requested, followed by the ISO country and 441 language codes in the following format: 443 selector^Imime/type la_CO^I1[CR][LF]datadatadatadata 445 The number 1 above is the file flag. It can be either 1 or 0. If it 446 is 1, it means that the client is not only requesting data, but also 447 *sending* it. This is useful for example when querying a relational 448 database on a Gopher server (this usage is now rare). An example 449 would be: 451 services.txt^Itext/plain fr_CA^I0[CR][LF] 453 7. Data Transfer 455 When a file is requested by a Gopher client, a Gopher server 456 incompatible with GopherIIbis simply sends the requested data as soon 457 as it gets the request from the client. GopherIIbis servers, on the 458 other hand, have three options when given a GopherIIbis-compliant 459 request (i.e. one that ends in ^I+). 461 If the size of the file in bytes is known, the server SHOULD transmit 462 a plus sign, the size, and the combination of carriage return and 463 line feed, then the file. For example, if the size of file 464 `report.tex` is known to be 64096 bytes, the server SHOULD transmit: 466 +64096[CR][LF]\documentclass{article}[CR][LF]\begin{document}... 468 If the size of the file is not known, there are two ways to proceed. 469 One of them is to send the character string `+-1` prior to beginning 470 transmission of the data proper, and end the transmission with a full 471 stop (.) on a line by itself, followed by carriage return and line 472 feed. For example: 474 +-1[CR][LF]data data[CR][LF]data data data data[CR][LF].[CR][LF] 476 It is RECOMMENDED that most textual data of unknown length be 477 transmitted this way. The exception is when there is a possibility 478 of the full stop appearing on a line by itself; this, of course, 479 would terminate the connexion. There is no choice when sending non- 480 textual (binary) data: it MUST NOT be terminated with a full stop. 482 In either of the two cases above, the string to send is `+-2`. This 483 instructs the client that the data will be terminated when the 484 connexion is closed, and furthermore, that the length of the data is 485 unknown. For example: 487 +-2[CR][LF]binarydata 489 8. Requesting and Receiving Metadata 491 A GopherIIbis client may request the metadata for a specific selector 492 by sending a string in the following form: 494 ^I![CR][LF] 496 The trailing tab and exclamation mark is what distinguishes a request 497 for data from a request for metadata. The metadata returned is of 498 the following form: 500 +INFO: 0lpryce.txt^IRest in peace, Lane Pryce^Igopher.scdp.com^I70+ 502 +ADMIN: 503 Admin: Roger Sterling 504 Mod-Date: Fri Feb 13 08:22:11 2015 <20130213082211> 506 +VIEWS: 507 text/plain: <10k> 508 application/postscript: <100k> 509 application/latex: <50k> 510 application/pdf: <120k> 512 +ABSTRACT: 514 Yesterday, our beloved partner Lane Gordon Pryce died of suicide in 515 his Manhattan home. He was 55. 517 In general, data intended to be read by the computer will be enclosed 518 in angle brackets (`<` and `>`). A graphical client may, for 519 example, provide a GUI menu of all possible document views with 520 graphical icons of the file type and tool-tips of the file size. 522 These are far from the only available metadata records; only the 523 `INFO` record is mandatory, and it MUST be transmitted first of all. 524 The `ADMIN` record is RECOMMENDED, and if it is included, it must be 525 transmitted *directly* after the `INFO` record. 527 It is also possible to retrieve only a *specific* record or range of 528 records. For example, to retrieve only the views and the abstract, a 529 client may send: 531 ^I!+INFO+ADMIN[CR][LF] 533 Finally, it is possible to retrieve metadata for an *entire 534 directory*. Of course, this is relatively bandwidth-intensive (for a 535 56k link) but a modern Ethernet connexion should have no problem with 536 it. The reason for the requirement of an `INFO` record for every 537 selector should now be abundantly clear: the `INFO` record serves to 538 separate metadata for one file from metadata for another. For 539 example: 541 ^I&[CR][LF] 543 The only difference between a request for a *single* file's metadata 544 and a request for that of a whole directory is that a single-file 545 request uses an exclamation mark, whereas a whole-directory request 546 uses an ampersand ("and sign", &). 548 It is even possible to request a specific record from every selector 549 in the directory, by appending the requested fields to the command 550 string as above. 552 8.1. The `INFO` Record 554 The `INFO` record is MANDATORY in every metadata listing. It 555 contains the same data as the Gopher selector, with a plus sign at 556 the end, per GopherIIbis style. It MUST always be present, and it 557 MUST always be the first metadata record present. The `INFO` record 558 serves to separate metadata listings when more are sent at the same 559 time. 561 8.2. The `ADMIN` Record 563 To promote accountability, the `ADMIN` record is also MANDATORY in 564 every metadata listing. It MUST contain fields for `Admin` (the name 565 and contact information for the administrator of the file) and `Mod- 566 Date` (the date of last modification) as seen in the example below: 568 +ADMIN: 569 Admin: Roger Sterling 570 Mod-Date: 01 January 2015 572 The time of last modification MUST be in 24-hour format. 574 If the metadata listing is for the results of a database search, such 575 as Veronica, it SHOULD also include fields for `Score` (a whole- 576 number ranking of the relevance of the result to the search query) 577 and for `Score-Range` (the lowest and highest possible relevance 578 scores), as per the following example: 580 +ADMIN: 581 Admin: Margaret Olson 582 Mod-Date: 13 February 2015 583 Score: 100 584 Score-Range: 0 150 585 The first number in the `Score-Range` field is the *lower bound*, and 586 the second number is the *upper bound*. 588 Several other fields are optional. `Site` is the name of the 589 Gopherhole, `Org` is the name of the business or individual who owns 590 the Gopherhole, `Loc` is the owner's location (city, district, and 591 country), `Geog` is the owner's geographic co-ordinates, and `TZ` is 592 the time zone in the format GMT+[01..11]. For example: 594 +ADMIN: 595 ... 596 Site: S|C|D|P Main Site 597 Org: Sterling|Cooper|Draper|Pryce Inc. 598 Loc: New York, NY, USA 599 Geog: 40N 173W 600 TZ: GMT-05 602 The `Author` may also be given, as may be the `Creation-Date` and 603 `Expiration-Date`, in the same format as the `Mod-Date`. 605 8.3. The `VIEWS` Record 607 Although the main selector might be for only one format of a file 608 (such as Rich Text Format), the same file may be available in many 609 other formats, such as Plain Text for older systems, LaTeX for 610 typesetters, PDF for displaying on screen, PostScript for printing on 611 a graphical printer, and many more. 613 The `VIEWS` record in GopherIIbis allows for serving multiple 614 variants of the same file, using what are known as MIME file 615 descriptors, Content-Types, or Internet media types. The `VIEWS` 616 field also allows for viewing the same file in multiple languages and 617 even in multiple dialects of the same language---in this case, the 618 relevant abbreviations are known as ISO-639 language codes and 619 ISO-3166 country codes. These are generally at least somewhat 620 intuitive (`CA` for Canada, `GB` for Great Britain, `en` for 621 English), but a full list may be found on the ISO Web site. 623 This is an example of a `VIEWS` record allowing for the selection of 624 a plain text, Rich Text, and PDF of the same file in American 625 English, Peninsular Portuguese, and Brazilian Portuguese: 627 +VIEWS 628 text/plain en_US: <32K> 629 text/plain pt_PT: <34K> 630 text/plain pt_BR: <34K> 631 text/rtf en_US: <55K> 632 text/rtf pt_PT: <60K> 633 text/rtf pt_BR: <66K> 634 application/pdf en_US: <120K> 635 application/pdf pt_PT: <132K> 636 application/pdf pt_BR: <133K> 638 The `VIEWS` record SHOULD be ranked according to the administrator's 639 idea of which view is preferred. On an American site catering to 640 English speakers, the `en_US` files should be listed first of all. 641 Likewise, on a site of any language catering to scientists, LaTeX 642 source should always come first of all. 644 8.4. The `ABSTRACT` Record 646 It is RECOMMENDED that every selector on a GopherIIbis-compliant 647 server have an `ABSTRACT` record. The `ABSTRACT` record contains a 648 *brief* description of the item (no more than a paragraph long) to 649 assist the reader in determining its purpose. Similarly, it is also 650 RECOMMENDED that the root directory of every Gopher server (that is, 651 what one gets when one requests metadata for the server itself with 652 no selector) contain an `ABSTRACT` record with the name, postal 653 address, eMail address, and telephone number of the person 654 responsible for the site. For example: 656 +ABSTRACT 657 The life and times of Professor Albert Einstein, Swiss patents clerk 658 and discoverer of four great scientific theories in one miraculous 659 year. 661 9. Errors 663 Although undesirable in communication, errors do occur in Gopher, and 664 their handling is crucial for a user-friendly, and standards- 665 compliant, Gopher experience. 667 When an error is encountered, the server MUST return a menu whose 668 first item bears itemtype `3`. All other ways of signalling an 669 error, such as redirecting to a Gopher error menu, an image, or 670 (worst of all) an HTML page, are PROHIBITED. 672 The selector string for itemtype `3` is the text of the error. It is 673 the responsibility of the server application to have understandable 674 and accurate strings for error handling. As they are well-understood 675 and common, HTTP-style error codes are acceptable and RECOMMENDED; 676 however, they SHOULD also be followed by a clear, legible description 677 of the error in both English and the local language. 679 Errors are handled in GopherIIbis in a slightly different fashion. 680 When an error occurs in response to a GopherIIbis-compliant query, 681 the server sends two minus signs, followed by an error code, a 682 description of the error, and a full stop. The error code SHOULD be 683 in the HTTP style as elucidated in the next sub-subsection, but the 684 numbers 1, 2, and 3 MUST also be understood and handled correctly, 685 also as defined in "Error Codes". An example of a GopherIIbis error 686 follows: 688 --404[CR][LF]The file requested could not be found.[CR][LF].[CR][LF] 690 The decision of whether to send a GopherII error string or a 691 GopherIIbis error string is governed by the type of query received. 692 If the query was compliant with GopherIIbis, a GopherIIbis error MUST 693 be sent. In all other cases, a GopherII error MUST be sent. 695 9.1. Error Codes 697 This is a listing of HTTP-style error codes used in Gopher; due to 698 Gopher's simplicity, it lacks most of the errors possible in HTTP. 699 Codes beginning with 4 can generally be traced to the client; codes 700 beginning with 5 are usually due to the server. 702 [[CREF3: Gopher predates HTTP so these can't possibly be "HTTP-style" 703 codes. What they are is "SMTP-style" or "FTP-style" and I would 704 suggest you pick one of those, especially if you don't want this 705 referred to the HTTPbis WG and killed.]] 707 400 Bad Request The request could not be understood by the server 708 due to malformed syntax. 710 401 Unauthorised The request requires authentication. For example, 711 the received query value (as password) does not match the expected 712 value. 714 403 Forbidden The request was received, but not filled. 716 404 Not Found The server could not find anything matching the 717 requested URL. If the condition is known to be permanent, use 718 error code 410 (Gone). 720 408 Request Time-out The client did not produce a request within the 721 time that the server was prepared to wait. 723 410 Gone The requested resource is no longer available at the server 724 and no forwarding address is known. This condition is expected to 725 be considered permanent. If this is unknown, use error code 404 726 (Not Found). 728 500 Internal Server Error The server encountered an unexpected 729 condition which prevented it from fulfilling the request. 731 501 Not Implemented The server does not support the functionality 732 required to fulfil the request. 734 503 Service Unavailable The server is currently unable to handle the 735 request due to temporary overload/maintenance. 737 An earlier version of the GopherIIbis extension, known as Gopher+, 738 used error codes `1`, `2`, and `3`. Error code `1` signifies an 739 unavailable item (similar to the 400-series errors), error code `2` 740 signifies an unavailable server (similar to the 500-series errors), 741 and error code `3` signifies an item that has moved. Provision was 742 made to create new error codes. This is now DEPRECATED; the *ad hoc* 743 creation of new errors does not accord with the ethos of a 744 standardised Internet protocol. 746 10. Titles in Gopher 748 No mention of menus with titles exists per RFC 1436. When one simply 749 browses about Gopherspace, this does not matter; for bookmarking and 750 Gopher crawlers, such as Veronica-2, however, this presents a large 751 problem. 753 A Gopher TITLE resource has the following format: 755 i^ITITLE^Iexample.com^I0 757 It is identical to a standard informational resource (itemtype `i`); 758 the selector string, however, is set to the specific value, `TITLE`. 760 The composition of the above format is as follows: 762 o `^I` is the ASCII character corresponding to a press of the `Tab` 763 key 765 o The type code MUST be `i` (information) 767 o The selector string MUST be `TITLE` 769 o There is no server to connect to; the dummy text used in place of 770 the server SHOULD be `example.com` 772 o There is no port to connect to; the placeholder number SHOULD 773 therefore be `0` (zero). 775 A Gopher client that conforms to the above `TITLE` specification 776 SHALL render it in one of two ways, depending on the placement of the 777 resource. If the `TITLE` is the *first* resource in the document, it 778 SHALL be considered its principal `TITLE` and used *wherever a 779 principal title is needed* (window headings, bookmarks, etc.); 780 furthermore, it SHOULD be rendered in a different size, font, and/or 781 colour to the remainder of the document. In *all other* cases, it 782 SHALL be considered a subordinate `TITLE` and SHOULD be rendered in a 783 different size, font, and/or colour to the remainder of the document, 784 but smaller and/or with less emphasis than the main title. 786 If a non-compliant Gopher client receives a `TITLE` resource as per 787 above, it will render it as plain informational text. As the main 788 `TITLE` must be on the first line of a menu, it will appear visually 789 similar to a title in any case, although not rendered as such. 791 11. Linking to Web Addresses 793 It is now possible, and standard, to link to documents, preferably in 794 HTML, on the World Wide Web, Gopher's younger, more widespread 795 cousin, from Gopher itself, using a two-part system: a `URL:` 796 selector on the Gopher (local) end, and a *redirect page* (following 797 rules as set out below) on the HTTP (remote) end. There are no 798 compliance requirements for Gopher servers, with one exception: 799 servers MUST follow the bulleted list located immediately after the 800 example redirect page. 802 A Gopher client SHALL, when it sees a selector with a path starting 803 with `URL:`, interpret the path as a URL. It SHALL ignore the host 804 and port components of the Gopher selector, using those components 805 from the URL instead, if applicable. 807 `URL:` selectors SHOULD NOT be used if it is possible to link to the 808 required content and protocol by any other means. In particular, the 809 following protocols SHALL NOT be used with the URL: selector. 811 o gopher 813 o telnet (VT100-compatible) 815 o tn3270 817 Authors SHOULD NOT link to any document not of HTML type unless 818 absolutely necessary; linking to non-HTML documents will break 819 compatibility with non-compliant Gopher browsers. 821 A Gopher `URL:` selector MUST take the following format: 823 h^IURL:
^I^I 825 URL:` selectors are, for the most part, identical to standard HTML 826 selectors, but composed of particular data: 828 o The item type corresponds to the type of document on the remote 829 end. Most typically, this is a Web page authored in HTML; 830 therefore, the item type is most commonly `h`. 832 o is the text of the link; this can be almost anything. 834 o
is the full URL, preceded by the string `URL:`. For 835 example, this could be `URL:http://www.example.com` 837 o is the server that the link *originated* from; this 838 MUST be ignored by a compliant client, but MUST also be sent by a 839 compliant server 841 o is the port that the link *originated* from; this MUST 842 be ignored by a compliant client, but MUST also be sent by a 843 compliant server 845 It is possible for a non-compliant Gopher client to follow a link to 846 an HTML page, as long as the server is compliant, by the following 847 means: when the client receives a command to follow a `URL:` 848 selector, it will contact the server that provided the menu, as the 849 originating host and port are *mandatory* per this specification. 851 When a Gopher server receives a request from a client beginning with 852 the string `URL:`, it SHALL write out an HTML document that redirects 853 the browser to the appropriate place. A conforming example of such a 854 document is as follows: 856 857 858 859 860 862 You are following an external link to a Web site. You will be 863 automatically taken to the site shortly. If you do not get sent 864 there, please click here to go 865 to the web site. 866

867 The URL linked is:http://www.example.com/"> 868

869 http://www.example.com/ 870

871 Thanks for using Gopher! 872 873 875 This document may be any desired by the server authors, but MUST 876 adhere to the following requirements. 878 o It SHALL provide a refresh of a duration of 10 seconds or less 880 o It SHALL NOT use `IMG` tags, frames, or have any reference 881 whatsoever to content outside that particular file, with the sole 882 exception of the link to the real destination. 884 o It SHALL NOT use JavaScript. 886 o It SHALL adhere to the W3C HTML 4.0 standard. 888 When a non-compliant Gopher client finds a reference to a HTML file 889 (type `h`), it will open up the file via Gopher, receiving the 890 redirect document using a Web browser. The Web browser will then be 891 redirected to the actual link destination. 893 Compliant Gopher clients will simply render the target directly. 895 12. Algorithm to use with selectors 897 Here is a description for a hypothetical algorithm for parsing item 898 types, splitting them into levels of interaction. 900 PROTOCOL 901 -------- 902 Type Description What to do 903 0 Brief text Render directly line by line. 904 1 Menu Request and analyse menu. If it 905 contains '3' error node, print 906 error. 907 Else, render menu in new window. 908 7 Index/Search 909 Server 911 DATA NODES 912 ---------- 913 Type Description What to do 914 4, 9, g, I, c, Binary file Request and analyse file. If it 915 d, m, s, ; contains '3' error node, print 916 error. Else, does plug-in exist? 917 If yes, display. If no, save to 918 disc. 919 6, p, x Text file Request and analyse file. If it 920 contains '3' error node, print 921 error. Else, print on screen. 922 h, 2, 8, T Link Treat as URL. 923 5 Archive File Request and analyse file. If it 924 contains '3' error node, print 925 error. Else, does plug-in exist? 926 If yes, display. If no, save to 927 disc. 929 For instance, if the client is incapable of handling images as it is 930 text-only, the algorithm above would have it save to disc. 932 13. Representation of Gopher Addresses 934 This section is greatly indebted to RFC 4266 [RFC4266]. 936 A Gopher address, or uniform resource locator, takes the form: 938 gopher://:/ 940 where is one of: 942 o 944 o %09 946 o %09%09 947 If : is omitted, the port defaults to 70. is a 948 single-character field to denote the Gopher type of the resource to 949 which the URL refers. The entire may also be empty, in 950 which case the delimiting `/` is also optional and the 951 defaults to `1`. 953 is the Gopher selector string. Selector strings are 954 arbitrary sequences of characters; they MUST NOT, however, contain 955 the characters corresponding to horizontal tab, line feed, or 956 carriage return. Gopher clients specify which item to retrieve by 957 sending the Gopher selector string to a Gopher server. It is 958 important to know that within the itself, there are no 959 reserved characters, so one may be arbitrarily creative when creating 960 selector names. 962 Note that some Gopher strings begin with a copy of the 963 character, in which case that character will occur twice 964 consecutively. The Gopher selector string may be an empty string; 965 this is how Gopher clients refer to the top-level directory on a 966 Gopher server. 968 If the URL refers to a search to be submitted to a Gopher search 969 engine, the selector is followed by an encoded tab `%09` and the 970 search string. To submit a search to a Gopher search engine, the 971 Gopher client sends the string (after decoding), a tab, 972 and the search string to the Gopher server. 974 14. Gopher Policy Files 976 It is often useful to provide information to Gopher clients that MAY, 977 but need not, be read by a human being. It is for this reason that 978 policy files exist. This document enumerates two types of policy 979 files, formally known as the Capability Policy and the Robot Access 980 Restriction Policy, but also informally known under their filenames: 981 `caps.txt` and `robots.txt`, respectively. 983 14.1. Capability Policy 985 It is RECOMMENDED, when hosting a public-access Gopher server, to 986 include a capability policy. Although it is, ultimately, the choice 987 of the owner or operator of the server, a capability policy (or caps 988 file) can be useful for clients querying the server for certain 989 information without using extensions such as Gopher+. 991 The purpose of a capability policy is so that a server can instruct a 992 client on how properly to parse selectors in its filesystem; it 993 ensures that the client can understand how files on the server are 994 organised. The scheme used in the current implementation of caps can 995 handle POSIX (UNIX and related operating systems), FAT/NTFS (used by 996 Microsoft Windows), and HFS (used by all versions of Apple Mac OS, 997 including OS X, which is otherwise POSIX-compatible). For technical 998 reasons, capability policies cannot handle VMS or Files-11 paths; 999 however, owing to their open interface, the specification can be 1000 arbitrarily extended. 1002 A capability policy is quite simple in its composition: it is a plain 1003 text file with no more than seventy characters per line in the root 1004 directory of a Gopher server with the name 1006 caps.txt 1008 and beginning with the six characters 1010 CAPS[CR][LF] 1011 caps.txt 1013 Because of the constrained name and location of the policy, it is a 1014 trivial matter to verify if one exists or not; the address is always 1015 of the form , with the real 1016 name of the server substituting for `example`. The server should 1017 accept both `caps.txt` and `/caps.txt` as selectors, and return the 1018 same content for both. 1020 A caps file contains *keys*, *values*, and *comments*. 1022 Keys can be compared to labelled containers for data; for instance, 1023 the key `ServerSoftware` is a container for the name of the Gopher 1024 software running on the server. Keys in capability policies are 1025 always alphanumeric (i.e., composed of letters and numbers only) and 1026 generally are in CamelCase (each individual word within the key 1027 capitalised). The data in these containers is called a value; values 1028 can use letters, numbers, and symbols. Keys and values are connected 1029 by the equals (=) sign. Any amount of whitespace (spaces and tabs) 1030 around the equals sign is acceptable. 1032 Anything not conforming to the syntax 1034 SomeKey = Value 1036 is ignored (treated as a comment). To be compliant with GopherII, 1037 comments must begin with a hash (#) sign. More importantly, they 1038 must be on a line to their own. 1040 Below is an example caps file. 1042 CAPS 1043 CapsVersion=1 1044 ExpireCapsAfter=3600 1046 PathDelimeter=/ 1047 PathIdentity=. 1048 PathParent=.. 1049 PathParentDouble=FALSE 1050 PathEscapeCharacter=\\ 1051 PathKeepPreDelimeter=FALSE 1053 ServerSoftware=Bucktooth 1054 ServerSoftwareVersion=0.2.9 1055 ServerArchitecture=AIX 1056 ServerDescription=IBM Power 520 Express, 2x4.2GHz POWER6 CPU, 8G RAM 1057 ServerGeolocationString=Southern California, USA 1059 ServerSupportsStdinScripts=TRUE 1061 ServerAdmin=gopher@floodgap.com 1063 DefaultEncoding=utf-8 1065 The `CapsVersion` field is self-explanatory, with one note: it should 1066 always be the *first* field in the file, so that an incompatible 1067 later format might be detected by the client. The `ExpireCapsAfter` 1068 field tells the client the recommended cache expiry time (that is, 1069 the time between fetching and re-fetching the caps file) in 1070 *seconds*. `3600` as above means one hour, and so on. 1072 The `Path` variables `PathDelimeter` [sic!], `PathIdentity`, 1073 `PathParent`, `PathParentDouble`, `PathEscapeCharacter`, and 1074 `PathKeepPreDelimeter` [sic!] refer to attributes of the file system. 1075 The above example is correct for a UNIX system, including Mac OS X. 1076 `PathDelimeter` refers to how the server separates folders from each 1077 other; Unix machines use `/`, Microsoft machines use `\`, and 1078 obsolete Macs use `:`. `PathIdentity` refers to the shorthand used 1079 by an operating system to mean "this directory"; UNIX machines use 1080 `.`. `PathParent` refers to the shorthand for "the directory 1081 immediately above", and is `..` on UNIX and Microsoft systems. 1082 `PathParentDouble` refers to an oddball feature of obsolete Macs: two 1083 consecutive path delimiters are used to refer to the parent 1084 directory. For all systems other than pre-OS X Macintoshes, 1085 `PathParentDouble` should be FALSE. `PathEscapeCharacter` tells the 1086 client the escape character for quoting delimiters when they appear 1087 in selectors; most of the time, this is `\\`. `PathKeepPreDelimiter` 1088 tells the client not to cut everything up to the first path 1089 delimiter; most of the time, this should be `FALSE`. 1091 The `Server` variables `ServerSoftware`, `ServerSoftwareVersion`, 1092 `ServerArchitecture`, `ServerDescription`, and 1093 `ServerGeolocationString` are freetext descriptions of the server 1094 software and version, operating system ("architecture"), server 1095 hardware (`Server Description`), and location on the Earth. 1097 Finally, `ServerAdmin` is an eMail contact address for the server 1098 administrator, and `DefaultEncoding` is the default text encoding for 1099 content types 0 and 1. 1101 14.2. Robot Access Restrictions Policy 1103 WWIS robots, also known as spiders, crawlers, or wanderers, are 1104 computer programmes that, without human intervention, recursively 1105 travel throughout linked pages or directories on an information 1106 system (that is, by repeatedly travelling up and down a tree) and 1107 store the copies of these files at an independent location. The 1108 process of programmatically gathering information in this manner is 1109 called crawling or spidering. 1111 Many sites, in particular search engines (such as Google on the World 1112 Wide Web, or Veronica on Gopher), use spidering as a means of 1113 providing up-to-date data. Robots are mainly used to create a copy 1114 of all the visited pages for later processing by a search engine that 1115 will index the downloaded pages to provide fast searches. Robots can 1116 also be used for redundancy; data can be preserved by a third party 1117 in case the original server becomes inaccessible. 1119 In 1993 and 1994, however, there were occasions where robots had 1120 visited locations on the Web at which they were not welcome. 1121 Inexperienced or heavy-handed use of robots caused situations where 1122 servers were swamped with requests at a high rate of speed; or, the 1123 same files were retrieved repeatedly. Both could cause denial of 1124 service. In other situations, robots traversed parts of servers that 1125 were unsuitable, such as temporary information or server-side 1126 scripts, especially those with side-effects (such as polls). Abuse 1127 of robots was also an issue, and continues to be one now; for 1128 instance, electronic mail addresses have been harvested with knowing 1129 intent to distribute unsolicited mail ('spam'). 1131 These incidents indicated the need for established mechanisms for 1132 Gopher servers to indicate to robots which parts of their server 1133 should not be accessed. This specification addresses this 1134 requirement with an operational solution, adapted from the identical 1135 method used on sites using the Hypertext Transfer and File Transfer 1136 Protocols. 1138 The method used to exclude robots from a Gopher server is formally 1139 known as the Robot Access Restrictions Policy (RARP) and consists of 1140 placing a plain-text file specifying, in simple and user-friendly 1141 syntax, which robots may access which directory. The policy file, if 1142 it exists, MUST be accessible via Gopher on the local address 1144 /robots.txt 1146 A possible drawback of this single-file approach is that only a 1147 server administrator can maintain such a list, not the individual 1148 document maintainers on the server. This can be resolved by a local 1149 process to construct the single file from a number of others, but if, 1150 or how, this is done is outside of the scope of this document. 1152 Furthermore, Gopher administrators should bear in mind that the Robot 1153 Access Restrictions Policy works largely on the honour system. Many 1154 crawlers can be set to ignore the policy, and it is trivial to write 1155 this capability into a new crawler. 1157 The policy file consists of one or more records, separated by one or 1158 more blank lines, terminated by the Gopher-standard CR/LF. Each 1159 record contains two or more lines of the form 1161 : 1163 The field name is not case-sensitive. Comments (lines to be ignored 1164 by robots themselves, but useful to robot operators and others) start 1165 with the hash (#) character and end with the line terminator (CR/LF). 1166 A value can share a line with a comment. A record starts with at 1167 least one `User-agent` field, followed by at least one `Disallow` 1168 field. There are two further, optional fields: `Crawl-delay`, as 1169 well as `Allow`. 1171 The value of the `User-agent` field is the name of the robot whose 1172 access policy is being described. If more than one `User-agent` 1173 field is present, the record is describing an identical access policy 1174 for each robot. This field is to be interpreted broadly. The 1175 recommended implementation of access policies in the robot's code is 1176 for a case-insensitive sub-string match, without version information. 1177 Since one is describing an access policy for at least one robot, at 1178 least one `User-agent` field is required. The value `*` (quotes 1179 excluded) describes access policy for any robot not matching any 1180 previous records; therefore, if listed, it SHOULD be listed last of 1181 all. If it is not listed last of all, anything below it will be 1182 ignored. 1184 The value of the `Disallow` field specifies a partial URL that is not 1185 to be visited. This can be a full path, or a partial path. Any 1186 address that begins with this value will not be retrieved; for 1187 instance, the line Disallow: /help would disallow `/help/index.html`; 1188 `/help/faq.html`; as well as `/help.html`. Conversely, the line 1189 Disallow: /help/ would allow `/help.html`, but nothing in the 1190 directory `/help/`. An empty `Disallow` field indicates that all 1191 addresses can be retrieved. As one is defining policy and not simply 1192 listing the names of robots, at least one `Disallow` field is 1193 required per record. 1195 One can also add specific exceptions to the locations disallowed by 1196 using the `Allow` field. 1198 The `Crawl-delay` field is also supported; this field indicates the 1199 number of seconds to wait between successive requests to the same 1200 server; the value must be an integer with no units. 1202 The following is an example of a well-built policy file: # Robot 1203 Exclusion File for gopher://gopher.scdp.com # If you wish to crawl 1204 gopher.scdp.com, please contact # lane.pryce@scdp.com to apply for an 1205 exemption. Our terms of # service are available at 1206 gopher://gopher.scdp.com/0/tos.txt. User-agent: baiduspider User- 1207 agent: googlebot User-agent: msnbot User-agent: bingbot User-agent: 1208 naverbot User-agent: seznambot User-agent: slurp User-agent: teoma 1209 User-agent: yandex Disallow: /cgi-bin/ # Dynamically generated 1210 scripts Disallow: /images/ # This consumes bandwidth! Disallow: 1211 /tmp/ # Temporary files---blink, gone! Disallow: /private/ # No 1212 peeking! Allow: /images/logo.jpg # Main logo. Mirror this if 1213 possible. Crawl-delay: 10 User-agent: * Disallow: / # If you have 1214 received authorisation to crawl this site, and are # getting denied, 1215 please contact support@scdp.com, or dial # (212) 555 0169. This site 1216 is copyright Sterling, Cooper, Draper, # and Pryce, 2012. 1218 In plain terms, this server allows major search engines Baidu, 1219 Google, Bing, Naver, Seznam, Teoma, Yahoo, and Yandex to mirror the 1220 site freely, with the exception of everything in the directories 1221 /cgi-bin/, /tmp/, and /private/, as well as everything with the 1222 exception of the single file logo.jpg in the directory /images/. So 1223 as to not unduly slow the server down, the policy file requests that 1224 search engines wait ten seconds between requests. All other robots 1225 are prohibited from accessing the site. 1227 Examples such as the following SHOULD NOT be used except in very rare 1228 situations. Robots generally cause more good than harm, and 1229 excluding them entirely, as this anti-social user would, does not 1230 make Gopher a healthy place. 1232 # Piss off! 1233 User-agent: * 1234 Disallow: / 1236 14.3. Administrator Contact File 1238 It is worth remembering that computers, like anything else, are 1239 fallible and prone to error. When failure occurs in Gopherspace, the 1240 person in the best position to rectify it is the system 1241 administrator. Furthermore, users may have questions or comments, 1242 also best directed to the system administrator. For this reason, 1243 each Gopher server MUST have a file in its top-level directory with 1244 the name *about.txt* and a RECOMMENDED selector string of *About* or 1245 *About this server* (equivalents in the local language are 1246 permissible, but an English translation is similarly RECOMMENDED). 1247 It is the Gopher equivalent of a Unix user's finger output. 1249 Since this file is intended to be readable by humans and not 1250 computers, it does not have a defined file format. However, it 1251 should have a short description of the server's contents, as well as 1252 the contact details of the server administrator and any other key 1253 employees, such as the legal department. A well-structured contact 1254 file looks as follows: 1256 Sterling|Cooper|Draper|Pryce 1257 ============================ 1259 Welcome to SCDP! We are a full-service advertising and marketing 1260 agency staffed by a team of diverse, senior professionals with a 1261 flair for solid strategy and compelling creative output. Our team 1262 produces unique television, radio, print, and Web advertisements 1263 for a range of industries. 1265 Our ability to identify and communicate your greatest benefit to 1266 your customers is our greatest benefit to you. We find out what 1267 makes you truly unique. We have built an excellent team: each 1268 member is an advertising specialist in their own right. 1269 Photography, programming, writing, design, strategy---you name it, 1270 we have a creative for that. 1272 System Administrator: Margaret Olson 1273 Telephone: (212) 555 0169 x808 1274 Address: 13, Madison Avenue, 1275 New York, N.Y., 1276 U.S.A. 1277 eMail: peggy.olson@scdp.com 1278 Skype: peggyXolson 1280 All prospective clients: 1281 Please contact Creative Director Donald Draper at extension 069. 1283 Legal issues: 1284 For all legal and financial issues, please contact Lane Pryce 1285 at extension 777. 1287 15. IANA Considerations 1289 Nothing within this document should be taken to imply that any 1290 actions are to be undertaken by the Internet Assigned Numbers 1291 Authority. 1293 [[CREF4: Unless you have some reason for doing otherwise, just 1294 saying, "This document does not require any action by the IANA" is 1295 adequate".]] 1297 16. Security Considerations 1299 None. 1301 [[CREF5: You are not going to get away with this, especially for 1302 standards track -- some serious discussion is needed.]] 1304 17. Acknowledgments 1306 [[CREF6: you are going to need these before you get finished.]] 1308 18. References 1310 18.1. Normative References 1312 [Anklesaria1993] 1313 Anklesaria, F., "Gopher+: upward compatible enhancements 1314 to the Internet Gopher protocol", 1993. 1316 Anklesaria, Farhad; Lindner, Paul; McCahill, Mark P.; 1317 Torrey, Daniel; Johnson, David; Alberti, Bob (1993). ** 1318 Retrieved 23 May, 2012, from 1319 1321 [RFC1436] Anklesaria, F., McCahill, M., Lindner, P., Johnson, D., 1322 Torrey, D., and B. Alberti, "The Internet Gopher Protocol 1323 (a distributed document search and retrieval protocol)", 1324 RFC 1436, March 1993. 1326 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1327 Requirement Levels", BCP 14, RFC 2119, March 1997. 1329 18.2. Informative References 1331 [CapsRef] Kaiser, C., "Welcome to caps!", 2010. 1333 Kaiser, Cameron (2010). *Welcome to caps!* Retrieved 23 1334 May, 2012, from 1337 [Floodgap] 1338 Floodgap, "Floodgap's caps file". 1340 - Floodgap's 1341 caps file 1343 [Goerzen2012] 1344 Goerzen, J., "Links to URL", 2002. 1346 Goerzen, John (2002). *Links to URL.* Retriever 23 May, 1347 2012, from 1350 [GopherHistory] 1351 ??, ?., "A paper on the history of Gopher". 1353 -- A paper on the history of 1355 Gopher 1357 [RelatedDocs] 1358 ??, ?., "gophernicus.org". 1360 - A number of 1361 documents relating to gopher, including the RFCs 1363 [RFC4266] Hoffman, P., "The gopher URI Scheme", RFC 4266, November 1364 2005. 1366 [UpdatedGopher] 1367 ??, ?., "The "Updated Gopher RFC" thread", 2012. 1369 The "Updated Gopher RFC" thread (started May 8 2012) on 1370 the gopher-project mailing list 1372 Appendix A. Summary of Changes from RFC 1436 1374 [[CREF7: Not required, but strongly recommended and you may be forced 1375 into it even if you don't do it spontaneously. ]] 1377 Appendix B. Change Log 1379 [[RFC Editor: please remove this appendix.]] 1381 B.1. Changes from -00 to -01 of this specification 1383 ... to be supplied .. 1385 Authors' Addresses 1387 Ted Matavka 1389 Email: n.theodore.matavka.files@gmail.com 1391 Wolfgang Faust 1393 Email: wolfgangmcq@gmail.com