idnits 2.17.1 draft-matavka-gopher-ii-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. -- The draft header indicates that this document updates RFC4266, but the abstract doesn't seem to mention this, which it should. -- The draft header indicates that this document updates RFC1436, but the abstract doesn't seem to directly say this. It does mention RFC1436 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC1436, updated by this document, for RFC5378 checks: 1993-03-01) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 8, 2015) is 3244 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'CR' is mentioned on line 1029, but not defined == Missing Reference: 'LF' is mentioned on line 1029, but not defined == Unused Reference: 'Anklesaria1993' is defined on line 1350, but no explicit reference was found in the text == Unused Reference: 'CapsRef' is defined on line 1369, but no explicit reference was found in the text == Unused Reference: 'Floodgap' is defined on line 1375, but no explicit reference was found in the text == Unused Reference: 'Goerzen2012' is defined on line 1381, but no explicit reference was found in the text == Unused Reference: 'GopherHistory' is defined on line 1388, but no explicit reference was found in the text == Unused Reference: 'RelatedDocs' is defined on line 1395, but no explicit reference was found in the text == Unused Reference: 'UpdatedGopher' is defined on line 1404, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'Anklesaria1993' ** Downref: Normative reference to an Informational RFC: RFC 1436 Summary: 1 error (**), 0 flaws (~~), 11 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Matavka 3 Internet-Draft 4 Updates: 1436, 4266 (if approved) W. Faust 5 Intended status: Standards Track June 8, 2015 6 Expires: December 10, 2015 8 Gopher-II: The Next Generation Gopher WWIS 9 draft-matavka-gopher-ii-02 11 Abstract 13 The Gopher protocol is over twenty years old. Changing practices and 14 unofficial extensions have caused Gopher as currently used to differ, 15 but remain largely compatible with, the standard established in its 16 official governing document, *The Internet Gopher Protocol (a 17 distributed document search and retrieval protocol)*, known as *RFC 18 1436*. Therefore, this document attempts to establish a contemporary 19 specification of the Gopher communications protocol, departing as 20 little as possible from current practice. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on December 10, 2015. 39 Copyright Notice 41 Copyright (c) 2015 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 57 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 58 1.2. Changes from RFC 1436 and 4266 . . . . . . . . . . . . . 3 59 2. Basic Gopher Transactions . . . . . . . . . . . . . . . . . . 4 60 2.1. Menu Transaction . . . . . . . . . . . . . . . . . . . . 4 61 2.2. Index Transaction . . . . . . . . . . . . . . . . . . . . 5 62 2.3. Simple Text Transaction . . . . . . . . . . . . . . . . . 5 63 2.4. Binary Transaction . . . . . . . . . . . . . . . . . . . 5 64 2.5. Details . . . . . . . . . . . . . . . . . . . . . . . . . 5 65 3. Line Terminators . . . . . . . . . . . . . . . . . . . . . . 6 66 4. Selector Formats . . . . . . . . . . . . . . . . . . . . . . 7 67 4.1. Type Codes . . . . . . . . . . . . . . . . . . . . . . . 7 68 4.2. GopherIIbis: Metadata in Gopherspace . . . . . . . . . . 8 69 5. Gopher Menus . . . . . . . . . . . . . . . . . . . . . . . . 9 70 5.1. Note on the terminating full stop . . . . . . . . . . . . 10 71 6. Requesting Data . . . . . . . . . . . . . . . . . . . . . . . 10 72 7. Data Transfer . . . . . . . . . . . . . . . . . . . . . . . . 11 73 8. Requesting and Receiving Metadata . . . . . . . . . . . . . . 12 74 8.1. The `INFO` Record . . . . . . . . . . . . . . . . . . . . 13 75 8.2. The `ADMIN` Record . . . . . . . . . . . . . . . . . . . 13 76 8.3. The `VIEWS` Record . . . . . . . . . . . . . . . . . . . 14 77 8.4. The `ABSTRACT` Record . . . . . . . . . . . . . . . . . . 15 78 9. Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 79 9.1. Error Codes . . . . . . . . . . . . . . . . . . . . . . . 16 80 10. Titles in Gopher . . . . . . . . . . . . . . . . . . . . . . 17 81 11. Linking to Web Addresses . . . . . . . . . . . . . . . . . . 18 82 12. Algorithm to use with selectors . . . . . . . . . . . . . . . 20 83 13. Representation of Gopher Addresses . . . . . . . . . . . . . 21 84 14. Gopher Policy Files . . . . . . . . . . . . . . . . . . . . . 22 85 14.1. Capability Policy . . . . . . . . . . . . . . . . . . . 22 86 14.2. Robot Access Restrictions Policy . . . . . . . . . . . . 25 87 14.3. Administrator Contact File . . . . . . . . . . . . . . . 28 88 15. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 89 16. Security Considerations . . . . . . . . . . . . . . . . . . . 29 90 17. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 30 91 18. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 92 18.1. Normative References . . . . . . . . . . . . . . . . . . 30 93 18.2. Informative References . . . . . . . . . . . . . . . . . 31 94 Appendix A. Summary of Changes from RFC 1436 . . . . . . . . . . 31 95 Appendix B. Change Log . . . . . . . . . . . . . . . . . . . . . 32 96 B.1. Changes from -00 to -01 of this specification . . . . . . 32 97 B.2. Changes from -01 to -02 of this specification . . . . . . 32 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 32 100 1. Introduction 102 The over-riding aim of this document is to author a contemporary 103 specification of the Gopher world-wide information system, without 104 falling short of reflecting actual practice and without breaking 105 compliance with RFC 1436 [RFC1436]. This document shall attempt to 106 describe, and, where necessary, update current practice as regards 107 the means of handling errors, line and file terminators, policy 108 files, TITLE selectors, the URL: re-direction scheme, and new 109 selector types not compliant with RFC 1436. This document is not to 110 be construed as a replacement for RFC 1436; it merely complements it. 112 Gopher is a lightweight, client/server-oriented query/answer 113 protocol, functioning as a world-wide information system (WWIS) and 114 facilitating access to remote servers of any description. The 115 protocol and software permit users of a wide variety of desktop 116 systems to browse, search, and retrieve documents residing on 117 multiple distributed server machines. Gopher is unique among world- 118 wide information systems in that it encourages data to be sent in 119 textual form and that it imposes a strict hierarchy on content, 120 making it a protocol that is fast to transmit, receive, and search. 121 This, in turn, makes it useful in high-latency, low-bandwidth 122 communications, such as mobile links. In fact, Gopher provides the 123 ideal method for transmitting information from and to mobile devices. 125 1.1. Terminology 127 NOTE: The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL 128 NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 129 this document are to be interpreted as described in RFC 2119 130 [RFC2119]. Furthermore, backticks (`) around a string mean that it 131 is to be interpreted literally. 133 1.2. Changes from RFC 1436 and 4266 135 GopherII remains broadly compatible with the original Gopher; a 136 client compatible with the original implementation of Gopher will be 137 able to browse GopherII servers with a minimum of problems. The only 138 difference in strict-compatibility terms arises in the new selector 139 types (including, but not limited to, one distinguishing "plain text" 140 from "ASCII markup document"). That said, although these new 141 selectors are not officially standardised in RFC 1436, most existing 142 Gopher clients will ask for user input when attempting to process an 143 unfamiliar selector, and these selectors have been in de facto use 144 for some time, such that current Gopher clients will be compatible 145 with them already. 147 GopherII modifies the original standard in eight ways. Aside from 148 the aforementioned new selector types, GopherII introduces the 149 concept of the so-called policy file. Policy files are configuration 150 files sent from the server to the client in ASCII form with a defined 151 syntax. Three policy files are defined in this document: the 152 capability policy, which defines architectural details of the server 153 (including information about the file system); the administrator 154 contact file, which defines the geographical location of the server 155 as well as the entity responsible for its maintenance, and the robot 156 access restrictions policy, which defines etiquette to be followed by 157 Gopher search engines. The word 'etiquette' is here used because, 158 like human codes of behaviour at mealtime, it is non-binding. 160 Policy files account for three of the substantive changes from 161 original Gopher. GopherII also adds HTTP-style error codes, a 162 mechanism for titling the Gopher client window or tab, and 163 standardises a backward-compatible method for linking to HTTP 164 addresses. The final change is that GopherII adds support for the 165 metadata system known as Gopher+ (GopherIIbis in this document), 166 although this part of the GopherII specification is entirely 167 optional. 169 2. Basic Gopher Transactions 171 There are four broad forms of basic transactions in Gopher: 173 o Menu Transaction; 175 o Index Transaction; 177 o Simple Text Transaction; and 179 o Binary Transaction. 181 The precise composition of these transactions is elucidated below. 183 2.1. Menu Transaction 185 o Client : [Open Connexion] 187 o Client : Send [selector] 189 o Server : Send 191 o Server : Send . 193 o Server : [Close Connexion] 195 2.2. Index Transaction 197 o Client : [Open Connexion] 199 o Client : Send [selectorquery parameters] 201 o Server : Send [] 203 o Server : Send . 205 o Server : [Close Connexion] 207 2.3. Simple Text Transaction 209 o Client : [Open Connexion] 211 o Client : Send [selector] 213 o Server : Send [] 215 o Server : Send . 217 o Server : [Close Connexion] 219 2.4. Binary Transaction 221 o Client : [Open Connexion] 223 o Client : Send [selector] 225 o Server : Send [] 227 o Server : DO NOT send . 229 o Server : [Close Connexion] 231 2.5. Details 233 The fourth step of each transaction, with the exception of the binary 234 type, is OPTIONAL. Servers MAY send a full-stop character after 235 sending a menu, index, or text; if they do, clients MUST accept it. 236 Further information may be found in the appropriate sub-section. 238 Gopher servers are normally found on TCP port 70. Clients MUST 239 assume this port if no other port is specified. When a client opens 240 a connection to a server, the server MUST accept the connection but 241 say nothing, waiting for a CR/LF-terminated selector string from the 242 client. The client MAY then send the selector string followed by CR/ 243 LF (or nothing to retrieve the root menu from the server, which MUST 244 always be type 1). The server MUST then send the requested content 245 and close the connection. 247 3. Line Terminators 249 ASCII, the international standard that governs the interchange of 250 plain-text information between computer systems, is nothing more or 251 less than a table mapping each character (letter, number, space, or 252 symbol) to a numerical code, which is then converted to binary and 253 written to disc. Its necessity was seen long before the advent of 254 the electronic monitor, so some of its more unique quirks must be 255 understood in view of the time period of which it was a product. 256 Historically, input and output was through a specially-adapted 257 typewriter, and the ASCII convention reflects this in the codes it 258 uses to terminate lines of text. 260 In ASCII, there are two codes, both having physical equivalents in 261 the real world, that signal the end of the line: the Carriage Return 262 (abbreviated C/R, CR, or c/r) and the Line Feed (abbreviated L/F, LF, 263 or l/f). Originally, the term *carriage return* was used for a 264 command that caused the assembly holding the paper (the carriage) to 265 return to the right so the machine was ready to type again on the 266 left side of the paper (assuming a left-to-right language). On the 267 other hand, the *line feed* moved the paper upwards, allowing the 268 carriage to type on the following line. 270 Different operating systems traditionally signal the end of a line in 271 different ways. UNIX and its descendants (including Mac OS X), the 272 operating systems most likely to run on a server, use the line feed 273 alone. CP/M, DOS, and Microsoft Windows use the sequence of carriage 274 return and line feed (CR/LF). Obsolete versions of Mac OS (up to, 275 and including, System 9) use the carriage return alone. 277 All programmes using Gopher MUST always use the Microsoft standard of 278 CR/LF, irrespective of the operating system they run on. Both 279 internal Gopher commands and policy files MUST comply with this 280 standard. Other text files SHOULD use standard Gopher format, but 281 this is not strictly required as a matter of technical form; the 282 client MUST be capable of converting to and from all variants of line 283 terminators. The recommendation stands for the benefit of non- 284 compliant clients only. 286 4. Selector Formats 288 4.1. Type Codes 290 The following selectors are defined by RFC 1436: 292 Type Treat As Meaning 293 0 TEXT Plain text file 294 1 MENU Menu 295 2 EXTERNAL CCSO flat database (formerly used as telephone 296 directories); other databases 297 3 ERROR Error message 298 4 TEXT Macintosh BinHex file 299 5 BINARY Binary archive (zip; rar; 7-Zip; gzip; tar) 300 6 TEXT UUEncoded archive 301 7 INDEX Query a search engine or CGI script 302 8 EXTERNAL Telnet to: VT100 series server 303 9 BINARY Binary file (see also 5) 304 + - Redundant server 305 T EXTERNAL Telnet to: tn3270 series server 306 g BINARY GIF format graphics file (TODO: Why not use I?) 307 I BINARY Any image file. 309 The `+` selector indicates a mirror of the previous item in the menu, 310 and MUST behave as though it had the same type as that entry. For 311 example: 313 5Download software /software.zip gopher.example.com 70 314 +example.net mirror mirror.example.net /example.com/software.zip 70 315 +Another mirror mirror2.example.com /software.zip 70 317 Additionally, the following selectors have been in common use and are 318 made official here. If a client does not have the capability to 319 display a particular item type, it SHOULD treat it as a more generic 320 item type, passing it off to the operating system (itemtype p 321 "implies" itemtype 0, etc.). 323 Type Treat As Meaning 324 c BINARY Calendar file (Kim Holviala) 325 d BINARY Word-processing document (MS 326 Word; OpenOffice.org; 327 WordPerfect); PDF document 328 h TEXT HTML document 329 i - Informational text (not 330 selectable) 332 p TEXT Page layout or markup document 333 (TeX; LaTeX; PostScript; Rich 334 Text Format)---these documents are 335 all plain text, but contain ASCII 336 tags" that make the document 337 prettier when sent through a 338 special program. 339 m BINARY Electronic mail repository (also 340 known as MBOX) (Kim Holviala) 341 s BINARY Audio recordings (files that 342 consist of audible, but no 343 visible, data) (Wesley Teal) 344 x TEXT eXtensible Markup Language 345 document (Wesley Teal) 346 ; BINARY Video files (files that consist 347 of both audible and visible 348 data) (Wesley Teal) 350 Filetypes `4`, `6`, `h`, `p`, and `x` SHOULD send as text (itemtype 351 0). This way, the text appears directly on the user's terminal 352 without being downloaded (unless the appropriate command is given to 353 the client, i.e. `CTRL/S`). It is vital to note that text 354 information can be sent via binary (with the minor inconvenience 355 noted above), as binary files contain a greater range of information 356 than ASCII. However, binary files, if sent via text, will be 357 irreparably ruined, as this effectively passes raw eight-bit data 358 through an ASCII filter. In the case of confusion, the owner/ 359 operator of the server should simply mark the file as binary to 360 ensure that it transfers safely. 362 4.2. GopherIIbis: Metadata in Gopherspace 364 It is sometimes useful to transmit data about GopherII selectors. 365 This is known as "metadata": the *meta* construction is derived from 366 the Greek for "beyond", and refers to concepts which are abstractions 367 from other concepts intended to complete or add to the latter. For 368 instance, in psychology, metamemory refers to an individual's ability 369 to remember that he has remembered something. In plain English, 370 metadata refers to data about data. 372 GopherIIbis is an OPTIONAL, but recommended, addition to the basic 373 GopherII specification. That said, it is optional only in the sense 374 that a GopherII client MAY EITHER display the relevant information in 375 accordance with the specification, or ELSE ignore it entirely. To be 376 conformant with GopherII, Gopher clients MUST be capable of handling 377 GopherIIbis metadata. A GopherII client that displays GopherIIbis 378 metadata may be referred to as being compliant with GopherIIbis. 380 The name of the GopherIIbis extension is pronounced "gopher-two-biss" 381 or "gopher-two-beess". In typeset text, the French word "bis" should 382 be *italicised* so as to set it off visually. The name of the 383 GopherIIbis extension reflects that it is merely an addition, or an 384 iteration, of the GopherII protocol. 386 5. Gopher Menus 388 Menu (type 1) content has the following format: 390 T^I^I^I 392 Where: 394 o `^I` is the ASCII character corresponding to the `Tab` key 396 o `T` is the type code, which MUST be run together with the item 397 text 399 o is the selector string to send to the specified server 401 o is the server to send the selector to 403 o is the port on the server to connect to 405 If the server understands how to send and receive GopherIIbis 406 metadata, it MUST indicate this fact by adding a fourth tab character 407 (^I) and a plus sign after the port number. For example: 409 T^I^I^I^I+ 411 If the client does not understand GopherIIbis metadata, it MUST 412 ignore the trailing ^I+. 414 Note on `i` item type: For the `i` item type, Selector, Server, and 415 Port are mostly ignored, but MUST be there anyway. In that case, the 416 host SHOULD be set to placeholder value `example.com`, and the port 417 SHOULD be set to placeholder value `0` (zero). One exception to 418 their being ignored is TITLE entries. These have TITLE as the 419 selector value; host and port SHOULD again be set to aforementioned 420 placeholder values. 422 5.1. Note on the terminating full stop 424 Per RFC 1436, a terminating full stop (.) character followed by CR/LF 425 should be sent on a line by itself after the end of the content, with 426 exceptions for binary data. This terminating full stop has caused no 427 end of trouble ever since. Many, if not most, modern Gopher servers 428 omit this terminating full stop. Therefore, the practice suggested 429 in RFC 1436 is DEPRECATED and the following practice is RECOMMENDED. 431 o Servers MAY send the full stop; clients MUST accept it 433 o Servers SHOULD send the full stop after menus and may OPTIONALLY 434 send it after other files 436 o Clients SHOULD display the full stop at the end of menus, if sent, 437 to notify the user that this is the end of the menu 439 o Clients SHOULD NOT include the full stop in other output, in case 440 that output has some significance which the full stop may disrupt. 442 o Clients SHOULD NOT consider a full stop significant, unless it 443 occurs immediately before the connection is terminated. 445 6. Requesting Data 447 A standard GopherII client requests data from the server by 448 transmitting the selector string, a carriage return, and a line feed. 449 For instance, to retrieve the file `services.txt`, the client sends 451 services.txt[CR][LF] 453 GopherIIbis handles things in a slightly more complicated way. In 454 addition to a selector string, a GopherIIbis-compliant request 455 contains a *format* string, a data flag indicating the presence or 456 absence of a data block, and an OPTIONAL data block. 458 The reason for the inclusion of the format string is because 459 GopherIIbis allows one selector to point to multiple versions of the 460 same file, in multiple languages. For instance, the same file in 461 Portable Document Format, PostScript format, Rich Text Format, and 462 plain text may be available, and each of these may be available in 463 British English, American English, Canadian French, and Continental 464 French. The format string, therefore, is the desired MIME type of 465 whichever format is being requested, followed by the ISO country and 466 language codes in the following format: 468 selector^Imime/type la_CO^I1[CR][LF]datadatadatadata 470 The number 1 above is the file flag. It can be either 1 or 0. If it 471 is 1, it means that the client is not only requesting data, but also 472 *sending* it. This is useful for example when querying a relational 473 database on a Gopher server (this usage is now rare). An example 474 would be: 476 services.txt^Itext/plain fr_CA^I0[CR][LF] 478 7. Data Transfer 480 When a file is requested by a Gopher client, a Gopher server 481 incompatible with GopherIIbis simply sends the requested data as soon 482 as it gets the request from the client. GopherIIbis servers, on the 483 other hand, have three options when given a GopherIIbis-compliant 484 request (i.e. one that ends in ^I+). 486 If the size of the file in bytes is known, the server SHOULD transmit 487 a plus sign, the size, and the combination of carriage return and 488 line feed, then the file. For example, if the size of file 489 `report.tex` is known to be 64096 bytes, the server SHOULD transmit: 491 +64096[CR][LF]\documentclass{article}[CR][LF]\begin{document}... 493 If the size of the file is not known, there are two ways to proceed. 494 One of them is to send the character string `+-1` prior to beginning 495 transmission of the data proper, and end the transmission with a full 496 stop (.) on a line by itself, followed by carriage return and line 497 feed. For example: 499 +-1[CR][LF]data data[CR][LF]data data data data[CR][LF].[CR][LF] 501 It is RECOMMENDED that most textual data of unknown length be 502 transmitted this way. The exception is when there is a possibility 503 of the full stop appearing on a line by itself; this, of course, 504 would terminate the connexion. There is no choice when sending non- 505 textual (binary) data: it MUST NOT be terminated with a full stop. 507 In either of the two cases above, the string to send is `+-2`. This 508 instructs the client that the data will be terminated when the 509 connexion is closed, and furthermore, that the length of the data is 510 unknown. For example: 512 +-2[CR][LF]binarydata 514 8. Requesting and Receiving Metadata 516 A GopherIIbis client may request the metadata for a specific selector 517 by sending a string in the following form: 519 ^I![CR][LF] 521 The trailing tab and exclamation mark is what distinguishes a request 522 for data from a request for metadata. The metadata returned is of 523 the following form: 525 +INFO: 0lpryce.txt^IRest in peace, Lane Pryce^Igopher.scdp.com^I70+ 527 +ADMIN: 528 Admin: Roger Sterling 529 Mod-Date: Fri Feb 13 08:22:11 2015 <20130213082211> 531 +VIEWS: 532 text/plain: <10k> 533 application/postscript: <100k> 534 application/latex: <50k> 535 application/pdf: <120k> 537 +ABSTRACT: 539 Yesterday, our beloved partner Lane Gordon Pryce died of suicide in 540 his Manhattan home. He was 55. 542 In general, data intended to be read by the computer will be enclosed 543 in angle brackets (`<` and `>`). A graphical client may, for 544 example, provide a GUI menu of all possible document views with 545 graphical icons of the file type and tool-tips of the file size. 547 These are far from the only available metadata records; only the 548 `INFO` record is mandatory, and it MUST be transmitted first of all. 549 The `ADMIN` record is RECOMMENDED, and if it is included, it must be 550 transmitted *directly* after the `INFO` record. 552 It is also possible to retrieve only a *specific* record or range of 553 records. For example, to retrieve only the views and the abstract, a 554 client may send: 556 ^I!+INFO+ADMIN[CR][LF] 558 Finally, it is possible to retrieve metadata for an *entire 559 directory*. Of course, this is relatively bandwidth-intensive (for a 560 56k link) but a modern Ethernet connexion should have no problem with 561 it. The reason for the requirement of an `INFO` record for every 562 selector should now be abundantly clear: the `INFO` record serves to 563 separate metadata for one file from metadata for another. For 564 example: 566 ^I&[CR][LF] 568 The only difference between a request for a *single* file's metadata 569 and a request for that of a whole directory is that a single-file 570 request uses an exclamation mark, whereas a whole-directory request 571 uses an ampersand ("and sign", &). 573 It is even possible to request a specific record from every selector 574 in the directory, by appending the requested fields to the command 575 string as above. 577 8.1. The `INFO` Record 579 The `INFO` record is MANDATORY in every metadata listing. It 580 contains the same data as the Gopher selector, with a plus sign at 581 the end, per GopherIIbis style. It MUST always be present, and it 582 MUST always be the first metadata record present. The `INFO` record 583 serves to separate metadata listings when more are sent at the same 584 time. 586 8.2. The `ADMIN` Record 588 To promote accountability, the `ADMIN` record is also MANDATORY in 589 every metadata listing. It MUST contain fields for `Admin` (the name 590 and contact information for the administrator of the file) and `Mod- 591 Date` (the date of last modification) as seen in the example below: 593 +ADMIN: 594 Admin: Roger Sterling 595 Mod-Date: 01 January 2015 597 The time of last modification MUST be in 24-hour format. 599 If the metadata listing is for the results of a database search, such 600 as Veronica, it SHOULD also include fields for `Score` (a whole- 601 number ranking of the relevance of the result to the search query) 602 and for `Score-Range` (the lowest and highest possible relevance 603 scores), as per the following example: 605 +ADMIN: 606 Admin: Margaret Olson 607 Mod-Date: 13 February 2015 608 Score: 100 609 Score-Range: 0 150 610 The first number in the `Score-Range` field is the *lower bound*, and 611 the second number is the *upper bound*. 613 Several other fields are optional. `Site` is the name of the 614 Gopherhole, `Org` is the name of the business or individual who owns 615 the Gopherhole, `Loc` is the owner's location (city, district, and 616 country), `Geog` is the owner's geographic co-ordinates, and `TZ` is 617 the time zone in the format GMT+[01..11]. For example: 619 +ADMIN: 620 ... 621 Site: S|C|D|P Main Site 622 Org: Sterling|Cooper|Draper|Pryce Inc. 623 Loc: New York, NY, USA 624 Geog: 40N 173W 625 TZ: GMT-05 627 The `Author` may also be given, as may be the `Creation-Date` and 628 `Expiration-Date`, in the same format as the `Mod-Date`. 630 8.3. The `VIEWS` Record 632 Although the main selector might be for only one format of a file 633 (such as Rich Text Format), the same file may be available in many 634 other formats, such as Plain Text for older systems, LaTeX for 635 typesetters, PDF for displaying on screen, PostScript for printing on 636 a graphical printer, and many more. 638 The `VIEWS` record in GopherIIbis allows for serving multiple 639 variants of the same file, using what are known as MIME file 640 descriptors, Content-Types, or Internet media types. The `VIEWS` 641 field also allows for viewing the same file in multiple languages and 642 even in multiple dialects of the same language---in this case, the 643 relevant abbreviations are known as ISO-639 language codes and 644 ISO-3166 country codes. These are generally at least somewhat 645 intuitive (`CA` for Canada, `GB` for Great Britain, `en` for 646 English), but a full list may be found on the ISO Web site. 648 This is an example of a `VIEWS` record allowing for the selection of 649 a plain text, Rich Text, and PDF of the same file in American 650 English, Peninsular Portuguese, and Brazilian Portuguese: 652 +VIEWS 653 text/plain en_US: <32K> 654 text/plain pt_PT: <34K> 655 text/plain pt_BR: <34K> 656 text/rtf en_US: <55K> 657 text/rtf pt_PT: <60K> 658 text/rtf pt_BR: <66K> 659 application/pdf en_US: <120K> 660 application/pdf pt_PT: <132K> 661 application/pdf pt_BR: <133K> 663 The `VIEWS` record SHOULD be ranked according to the administrator's 664 idea of which view is preferred. On an American site catering to 665 English speakers, the `en_US` files should be listed first of all. 666 Likewise, on a site of any language catering to scientists, LaTeX 667 source should always come first of all. 669 8.4. The `ABSTRACT` Record 671 It is RECOMMENDED that every selector on a GopherIIbis-compliant 672 server have an `ABSTRACT` record. The `ABSTRACT` record contains a 673 *brief* description of the item (no more than a paragraph long) to 674 assist the reader in determining its purpose. Similarly, it is also 675 RECOMMENDED that the root directory of every Gopher server (that is, 676 what one gets when one requests metadata for the server itself with 677 no selector) contain an `ABSTRACT` record with the name, postal 678 address, eMail address, and telephone number of the person 679 responsible for the site. For example: 681 +ABSTRACT 682 The life and times of Professor Albert Einstein, Swiss patents clerk 683 and discoverer of four great scientific theories in one miraculous 684 year. 686 9. Errors 688 Although undesirable in communication, errors do occur in Gopher, and 689 their handling is crucial for a user-friendly, and standards- 690 compliant, Gopher experience. 692 When an error is encountered, the server MUST return a menu whose 693 first item bears itemtype `3`. All other ways of signalling an 694 error, such as redirecting to a Gopher error menu, an image, or 695 (worst of all) an HTML page, are PROHIBITED. 697 The selector string for itemtype `3` is the text of the error. It is 698 the responsibility of the server application to have understandable 699 and accurate strings for error handling. As they are well-understood 700 and common, HTTP-style error codes are acceptable and RECOMMENDED; 701 however, they SHOULD also be followed by a clear, legible description 702 of the error in both English and the local language. 704 Errors are handled in GopherIIbis in a slightly different fashion. 705 When an error occurs in response to a GopherIIbis-compliant query, 706 the server sends two minus signs, followed by an error code, a 707 description of the error, and a full stop. The error code SHOULD be 708 in the three-digit style elucidated in the next sub-subsection, but 709 the numbers 1, 2, and 3 MUST also be understood and handled 710 correctly, also as defined in "Error Codes". An example of a 711 GopherIIbis error follows: 713 --404[CR][LF]The file requested could not be found.[CR][LF].[CR][LF] 715 The decision of whether to send a GopherII error string or a 716 GopherIIbis error string is governed by the type of query received. 717 If the query was compliant with GopherIIbis, a GopherIIbis error MUST 718 be sent. In all other cases, a GopherII error MUST be sent. 720 9.1. Error Codes 722 This is a listing of numeric error codes used in Gopher; due to 723 Gopher's simplicity, it lacks most of the errors possible in HTTP. 724 Codes beginning with 4 can generally be traced to the client; codes 725 beginning with 5 are usually due to the server. 727 400 Bad Request The request could not be understood by the server 728 due to malformed syntax. 730 401 Unauthorised The request requires authentication. For example, 731 the received query value (as password) does not match the expected 732 value. 734 403 Forbidden The request was received, but not filled. 736 404 Not Found The server could not find anything matching the 737 requested URL. If the condition is known to be permanent, use 738 error code 410 (Gone). 740 408 Request Time-out The client did not produce a request within the 741 time that the server was prepared to wait. 743 410 Gone The requested resource is no longer available at the server 744 and no forwarding address is known. This condition is expected to 745 be considered permanent. If this is unknown, use error code 404 746 (Not Found). 748 500 Internal Server Error The server encountered an unexpected 749 condition which prevented it from fulfilling the request. 751 501 Not Implemented The server does not support the functionality 752 required to fulfil the request. 754 503 Service Unavailable The server is currently unable to handle the 755 request due to temporary overload/maintenance. 757 An earlier version of the GopherIIbis extension, known as Gopher+, 758 used error codes `1`, `2`, and `3`. Error code `1` signifies an 759 unavailable item (similar to the 400-series errors), error code `2` 760 signifies an unavailable server (similar to the 500-series errors), 761 and error code `3` signifies an item that has moved. Provision was 762 made to create new error codes. This is now DEPRECATED; the *ad hoc* 763 creation of new errors does not accord with the ethos of a 764 standardised Internet protocol. 766 10. Titles in Gopher 768 No mention of menus with titles exists per RFC 1436. When one simply 769 browses about Gopherspace, this does not matter; for bookmarking and 770 Gopher crawlers, such as Veronica-2, however, this presents a large 771 problem. 773 A Gopher TITLE resource has the following format: 775 i^ITITLE^Iexample.com^I0 777 It is identical to a standard informational resource (itemtype `i`); 778 the selector string, however, is set to the specific value, `TITLE`. 780 The composition of the above format is as follows: 782 o `^I` is the ASCII character corresponding to a press of the `Tab` 783 key 785 o The type code MUST be `i` (information) 787 o The selector string MUST be `TITLE` 789 o There is no server to connect to; the dummy text used in place of 790 the server SHOULD be `example.com` 792 o There is no port to connect to; the placeholder number SHOULD 793 therefore be `0` (zero). 795 A Gopher client that conforms to the above `TITLE` specification 796 SHALL render it in one of two ways, depending on the placement of the 797 resource. If the `TITLE` is the *first* resource in the document, it 798 SHALL be considered its principal `TITLE` and used *wherever a 799 principal title is needed* (window headings, bookmarks, etc.); 800 furthermore, it SHOULD be rendered in a different size, font, and/or 801 colour to the remainder of the document. In *all other* cases, it 802 SHALL be considered a subordinate `TITLE` and SHOULD be rendered in a 803 different size, font, and/or colour to the remainder of the document, 804 but smaller and/or with less emphasis than the main title. 806 If a non-compliant Gopher client receives a `TITLE` resource as per 807 above, it will render it as plain informational text. As the main 808 `TITLE` must be on the first line of a menu, it will appear visually 809 similar to a title in any case, although not rendered as such. 811 11. Linking to Web Addresses 813 It is now possible, and standard, to link to documents, preferably in 814 HTML, on the World Wide Web, Gopher's younger, more widespread 815 cousin, from Gopher itself, using a two-part system: a `URL:` 816 selector on the Gopher (local) end, and a *redirect page* (following 817 rules as set out below) on the HTTP (remote) end. There are no 818 compliance requirements for Gopher servers, with one exception: 819 servers MUST follow the bulleted list located immediately after the 820 example redirect page. 822 A Gopher client SHALL, when it sees a selector with a path starting 823 with `URL:`, interpret the path as a URL. It SHALL ignore the host 824 and port components of the Gopher selector, using those components 825 from the URL instead, if applicable. 827 `URL:` selectors SHOULD NOT be used if it is possible to link to the 828 required content and protocol by any other means. In particular, the 829 following protocols SHALL NOT be used with the URL: selector. 831 o gopher 833 o telnet (VT100-compatible) 835 o tn3270 837 Authors SHOULD NOT link to any document not of HTML type unless 838 absolutely necessary; linking to non-HTML documents will break 839 compatibility with non-compliant Gopher browsers. 841 A Gopher `URL:` selector MUST take the following format: 843 h^IURL:
^I^I 845 URL:` selectors are, for the most part, identical to standard HTML 846 selectors, but composed of particular data: 848 o The item type corresponds to the type of document on the remote 849 end. Most typically, this is a Web page authored in HTML; 850 therefore, the item type is most commonly `h`. 852 o is the text of the link; this can be almost anything. 854 o
is the full URL, preceded by the string `URL:`. For 855 example, this could be `URL:http://www.example.com` 857 o is the server that the link *originated* from; this 858 MUST be ignored by a compliant client, but MUST also be sent by a 859 compliant server 861 o is the port that the link *originated* from; this MUST 862 be ignored by a compliant client, but MUST also be sent by a 863 compliant server 865 It is possible for a non-compliant Gopher client to follow a link to 866 an HTML page, as long as the server is compliant, by the following 867 means: when the client receives a command to follow a `URL:` 868 selector, it will contact the server that provided the menu, as the 869 originating host and port are *mandatory* per this specification. 871 When a Gopher server receives a request from a client beginning with 872 the string `URL:`, it SHALL write out an HTML document that redirects 873 the browser to the appropriate place. A conforming example of such a 874 document is as follows: 876 877 878 879 880 882 You are following an external link to a Web site. You will be 883 automatically taken to the site shortly. If you do not get sent 884 there, please click here to go 885 to the web site. 886

887 The URL linked is:http://www.example.com/"> 888

889 http://www.example.com/ 890

891 Thanks for using Gopher! 892 893 894 This document may be any desired by the server authors, but MUST 895 adhere to the following requirements. 897 o It SHALL provide a refresh of a duration of 10 seconds or less 899 o It SHALL NOT use `IMG` tags, frames, or have any reference 900 whatsoever to content outside that particular file, with the sole 901 exception of the link to the real destination. 903 o It SHALL NOT use JavaScript. 905 o It SHALL adhere to the W3C HTML 4.0 standard. 907 When a non-compliant Gopher client finds a reference to a HTML file 908 (type `h`), it will open up the file via Gopher, receiving the 909 redirect document using a Web browser. The Web browser will then be 910 redirected to the actual link destination. 912 Compliant Gopher clients will simply render the target directly. 914 12. Algorithm to use with selectors 916 Here is a description for a hypothetical algorithm for parsing item 917 types, splitting them into levels of interaction. 919 PROTOCOL 920 -------- 921 Type Description What to do 922 0 Brief text Render directly line by line. 923 1 Menu Request and analyse menu. If it 924 contains '3' error node, print 925 error. 926 Else, render menu in new window. 927 7 Index/Search 928 Server 930 DATA NODES 931 ---------- 932 Type Description What to do 933 4, 9, g, I, c, Binary file Request and analyse file. If it 934 d, m, s, ; contains '3' error node, print 935 error. Else, does plug-in exist? 936 If yes, display. If no, save to 937 disc. 938 6, p, x Text file Request and analyse file. If it 939 contains '3' error node, print 940 error. Else, print on screen. 941 h, 2, 8, T Link Treat as URL. 942 5 Archive File Request and analyse file. If it 943 contains '3' error node, print 944 error. Else, does plug-in exist? 945 If yes, display. If no, save to 946 disc. 948 For instance, if the client is incapable of handling images as it is 949 text-only, the algorithm above would have it save to disc. 951 13. Representation of Gopher Addresses 953 This section is greatly indebted to RFC 4266 [RFC4266]. 955 A Gopher address, or uniform resource locator, takes the form: 957 gopher://:/ 959 where is one of: 961 o 963 o %09 965 o %09%09 966 If : is omitted, the port defaults to 70. is a 967 single-character field to denote the Gopher type of the resource to 968 which the URL refers. The entire may also be empty, in 969 which case the delimiting `/` is also optional and the 970 defaults to `1`. 972 is the Gopher selector string. Selector strings are 973 arbitrary sequences of characters; they MUST NOT, however, contain 974 the characters corresponding to horizontal tab, line feed, or 975 carriage return. Gopher clients specify which item to retrieve by 976 sending the Gopher selector string to a Gopher server. It is 977 important to know that within the itself, there are no 978 reserved characters, so one may be arbitrarily creative when creating 979 selector names. 981 Note that some Gopher strings begin with a copy of the 982 character, in which case that character will occur twice 983 consecutively. The Gopher selector string may be an empty string; 984 this is how Gopher clients refer to the top-level directory on a 985 Gopher server. 987 If the URL refers to a search to be submitted to a Gopher search 988 engine, the selector is followed by an encoded tab `%09` and the 989 search string. To submit a search to a Gopher search engine, the 990 Gopher client sends the string (after decoding), a tab, 991 and the search string to the Gopher server. 993 14. Gopher Policy Files 995 It is often useful to provide information to Gopher clients that MAY, 996 but need not, be read by a human being. It is for this reason that 997 policy files exist. This document enumerates two types of policy 998 files, formally known as the Capability Policy and the Robot Access 999 Restriction Policy, but also informally known under their filenames: 1000 `caps.txt` and `robots.txt`, respectively. 1002 14.1. Capability Policy 1004 It is RECOMMENDED, when hosting a public-access Gopher server, to 1005 include a capability policy. Although it is, ultimately, the choice 1006 of the owner or operator of the server, a capability policy (or caps 1007 file) can be useful for clients querying the server for certain 1008 information without using extensions such as Gopher+. 1010 The purpose of a capability policy is so that a server can instruct a 1011 client on how properly to parse selectors in its filesystem; it 1012 ensures that the client can understand how files on the server are 1013 organised. The scheme used in the current implementation of caps can 1014 handle POSIX (UNIX and related operating systems), FAT/NTFS (used by 1015 Microsoft Windows), and HFS (used by all versions of Apple Mac OS, 1016 including OS X, which is otherwise POSIX-compatible). For technical 1017 reasons, capability policies cannot handle VMS or Files-11 paths; 1018 however, owing to their open interface, the specification can be 1019 arbitrarily extended. 1021 A capability policy is quite simple in its composition: it is a plain 1022 text file with no more than seventy characters per line in the root 1023 directory of a Gopher server with the name 1025 caps.txt 1027 and beginning with the six characters 1029 CAPS[CR][LF] 1030 caps.txt 1032 Because of the constrained name and location of the policy, it is a 1033 trivial matter to verify if one exists or not; the address is always 1034 of the form , with the real 1035 name of the server substituting for `example`. The server should 1036 accept both `caps.txt` and `/caps.txt` as selectors, and return the 1037 same content for both. 1039 A caps file contains *keys*, *values*, and *comments*. 1041 Keys can be compared to labelled containers for data; for instance, 1042 the key `ServerSoftware` is a container for the name of the Gopher 1043 software running on the server. Keys in capability policies are 1044 always alphanumeric (i.e., composed of letters and numbers only) and 1045 generally are in CamelCase (each individual word within the key 1046 capitalised). The data in these containers is called a value; values 1047 can use letters, numbers, and symbols. Keys and values are connected 1048 by the equals (=) sign. Any amount of whitespace (spaces and tabs) 1049 around the equals sign is acceptable. 1051 Anything not conforming to the syntax 1053 SomeKey = Value 1055 is ignored (treated as a comment). To be compliant with GopherII, 1056 comments must begin with a hash (#) sign. More importantly, they 1057 must be on a line to their own. 1059 Below is an example caps file. 1061 CAPS 1062 CapsVersion=1 1063 ExpireCapsAfter=3600 1065 PathDelimeter=/ 1066 PathIdentity=. 1067 PathParent=.. 1068 PathParentDouble=FALSE 1069 PathEscapeCharacter=\\ 1070 PathKeepPreDelimeter=FALSE 1072 ServerSoftware=Bucktooth 1073 ServerSoftwareVersion=0.2.9 1074 ServerArchitecture=AIX 1075 ServerDescription=IBM Power 520 Express, 2x4.2GHz POWER6 CPU, 8G RAM 1076 ServerGeolocationString=Southern California, USA 1078 ServerSupportsStdinScripts=TRUE 1080 ServerAdmin=gopher@floodgap.com 1082 DefaultEncoding=utf-8 1084 The `CapsVersion` field is self-explanatory, with one note: it should 1085 always be the *first* field in the file, so that an incompatible 1086 later format might be detected by the client. The `ExpireCapsAfter` 1087 field tells the client the recommended cache expiry time (that is, 1088 the time between fetching and re-fetching the caps file) in 1089 *seconds*. `3600` as above means one hour, and so on. 1091 The `Path` variables `PathDelimeter` [sic!], `PathIdentity`, 1092 `PathParent`, `PathParentDouble`, `PathEscapeCharacter`, and 1093 `PathKeepPreDelimeter` [sic!] refer to attributes of the file system. 1094 The above example is correct for a UNIX system, including Mac OS X. 1095 `PathDelimeter` refers to how the server separates folders from each 1096 other; Unix machines use `/`, Microsoft machines use `\`, and 1097 obsolete Macs use `:`. `PathIdentity` refers to the shorthand used 1098 by an operating system to mean "this directory"; UNIX machines use 1099 `.`. `PathParent` refers to the shorthand for "the directory 1100 immediately above", and is `..` on UNIX and Microsoft systems. 1101 `PathParentDouble` refers to an oddball feature of obsolete Macs: two 1102 consecutive path delimiters are used to refer to the parent 1103 directory. For all systems other than pre-OS X Macintoshes, 1104 `PathParentDouble` should be FALSE. `PathEscapeCharacter` tells the 1105 client the escape character for quoting delimiters when they appear 1106 in selectors; most of the time, this is `\\`. `PathKeepPreDelimiter` 1107 tells the client not to cut everything up to the first path 1108 delimiter; most of the time, this should be `FALSE`. 1110 The `Server` variables `ServerSoftware`, `ServerSoftwareVersion`, 1111 `ServerArchitecture`, `ServerDescription`, and 1112 `ServerGeolocationString` are freetext descriptions of the server 1113 software and version, operating system ("architecture"), server 1114 hardware (`Server Description`), and location on the Earth. 1116 Finally, `ServerAdmin` is an eMail contact address for the server 1117 administrator, and `DefaultEncoding` is the default text encoding for 1118 content types 0 and 1. 1120 14.2. Robot Access Restrictions Policy 1122 WWIS robots, also known as spiders, crawlers, or wanderers, are 1123 computer programmes that, without human intervention, recursively 1124 travel throughout linked pages or directories on an information 1125 system (that is, by repeatedly travelling up and down a tree) and 1126 store the copies of these files at an independent location. The 1127 process of programmatically gathering information in this manner is 1128 called crawling or spidering. 1130 Many sites, in particular search engines (such as Google on the World 1131 Wide Web, or Veronica on Gopher), use spidering as a means of 1132 providing up-to-date data. Robots are mainly used to create a copy 1133 of all the visited pages for later processing by a search engine that 1134 will index the downloaded pages to provide fast searches. Robots can 1135 also be used for redundancy; data can be preserved by a third party 1136 in case the original server becomes inaccessible. 1138 In 1993 and 1994, however, there were occasions where robots had 1139 visited locations on the Web at which they were not welcome. 1140 Inexperienced or heavy-handed use of robots caused situations where 1141 servers were swamped with requests at a high rate of speed; or, the 1142 same files were retrieved repeatedly. Both could cause denial of 1143 service. In other situations, robots traversed parts of servers that 1144 were unsuitable, such as temporary information or server-side 1145 scripts, especially those with side-effects (such as polls). Abuse 1146 of robots was also an issue, and continues to be one now; for 1147 instance, electronic mail addresses have been harvested with knowing 1148 intent to distribute unsolicited mail ('spam'). 1150 These incidents indicated the need for established mechanisms for 1151 Gopher servers to indicate to robots which parts of their server 1152 should not be accessed. This specification addresses this 1153 requirement with an operational solution, adapted from the identical 1154 method used on sites using the Hypertext Transfer and File Transfer 1155 Protocols. 1157 The method used to exclude robots from a Gopher server is formally 1158 known as the Robot Access Restrictions Policy (RARP) and consists of 1159 placing a plain-text file specifying, in simple and user-friendly 1160 syntax, which robots may access which directory. The policy file, if 1161 it exists, MUST be accessible via Gopher on the local address 1163 /robots.txt 1165 A possible drawback of this single-file approach is that only a 1166 server administrator can maintain such a list, not the individual 1167 document maintainers on the server. This can be resolved by a local 1168 process to construct the single file from a number of others, but if, 1169 or how, this is done is outside of the scope of this document. 1171 Furthermore, Gopher administrators should bear in mind that the Robot 1172 Access Restrictions Policy works largely on the honour system. Many 1173 crawlers can be set to ignore the policy, and it is trivial to write 1174 this capability into a new crawler. 1176 The policy file consists of one or more records, separated by one or 1177 more blank lines, terminated by the Gopher-standard CR/LF. Each 1178 record contains two or more lines of the form 1180 : 1182 The field name is not case-sensitive. Comments (lines to be ignored 1183 by robots themselves, but useful to robot operators and others) start 1184 with the hash (#) character and end with the line terminator (CR/LF). 1185 A value can share a line with a comment. A record starts with at 1186 least one `User-agent` field, followed by at least one `Disallow` 1187 field. There are two further, optional fields: `Crawl-delay`, as 1188 well as `Allow`. 1190 The value of the `User-agent` field is the name of the robot whose 1191 access policy is being described. If more than one `User-agent` 1192 field is present, the record is describing an identical access policy 1193 for each robot. This field is to be interpreted broadly. The 1194 recommended implementation of access policies in the robot's code is 1195 for a case-insensitive sub-string match, without version information. 1196 Since one is describing an access policy for at least one robot, at 1197 least one `User-agent` field is required. The value `*` (quotes 1198 excluded) describes access policy for any robot not matching any 1199 previous records; therefore, if listed, it SHOULD be listed last of 1200 all. If it is not listed last of all, anything below it will be 1201 ignored. 1203 The value of the `Disallow` field specifies a partial URL that is not 1204 to be visited. This can be a full path, or a partial path. Any 1205 address that begins with this value will not be retrieved; for 1206 instance, the line Disallow: /help would disallow `/help/index.html`; 1207 `/help/faq.html`; as well as `/help.html`. Conversely, the line 1208 Disallow: /help/ would allow `/help.html`, but nothing in the 1209 directory `/help/`. An empty `Disallow` field indicates that all 1210 addresses can be retrieved. As one is defining policy and not simply 1211 listing the names of robots, at least one `Disallow` field is 1212 required per record. 1214 One can also add specific exceptions to the locations disallowed by 1215 using the `Allow` field. 1217 The `Crawl-delay` field is also supported; this field indicates the 1218 number of seconds to wait between successive requests to the same 1219 server; the value must be an integer with no units. 1221 The following is an example of a well-built policy file: # Robot 1222 Exclusion File for gopher://gopher.scdp.com # If you wish to crawl 1223 gopher.scdp.com, please contact # lane.pryce@scdp.com to apply for an 1224 exemption. Our terms of # service are available at 1225 gopher://gopher.scdp.com/0/tos.txt. User-agent: baiduspider User- 1226 agent: googlebot User-agent: msnbot User-agent: bingbot User-agent: 1227 naverbot User-agent: seznambot User-agent: slurp User-agent: teoma 1228 User-agent: yandex Disallow: /cgi-bin/ # Dynamically generated 1229 scripts Disallow: /images/ # This consumes bandwidth! Disallow: 1230 /tmp/ # Temporary files---blink, gone! Disallow: /private/ # No 1231 peeking! Allow: /images/logo.jpg # Main logo. Mirror this if 1232 possible. Crawl-delay: 10 User-agent: * Disallow: / # If you have 1233 received authorisation to crawl this site, and are # getting denied, 1234 please contact support@scdp.com, or dial # (212) 555 0169. This site 1235 is copyright Sterling, Cooper, Draper, # and Pryce, 2012. 1237 In plain terms, this server allows major search engines Baidu, 1238 Google, Bing, Naver, Seznam, Teoma, Yahoo, and Yandex to mirror the 1239 site freely, with the exception of everything in the directories 1240 /cgi-bin/, /tmp/, and /private/, as well as everything with the 1241 exception of the single file logo.jpg in the directory /images/. So 1242 as to not unduly slow the server down, the policy file requests that 1243 search engines wait ten seconds between requests. All other robots 1244 are prohibited from accessing the site. 1246 Examples such as the following SHOULD NOT be used except in very rare 1247 situations. Robots generally cause more good than harm, and 1248 excluding them entirely, as this anti-social user would, does not 1249 make Gopher a healthy place. 1251 # Piss off! 1252 User-agent: * 1253 Disallow: / 1255 14.3. Administrator Contact File 1257 It is worth remembering that computers, like anything else, are 1258 fallible and prone to error. When failure occurs in Gopherspace, the 1259 person in the best position to rectify it is the system 1260 administrator. Furthermore, users may have questions or comments, 1261 also best directed to the system administrator. For this reason, 1262 each Gopher server MUST have a file in its top-level directory with 1263 the name *about.txt* and a RECOMMENDED selector string of *About* or 1264 *About this server* (equivalents in the local language are 1265 permissible, but an English translation is similarly RECOMMENDED). 1266 It is the Gopher equivalent of a Unix user's finger output. 1268 Since this file is intended to be readable by humans and not 1269 computers, it does not have a defined file format. However, it 1270 should have a short description of the server's contents, as well as 1271 the contact details of the server administrator and any other key 1272 employees, such as the legal department. A well-structured contact 1273 file looks as follows: 1275 Sterling|Cooper|Draper|Pryce 1276 ============================ 1278 Welcome to SCDP! We are a full-service advertising and marketing 1279 agency staffed by a team of diverse, senior professionals with a 1280 flair for solid strategy and compelling creative output. Our team 1281 produces unique television, radio, print, and Web advertisements 1282 for a range of industries. 1284 Our ability to identify and communicate your greatest benefit to 1285 your customers is our greatest benefit to you. We find out what 1286 makes you truly unique. We have built an excellent team: each 1287 member is an advertising specialist in their own right. 1288 Photography, programming, writing, design, strategy---you name it, 1289 we have a creative for that. 1291 System Administrator: Margaret Olson 1292 Telephone: (212) 555 0169 x808 1293 Address: 13, Madison Avenue, 1294 New York, N.Y., 1295 U.S.A. 1296 eMail: peggy.olson@scdp.com 1297 Skype: peggyXolson 1299 All prospective clients: 1300 Please contact Creative Director Donald Draper at extension 069. 1302 Legal issues: 1303 For all legal and financial issues, please contact Lane Pryce 1304 at extension 777. 1306 15. IANA Considerations 1308 Nothing within this document should be taken to imply that any 1309 actions are to be undertaken by the Internet Assigned Numbers 1310 Authority. 1312 16. Security Considerations 1314 Security in GopherII is dependent on the connexion on which it runs. 1315 Standard GopherII (that is, running on "straight" TCP) is insecure 1316 simply by virtue of the protocol. Sensitive information, such as 1317 credit card numbers, must not be sent over a standard Gopher link. 1318 It is permissible to run GopherII over SSL, in which case all 1319 security considerations that apply for working HTTPS apply also for 1320 GopherIIs. 1322 17. Acknowledgements 1324 Thanks go to John Klensin for his invaluable assistance in regards to 1325 the IETF process, his constructive criticism, and his calm demeanour 1326 even when others just could not keep their tempers in check. 1328 Thanks also go to the members of the Gopher mailing list for keeping 1329 the Gopher protocol alive. Thanks go specifically to the Gopher 1330 developers: to Matjaz Mesnjak for his Windows-compatible, graphical 1331 Gopher client and his simple Motsognir Gopher server; to Dr Cameron 1332 Kaiser for Veronica-II, the next generation of Gopher search engine, 1333 for the Bucktooth Gopher server, for the Overbite extension for 1334 Mozilla Firefox, and for his tireless work on GopherVR, the only full 1335 virtual-reality Gopher client; to Kevin Veroneau for his Gopher 1336 Application Framework; and to Kim Holviala for the Gophernicus Gopher 1337 server. 1339 Finally, my thanks go to Thomas E. Dickey and the others who have 1340 put in valuable work on the Lynx browser. I thank them because, 1341 rather than remove Gopher support in a misguided attempt to plug 1342 security holes, they have in fact continued to improve this side of 1343 their software, and they have succeeded in making the finest text- 1344 mode Gopher client bar none. 1346 18. References 1348 18.1. Normative References 1350 [Anklesaria1993] 1351 Anklesaria, F., "Gopher+: upward compatible enhancements 1352 to the Internet Gopher protocol", 1993. 1354 Anklesaria, Farhad; Lindner, Paul; McCahill, Mark P.; 1355 Torrey, Daniel; Johnson, David; Alberti, Bob (1993). ** 1356 Retrieved 23 May, 2012, from 1357 1359 [RFC1436] Anklesaria, F., McCahill, M., Lindner, P., Johnson, D., 1360 Torrey, D., and B. Alberti, "The Internet Gopher Protocol 1361 (a distributed document search and retrieval protocol)", 1362 RFC 1436, March 1993. 1364 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1365 Requirement Levels", BCP 14, RFC 2119, March 1997. 1367 18.2. Informative References 1369 [CapsRef] Kaiser, C., "Welcome to caps!", 2010. 1371 Kaiser, Cameron (2010). *Welcome to caps!* Retrieved 23 1372 May, 2012, from 1375 [Floodgap] 1376 Floodgap, "Floodgap's caps file". 1378 - Floodgap's 1379 caps file 1381 [Goerzen2012] 1382 Goerzen, J., "Links to URL", 2002. 1384 Goerzen, John (2002). *Links to URL.* Retriever 23 May, 1385 2012, from 1388 [GopherHistory] 1389 ??, ?., "A paper on the history of Gopher". 1391 -- A paper on the history of 1393 Gopher 1395 [RelatedDocs] 1396 ??, ?., "gophernicus.org". 1398 - A number of 1399 documents relating to gopher, including the RFCs 1401 [RFC4266] Hoffman, P., "The gopher URI Scheme", RFC 4266, November 1402 2005. 1404 [UpdatedGopher] 1405 ??, ?., "The "Updated Gopher RFC" thread", 2012. 1407 The "Updated Gopher RFC" thread (started May 8 2012) on 1408 the gopher-project mailing list 1410 Appendix A. Summary of Changes from RFC 1436 1412 In broad strokes, RFC 1436 is compatible with this document; an "old" 1413 Gopher client should be fully capable of browsing a GopherII server. 1414 GopherII can be considered simply a refinement of the RFC 1436 1415 concept; while RFC 1436 lays out a viable protocol, it leaves a lot 1416 of small-scale implementation detail up to the makers of client 1417 software. While a sort of gentleman's agreement did manifest itself, 1418 and while this gentleman's agreement was in some places almost 1419 universal (the `i` itemtype, for example, with only Microsoft 1420 Internet Explorer as the nonconforming Gopher client) it did lack 1421 standardisation, which is what this document remedies. More 1422 specifically: 1424 o c, d, h, i, p, m, s, x, ; itemtypes. 1426 o extension formerly known as Gopher+ 1428 o terminating full-stop behaviour 1430 o what to put in the title bar (`TITLE` resource) 1432 o links to HTTP urls 1434 o policy files 1436 Appendix B. Change Log 1438 B.1. Changes from -00 to -01 of this specification 1440 Converted to RFC standard format for legibility; added security 1441 considerations section. 1443 B.2. Changes from -01 to -02 of this specification 1445 Added acknowledgements and changes from original Gopher RFC's. 1446 Removed placeholder text. 1448 Authors' Addresses 1450 Ted Matavka 1452 Email: n.theodore.matavka.files@gmail.com 1454 Wolfgang Faust 1456 Email: wolfgangmcq@gmail.com