idnits 2.17.1 draft-ietf-http-v10-spec-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 2734 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 14, 1995) is 10421 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: 'CRLF' on line 501 ** Downref: Normative reference to an Informational RFC: RFC 1436 (ref. '1') ** Downref: Normative reference to an Informational RFC: RFC 1630 (ref. '2') ** Downref: Normative reference to an Historic draft: draft-ietf-html-spec (ref. '3') ** Obsolete normative reference: RFC 1738 (ref. '4') (Obsoleted by RFC 4248, RFC 4266) ** Obsolete normative reference: RFC 1521 (ref. '5') (Obsoleted by RFC 2045, RFC 2046, RFC 2047, RFC 2048, RFC 2049) ** Obsolete normative reference: RFC 822 (ref. '7') (Obsoleted by RFC 2822) -- Possible downref: Non-RFC (?) normative reference: ref. '8' ** Obsolete normative reference: RFC 1808 (ref. '9') (Obsoleted by RFC 3986) ** Obsolete normative reference: RFC 850 (ref. '10') (Obsoleted by RFC 1036) ** Obsolete normative reference: RFC 977 (ref. '11') (Obsoleted by RFC 3977) ** Obsolete normative reference: RFC 821 (ref. '12') (Obsoleted by RFC 2821) ** Obsolete normative reference: RFC 1590 (ref. '13') (Obsoleted by RFC 2045, RFC 2046, RFC 2047, RFC 2048, RFC 2049) ** Obsolete normative reference: RFC 1700 (ref. '15') (Obsoleted by RFC 3232) ** Downref: Normative reference to an Informational RFC: RFC 1737 (ref. '16') -- Possible downref: Non-RFC (?) normative reference: ref. '17' -- Possible downref: Non-RFC (?) normative reference: ref. '18' Summary: 20 errors (**), 0 flaws (~~), 2 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 HTTP Working Group T. Berners-Lee, MIT/LCS 2 INTERNET-DRAFT R. Fielding, UC Irvine 3 H. Frystyk, MIT/LCS 4 Expires April 14, 1996 October 14, 1995 6 Hypertext Transfer Protocol -- HTTP/1.0 8 Status of this Memo 10 This document is an Internet-Draft. Internet-Drafts are working 11 documents of the Internet Engineering Task Force (IETF), its areas, 12 and its working groups. Note that other groups may also distribute 13 working documents as Internet-Drafts. 15 Internet-Drafts are draft documents valid for a maximum of six 16 months and may be updated, replaced, or obsoleted by other 17 documents at any time. It is inappropriate to use Internet-Drafts 18 as reference material or to cite them other than as "work in 19 progress". 21 To learn the current status of any Internet-Draft, please check the 22 "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow 23 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 24 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 25 ftp.isi.edu (US West Coast). 27 Distribution of this document is unlimited. Please send comments to 28 the HTTP working group at . Discussions 29 of the working group are archived at 30 . General discussions 31 about HTTP and the applications which use HTTP should take place on 32 the mailing list. 34 Abstract 36 The Hypertext Transfer Protocol (HTTP) is an application-level 37 protocol with the lightness and speed necessary for distributed, 38 collaborative, hypermedia information systems. It is a generic, 39 stateless, object-oriented protocol which can be used for many 40 tasks, such as name servers and distributed object management 41 systems, through extension of its request methods (commands). A 42 feature of HTTP is the typing of data representation, allowing 43 systems to be built independently of the data being transferred. 45 HTTP has been in use by the World-Wide Web global information 46 initiative since 1990. This specification reflects common usage of 47 the protocol referred to as "HTTP/1.0". 49 Table of Contents 51 1. Introduction 52 1.1 Purpose 53 1.2 Terminology 54 1.3 Overall Operation 56 2. Notational Conventions and Generic Grammar 57 2.1 Augmented BNF 58 2.2 Basic Rules 60 3. Protocol Parameters 61 3.1 HTTP Version 62 3.2 Uniform Resource Identifiers 63 3.2.1 General Syntax 64 3.2.2 http URL 65 3.3 Date/Time Formats 66 3.4 Character Sets 67 3.5 Content Codings 68 3.6 Media Types 69 3.6.1 Canonicalization and Text Defaults 70 3.6.2 Multipart Types 71 3.7 Product Tokens 73 4. HTTP Message 74 4.1 Message Types 75 4.2 Message Headers 76 4.3 General Header Fields 78 5. Request 79 5.1 Request-Line 80 5.1.1 Method 81 5.1.2 Request-URI 82 5.2 Request Header Fields 84 6. Response 85 6.1 Status-Line 86 6.1.1 Status Code and Reason Phrase 87 6.2 Response Header Fields 89 7. Entity 90 7.1 Entity Header Fields 91 7.2 Entity Body 92 7.2.1 Type 93 7.2.2 Length 95 8. Method Definitions 96 8.1 GET 97 8.2 HEAD 98 8.3 POST 100 9. Status Code Definitions 101 9.1 Informational 1xx 102 9.2 Successful 2xx 103 9.3 Redirection 3xx 104 9.4 Client Error 4xx 105 9.5 Server Error 5xx 107 10. Header Field Definitions 108 10.1 Allow 109 10.2 Authorization 110 10.3 Content-Encoding 111 10.4 Content-Length 112 10.5 Content-Type 113 10.6 Date 114 10.7 Expires 115 10.8 From 116 10.9 If-Modified-Since 117 10.10 Last-Modified 118 10.11 Location 119 10.12 MIME-Version 120 10.13 Pragma 121 10.14 Referer 122 10.15 Server 123 10.16 User-Agent 124 10.17 WWW-Authenticate 126 11. Access Authentication 127 11.1 Basic Authentication Scheme 129 12. Security Considerations 130 12.1 Authentication of Clients 131 12.2 Safe Methods 132 12.3 Abuse of Server Log Information 133 12.4 Transfer of Sensitive Information 135 13. Acknowledgments 137 14. References 139 15. Authors' Addresses 141 Appendix A. Internet Media Type message/http 143 Appendix B. Tolerant Applications 145 Appendix C. Relationship to MIME 146 C.1 Conversion to Canonical Form 147 C.1.1 Representation of Line Breaks 148 C.1.2 Default Character Set 149 C.2 Conversion of Date Formats 150 C.3 Introduction of Content-Encoding 151 C.4 No Content-Transfer-Encoding 153 1. Introduction 155 1.1 Purpose 157 The Hypertext Transfer Protocol (HTTP) is an application-level 158 protocol with the lightness and speed necessary for distributed, 159 collaborative, hypermedia information systems. HTTP has been in use 160 by the World-Wide Web global information initiative since 1990. 161 This specification reflects common usage of the protocol referred 162 to as "HTTP/1.0". This specification is not intended to become an 163 Internet standard; rather, it defines those features of the HTTP 164 protocol that can reasonably be expected of any implementation 165 which claims to be using HTTP/1.0. 167 Practical information systems require more functionality than 168 simple retrieval, including search, front-end update, and 169 annotation. HTTP allows an open-ended set of methods to be used to 170 indicate the purpose of a request. It builds on the discipline of 171 reference provided by the Uniform Resource Identifier (URI) [2], as 172 a location (URL) [4] or name (URN) [16], for indicating the 173 resource on which a method is to be applied. Messages are passed in 174 a format similar to that used by Internet Mail [7] and the 175 Multipurpose Internet Mail Extensions (MIME) [5]. 177 HTTP is also used as a generic protocol for communication between 178 user agents and proxies/gateways to other Internet protocols, such 179 as SMTP [12], NNTP [11], FTP [14], Gopher [1], and WAIS [8], 180 allowing basic hypermedia access to resources available from 181 diverse applications and simplifying the implementation of user 182 agents. 184 1.2 Terminology 186 This specification uses a number of terms to refer to the roles 187 played by participants in, and objects of, the HTTP communication. 189 connection 191 A transport layer virtual circuit established between two 192 application programs for the purpose of communication. 194 message 196 The basic unit of HTTP communication, consisting of a structured 197 sequence of octets matching the syntax defined in Section 4 and 198 transmitted via the connection. 200 request 202 An HTTP request message (as defined in Section 5). 204 response 206 An HTTP response message (as defined in Section 6). 208 resource 210 A network data object or service which can be identified by a 211 URI (Section 3.2). 213 entity 215 A particular representation or rendition of a data resource, or 216 reply from a service resource, that may be enclosed within a 217 request or response message. An entity consists of 218 metainformation in the form of entity headers and content in the 219 form of an entity body. 221 client 223 An application program that establishes connections for the 224 purpose of sending requests. 226 user agent 228 The client which initiates a request. These are often browsers, 229 editors, spiders (web-traversing robots), or other end user 230 tools. 232 server 234 An application program that accepts connections in order to 235 service requests by sending back responses. 237 origin server 239 The server on which a given resource resides or is to be created. 241 proxy 243 An intermediary program which acts as both a server and a client 244 for the purpose of making requests on behalf of other clients. 245 Requests are serviced internally or by passing them, with 246 possible translation, on to other servers. A proxy must 247 interpret and, if necessary, rewrite a request message before 248 forwarding it. Proxies are often used as client-side portals 249 through network firewalls and as helper applications for 250 handling requests via protocols not implemented by the user 251 agent. 253 gateway 255 A server which acts as an intermediary for some other server. 256 Unlike a proxy, a gateway receives requests as if it were the 257 origin server for the requested resource; the requesting client 258 may not be aware that it is communicating with a gateway. 259 Gateways are often used as server-side portals through network 260 firewalls and as protocol translators for access to resources 261 stored on non-HTTP systems. 263 tunnel 265 A tunnel is an intermediary program which is acting as a blind 266 relay between two connections. Once active, a tunnel is not 267 considered a party to the HTTP communication, though the tunnel 268 may have been initiated by an HTTP request. A tunnel is closed 269 when both ends of the relayed connections are closed. Tunnels 270 are used when a portal is necessary and the intermediary cannot, 271 or should not, interpret the relayed communication. 273 cache 275 A program's local store of response messages and the subsystem 276 that controls its message storage, retrieval, and deletion. A 277 cache stores cachable responses in order to reduce the response 278 time and network bandwidth consumption on future, equivalent 279 requests. Any client or server may include a cache, though a 280 cache cannot be used by a server while it is acting as a tunnel. 282 Any given program may be capable of being both a client and a 283 server; our use of these terms refers only to the role being 284 performed by the program for a particular connection, rather than 285 to the program's capabilities in general. Likewise, any server may 286 act as an origin server, proxy, gateway, or tunnel, switching 287 behavior based on the nature of each request. 289 1.3 Overall Operation 291 The HTTP protocol is based on a request/response paradigm. A client 292 establishes a connection with a server and sends a request to the 293 server in the form of a request method, URI, and protocol version, 294 followed by a MIME-like message containing request modifiers, 295 client information, and possible body content. The server responds 296 with a status line, including the message's protocol version and a 297 success or error code, followed by a MIME-like message containing 298 server information, entity metainformation, and possible body 299 content. 301 Most HTTP communication is initiated by a user agent and consists 302 of a request to be applied to a resource on some origin server. In 303 the simplest case, this may be accomplished via a single connection 304 (v) between the user agent (UA) and the origin server (O). 306 request chain ------------------------> 307 UA -------------------v------------------- O 308 <----------------------- response chain 310 A more complicated situation occurs when one or more intermediaries 311 are present in the request/response chain. There are three common 312 forms of intermediary: proxy, gateway, and tunnel. A proxy is a 313 forwarding agent, receiving requests for a URI in its absolute 314 form, rewriting all or parts of the message, and forwarding the 315 reformatted request toward the server identified by the URI. A 316 gateway is a receiving agent, acting as a layer above some other 317 server(s) and, if necessary, translating the requests to the 318 underlying server's protocol. A tunnel acts as a relay point 319 between two connections without changing the messages; tunnels are 320 used when the communication needs to pass through an intermediary 321 (such as a firewall) even when the intermediary cannot understand 322 the contents of the messages. 324 request chain --------------------------------------> 325 UA -----v----- A -----v----- B -----v----- C -----v----- O 326 <------------------------------------- response chain 328 The figure above shows three intermediaries (A, B, and C) between 329 the user agent and origin server. A request or response message 330 that travels the whole chain must pass through four separate 331 connections. This distinction is important because some HTTP 332 communication options may apply only to the connection with the 333 nearest, non-tunnel neighbor, only to the end-points of the chain, 334 or to all connections along the chain. Although the diagram is 335 linear, each participant may be engaged in multiple, simultaneous 336 communications. For example, B may be receiving requests from many 337 clients other than A, and/or forwarding requests to servers other 338 than C, at the same time that it is handling A's request. 340 Any party to the communication which is not acting as a tunnel may 341 employ an internal cache for handling requests. The effect of a 342 cache is that the request/response chain is shortened if one of the 343 participants along the chain has a cached response applicable to 344 that request. The following illustrates the resulting chain if B 345 has a cached copy of an earlier response from O (via C) for a 346 request which has not been cached by UA or A. 348 request chain ----------> 349 UA -----v----- A -----v----- B - - - - - - C - - - - - - O 350 <--------- response chain 352 Not all responses are cachable, and some requests may contain 353 modifiers which place special requirements on cache behavior. 354 Historically, HTTP/1.0 applications have not adequately defined 355 what is or is not a "cachable" response. 357 On the Internet, HTTP communication generally takes place over 358 TCP/IP connections. The default port is TCP 80 [15], but other 359 ports can be used. This does not preclude HTTP from being 360 implemented on top of any other protocol on the Internet, or on 361 other networks. HTTP only presumes a reliable transport; any 362 protocol that provides such guarantees can be used, and the mapping 363 of the HTTP/1.0 request and response structures onto the transport 364 data units of the protocol in question is outside the scope of this 365 specification. 367 Current practice requires that the connection be established by the 368 client prior to each request and closed by the server after sending 369 the response. Both clients and servers must be capable of handling 370 cases where either party closes the connection prematurely, due to 371 user action, automated time-out, or program failure. In any case, 372 the closing of the connection by either or both parties always 373 terminates the current request, regardless of its status. 375 2. Notational Conventions and Generic Grammar 377 2.1 Augmented BNF 379 All of the mechanisms specified in this document are described in 380 both prose and an augmented Backus-Naur Form (BNF) similar to that 381 used by RFC 822 [7]. Implementors will need to be familiar with the 382 notation in order to understand this specification. The augmented 383 BNF includes the following constructs: 385 name = definition 387 The name of a rule is simply the name itself (without any 388 enclosing "<" and ">") and is separated from its definition by 389 the equal character "=". Whitespace is only significant in that 390 indentation of continuation lines is used to indicate a rule 391 definition that spans more than one line. Certain basic rules 392 are in uppercase, such as SP, LWS, HT, CRLF, DIGIT, ALPHA, etc. 393 Angle brackets are used within definitions whenever their 394 presence will facilitate discerning the use of rule names. 396 "literal" 398 Quotation marks surround literal text. Unless stated otherwise, 399 the text is case-insensitive. 401 rule1 | rule2 403 Elements separated by a bar ("I") are alternatives, 404 e.g., "yes | no" will accept yes or no. 406 (rule1 rule2) 408 Elements enclosed in parentheses are treated as a single 409 element. Thus, "(elem (foo | bar) elem)" allows the token 410 sequences "elem foo elem" and "elem bar elem". 412 *rule 414 The character "*" preceding an element indicates repetition. The 415 full form is "*element" indicating at least and at 416 most occurrences of element. Default values are 0 and 417 infinity so that "*(element)" allows any number, including zero; 418 "1*element" requires at least one; and "1*2element" allows one 419 or two. 421 [rule] 423 Square brackets enclose optional elements; "[foo bar]" is 424 equivalent to "*1(foo bar)". 426 N rule 428 Specific repetition: "(element)" is equivalent to 429 "*(element)"; that is, exactly occurrences of 430 (element). Thus 2DIGIT is a 2-digit number, and 3ALPHA is a 431 string of three alphabetic characters. 433 #rule 435 A construct "#" is defined, similar to "*", for defining lists 436 of elements. The full form is "#element" indicating at 437 least and at most elements, each separated by one or 438 more commas (",") and optional linear whitespace (LWS). This 439 makes the usual form of lists very easy; a rule such as 440 "( *LWS element *( *LWS "," *LWS element ))" can be shown as 441 "1#element". Wherever this construct is used, null elements are 442 allowed, but do not contribute to the count of elements present. 443 That is, "(element), , (element)" is permitted, but counts as 444 only two elements. Therefore, where at least one element is 445 required, at least one non-null element must be present. Default 446 values are 0 and infinity so that "#(element)" allows any 447 number, including zero; "1#element" requires at least one; and 448 "1#2element" allows one or two. 450 ; comment 452 A semi-colon, set off some distance to the right of rule text, 453 starts a comment that continues to the end of line. This is a 454 simple way of including useful notes in parallel with the 455 specifications. 457 implied *LWS 459 The grammar described by this specification is word-based. 460 Except where noted otherwise, linear whitespace (LWS) can be 461 included between any two adjacent words (token or 462 quoted-string), and between adjacent tokens and delimiters 463 (tspecials), without changing the interpretation of a field. 464 However, applications should attempt to follow "common form" 465 when generating HTTP constructs, since there exist some 466 implementations that fail to accept anything beyond the common 467 forms. 469 2.2 Basic Rules 471 The following rules are used throughout this specification to 472 describe basic parsing constructs. The US-ASCII coded character set 473 is defined by [17]. 475 OCTET = 476 CHAR = 477 UPALPHA = 478 LOALPHA = 479 ALPHA = UPALPHA | LOALPHA 480 DIGIT = 481 CTL = 483 CR = 484 LF = 485 SP = 486 HT = 487 <"> = 489 HTTP/1.0 defines the octet sequence CR LF as the end-of-line marker 490 for all protocol elements except the Entity-Body (see Appendix B 491 for tolerant applications). The end-of-line marker within an 492 Entity-Body is defined by its associated media type, as described 493 in Section 3.6. 495 CRLF = CR LF 497 HTTP/1.0 headers may be folded onto multiple lines if each 498 continuation line begins with a space or horizontal tab. All linear 499 whitespace, including folding, has the same semantics as SP. 501 LWS = [CRLF] 1*( SP | HT ) 503 However, folding of header lines is not expected by some 504 applications, and should not be generated by HTTP/1.0 applications. 506 The TEXT rule is only used for descriptive field contents and 507 values that are not intended to be interpreted by the message 508 parser. Words of *TEXT may contain octets from character sets other 509 than US-ASCII. 511 TEXT = 514 Recipients of header field TEXT containing octets outside the 515 US-ASCII character set may assume that they represent ISO-8859-1 516 characters. 518 Many HTTP/1.0 header field values consist of words separated by LWS 519 or special characters. These special characters must be in a quoted 520 string to be used within a parameter value. 522 word = token | quoted-string 524 token = 1* 526 tspecials = "(" | ")" | "<" | ">" | "@" 527 | "," | ";" | ":" | "\" | <"> 528 | "/" | "[" | "]" | "?" | "=" 529 | "{" | "}" | SP | HT 531 Comments may be included in some HTTP header fields by surrounding 532 the comment text with parentheses. Comments are only allowed in 533 fields containing "comment" as part of their field value definition. 535 comment = "(" *( ctext | comment ) ")" 536 ctext = 538 A string of text is parsed as a single word if it is quoted using 539 double-quote marks. 541 quoted-string = ( <"> *(qdtext) <"> ) 543 qdtext = and CTLs, 544 but including LWS> 546 Single-character quoting using the backslash ("\") character is not 547 permitted in HTTP/1.0. 549 3. Protocol Parameters 551 3.1 HTTP Version 553 HTTP uses a "." numbering scheme to indicate versions 554 of the protocol. The protocol versioning policy is intended to 555 allow the sender to indicate the format of a message and its 556 capacity for understanding further HTTP communication, rather than 557 the features obtained via that communication. No change is made to 558 the version number for the addition of message components which do 559 not affect communication behavior or which only add to extensible 560 field values. The number is incremented when the changes 561 made to the protocol add features which do not change the general 562 message parsing algorithm, but which may add to the message 563 semantics and imply additional capabilities of the sender. The 564 number is incremented when the format of a message within 565 the protocol is changed. 567 The version of an HTTP message is indicated by an HTTP-Version 568 field in the first line of the message. If the protocol version is 569 not specified, the recipient must assume that the message is in the 570 simple HTTP/0.9 format. 572 HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT 574 Note that the major and minor numbers should be treated as separate 575 integers and that each may be incremented higher than a single 576 digit. Thus, HTTP/2.4 is a lower version than HTTP/2.13, which in 577 turn is lower than HTTP/12.3. Leading zeros should be ignored by 578 recipients and never generated by senders. 580 This document defines both the 0.9 and 1.0 versions of the HTTP 581 protocol. Applications sending Full-Request or Full-Response 582 messages, as defined by this specification, must include an 583 HTTP-Version of "HTTP/1.0". 585 HTTP/1.0 servers must: 587 o recognize the format of the Request-Line for HTTP/0.9 and 588 HTTP/1.0 requests; 590 o understand any valid request in the format of HTTP/0.9 or 591 HTTP/1.0; 593 o respond appropriately with a message in the same protocol 594 version used by the client. 596 HTTP/1.0 clients must: 598 o recognize the format of the Status-Line for HTTP/1.0 responses; 600 o understand any valid response in the format of HTTP/0.9 or 601 HTTP/1.0. 603 Proxy and gateway applications must be careful in forwarding 604 requests that are received in a format different than that of the 605 application's native HTTP version. Since the protocol version 606 indicates the protocol capability of the sender, a proxy/gateway 607 must never send a message with a version indicator which is greater 608 than its native version; if a higher version request is received, 609 the proxy/gateway must either downgrade the request version or 610 respond with an error. Requests with a version lower than that of 611 the application's native format may be upgraded before being 612 forwarded; the proxy/gateway's response to that request must follow 613 the normal server requirements. 615 3.2 Uniform Resource Identifiers 617 URIs have been known by many names: WWW addresses, Universal 618 Document Identifiers, Universal Resource Identifiers [2], and 619 finally the combination of Uniform Resource Locators (URL) [4] and 620 Names (URN) [16]. As far as HTTP is concerned, Uniform Resource 621 Identifiers are simply formatted strings which identify--via name, 622 location, or any other characteristic--a network resource. 624 3.2.1 General Syntax 626 URIs in HTTP/1.0 can be represented in absolute form or relative to 627 some known base URI [9], depending upon the context of their use. 628 The two forms are differentiated by the fact that absolute URIs 629 always begin with a scheme name followed by a colon. 631 URI = ( absoluteURI | relativeURI ) [ "#" fragment ] 633 absoluteURI = scheme ":" *( uchar | reserved ) 635 relativeURI = net_path | abs_path | rel_path 637 net_path = "//" net_loc [ abs_path ] 638 abs_path = "/" rel_path 639 rel_path = [ path ] [ ";" params ] [ "?" query ] 641 path = fsegment *( "/" segment ) 642 fsegment = 1*pchar 643 segment = *pchar 645 params = param *( ";" param ) 646 param = *( pchar | "/" ) 648 scheme = 1*( ALPHA | DIGIT | "+" | "-" | "." ) 649 net_loc = *( pchar | ";" | "?" ) 650 query = *( uchar | reserved ) 651 fragment = *( uchar | reserved ) 653 pchar = uchar | ":" | "@" | "&" | "=" 654 uchar = unreserved | escape 655 unreserved = ALPHA | DIGIT | safe | extra | national 657 escape = "%" hex hex 658 hex = "A" | "B" | "C" | "D" | "E" | "F" 659 | "a" | "b" | "c" | "d" | "e" | "f" | DIGIT 661 reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" 662 safe = "$" | "-" | "_" | "." | "+" 663 extra = "!" | "*" | "'" | "(" | ")" | "," 664 national = 667 For definitive information on URL syntax and semantics, see RFC 668 1738 [4] and RFC 1808 [9]. The BNF above includes national 669 characters not allowed in valid URLs as specified by RFC 1738, 670 since HTTP servers are not restricted in the set of unreserved 671 characters allowed to represent the rel_path part of addresses, and 672 HTTP proxies may receive requests for URIs not defined by RFC 1738. 674 3.2.2 http URL 676 The "http" scheme is used to locate network resources via the HTTP 677 protocol. This section defines the scheme-specific syntax and 678 semantics for http URLs. 680 http_URL = "http:" "//" host [ ":" port ] abs_path 682 host = 686 port = *DIGIT 688 If the port is empty or not given, port 80 is assumed. The 689 semantics are that the identified resource is located at the server 690 listening for TCP connections on that port of that host, and the 691 Request-URI for the resource is abs_path. If the abs_path is not 692 present in the URL, it must be given as "/" when used as a 693 Request-URI. 695 Note: Although the HTTP protocol is independent of the 696 transport layer protocol, the http URL only identifies 697 resources by their TCP location, and thus non-TCP resources 698 must be identified by some other URI scheme. 700 The canonical form for "http" URLs is obtained by converting any 701 UPALPHA characters in host to their LOALPHA equivalent (hostnames 702 are case-insensitive), eliding the [ ":" port ] if the port is 80, 703 and replacing an empty abs_path with "/". 705 3.3 Date/Time Formats 707 HTTP/1.0 applications have historically allowed three different 708 formats for the representation of date/time stamps: 710 Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123 711 Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036 712 Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format 714 The first format is preferred as an Internet standard and 715 represents a fixed-length subset of that defined by RFC 1123 [6] 716 (an update to RFC 822 [7]). The second format is in common use, but 717 is based on the obsolete RFC 850 [10] date format and lacks a 718 four-digit year. HTTP/1.0 clients and servers that parse the date 719 value should accept all three formats, though they must never 720 generate the third (asctime) format. 722 Note: Recipients of date values are encouraged to be robust 723 in accepting date values that may have been generated by 724 non-HTTP applications, as is sometimes the case when 725 retrieving or posting messages via proxies/gateways to SMTP 726 or NNTP. 728 All HTTP/1.0 date/time stamps must be represented in Universal Time 729 (UT), also known as Greenwich Mean Time (GMT), without exception. 730 This is indicated in the first two formats by the inclusion of 731 "GMT" as the three-letter abbreviation for time zone, and should be 732 assumed when reading the asctime format. 734 HTTP-date = rfc1123-date | rfc850-date | asctime-date 736 rfc1123-date = wkday "," SP date1 SP time SP "GMT" 737 rfc850-date = weekday "," SP date2 SP time SP "GMT" 738 asctime-date = wkday SP date3 SP time SP 4DIGIT 740 date1 = 2DIGIT SP month SP 4DIGIT 741 ; day month year (e.g., 02 Jun 1982) 742 date2 = 2DIGIT "-" month "-" 2DIGIT 743 ; day-month-year (e.g., 02-Jun-82) 744 date3 = month SP ( 2DIGIT | ( SP 1DIGIT )) 745 ; month day (e.g., Jun 2) 747 time = 2DIGIT ":" 2DIGIT ":" 2DIGIT 748 ; 00:00:00 - 23:59:59 750 wkday = "Mon" | "Tue" | "Wed" 751 | "Thu" | "Fri" | "Sat" | "Sun" 753 weekday = "Monday" | "Tuesday" | "Wednesday" 754 | "Thursday" | "Friday" | "Saturday" | "Sunday" 756 month = "Jan" | "Feb" | "Mar" | "Apr" 757 | "May" | "Jun" | "Jul" | "Aug" 758 | "Sep" | "Oct" | "Nov" | "Dec" 760 Note: HTTP/1.0 requirements for the date/time stamp format 761 apply only to their usage within the protocol stream. 762 Clients and servers are not required to use these formats 763 for user presentation, request logging, etc. 765 3.4 Character Sets 767 HTTP uses the same definition of the term "character set" as that 768 described for MIME: 770 The term "character set" is used in this document to 771 refer to a method used with one or more tables to convert 772 a sequence of octets into a sequence of characters. Note 773 that unconditional conversion in the other direction is 774 not required, in that not all characters may be available 775 in a given character set and a character set may provide 776 more than one sequence of octets to represent a 777 particular character. This definition is intended to 778 allow various kinds of character encodings, from simple 779 single-table mappings such as US-ASCII to complex table 780 switching methods such as those that use ISO 2022's 781 techniques. However, the definition associated with a 782 MIME character set name must fully specify the mapping to 783 be performed from octets to characters. In particular, 784 use of external profiling information to determine the 785 exact mapping is not permitted. 787 HTTP character sets are identified by case-insensitive tokens. The 788 complete set of tokens are defined by the IANA Character Set 789 registry [15]. However, because that registry does not define a 790 single, consistent token for each character set, we define here the 791 preferred names for those character sets most likely to be used 792 with HTTP entities. These character sets include those registered 793 by RFC 1521 [5] -- the US-ASCII [17] and ISO-8859 [18] character 794 sets -- and other names specifically recommended for use within MIME 795 charset parameters. 797 charset = "US-ASCII" 798 | "ISO-8859-1" | "ISO-8859-2" | "ISO-8859-3" 799 | "ISO-8859-4" | "ISO-8859-5" | "ISO-8859-6" 800 | "ISO-8859-7" | "ISO-8859-8" | "ISO-8859-9" 801 | "ISO-2022-JP" | "ISO-2022-JP-2" | "ISO-2022-KR" 802 | "UNICODE-1-1" | "UNICODE-1-1-UTF-7" | "UNICODE-1-1-UTF-8" 803 | token 805 Although HTTP allows an arbitrary token to be used as a charset 806 value, any token that has a predefined value within the IANA 807 Character Set registry [15] must represent the character set 808 defined by that registry. Applications should limit their use of 809 character sets to those defined by the IANA registry. 811 Note: This use of the term "character set" is more commonly 812 referred to as a "character encoding." However, since HTTP 813 and MIME share the same registry, it is important that the 814 terminology also be shared. 816 3.5 Content Codings 818 Content coding values are used to indicate an encoding 819 transformation that has been applied to a resource. Content codings 820 are primarily used to allow a document to be compressed or 821 encrypted without losing the identity of its underlying media type. 822 Typically, the resource is stored in this encoding and only decoded 823 before rendering or analogous usage. 825 content-coding = "x-gzip" | "x-compress" | token 827 Note: For future compatibility, HTTP/1.0 applications should 828 consider "gzip" and "compress" to be equivalent to "x-gzip" 829 and "x-compress", respectively. 831 All content-coding values are case-insensitive. HTTP/1.0 uses 832 content-coding values in the Content-Encoding (Section 10.3) header 833 field. Although the value describes the content-coding, what is 834 more important is that it indicates what decoding mechanism will be 835 required to remove the encoding. Note that a single program may be 836 capable of decoding multiple content-coding formats. Two values are 837 defined by this specification: 839 x-gzip 840 An encoding format produced by the file compression program 841 "gzip" (GNU zip) developed by Jean-loup Gailly. This format is 842 typically a Lempel-Ziv coding (LZ77) with a 32 bit CRC. Gzip is 843 available from the GNU project at 844 . 846 x-compress 847 The encoding format produced by the file compression program 848 "compress". This format is an adaptive Lempel-Ziv-Welch coding 849 (LZW). 851 Note: Use of program names for the identification of 852 encoding formats is not desirable and should be discouraged 853 for future encodings. Their use here is representative of 854 historical practice, not good design. 856 3.6 Media Types 858 HTTP uses Internet Media Types [13] in the Content-Type header 859 field (Section 10.5) in order to provide open and extensible data 860 typing. For mail applications, where there is no type negotiation 861 between sender and recipient, it is reasonable to put strict limits 862 on the set of allowed media types. With HTTP, where the sender and 863 recipient can communicate directly, applications are allowed more 864 freedom in the use of non-registered types. The following grammar 865 for media types is a superset of that for MIME because it does not 866 restrict itself to the official IANA and x-token types. 868 media-type = type "/" subtype *( ";" parameter ) 869 type = token 870 subtype = token 872 Parameters may follow the type/subtype in the form of 873 attribute/value pairs. 875 parameter = attribute "=" value 876 attribute = token 877 value = token | quoted-string 879 The type, subtype, and parameter attribute names are 880 case-insensitive. Parameter values may or may not be 881 case-sensitive, depending on the semantics of the parameter name. 882 LWS must not be generated between the type and subtype, nor between 883 an attribute and its value. 885 Many current applications do not recognize media type parameters. 886 Since parameters are a fundamental aspect of media types, this must 887 be considered an error in those applications. Nevertheless, 888 HTTP/1.0 applications should only use media type parameters when 889 they are necessary to define the content of a message. 891 If a given media-type value has been registered by the IANA, any 892 use of that value must be indicative of the registered data format. 893 Although HTTP allows the use of non-registered media types, such 894 usage must not conflict with the IANA registry. Data providers are 895 strongly encouraged to register their media types with IANA via the 896 procedures outlined in RFC 1590 [13]. 898 All media-type's registered by IANA must be preferred over 899 extension tokens. However, HTTP does not limit applications to the 900 use of officially registered media types, nor does it encourage the 901 use of an "x-" prefix for unofficial types outside of explicitly 902 short experimental use between consenting applications. 904 3.6.1 Canonicalization and Text Defaults 906 Media types are registered in a canonical form. In general, entity 907 bodies transferred via HTTP must be represented in the appropriate 908 canonical form prior to transmission. If the body has been encoded 909 via a Content-Encoding, the data must be in canonical form prior to 910 that encoding. However, HTTP modifies the canonical form 911 requirements for media of primary type "text" and for "application" 912 types consisting of text-like records. 914 HTTP redefines the canonical form of text media to allow multiple 915 octet sequences to indicate a text line break. In addition to the 916 preferred form of CRLF, HTTP applications must accept a bare CR or 917 LF alone as representing a single line break in text media. 918 Furthermore, if the text media is represented in a character set 919 which does not use octets 13 and 10 for CR and LF respectively, as 920 is the case for some multi-byte character sets, HTTP allows the use 921 of whatever octet sequence(s) is defined by that character set to 922 represent the equivalent of CRLF, bare CR, and bare LF. It is 923 assumed that any recipient capable of using such a character set 924 will know the appropriate octet sequence for representing line 925 breaks within that character set. 927 Note: This interpretation of line breaks applies only to the 928 contents of an Entity-Body and only after any 929 Content-Encoding has been removed. All other HTTP constructs 930 use CRLF exclusively to indicate a line break. Content 931 codings define their own line break requirements. 933 A recipient of an HTTP text entity should translate the received 934 entity line breaks to the local line break conventions before 935 saving the entity external to the application and its cache; 936 whether this translation takes place immediately upon receipt of 937 the entity, or only when prompted by the user, is entirely up to 938 the individual application. 940 HTTP also redefines the default character set for text media in an 941 entity body. If a textual media type defines a charset parameter 942 with a registered default value of "US-ASCII", HTTP changes the 943 default to be "ISO-8859-1". Since the ISO-8859-1 [18] character set 944 is a superset of US-ASCII [17], this has no effect upon the 945 interpretation of entity bodies which only contain octets within 946 the US-ASCII set (0 - 127). The presence of a charset parameter 947 value in a Content-Type header field overrides the default. 949 It is recommended that the character set of an entity body be 950 labelled as the lowest common denominator of the character codes 951 used within a document, with the exception that no label is 952 preferred over the labels US-ASCII or ISO-8859-1. 954 3.6.2 Multipart Types 956 MIME provides for a number of "multipart" types -- encapsulations of 957 several entities within a single message's Entity-Body. The 958 multipart types registered by IANA [15] do not have any special 959 meaning for HTTP/1.0, though user agents may need to understand 960 each type in order to correctly interpret the purpose of each 961 body-part. Ideally, an HTTP user agent should follow the same or 962 similar behavior as a MIME user agent does upon receipt of a 963 multipart type. 965 As in MIME [5], all multipart types share a common syntax and must 966 include a boundary parameter as part of the media type value. The 967 message body is itself a protocol element and must therefore use 968 only CRLF to represent line breaks between body-parts. Unlike in 969 MIME, multipart body-parts may contain HTTP header fields which are 970 significant to the meaning of that part. 972 3.7 Product Tokens 974 Product tokens are used to allow communicating applications to 975 identify themselves via a simple product token, with an optional 976 slash and version designator. Most fields using product tokens also 977 allow subproducts which form a significant part of the application 978 to be listed, separated by whitespace. By convention, the products 979 are listed in order of their significance for identifying the 980 application. 982 product = token ["/" product-version] 983 product-version = token 985 Examples: 987 User-Agent: CERN-LineMode/2.15 libwww/2.17b3 989 Server: Apache/0.8.4 991 Product tokens should be short and to the point -- use of them for 992 advertizing or other non-essential information is explicitly 993 forbidden. Although any token character may appear in a 994 product-version, this token should only be used for a version 995 identifier (i.e., successive versions of the same product should 996 only differ in the product-version portion of the product value). 998 4. HTTP Message 1000 4.1 Message Types 1002 HTTP messages consist of requests from client to server and 1003 responses from server to client. 1005 HTTP-message = Simple-Request ; HTTP/0.9 messages 1006 | Simple-Response 1007 | Full-Request ; HTTP/1.0 messages 1008 | Full-Response 1010 Full-Request and Full-Response use the generic message format of 1011 RFC 822 [7] for transferring entities. Both messages may include 1012 optional header fields (also known as "headers") and an entity 1013 body. The entity body is separated from the headers by a null line 1014 (i.e., a line with nothing preceding the CRLF). 1016 Full-Request = Request-Line ; Section 5.1 1017 *( General-Header ; Section 4.3 1018 | Request-Header ; Section 5.2 1019 | Entity-Header ) ; Section 7.1 1020 CRLF 1021 [ Entity-Body ] ; Section 7.2 1023 Full-Response = Status-Line ; Section 6.1 1024 *( General-Header ; Section 4.3 1025 | Response-Header ; Section 6.2 1026 | Entity-Header ) ; Section 7.1 1027 CRLF 1028 [ Entity-Body ] ; Section 7.2 1030 Simple-Request and Simple-Response do not allow the use of any 1031 header information and are limited to a single request method (GET). 1033 Simple-Request = "GET" SP Request-URI CRLF 1035 Simple-Response = [ Entity-Body ] 1037 Use of the Simple-Request format is discouraged because it prevents 1038 the server from identifying the media type of the returned entity. 1040 4.2 Message Headers 1042 HTTP header fields, which include General-Header (Section 4.3), 1043 Request-Header (Section 5.2), Response-Header (Section 6.2), and 1044 Entity-Header (Section 7.1) fields, follow the same generic format 1045 as that given in Section 3.1 of RFC 822 [7]. Each header field 1046 consists of a name followed immediately by a colon (":"), a single 1047 space (SP) character, and the field value. Field names are 1048 case-insensitive. Header fields can be extended over multiple lines 1049 by preceding each extra line with at least one SP or HT, though 1050 this is not recommended. 1052 HTTP-header = field-name ":" [ field-value ] CRLF 1054 field-name = token 1055 field-value = *( field-content | LWS ) 1057 field-content = 1061 The order in which header fields are received is not significant. 1062 However, it is "good practice" to send General-Header fields first, 1063 followed by Request-Header or Response-Header fields prior to the 1064 Entity-Header fields. 1066 Multiple HTTP-header fields with the same field-name may be present 1067 in a message if and only if the entire field-value for that header 1068 field is defined as a comma-separated list [i.e., #(values)]. It 1069 must be possible to combine the multiple header fields into one 1070 "field-name: field-value" pair, without changing the semantics of 1071 the message, by appending each subsequent field-value to the first, 1072 each separated by a comma. 1074 4.3 General Header Fields 1076 There are a few header fields which have general applicability for 1077 both request and response messages, but which do not apply to the 1078 entity being transferred. These headers apply only to the message 1079 being transmitted. 1081 General-Header = Date ; Section 10.6 1082 | MIME-Version ; Section 10.12 1083 | Pragma ; Section 10.13 1085 General header field names can be extended reliably only in 1086 combination with a change in the protocol version. However, new or 1087 experimental header fields may be given the semantics of general 1088 header fields if all parties in the communication recognize them to 1089 be general header fields. Unknown header fields are treated as 1090 Entity-Header fields. 1092 5. Request 1094 A request message from a client to a server includes, within the 1095 first line of that message, the method to be applied to the 1096 resource, the identifier of the resource, and the protocol version 1097 in use. For backwards compatibility with the more limited HTTP/0.9 1098 protocol, there are two valid formats for an HTTP request: 1100 Request = Simple-Request | Full-Request 1102 Simple-Request = "GET" SP Request-URI CRLF 1104 Full-Request = Request-Line ; Section 5.1 1105 *( General-Header ; Section 4.3 1106 | Request-Header ; Section 5.2 1107 | Entity-Header ) ; Section 7.1 1108 CRLF 1109 [ Entity-Body ] ; Section 7.2 1111 If an HTTP/1.0 server receives a Simple-Request, it must respond 1112 with an HTTP/0.9 Simple-Response. An HTTP/1.0 client capable of 1113 receiving a Full-Response should never generate a Simple-Request. 1115 5.1 Request-Line 1117 The Request-Line begins with a method token, followed by the 1118 Request-URI and the protocol version, and ending with CRLF. The 1119 elements are separated by SP characters. No CR or LF are allowed 1120 except in the final CRLF sequence. 1122 Request-Line = Method SP Request-URI SP HTTP-Version CRLF 1124 Note that the difference between a Simple-Request and the 1125 Request-Line of a Full-Request is the presence of the HTTP-Version 1126 field and the availability of methods other than GET. 1128 5.1.1 Method 1130 The Method token indicates the method to be performed on the 1131 resource identified by the Request-URI. The method is 1132 case-sensitive. 1134 Method = "GET" ; Section 8.1 1135 | "HEAD" ; Section 8.2 1136 | "POST" ; Section 8.3 1137 | extension-method 1139 extension-method = token 1141 The list of methods acceptable by a specific resource can change 1142 dynamically; the client is notified through the return code of the 1143 response if a method is not allowed on a resource. Servers should 1144 return the status code 501 (not implemented) if the method is 1145 unknown or not implemented. 1147 The methods commonly used by HTTP/1.0 applications are fully 1148 defined in Section 8. 1150 5.1.2 Request-URI 1152 The Request-URI is a Uniform Resource Identifier (Section 3.2) and 1153 identifies the resource upon which to apply the request. 1155 Request-URI = absoluteURI | abs_path 1157 The two options for Request-URI are dependent on the nature of the 1158 request. 1160 The absoluteURI form is only allowed when the request is being made 1161 to a proxy. The proxy is requested to forward the request and 1162 return the response. If the request is GET or HEAD and a prior 1163 response is cached, the proxy may use the cached message if it 1164 passes any restrictions in the Expires header field. Note that the 1165 proxy may forward the request on to another proxy or directly to 1166 the server specified by the absoluteURI. In order to avoid request 1167 loops, a proxy must be able to recognize all of its server names, 1168 including any aliases, local variations, and the numeric IP 1169 address. An example Request-Line would be: 1171 GET http://www.w3.org/hypertext/WWW/TheProject.html HTTP/1.0 1173 The most common form of Request-URI is that used to identify a 1174 resource on an origin server or gateway. In this case, only the 1175 absolute path of the URI is transmitted (see Section 3.2.1, 1176 abs_path). For example, a client wishing to retrieve the resource 1177 above directly from the origin server would create a TCP connection 1178 to port 80 of the host "www.w3.org" and send the line: 1180 GET /hypertext/WWW/TheProject.html HTTP/1.0 1182 followed by the remainder of the Full-Request. Note that the 1183 absolute path cannot be empty; if none is present in the original 1184 URI, it must be given as "/" (the server root). 1186 The Request-URI is transmitted as an encoded string, where some 1187 characters may be escaped using the "% hex hex" encoding defined by 1188 RFC 1738 [4]. The origin server must decode the Request-URI in 1189 order to properly interpret the request. 1191 5.2 Request Header Fields 1193 The request header fields allow the client to pass additional 1194 information about the request, and about the client itself, to the 1195 server. All header fields are optional and conform to the generic 1196 HTTP-header syntax. 1198 Request-Header = Authorization ; Section 10.2 1199 | From ; Section 10.8 1200 | If-Modified-Since ; Section 10.9 1201 | Referer ; Section 10.14 1202 | User-Agent ; Section 10.16 1204 Request-Header field names can be extended reliably only in 1205 combination with a change in the protocol version. However, new or 1206 experimental header fields may be given the semantics of request 1207 header fields if all parties in the communication recognize them to 1208 be request header fields. Unknown header fields are treated as 1209 Entity-Header fields. 1211 6. Response 1213 After receiving and interpreting a request message, a server 1214 responds in the form of an HTTP response message. 1216 Response = Simple-Response | Full-Response 1218 Simple-Response = [ Entity-Body ] 1220 Full-Response = Status-Line ; Section 6.1 1221 *( General-Header ; Section 4.3 1222 | Response-Header ; Section 6.2 1223 | Entity-Header ) ; Section 7.1 1224 CRLF 1225 [ Entity-Body ] ; Section 7.2 1227 A Simple-Response should only be sent in response to an HTTP/0.9 1228 Simple-Request or if the server only supports the more limited 1229 HTTP/0.9 protocol. If a client sends an HTTP/1.0 Full-Request and 1230 receives a response that does not begin with a Status-Line, it 1231 should assume that the response is a Simple-Response and parse it 1232 accordingly. Note that the Simple-Response consists only of the 1233 entity body and is terminated by the server closing the connection. 1235 6.1 Status-Line 1237 The first line of a Full-Response message is the Status-Line, 1238 consisting of the protocol version followed by a numeric status 1239 code and its associated textual phrase, with each element separated 1240 by SP characters. No CR or LF is allowed except in the final CRLF 1241 sequence. 1243 Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF 1245 Since a status line always begins with the protocol version and 1246 status code 1248 "HTTP/" 1*DIGIT "." 1*DIGIT SP 3DIGIT SP 1250 (e.g., "HTTP/1.0 200 "), the presence of that expression is 1251 sufficient to differentiate a Full-Response from a Simple-Response. 1252 Although the Simple-Response format may allow such an expression to 1253 occur at the beginning of an entity body, and thus cause a 1254 misinterpretation of the message if it was given in response to a 1255 Full-Request, most HTTP/0.9 servers are limited to responses of 1256 type "text/html" and therefore would never generate such a response. 1258 6.1.1 Status Code and Reason Phrase 1260 The Status-Code element is a 3-digit integer result code of the 1261 attempt to understand and satisfy the request. The Reason-Phrase is 1262 intended to give a short textual description of the Status-Code. 1263 The Status-Code is intended for use by automata and the 1264 Reason-Phrase is intended for the human user. The client is not 1265 required to examine or display the Reason-Phrase. 1267 The first digit of the Status-Code defines the class of response. 1268 The last two digits do not have any categorization role. There are 1269 5 values for the first digit: 1271 o 1xx: Informational - Not used, but reserved for future use 1273 o 2xx: Success - The action was successfully received, 1274 understood, and accepted. 1276 o 3xx: Redirection - Further action must be taken in order to 1277 complete the request 1279 o 4xx: Client Error - The request contains bad syntax or cannot 1280 be fulfilled 1282 o 5xx: Server Error - The server failed to fulfill an apparently 1283 valid request 1285 The individual values of the numeric status codes defined for 1286 HTTP/1.0, and an example set of corresponding Reason-Phrase's, are 1287 presented below. The reason phrases listed here are only 1288 recommended -- they may be replaced by local equivalents without 1289 affecting the protocol. These codes are fully defined in Section 9. 1291 Status-Code = "200" ; OK 1292 | "201" ; Created 1293 | "202" ; Accepted 1294 | "204" ; No Content 1295 | "301" ; Moved Permanently 1296 | "302" ; Moved Temporarily 1297 | "304" ; Not Modified 1298 | "400" ; Bad Request 1299 | "401" ; Unauthorized 1300 | "403" ; Forbidden 1301 | "404" ; Not Found 1302 | "500" ; Internal Server Error 1303 | "501" ; Not Implemented 1304 | "502" ; Bad Gateway 1305 | "503" ; Service Unavailable 1306 | extension-code 1308 extension-code = 3DIGIT 1310 Reason-Phrase = * 1312 HTTP status codes are extensible, but the above codes are the only 1313 ones generally recognized in current practice. HTTP applications 1314 are not required to understand the meaning of all registered status 1315 codes, though such understanding is obviously desirable. However, 1316 applications must understand the class of any status code, as 1317 indicated by the first digit, and treat any unknown response as 1318 being equivalent to the x00 status code of that class. For example, 1319 if an unknown status code of 421 is received by the client, it can 1320 safely assume that there was something wrong with its request and 1321 treat the response as if it had received a 400 status code. In such 1322 cases, user agents should present to the user the entity returned 1323 with the response, since that entity is likely to include 1324 human-readable information which will explain the unusual status. 1326 6.2 Response Header Fields 1328 The response header fields allow the server to pass additional 1329 information about the response which cannot be placed in the 1330 Status-Line. These header fields are not intended to give 1331 information about an Entity-Body returned in the response, but 1332 about the server itself. 1334 Response-Header = Location ; Section 10.11 1335 | Server ; Section 10.15 1336 | WWW-Authenticate ; Section 10.17 1338 Response-Header field names can be extended reliably only in 1339 combination with a change in the protocol version. However, new or 1340 experimental header fields may be given the semantics of response 1341 header fields if all parties in the communication recognize them to 1342 be response header fields. Unknown header fields are treated as 1343 Entity-Header fields. 1345 7. Entity 1347 Full-Request and Full-Response messages may transfer an entity 1348 within some requests and responses. An entity consists of 1349 Entity-Header fields and (usually) an Entity-Body. In this section, 1350 both sender and recipient refer to either the client or the server, 1351 depending on who sends and who receives the entity. 1353 7.1 Entity Header Fields 1355 Entity-Header fields define optional metainformation about the 1356 Entity-Body or, if no body is present, about the resource 1357 identified by the request. 1359 Entity-Header = Allow ; Section 10.1 1360 | Content-Encoding ; Section 10.3 1361 | Content-Length ; Section 10.4 1362 | Content-Type ; Section 10.5 1363 | Expires ; Section 10.7 1364 | Last-Modified ; Section 10.10 1365 | extension-header 1367 extension-header = HTTP-header 1369 The extension-header mechanism allows additional Entity-Header 1370 fields to be defined without changing the protocol, but these 1371 fields cannot be assumed to be recognizable by the recipient. 1372 Unknown header fields should be ignored by the recipient and 1373 forwarded by proxies. 1375 7.2 Entity Body 1377 The entity body (if any) sent with an HTTP/1.0 request or response 1378 is in a format and encoding defined by the Entity-Header fields. 1380 Entity-Body = *OCTET 1382 An entity body is included with a request message only when the 1383 request method calls for one. The presence of an entity body in a 1384 request is signaled by the inclusion of a Content-Length header 1385 field in the request message headers. HTTP/1.0 requests containing 1386 an entity body must include a valid Content-Length header field. 1388 For response messages, whether or not an entity body is included 1389 with a message is dependent on both the request method and the 1390 response code. All responses to the HEAD request method must not 1391 include a body, even though the presence of entity header fields 1392 may lead one to believe they do. All 1xx (informational), 204 (no 1393 content), and 304 (not modified) responses must not include a body. 1394 All other responses must include an entity body or a Content-Length 1395 header field defined with a value of zero (0). 1397 7.2.1 Type 1399 When an Entity-Body is included with a message, the data type of 1400 that body is determined via the header fields Content-Type and 1401 Content-Encoding. These define a two-layer, ordered encoding model: 1403 entity-body := Content-Encoding( Content-Type( data ) ) 1405 A Content-Type specifies the media type of the underlying data. A 1406 Content-Encoding may be used to indicate any additional content 1407 coding applied to the type, usually for the purpose of data 1408 compression, that is a property of the resource requested. The 1409 default for the content encoding is none (i.e., the identity 1410 function). 1412 Any HTTP/1.0 message containing an entity body should include a 1413 Content-Type header field defining the media type of that body. If 1414 and only if the media type is not given by a Content-Type header, 1415 as is the case for Simple-Response messages, the recipient may 1416 attempt to guess the media type via inspection of its content 1417 and/or the name extension(s) of the URL used to identify the 1418 resource. If the media type remains unknown, the recipient should 1419 treat it as type "application/octet-stream". 1421 7.2.2 Length 1423 When an Entity-Body is included with a message, the length of that 1424 body may be determined in one of two ways. If a Content-Length 1425 header field is present, its value in bytes represents the length 1426 of the Entity-Body. Otherwise, the body length is determined by the 1427 closing of the connection by the server. 1429 Closing the connection cannot be used to indicate the end of a 1430 request body, since it leaves no possibility for the server to send 1431 back a response. Therefore, HTTP/1.0 requests containing an entity 1432 body must include a valid Content-Length header field. If a request 1433 contains an entity body and Content-Length is not specified, and 1434 the server does not recognize or cannot calculate the length from 1435 other fields, then the server should send a 400 (bad request) 1436 response. 1438 Note: Some older servers supply an invalid Content-Length 1439 when sending a document that contains server-side includes 1440 dynamically inserted into the data stream. It must be 1441 emphasized that this will not be tolerated by future 1442 versions of HTTP. Unless the client knows that it is 1443 receiving a response from a compliant server, it should not 1444 depend on the Content-Length value being correct. 1446 8. Method Definitions 1448 The set of common methods for HTTP/1.0 is defined below. Although 1449 this set can be expanded, additional methods cannot be assumed to 1450 share the same semantics for separately extended clients and 1451 servers. 1453 8.1 GET 1455 The GET method means retrieve whatever information (in the form of 1456 an entity) is identified by the Request-URI. If the Request-URI 1457 refers to a data-producing process, it is the produced data which 1458 shall be returned as the entity in the response and not the source 1459 text of the process, unless that text happens to be the output of 1460 the process. 1462 The semantics of the GET method changes to a "conditional GET" if 1463 the request message includes an If-Modified-Since header field. A 1464 conditional GET method requests that the identified resource be 1465 transferred only if it has been modified since the date given by 1466 the If-Modified-Since header, as described in Section 10.9. The 1467 conditional GET method is intended to reduce network usage by 1468 allowing cached entities to be refreshed without requiring multiple 1469 requests or transferring unnecessary data. 1471 8.2 HEAD 1473 The HEAD method is identical to GET except that the server must not 1474 return any Entity-Body in the response. The metainformation 1475 contained in the HTTP headers in response to a HEAD request should 1476 be identical to the information sent in response to a GET request. 1477 This method can be used for obtaining metainformation about the 1478 resource identified by the Request-URI without transferring the 1479 Entity-Body itself. This method is often used for testing hypertext 1480 links for validity, accessibility, and recent modification. 1482 There is no "conditional HEAD" request analogous to the conditional 1483 GET. If an If-Modified-Since header field is included with a HEAD 1484 request, it should be ignored. 1486 8.3 POST 1488 The POST method is used to request that the destination server 1489 accept the entity enclosed in the request as a new subordinate of 1490 the resource identified by the Request-URI in the Request-Line. 1491 POST is designed to allow a uniform method to cover the following 1492 functions: 1494 o Annotation of existing resources; 1496 o Posting a message to a bulletin board, newsgroup, mailing list, 1497 or similar group of articles; 1499 o Providing a block of data, such as the result of submitting a 1500 form [3], to a data-handling process; 1502 o Extending a database through an append operation. 1504 The actual function performed by the POST method is determined by 1505 the server and is usually dependent on the Request-URI. The posted 1506 entity is subordinate to that URI in the same way that a file is 1507 subordinate to a directory containing it, a news article is 1508 subordinate to a newsgroup to which it is posted, or a record is 1509 subordinate to a database. 1511 A successful POST does not require that the entity be created as a 1512 resource on the origin server or made accessible for future 1513 reference. That is, the action performed by the POST method might 1514 not result in a resource that can be identified by a URI. In this 1515 case, either 200 (ok) or 204 (no content) is the appropriate 1516 response status, depending on whether or not the response includes 1517 an entity that describes the result. 1519 If a resource has been created on the origin server, the response 1520 should be 201 (created) and contain an entity (preferably of type 1521 "text/html") which describes the status of the request and refers 1522 to the new resource. 1524 A valid Content-Length is required on all HTTP/1.0 POST requests. 1525 An HTTP/1.0 server should respond with a 400 (bad request) message 1526 if it cannot determine the length of the request message's content. 1528 Applications must not cache responses to a POST request. 1530 9. Status Code Definitions 1532 Each Status-Code is described below, including a description of 1533 which method(s) it can follow and any metainformation required in 1534 the response. 1536 9.1 Informational 1xx 1538 This class of status code indicates a provisional response, 1539 consisting only of the Status-Line and optional headers, and is 1540 terminated by an empty line. HTTP/1.0 does not define any 1xx 1541 status codes and they are not a valid response to a HTTP/1.0 1542 request. However, they may be useful for experimental applications 1543 which are outside the scope of this specification. 1545 9.2 Successful 2xx 1547 This class of status code indicates that the client's request was 1548 successfully received, understood, and accepted. 1550 200 OK 1552 The request has succeeded. The information returned with the 1553 response is dependent on the method used in the request, as follows: 1555 GET an entity corresponding to the requested resource is sent 1556 in the response; 1558 HEAD the response must only contain the header information and 1559 no Entity-Body; 1561 POST an entity describing or containing the result of the action. 1563 201 Created 1565 The request has been fulfilled and resulted in a new resource being 1566 created. The newly created resource can be referenced by the URI(s) 1567 returned in the entity of the response. The origin server should 1568 create the resource before using this Status-Code. If the action 1569 cannot be carried out immediately, the server must include in the 1570 response body a description of when the resource will be available; 1571 otherwise, the server should respond with 202 (accepted). 1573 Of the methods defined by this specification, only POST can create 1574 a resource. 1576 202 Accepted 1578 The request has been accepted for processing, but the processing 1579 has not been completed. The request may or may not eventually be 1580 acted upon, as it may be disallowed when processing actually takes 1581 place. There is no facility for re-sending a status code from an 1582 asynchronous operation such as this. 1584 The 202 response is intentionally non-committal. Its purpose is to 1585 allow a server to accept a request for some other process (perhaps 1586 a batch-oriented process that is only run once per day) without 1587 requiring that the user agent's connection to the server persist 1588 until the process is completed. The entity returned with this 1589 response should include an indication of the request's current 1590 status and either a pointer to a status monitor or some estimate of 1591 when the user can expect the request to be fulfilled. 1593 204 No Content 1595 The server has fulfilled the request but there is no new 1596 information to send back. If the client is a user agent, it should 1597 not change its document view from that which caused the request to 1598 be generated. This response is primarily intended to allow input 1599 for scripts or other actions to take place without causing a change 1600 to the user agent's active document view. The response may include 1601 new metainformation in the form of entity headers, which should 1602 apply to the document currently in the user agent's active view. 1604 9.3 Redirection 3xx 1606 This class of status code indicates that further action needs to be 1607 taken by the user agent in order to fulfill the request. The action 1608 required can sometimes be carried out by the user agent without 1609 interaction with the user, but it is strongly recommended that this 1610 only take place if the method used in the request is GET or HEAD. A 1611 user agent should never automatically redirect a request more than 1612 5 times, since such redirections usually indicate an infinite loop. 1614 300 Multiple Choices 1616 This response code is not directly used by HTTP/1.0 applications, 1617 but serves as the default for interpreting the 3xx class of 1618 responses. 1620 The requested resource is available at one or more locations. 1621 Unless it was a HEAD request, the response should include an entity 1622 containing a list of resource characteristics and locations from 1623 which the user or user agent can choose the one most appropriate. 1624 If the server has a preferred choice, it should include the URL in 1625 a Location field; user agents may use this field value for 1626 automatic redirection. 1628 301 Moved Permanently 1630 The requested resource has been assigned a new permanent URL and 1631 any future references to this resource should be done using that 1632 URL. Clients with link editing capabilities should automatically 1633 relink references to the Request-URI to the new reference returned 1634 by the server, where possible. 1636 The new URL must be given by the Location field in the response. 1637 Unless it was a HEAD request, the Entity-Body of the response 1638 should contain a short note with a hyperlink to the new URL. 1640 If the 301 status code is received in response to a request using 1641 the POST method, the user agent must not automatically redirect the 1642 request unless it can be confirmed by the user, since this might 1643 change the conditions under which the request was issued. 1645 302 Moved Temporarily 1647 The requested resource resides temporarily under a different URL. 1648 Since the redirection may be altered on occasion, the client should 1649 continue to use the Request-URI for future requests. 1651 The URL must be given by the Location field in the response. Unless 1652 it was a HEAD request, the Entity-Body of the response should 1653 contain a short note with a hyperlink to the new URI(s). 1655 If the 302 status code is received in response to a request using 1656 the POST method, the user agent must not automatically redirect the 1657 request unless it can be confirmed by the user, since this might 1658 change the conditions under which the request was issued. 1660 304 Not Modified 1662 If the client has performed a conditional GET request and access is 1663 allowed, but the document has not been modified since the date and 1664 time specified in the If-Modified-Since field, the server must 1665 respond with this status code and not send an Entity-Body to the 1666 client. Header fields contained in the response should only include 1667 information which is relevant to cache managers or which may have 1668 changed independently of the entity's Last-Modified date. Examples 1669 of relevant header fields include: Date, Server, and Expires. A 1670 cache should update its cached entity to reflect any new field 1671 values given in the 304 response. 1673 9.4 Client Error 4xx 1675 The 4xx class of status code is intended for cases in which the 1676 client seems to have erred. If the client has not completed the 1677 request when a 4xx code is received, it should immediately cease 1678 sending data to the server. Except when responding to a HEAD 1679 request, the server should include an entity containing an 1680 explanation of the error situation, and whether it is a temporary 1681 or permanent condition. These status codes are applicable to any 1682 request method. 1684 Note: If the client is sending data, server implementations 1685 on TCP should be careful to ensure that the client 1686 acknowledges receipt of the packet(s) containing the 1687 response prior to closing the input connection. If the 1688 client continues sending data to the server after the close, 1689 the server's controller will send a reset packet to the 1690 client, which may erase the client's unacknowledged input 1691 buffers before they can be read and interpreted by the HTTP 1692 application. 1694 400 Bad Request 1696 The request could not be understood by the server due to malformed 1697 syntax. The client should not repeat the request without 1698 modifications. 1700 401 Unauthorized 1702 The request requires user authentication. The response must include 1703 a WWW-Authenticate header field (Section 10.17) containing a 1704 challenge applicable to the requested resource. The client may 1705 repeat the request with a suitable Authorization header field 1706 (Section 10.2). If the request already included Authorization 1707 credentials, then the 401 response indicates that authorization has 1708 been refused for those credentials. If the 401 response contains 1709 the same challenge as the prior response, and the user agent has 1710 already attempted authentication at least once, then the user 1711 should be presented the entity that was given in the response, 1712 since that entity may include relevent diagnostic information. HTTP 1713 access authentication is explained in Section 11. 1715 403 Forbidden 1717 The server understood the request, but is refusing to fulfill it. 1718 Authorization will not help and the request should not be repeated. 1719 If the request method was not HEAD and the server wishes to make 1720 public why the request has not been fulfilled, it should describe 1721 the reason for the refusal in the entity body. This status code is 1722 commonly used when the server does not wish to reveal exactly why 1723 the request has been refused, or when no other response is 1724 applicable. 1726 404 Not Found 1728 The server has not found anything matching the Request-URI. No 1729 indication is given of whether the condition is temporary or 1730 permanent. If the server does not wish to make this information 1731 available to the client, the status code 403 (forbidden) can be 1732 used instead. 1734 9.5 Server Error 5xx 1736 Response status codes beginning with the digit "5" indicate cases 1737 in which the server is aware that it has erred or is incapable of 1738 performing the request. If the client has not completed the request 1739 when a 5xx code is received, it should immediately cease sending 1740 data to the server. Except when responding to a HEAD request, the 1741 server should include an entity containing an explanation of the 1742 error situation, and whether it is a temporary or permanent 1743 condition. These response codes are applicable to any request 1744 method and there are no required header fields. 1746 500 Internal Server Error 1748 The server encountered an unexpected condition which prevented it 1749 from fulfilling the request. 1751 501 Not Implemented 1753 The server does not support the functionality required to fulfill 1754 the request. This is the appropriate response when the server does 1755 not recognize the request method and is not capable of supporting 1756 it for any resource. 1758 502 Bad Gateway 1760 The server, while acting as a gateway or proxy, received an invalid 1761 response from the upstream server it accessed in attempting to 1762 fulfill the request. 1764 503 Service Unavailable 1766 The server is currently unable to handle the request due to a 1767 temporary overloading or maintenance of the server. The implication 1768 is that this is a temporary condition which will be alleviated 1769 after some delay. 1771 Note: The existence of the 503 status code does not imply 1772 that a server must use it when becoming overloaded. Some 1773 servers may wish to simply refuse the connection. 1775 10. Header Field Definitions 1777 This section defines the syntax and semantics of all commonly used 1778 HTTP/1.0 header fields. For general and entity header fields, both 1779 sender and recipient refer to either the client or the server, 1780 depending on who sends and who receives the message. 1782 10.1 Allow 1784 The Allow entity-header field lists the set of methods supported by 1785 the resource identified by the Request-URI. The purpose of this 1786 field is strictly to inform the recipient of valid methods 1787 associated with the resource. The Allow header field is not 1788 permitted in a request using the POST method, and thus should be 1789 ignored if it is received as part of a POST entity. 1791 Allow = "Allow" ":" 1#method 1793 Example of use: 1795 Allow: GET, HEAD 1797 This field cannot prevent a client from trying other methods. 1798 However, the indications given by the Allow header field value 1799 should be followed. The actual set of allowed methods is defined by 1800 the origin server at the time of each request. 1802 A proxy must not modify the Allow header field even if it does not 1803 understand all the methods specified, since the user agent may have 1804 other means of communicating with the origin server. 1806 The Allow header field does not indicate what methods are 1807 implemented by the server. 1809 10.2 Authorization 1811 A user agent that wishes to authenticate itself with a 1812 server--usually, but not necessarily, after receiving a 401 1813 response--may do so by including an Authorization request-header 1814 field with the request. The Authorization field value consists of 1815 credentials containing the authentication information of the user 1816 agent for the realm of the resource being requested. 1818 Authorization = "Authorization" ":" credentials 1820 HTTP access authentication is described in Section 11. If a request 1821 is authenticated and a realm specified, the same credentials should 1822 be valid for all other requests within this realm. 1824 Responses to requests containing an Authorization field are not 1825 cachable. 1827 10.3 Content-Encoding 1829 The Content-Encoding entity-header field is used as a modifier to 1830 the media-type. When present, its value indicates what additional 1831 content coding has been applied to the resource, and thus what 1832 decoding mechanism must be applied in order to obtain the 1833 media-type referenced by the Content-Type header field. The 1834 Content-Encoding is primarily used to allow a document to be 1835 compressed without losing the identity of its underlying media type. 1837 Content-Encoding = "Content-Encoding" ":" content-coding 1839 Content codings are defined in Section 3.5. An example of its use is 1841 Content-Encoding: x-gzip 1843 The Content-Encoding is a characteristic of the resource identified 1844 by the Request-URI. Typically, the resource is stored with this 1845 encoding and is only decoded before rendering or analogous usage. 1847 10.4 Content-Length 1849 The Content-Length entity-header field indicates the size of the 1850 Entity-Body, in decimal number of octets, sent to the recipient or, 1851 in the case of the HEAD method, the size of the Entity-Body that 1852 would have been sent had the request been a GET. 1854 Content-Length = "Content-Length" ":" 1*DIGIT 1856 An example is 1858 Content-Length: 3495 1860 Applications should use this field to indicate the size of the 1861 Entity-Body to be transferred, regardless of the media type of the 1862 entity. A valid Content-Length field value is required on all 1863 HTTP/1.0 request messages containing an entity body. 1865 Any Content-Length greater than or equal to zero is a valid value. 1866 Section 7.2.2 describes how to determine the length of a response 1867 entity body if a Content-Length is not given. 1869 Note: The meaning of this field is significantly different 1870 from the corresponding definition in MIME, where it is an 1871 optional field used within the "message/external-body" 1872 content-type. In HTTP, it should be used whenever the 1873 entity's length can be determined prior to being transferred. 1875 10.5 Content-Type 1877 The Content-Type entity-header field indicates the media type of 1878 the Entity-Body sent to the recipient or, in the case of the HEAD 1879 method, the media type that would have been sent had the request 1880 been a GET. 1882 Content-Type = "Content-Type" ":" media-type 1884 Media types are defined in Section 3.6. An example of the field is 1886 Content-Type: text/html 1888 Further discussion of methods for identifying the media type of an 1889 entity is provided in Section 7.2.1. 1891 10.6 Date 1893 The Date general-header field represents the date and time at which 1894 the message was originated, having the same semantics as orig-date 1895 in RFC 822. The field value is an HTTP-date, as described in 1896 Section 3.3. 1898 Date = "Date" ":" HTTP-date 1900 An example is 1902 Date: Tue, 15 Nov 1994 08:12:31 GMT 1904 If a message is received via direct connection with the user agent 1905 (in the case of requests) or the origin server (in the case of 1906 responses), then the date can be assumed to be the current date at 1907 the receiving end. However, since the date--as it is believed by the 1908 origin--is important for evaluating cached responses, origin servers 1909 should always include a Date header. Clients should only send a 1910 Date header field in messages that include an entity body, as in 1911 the case of the POST request, and even then it is optional. A 1912 received message which does not have a Date header field should be 1913 assigned one by the recipient if the message will be cached by that 1914 recipient or gatewayed via a protocol which requires a Date. 1916 In theory, the date should represent the moment just before the 1917 entity is generated. In practice, the date can be generated at any 1918 time during the message origination without affecting its semantic 1919 value. 1921 Note: An earlier version of this document incorrectly 1922 specified that this field should contain the creation date 1923 of the enclosed Entity-Body. This has been changed to 1924 reflect actual (and proper) usage. 1926 10.7 Expires 1928 The Expires entity-header field gives the date/time after which the 1929 entity should be considered stale. This allows information 1930 providers to suggest the volatility of the resource, or a date 1931 after which the information may no longer be valid. Applications 1932 must not cache this entity beyond the date given. The presence of 1933 an Expires field does not imply that the original resource will 1934 change or cease to exist at, before, or after that time. However, 1935 information providers that know or even suspect that a resource 1936 will change by a certain date should include an Expires header with 1937 that date. The format is an absolute date and time as defined by 1938 HTTP-date in Section 3.3. 1940 Expires = "Expires" ":" HTTP-date 1942 An example of its use is 1944 Expires: Thu, 01 Dec 1994 16:00:00 GMT 1946 If the date given is equal to or earlier than the value of the Date 1947 header, the recipient must not cache the enclosed entity. If a 1948 resource is dynamic by nature, as is the case with many 1949 data-producing processes, entities from that resource should be 1950 given an appropriate Expires value which reflects that dynamism. 1952 The Expires field cannot be used to force a user agent to refresh 1953 its display or reload a resource; its semantics apply only to 1954 caching mechanisms, and such mechanisms need only check a 1955 resource's expiration status when a new request for that resource 1956 is initiated. 1958 User agents often have history mechanisms, such as "Back" buttons 1959 and history lists, which can be used to redisplay an entity 1960 retrieved earlier in a session. By default, the Expires field does 1961 not apply to history mechanisms. If the entity is still in storage, 1962 a history mechanism should display it even if the entity has 1963 expired, unless the user has specifically configured the agent to 1964 refresh expired history documents. 1966 Note: Applications are encouraged to be tolerant of bad or 1967 misinformed implementations of the Expires header. A value 1968 of zero (0) or an invalid date format should be considered 1969 equivalent to an "expires immediately." Although these 1970 values are not legitimate for HTTP/1.0, a robust 1971 implementation is always desirable. 1973 10.8 From 1975 The From request-header field, if given, should contain an Internet 1976 e-mail address for the human user who controls the requesting user 1977 agent. The address should be machine-usable, as defined by mailbox 1978 in RFC 822 [7] (as updated by RFC 1123 [6]): 1980 From = "From" ":" mailbox 1982 An example is: 1984 From: webmaster@w3.org 1986 This header field may be used for logging purposes and as a means 1987 for identifying the source of invalid or unwanted requests. It 1988 should not be used as an insecure form of access protection. The 1989 interpretation of this field is that the request is being performed 1990 on behalf of the person given, who accepts responsibility for the 1991 method performed. In particular, robot agents should include this 1992 header so that the person responsible for running the robot can be 1993 contacted if problems occur on the receiving end. 1995 The Internet e-mail address in this field may be separate from the 1996 Internet host which issued the request. For example, when a request 1997 is passed through a proxy, the original issuer's address should be 1998 used. 2000 Note: The client should not send the From header field 2001 without the user's approval, as it may conflict with the 2002 user's privacy interests or their site's security policy. It 2003 is strongly recommended that the user be able to disable, 2004 enable, and modify the value of this field at any time prior 2005 to a request. 2007 10.9 If-Modified-Since 2009 The If-Modified-Since request-header field is used with the GET 2010 method to make it conditional: if the requested resource has not 2011 been modified since the time specified in this field, a copy of the 2012 resource will not be returned from the server; instead, a 304 (not 2013 modified) response will be returned without any Entity-Body. 2015 If-Modified-Since = "If-Modified-Since" ":" HTTP-date 2017 An example of the field is: 2019 If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT 2021 A conditional GET method requests that the identified resource be 2022 transferred only if it has been modified since the date given by 2023 the If-Modified-Since header. The algorithm for determining this 2024 includes the following cases: 2026 a) If the request would normally result in anything other than 2027 a 200 (ok) status, or if the passed If-Modified-Since date 2028 is invalid, the response is exactly the same as for a 2029 normal GET. A date which is later than the server's current 2030 time is invalid. 2032 b) If the resource has been modified since the 2033 If-Modified-Since date, the response is exactly the same as 2034 for a normal GET. 2036 c) If the resource has not been modified since a valid 2037 If-Modified-Since date, the server shall return a 304 (not 2038 modified) response. 2040 The purpose of this feature is to allow efficient updates of cached 2041 information with a minimum amount of transaction overhead. 2043 10.10 Last-Modified 2045 The Last-Modified entity-header field indicates the date and time 2046 at which the sender believes the resource was last modified. The 2047 exact semantics of this field are defined in terms of how the 2048 recipient should interpret it: if the recipient has a copy of this 2049 resource which is older than the date given by the Last-Modified 2050 field, that copy should be considered stale. 2052 Last-Modified = "Last-Modified" ":" HTTP-date 2054 An example of its use is 2056 Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT 2058 The exact meaning of this header field depends on the 2059 implementation of the sender and the nature of the original 2060 resource. For files, it may be just the file system last-modified 2061 time. For entities with dynamically included parts, it may be the 2062 most recent of the set of last-modify times for its component 2063 parts. For database gateways, it may be the last-update timestamp 2064 of the record. For virtual objects, it may be the last time the 2065 internal state changed. 2067 An origin server must not send a Last-Modified date which is later 2068 than the server's time of message origination. In such cases, where 2069 the resource's last modification would indicate some time in the 2070 future, the server must replace that date with the message 2071 origination date. 2073 10.11 Location 2075 The Location response-header field defines the exact location of 2076 the resource that was identified by the Request-URI. For 3xx 2077 responses, the location must indicate the server's preferred URL 2078 for automatic redirection to the resource. Only one absolute URL is 2079 allowed. 2081 Location = "Location" ":" absoluteURI 2083 An example is 2085 Location: http://www.w3.org/hypertext/WWW/NewLocation.html 2087 10.12 MIME-Version 2089 HTTP is not a MIME-compliant protocol (see Appendix C). However, 2090 HTTP/1.0 messages may include a single MIME-Version general-header 2091 field to indicate what version of the MIME protocol was used to 2092 construct the message. Use of the MIME-Version header field should 2093 indicate that the message is in full compliance with the MIME 2094 protocol (as defined in [5]). Unfortunately, some older versions of 2095 HTTP/1.0 clients and servers use this field indiscriminately, and 2096 thus recipients must not take it for granted that the message is 2097 indeed in full compliance with MIME. Proxies and gateways are 2098 responsible for ensuring this compliance (where possible) when 2099 exporting HTTP messages to strict MIME environments. Future 2100 HTTP/1.0 applications must only use MIME-Version when the message 2101 is fully MIME-compliant. 2103 MIME-Version = "MIME-Version" ":" 1*DIGIT "." 1*DIGIT 2105 MIME version "1.0" is the default for use in HTTP/1.0. However, 2106 HTTP/1.0 message parsing and semantics are defined by this document 2107 and not the MIME specification. 2109 10.13 Pragma 2111 The Pragma general-header field is used to include 2112 implementation-specific directives that may apply to any recipient 2113 along the request/response chain. All pragma directives specify 2114 optional behavior from the viewpoint of the protocol; however, some 2115 systems may require that behavior be consistent with the directives. 2117 Pragma = "Pragma" ":" 1#pragma-directive 2119 pragma-directive = "no-cache" | extension-pragma 2120 extension-pragma = token [ "=" word ] 2122 When the "no-cache" directive is present in a request message, an 2123 application should forward the request toward the origin server 2124 even if it has a cached copy of what is being requested. This 2125 allows a client to insist upon receiving an authoritative response 2126 to its request. It also allows a client to refresh a cached copy 2127 which is known to be corrupted or stale. 2129 Pragma directives must be passed through by a proxy or gateway 2130 application, regardless of their significance to that application, 2131 since the directives may be applicable to all recipients along the 2132 request/response chain. It is not possible to specify a pragma for 2133 a specific recipient; however, any pragma directive not relevant to 2134 a recipient should be ignored by that recipient. 2136 10.14 Referer 2138 The Referer request-header field allows the client to specify, for 2139 the server's benefit, the address (URI) of the resource from which 2140 the Request-URI was obtained. This allows a server to generate 2141 lists of back-links to resources for interest, logging, optimized 2142 caching, etc. It also allows obsolete or mistyped links to be 2143 traced for maintenance. The Referer field must not be sent if the 2144 Request-URI was obtained from a source that does not have its own 2145 URI, such as input from the user keyboard. 2147 Referer = "Referer" ":" ( absoluteURI | relativeURI ) 2149 Example: 2151 Referer: http://www.w3.org/hypertext/DataSources/Overview.html 2153 If a partial URI is given, it should be interpreted relative to the 2154 Request-URI. The URI must not include a fragment. 2156 Note: Because the source of a link may be private 2157 information or may reveal an otherwise private information 2158 source, it is strongly recommended that the user be able to 2159 select whether or not the Referer field is sent. For 2160 example, a browser client could have a toggle switch for 2161 browsing openly/anonymously, which would respectively 2162 enable/disable the sending of Referer and From information. 2164 10.15 Server 2166 The Server response-header field contains information about the 2167 software used by the origin server to handle the request. The field 2168 can contain multiple product tokens (Section 3.7) and comments 2169 identifying the server and any significant subproducts. By 2170 convention, the product tokens are listed in order of their 2171 significance for identifying the application. 2173 Server = "Server" ":" 1*( product | comment ) 2175 Example: 2177 Server: CERN/3.0 libwww/2.17 2179 If the response is being forwarded through a proxy, the proxy 2180 application must not add its data to the product list. 2182 Note: Revealing the specific software version of the server 2183 may allow the server machine to become more vulnerable to 2184 attacks against software that is known to contain security 2185 holes. Server implementors are encouraged to make this field 2186 a configurable option. 2188 10.16 User-Agent 2190 The User-Agent request-header field contains information about the 2191 user agent originating the request. This is for statistical 2192 purposes, the tracing of protocol violations, and automated 2193 recognition of user agents for the sake of tailoring responses to 2194 avoid particular user agent limitations. Although it is not 2195 required, user agents should include this field with requests. The 2196 field can contain multiple product tokens (Section 3.7) and 2197 comments identifying the agent and any subproducts which form a 2198 significant part of the user agent. By convention, the product 2199 tokens are listed in order of their significance for identifying 2200 the application. 2202 User-Agent = "User-Agent" ":" 1*( product | comment ) 2204 Example: 2206 User-Agent: CERN-LineMode/2.15 libwww/2.17b3 2208 Note: Some current proxy applications append their product 2209 information to the list in the User-Agent field. This is not 2210 recommended, since it makes machine interpretation of these 2211 fields ambiguous. 2213 10.17 WWW-Authenticate 2215 The WWW-Authenticate response-header field must be included in 401 2216 (unauthorized) response messages. The field value consists of at 2217 least one challenge that indicates the authentication scheme(s) and 2218 parameters applicable to the Request-URI. 2220 WWW-Authenticate = "WWW-Authenticate" ":" 1#challenge 2222 The HTTP access authentication process is described in Section 11. 2223 User agents must take special care in parsing the WWW-Authenticate 2224 field value if it contains more than one challenge, or if more than 2225 one WWW-Authenticate header field is provided, since the contents 2226 of a challenge may itself contain a comma-separated list of 2227 authentication parameters. 2229 11. Access Authentication 2231 HTTP provides a simple challenge-response authentication mechanism 2232 which may be used by a server to challenge a client request and by 2233 a client to provide authentication information. It uses an 2234 extensible, case-insensitive token to identify the authentication 2235 scheme, followed by a comma-separated list of attribute-value pairs 2236 which carry the parameters necessary for achieving authentication 2237 via that scheme. 2239 auth-scheme = token 2241 auth-param = token "=" quoted-string 2243 The 401 (unauthorized) response message is used by an origin server 2244 to challenge the authorization of a user agent. This response must 2245 include a WWW-Authenticate header field containing at least one 2246 challenge applicable to the requested resource. 2248 challenge = auth-scheme 1*SP realm *( "," auth-param ) 2250 realm = "realm" "=" realm-value 2251 realm-value = quoted-string 2253 The realm attribute (case-insensitive) is required for all 2254 authentication schemes which issue a challenge. The realm value 2255 (case-sensitive), in combination with the canonical root URL of the 2256 server being accessed, defines the protection space. These realms 2257 allow the protected resources on a server to be partitioned into a 2258 set of protection spaces, each with its own authentication scheme 2259 and/or authorization database. The realm value is a string, 2260 generally assigned by the origin server, which may have additional 2261 semantics specific to the authentication scheme. 2263 A user agent that wishes to authenticate itself with a 2264 server--usually, but not necessarily, after receiving a 401 2265 response--may do so by including an Authorization header field with 2266 the request. The Authorization field value consists of credentials 2267 containing the authentication information of the user agent for the 2268 realm of the resource being requested. 2270 credentials = basic-credentials 2271 | ( auth-scheme #auth-param ) 2273 The domain over which credentials can be automatically applied by a 2274 user agent is determined by the protection space. If a prior 2275 request has been authorized, the same credentials may be reused for 2276 all other requests within that protection space for a period of 2277 time determined by the authentication scheme, parameters, and/or 2278 user preference. Unless otherwise defined by the authentication 2279 scheme, a single protection space cannot extend outside the scope 2280 of its server. 2282 If the server does not wish to accept the credentials sent with a 2283 request, it should return a 403 (forbidden) response. 2285 The HTTP protocol does not restrict applications to this simple 2286 challenge-response mechanism for access authentication. Additional 2287 mechanisms may be used, such as encryption at the transport level 2288 or via message encapsulation, and with additional header fields 2289 specifying authentication information. However, these additional 2290 mechanisms are not defined by this specification. 2292 Proxies must be completely transparent regarding user agent 2293 authentication. That is, they must forward the WWW-Authenticate and 2294 Authorization headers untouched, and must not cache the response to 2295 a request containing Authorization. HTTP/1.0 does not provide a 2296 means for a client to be authenticated with a proxy. 2298 11.1 Basic Authentication Scheme 2300 The "basic" authentication scheme is based on the model that the 2301 user agent must authenticate itself with a user-ID and a password 2302 for each realm. The realm value should be considered an opaque 2303 string which can only be compared for equality with other realms on 2304 that server. The server will authorize the request only if it can 2305 validate the user-ID and password for the protection space of the 2306 Request-URI. There are no optional authentication parameters. 2308 Upon receipt of an unauthorized request for a URI within the 2309 protection space, the server should respond with a challenge like 2310 the following: 2312 WWW-Authenticate: Basic realm="WallyWorld" 2314 where "WallyWorld" is the string assigned by the server to identify 2315 the protection space of the Request-URI. 2317 To receive authorization, the client sends the user-ID and 2318 password, separated by a single colon (":") character, within a 2319 base64 [5] encoded string in the credentials. 2321 basic-credentials = "Basic" SP basic-cookie 2323 basic-cookie = 2326 userid-password = [ token ] ":" *TEXT 2328 If the user agent wishes to send the user-ID "Aladdin" and password 2329 "open sesame", it would use the following header field: 2331 Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ== 2333 The basic authentication scheme is a non-secure method of filtering 2334 unauthorized access to resources on an HTTP server. It is based on 2335 the assumption that the connection between the client and the 2336 server can be regarded as a trusted carrier. As this is not 2337 generally true on an open network, the basic authentication scheme 2338 should be used accordingly. In spite of this, clients should 2339 implement the scheme in order to communicate with servers that use 2340 it. 2342 12. Security Considerations 2344 This section is meant to inform application developers, information 2345 providers, and users of the security limitations in HTTP/1.0 as 2346 described by this document. The discussion does not include 2347 definitive solutions to the problems revealed, though it does make 2348 some suggestions for reducing security risks. 2350 12.1 Authentication of Clients 2352 As mentioned in Section 11.1, the Basic authentication scheme is 2353 not a secure method of user authentication, nor does it prevent the 2354 Entity-Body from being transmitted in clear text across the 2355 physical network used as the carrier. HTTP/1.0 does not prevent 2356 additional authentication schemes and encryption mechanisms from 2357 being employed to increase security. 2359 12.2 Safe Methods 2361 The writers of client software should be aware that the software 2362 represents the user in their interactions over the Internet, and 2363 should be careful to allow the user to be aware of any actions they 2364 may take which may have an unexpected significance to themselves or 2365 others. 2367 In particular, the convention has been established that the GET and 2368 HEAD methods should never have the significance of taking an action 2369 other than retrieval. These methods should be considered "safe." 2370 This allows user agents to represent other methods, such as POST, 2371 in a special way, so that the user is made aware of the fact that a 2372 possibly unsafe action is being requested. 2374 Naturally, it is not possible to ensure that the server does not 2375 generate side-effects as a result of performing a GET request; in 2376 fact, some dynamic resources consider that a feature. The important 2377 distinction here is that the user did not request the side-effects, 2378 so therefore cannot be held accountable for them. 2380 12.3 Abuse of Server Log Information 2382 A server is in the position to save personal data about a user's 2383 requests which may identify their reading patterns or subjects of 2384 interest. This information is clearly confidential in nature and 2385 its handling may be constrained by law in certain countries. People 2386 using the HTTP protocol to provide data are responsible for 2387 ensuring that such material is not distributed without the 2388 permission of any individuals that are identifiable by the 2389 published results. 2391 12.4 Transfer of Sensitive Information 2393 Like any generic data transfer protocol, HTTP cannot regulate the 2394 content of the data that is transferred, nor is there any a priori 2395 method of determining the sensitivity of any particular piece of 2396 information within the context of any given request. Therefore, 2397 applications should supply as much control over this information as 2398 possible to the provider of that information. Three header fields 2399 are worth special mention in this context: Server, Referer and From. 2401 Revealing the specific software version of the server may allow the 2402 server machine to become more vulnerable to attacks against 2403 software that is known to contain security holes. Implementors 2404 should make the Server header field a configurable option. 2406 The Referer field allows reading patterns to be studied and reverse 2407 links drawn. Although it can be very useful, its power can be 2408 abused if user details are not separated from the information 2409 contained in the Referer. Even when the personal information has 2410 been removed, the Referer field may indicate a private document's 2411 URI whose publication would be inappropriate. 2413 The information sent in the From field might conflict with the 2414 user's privacy interests or their site's security policy, and hence 2415 it should not be transmitted without the user being able to 2416 disable, enable, and modify the contents of the field. The user 2417 must be able to set the contents of this field within a user 2418 preference or application defaults configuration. 2420 We suggest, though do not require, that a convenient toggle 2421 interface be provided for the user to enable or disable the sending 2422 of From and Referer information. 2424 13. Acknowledgments 2426 This specification makes heavy use of the augmented BNF and generic 2427 constructs defined by David H. Crocker for RFC 822 [7]. Similarly, 2428 it reuses many of the definitions provided by Nathaniel Borenstein 2429 and Ned Freed for MIME [5]. We hope that their inclusion in this 2430 specification will help reduce past confusion over the relationship 2431 between HTTP/1.0 and Internet mail message formats. 2433 The HTTP protocol has evolved considerably over the past four 2434 years. It has benefited from a large and active developer 2435 community--the many people who have participated on the www-talk 2436 mailing list--and it is that community which has been most 2437 responsible for the success of HTTP and of the World-Wide Web in 2438 general. Marc Andreessen, Robert Cailliau, Daniel W. Connolly, 2439 Bob Denny, Jean-Francois Groff, Phillip M. Hallam-Baker, 2440 Hakon W. Lie, Ari Luotonen, Rob McCool, Lou Montulli, Dave Raggett, 2441 Tony Sanders, and Marc VanHeyningen deserve special recognition for 2442 their efforts in defining aspects of the protocol for early versions 2443 of this specification. 2445 This document has benefited greatly from the comments of all those 2446 participating in the HTTP-WG. In addition to those already 2447 mentioned, the following individuals have contributed to this 2448 specification: 2450 Gary Adams Harald Tveit Alvestrand 2451 Keith Ball Brian Behlendorf 2452 Paul Burchard Maurizio Codogno 2453 Mike Cowlishaw Roman Czyborra 2454 Michael A. Dolan John Franks 2455 Jim Gettys Marc Hedlund 2456 Koen Holtman Alex Hopmann 2457 Bob Jernigan Shel Kaphan 2458 Martijn Koster Dave Kristol 2459 Daniel LaLiberte Paul Leach 2460 Albert Lunde John C. Mallery 2461 Larry Masinter Mitra 2462 Gavin Nicol Bill Perry 2463 Jeffrey Perry Owen Rees 2464 David Robinson Marc Salomon 2465 Rich Salz Jim Seidman 2466 Chuck Shotton Eric W. Sink 2467 Simon E. Spero Robert S. Thau 2468 Francois Yergeau Mary Ellen Zurko 2469 Jean-Philippe Martin-Flatin 2471 14. References 2473 [1] F. Anklesaria, M. McCahill, P. Lindner, D. Johnson, D. Torrey, 2474 and B. Alberti. "The Internet Gopher Protocol: A distributed 2475 document search and retrieval protocol." RFC 1436, University 2476 of Minnesota, March 1993. 2478 [2] T. Berners-Lee. "Universal Resource Identifiers in WWW: A 2479 Unifying Syntax for the Expression of Names and Addresses of 2480 Objects on the Network as used in the World-Wide Web." RFC 2481 1630, CERN, June 1994. 2483 [3] T. Berners-Lee and D. Connolly. "HyperText Markup Language 2484 Specification - 2.0." Work in Progress 2485 (draft-ietf-html-spec-05.txt), MIT/W3C, August 1995. 2487 [4] T. Berners-Lee, L. Masinter, and M. McCahill. "Uniform Resource 2488 Locators (URL)." RFC 1738, CERN, Xerox PARC, University of 2489 Minnesota, December 1994. 2491 [5] N. Borenstein and N. Freed. "MIME (Multipurpose Internet Mail 2492 Extensions) Part One: Mechanisms for Specifying and Describing 2493 the Format of Internet Message Bodies." RFC 1521, Bellcore, 2494 Innosoft, September 1993. 2496 [6] R. Braden. "Requirements for Internet hosts - application and 2497 support." STD 3, RFC 1123, IETF, October 1989. 2499 [7] D. H. Crocker. "Standard for the Format of ARPA Internet Text 2500 Messages." STD 11, RFC 822, UDEL, August 1982. 2502 [8] F. Davis, B. Kahle, H. Morris, J. Salem, T. Shen, R. Wang, 2503 J. Sui, and M. Grinbaum. "WAIS Interface Protocol Prototype 2504 Functional Specification." (v1.5), Thinking Machines 2505 Corporation, April 1990. 2507 [9] R. Fielding. "Relative Uniform Resource Locators." RFC 1808, 2508 UC Irvine, June 1995. 2510 [10] M. Horton and R. Adams. "Standard for interchange of USENET 2511 messages." RFC 1036 (Obsoletes RFC 850), AT&T Bell 2512 Laboratories, Center for Seismic Studies, December 1987. 2514 [11] B. Kantor and P. Lapsley. "Network News Transfer Protocol: 2515 A Proposed Standard for the Stream-Based Transmission of News." 2516 RFC 977, UC San Diego, UC Berkeley, February 1986. 2518 [12] J. Postel. "Simple Mail Transfer Protocol." STD 10, RFC 821, 2519 USC/ISI, August 1982. 2521 [13] J. Postel. "Media Type Registration Procedure." RFC 1590, 2522 USC/ISI, March 1994. 2524 [14] J. Postel and J. K. Reynolds. "File Transfer Protocol (FTP)." 2525 STD 9, RFC 959, USC/ISI, October 1985. 2527 [15] J. Reynolds and J. Postel. "Assigned Numbers." STD 2, RFC 1700, 2528 USC/ISI, October 1994. 2530 [16] K. Sollins and L. Masinter. "Functional Requirements for 2531 Uniform Resource Names." RFC 1737, MIT/LCS, Xerox Corporation, 2532 December 1994. 2534 [17] US-ASCII. Coded Character Set - 7-Bit American Standard Code 2535 for Information Interchange. Standard ANSI X3.4-1986, ANSI, 2536 1986. 2538 [18] ISO-8859. International Standard -- Information Processing -- 2539 8-bit Single-Byte Coded Graphic Character Sets -- 2540 Part 1: Latin alphabet No. 1, ISO 8859-1:1987. 2541 Part 2: Latin alphabet No. 2, ISO 8859-2, 1987. 2542 Part 3: Latin alphabet No. 3, ISO 8859-3, 1988. 2543 Part 4: Latin alphabet No. 4, ISO 8859-4, 1988. 2544 Part 5: Latin/Cyrillic alphabet, ISO 8859-5, 1988. 2545 Part 6: Latin/Arabic alphabet, ISO 8859-6, 1987. 2546 Part 7: Latin/Greek alphabet, ISO 8859-7, 1987. 2547 Part 8: Latin/Hebrew alphabet, ISO 8859-8, 1988. 2548 Part 9: Latin alphabet No. 5, ISO 8859-9, 1990. 2550 15. Authors' Addresses 2552 Tim Berners-Lee 2553 Director, W3 Consortium 2554 MIT Laboratory for Computer Science 2555 545 Technology Square 2556 Cambridge, MA 02139, U.S.A. 2557 Tel: +1 (617) 253 5702 2558 Fax: +1 (617) 258 8682 2559 Email: timbl@w3.org 2561 Roy T. Fielding 2562 Department of Information and Computer Science 2563 University of California 2564 Irvine, CA 92717-3425, U.S.A. 2565 Tel: +1 (714) 824-4049 2566 Fax: +1 (714) 824-4056 2567 Email: fielding@ics.uci.edu 2569 Henrik Frystyk Nielsen 2570 W3 Consortium 2571 MIT Laboratory for Computer Science 2572 545 Technology Square 2573 Cambridge, MA 02139, U.S.A. 2574 Tel: +1 (617) 258 8143 2575 Fax: +1 (617) 258 8682 2576 Email: frystyk@w3.org 2578 Appendices 2580 These appendices are provided for informational reasons only -- they 2581 do not form a part of the HTTP/1.0 specification. 2583 A. Internet Media Type message/http 2585 In addition to defining the HTTP/1.0 protocol, this document serves 2586 as the specification for the Internet media type "message/http". 2587 The following is to be registered with IANA [13]. 2589 Media Type name: message 2591 Media subtype name: http 2593 Required parameters: none 2595 Optional parameters: version, msgtype 2597 version: The HTTP-Version number of the enclosed message 2598 (e.g., "1.0"). If not present, the version can be 2599 determined from the first line of the body. 2601 msgtype: The message type -- "request" or "response". If 2602 not present, the type can be determined from the 2603 first line of the body. 2605 Encoding considerations: only "7bit", "8bit", or "binary" are 2606 permitted 2608 Security considerations: none 2610 B. Tolerant Applications 2612 Although this document specifies the requirements for the 2613 generation of HTTP/1.0 messages, not all applications will be 2614 correct in their implementation. We therefore recommend that 2615 operational applications be tolerant of deviations whenever those 2616 deviations can be interpreted unambiguously. 2618 Clients should be tolerant in parsing the Status-Line and servers 2619 tolerant when parsing the Request-Line. In particular, they should 2620 accept any amount of SP or HT characters between fields, even 2621 though only a single SP is required. 2623 The line terminator for HTTP-header fields is the sequence CRLF. 2624 However, we recommend that applications, when parsing such headers, 2625 recognize a single LF as a line terminator and ignore the leading 2626 CR. 2628 C. Relationship to MIME 2630 HTTP/1.0 reuses many of the constructs defined for Internet Mail 2631 (RFC 822 [7]) and the Multipurpose Internet Mail Extensions 2632 (MIME [5]) to allow entities to be transmitted in an open variety 2633 of representations and with extensible mechanisms. However, HTTP is 2634 not a MIME-compliant application. HTTP's performance requirements 2635 differ substantially from those of Internet mail. Since it is not 2636 limited by the restrictions of existing mail protocols and SMTP 2637 gateways, HTTP does not obey some of the constraints imposed by 2638 RFC 822 and MIME for mail transport. 2640 This appendix describes specific areas where HTTP differs from 2641 MIME. Proxies/gateways to MIME-compliant protocols must be aware of 2642 these differences and provide the appropriate conversions where 2643 necessary. 2645 C.1 Conversion to Canonical Form 2647 MIME requires that an entity be converted to canonical form prior 2648 to being transferred, as described in Appendix G of RFC 1521 [5]. 2649 Although HTTP does require media types to be transferred in 2650 canonical form, it changes the definition of "canonical form" for 2651 text-based media types as described in Section 3.6.1. 2653 C.1.1 Representation of Line Breaks 2655 MIME requires that the canonical form of any text type represent 2656 line breaks as CRLF and forbids the use of CR or LF outside of line 2657 break sequences. Since HTTP allows CRLF, bare CR, and bare LF (or 2658 the octet sequence(s) to which they would be translated for the 2659 given character set) to indicate a line break within text content, 2660 recipients of an HTTP message cannot rely upon receiving 2661 MIME-canonical line breaks in text. 2663 Where it is possible, a proxy or gateway from HTTP to a 2664 MIME-compliant protocol should translate all line breaks within 2665 text/* media types to the MIME canonical form of CRLF. However, 2666 this may be complicated by the presence of a Content-Encoding and 2667 by the fact that HTTP allows the use of some character sets which 2668 do not use octets 13 and 10 to represent CR and LF, as is the case 2669 for some multi-byte character sets. If canonicalization is 2670 performed, the Content-Length header field value must be updated to 2671 reflect the new body length. 2673 C.1.2 Default Character Set 2675 MIME requires that all subtypes of the top-level Content-Type 2676 "text" have a default character set of US-ASCII [17]. In contrast, 2677 HTTP defines the default character set for "text" to be 2678 ISO-8859-1 [18] (a superset of US-ASCII). Therefore, if a text/* 2679 media type given in the Content-Type header field does not already 2680 include an explicit charset parameter, the parameter 2682 ;charset="iso-8859-1" 2684 should be added by the proxy/gateway if the entity contains any 2685 octets greater than 127. 2687 C.2 Conversion of Date Formats 2689 HTTP/1.0 uses a restricted subset of date formats to simplify the 2690 process of date comparison. Proxies/gateways from other protocols 2691 should ensure that any Date header field present in a message 2692 conforms to one of the HTTP/1.0 formats and rewrite the date if 2693 necessary. 2695 C.3 Introduction of Content-Encoding 2697 MIME does not include any concept equivalent to HTTP's 2698 Content-Encoding header field. Since this acts as a modifier on the 2699 media type, proxies/gateways to MIME-compliant protocols must 2700 either change the value of the Content-Type header field or decode 2701 the Entity-Body before forwarding the message. 2703 Note: Some experimental applications of Content-Type for 2704 Internet mail have used a media-type parameter of 2705 ";conversions=" to perform an equivalent 2706 function as Content-Encoding. However, this parameter is not 2707 part of the MIME specification at the time of this writing. 2709 C.4 No Content-Transfer-Encoding 2711 HTTP does not use the Content-Transfer-Encoding (CTE) field of 2712 MIME. Proxies/gateways from MIME-compliant protocols must remove 2713 any non-identity CTE ("quoted-printable" or "base64") encoding 2714 prior to delivering the response message to an HTTP client. 2715 Proxies/gateways to MIME-compliant protocols are responsible for 2716 ensuring that the message is in the correct format and encoding for 2717 safe transport on that protocol, where "safe transport" is defined 2718 by the limitations of the protocol being used. At a minimum, the 2719 CTE field of 2721 Content-Transfer-Encoding: binary 2723 should be added by the proxy/gateway if it is unwilling to apply a 2724 content transfer encoding. 2726 An HTTP client may include a Content-Transfer-Encoding as an 2727 extension Entity-Header in a POST request when it knows the 2728 destination of that request is a proxy/gateway to a MIME-compliant 2729 protocol.