idnits 2.17.1 draft-ietf-httpbis-p1-messaging-21.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC2616, but the abstract doesn't seem to mention this, which it should. -- The draft header indicates that this document obsoletes RFC2145, but the abstract doesn't seem to mention this, which it should. -- The draft header indicates that this document updates RFC2817, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC2817, updated by this document, for RFC5378 checks: 1998-11-18) -- The document seems to contain a disclaimer for pre-RFC5378 work, and may have content which was first submitted before 10 November 2008. The disclaimer is necessary when there are original authors that you have been unable to contact, or if some do not wish to grant the BCP78 rights to the IETF Trust. If you are able to get all authors (current and original) to grant those rights, you can and should remove the disclaimer; otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 4, 2012) is 4215 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'Part7' is defined on line 3041, but no explicit reference was found in the text == Unused Reference: 'RFC2145' is defined on line 3102, but no explicit reference was found in the text == Outdated reference: A later version (-26) exists of draft-ietf-httpbis-p2-semantics-21 == Outdated reference: A later version (-26) exists of draft-ietf-httpbis-p4-conditional-21 == Outdated reference: A later version (-26) exists of draft-ietf-httpbis-p5-range-21 == Outdated reference: A later version (-26) exists of draft-ietf-httpbis-p6-cache-21 == Outdated reference: A later version (-26) exists of draft-ietf-httpbis-p7-auth-21 ** Downref: Normative reference to an Informational RFC: RFC 1950 ** Downref: Normative reference to an Informational RFC: RFC 1951 ** Downref: Normative reference to an Informational RFC: RFC 1952 -- Possible downref: Non-RFC (?) normative reference: ref. 'USASCII' -- Obsolete informational reference (is this intentional?): RFC 2068 (Obsoleted by RFC 2616) -- Obsolete informational reference (is this intentional?): RFC 2145 (Obsoleted by RFC 7230) -- Obsolete informational reference (is this intentional?): RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) -- Obsolete informational reference (is this intentional?): RFC 2818 (Obsoleted by RFC 9110) -- Obsolete informational reference (is this intentional?): RFC 2965 (Obsoleted by RFC 6265) -- Obsolete informational reference (is this intentional?): RFC 4288 (Obsoleted by RFC 6838) -- Obsolete informational reference (is this intentional?): RFC 4395 (Obsoleted by RFC 7595) -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) Summary: 3 errors (**), 0 flaws (~~), 8 warnings (==), 16 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 HTTPbis Working Group R. Fielding, Ed. 3 Internet-Draft Adobe 4 Obsoletes: 2145,2616 (if approved) J. Reschke, Ed. 5 Updates: 2817 (if approved) greenbytes 6 Intended status: Standards Track October 4, 2012 7 Expires: April 7, 2013 9 Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing 10 draft-ietf-httpbis-p1-messaging-21 12 Abstract 14 The Hypertext Transfer Protocol (HTTP) is an application-level 15 protocol for distributed, collaborative, hypertext information 16 systems. HTTP has been in use by the World Wide Web global 17 information initiative since 1990. This document provides an 18 overview of HTTP architecture and its associated terminology, defines 19 the "http" and "https" Uniform Resource Identifier (URI) schemes, 20 defines the HTTP/1.1 message syntax and parsing requirements, and 21 describes general security concerns for implementations. 23 Editorial Note (To be removed by RFC Editor) 25 Discussion of this draft takes place on the HTTPBIS working group 26 mailing list (ietf-http-wg@w3.org), which is archived at 27 . 29 The current issues list is at 30 and related 31 documents (including fancy diffs) can be found at 32 . 34 The changes in this draft are summarized in Appendix D.22. 36 Status of This Memo 38 This Internet-Draft is submitted in full conformance with the 39 provisions of BCP 78 and BCP 79. 41 Internet-Drafts are working documents of the Internet Engineering 42 Task Force (IETF). Note that other groups may also distribute 43 working documents as Internet-Drafts. The list of current Internet- 44 Drafts is at http://datatracker.ietf.org/drafts/current/. 46 Internet-Drafts are draft documents valid for a maximum of six months 47 and may be updated, replaced, or obsoleted by other documents at any 48 time. It is inappropriate to use Internet-Drafts as reference 49 material or to cite them other than as "work in progress." 51 This Internet-Draft will expire on April 7, 2013. 53 Copyright Notice 55 Copyright (c) 2012 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents 60 (http://trustee.ietf.org/license-info) in effect on the date of 61 publication of this document. Please review these documents 62 carefully, as they describe your rights and restrictions with respect 63 to this document. Code Components extracted from this document must 64 include Simplified BSD License text as described in Section 4.e of 65 the Trust Legal Provisions and are provided without warranty as 66 described in the Simplified BSD License. 68 This document may contain material from IETF Documents or IETF 69 Contributions published or made publicly available before November 70 10, 2008. The person(s) controlling the copyright in some of this 71 material may not have granted the IETF Trust the right to allow 72 modifications of such material outside the IETF Standards Process. 73 Without obtaining an adequate license from the person(s) controlling 74 the copyright in such materials, this document may not be modified 75 outside the IETF Standards Process, and derivative works of it may 76 not be created outside the IETF Standards Process, except to format 77 it for publication as an RFC or to translate it into languages other 78 than English. 80 Table of Contents 82 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 83 1.1. Requirement Notation . . . . . . . . . . . . . . . . . . . 6 84 1.2. Syntax Notation . . . . . . . . . . . . . . . . . . . . . 6 85 2. Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 6 86 2.1. Client/Server Messaging . . . . . . . . . . . . . . . . . 7 87 2.2. Implementation Diversity . . . . . . . . . . . . . . . . . 8 88 2.3. Intermediaries . . . . . . . . . . . . . . . . . . . . . . 9 89 2.4. Caches . . . . . . . . . . . . . . . . . . . . . . . . . . 11 90 2.5. Conformance and Error Handling . . . . . . . . . . . . . . 12 91 2.6. Protocol Versioning . . . . . . . . . . . . . . . . . . . 13 92 2.7. Uniform Resource Identifiers . . . . . . . . . . . . . . . 15 93 2.7.1. http URI scheme . . . . . . . . . . . . . . . . . . . 16 94 2.7.2. https URI scheme . . . . . . . . . . . . . . . . . . . 17 95 2.7.3. http and https URI Normalization and Comparison . . . 18 96 3. Message Format . . . . . . . . . . . . . . . . . . . . . . . . 18 97 3.1. Start Line . . . . . . . . . . . . . . . . . . . . . . . . 19 98 3.1.1. Request Line . . . . . . . . . . . . . . . . . . . . . 20 99 3.1.2. Status Line . . . . . . . . . . . . . . . . . . . . . 21 100 3.2. Header Fields . . . . . . . . . . . . . . . . . . . . . . 21 101 3.2.1. Whitespace . . . . . . . . . . . . . . . . . . . . . . 23 102 3.2.2. Field Parsing . . . . . . . . . . . . . . . . . . . . 23 103 3.2.3. Field Length . . . . . . . . . . . . . . . . . . . . . 24 104 3.2.4. Field value components . . . . . . . . . . . . . . . . 24 105 3.3. Message Body . . . . . . . . . . . . . . . . . . . . . . . 26 106 3.3.1. Transfer-Encoding . . . . . . . . . . . . . . . . . . 26 107 3.3.2. Content-Length . . . . . . . . . . . . . . . . . . . . 28 108 3.3.3. Message Body Length . . . . . . . . . . . . . . . . . 29 109 3.4. Handling Incomplete Messages . . . . . . . . . . . . . . . 31 110 3.5. Message Parsing Robustness . . . . . . . . . . . . . . . . 32 111 4. Transfer Codings . . . . . . . . . . . . . . . . . . . . . . . 32 112 4.1. Chunked Transfer Coding . . . . . . . . . . . . . . . . . 33 113 4.1.1. Trailer . . . . . . . . . . . . . . . . . . . . . . . 34 114 4.1.2. Decoding chunked . . . . . . . . . . . . . . . . . . . 35 115 4.2. Compression Codings . . . . . . . . . . . . . . . . . . . 35 116 4.2.1. Compress Coding . . . . . . . . . . . . . . . . . . . 35 117 4.2.2. Deflate Coding . . . . . . . . . . . . . . . . . . . . 35 118 4.2.3. Gzip Coding . . . . . . . . . . . . . . . . . . . . . 36 119 4.3. TE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 120 5. Message Routing . . . . . . . . . . . . . . . . . . . . . . . 37 121 5.1. Identifying a Target Resource . . . . . . . . . . . . . . 37 122 5.2. Connecting Inbound . . . . . . . . . . . . . . . . . . . . 37 123 5.3. Request Target . . . . . . . . . . . . . . . . . . . . . . 38 124 5.4. Host . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 125 5.5. Effective Request URI . . . . . . . . . . . . . . . . . . 41 126 5.6. Message Forwarding . . . . . . . . . . . . . . . . . . . . 42 127 5.7. Via . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 128 5.8. Message Transforming . . . . . . . . . . . . . . . . . . . 44 129 5.9. Associating a Response to a Request . . . . . . . . . . . 46 130 6. Connection Management . . . . . . . . . . . . . . . . . . . . 46 131 6.1. Connection . . . . . . . . . . . . . . . . . . . . . . . . 46 132 6.2. Persistent Connections . . . . . . . . . . . . . . . . . . 48 133 6.2.1. Establishment . . . . . . . . . . . . . . . . . . . . 49 134 6.2.2. Reuse . . . . . . . . . . . . . . . . . . . . . . . . 50 135 6.2.3. Concurrency . . . . . . . . . . . . . . . . . . . . . 51 136 6.2.4. Failures and Time-outs . . . . . . . . . . . . . . . . 51 137 6.2.5. Tear-down . . . . . . . . . . . . . . . . . . . . . . 52 138 6.3. Upgrade . . . . . . . . . . . . . . . . . . . . . . . . . 53 139 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 54 140 7.1. Header Field Registration . . . . . . . . . . . . . . . . 54 141 7.2. URI Scheme Registration . . . . . . . . . . . . . . . . . 55 142 7.3. Internet Media Type Registrations . . . . . . . . . . . . 56 143 7.3.1. Internet Media Type message/http . . . . . . . . . . . 56 144 7.3.2. Internet Media Type application/http . . . . . . . . . 57 146 7.4. Transfer Coding Registry . . . . . . . . . . . . . . . . . 58 147 7.5. Transfer Coding Registrations . . . . . . . . . . . . . . 58 148 7.6. Upgrade Token Registry . . . . . . . . . . . . . . . . . . 59 149 7.7. Upgrade Token Registration . . . . . . . . . . . . . . . . 60 150 8. Security Considerations . . . . . . . . . . . . . . . . . . . 60 151 8.1. Personal Information . . . . . . . . . . . . . . . . . . . 60 152 8.2. Abuse of Server Log Information . . . . . . . . . . . . . 60 153 8.3. Attacks Based On File and Path Names . . . . . . . . . . . 61 154 8.4. DNS-related Attacks . . . . . . . . . . . . . . . . . . . 61 155 8.5. Intermediaries and Caching . . . . . . . . . . . . . . . . 61 156 8.6. Protocol Element Size Overflows . . . . . . . . . . . . . 62 157 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 62 158 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 64 159 10.1. Normative References . . . . . . . . . . . . . . . . . . . 64 160 10.2. Informative References . . . . . . . . . . . . . . . . . . 65 161 Appendix A. HTTP Version History . . . . . . . . . . . . . . . . 67 162 A.1. Changes from HTTP/1.0 . . . . . . . . . . . . . . . . . . 67 163 A.1.1. Multi-homed Web Servers . . . . . . . . . . . . . . . 68 164 A.1.2. Keep-Alive Connections . . . . . . . . . . . . . . . . 68 165 A.1.3. Introduction of Transfer-Encoding . . . . . . . . . . 69 166 A.2. Changes from RFC 2616 . . . . . . . . . . . . . . . . . . 69 167 Appendix B. ABNF list extension: #rule . . . . . . . . . . . . . 70 168 Appendix C. Collected ABNF . . . . . . . . . . . . . . . . . . . 71 169 Appendix D. Change Log (to be removed by RFC Editor before 170 publication) . . . . . . . . . . . . . . . . . . . . 74 171 D.1. Since RFC 2616 . . . . . . . . . . . . . . . . . . . . . . 74 172 D.2. Since draft-ietf-httpbis-p1-messaging-00 . . . . . . . . . 74 173 D.3. Since draft-ietf-httpbis-p1-messaging-01 . . . . . . . . . 75 174 D.4. Since draft-ietf-httpbis-p1-messaging-02 . . . . . . . . . 76 175 D.5. Since draft-ietf-httpbis-p1-messaging-03 . . . . . . . . . 77 176 D.6. Since draft-ietf-httpbis-p1-messaging-04 . . . . . . . . . 77 177 D.7. Since draft-ietf-httpbis-p1-messaging-05 . . . . . . . . . 78 178 D.8. Since draft-ietf-httpbis-p1-messaging-06 . . . . . . . . . 79 179 D.9. Since draft-ietf-httpbis-p1-messaging-07 . . . . . . . . . 79 180 D.10. Since draft-ietf-httpbis-p1-messaging-08 . . . . . . . . . 80 181 D.11. Since draft-ietf-httpbis-p1-messaging-09 . . . . . . . . . 80 182 D.12. Since draft-ietf-httpbis-p1-messaging-10 . . . . . . . . . 81 183 D.13. Since draft-ietf-httpbis-p1-messaging-11 . . . . . . . . . 81 184 D.14. Since draft-ietf-httpbis-p1-messaging-12 . . . . . . . . . 82 185 D.15. Since draft-ietf-httpbis-p1-messaging-13 . . . . . . . . . 82 186 D.16. Since draft-ietf-httpbis-p1-messaging-14 . . . . . . . . . 83 187 D.17. Since draft-ietf-httpbis-p1-messaging-15 . . . . . . . . . 83 188 D.18. Since draft-ietf-httpbis-p1-messaging-16 . . . . . . . . . 83 189 D.19. Since draft-ietf-httpbis-p1-messaging-17 . . . . . . . . . 84 190 D.20. Since draft-ietf-httpbis-p1-messaging-18 . . . . . . . . . 84 191 D.21. Since draft-ietf-httpbis-p1-messaging-19 . . . . . . . . . 84 192 D.22. Since draft-ietf-httpbis-p1-messaging-20 . . . . . . . . . 85 193 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 195 1. Introduction 197 The Hypertext Transfer Protocol (HTTP) is an application-level 198 request/response protocol that uses extensible semantics and MIME- 199 like message payloads for flexible interaction with network-based 200 hypertext information systems. This document is the first in a 201 series of documents that collectively form the HTTP/1.1 202 specification: 204 RFC xxx1: Message Syntax and Routing 206 RFC xxx2: Semantics and Content 208 RFC xxx3: Conditional Requests 210 RFC xxx4: Range Requests 212 RFC xxx5: Caching 214 RFC xxx6: Authentication 216 This HTTP/1.1 specification obsoletes and moves to historic status 217 RFC 2616, its predecessor RFC 2068, RFC 2145 (on HTTP versioning), 218 and RFC 2817 (on using CONNECT for TLS upgrades). 220 HTTP is a generic interface protocol for information systems. It is 221 designed to hide the details of how a service is implemented by 222 presenting a uniform interface to clients that is independent of the 223 types of resources provided. Likewise, servers do not need to be 224 aware of each client's purpose: an HTTP request can be considered in 225 isolation rather than being associated with a specific type of client 226 or a predetermined sequence of application steps. The result is a 227 protocol that can be used effectively in many different contexts and 228 for which implementations can evolve independently over time. 230 HTTP is also designed for use as an intermediation protocol for 231 translating communication to and from non-HTTP information systems. 232 HTTP proxies and gateways can provide access to alternative 233 information services by translating their diverse protocols into a 234 hypertext format that can be viewed and manipulated by clients in the 235 same way as HTTP services. 237 One consequence of HTTP flexibility is that the protocol cannot be 238 defined in terms of what occurs behind the interface. Instead, we 239 are limited to defining the syntax of communication, the intent of 240 received communication, and the expected behavior of recipients. If 241 the communication is considered in isolation, then successful actions 242 ought to be reflected in corresponding changes to the observable 243 interface provided by servers. However, since multiple clients might 244 act in parallel and perhaps at cross-purposes, we cannot require that 245 such changes be observable beyond the scope of a single response. 247 This document describes the architectural elements that are used or 248 referred to in HTTP, defines the "http" and "https" URI schemes, 249 describes overall network operation and connection management, and 250 defines HTTP message framing and forwarding requirements. Our goal 251 is to define all of the mechanisms necessary for HTTP message 252 handling that are independent of message semantics, thereby defining 253 the complete set of requirements for message parsers and message- 254 forwarding intermediaries. 256 1.1. Requirement Notation 258 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 259 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 260 document are to be interpreted as described in [RFC2119]. 262 Conformance criteria and considerations regarding error handling are 263 defined in Section 2.5. 265 1.2. Syntax Notation 267 This specification uses the Augmented Backus-Naur Form (ABNF) 268 notation of [RFC5234] with the list rule extension defined in 269 Appendix B. Appendix C shows the collected ABNF with the list rule 270 expanded. 272 The following core rules are included by reference, as defined in 273 [RFC5234], Appendix B.1: ALPHA (letters), CR (carriage return), CRLF 274 (CR LF), CTL (controls), DIGIT (decimal 0-9), DQUOTE (double quote), 275 HEXDIG (hexadecimal 0-9/A-F/a-f), HTAB (horizontal tab), LF (line 276 feed), OCTET (any 8-bit sequence of data), SP (space), and VCHAR (any 277 visible [USASCII] character). 279 As a convention, ABNF rule names prefixed with "obs-" denote 280 "obsolete" grammar rules that appear for historical reasons. 282 2. Architecture 284 HTTP was created for the World Wide Web architecture and has evolved 285 over time to support the scalability needs of a worldwide hypertext 286 system. Much of that architecture is reflected in the terminology 287 and syntax productions used to define HTTP. 289 2.1. Client/Server Messaging 291 HTTP is a stateless request/response protocol that operates by 292 exchanging messages (Section 3) across a reliable transport or 293 session-layer "connection" (Section 6). An HTTP "client" is a 294 program that establishes a connection to a server for the purpose of 295 sending one or more HTTP requests. An HTTP "server" is a program 296 that accepts connections in order to service HTTP requests by sending 297 HTTP responses. 299 The terms client and server refer only to the roles that these 300 programs perform for a particular connection. The same program might 301 act as a client on some connections and a server on others. We use 302 the term "user agent" to refer to the program that initiates a 303 request, such as a WWW browser, editor, or spider (web-traversing 304 robot), and the term "origin server" to refer to the program that can 305 originate authoritative responses to a request. For general 306 requirements, we use the term "sender" to refer to whichever 307 component sent a given message and the term "recipient" to refer to 308 any component that receives the message. 310 HTTP relies upon the Uniform Resource Identifier (URI) standard 311 [RFC3986] to indicate the target resource (Section 5.1) and 312 relationships between resources. Messages are passed in a format 313 similar to that used by Internet mail [RFC5322] and the Multipurpose 314 Internet Mail Extensions (MIME) [RFC2045] (see Appendix A of [Part2] 315 for the differences between HTTP and MIME messages). 317 Most HTTP communication consists of a retrieval request (GET) for a 318 representation of some resource identified by a URI. In the simplest 319 case, this might be accomplished via a single bidirectional 320 connection (===) between the user agent (UA) and the origin server 321 (O). 323 request > 324 UA ======================================= O 325 < response 327 A client sends an HTTP request to a server in the form of a request 328 message, beginning with a request-line that includes a method, URI, 329 and protocol version (Section 3.1.1), followed by header fields 330 containing request modifiers, client information, and representation 331 metadata (Section 3.2), an empty line to indicate the end of the 332 header section, and finally a message body containing the payload 333 body (if any, Section 3.3). 335 A server responds to a client's request by sending one or more HTTP 336 response messages, each beginning with a status line that includes 337 the protocol version, a success or error code, and textual reason 338 phrase (Section 3.1.2), possibly followed by header fields containing 339 server information, resource metadata, and representation metadata 340 (Section 3.2), an empty line to indicate the end of the header 341 section, and finally a message body containing the payload body (if 342 any, Section 3.3). 344 A connection might be used for multiple request/response exchanges, 345 as defined in Section 6.2. 347 The following example illustrates a typical message exchange for a 348 GET request on the URI "http://www.example.com/hello.txt": 350 client request: 352 GET /hello.txt HTTP/1.1 353 User-Agent: curl/7.16.3 libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3 354 Host: www.example.com 355 Accept-Language: en, mi 357 server response: 359 HTTP/1.1 200 OK 360 Date: Mon, 27 Jul 2009 12:28:53 GMT 361 Server: Apache 362 Last-Modified: Wed, 22 Jul 2009 19:15:56 GMT 363 ETag: "34aa387-d-1568eb00" 364 Accept-Ranges: bytes 365 Content-Length: 14 366 Vary: Accept-Encoding 367 Content-Type: text/plain 369 Hello World! 371 2.2. Implementation Diversity 373 When considering the design of HTTP, it is easy to fall into a trap 374 of thinking that all user agents are general-purpose browsers and all 375 origin servers are large public websites. That is not the case in 376 practice. Common HTTP user agents include household appliances, 377 stereos, scales, firmware update scripts, command-line programs, 378 mobile apps, and communication devices in a multitude of shapes and 379 sizes. Likewise, common HTTP origin servers include home automation 380 units, configurable networking components, office machines, 381 autonomous robots, news feeds, traffic cameras, ad selectors, and 382 video delivery platforms. 384 The term "user agent" does not imply that there is a human user 385 directly interacting with the software agent at the time of a 386 request. In many cases, a user agent is installed or configured to 387 run in the background and save its results for later inspection (or 388 save only a subset of those results that might be interesting or 389 erroneous). Spiders, for example, are typically given a start URI 390 and configured to follow certain behavior while crawling the Web as a 391 hypertext graph. 393 The implementation diversity of HTTP means that we cannot assume the 394 user agent can make interactive suggestions to a user or provide 395 adequate warning for security or privacy options. In the few cases 396 where this specification requires reporting of errors to the user, it 397 is acceptable for such reporting to only be observable in an error 398 console or log file. Likewise, requirements that an automated action 399 be confirmed by the user before proceeding can me met via advance 400 configuration choices, run-time options, or simply not proceeding 401 with the unsafe action. 403 2.3. Intermediaries 405 HTTP enables the use of intermediaries to satisfy requests through a 406 chain of connections. There are three common forms of HTTP 407 intermediary: proxy, gateway, and tunnel. In some cases, a single 408 intermediary might act as an origin server, proxy, gateway, or 409 tunnel, switching behavior based on the nature of each request. 411 > > > > 412 UA =========== A =========== B =========== C =========== O 413 < < < < 415 The figure above shows three intermediaries (A, B, and C) between the 416 user agent and origin server. A request or response message that 417 travels the whole chain will pass through four separate connections. 418 Some HTTP communication options might apply only to the connection 419 with the nearest, non-tunnel neighbor, only to the end-points of the 420 chain, or to all connections along the chain. Although the diagram 421 is linear, each participant might be engaged in multiple, 422 simultaneous communications. For example, B might be receiving 423 requests from many clients other than A, and/or forwarding requests 424 to servers other than C, at the same time that it is handling A's 425 request. 427 We use the terms "upstream" and "downstream" to describe various 428 requirements in relation to the directional flow of a message: all 429 messages flow from upstream to downstream. Likewise, we use the 430 terms inbound and outbound to refer to directions in relation to the 431 request path: "inbound" means toward the origin server and "outbound" 432 means toward the user agent. 434 A "proxy" is a message forwarding agent that is selected by the 435 client, usually via local configuration rules, to receive requests 436 for some type(s) of absolute URI and attempt to satisfy those 437 requests via translation through the HTTP interface. Some 438 translations are minimal, such as for proxy requests for "http" URIs, 439 whereas other requests might require translation to and from entirely 440 different application-level protocols. Proxies are often used to 441 group an organization's HTTP requests through a common intermediary 442 for the sake of security, annotation services, or shared caching. 444 An HTTP-to-HTTP proxy is called a "transforming proxy" if it is 445 designed or configured to modify request or response messages in a 446 semantically meaningful way (i.e., modifications, beyond those 447 required by normal HTTP processing, that change the message in a way 448 that would be significant to the original sender or potentially 449 significant to downstream recipients). For example, a transforming 450 proxy might be acting as a shared annotation server (modifying 451 responses to include references to a local annotation database), a 452 malware filter, a format transcoder, or an intranet-to-Internet 453 privacy filter. Such transformations are presumed to be desired by 454 the client (or client organization) that selected the proxy and are 455 beyond the scope of this specification. However, when a proxy is not 456 intended to transform a given message, we use the term "non- 457 transforming proxy" to target requirements that preserve HTTP message 458 semantics. See Section 7.3.4 of [Part2] and Section 7.5 of [Part6] 459 for status and warning codes related to transformations. 461 A "gateway" (a.k.a., "reverse proxy") is a receiving agent that acts 462 as a layer above some other server(s) and translates the received 463 requests to the underlying server's protocol. Gateways are often 464 used to encapsulate legacy or untrusted information services, to 465 improve server performance through "accelerator" caching, and to 466 enable partitioning or load-balancing of HTTP services across 467 multiple machines. 469 A gateway behaves as an origin server on its outbound connection and 470 as a user agent on its inbound connection. All HTTP requirements 471 applicable to an origin server also apply to the outbound 472 communication of a gateway. A gateway communicates with inbound 473 servers using any protocol that it desires, including private 474 extensions to HTTP that are outside the scope of this specification. 475 However, an HTTP-to-HTTP gateway that wishes to interoperate with 476 third-party HTTP servers MUST conform to HTTP user agent requirements 477 on the gateway's inbound connection and MUST implement the Connection 478 (Section 6.1) and Via (Section 5.7) header fields for both 479 connections. 481 A "tunnel" acts as a blind relay between two connections without 482 changing the messages. Once active, a tunnel is not considered a 483 party to the HTTP communication, though the tunnel might have been 484 initiated by an HTTP request. A tunnel ceases to exist when both 485 ends of the relayed connection are closed. Tunnels are used to 486 extend a virtual connection through an intermediary, such as when 487 Transport Layer Security (TLS, [RFC5246]) is used to establish 488 confidential communication through a shared firewall proxy. 490 The above categories for intermediary only consider those acting as 491 participants in the HTTP communication. There are also 492 intermediaries that can act on lower layers of the network protocol 493 stack, filtering or redirecting HTTP traffic without the knowledge or 494 permission of message senders. Network intermediaries often 495 introduce security flaws or interoperability problems by violating 496 HTTP semantics. For example, an "interception proxy" [RFC3040] (also 497 commonly known as a "transparent proxy" [RFC1919] or "captive 498 portal") differs from an HTTP proxy because it is not selected by the 499 client. Instead, an interception proxy filters or redirects outgoing 500 TCP port 80 packets (and occasionally other common port traffic). 501 Interception proxies are commonly found on public network access 502 points, as a means of enforcing account subscription prior to 503 allowing use of non-local Internet services, and within corporate 504 firewalls to enforce network usage policies. They are 505 indistinguishable from a man-in-the-middle attack. 507 HTTP is defined as a stateless protocol, meaning that each request 508 message can be understood in isolation. Many implementations depend 509 on HTTP's stateless design in order to reuse proxied connections or 510 dynamically load balance requests across multiple servers. Hence, 511 servers MUST NOT assume that two requests on the same connection are 512 from the same user agent unless the connection is secured and 513 specific to that agent. Some non-standard HTTP extensions (e.g., 514 [RFC4559]) have been known to violate this requirement, resulting in 515 security and interoperability problems. 517 2.4. Caches 519 A "cache" is a local store of previous response messages and the 520 subsystem that controls its message storage, retrieval, and deletion. 521 A cache stores cacheable responses in order to reduce the response 522 time and network bandwidth consumption on future, equivalent 523 requests. Any client or server MAY employ a cache, though a cache 524 cannot be used by a server while it is acting as a tunnel. 526 The effect of a cache is that the request/response chain is shortened 527 if one of the participants along the chain has a cached response 528 applicable to that request. The following illustrates the resulting 529 chain if B has a cached copy of an earlier response from O (via C) 530 for a request which has not been cached by UA or A. 532 > > 533 UA =========== A =========== B - - - - - - C - - - - - - O 534 < < 536 A response is "cacheable" if a cache is allowed to store a copy of 537 the response message for use in answering subsequent requests. Even 538 when a response is cacheable, there might be additional constraints 539 placed by the client or by the origin server on when that cached 540 response can be used for a particular request. HTTP requirements for 541 cache behavior and cacheable responses are defined in Section 2 of 542 [Part6]. 544 There are a wide variety of architectures and configurations of 545 caches and proxies deployed across the World Wide Web and inside 546 large organizations. These systems include national hierarchies of 547 proxy caches to save transoceanic bandwidth, systems that broadcast 548 or multicast cache entries, organizations that distribute subsets of 549 cached data via optical media, and so on. 551 2.5. Conformance and Error Handling 553 This specification targets conformance criteria according to the role 554 of a participant in HTTP communication. Hence, HTTP requirements are 555 placed on senders, recipients, clients, servers, user agents, 556 intermediaries, origin servers, proxies, gateways, or caches, 557 depending on what behavior is being constrained by the requirement. 558 Additional (social) requirements are placed on implementations, 559 resource owners, and protocol element registrations when they apply 560 beyond the scope of a single communication. 562 The verb "generate" is used instead of "send" where a requirement 563 differentiates between creating a protocol element and merely 564 forwarding a received element downstream. 566 An implementation is considered conformant if it complies with all of 567 the requirements associated with the roles it partakes in HTTP. Note 568 that SHOULD-level requirements are relevant here, unless one of the 569 documented exceptions is applicable. 571 Conformance applies to both the syntax and semantics of HTTP protocol 572 elements. A sender MUST NOT generate protocol elements that convey a 573 meaning that is known by that sender to be false. A sender MUST NOT 574 generate protocol elements that do not match the grammar defined by 575 the ABNF rules for those protocol elements that are applicable to the 576 sender's role. If a received protocol element is processed, the 577 recipient MUST be able to parse any value that would match the ABNF 578 rules for that protocol element, excluding only those rules not 579 applicable to the recipient's role. 581 Unless noted otherwise, a recipient MAY attempt to recover a usable 582 protocol element from an invalid construct. HTTP does not define 583 specific error handling mechanisms except when they have a direct 584 impact on security, since different applications of the protocol 585 require different error handling strategies. For example, a Web 586 browser might wish to transparently recover from a response where the 587 Location header field doesn't parse according to the ABNF, whereas a 588 systems control client might consider any form of error recovery to 589 be dangerous. 591 2.6. Protocol Versioning 593 HTTP uses a "." numbering scheme to indicate versions 594 of the protocol. This specification defines version "1.1". The 595 protocol version as a whole indicates the sender's conformance with 596 the set of requirements laid out in that version's corresponding 597 specification of HTTP. 599 The version of an HTTP message is indicated by an HTTP-version field 600 in the first line of the message. HTTP-version is case-sensitive. 602 HTTP-version = HTTP-name "/" DIGIT "." DIGIT 603 HTTP-name = %x48.54.54.50 ; "HTTP", case-sensitive 605 The HTTP version number consists of two decimal digits separated by a 606 "." (period or decimal point). The first digit ("major version") 607 indicates the HTTP messaging syntax, whereas the second digit ("minor 608 version") indicates the highest minor version to which the sender is 609 conformant and able to understand for future communication. The 610 minor version advertises the sender's communication capabilities even 611 when the sender is only using a backwards-compatible subset of the 612 protocol, thereby letting the recipient know that more advanced 613 features can be used in response (by servers) or in future requests 614 (by clients). 616 When an HTTP/1.1 message is sent to an HTTP/1.0 recipient [RFC1945] 617 or a recipient whose version is unknown, the HTTP/1.1 message is 618 constructed such that it can be interpreted as a valid HTTP/1.0 619 message if all of the newer features are ignored. This specification 620 places recipient-version requirements on some new features so that a 621 conformant sender will only use compatible features until it has 622 determined, through configuration or the receipt of a message, that 623 the recipient supports HTTP/1.1. 625 The interpretation of a header field does not change between minor 626 versions of the same major HTTP version, though the default behavior 627 of a recipient in the absence of such a field can change. Unless 628 specified otherwise, header fields defined in HTTP/1.1 are defined 629 for all versions of HTTP/1.x. In particular, the Host and Connection 630 header fields ought to be implemented by all HTTP/1.x implementations 631 whether or not they advertise conformance with HTTP/1.1. 633 New header fields can be defined such that, when they are understood 634 by a recipient, they might override or enhance the interpretation of 635 previously defined header fields. When an implementation receives an 636 unrecognized header field, the recipient MUST ignore that header 637 field for local processing regardless of the message's HTTP version. 638 An unrecognized header field received by a proxy MUST be forwarded 639 downstream unless the header field's field-name is listed in the 640 message's Connection header field (see Section 6.1). These 641 requirements allow HTTP's functionality to be enhanced without 642 requiring prior update of deployed intermediaries. 644 Intermediaries that process HTTP messages (i.e., all intermediaries 645 other than those acting as tunnels) MUST send their own HTTP-version 646 in forwarded messages. In other words, they MUST NOT blindly forward 647 the first line of an HTTP message without ensuring that the protocol 648 version in that message matches a version to which that intermediary 649 is conformant for both the receiving and sending of messages. 650 Forwarding an HTTP message without rewriting the HTTP-version might 651 result in communication errors when downstream recipients use the 652 message sender's version to determine what features are safe to use 653 for later communication with that sender. 655 An HTTP client SHOULD send a request version equal to the highest 656 version to which the client is conformant and whose major version is 657 no higher than the highest version supported by the server, if this 658 is known. An HTTP client MUST NOT send a version to which it is not 659 conformant. 661 An HTTP client MAY send a lower request version if it is known that 662 the server incorrectly implements the HTTP specification, but only 663 after the client has attempted at least one normal request and 664 determined from the response status or header fields (e.g., Server) 665 that the server improperly handles higher request versions. 667 An HTTP server SHOULD send a response version equal to the highest 668 version to which the server is conformant and whose major version is 669 less than or equal to the one received in the request. An HTTP 670 server MUST NOT send a version to which it is not conformant. A 671 server MAY send a 505 (HTTP Version Not Supported) response if it 672 cannot send a response using the major version used in the client's 673 request. 675 An HTTP server MAY send an HTTP/1.0 response to an HTTP/1.0 request 676 if it is known or suspected that the client incorrectly implements 677 the HTTP specification and is incapable of correctly processing later 678 version responses, such as when a client fails to parse the version 679 number correctly or when an intermediary is known to blindly forward 680 the HTTP-version even when it doesn't conform to the given minor 681 version of the protocol. Such protocol downgrades SHOULD NOT be 682 performed unless triggered by specific client attributes, such as 683 when one or more of the request header fields (e.g., User-Agent) 684 uniquely match the values sent by a client known to be in error. 686 The intention of HTTP's versioning design is that the major number 687 will only be incremented if an incompatible message syntax is 688 introduced, and that the minor number will only be incremented when 689 changes made to the protocol have the effect of adding to the message 690 semantics or implying additional capabilities of the sender. 691 However, the minor version was not incremented for the changes 692 introduced between [RFC2068] and [RFC2616], and this revision is 693 specifically avoiding any such changes to the protocol. 695 2.7. Uniform Resource Identifiers 697 Uniform Resource Identifiers (URIs) [RFC3986] are used throughout 698 HTTP as the means for identifying resources (Section 2 of [Part2]). 699 URI references are used to target requests, indicate redirects, and 700 define relationships. 702 This specification adopts the definitions of "URI-reference", 703 "absolute-URI", "relative-part", "port", "host", "path-abempty", 704 "path-absolute", "query", and "authority" from the URI generic 705 syntax. In addition, we define a partial-URI rule for protocol 706 elements that allow a relative URI but not a fragment. 708 URI-reference = 709 absolute-URI = 710 relative-part = 711 authority = 712 path-abempty = 713 path-absolute = 714 port = 715 query = 716 uri-host = 718 partial-URI = relative-part [ "?" query ] 720 Each protocol element in HTTP that allows a URI reference will 721 indicate in its ABNF production whether the element allows any form 722 of reference (URI-reference), only a URI in absolute form (absolute- 723 URI), only the path and optional query components, or some 724 combination of the above. Unless otherwise indicated, URI references 725 are parsed relative to the effective request URI (Section 5.5). 727 2.7.1. http URI scheme 729 The "http" URI scheme is hereby defined for the purpose of minting 730 identifiers according to their association with the hierarchical 731 namespace governed by a potential HTTP origin server listening for 732 TCP connections on a given port. 734 http-URI = "http:" "//" authority path-abempty [ "?" query ] 736 The HTTP origin server is identified by the generic syntax's 737 authority component, which includes a host identifier and optional 738 TCP port ([RFC3986], Section 3.2.2). The remainder of the URI, 739 consisting of both the hierarchical path component and optional query 740 component, serves as an identifier for a potential resource within 741 that origin server's name space. 743 If the host identifier is provided as an IP literal or IPv4 address, 744 then the origin server is any listener on the indicated TCP port at 745 that IP address. If host is a registered name, then that name is 746 considered an indirect identifier and the recipient might use a name 747 resolution service, such as DNS, to find the address of a listener 748 for that host. The host MUST NOT be empty; if an "http" URI is 749 received with an empty host, then it MUST be rejected as invalid. If 750 the port subcomponent is empty or not given, then TCP port 80 is 751 assumed (the default reserved port for WWW services). 753 Regardless of the form of host identifier, access to that host is not 754 implied by the mere presence of its name or address. The host might 755 or might not exist and, even when it does exist, might or might not 756 be running an HTTP server or listening to the indicated port. The 757 "http" URI scheme makes use of the delegated nature of Internet names 758 and addresses to establish a naming authority (whatever entity has 759 the ability to place an HTTP server at that Internet name or address) 760 and allows that authority to determine which names are valid and how 761 they might be used. 763 When an "http" URI is used within a context that calls for access to 764 the indicated resource, a client MAY attempt access by resolving the 765 host to an IP address, establishing a TCP connection to that address 766 on the indicated port, and sending an HTTP request message 767 (Section 3) containing the URI's identifying data (Section 5) to the 768 server. If the server responds to that request with a non-interim 769 HTTP response message, as described in Section 7 of [Part2], then 770 that response is considered an authoritative answer to the client's 771 request. 773 Although HTTP is independent of the transport protocol, the "http" 774 scheme is specific to TCP-based services because the name delegation 775 process depends on TCP for establishing authority. An HTTP service 776 based on some other underlying connection protocol would presumably 777 be identified using a different URI scheme, just as the "https" 778 scheme (below) is used for resources that require an end-to-end 779 secured connection. Other protocols might also be used to provide 780 access to "http" identified resources -- it is only the authoritative 781 interface used for mapping the namespace that is specific to TCP. 783 The URI generic syntax for authority also includes a deprecated 784 userinfo subcomponent ([RFC3986], Section 3.2.1) for including user 785 authentication information in the URI. Some implementations make use 786 of the userinfo component for internal configuration of 787 authentication information, such as within command invocation 788 options, configuration files, or bookmark lists, even though such 789 usage might expose a user identifier or password. Senders MUST NOT 790 include a userinfo subcomponent (and its "@" delimiter) when 791 transmitting an "http" URI in a message. Recipients of HTTP messages 792 that contain a URI reference SHOULD parse for the existence of 793 userinfo and treat its presence as an error, likely indicating that 794 the deprecated subcomponent is being used to obscure the authority 795 for the sake of phishing attacks. 797 2.7.2. https URI scheme 799 The "https" URI scheme is hereby defined for the purpose of minting 800 identifiers according to their association with the hierarchical 801 namespace governed by a potential HTTP origin server listening to a 802 given TCP port for TLS-secured connections [RFC5246]. 804 All of the requirements listed above for the "http" scheme are also 805 requirements for the "https" scheme, except that a default TCP port 806 of 443 is assumed if the port subcomponent is empty or not given, and 807 the TCP connection MUST be secured, end-to-end, through the use of 808 strong encryption prior to sending the first HTTP request. 810 https-URI = "https:" "//" authority path-abempty [ "?" query ] 812 Unlike the "http" scheme, responses to "https" identified requests 813 are never "public" and thus MUST NOT be reused for shared caching. 814 They can, however, be reused in a private cache if the message is 815 cacheable by default in HTTP or specifically indicated as such by the 816 Cache-Control header field (Section 7.2 of [Part6]). 818 Resources made available via the "https" scheme have no shared 819 identity with the "http" scheme even if their resource identifiers 820 indicate the same authority (the same host listening to the same TCP 821 port). They are distinct name spaces and are considered to be 822 distinct origin servers. However, an extension to HTTP that is 823 defined to apply to entire host domains, such as the Cookie protocol 824 [RFC6265], can allow information set by one service to impact 825 communication with other services within a matching group of host 826 domains. 828 The process for authoritative access to an "https" identified 829 resource is defined in [RFC2818]. 831 2.7.3. http and https URI Normalization and Comparison 833 Since the "http" and "https" schemes conform to the URI generic 834 syntax, such URIs are normalized and compared according to the 835 algorithm defined in [RFC3986], Section 6, using the defaults 836 described above for each scheme. 838 If the port is equal to the default port for a scheme, the normal 839 form is to elide the port subcomponent. Likewise, an empty path 840 component is equivalent to an absolute path of "/", so the normal 841 form is to provide a path of "/" instead. The scheme and host are 842 case-insensitive and normally provided in lowercase; all other 843 components are compared in a case-sensitive manner. Characters other 844 than those in the "reserved" set are equivalent to their percent- 845 encoded octets (see [RFC3986], Section 2.1): the normal form is to 846 not encode them. 848 For example, the following three URIs are equivalent: 850 http://example.com:80/~smith/home.html 851 http://EXAMPLE.com/%7Esmith/home.html 852 http://EXAMPLE.com:/%7esmith/home.html 854 3. Message Format 856 All HTTP/1.1 messages consist of a start-line followed by a sequence 857 of octets in a format similar to the Internet Message Format 858 [RFC5322]: zero or more header fields (collectively referred to as 859 the "headers" or the "header section"), an empty line indicating the 860 end of the header section, and an optional message body. 862 HTTP-message = start-line 863 *( header-field CRLF ) 864 CRLF 865 [ message-body ] 867 The normal procedure for parsing an HTTP message is to read the 868 start-line into a structure, read each header field into a hash table 869 by field name until the empty line, and then use the parsed data to 870 determine if a message body is expected. If a message body has been 871 indicated, then it is read as a stream until an amount of octets 872 equal to the message body length is read or the connection is closed. 874 Recipients MUST parse an HTTP message as a sequence of octets in an 875 encoding that is a superset of US-ASCII [USASCII]. Parsing an HTTP 876 message as a stream of Unicode characters, without regard for the 877 specific encoding, creates security vulnerabilities due to the 878 varying ways that string processing libraries handle invalid 879 multibyte character sequences that contain the octet LF (%x0A). 880 String-based parsers can only be safely used within protocol elements 881 after the element has been extracted from the message, such as within 882 a header field-value after message parsing has delineated the 883 individual fields. 885 An HTTP message can be parsed as a stream for incremental processing 886 or forwarding downstream. However, recipients cannot rely on 887 incremental delivery of partial messages, since some implementations 888 will buffer or delay message forwarding for the sake of network 889 efficiency, security checks, or payload transformations. 891 3.1. Start Line 893 An HTTP message can either be a request from client to server or a 894 response from server to client. Syntactically, the two types of 895 message differ only in the start-line, which is either a request-line 896 (for requests) or a status-line (for responses), and in the algorithm 897 for determining the length of the message body (Section 3.3). In 898 theory, a client could receive requests and a server could receive 899 responses, distinguishing them by their different start-line formats, 900 but in practice servers are implemented to only expect a request (a 901 response is interpreted as an unknown or invalid request method) and 902 clients are implemented to only expect a response. 904 start-line = request-line / status-line 906 A sender MUST NOT send whitespace between the start-line and the 907 first header field. The presence of such whitespace in a request 908 might be an attempt to trick a server into ignoring that field or 909 processing the line after it as a new request, either of which might 910 result in a security vulnerability if other implementations within 911 the request chain interpret the same message differently. Likewise, 912 the presence of such whitespace in a response might be ignored by 913 some clients or cause others to cease parsing. 915 3.1.1. Request Line 917 A request-line begins with a method token, followed by a single space 918 (SP), the request-target, another single space (SP), the protocol 919 version, and ending with CRLF. 921 request-line = method SP request-target SP HTTP-version CRLF 923 A server MUST be able to parse any received message that begins with 924 a request-line and matches the ABNF rule for HTTP-message. 926 The method token indicates the request method to be performed on the 927 target resource. The request method is case-sensitive. 929 method = token 931 The methods defined by this specification can be found in Section 5 932 of [Part2], along with information regarding the HTTP method registry 933 and considerations for defining new methods. 935 The request-target identifies the target resource upon which to apply 936 the request, as defined in Section 5.3. 938 No whitespace is allowed inside the method, request-target, and 939 protocol version. Hence, recipients typically parse the request-line 940 into its component parts by splitting on the SP characters. 942 Unfortunately, some user agents fail to properly encode hypertext 943 references that have embedded whitespace, sending the characters 944 directly instead of properly percent-encoding the disallowed 945 characters. Recipients of an invalid request-line SHOULD respond 946 with either a 400 (Bad Request) error or a 301 (Moved Permanently) 947 redirect with the request-target properly encoded. Recipients SHOULD 948 NOT attempt to autocorrect and then process the request without a 949 redirect, since the invalid request-line might be deliberately 950 crafted to bypass security filters along the request chain. 952 HTTP does not place a pre-defined limit on the length of a request- 953 line. A server that receives a method longer than any that it 954 implements SHOULD respond with either a 405 (Method Not Allowed), if 955 it is an origin server, or a 501 (Not Implemented) status code. A 956 server MUST be prepared to receive URIs of unbounded length and 957 respond with the 414 (URI Too Long) status code if the received 958 request-target would be longer than the server wishes to handle (see 959 Section 7.5.12 of [Part2]). 961 Various ad-hoc limitations on request-line length are found in 962 practice. It is RECOMMENDED that all HTTP senders and recipients 963 support, at a minimum, request-line lengths of up to 8000 octets. 965 3.1.2. Status Line 967 The first line of a response message is the status-line, consisting 968 of the protocol version, a space (SP), the status code, another 969 space, a possibly-empty textual phrase describing the status code, 970 and ending with CRLF. 972 status-line = HTTP-version SP status-code SP reason-phrase CRLF 974 A client MUST be able to parse any received message that begins with 975 a status-line and matches the ABNF rule for HTTP-message. 977 The status-code element is a 3-digit integer code describing the 978 result of the server's attempt to understand and satisfy the client's 979 corresponding request. The rest of the response message is to be 980 interpreted in light of the semantics defined for that status code. 981 See Section 7 of [Part2] for information about the semantics of 982 status codes, including the classes of status code (indicated by the 983 first digit), the status codes defined by this specification, 984 considerations for the definition of new status codes, and the IANA 985 registry. 987 status-code = 3DIGIT 989 The reason-phrase element exists for the sole purpose of providing a 990 textual description associated with the numeric status code, mostly 991 out of deference to earlier Internet application protocols that were 992 more frequently used with interactive text clients. A client SHOULD 993 ignore the reason-phrase content. 995 reason-phrase = *( HTAB / SP / VCHAR / obs-text ) 997 3.2. Header Fields 999 Each HTTP header field consists of a case-insensitive field name 1000 followed by a colon (":"), optional whitespace, and the field value. 1002 header-field = field-name ":" OWS field-value BWS 1003 field-name = token 1004 field-value = *( field-content / obs-fold ) 1005 field-content = *( HTAB / SP / VCHAR / obs-text ) 1006 obs-fold = CRLF ( SP / HTAB ) 1007 ; obsolete line folding 1008 ; see Section 3.2.2 1010 The field-name token labels the corresponding field-value as having 1011 the semantics defined by that header field. For example, the Date 1012 header field is defined in Section 8.1.1.2 of [Part2] as containing 1013 the origination timestamp for the message in which it appears. 1015 HTTP header fields are fully extensible: there is no limit on the 1016 introduction of new field names, each presumably defining new 1017 semantics, or on the number of header fields used in a given message. 1018 Existing fields are defined in each part of this specification and in 1019 many other specifications outside the standards process. New header 1020 fields can be introduced without changing the protocol version if 1021 their defined semantics allow them to be safely ignored by recipients 1022 that do not recognize them. 1024 New HTTP header fields SHOULD be registered with IANA in the Message 1025 Header Field Registry, as described in Section 9.3 of [Part2]. 1026 Unrecognized header fields MUST be forwarded by a proxy unless the 1027 field-name is listed in the Connection header field (Section 6.1) or 1028 the proxy is specifically configured to block or otherwise transform 1029 such fields. Unrecognized header fields SHOULD be ignored by other 1030 recipients. 1032 The order in which header fields with differing field names are 1033 received is not significant. However, it is "good practice" to send 1034 header fields that contain control data first, such as Host on 1035 requests and Date on responses, so that implementations can decide 1036 when not to handle a message as early as possible. A server MUST 1037 wait until the entire header section is received before interpreting 1038 a request message, since later header fields might include 1039 conditionals, authentication credentials, or deliberately misleading 1040 duplicate header fields that would impact request processing. 1042 Multiple header fields with the same field name MUST NOT be sent in a 1043 message unless the entire field value for that header field is 1044 defined as a comma-separated list [i.e., #(values)]. Multiple header 1045 fields with the same field name can be combined into one "field-name: 1046 field-value" pair, without changing the semantics of the message, by 1047 appending each subsequent field value to the combined field value in 1048 order, separated by a comma. The order in which header fields with 1049 the same field name are received is therefore significant to the 1050 interpretation of the combined field value; a proxy MUST NOT change 1051 the order of these field values when forwarding a message. 1053 Note: The "Set-Cookie" header field as implemented in practice can 1054 occur multiple times, but does not use the list syntax, and thus 1055 cannot be combined into a single line ([RFC6265]). (See Appendix 1056 A.2.3 of [Kri2001] for details.) Also note that the Set-Cookie2 1057 header field specified in [RFC2965] does not share this problem. 1059 3.2.1. Whitespace 1061 This specification uses three rules to denote the use of linear 1062 whitespace: OWS (optional whitespace), RWS (required whitespace), and 1063 BWS ("bad" whitespace). 1065 The OWS rule is used where zero or more linear whitespace octets 1066 might appear. OWS SHOULD either not be produced or be produced as a 1067 single SP. Multiple OWS octets that occur within field-content 1068 SHOULD either be replaced with a single SP or transformed to all SP 1069 octets (each octet other than SP replaced with SP) before 1070 interpreting the field value or forwarding the message downstream. 1072 RWS is used when at least one linear whitespace octet is required to 1073 separate field tokens. RWS SHOULD be produced as a single SP. 1074 Multiple RWS octets that occur within field-content SHOULD either be 1075 replaced with a single SP or transformed to all SP octets before 1076 interpreting the field value or forwarding the message downstream. 1078 BWS is used where the grammar allows optional whitespace, for 1079 historical reasons, but senders SHOULD NOT produce it in messages; 1080 recipients MUST accept such bad optional whitespace and remove it 1081 before interpreting the field value or forwarding the message 1082 downstream. 1084 OWS = *( SP / HTAB ) 1085 ; "optional" whitespace 1086 RWS = 1*( SP / HTAB ) 1087 ; "required" whitespace 1088 BWS = OWS 1089 ; "bad" whitespace 1091 3.2.2. Field Parsing 1093 No whitespace is allowed between the header field-name and colon. In 1094 the past, differences in the handling of such whitespace have led to 1095 security vulnerabilities in request routing and response handling. 1096 Any received request message that contains whitespace between a 1097 header field-name and colon MUST be rejected with a response code of 1098 400 (Bad Request). A proxy MUST remove any such whitespace from a 1099 response message before forwarding the message downstream. 1101 A field value MAY be preceded by optional whitespace (OWS); a single 1102 SP is preferred. The field value does not include any leading or 1103 trailing white space: OWS occurring before the first non-whitespace 1104 octet of the field value or after the last non-whitespace octet of 1105 the field value is ignored and SHOULD be removed before further 1106 processing (as this does not change the meaning of the header field). 1108 Historically, HTTP header field values could be extended over 1109 multiple lines by preceding each extra line with at least one space 1110 or horizontal tab (obs-fold). This specification deprecates such 1111 line folding except within the message/http media type 1112 (Section 7.3.1). HTTP senders MUST NOT produce messages that include 1113 line folding (i.e., that contain any field-value that matches the 1114 obs-fold rule) unless the message is intended for packaging within 1115 the message/http media type. HTTP recipients SHOULD accept line 1116 folding and replace any embedded obs-fold whitespace with either a 1117 single SP or a matching number of SP octets (to avoid buffer copying) 1118 prior to interpreting the field value or forwarding the message 1119 downstream. 1121 Historically, HTTP has allowed field content with text in the ISO- 1122 8859-1 [ISO-8859-1] character encoding and supported other character 1123 sets only through use of [RFC2047] encoding. In practice, most HTTP 1124 header field values use only a subset of the US-ASCII character 1125 encoding [USASCII]. Newly defined header fields SHOULD limit their 1126 field values to US-ASCII octets. Recipients SHOULD treat other (obs- 1127 text) octets in field content as opaque data. 1129 3.2.3. Field Length 1131 HTTP does not place a pre-defined limit on the length of header 1132 fields, either in isolation or as a set. A server MUST be prepared 1133 to receive request header fields of unbounded length and respond with 1134 a 4xx (Client Error) status code if the received header field(s) 1135 would be longer than the server wishes to handle. 1137 A client that receives response header fields that are longer than it 1138 wishes to handle can only treat it as a server error. 1140 Various ad-hoc limitations on header field length are found in 1141 practice. It is RECOMMENDED that all HTTP senders and recipients 1142 support messages whose combined header fields have 4000 or more 1143 octets. 1145 3.2.4. Field value components 1147 Many HTTP header field values consist of words (token or quoted- 1148 string) separated by whitespace or special characters. These special 1149 characters MUST be in a quoted string to be used within a parameter 1150 value (as defined in Section 4). 1152 word = token / quoted-string 1154 token = 1*tchar 1156 tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" 1157 / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~" 1158 / DIGIT / ALPHA 1159 ; any VCHAR, except special 1161 special = "(" / ")" / "<" / ">" / "@" / "," 1162 / ";" / ":" / "\" / DQUOTE / "/" / "[" 1163 / "]" / "?" / "=" / "{" / "}" 1165 A string of text is parsed as a single word if it is quoted using 1166 double-quote marks. 1168 quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE 1169 qdtext = OWS / %x21 / %x23-5B / %x5D-7E / obs-text 1170 obs-text = %x80-FF 1172 The backslash octet ("\") can be used as a single-octet quoting 1173 mechanism within quoted-string constructs: 1175 quoted-pair = "\" ( HTAB / SP / VCHAR / obs-text ) 1177 Recipients that process the value of the quoted-string MUST handle a 1178 quoted-pair as if it were replaced by the octet following the 1179 backslash. 1181 Senders SHOULD NOT escape octets in quoted-strings that do not 1182 require escaping (i.e., other than DQUOTE and the backslash octet). 1184 Comments can be included in some HTTP header fields by surrounding 1185 the comment text with parentheses. Comments are only allowed in 1186 fields containing "comment" as part of their field value definition. 1188 comment = "(" *( ctext / quoted-cpair / comment ) ")" 1189 ctext = OWS / %x21-27 / %x2A-5B / %x5D-7E / obs-text 1191 The backslash octet ("\") can be used as a single-octet quoting 1192 mechanism within comment constructs: 1194 quoted-cpair = "\" ( HTAB / SP / VCHAR / obs-text ) 1196 Senders SHOULD NOT escape octets in comments that do not require 1197 escaping (i.e., other than the backslash octet "\" and the 1198 parentheses "(" and ")"). 1200 3.3. Message Body 1202 The message body (if any) of an HTTP message is used to carry the 1203 payload body of that request or response. The message body is 1204 identical to the payload body unless a transfer coding has been 1205 applied, as described in Section 3.3.1. 1207 message-body = *OCTET 1209 The rules for when a message body is allowed in a message differ for 1210 requests and responses. 1212 The presence of a message body in a request is signaled by a a 1213 Content-Length or Transfer-Encoding header field. Request message 1214 framing is independent of method semantics, even if the method does 1215 not define any use for a message body. 1217 The presence of a message body in a response depends on both the 1218 request method to which it is responding and the response status code 1219 (Section 3.1.2). Responses to the HEAD request method never include 1220 a message body because the associated response header fields (e.g., 1221 Transfer-Encoding, Content-Length, etc.), if present, indicate only 1222 what their values would have been if the request method had been GET 1223 (Section 5.3.2 of [Part2]). 2xx (Successful) responses to CONNECT 1224 switch to tunnel mode instead of having a message body (Section 5.3.6 1225 of [Part2]). All 1xx (Informational), 204 (No Content), and 304 (Not 1226 Modified) responses MUST NOT include a message body. All other 1227 responses do include a message body, although the body MAY be of zero 1228 length. 1230 3.3.1. Transfer-Encoding 1232 When one or more transfer codings are applied to a payload body in 1233 order to form the message body, a Transfer-Encoding header field MUST 1234 be sent in the message and MUST contain the list of corresponding 1235 transfer-coding names in the same order that they were applied. 1236 Transfer codings are defined in Section 4. 1238 Transfer-Encoding = 1#transfer-coding 1240 Transfer-Encoding is analogous to the Content-Transfer-Encoding field 1241 of MIME, which was designed to enable safe transport of binary data 1242 over a 7-bit transport service ([RFC2045], Section 6). However, safe 1243 transport has a different focus for an 8bit-clean transfer protocol. 1244 In HTTP's case, Transfer-Encoding is primarily intended to accurately 1245 delimit a dynamically generated payload and to distinguish payload 1246 encodings that are only applied for transport efficiency or security 1247 from those that are characteristics of the target resource. 1249 The "chunked" transfer-coding (Section 4.1) MUST be implemented by 1250 all HTTP/1.1 recipients because it plays a crucial role in delimiting 1251 messages when the payload body size is not known in advance. When 1252 the "chunked" transfer-coding is used, it MUST be the last transfer- 1253 coding applied to form the message body and MUST NOT be applied more 1254 than once in a message body. If any transfer-coding is applied to a 1255 request payload body, the final transfer-coding applied MUST be 1256 "chunked". If any transfer-coding is applied to a response payload 1257 body, then either the final transfer-coding applied MUST be "chunked" 1258 or the message MUST be terminated by closing the connection. 1260 For example, 1262 Transfer-Encoding: gzip, chunked 1264 indicates that the payload body has been compressed using the gzip 1265 coding and then chunked using the chunked coding while forming the 1266 message body. 1268 If more than one Transfer-Encoding header field is present in a 1269 message, the multiple field-values MUST be combined into one field- 1270 value, according to the algorithm defined in Section 3.2, before 1271 determining the message body length. 1273 Unlike Content-Encoding (Section 3.1.2.1 of [Part2]), Transfer- 1274 Encoding is a property of the message, not of the payload, and thus 1275 MAY be added or removed by any implementation along the request/ 1276 response chain. Additional information about the encoding parameters 1277 MAY be provided by other header fields not defined by this 1278 specification. 1280 Transfer-Encoding MAY be sent in a response to a HEAD request or in a 1281 304 (Not Modified) response (Section 4.1 of [Part4]) to a GET 1282 request, neither of which includes a message body, to indicate that 1283 the origin server would have applied a transfer coding to the message 1284 body if the request had been an unconditional GET. This indication 1285 is not required, however, because any recipient on the response chain 1286 (including the origin server) can remove transfer codings when they 1287 are not needed. 1289 Transfer-Encoding was added in HTTP/1.1. It is generally assumed 1290 that implementations advertising only HTTP/1.0 support will not 1291 understand how to process a transfer-encoded payload. A client MUST 1292 NOT send a request containing Transfer-Encoding unless it knows the 1293 server will handle HTTP/1.1 (or later) requests; such knowledge might 1294 be in the form of specific user configuration or by remembering the 1295 version of a prior received response. A server MUST NOT send a 1296 response containing Transfer-Encoding unless the corresponding 1297 request indicates HTTP/1.1 (or later). 1299 A server that receives a request message with a transfer-coding it 1300 does not understand SHOULD respond with 501 (Not Implemented) and 1301 then close the connection. 1303 3.3.2. Content-Length 1305 When a message is allowed to contain a message body, does not have a 1306 Transfer-Encoding header field, and has a payload body length that is 1307 known to the sender before the message header section has been sent, 1308 the sender SHOULD send a Content-Length header field to indicate the 1309 length of the payload body as a decimal number of octets. 1311 Content-Length = 1*DIGIT 1313 An example is 1315 Content-Length: 3495 1317 A sender MUST NOT send a Content-Length header field in any message 1318 that contains a Transfer-Encoding header field. 1320 A server MAY send a Content-Length header field in a response to a 1321 HEAD request (Section 5.3.2 of [Part2]); a server MUST NOT send 1322 Content-Length in such a response unless its field-value equals the 1323 decimal number of octets that would have been sent in the payload 1324 body of a response if the same request had used the GET method. 1326 A server MAY send a Content-Length header field in a 304 (Not 1327 Modified) response to a conditional GET request (Section 4.1 of 1328 [Part4]); a server MUST NOT send Content-Length in such a response 1329 unless its field-value equals the decimal number of octets that would 1330 have been sent in the payload body of a 200 (OK) response to the same 1331 request. 1333 A server MUST NOT send a Content-Length header field in any response 1334 with a status code of 1xx (Informational) or 204 (No Content). A 1335 server SHOULD NOT send a Content-Length header field in any 2xx 1336 (Successful) response to a CONNECT request (Section 5.3.6 of 1337 [Part2]). 1339 Any Content-Length field value greater than or equal to zero is 1340 valid. Since there is no predefined limit to the length of an HTTP 1341 payload, recipients SHOULD anticipate potentially large decimal 1342 numerals and prevent parsing errors due to integer conversion 1343 overflows (Section 8.6). 1345 If a message is received that has multiple Content-Length header 1346 fields with field-values consisting of the same decimal value, or a 1347 single Content-Length header field with a field value containing a 1348 list of identical decimal values (e.g., "Content-Length: 42, 42"), 1349 indicating that duplicate Content-Length header fields have been 1350 generated or combined by an upstream message processor, then the 1351 recipient MUST either reject the message as invalid or replace the 1352 duplicated field-values with a single valid Content-Length field 1353 containing that decimal value prior to determining the message body 1354 length. 1356 Note: HTTP's use of Content-Length for message framing differs 1357 significantly from the same field's use in MIME, where it is an 1358 optional field used only within the "message/external-body" media- 1359 type. 1361 3.3.3. Message Body Length 1363 The length of a message body is determined by one of the following 1364 (in order of precedence): 1366 1. Any response to a HEAD request and any response with a 1xx 1367 (Informational), 204 (No Content), or 304 (Not Modified) status 1368 code is always terminated by the first empty line after the 1369 header fields, regardless of the header fields present in the 1370 message, and thus cannot contain a message body. 1372 2. Any 2xx (Successful) response to a CONNECT request implies that 1373 the connection will become a tunnel immediately after the empty 1374 line that concludes the header fields. A client MUST ignore any 1375 Content-Length or Transfer-Encoding header fields received in 1376 such a message. 1378 3. If a Transfer-Encoding header field is present and the "chunked" 1379 transfer-coding (Section 4.1) is the final encoding, the message 1380 body length is determined by reading and decoding the chunked 1381 data until the transfer-coding indicates the data is complete. 1383 If a Transfer-Encoding header field is present in a response and 1384 the "chunked" transfer-coding is not the final encoding, the 1385 message body length is determined by reading the connection until 1386 it is closed by the server. If a Transfer-Encoding header field 1387 is present in a request and the "chunked" transfer-coding is not 1388 the final encoding, the message body length cannot be determined 1389 reliably; the server MUST respond with the 400 (Bad Request) 1390 status code and then close the connection. 1392 If a message is received with both a Transfer-Encoding and a 1393 Content-Length header field, the Transfer-Encoding overrides the 1394 Content-Length. Such a message might indicate an attempt to 1395 perform request or response smuggling (bypass of security-related 1396 checks on message routing or content) and thus ought to be 1397 handled as an error. The provided Content-Length MUST be 1398 removed, prior to forwarding the message downstream, or replaced 1399 with the real message body length after the transfer-coding is 1400 decoded. 1402 4. If a message is received without Transfer-Encoding and with 1403 either multiple Content-Length header fields having differing 1404 field-values or a single Content-Length header field having an 1405 invalid value, then the message framing is invalid and MUST be 1406 treated as an error to prevent request or response smuggling. If 1407 this is a request message, the server MUST respond with a 400 1408 (Bad Request) status code and then close the connection. If this 1409 is a response message received by a proxy, the proxy MUST discard 1410 the received response, send a 502 (Bad Gateway) status code as 1411 its downstream response, and then close the connection. If this 1412 is a response message received by a user-agent, it MUST be 1413 treated as an error by discarding the message and closing the 1414 connection. 1416 5. If a valid Content-Length header field is present without 1417 Transfer-Encoding, its decimal value defines the message body 1418 length in octets. If the actual number of octets sent in the 1419 message is less than the indicated Content-Length, the recipient 1420 MUST consider the message to be incomplete and treat the 1421 connection as no longer usable. If the actual number of octets 1422 sent in the message is more than the indicated Content-Length, 1423 the recipient MUST only process the message body up to the field 1424 value's number of octets; the remainder of the message MUST 1425 either be discarded or treated as the next message in a pipeline. 1426 For the sake of robustness, a user-agent MAY attempt to detect 1427 and correct such an error in message framing if it is parsing the 1428 response to the last request on a connection and the connection 1429 has been closed by the server. 1431 6. If this is a request message and none of the above are true, then 1432 the message body length is zero (no message body is present). 1434 7. Otherwise, this is a response message without a declared message 1435 body length, so the message body length is determined by the 1436 number of octets received prior to the server closing the 1437 connection. 1439 Since there is no way to distinguish a successfully completed, close- 1440 delimited message from a partially-received message interrupted by 1441 network failure, a server SHOULD use encoding or length-delimited 1442 messages whenever possible. The close-delimiting feature exists 1443 primarily for backwards compatibility with HTTP/1.0. 1445 A server MAY reject a request that contains a message body but not a 1446 Content-Length by responding with 411 (Length Required). 1448 Unless a transfer-coding other than "chunked" has been applied, a 1449 client that sends a request containing a message body SHOULD use a 1450 valid Content-Length header field if the message body length is known 1451 in advance, rather than the "chunked" encoding, since some existing 1452 services respond to "chunked" with a 411 (Length Required) status 1453 code even though they understand the chunked encoding. This is 1454 typically because such services are implemented via a gateway that 1455 requires a content-length in advance of being called and the server 1456 is unable or unwilling to buffer the entire request before 1457 processing. 1459 A client that sends a request containing a message body MUST include 1460 a valid Content-Length header field if it does not know the server 1461 will handle HTTP/1.1 (or later) requests; such knowledge can be in 1462 the form of specific user configuration or by remembering the version 1463 of a prior received response. 1465 3.4. Handling Incomplete Messages 1467 Request messages that are prematurely terminated, possibly due to a 1468 canceled connection or a server-imposed time-out exception, MUST 1469 result in closure of the connection; sending an error response prior 1470 to closing the connection is OPTIONAL. 1472 Response messages that are prematurely terminated, usually by closure 1473 of the connection prior to receiving the expected number of octets or 1474 by failure to decode a transfer-encoded message body, MUST be 1475 recorded as incomplete. A response that terminates in the middle of 1476 the header block (before the empty line is received) cannot be 1477 assumed to convey the full semantics of the response and MUST be 1478 treated as an error. 1480 A message body that uses the chunked transfer encoding is incomplete 1481 if the zero-sized chunk that terminates the encoding has not been 1482 received. A message that uses a valid Content-Length is incomplete 1483 if the size of the message body received (in octets) is less than the 1484 value given by Content-Length. A response that has neither chunked 1485 transfer encoding nor Content-Length is terminated by closure of the 1486 connection, and thus is considered complete regardless of the number 1487 of message body octets received, provided that the header block was 1488 received intact. 1490 A user agent MUST NOT render an incomplete response message body as 1491 if it were complete (i.e., some indication needs to be given to the 1492 user that an error occurred). Cache requirements for incomplete 1493 responses are defined in Section 3 of [Part6]. 1495 A server MUST read the entire request message body or close the 1496 connection after sending its response, since otherwise the remaining 1497 data on a persistent connection would be misinterpreted as the next 1498 request. Likewise, a client MUST read the entire response message 1499 body if it intends to reuse the same connection for a subsequent 1500 request. Pipelining multiple requests on a connection is described 1501 in Section 6.2.2.1. 1503 3.5. Message Parsing Robustness 1505 Older HTTP/1.0 client implementations might send an extra CRLF after 1506 a POST request as a lame workaround for some early server 1507 applications that failed to read message body content that was not 1508 terminated by a line-ending. An HTTP/1.1 client MUST NOT preface or 1509 follow a request with an extra CRLF. If terminating the request 1510 message body with a line-ending is desired, then the client MUST 1511 include the terminating CRLF octets as part of the message body 1512 length. 1514 In the interest of robustness, servers SHOULD ignore at least one 1515 empty line received where a request-line is expected. In other 1516 words, if the server is reading the protocol stream at the beginning 1517 of a message and receives a CRLF first, it SHOULD ignore the CRLF. 1518 Likewise, although the line terminator for the start-line and header 1519 fields is the sequence CRLF, we recommend that recipients recognize a 1520 single LF as a line terminator and ignore any CR. 1522 When a server listening only for HTTP request messages, or processing 1523 what appears from the start-line to be an HTTP request message, 1524 receives a sequence of octets that does not match the HTTP-message 1525 grammar aside from the robustness exceptions listed above, the server 1526 MUST respond with an HTTP/1.1 400 (Bad Request) response. 1528 4. Transfer Codings 1530 Transfer-coding values are used to indicate an encoding 1531 transformation that has been, can be, or might need to be applied to 1532 a payload body in order to ensure "safe transport" through the 1533 network. This differs from a content coding in that the transfer- 1534 coding is a property of the message rather than a property of the 1535 representation that is being transferred. 1537 transfer-coding = "chunked" ; Section 4.1 1538 / "compress" ; Section 4.2.1 1539 / "deflate" ; Section 4.2.2 1540 / "gzip" ; Section 4.2.3 1541 / transfer-extension 1542 transfer-extension = token *( OWS ";" OWS transfer-parameter ) 1544 Parameters are in the form of attribute/value pairs. 1546 transfer-parameter = attribute BWS "=" BWS value 1547 attribute = token 1548 value = word 1550 All transfer-coding values are case-insensitive and SHOULD be 1551 registered within the HTTP Transfer Coding registry, as defined in 1552 Section 7.4. They are used in the TE (Section 4.3) and Transfer- 1553 Encoding (Section 3.3.1) header fields. 1555 4.1. Chunked Transfer Coding 1557 The chunked encoding modifies the body of a message in order to 1558 transfer it as a series of chunks, each with its own size indicator, 1559 followed by an OPTIONAL trailer containing header fields. This 1560 allows dynamically produced content to be transferred along with the 1561 information necessary for the recipient to verify that it has 1562 received the full message. 1564 chunked-body = *chunk 1565 last-chunk 1566 trailer-part 1567 CRLF 1569 chunk = chunk-size [ chunk-ext ] CRLF 1570 chunk-data CRLF 1571 chunk-size = 1*HEXDIG 1572 last-chunk = 1*("0") [ chunk-ext ] CRLF 1574 chunk-ext = *( ";" chunk-ext-name [ "=" chunk-ext-val ] ) 1575 chunk-ext-name = token 1576 chunk-ext-val = token / quoted-str-nf 1577 chunk-data = 1*OCTET ; a sequence of chunk-size octets 1578 trailer-part = *( header-field CRLF ) 1580 quoted-str-nf = DQUOTE *( qdtext-nf / quoted-pair ) DQUOTE 1581 ; like quoted-string, but disallowing line folding 1582 qdtext-nf = HTAB / SP / %x21 / %x23-5B / %x5D-7E / obs-text 1584 Chunk extensions within the chucked encoding are deprecated. Senders 1585 SHOULD NOT send chunk-ext. Definition of new chunk extensions is 1586 discouraged. 1588 The chunk-size field is a string of hex digits indicating the size of 1589 the chunk-data in octets. The chunked encoding is ended by any chunk 1590 whose size is zero, followed by the trailer, which is terminated by 1591 an empty line. 1593 4.1.1. Trailer 1595 A trailer allows the sender to include additional fields at the end 1596 of a chunked message in order to supply metadata that might be 1597 dynamically generated while the message body is sent, such as a 1598 message integrity check, digital signature, or post-processing 1599 status. The trailer MUST NOT contain fields that need to be known 1600 before a recipient processes the body, such as Transfer-Encoding, 1601 Content-Length, and Trailer. 1603 When a message includes a message body encoded with the chunked 1604 transfer-coding and the sender desires to send metadata in the form 1605 of trailer fields at the end of the message, the sender SHOULD send a 1606 Trailer header field before the message body to indicate which fields 1607 will be present in the trailers. This allows the recipient to 1608 prepare for receipt of that metadata before it starts processing the 1609 body, which is useful if the message is being streamed and the 1610 recipient wishes to confirm an integrity check on the fly. 1612 Trailer = 1#field-name 1614 If no Trailer header field is present, the sender of a chunked 1615 message body SHOULD send an empty trailer. 1617 A server MUST send an empty trailer with the chunked transfer-coding 1618 unless at least one of the following is true: 1620 1. the request included a TE header field that indicates "trailers" 1621 is acceptable in the transfer-coding of the response, as 1622 described in Section 4.3; or, 1624 2. the trailer fields consist entirely of optional metadata and the 1625 recipient could use the message (in a manner acceptable to the 1626 server where the field originated) without receiving that 1627 metadata. In other words, the server that generated the header 1628 field is willing to accept the possibility that the trailer 1629 fields might be silently discarded along the path to the client. 1631 The above requirement prevents the need for an infinite buffer when a 1632 message is being received by an HTTP/1.1 (or later) proxy and 1633 forwarded to an HTTP/1.0 recipient. 1635 4.1.2. Decoding chunked 1637 A process for decoding the "chunked" transfer-coding can be 1638 represented in pseudo-code as: 1640 length := 0 1641 read chunk-size, chunk-ext (if any) and CRLF 1642 while (chunk-size > 0) { 1643 read chunk-data and CRLF 1644 append chunk-data to decoded-body 1645 length := length + chunk-size 1646 read chunk-size and CRLF 1647 } 1648 read header-field 1649 while (header-field not empty) { 1650 append header-field to existing header fields 1651 read header-field 1652 } 1653 Content-Length := length 1654 Remove "chunked" from Transfer-Encoding 1655 Remove Trailer from existing header fields 1657 All recipients MUST be able to receive and decode the "chunked" 1658 transfer-coding and MUST ignore chunk-ext extensions they do not 1659 understand. 1661 4.2. Compression Codings 1663 The codings defined below can be used to compress the payload of a 1664 message. 1666 4.2.1. Compress Coding 1668 The "compress" format is produced by the common UNIX file compression 1669 program "compress". This format is an adaptive Lempel-Ziv-Welch 1670 coding (LZW). Recipients SHOULD consider "x-compress" to be 1671 equivalent to "compress". 1673 4.2.2. Deflate Coding 1675 The "deflate" format is defined as the "deflate" compression 1676 mechanism (described in [RFC1951]) used inside the "zlib" data format 1677 ([RFC1950]). 1679 Note: Some incorrect implementations send the "deflate" compressed 1680 data without the zlib wrapper. 1682 4.2.3. Gzip Coding 1684 The "gzip" format is produced by the file compression program "gzip" 1685 (GNU zip), as described in [RFC1952]. This format is a Lempel-Ziv 1686 coding (LZ77) with a 32 bit CRC. Recipients SHOULD consider "x-gzip" 1687 to be equivalent to "gzip". 1689 4.3. TE 1691 The "TE" header field in a request indicates what transfer-codings, 1692 besides "chunked", the client is willing to accept in response, and 1693 whether or not the client is willing to accept trailer fields in a 1694 chunked transfer-coding. 1696 The TE field-value consists of a comma-separated list of transfer- 1697 coding names, each allowing for optional parameters (as described in 1698 Section 4), and/or the keyword "trailers". Clients MUST NOT send the 1699 chunked transfer-coding name in TE; chunked is always acceptable for 1700 HTTP/1.1 recipients. 1702 TE = #t-codings 1703 t-codings = "trailers" / ( transfer-coding [ t-ranking ] ) 1704 t-ranking = OWS ";" OWS "q=" rank 1705 rank = ( "0" [ "." 0*3DIGIT ] ) 1706 / ( "1" [ "." 0*3("0") ] ) 1708 Three examples of TE use are below. 1710 TE: deflate 1711 TE: 1712 TE: trailers, deflate;q=0.5 1714 The presence of the keyword "trailers" indicates that the client is 1715 willing to accept trailer fields in a chunked transfer-coding, as 1716 defined in Section 4.1, on behalf of itself and any downstream 1717 clients. For chained requests, this implies that either: (a) all 1718 downstream clients are willing to accept trailer fields in the 1719 forwarded response; or, (b) the client will attempt to buffer the 1720 response on behalf of downstream recipients. Note that HTTP/1.1 does 1721 not define any means to limit the size of a chunked response such 1722 that a client can be assured of buffering the entire response. 1724 When multiple transfer-codings are acceptable, the client MAY rank 1725 the codings by preference using a case-insensitive "q" parameter 1726 (similar to the qvalues used in content negotiation fields, Section 1727 6.3.1 of [Part2]). The rank value is a real number in the range 0 1728 through 1, where 0.001 is the least preferred and 1 is the most 1729 preferred; a value of 0 means "not acceptable". 1731 If the TE field-value is empty or if no TE field is present, the only 1732 acceptable transfer-coding is "chunked". A message with no transfer- 1733 coding is always acceptable. 1735 Since the TE header field only applies to the immediate connection, a 1736 sender of TE MUST also send a "TE" connection option within the 1737 Connection header field (Section 6.1) in order to prevent the TE 1738 field from being forwarded by intermediaries that do not support its 1739 semantics. 1741 5. Message Routing 1743 HTTP request message routing is determined by each client based on 1744 the target resource, the client's proxy configuration, and 1745 establishment or reuse of an inbound connection. The corresponding 1746 response routing follows the same connection chain back to the 1747 client. 1749 5.1. Identifying a Target Resource 1751 HTTP is used in a wide variety of applications, ranging from general- 1752 purpose computers to home appliances. In some cases, communication 1753 options are hard-coded in a client's configuration. However, most 1754 HTTP clients rely on the same resource identification mechanism and 1755 configuration techniques as general-purpose Web browsers. 1757 HTTP communication is initiated by a user agent for some purpose. 1758 The purpose is a combination of request semantics, which are defined 1759 in [Part2], and a target resource upon which to apply those 1760 semantics. A URI reference (Section 2.7) is typically used as an 1761 identifier for the "target resource", which a user agent would 1762 resolve to its absolute form in order to obtain the "target URI". 1763 The target URI excludes the reference's fragment identifier 1764 component, if any, since fragment identifiers are reserved for 1765 client-side processing ([RFC3986], Section 3.5). 1767 5.2. Connecting Inbound 1769 Once the target URI is determined, a client needs to decide whether a 1770 network request is necessary to accomplish the desired semantics and, 1771 if so, where that request is to be directed. 1773 If the client has a response cache and the request semantics can be 1774 satisfied by a cache ([Part6]), then the request is usually directed 1775 to the cache first. 1777 If the request is not satisfied by a cache, then a typical client 1778 will check its configuration to determine whether a proxy is to be 1779 used to satisfy the request. Proxy configuration is implementation- 1780 dependent, but is often based on URI prefix matching, selective 1781 authority matching, or both, and the proxy itself is usually 1782 identified by an "http" or "https" URI. If a proxy is applicable, 1783 the client connects inbound by establishing (or reusing) a connection 1784 to that proxy. 1786 If no proxy is applicable, a typical client will invoke a handler 1787 routine, usually specific to the target URI's scheme, to connect 1788 directly to an authority for the target resource. How that is 1789 accomplished is dependent on the target URI scheme and defined by its 1790 associated specification, similar to how this specification defines 1791 origin server access for resolution of the "http" (Section 2.7.1) and 1792 "https" (Section 2.7.2) schemes. 1794 HTTP requirements regarding connection management are defined in 1795 Section 6. 1797 5.3. Request Target 1799 Once an inbound connection is obtained, the client sends an HTTP 1800 request message (Section 3) with a request-target derived from the 1801 target URI. There are four distinct formats for the request-target, 1802 depending on both the method being requested and whether the request 1803 is to a proxy. 1805 request-target = origin-form 1806 / absolute-form 1807 / authority-form 1808 / asterisk-form 1810 origin-form = path-absolute [ "?" query ] 1811 absolute-form = absolute-URI 1812 authority-form = authority 1813 asterisk-form = "*" 1815 The most common form of request-target is the origin-form. When 1816 making a request directly to an origin server, other than a CONNECT 1817 or server-wide OPTIONS request (as detailed below), a client MUST 1818 send only the absolute path and query components of the target URI as 1819 the request-target. If the target URI's path component is empty, 1820 then the client MUST send "/" as the path within the origin-form of 1821 request-target. A Host header field is also sent, as defined in 1822 Section 5.4, containing the target URI's authority component 1823 (excluding any userinfo). 1825 For example, a client wishing to retrieve a representation of the 1826 resource identified as 1828 http://www.example.org/where?q=now 1830 directly from the origin server would open (or reuse) a TCP 1831 connection to port 80 of the host "www.example.org" and send the 1832 lines: 1834 GET /where?q=now HTTP/1.1 1835 Host: www.example.org 1837 followed by the remainder of the request message. 1839 When making a request to a proxy, other than a CONNECT or server-wide 1840 OPTIONS request (as detailed below), a client MUST send the target 1841 URI in absolute-form as the request-target. The proxy is requested 1842 to either service that request from a valid cache, if possible, or 1843 make the same request on the client's behalf to either the next 1844 inbound proxy server or directly to the origin server indicated by 1845 the request-target. Requirements on such "forwarding" of messages 1846 are defined in Section 5.6. 1848 An example absolute-form of request-line would be: 1850 GET http://www.example.org/pub/WWW/TheProject.html HTTP/1.1 1852 To allow for transition to the absolute-form for all requests in some 1853 future version of HTTP, HTTP/1.1 servers MUST accept the absolute- 1854 form in requests, even though HTTP/1.1 clients will only send them in 1855 requests to proxies. 1857 The authority-form of request-target is only used for CONNECT 1858 requests (Section 5.3.6 of [Part2]). When making a CONNECT request 1859 to establish a tunnel through one or more proxies, a client MUST send 1860 only the target URI's authority component (excluding any userinfo) as 1861 the request-target. For example, 1863 CONNECT www.example.com:80 HTTP/1.1 1865 The asterisk-form of request-target is only used for a server-wide 1866 OPTIONS request (Section 5.3.7 of [Part2]). When a client wishes to 1867 request OPTIONS for the server as a whole, as opposed to a specific 1868 named resource of that server, the client MUST send only "*" (%x2A) 1869 as the request-target. For example, 1870 OPTIONS * HTTP/1.1 1872 If a proxy receives an OPTIONS request with an absolute-form of 1873 request-target in which the URI has an empty path and no query 1874 component, then the last proxy on the request chain MUST send a 1875 request-target of "*" when it forwards the request to the indicated 1876 origin server. 1878 For example, the request 1880 OPTIONS http://www.example.org:8001 HTTP/1.1 1882 would be forwarded by the final proxy as 1884 OPTIONS * HTTP/1.1 1885 Host: www.example.org:8001 1887 after connecting to port 8001 of host "www.example.org". 1889 5.4. Host 1891 The "Host" header field in a request provides the host and port 1892 information from the target URI, enabling the origin server to 1893 distinguish among resources while servicing requests for multiple 1894 host names on a single IP address. Since the Host field-value is 1895 critical information for handling a request, it SHOULD be sent as the 1896 first header field following the request-line. 1898 Host = uri-host [ ":" port ] ; Section 2.7.1 1900 A client MUST send a Host header field in all HTTP/1.1 request 1901 messages. If the target URI includes an authority component, then 1902 the Host field-value MUST be identical to that authority component 1903 after excluding any userinfo (Section 2.7.1). If the authority 1904 component is missing or undefined for the target URI, then the Host 1905 header field MUST be sent with an empty field-value. 1907 For example, a GET request to the origin server for 1908 would begin with: 1910 GET /pub/WWW/ HTTP/1.1 1911 Host: www.example.org 1913 The Host header field MUST be sent in an HTTP/1.1 request even if the 1914 request-target is in the absolute-form, since this allows the Host 1915 information to be forwarded through ancient HTTP/1.0 proxies that 1916 might not have implemented Host. 1918 When a proxy receives a request with an absolute-form of request- 1919 target, the proxy MUST ignore the received Host header field (if any) 1920 and instead replace it with the host information of the request- 1921 target. If the proxy forwards the request, it MUST generate a new 1922 Host field-value based on the received request-target rather than 1923 forward the received Host field-value. 1925 Since the Host header field acts as an application-level routing 1926 mechanism, it is a frequent target for malware seeking to poison a 1927 shared cache or redirect a request to an unintended server. An 1928 interception proxy is particularly vulnerable if it relies on the 1929 Host field-value for redirecting requests to internal servers, or for 1930 use as a cache key in a shared cache, without first verifying that 1931 the intercepted connection is targeting a valid IP address for that 1932 host. 1934 A server MUST respond with a 400 (Bad Request) status code to any 1935 HTTP/1.1 request message that lacks a Host header field and to any 1936 request message that contains more than one Host header field or a 1937 Host header field with an invalid field-value. 1939 5.5. Effective Request URI 1941 A server that receives an HTTP request message MUST reconstruct the 1942 user agent's original target URI, based on the pieces of information 1943 learned from the request-target, Host header field, and connection 1944 context, in order to identify the intended target resource and 1945 properly service the request. The URI derived from this 1946 reconstruction process is referred to as the "effective request URI". 1948 For a user agent, the effective request URI is the target URI. 1950 If the request-target is in absolute-form, then the effective request 1951 URI is the same as the request-target. Otherwise, the effective 1952 request URI is constructed as follows. 1954 If the request is received over a TLS-secured TCP connection, then 1955 the effective request URI's scheme is "https"; otherwise, the scheme 1956 is "http". 1958 If the request-target is in authority-form, then the effective 1959 request URI's authority component is the same as the request-target. 1960 Otherwise, if a Host header field is supplied with a non-empty field- 1961 value, then the authority component is the same as the Host field- 1962 value. Otherwise, the authority component is the concatenation of 1963 the default host name configured for the server, a colon (":"), and 1964 the connection's incoming TCP port number in decimal form. 1966 If the request-target is in authority-form or asterisk-form, then the 1967 effective request URI's combined path and query component is empty. 1968 Otherwise, the combined path and query component is the same as the 1969 request-target. 1971 The components of the effective request URI, once determined as 1972 above, can be combined into absolute-URI form by concatenating the 1973 scheme, "://", authority, and combined path and query component. 1975 Example 1: the following message received over an insecure TCP 1976 connection 1978 GET /pub/WWW/TheProject.html HTTP/1.1 1979 Host: www.example.org:8080 1981 has an effective request URI of 1983 http://www.example.org:8080/pub/WWW/TheProject.html 1985 Example 2: the following message received over a TLS-secured TCP 1986 connection 1988 OPTIONS * HTTP/1.1 1989 Host: www.example.org 1991 has an effective request URI of 1993 https://www.example.org 1995 An origin server that does not allow resources to differ by requested 1996 host MAY ignore the Host field-value and instead replace it with a 1997 configured server name when constructing the effective request URI. 1999 Recipients of an HTTP/1.0 request that lacks a Host header field MAY 2000 attempt to use heuristics (e.g., examination of the URI path for 2001 something unique to a particular host) in order to guess the 2002 effective request URI's authority component. 2004 5.6. Message Forwarding 2006 As described in Section 2.3, intermediaries can serve a variety of 2007 roles in the processing of HTTP requests and responses. Some 2008 intermediaries are used to improve performance or availability. 2009 Others are used for access control or to filter content. Since an 2010 HTTP stream has characteristics similar to a pipe-and-filter 2011 architecture, there are no inherent limits to the extent an 2012 intermediary can enhance (or interfere) with either direction of the 2013 stream. 2015 Intermediaries that forward a message MUST implement the Connection 2016 header field, as specified in Section 6.1, to exclude fields that are 2017 only intended for the incoming connection. 2019 In order to avoid request loops, a proxy that forwards requests to 2020 other proxies MUST be able to recognize and exclude all of its own 2021 server names, including any aliases, local variations, or literal IP 2022 addresses. 2024 5.7. Via 2026 The "Via" header field MUST be sent by a proxy or gateway in 2027 forwarded messages to indicate the intermediate protocols and 2028 recipients between the user agent and the server on requests, and 2029 between the origin server and the client on responses. It is 2030 analogous to the "Received" field used by email systems (Section 2031 3.6.7 of [RFC5322]). Via is used in HTTP for tracking message 2032 forwards, avoiding request loops, and identifying the protocol 2033 capabilities of all senders along the request/response chain. 2035 Via = 1#( received-protocol RWS received-by 2036 [ RWS comment ] ) 2037 received-protocol = [ protocol-name "/" ] protocol-version 2038 received-by = ( uri-host [ ":" port ] ) / pseudonym 2039 pseudonym = token 2041 The received-protocol indicates the protocol version of the message 2042 received by the server or client along each segment of the request/ 2043 response chain. The received-protocol version is appended to the Via 2044 field value when the message is forwarded so that information about 2045 the protocol capabilities of upstream applications remains visible to 2046 all recipients. 2048 The protocol-name is excluded if and only if it would be "HTTP". The 2049 received-by field is normally the host and optional port number of a 2050 recipient server or client that subsequently forwarded the message. 2051 However, if the real host is considered to be sensitive information, 2052 it MAY be replaced by a pseudonym. If the port is not given, it MAY 2053 be assumed to be the default port of the received-protocol. 2055 Multiple Via field values represent each proxy or gateway that has 2056 forwarded the message. Each recipient MUST append its information 2057 such that the end result is ordered according to the sequence of 2058 forwarding applications. 2060 Comments MAY be used in the Via header field to identify the software 2061 of each recipient, analogous to the User-Agent and Server header 2062 fields. However, all comments in the Via field are optional and MAY 2063 be removed by any recipient prior to forwarding the message. 2065 For example, a request message could be sent from an HTTP/1.0 user 2066 agent to an internal proxy code-named "fred", which uses HTTP/1.1 to 2067 forward the request to a public proxy at p.example.net, which 2068 completes the request by forwarding it to the origin server at 2069 www.example.com. The request received by www.example.com would then 2070 have the following Via header field: 2072 Via: 1.0 fred, 1.1 p.example.net (Apache/1.1) 2074 A proxy or gateway used as a portal through a network firewall SHOULD 2075 NOT forward the names and ports of hosts within the firewall region 2076 unless it is explicitly enabled to do so. If not enabled, the 2077 received-by host of any host behind the firewall SHOULD be replaced 2078 by an appropriate pseudonym for that host. 2080 A proxy or gateway MAY combine an ordered subsequence of Via header 2081 field entries into a single such entry if the entries have identical 2082 received-protocol values. For example, 2084 Via: 1.0 ricky, 1.1 ethel, 1.1 fred, 1.0 lucy 2086 could be collapsed to 2088 Via: 1.0 ricky, 1.1 mertz, 1.0 lucy 2090 Senders SHOULD NOT combine multiple entries unless they are all under 2091 the same organizational control and the hosts have already been 2092 replaced by pseudonyms. Senders MUST NOT combine entries which have 2093 different received-protocol values. 2095 5.8. Message Transforming 2097 If a proxy receives a request-target with a host name that is not a 2098 fully qualified domain name, it MAY add its own domain to the host 2099 name it received when forwarding the request. A proxy MUST NOT 2100 change the host name if it is a fully qualified domain name. 2102 A non-transforming proxy MUST NOT modify the "path-absolute" and 2103 "query" parts of the received request-target when forwarding it to 2104 the next inbound server, except as noted above to replace an empty 2105 path with "/" or "*". 2107 A non-transforming proxy MUST preserve the message payload (Section 2108 3.3 of [Part2]), though it MAY change the message body through 2109 application or removal of a transfer-coding (Section 4). 2111 A non-transforming proxy SHOULD NOT modify header fields that provide 2112 information about the end points of the communication chain, the 2113 resource state, or the selected representation. 2115 A non-transforming proxy MUST NOT modify any of the following fields 2116 in a request or response, and it MUST NOT add any of these fields if 2117 not already present: 2119 o Allow (Section 8.4.1 of [Part2]) 2121 o Content-Location (Section 3.1.4.2 of [Part2]) 2123 o Content-MD5 (Section 14.15 of [RFC2616]) 2125 o ETag (Section 2.3 of [Part4]) 2127 o Last-Modified (Section 2.2 of [Part4]) 2129 o Server (Section 8.4.2 of [Part2]) 2131 A non-transforming proxy MUST NOT modify an Expires header field 2132 (Section 7.3 of [Part6]) if already present in a response, but it MAY 2133 add an Expires header field with a field-value identical to that of 2134 the Date header field. 2136 A proxy MUST NOT modify or add any of the following fields in a 2137 message that contains the no-transform cache-control directive: 2139 o Content-Encoding (Section 3.1.2.2 of [Part2]) 2141 o Content-Range (Section 5.2 of [Part5]) 2143 o Content-Type (Section 3.1.1.5 of [Part2]) 2145 A transforming proxy MAY modify or add these fields to a message that 2146 does not include no-transform, but if it does so, it MUST add a 2147 Warning 214 (Transformation applied) if one does not already appear 2148 in the message (see Section 7.5 of [Part6]). 2150 Warning: Unnecessary modification of header fields might cause 2151 authentication failures if stronger authentication mechanisms are 2152 introduced in later versions of HTTP. Such authentication 2153 mechanisms MAY rely on the values of header fields not listed 2154 here. 2156 5.9. Associating a Response to a Request 2158 HTTP does not include a request identifier for associating a given 2159 request message with its corresponding one or more response messages. 2160 Hence, it relies on the order of response arrival to correspond 2161 exactly to the order in which requests are made on the same 2162 connection. More than one response message per request only occurs 2163 when one or more informational responses (1xx, see Section 7.2 of 2164 [Part2]) precede a final response to the same request. 2166 A client that uses persistent connections and sends more than one 2167 request per connection MUST maintain a list of outstanding requests 2168 in the order sent on that connection and MUST associate each received 2169 response message to the highest ordered request that has not yet 2170 received a final (non-1xx) response. 2172 6. Connection Management 2174 HTTP messaging is independent of the underlying transport or session- 2175 layer connection protocol(s). HTTP only presumes a reliable 2176 transport with in-order delivery of requests and the corresponding 2177 in-order delivery of responses. The mapping of HTTP request and 2178 response structures onto the data units of an underlying transport 2179 protocol is outside the scope of this specification. 2181 As described in Section 5.2, the specific connection protocols to be 2182 used for an HTTP interaction are determined by client configuration 2183 and the target URI. For example, the "http" URI scheme 2184 (Section 2.7.1) indicates a default connection of TCP over IP, with a 2185 default TCP port of 80, but the client might be configured to use a 2186 proxy via some other connection, port, or protocol. 2188 HTTP implementations are expected to engage in connection management, 2189 which includes maintaining the state of current connections, 2190 establishing a new connection or reusing an existing connection, 2191 processing messages received on a connection, detecting connection 2192 failures, and closing each connection. Most clients maintain 2193 multiple connections in parallel, including more than one connection 2194 per server endpoint. Most servers are designed to maintain thousands 2195 of concurrent connections, while controlling request queues to enable 2196 fair use and detect denial of service attacks. 2198 6.1. Connection 2200 The "Connection" header field allows the sender to indicate desired 2201 control options for the current connection. In order to avoid 2202 confusing downstream recipients, a proxy or gateway MUST remove or 2203 replace any received connection options before forwarding the 2204 message. 2206 When a header field is used to supply control information for or 2207 about the current connection, the sender SHOULD list the 2208 corresponding field-name within the "Connection" header field. A 2209 proxy or gateway MUST parse a received Connection header field before 2210 a message is forwarded and, for each connection-option in this field, 2211 remove any header field(s) from the message with the same name as the 2212 connection-option, and then remove the Connection header field itself 2213 (or replace it with the intermediary's own connection options for the 2214 forwarded message). 2216 Hence, the Connection header field provides a declarative way of 2217 distinguishing header fields that are only intended for the immediate 2218 recipient ("hop-by-hop") from those fields that are intended for all 2219 recipients on the chain ("end-to-end"), enabling the message to be 2220 self-descriptive and allowing future connection-specific extensions 2221 to be deployed without fear that they will be blindly forwarded by 2222 older intermediaries. 2224 The Connection header field's value has the following grammar: 2226 Connection = 1#connection-option 2227 connection-option = token 2229 Connection options are case-insensitive. 2231 A sender MUST NOT include field-names in the Connection header field- 2232 value for fields that are defined as expressing constraints for all 2233 recipients in the request or response chain, such as the Cache- 2234 Control header field (Section 7.2 of [Part6]). 2236 The connection options do not have to correspond to a header field 2237 present in the message, since a connection-specific header field 2238 might not be needed if there are no parameters associated with that 2239 connection option. Recipients that trigger certain connection 2240 behavior based on the presence of connection options MUST do so based 2241 on the presence of the connection-option rather than only the 2242 presence of the optional header field. In other words, if the 2243 connection option is received as a header field but not indicated 2244 within the Connection field-value, then the recipient MUST ignore the 2245 connection-specific header field because it has likely been forwarded 2246 by an intermediary that is only partially conformant. 2248 When defining new connection options, specifications ought to 2249 carefully consider existing deployed header fields and ensure that 2250 the new connection option does not share the same name as an 2251 unrelated header field that might already be deployed. Defining a 2252 new connection option essentially reserves that potential field-name 2253 for carrying additional information related to the connection option, 2254 since it would be unwise for senders to use that field-name for 2255 anything else. 2257 The "close" connection option is defined for a sender to signal that 2258 this connection will be closed after completion of the response. For 2259 example, 2261 Connection: close 2263 in either the request or the response header fields indicates that 2264 the connection SHOULD be closed after the current request/response is 2265 complete (Section 6.2.5). 2267 A client that does not support persistent connections MUST send the 2268 "close" connection option in every request message. 2270 A server that does not support persistent connections MUST send the 2271 "close" connection option in every response message that does not 2272 have a 1xx (Informational) status code. 2274 6.2. Persistent Connections 2276 HTTP was originally designed to use a separate connection for each 2277 request/response pair. As the Web evolved and embedded requests 2278 became common for inline images, the connection establishment 2279 overhead was a significant drain on performance and a concern for 2280 Internet congestion. Message framing (via Content-Length) and 2281 optional long-lived connections (via Keep-Alive) were added to 2282 HTTP/1.0 in order to improve performance for some requests. However, 2283 these extensions were insufficient for dynamically generated 2284 responses and difficult to use with intermediaries. 2286 HTTP/1.1 defaults to the use of "persistent connections", which allow 2287 multiple requests and responses to be carried over a single 2288 connection. The "close" connection-option is used to signal that a 2289 connection will close after the current request/response. Persistent 2290 connections have a number of advantages: 2292 o By opening and closing fewer connections, CPU time is saved in 2293 routers and hosts (clients, servers, proxies, gateways, tunnels, 2294 or caches), and memory used for protocol control blocks can be 2295 saved in hosts. 2297 o Most requests and responses can be pipelined on a connection. 2298 Pipelining allows a client to make multiple requests without 2299 waiting for each response, allowing a single connection to be used 2300 much more efficiently and with less overall latency. 2302 o For TCP connections, network congestion is reduced by eliminating 2303 the packets associated with the three way handshake and graceful 2304 close procedures, and by allowing sufficient time to determine the 2305 congestion state of the network. 2307 o Latency on subsequent requests is reduced since there is no time 2308 spent in the connection opening handshake. 2310 o HTTP can evolve more gracefully, since most errors can be reported 2311 without the penalty of closing the connection. Clients using 2312 future versions of HTTP might optimistically try a new feature, 2313 but if communicating with an older server, retry with old 2314 semantics after an error is reported. 2316 HTTP implementations SHOULD implement persistent connections. 2318 6.2.1. Establishment 2320 It is beyond the scope of this specification to describe how 2321 connections are established via various transport or session-layer 2322 protocols. Each connection applies to only one transport link. 2324 A recipient determines whether a connection is persistent or not 2325 based on the most recently received message's protocol version and 2326 Connection header field (if any): 2328 o If the close connection option is present, the connection will not 2329 persist after the current response; else, 2331 o If the received protocol is HTTP/1.1 (or later), the connection 2332 will persist after the current response; else, 2334 o If the received protocol is HTTP/1.0, the "keep-alive" connection 2335 option is present, the recipient is not a proxy, and the recipient 2336 wishes to honor the HTTP/1.0 "keep-alive" mechanism, the 2337 connection will persist after the current response; otherwise, 2339 o The connection will close after the current response. 2341 A proxy server MUST NOT maintain a persistent connection with an 2342 HTTP/1.0 client (see Section 19.7.1 of [RFC2068] for information and 2343 discussion of the problems with the Keep-Alive header field 2344 implemented by many HTTP/1.0 clients). 2346 6.2.2. Reuse 2348 In order to remain persistent, all messages on a connection MUST have 2349 a self-defined message length (i.e., one not defined by closure of 2350 the connection), as described in Section 3.3. 2352 A server MAY assume that an HTTP/1.1 client intends to maintain a 2353 persistent connection until a close connection option is received in 2354 a request. 2356 A client MAY reuse a persistent connection until it sends or receives 2357 a close connection option or receives an HTTP/1.0 response without a 2358 "keep-alive" connection option. 2360 Clients and servers SHOULD NOT assume that a persistent connection is 2361 maintained for HTTP versions less than 1.1 unless it is explicitly 2362 signaled. See Appendix A.1.2 for more information on backward 2363 compatibility with HTTP/1.0 clients. 2365 6.2.2.1. Pipelining 2367 A client that supports persistent connections MAY "pipeline" its 2368 requests (i.e., send multiple requests without waiting for each 2369 response). A server MUST send its responses to those requests in the 2370 same order that the requests were received. 2372 Clients which assume persistent connections and pipeline immediately 2373 after connection establishment SHOULD be prepared to retry their 2374 connection if the first pipelined attempt fails. If a client does 2375 such a retry, it MUST NOT pipeline before it knows the connection is 2376 persistent. Clients MUST also be prepared to resend their requests 2377 if the server closes the connection before sending all of the 2378 corresponding responses. 2380 Clients SHOULD NOT pipeline requests using non-idempotent request 2381 methods or non-idempotent sequences of request methods (see Section 2382 5.2.2 of [Part2]). Otherwise, a premature termination of the 2383 transport connection could lead to indeterminate results. A client 2384 wishing to send a non-idempotent request SHOULD wait to send that 2385 request until it has received the response status line for the 2386 previous request. 2388 6.2.2.2. Retrying Requests 2390 Senders can close the transport connection at any time. Therefore, 2391 clients, servers, and proxies MUST be able to recover from 2392 asynchronous close events. Client software MAY reopen the transport 2393 connection and retransmit the aborted sequence of requests without 2394 user interaction so long as the request sequence is idempotent (see 2395 Section 5.2.2 of [Part2]). Non-idempotent request methods or 2396 sequences MUST NOT be automatically retried, although user agents MAY 2397 offer a human operator the choice of retrying the request(s). 2398 Confirmation by user-agent software with semantic understanding of 2399 the application MAY substitute for user confirmation. The automatic 2400 retry SHOULD NOT be repeated if the second sequence of requests 2401 fails. 2403 6.2.3. Concurrency 2405 Clients SHOULD limit the number of simultaneous connections that they 2406 maintain to a given server. 2408 Previous revisions of HTTP gave a specific number of connections as a 2409 ceiling, but this was found to be impractical for many applications. 2410 As a result, this specification does not mandate a particular maximum 2411 number of connections, but instead encourages clients to be 2412 conservative when opening multiple connections. 2414 Multiple connections are typically used to avoid the "head-of-line 2415 blocking" problem, wherein a request that takes significant server- 2416 side processing and/or has a large payload blocks subsequent requests 2417 on the same connection. However, each connection consumes server 2418 resources. Furthermore, using multiple connections can cause 2419 undesirable side effects in congested networks. 2421 Note that servers might reject traffic that they deem abusive, 2422 including an excessive number of connections from a client. 2424 6.2.4. Failures and Time-outs 2426 Servers will usually have some time-out value beyond which they will 2427 no longer maintain an inactive connection. Proxy servers might make 2428 this a higher value since it is likely that the client will be making 2429 more connections through the same server. The use of persistent 2430 connections places no requirements on the length (or existence) of 2431 this time-out for either the client or the server. 2433 When a client or server wishes to time-out it SHOULD issue a graceful 2434 close on the transport connection. Clients and servers SHOULD both 2435 constantly watch for the other side of the transport close, and 2436 respond to it as appropriate. If a client or server does not detect 2437 the other side's close promptly it could cause unnecessary resource 2438 drain on the network. 2440 A client, server, or proxy MAY close the transport connection at any 2441 time. For example, a client might have started to send a new request 2442 at the same time that the server has decided to close the "idle" 2443 connection. From the server's point of view, the connection is being 2444 closed while it was idle, but from the client's point of view, a 2445 request is in progress. 2447 Servers SHOULD maintain persistent connections and allow the 2448 underlying transport's flow control mechanisms to resolve temporary 2449 overloads, rather than terminate connections with the expectation 2450 that clients will retry. The latter technique can exacerbate network 2451 congestion. 2453 A client sending a message body SHOULD monitor the network connection 2454 for an error status code while it is transmitting the request. If 2455 the client sees an error status code, it SHOULD immediately cease 2456 transmitting the body and close the connection. 2458 6.2.5. Tear-down 2460 The Connection header field (Section 6.1) provides a "close" 2461 connection option that a sender SHOULD send when it wishes to close 2462 the connection after the current request/response pair. 2464 A client that sends a close connection option MUST NOT send further 2465 requests on that connection (after the one containing close) and MUST 2466 close the connection after reading the final response message 2467 corresponding to this request. 2469 A server that receives a close connection option MUST initiate a 2470 lingering close of the connection after it sends the final response 2471 to the request that contained close. The server SHOULD include a 2472 close connection option in its final response on that connection. 2473 The server MUST NOT process any further requests received on that 2474 connection. 2476 A server that sends a close connection option MUST initiate a 2477 lingering close of the connection after it sends the response 2478 containing close. The server MUST NOT process any further requests 2479 received on that connection. 2481 A client that receives a close connection option MUST cease sending 2482 requests on that connection and close the connection after reading 2483 the response message containing the close; if additional pipelined 2484 requests had been sent on the connection, the client SHOULD assume 2485 that they will not be processed by the server. 2487 If a server performs an immediate close of a TCP connection, there is 2488 a significant risk that the client will not be able to read the last 2489 HTTP response. If the server receives additional data from the 2490 client on a fully-closed connection, such as another request that was 2491 sent by the client before receiving the server's response, the 2492 server's TCP stack will send a reset packet to the client; 2493 unfortunately, the reset packet might erase the client's 2494 unacknowledged input buffers before they can be read and interpreted 2495 by the client's HTTP parser. 2497 To avoid the TCP reset problem, a server can perform a lingering 2498 close on a connection by closing only the write side of the read/ 2499 write connection (a half-close) and continuing to read from the 2500 connection until the connection is closed by the client or the server 2501 is reasonably certain that its own TCP stack has received the 2502 client's acknowledgement of the packet(s) containing the server's 2503 last response. It is then safe for the server to fully close the 2504 connection. 2506 It is unknown whether the reset problem is exclusive to TCP or might 2507 also be found in other transport connection protocols. 2509 6.3. Upgrade 2511 The "Upgrade" header field is intended to provide a simple mechanism 2512 for transitioning from HTTP/1.1 to some other protocol on the same 2513 connection. A client MAY send a list of protocols in the Upgrade 2514 header field of a request to invite the server to switch to one or 2515 more of those protocols before sending the final response. A server 2516 MUST send an Upgrade header field in 101 (Switching Protocols) 2517 responses to indicate which protocol(s) are being switched to, and 2518 MUST send it in 426 (Upgrade Required) responses to indicate 2519 acceptable protocols. A server MAY send an Upgrade header field in 2520 any other response to indicate that they might be willing to upgrade 2521 to one of the specified protocols for a future request. 2523 Upgrade = 1#protocol 2525 protocol = protocol-name ["/" protocol-version] 2526 protocol-name = token 2527 protocol-version = token 2529 For example, 2531 Upgrade: HTTP/2.0, SHTTP/1.3, IRC/6.9, RTA/x11 2533 Upgrade eases the difficult transition between incompatible protocols 2534 by allowing the client to initiate a request in the more commonly 2535 supported protocol while indicating to the server that it would like 2536 to use a "better" protocol if available (where "better" is determined 2537 by the server, possibly according to the nature of the request method 2538 or target resource). 2540 Upgrade cannot be used to insist on a protocol change; its acceptance 2541 and use by the server is optional. The capabilities and nature of 2542 the application-level communication after the protocol change is 2543 entirely dependent upon the new protocol chosen, although the first 2544 action after changing the protocol MUST be a response to the initial 2545 HTTP request that contained the Upgrade header field. 2547 For example, if the Upgrade header field is received in a GET request 2548 and the server decides to switch protocols, then it MUST first 2549 respond with a 101 (Switching Protocols) message in HTTP/1.1 and then 2550 immediately follow that with the new protocol's equivalent of a 2551 response to a GET on the target resource. This allows a connection 2552 to be upgraded to protocols with the same semantics as HTTP without 2553 the latency cost of an additional round-trip. A server MUST NOT 2554 switch protocols unless the received message semantics can be honored 2555 by the new protocol; an OPTIONS request can be honored by any 2556 protocol. 2558 When Upgrade is sent, a sender MUST also send a Connection header 2559 field (Section 6.1) that contains the "upgrade" connection option, in 2560 order to prevent Upgrade from being accidentally forwarded by 2561 intermediaries that might not implement the listed protocols. A 2562 server MUST ignore an Upgrade header field that is received in an 2563 HTTP/1.0 request. 2565 The Upgrade header field only applies to switching application-level 2566 protocols on the existing connection; it cannot be used to switch to 2567 a protocol on a different connection. For that purpose, it is more 2568 appropriate to use a 3xx (Redirection) response (Section 7.4 of 2569 [Part2]). 2571 This specification only defines the protocol name "HTTP" for use by 2572 the family of Hypertext Transfer Protocols, as defined by the HTTP 2573 version rules of Section 2.6 and future updates to this 2574 specification. Additional tokens can be registered with IANA using 2575 the registration procedure defined in Section 7.6. 2577 7. IANA Considerations 2579 7.1. Header Field Registration 2581 HTTP header fields are registered within the Message Header Field 2582 Registry [RFC3864] maintained by IANA at . 2585 This document defines the following HTTP header fields, so their 2586 associated registry entries shall be updated according to the 2587 permanent registrations below: 2589 +-------------------+----------+----------+---------------+ 2590 | Header Field Name | Protocol | Status | Reference | 2591 +-------------------+----------+----------+---------------+ 2592 | Connection | http | standard | Section 6.1 | 2593 | Content-Length | http | standard | Section 3.3.2 | 2594 | Host | http | standard | Section 5.4 | 2595 | TE | http | standard | Section 4.3 | 2596 | Trailer | http | standard | Section 4.1.1 | 2597 | Transfer-Encoding | http | standard | Section 3.3.1 | 2598 | Upgrade | http | standard | Section 6.3 | 2599 | Via | http | standard | Section 5.7 | 2600 +-------------------+----------+----------+---------------+ 2602 Furthermore, the header field-name "Close" shall be registered as 2603 "reserved", since using that name as an HTTP header field might 2604 conflict with the "close" connection option of the "Connection" 2605 header field (Section 6.1). 2607 +-------------------+----------+----------+-------------+ 2608 | Header Field Name | Protocol | Status | Reference | 2609 +-------------------+----------+----------+-------------+ 2610 | Close | http | reserved | Section 7.1 | 2611 +-------------------+----------+----------+-------------+ 2613 The change controller is: "IETF (iesg@ietf.org) - Internet 2614 Engineering Task Force". 2616 7.2. URI Scheme Registration 2618 IANA maintains the registry of URI Schemes [RFC4395] at 2619 . 2621 This document defines the following URI schemes, so their associated 2622 registry entries shall be updated according to the permanent 2623 registrations below: 2625 +------------+------------------------------------+---------------+ 2626 | URI Scheme | Description | Reference | 2627 +------------+------------------------------------+---------------+ 2628 | http | Hypertext Transfer Protocol | Section 2.7.1 | 2629 | https | Hypertext Transfer Protocol Secure | Section 2.7.2 | 2630 +------------+------------------------------------+---------------+ 2632 7.3. Internet Media Type Registrations 2634 This document serves as the specification for the Internet media 2635 types "message/http" and "application/http". The following is to be 2636 registered with IANA (see [RFC4288]). 2638 7.3.1. Internet Media Type message/http 2640 The message/http type can be used to enclose a single HTTP request or 2641 response message, provided that it obeys the MIME restrictions for 2642 all "message" types regarding line length and encodings. 2644 Type name: message 2646 Subtype name: http 2648 Required parameters: none 2650 Optional parameters: version, msgtype 2652 version: The HTTP-version number of the enclosed message (e.g., 2653 "1.1"). If not present, the version can be determined from the 2654 first line of the body. 2656 msgtype: The message type -- "request" or "response". If not 2657 present, the type can be determined from the first line of the 2658 body. 2660 Encoding considerations: only "7bit", "8bit", or "binary" are 2661 permitted 2663 Security considerations: none 2665 Interoperability considerations: none 2667 Published specification: This specification (see Section 7.3.1). 2669 Applications that use this media type: 2671 Additional information: 2673 Magic number(s): none 2675 File extension(s): none 2676 Macintosh file type code(s): none 2678 Person and email address to contact for further information: See 2679 Authors Section. 2681 Intended usage: COMMON 2683 Restrictions on usage: none 2685 Author/Change controller: IESG 2687 7.3.2. Internet Media Type application/http 2689 The application/http type can be used to enclose a pipeline of one or 2690 more HTTP request or response messages (not intermixed). 2692 Type name: application 2694 Subtype name: http 2696 Required parameters: none 2698 Optional parameters: version, msgtype 2700 version: The HTTP-version number of the enclosed messages (e.g., 2701 "1.1"). If not present, the version can be determined from the 2702 first line of the body. 2704 msgtype: The message type -- "request" or "response". If not 2705 present, the type can be determined from the first line of the 2706 body. 2708 Encoding considerations: HTTP messages enclosed by this type are in 2709 "binary" format; use of an appropriate Content-Transfer-Encoding 2710 is required when transmitted via E-mail. 2712 Security considerations: none 2714 Interoperability considerations: none 2716 Published specification: This specification (see Section 7.3.2). 2718 Applications that use this media type: 2720 Additional information: 2722 Magic number(s): none 2724 File extension(s): none 2726 Macintosh file type code(s): none 2728 Person and email address to contact for further information: See 2729 Authors Section. 2731 Intended usage: COMMON 2733 Restrictions on usage: none 2735 Author/Change controller: IESG 2737 7.4. Transfer Coding Registry 2739 The HTTP Transfer Coding Registry defines the name space for transfer 2740 coding names. 2742 Registrations MUST include the following fields: 2744 o Name 2746 o Description 2748 o Pointer to specification text 2750 Names of transfer codings MUST NOT overlap with names of content 2751 codings (Section 3.1.2.1 of [Part2]) unless the encoding 2752 transformation is identical, as is the case for the compression 2753 codings defined in Section 4.2. 2755 Values to be added to this name space require IETF Review (see 2756 Section 4.1 of [RFC5226]), and MUST conform to the purpose of 2757 transfer coding defined in this section. Use of program names for 2758 the identification of encoding formats is not desirable and is 2759 discouraged for future encodings. 2761 The registry itself is maintained at 2762 . 2764 7.5. Transfer Coding Registrations 2766 The HTTP Transfer Coding Registry shall be updated with the 2767 registrations below: 2769 +----------+----------------------------------------+---------------+ 2770 | Name | Description | Reference | 2771 +----------+----------------------------------------+---------------+ 2772 | chunked | Transfer in a series of chunks | Section 4.1 | 2773 | compress | UNIX "compress" program method | Section 4.2.1 | 2774 | deflate | "deflate" compression mechanism | Section 4.2.2 | 2775 | | ([RFC1951]) used inside the "zlib" | | 2776 | | data format ([RFC1950]) | | 2777 | gzip | Same as GNU zip [RFC1952] | Section 4.2.3 | 2778 +----------+----------------------------------------+---------------+ 2780 7.6. Upgrade Token Registry 2782 The HTTP Upgrade Token Registry defines the name space for protocol- 2783 name tokens used to identify protocols in the Upgrade header field. 2784 Each registered protocol name is associated with contact information 2785 and an optional set of specifications that details how the connection 2786 will be processed after it has been upgraded. 2788 Registrations happen on a "First Come First Served" basis (see 2789 Section 4.1 of [RFC5226]) and are subject to the following rules: 2791 1. A protocol-name token, once registered, stays registered forever. 2793 2. The registration MUST name a responsible party for the 2794 registration. 2796 3. The registration MUST name a point of contact. 2798 4. The registration MAY name a set of specifications associated with 2799 that token. Such specifications need not be publicly available. 2801 5. The registration SHOULD name a set of expected "protocol-version" 2802 tokens associated with that token at the time of registration. 2804 6. The responsible party MAY change the registration at any time. 2805 The IANA will keep a record of all such changes, and make them 2806 available upon request. 2808 7. The IESG MAY reassign responsibility for a protocol token. This 2809 will normally only be used in the case when a responsible party 2810 cannot be contacted. 2812 This registration procedure for HTTP Upgrade Tokens replaces that 2813 previously defined in Section 7.2 of [RFC2817]. 2815 7.7. Upgrade Token Registration 2817 The HTTP Upgrade Token Registry shall be updated with the 2818 registration below: 2820 +-------+----------------------+----------------------+-------------+ 2821 | Value | Description | Expected Version | Reference | 2822 | | | Tokens | | 2823 +-------+----------------------+----------------------+-------------+ 2824 | HTTP | Hypertext Transfer | any DIGIT.DIGIT | Section 2.6 | 2825 | | Protocol | (e.g, "2.0") | | 2826 +-------+----------------------+----------------------+-------------+ 2828 The responsible party is: "IETF (iesg@ietf.org) - Internet 2829 Engineering Task Force". 2831 8. Security Considerations 2833 This section is meant to inform application developers, information 2834 providers, and users of the security limitations in HTTP/1.1 as 2835 described by this document. The discussion does not include 2836 definitive solutions to the problems revealed, though it does make 2837 some suggestions for reducing security risks. 2839 8.1. Personal Information 2841 HTTP clients are often privy to large amounts of personal 2842 information, including both information provided by the user to 2843 interact with resources (e.g., the user's name, location, mail 2844 address, passwords, encryption keys, etc.) and information about the 2845 user's browsing activity over time (e.g., history, bookmarks, etc.). 2846 HTTP implementations need to prevent unintentional leakage of this 2847 information. 2849 8.2. Abuse of Server Log Information 2851 A server is in the position to save personal data about a user's 2852 requests which might identify their reading patterns or subjects of 2853 interest. In particular, log information gathered at an intermediary 2854 often contains a history of user agent interaction, across a 2855 multitude of sites, that can be traced to individual users. 2857 HTTP log information is confidential in nature; its handling is often 2858 constrained by laws and regulations. Log information needs to be 2859 securely stored and appropriate guidelines followed for its analysis. 2860 Anonymization of personal information within individual entries 2861 helps, but is generally not sufficient to prevent real log traces 2862 from being re-identified based on correlation with other access 2863 characteristics. As such, access traces that are keyed to a specific 2864 client should not be published even if the key is pseudonymous. 2866 To minimize the risk of theft or accidental publication, log 2867 information should be purged of personally identifiable information, 2868 including user identifiers, IP addresses, and user-provided query 2869 parameters, as soon as that information is no longer necessary to 2870 support operational needs for security, auditing, or fraud control. 2872 8.3. Attacks Based On File and Path Names 2874 Origin servers SHOULD be careful to restrict the documents returned 2875 by HTTP requests to be only those that were intended by the server 2876 administrators. If an HTTP server translates HTTP URIs directly into 2877 file system calls, the server MUST take special care not to serve 2878 files that were not intended to be delivered to HTTP clients. For 2879 example, UNIX, Microsoft Windows, and other operating systems use 2880 ".." as a path component to indicate a directory level above the 2881 current one. On such a system, an HTTP server MUST disallow any such 2882 construct in the request-target if it would otherwise allow access to 2883 a resource outside those intended to be accessible via the HTTP 2884 server. Similarly, files intended for reference only internally to 2885 the server (such as access control files, configuration files, and 2886 script code) MUST be protected from inappropriate retrieval, since 2887 they might contain sensitive information. 2889 8.4. DNS-related Attacks 2891 HTTP clients rely heavily on the Domain Name Service (DNS), and are 2892 thus generally prone to security attacks based on the deliberate 2893 misassociation of IP addresses and DNS names not protected by DNSSec. 2894 Clients need to be cautious in assuming the validity of an IP number/ 2895 DNS name association unless the response is protected by DNSSec 2896 ([RFC4033]). 2898 8.5. Intermediaries and Caching 2900 By their very nature, HTTP intermediaries are men-in-the-middle, and 2901 represent an opportunity for man-in-the-middle attacks. Compromise 2902 of the systems on which the intermediaries run can result in serious 2903 security and privacy problems. Intermediaries have access to 2904 security-related information, personal information about individual 2905 users and organizations, and proprietary information belonging to 2906 users and content providers. A compromised intermediary, or an 2907 intermediary implemented or configured without regard to security and 2908 privacy considerations, might be used in the commission of a wide 2909 range of potential attacks. 2911 Intermediaries that contain a shared cache are especially vulnerable 2912 to cache poisoning attacks. 2914 Implementers need to consider the privacy and security implications 2915 of their design and coding decisions, and of the configuration 2916 options they provide to operators (especially the default 2917 configuration). 2919 Users need to be aware that intermediaries are no more trustworthy 2920 than the people who run them; HTTP itself cannot solve this problem. 2922 8.6. Protocol Element Size Overflows 2924 Because HTTP uses mostly textual, character-delimited fields, 2925 attackers can overflow buffers in implementations, and/or perform a 2926 Denial of Service against implementations that accept fields with 2927 unlimited lengths. 2929 To promote interoperability, this specification makes specific 2930 recommendations for minimum size limits on request-line 2931 (Section 3.1.1) and blocks of header fields (Section 3.2). These are 2932 minimum recommendations, chosen to be supportable even by 2933 implementations with limited resources; it is expected that most 2934 implementations will choose substantially higher limits. 2936 This specification also provides a way for servers to reject messages 2937 that have request-targets that are too long (Section 7.5.12 of 2938 [Part2]) or request entities that are too large (Section 7.5 of 2939 [Part2]). 2941 Recipients SHOULD carefully limit the extent to which they read other 2942 fields, including (but not limited to) request methods, response 2943 status phrases, header field-names, and body chunks, so as to avoid 2944 denial of service attacks without impeding interoperability. 2946 9. Acknowledgments 2948 This edition of HTTP builds on the many contributions that went into 2949 RFC 1945, RFC 2068, RFC 2145, and RFC 2616, including substantial 2950 contributions made by the previous authors, editors, and working 2951 group chairs: Tim Berners-Lee, Ari Luotonen, Roy T. Fielding, Henrik 2952 Frystyk Nielsen, Jim Gettys, Jeffrey C. Mogul, Larry Masinter, Paul 2953 J. Leach, and Mark Nottingham. See Section 16 of [RFC2616] for 2954 additional acknowledgements from prior revisions. 2956 Since 1999, the following contributors have helped improve the HTTP 2957 specification by reporting bugs, asking smart questions, drafting or 2958 reviewing text, and evaluating open issues: 2960 Adam Barth, Adam Roach, Addison Phillips, Adrian Chadd, Adrien W. de 2961 Croy, Alan Ford, Alan Ruttenberg, Albert Lunde, Alek Storm, Alex 2962 Rousskov, Alexandre Morgaut, Alexey Melnikov, Alisha Smith, Amichai 2963 Rothman, Amit Klein, Amos Jeffries, Andreas Maier, Andreas Petersson, 2964 Anil Sharma, Anne van Kesteren, Anthony Bryan, Asbjorn Ulsberg, 2965 Balachander Krishnamurthy, Barry Leiba, Ben Laurie, Benjamin Niven- 2966 Jenkins, Bil Corry, Bill Burke, Bjoern Hoehrmann, Bob Scheifler, 2967 Boris Zbarsky, Brett Slatkin, Brian Kell, Brian McBarron, Brian Pane, 2968 Brian Smith, Bryce Nesbitt, Cameron Heavon-Jones, Carl Kugler, 2969 Carsten Bormann, Charles Fry, Chris Newman, Cyrus Daboo, Dale Robert 2970 Anderson, Dan Wing, Dan Winship, Daniel Stenberg, Dave Cridland, Dave 2971 Crocker, Dave Kristol, David Booth, David Singer, David W. Morris, 2972 Diwakar Shetty, Dmitry Kurochkin, Drummond Reed, Duane Wessels, 2973 Edward Lee, Eliot Lear, Eran Hammer-Lahav, Eric D. Williams, Eric J. 2974 Bowman, Eric Lawrence, Eric Rescorla, Erik Aronesty, Evan Prodromou, 2975 Florian Weimer, Frank Ellermann, Fred Bohle, Gabriel Montenegro, 2976 Geoffrey Sneddon, Gervase Markham, Grahame Grieve, Greg Wilkins, 2977 Harald Tveit Alvestrand, Harry Halpin, Helge Hess, Henrik Nordstrom, 2978 Henry S. Thompson, Henry Story, Herbert van de Sompel, Howard Melman, 2979 Hugo Haas, Ian Fette, Ian Hickson, Ido Safruti, Ingo Struck, J. Ross 2980 Nicoll, James H. Manger, James Lacey, James M. Snell, Jamie Lokier, 2981 Jan Algermissen, Jeff Hodges (who came up with the term 'effective 2982 Request-URI'), Jeff Walden, Jim Luther, Joe D. Williams, Joe 2983 Gregorio, Joe Orton, John C. Klensin, John C. Mallery, John Cowan, 2984 John Kemp, John Panzer, John Schneider, John Stracke, John Sullivan, 2985 Jonas Sicking, Jonathan Billington, Jonathan Moore, Jonathan Rees, 2986 Jonathan Silvera, Jordi Ros, Joris Dobbelsteen, Josh Cohen, Julien 2987 Pierre, Jungshik Shin, Justin Chapweske, Justin Erenkrantz, Justin 2988 James, Kalvinder Singh, Karl Dubost, Keith Hoffman, Keith Moore, Koen 2989 Holtman, Konstantin Voronkov, Kris Zyp, Lisa Dusseault, Maciej 2990 Stachowiak, Marc Schneider, Marc Slemko, Mark Baker, Mark Pauley, 2991 Mark Watson, Markus Isomaki, Markus Lanthaler, Martin J. Duerst, 2992 Martin Musatov, Martin Nilsson, Martin Thomson, Matt Lynch, Matthew 2993 Cox, Max Clark, Michael Burrows, Michael Hausenblas, Mike Amundsen, 2994 Mike Belshe, Mike Kelly, Mike Schinkel, Miles Sabin, Murray S. 2995 Kucherawy, Mykyta Yevstifeyev, Nathan Rixham, Nicholas Shanks, Nico 2996 Williams, Nicolas Alvarez, Nicolas Mailhot, Noah Slater, Pablo 2997 Castro, Pat Hayes, Patrick R. McManus, Paul E. Jones, Paul Hoffman, 2998 Paul Marquess, Peter Lepeska, Peter Saint-Andre, Peter Watkins, Phil 2999 Archer, Philippe Mougin, Phillip Hallam-Baker, Poul-Henning Kamp, 3000 Preethi Natarajan, Rajeev Bector, Ray Polk, Reto Bachmann-Gmuer, 3001 Richard Cyganiak, Robert Brewer, Robert Collins, Robert O'Callahan, 3002 Robert Olofsson, Robert Sayre, Robert Siemer, Robert de Wilde, 3003 Roberto Javier Godoy, Roberto Peon, Ronny Widjaja, S. Mike Dierken, 3004 Salvatore Loreto, Sam Johnston, Sam Ruby, Scott Lawrence (who 3005 maintained the original issues list), Sean B. Palmer, Shane McCarron, 3006 Stefan Eissing, Stefan Tilkov, Stefanos Harhalakis, Stephane 3007 Bortzmeyer, Stephen Farrell, Stephen Ludin, Stuart Williams, Subbu 3008 Allamaraju, Sylvain Hellegouarch, Tapan Divekar, Tatsuya Hayashi, Ted 3009 Hardie, Thomas Broyer, Thomas Nordin, Thomas Roessler, Tim Bray, Tim 3010 Morgan, Tim Olsen, Tom Zhou, Travis Snoozy, Tyler Close, Vincent 3011 Murphy, Wenbo Zhu, Werner Baumann, Wilbur Streett, Wilfredo Sanchez 3012 Vega, William A. Rowe Jr., William Chan, Willy Tarreau, Xiaoshu Wang, 3013 Yaron Goland, Yngve Nysaeter Pettersen, Yoav Nir, Yogesh Bang, Yutaka 3014 Oiwa, Yves Lafon (long-time member of the editor team), Zed A. Shaw, 3015 and Zhong Yu. 3017 10. References 3019 10.1. Normative References 3021 [Part2] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext 3022 Transfer Protocol (HTTP/1.1): Semantics and Content", 3023 draft-ietf-httpbis-p2-semantics-21 (work in progress), 3024 October 2012. 3026 [Part4] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext 3027 Transfer Protocol (HTTP/1.1): Conditional Requests", 3028 draft-ietf-httpbis-p4-conditional-21 (work in 3029 progress), October 2012. 3031 [Part5] Fielding, R., Ed., Lafon, Y., Ed., and J. Reschke, Ed., 3032 "Hypertext Transfer Protocol (HTTP/1.1): Range 3033 Requests", draft-ietf-httpbis-p5-range-21 (work in 3034 progress), October 2012. 3036 [Part6] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, 3037 Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching", 3038 draft-ietf-httpbis-p6-cache-21 (work in progress), 3039 October 2012. 3041 [Part7] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext 3042 Transfer Protocol (HTTP/1.1): Authentication", 3043 draft-ietf-httpbis-p7-auth-21 (work in progress), 3044 October 2012. 3046 [RFC1950] Deutsch, L. and J-L. Gailly, "ZLIB Compressed Data 3047 Format Specification version 3.3", RFC 1950, May 1996. 3049 [RFC1951] Deutsch, P., "DEFLATE Compressed Data Format 3050 Specification version 1.3", RFC 1951, May 1996. 3052 [RFC1952] Deutsch, P., Gailly, J-L., Adler, M., Deutsch, L., and 3053 G. Randers-Pehrson, "GZIP file format specification 3054 version 4.3", RFC 1952, May 1996. 3056 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 3057 Requirement Levels", BCP 14, RFC 2119, March 1997. 3059 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, 3060 "Uniform Resource Identifier (URI): Generic Syntax", 3061 STD 66, RFC 3986, January 2005. 3063 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for 3064 Syntax Specifications: ABNF", STD 68, RFC 5234, 3065 January 2008. 3067 [USASCII] American National Standards Institute, "Coded Character 3068 Set -- 7-bit American Standard Code for Information 3069 Interchange", ANSI X3.4, 1986. 3071 10.2. Informative References 3073 [ISO-8859-1] International Organization for Standardization, 3074 "Information technology -- 8-bit single-byte coded 3075 graphic character sets -- Part 1: Latin alphabet No. 3076 1", ISO/IEC 8859-1:1998, 1998. 3078 [Kri2001] Kristol, D., "HTTP Cookies: Standards, Privacy, and 3079 Politics", ACM Transactions on Internet Technology Vol. 3080 1, #2, November 2001, 3081 . 3083 [RFC1919] Chatel, M., "Classical versus Transparent IP Proxies", 3084 RFC 1919, March 1996. 3086 [RFC1945] Berners-Lee, T., Fielding, R., and H. Nielsen, 3087 "Hypertext Transfer Protocol -- HTTP/1.0", RFC 1945, 3088 May 1996. 3090 [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet 3091 Mail Extensions (MIME) Part One: Format of Internet 3092 Message Bodies", RFC 2045, November 1996. 3094 [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail 3095 Extensions) Part Three: Message Header Extensions for 3096 Non-ASCII Text", RFC 2047, November 1996. 3098 [RFC2068] Fielding, R., Gettys, J., Mogul, J., Nielsen, H., and 3099 T. Berners-Lee, "Hypertext Transfer Protocol -- 3100 HTTP/1.1", RFC 2068, January 1997. 3102 [RFC2145] Mogul, J., Fielding, R., Gettys, J., and H. Nielsen, 3103 "Use and Interpretation of HTTP Version Numbers", 3104 RFC 2145, May 1997. 3106 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 3107 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext 3108 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. 3110 [RFC2817] Khare, R. and S. Lawrence, "Upgrading to TLS Within 3111 HTTP/1.1", RFC 2817, May 2000. 3113 [RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000. 3115 [RFC2965] Kristol, D. and L. Montulli, "HTTP State Management 3116 Mechanism", RFC 2965, October 2000. 3118 [RFC3040] Cooper, I., Melve, I., and G. Tomlinson, "Internet Web 3119 Replication and Caching Taxonomy", RFC 3040, 3120 January 2001. 3122 [RFC3864] Klyne, G., Nottingham, M., and J. Mogul, "Registration 3123 Procedures for Message Header Fields", BCP 90, 3124 RFC 3864, September 2004. 3126 [RFC4033] Arends, R., Austein, R., Larson, M., Massey, D., and S. 3127 Rose, "DNS Security Introduction and Requirements", 3128 RFC 4033, March 2005. 3130 [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications 3131 and Registration Procedures", BCP 13, RFC 4288, 3132 December 2005. 3134 [RFC4395] Hansen, T., Hardie, T., and L. Masinter, "Guidelines 3135 and Registration Procedures for New URI Schemes", 3136 BCP 115, RFC 4395, February 2006. 3138 [RFC4559] Jaganathan, K., Zhu, L., and J. Brezak, "SPNEGO-based 3139 Kerberos and NTLM HTTP Authentication in Microsoft 3140 Windows", RFC 4559, June 2006. 3142 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing 3143 an IANA Considerations Section in RFCs", BCP 26, 3144 RFC 5226, May 2008. 3146 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer 3147 Security (TLS) Protocol Version 1.2", RFC 5246, 3148 August 2008. 3150 [RFC5322] Resnick, P., "Internet Message Format", RFC 5322, 3151 October 2008. 3153 [RFC6265] Barth, A., "HTTP State Management Mechanism", RFC 6265, 3154 April 2011. 3156 Appendix A. HTTP Version History 3158 HTTP has been in use by the World-Wide Web global information 3159 initiative since 1990. The first version of HTTP, later referred to 3160 as HTTP/0.9, was a simple protocol for hypertext data transfer across 3161 the Internet with only a single request method (GET) and no metadata. 3162 HTTP/1.0, as defined by [RFC1945], added a range of request methods 3163 and MIME-like messaging that could include metadata about the data 3164 transferred and modifiers on the request/response semantics. 3165 However, HTTP/1.0 did not sufficiently take into consideration the 3166 effects of hierarchical proxies, caching, the need for persistent 3167 connections, or name-based virtual hosts. The proliferation of 3168 incompletely-implemented applications calling themselves "HTTP/1.0" 3169 further necessitated a protocol version change in order for two 3170 communicating applications to determine each other's true 3171 capabilities. 3173 HTTP/1.1 remains compatible with HTTP/1.0 by including more stringent 3174 requirements that enable reliable implementations, adding only those 3175 new features that will either be safely ignored by an HTTP/1.0 3176 recipient or only sent when communicating with a party advertising 3177 conformance with HTTP/1.1. 3179 It is beyond the scope of a protocol specification to mandate 3180 conformance with previous versions. HTTP/1.1 was deliberately 3181 designed, however, to make supporting previous versions easy. We 3182 would expect a general-purpose HTTP/1.1 server to understand any 3183 valid request in the format of HTTP/1.0 and respond appropriately 3184 with an HTTP/1.1 message that only uses features understood (or 3185 safely ignored) by HTTP/1.0 clients. Likewise, we would expect an 3186 HTTP/1.1 client to understand any valid HTTP/1.0 response. 3188 Since HTTP/0.9 did not support header fields in a request, there is 3189 no mechanism for it to support name-based virtual hosts (selection of 3190 resource by inspection of the Host header field). Any server that 3191 implements name-based virtual hosts ought to disable support for 3192 HTTP/0.9. Most requests that appear to be HTTP/0.9 are, in fact, 3193 badly constructed HTTP/1.x requests wherein a buggy client failed to 3194 properly encode linear whitespace found in a URI reference and placed 3195 in the request-target. 3197 A.1. Changes from HTTP/1.0 3199 This section summarizes major differences between versions HTTP/1.0 3200 and HTTP/1.1. 3202 A.1.1. Multi-homed Web Servers 3204 The requirements that clients and servers support the Host header 3205 field (Section 5.4), report an error if it is missing from an 3206 HTTP/1.1 request, and accept absolute URIs (Section 5.3) are among 3207 the most important changes defined by HTTP/1.1. 3209 Older HTTP/1.0 clients assumed a one-to-one relationship of IP 3210 addresses and servers; there was no other established mechanism for 3211 distinguishing the intended server of a request than the IP address 3212 to which that request was directed. The Host header field was 3213 introduced during the development of HTTP/1.1 and, though it was 3214 quickly implemented by most HTTP/1.0 browsers, additional 3215 requirements were placed on all HTTP/1.1 requests in order to ensure 3216 complete adoption. At the time of this writing, most HTTP-based 3217 services are dependent upon the Host header field for targeting 3218 requests. 3220 A.1.2. Keep-Alive Connections 3222 In HTTP/1.0, each connection is established by the client prior to 3223 the request and closed by the server after sending the response. 3224 However, some implementations implement the explicitly negotiated 3225 ("Keep-Alive") version of persistent connections described in Section 3226 19.7.1 of [RFC2068]. 3228 Some clients and servers might wish to be compatible with these 3229 previous approaches to persistent connections, by explicitly 3230 negotiating for them with a "Connection: keep-alive" request header 3231 field. However, some experimental implementations of HTTP/1.0 3232 persistent connections are faulty; for example, if a HTTP/1.0 proxy 3233 server doesn't understand Connection, it will erroneously forward 3234 that header field to the next inbound server, which would result in a 3235 hung connection. 3237 One attempted solution was the introduction of a Proxy-Connection 3238 header field, targeted specifically at proxies. In practice, this 3239 was also unworkable, because proxies are often deployed in multiple 3240 layers, bringing about the same problem discussed above. 3242 As a result, clients are encouraged not to send the Proxy-Connection 3243 header field in any requests. 3245 Clients are also encouraged to consider the use of Connection: keep- 3246 alive in requests carefully; while they can enable persistent 3247 connections with HTTP/1.0 servers, clients using them need will need 3248 to monitor the connection for "hung" requests (which indicate that 3249 the client ought stop sending the header field), and this mechanism 3250 ought not be used by clients at all when a proxy is being used. 3252 A.1.3. Introduction of Transfer-Encoding 3254 HTTP/1.1 introduces the Transfer-Encoding header field 3255 (Section 3.3.1). Proxies/gateways MUST remove any transfer-coding 3256 prior to forwarding a message via a MIME-compliant protocol. 3258 A.2. Changes from RFC 2616 3260 Clarify that the string "HTTP" in the HTTP-version ABNF production is 3261 case sensitive. Restrict the version numbers to be single digits due 3262 to the fact that implementations are known to handle multi-digit 3263 version numbers incorrectly. (Section 2.6) 3265 Require that invalid whitespace around field-names be rejected. 3266 Change ABNF productions for header fields to only define the field 3267 value. (Section 3.2) 3269 Rules about implicit linear whitespace between certain grammar 3270 productions have been removed; now whitespace is only allowed where 3271 specifically defined in the ABNF. (Section 3.2.1) 3273 The NUL octet is no longer allowed in comment and quoted-string text. 3274 The quoted-pair rule no longer allows escaping control characters 3275 other than HTAB. Non-ASCII content in header fields and reason 3276 phrase has been obsoleted and made opaque (the TEXT rule was 3277 removed). (Section 3.2.4) 3279 Require recipients to handle bogus "Content-Length" header fields as 3280 errors. (Section 3.3) 3282 Remove reference to non-existent identity transfer-coding value 3283 tokens. (Sections 3.3 and 4) 3285 Clarification that the chunk length does not include the count of the 3286 octets in the chunk header and trailer. Furthermore disallowed line 3287 folding in chunk extensions, and deprecate their use. (Section 4.1) 3289 Update use of abs_path production from RFC 1808 to the path-absolute 3290 + query components of RFC 3986. State that the asterisk form is 3291 allowed for the OPTIONS request method only. (Section 5.3) 3293 Clarify exactly when "close" connection options have to be sent; drop 3294 notion of header fields being "hop-by-hop" without being listed in 3295 the Connection header field. (Section 6.1) 3297 Remove hard limit of two connections per server. Remove requirement 3298 to retry a sequence of requests as long it was idempotent. Remove 3299 requirements about when servers are allowed to close connections 3300 prematurely. (Section 6.2) 3302 Remove requirement to retry requests under certain circumstances when 3303 the server prematurely closes the connection. (Section 6.2.2) 3305 Define the semantics of the Upgrade header field in responses other 3306 than 101 (this was incorporated from [RFC2817]). (Section 6.3) 3308 Registration of Transfer Codings now requires IETF Review 3309 (Section 7.4) 3311 Take over the Upgrade Token Registry, previously defined in Section 3312 7.2 of [RFC2817]. (Section 7.6) 3314 Empty list elements in list productions have been deprecated. 3315 (Appendix B) 3317 Appendix B. ABNF list extension: #rule 3319 A #rule extension to the ABNF rules of [RFC5234] is used to improve 3320 readability in the definitions of some header field values. 3322 A construct "#" is defined, similar to "*", for defining comma- 3323 delimited lists of elements. The full form is "#element" 3324 indicating at least and at most elements, each separated by a 3325 single comma (",") and optional whitespace (OWS). 3327 Thus, 3329 1#element => element *( OWS "," OWS element ) 3331 and: 3333 #element => [ 1#element ] 3335 and for n >= 1 and m > 1: 3337 #element => element *( OWS "," OWS element ) 3339 For compatibility with legacy list rules, recipients SHOULD accept 3340 empty list elements. In other words, consumers would follow the list 3341 productions: 3343 #element => [ ( "," / element ) *( OWS "," [ OWS element ] ) ] 3345 1#element => *( "," OWS ) element *( OWS "," [ OWS element ] ) 3347 Note that empty elements do not contribute to the count of elements 3348 present, though. 3350 For example, given these ABNF productions: 3352 example-list = 1#example-list-elmt 3353 example-list-elmt = token ; see Section 3.2.4 3355 Then these are valid values for example-list (not including the 3356 double quotes, which are present for delimitation only): 3358 "foo,bar" 3359 "foo ,bar," 3360 "foo , ,bar,charlie " 3362 But these values would be invalid, as at least one non-empty element 3363 is required: 3365 "" 3366 "," 3367 ", ," 3369 Appendix C shows the collected ABNF, with the list rules expanded as 3370 explained above. 3372 Appendix C. Collected ABNF 3374 BWS = OWS 3376 Connection = *( "," OWS ) connection-option *( OWS "," [ OWS 3377 connection-option ] ) 3378 Content-Length = 1*DIGIT 3380 HTTP-message = start-line *( header-field CRLF ) CRLF [ message-body 3381 ] 3382 HTTP-name = %x48.54.54.50 ; HTTP 3383 HTTP-version = HTTP-name "/" DIGIT "." DIGIT 3384 Host = uri-host [ ":" port ] 3386 OWS = *( SP / HTAB ) 3388 RWS = 1*( SP / HTAB ) 3390 TE = [ ( "," / t-codings ) *( OWS "," [ OWS t-codings ] ) ] 3391 Trailer = *( "," OWS ) field-name *( OWS "," [ OWS field-name ] ) 3392 Transfer-Encoding = *( "," OWS ) transfer-coding *( OWS "," [ OWS 3393 transfer-coding ] ) 3395 URI-reference = 3396 Upgrade = *( "," OWS ) protocol *( OWS "," [ OWS protocol ] ) 3398 Via = *( "," OWS ) ( received-protocol RWS received-by [ RWS comment 3399 ] ) *( OWS "," [ OWS ( received-protocol RWS received-by [ RWS 3400 comment ] ) ] ) 3402 absolute-URI = 3403 absolute-form = absolute-URI 3404 asterisk-form = "*" 3405 attribute = token 3406 authority = 3407 authority-form = authority 3409 chunk = chunk-size [ chunk-ext ] CRLF chunk-data CRLF 3410 chunk-data = 1*OCTET 3411 chunk-ext = *( ";" chunk-ext-name [ "=" chunk-ext-val ] ) 3412 chunk-ext-name = token 3413 chunk-ext-val = token / quoted-str-nf 3414 chunk-size = 1*HEXDIG 3415 chunked-body = *chunk last-chunk trailer-part CRLF 3416 comment = "(" *( ctext / quoted-cpair / comment ) ")" 3417 connection-option = token 3418 ctext = OWS / %x21-27 ; '!'-''' 3419 / %x2A-5B ; '*'-'[' 3420 / %x5D-7E ; ']'-'~' 3421 / obs-text 3423 field-content = *( HTAB / SP / VCHAR / obs-text ) 3424 field-name = token 3425 field-value = *( field-content / obs-fold ) 3427 header-field = field-name ":" OWS field-value BWS 3428 http-URI = "http://" authority path-abempty [ "?" query ] 3429 https-URI = "https://" authority path-abempty [ "?" query ] 3431 last-chunk = 1*"0" [ chunk-ext ] CRLF 3433 message-body = *OCTET 3434 method = token 3436 obs-fold = CRLF ( SP / HTAB ) 3437 obs-text = %x80-FF 3438 origin-form = path-absolute [ "?" query ] 3440 partial-URI = relative-part [ "?" query ] 3441 path-abempty = 3442 path-absolute = 3443 port = 3444 protocol = protocol-name [ "/" protocol-version ] 3445 protocol-name = token 3446 protocol-version = token 3447 pseudonym = token 3449 qdtext = OWS / "!" / %x23-5B ; '#'-'[' 3450 / %x5D-7E ; ']'-'~' 3451 / obs-text 3452 qdtext-nf = HTAB / SP / "!" / %x23-5B ; '#'-'[' 3453 / %x5D-7E ; ']'-'~' 3454 / obs-text 3455 query = 3456 quoted-cpair = "\" ( HTAB / SP / VCHAR / obs-text ) 3457 quoted-pair = "\" ( HTAB / SP / VCHAR / obs-text ) 3458 quoted-str-nf = DQUOTE *( qdtext-nf / quoted-pair ) DQUOTE 3459 quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE 3461 rank = ( "0" [ "." *3DIGIT ] ) / ( "1" [ "." *3"0" ] ) 3462 reason-phrase = *( HTAB / SP / VCHAR / obs-text ) 3463 received-by = ( uri-host [ ":" port ] ) / pseudonym 3464 received-protocol = [ protocol-name "/" ] protocol-version 3465 relative-part = 3466 request-line = method SP request-target SP HTTP-version CRLF 3467 request-target = origin-form / absolute-form / authority-form / 3468 asterisk-form 3470 special = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "\" / 3471 DQUOTE / "/" / "[" / "]" / "?" / "=" / "{" / "}" 3472 start-line = request-line / status-line 3473 status-code = 3DIGIT 3474 status-line = HTTP-version SP status-code SP reason-phrase CRLF 3476 t-codings = "trailers" / ( transfer-coding [ t-ranking ] ) 3477 t-ranking = OWS ";" OWS "q=" rank 3478 tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." / 3479 "^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA 3480 token = 1*tchar 3481 trailer-part = *( header-field CRLF ) 3482 transfer-coding = "chunked" / "compress" / "deflate" / "gzip" / 3483 transfer-extension 3484 transfer-extension = token *( OWS ";" OWS transfer-parameter ) 3485 transfer-parameter = attribute BWS "=" BWS value 3487 uri-host = 3489 value = word 3490 word = token / quoted-string 3492 Appendix D. Change Log (to be removed by RFC Editor before publication) 3494 D.1. Since RFC 2616 3496 Extracted relevant partitions from [RFC2616]. 3498 D.2. Since draft-ietf-httpbis-p1-messaging-00 3500 Closed issues: 3502 o : "HTTP Version 3503 should be case sensitive" 3504 () 3506 o : "'unsafe' 3507 characters" () 3509 o : "Chunk Size 3510 Definition" () 3512 o : "Message Length" 3513 () 3515 o : "Media Type 3516 Registrations" () 3518 o : "URI includes 3519 query" () 3521 o : "No close on 3522 1xx responses" () 3524 o : "Remove 3525 'identity' token references" 3526 () 3528 o : "Import query 3529 BNF" 3531 o : "qdtext BNF" 3533 o : "Normative and 3534 Informative references" 3536 o : "RFC2606 3537 Compliance" 3539 o : "RFC977 3540 reference" 3542 o : "RFC1700 3543 references" 3545 o : "inconsistency 3546 in date format explanation" 3548 o : "Date reference 3549 typo" 3551 o : "Informative 3552 references" 3554 o : "ISO-8859-1 3555 Reference" 3557 o : "Normative up- 3558 to-date references" 3560 Other changes: 3562 o Update media type registrations to use RFC4288 template. 3564 o Use names of RFC4234 core rules DQUOTE and HTAB, fix broken ABNF 3565 for chunk-data (work in progress on 3566 ) 3568 D.3. Since draft-ietf-httpbis-p1-messaging-01 3570 Closed issues: 3572 o : "Bodies on GET 3573 (and other) requests" 3575 o : "Updating to 3576 RFC4288" 3578 o : "Status Code 3579 and Reason Phrase" 3581 o : "rel_path not 3582 used" 3584 Ongoing work on ABNF conversion 3585 (): 3587 o Get rid of duplicate BNF rule names ("host" -> "uri-host", 3588 "trailer" -> "trailer-part"). 3590 o Avoid underscore character in rule names ("http_URL" -> "http- 3591 URL", "abs_path" -> "path-absolute"). 3593 o Add rules for terms imported from URI spec ("absoluteURI", 3594 "authority", "path-absolute", "port", "query", "relativeURI", 3595 "host) -- these will have to be updated when switching over to 3596 RFC3986. 3598 o Synchronize core rules with RFC5234. 3600 o Get rid of prose rules that span multiple lines. 3602 o Get rid of unused rules LOALPHA and UPALPHA. 3604 o Move "Product Tokens" section (back) into Part 1, as "token" is 3605 used in the definition of the Upgrade header field. 3607 o Add explicit references to BNF syntax and rules imported from 3608 other parts of the specification. 3610 o Rewrite prose rule "token" in terms of "tchar", rewrite prose rule 3611 "TEXT". 3613 D.4. Since draft-ietf-httpbis-p1-messaging-02 3615 Closed issues: 3617 o : "HTTP-date vs. 3618 rfc1123-date" 3620 o : "WS in quoted- 3621 pair" 3623 Ongoing work on IANA Message Header Field Registration 3624 (): 3626 o Reference RFC 3984, and update header field registrations for 3627 header fields defined in this document. 3629 Ongoing work on ABNF conversion 3630 (): 3632 o Replace string literals when the string really is case-sensitive 3633 (HTTP-version). 3635 D.5. Since draft-ietf-httpbis-p1-messaging-03 3637 Closed issues: 3639 o : "Connection 3640 closing" 3642 o : "Move 3643 registrations and registry information to IANA Considerations" 3645 o : "need new URL 3646 for PAD1995 reference" 3648 o : "IANA 3649 Considerations: update HTTP URI scheme registration" 3651 o : "Cite HTTPS 3652 URI scheme definition" 3654 o : "List-type 3655 header fields vs Set-Cookie" 3657 Ongoing work on ABNF conversion 3658 (): 3660 o Replace string literals when the string really is case-sensitive 3661 (HTTP-Date). 3663 o Replace HEX by HEXDIG for future consistence with RFC 5234's core 3664 rules. 3666 D.6. Since draft-ietf-httpbis-p1-messaging-04 3668 Closed issues: 3670 o : "Out-of-date 3671 reference for URIs" 3673 o : "RFC 2822 is 3674 updated by RFC 5322" 3676 Ongoing work on ABNF conversion 3677 (): 3679 o Use "/" instead of "|" for alternatives. 3681 o Get rid of RFC822 dependency; use RFC5234 plus extensions instead. 3683 o Only reference RFC 5234's core rules. 3685 o Introduce new ABNF rules for "bad" whitespace ("BWS"), optional 3686 whitespace ("OWS") and required whitespace ("RWS"). 3688 o Rewrite ABNFs to spell out whitespace rules, factor out header 3689 field value format definitions. 3691 D.7. Since draft-ietf-httpbis-p1-messaging-05 3693 Closed issues: 3695 o : "Header LWS" 3697 o : "Sort 1.3 3698 Terminology" 3700 o : "RFC2047 3701 encoded words" 3703 o : "Character 3704 Encodings in TEXT" 3706 o : "Line Folding" 3708 o : "OPTIONS * and 3709 proxies" 3711 o : "reason-phrase 3712 BNF" 3714 o : "Use of TEXT" 3716 o : "Join 3717 "Differences Between HTTP Entities and RFC 2045 Entities"?" 3719 o : "RFC822 3720 reference left in discussion of date formats" 3722 Final work on ABNF conversion 3723 (): 3725 o Rewrite definition of list rules, deprecate empty list elements. 3727 o Add appendix containing collected and expanded ABNF. 3729 Other changes: 3731 o Rewrite introduction; add mostly new Architecture Section. 3733 o Move definition of quality values from Part 3 into Part 1; make TE 3734 request header field grammar independent of accept-params (defined 3735 in Part 3). 3737 D.8. Since draft-ietf-httpbis-p1-messaging-06 3739 Closed issues: 3741 o : "base for 3742 numeric protocol elements" 3744 o : "comment ABNF" 3746 Partly resolved issues: 3748 o : "205 Bodies" 3749 (took out language that implied that there might be methods for 3750 which a payload body MUST NOT be included) 3752 o : "editorial 3753 improvements around HTTP-date" 3755 D.9. Since draft-ietf-httpbis-p1-messaging-07 3757 Closed issues: 3759 o : "Repeating 3760 single-value header fields" 3762 o : "increase 3763 connection limit" 3765 o : "IP addresses 3766 in URLs" 3768 o : "take over 3769 HTTP Upgrade Token Registry" 3771 o : "CR and LF in 3772 chunk extension values" 3774 o : "HTTP/0.9 3775 support" 3777 o : "pick IANA 3778 policy (RFC5226) for Transfer Coding / Content Coding" 3780 o : "move 3781 definitions of gzip/deflate/compress to part 1" 3783 o : "disallow 3784 control characters in quoted-pair" 3786 Partly resolved issues: 3788 o : "update IANA 3789 requirements wrt Transfer-Coding values" (add the IANA 3790 Considerations subsection) 3792 D.10. Since draft-ietf-httpbis-p1-messaging-08 3794 Closed issues: 3796 o : "header 3797 parsing, treatment of leading and trailing OWS" 3799 Partly resolved issues: 3801 o : "Placement of 3802 13.5.1 and 13.5.2" 3804 o : "use of term 3805 "word" when talking about header field structure" 3807 D.11. Since draft-ietf-httpbis-p1-messaging-09 3809 Closed issues: 3811 o : "Clarification 3812 of the term 'deflate'" 3814 o : "OPTIONS * and 3815 proxies" 3817 o : "MIME-Version 3818 not listed in P1, general header fields" 3820 o : "IANA registry 3821 for content/transfer encodings" 3823 o : "Case- 3824 sensitivity of HTTP-date" 3826 o : "use of term 3827 "word" when talking about header field structure" 3829 Partly resolved issues: 3831 o : "Term for the 3832 requested resource's URI" 3834 D.12. Since draft-ietf-httpbis-p1-messaging-10 3836 Closed issues: 3838 o : "Connection 3839 Closing" 3841 o : "Delimiting 3842 messages with multipart/byteranges" 3844 o : "Handling 3845 multiple Content-Length header fields" 3847 o : "Clarify 3848 entity / representation / variant terminology" 3850 o : "consider 3851 removing the 'changes from 2068' sections" 3853 Partly resolved issues: 3855 o : "HTTP(s) URI 3856 scheme definitions" 3858 D.13. Since draft-ietf-httpbis-p1-messaging-11 3860 Closed issues: 3862 o : "Trailer 3863 requirements" 3865 o : "Text about 3866 clock requirement for caches belongs in p6" 3868 o : "effective 3869 request URI: handling of missing host in HTTP/1.0" 3871 o : "confusing 3872 Date requirements for clients" 3874 Partly resolved issues: 3876 o : "Handling 3877 multiple Content-Length header fields" 3879 D.14. Since draft-ietf-httpbis-p1-messaging-12 3881 Closed issues: 3883 o : "RFC2145 3884 Normative" 3886 o : "HTTP(s) URI 3887 scheme definitions" (tune the requirements on userinfo) 3889 o : "define 3890 'transparent' proxy" 3892 o : "Header Field 3893 Classification" 3895 o : "Is * usable 3896 as a request-uri for new methods?" 3898 o : "Migrate 3899 Upgrade details from RFC2817" 3901 o : "untangle 3902 ABNFs for header fields" 3904 o : "update RFC 3905 2109 reference" 3907 D.15. Since draft-ietf-httpbis-p1-messaging-13 3909 Closed issues: 3911 o : "Allow is not 3912 in 13.5.2" 3914 o : "Handling 3915 multiple Content-Length header fields" 3917 o : "untangle 3918 ABNFs for header fields" 3920 o : "Content- 3921 Length ABNF broken" 3923 D.16. Since draft-ietf-httpbis-p1-messaging-14 3925 Closed issues: 3927 o : "HTTP-version 3928 should be redefined as fixed length pair of DIGIT . DIGIT" 3930 o : "Recommend 3931 minimum sizes for protocol elements" 3933 o : "Set 3934 expectations around buffering" 3936 o : "Considering 3937 messages in isolation" 3939 D.17. Since draft-ietf-httpbis-p1-messaging-15 3941 Closed issues: 3943 o : "DNS Spoofing 3944 / DNS Binding advice" 3946 o : "move RFCs 3947 2145, 2616, 2817 to Historic status" 3949 o : "\-escaping in 3950 quoted strings" 3952 o : "'Close' 3953 should be reserved in the HTTP header field registry" 3955 D.18. Since draft-ietf-httpbis-p1-messaging-16 3957 Closed issues: 3959 o : "Document 3960 HTTP's error-handling philosophy" 3962 o : "Explain 3963 header field registration" 3965 o : "Revise 3966 Acknowledgements Sections" 3968 o : "Retrying 3969 Requests" 3971 o : "Closing the 3972 connection on server error" 3974 D.19. Since draft-ietf-httpbis-p1-messaging-17 3976 Closed issues: 3978 o : "Proxy- 3979 Connection and Keep-Alive" 3981 o : "Clarify 'User 3982 Agent'" 3984 o : "Define non- 3985 final responses" 3987 o : "intended 3988 maturity level vs normative references" 3990 o : "Intermediary 3991 rewriting of queries" 3993 D.20. Since draft-ietf-httpbis-p1-messaging-18 3995 Closed issues: 3997 o : "message-body 3998 in CONNECT response" 4000 o : "Misplaced 4001 text on connection handling in p2" 4003 o : "wording of 4004 line folding rule" 4006 o : "chunk- 4007 extensions" 4009 o : "make IANA 4010 policy definitions consistent" 4012 D.21. Since draft-ietf-httpbis-p1-messaging-19 4014 Closed issues: 4016 o : "make IANA 4017 policy definitions consistent" 4019 o : "clarify 4020 connection header field values are case-insensitive" 4022 o : "ABNF 4023 requirements for recipients" 4025 o : "note 4026 introduction of new IANA registries as normative changes" 4028 o : "Reference to 4029 ISO-8859-1 is informative" 4031 D.22. Since draft-ietf-httpbis-p1-messaging-20 4033 Closed issues: 4035 o : "is 'q=' case- 4036 sensitive?" 4038 o : "Semantics of 4039 HTTPS" 4041 Other changes: 4043 o Drop notion of header fields being "hop-by-hop" without being 4044 listed in the Connection header field. 4046 o Section about connection management rewritten; dropping some 4047 historic information. 4049 o Move description of "100-continue" into Part 2. 4051 o Rewrite the persistent connection and Upgrade requirements to be 4052 actionable by role and consistent with the rest of HTTP. 4054 Index 4056 A 4057 absolute-form (of request-target) 39 4058 accelerator 10 4059 application/http Media Type 57 4060 asterisk-form (of request-target) 39 4061 authority-form (of request-target) 39 4063 B 4064 browser 7 4066 C 4067 cache 11 4068 cacheable 12 4069 captive portal 11 4070 chunked (Coding Format) 33 4071 client 7 4072 close 46, 52 4073 compress (Coding Format) 35 4074 connection 7 4075 Connection header field 46, 52 4076 Content-Length header field 28 4078 D 4079 deflate (Coding Format) 35 4080 downstream 9 4082 E 4083 effective request URI 41 4085 G 4086 gateway 10 4087 Grammar 4088 absolute-form 38 4089 absolute-URI 15 4090 ALPHA 6 4091 asterisk-form 38 4092 attribute 33 4093 authority 15 4094 authority-form 38 4095 BWS 23 4096 chunk 33 4097 chunk-data 33 4098 chunk-ext 33 4099 chunk-ext-name 33 4100 chunk-ext-val 33 4101 chunk-size 33 4102 chunked-body 33 4103 comment 25 4104 Connection 47 4105 connection-option 47 4106 Content-Length 28 4107 CR 6 4108 CRLF 6 4109 ctext 25 4110 CTL 6 4111 date2 33 4112 date3 33 4113 DIGIT 6 4114 DQUOTE 6 4115 field-content 21 4116 field-name 21 4117 field-value 21 4118 header-field 21 4119 HEXDIG 6 4120 Host 40 4121 HTAB 6 4122 HTTP-message 18 4123 HTTP-name 13 4124 http-URI 16 4125 HTTP-version 13 4126 https-URI 17 4127 last-chunk 33 4128 LF 6 4129 message-body 26 4130 method 20 4131 obs-fold 21 4132 obs-text 25 4133 OCTET 6 4134 origin-form 38 4135 OWS 23 4136 partial-URI 15 4137 path-absolute 15 4138 port 15 4139 protocol-name 43 4140 protocol-version 43 4141 pseudonym 43 4142 qdtext 25 4143 qdtext-nf 33 4144 query 15 4145 quoted-cpair 25 4146 quoted-pair 25 4147 quoted-str-nf 33 4148 quoted-string 25 4149 rank 36 4150 reason-phrase 21 4151 received-by 43 4152 received-protocol 43 4153 request-line 20 4154 request-target 38 4155 RWS 23 4156 SP 6 4157 special 25 4158 start-line 19 4159 status-code 21 4160 status-line 21 4161 t-codings 36 4162 t-ranking 36 4163 tchar 25 4164 TE 36 4165 token 25 4166 Trailer 34 4167 trailer-part 33 4168 transfer-coding 33 4169 Transfer-Encoding 26 4170 transfer-extension 33 4171 transfer-parameter 33 4172 Upgrade 53 4173 uri-host 15 4174 URI-reference 15 4175 value 33 4176 VCHAR 6 4177 Via 43 4178 word 25 4179 gzip (Coding Format) 36 4181 H 4182 header field 18 4183 header section 18 4184 headers 18 4185 Host header field 40 4186 http URI scheme 16 4187 https URI scheme 17 4189 I 4190 inbound 9 4191 interception proxy 11 4192 intermediary 9 4194 M 4195 Media Type 4196 application/http 57 4197 message/http 56 4198 message 7 4199 message/http Media Type 56 4200 method 20 4202 N 4203 non-transforming proxy 10 4205 O 4206 origin server 7 4207 origin-form (of request-target) 38 4208 outbound 9 4210 P 4211 proxy 10 4213 R 4214 recipient 7 4215 request 7 4216 request-target 20 4217 resource 15 4218 response 7 4219 reverse proxy 10 4221 S 4222 sender 7 4223 server 7 4224 spider 7 4226 T 4227 target resource 37 4228 target URI 37 4229 TE header field 36 4230 Trailer header field 34 4231 Transfer-Encoding header field 26 4232 transforming proxy 10 4233 transparent proxy 11 4234 tunnel 11 4236 U 4237 Upgrade header field 53 4238 upstream 9 4239 URI scheme 4240 http 16 4241 https 17 4242 user agent 7 4244 V 4245 Via header field 43 4247 Authors' Addresses 4249 Roy T. Fielding (editor) 4250 Adobe Systems Incorporated 4251 345 Park Ave 4252 San Jose, CA 95110 4253 USA 4255 EMail: fielding@gbiv.com 4256 URI: http://roy.gbiv.com/ 4257 Julian F. Reschke (editor) 4258 greenbytes GmbH 4259 Hafenweg 16 4260 Muenster, NW 48155 4261 Germany 4263 EMail: julian.reschke@greenbytes.de 4264 URI: http://greenbytes.de/tech/webdav/