idnits 2.17.1
draft-reschke-rfc2231-in-http-02.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
** The document seems to lack a License Notice according IETF Trust
Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009
Section 6.b -- however, there's a paragraph with a matching beginning.
Boilerplate error?
(You're using the IETF Trust Provisions' Section 6.b License Notice from
12 Feb 2009 rather than one of the newer Notices. See
https://trustee.ietf.org/license-info/.)
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
** The abstract seems to contain references ([2], [1]), which it shouldn't.
Please replace those with straight textual mentions of the documents in
question.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
-- The document seems to lack a disclaimer for pre-RFC5378 work, but may
have content which was first submitted before 10 November 2008. If you
have contacted all the original authors and they are all willing to grant
the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
this comment. If not, you may need to add the pre-RFC5378 disclaimer.
(See the Legal Provisions document at
https://trustee.ietf.org/license-info for more information.)
-- The document date (May 19, 2009) is 5427 days in the past. Is this
intentional?
Checking references for intended status: Proposed Standard
----------------------------------------------------------------------------
(See RFCs 3967 and 4897 for information about using normative references
to lower-maturity documents in RFCs)
== Unused Reference: 'RFC4646' is defined on line 358, but no explicit
reference was found in the text
-- Possible downref: Non-RFC (?) normative reference: ref. 'ISO-8859-1'
** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231,
RFC 7232, RFC 7233, RFC 7234, RFC 7235)
** Obsolete normative reference: RFC 4646 (Obsoleted by RFC 5646)
Summary: 4 errors (**), 0 flaws (~~), 2 warnings (==), 3 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Network Working Group J. Reschke
3 Internet-Draft greenbytes
4 Intended status: Standards Track May 19, 2009
5 Expires: November 20, 2009
7 Application of RFC 2231 Encoding to
8 Hypertext Transfer Protocol (HTTP) Headers
9 draft-reschke-rfc2231-in-http-02
11 Status of this Memo
13 This Internet-Draft is submitted to IETF in full conformance with the
14 provisions of BCP 78 and BCP 79.
16 Internet-Drafts are working documents of the Internet Engineering
17 Task Force (IETF), its areas, and its working groups. Note that
18 other groups may also distribute working documents as Internet-
19 Drafts.
21 Internet-Drafts are draft documents valid for a maximum of six months
22 and may be updated, replaced, or obsoleted by other documents at any
23 time. It is inappropriate to use Internet-Drafts as reference
24 material or to cite them other than as "work in progress."
26 The list of current Internet-Drafts can be accessed at
27 http://www.ietf.org/ietf/1id-abstracts.txt.
29 The list of Internet-Draft Shadow Directories can be accessed at
30 http://www.ietf.org/shadow.html.
32 This Internet-Draft will expire on November 20, 2009.
34 Copyright Notice
36 Copyright (c) 2009 IETF Trust and the persons identified as the
37 document authors. All rights reserved.
39 This document is subject to BCP 78 and the IETF Trust's Legal
40 Provisions Relating to IETF Documents in effect on the date of
41 publication of this document (http://trustee.ietf.org/license-info).
42 Please review these documents carefully, as they describe your rights
43 and restrictions with respect to this document.
45 Abstract
47 By default, message header parameters in Hypertext Transfer Protocol
48 (HTTP) messages can not carry characters outside the ISO-8859-1
49 character set. RFC 2231 defines an escaping mechanism for use in
50 Multipurpose Internet Mail Extensions (MIME) headers. This document
51 specifies a profile of that encoding suitable for use in HTTP.
53 Editorial Note (To be removed by RFC Editor before publication)
55 There are multiple HTTP headers that already use RFC 2231 encoding in
56 practice (Content-Disposition) or might use it in the future (Link).
57 The purpose of this document is to provide a single place where the
58 generic aspects of RFC 2231 encoding in HTTP headers are defined.
60 Distribution of this document is unlimited. Although this is not a
61 work item of the HTTPbis Working Group, comments should be sent to
62 the Hypertext Transfer Protocol (HTTP) mailing list at
63 ietf-http-wg@w3.org [1], which may be joined by sending a message
64 with subject "subscribe" to ietf-http-wg-request@w3.org [2].
66 Discussions of the HTTPbis Working Group are archived at
67 .
69 XML versions, latest edits and the issues list for this document are
70 available from
71 . A
72 collection of test cases is available at
73 .
75 Table of Contents
77 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
78 2. Notational Conventions . . . . . . . . . . . . . . . . . . . . 4
79 3. A Profile of RFC 2231 for Use in HTTP . . . . . . . . . . . . 4
80 3.1. Parameter Continuations . . . . . . . . . . . . . . . . . 4
81 3.2. Parameter Value Character Set and Language Information . . 5
82 3.2.1. Examples . . . . . . . . . . . . . . . . . . . . . . . 7
83 3.3. Language specification in Encoded Words . . . . . . . . . 7
84 4. Guidelines for Usage in HTTP Header Definitions . . . . . . . 8
85 4.1. When to Use the Extension . . . . . . . . . . . . . . . . 8
86 4.2. Error Handling . . . . . . . . . . . . . . . . . . . . . . 8
87 5. Security Considerations . . . . . . . . . . . . . . . . . . . 9
88 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
89 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 9
90 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9
91 8.1. Normative References . . . . . . . . . . . . . . . . . . . 9
92 8.2. Informative References . . . . . . . . . . . . . . . . . . 10
93 Appendix A. Change Log (to be removed by RFC Editor before
94 publication) . . . . . . . . . . . . . . . . . . . . 10
95 A.1. Since draft-reschke-rfc2231-in-http-00 . . . . . . . . . . 10
96 A.2. Since draft-reschke-rfc2231-in-http-01 . . . . . . . . . . 10
97 Appendix B. Open issues (to be removed by RFC Editor prior to
98 publication) . . . . . . . . . . . . . . . . . . . . 10
99 B.1. edit . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
100 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 11
102 1. Introduction
104 By default, message header parameters in HTTP ([RFC2616]) messages
105 can not carry characters outside the ISO-8859-1 character set
106 ([ISO-8859-1]). RFC 2231 ([RFC2231]) defines an escaping mechanism
107 for use in MIME headers. This document specifies a profile of that
108 encoding for use in HTTP.
110 2. Notational Conventions
112 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
113 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
114 document are to be interpreted as described in [RFC2119].
116 This specification uses the ABNF (Augmented Backus-Naur Form)
117 notation defined in [RFC5234]. The following core rules are included
118 by reference, as defined in [RFC5234], Appendix B.1: ALPHA (letters),
119 DIGIT (decimal 0-9), HEXDIG (hexadecimal 0-9/A-F/a-f) and LWSP
120 (linear white space).
122 Note that this specification uses the term "character set" for
123 consistency with other IETF specifications such as RFC 2277 (see
124 [RFC2277], Section 3). A more accurate term would be "character
125 encoding" (a mapping of code points to octet sequences).
127 3. A Profile of RFC 2231 for Use in HTTP
129 RFC 2231 defines several extensions to MIME. The sections below
130 discuss if and how they apply to HTTP.
132 In short:
134 o Parameter Continuations aren't needed (Section 3.1),
136 o Character Set and Language Information are useful, therefore a
137 simple subset is specified (Section 3.2), and
139 o Language Specifications in Encoded Words aren't needed
140 (Section 3.3).
142 3.1. Parameter Continuations
144 Section 3 of [RFC2231] defines a mechanism that deals with the length
145 limitations that apply to MIME headers. These limitations do not
146 apply to HTTP ([RFC2616], Section 19.4.7).
148 Thus in HTTP, senders MUST NOT use parameter continuations, and
149 therefore recipients do not need to support them.
151 3.2. Parameter Value Character Set and Language Information
153 Section 4 of [RFC2231] specifies how to embed language information
154 into parameter values, and also how to encode non-ASCII characters,
155 dealing with restrictions both in MIME and HTTP header parameters.
157 However, RFC 2231 does not specify a mandatory-to-implement character
158 encoding, making it hard for senders to decide which character set to
159 use. Thus, recipients implementing this specification MUST support
160 the character sets "ISO-8859-1" [ISO-8859-1] and "UTF-8" [RFC3629].
162 Furthermore, RFC 2231 allows leaving out the character encoding
163 information. The profile defined by this specification does not
164 allow that.
166 The syntax for parameters is defined in Section 3.6 of [RFC2616]
167 (with RFC 2616 implied LWS translated to RFC 5234 LWSP):
169 parameter = attribute LWSP "=" LWSP value
171 attribute = token
172 value = token / quoted-string
174 quoted-string =
175 token =
177 This specification extends the grammar to:
179 parameter = reg-parameter / ext-parameter
181 reg-parameter = attribute LWSP "=" LWSP value
183 ext-parameter = attribute "*" LWSP "=" LWSP ext-value
185 ext-value = charset "'" [ language ] "'" value-chars
186 ; extended-initial-value,
187 ; defined in [RFC2231], Section 7
189 charset = %x55.54.46.2D.38 ; "UTF-8"
190 / %x49.53.4F.2D.38.38.35.39.2D.31 ; "ISO-8859-1"
191 / ext-charset
193 ext-charset = token ; see IANA charset registry
194 ; ()
196 language =
198 value-chars = *( pct-encoded / attr-char )
200 pct-encoded = "%" HEXDIG HEXDIG
201 ; see [RFC3986], Section 2.1
203 attr-char = ALPHA / DIGIT
204 / "-" / "." / "_" / "~" / ":"
205 / "!" / "$" / "&" / "+"
207 Thus, a parameter is either regular parameter (reg-parameter), as
208 previously defined in Section 3.6 of [RFC2616], or an extended
209 parameter (ext-parameter).
211 Extended parameters are those where the left hand side of the
212 assignment ends with an asterisk character.
214 The value part of an extended parameter (ext-value) is a token that
215 consists of three parts: the REQUIRED character set name (charset),
216 the OPTIONAL language information (language), and a a character
217 sequence representing the actual value (value-chars), separated by
218 single quote characters.
220 Inside the value part, characters not contained in attr-char are
221 encoded into an octet sequence using the specified character set.
222 That octet sequence then is percent-encoded as specified in Section
223 2.1 of [RFC3986].
225 Producers MUST NOT use character sets other than "UTF-8" ([RFC3629])
226 or ISO-8859-1 ([ISO-8859-1]). Extension character sets (ext-charset)
227 are reserved for future use.
229 3.2.1. Examples
231 Non-extended notation, using "token":
233 foo: bar; title=Economy
235 Non-extended notation, using "quoted-string":
237 foo: bar; title="US-$ rates"
239 Extended notation, using the unicode character U+00A3 (POUND SIGN):
241 foo: bar; title*=iso-8859-1'en'%A3%20rates
243 Note: the Unicode pound sign character U+00A3 was encoded using ISO-
244 8859-1 into the single octet A3, then percent-encoded. Also note
245 that the space character was encoded as %20, as it is not contained
246 in attr-char.
248 Extended notation, using the unicode characters U+00A3 (POUND SIGN)
249 and U+20AC (EURO SIGN):
251 foo: bar; title*=UTF-8''%c2%a3%20and%20%e2%82%ac%20rates
253 Note: the unicode pound sign character U+00A3 was encoded using UTF-8
254 into the octet sequence C2 A3, then percent-encoded. Likewise, the
255 unicode euro sign character U+20AC was encoded into the octet
256 sequence E2 82 AC, then percent-encoded. Also note that HEXDIG
257 allows both lower-case and upper-case character, so recipients must
258 understand both, and that the language information is optional, while
259 the character set is not.
261 3.3. Language specification in Encoded Words
263 Section 5 of [RFC2231] extends the encoding defined in [RFC2047] to
264 also support language specification in encoded words. Although the
265 HTTP/1.1 specification does refer to RFC 2047 ([RFC2616], Section
266 2.2), it's not clear to which header field exactly it applies, and
267 whether it is implemented in practice (see
268 for details).
270 Thus, the RFC 2231 profile defined by this specification does not
271 include this feature.
273 4. Guidelines for Usage in HTTP Header Definitions
275 Specifications of HTTP headers that use the extensions defined in
276 Section 3.2 should clearly state that. A simple way to achieve this
277 is to normatively reference this specification, and to include the
278 ext-value production into the ABNF for that header.
280 For instance:
282 foo-header = "foo" LWSP ":" LWSP token ";" LWSP title-param
283 title-param = "title" LWSP "=" LWSP value
284 / "title*" LWSP "=" LWSP ext-value
285 ext-value =
287 [[rfcno: Note to RFC Editor: in the figure above, please replace
288 "xxxx" by the RFC number assigned to this specification.]]
290 4.1. When to Use the Extension
292 Section 4.2 of [RFC2277] requires that protocol elements containing
293 text can carry language information. Thus, the ext-value production
294 should always be used when the parameter value is of textual nature.
296 Furthermore, the extension should also be used whenever the parameter
297 value needs to carry characters not present in the US-ASCII
298 ([USASCII]) character set (note that it would be unacceptable to
299 define a new parameter that would be restricted to a subset of the
300 Unicode character set).
302 4.2. Error Handling
304 Header specifications that include parameters should also specify
305 whether same-named parameters can occur multiple times. If
306 repetitions are not allowed (and this is believed to be the common
307 case), the specification should state whether regular or the extended
308 syntax takes precedence. In the latter case, this could be used by
309 producers to use both formats without breaking recipients that do not
310 understand the syntax. [[anchor6: Does not work as expected, see
311 and
312 .]]
314 Example:
316 foo: bar; title="EURO exchange rates";
317 title*=utf-8''%e2%82%ac%20exchange%20rates
319 In this case, the sender provides an ASCII version of the title for
320 legacy recipients, but also includes an internationalized version for
321 recipients understanding this specification -- the latter obviously
322 should prefer the new syntax over the old one.
324 5. Security Considerations
326 This document does not discuss security issues and is not believed to
327 raise any security issues not already endemic in HTTP.
329 6. IANA Considerations
331 There are no IANA Considerations related to this specification.
333 7. Acknowledgements
335 Thanks to Frank Ellermann for help figuring out ABNF details, and to
336 Roar Lauritzsen for implementer's feedback.
338 8. References
340 8.1. Normative References
342 [ISO-8859-1]
343 International Organization for Standardization,
344 "Information technology -- 8-bit single-byte coded graphic
345 character sets -- Part 1: Latin alphabet No. 1", ISO/
346 IEC 8859-1:1998, 1998.
348 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
349 Requirement Levels", BCP 14, RFC 2119, March 1997.
351 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
352 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
353 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
355 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO
356 10646", RFC 3629, STD 63, November 2003.
358 [RFC4646] Phillips, A. and M. Davis, "Tags for Identifying
359 Languages", BCP 47, RFC 4646, September 2006.
361 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
362 Specifications: ABNF", STD 68, RFC 5234, January 2008.
364 8.2. Informative References
366 [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail Extensions)
367 Part Three: Message Header Extensions for Non-ASCII Text",
368 RFC 2047, November 1996.
370 [RFC2231] Freed, N. and K. Moore, "MIME Parameter Value and Encoded
371 Word Extensions:
372 Character Sets, Languages, and Continuations", RFC 2231,
373 November 1997.
375 [RFC2277] Alvestrand, H., "IETF Policy on Character Sets and
376 Languages", BCP 18, RFC 2277, January 1998.
378 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
379 Resource Identifier (URI): Generic Syntax", RFC 3986,
380 STD 66, January 2005.
382 [USASCII] American National Standards Institute, "Coded Character
383 Set -- 7-bit American Standard Code for Information
384 Interchange", ANSI X3.4, 1986.
386 URIs
388 [1]
390 [2]
392 Appendix A. Change Log (to be removed by RFC Editor before publication)
394 A.1. Since draft-reschke-rfc2231-in-http-00
396 Use RFC5234-style ABNF, closer to the one used in RFC 2231.
398 Make RFC 2231 dependency informative, so this specification can
399 evolve independantly.
401 Explain the ABNF in prose.
403 A.2. Since draft-reschke-rfc2231-in-http-01
405 Remove unneeded RFC5137 notation (code point vs character).
407 Appendix B. Open issues (to be removed by RFC Editor prior to
408 publication)
410 B.1. edit
412 Type: edit
414 julian.reschke@greenbytes.de (2009-04-17): Umbrella issue for
415 editorial fixes/enhancements.
417 Author's Address
419 Julian F. Reschke
420 greenbytes GmbH
421 Hafenweg 16
422 Muenster, NW 48155
423 Germany
425 Email: julian.reschke@greenbytes.de
426 URI: http://greenbytes.de/tech/webdav/