idnits 2.17.1
draft-ietf-json-rfc4627bis-06.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
-- The draft header indicates that this document obsoletes RFC4627, but the
abstract doesn't seem to mention this, which it should.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
-- The document date (October 11, 2013) is 3821 days in the past. Is this
intentional?
Checking references for intended status: Proposed Standard
----------------------------------------------------------------------------
(See RFCs 3967 and 4897 for information about using normative references
to lower-maturity documents in RFCs)
-- Looks like a reference, but probably isn't: '116' on line 463
-- Looks like a reference, but probably isn't: '943' on line 463
-- Looks like a reference, but probably isn't: '234' on line 463
-- Looks like a reference, but probably isn't: '38793' on line 463
-- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE754'
-- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE'
-- Obsolete informational reference (is this intentional?): RFC 4627
(Obsoleted by RFC 7158, RFC 7159)
Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 9 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 JSON Working Group T. Bray, Ed.
3 Internet-Draft Google, Inc.
4 Obsoletes: 4627 (if approved) October 11, 2013
5 Intended status: Standards Track
6 Expires: April 14, 2014
8 The JSON Data Interchange Format
9 draft-ietf-json-rfc4627bis-06
11 Abstract
13 JavaScript Object Notation (JSON) is a lightweight, text-based,
14 language-independent data interchange format. It was derived from
15 the ECMAScript Programming Language Standard. JSON defines a small
16 set of formatting rules for the portable representation of structured
17 data.
19 Status of This Memo
21 This Internet-Draft is submitted in full conformance with the
22 provisions of BCP 78 and BCP 79.
24 Internet-Drafts are working documents of the Internet Engineering
25 Task Force (IETF). Note that other groups may also distribute
26 working documents as Internet-Drafts. The list of current Internet-
27 Drafts is at http://datatracker.ietf.org/drafts/current/.
29 Internet-Drafts are draft documents valid for a maximum of six months
30 and may be updated, replaced, or obsoleted by other documents at any
31 time. It is inappropriate to use Internet-Drafts as reference
32 material or to cite them other than as "work in progress."
34 This Internet-Draft will expire on April 14, 2014.
36 Copyright Notice
38 Copyright (c) 2013 IETF Trust and the persons identified as the
39 document authors. All rights reserved.
41 This document is subject to BCP 78 and the IETF Trust's Legal
42 Provisions Relating to IETF Documents
43 (http://trustee.ietf.org/license-info) in effect on the date of
44 publication of this document. Please review these documents
45 carefully, as they describe your rights and restrictions with respect
46 to this document. Code Components extracted from this document must
47 include Simplified BSD License text as described in Section 4.e of
48 the Trust Legal Provisions and are provided without warranty as
49 described in the Simplified BSD License.
51 Table of Contents
53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
54 1.1. Conventions Used in This Document . . . . . . . . . . . . 3
55 1.2. Specifications of JSON . . . . . . . . . . . . . . . . . 3
56 1.3. Introduction to This Revision . . . . . . . . . . . . . . 3
57 2. JSON Grammar . . . . . . . . . . . . . . . . . . . . . . . . 4
58 3. Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
59 4. Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
60 5. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
61 6. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
62 7. Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
63 8. String and Character Issues . . . . . . . . . . . . . . . . . 8
64 8.1. Encoding and Detection . . . . . . . . . . . . . . . . . 8
65 8.2. Unicode Characters . . . . . . . . . . . . . . . . . . . 8
66 8.3. String Comparison . . . . . . . . . . . . . . . . . . . . 9
67 9. Parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
68 10. Generators . . . . . . . . . . . . . . . . . . . . . . . . . 9
69 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
70 12. Security Considerations . . . . . . . . . . . . . . . . . . . 10
71 13. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 10
72 14. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 11
73 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 12
74 15.1. Normative References . . . . . . . . . . . . . . . . . . 12
75 15.2. Informative References . . . . . . . . . . . . . . . . . 12
76 Appendix A. Changes from RFC 4627 . . . . . . . . . . . . . . . 12
77 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 14
79 1. Introduction
81 JavaScript Object Notation (JSON) is a text format for the
82 serialization of structured data. It is derived from the object
83 literals of JavaScript, as defined in the ECMAScript Programming
84 Language Standard, Third Edition [ECMA-262].
86 JSON can represent four primitive types (strings, numbers, booleans,
87 and null) and two structured types (objects and arrays).
89 A string is a sequence of zero or more Unicode characters [UNICODE].
91 An object is an unordered collection of zero or more name/value
92 pairs, where a name is a string and a value is a string, number,
93 boolean, null, object, or array.
95 An array is an ordered sequence of zero or more values.
97 The terms "object" and "array" come from the conventions of
98 JavaScript.
100 JSON's design goals were for it to be minimal, portable, textual, and
101 a subset of JavaScript.
103 1.1. Conventions Used in This Document
105 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
106 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
107 document are to be interpreted as described in [RFC2119].
109 The grammatical rules in this document are to be interpreted as
110 described in [RFC5234].
112 1.2. Specifications of JSON
114 This document is an update of [RFC4627], which described JSON and
115 registered the Media Type "application/json".
117 A description of JSON in ECMAScript terms appears in version 5.1 of
118 the ECMAScript specification [ECMA-262], section 15.12. JSON is also
119 described in [ECMA-404]. ECMAscript 5.1 enumerates the differences
120 between JSON as described in that specification and in RFC4627. The
121 most significant is that ECMAScript 5.1 does not require a JSON Text
122 to be an Array or an Object; thus, for example, these constructs
123 would all be valid JSON texts in the ECMAScript context:
125 o "Hello world!"
127 o 42
129 o true
131 All of the specifications of JSON syntax agree on the syntactic
132 elements of the language.
134 1.3. Introduction to This Revision
135 In the years since the publication of RFC 4627, JSON has found very
136 wide use. This experience has revealed certain patterns which, while
137 allowed by its specifications, have caused interoperability problems.
139 Also, a small number of errata have been reported.
141 This revision does not change any of the rules of the specification;
142 all texts which were legal JSON remain so, and none which were not
143 JSON become JSON. The revision's goal is to fix the errata and
144 highlight practices which can lead to interoperability problems.
146 2. JSON Grammar
148 A JSON text is a sequence of tokens. The set of tokens includes six
149 structural characters, strings, numbers, and three literal names.
151 A JSON text is a serialized object or array.
153 JSON-text = object / array
155 These are the six structural characters:
157 begin-array = ws %x5B ws ; [ left square bracket
159 begin-object = ws %x7B ws ; { left curly bracket
161 end-array = ws %x5D ws ; ] right square bracket
163 end-object = ws %x7D ws ; } right curly bracket
165 name-separator = ws %x3A ws ; : colon
167 value-separator = ws %x2C ws ; , comma
169 Insignificant whitespace is allowed before or after any of the six
170 structural characters.
172 ws = *(
173 %x20 / ; Space
174 %x09 / ; Horizontal tab
175 %x0A / ; Line feed or New line
176 %x0D ) ; Carriage return
178 3. Values
179 A JSON value MUST be an object, array, number, or string, or one of
180 the following three literal names:
182 false null true
184 The literal names MUST be lowercase. No other literal names are
185 allowed.
187 value = false / null / true / object / array / number / string
189 false = %x66.61.6c.73.65 ; false
191 null = %x6e.75.6c.6c ; null
193 true = %x74.72.75.65 ; true
195 4. Objects
197 An object structure is represented as a pair of curly brackets
198 surrounding zero or more name/value pairs (or members). A name is a
199 string. A single colon comes after each name, separating the name
200 from the value. A single comma separates a value from a following
201 name. The names within an object SHOULD be unique.
203 object = begin-object [ member *( value-separator member ) ]
204 end-object
206 member = string name-separator value
208 An object whose names are all unique is interoperable in the sense
209 that all software implementations which receive that object will
210 agree on the name-value mappings. When the names within an object
211 are not unique, the behavior of software that receives such an object
212 is unpredictable. Many implementations report the last name/value
213 pair only; other implementations report an error or fail to parse the
214 object; other implementations report all of the name/value pairs,
215 including duplicates.
217 5. Arrays
219 An array structure is represented as square brackets surrounding zero
220 or more values (or elements). Elements are separated by commas.
222 array = begin-array [ value *( value-separator value ) ] end-array
224 6. Numbers
226 The representation of numbers is similar to that used in most
227 programming languages. A number contains an integer component that
228 may be prefixed with an optional minus sign, which may be followed by
229 a fraction part and/or an exponent part.
231 Octal and hex forms are not allowed. Leading zeros are not allowed.
233 A fraction part is a decimal point followed by one or more digits.
235 An exponent part begins with the letter E in upper or lowercase,
236 which may be followed by a plus or minus sign. The E and optional
237 sign are followed by one or more digits.
239 Numeric values that cannot be represented in the grammar below (such
240 as Infinity and NaN) are not permitted.
242 number = [ minus ] int [ frac ] [ exp ]
244 decimal-point = %x2E ; .
246 digit1-9 = %x31-39 ; 1-9
248 e = %x65 / %x45 ; e E
250 exp = e [ minus / plus ] 1*DIGIT
252 frac = decimal-point 1*DIGIT
254 int = zero / ( digit1-9 *DIGIT )
256 minus = %x2D ; -
258 plus = %x2B ; +
260 zero = %x30 ; 0
262 This specification allows implementations to set limits on the range
263 and precision of numbers accepted. Since software which implements
264 IEEE 754-2008 binary64 (double precision) numbers [IEEE754] is
265 generally available and widely used, good interoperability can be
266 achieved by implementations which expect no more precision or range
267 than these provide, in the sense that implementations will
268 approximate JSON numbers within the expected precision. A JSON
269 number such as 1E400 or 3.141592653589793238462643383279 may indicate
270 potential interoperability problems since it suggests that the
271 software which created it it expected greater magnitude or precision
272 than is widely available.
274 Note that when such software is used, numbers which are integers and
275 are in the range [-(2**53)+1, (2**53)-1] are interoperable in the
276 sense that implementations will agree exactly on their numeric
277 values.
279 7. Strings
281 The representation of strings is similar to conventions used in the C
282 family of programming languages. A string begins and ends with
283 quotation marks. All Unicode characters may be placed within the
284 quotation marks except for the characters that must be escaped:
285 quotation mark, reverse solidus, and the control characters (U+0000
286 through U+001F).
288 Any character may be escaped. If the character is in the Basic
289 Multilingual Plane (U+0000 through U+FFFF), then it may be
290 represented as a six-character sequence: a reverse solidus, followed
291 by the lowercase letter u, followed by four hexadecimal digits that
292 encode the character's code point. The hexadecimal letters A though
293 F can be upper or lowercase. So, for example, a string containing
294 only a single reverse solidus character may be represented as
295 "\u005C".
297 Alternatively, there are two-character sequence escape
298 representations of some popular characters. So, for example, a
299 string containing only a single reverse solidus character may be
300 represented more compactly as "\\".
302 To escape an extended character that is not in the Basic Multilingual
303 Plane, the character is represented as a twelve-character sequence,
304 encoding the UTF-16 surrogate pair. So, for example, a string
305 containing only the G clef character (U+1D11E) may be represented as
306 "\uD834\uDD1E".
308 string = quotation-mark *char quotation-mark
310 char = unescaped /
311 escape (
312 %x22 / ; " quotation mark U+0022
313 %x5C / ; \ reverse solidus U+005C
314 %x2F / ; / solidus U+002F
315 %x62 / ; b backspace U+0008
316 %x66 / ; f form feed U+000C
317 %x6E / ; n line feed U+000A
318 %x72 / ; r carriage return U+000D
319 %x74 / ; t tab U+0009
320 %x75 4HEXDIG ) ; uXXXX U+XXXX
322 escape = %x5C ; \
324 quotation-mark = %x22 ; "
326 unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
328 8. String and Character Issues
330 8.1. Encoding and Detection
332 JSON text SHALL be encoded in Unicode. The default encoding is
333 UTF-8.
335 Since the first two characters of a JSON text will always be ASCII
336 characters [RFC0020], it is possible to determine whether an octet
337 stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
338 at the pattern of nulls in the first four octets.
340 00 00 00 xx UTF-32BE
341 00 xx 00 xx UTF-16BE
342 xx 00 00 00 UTF-32LE
343 xx 00 xx 00 UTF-16LE
344 xx xx xx xx UTF-8
346 8.2. Unicode Characters
348 When all the strings represented in a JSON text are composed entirely
349 of Unicode characters [UNICODE] (however escaped), then that JSON
350 text is interoperable in the sense that all software implementations
351 which parse it will agree on the contents of names and of string
352 values in objects and arrays.
354 However, the ABNF in this specification allows member names and
355 string values to contain bit sequences which cannot encode Unicode
356 characters, for example "\uDEAD" (a single unpaired UTF-16
357 surrogate). Instances of this have been observed, for example when a
358 library truncates a UTF-16 string without checking whether the
359 truncation split a surrogate pair. The behavior of software which
360 receives JSON texts containing such values is unpredictable; for
361 example, implementations might return different values for the length
362 of a string value, or even suffer fatal runtime exceptions.
364 8.3. String Comparison
366 Software implementations are typically required to test names of
367 object members for equality. Implementations which transform the
368 textual representation into sequences of Unicode code units, and then
369 perform the comparison numerically, code unit by code unit, are
370 interoperable in the sense that implementations will agree in all
371 cases on equality or inequality of two strings. For example,
372 implementations which compare strings with escaped characters
373 unconverted may incorrectly find that "a\b" and "a\u005Cb" are not
374 equal.
376 9. Parsers
378 A JSON parser transforms a JSON text into another representation. A
379 JSON parser MUST accept all texts that conform to the JSON grammar.
380 A JSON parser MAY accept non-JSON forms or extensions.
382 An implementation may set limits on the size of texts that it
383 accepts. An implementation may set limits on the maximum depth of
384 nesting. An implementation may set limits on the range and precision
385 of numbers. An implementation may set limits on the length and
386 character contents of strings.
388 10. Generators
390 A JSON generator produces JSON text. The resulting text MUST
391 strictly conform to the JSON grammar.
393 11. IANA Considerations
395 The MIME media type for JSON text is application/json.
397 Type name: application
399 Subtype name: json
401 Required parameters: n/a
402 Optional parameters: n/a
404 Encoding considerations: 8bit if UTF-8; binary if UTF-16 or UTF-32.
405 JSON may be represented using UTF-8, UTF-16, or UTF-32. When JSON
406 is written in UTF-8, JSON is 8bit compatible. When JSON is
407 written in UTF-16 or UTF-32, the binary content-transfer-encoding
408 must be used.
410 Interoperability considerations: Described in this document
412 Published specification: This document
414 Applications that use this media type: JSON has been used to exchange
415 data between applications written in all of these programming
416 languages: ActionScript, C, C#, Clojure, ColdFusion, Common Lisp,
417 E, Erlang, Go, Java, JavaScript, Lua, Objective CAML, Perl, PHP,
418 Python, Rebol, Ruby, Scala, and Scheme.
420 Additional information: Magic number(s): n/a
421 File extension(s): .json
422 Macintosh file type code(s): TEXT
424 Person & email address to contact for further information: IESG
425 .
507 [RFC0020] Cerf, V., "ASCII format for network interchange", RFC 20,
508 October 1969.
510 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
511 Requirement Levels", BCP 14, RFC 2119, March 1997.
513 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax
514 Specifications: ABNF", STD 68, RFC 5234, January 2008.
516 [UNICODE] The Unicode Consortium, "The Unicode Standard, Version 4.0
517 ", 2003, .
519 15.2. Informative References
521 [ECMA-262]
522 European Computer Manufacturers Association, "ECMAScript
523 Language Specification 5.1 Edition ", June 2011, .
526 [ECMA-404]
527 Ecma International, "The JSON Data Interchange Format ",
528 October 2013, .
531 [RFC4627] Crockford, D., "The application/json Media Type for
532 JavaScript Object Notation (JSON)", RFC 4627, July 2006.
534 Appendix A. Changes from RFC 4627
536 This section lists changes between this document and the text in RFC
537 4627.
539 o Changed Working Group attribution to JSON Working Group.
541 o Changed title of document.
543 o Change the reference to [UNICODE] to be be non-version-specific.
545 o Added a "Specifications of JSON" section.
547 o Added an "Introduction to this Revision" section.
549 o Added language about duplicate object member names and
550 interoperability.
552 o Applied erratum #607 from RFC 4627 to correctly align the artwork
553 for the definition of "object".
555 o Changed "as sequences of digits" to "in the grammar below" in
556 "Numbers" section.
558 o Added language about number interoperability as a function of
559 IEEE754, and an IEEE754 reference.
561 o Added language about interoperability and Unicode characters, and
562 about string comparisons. To do this, turned the old "Encoding"
563 section into a "String and Character Issues" section, with three
564 subsections: The old "Encoding" material, and two new sections for
565 "Unicode Characters" and "String Comparison".
567 o Changed guidance in "Parsers" section to point out that
568 implementations may set limits on the range "and precision" of
569 numbers.
571 o Updated and tidied the "IANA Considerations" section.
573 o Made a real "Security Considerations" section, and lifted the text
574 out of the existing "IANA Considerations" section.
576 o Applied erratum #3607 from RFC 4627 by removing the security
577 consideration that begins "A JSON text can be safely passed" and
578 the JavaScript code that went with that consideration.
580 o Added a note to the "Security Considerations" section pointing out
581 the risks of using the "eval()" function in JavaScript or any
582 other language in which JSON texts conform to that language's
583 syntax.
585 o Changed "100" to 100 and added a boolean field, both in the first
586 example.
588 o Added "Contributors" section crediting Douglas Crockford.
590 o Added a reference to RFC4627.
592 o Moved the ECMAScript reference from Normative to Informative,
593 updated it to reference ECMAScript 5.1, and added reference to
594 ECMA 404.
596 Author's Address
598 Tim Bray (editor)
599 Google, Inc.
601 Email: tbray@textuality.com