idnits 2.17.1 draft-ietf-json-rfc4627bis-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC4627, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 11, 2013) is 3821 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '116' on line 463 -- Looks like a reference, but probably isn't: '943' on line 463 -- Looks like a reference, but probably isn't: '234' on line 463 -- Looks like a reference, but probably isn't: '38793' on line 463 -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE754' -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE' -- Obsolete informational reference (is this intentional?): RFC 4627 (Obsoleted by RFC 7158, RFC 7159) Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 JSON Working Group T. Bray, Ed. 3 Internet-Draft Google, Inc. 4 Obsoletes: 4627 (if approved) October 11, 2013 5 Intended status: Standards Track 6 Expires: April 14, 2014 8 The JSON Data Interchange Format 9 draft-ietf-json-rfc4627bis-06 11 Abstract 13 JavaScript Object Notation (JSON) is a lightweight, text-based, 14 language-independent data interchange format. It was derived from 15 the ECMAScript Programming Language Standard. JSON defines a small 16 set of formatting rules for the portable representation of structured 17 data. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on April 14, 2014. 36 Copyright Notice 38 Copyright (c) 2013 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 54 1.1. Conventions Used in This Document . . . . . . . . . . . . 3 55 1.2. Specifications of JSON . . . . . . . . . . . . . . . . . 3 56 1.3. Introduction to This Revision . . . . . . . . . . . . . . 3 57 2. JSON Grammar . . . . . . . . . . . . . . . . . . . . . . . . 4 58 3. Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 59 4. Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 60 5. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 61 6. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 62 7. Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 63 8. String and Character Issues . . . . . . . . . . . . . . . . . 8 64 8.1. Encoding and Detection . . . . . . . . . . . . . . . . . 8 65 8.2. Unicode Characters . . . . . . . . . . . . . . . . . . . 8 66 8.3. String Comparison . . . . . . . . . . . . . . . . . . . . 9 67 9. Parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 68 10. Generators . . . . . . . . . . . . . . . . . . . . . . . . . 9 69 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 70 12. Security Considerations . . . . . . . . . . . . . . . . . . . 10 71 13. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 10 72 14. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 11 73 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 74 15.1. Normative References . . . . . . . . . . . . . . . . . . 12 75 15.2. Informative References . . . . . . . . . . . . . . . . . 12 76 Appendix A. Changes from RFC 4627 . . . . . . . . . . . . . . . 12 77 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 14 79 1. Introduction 81 JavaScript Object Notation (JSON) is a text format for the 82 serialization of structured data. It is derived from the object 83 literals of JavaScript, as defined in the ECMAScript Programming 84 Language Standard, Third Edition [ECMA-262]. 86 JSON can represent four primitive types (strings, numbers, booleans, 87 and null) and two structured types (objects and arrays). 89 A string is a sequence of zero or more Unicode characters [UNICODE]. 91 An object is an unordered collection of zero or more name/value 92 pairs, where a name is a string and a value is a string, number, 93 boolean, null, object, or array. 95 An array is an ordered sequence of zero or more values. 97 The terms "object" and "array" come from the conventions of 98 JavaScript. 100 JSON's design goals were for it to be minimal, portable, textual, and 101 a subset of JavaScript. 103 1.1. Conventions Used in This Document 105 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 106 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 107 document are to be interpreted as described in [RFC2119]. 109 The grammatical rules in this document are to be interpreted as 110 described in [RFC5234]. 112 1.2. Specifications of JSON 114 This document is an update of [RFC4627], which described JSON and 115 registered the Media Type "application/json". 117 A description of JSON in ECMAScript terms appears in version 5.1 of 118 the ECMAScript specification [ECMA-262], section 15.12. JSON is also 119 described in [ECMA-404]. ECMAscript 5.1 enumerates the differences 120 between JSON as described in that specification and in RFC4627. The 121 most significant is that ECMAScript 5.1 does not require a JSON Text 122 to be an Array or an Object; thus, for example, these constructs 123 would all be valid JSON texts in the ECMAScript context: 125 o "Hello world!" 127 o 42 129 o true 131 All of the specifications of JSON syntax agree on the syntactic 132 elements of the language. 134 1.3. Introduction to This Revision 135 In the years since the publication of RFC 4627, JSON has found very 136 wide use. This experience has revealed certain patterns which, while 137 allowed by its specifications, have caused interoperability problems. 139 Also, a small number of errata have been reported. 141 This revision does not change any of the rules of the specification; 142 all texts which were legal JSON remain so, and none which were not 143 JSON become JSON. The revision's goal is to fix the errata and 144 highlight practices which can lead to interoperability problems. 146 2. JSON Grammar 148 A JSON text is a sequence of tokens. The set of tokens includes six 149 structural characters, strings, numbers, and three literal names. 151 A JSON text is a serialized object or array. 153 JSON-text = object / array 155 These are the six structural characters: 157 begin-array = ws %x5B ws ; [ left square bracket 159 begin-object = ws %x7B ws ; { left curly bracket 161 end-array = ws %x5D ws ; ] right square bracket 163 end-object = ws %x7D ws ; } right curly bracket 165 name-separator = ws %x3A ws ; : colon 167 value-separator = ws %x2C ws ; , comma 169 Insignificant whitespace is allowed before or after any of the six 170 structural characters. 172 ws = *( 173 %x20 / ; Space 174 %x09 / ; Horizontal tab 175 %x0A / ; Line feed or New line 176 %x0D ) ; Carriage return 178 3. Values 179 A JSON value MUST be an object, array, number, or string, or one of 180 the following three literal names: 182 false null true 184 The literal names MUST be lowercase. No other literal names are 185 allowed. 187 value = false / null / true / object / array / number / string 189 false = %x66.61.6c.73.65 ; false 191 null = %x6e.75.6c.6c ; null 193 true = %x74.72.75.65 ; true 195 4. Objects 197 An object structure is represented as a pair of curly brackets 198 surrounding zero or more name/value pairs (or members). A name is a 199 string. A single colon comes after each name, separating the name 200 from the value. A single comma separates a value from a following 201 name. The names within an object SHOULD be unique. 203 object = begin-object [ member *( value-separator member ) ] 204 end-object 206 member = string name-separator value 208 An object whose names are all unique is interoperable in the sense 209 that all software implementations which receive that object will 210 agree on the name-value mappings. When the names within an object 211 are not unique, the behavior of software that receives such an object 212 is unpredictable. Many implementations report the last name/value 213 pair only; other implementations report an error or fail to parse the 214 object; other implementations report all of the name/value pairs, 215 including duplicates. 217 5. Arrays 219 An array structure is represented as square brackets surrounding zero 220 or more values (or elements). Elements are separated by commas. 222 array = begin-array [ value *( value-separator value ) ] end-array 224 6. Numbers 226 The representation of numbers is similar to that used in most 227 programming languages. A number contains an integer component that 228 may be prefixed with an optional minus sign, which may be followed by 229 a fraction part and/or an exponent part. 231 Octal and hex forms are not allowed. Leading zeros are not allowed. 233 A fraction part is a decimal point followed by one or more digits. 235 An exponent part begins with the letter E in upper or lowercase, 236 which may be followed by a plus or minus sign. The E and optional 237 sign are followed by one or more digits. 239 Numeric values that cannot be represented in the grammar below (such 240 as Infinity and NaN) are not permitted. 242 number = [ minus ] int [ frac ] [ exp ] 244 decimal-point = %x2E ; . 246 digit1-9 = %x31-39 ; 1-9 248 e = %x65 / %x45 ; e E 250 exp = e [ minus / plus ] 1*DIGIT 252 frac = decimal-point 1*DIGIT 254 int = zero / ( digit1-9 *DIGIT ) 256 minus = %x2D ; - 258 plus = %x2B ; + 260 zero = %x30 ; 0 262 This specification allows implementations to set limits on the range 263 and precision of numbers accepted. Since software which implements 264 IEEE 754-2008 binary64 (double precision) numbers [IEEE754] is 265 generally available and widely used, good interoperability can be 266 achieved by implementations which expect no more precision or range 267 than these provide, in the sense that implementations will 268 approximate JSON numbers within the expected precision. A JSON 269 number such as 1E400 or 3.141592653589793238462643383279 may indicate 270 potential interoperability problems since it suggests that the 271 software which created it it expected greater magnitude or precision 272 than is widely available. 274 Note that when such software is used, numbers which are integers and 275 are in the range [-(2**53)+1, (2**53)-1] are interoperable in the 276 sense that implementations will agree exactly on their numeric 277 values. 279 7. Strings 281 The representation of strings is similar to conventions used in the C 282 family of programming languages. A string begins and ends with 283 quotation marks. All Unicode characters may be placed within the 284 quotation marks except for the characters that must be escaped: 285 quotation mark, reverse solidus, and the control characters (U+0000 286 through U+001F). 288 Any character may be escaped. If the character is in the Basic 289 Multilingual Plane (U+0000 through U+FFFF), then it may be 290 represented as a six-character sequence: a reverse solidus, followed 291 by the lowercase letter u, followed by four hexadecimal digits that 292 encode the character's code point. The hexadecimal letters A though 293 F can be upper or lowercase. So, for example, a string containing 294 only a single reverse solidus character may be represented as 295 "\u005C". 297 Alternatively, there are two-character sequence escape 298 representations of some popular characters. So, for example, a 299 string containing only a single reverse solidus character may be 300 represented more compactly as "\\". 302 To escape an extended character that is not in the Basic Multilingual 303 Plane, the character is represented as a twelve-character sequence, 304 encoding the UTF-16 surrogate pair. So, for example, a string 305 containing only the G clef character (U+1D11E) may be represented as 306 "\uD834\uDD1E". 308 string = quotation-mark *char quotation-mark 310 char = unescaped / 311 escape ( 312 %x22 / ; " quotation mark U+0022 313 %x5C / ; \ reverse solidus U+005C 314 %x2F / ; / solidus U+002F 315 %x62 / ; b backspace U+0008 316 %x66 / ; f form feed U+000C 317 %x6E / ; n line feed U+000A 318 %x72 / ; r carriage return U+000D 319 %x74 / ; t tab U+0009 320 %x75 4HEXDIG ) ; uXXXX U+XXXX 322 escape = %x5C ; \ 324 quotation-mark = %x22 ; " 326 unescaped = %x20-21 / %x23-5B / %x5D-10FFFF 328 8. String and Character Issues 330 8.1. Encoding and Detection 332 JSON text SHALL be encoded in Unicode. The default encoding is 333 UTF-8. 335 Since the first two characters of a JSON text will always be ASCII 336 characters [RFC0020], it is possible to determine whether an octet 337 stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking 338 at the pattern of nulls in the first four octets. 340 00 00 00 xx UTF-32BE 341 00 xx 00 xx UTF-16BE 342 xx 00 00 00 UTF-32LE 343 xx 00 xx 00 UTF-16LE 344 xx xx xx xx UTF-8 346 8.2. Unicode Characters 348 When all the strings represented in a JSON text are composed entirely 349 of Unicode characters [UNICODE] (however escaped), then that JSON 350 text is interoperable in the sense that all software implementations 351 which parse it will agree on the contents of names and of string 352 values in objects and arrays. 354 However, the ABNF in this specification allows member names and 355 string values to contain bit sequences which cannot encode Unicode 356 characters, for example "\uDEAD" (a single unpaired UTF-16 357 surrogate). Instances of this have been observed, for example when a 358 library truncates a UTF-16 string without checking whether the 359 truncation split a surrogate pair. The behavior of software which 360 receives JSON texts containing such values is unpredictable; for 361 example, implementations might return different values for the length 362 of a string value, or even suffer fatal runtime exceptions. 364 8.3. String Comparison 366 Software implementations are typically required to test names of 367 object members for equality. Implementations which transform the 368 textual representation into sequences of Unicode code units, and then 369 perform the comparison numerically, code unit by code unit, are 370 interoperable in the sense that implementations will agree in all 371 cases on equality or inequality of two strings. For example, 372 implementations which compare strings with escaped characters 373 unconverted may incorrectly find that "a\b" and "a\u005Cb" are not 374 equal. 376 9. Parsers 378 A JSON parser transforms a JSON text into another representation. A 379 JSON parser MUST accept all texts that conform to the JSON grammar. 380 A JSON parser MAY accept non-JSON forms or extensions. 382 An implementation may set limits on the size of texts that it 383 accepts. An implementation may set limits on the maximum depth of 384 nesting. An implementation may set limits on the range and precision 385 of numbers. An implementation may set limits on the length and 386 character contents of strings. 388 10. Generators 390 A JSON generator produces JSON text. The resulting text MUST 391 strictly conform to the JSON grammar. 393 11. IANA Considerations 395 The MIME media type for JSON text is application/json. 397 Type name: application 399 Subtype name: json 401 Required parameters: n/a 402 Optional parameters: n/a 404 Encoding considerations: 8bit if UTF-8; binary if UTF-16 or UTF-32. 405 JSON may be represented using UTF-8, UTF-16, or UTF-32. When JSON 406 is written in UTF-8, JSON is 8bit compatible. When JSON is 407 written in UTF-16 or UTF-32, the binary content-transfer-encoding 408 must be used. 410 Interoperability considerations: Described in this document 412 Published specification: This document 414 Applications that use this media type: JSON has been used to exchange 415 data between applications written in all of these programming 416 languages: ActionScript, C, C#, Clojure, ColdFusion, Common Lisp, 417 E, Erlang, Go, Java, JavaScript, Lua, Objective CAML, Perl, PHP, 418 Python, Rebol, Ruby, Scala, and Scheme. 420 Additional information: Magic number(s): n/a 421 File extension(s): .json 422 Macintosh file type code(s): TEXT 424 Person & email address to contact for further information: IESG 425 . 507 [RFC0020] Cerf, V., "ASCII format for network interchange", RFC 20, 508 October 1969. 510 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 511 Requirement Levels", BCP 14, RFC 2119, March 1997. 513 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 514 Specifications: ABNF", STD 68, RFC 5234, January 2008. 516 [UNICODE] The Unicode Consortium, "The Unicode Standard, Version 4.0 517 ", 2003, . 519 15.2. Informative References 521 [ECMA-262] 522 European Computer Manufacturers Association, "ECMAScript 523 Language Specification 5.1 Edition ", June 2011, . 526 [ECMA-404] 527 Ecma International, "The JSON Data Interchange Format ", 528 October 2013, . 531 [RFC4627] Crockford, D., "The application/json Media Type for 532 JavaScript Object Notation (JSON)", RFC 4627, July 2006. 534 Appendix A. Changes from RFC 4627 536 This section lists changes between this document and the text in RFC 537 4627. 539 o Changed Working Group attribution to JSON Working Group. 541 o Changed title of document. 543 o Change the reference to [UNICODE] to be be non-version-specific. 545 o Added a "Specifications of JSON" section. 547 o Added an "Introduction to this Revision" section. 549 o Added language about duplicate object member names and 550 interoperability. 552 o Applied erratum #607 from RFC 4627 to correctly align the artwork 553 for the definition of "object". 555 o Changed "as sequences of digits" to "in the grammar below" in 556 "Numbers" section. 558 o Added language about number interoperability as a function of 559 IEEE754, and an IEEE754 reference. 561 o Added language about interoperability and Unicode characters, and 562 about string comparisons. To do this, turned the old "Encoding" 563 section into a "String and Character Issues" section, with three 564 subsections: The old "Encoding" material, and two new sections for 565 "Unicode Characters" and "String Comparison". 567 o Changed guidance in "Parsers" section to point out that 568 implementations may set limits on the range "and precision" of 569 numbers. 571 o Updated and tidied the "IANA Considerations" section. 573 o Made a real "Security Considerations" section, and lifted the text 574 out of the existing "IANA Considerations" section. 576 o Applied erratum #3607 from RFC 4627 by removing the security 577 consideration that begins "A JSON text can be safely passed" and 578 the JavaScript code that went with that consideration. 580 o Added a note to the "Security Considerations" section pointing out 581 the risks of using the "eval()" function in JavaScript or any 582 other language in which JSON texts conform to that language's 583 syntax. 585 o Changed "100" to 100 and added a boolean field, both in the first 586 example. 588 o Added "Contributors" section crediting Douglas Crockford. 590 o Added a reference to RFC4627. 592 o Moved the ECMAScript reference from Normative to Informative, 593 updated it to reference ECMAScript 5.1, and added reference to 594 ECMA 404. 596 Author's Address 598 Tim Bray (editor) 599 Google, Inc. 601 Email: tbray@textuality.com