idnits 2.17.1 draft-ietf-json-rfc4627bis-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC4627, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 04, 2013) is 3789 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '116' on line 460 -- Looks like a reference, but probably isn't: '943' on line 460 -- Looks like a reference, but probably isn't: '234' on line 460 -- Looks like a reference, but probably isn't: '38793' on line 460 == Unused Reference: 'RFC0020' is defined on line 512, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE754' -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE' -- Obsolete informational reference (is this intentional?): RFC 4627 (Obsoleted by RFC 7158, RFC 7159) Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 JSON Working Group T. Bray, Ed. 3 Internet-Draft Google, Inc. 4 Obsoletes: 4627 (if approved) December 04, 2013 5 Intended status: Standards Track 6 Expires: June 07, 2014 8 The JSON Data Interchange Format 9 draft-ietf-json-rfc4627bis-08 11 Abstract 13 JavaScript Object Notation (JSON) is a lightweight, text-based, 14 language-independent data interchange format. It was derived from 15 the ECMAScript Programming Language Standard. JSON defines a small 16 set of formatting rules for the portable representation of structured 17 data. 19 This document makes no changes to the definition of JSON; it repairs 20 specification errors and offers experience-based interoperability 21 guidance. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on June 07, 2014. 40 Copyright Notice 42 Copyright (c) 2013 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 58 1.1. Conventions Used in This Document . . . . . . . . . . . . 3 59 1.2. Specifications of JSON . . . . . . . . . . . . . . . . . 3 60 1.3. Introduction to This Revision . . . . . . . . . . . . . . 3 61 2. JSON Grammar . . . . . . . . . . . . . . . . . . . . . . . . 3 62 3. Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 63 4. Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 64 5. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 65 6. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 66 7. Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 67 8. String and Character Issues . . . . . . . . . . . . . . . . . 8 68 8.1. Character Encoding . . . . . . . . . . . . . . . . . . . 8 69 8.2. Unicode Characters . . . . . . . . . . . . . . . . . . . 8 70 8.3. String Comparison . . . . . . . . . . . . . . . . . . . . 9 71 9. Parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 72 10. Generators . . . . . . . . . . . . . . . . . . . . . . . . . 9 73 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 74 12. Security Considerations . . . . . . . . . . . . . . . . . . . 10 75 13. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 10 76 14. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 12 77 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 78 15.1. Normative References . . . . . . . . . . . . . . . . . . 12 79 15.2. Informative References . . . . . . . . . . . . . . . . . 12 80 Appendix A. Changes from RFC 4627 . . . . . . . . . . . . . . . 13 81 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 14 83 1. Introduction 85 JavaScript Object Notation (JSON) is a text format for the 86 serialization of structured data. It is derived from the object 87 literals of JavaScript, as defined in the ECMAScript Programming 88 Language Standard, Third Edition [ECMA-262]. 90 JSON can represent four primitive types (strings, numbers, booleans, 91 and null) and two structured types (objects and arrays). 93 A string is a sequence of zero or more Unicode characters [UNICODE]. 95 An object is an unordered collection of zero or more name/value 96 pairs, where a name is a string and a value is a string, number, 97 boolean, null, object, or array. 99 An array is an ordered sequence of zero or more values. 101 The terms "object" and "array" come from the conventions of 102 JavaScript. 104 JSON's design goals were for it to be minimal, portable, textual, and 105 a subset of JavaScript. 107 1.1. Conventions Used in This Document 109 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 110 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 111 document are to be interpreted as described in [RFC2119]. 113 The grammatical rules in this document are to be interpreted as 114 described in [RFC5234]. 116 1.2. Specifications of JSON 118 This document is an update of [RFC4627], which described JSON and 119 registered the Media Type "application/json". 121 A description of JSON in ECMAScript terms appears in version 5.1 of 122 the ECMAScript specification [ECMA-262], section 15.12. JSON is also 123 described in [ECMA-404]. 125 All of the specifications of JSON syntax agree on the syntactic 126 elements of the language. 128 1.3. Introduction to This Revision 130 In the years since the publication of RFC 4627, JSON has found very 131 wide use. This experience has revealed certain patterns which, while 132 allowed by its specifications, have caused interoperability problems. 134 Also, a small number of errata have been reported. 136 This revision does not change any of the rules of the specification; 137 all texts which were legal JSON remain so, and none which were not 138 JSON become JSON. The revision's goal is to fix the errata and 139 highlight practices which can lead to interoperability problems. 141 2. JSON Grammar 142 A JSON text is a sequence of tokens. The set of tokens includes six 143 structural characters, strings, numbers, and three literal names. 145 A JSON text is a serialized value. Note that certain previous 146 specifications of JSON constrained a JSON text to be an object or an 147 array. Implementations which generate only objects or arrays where a 148 JSON text is called for will be interoperable in the sense that all 149 implementations will accept these as conforming JSON texts. 151 JSON-text = ws value ws 153 These are the six structural characters: 155 begin-array = ws %x5B ws ; [ left square bracket 157 begin-object = ws %x7B ws ; { left curly bracket 159 end-array = ws %x5D ws ; ] right square bracket 161 end-object = ws %x7D ws ; } right curly bracket 163 name-separator = ws %x3A ws ; : colon 165 value-separator = ws %x2C ws ; , comma 167 Insignificant whitespace is allowed before or after any of the six 168 structural characters. 170 ws = *( 171 %x20 / ; Space 172 %x09 / ; Horizontal tab 173 %x0A / ; Line feed or New line 174 %x0D ) ; Carriage return 176 3. Values 178 A JSON value MUST be an object, array, number, or string, or one of 179 the following three literal names: 181 false null true 183 The literal names MUST be lowercase. No other literal names are 184 allowed. 186 value = false / null / true / object / array / number / string 188 false = %x66.61.6c.73.65 ; false 190 null = %x6e.75.6c.6c ; null 192 true = %x74.72.75.65 ; true 194 4. Objects 196 An object structure is represented as a pair of curly brackets 197 surrounding zero or more name/value pairs (or members). A name is a 198 string. A single colon comes after each name, separating the name 199 from the value. A single comma separates a value from a following 200 name. The names within an object SHOULD be unique. 202 object = begin-object [ member *( value-separator member ) ] 203 end-object 205 member = string name-separator value 207 An object whose names are all unique is interoperable in the sense 208 that all software implementations which receive that object will 209 agree on the name-value mappings. When the names within an object 210 are not unique, the behavior of software that receives such an object 211 is unpredictable. Many implementations report the last name/value 212 pair only; other implementations report an error or fail to parse the 213 object; other implementations report all of the name/value pairs, 214 including duplicates. 216 5. Arrays 218 An array structure is represented as square brackets surrounding zero 219 or more values (or elements). Elements are separated by commas. 221 array = begin-array [ value *( value-separator value ) ] end-array 223 6. Numbers 225 The representation of numbers is similar to that used in most 226 programming languages. A number contains an integer component that 227 may be prefixed with an optional minus sign, which may be followed by 228 a fraction part and/or an exponent part. 230 Octal and hex forms are not allowed. Leading zeros are not allowed. 232 A fraction part is a decimal point followed by one or more digits. 234 An exponent part begins with the letter E in upper or lowercase, 235 which may be followed by a plus or minus sign. The E and optional 236 sign are followed by one or more digits. 238 Numeric values that cannot be represented in the grammar below (such 239 as Infinity and NaN) are not permitted. 241 number = [ minus ] int [ frac ] [ exp ] 243 decimal-point = %x2E ; . 245 digit1-9 = %x31-39 ; 1-9 247 e = %x65 / %x45 ; e E 249 exp = e [ minus / plus ] 1*DIGIT 251 frac = decimal-point 1*DIGIT 253 int = zero / ( digit1-9 *DIGIT ) 255 minus = %x2D ; - 257 plus = %x2B ; + 259 zero = %x30 ; 0 261 This specification allows implementations to set limits on the range 262 and precision of numbers accepted. Since software which implements 263 IEEE 754-2008 binary64 (double precision) numbers [IEEE754] is 264 generally available and widely used, good interoperability can be 265 achieved by implementations which expect no more precision or range 266 than these provide, in the sense that implementations will 267 approximate JSON numbers within the expected precision. A JSON 268 number such as 1E400 or 3.141592653589793238462643383279 may indicate 269 potential interoperability problems since it suggests that the 270 software which created it it expected greater magnitude or precision 271 than is widely available. 273 Note that when such software is used, numbers which are integers and 274 are in the range [-(2**53)+1, (2**53)-1] are interoperable in the 275 sense that implementations will agree exactly on their numeric 276 values. 278 7. Strings 280 The representation of strings is similar to conventions used in the C 281 family of programming languages. A string begins and ends with 282 quotation marks. All Unicode characters may be placed within the 283 quotation marks except for the characters that must be escaped: 284 quotation mark, reverse solidus, and the control characters (U+0000 285 through U+001F). 287 Any character may be escaped. If the character is in the Basic 288 Multilingual Plane (U+0000 through U+FFFF), then it may be 289 represented as a six-character sequence: a reverse solidus, followed 290 by the lowercase letter u, followed by four hexadecimal digits that 291 encode the character's code point. The hexadecimal letters A though 292 F can be upper or lowercase. So, for example, a string containing 293 only a single reverse solidus character may be represented as 294 "\u005C". 296 Alternatively, there are two-character sequence escape 297 representations of some popular characters. So, for example, a 298 string containing only a single reverse solidus character may be 299 represented more compactly as "\\". 301 To escape an extended character that is not in the Basic Multilingual 302 Plane, the character is represented as a twelve-character sequence, 303 encoding the UTF-16 surrogate pair. So, for example, a string 304 containing only the G clef character (U+1D11E) may be represented as 305 "\uD834\uDD1E". 307 string = quotation-mark *char quotation-mark 309 char = unescaped / 310 escape ( 311 %x22 / ; " quotation mark U+0022 312 %x5C / ; \ reverse solidus U+005C 313 %x2F / ; / solidus U+002F 314 %x62 / ; b backspace U+0008 315 %x66 / ; f form feed U+000C 316 %x6E / ; n line feed U+000A 317 %x72 / ; r carriage return U+000D 318 %x74 / ; t tab U+0009 319 %x75 4HEXDIG ) ; uXXXX U+XXXX 321 escape = %x5C ; \ 323 quotation-mark = %x22 ; " 325 unescaped = %x20-21 / %x23-5B / %x5D-10FFFF 327 8. String and Character Issues 329 8.1. Character Encoding 331 JSON text SHALL be encoded in Unicode. The default encoding is 332 UTF-8, and JSON texts which are encoded in UTF-8 are interoperable in 333 the sense that they will be read successfully by the maximum number 334 of implementations; there are many implementations which cannot 335 successfully read texts in other encodings (such as UTF-16 and 336 UTF-32). 338 Implementations have been observed to generate JSON texts prefixed 339 with a Byte Order Mark character. While this is not allowed by the 340 grammar in this specification, in the interests of interoperability 341 it is RECOMMENDED that implementations which parse JSON texts ignore 342 the presence of a byte order mark rather than treating it as an 343 error. 345 8.2. Unicode Characters 347 When all the strings represented in a JSON text are composed entirely 348 of Unicode characters [UNICODE] (however escaped), then that JSON 349 text is interoperable in the sense that all software implementations 350 which parse it will agree on the contents of names and of string 351 values in objects and arrays. 353 However, the ABNF in this specification allows member names and 354 string values to contain bit sequences which cannot encode Unicode 355 characters, for example "\uDEAD" (a single unpaired UTF-16 356 surrogate). Instances of this have been observed, for example when a 357 library truncates a UTF-16 string without checking whether the 358 truncation split a surrogate pair. The behavior of software which 359 receives JSON texts containing such values is unpredictable; for 360 example, implementations might return different values for the length 361 of a string value, or even suffer fatal runtime exceptions. 363 8.3. String Comparison 365 Software implementations are typically required to test names of 366 object members for equality. Implementations which transform the 367 textual representation into sequences of Unicode code units, and then 368 perform the comparison numerically, code unit by code unit, are 369 interoperable in the sense that implementations will agree in all 370 cases on equality or inequality of two strings. For example, 371 implementations which compare strings with escaped characters 372 unconverted may incorrectly find that "a\\b" and "a\u005Cb" are not 373 equal. 375 9. Parsers 377 A JSON parser transforms a JSON text into another representation. A 378 JSON parser MUST accept all texts that conform to the JSON grammar. 379 A JSON parser MAY accept non-JSON forms or extensions. 381 An implementation may set limits on the size of texts that it 382 accepts. An implementation may set limits on the maximum depth of 383 nesting. An implementation may set limits on the range and precision 384 of numbers. An implementation may set limits on the length and 385 character contents of strings. 387 10. Generators 389 A JSON generator produces JSON text. The resulting text MUST 390 strictly conform to the JSON grammar. 392 11. IANA Considerations 394 The MIME media type for JSON text is application/json. 396 Type name: application 398 Subtype name: json 400 Required parameters: n/a 401 Optional parameters: n/a 403 Encoding considerations: binary 405 Interoperability considerations: Described in this document 407 Published specification: This document 409 Applications that use this media type: JSON has been used to 410 exchange data between applications written in all of these 411 programming languages: ActionScript, C, C#, Clojure, ColdFusion, 412 Common Lisp, E, Erlang, Go, Java, JavaScript, Lua, Objective CAML, 413 Perl, PHP, Python, Rebol, Ruby, Scala, and Scheme. 415 Additional information: Magic number(s): n/a 416 File extension(s): .json 417 Macintosh file type code(s): TEXT 419 Person & email address to contact for further information: IESG 420 422 Intended usage: COMMON 424 Restrictions on usage: none 426 Author: Douglas Crockford 427 429 Change controller: IESG 430 432 12. Security Considerations 434 Generally there are security issues with scripting languages. JSON 435 is a subset of JavaScript, but excludes assignment and invocation. 437 Since JSON's syntax is borrowed from JavaScript, it is possible to 438 use that language's "eval()" function to parse JSON texts. This 439 generally constitutes an unacceptable security risk, since the text 440 could contain executable code along with data declarations. The same 441 consideration applies to the use of eval()-like functions in any 442 other programming language in which JSON texts conform to that 443 language's syntax. 445 13. Examples 447 This is a JSON object: 449 { 450 "Image": { 451 "Width": 800, 452 "Height": 600, 453 "Title": "View from 15th Floor", 454 "Thumbnail": { 455 "Url": "http://www.example.com/image/481989943", 456 "Height": 125, 457 "Width": 100 458 }, 459 "Animated" : false, 460 "IDs": [116, 943, 234, 38793] 461 } 462 } 464 Its Image member is an object whose Thumbnail member is an object and 465 whose IDs member is an array of numbers. 467 This is a JSON array containing two objects: 469 [ 470 { 471 "precision": "zip", 472 "Latitude": 37.7668, 473 "Longitude": -122.3959, 474 "Address": "", 475 "City": "SAN FRANCISCO", 476 "State": "CA", 477 "Zip": "94107", 478 "Country": "US" 479 }, 480 { 481 "precision": "zip", 482 "Latitude": 37.371991, 483 "Longitude": -122.026020, 484 "Address": "", 485 "City": "SUNNYVALE", 486 "State": "CA", 487 "Zip": "94085", 488 "Country": "US" 489 } 490 ] 492 Here are three small JSON texts containing only values 493 "Hello world!" 495 42 497 true 499 14. Contributors 501 RFC 4627 was written by Douglas Crockford. This document was 502 constructed by making a relatively small number of changes to that 503 document; thus the vast majority of the text here is his. 505 15. References 507 15.1. Normative References 509 [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", 2008, 510 . 512 [RFC0020] Cerf, V., "ASCII format for network interchange", RFC 20, 513 October 1969. 515 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 516 Requirement Levels", BCP 14, RFC 2119, March 1997. 518 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 519 Specifications: ABNF", STD 68, RFC 5234, January 2008. 521 [UNICODE] The Unicode Consortium, "The Unicode Standard", 2003-, 522 . 524 Note that this reference is to the latest version of 525 Unicode, rather than to a specific release. It is not 526 expected that future changes in the UNICODE specification 527 will impact the syntax of JSON. 529 15.2. Informative References 531 [ECMA-262] 532 European Computer Manufacturers Association, "ECMAScript 533 Language Specification 5.1 Edition ", June 2011, 534 . 536 [ECMA-404] 537 Ecma International, "The JSON Data Interchange Format ", 538 October 2013, . 541 [RFC4627] Crockford, D., "The application/json Media Type for 542 JavaScript Object Notation (JSON)", RFC 4627, July 2006. 544 Appendix A. Changes from RFC 4627 546 This section lists changes between this document and the text in RFC 547 4627. 549 o Changed Working Group attribution to JSON Working Group. 551 o Changed title and abstract of document. 553 o Change the reference to [UNICODE] to be be non-version-specific. 555 o Added a "Specifications of JSON" section. 557 o Added an "Introduction to this Revision" section. 559 o Added language about duplicate object member names and 560 interoperability. 562 o Applied erratum #607 from RFC 4627 to correctly align the artwork 563 for the definition of "object". 565 o Changed "as sequences of digits" to "in the grammar below" in 566 "Numbers" section. 568 o Added language about number interoperability as a function of 569 IEEE754, and an IEEE754 reference. 571 o Added language about interoperability and Unicode characters, and 572 about string comparisons. To do this, turned the old "Encoding" 573 section into a "String and Character Issues" section, with three 574 subsections: The old "Encoding" material, and two new sections for 575 "Unicode Characters" and "String Comparison". 577 o Changed guidance in "Parsers" section to point out that 578 implementations may set limits on the range "and precision" of 579 numbers. 581 o Updated and tidied the "IANA Considerations" section. 583 o Made a real "Security Considerations" section, and lifted the text 584 out of the existing "IANA Considerations" section. 586 o Applied erratum #3607 from RFC 4627 by removing the security 587 consideration that begins "A JSON text can be safely passed" and 588 the JavaScript code that went with that consideration. 590 o Added a note to the "Security Considerations" section pointing out 591 the risks of using the "eval()" function in JavaScript or any 592 other language in which JSON texts conform to that language's 593 syntax. 595 o Changed "100" to 100 and added a boolean field, both in the first 596 example. 598 o Added "Contributors" section crediting Douglas Crockford. 600 o Added a reference to RFC4627. 602 o Moved the ECMAScript reference from Normative to Informative, 603 updated it to reference ECMAScript 5.1, and added reference to 604 ECMA 404. 606 Author's Address 608 Tim Bray (editor) 609 Google, Inc. 611 Email: tbray@textuality.com