idnits 2.17.1 draft-ietf-json-rfc4627bis-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC4627, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (December 19, 2013) is 3752 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '116' on line 482 -- Looks like a reference, but probably isn't: '943' on line 482 -- Looks like a reference, but probably isn't: '234' on line 482 -- Looks like a reference, but probably isn't: '38793' on line 482 == Unused Reference: 'RFC0020' is defined on line 534, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE754' -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE' -- Obsolete informational reference (is this intentional?): RFC 4627 (Obsoleted by RFC 7158, RFC 7159) Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 JSON Working Group T. Bray, Ed. 3 Internet-Draft Google, Inc. 4 Obsoletes: 4627 (if approved) December 19, 2013 5 Intended status: Standards Track 6 Expires: June 22, 2014 8 The JSON Data Interchange Format 9 draft-ietf-json-rfc4627bis-10 11 Abstract 13 JavaScript Object Notation (JSON) is a lightweight, text-based, 14 language-independent data interchange format. It was derived from 15 the ECMAScript Programming Language Standard. JSON defines a small 16 set of formatting rules for the portable representation of structured 17 data. 19 This document removes inconsistencies with other specifications of 20 JSON, repairs specification errors, and offers experience-based 21 interoperability guidance. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on June 22, 2014. 40 Copyright Notice 42 Copyright (c) 2013 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 This document may contain material from IETF Documents or IETF 56 Contributions published or made publicly available before November 57 10, 2008. The person(s) controlling the copyright in some of this 58 material may not have granted the IETF Trust the right to allow 59 modifications of such material outside the IETF Standards Process. 60 Without obtaining an adequate license from the person(s) controlling 61 the copyright in such materials, this document may not be modified 62 outside the IETF Standards Process, and derivative works of it may 63 not be created outside the IETF Standards Process, except to format 64 it for publication as an RFC or to translate it into languages other 65 than English. 67 Table of Contents 69 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 70 1.1. Conventions Used in This Document . . . . . . . . . . . . 3 71 1.2. Specifications of JSON . . . . . . . . . . . . . . . . . 3 72 1.3. Introduction to This Revision . . . . . . . . . . . . . . 4 73 2. JSON Grammar . . . . . . . . . . . . . . . . . . . . . . . . 4 74 3. Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 75 4. Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 76 5. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 77 6. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 78 7. Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 79 8. String and Character Issues . . . . . . . . . . . . . . . . . 8 80 8.1. Character Encoding . . . . . . . . . . . . . . . . . . . 8 81 8.2. Unicode Characters . . . . . . . . . . . . . . . . . . . 8 82 8.3. String Comparison . . . . . . . . . . . . . . . . . . . . 9 83 9. Parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 84 10. Generators . . . . . . . . . . . . . . . . . . . . . . . . . 9 85 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 86 12. Security Considerations . . . . . . . . . . . . . . . . . . . 10 87 13. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 11 88 14. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 12 89 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 90 15.1. Normative References . . . . . . . . . . . . . . . . . . 12 91 15.2. Informative References . . . . . . . . . . . . . . . . . 12 92 Appendix A. Changes from RFC 4627 . . . . . . . . . . . . . . . 13 93 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 14 95 1. Introduction 96 JavaScript Object Notation (JSON) is a text format for the 97 serialization of structured data. It is derived from the object 98 literals of JavaScript, as defined in the ECMAScript Programming 99 Language Standard, Third Edition [ECMA-262]. 101 JSON can represent four primitive types (strings, numbers, booleans, 102 and null) and two structured types (objects and arrays). 104 A string is a sequence of zero or more Unicode characters [UNICODE]. 106 An object is an unordered collection of zero or more name/value 107 pairs, where a name is a string and a value is a string, number, 108 boolean, null, object, or array. 110 An array is an ordered sequence of zero or more values. 112 The terms "object" and "array" come from the conventions of 113 JavaScript. 115 JSON's design goals were for it to be minimal, portable, textual, and 116 a subset of JavaScript. 118 1.1. Conventions Used in This Document 120 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 121 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 122 document are to be interpreted as described in [RFC2119]. 124 The grammatical rules in this document are to be interpreted as 125 described in [RFC5234]. 127 1.2. Specifications of JSON 129 This document is an update of [RFC4627], which described JSON and 130 registered the Media Type "application/json". 132 A description of JSON in ECMAScript terms appears in version 5.1 of 133 the ECMAScript specification [ECMA-262], section 15.12. JSON is also 134 described in [ECMA-404]. 136 All of the specifications of JSON syntax agree on the syntactic 137 elements of the language. 139 1.3. Introduction to This Revision 141 In the years since the publication of RFC 4627, JSON has found very 142 wide use. This experience has revealed certain patterns which, while 143 allowed by its specifications, have caused interoperability problems. 145 Also, a small number of errata have been reported. 147 The revision's goal is to apply the errata, remove inconsistencies 148 with other specifications of JSON, and highlight practices which can 149 lead to interoperability problems. 151 2. JSON Grammar 153 A JSON text is a sequence of tokens. The set of tokens includes six 154 structural characters, strings, numbers, and three literal names. 156 A JSON text is a serialized value. Note that certain previous 157 specifications of JSON constrained a JSON text to be an object or an 158 array. Implementations which generate only objects or arrays where a 159 JSON text is called for will be interoperable in the sense that all 160 implementations will accept these as conforming JSON texts. 162 JSON-text = ws value ws 164 These are the six structural characters: 166 begin-array = ws %x5B ws ; [ left square bracket 168 begin-object = ws %x7B ws ; { left curly bracket 170 end-array = ws %x5D ws ; ] right square bracket 172 end-object = ws %x7D ws ; } right curly bracket 174 name-separator = ws %x3A ws ; : colon 176 value-separator = ws %x2C ws ; , comma 178 Insignificant whitespace is allowed before or after any of the six 179 structural characters. 181 ws = *( 182 %x20 / ; Space 183 %x09 / ; Horizontal tab 184 %x0A / ; Line feed or New line 185 %x0D ) ; Carriage return 187 3. Values 189 A JSON value MUST be an object, array, number, or string, or one of 190 the following three literal names: 192 false null true 194 The literal names MUST be lowercase. No other literal names are 195 allowed. 197 value = false / null / true / object / array / number / string 199 false = %x66.61.6c.73.65 ; false 201 null = %x6e.75.6c.6c ; null 203 true = %x74.72.75.65 ; true 205 4. Objects 207 An object structure is represented as a pair of curly brackets 208 surrounding zero or more name/value pairs (or members). A name is a 209 string. A single colon comes after each name, separating the name 210 from the value. A single comma separates a value from a following 211 name. The names within an object SHOULD be unique. 213 object = begin-object [ member *( value-separator member ) ] 214 end-object 216 member = string name-separator value 218 An object whose names are all unique is interoperable in the sense 219 that all software implementations which receive that object will 220 agree on the name-value mappings. When the names within an object 221 are not unique, the behavior of software that receives such an object 222 is unpredictable. Many implementations report the last name/value 223 pair only; other implementations report an error or fail to parse the 224 object; other implementations report all of the name/value pairs, 225 including duplicates. 227 JSON parsing libraries have been observed to differ as to whether or 228 not they make the ordering of object members visible to calling 229 software. Implementations whose behavior does not depend on member 230 ordering will be interoperable in the sense that they will not be 231 affected by these differences. 233 5. Arrays 235 An array structure is represented as square brackets surrounding zero 236 or more values (or elements). Elements are separated by commas. 238 array = begin-array [ value *( value-separator value ) ] end-array 240 There is no requirement that the values in an array be of the same 241 type. 243 6. Numbers 245 The representation of numbers is similar to that used in most 246 programming languages. A number is represented in base 10 using 247 decimal digits. It contains an integer component that may be 248 prefixed with an optional minus sign, which may be followed by a 249 fraction part and/or an exponent part. Leading zeros are not 250 allowed. 252 A fraction part is a decimal point followed by one or more digits. 254 An exponent part begins with the letter E in upper or lowercase, 255 which may be followed by a plus or minus sign. The E and optional 256 sign are followed by one or more digits. 258 Numeric values that cannot be represented in the grammar below (such 259 as Infinity and NaN) are not permitted. 261 number = [ minus ] int [ frac ] [ exp ] 263 decimal-point = %x2E ; . 265 digit1-9 = %x31-39 ; 1-9 267 e = %x65 / %x45 ; e E 269 exp = e [ minus / plus ] 1*DIGIT 271 frac = decimal-point 1*DIGIT 273 int = zero / ( digit1-9 *DIGIT ) 275 minus = %x2D ; - 276 plus = %x2B ; + 278 zero = %x30 ; 0 280 This specification allows implementations to set limits on the range 281 and precision of numbers accepted. Since software which implements 282 IEEE 754-2008 binary64 (double precision) numbers [IEEE754] is 283 generally available and widely used, good interoperability can be 284 achieved by implementations which expect no more precision or range 285 than these provide, in the sense that implementations will 286 approximate JSON numbers within the expected precision. A JSON 287 number such as 1E400 or 3.141592653589793238462643383279 may indicate 288 potential interoperability problems since it suggests that the 289 software which created it it expected greater magnitude or precision 290 than is widely available. 292 Note that when such software is used, numbers which are integers and 293 are in the range [-(2**53)+1, (2**53)-1] are interoperable in the 294 sense that implementations will agree exactly on their numeric 295 values. 297 7. Strings 299 The representation of strings is similar to conventions used in the C 300 family of programming languages. A string begins and ends with 301 quotation marks. All Unicode characters may be placed within the 302 quotation marks except for the characters that must be escaped: 303 quotation mark, reverse solidus, and the control characters (U+0000 304 through U+001F). 306 Any character may be escaped. If the character is in the Basic 307 Multilingual Plane (U+0000 through U+FFFF), then it may be 308 represented as a six-character sequence: a reverse solidus, followed 309 by the lowercase letter u, followed by four hexadecimal digits that 310 encode the character's code point. The hexadecimal letters A though 311 F can be upper or lowercase. So, for example, a string containing 312 only a single reverse solidus character may be represented as 313 "\u005C". 315 Alternatively, there are two-character sequence escape 316 representations of some popular characters. So, for example, a 317 string containing only a single reverse solidus character may be 318 represented more compactly as "\\". 320 To escape an extended character that is not in the Basic Multilingual 321 Plane, the character is represented as a twelve-character sequence, 322 encoding the UTF-16 surrogate pair. So, for example, a string 323 containing only the G clef character (U+1D11E) may be represented as 324 "\uD834\uDD1E". 326 string = quotation-mark *char quotation-mark 328 char = unescaped / 329 escape ( 330 %x22 / ; " quotation mark U+0022 331 %x5C / ; \ reverse solidus U+005C 332 %x2F / ; / solidus U+002F 333 %x62 / ; b backspace U+0008 334 %x66 / ; f form feed U+000C 335 %x6E / ; n line feed U+000A 336 %x72 / ; r carriage return U+000D 337 %x74 / ; t tab U+0009 338 %x75 4HEXDIG ) ; uXXXX U+XXXX 340 escape = %x5C ; \ 342 quotation-mark = %x22 ; " 344 unescaped = %x20-21 / %x23-5B / %x5D-10FFFF 346 8. String and Character Issues 348 8.1. Character Encoding 350 JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32. The default 351 encoding is UTF-8, and JSON texts which are encoded in UTF-8 are 352 interoperable in the sense that they will be read successfully by the 353 maximum number of implementations; there are many implementations 354 which cannot successfully read texts in other encodings (such as 355 UTF-16 and UTF-32). 357 Implementations MUST NOT add a byte order mark to the beginning of a 358 JSON text. In the interests of interoperability, implementations 359 which parse JSON texts MAY ignore the presence of a byte order mark 360 rather than treating it as an error. 362 8.2. Unicode Characters 364 When all the strings represented in a JSON text are composed entirely 365 of Unicode characters [UNICODE] (however escaped), then that JSON 366 text is interoperable in the sense that all software implementations 367 which parse it will agree on the contents of names and of string 368 values in objects and arrays. 370 However, the ABNF in this specification allows member names and 371 string values to contain bit sequences which cannot encode Unicode 372 characters, for example "\uDEAD" (a single unpaired UTF-16 373 surrogate). Instances of this have been observed, for example when a 374 library truncates a UTF-16 string without checking whether the 375 truncation split a surrogate pair. The behavior of software which 376 receives JSON texts containing such values is unpredictable; for 377 example, implementations might return different values for the length 378 of a string value, or even suffer fatal runtime exceptions. 380 8.3. String Comparison 382 Software implementations are typically required to test names of 383 object members for equality. Implementations which transform the 384 textual representation into sequences of Unicode code units, and then 385 perform the comparison numerically, code unit by code unit, are 386 interoperable in the sense that implementations will agree in all 387 cases on equality or inequality of two strings. For example, 388 implementations which compare strings with escaped characters 389 unconverted may incorrectly find that "a\\b" and "a\u005Cb" are not 390 equal. 392 9. Parsers 394 A JSON parser transforms a JSON text into another representation. A 395 JSON parser MUST accept all texts that conform to the JSON grammar. 396 A JSON parser MAY accept non-JSON forms or extensions. 398 An implementation may set limits on the size of texts that it 399 accepts. An implementation may set limits on the maximum depth of 400 nesting. An implementation may set limits on the range and precision 401 of numbers. An implementation may set limits on the length and 402 character contents of strings. 404 10. Generators 406 A JSON generator produces JSON text. The resulting text MUST 407 strictly conform to the JSON grammar. 409 11. IANA Considerations 411 The MIME media type for JSON text is application/json. 413 Type name: application 415 Subtype name: json 417 Required parameters: n/a 418 Optional parameters: n/a 420 Encoding considerations: binary 422 Security considerations: See [this RFC], section 12. 424 Interoperability considerations: Described in this document 426 Published specification: This document 428 Applications that use this media type: JSON has been used to 429 exchange data between applications written in all of these 430 programming languages: ActionScript, C, C#, Clojure, ColdFusion, 431 Common Lisp, E, Erlang, Go, Java, JavaScript, Lua, Objective CAML, 432 Perl, PHP, Python, Rebol, Ruby, Scala, and Scheme. 434 Additional information: Magic number(s): n/a 435 File extension(s): .json 436 Macintosh file type code(s): TEXT 438 Person & email address to contact for further information: IESG 439 441 Intended usage: COMMON 443 Restrictions on usage: none 445 Author: Douglas Crockford 446 448 Change controller: IESG 449 451 Note: No "charset" parameter is defined for this registration. 452 Adding one really has no effect on compliant recipients. 454 12. Security Considerations 456 Generally there are security issues with scripting languages. JSON 457 is a subset of JavaScript, but excludes assignment and invocation. 459 Since JSON's syntax is borrowed from JavaScript, it is possible to 460 use that language's "eval()" function to parse JSON texts. This 461 generally constitutes an unacceptable security risk, since the text 462 could contain executable code along with data declarations. The same 463 consideration applies to the use of eval()-like functions in any 464 other programming language in which JSON texts conform to that 465 language's syntax. 467 13. Examples 469 This is a JSON object: 471 { 472 "Image": { 473 "Width": 800, 474 "Height": 600, 475 "Title": "View from 15th Floor", 476 "Thumbnail": { 477 "Url": "http://www.example.com/image/481989943", 478 "Height": 125, 479 "Width": 100 480 }, 481 "Animated" : false, 482 "IDs": [116, 943, 234, 38793] 483 } 484 } 486 Its Image member is an object whose Thumbnail member is an object and 487 whose IDs member is an array of numbers. 489 This is a JSON array containing two objects: 491 [ 492 { 493 "precision": "zip", 494 "Latitude": 37.7668, 495 "Longitude": -122.3959, 496 "Address": "", 497 "City": "SAN FRANCISCO", 498 "State": "CA", 499 "Zip": "94107", 500 "Country": "US" 501 }, 502 { 503 "precision": "zip", 504 "Latitude": 37.371991, 505 "Longitude": -122.026020, 506 "Address": "", 507 "City": "SUNNYVALE", 508 "State": "CA", 509 "Zip": "94085", 510 "Country": "US" 511 } 512 ] 513 Here are three small JSON texts containing only values: 515 "Hello world!" 517 42 519 true 521 14. Contributors 523 RFC 4627 was written by Douglas Crockford. This document was 524 constructed by making a relatively small number of changes to that 525 document; thus the vast majority of the text here is his. 527 15. References 529 15.1. Normative References 531 [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", 2008, 532 . 534 [RFC0020] Cerf, V., "ASCII format for network interchange", RFC 20, 535 October 1969. 537 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 538 Requirement Levels", BCP 14, RFC 2119, March 1997. 540 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 541 Specifications: ABNF", STD 68, RFC 5234, January 2008. 543 [UNICODE] The Unicode Consortium, "The Unicode Standard", 2003-, 544 . 546 Note that this reference is to the latest version of 547 Unicode, rather than to a specific release. It is not 548 expected that future changes in the UNICODE specification 549 will impact the syntax of JSON. 551 15.2. Informative References 553 [ECMA-262] 554 European Computer Manufacturers Association, "ECMAScript 555 Language Specification 5.1 Edition ", June 2011, 556 . 558 [ECMA-404] 559 Ecma International, "The JSON Data Interchange Format ", 560 October 2013, . 563 [RFC4627] Crockford, D., "The application/json Media Type for 564 JavaScript Object Notation (JSON)", RFC 4627, July 2006. 566 Appendix A. Changes from RFC 4627 568 This section lists changes between this document and the text in RFC 569 4627. 571 o Changed title and abstract of document. 573 o Change the reference to [UNICODE] to be be non-version-specific. 575 o Added a "Specifications of JSON" section. 577 o Added an "Introduction to this Revision" section. 579 o Changed the definition of "JSON text" so that it can be any JSON 580 value, removing the constraint that it be an object or array. 582 o Added language about duplicate object member names, member 583 ordering, and interoperability. 585 o Clarified the absence of a requirement that values in an array be 586 of the same JSON type. 588 o Applied erratum #607 from RFC 4627 to correctly align the artwork 589 for the definition of "object". 591 o Changed "as sequences of digits" to "in the grammar below" in 592 "Numbers" section, and made base-10-ness explicit. 594 o Added language about number interoperability as a function of 595 IEEE754, and an IEEE754 reference. 597 o Added language about interoperability and Unicode characters, and 598 about string comparisons. To do this, turned the old "Encoding" 599 section into a "String and Character Issues" section, with three 600 subsections: "Character Encoding", "Unicode Characters" and 601 "String Comparison". 603 o Changed guidance in "Parsers" section to point out that 604 implementations may set limits on the range "and precision" of 605 numbers. 607 o Updated and tidied the "IANA Considerations" section. 609 o Made a real "Security Considerations" section, and lifted the text 610 out of the existing "IANA Considerations" section. 612 o Applied erratum #3607 from RFC 4627 by removing the security 613 consideration that begins "A JSON text can be safely passed" and 614 the JavaScript code that went with that consideration. 616 o Added a note to the "Security Considerations" section pointing out 617 the risks of using the "eval()" function in JavaScript or any 618 other language in which JSON texts conform to that language's 619 syntax. 621 o Added a note to IANA considerations clarifying the absence of a 622 "charset" parameter for the application/json media type. 624 o Changed "100" to 100 and added a boolean field, both in the first 625 example. 627 o Added examples of JSON texts which simple values, neither objects 628 nor arrays. 630 o Added "Contributors" section crediting Douglas Crockford. 632 o Added a reference to RFC4627. 634 o Moved the ECMAScript reference from Normative to Informative, 635 updated it to reference ECMAScript 5.1, and added reference to 636 ECMA 404. 638 Author's Address 640 Tim Bray (editor) 641 Google, Inc. 643 Email: tbray@textuality.com