| < draft-ietf-json-rfc4627bis-01.txt | draft-ietf-json-rfc4627bis-02.txt > | |||
|---|---|---|---|---|
| Operations Area Working Group D. Crockford | Operations Area Working Group D. Crockford | |||
| Internet-Draft JSON.org | Internet-Draft JSON.org | |||
| Intended status: Standards Track June 06, 2013 | Intended status: Standards Track June 05, 2013 | |||
| Expires: December 08, 2013 | Expires: December 07, 2013 | |||
| The JSON Data Interchange Format | The application/json Media Type for JavaScript Object Notation (JSON) | |||
| draft-ietf-json-rfc4627bis-01 | draft-ietf-json-rfc4627bis-02 | |||
| Abstract | Abstract | |||
| JSON is a lightweight, text-based, language-independent data | JavaScript Object Notation (JSON) is a lightweight, text-based, | |||
| interchange format. It was derived from the ECMAScript Programming | language-independent data interchange format. It was derived from | |||
| Language Standard. JSON defines a small set of formatting rules for | the ECMAScript Programming Language Standard. JSON defines a small | |||
| the portable representation of structured data. | set of formatting rules for the portable representation of structured | |||
| data. | ||||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on December 08, 2013. | This Internet-Draft will expire on December 07, 2013. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2013 IETF Trust and the persons identified as the | Copyright (c) 2013 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 2, line 16 ¶ | skipping to change at page 2, line 16 ¶ | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
| 1.1. Conventions Used in This Document . . . . . . . . . . . . 2 | 1.1. Conventions Used in This Document . . . . . . . . . . . . 2 | |||
| 1.2. Changes from RFC 4627 . . . . . . . . . . . . . . . . . . 3 | 1.2. Changes from RFC 4627 . . . . . . . . . . . . . . . . . . 3 | |||
| 2. JSON Grammar . . . . . . . . . . . . . . . . . . . . . . . . 3 | 2. JSON Grammar . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 2.1. Values . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 2.1. Values . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 2.2. Objects . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 2.2. Objects . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 2.3. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 2.3. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 2.4. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 2.4. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 2.5. Strings . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 2.5. Strings . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 3. Parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 | 3. Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 4. Generators . . . . . . . . . . . . . . . . . . . . . . . . . 7 | 4. Parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 5. Security Considerations . . . . . . . . . . . . . . . . . . . 7 | 5. Generators . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 8 | 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 7. Normative References . . . . . . . . . . . . . . . . . . . . 9 | 7. Security Considerations . . . . . . . . . . . . . . . . . . . 8 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 9 | 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
| 9. Normative References . . . . . . . . . . . . . . . . . . . . 10 | ||||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 11 | ||||
| 1. Introduction | 1. Introduction | |||
| JSON is a text format for the serialization of structured data. It | JavaScript Object Notation (JSON) is a text format for the | |||
| was inspired by the object literals of JavaScript, as defined in the | serialization of structured data. It is derived from the object | |||
| ECMAScript Programming Language Standard, Fifth Edition[ECMA]. | literals of JavaScript, as defined in the ECMAScript Programming | |||
| Language Standard, Third Edition [ECMA]. | ||||
| JSON can represent four primitive types (strings, numbers, booleans, | JSON can represent four primitive types (strings, numbers, booleans, | |||
| and null) and two structured types (objects and arrays). | and null) and two structured types (objects and arrays). | |||
| A string is a sequence of zero or more characters. | A string is a sequence of zero or more Unicode characters [UNICODE]. | |||
| An object is an unordered collection of zero or more name/value | An object is an unordered collection of zero or more name/value | |||
| pairs, where a name is a string and a value is a string, number, | pairs, where a name is a string and a value is a string, number, | |||
| boolean, null, object, or array. | boolean, null, object, or array. | |||
| An array is an ordered sequence of zero or more values. | An array is an ordered sequence of zero or more values. | |||
| The terms "object" and "array" come from the conventions of | The terms "object" and "array" come from the conventions of | |||
| JavaScript. | JavaScript. | |||
| JSON's design goals were for it to be minimal, portable, textual, and | JSON's design goals were for it to be minimal, portable, textual, and | |||
| a subset of JavaScript. JSON stands for JavaScript Object Notation. | a subset of JavaScript. | |||
| 1.1. Conventions Used in This Document | 1.1. Conventions Used in This Document | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
| document are to be interpreted as described in [RFC2119]. | document are to be interpreted as described in [RFC2119]. | |||
| The grammatical rules in this document are to be interpreted as | The grammatical rules in this document are to be interpreted as | |||
| described in [RFC5234]. | described in [RFC4234]. | |||
| 1.2. Changes from RFC 4627 | 1.2. Changes from RFC 4627 | |||
| This section lists all changes between this document and the text in | This section lists all changes between this document and the text in | |||
| RFC 4627. | RFC 4627. | |||
| o Applied errata #607 from RFC 4627 to correctly align the artwork | o Applied errata #607 from RFC 4627 to correctly align the artwork | |||
| for the definition of "object". | for the definition of "object". | |||
| 2. JSON Grammar | 2. JSON Grammar | |||
| skipping to change at page 4, line 29 ¶ | skipping to change at page 4, line 31 ¶ | |||
| null = %x6e.75.6c.6c ; null | null = %x6e.75.6c.6c ; null | |||
| true = %x74.72.75.65 ; true | true = %x74.72.75.65 ; true | |||
| 2.2. Objects | 2.2. Objects | |||
| An object structure is represented as a pair of curly brackets | An object structure is represented as a pair of curly brackets | |||
| surrounding zero or more name/value pairs (or members). A name is a | surrounding zero or more name/value pairs (or members). A name is a | |||
| string. A single colon comes after each name, separating the name | string. A single colon comes after each name, separating the name | |||
| from the value. A single comma separates a value from a following | from the value. A single comma separates a value from a following | |||
| name. The names within an object SHOULD be unique. If a key is | name. The names within an object SHOULD be unique. | |||
| duplicated, a parser MAY reject. If it does not reject, then it MUST | ||||
| take only the last of the duplicated key pairs. | ||||
| object = begin-object [ member *( value-separator member ) ] | object = begin-object [ member *( value-separator member ) ] | |||
| end-object | end-object | |||
| member = string name-separator value | member = string name-separator value | |||
| 2.3. Arrays | 2.3. Arrays | |||
| An array structure is represented as square brackets surrounding zero | An array structure is represented as square brackets surrounding zero | |||
| or more values (or elements). Elements are separated by commas. | or more values (or elements). Elements are separated by commas. | |||
| array = begin-array [ value *( value-separator value ) ] end-array | array = begin-array [ value *( value-separator value ) ] end-array | |||
| 2.4. Numbers | 2.4. Numbers | |||
| The representation of numbers is similar to that used in most | ||||
| programming languages. A number contains an integer component that | ||||
| may be prefixed with an optional minus sign, which may be followed by | ||||
| a fraction part and/or an exponent part. | ||||
| A number is represented in base 10 with no superfluous leading zeroes | Octal and hex forms are not allowed. Leading zeros are not allowed. | |||
| or punctuation such as commas or spaces. It may have a preceding | ||||
| minus sign. It may have a "."-prefixed fractional part. It may have | A fraction part is a decimal point followed by one or more digits. | |||
| an exponent, prefixed by "e" or "E" and optionally "+" or "-". | ||||
| An exponent part begins with the letter E in upper or lowercase, | ||||
| which may be followed by a plus or minus sign. The E and optional | ||||
| sign are followed by one or more digits. | ||||
| Numeric values that cannot be represented as sequences of digits | Numeric values that cannot be represented as sequences of digits | |||
| (such as Infinity and NaN) are not permitted. | (such as Infinity and NaN) are not permitted. | |||
| number = [ minus ] int [ frac ] [ exp ] | number = [ minus ] int [ frac ] [ exp ] | |||
| decimal-point = %x2E ; . | decimal-point = %x2E ; . | |||
| digit1-9 = %x31-39 ; 1-9 | digit1-9 = %x31-39 ; 1-9 | |||
| skipping to change at page 5, line 33 ¶ | skipping to change at page 5, line 43 ¶ | |||
| minus = %x2D ; - | minus = %x2D ; - | |||
| plus = %x2B ; + | plus = %x2B ; + | |||
| zero = %x30 ; 0 | zero = %x30 ; 0 | |||
| 2.5. Strings | 2.5. Strings | |||
| The representation of strings is similar to conventions used in the C | The representation of strings is similar to conventions used in the C | |||
| family of programming languages. A string is a sequence of code | family of programming languages. A string begins and ends with | |||
| units wrapped with quotation marks. All characters may be placed | quotation marks. All Unicode characters may be placed within the | |||
| within the quotation marks except for the characters that must be | quotation marks except for the characters that must be escaped: | |||
| escaped: quotation mark, reverse solidus, and the control characters | quotation mark, reverse solidus, and the control characters (U+0000 | |||
| (U+0000 through U+001F). | through U+001F). | |||
| Any character may be escaped. If the character is in the Basic | Any character may be escaped. If the character is in the Basic | |||
| Multilingual Plane (U+0000 through U+FFFF), then it may be | Multilingual Plane (U+0000 through U+FFFF), then it may be | |||
| represented as a six-character sequence: a reverse solidus, followed | represented as a six-character sequence: a reverse solidus, followed | |||
| by the lowercase letter u, followed by four hexadecimal digits that | by the lowercase letter u, followed by four hexadecimal digits that | |||
| encode the character's Unicode code point. The hexadecimal letters A | encode the character's code point. The hexadecimal letters A though | |||
| though F can be upper or lowercase. So, for example, a string | F can be upper or lowercase. So, for example, a string containing | |||
| containing only a single reverse solidus character may be represented | only a single reverse solidus character may be represented as | |||
| as "\u005C". | "\u005C". | |||
| Alternatively, there are two-character sequence escape | Alternatively, there are two-character sequence escape | |||
| representations of some popular characters. So, for example, a | representations of some popular characters. So, for example, a | |||
| string containing only a single reverse solidus character may be | string containing only a single reverse solidus character may be | |||
| represented more compactly as "\\". | represented more compactly as "\\". | |||
| To escape an extended character that is not in the Basic Multilingual | ||||
| Plane, the character is represented as a twelve-character sequence, | ||||
| encoding the UTF-16 surrogate pair. So, for example, a string | ||||
| containing only the G clef character (U+1D11E) may be represented as | ||||
| "\uD834\uDD1E". | ||||
| string = quotation-mark *char quotation-mark | string = quotation-mark *char quotation-mark | |||
| char = unescaped / | char = unescaped / | |||
| escape ( | escape ( | |||
| %x22 / ; " quotation mark U+0022 | %x22 / ; " quotation mark U+0022 | |||
| %x5C / ; \ reverse solidus U+005C | %x5C / ; \ reverse solidus U+005C | |||
| %x2F / ; / solidus U+002F | %x2F / ; / solidus U+002F | |||
| %x62 / ; b backspace U+0008 | %x62 / ; b backspace U+0008 | |||
| %x66 / ; f form feed U+000C | %x66 / ; f form feed U+000C | |||
| %x6E / ; n line feed U+000A | %x6E / ; n line feed U+000A | |||
| %x72 / ; r carriage return U+000D | %x72 / ; r carriage return U+000D | |||
| %x74 / ; t tab U+0009 | %x74 / ; t tab U+0009 | |||
| %x75 4HEXDIG ) ; uXXXX U+XXXX | %x75 4HEXDIG ) ; uXXXX U+XXXX | |||
| escape = %x5C ; \ | escape = %x5C ; \ | |||
| quotation-mark = %x22 ; " | quotation-mark = %x22 ; " | |||
| unescaped = %x20-21 / %x23-5B / %x5D-10FFFF | unescaped = %x20-21 / %x23-5B / %x5D-10FFFF | |||
| The following four cases MUST all produce the same result: | 3. Encoding | |||
| "\u002F" | JSON text SHALL be encoded in Unicode. The default encoding is | |||
| "\u002F" | UTF-8. | |||
| "\/" | ||||
| "/" | ||||
| To escape an extended character that is not in the Basic Multilingual | Since the first two characters of a JSON text will always be ASCII | |||
| Plane, the character is represented as a twelve-character sequence, | characters [RFC0020], it is possible to determine whether an octet | |||
| encoding the UTF-16 surrogate pair. So for example, a string | stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking | |||
| containing only the G clef character (U+1D11E) may be represented as | at the pattern of nulls in the first four octets. | |||
| "\uD834\uDD1E". A generator SHOULD NOT emit unpaired surrogates. A | ||||
| parser MAY reject JSON text containing unpaired surrogates. | ||||
| 3. Parsers | 00 00 00 xx UTF-32BE | |||
| 00 xx 00 xx UTF-16BE | ||||
| xx 00 00 00 UTF-32LE | ||||
| xx 00 xx 00 UTF-16LE | ||||
| xx xx xx xx UTF-8 | ||||
| 4. Parsers | ||||
| A JSON parser transforms a JSON text into another representation. A | A JSON parser transforms a JSON text into another representation. A | |||
| JSON parser MUST accept all texts that conform to the JSON grammar. | JSON parser MUST accept all texts that conform to the JSON grammar. | |||
| A JSON parser MAY accept non-JSON forms or extensions. | A JSON parser MAY accept non-JSON forms or extensions. | |||
| An implementation may set limits on the size of texts that it | An implementation may set limits on the size of texts that it | |||
| accepts. An implementation may set limits on the maximum depth of | accepts. An implementation may set limits on the maximum depth of | |||
| nesting. An implementation may set limits on the range of numbers. | nesting. An implementation may set limits on the range of numbers. | |||
| An implementation may set limits on the length and character contents | An implementation may set limits on the length and character contents | |||
| of strings. | of strings. | |||
| 4. Generators | 5. Generators | |||
| A JSON generator produces JSON text. The resulting text MUST | A JSON generator produces JSON text. The resulting text MUST | |||
| strictly conform to the JSON grammar. | strictly conform to the JSON grammar. | |||
| 5. Security Considerations | 6. IANA Considerations | |||
| With any data format, it is important to encode correctly. Care must | The MIME media type for JSON text is application/json. | |||
| be taken when constructing JSON texts by concatenation. For example: | ||||
| account = 4627; | Type name: application | |||
| comment = "\",\"account\":262"; // provided by attacker | ||||
| json_text = "(\"account\":" + account + ",\"comment\":\"" + comment + "\"}"; | ||||
| The result will be | Subtype name: json | |||
| {"account":4627,"comment":"","account":262} | Required parameters: n/a | |||
| which some parsers MAY see as being the same as | Optional parameters: n/a | |||
| {"comment":"","account":262} | Encoding considerations: 8bit if UTF-8; binary if UTF-16 or UTF-32 | |||
| This confusion allows an attacker to modify the account property or | JSON may be represented using UTF-8, UTF-16, or UTF-32. When JSON | |||
| any other property. | is written in UTF-8, JSON is 8bit compatible. When JSON is | |||
| written in UTF-16 or UTF-32, the binary content-transfer-encoding | ||||
| must be used. | ||||
| It is much wiser to use JSON generators, which are available in many | Security considerations: | |||
| forms for most programming languages, to do the encoding, avoiding | ||||
| the confusion hazard. | ||||
| JSON is so similar to some programming languages that the native | Generally there are security issues with scripting languages. JSON | |||
| parsing ability of the language processors can be used to parse JSON | is a subset of JavaScript, but it is a safe subset that excludes | |||
| texts. This should be avoided because the native parser will accept | assignment and invocation. | |||
| code which is not JSON. | ||||
| For example, JavaScript's eval() function is able parse JSON text, | A JSON text can be safely passed into JavaScript's eval() function | |||
| but is can also parse programs. If an attacker can inject code into | (which compiles and executes a string) if all the characters not | |||
| the JSON text (as we saw above), then it can compromise the system. | enclosed in strings are in the set of characters that form JSON | |||
| JSON parsers should always be used instead. | tokens. This can be quickly determined in JavaScript with two | |||
| regular expressions and calls to the test and replace methods. | ||||
| The web browser's script tag is an alias for the eval() function. It | var my_JSON_object = !(/[^,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]/.test( | |||
| should not be used to deliver JSON text to web browsers. | text.replace(/"(\\.|[^"\\])*"/g, ''))) && | |||
| eval('(' + text + ')'); | ||||
| 6. Examples | Interoperability considerations: n/a | |||
| Published specification: RFC 4627 | ||||
| Applications that use this media type: | ||||
| JSON has been used to exchange data between applications written | ||||
| in all of these programming languages: ActionScript, C, C#, | ||||
| ColdFusion, Common Lisp, E, Erlang, Java, JavaScript, Lua, | ||||
| Objective CAML, Perl, PHP, Python, Rebol, Ruby, and Scheme. | ||||
| Additional information: | ||||
| Magic number(s): n/a | ||||
| File extension(s): .json | ||||
| Macintosh file type code(s): TEXT | ||||
| Person & email address to contact for further information: | ||||
| Douglas Crockford | ||||
| douglas@crockford.com | ||||
| Intended usage: COMMON | ||||
| Restrictions on usage: none | ||||
| Author: | ||||
| Douglas Crockford | ||||
| douglas@crockford.com | ||||
| Change controller: | ||||
| Douglas Crockford | ||||
| douglas@crockford.com | ||||
| 7. Security Considerations | ||||
| See Security Considerations in Section 6. | ||||
| 8. Examples | ||||
| This is a JSON object: | This is a JSON object: | |||
| { | { | |||
| "Image": { | "Image": { | |||
| "Width": 800, | "Width": 800, | |||
| "Height": 600, | "Height": 600, | |||
| "Title": "View from 15th Floor", | "Title": "View from 15th Floor", | |||
| "Thumbnail": { | "Thumbnail": { | |||
| "Url": "http://www.example.com/image/481989943", | "Url": "http://www.example.com/image/481989943", | |||
| skipping to change at page 9, line 4 ¶ | skipping to change at page 10, line 26 ¶ | |||
| { | { | |||
| "precision": "zip", | "precision": "zip", | |||
| "Latitude": 37.371991, | "Latitude": 37.371991, | |||
| "Longitude": -122.026020, | "Longitude": -122.026020, | |||
| "Address": "", | "Address": "", | |||
| "City": "SUNNYVALE", | "City": "SUNNYVALE", | |||
| "State": "CA", | "State": "CA", | |||
| "Zip": "94085", | "Zip": "94085", | |||
| "Country": "US" | "Country": "US" | |||
| } | } | |||
| ] | ] | |||
| 7. Normative References | 9. Normative References | |||
| [ECMA] European Computer Manufacturers Association, "ECMAScript | [ECMA] European Computer Manufacturers Association, "ECMAScript | |||
| Language Specification Fifth Edition ", December 2009, | Language Specification 3rd Edition ", December 1999, | |||
| <http://www.ecma-international.org/publications/files/ | <http://www.ecma-international.org/publications/files/ | |||
| ecma-st/ECMA-262.pdf>. | ecma-st/ECMA-262.pdf>. | |||
| [RFC0020] Cerf, V., "ASCII format for network interchange", RFC 20, | [RFC0020] Cerf, V., "ASCII format for network interchange", RFC 20, | |||
| October 1969. | October 1969. | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
| [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax | [RFC4234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax | |||
| Specifications: ABNF", STD 68, RFC 5234, January 2008. | Specifications: ABNF", RFC 4234, October 2005. | |||
| [UNICODE] The Unicode Consortium, "The Unicode Standard, Version 6.2 | [UNICODE] The Unicode Consortium, "The Unicode Standard, Version 4.0 | |||
| ", 2012, <http://www.unicode.org/versions/Unicode6.2.0/>. | ", 2003, <http://www.unicode.org/versions/Unicode4.1.0/>. | |||
| Author's Address | Author's Address | |||
| Douglas Crockford | Douglas Crockford | |||
| JSON.org | JSON.org | |||
| Email: douglas@crockford.com | Email: douglas@crockford.com | |||
| End of changes. 41 change blocks. | ||||
| 86 lines changed or deleted | 135 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||