idnits 2.17.1 draft-hallambaker-jsonbcd-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 21, 2014) is 3741 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE-754' ** Obsolete normative reference: RFC 4627 (Obsoleted by RFC 7158, RFC 7159) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force P. Hallam-Baker 3 Internet-Draft Comodo Group Inc. 4 Intended status: Standards Track January 21, 2014 5 Expires: July 25, 2014 7 Binary Encodings for JavaScript Object Notation: JSON-B, JSON-C, JSON-D 8 draft-hallambaker-jsonbcd-01 10 Abstract 12 Three binary encodings for JavaScript Object Notation (JSON) are 13 presented. JSON-B (Binary) is a strict superset of the JSON encoding 14 that permits efficient binary encoding of intrinsic JavaScript data 15 types. JSON-C (Compact) is a strict superset of JSON-B that supports 16 compact representation of repeated data strings with short numeric 17 codes. JSON-D (Data) supports additional binary data types for 18 integer and floating point representations for use in scientific 19 applications where conversion between binary and decimal 20 representations would cause a loss of precision. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on July 25, 2014. 39 Copyright Notice 41 Copyright (c) 2014 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 2 57 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 2 58 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 59 2.1. Objectives . . . . . . . . . . . . . . . . . . . . . . . 3 60 3. Extended JSON Grammar . . . . . . . . . . . . . . . . . . . . 4 61 4. JSON-B . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 62 4.1. JSON-B Examples . . . . . . . . . . . . . . . . . . . . . 8 63 5. JSON-C . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 64 5.1. JSON-C Examples . . . . . . . . . . . . . . . . . . . . . 9 65 6. JSON-D (Data) . . . . . . . . . . . . . . . . . . . . . . . . 10 66 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11 67 8. Security Considerations . . . . . . . . . . . . . . . . . . . 11 68 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 69 10. Normative References . . . . . . . . . . . . . . . . . . . . 11 70 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 11 72 1. Definitions 74 1.1. Requirements Language 76 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 77 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 78 document are to be interpreted as described in [RFC2119]. 80 2. Introduction 82 JavaScript Object Notation (JSON) is a simple text encoding for the 83 JavaScript Data model that has found wide application beyond its 84 original field of use. In particular JSON has rapidly become a 85 preferred encoding for Web Services. 87 JSON encoding supports just four fundamental data types (integer, 88 floating point, string and boolean), arrays and objects which consist 89 of a list of tag-value pairs. 91 Although the JSON encoding is sufficient for many purposes it is not 92 always efficient. In particular there is no efficient representation 93 for blocks of binary data. Use of base64 encoding increases data 94 volume by 33%. This overhead increases exponentially in applications 95 where nested binary encodings are required making use of JSON 96 encoding unsatisfactory in cryptographic applications where nested 97 binary structures are frequently required. 99 Another source of inefficiency in JSON encoding is the repeated 100 occurrence of object tags. A JSON encoding containing an array of a 101 hundred objects such as {"first":1,"second":2} will contain a hundred 102 occurrences of the string "first" (seven bytes) and a hundred 103 occurrences of the string "second" (eight bytes). Using two byte 104 code sequences in place of strings allows a saving of 11 bytes per 105 object without loss of information, a saving of 50%. 107 A third objection to the use of JSON encoding is that floating point 108 numbers can only be represented in decimal form and this necessarily 109 involves a loss of precision when converting between binary and 110 decimal representations. While such issues are rarely important in 111 network applications they can be critical in scientific applications. 112 It is not acceptable for saving and restoring a data set to change 113 the result of a calculation. 115 2.1. Objectives 117 The following were identified as core objectives for a binary JSON 118 encoding: 120 Low overhead encoding and decoding 122 Easy to convert existing encoders and decoders to add binary 123 support 125 Efficient encoding of binary data 127 Ability to convert from JSON to binary encoding in a streaming 128 mode (i.e. without reading the entire binary data block before 129 beginning encoding. 131 Lossless encoding of JavaScript data types 133 The ability to support JSON tag compression and extended data 134 types are considered desirable but not essential for typical 135 network applications. 137 Three binary encodings are defined: 139 JSON-B (Binary) Simply encodes JSON data in binary. Only the 140 JavaScript data model is supported (i.e. atomic types are 141 integers, double or string). Integers may be 8, 16, 32 or 64 bits 142 either signed or unsigned. Floating points are IEEE 754 binary64 143 format [IEEE-754]. Supports chunked encoding for binary and UTF-8 144 string types. 146 JSON-C (Compact) As JSON-B but with support for representing JSON 147 tags in numeric code form (16 bit code space). This is done for 148 both compact encoding and to allow simplification of encoders/ 149 decoders in constrained environments. Codes may be defined inline 150 or by reference to a known dictionary of codes referenced via a 151 digest value. 153 JSON-D (Data) As JSON-C but with support for representing additional 154 data types without loss of precision. In particular other IEEE 155 754 floating point formats, both binary and decimal and Intel's 80 156 bit floating point, plus 128 bit integers and bignum integers. 158 3. Extended JSON Grammar 160 The JSON-B, JSON-C and JSON-D encodings are all based on the JSON 161 grammar [RFC4627] using the same syntactic structure but different 162 lexical encodings. 164 JSON-B0 and JSON-C0 replace the JSON lexical encodings for strings 165 and numbers with binary encodings. JSON-B1 and JSON-C1 allow either 166 lexical encoding to be used. Thus any valid JSON encoding is a valid 167 JSON-B1 or JSON-C1 encoding. 169 The grammar of JSON-B, JSON-C and JSON-D is a superset of the JSON 170 grammar. The following productions are added to the grammar: 172 x-value Binary encodings for data values. As the binary value 173 encodings are all self delimiting 175 x-member An object member where the value is specified as an X-value 176 and thus does not require a value-separator. 178 b-value Binary data encodings defined in JSON-B. 180 b-string Defined length string encoding defined in JSON-B. 182 c-def Tag code definition defined in JSON-C. These may only appear 183 before the beginning of an Object or Array and before any 184 preceeding white space. 186 c-tag Tag code value defined in JSON-C. 188 d-value Additional binary data encodings defined in JSON-D for use 189 in scientific data applications. 191 The JSON grammar is modified to permit the use of x-value productions 192 in place of ( value value-separator ) : 194 JSON-text = (object / array) 196 object = *cdef begin-object [ 197 *( member value-separator | x-member ) 198 (member | x-member) ] end-object 200 member = tag value 201 x-member = tag x-value 203 tag = string name-separator | b-string | c-tag 205 array = *cdef begin-array [ *( value value-separator | x-value ) 206 (value | x-value) ] end-array 208 x-value = b-value / d-value 210 value = false / null / true / object / array / number / string 212 name-separator = ws %x3A ws ; : colon 213 value-separator = ws %x2C ws ; , comma 215 The following lexical values are unchanged: 217 begin-array = ws %x5B ws ; [ left square bracket 218 begin-object = ws %x7B ws ; { left curly bracket 219 end-array = ws %x5D ws ; ] right square bracket 220 end-object = ws %x7D ws ; } right curly bracket 222 ws = *( %x20 %x09 %x0A %x0D ) 224 false = %x66.61.6c.73.65 ; false 225 null = %x6e.75.6c.6c ; null 226 true = %x74.72.75.65 ; true 228 The productions number and string are defined as before: 230 number = [ minus ] int [ frac ] [ exp ] 231 decimal-point = %x2E ; . 232 digit1-9 = %x31-39 ; 1-9 233 e = %x65 / %x45 ; e E 234 exp = e [ minus / plus ] 1*DIGIT 235 frac = decimal-point 1*DIGIT 236 int = zero / ( digit1-9 *DIGIT ) 237 minus = %x2D ; - 238 plus = %x2B ; + 239 zero = %x30 ; 0 241 string = quotation-mark *char quotation-mark 242 char = unescaped / 243 escape ( %x22 / %x5C / %x2F / %x62 / %x66 / 244 %x6E / %x72 / %x74 / %x75 4HEXDIG ) 246 escape = %x5C ; \ 247 quotation-mark = %x22 ; " 248 unescaped = %x20-21 / %x23-5B / %x5D-10FFFF 250 4. JSON-B 252 The JSON-B encoding defines the b-value and b-string productions: 254 b-value = b-atom | b-string | b-data | b-integer | 255 b-float 257 b-string = *( string-chunk ) string-term 258 b-data = *( data-chunk ) data-last 260 b-integer = p-int8 | p-int16 | p-int32 | p-int64 | p-bignum16 | 261 n-int8 | n-int16 | n-int32 | n-int64 | n-bignum16 263 b-float = binary64 265 The lexical encodings of the productions are defined in the following 266 table where the column 'tag' specifies the byte code that begins the 267 production, 'Fixed' specifies the number of data bytes that follow 268 and 'Length' specifies the number of bytes used to define the length 269 of a variable length field following the data bytes: 271 +--------------+-----+-------+--------+-----------------------------+ 272 | Production | Tag | Fixed | Length | Data Description | 273 +--------------+-----+-------+--------+-----------------------------+ 274 | string-term | x80 | - | 1 | Terminal String 8 bit | 275 | | | | | length | 276 | string-term | x81 | - | 2 | Terminal String 16 bit | 277 | | | | | length | 278 | string-term | x82 | - | 4 | Terminal String 32 bit | 279 | | | | | length | 280 | string-term | x83 | - | 8 | Terminal String 64 bit | 281 | | | | | length | 282 | string-chunk | x84 | - | 1 | Non-Terminal String 8 bit | 283 | | | | | length | 284 | string-chunk | x85 | - | 2 | Non-Terminal String 16 bit | 285 | | | | | length | 286 | string-chunk | x86 | - | 4 | Non-Terminal String 32 bit | 287 | | | | | length | 288 | string-chunk | x87 | - | 8 | Non-Terminal String 64 bit | 289 | | | | | length | 290 | data-term | x88 | - | 1 | Terminal Data 8 bit length | 291 | data-term | x89 | - | 2 | Terminal Data 16 bit length | 292 | data-term | x8A | - | 4 | Terminal Data 32 bit length | 293 | data-term | x8B | - | 8 | Terminal Data 64 bit length | 294 | data-chunk | x8C | - | 1 | Non-Terminal Data 8 bit | 295 | | | | | length | 296 | data-chunk | x8D | - | 2 | Non-Terminal Data 16 bit | 297 | | | | | length | 298 | data-chunk | x8E | - | 4 | Non-Terminal Data 32 bit | 299 | | | | | length | 300 | data-chunk | x8F | - | 8 | Non-Terminal String 64 bit | 301 | | | | | length | 302 | p-int8 | xA0 | 1 | - | Positive 8 bit Integer | 303 | p-int16 | xA1 | 2 | - | Positive 16 bit Integer | 304 | p-int32 | xA2 | 4 | - | Positive 32 bit Integer | 305 | p-int64 | xA3 | 8 | - | Positive 64 bit Integer | 306 | p-bignum16 | xA5 | - | 2 | Positive Bignum 16 bit | 307 | | | | | length | 308 | n-int8 | xA8 | 1 | - | Negative 8 bit Integer | 309 | n-int16 | xA9 | 2 | - | Negative 16 bit Integer | 310 | n-int32 | xAA | 4 | - | Negative 32 bit Integer | 311 | n-int64 | xAB | 8 | - | Negative 64 bit Integer | 312 | n-bignum16 | xAD | - | 2 | Negative Bignum 16 bit | 313 | | | | | length | 314 | binary64 | x92 | 8 | - | IEEE 754 Floating Point | 315 | | | | | binary64 | 316 | b-value | xB0 | - | - | True | 317 | b-value | xB1 | - | - | False | 318 | b-value | xB2 | - | - | Null | 319 +--------------+-----+-------+--------+-----------------------------+ 321 Table 1: JSON-B Lexical Encodings 323 A data type commonly used in networking that is not defined in this 324 scheme is a datetime representation. 326 4.1. JSON-B Examples 328 The following examples show examples of using JSON-B encoding: 330 Binary Encoding JSON Equivalent 332 A0 2A 42 (as 8 bit integer) 333 A1 00 2A 42 (as 16 bit integer) 334 A2 00 00 00 2A 42 (as 32 bit integer) 335 A3 00 00 00 00 00 00 00 2A 42 (as 64 bit integer) 336 A5 00 01 42 42 (as Bignum) 338 80 05 48 65 6c 6c 6f "Hello" (single chunk) 339 81 00 05 48 65 6c 6c 6f "Hello" (single chunk) 340 84 05 48 65 6c 6c 6f 80 00 "Hello" (as two chunks) 342 92 3f f0 00 00 00 00 00 00 1.0 343 92 40 24 00 00 00 00 00 00 10.0 344 92 40 09 21 fb 54 44 2e ea 3.14159265359 345 92 bf f0 00 00 00 00 00 00 -1.0 347 B0 true 348 B1 false 349 B2 null 351 5. JSON-C 353 JSON-C (Compressed) permits numeric code values to be substituted for 354 strings and binary data. Tag codes MAY be 8, 16 or 32 bits long 355 encoded in network byte order. 357 Tag codes MUST be defined before they are referenced. A Tag code MAY 358 be defined before the corresponding data or string value is used or 359 at the same time that it is used. 361 A dictionary is a list of tag code definitions. An encoding MAY 362 incorporate definitions from a dictionary using the dict-hash 363 production. The dict hash production specifies a (positive) offset 364 value to be added to the entries in the dictionary and a hash code 365 identifier consisting of the ASN.1 OID value sequence for the 366 cryptographic digest used to compute the hash value followed by the 367 hash value in network byte order. 369 +------------+-----+-------+--------+-------------------------------+ 370 | Production | Tag | Fixed | Length | Data Description | 371 +------------+-----+-------+--------+-------------------------------+ 372 | c-tag | xC0 | 1 | - | 8 bit tag code | 373 | c-tag | xC1 | 2 | - | 16 bit tag code | 374 | c-tag | xC2 | 4 | - | 32 bit tag code | 375 | c-def | xC4 | 1 | - | 8 bit tag definition | 376 | c-def | xC5 | 2 | - | 16 bit tag definition | 377 | c-def | xC6 | 4 | - | 32 bit tag definition | 378 | c-tag | xC8 | 1 | - | 8 bit tag code & definition | 379 | c-tag | xC9 | 2 | - | 16 bit tag code & definition | 380 | c-tag | xCA | 4 | - | 32 bit tag code & definition | 381 | c-def | xCC | 1 | - | 8 bit tag dictionary | 382 | | | | | definition | 383 | c-tag | xCD | 2 | - | 16 bit tag dictionary | 384 | | | | | definition | 385 | c-tag | xCE | 4 | - | 32 bit tag dictionary | 386 | | | | | definition | 387 | dict-hash | xD0 | 4 | 1 | Hash of dictionary | 388 +------------+-----+-------+--------+-------------------------------+ 390 Table 2: JSON-C Lexical Encodings 392 All integer values are encoded in Network Byte Order (most 393 significant byte first). 395 5.1. JSON-C Examples 397 The following examples show examples of using JSON-C encoding: 399 JSON-C Value Define 401 C8 20 80 05 48 65 6c 6c 6f "Hello" 20 = "Hello" 402 C4 21 80 05 48 65 6c 6c 6f 21 = "Hello" 403 C0 20 "Hello" 404 C1 00 20 "Hello" 406 D0 00 00 01 00 1B 277 = "Hello" 407 06 09 60 86 48 01 65 03 408 04 02 01 OID for SHA-2-256 409 e3 b0 c4 42 98 fc 1c 14 410 9a fb f4 c8 99 6f b9 24 411 27 ae 41 e4 64 9b 93 4c 412 a4 95 99 1b 78 52 b8 55 SHA-256(C4 21 80 05 48 65 6c 6c 6f) 414 2.16.840.1.101.3.4.2.1 416 6. JSON-D (Data) 418 JSON-B and JSON-C only support the two numeric types defined in the 419 JavaScript data model: Integers and 64 bit floating point values. 420 JSON-D (Data) defines binary encodings for additional data types that 421 are commonly used in scientific applications. These comprise 422 positive and negative 128 bit integers, six additional floating point 423 representations defined by IEEE 754 [RFC2119] and the Intel extended 424 precision 80 bit floating point representation. 426 Should the need arise, even bigger bignums could be defined with the 427 length specified as a 32 bit value permitting bignums of up to 2^35 428 bits to be represented. 430 d-value = d-integer | d-float 432 d-float = binary16 | binary32 | binary128 | binary80 | 433 decimal32 | decimal64 | decimal 128 435 +------------+-----+-------+--------+-------------------------------+ 436 | Production | Tag | Fixed | Length | Data Description | 437 +------------+-----+-------+--------+-------------------------------+ 438 | p-int128 | xA4 | 16 | - | Positive 128 bit Integer | 439 | n-in7128 | xAC | 16 | - | Negative 128 bit Integer | 440 | binary16 | x90 | 2 | - | IEEE 754 Floating Point | 441 | | | | | binary16 | 442 | binary32 | x91 | 4 | - | IEEE 754 Floating Point | 443 | | | | | binary32 | 444 | binary128 | x94 | 16 | - | IEEE 754 Floating Point | 445 | | | | | binary128 | 446 | intel80 | x95 | 10 | - | Intel 80 bit extended binary | 447 | | | | | Floating Point | 448 | decimal32 | x96 | 4 | - | IEEE 754 Floating Point | 449 | | | | | decimal32 | 450 | decimal64 | x97 | 8 | - | IEEE 754 Floating Point | 451 | | | | | decimal64 | 452 | decimal128 | x98 | 18 | - | IEEE 754 Floating Point | 453 | | | | | decimal128 | 454 +------------+-----+-------+--------+-------------------------------+ 456 Table 3: JSON-D Lexical Encodings 458 7. Acknowledgements 460 Nico Williams, etc 462 8. Security Considerations 464 9. IANA Considerations 466 [TBS list out all the code points that require an IANA registration] 468 10. Normative References 470 [IEEE-754] 471 "Information technology -- Microprocessor Systems -- 472 Floating-Point arithmetic", ISO/IEC/IEEE 60559:2011, July 473 2011, . 475 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 476 Requirement Levels", BCP 14, RFC 2119, March 1997. 478 [RFC4627] Crockford, D., "The application/json Media Type for 479 JavaScript Object Notation (JSON)", RFC 4627, July 2006. 481 Author's Address 483 Phillip Hallam-Baker 484 Comodo Group Inc. 486 Email: philliph@comodo.com