idnits 2.17.1 draft-hallambaker-jsonbcd-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Unrecognized Status in 'Intended Status:', assuming Proposed Standard (Expected one of 'Standards Track', 'Full Standard', 'Draft Standard', 'Proposed Standard', 'Best Current Practice', 'Informational', 'Experimental', 'Informational', 'Historic'.) -- The document date (July 6, 2015) is 3188 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'IEEE-754' is defined on line 539, but no explicit reference was found in the text ** Obsolete normative reference: RFC 4627 (Obsoleted by RFC 7158, RFC 7159) -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE-754' Summary: 3 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force (IETF) Phillip Hallam-Baker 2 INTERNET-DRAFT Comodo Group Inc. 3 Intended Status: July 6, 2015 4 Expires: January 7, 2016 6 Binary Encodings for JavaScript Object Notation: JSON-B, JSON-C, JSON-D 7 draft-hallambaker-jsonbcd-03 9 Abstract 11 Three binary encodings for JavaScript Object Notation (JSON) are 12 presented. JSON-B (Binary) is a strict superset of the JSON encoding 13 that permits efficient binary encoding of intrinsic JavaScript data 14 types. JSON-C (Compact) is a strict superset of JSON-B that supports 15 compact representation of repeated data strings with short numeric 16 codes. JSON-D (Data) supports additional binary data types for 17 integer and floating point representations for use in scientific 18 applications where conversion between binary and decimal 19 representations would cause a loss of precision. 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at http://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 Copyright Notice 38 Copyright (c) 2015 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 54 1.1. Requirements Language" . . . . . . . . . . . . . . . . . 3 55 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 56 2.1. Objectives . . . . . . . . . . . . . . . . . . . . . . . 3 57 3. title="Extended JSON Grammar"> . . . . . . . . . . . . . . . . 4 58 4. JSON-B . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 59 4.1. JSON-B Examples . . . . . . . . . . . . . . . . . . . . . 9 60 5. JSON-C . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 61 5.1. JSON-C Examples . . . . . . . . . . . . . . . . . . . . . 10 62 6. JSON-D (Data) . . . . . . . . . . . . . . . . . . . . . . . . 11 63 7. title="Acknowledgements"> . . . . . . . . . . . . . . . . . . 12 64 8. title="Security Considerations"> . . . . . . . . . . . . . . . 12 65 9. title="IANA Considerations"> . . . . . . . . . . . . . . . . . 12 66 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 67 10.1. Normative References . . . . . . . . . . . . . . . . . . 12 68 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 12 70 1. Definitions 72 1.1. Requirements Language" 74 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 75 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 76 document are to be interpreted as described in [RFC2119]. 78 2. Introduction 80 JavaScript Object Notation (JSON) is a simple text encoding for the 81 JavaScript Data model that has found wide application beyond its 82 original field of use. In particular JSON has rapidly become a 83 preferred encoding for Web Services. 85 JSON encoding supports just four fundamental data types (integer, 86 floating point, string and boolean), arrays and objects which consist 87 of a list of tag-value pairs. 89 Although the JSON encoding is sufficient for many purposes it is not 90 always efficient. In particular there is no efficient representation 91 for blocks of binary data. Use of base64 encoding increases data 92 volume by 33%. This overhead increases exponentially in applications 93 where nested binary encodings are required making use of JSON 94 encoding unsatisfactory in cryptographic applications where nested 95 binary structures are frequently required. 97 Another source of inefficiency in JSON encoding is the repeated 98 occurrence of object tags. A JSON encoding containing an array of a 99 hundred objects such as {"first":1,"second":2} will contain a hundred 100 occurrences of the string "first" (seven bytes) and a hundred 101 occurrences of the string "second" (eight bytes). Using two byte code 102 sequences in place of strings allows a saving of 11 bytes per object 103 without loss of information, a saving of 50%. 105 A third objection to the use of JSON encoding is that floating point 106 numbers can only be represented in decimal form and this necessarily 107 involves a loss of precision when converting between binary and 108 decimal representations. While such issues are rarely important in 109 network applications they can be critical in scientific applications. 110 It is not acceptable for saving and restoring a data set to change 111 the result of a calculation. 113 2.1. Objectives 115 The following were identified as core objectives for a binary JSON 116 encoding: 118 * Low overhead encoding and decoding 119 * Easy to convert existing encoders and decoders to add binary 120 support 122 * Efficient encoding of binary data 124 * Ability to convert from JSON to binary encoding in a streaming 125 mode (i.e. without reading the entire binary data block before 126 beginning encoding. 128 * Lossless encoding of JavaScript data types 130 * The ability to support JSON tag compression and extended data 131 types are considered desirable but not essential for typical 132 network applications. 134 Three binary encodings are defined: 136 JSON-B (Binary) 137 Simply encodes JSON data in binary. Only the JavaScript data 138 model is supported (i.e. atomic types are integers, double or 139 string). Integers may be 8, 16, 32 or 64 bits either signed or 140 unsigned. Floating points are IEEE 754 binary64 format [!IEEE- 141 754]. Supports chunked encoding for binary and UTF-8 string 142 types. 144 JSON-C (Compact) 145 As JSON-B but with support for representing JSON tags in 146 numeric code form (16 bit code space). This is done for both 147 compact encoding and to allow simplification of 148 encoders/decoders in constrained environments. Codes may be 149 defined inline or by reference to a known dictionary of codes 150 referenced via a digest value. 152 JSON-D (Data) 153 As JSON-C but with support for representing additional data 154 types without loss of precision. In particular other IEEE 754 155 floating point formats, both binary and decimal and Intel's 80 156 bit floating point, plus 128 bit integers and bignum integers. 158 3. title="Extended JSON Grammar"> 160 The JSON-B, JSON-C and JSON-D encodings are all based on the JSON 161 grammar [RFC4627] /> using the same syntactic structure but different 162 lexical encodings. 164 JSON-B0 and JSON-C0 replace the JSON lexical encodings for strings 165 and numbers with binary encodings. JSON-B1 and JSON-C1 allow either 166 lexical encoding to be used. Thus any valid JSON encoding is a valid 167 JSON-B1 or JSON-C1 encoding. 169 The grammar of JSON-B, JSON-C and JSON-D is a superset of the JSON 170 grammar. The following productions are added to the grammar: 172 x-value"> 173 Binary encodings for data values. As the binary value encodings 174 are all self delimiting 176 x-member 177 An object member where the value is specified as an X-value and 178 thus does not require a value-separator. 180 b-value 181 Binary data encodings defined in JSON-B. 183 b-string 184 Defined length string encoding defined in JSON-B. 186 c-def 187 Tag code definition defined in JSON-C. These may only appear 188 before the beginning of an Object or Array and before any 189 preceeding white space. 191 c-tag 192 Tag code value defined in JSON-C. 194 d-value 195 Additional binary data encodings defined in JSON-D for use in 196 scientific data applications. 198 The JSON grammar is modified to permit the use of x-value productions 199 in place of ( value value-separator ) : 201 JSON-text = (object / array) 203 object = *cdef begin-object [ 204 *( member value-separator | x-member ) 205 (member | x-member) ] end-object 207 member = tag value 208 x-member = tag x-value 210 tag = string name-separator | b-string | c-tag 212 array = *cdef begin-array [ *( value value-separator | x-value ) 213 (value | x-value) ] end-array 215 x-value = b-value / d-value 217 value = false / null / true / object / array / number / string 219 name-separator = ws %x3A ws ; : colon 220 value-separator = ws %x2C ws ; , comma 222 The following lexical values are unchanged: 224 begin-array = ws %x5B ws ; [ left square bracket 225 begin-object = ws %x7B ws ; { left curly bracket 226 end-array = ws %x5D ws ; ] right square bracket 227 end-object = ws %x7D ws ; } right curly bracket 229 ws = *( %x20 %x09 %x0A %x0D ) 231 false = %x66.61.6c.73.65 ; false 232 null = %x6e.75.6c.6c ; null 233 true = %x74.72.75.65 ; true 235 The productions number and string are defined as before: 237 number = [ minus ] int [ frac ] [ exp ] 238 decimal-point = %x2E ; . 239 digit1-9 = %x31-39 ; 1-9 240 e = %x65 / %x45 ; e E 241 exp = e [ minus / plus ] 1*DIGIT 242 frac = decimal-point 1*DIGIT 243 int = zero / ( digit1-9 *DIGIT ) 244 minus = %x2D ; - 245 plus = %x2B ; + 246 zero = %x30 ; 0 248 string = quotation-mark *char quotation-mark 249 char = unescaped / 250 escape ( %x22 / %x5C / %x2F / %x62 / %x66 / 251 %x6E / %x72 / %x74 / %x75 4HEXDIG ) 253 escape = %x5C ; \ 254 quotation-mark = %x22 ; " 255 unescaped = %x20-21 / %x23-5B / %x5D-10FFFF 257 4. JSON-B 259 The JSON-B encoding defines the b-value and b-string productions: 261 b-value = b-atom | b-string | b-data | b-integer | 262 b-float 264 b-string = *( string-chunk ) string-term 265 b-data = *( data-chunk ) data-last 267 b-integer = p-int8 | p-int16 | p-int32 | p-int64 | p-bignum16 | 268 n-int8 | n-int16 | n-int32 | n-int64 | n-bignum16 270 b-float = binary64 272 The lexical encodings of the productions are defined in the following 273 table where the column 'tag' specifies the byte code that begins the 274 production, 'Fixed' specifies the number of data bytes that follow 275 and 'Length' specifies the number of bytes used to define the length 276 of a variable length field following the data bytes: 278 +--------------+-----+-------+--------+-----------------------------+ 279 | Production | Tag | Fixed | Length | Data Description | 280 +--------------+-----+-------+--------+-----------------------------+ 281 | string-term | x80 | - | 1 | Terminal String 8 bit | 282 | | | | | length | 283 | | | | | | 284 | string-term | x81 | - | 2 | Terminal String 16 bit | 285 | | | | | length | 286 | | | | | | 287 | string-term | x82 | - | 4 | Terminal String 32 bit | 288 | | | | | length | 289 | | | | | | 290 | string-term | x83 | - | 8 | Terminal String 64 bit | 291 | | | | | length | 292 | | | | | | 293 | string-chunk | x84 | - | 1 | Non-Terminal String 8 bit | 294 | | | | | length | 295 | | | | | | 296 | string-chunk | x85 | - | 2 | Non-Terminal String 16 bit | 297 | | | | | length | 298 | | | | | | 299 | string-chunk | x86 | - | 4 | Non-Terminal String 32 bit | 300 | | | | | length | 301 | | | | | | 302 | string-chunk | x87 | - | 8 | Non-Terminal String 64 bit | 303 | | | | | length | 304 | | | | | | 305 | data-term | x88 | - | 1 | Terminal Data 8 bit length | 306 | | | | | | 307 +--------------+-----+-------+--------+-----------------------------+ 308 | Production | Tag | Fixed | Length | Data Description | 309 +--------------+-----+-------+--------+-----------------------------+ 310 | data-term | x89 | - | 2 | Terminal Data 16 bit length | 311 | | | | | | 312 | data-term | x8A | - | 4 | Terminal Data 32 bit length | 313 | | | | | | 314 | data-term | x8B | - | 8 | Terminal Data 64 bit length | 315 | | | | | | 316 | data-chunk | x8C | - | 1 | Non-Terminal Data 8 bit | 317 | | | | | length | 318 | | | | | | 319 | data-chunk | x8D | - | 2 | Non-Terminal Data 16 bit | 320 | | | | | length | 321 | | | | | | 322 | data-chunk | x8E | - | 4 | Non-Terminal Data 32 bit | 323 | | | | | length | 324 | | | | | | 325 | data-chunk | x8F | - | 8 | Non-Terminal String 64 bit | 326 | | | | | length | 327 | | | | | | 328 | p-int8 | xA0 | 1 | - | Positive 8 bit Integer | 329 | | | | | | 330 | p-int16 | xA1 | 2 | - | Positive 16 bit Integer | 331 | | | | | | 332 | p-int32 | xA2 | 4 | - | Positive 32 bit Integer | 333 | | | | | | 334 | p-int64 | xA3 | 8 | - | Positive 64 bit Integer | 335 | | | | | | 336 | p-bignum16 | xA5 | - | 2 | Positive Bignum 16 bit | 337 | | | | | length | 338 | | | | | | 339 | n-int8 | xA8 | 1 | - | Negative 8 bit Integer | 340 | | | | | | 341 | n-int16 | xA9 | 2 | - | Negative 16 bit Integer | 342 | | | | | | 343 | n-int32 | xAA | 4 | - | Negative 32 bit Integer | 344 | | | | | | 345 | n-int64 | xAB | 8 | - | Negative 64 bit Integer | 346 | | | | | | 347 | n-bignum16 | xAD | - | 2 | Negative Bignum 16 bit | 348 | | | | | length | 349 | | | | | | 350 | binary64 | x92 | 8 | - | IEEE 754 Floating Point | 351 | | | | | binary64 | 352 | | | | | | 353 | b-value | xB0 | - | - | True | 354 | | | | | | 355 | b-value | xB1 | - | - | False | 356 | | | | | | 357 +--------------+-----+-------+--------+-----------------------------+ 358 | Production | Tag | Fixed | Length | Data Description | 359 +--------------+-----+-------+--------+-----------------------------+ 360 | b-value | xB2 | - | - | Null | 361 +--------------+-----+-------+--------+-----------------------------+ 363 A data type commonly used in networking that is not defined in this 364 scheme is a datetime representation. 366 4.1. JSON-B Examples 368 The following examples show examples of using JSON-B encoding: 370 Binary Encoding JSON Equivalent 372 A0 2A 42 (as 8 bit integer) 373 A1 00 2A 42 (as 16 bit integer) 374 A2 00 00 00 2A 42 (as 32 bit integer) 375 A3 00 00 00 00 00 00 00 2A 42 (as 64 bit integer) 376 A5 00 01 42 42 (as Bignum) 378 80 05 48 65 6c 6c 6f "Hello" (single chunk) 379 81 00 05 48 65 6c 6c 6f "Hello" (single chunk) 380 84 05 48 65 6c 6c 6f 80 00 "Hello" (as two chunks) 382 92 3f f0 00 00 00 00 00 00 1.0 383 92 40 24 00 00 00 00 00 00 10.0 384 92 40 09 21 fb 54 44 2e ea 3.14159265359 385 92 bf f0 00 00 00 00 00 00 -1.0 387 B0 true 388 B1 false 389 B2 null 391 5. JSON-C 393 JSON-C (Compressed) permits numeric code values to be substituted for 394 strings and binary data. Tag codes MAY be 8, 16 or 32 bits long 395 encoded in network byte order. 397 Tag codes MUST be defined before they are referenced. A Tag code MAY 398 be defined before the corresponding data or string value is used or 399 at the same time that it is used. 401 A dictionary is a list of tag code definitions. An encoding MAY 402 incorporate definitions from a dictionary using the dict-hash 403 production. The dict hash production specifies a (positive) offset 404 value to be added to the entries in the dictionary and a hash code 405 identifier consisting of the ASN.1 OID value sequence for the 406 cryptographic digest used to compute the hash value followed by the 407 hash value in network byte order. 409 +------------+-----+-------+--------+-------------------------------+ 410 | Production | Tag | Fixed | Length | Data Description | 411 +------------+-----+-------+--------+-------------------------------+ 412 | c-tag | xC0 | 1 | - | 8 bit tag code | 413 | | | | | | 414 | c-tag | xC1 | 2 | - | 16 bit tag code | 415 | | | | | | 416 | c-tag | xC2 | 4 | - | 32 bit tag code | 417 | | | | | | 418 | c-def | xC4 | 1 | - | 8 bit tag definition | 419 | | | | | | 420 | c-def | xC5 | 2 | - | 16 bit tag definition | 421 | | | | | | 422 | c-def | xC6 | 4 | - | 32 bit tag definition | 423 | | | | | | 424 | c-tag | xC8 | 1 | - | 8 bit tag code & definition | 425 | | | | | | 426 | c-tag | xC9 | 2 | - | 16 bit tag code & definition | 427 | | | | | | 428 | c-tag | xCA | 4 | - | 32 bit tag code & definition | 429 | | | | | | 430 | c-def | xCC | 1 | - | 8 bit tag dictionary | 431 | | | | | definition | 432 | | | | | | 433 | c-def | xCD | 2 | - | 16 bit tag dictionary | 434 | | | | | definition | 435 | | | | | | 436 | c-def | xCE | 4 | - | 32 bit tag dictionary | 437 | | | | | definition | 438 | | | | | | 439 | dict-hash | xD0 | 4 | 1 | Hash of dictionary | 440 +------------+-----+-------+--------+-------------------------------+ 442 All integer values are encoded in Network Byte Order (most 443 significant byte first). 445 5.1. JSON-C Examples 447 The following examples show examples of using JSON-C encoding: 449 JSON-C Value Define 451 C8 20 80 05 48 65 6c 6c 6f "Hello" 20 = "Hello" 452 C4 21 80 05 48 65 6c 6c 6f 21 = "Hello" 453 C0 20 "Hello" 454 C1 00 20 "Hello" 456 D0 00 00 01 00 1B 277 = "Hello" 457 06 09 60 86 48 01 65 03 458 04 02 01 OID for SHA-2-256 459 e3 b0 c4 42 98 fc 1c 14 460 9a fb f4 c8 99 6f b9 24 461 27 ae 41 e4 64 9b 93 4c 462 a4 95 99 1b 78 52 b8 55 SHA-256(C4 21 80 05 48 65 6c 6c 6f) 464 2.16.840.1.101.3.4.2.1 466 6. JSON-D (Data) 468 JSON-B and JSON-C only support the two numeric types defined in the 469 JavaScript data model: Integers and 64 bit floating point values. 470 JSON-D (Data) defines binary encodings for additional data types that 471 are commonly used in scientific applications. These comprise positive 472 and negative 128 bit integers, six additional floating point 473 representations defined by IEEE 754 [RFC2119] and the Intel extended 474 precision 80 bit floating point representation. 476 Should the need arise, even bigger bignums could be defined with the 477 length specified as a 32 bit value permitting bignums of up to 2^35 478 bits to be represented. 480 d-value = d-integer | d-float 482 d-float = binary16 | binary32 | binary128 | binary80 | 483 decimal32 | decimal64 | decimal 128 485 +------------+-----+-------+--------+-------------------------------+ 486 | Production | Tag | Fixed | Length | Data Description | 487 +------------+-----+-------+--------+-------------------------------+ 488 | p-int128 | xA4 | 16 | - | Positive 128 bit Integer | 489 | | | | | | 490 | n-in7128 | xAC | 16 | - | Negative 128 bit Integer | 491 | | | | | | 492 | binary16 | x90 | 2 | - | IEEE 754 Floating Point | 493 | | | | | binary16 | 494 | | | | | | 495 +------------+-----+-------+--------+-------------------------------+ 496 | Production | Tag | Fixed | Length | Data Description | 497 +------------+-----+-------+--------+-------------------------------+ 498 | binary32 | x91 | 4 | - | IEEE 754 Floating Point | 499 | | | | | binary32 | 500 | | | | | | 501 | binary128 | x94 | 16 | - | IEEE 754 Floating Point | 502 | | | | | binary128 | 503 | | | | | | 504 | intel80 | x95 | 10 | - | Intel 80 bit extended binary | 505 | | | | | Floating Point | 506 | | | | | | 507 | decimal32 | x96 | 4 | - | IEEE 754 Floating Point | 508 | | | | | decimal32 | 509 | | | | | | 510 | decimal64 | x97 | 8 | - | IEEE 754 Floating Point | 511 | | | | | decimal64 | 512 | | | | | | 513 | decimal128 | x98 | 18 | - | IEEE 754 Floating Point | 514 | | | | | decimal128 | 515 +------------+-----+-------+--------+-------------------------------+ 517 7. title="Acknowledgements"> 519 Nico Williams, etc 521 8. title="Security Considerations"> 523 TBS 525 9. title="IANA Considerations"> 527 [TBS list out all the code points that require an IANA registration] 529 10. References 531 10.1. Normative References 533 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 534 Requirement Levels", BCP 14, RFC 2119, March 1997. 536 [RFC4627] Crockford, D., "The application/json Media Type for 537 JavaScript Object Notation (JSON)", RFC 4627, July 2006. 539 [IEEE-754] , "[Reference Not Found!]". 541 Author's Address 543 Phillip Hallam-Baker 544 Comodo Group Inc. 546 philliph@comodo.com