idnits 2.17.1 draft-rundgren-json-canonicalization-scheme-17.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 2 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (20 January 2020) is 1557 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'JsonIgnore' is mentioned on line 826, but not defined Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Rundgren 3 Internet-Draft Independent 4 Intended status: Informational B. Jordan 5 Expires: 23 July 2020 Broadcom 6 S. Erdtman 7 Spotify AB 8 20 January 2020 10 JSON Canonicalization Scheme (JCS) 11 draft-rundgren-json-canonicalization-scheme-17 13 Abstract 15 Cryptographic operations like hashing and signing need the data to be 16 expressed in an invariant format so that the operations are reliably 17 repeatable. One way to address this is to create a canonical 18 representation of the data. Canonicalization also permits data to be 19 exchanged in its original form on the "wire" while cryptographic 20 operations performed on the canonicalized counterpart of the data in 21 the producer and consumer end points, generate consistent results. 23 This document describes the JSON Canonicalization Scheme (JCS). The 24 JCS specification defines how to create a canonical representation of 25 JSON data by building on the strict serialization methods for JSON 26 primitives defined by ECMAScript, constraining JSON data to the 27 I-JSON subset, and by using deterministic property sorting. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at https://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on 23 July 2020. 46 Copyright Notice 48 Copyright (c) 2020 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 53 license-info) in effect on the date of publication of this document. 54 Please review these documents carefully, as they describe your rights 55 and restrictions with respect to this document. Code Components 56 extracted from this document must include Simplified BSD License text 57 as described in Section 4.e of the Trust Legal Provisions and are 58 provided without warranty as described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 64 3. Detailed Operation . . . . . . . . . . . . . . . . . . . . . 4 65 3.1. Creation of Input Data . . . . . . . . . . . . . . . . . 4 66 3.2. Generation of Canonical JSON Data . . . . . . . . . . . . 5 67 3.2.1. Whitespace . . . . . . . . . . . . . . . . . . . . . 5 68 3.2.2. Serialization of Primitive Data Types . . . . . . . . 5 69 3.2.2.1. Serialization of Literals . . . . . . . . . . . . 6 70 3.2.2.2. Serialization of Strings . . . . . . . . . . . . 6 71 3.2.2.3. Serialization of Numbers . . . . . . . . . . . . 7 72 3.2.3. Sorting of Object Properties . . . . . . . . . . . . 7 73 3.2.4. UTF-8 Generation . . . . . . . . . . . . . . . . . . 9 74 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 75 5. Security Considerations . . . . . . . . . . . . . . . . . . . 9 76 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10 77 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 78 7.1. Normative References . . . . . . . . . . . . . . . . . . 10 79 7.2. Informative References . . . . . . . . . . . . . . . . . 11 80 Appendix A. ES6 Sample Canonicalizer . . . . . . . . . . . . . . 12 81 Appendix B. Number Serialization Samples . . . . . . . . . . . . 13 82 Appendix C. Canonicalized JSON as "Wire Format" . . . . . . . . 15 83 Appendix D. Dealing with Big Numbers . . . . . . . . . . . . . . 15 84 Appendix E. String Subtype Handling . . . . . . . . . . . . . . 16 85 E.1. Subtypes in Arrays . . . . . . . . . . . . . . . . . . . 18 86 Appendix F. Implementation Guidelines . . . . . . . . . . . . . 18 87 Appendix G. Open Source Implementations . . . . . . . . . . . . 19 88 Appendix H. Other JSON Canonicalization Efforts . . . . . . . . 20 89 Appendix I. Development Portal . . . . . . . . . . . . . . . . . 20 90 Appendix J. Document History . . . . . . . . . . . . . . . . . . 20 91 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22 93 1. Introduction 95 This document describes the JSON Canonicalization Scheme (JCS). The 96 JCS specification defines how to create a canonical representation of 97 JSON [RFC8259] data by building on the strict serialization methods 98 for JSON primitives defined by ECMAScript [ES6], constraining JSON 99 data to the I-JSON [RFC7493] subset, and by using deterministic 100 property sorting. The output from JCS is a "Hashable" representation 101 of JSON data that can be used by cryptographic methods. The 102 subsequent paragraphs outline the primary design considerations. 104 Cryptographic operations like hashing and signing need the data to be 105 expressed in an invariant format so that the operations are reliably 106 repeatable. One way to accomplish this is to convert the data into a 107 format that has a simple and fixed representation, like Base64Url 108 [RFC4648]. This is how JWS [RFC7515] addressed this issue. Another 109 solution is to create a canonical version of the data, similar to 110 what was done for the XML Signature [XMLDSIG] standard. 112 The primary advantage with a canonicalizing scheme is that data can 113 be kept in its original form. This is the core rationale behind JCS. 114 Put another way, using canonicalization enables a JSON Object to 115 remain a JSON Object even after being signed. This can simplify 116 system design, documentation, and logging. 118 To avoid "reinventing the wheel", JCS relies on the serialization of 119 JSON primitives (strings, numbers and literals), as defined by 120 ECMAScript (aka JavaScript) beginning with version 6 [ES6], hereafter 121 referred to as "ES6". 123 Seasoned XML developers may recall difficulties getting XML 124 signatures to validate. This was usually due to different 125 interpretations of the quite intricate XML canonicalization rules as 126 well as of the equally complex Web Services security standards. The 127 reasons why JCS should not suffer from similar issues are: 129 o The absence of a namespace concept and default values. 131 o Constraining data to the I-JSON [RFC7493] subset. This eliminates 132 the need for specific parsers for dealing with canonicalization. 134 o JCS compatible serialization of JSON primitives is currently 135 supported by most Web browsers as well as by Node.js [NODEJS], 137 o The full JCS specification is currently supported by multiple Open 138 Source implementations (see Appendix G). See also Appendix F for 139 implementation guidelines. 141 JCS is compatible with some existing systems relying on JSON 142 canonicalization such as JWK Thumbprint [RFC7638] and Keybase 143 [KEYBASE]. 145 For potential uses outside of cryptography see [JSONCOMP]. 147 The intended audiences of this document are JSON tool vendors, as 148 well as designers of JSON based cryptographic solutions. The reader 149 is assumed to be knowledgeable in ECMAScript including the "JSON" 150 object. 152 2. Terminology 154 Note that this document is not on the IETF standards track. However, 155 a conformant implementation is supposed to adhere to the specified 156 behavior for security and interoperability reasons. This text uses 157 BCP 14 to describe that necessary behavior. 159 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 160 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 161 "OPTIONAL" in this document are to be interpreted as described in BCP 162 14 [RFC2119] [RFC8174] when, and only when, they appear in all 163 capitals, as shown here. 165 3. Detailed Operation 167 This section describes the details related to creating a canonical 168 JSON representation, and how they are addressed by JCS. 170 Appendix F describes the RECOMMENDED way of adding JCS support to 171 existing JSON tools. 173 3.1. Creation of Input Data 175 Data to be canonically serialized is usually created by: 177 o Parsing previously generated JSON data. 179 o Programmatically creating data. 181 Irrespective of the method used, the data to be serialized MUST be 182 adapted for I-JSON [RFC7493] formatting, which implies the following: 184 o JSON Objects MUST NOT exhibit duplicate property names. 186 o JSON String data MUST be expressible as Unicode [UNICODE]. 188 o JSON Number data MUST be expressible as IEEE-754 [IEEE754] double 189 precision values. For applications needing higher precision or 190 longer integers than offered by IEEE-754 double precision, it is 191 RECOMMENDED to represent such numbers as JSON Strings, see 192 Appendix D for details on how this can be performed in an 193 interoperable and extensible way. 195 An additional constraint is that parsed JSON String data MUST NOT be 196 altered during subsequent serializations. For more information see 197 Appendix E. 199 Note: although the Unicode standard offers the possibility of 200 rearranging certain character sequences, referred to as "Unicode 201 Normalization" (https://www.unicode.org/reports/tr15/), JCS compliant 202 string processing does not take this in consideration. That is, all 203 components involved in a scheme depending on JCS, MUST preserve 204 Unicode string data "as is". 206 3.2. Generation of Canonical JSON Data 208 The following subsections describe the steps required to create a 209 canonical JSON representation of the data elaborated on in the 210 previous section. 212 Appendix A shows sample code for an ES6 based canonicalizer, matching 213 the JCS specification. 215 3.2.1. Whitespace 217 Whitespace between JSON tokens MUST NOT be emitted. 219 3.2.2. Serialization of Primitive Data Types 221 Assume a JSON object as follows is parsed: 223 { 224 "numbers": [333333333.33333329, 1E30, 4.50, 225 2e-3, 0.000000000000000000000000001], 226 "string": "\u20ac$\u000F\u000aA'\u0042\u0022\u005c\\\"\/", 227 "literals": [null, true, false] 228 } 230 If the parsed data is subsequently serialized using a serializer 231 compliant with ES6's "JSON.stringify()", the result would (with a 232 line wrap added for display purposes only), be rather divergent with 233 respect to the original data: 235 {"numbers":[333333333.3333333,1e+30,4.5,0.002,1e-27],"string": 236 "€$\u000f\nA'B\"\\\\\"/","literals":[null,true,false]} 238 The reason for the difference between the parsed data and its 239 serialized counterpart, is due to a wide tolerance on input data (as 240 defined by JSON [RFC8259]), while output data (as defined by ES6), 241 has a fixed representation. As can be seen in the example, numbers 242 are subject to rounding as well. 244 The following subsections describe the serialization of primitive 245 JSON data types according to JCS. This part is identical to that of 246 ES6. In the (unlikely) event that a future version of ECMAScript 247 would invalidate any of the following serialization methods, it will 248 be up to the developer community to either stick to this 249 specification or create a new specification. 251 3.2.2.1. Serialization of Literals 253 In accordance with JSON [RFC8259], the literals "null", "true", and 254 "false" MUST be serialized as null, true, and false respectively. 256 3.2.2.2. Serialization of Strings 258 For JSON String data (which includes JSON Object property names as 259 well), each Unicode code point MUST be serialized as described below 260 (see section 24.3.2.2 of [ES6]): 262 o If the Unicode value falls within the traditional ASCII control 263 character range (U+0000 through U+001F), it MUST be serialized 264 using lowercase hexadecimal Unicode notation (\uhhhh) unless it is 265 in the set of predefined JSON control characters U+0008, U+0009, 266 U+000A, U+000C or U+000D which MUST be serialized as \b, \t, \n, 267 \f and \r respectively. 269 o If the Unicode value is outside of the ASCII control character 270 range, it MUST be serialized "as is" unless it is equivalent to 271 U+005C (\) or U+0022 (") which MUST be serialized as \\ and \" 272 respectively. 274 Finally, the resulting sequence of Unicode code points MUST be 275 enclosed in double quotes ("). 277 Note: since invalid Unicode data like "lone surrogates" (e.g. 278 U+DEAD) may lead to interoperability issues including broken 279 signatures, occurrences of such data MUST cause a compliant JCS 280 implementation to terminate with an appropriate error. 282 3.2.2.3. Serialization of Numbers 284 ES6 builds on the IEEE-754 [IEEE754] double precision standard for 285 representing JSON Number data. Such data MUST be serialized 286 according to section 7.1.12.1 of [ES6] including the "Note 2" 287 enhancement. 289 Due to the relative complexity of this part, the algorithm itself is 290 not included in this document. For implementers of JCS compliant 291 number serialization, Google's implementation in V8 [V8] may serve as 292 a reference. Another compatible number serialization reference 293 implementation is Ryu [RYU], that is used by the JCS open source Java 294 implementation mentioned in Appendix G. Appendix B holds a set of 295 IEEE-754 sample values and their corresponding JSON serialization. 297 Note: since "NaN" (Not a Number) and "Infinity" are not permitted in 298 JSON, occurrences of "NaN" or "Infinity" MUST cause a compliant JCS 299 implementation to terminate with an appropriate error. 301 3.2.3. Sorting of Object Properties 303 Although the previous step normalized the representation of primitive 304 JSON data types, the result would not yet qualify as "canonical" 305 since JSON Object properties are not in lexicographic (alphabetical) 306 order. 308 Applied to the sample in Section 3.2.2, a properly canonicalized 309 version should (with a line wrap added for display purposes only), 310 read as: 312 {"literals":[null,true,false],"numbers":[333333333.3333333, 313 1e+30,4.5,0.002,1e-27],"string":"€$\u000f\nA'B\"\\\\\"/"} 315 The rules for lexicographic sorting of JSON Object properties 316 according to JCS are as follows: 318 o JSON Object properties MUST be sorted recursively, which means 319 that JSON child Objects MUST have their properties sorted as well. 321 o JSON Array data MUST also be scanned for the presence of JSON 322 Objects (if an object is found then its properties MUST be 323 sorted), but array element order MUST NOT be changed. 325 When a JSON Object is about to have its properties sorted, the 326 following measures MUST be adhered to: 328 o The sorting process is applied to property name strings in their 329 "raw" (unescaped) form. That is, a newline character is treated 330 as U+000A. 332 o Property name strings to be sorted are formatted as arrays of 333 UTF-16 [UNICODE] code units. The sorting is based on pure value 334 comparisons, where code units are treated as unsigned integers, 335 independent of locale settings. 337 o Property name strings either have different values at some index 338 that is a valid index for both strings, or their lengths are 339 different, or both. If they have different values at one or more 340 index positions, let k be the smallest such index; then the string 341 whose value at position k has the smaller value, as determined by 342 using the < operator, lexicographically precedes the other string. 343 If there is no index position at which they differ, then the 344 shorter string lexicographically precedes the longer string. 346 In plain English this means that property names are sorted in 347 ascending order like the following: 349 "" 350 "a" 351 "aa" 352 "ab" 354 The rationale for basing the sorting algorithm on UTF-16 code units 355 is that it maps directly to the string type in ECMAScript (featured 356 in Web browsers and Node.js), Java and .NET. In addition, JSON only 357 supports escape sequences expressed as UTF-16 code units making 358 knowledge and handling of such data a necessity anyway. Systems 359 using another internal representation of string data will need to 360 convert JSON property name strings into arrays of UTF-16 code units 361 before sorting. The conversion from UTF-8 or UTF-32 to UTF-16 is 362 defined by the Unicode [UNICODE] standard. 364 The following test data can be used for verifying the correctness of 365 the sorting scheme in a JCS implementation. JSON test data: 367 { 368 "\u20ac": "Euro Sign", 369 "\r": "Carriage Return", 370 "\ufb33": "Hebrew Letter Dalet With Dagesh", 371 "1": "One", 372 "\ud83d\ude00": "Emoji: Grinning Face", 373 "\u0080": "Control", 374 "\u00f6": "Latin Small Letter O With Diaeresis" 375 } 377 Expected argument order after sorting property strings: 379 "Carriage Return" 380 "One" 381 "Control" 382 "Latin Small Letter O With Diaeresis" 383 "Euro Sign" 384 "Emoji: Grinning Face" 385 "Hebrew Letter Dalet With Dagesh" 387 Note: for the purpose of obtaining a deterministic property order, 388 sorting on UTF-8 or UTF-32 encoded data would also work, but the 389 outcome for JSON data like above would differ and thus be 390 incompatible with this specification. However, in practice, property 391 names are rarely defined outside of 7-bit ASCII making it possible to 392 sort on string data in UTF-8 or UTF-32 format without conversions to 393 UTF-16 and still be compatible with JCS. If this is a viable option 394 or not depends on the environment JCS is used in. 396 3.2.4. UTF-8 Generation 398 Finally, in order to create a platform independent representation, 399 the result of the preceding step MUST be encoded in UTF-8. 401 Applied to the sample in Section 3.2.3 this should yield the 402 following bytes, here shown in hexadecimal notation: 404 7b 22 6c 69 74 65 72 61 6c 73 22 3a 5b 6e 75 6c 6c 2c 74 72 405 75 65 2c 66 61 6c 73 65 5d 2c 22 6e 75 6d 62 65 72 73 22 3a 406 5b 33 33 33 33 33 33 33 33 33 2e 33 33 33 33 33 33 33 2c 31 407 65 2b 33 30 2c 34 2e 35 2c 30 2e 30 30 32 2c 31 65 2d 32 37 408 5d 2c 22 73 74 72 69 6e 67 22 3a 22 e2 82 ac 24 5c 75 30 30 409 30 66 5c 6e 41 27 42 5c 22 5c 5c 5c 5c 5c 22 2f 22 7d 411 This data is intended to be usable as input to cryptographic methods. 413 4. IANA Considerations 415 This document has no IANA actions. 417 5. Security Considerations 419 It is crucial to perform sanity checks on input data to avoid 420 overflowing buffers and similar things that could affect the 421 integrity of the system. 423 When JCS is applied to signature schemes like the one described in 424 Appendix F, applications MUST perform the following operations before 425 acting upon received data: 427 1. Parse the JSON data and verify that it adheres to I-JSON. 429 2. Verify the data for correctness according to the conventions 430 defined by the ecosystem where it is to be used. This also 431 includes locating the property holding the signature data. 433 3. Verify the signature. 435 If any of these steps fail, the operation in progress MUST be 436 aborted. 438 6. Acknowledgements 440 Building on ES6 Number serialization was originally proposed by 441 James Manger. This ultimately led to the adoption of the entire ES6 442 serialization scheme for JSON primitives. 444 Other people who have contributed with valuable input to this 445 specification include Scott Ananian, Tim Bray, Ben Campbell, Adrian 446 Farell, Richard Gibson, Bron Gondwana, John-Mark Gurney, John Levine, 447 Mark Miller, Matthew Miller, Mike Jones, Mark Nottingham, 448 Mike Samuel, Jim Schaad, Robert Tupelo-Schneck and Michal Wadas. 450 For carrying out real world concept verification, the software and 451 support for number serialization provided by Ulf Adams, 452 Tanner Gooding and Remy Oudompheng was very helpful. 454 7. References 456 7.1. Normative References 458 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 459 Requirement Levels", BCP 14, RFC 2119, 460 DOI 10.17487/RFC2119, March 1997, 461 . 463 [RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data 464 Interchange Format", STD 90, RFC 8259, 465 DOI 10.17487/RFC8259, December 2017, 466 . 468 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 469 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 470 May 2017, . 472 [RFC7493] Bray, T., Ed., "The I-JSON Message Format", RFC 7493, 473 DOI 10.17487/RFC7493, March 2015, 474 . 476 [ES6] Ecma International, "ECMAScript 2015 Language 477 Specification", June 2015, . 480 [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", 481 August 2008, . 483 [UNICODE] The Unicode Consortium, "The Unicode Standard, Version 484 12.1.0", May 2019, 485 . 487 7.2. Informative References 489 [RFC7638] Jones, M. and N. Sakimura, "JSON Web Key (JWK) 490 Thumbprint", RFC 7638, DOI 10.17487/RFC7638, September 491 2015, . 493 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 494 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, 495 . 497 [RFC7515] Jones, M., Bradley, J., and N. Sakimura, "JSON Web 498 Signature (JWS)", RFC 7515, DOI 10.17487/RFC7515, May 499 2015, . 501 [JSONCOMP] A. Rundgren, ""Comparable" JSON - Work in progress", 502 . 505 [V8] Google LLC, "Chrome V8 Open Source JavaScript Engine", 506 . 508 [RYU] Ulf Adams, "Ryu floating point number serializing 509 algorithm", . 511 [NODEJS] "Node.js", . 513 [KEYBASE] "Keybase", 514 . 516 [OPENAPI] "The OpenAPI Initiative", . 518 [XMLDSIG] W3C, "XML Signature Syntax and Processing Version 1.1", 519 . 521 Appendix A. ES6 Sample Canonicalizer 523 Below is an example of a JCS canonicalizer for usage with ES6 based 524 systems: 526 //////////////////////////////////////////////////////////// 527 // Since the primary purpose of this code is highlighting // 528 // the core of the JCS algorithm, error handling and // 529 // UTF-8 generation were not implemented // 530 //////////////////////////////////////////////////////////// 531 var canonicalize = function(object) { 533 var buffer = ''; 534 serialize(object); 535 return buffer; 537 function serialize(object) { 538 if (object === null || typeof object !== 'object' || 539 object.toJSON != null) { 540 ///////////////////////////////////////////////// 541 // Primitive type or toJSON - Use ES6/JSON // 542 ///////////////////////////////////////////////// 543 buffer += JSON.stringify(object); 545 } else if (Array.isArray(object)) { 546 ///////////////////////////////////////////////// 547 // Array - Maintain element order // 548 ///////////////////////////////////////////////// 549 buffer += '['; 550 let next = false; 551 object.forEach((element) => { 552 if (next) { 553 buffer += ','; 554 } 555 next = true; 556 ///////////////////////////////////////// 557 // Array element - Recursive expansion // 558 ///////////////////////////////////////// 559 serialize(element); 560 }); 561 buffer += ']'; 563 } else { 564 ///////////////////////////////////////////////// 565 // Object - Sort properties before serializing // 566 ///////////////////////////////////////////////// 567 buffer += '{'; 568 let next = false; 569 Object.keys(object).sort().forEach((property) => { 570 if (next) { 571 buffer += ','; 572 } 573 next = true; 574 /////////////////////////////////////////////// 575 // Property names are strings - Use ES6/JSON // 576 /////////////////////////////////////////////// 577 buffer += JSON.stringify(property); 578 buffer += ':'; 579 ////////////////////////////////////////// 580 // Property value - Recursive expansion // 581 ////////////////////////////////////////// 582 serialize(object[property]); 583 }); 584 buffer += '}'; 585 } 586 } 587 }; 589 Appendix B. Number Serialization Samples 591 The following table holds a set of ES6 compatible Number 592 serialization samples, including some edge cases. The column 593 "IEEE-754" refers to the internal ES6 representation of the Number 594 data type which is based on the IEEE-754 [IEEE754] standard using 595 64-bit (double precision) values, here expressed in hexadecimal. 597 |===================================================================| 598 | IEEE-754 | JSON Representation | Comment | 599 |===================================================================| 600 | 0000000000000000 | 0 | Zero | 601 |-------------------------------------------------------------------| 602 | 8000000000000000 | 0 | Minus zero | 603 |-------------------------------------------------------------------| 604 | 0000000000000001 | 5e-324 | Min pos number | 605 |-------------------------------------------------------------------| 606 | 8000000000000001 | -5e-324 | Min neg number | 607 |-------------------------------------------------------------------| 608 | 7fefffffffffffff | 1.7976931348623157e+308 | Max pos number | 609 |-------------------------------------------------------------------| 610 | ffefffffffffffff | -1.7976931348623157e+308 | Max neg number | 611 |-------------------------------------------------------------------| 612 | 4340000000000000 | 9007199254740992 | Max pos int (1) | 613 |-------------------------------------------------------------------| 614 | c340000000000000 | -9007199254740992 | Max neg int (1) | 615 |-------------------------------------------------------------------| 616 | 4430000000000000 | 295147905179352830000 | ~2**68 (2) | 617 |-------------------------------------------------------------------| 618 | 7fffffffffffffff | | NaN (3) | 619 |-------------------------------------------------------------------| 620 | 7ff0000000000000 | | Infinity (3) | 621 |-------------------------------------------------------------------| 622 | 44b52d02c7e14af5 | 9.999999999999997e+22 | | 623 |-------------------------------------------------------------------| 624 | 44b52d02c7e14af6 | 1e+23 | | 625 |-------------------------------------------------------------------| 626 | 44b52d02c7e14af7 | 1.0000000000000001e+23 | | 627 |-------------------------------------------------------------------| 628 | 444b1ae4d6e2ef4e | 999999999999999700000 | | 629 |-------------------------------------------------------------------| 630 | 444b1ae4d6e2ef4f | 999999999999999900000 | | 631 |-------------------------------------------------------------------| 632 | 444b1ae4d6e2ef50 | 1e+21 | | 633 |-------------------------------------------------------------------| 634 | 3eb0c6f7a0b5ed8c | 9.999999999999997e-7 | | 635 |-------------------------------------------------------------------| 636 | 3eb0c6f7a0b5ed8d | 0.000001 | | 637 |-------------------------------------------------------------------| 638 | 41b3de4355555553 | 333333333.3333332 | | 639 |-------------------------------------------------------------------| 640 | 41b3de4355555554 | 333333333.33333325 | | 641 |-------------------------------------------------------------------| 642 | 41b3de4355555555 | 333333333.3333333 | | 643 |-------------------------------------------------------------------| 644 | 41b3de4355555556 | 333333333.3333334 | | 645 |-------------------------------------------------------------------| 646 | 41b3de4355555557 | 333333333.33333343 | | 647 |-------------------------------------------------------------------| 648 | becbf647612f3696 | -0.0000033333333333333333 | | 649 |-------------------------------------------------------------------| 650 | 43143ff3c1cb0959 | 1424953923781206.2 | Round to even (4) | 651 |-------------------------------------------------------------------| 653 Notes: 655 (1) For maximum compliance with the ES6 "JSON" object, values that 656 are to be interpreted as true integers SHOULD be in the range 657 -9007199254740991 to 9007199254740991. However, how numbers are 658 used in applications do not affect the JCS algorithm. 660 (2) Although a set of specific integers like 2**68 could be regarded 661 as having extended precision, the JCS/ES6 number serialization 662 algorithm does not take this in consideration. 664 (3) Value out range, not permitted in JSON. See Section 3.2.2.3. 666 (4) This number is exactly 1424953923781206.25 but will after the 667 "Note 2" rule mentioned in Section 3.2.2.3 be truncated and 668 rounded to the closest even value. 670 For a more exhaustive validation of a JCS number serializer, you may 671 test against a file (currently) available in the development portal 672 (see Appendix I), containing a large set of sample values. Another 673 option is running V8 [V8] as a live reference together with a program 674 generating a substantial amount of random IEEE-754 values. 676 Appendix C. Canonicalized JSON as "Wire Format" 678 Since the result from the canonicalization process (see 679 Section 3.2.4), is fully valid JSON, it can also be used as 680 "Wire Format". However, this is just an option since cryptographic 681 schemes based on JCS, in most cases would not depend on that 682 externally supplied JSON data already is canonicalized. 684 In fact, the ES6 standard way of serializing objects using 685 "JSON.stringify()" produces a more "logical" format, where properties 686 are kept in the order they were created or received. The example 687 below shows an address record which could benefit from ES6 standard 688 serialization: 690 { 691 "name": "John Doe", 692 "address": "2000 Sunset Boulevard", 693 "city": "Los Angeles", 694 "zip": "90001", 695 "state": "CA" 696 } 698 Using canonicalization the properties above would be output in the 699 order "address", "city", "name", "state" and "zip", which adds 700 fuzziness to the data from a human (developer or technical support), 701 perspective. Canonicalization also converts JSON data into a single 702 line of text, which may be less than ideal for debugging and logging. 704 Appendix D. Dealing with Big Numbers 706 There are several issues associated with the JSON Number type, here 707 illustrated by the following sample object: 709 { 710 "giantNumber": 1.4e+9999, 711 "payMeThis": 26000.33, 712 "int64Max": 9223372036854775807 713 } 715 Although the sample above conforms to JSON [RFC8259], applications 716 would normally use different native data types for storing 717 "giantNumber" and "int64Max". In addition, monetary data like 718 "payMeThis" would presumably not rely on floating point data types 719 due to rounding issues with respect to decimal arithmetic. 721 The established way handling this kind of "overloading" of the JSON 722 Number type (at least in an extensible manner), is through mapping 723 mechanisms, instructing parsers what to do with different properties 724 based on their name. However, this greatly limits the value of using 725 the JSON Number type outside of its original somewhat constrained, 726 JavaScript context. The ES6 "JSON" object does not support mappings 727 to JSON Number either. 729 Due to the above, numbers that do not have a natural place in the 730 current JSON ecosystem MUST be wrapped using the JSON String type. 731 This is close to a de-facto standard for open systems. This is also 732 applicable for other data types that do not have direct support in 733 JSON, like "DateTime" objects as described in Appendix E. 735 Aided by a system using the JSON String type; be it programmatic like 737 var obj = JSON.parse('{"giantNumber": "1.4e+9999"}'); 738 var biggie = new BigNumber(obj.giantNumber); 740 or declarative schemes like OpenAPI [OPENAPI], JCS imposes no limits 741 on applications, including when using ES6. 743 Appendix E. String Subtype Handling 745 Due to the limited set of data types featured in JSON, the JSON 746 String type is commonly used for holding subtypes. This can 747 depending on JSON parsing method lead to interoperability problems 748 which MUST be dealt with by JCS compliant applications targeting a 749 wider audience. 751 Assume you want to parse a JSON object where the schema designer 752 assigned the property "big" for holding a "BigInt" subtype and "time" 753 for holding a "DateTime" subtype, while "val" is supposed to be a 754 JSON Number compliant with JCS. The following example shows such an 755 object: 757 { 758 "time": "2019-01-28T07:45:10Z", 759 "big": "055", 760 "val": 3.5 761 } 763 Parsing of this object can accomplished by the following ES6 764 statement: 766 var object = JSON.parse(JSON_object_featured_as_a_string); 768 After parsing the actual data can be extracted which for subtypes 769 also involve a conversion step using the result of the parsing 770 process (an ECMAScript object) as input: 772 ... = new Date(object.time); // Date object 773 ... = BigInt(object.big); // Big integer 774 ... = object.val; // JSON/JS number 776 Note that the "BigInt" data type is currently only natively supported 777 by V8 [V8]. 779 Canonicalization of "object" using the sample code in Appendix A 780 would return the following string: 782 {"big":"055","time":"2019-01-28T07:45:10Z","val":3.5} 784 Although this is (with respect to JCS) technically correct, there is 785 another way parsing JSON data which also can be used with ECMAScript 786 as shown below: 788 // "BigInt" requires the following code to become JSON serializable 789 BigInt.prototype.toJSON = function() { 790 return this.toString(); 791 }; 793 // JSON parsing using a "stream" based method 794 var object = JSON.parse(JSON_object_featured_as_a_string, 795 (k,v) => k == 'time' ? new Date(v) : k == 'big' ? BigInt(v) : v 796 ); 798 If you now apply the canonicalizer in Appendix A to "object", the 799 following string would be generated: 801 {"big":"55","time":"2019-01-28T07:45:10.000Z","val":3.5} 803 In this case the string arguments for "big" and "time" have changed 804 with respect to the original, presumable making an application 805 depending on JCS fail. 807 The reason for the deviation is that in stream and schema based JSON 808 parsers, the original "string" argument is typically replaced on-the- 809 fly by the native subtype which when serialized, may exhibit a 810 different and platform dependent pattern. 812 That is, stream and schema based parsing MUST treat subtypes as 813 "pure" (immutable) JSON String types, and perform the actual 814 conversion to the designated native type in a subsequent step. In 815 modern programming platforms like Go, Java and C# this can be 816 achieved with moderate efforts by combining annotations, getters and 817 setters. Below is an example in C#/Json.NET showing a part of a 818 class that is serializable as a JSON Object: 820 // The "pure" string solution uses a local 821 // string variable for JSON serialization while 822 // exposing another type to the application 823 [JsonProperty("amount")] 824 private string _amount; 826 [JsonIgnore] 827 public decimal Amount { 828 get { return decimal.Parse(_amount); } 829 set { _amount = value.ToString(); } 830 } 832 In an application "Amount" can be accessed as any other property 833 while it is actually represented by a quoted string in JSON contexts. 835 Note: the example above also addresses the constraints on numeric 836 data implied by I-JSON (the C# "decimal" data type has quite 837 different characteristics compared to IEEE-754 double precision). 839 E.1. Subtypes in Arrays 841 Since the JSON Array construct permits mixing arbitrary JSON data 842 types, custom parsing and serialization code may be required to cope 843 with subtypes anyway. 845 Appendix F. Implementation Guidelines 847 The optimal solution is integrating support for JCS directly in JSON 848 serializers (parsers need no changes). That is, canonicalization 849 would just be an additional "mode" for a JSON serializer. However, 850 this is currently not the case. Fortunately, JCS support can be 851 introduced through externally supplied canonicalizer software acting 852 as a post processor to existing JSON serializers. This arrangement 853 also relieves the JCS implementer from having to deal with how 854 underlying data is to be represented in JSON. 856 The post processor concept enables signature creation schemes like 857 the following: 859 1. Create the data to be signed. 861 2. Serialize the data using existing JSON tools. 863 3. Let the external canonicalizer process the serialized data and 864 return canonicalized result data. 866 4. Sign the canonicalized data. 868 5. Add the resulting signature value to the original JSON data 869 through a designated signature property. 871 6. Serialize the completed (now signed) JSON object using existing 872 JSON tools. 874 A compatible signature verification scheme would then be as follows: 876 1. Parse the signed JSON data using existing JSON tools. 878 2. Read and save the signature value from the designated signature 879 property. 881 3. Remove the signature property from the parsed JSON object. 883 4. Serialize the remaining JSON data using existing JSON tools. 885 5. Let the external canonicalizer process the serialized data and 886 return canonicalized result data. 888 6. Verify that the canonicalized data matches the saved signature 889 value using the algorithm and key used for creating the 890 signature. 892 A canonicalizer like above is effectively only a "filter", 893 potentially usable with a multitude of quite different cryptographic 894 schemes. 896 Using a JSON serializer with integrated JCS support, the 897 serialization performed before the canonicalization step could be 898 eliminated for both processes. 900 Appendix G. Open Source Implementations 902 The following Open Source implementations have been verified to be 903 compatible with JCS: 905 * JavaScript: https://www.npmjs.com/package/canonicalize 907 * Java: https://github.com/erdtman/java-json-canonicalization 908 * Go: https://github.com/cyberphone/json- 909 canonicalization/tree/master/go 911 * .NET/C#: https://github.com/cyberphone/json- 912 canonicalization/tree/master/dotnet 914 * Python: https://github.com/cyberphone/json- 915 canonicalization/tree/master/python3 917 Appendix H. Other JSON Canonicalization Efforts 919 There are (and have been) other efforts creating "Canonical JSON". 920 Below is a list of URLs to some of them: 922 * https://tools.ietf.org/html/draft-staykov-hu-json-canonical- 923 form-00 925 * https://gibson042.github.io/canonicaljson-spec/ 927 * http://wiki.laptop.org/go/Canonical_JSON 929 The listed efforts all build on text level JSON to JSON 930 transformations. The primary feature of text level canonicalization 931 is that it can be made neutral to the flavor of JSON used. However, 932 such schemes also imply major changes to the JSON parsing process 933 which is a likely hurdle for adoption. Albeit at the expense of 934 certain JSON and application constraints, JCS was designed to be 935 compatible with existing JSON tools. 937 Appendix I. Development Portal 939 The JCS specification is currently developed at: 940 https://github.com/cyberphone/ietf-json-canon. 942 JCS source code and extensive test data is available at: 943 https://github.com/cyberphone/json-canonicalization 945 Appendix J. Document History 947 [[ This section to be removed by the RFC Editor before publication as 948 an RFC ]] 950 Version 00-06: 952 * See IETF diff listings. 954 Version 07: 956 * Initial converson to XML RFC version 3. 958 * Changed intended status to "Informational". 960 * Added UTF-16 test data and explanations. 962 Version 08: 964 * Updated Abstract. 966 * Added a "Note 2" number serialization sample. 968 * Updated Security Considerations. 970 * Tried to clear up the JSON input data section. 972 * Added a line about Unicode normalization. 974 * Added a line about serialiation of structured data. 976 * Added a missing fact about "BigInt" (V8 not ES6). 978 Version 09: 980 * Updated initial line of Abstract and Introduction. 982 * Added note about breaking ECMAScript changes. 984 * Minor language nit fixes. 986 Version 10-12: 988 * Language tweaks. 990 Version 13: 992 * Reorganized Section 3.2.2.3. 994 Version 14: 996 * Improved introduction + some minor changes in security 997 considerations, aknowlegdgements, and unicode normalization. 999 * Generalized data representation issues by updating Appendix F. 1001 Version 15: 1003 * Minor nits, reverted the IEEE-754 table to ASCII. 1005 * Added a bit more meat to the IEEE-754 table. 1007 * Changed all to: type="ascii-art" and removed name="". 1009 Version 16: 1011 * Updated section 2 according to AD's wish. 1013 Version 17: 1015 * Updated section 2 after IESG input. 1017 * Author affiliation update. 1019 Authors' Addresses 1021 Anders Rundgren 1022 Independent 1023 Montpellier 1024 France 1026 Email: anders.rundgren.net@gmail.com 1027 URI: https://www.linkedin.com/in/andersrundgren/ 1029 Bret Jordan 1030 Broadcom 1031 1320 Ridder Park Drive 1032 San Jose, CA 95131 1033 United States of America 1035 Email: bret.jordan@broadcom.com 1037 Samuel Erdtman 1038 Spotify AB 1039 Birger Jarlsgatan 61, 4tr 1040 SE-113 56 Stockholm 1041 Sweden 1043 Email: erdtman@spotify.com