idnits 2.17.1 draft-rundgren-json-canonicalization-scheme-15.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 2 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 55 instances of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (17 November 2019) is 1612 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'JsonIgnore' is mentioned on line 821, but not defined Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Rundgren 3 Internet-Draft Independent 4 Intended status: Informational B. Jordan 5 Expires: 20 May 2020 Symantec Corporation 6 S. Erdtman 7 Spotify AB 8 17 November 2019 10 JSON Canonicalization Scheme (JCS) 11 draft-rundgren-json-canonicalization-scheme-15 13 Abstract 15 Cryptographic operations like hashing and signing need the data to be 16 expressed in an invariant format so that the operations are reliably 17 repeatable. One way to address this is to create a canonical 18 representation of the data. Canonicalization also permits data to be 19 exchanged in its original form on the "wire" while cryptographic 20 operations performed on the canonicalized counterpart of the data in 21 the producer and consumer end points, generate consistent results. 23 This document describes the JSON Canonicalization Scheme (JCS). The 24 JCS specification defines how to create a canonical representation of 25 JSON data by building on the strict serialization methods for JSON 26 primitives defined by ECMAScript, constraining JSON data to the 27 I-JSON subset, and by using deterministic property sorting. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at https://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on 20 May 2020. 46 Copyright Notice 48 Copyright (c) 2019 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 53 license-info) in effect on the date of publication of this document. 54 Please review these documents carefully, as they describe your rights 55 and restrictions with respect to this document. Code Components 56 extracted from this document must include Simplified BSD License text 57 as described in Section 4.e of the Trust Legal Provisions and are 58 provided without warranty as described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 64 3. Detailed Operation . . . . . . . . . . . . . . . . . . . . . 4 65 3.1. Creation of Input Data . . . . . . . . . . . . . . . . . 4 66 3.2. Generation of Canonical JSON Data . . . . . . . . . . . . 5 67 3.2.1. Whitespace . . . . . . . . . . . . . . . . . . . . . 5 68 3.2.2. Serialization of Primitive Data Types . . . . . . . . 5 69 3.2.2.1. Serialization of Literals . . . . . . . . . . . . 6 70 3.2.2.2. Serialization of Strings . . . . . . . . . . . . 6 71 3.2.2.3. Serialization of Numbers . . . . . . . . . . . . 6 72 3.2.3. Sorting of Object Properties . . . . . . . . . . . . 7 73 3.2.4. UTF-8 Generation . . . . . . . . . . . . . . . . . . 9 74 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 75 5. Security Considerations . . . . . . . . . . . . . . . . . . . 9 76 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10 77 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 78 7.1. Normative References . . . . . . . . . . . . . . . . . . 10 79 7.2. Informative References . . . . . . . . . . . . . . . . . 11 80 Appendix A. ES6 Sample Canonicalizer . . . . . . . . . . . . . . 11 81 Appendix B. Number Serialization Samples . . . . . . . . . . . . 13 82 Appendix C. Canonicalized JSON as "Wire Format" . . . . . . . . 15 83 Appendix D. Dealing with Big Numbers . . . . . . . . . . . . . . 15 84 Appendix E. String Subtype Handling . . . . . . . . . . . . . . 16 85 E.1. Subtypes in Arrays . . . . . . . . . . . . . . . . . . . 18 86 Appendix F. Implementation Guidelines . . . . . . . . . . . . . 18 87 Appendix G. Open Source Implementations . . . . . . . . . . . . 19 88 Appendix H. Other JSON Canonicalization Efforts . . . . . . . . 20 89 Appendix I. Development Portal . . . . . . . . . . . . . . . . . 20 90 Appendix J. Document History . . . . . . . . . . . . . . . . . . 20 91 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22 93 1. Introduction 95 This document describes the JSON Canonicalization Scheme (JCS). The 96 JCS specification defines how to create a canonical representation of 97 JSON [RFC8259] data by building on the strict serialization methods 98 for JSON primitives defined by ECMAScript [ES6], constraining JSON 99 data to the I-JSON [RFC7493] subset, and by using deterministic 100 property sorting. The output from JCS is a "Hashable" representation 101 of JSON data that can be used by cryptographic methods. The 102 subsequent paragraphs outline the primary design considerations. 104 Cryptographic operations like hashing and signing need the data to be 105 expressed in an invariant format so that the operations are reliably 106 repeatable. One way to accomplish this is to convert the data into a 107 format that has a simple and fixed representation, like Base64Url 108 [RFC4648]. This is how JWS [RFC7515] addressed this issue. Another 109 solution is to create a canonical version of the data, similar to 110 what was done for the XML Signature [XMLDSIG] standard. 112 The primary advantage with a canonicalizing scheme is that data can 113 be kept in its original form. This is the core rationale behind JCS. 114 Put another way, using canonicalization enables a JSON Object to 115 remain a JSON Object even after being signed. This can simplify 116 system design, documentation, and logging. 118 To avoid "reinventing the wheel", JCS relies on the serialization of 119 JSON primitives (strings, numbers and literals), as defined by 120 ECMAScript (aka JavaScript) beginning with version 6 [ES6], hereafter 121 referred to as "ES6". 123 Seasoned XML developers may recall difficulties getting XML 124 signatures to validate. This was usually due to different 125 interpretations of the quite intricate XML canonicalization rules as 126 well as of the equally complex Web Services security standards. The 127 reasons why JCS should not suffer from similar issues are: 129 o The absence of a namespace concept and default values. 131 o Constraining data to the I-JSON [RFC7493] subset. This eliminates 132 the need for specific parsers for dealing with canonicalization. 134 o JCS compatible serialization of JSON primitives is currently 135 supported by most Web browsers as well as by Node.js [NODEJS], 137 o The full JCS specification is currently supported by multiple Open 138 Source implementations (see Appendix G). See also Appendix F for 139 implementation guidelines. 141 JCS is compatible with some existing systems relying on JSON 142 canonicalization such as JWK Thumbprint [RFC7638] and Keybase 143 [KEYBASE]. 145 For potential uses outside of cryptography see [JSONCOMP]. 147 The intended audiences of this document are JSON tool vendors, as 148 well as designers of JSON based cryptographic solutions. The reader 149 is assumed to be knowledgeable in ECMAScript including the "JSON" 150 object. 152 2. Terminology 154 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 155 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 156 "OPTIONAL" in this document are to be interpreted as described in BCP 157 14 [RFC2119] [RFC8174] when, and only when, they appear in all 158 capitals, as shown here. 160 3. Detailed Operation 162 This section describes the details related to creating a canonical 163 JSON representation, and how they are addressed by JCS. 165 Appendix F describes the RECOMMENDED way of adding JCS support to 166 existing JSON tools. 168 3.1. Creation of Input Data 170 Data to be canonically serialized is usually created by: 172 o Parsing previously generated JSON data. 174 o Programmatically creating data. 176 Irrespective of the method used, the data to be serialized MUST be 177 adapted for I-JSON [RFC7493] formatting, which implies the following: 179 o JSON Objects MUST NOT exhibit duplicate property names. 181 o JSON String data MUST be expressible as Unicode [UNICODE]. 183 o JSON Number data MUST be expressible as IEEE-754 [IEEE754] double 184 precision values. For applications needing higher precision or 185 longer integers than offered by IEEE-754 double precision, it is 186 RECOMMENDED to represent such numbers as JSON Strings, see 187 Appendix D for details on how this can be performed in an 188 interoperable and extensible way. 190 An additional constraint is that parsed JSON String data MUST NOT be 191 altered during subsequent serializations. For more information see 192 Appendix E. 194 Note: although the Unicode standard offers the possibility of 195 rearranging certain character sequences, referred to as "Unicode 196 Normalization" (https://www.unicode.org/reports/tr15/), JCS compliant 197 string processing does not take this in consideration. That is, all 198 components involved in a scheme depending on JCS, MUST preserve 199 Unicode string data "as is". 201 3.2. Generation of Canonical JSON Data 203 The following subsections describe the steps required to create a 204 canonical JSON representation of the data elaborated on in the 205 previous section. 207 Appendix A shows sample code for an ES6 based canonicalizer, matching 208 the JCS specification. 210 3.2.1. Whitespace 212 Whitespace between JSON tokens MUST NOT be emitted. 214 3.2.2. Serialization of Primitive Data Types 216 Assume a JSON object as follows is parsed: 218 { 219 "numbers": [333333333.33333329, 1E30, 4.50, 220 2e-3, 0.000000000000000000000000001], 221 "string": "\u20ac$\u000F\u000aA'\u0042\u0022\u005c\\\"\/", 222 "literals": [null, true, false] 223 } 225 If the parsed data is subsequently serialized using a serializer 226 compliant with ES6's "JSON.stringify()", the result would (with a 227 line wrap added for display purposes only), be rather divergent with 228 respect to the original data: 230 {"numbers":[333333333.3333333,1e+30,4.5,0.002,1e-27],"string": 231 "€$\u000f\nA'B\"\\\\\"/","literals":[null,true,false]} 233 The reason for the difference between the parsed data and its 234 serialized counterpart, is due to a wide tolerance on input data (as 235 defined by JSON [RFC8259]), while output data (as defined by ES6), 236 has a fixed representation. As can be seen in the example, numbers 237 are subject to rounding as well. 239 The following subsections describe the serialization of primitive 240 JSON data types according to JCS. This part is identical to that of 241 ES6. In the (unlikely) event that a future version of ECMAScript 242 would invalidate any of the following serialization methods, it will 243 be up to the developer community to either stick to this 244 specification or create a new specification. 246 3.2.2.1. Serialization of Literals 248 In accordance with JSON [RFC8259], the literals "null", "true", and 249 "false" MUST be serialized as null, true, and false respectively. 251 3.2.2.2. Serialization of Strings 253 For JSON String data (which includes JSON Object property names as 254 well), each Unicode code point MUST be serialized as described below 255 (see section 24.3.2.2 of [ES6]): 257 o If the Unicode value falls within the traditional ASCII control 258 character range (U+0000 through U+001F), it MUST be serialized 259 using lowercase hexadecimal Unicode notation (\uhhhh) unless it is 260 in the set of predefined JSON control characters U+0008, U+0009, 261 U+000A, U+000C or U+000D which MUST be serialized as \b, \t, \n, 262 \f and \r respectively. 264 o If the Unicode value is outside of the ASCII control character 265 range, it MUST be serialized "as is" unless it is equivalent to 266 U+005C (\) or U+0022 (") which MUST be serialized as \\ and \" 267 respectively. 269 Finally, the resulting sequence of Unicode code points MUST be 270 enclosed in double quotes ("). 272 Note: since invalid Unicode data like "lone surrogates" (e.g. 273 U+DEAD) may lead to interoperability issues including broken 274 signatures, occurrences of such data MUST cause a compliant JCS 275 implementation to terminate with an appropriate error. 277 3.2.2.3. Serialization of Numbers 279 ES6 builds on the IEEE-754 [IEEE754] double precision standard for 280 representing JSON Number data. Such data MUST be serialized 281 according to section 7.1.12.1 of [ES6] including the "Note 2" 282 enhancement. 284 Due to the relative complexity of this part, the algorithm itself is 285 not included in this document. For implementers of JCS compliant 286 number serialization, Google's implementation in V8 [V8] may serve as 287 a reference. Another compatible number serialization reference 288 implementation is Ryu [RYU], that is used by the JCS open source Java 289 implementation mentioned in Appendix G. Appendix B holds a set of 290 IEEE-754 sample values and their corresponding JSON serialization. 292 Note: since "NaN" (Not a Number) and "Infinity" are not permitted in 293 JSON, occurrences of "NaN" or "Infinity" MUST cause a compliant JCS 294 implementation to terminate with an appropriate error. 296 3.2.3. Sorting of Object Properties 298 Although the previous step normalized the representation of primitive 299 JSON data types, the result would not yet qualify as "canonical" 300 since JSON Object properties are not in lexicographic (alphabetical) 301 order. 303 Applied to the sample in Section 3.2.2, a properly canonicalized 304 version should (with a line wrap added for display purposes only), 305 read as: 307 {"literals":[null,true,false],"numbers":[333333333.3333333, 308 1e+30,4.5,0.002,1e-27],"string":"€$\u000f\nA'B\"\\\\\"/"} 310 The rules for lexicographic sorting of JSON Object properties 311 according to JCS are as follows: 313 o JSON Object properties MUST be sorted recursively, which means 314 that JSON child Objects MUST have their properties sorted as well. 316 o JSON Array data MUST also be scanned for the presence of JSON 317 Objects (if an object is found then its properties MUST be 318 sorted), but array element order MUST NOT be changed. 320 When a JSON Object is about to have its properties sorted, the 321 following measures MUST be adhered to: 323 o The sorting process is applied to property name strings in their 324 "raw" (unescaped) form. That is, a newline character is treated 325 as U+000A. 327 o Property name strings to be sorted are formatted as arrays of 328 UTF-16 [UNICODE] code units. The sorting is based on pure value 329 comparisons, where code units are treated as unsigned integers, 330 independent of locale settings. 332 o Property name strings either have different values at some index 333 that is a valid index for both strings, or their lengths are 334 different, or both. If they have different values at one or more 335 index positions, let k be the smallest such index; then the string 336 whose value at position k has the smaller value, as determined by 337 using the < operator, lexicographically precedes the other string. 338 If there is no index position at which they differ, then the 339 shorter string lexicographically precedes the longer string. 341 In plain English this means that property names are sorted in 342 ascending order like the following: 344 "" 345 "a" 346 "aa" 347 "ab" 349 The rationale for basing the sorting algorithm on UTF-16 code units 350 is that it maps directly to the string type in ECMAScript (featured 351 in Web browsers and Node.js), Java and .NET. In addition, JSON only 352 supports escape sequences expressed as UTF-16 code units making 353 knowledge and handling of such data a necessity anyway. Systems 354 using another internal representation of string data will need to 355 convert JSON property name strings into arrays of UTF-16 code units 356 before sorting. The conversion from UTF-8 or UTF-32 to UTF-16 is 357 defined by the Unicode [UNICODE] standard. 359 The following test data can be used for verifying the correctness of 360 the sorting scheme in a JCS implementation. JSON test data: 362 { 363 "\u20ac": "Euro Sign", 364 "\r": "Carriage Return", 365 "\ufb33": "Hebrew Letter Dalet With Dagesh", 366 "1": "One", 367 "\ud83d\ude00": "Emoji: Grinning Face", 368 "\u0080": "Control", 369 "\u00f6": "Latin Small Letter O With Diaeresis" 370 } 372 Expected argument order after sorting property strings: 374 "Carriage Return" 375 "One" 376 "Control" 377 "Latin Small Letter O With Diaeresis" 378 "Euro Sign" 379 "Emoji: Grinning Face" 380 "Hebrew Letter Dalet With Dagesh" 382 Note: for the purpose of obtaining a deterministic property order, 383 sorting on UTF-8 or UTF-32 encoded data would also work, but the 384 outcome for JSON data like above would differ and thus be 385 incompatible with this specification. However, in practice, property 386 names are rarely defined outside of 7-bit ASCII making it possible to 387 sort on string data in UTF-8 or UTF-32 format without conversions to 388 UTF-16 and still be compatible with JCS. If this is a viable option 389 or not depends on the environment JCS is used in. 391 3.2.4. UTF-8 Generation 393 Finally, in order to create a platform independent representation, 394 the result of the preceding step MUST be encoded in UTF-8. 396 Applied to the sample in Section 3.2.3 this should yield the 397 following bytes, here shown in hexadecimal notation: 399 7b 22 6c 69 74 65 72 61 6c 73 22 3a 5b 6e 75 6c 6c 2c 74 72 400 75 65 2c 66 61 6c 73 65 5d 2c 22 6e 75 6d 62 65 72 73 22 3a 401 5b 33 33 33 33 33 33 33 33 33 2e 33 33 33 33 33 33 33 2c 31 402 65 2b 33 30 2c 34 2e 35 2c 30 2e 30 30 32 2c 31 65 2d 32 37 403 5d 2c 22 73 74 72 69 6e 67 22 3a 22 e2 82 ac 24 5c 75 30 30 404 30 66 5c 6e 41 27 42 5c 22 5c 5c 5c 5c 5c 22 2f 22 7d 406 This data is intended to be usable as input to cryptographic methods. 408 4. IANA Considerations 410 This document has no IANA actions. 412 5. Security Considerations 414 It is crucial to perform sanity checks on input data to avoid 415 overflowing buffers and similar things that could affect the 416 integrity of the system. 418 When JCS is applied to signature schemes like the one described in 419 Appendix F, applications MUST perform the following operations before 420 acting upon received data: 422 1. Parse the JSON data and verify that it adheres to I-JSON. 424 2. Verify the data for correctness according to the conventions 425 defined by the ecosystem where it is to be used. This also 426 includes locating the property holding the signature data. 428 3. Verify the signature. 430 If any of these steps fail, the operation in progress MUST be 431 aborted. 433 6. Acknowledgements 435 Building on ES6 Number serialization was originally proposed by 436 James Manger. This ultimately led to the adoption of the entire ES6 437 serialization scheme for JSON primitives. 439 Other people who have contributed with valuable input to this 440 specification include Scott Ananian, Tim Bray, Ben Campbell, Adrian 441 Farell, Richard Gibson, Bron Gondwana, John-Mark Gurney, John Levine, 442 Mark Miller, Matthew Miller, Mike Jones, Mark Nottingham, 443 Mike Samuel, Jim Schaad, Robert Tupelo-Schneck and Michal Wadas. 445 For carrying out real world concept verification, the software and 446 support for number serialization provided by Ulf Adams, 447 Tanner Gooding and Remy Oudompheng was very helpful. 449 7. References 451 7.1. Normative References 453 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 454 Requirement Levels", BCP 14, RFC 2119, 455 DOI 10.17487/RFC2119, March 1997, 456 . 458 [RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data 459 Interchange Format", STD 90, RFC 8259, 460 DOI 10.17487/RFC8259, December 2017, 461 . 463 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 464 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 465 May 2017, . 467 [RFC7493] Bray, T., Ed., "The I-JSON Message Format", RFC 7493, 468 DOI 10.17487/RFC7493, March 2015, 469 . 471 [ES6] Ecma International, "ECMAScript 2015 Language 472 Specification", June 2015, . 475 [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", 476 August 2008, . 478 [UNICODE] The Unicode Consortium, "The Unicode Standard, Version 479 12.1.0", May 2019, 480 . 482 7.2. Informative References 484 [RFC7638] Jones, M. and N. Sakimura, "JSON Web Key (JWK) 485 Thumbprint", RFC 7638, DOI 10.17487/RFC7638, September 486 2015, . 488 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 489 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, 490 . 492 [RFC7515] Jones, M., Bradley, J., and N. Sakimura, "JSON Web 493 Signature (JWS)", RFC 7515, DOI 10.17487/RFC7515, May 494 2015, . 496 [JSONCOMP] A. Rundgren, ""Comparable" JSON - Work in progress", 497 . 500 [V8] Google LLC, "Chrome V8 Open Source JavaScript Engine", 501 . 503 [RYU] Ulf Adams, "Ryu floating point number serializing 504 algorithm", . 506 [NODEJS] "Node.js", . 508 [KEYBASE] "Keybase", 509 . 511 [OPENAPI] "The OpenAPI Initiative", . 513 [XMLDSIG] W3C, "XML Signature Syntax and Processing Version 1.1", 514 . 516 Appendix A. ES6 Sample Canonicalizer 518 Below is an example of a JCS canonicalizer for usage with ES6 based 519 systems: 521 //////////////////////////////////////////////////////////// 522 // Since the primary purpose of this code is highlighting // 523 // the core of the JCS algorithm, error handling and // 524 // UTF-8 generation were not implemented // 525 //////////////////////////////////////////////////////////// 526 var canonicalize = function(object) { 528 var buffer = ''; 529 serialize(object); 530 return buffer; 532 function serialize(object) { 533 if (object === null || typeof object !== 'object' || 534 object.toJSON != null) { 535 ///////////////////////////////////////////////// 536 // Primitive type or toJSON - Use ES6/JSON // 537 ///////////////////////////////////////////////// 538 buffer += JSON.stringify(object); 540 } else if (Array.isArray(object)) { 541 ///////////////////////////////////////////////// 542 // Array - Maintain element order // 543 ///////////////////////////////////////////////// 544 buffer += '['; 545 let next = false; 546 object.forEach((element) => { 547 if (next) { 548 buffer += ','; 549 } 550 next = true; 551 ///////////////////////////////////////// 552 // Array element - Recursive expansion // 553 ///////////////////////////////////////// 554 serialize(element); 555 }); 556 buffer += ']'; 558 } else { 559 ///////////////////////////////////////////////// 560 // Object - Sort properties before serializing // 561 ///////////////////////////////////////////////// 562 buffer += '{'; 563 let next = false; 564 Object.keys(object).sort().forEach((property) => { 565 if (next) { 566 buffer += ','; 567 } 568 next = true; 569 /////////////////////////////////////////////// 570 // Property names are strings - Use ES6/JSON // 571 /////////////////////////////////////////////// 572 buffer += JSON.stringify(property); 573 buffer += ':'; 574 ////////////////////////////////////////// 575 // Property value - Recursive expansion // 576 ////////////////////////////////////////// 577 serialize(object[property]); 578 }); 579 buffer += '}'; 580 } 581 } 582 }; 584 Appendix B. Number Serialization Samples 586 The following table holds a set of ES6 compatible Number 587 serialization samples, including some edge cases. The column 588 "IEEE-754" refers to the internal ES6 representation of the Number 589 data type which is based on the IEEE-754 [IEEE754] standard using 590 64-bit (double precision) values, here expressed in hexadecimal. 592 |====================================================================| 593 | IEEE-754 | JSON Representation | Comment | 594 |====================================================================| 595 | 0000000000000000 | 0 | Zero | 596 |--------------------------------------------------------------------| 597 | 8000000000000000 | 0 | Minus zero | 598 |--------------------------------------------------------------------| 599 | 0000000000000001 | 5e-324 | Min pos number | 600 |--------------------------------------------------------------------| 601 | 8000000000000001 | -5e-324 | Min neg number | 602 |--------------------------------------------------------------------| 603 | 7fefffffffffffff | 1.7976931348623157e+308 | Max pos number | 604 |--------------------------------------------------------------------| 605 | ffefffffffffffff | -1.7976931348623157e+308 | Max neg number | 606 |--------------------------------------------------------------------| 607 | 4340000000000000 | 9007199254740992 | Max pos integer (1) | 608 |--------------------------------------------------------------------| 609 | c340000000000000 | -9007199254740992 | Max neg integer (1) | 610 |--------------------------------------------------------------------| 611 | 4430000000000000 | 295147905179352830000 | ~2**68 (2) | 612 |--------------------------------------------------------------------| 613 | 7fffffffffffffff | | NaN (3) | 614 |--------------------------------------------------------------------| 615 | 7ff0000000000000 | | Infinity (3) | 616 |--------------------------------------------------------------------| 617 | 44b52d02c7e14af5 | 9.999999999999997e+22 | | 618 |--------------------------------------------------------------------| 619 | 44b52d02c7e14af6 | 1e+23 | | 620 |--------------------------------------------------------------------| 621 | 44b52d02c7e14af7 | 1.0000000000000001e+23 | | 622 |--------------------------------------------------------------------| 623 | 444b1ae4d6e2ef4e | 999999999999999700000 | | 624 |--------------------------------------------------------------------| 625 | 444b1ae4d6e2ef4f | 999999999999999900000 | | 626 |--------------------------------------------------------------------| 627 | 444b1ae4d6e2ef50 | 1e+21 | | 628 |--------------------------------------------------------------------| 629 | 3eb0c6f7a0b5ed8c | 9.999999999999997e-7 | | 630 |--------------------------------------------------------------------| 631 | 3eb0c6f7a0b5ed8d | 0.000001 | | 632 |--------------------------------------------------------------------| 633 | 41b3de4355555553 | 333333333.3333332 | | 634 |--------------------------------------------------------------------| 635 | 41b3de4355555554 | 333333333.33333325 | | 636 |--------------------------------------------------------------------| 637 | 41b3de4355555555 | 333333333.3333333 | | 638 |--------------------------------------------------------------------| 639 | 41b3de4355555556 | 333333333.3333334 | | 640 |--------------------------------------------------------------------| 641 | 41b3de4355555557 | 333333333.33333343 | | 642 |--------------------------------------------------------------------| 643 | becbf647612f3696 | -0.0000033333333333333333 | | 644 |--------------------------------------------------------------------| 645 | 43143ff3c1cb0959 | 1424953923781206.2 | Round to even (4) | 646 |--------------------------------------------------------------------| 648 Notes: 650 (1) For maximum compliance with the ES6 "JSON" object, values that 651 are to be interpreted as true integers SHOULD be in the range 652 -9007199254740991 to 9007199254740991. However, how numbers are 653 used in applications do not affect the JCS algorithm. 655 (2) Although a set of specific integers like 2**68 could be regarded 656 as having extended precision, the JCS/ES6 number serialization 657 algorithm does not take this in consideration. 659 (3) Value out range, not permitted in JSON. See Section 3.2.2.3. 661 (4) This number is exactly 1424953923781206.25 but will after the 662 "Note 2" rule mentioned in Section 3.2.2.3 be truncated and 663 rounded to the closest even value. 665 For a more exhaustive validation of a JCS number serializer, you may 666 test against a file (currently) available in the development portal 667 (see Appendix I), containing a large set of sample values. Another 668 option is running V8 [V8] as a live reference together with a program 669 generating a substantial amount of random IEEE-754 values. 671 Appendix C. Canonicalized JSON as "Wire Format" 673 Since the result from the canonicalization process (see 674 Section 3.2.4), is fully valid JSON, it can also be used as 675 "Wire Format". However, this is just an option since cryptographic 676 schemes based on JCS, in most cases would not depend on that 677 externally supplied JSON data already is canonicalized. 679 In fact, the ES6 standard way of serializing objects using 680 "JSON.stringify()" produces a more "logical" format, where properties 681 are kept in the order they were created or received. The example 682 below shows an address record which could benefit from ES6 standard 683 serialization: 685 { 686 "name": "John Doe", 687 "address": "2000 Sunset Boulevard", 688 "city": "Los Angeles", 689 "zip": "90001", 690 "state": "CA" 691 } 693 Using canonicalization the properties above would be output in the 694 order "address", "city", "name", "state" and "zip", which adds 695 fuzziness to the data from a human (developer or technical support), 696 perspective. Canonicalization also converts JSON data into a single 697 line of text, which may be less than ideal for debugging and logging. 699 Appendix D. Dealing with Big Numbers 701 There are several issues associated with the JSON Number type, here 702 illustrated by the following sample object: 704 { 705 "giantNumber": 1.4e+9999, 706 "payMeThis": 26000.33, 707 "int64Max": 9223372036854775807 708 } 710 Although the sample above conforms to JSON [RFC8259], applications 711 would normally use different native data types for storing 712 "giantNumber" and "int64Max". In addition, monetary data like 713 "payMeThis" would presumably not rely on floating point data types 714 due to rounding issues with respect to decimal arithmetic. 716 The established way handling this kind of "overloading" of the JSON 717 Number type (at least in an extensible manner), is through mapping 718 mechanisms, instructing parsers what to do with different properties 719 based on their name. However, this greatly limits the value of using 720 the JSON Number type outside of its original somewhat constrained, 721 JavaScript context. The ES6 "JSON" object does not support mappings 722 to JSON Number either. 724 Due to the above, numbers that do not have a natural place in the 725 current JSON ecosystem MUST be wrapped using the JSON String type. 726 This is close to a de-facto standard for open systems. This is also 727 applicable for other data types that do not have direct support in 728 JSON, like "DateTime" objects as described in Appendix E. 730 Aided by a system using the JSON String type; be it programmatic like 732 var obj = JSON.parse('{"giantNumber": "1.4e+9999"}'); 733 var biggie = new BigNumber(obj.giantNumber); 735 or declarative schemes like OpenAPI [OPENAPI], JCS imposes no limits 736 on applications, including when using ES6. 738 Appendix E. String Subtype Handling 740 Due to the limited set of data types featured in JSON, the JSON 741 String type is commonly used for holding subtypes. This can 742 depending on JSON parsing method lead to interoperability problems 743 which MUST be dealt with by JCS compliant applications targeting a 744 wider audience. 746 Assume you want to parse a JSON object where the schema designer 747 assigned the property "big" for holding a "BigInt" subtype and "time" 748 for holding a "DateTime" subtype, while "val" is supposed to be a 749 JSON Number compliant with JCS. The following example shows such an 750 object: 752 { 753 "time": "2019-01-28T07:45:10Z", 754 "big": "055", 755 "val": 3.5 756 } 758 Parsing of this object can accomplished by the following ES6 759 statement: 761 var object = JSON.parse(JSON_object_featured_as_a_string); 763 After parsing the actual data can be extracted which for subtypes 764 also involve a conversion step using the result of the parsing 765 process (an ECMAScript object) as input: 767 ... = new Date(object.time); // Date object 768 ... = BigInt(object.big); // Big integer 769 ... = object.val; // JSON/JS number 771 Note that the "BigInt" data type is currently only natively supported 772 by V8 [V8]. 774 Canonicalization of "object" using the sample code in Appendix A 775 would return the following string: 777 {"big":"055","time":"2019-01-28T07:45:10Z","val":3.5} 779 Although this is (with respect to JCS) technically correct, there is 780 another way parsing JSON data which also can be used with ECMAScript 781 as shown below: 783 // "BigInt" requires the following code to become JSON serializable 784 BigInt.prototype.toJSON = function() { 785 return this.toString(); 786 }; 788 // JSON parsing using a "stream" based method 789 var object = JSON.parse(JSON_object_featured_as_a_string, 790 (k,v) => k == 'time' ? new Date(v) : k == 'big' ? BigInt(v) : v 791 ); 793 If you now apply the canonicalizer in Appendix A to "object", the 794 following string would be generated: 796 {"big":"55","time":"2019-01-28T07:45:10.000Z","val":3.5} 798 In this case the string arguments for "big" and "time" have changed 799 with respect to the original, presumable making an application 800 depending on JCS fail. 802 The reason for the deviation is that in stream and schema based JSON 803 parsers, the original "string" argument is typically replaced on-the- 804 fly by the native subtype which when serialized, may exhibit a 805 different and platform dependent pattern. 807 That is, stream and schema based parsing MUST treat subtypes as 808 "pure" (immutable) JSON String types, and perform the actual 809 conversion to the designated native type in a subsequent step. In 810 modern programming platforms like Go, Java and C# this can be 811 achieved with moderate efforts by combining annotations, getters and 812 setters. Below is an example in C#/Json.NET showing a part of a 813 class that is serializable as a JSON Object: 815 // The "pure" string solution uses a local 816 // string variable for JSON serialization while 817 // exposing another type to the application 818 [JsonProperty("amount")] 819 private string _amount; 821 [JsonIgnore] 822 public decimal Amount { 823 get { return decimal.Parse(_amount); } 824 set { _amount = value.ToString(); } 825 } 827 In an application "Amount" can be accessed as any other property 828 while it is actually represented by a quoted string in JSON contexts. 830 Note: the example above also addresses the constraints on numeric 831 data implied by I-JSON (the C# "decimal" data type has quite 832 different characteristics compared to IEEE-754 double precision). 834 E.1. Subtypes in Arrays 836 Since the JSON Array construct permits mixing arbitrary JSON data 837 types, custom parsing and serialization code may be required to cope 838 with subtypes anyway. 840 Appendix F. Implementation Guidelines 842 The optimal solution is integrating support for JCS directly in JSON 843 serializers (parsers need no changes). That is, canonicalization 844 would just be an additional "mode" for a JSON serializer. However, 845 this is currently not the case. Fortunately, JCS support can be 846 introduced through externally supplied canonicalizer software acting 847 as a post processor to existing JSON serializers. This arrangement 848 also relieves the JCS implementer from having to deal with how 849 underlying data is to be represented in JSON. 851 The post processor concept enables signature creation schemes like 852 the following: 854 1. Create the data to be signed. 856 2. Serialize the data using existing JSON tools. 858 3. Let the external canonicalizer process the serialized data and 859 return canonicalized result data. 861 4. Sign the canonicalized data. 863 5. Add the resulting signature value to the original JSON data 864 through a designated signature property. 866 6. Serialize the completed (now signed) JSON object using existing 867 JSON tools. 869 A compatible signature verification scheme would then be as follows: 871 1. Parse the signed JSON data using existing JSON tools. 873 2. Read and save the signature value from the designated signature 874 property. 876 3. Remove the signature property from the parsed JSON object. 878 4. Serialize the remaining JSON data using existing JSON tools. 880 5. Let the external canonicalizer process the serialized data and 881 return canonicalized result data. 883 6. Verify that the canonicalized data matches the saved signature 884 value using the algorithm and key used for creating the 885 signature. 887 A canonicalizer like above is effectively only a "filter", 888 potentially usable with a multitude of quite different cryptographic 889 schemes. 891 Using a JSON serializer with integrated JCS support, the 892 serialization performed before the canonicalization step could be 893 eliminated for both processes. 895 Appendix G. Open Source Implementations 897 The following Open Source implementations have been verified to be 898 compatible with JCS: 900 * JavaScript: https://www.npmjs.com/package/canonicalize 902 * Java: https://github.com/erdtman/java-json-canonicalization 904 * Go: https://github.com/cyberphone/json- 905 canonicalization/tree/master/go 907 * .NET/C#: https://github.com/cyberphone/json- 908 canonicalization/tree/master/dotnet 910 * Python: https://github.com/cyberphone/json- 911 canonicalization/tree/master/python3 913 Appendix H. Other JSON Canonicalization Efforts 915 There are (and have been) other efforts creating "Canonical JSON". 916 Below is a list of URLs to some of them: 918 * https://tools.ietf.org/html/draft-staykov-hu-json-canonical- 919 form-00 921 * https://gibson042.github.io/canonicaljson-spec/ 923 * http://wiki.laptop.org/go/Canonical_JSON 925 The listed efforts all build on text level JSON to JSON 926 transformations. The primary feature of text level canonicalization 927 is that it can be made neutral to the flavor of JSON used. However, 928 such schemes also imply major changes to the JSON parsing process 929 which is a likely hurdle for adoption. Albeit at the expense of 930 certain JSON and application constraints, JCS was designed to be 931 compatible with existing JSON tools. 933 Appendix I. Development Portal 935 The JCS specification is currently developed at: 936 https://github.com/cyberphone/ietf-json-canon. 938 JCS source code and extensive test data is available at: 939 https://github.com/cyberphone/json-canonicalization 941 Appendix J. Document History 943 [[ This section to be removed by the RFC Editor before publication as 944 an RFC ]] 946 Version 00-06: 948 * See IETF diff listings. 950 Version 07: 952 * Initial converson to XML RFC version 3. 954 * Changed intended status to "Informational". 956 * Added UTF-16 test data and explanations. 958 Version 08: 960 * Updated Abstract. 962 * Added a "Note 2" number serialization sample. 964 * Updated Security Considerations. 966 * Tried to clear up the JSON input data section. 968 * Added a line about Unicode normalization. 970 * Added a line about serialiation of structured data. 972 * Added a missing fact about "BigInt" (V8 not ES6). 974 Version 09: 976 * Updated initial line of Abstract and Introduction. 978 * Added note about breaking ECMAScript changes. 980 * Minor language nit fixes. 982 Version 10-12: 984 * Language tweaks. 986 Version 13: 988 * Reorganized Section 3.2.2.3. 990 Version 14: 992 * Improved introduction + some minor changes in security 993 considerations, aknowlegdgements, and unicode normalization. 995 * Generalized data representation issues by updating Appendix F. 997 Version 15: 999 * Minor nits, reverted the IEEE-754 table to ASCII. 1001 * Added a bit more meat to the IEEE-754 table. 1003 * Changed all to: type="ascii-art" and removed name="". 1005 Authors' Addresses 1007 Anders Rundgren 1008 Independent 1009 Montpellier 1010 France 1012 Email: anders.rundgren.net@gmail.com 1013 URI: https://www.linkedin.com/in/andersrundgren/ 1015 Bret Jordan 1016 Symantec Corporation 1017 350 Ellis Street 1018 Mountain View, CA 94043 1019 United States of America 1021 Email: bret_jordan@symantec.com 1023 Samuel Erdtman 1024 Spotify AB 1025 Birger Jarlsgatan 61, 4tr 1026 SE-113 56 Stockholm 1027 Sweden 1029 Email: erdtman@spotify.com