idnits 2.17.1 draft-rundgren-json-canonicalization-scheme-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 9, 2019) is 1813 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC4648' is mentioned on line 452, but not defined == Missing Reference: 'RFC7515' is mentioned on line 456, but not defined == Missing Reference: 'XMLDSIG' is mentioned on line 470, but not defined == Missing Reference: 'NODEJS' is mentioned on line 448, but not defined == Missing Reference: 'RFC7638' is mentioned on line 460, but not defined == Missing Reference: 'KEYBASE' is mentioned on line 445, but not defined == Missing Reference: 'JSONCOMP' is mentioned on line 440, but not defined -- Looks like a reference, but probably isn't: '1' on line 475 == Missing Reference: 'V8' is mentioned on line 467, but not defined == Missing Reference: 'RYU' is mentioned on line 464, but not defined == Missing Reference: 'OPENAPI' is mentioned on line 710, but not defined -- Looks like a reference, but probably isn't: '2' on line 867 -- Looks like a reference, but probably isn't: '3' on line 869 -- Looks like a reference, but probably isn't: '4' on line 872 -- Looks like a reference, but probably isn't: '5' on line 875 -- Looks like a reference, but probably isn't: '6' on line 878 -- Looks like a reference, but probably isn't: '7' on line 886 -- Looks like a reference, but probably isn't: '8' on line 888 -- Looks like a reference, but probably isn't: '9' on line 890 -- Looks like a reference, but probably isn't: '10' on line 898 -- Looks like a reference, but probably isn't: '11' on line 901 -- Looks like a reference, but probably isn't: '12' on line 904 == Missing Reference: 'JsonIgnore' is mentioned on line 793, but not defined -- Possible downref: Non-RFC (?) normative reference: ref. 'ES6' -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE754' -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE' Summary: 0 errors (**), 0 flaws (~~), 12 warnings (==), 16 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Rundgren 3 Internet-Draft Independent 4 Intended status: Standards Track B. Jordan 5 Expires: November 10, 2019 Symantec Corporation 6 S. Erdtman 7 Spotify AB 8 May 9, 2019 10 JSON Canonicalization Scheme (JCS) 11 draft-rundgren-json-canonicalization-scheme-06 13 Abstract 15 Cryptographic operations like hashing and signing requires that the 16 original data does not change during serialization or parsing. One 17 way addressing this issue is creating a canonical form of the data. 18 Canonicalization also permits data to be exchanged in its original 19 form on the "wire" while still being subject to secure cryptographic 20 operations. The JSON Canonicalization Scheme (JCS) provides 21 canonicalization support for data in the JSON format by building on 22 the strict serialization methods for JSON primitives defined by 23 ECMAScript, constraining JSON data to the I-JSON subset, and through 24 a deterministic property sorting scheme. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on November 10, 2019. 43 Copyright Notice 45 Copyright (c) 2019 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 3. Detailed Operation . . . . . . . . . . . . . . . . . . . . . 4 63 3.1. Creation of Input Data . . . . . . . . . . . . . . . . . 4 64 3.2. Generation of Canonical JSON Data . . . . . . . . . . . . 5 65 3.2.1. Whitespace . . . . . . . . . . . . . . . . . . . . . 5 66 3.2.2. Serialization of Primitive Data Types . . . . . . . . 5 67 3.2.2.1. Serialization of Literals . . . . . . . . . . . . 6 68 3.2.2.2. Serialization of Strings . . . . . . . . . . . . 6 69 3.2.2.3. Serialization of Numbers . . . . . . . . . . . . 6 70 3.2.3. Sorting of Object Properties . . . . . . . . . . . . 7 71 3.2.4. UTF-8 Generation . . . . . . . . . . . . . . . . . . 8 72 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 73 5. Security Considerations . . . . . . . . . . . . . . . . . . . 9 74 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 75 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 76 7.1. Normative References . . . . . . . . . . . . . . . . . . 9 77 7.2. Informal References . . . . . . . . . . . . . . . . . . . 10 78 7.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 11 79 Appendix A. ES6 Sample Canonicalizer . . . . . . . . . . . . . . 11 80 Appendix B. Number Serialization Samples . . . . . . . . . . . . 13 81 Appendix C. Canonicalized JSON as "Wire Format" . . . . . . . . 14 82 Appendix D. Dealing with Big Numbers . . . . . . . . . . . . . . 15 83 Appendix E. String Subtype Handling . . . . . . . . . . . . . . 16 84 E.1. Subtypes in Arrays . . . . . . . . . . . . . . . . . . . 18 85 Appendix F. Implementation Guidelines . . . . . . . . . . . . . 18 86 Appendix G. Open Source Implementations . . . . . . . . . . . . 19 87 Appendix H. Other JSON Canonicalization Efforts . . . . . . . . 19 88 Appendix I. Development Portal . . . . . . . . . . . . . . . . . 20 89 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 91 1. Introduction 93 Cryptographic operations like hashing and signing requires that the 94 original data does not change during serialization or parsing. One 95 way of accomplishing this is converting the data into a format that 96 has a simple and fixed representation like Base64Url [RFC4648], which 97 is how JWS [RFC7515] addressed this issue. 99 Another solution is to create a canonical version of the data, 100 similar to what was done for the XML Signature [XMLDSIG] standard. 101 The primary advantage with a canonicalizing scheme is that data can 102 be kept in its original form. This is the core rationale behind JCS. 103 Put another way: by using canonicalization a JSON Object may remain a 104 JSON Object even after being signed which simplifies system design, 105 documentation and logging. 107 To avoid "reinventing the wheel", JCS relies on serialization of JSON 108 primitives compatible with ECMAScript (aka JavaScript) beginning with 109 version 6 [ES6], hereafter referred to as "ES6". 111 Seasoned XML developers recalling difficulties getting signatures to 112 validate (usually due to different interpretations of the quite 113 intricate XML canonicalization rules as well as of the equally 114 extensive Web Services security standards), may rightfully wonder why 115 JCS would not suffer from similar issues. The reasons are twofold: 117 o The absence of a namespace concept and default values, as well as 118 constraining data to the I-JSON subset eliminate the need for 119 specific parsers for dealing with canonicalization. 121 o JCS compatible serialization of JSON primitives is supported by 122 most current Web browsers and as well as by Node.js [NODEJS], 123 while the full JCS specification is supported by multiple Open 124 Source implementations (see Appendix G). See also Appendix F. 126 In summary the JCS specification describes how serialization of JSON 127 primitives compliant with ES6 combined with a deterministic property 128 sorting scheme can be used for creating "Hashable" representations of 129 JSON data intended for consumption by cryptographic methods. 131 JCS is compatible with some existing systems relying on JSON 132 canonicalization such as JWK Thumbprint [RFC7638] and Keybase 133 [KEYBASE]. 135 For potential uses outside of cryptography see [JSONCOMP]. 137 The intended audiences of this document are JSON tool vendors, as 138 well as designers of JSON based cryptographic solutions. 140 2. Terminology 142 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 143 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 144 "OPTIONAL" in this document are to be interpreted as described in BCP 145 14 [RFC2119] [RFC8174] when, and only when, they appear in all 146 capitals, as shown here. 148 3. Detailed Operation 150 This section describes the different issues related to creating a 151 canonical JSON representation, and how they are addressed by JCS. 153 3.1. Creation of Input Data 155 In order to serialize JSON data, one needs data that is adapted for 156 JSON serialization. This is usually achieved by: 158 o Parsing previously generated JSON data. 160 o Programmatically creating data. 162 Irrespective of the method used, the data to be serialized MUST be 163 compatible with I-JSON [RFC7493], which implies the following: 165 o JSON Objects MUST NOT exhibit duplicate property names. 167 o JSON String data MUST be expressible as Unicode [UNICODE]. 169 o JSON Number data MUST be expressible as IEEE-754 [IEEE754] double 170 precision values. For applications needing higher precision or 171 longer integers than offered by IEEE-754 double precision, 172 Appendix D outlines how such requirements can be supported in an 173 interoperable and extensible way. 175 An additional constraint is that parsed JSON String data MUST NOT be 176 altered during subsequent serializations. For more information see 177 Appendix E. 179 Note: although the Unicode standard offers a possibility combining 180 certain characters into one, referred to as "Unicode Normalization" 181 (https://www.unicode.org/reports/tr15/ [1]), such functionality MUST 182 be delegated to the application layer. 184 3.2. Generation of Canonical JSON Data 186 The following subsections describe the steps required for creating a 187 canonical JSON representation of the data elaborated on in the 188 previous section. 190 Appendix A shows sample code for an ES6 based canonicalizer, matching 191 the JCS specification. 193 3.2.1. Whitespace 195 Whitespace between JSON elements MUST NOT be emitted. 197 3.2.2. Serialization of Primitive Data Types 199 Assume that you parse a JSON object like the following: 201 { 202 "numbers": [333333333.33333329, 1E30, 4.50, 203 2e-3, 0.000000000000000000000000001], 204 "string": "\u20ac$\u000F\u000aA'\u0042\u0022\u005c\\\"\/", 205 "literals": [null, true, false] 206 } 208 If you subsequently serialize the parsed data using a serializer 209 compliant with ES6's "JSON.stringify()", the result would (with a 210 line wrap added for display purposes only), be rather divergent with 211 respect to representation of data: 213 {"numbers":[333333333.3333333,1e+30,4.5,0.002,1e-27],"string": 214 "EURO$\u000f\nA'B\"\\\\\"/","literals":[null,true,false]} 216 Note: EURO denotes a single Euro character (Unicode: U+20AC), 217 which not being ASCII, is currently not displayable in RFCs. 219 The reason for the difference between the parsed data and its 220 serialized counterpart, is due to a wide tolerance on input data (as 221 defined by JSON [RFC8259]), while output data (as defined by ES6), 222 has a fixed representation. As can be seen by the example, numbers 223 are subject to rounding as well. 225 The following subsections describe serialization of primitive JSON 226 data types according to JCS. This part is identical to that of ES6. 228 3.2.2.1. Serialization of Literals 230 The JSON literals "null", "true", and "false" present no challenge 231 since they already have a fixed definition in JSON [RFC8259]. 233 3.2.2.2. Serialization of Strings 235 For JSON String data (which includes JSON Object property names as 236 well), each Unicode code point MUST be serialized as described below 237 (also matching Section 24.3.2.2 of [ES6]): 239 o If the Unicode value falls within the traditional ASCII control 240 character range (U+0000 through U+001F), it MUST be serialized 241 using lowercase hexadecimal Unicode notation (\uhhhh) unless it is 242 in the set of predefined JSON control characters U+0008, U+0009, 243 U+000A, U+000C or U+000D which MUST be serialized as \b, \t, \n, 244 \f and \r respectively. 246 o If the Unicode value is outside of the ASCII control character 247 range, it MUST be serialized "as is" unless it is equivalent to 248 U+005C (\) or U+0022 (") which MUST be serialized as \\ and \" 249 respectively. 251 Finally, the resulting sequence of Unicode code points MUST be 252 enclosed in double quotes ("). 254 Note: some JSON systems permit the use of invalid Unicode data 255 including "lone surrogates" (e.g. U+DEAD). Since this leads to 256 interoperability issues including broken signatures, occurrences of 257 such data MUST cause the JCS algorithm to terminate with an error 258 indication. 260 3.2.2.3. Serialization of Numbers 262 JSON Number data MUST be serialized according to Section 7.1.12.1 of 263 [ES6] including the "Note 2" enhancement. 265 Due to the relative complexity of this part, the algorithm itself is 266 not included in this document. However, the specification is fully 267 implemented by for example Google's V8 [V8]. The open source Java 268 implementation mentioned in Appendix G uses a recently developed 269 number serialization algorithm called Ryu [RYU]. 271 ES6 builds on the IEEE-754 [IEEE754] double precision standard for 272 representing JSON Number data. Appendix B holds a set of IEEE-754 273 sample values and their corresponding JSON serialization. 275 Note: since NaN (Not a Number) and Infinity are not permitted in 276 JSON, occurrences of such values MUST cause the JCS algorithm to 277 terminate with an error indication. 279 3.2.3. Sorting of Object Properties 281 Although the previous step indeed normalized the representation of 282 primitive JSON data types, the result would not qualify as 283 "canonical" since JSON Object properties are not in lexicographic 284 (alphabetical) order. 286 Applied to the sample in Section 3.2.2, a properly canonicalized 287 version should (with a line wrap added for display purposes only), 288 read as: 290 {"literals":[null,true,false],"numbers":[333333333.3333333, 291 1e+30,4.5,0.002,1e-27],"string":"EURO$\u000f\nA'B\"\\\\\"/"} 293 Note: EURO denotes a single Euro character (Unicode: U+20AC), 294 which not being ASCII, is currently not displayable in RFCs. 296 The rules for lexicographic sorting of JSON Object properties 297 according to JCS are as follows: 299 o JSON Object properties MUST be sorted in a recursive manner which 300 means that possible JSON child Objects MUST have their properties 301 sorted as well. 303 o JSON Array data MUST also be scanned for presence of JSON Objects 304 (and applying associated property sorting), but array element 305 order MUST NOT be changed. 307 When a JSON Object is about to have its properties sorted, the 308 following measures MUST be adhered to: 310 o The sorting process is applied to property name strings in their 311 "raw" (unescaped) form. That is, a newline character is treated 312 as U+000A. 314 o Property name strings to be sorted are formatted as arrays of 315 UTF-16 [UNICODE] code units. The sorting is based on pure value 316 comparisons, where code units are treated as unsigned integers, 317 independent of locale settings. 319 o Property name strings either have different values at some index 320 that is a valid index for both strings, or their lengths are 321 different, or both. If they have different values at one or more 322 index positions, let k be the smallest such index; then the string 323 whose value at position k has the smaller value, as determined by 324 using the < operator, lexicographically precedes the other string. 325 If there is no index position at which they differ, then the 326 shorter string lexicographically precedes the longer string. 328 In plain English this means that property names are sorted in 329 ascending order like the following: 331 "" 332 "a" 333 "aa" 334 "ab" 336 The rationale for basing the sorting algorithm on UTF-16 code units 337 is that it maps directly to the string type in ECMAScript (featured 338 in Web browsers and Node.js), Java and .NET. Systems using another 339 internal representation of string data will need to convert JSON 340 property name strings into arrays of UTF-16 code units before 341 sorting. The conversion from UTF-8 or UTF-32 to UTF-16 is defined by 342 the Unicode [UNICODE] standard. 344 Note: for the purpose of obtaining a deterministic property order, 345 sorting on UTF-8 or UTF-32 encoded data would also work, but the 346 result would differ and thus be incompatible with this specification. 347 However, in practice property names rarely go outside of 7-bit ASCII 348 making it possible sorting on the UTF-8 byte level and still be 349 compatible with JCS. If this is a viable option or not depends on 350 the environment JCS is supposed to be used in. 352 3.2.4. UTF-8 Generation 354 Finally, in order to create a platform independent representation, 355 the result of the preceding step MUST be encoded in UTF-8. 357 Applied to the sample in Section 3.2.3 this should yield the 358 following bytes here shown in hexadecimal notation: 360 7b 22 6c 69 74 65 72 61 6c 73 22 3a 5b 6e 75 6c 6c 2c 74 72 361 75 65 2c 66 61 6c 73 65 5d 2c 22 6e 75 6d 62 65 72 73 22 3a 362 5b 33 33 33 33 33 33 33 33 33 2e 33 33 33 33 33 33 33 2c 31 363 65 2b 33 30 2c 34 2e 35 2c 30 2e 30 30 32 2c 31 65 2d 32 37 364 5d 2c 22 73 74 72 69 6e 67 22 3a 22 e2 82 ac 24 5c 75 30 30 365 30 66 5c 6e 41 27 42 5c 22 5c 5c 5c 5c 5c 22 2f 22 7d 367 This data is intended to be usable as input to cryptographic methods. 369 4. IANA Considerations 371 This document has no IANA actions. 373 5. Security Considerations 375 It is vital performing "sanity" checks on input data to avoid 376 overflowing buffers and similar things that could affect the 377 integrity of the system. 379 When JCS is applied to signature schemes like the one in Appendix F, 380 applications MUST perform the following operations before acting upon 381 received data: 383 1. Parse the JSON data 385 2. Verify the data for correctness 387 3. Verify the signature 389 6. Acknowledgements 391 Building on ES6 Number serialization was originally proposed by 392 James Manger. This ultimately led to the adoption of the entire ES6 393 serialization scheme for JSON primitives. 395 Other people who have contributed with valuable input to this 396 specification include Scott Ananian, Ben Campbell, Richard Gibson, 397 Bron Gondwana, John-Mark Gurney, Mike Jones, Mike Miller, 398 Mark Nottingham, Mike Samuel, Jim Schaad, Robert Tupelo-Schneck and 399 Michal Wadas. 401 For carrying out real world concept verification, the software and 402 support for number serialization provided by Ulf Adams, 403 Tanner Gooding and Remy Oudompheng was very helpful. 405 7. References 407 7.1. Normative References 409 [ES6] Ecma International, "ECMAScript 2015 Language 410 Specification", . 413 [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", 414 August 2008, . 416 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 417 Requirement Levels", BCP 14, RFC 2119, 418 DOI 10.17487/RFC2119, March 1997, 419 . 421 [RFC7493] Bray, T., Ed., "The I-JSON Message Format", RFC 7493, 422 DOI 10.17487/RFC7493, March 2015, 423 . 425 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 426 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 427 May 2017, . 429 [RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data 430 Interchange Format", STD 90, RFC 8259, 431 DOI 10.17487/RFC8259, December 2017, 432 . 434 [UNICODE] The Unicode Consortium, "The Unicode Standard, Version 435 10.0.0", 436 . 438 7.2. Informal References 440 [JSONCOMP] 441 A. Rundgren, ""Comparable" JSON - Work in progress", 442 . 445 [KEYBASE] "Keybase", 446 . 448 [NODEJS] "Node.js", . 450 [OPENAPI] "The OpenAPI Initiative", . 452 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 453 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, 454 . 456 [RFC7515] Jones, M., Bradley, J., and N. Sakimura, "JSON Web 457 Signature (JWS)", RFC 7515, DOI 10.17487/RFC7515, May 458 2015, . 460 [RFC7638] Jones, M. and N. Sakimura, "JSON Web Key (JWK) 461 Thumbprint", RFC 7638, DOI 10.17487/RFC7638, September 462 2015, . 464 [RYU] Ulf Adams, "Ryu floating point number serializing 465 algorithm", . 467 [V8] Google LLC, "Chrome V8 Open Source JavaScript Engine", 468 . 470 [XMLDSIG] W3C, "XML Signature Syntax and Processing Version 1.1", 471 . 473 7.3. URIs 475 [1] https://www.unicode.org/reports/tr15/ 477 [2] https://www.npmjs.com/package/canonicalize 479 [3] https://github.com/erdtman/java-json-canonicalization 481 [4] https://github.com/cyberphone/json-canonicalization/tree/master/ 482 go 484 [5] https://github.com/cyberphone/json-canonicalization/tree/master/ 485 dotnet 487 [6] https://github.com/cyberphone/json-canonicalization/tree/master/ 488 python3 490 [7] https://tools.ietf.org/html/draft-staykov-hu-json-canonical- 491 form-00 493 [8] https://gibson042.github.io/canonicaljson-spec/ 495 [9] http://wiki.laptop.org/go/Canonical_JSON 497 [10] https://github.com/cyberphone/ietf-json-canon 499 [11] https://cyberphone.github.io/ietf-json-canon 501 [12] https://github.com/cyberphone/json-canonicalization 503 Appendix A. ES6 Sample Canonicalizer 505 Below is an example of a JCS canonicalizer for usage with ES6 based 506 systems: 508 //////////////////////////////////////////////////////////// 509 // Since the primary purpose of this code is highlighting // 510 // the core of the JCS algorithm, error handling and // 511 // UTF-8 generation were not implemented // 512 //////////////////////////////////////////////////////////// 513 var canonicalize = function(object) { 515 var buffer = ''; 516 serialize(object); 517 return buffer; 519 function serialize(object) { 520 if (object === null || typeof object !== 'object' || 521 object.toJSON != null) { 522 ///////////////////////////////////////////////// 523 // Primitive type or toJSON - Use ES6/JSON // 524 ///////////////////////////////////////////////// 525 buffer += JSON.stringify(object); 527 } else if (Array.isArray(object)) { 528 ///////////////////////////////////////////////// 529 // Array - Maintain element order // 530 ///////////////////////////////////////////////// 531 buffer += '['; 532 let next = false; 533 object.forEach((element) => { 534 if (next) { 535 buffer += ','; 536 } 537 next = true; 538 ///////////////////////////////////////// 539 // Array element - Recursive expansion // 540 ///////////////////////////////////////// 541 serialize(element); 542 }); 543 buffer += ']'; 545 } else { 546 ///////////////////////////////////////////////// 547 // Object - Sort properties before serializing // 548 ///////////////////////////////////////////////// 549 buffer += '{'; 550 let next = false; 551 Object.keys(object).sort().forEach((property) => { 552 if (next) { 553 buffer += ','; 554 } 555 next = true; 556 /////////////////////////////////////////////// 557 // Property names are strings - Use ES6/JSON // 558 /////////////////////////////////////////////// 559 buffer += JSON.stringify(property); 560 buffer += ':'; 561 ////////////////////////////////////////// 562 // Property value - Recursive expansion // 563 ////////////////////////////////////////// 564 serialize(object[property]); 565 }); 566 buffer += '}'; 567 } 568 } 569 }; 571 Appendix B. Number Serialization Samples 573 The following table holds a set of ES6 compatible Number 574 serialization samples, including some edge cases. The column 575 "IEEE-754" refers to the internal ES6 representation of the Number 576 data type which is based on the IEEE-754 [IEEE754] standard using 577 64-bit (double precision) values, here expressed in hexadecimal. 579 |====================================================================| 580 | IEEE-754 | JSON Representation | Comment | 581 |====================================================================| 582 | 0000000000000000 | 0 | Zero | 583 |--------------------------------------------------------------------| 584 | 8000000000000000 | 0 | Minus zero | 585 |--------------------------------------------------------------------| 586 | 0000000000000001 | 5e-324 | Min pos number | 587 |--------------------------------------------------------------------| 588 | 8000000000000001 | -5e-324 | Min neg number | 589 |--------------------------------------------------------------------| 590 | 7fefffffffffffff | 1.7976931348623157e+308 | Max pos number | 591 |--------------------------------------------------------------------| 592 | ffefffffffffffff | -1.7976931348623157e+308 | Max neg number | 593 |--------------------------------------------------------------------| 594 | 4340000000000000 | 9007199254740992 | Max pos integer (1) | 595 |--------------------------------------------------------------------| 596 | c340000000000000 | -9007199254740992 | Max neg integer (1) | 597 |--------------------------------------------------------------------| 598 | 4430000000000000 | 295147905179352830000 | ~2**68 (2) | 599 |--------------------------------------------------------------------| 600 | 7fffffffffffffff | | NaN (3) | 601 |--------------------------------------------------------------------| 602 | 7ff0000000000000 | | Infinity (3) | 603 |--------------------------------------------------------------------| 604 | 44b52d02c7e14af5 | 9.999999999999997e+22 | | 605 |--------------------------------------------------------------------| 606 | 44b52d02c7e14af6 | 1e+23 | | 607 |--------------------------------------------------------------------| 608 | 44b52d02c7e14af7 | 1.0000000000000001e+23 | | 609 |--------------------------------------------------------------------| 610 | 444b1ae4d6e2ef4e | 999999999999999700000 | | 611 |--------------------------------------------------------------------| 612 | 444b1ae4d6e2ef4f | 999999999999999900000 | | 613 |--------------------------------------------------------------------| 614 | 444b1ae4d6e2ef50 | 1e+21 | | 615 |--------------------------------------------------------------------| 616 | 3eb0c6f7a0b5ed8c | 9.999999999999997e-7 | | 617 |--------------------------------------------------------------------| 618 | 3eb0c6f7a0b5ed8d | 0.000001 | | 619 |--------------------------------------------------------------------| 620 | 41b3de4355555553 | 333333333.3333332 | | 621 |--------------------------------------------------------------------| 622 | 41b3de4355555554 | 333333333.33333325 | | 623 |--------------------------------------------------------------------| 624 | 41b3de4355555555 | 333333333.3333333 | | 625 |--------------------------------------------------------------------| 626 | 41b3de4355555556 | 333333333.3333334 | | 627 |--------------------------------------------------------------------| 628 | 41b3de4355555557 | 333333333.33333343 | | 629 |--------------------------------------------------------------------| 630 | becbf647612f3696 | -0.0000033333333333333333 | | 631 |--------------------------------------------------------------------| 633 Notes: 635 (1) For maximum compliance with the ES6 "JSON" object values that 636 are to be interpreted as true integers SHOULD be in the range 637 -9007199254740991 to 9007199254740991. However, how numbers are 638 used in applications do not affect the JCS algorithm. 640 (2) Although a set of specific integers like 2**68 could be regarded 641 as having extended precision, the JCS/ES6 number serialization 642 algorithm does not take this in consideration. 644 (3) Invalid. See Section 3.2.2.3. 646 Appendix C. Canonicalized JSON as "Wire Format" 648 Since the result from the canonicalization process (see 649 Section 3.2.4), is fully valid JSON, it can also be used as 650 "Wire Format". However, this is just an option since cryptographic 651 schemes based on JCS, in most cases would not depend on that 652 externally supplied JSON data already is canonicalized. 654 In fact, the ES6 standard way of serializing objects using 655 "JSON.stringify()" produces a more "logical" format, where properties 656 are kept in the order they were created or received. The example 657 below shows an address record which could benefit from ES6 standard 658 serialization: 660 { 661 "name": "John Doe", 662 "address": "2000 Sunset Boulevard", 663 "city": "Los Angeles", 664 "zip": "90001", 665 "state": "CA" 666 } 668 Using canonicalization the properties above would be output in the 669 order "address", "city", "name", "state" and "zip", which adds 670 fuzziness to the data from a human (developer or technical support), 671 perspective. Canonicalization also converts JSON data into a single 672 line of text, which may be less than ideal for debugging and logging. 674 Appendix D. Dealing with Big Numbers 676 There are several issues associated with the JSON Number type, here 677 illustrated by the following sample object: 679 { 680 "giantNumber": 1.4e+9999, 681 "payMeThis": 26000.33, 682 "int64Max": 9223372036854775807 683 } 685 Although the sample above conforms to JSON [RFC8259], applications 686 would normally use different native data types for storing 687 "giantNumber" and "int64Max". In addition, monetary data like 688 "payMeThis" would presumably not rely on floating point data types 689 due to rounding issues with respect to decimal arithmetic. 691 The established way handling this kind of "overloading" of the JSON 692 Number type (at least in an extensible manner), is through mapping 693 mechanisms, instructing parsers what to do with different properties 694 based on their name. However, this greatly limits the value of using 695 the JSON Number type outside of its original somewhat constrained, 696 JavaScript context. The ES6 "JSON" object does not support mappings 697 to JSON Number either. 699 Due to the above, numbers that do not have a natural place in the 700 current JSON ecosystem MUST be wrapped using the JSON String type. 701 This is close to a de-facto standard for open systems. This is also 702 applicable for other data types that do not have direct support in 703 JSON, like "DateTime" objects as described in Appendix E. 705 Aided by a system using the JSON String type; be it programmatic like 707 var obj = JSON.parse('{"giantNumber": "1.4e+9999"}'); 708 var biggie = new BigNumber(obj.giantNumber); 710 or declarative schemes like OpenAPI [OPENAPI], JCS imposes no limits 711 on applications, including when using ES6. 713 Appendix E. String Subtype Handling 715 Due to the limited set of data types featured in JSON, the JSON 716 String type is commonly used for holding subtypes. This can 717 depending on JSON parsing method lead to interoperability problems 718 which MUST be dealt with by JCS compliant applications targeting a 719 wider audience. 721 Assume you want to parse a JSON object where the schema designer 722 assigned the property "big" for holding a "BigInteger" subtype and 723 "time" for holding a "DateTime" subtype, while "val" is supposed to 724 be a JSON Number compliant with JCS. The following example shows 725 such an object: 727 { 728 "time": "2019-01-28T07:45:10Z", 729 "big": "055", 730 "val": 3.5 731 } 733 Parsing of this object can accomplished by the following ES6 734 statement: 736 var object = JSON.parse(JSON-data-featured-as-a-string); 738 After parsing the actual data can be extracted which for subtypes 739 also involve a conversion step using the result of the parsing 740 process (an ECMAScript object) as input: 742 ... = new Date(object.time); // Date object 743 ... = BigInt(object.big); // Big integer 744 ... = object.val; // JSON/JS number 746 Canonicalization of "object" using the sample code in Appendix A 747 would return the following string: 749 {"big":"055","time":"2019-01-28T07:45:10Z",val:3.5} 751 Although this is (with respect to JCS) technically correct, there is 752 another way parsing JSON data which also can be used with ES6 as 753 shown below: 755 // Currently required to make BigInt JSON serializable 756 BigInt.prototype.toJSON = function() { 757 return this.toString(); 758 }; 760 // JSON parsing using a "stream" based method 761 var object = JSON.parse(JSON-data-featured-as-a-string, 762 (k,v) => k == 'time' ? new Date(v) : k == 'big' ? BigInt(v) : v 763 ); 765 If you now apply the canonicalizer in Appendix A to "object", the 766 following string would be generated: 768 {"big":"55","time":"2019-01-28T07:45:10.000Z","val":3.5} 770 In this case the string arguments for "big" and "time" have changed 771 with respect to the original, presumable making an application 772 depending on JCS fail. 774 The reason for the deviation is that in stream and schema based JSON 775 parsers, the original "string" argument is typically replaced on-the- 776 fly by the native subtype which when serialized, may exhibit a 777 different and platform dependent pattern. 779 That is, stream and schema based parsing MUST treat subtypes as 780 "pure" (immutable) JSON String types, and perform the actual 781 conversion to the designated native type in a subsequent step. In 782 modern programming platforms like Go, Java and C# this can be 783 achieved with moderate efforts by combining annotations, getters and 784 setters. Below is an example in C#/Json.NET showing a part of a 785 class that is serializable as a JSON Object: 787 // The "pure" string solution uses a local 788 // string variable for JSON serialization while 789 // exposing another type to the application 790 [JsonProperty("amount")] 791 private string _amount; 793 [JsonIgnore] 794 public decimal Amount { 795 get { return decimal.Parse(_amount); } 796 set { _amount = value.ToString(); } 797 } 799 In an application "Amount" can be accessed as any other property 800 while it is actually represented by a quoted string in JSON contexts. 802 Note: the example above also addresses the constraints on numeric 803 data implied by I-JSON (the C# "decimal" data type has quite 804 different characteristics compared to IEEE-754 double precision). 806 E.1. Subtypes in Arrays 808 Since the JSON Array construct permits mixing arbitrary JSON 809 elements, custom parsing and serialization code must normally be used 810 to cope with subtypes anyway. 812 Appendix F. Implementation Guidelines 814 The optimal solution is integrating support for JCS directly in JSON 815 serializers (parsers need no changes). That is, canonicalization 816 would just be an additional "mode" for a JSON serializer. However, 817 this is currently not the case. Fortunately JCS support can be 818 performed through externally supplied canonicalizer software, 819 enabling signature creation schemes like the following: 821 1. Create the data to be signed. 823 2. Serialize the data using existing JSON tools. 825 3. Let the external canonicalizer process the serialized data and 826 return canonicalized result data. 828 4. Sign the canonicalized data. 830 5. Add the resulting signature value to the original JSON data 831 through a designated signature property. 833 6. Serialize the completed (now signed) JSON object using existing 834 JSON tools. 836 A compatible signature verification scheme would then be as follows: 838 1. Parse the signed JSON data using existing JSON tools. 840 2. Read and save the signature value from the designated signature 841 property. 843 3. Remove the signature property from the parsed JSON object. 845 4. Serialize the remaining JSON data using existing JSON tools. 847 5. Let the external canonicalizer process the serialized data and 848 return canonicalized result data. 850 6. Verify that the canonicalized data matches the saved signature 851 value using the algorithm and key used for creating the 852 signature. 854 A canonicalizer like above is effectively only a "filter", 855 potentially usable with a multitude of quite different cryptographic 856 schemes. 858 Using a JSON serializer with integrated JCS support, the 859 serialization performed before the canonicalization step could be 860 eliminated for both processes. 862 Appendix G. Open Source Implementations 864 The following Open Source implementations have been verified to be 865 compatible with JCS: 867 o JavaScript: https://www.npmjs.com/package/canonicalize [2] 869 o Java: https://github.com/erdtman/java-json-canonicalization [3] 871 o Go: https://github.com/cyberphone/json- 872 canonicalization/tree/master/go [4] 874 o .NET/C#: https://github.com/cyberphone/json- 875 canonicalization/tree/master/dotnet [5] 877 o Python: https://github.com/cyberphone/json- 878 canonicalization/tree/master/python3 [6] 880 Appendix H. Other JSON Canonicalization Efforts 882 There are (and have been) other efforts creating "Canonical JSON". 883 Below is a list of URLs to some of them: 885 o https://tools.ietf.org/html/draft-staykov-hu-json-canonical- 886 form-00 [7] 888 o https://gibson042.github.io/canonicaljson-spec/ [8] 890 o http://wiki.laptop.org/go/Canonical_JSON [9] 892 In contrast to JCS which is a serialization scheme, the listed 893 efforts build on text level JSON to JSON transformations. 895 Appendix I. Development Portal 897 The JCS specification is currently developed at: 898 https://github.com/cyberphone/ietf-json-canon [10]. 900 The most recent "editors' copy" can be found at: 901 https://cyberphone.github.io/ietf-json-canon [11]. 903 JCS source code and test data is available at: 904 https://github.com/cyberphone/json-canonicalization [12] 906 Authors' Addresses 908 Anders Rundgren 909 Independent 910 Montpellier 911 France 913 Email: anders.rundgren.net@gmail.com 914 URI: https://www.linkedin.com/in/andersrundgren/ 916 Bret Jordan 917 Symantec Corporation 918 350 Ellis Street 919 Mountain View CA 94043 920 USA 922 Email: bret_jordan@symantec.com 924 Samuel Erdtman 925 Spotify AB 926 Birger Jarlsgatan 61, 4tr 927 Stockholm 113 56 928 Sweden 930 Email: erdtman@spotify.com