idnits 2.17.1 draft-ietf-cbor-array-tags-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 05, 2019) is 1872 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '2' on line 324 -- Looks like a reference, but probably isn't: '3' on line 222 == Outdated reference: A later version (-08) exists of draft-ietf-cbor-cddl-07 ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Roatch 3 Internet-Draft 4 Intended status: Informational C. Bormann 5 Expires: September 6, 2019 Universitaet Bremen TZI 6 March 05, 2019 8 Concise Binary Object Representation (CBOR) Tags for Typed Arrays 9 draft-ietf-cbor-array-tags-03 11 Abstract 13 The Concise Binary Object Representation (CBOR, RFC 7049) is a data 14 format whose design goals include the possibility of extremely small 15 code size, fairly small message size, and extensibility without the 16 need for version negotiation. 18 The present document makes use of this extensibility to define a 19 number of CBOR tags for typed arrays of numeric data, as well as two 20 additional tags for multi-dimensional and homogeneous arrays. It is 21 intended as the reference document for the IANA registration of the 22 CBOR tags defined. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at https://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on September 6, 2019. 41 Copyright Notice 43 Copyright (c) 2019 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (https://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 59 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 60 2. Typed Arrays . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2.1. Types of numbers . . . . . . . . . . . . . . . . . . . . 3 62 3. Additional Array Tags . . . . . . . . . . . . . . . . . . . . 5 63 3.1. Multi-dimensional Array . . . . . . . . . . . . . . . . . 5 64 3.2. Homogeneous Array . . . . . . . . . . . . . . . . . . . . 7 65 4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 8 66 5. CDDL typenames . . . . . . . . . . . . . . . . . . . . . . . 9 67 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 68 7. Security Considerations . . . . . . . . . . . . . . . . . . . 12 69 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 70 8.1. Normative References . . . . . . . . . . . . . . . . . . 13 71 8.2. Informative References . . . . . . . . . . . . . . . . . 13 72 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 14 73 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 14 74 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14 76 1. Introduction 78 The Concise Binary Object Representation (CBOR, [RFC7049]) provides 79 for the interchange of structured data without a requirement for a 80 pre-agreed schema. RFC 7049 defines a basic set of data types, as 81 well as a tagging mechanism that enables extending the set of data 82 types supported via an IANA registry. 84 Recently, a simple form of typed arrays of numeric data have received 85 interest both in the Web graphics community [TypedArray] and in the 86 JavaScript specification [TypedArrayES6], as well as in corresponding 87 implementations [ArrayBuffer]. 89 Since these typed arrays may carry significant amounts of data, there 90 is interest in interchanging them in CBOR without the need of lengthy 91 conversion of each number in the array. 93 This document defines a number of interrelated CBOR tags that cover 94 these typed arrays, as well as two additional tags for multi- 95 dimensional and homogeneous arrays. It is intended as the reference 96 document for the IANA registration of the tags defined. 98 1.1. Terminology 100 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 101 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 102 "OPTIONAL" in this document are to be interpreted as described in 103 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 104 capitals, as shown here. 106 The term "byte" is used in its now customary sense as a synonym for 107 "octet". Where bit arithmetic is explained, this document uses the 108 notation familiar from the programming language C (including C++14's 109 0bnnn binary literals), except that the operator "**" stands for 110 exponentiation. 112 2. Typed Arrays 114 Typed arrays are homogeneous arrays of numbers, all of which are 115 encoded in a single form of binary representation. The concatenation 116 of these representations is encoded as a single CBOR byte string 117 (major type 2), enclosed by a single tag indicating the type and 118 encoding of all the numbers represented in the byte string. 120 2.1. Types of numbers 122 Three classes of numbers are of interest: unsigned integers (uint), 123 signed integers (two's complement, sint), and IEEE 754 binary 124 floating point numbers (which are always signed). For each of these 125 classes, there are multiple representation lengths in active use: 127 +-----------+--------+--------+-----------+ 128 | Length ll | uint | sint | float | 129 +-----------+--------+--------+-----------+ 130 | 0 | uint8 | sint8 | binary16 | 131 | 1 | uint16 | sint16 | binary32 | 132 | 2 | uint32 | sint32 | binary64 | 133 | 3 | uint64 | sint64 | binary128 | 134 +-----------+--------+--------+-----------+ 136 Table 1: Length values 138 Here, sintN stands for a signed integer of exactly N bits (for 139 instance, sint16), and uintN stands for an unsigned integer of 140 exactly N bits (for instance, uint32). The name binaryN stands for 141 the number form of the same name defined in IEEE 754. 143 Since one objective of these tags is to be able to directly ship the 144 ArrayBuffers underlying the Typed Arrays without re-encoding them, 145 and these may be either in big endian (network byte order) or in 146 little endian form, we need to define tags for both variants. 148 In total, this leads to 24 variants. In the tag, we need to express 149 the choice between integer and floating point, the signedness (for 150 integers), the endianness, and one of the four length values. 152 In order to simplify implementation, a range of tags is being 153 allocated that allows retrieving all this information from the bits 154 of the tag: Tag values from 64 to 87. 156 The value is split up into 5 bit fields: 0b010_f_s_e_ll, as detailed 157 in Table 2. 159 +-------+-------------------------------------------------------+ 160 | Field | Use | 161 +-------+-------------------------------------------------------+ 162 | 0b010 | the constant bits 0, 1, 0 | 163 | f | 0 for integer, 1 for float | 164 | s | 0 for unsigned integer or float, 1 for signed integer | 165 | e | 0 for big endian, 1 for little endian | 166 | ll | A number for the length (Table 1). | 167 +-------+-------------------------------------------------------+ 169 Table 2: Bit fields in the low 8 bits of the tag 171 The number of bytes in each array element can then be calculated by 172 "2**(f + ll)" (or "1 << (f + ll)" in a typical programming language). 173 (Notice that f and ll are the lsb of each nibble (4bit) in the byte.) 175 In the CBOR representation, the total number of elements in the array 176 is not expressed explicitly, but implied from the length of the byte 177 string and the length of each representation. It can be computed 178 inversely to the previous formula from the length of the byte string 179 in bytes: "bytelength >> (f + ll)". 181 For the uint8/sint8 values, the endianness is redundant. Only the 182 big endian variant is used. The little endian variant of sint8 MUST 183 NOT be used, its tag is marked as reserved. As a special case, the 184 tag number that would have been the little endian variant of uint8 is 185 used to signify that the numbers in the array are using clamped 186 conversion from integers, as described in more detail in Section 7.1 187 of [TypedArrayUpdate]. 189 3. Additional Array Tags 191 This specification defines three additional array tags. The Multi- 192 dimensional Array tags can be combined with classical CBOR arrays as 193 well as with Typed Arrays in order to build multi-dimensional arrays 194 with constant numbers of elements in the sub-arrays. The Homogeneous 195 Array tag can be used to facilitate the ingestion of homogeneous 196 classical CBOR arrays, providing performance advantages even when a 197 Typed Array does not apply. 199 3.1. Multi-dimensional Array 201 Tag: 40 203 Data Item: array (major type 4) of two arrays, one array (major type 204 4) of dimensions, and one array (major type 4, a Typed Array, or a 205 Homogeneous Array) of elements 207 A multi-dimensional array is represented as a tagged array that 208 contains two (one-dimensional) arrays. The first array defines the 209 dimensions of the multi-dimensional array (in the sequence of outer 210 dimensions towards inner dimensions) while the second array 211 represents the contents of the multi-dimensional array. If the 212 second array is itself tagged as a Typed Array then the element type 213 of the multi-dimensional array is known to be the same type as that 214 of the Typed Array. Data in the Typed Array byte string consists of 215 consecutive values where the last dimension is considered contiguous 216 (row-major order). 218 Figure 1 shows a declaration of a two-dimensional array in the C 219 language, a representation of that in CBOR using both a 220 multidimensional array tag and a typed array tag. 222 uint16_t a[2][3] = { 223 {2, 4, 8}, /* row 0 */ 224 {4, 16, 256}, 225 }; 227 # multi-dimensional array tag 228 82 # array(2) 229 82 # array(2) 230 02 # unsigned(2) 1st Dimension 231 03 # unsigned(3) 2nd Dimension 232 # uint16 array 233 4c # byte string(12) 234 0002 # unsigned(2) 235 0004 # unsigned(4) 236 0008 # unsigned(8) 237 0004 # unsigned(4) 238 0010 # unsigned(16) 239 0100 # unsigned(256) 241 Figure 1: Multi-dimensional array in C and CBOR 243 Figure 2 shows the same two-dimensional array using the 244 multidimensional array tag in conjunction with a basic CBOR array 245 (which, with the small numbers chosen for the example, happens to be 246 shorter). 248 # multi-dimensional array tag 249 82 # array(2) 250 82 # array(2) 251 02 # unsigned(2) 1st Dimension 252 03 # unsigned(3) 2nd Dimension 253 86 # array(6) 254 02 # unsigned(2) 255 04 # unsigned(4) 256 08 # unsigned(8) 257 04 # unsigned(4) 258 10 # unsigned(16) 259 19 0100 # unsigned(256) 261 Figure 2: Multi-dimensional array using basic CBOR array 263 Tag: 1040 265 Data Item: as with tag 40 267 Note that above arrays are in "row major" order, which is the 268 preferred order for the purposes of this specification. An analogous 269 representation that uses "column major" order arrays is provided 270 under the tag 1040, as illustrated in Figure 3. 272 # multi-dimensional array tag, column major order 273 82 # array(2) 274 82 # array(2) 275 02 # unsigned(2) 1st Dimension 276 03 # unsigned(3) 2nd Dimension 277 86 # array(6) 278 02 # unsigned(2) 279 04 # unsigned(4) 280 04 # unsigned(4) 281 10 # unsigned(16) 282 08 # unsigned(8) 283 19 0100 # unsigned(256) 285 Figure 3: Multi-dimensional array using basic CBOR array, column 286 major order 288 3.2. Homogeneous Array 290 Tag: 41 292 Data Item: array (major type 4) 294 This tag provides a hint to decoders that the array tagged by it has 295 elements that are all of the same application type. The element type 296 of the array is thus determined by the application type of the first 297 array element. This can be used by implementations in strongly typed 298 languages while decoding to create native homogeneous arrays of 299 specific types instead of ordered lists. 301 Which CBOR data items constitute elements of the same application 302 type is specific to the application. However, type systems of 303 programming languages have enough commonality that an application 304 should be able to create portable homogeneous arrays. 306 Figure 4 shows an example for a homogeneous array of booleans in C++ 307 and CBOR. 309 bool boolArray[2] = { true, false }; 311 # Homogeneous Array Tag 312 82 #array(2) 313 F5 # true 314 F4 # false 316 Figure 4: Homogeneous array in C++ and CBOR 318 Figure 5 extends the example with a more complex structure. 320 typedef struct { 321 bool active; 322 int value; 323 } foo; 324 foo myArray[2] = { {true, 3}, {true, -4} }; 326 327 82 # array(2) 328 82 # array(2) 329 F5 # true 330 03 # 3 331 82 # array(2) 332 F5 # true 333 23 # -4 335 Figure 5: Homogeneous array in C++ and CBOR 337 4. Discussion 339 Support for both little- and big-endian representation may seem out 340 of character with CBOR, which is otherwise fully big endian. This 341 support is in line with the intended use of the typed arrays and the 342 objective not to require conversion of each array element. 344 This specification allocates a sizable chunk out of the single-byte 345 tag space. This use of code point space is justified by the wide use 346 of typed arrays in data interchange. 348 Providing a column-major order variant of the multi-dimensional array 349 may seem superfluous to some, and useful to others. It is cheap to 350 define the additional tag so it is available when actually needed. 351 Allocating it out of a different number space makes the preference 352 for row-major evident. 354 Applying a Homogeneous Array tag to a Typed Array would be redundant 355 and is therefore not provided by the present specification. 357 5. CDDL typenames 359 For the use with CDDL [I-D.ietf-cbor-cddl], the typenames defined in 360 Figure 6 are recommended: 362 ta-uint8 = #6.64(bstr) 363 ta-uint16be = #6.65(bstr) 364 ta-uint32be = #6.66(bstr) 365 ta-uint64be = #6.67(bstr) 366 ta-uint8-clamped = #6.68(bstr) 367 ta-uint16le = #6.69(bstr) 368 ta-uint32le = #6.70(bstr) 369 ta-uint64le = #6.71(bstr) 370 ta-sint8 = #6.72(bstr) 371 ta-sint16be = #6.73(bstr) 372 ta-sint32be = #6.74(bstr) 373 ta-sint64be = #6.75(bstr) 374 ; reserved: #6.76(bstr) 375 ta-sint16le = #6.77(bstr) 376 ta-sint32le = #6.78(bstr) 377 ta-sint64le = #6.79(bstr) 378 ta-float16be = #6.80(bstr) 379 ta-float32be = #6.81(bstr) 380 ta-float64be = #6.82(bstr) 381 ta-float128be = #6.83(bstr) 382 ta-float16le = #6.84(bstr) 383 ta-float32le = #6.85(bstr) 384 ta-float64le = #6.86(bstr) 385 ta-float128le = #6.87(bstr) 386 homogeneous = #6.41(array) 387 multi-dim = #6.40([dim, array]) 388 multi-dim-column-major = #6.1040([dim, array]) 390 Figure 6: Recommended typenames for CDDL 392 6. IANA Considerations 394 IANA has allocated the tags in Table 3, with the present document as 395 the specification reference. (The reserved value is reserved for a 396 future revision of typed array tags.) 398 The allocations came out of the "specification required" space 399 (24..255), with the exception of 1040, which came out of the "first 400 come first served" space (256..). 402 +------+-------------------+----------------------------------------+ 403 | Tag | Data Item | Semantics | 404 +------+-------------------+----------------------------------------+ 405 | 64 | byte string | uint8 Typed Array | 406 | 65 | byte string | uint16, big endian, Typed Array | 407 | 66 | byte string | uint32, big endian, Typed Array | 408 | 67 | byte string | uint64, big endian, Typed Array | 409 | 68 | byte string | uint8 Typed Array, clamped arithmetic | 410 | 69 | byte string | uint16, little endian, Typed Array | 411 | 70 | byte string | uint32, little endian, Typed Array | 412 | 71 | byte string | uint64, little endian, Typed Array | 413 | 72 | byte string | sint8 Typed Array | 414 | 73 | byte string | sint16, big endian, Typed Array | 415 | 74 | byte string | sint32, big endian, Typed Array | 416 | 75 | byte string | sint64, big endian, Typed Array | 417 | 76 | byte string | (reserved) | 418 | 77 | byte string | sint16, little endian, Typed Array | 419 | 78 | byte string | sint32, little endian, Typed Array | 420 | 79 | byte string | sint64, little endian, Typed Array | 421 | 80 | byte string | IEEE 754 binary16, big endian, Typed | 422 | | | Array | 423 | 81 | byte string | IEEE 754 binary32, big endian, Typed | 424 | | | Array | 425 | 82 | byte string | IEEE 754 binary64, big endian, Typed | 426 | | | Array | 427 | 83 | byte string | IEEE 754 binary128, big endian, Typed | 428 | | | Array | 429 | 84 | byte string | IEEE 754 binary16, little endian, | 430 | | | Typed Array | 431 | 85 | byte string | IEEE 754 binary32, little endian, | 432 | | | Typed Array | 433 | 86 | byte string | IEEE 754 binary64, little endian, | 434 | | | Typed Array | 435 | 87 | byte string | IEEE 754 binary128, little endian, | 436 | | | Typed Array | 437 | 40 | array of two | Multi-dimensional Array, row-major | 438 | | arrays* | order | 439 | 1040 | array of two | Multi-dimensional Array, column-major | 440 | | arrays* | order | 441 | 41 | array | Homogeneous Array | 442 +------+-------------------+----------------------------------------+ 444 Table 3: Values for Tags 446 *) 40 or 1040 data item: second element of outer array in data item 447 is native CBOR array (major type 4) or Typed Array (one of Tag 448 64..87) 450 7. Security Considerations 452 The security considerations of RFC 7049 apply; special attention is 453 drawn to the second paragraph of Section 8 of RFC 7049. The tags 454 introduced here are not expected to raise security considerations 455 beyond those. 457 8. References 459 8.1. Normative References 461 [I-D.ietf-cbor-cddl] 462 Birkholz, H., Vigano, C., and C. Bormann, "Concise data 463 definition language (CDDL): a notational convention to 464 express CBOR and JSON data structures", draft-ietf-cbor- 465 cddl-07 (work in progress), February 2019. 467 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 468 Requirement Levels", BCP 14, RFC 2119, 469 DOI 10.17487/RFC2119, March 1997, 470 . 472 [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object 473 Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, 474 October 2013, . 476 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 477 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 478 May 2017, . 480 8.2. Informative References 482 [ArrayBuffer] 483 Mozilla Developer Network, "JavaScript typed arrays", 484 2013, . 487 [TypedArray] 488 Vukicevic, V. and K. Russell, "Typed Array Specification", 489 February 2011, 490 . 492 [TypedArrayES6] 493 "22.2 TypedArray Objects", in: ECMA-262 6th Edition, The 494 ECMAScript 2015 Language Specification, June 2015, 495 . 498 [TypedArrayUpdate] 499 Herman, D. and K. Russell, "Typed Array Specification", 500 July 2013, 501 . 504 Contributors 506 Glenn Engel suggested the tags for multi-dimensional arrays and 507 homogeneous arrays. 509 Acknowledgements 511 Jim Schaad reminded us that column-major order still is in use. IANA 512 helped correct an error in a previous version. 514 Authors' Addresses 516 Johnathan Roatch 518 Email: jroatch@gmail.com 520 Carsten Bormann 521 Universitaet Bremen TZI 522 Postfach 330440 523 Bremen D-28359 524 Germany 526 Phone: +49-421-218-63921 527 Email: cabo@tzi.org