idnits 2.17.1 draft-jroatch-cbor-tags-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 26, 2018) is 2244 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '2' on line 248 -- Looks like a reference, but probably isn't: '3' on line 209 == Outdated reference: A later version (-08) exists of draft-ietf-cbor-cddl-02 ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Roatch 3 Internet-Draft 4 Intended status: Informational C. Bormann 5 Expires: August 30, 2018 Universitaet Bremen TZI 6 February 26, 2018 8 Concise Binary Object Representation (CBOR) Tags for Typed Arrays 9 draft-jroatch-cbor-tags-07 11 Abstract 13 The Concise Binary Object Representation (CBOR, RFC 7049) is a data 14 format whose design goals include the possibility of extremely small 15 code size, fairly small message size, and extensibility without the 16 need for version negotiation. 18 The present document makes use of this extensibility to define a 19 number of CBOR tags for typed arrays of numeric data, as well as two 20 additional tags for multi-dimensional and homogeneous arrays. It is 21 intended as the reference document for the IANA registration of the 22 CBOR tags defined. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at https://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on August 30, 2018. 41 Copyright Notice 43 Copyright (c) 2018 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (https://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 59 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 60 2. Typed Arrays . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2.1. Types of numbers . . . . . . . . . . . . . . . . . . . . 3 62 3. Additional Array Tags . . . . . . . . . . . . . . . . . . . . 4 63 3.1. Multi-dimensional Array . . . . . . . . . . . . . . . . . 5 64 3.2. Homogeneous Array . . . . . . . . . . . . . . . . . . . . 5 65 4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 6 66 5. CDDL typenames . . . . . . . . . . . . . . . . . . . . . . . 7 67 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 68 7. Security Considerations . . . . . . . . . . . . . . . . . . . 9 69 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 70 8.1. Normative References . . . . . . . . . . . . . . . . . . 10 71 8.2. Informative References . . . . . . . . . . . . . . . . . 10 72 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 10 73 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 10 74 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 76 1. Introduction 78 The Concise Binary Object Representation (CBOR, [RFC7049]) provides 79 for the interchange of structured data without a requirement for a 80 pre-agreed schema. RFC 7049 defines a basic set of data types, as 81 well as a tagging mechanism that enables extending the set of data 82 types supported via an IANA registry. 84 Recently, a simple form of typed arrays of numeric data have received 85 interest both in the Web graphics community [TypedArray] and in the 86 JavaScript specification [TypedArrayES6], as well as in corresponding 87 implementations [ArrayBuffer]. 89 Since these typed arrays may carry significant amounts of data, there 90 is interest in interchanging them in CBOR without the need of lengthy 91 conversion of each number in the array. 93 This document defines a number of interrelated CBOR tags that cover 94 these typed arrays, as well as two additional tags for multi- 95 dimensional and homogeneous arrays. It is intended as the reference 96 document for the IANA registration of the tags defined. 98 1.1. Terminology 100 The term "byte" is used in its now customary sense as a synonym for 101 "octet". Where bit arithmetic is explained, this document uses the 102 notation familiar from the programming language C (including C++14's 103 0bnnn binary literals), except that the operator "**" stands for 104 exponentiation. 106 2. Typed Arrays 108 Typed arrays are homogeneous arrays of numbers, all of which are 109 encoded in a single form of binary representation. The concatenation 110 of these representations is encoded as a single CBOR byte string 111 (major type 2), enclosed by a single tag indicating the type and 112 encoding of all the numbers represented in the byte string. 114 2.1. Types of numbers 116 Three classes of numbers are of interest: unsigned integers (uint), 117 signed integers (twos' complement, sint), and IEEE 754 binary 118 floating point numbers (which are always signed). For each of these 119 classes, there are multiple representation lengths in active use: 121 +-----------+--------+--------+-----------+ 122 | Length ll | uint | sint | float | 123 +-----------+--------+--------+-----------+ 124 | 0 | uint8 | sint8 | binary16 | 125 | 1 | uint16 | sint16 | binary32 | 126 | 2 | uint32 | sint32 | binary64 | 127 | 3 | uint64 | sint64 | binary128 | 128 +-----------+--------+--------+-----------+ 130 Table 1: Length values 132 Here, sintN stands for a signed integer of exactly N bits (for 133 instance, sint16), and uintN stands for an unsigned integer of 134 exactly N bits (for instance, uint32). The name binaryN stands for 135 the number form of the same name defined in IEEE 754. 137 Since one objective of these tags is to be able to directly ship the 138 ArrayBuffers underlying the Typed Arrays without re-encoding them, 139 and these may be either in big endian (network byte order) or in 140 little endian form, we need to define tags for both variants. 142 In total, this leads to 24 variants. In the tag, we need to express 143 the choice between integer and floating point, the signedness (for 144 integers), the endianness, and one of the four length values. 146 In order to simplify implementation, a range of tags is being 147 allocated that allows retrieving all this information from the bits 148 of the tag: Tag values from TBD64 to TBD87. 150 The value is split up into 5 bit fields: TDB0b010_f_s_e_ll, as 151 detailed in Table 2. 153 +----------+-------------------------------------------------------+ 154 | Field | Use | 155 +----------+-------------------------------------------------------+ 156 | TBD0b010 | a constant such as '010', to be defined | 157 | f | 0 for integer, 1 for float | 158 | s | 0 for unsigned integer or float, 1 for signed integer | 159 | e | 0 for big endian, 1 for little endian | 160 | ll | A number for the length (Table 1). | 161 +----------+-------------------------------------------------------+ 163 Table 2: Bit fields in the low 8 bits of the tag 165 The number of bytes in each array element can then be calculated by 166 "2**(f + ll)" (or "1 << (f + ll)" in a typical programming language). 167 (Notice that f and ll are the lsb of each nibble (4bit) in the byte.) 169 In the CBOR representation, the total number of elements in the array 170 is not expressed explicitly, but implied from the length of the byte 171 string and the length of each representation. It can be computed 172 inversely to the previous formula: "bytelength >> (f + ll)". 174 For the uint8/sint8 values, the endianness is redundant. Only the 175 big endian variant is used. As a special case, what would be the 176 little endian variant of uint8 is used to signify that the numbers in 177 the array are using clamped conversion from integers, as described in 178 more detail in Section 7.1 of [TypedArrayUpdate]. 180 3. Additional Array Tags 182 This specification defines two additional array tags. The Multi- 183 dimensional Array tag can be combined with classical CBOR arrays as 184 well as with Typed Arrays in order to build multi-dimensional arrays 185 with constant numbers of elements in the sub-arrays. The Homogeneous 186 Array tag can be used to facilitate the ingestion of homogeneous 187 classical CBOR arrays, providing performance advantages even when a 188 Typed Array does not apply. 190 3.1. Multi-dimensional Array 192 Tag: TBD40 194 Data Item: array (major type 4) of two arrays, one array (major type 195 4) of dimensions, and one array (major type 4, a Typed Array, or a 196 Homogeneous Array) of elements 198 A multi-dimensional array is represented as a tagged array that 199 contains two (one-dimensional) arrays. The first array defines the 200 dimensions of the multi-dimensional array (in the sequence of outer 201 dimensions towards inner dimensions) while the second array 202 represents the contents of the multi-dimensional array. If the 203 second array is itself tagged as a Typed Array then the element type 204 of the multi-dimensional array is known to be the same type as that 205 of the Typed Array. Data in the Typed Array byte string consists of 206 consecutive values where the last dimension is considered contiguous 207 (row-major order). 209 uint16_t a[2][3] = { 210 {0, 1, 2}, /* row 0 */ 211 {3, 4, 5}, 212 }; 214 # multi-dimensional array tag 215 82 # array(2) 216 82 # array(2) 217 02 # unsigned(2) 1st Dimension 218 03 # unsigned(3) 2nd Dimension 219 d8 41 # uint16 array 220 4a # byte string(12) 221 00 00 # unsigned(0) 222 00 01 # unsigned(1) 223 00 02 # unsigned(2) 224 00 03 # unsigned(3) 225 00 04 # unsigned(4) 226 00 05 # unsigned(5) 228 Figure 1: Multi-dimensional array in C and CBOR 230 3.2. Homogeneous Array 232 Tag: TBD41 234 Data Item: array (major type 4) 236 This tag provides a hint to decoders that the array tagged by it has 237 elements that are all of the same application type. The element type 238 of the array is thus determined by the application type of the first 239 array element. This can be used by implementations in strongly typed 240 languages while decoding to create native homogeneous arrays of 241 specific types instead of ordered lists. 243 Which CBOR data items constitute elements of the same application 244 type is specific to the application. However, type systems of 245 programming languages have enough commonality that an application 246 should be able to create portable homogeneous arrays. 248 bool boolArray[2] = { true, false }; 250 # Homogeneous Array Tag 251 82 #array(2) 252 F5 # true 253 F4 # false 255 Figure 2: Homogeneous array in C and CBOR 257 4. Discussion 259 Support for both little- and big-endian representation may seem out 260 of character with CBOR, which is otherwise fully big endian. This 261 support is in line with the intended use of the typed arrays and the 262 objective not to require conversion of each array element. 264 This specification allocates a sizable chunk out of the single-byte 265 tag space. This use of code point space is justified by the wide use 266 of typed arrays in data interchange. 268 Applying a Homogeneous Array tag to a Typed Array would be redundant 269 and is therefore not provided by the present specification. 271 5. CDDL typenames 273 For the use with CDDL [I-D.ietf-cbor-cddl], the typenames defined in 274 Figure 3 are recommended: 276 ta-uint8 = #6.TBD64(bstr) 277 ta-uint16be = #6.TBD65(bstr) 278 ta-uint32be = #6.TBD66(bstr) 279 ta-uint64be = #6.TBD67(bstr) 280 ta-uint8-clamped = #6.TBD68(bstr) 281 ta-uint16le = #6.TBD69(bstr) 282 ta-uint32le = #6.TBD70(bstr) 283 ta-uint64le = #6.TBD71(bstr) 284 ta-sint8 = #6.TBD72(bstr) 285 ta-sint16be = #6.TBD73(bstr) 286 ta-sint32be = #6.TBD74(bstr) 287 ta-sint64be = #6.TBD75(bstr) 288 ; reserved: #6.TBD76(bstr) 289 ta-sint16le = #6.TBD77(bstr) 290 ta-sint32le = #6.TBD78(bstr) 291 ta-sint64le = #6.TBD79(bstr) 292 ta-float16be = #6.TBD80(bstr) 293 ta-float32be = #6.TBD81(bstr) 294 ta-float64be = #6.TBD82(bstr) 295 ta-float128be = #6.TBD83(bstr) 296 ta-float16le = #6.TBD84(bstr) 297 ta-float32le = #6.TBD85(bstr) 298 ta-float64le = #6.TBD86(bstr) 299 ta-float128le = #6.TBD87(bstr) 300 homogeneous = #6.TBD41(array) 301 multi-dim = #6.TBD40([dim, array]) 303 Figure 3: Recommended typenames for CDDL 305 6. IANA Considerations 307 IANA is requested to allocate the tags in Table 3, with the present 308 document as the specification reference. 310 +-------+-------------------+---------------------------------------+ 311 | Tag | Data Item | Semantics | 312 +-------+-------------------+---------------------------------------+ 313 | TBD64 | byte string | uint8 Typed Array | 314 | TBD65 | byte string | uint16, big endian, Typed Array | 315 | TBD66 | byte string | uint32, big endian, Typed Array | 316 | TBD67 | byte string | uint64, big endian, Typed Array | 317 | TBD68 | byte string | uint8 Typed Array, clamped arithmetic | 318 | TBD69 | byte string | uint16, little endian, Typed Array | 319 | TBD70 | byte string | uint32, little endian, Typed Array | 320 | TBD71 | byte string | uint64, little endian, Typed Array | 321 | TBD72 | byte string | sint8 Typed Array | 322 | TBD73 | byte string | sint16, big endian, Typed Array | 323 | TBD74 | byte string | sint32, big endian, Typed Array | 324 | TBD75 | byte string | sint64, big endian, Typed Array | 325 | TBD76 | byte string | (reserved) | 326 | TBD77 | byte string | sint16, little endian, Typed Array | 327 | TBD78 | byte string | sint32, little endian, Typed Array | 328 | TBD79 | byte string | sint64, little endian, Typed Array | 329 | TBD80 | byte string | IEEE 754 binary16, big endian, Typed | 330 | | | Array | 331 | TBD81 | byte string | IEEE 754 binary32, big endian, Typed | 332 | | | Array | 333 | TBD82 | byte string | IEEE 754 binary64, big endian, Typed | 334 | | | Array | 335 | TBD83 | byte string | IEEE 754 binary128, big endian, Typed | 336 | | | Array | 337 | TBD84 | byte string | IEEE 754 binary16, little endian, | 338 | | | Typed Array | 339 | TBD85 | byte string | IEEE 754 binary32, little endian, | 340 | | | Typed Array | 341 | TBD86 | byte string | IEEE 754 binary64, little endian, | 342 | | | Typed Array | 343 | TBD87 | byte string | IEEE 754 binary128, little endian, | 344 | | | Typed Array | 345 | TBD40 | array of two | Multi-dimensional Array | 346 | | arrays* | | 347 | TBD41 | array | Homogeneous Array | 348 +-------+-------------------+---------------------------------------+ 350 Table 3: Values for Tags 352 *) TBD40 data item: second element of outer array in data item is 353 native CBOR array (major type 4) or Typed Array (one of Tag 354 TBD64..TBD87) 356 RFC editor note: Please replace TBDnn by the tag numbers allocated by 357 IANA throughout the document and delete this note. IANA note: To 358 make the calculations work, TDB64 to TBD87 need to come from a 359 contiguous range the start of which is divisible by 32. 361 TO DO: The WG needs to figure out whether it is OK to spend 24 "good" 362 (1+1 byte) tags for this, whether this all goes to 1+2 byte tags, or 363 whether maybe the layout of the bits in the tag should change to move 364 the larger datatypes into the 1+2 range and just the 8-bit ones into 365 the 1+1 range. 367 7. Security Considerations 369 The security considerations of RFC 7049 apply; the tags introduced 370 here are not expected to raise security considerations beyond those. 372 8. References 374 8.1. Normative References 376 [I-D.ietf-cbor-cddl] 377 Birkholz, H., Vigano, C., and C. Bormann, "Concise data 378 definition language (CDDL): a notational convention to 379 express CBOR data structures", draft-ietf-cbor-cddl-02 380 (work in progress), February 2018. 382 [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object 383 Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, 384 October 2013, . 386 8.2. Informative References 388 [ArrayBuffer] 389 Mozilla Developer Network, "JavaScript typed arrays", 390 2013, . 393 [TypedArray] 394 Vukicevic, V. and K. Russell, "Typed Array Specification", 395 February 2011, 396 . 398 [TypedArrayES6] 399 "22.2 TypedArray Objects", in: ECMA-262 6th Edition, The 400 ECMAScript 2015 Language Specification, June 2015, 401 . 404 [TypedArrayUpdate] 405 Herman, D. and K. Russell, "Typed Array Specification", 406 July 2013, 407 . 410 Contributors 412 Glenn Engel suggested the tags for multi-dimensional arrays and 413 homogeneous arrays. 415 Acknowledgements 417 TBD 419 Authors' Addresses 421 Johnathan Roatch 423 Email: jroatch@gmail.com 425 Carsten Bormann 426 Universitaet Bremen TZI 427 Postfach 330440 428 Bremen D-28359 429 Germany 431 Phone: +49-421-218-63921 432 Email: cabo@tzi.org