idnits 2.17.1 draft-greevenbosch-appsawg-cbor-cddl-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 21, 2016) is 2930 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '3145' on line 945 -- Looks like a reference, but probably isn't: '4454' on line 945 -- Looks like a reference, but probably isn't: '1175' on line 945 -- Looks like a reference, but probably isn't: '3441' on line 945 -- Looks like a reference, but probably isn't: '74' on line 945 -- Looks like a reference, but probably isn't: '1542' on line 945 -- Looks like a reference, but probably isn't: '4099' on line 946 -- Looks like a reference, but probably isn't: '4062' on line 946 -- Looks like a reference, but probably isn't: '2808' on line 946 -- Looks like a reference, but probably isn't: '8' on line 946 -- Looks like a reference, but probably isn't: '3174' on line 946 -- Looks like a reference, but probably isn't: '3048' on line 946 ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949) ** Obsolete normative reference: RFC 7159 (Obsoleted by RFC 8259) == Outdated reference: A later version (-09) exists of draft-newton-json-content-rules-05 Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 13 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Vigano 3 Internet-Draft Universitaet Bremen 4 Intended status: Informational H. Birkholz 5 Expires: September 22, 2016 Fraunhofer SIT 6 March 21, 2016 8 CBOR data definition language (CDDL): a notational convention to express 9 CBOR data structures 10 draft-greevenbosch-appsawg-cbor-cddl-08 12 Abstract 14 This document proposes a notational convention to express CBOR data 15 structures (RFC 7049). Its main goal is to provide an easy and 16 unambiguous way to express structures for protocol messages and data 17 formats that use CBOR. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on September 22, 2016. 36 Copyright Notice 38 Copyright (c) 2016 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 54 1.1. Requirements notation . . . . . . . . . . . . . . . . . . 4 55 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 56 2. The Style of Data Structure Specification . . . . . . . . . . 4 57 2.1. Groups and Composition in CDDL . . . . . . . . . . . . . 5 58 2.1.1. Usage . . . . . . . . . . . . . . . . . . . . . . . . 7 59 2.1.2. Syntax . . . . . . . . . . . . . . . . . . . . . . . 8 60 2.2. Types . . . . . . . . . . . . . . . . . . . . . . . . . . 8 61 2.2.1. Values . . . . . . . . . . . . . . . . . . . . . . . 8 62 2.2.2. Choices . . . . . . . . . . . . . . . . . . . . . . . 8 63 2.2.3. Representation Types . . . . . . . . . . . . . . . . 10 64 2.2.4. Root type . . . . . . . . . . . . . . . . . . . . . . 10 65 3. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 66 3.1. General conventions . . . . . . . . . . . . . . . . . . . 11 67 3.2. Occurrence . . . . . . . . . . . . . . . . . . . . . . . 12 68 3.3. Predefined names for types . . . . . . . . . . . . . . . 12 69 3.4. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . 13 70 3.5. Maps . . . . . . . . . . . . . . . . . . . . . . . . . . 14 71 3.5.1. Structs . . . . . . . . . . . . . . . . . . . . . . . 14 72 3.5.2. Tables . . . . . . . . . . . . . . . . . . . . . . . 17 73 3.6. Tags . . . . . . . . . . . . . . . . . . . . . . . . . . 17 74 3.7. Operator Precedence . . . . . . . . . . . . . . . . . . . 18 75 4. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 19 76 4.1. Moves in a computer game . . . . . . . . . . . . . . . . 19 77 4.2. Fruit . . . . . . . . . . . . . . . . . . . . . . . . . . 24 78 4.3. RFC 7071 . . . . . . . . . . . . . . . . . . . . . . . . 25 79 4.4. Examples from JSON Content Rules . . . . . . . . . . . . 29 80 5. Making Use of CDDL . . . . . . . . . . . . . . . . . . . . . 31 81 5.1. As a guide to a human user . . . . . . . . . . . . . . . 31 82 5.2. For automated checking of CBOR data structure . . . . . . 31 83 5.3. For data analysis tools . . . . . . . . . . . . . . . . . 31 84 6. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 32 85 6.1. Work to do . . . . . . . . . . . . . . . . . . . . . . . 32 86 7. Resolved Issues . . . . . . . . . . . . . . . . . . . . . . . 32 87 8. Security considerations . . . . . . . . . . . . . . . . . . . 32 88 9. IANA considerations . . . . . . . . . . . . . . . . . . . . . 33 89 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 33 90 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 33 91 11.1. Normative References . . . . . . . . . . . . . . . . . . 33 92 11.2. Informative References . . . . . . . . . . . . . . . . . 34 93 Appendix A. Cemetery . . . . . . . . . . . . . . . . . . . . . . 34 94 Appendix B. Nursery . . . . . . . . . . . . . . . . . . . . . . 34 95 B.1. Annotations . . . . . . . . . . . . . . . . . . . . . . . 34 96 B.1.1. Annotation .size . . . . . . . . . . . . . . . . . . 35 97 B.1.2. Annotation .bits . . . . . . . . . . . . . . . . . . 35 98 B.1.3. Annotation .regexp . . . . . . . . . . . . . . . . . 36 99 B.1.4. Annotations .cbor and .cborseq . . . . . . . . . . . 37 100 B.1.5. Annotations .within and .and . . . . . . . . . . . . 37 101 B.1.6. Annotations .lt, .le, .gt, .ge, .eq, .ne, and 102 .default . . . . . . . . . . . . . . . . . . . . . . 38 103 B.2. Socket/Plug . . . . . . . . . . . . . . . . . . . . . . . 38 104 B.3. Generics . . . . . . . . . . . . . . . . . . . . . . . . 40 105 Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 40 106 Appendix D. ABNF grammar . . . . . . . . . . . . . . . . . . . . 42 107 Appendix E. Standard Prelude . . . . . . . . . . . . . . . . . . 44 108 Appendix F. The CDDL tool . . . . . . . . . . . . . . . . . . . 46 109 Appendix G. Extended Diagnostic Notation . . . . . . . . . . . . 46 110 G.1. White space in binary strings . . . . . . . . . . . . . . 47 111 G.2. Text in binary strings . . . . . . . . . . . . . . . . . 47 112 G.3. Concatenated Strings . . . . . . . . . . . . . . . . . . 47 113 G.4. Hexadecimal, octal, and binary numbers . . . . . . . . . 48 114 G.5. Comments . . . . . . . . . . . . . . . . . . . . . . . . 48 115 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 50 117 1. Introduction 119 In this document, a notational convention to express CBOR [RFC7049] 120 data structures is defined. 122 The main goal for the convention is to provide a unified notation 123 that can be used when defining protocols that use CBOR. We term the 124 convention "CBOR data definition language", or CDDL. 126 The CBOR notational convention has the following goals: 128 (G1) Provide an unambiguous description of the overall structure of 129 a CBOR data structure. 131 (G2) Flexibility to express the freedoms of choice in the CBOR data 132 format. 134 (G3) Possibility to restrict format choices where appropriate 135 [_format]. 137 (G4) Able to express common CBOR datatypes and structures. 139 (G5) Human and machine readable and processable. 141 (G6) Automatic checking of data format compliance. 143 (G7) Extraction of specific elements from CBOR data for further 144 processing. 146 This document has the following structure: 148 The syntax of CDDL is defined in Section 3. Examples of CDDL and 149 related CBOR data instances are defined in Section 4. Section 5 150 discusses usage of CDDL. Examples are provided early in the text to 151 better illustrate concept definitions. A formal definition of CDDL 152 using ABNF grammar is provided in Appendix D. Finally, a prelude of 153 standard CDDL definitions available in every CBOR specification is 154 listed in Appendix E. 156 1.1. Requirements notation 158 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 159 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 160 "OPTIONAL" in this document are to be interpreted as described in RFC 161 2119, BCP 14 [RFC2119]. 163 1.2. Terminology 165 New terms are introduced in _cursive_. CDDL text in the running text 166 is in "typewriter". 168 2. The Style of Data Structure Specification 170 CDDL focuses on styles of specification that are in use in the 171 community employing the data model as pioneered by JSON and now 172 refined in CBOR. 174 There are a number of more or less atomic elements of a CBOR data 175 model, such as numbers, simple values (false, true, nil), strings; 176 CDDL does not focus on specifying their structure. CDDL of course 177 also allows adding a CBOR tag to a data item. 179 The more important components of a data structure definition language 180 are the data types used for composition: arrays and maps in CBOR 181 (called arrays and objects in JSON). While these are only two 182 representation formats, they are used to specify four loosely 183 distinguishable styles of composition: 185 o A _vector_, an array of elements that are mostly of the same 186 semantics. The set of signatures associated with a signed data 187 item is a typical application of a vector. 189 o A _record_, an array the elements of which have different, 190 positionally defined semantics, as detailed in the data structure 191 definition. A 2D point, specified as an array of an x coordinate 192 (which comes first) and a y coordinate (coming second) is an 193 example of a record, as is the pair of exponent (first) and 194 mantissa (second) in a CBOR decimal fraction. 196 o A _table_, a map from a domain of map keys to a domain of map 197 values, that are mostly of the same semantics. A set of language 198 tags, each mapped to a string translated to that specific 199 language, is an example of a table. The key domain is usually not 200 limited to a specific set by the specification, but open for the 201 application, e.g., in a table mapping IP addresses to MAC 202 addresses, the specification does not attempt to foresee all 203 possible IP addresses. 205 o A _struct_, a map from a domain of map keys as defined by the 206 specification to a domain of map values the semantics of each of 207 which is bound to a specific map key. This is what many people 208 have in mind when they think about JSON objects; CBOR adds the 209 ability to use map keys that are not just strings. Structs can be 210 used to solve similar problems as records; the use of explicit map 211 keys facilitates optionality and extensibility. 213 Two important concepts provide the foundation for CDDL: 215 1. Instead of defining all four types of composition in CDDL 216 separately, or even defining one kind for arrays (vectors and 217 records) and one kind for maps (tables and structs), there is 218 only one kind of composition in CDDL: the _group_ (Section 2.1). 220 2. The other important concept is that of a _type_. The entire CDDL 221 specification defines a type (the one defined by its first 222 _rule_), which formally is the set of CBOR instances that are 223 acceptable for this specification. CDDL predefines a number of 224 basic types such as "uint" (unsigned integer) or "tstr" (text 225 string), often making use of a simple formal notation for CBOR 226 data items. Each value that can be expressed as a CBOR data item 227 also is a type in its own right, e.g. "1". A type can be built 228 as a _choice_ of other types, e.g., an "int" is either a "uint" 229 or a "nint" (negative integer). Finally, a type can be built as 230 an array or a map from a group. 232 2.1. Groups and Composition in CDDL 234 CDDL Groups are lists of name/value pairs (group _entries_). 236 In an array context, only the value of the entry is represented; the 237 name is annotation only (and can be left off if not needed). In a 238 map context, the names become the map keys ("member keys"). 240 In an array context, the sequence of elements in the group is 241 important, as it is the information that allows associating actual 242 array elements with entries in the group. In a map context, the 243 sequence of entries in a group is not relevant (but there is still a 244 need to write down group entries in a sequence). 246 A group can be placed in (round) parentheses, and given a name by 247 using it in a rule: 249 pii = ( 250 age: int, 251 name: tstr, 252 employer: tstr, 253 ) 255 Figure 1: A basic group 257 Or a group can just be used in the definition of something else: 259 person = {( 260 age: int, 261 name: tstr, 262 employer: tstr, 263 )} 265 Figure 2: Using a group in a map 267 which, given the above rule for pii, is identical to: 269 person = { 270 pii 271 } 273 Figure 3: Using a group by name 275 Note that the (curly) braces signify the creation of a map; the 276 groups themselves are neutral as to whether they will be used in a 277 map or an array. 279 The parentheses for groups are optional when there is some other set 280 of brackets present, so it would be slightly more natural to express 281 Figure 2 as: 283 person = { 284 age: int, 285 name: tstr, 286 employer: tstr, 287 } 289 Groups can be used to factor out common parts of structs, e.g., 290 instead of writing: 292 person = { 293 age: int, 294 name: tstr, 295 employer: tstr, 296 } 298 dog = { 299 age: int, 300 name: tstr, 301 leash-length: float, 302 } 304 one can choose a name for the common subgroup and write: 306 person = { 307 identity, 308 employer: tstr, 309 } 311 dog = { 312 identity, 313 leash-length: float, 314 } 316 identity = ( 317 age: int, 318 name: tstr, 319 ) 321 Figure 4: Using a group for factorization 323 Note that the contents of the braces in the above definitions 324 constitute (anonymous) groups, while "identity" is a named group. 326 2.1.1. Usage 328 Groups are the instrument used in composing data structures with 329 CDDL. It is a matter of style in defining those structures whether 330 to define groups (anonymously) right in their contexts or whether to 331 define them in a separate rule and to reference them with their 332 respective name (possibly more than once). 334 With this, one is allowed to define all small parts of their data 335 structures and compose bigger protocol units with those or to have 336 only one big protocol data unit that has all definitions ad hoc where 337 needed. 339 2.1.2. Syntax 341 The composition syntax intends to be concise and easy to read: 343 o The start of a group can be marked by '(' 345 o The end of a group can be marked by ')' 347 o Definitions of entries inside of a group are noted as follows: 348 _keytype => valuetype,_ (read "keytype maps to valuetype"). The 349 comma is actually optional (not just in the final entry), but it 350 is considered good style to set it. The double arrow can be 351 replaced by a colon in the common case of directly using a string 352 as a key (see Section 3.5.1). 354 An entry consists of a _keytype_ and a _valuetype_: 356 o _keytype_ is either an atom used as the actual key or a valuetype. 357 This may be needed when using groups in a table context, where the 358 actual keys are of lesser importance than the key types, e.g in 359 contexts verifying incoming data. 361 o _valuetype_ is either a valuetype derived from the major types 362 defined in [RFC7049], a convenience valuetype defined in this 363 document (Appendix E) or the name of a group defined in the 364 protocol file. 366 A group definition can also contain choices between groups, see 367 Section 2.2.2. 369 2.2. Types 371 2.2.1. Values 373 Values such as numbers and strings can be used in place of a type. 374 (For instance, this is a very common thing to do for a keytype, 375 common enough that CDDL provides additional convenience syntax for 376 this.) 378 2.2.2. Choices 380 Many places that allow a type also allow a choice between types, 381 delimited by a "/" (slash). The entire choice construct can be put 382 into parentheses if this is required to make the construction 383 unambiguous (please see Appendix D for the details). 385 Choices of values can be used to express enumerations: 387 attire = "bow tie" / "necktie" / "Internet attire" 388 protocol = 6 / 17 390 Similarly as for types, CDDL also allows choices between groups, 391 delimited by a "//" (double slash). 393 address = { delivery } 395 delivery = ( 396 street: tstr, ? number: uint, city // 397 po-box: uint, city // 398 per-pickup: true ) 400 city = ( 401 name: tstr, zip-code: uint 402 ) 404 Both for type choices and for group choices, additional alternatives 405 can be added to a rule later in separate rules by using "/=" and 406 "//=", respectively, instead of "=": 408 attire /= "swimwear" 410 delivery //= ( 411 lat: float, long: float, drone-type: tstr 412 ) 414 It is not a mistake if a name is first used with a "/=" or "//=" 415 (there is no need to "create it" with "="). 417 2.2.2.1. Ranges 419 Instead of naming all the values that make up a choice, CDDL allows 420 building a _range_ out of two values that are in an ordering 421 relationship. A range can be inclusive of both ends given (denoted 422 by joining two values by ".."), or include the first and exclude the 423 second (denoted by instead using "..."). 425 device-address = byte 426 max-byte = 255 427 byte = 0..max-byte ; inclusive range 428 first-non-byte = 256 429 byte1 = 0...first-non-byte ; byte1 is equivalent to byte 431 CDDL currently only allows ranges between numbers [_range]. 433 2.2.2.2. Turning a group into a choice 435 Some choices are built out of large numbers of values, often 436 integers, each of which is best given a semantic name in the 437 specification. Instead of naming each of these integers and then 438 accumulating these into a choice, CDDL allows building a choice from 439 a group by prefixing it with a "&" character: 441 terminal-color = &basecolors 442 basecolors = ( 443 black: 0, red: 1, green: 2, yellow: 3, 444 blue: 4, magenta: 5, cyan: 6, white: 7, 445 ) 446 extended-color = &( 447 basecolors, 448 orange: 8, pink: 9, purple: 10, brown: 11, 449 ) 451 As with the use of groups in arrays (Section 3.4), the membernames 452 have only documentary value (in particular, they might be used by a 453 tool when displaying integers that are taken from that choice). 455 2.2.3. Representation Types 457 CDDL allows the specification of a data item type by referring to the 458 CBOR representation (major and minor numbers). How this is used 459 should be evident from the prelude (Appendix E). 461 It may be necessary to make use of representation types outside the 462 prelude, e.g., a specification could start by making use of an 463 existing tag in a more specific way, or define a new tag not defined 464 in the prelude: 466 my_breakfast = #6.55799(breakfast) ; cbor-any is too general! 467 breakfast = cereal / porridge 468 cereal = #6.998(tstr) 469 porridge = #6.999([liquid, solid]) 470 liquid = milk / water 471 milk = 0 472 water = 1 473 solid = tstr 475 2.2.4. Root type 477 There is no special syntax to identify the root of a CDDL data 478 structure definition: that role is simply taken by the first rule 479 defined in the file. 481 This is motivated by the usual top-down approach for defining data 482 structures, decomposing a big data structure unit into smaller parts; 483 however, except for the root type, there is no need to strictly 484 follow this sequence. 486 3. Syntax 488 In this section, the overall syntax of CDDL is shown, alongside some 489 examples just illustrating syntax. (The definition will not attempt 490 to be overly formal; refer to Appendix D for the details.) 492 3.1. General conventions 494 The basic syntax is inspired by ABNF [RFC5234], with 496 o rules, whether they define groups or types, are defined with a 497 name, followed by an equals sign "=" and the actual definition 498 according to the respective syntactic rules of that definition. 500 o A name can consist of any of the characters from the set {'A', 501 ..., 'Z', 'a', ..., 'z', '0', ..., '9', '_', '-', '@', '.', '$'}, 502 starting with an alphabetic character (including '@', '_', '$') 503 and ending in one or a digit. 505 * Names are case sensitive. 507 * It is preferred style to start a name with a lower case letter. 509 * The hyphen is preferred over the underscore (except in a 510 "bareword" (Section 3.5.1), where the semantics may actually 511 require an underscore). 513 * The period may be useful for larger specifications, to express 514 some module structure (as in "tcp.throughput" vs. 515 "udp.throughput"). 517 * A number of names are predefined in the CDDL prelude, as listed 518 in Appendix E. 520 * Rule names (types or groups) do not appear in the actual CBOR 521 encoding, but names used as "barewords" in member keys do. 523 o Comments are started by a ';' (semicolon) character and finish at 524 the end of a line (LF or CRLF). 526 o outside strings, whitespace (spaces, newlines, and comments) is 527 used to separate syntactic elements for readability (and to 528 separate identifiers or numbers that follow each other); it is 529 otherwise completely optional. 531 o Hexadecimal numbers are preceded by '0x' (without quotes, lower 532 case x), and are case insensitive. Similarly, binary numbers are 533 preceded by '0b'. 535 o Strings are enclosed by double quotation '"' characters. They 536 follow the conventions for strings as defined in [RFC7159], 537 section 7. [_strings] 539 o CDDL uses UTF-8 [RFC3629] for its encoding. 541 Example: 543 ; This is a comment 544 person = { g } 546 g = ( 547 "name": tstr, 548 age: int, 549 ) 551 3.2. Occurrence 553 An optional _occurrence_ indicator can be given in front of a group 554 entry. It is either one of the characters '?' (optional), '*' (zero 555 or more), or '+' (one or more), or is of the form n*m, where n and m 556 are optional unsigned integers and n is the lower limit (default 0) 557 and m is the upper limit (default no limit) of occurrences. 559 If no occurrence indicator is specified, the group entry is to occur 560 exactly once (as if 1*1 were specified). 562 Note that CDDL, outside any directives/annotations that could 563 possibly be defined, does not make any prescription as to whether 564 arrays or maps use the definite length or indefinite length encoding. 565 I.e., there is no correlation between leaving the size of an array 566 "open" in the spec and the fact that it is then interchanged with 567 definite or indefinite length. 569 3.3. Predefined names for types 571 CDDL predefines a number of names. This subsection summarizes these 572 names, but please see Appendix E for the exact definitions. 574 The following keywords for primitive datatypes are defined: 576 "bool" Boolean value (major type 7, additional information 20 or 577 21). 579 "uint" An unsigned integer (major type 0). 581 "nint" A negative integer (major type 1). 583 "int" An unsigned integer or a negative integer. 585 "float16" IEEE 754 half-precision float (major type 7, additional 586 information 25). 588 "float32" IEEE 754 single-precision float (major type 7, additional 589 information 26). 591 "float64" IEEE 754 double-precision float (major type 7, additional 592 information 27). 594 "float" One of float16, float32, or float64. 596 "bstr" or "bytes" A byte string (major type 2). 598 "tstr" or "text" Text string (major type 3) 600 (Note that there are no predefined names for arrays or maps; these 601 are defined with the syntax given below.) 603 In addition, a number of types are defined in the prelude that are 604 associated with CBOR tags, such as "tdate", "bigint", "regexp" etc. 606 3.4. Arrays 608 Array definitions surround a group with square brackets. 610 For each entry, an occurrence indicator as specified in Section 3.2 611 is permitted. 613 For example: 615 unlimited-people = [* person] 616 one-or-two-people = [1*2 person] 617 at-least-two-people = [2* person] 618 person = ( 619 name: tstr, 620 age: uint, 621 ) 623 The group "person" is defined in such a way that repeating it in the 624 array each time generates alternating names and ages, so these are 625 four valid values for a data item of type "unlimited-people": 627 ["roundlet", 1047, "psychurgy", 2204, "extrarhythmical", 2231] 628 [] 629 ["aluminize", 212, "climograph", 4124] 630 ["penintime", 1513, "endocarditis", 4084, "impermeator", 1669, 631 "coextension", 865] 633 3.5. Maps 635 The syntax for specifying maps merits special attention, as well as a 636 number of optimizations and conveniences, as it is likely to be the 637 focal point of many specifications employing CDDL. While the syntax 638 does not strictly distinguish struct and table usage of maps, it 639 caters specifically to each of them. 641 3.5.1. Structs 643 The "struct" usage of maps is similar to the way JSON objects are 644 used in many JSON applications. 646 A map is defined in the same way as defining an array (see 647 Section 3.4), except for using curly braces "{}" instead of square 648 brackets "[]". 650 An occurrence indicator as specified in Section 3.2 is permitted for 651 each group entry. 653 The following is an example of a structure: 655 Geography = [ 656 city : tstr, 657 gpsCoordinates : GpsCoordinates, 658 ] 660 GpsCoordinates = { 661 longitude : uint, ; multiplied by 10^7 662 latitude : uint, ; multiplied by 10^7 663 } 665 When encoding, the Geography structure is encoded using a CBOR array 666 with two entries, whereas the GpsCoordinates are encoded as a CBOR 667 map with two key-value pairs. 669 Types used in a structure can be defined in separate rules or just in 670 place (potentially placed inside parentheses, such as for choices). 671 E.g.: 673 located-samples = { 674 sample-point: int, 675 samples: [+ float], 676 } 678 where "located-samples" is the datatype to be used when referring to 679 the struct, and "sample-point" and "samples" are the keys to be used. 680 This is actually a complete example: an identifier that is followed 681 by a colon can be directly used as the text string for a member key 682 (we speak of a "bareword" member key), as can a double-quoted string 683 or a number. (When other types, in particular multi-valued ones, are 684 used as keytypes, they are followed by a double arrow, see below.) 686 If a text string key does not match the syntax for an identifier (or 687 if the specifier just happens to prefer using double quotes), the 688 text string syntax can also be used in the member key position, 689 followed by a colon. The above example could therefore have been 690 written with quoted strings in the member key positions. 692 All the types defined can be used in a keytype position by following 693 them with a double arrow. A string also is a (single-valued) type, 694 so another form for this example is: 696 located-samples = { 697 "sample-point" => int, 698 "samples" => [+ float], 699 } 701 A better way to demonstrate the double-arrow use may be: 703 located-samples = { 704 sample-point: int, 705 samples: [+ float], 706 * equipment-type => equipment-tolerances, 707 } 708 equipment-type = [name: tstr, manufacturer: tstr] 709 equipment-tolerances = [+ [float, float]] 711 The example below defines a struct with optional entries: display 712 name (as a text string), the name components first name and family 713 name (as a map of text strings), and age information (as an unsigned 714 integer). 716 PersonalData = { 717 ? displayName: tstr, 718 NameComponents, 719 ? age: uint, 720 } 722 NameComponents = ( 723 ? firstName: tstr, 724 ? familyName: tstr, 725 ) 727 Note that the group definition for NameComponents does not generate 728 another map; instead, all four keys are directly in the struct built 729 by PersonalData. 731 In this example, all key/value pairs are optional from the 732 perspective of CDDL. With no occurrence indicator, an entry is 733 mandatory. 735 If the addition of more entries not specified by the current 736 specification is desired, one can add this possibility explicitly: 738 PersonalData = { 739 ? displayName: tstr, 740 NameComponents, 741 ? age: uint, 742 * tstr => any 743 } 745 NameComponents = ( 746 ? firstName: tstr, 747 ? familyName: tstr, 748 ) 750 Figure 5: Personal Data: Example for extensibility 752 The cddl tool (Appendix F) generated as one acceptable instance for 753 this specification: 755 {"familyName": "agust", "antiforeignism": "pretzel", 756 "springbuck": "illuminatingly", "exuviae": "ephemeris", 757 "kilometrage": "frogfish"} 759 (See Appendix B.2 for one way to explicitly identify an extension 760 point.) 762 3.5.2. Tables 764 A table can be specified by defining a map with entries where the 765 keytype is not single-valued, e.g.: 767 square-roots = {* x => y} 768 x = int 769 y = float 771 Here, the key in each key/value pair has datatype x (defined as int), 772 and the value has datatype y (defined as float). 774 If the specification does not need to restrict one of x or y (i.e., 775 the application is free to choose per entry), it can be replaced by 776 the predefined name "any". 778 As another example, the following could be used as a conversion table 779 converting from an integer or float to a string: 781 tostring = {* x => tstr} 782 x = int / float 784 3.6. Tags 786 A type can make use of a CBOR tag (major type 6) by using the 787 representation type notation, giving #6.nnn(type) where nnn is an 788 unsigned integer giving the tag number and "type" is the type of the 789 data item being tagged. 791 For example, the following line from the CDDL prelude (Appendix E) 792 defines "biguint" as a type name for a positive bignum N: 794 biguint = #6.2(bstr) 796 The tags defined by [RFC7049] are included in the prelude. 797 Additional tags since registered need to be added to a CDDL 798 specification as needed; e.g., a binary UUID tag could be referenced 799 as "buuid" in a specification after defining 801 buuid = #6.37(bstr) 803 In the following example, usage of the tag 32 for URIs is optional: 805 my_uri = #6.32(tstr) / tstr 807 3.7. Operator Precedence 809 As with any language that has multiple syntactic features such as 810 prefix and infix operators, CDDL has operators that bind more tightly 811 than others. This is becoming more complicated than, say, in ABNF, 812 as CDDL has both types and groups, with operators that are specific 813 to these concepts. Type operators (such as "/" for type choice) 814 operate on types, while group operators (such as "//" for group 815 choice) operate on groups. Types can simply be used in groups, but 816 groups need to be bracketed (as arrays or maps) to become types. So, 817 type operators naturally bind closer than group operators. 819 For instance, in 821 t = [group1] 822 group1 = (a / b // c / d) 823 a = 1 b = 2 c = 3 d = 4 825 group1 is a group choice between the type choice of a and b and the 826 type choice of c and d. This becomes more relevant once member keys 827 and/or occurrences are added in: 829 t = {group2} 830 group2 = (? ab: a / b // cd: c / d) 831 a = 1 b = 2 c = 3 d = 4 833 is a group choice between the optional member "ab" of type a or b and 834 the member "cd" of type c or d. Note that the optionality is 835 attached to the first choice ("ab"), not to the second choice. 837 Similarly, in 839 t = [group3] 840 group3 = (+ a / b / c) 841 a = 1 b = 2 c = 3 843 group3 is a repetition of a type choice between a, b, and c [unflex]; 844 if just a is to be repeatable, a group choice is needed to focus the 845 occurrence: 847 t = [group4] 848 group4 = (+ a // b / c) 849 a = 1 b = 2 c = 3 851 group4 is a group choice between a repeatable a and a single b or c. 853 In general, as with many other languages with operator precedence 854 rules, it is best not to rely on them, but to insert parentheses for 855 readability: 857 t = [group4a] 858 group4a = ((+ a) // (b / c)) 859 a = 1 b = 2 c = 3 861 The operator precedences, in sequence of loose to tight binding, are 862 defined in Appendix D and summarized in Table 1. (Arities given are 863 1 for unary prefix operators and 2 for binary infix operators.) 865 +----------+----+---------------------------+------+ 866 | Operator | Ar | Operates on | Prec | 867 +----------+----+---------------------------+------+ 868 | = | 2 | name = type, name = group | 1 | 869 | /= | 2 | name /= type | 1 | 870 | //= | 2 | name //= group | 1 | 871 | // | 2 | group // group | 2 | 872 | , | 2 | group, group | 3 | 873 | * | 1 | * group | 4 | 874 | N*M | 1 | N*M group | 4 | 875 | + | 1 | + group | 4 | 876 | ? | 1 | ? group | 4 | 877 | => | 2 | type => type | 5 | 878 | : | 2 | name: type | 5 | 879 | / | 2 | type / type | 6 | 880 | & | 1 | &group | 6 | 881 | .. | 2 | type..type | 7 | 882 | ... | 2 | type...type | 7 | 883 | .anno | 2 | type .anno type | 7 | 884 +----------+----+---------------------------+------+ 886 Table 1: Summary of operator precedences 888 4. Examples 890 This section contains various examples of structures defined using 891 CDDL. 893 4.1. Moves in a computer game 895 A multiplayer computer game uses CBOR to exchange moves between the 896 players. To ensure a good gaming experience, the move information 897 needs to be exchanged quickly and frequently. Therefore, the game 898 uses CBOR to send its information in a compact format. Figure 6 899 shows definition of the CBOR information exchange format. 901 UpdateMsg = [* { 902 move_no : uint, ; increases for each move 903 player_info : PlayerInfo, ; general information 904 moves : Moves, ; moves in this message 905 }] 907 PlayerInfo = { 908 alias : tstr, 909 player_id : uint, 910 experience : uint, ; beginner: 0; expert: 3 911 gold : uint, 912 supplies : Supplies, 913 avg_strength : float16, 914 } 916 Supplies = { 917 wood => uint 918 iron => uint 919 grain => uint 920 } 922 wood = 0 923 iron = 1 924 grain = 2 926 Moves = [* Move] 928 Move = ( 929 unit_id : uint, 930 unit_strength : uint, ; between 0 and 100 931 2*2 source_pos : uint, ; (x,y) 932 2*2 target_pos : uint, ; (x,y) 933 ) 935 Figure 6: CDDL definition of an information exchange format for a 936 computer game 938 The CDDL tool generates this as a possible instance: 940 [{"move_no": 3985, "player_info": 941 {"alias": "timbrologist", "player_id": 699, "experience": 2699, 942 "gold": 328, "supplies": {0: 1768, 1: 3087, 2: 1401}, 943 "avg_strength": 0.9712613869888417}, 944 "moves": [[1702, 458, 38, 399, 327, 304], 945 [3145, 4454, 1175, 3441, 74, 1542], 946 [4099, 4062, 2808, 8, 3174, 3048], 947 [367, 3649, 756, 3644, 3725, 2769]]}, 948 {"move_no": 199, "player_info": 949 {"alias": "cipo", "player_id": 4309, "experience": 4094, 950 "gold": 4114, "supplies": {0: 873, 1: 4706, 2: 1733}, 951 "avg_strength": 0.37808379403466696}, 952 "moves": [[1977, 3129, 3890, 4000, 1555, 377], 953 [2646, 286, 3363, 4381, 3815, 1039]]}, 954 {"move_no": 2226, "player_info": 955 {"alias": "Stacey", "player_id": 1055, "experience": 207, 956 "gold": 285, "supplies": {0: 3325, 1: 1515, 2: 3304}, 957 "avg_strength": 0.8590028130444863}, 958 "moves": [[869, 4126, 2382, 3155, 1523, 2621]]}] 960 Notice that the supplies have been encoded as a map with integer 961 keys. In this example, using string keys would also have been 962 suitable; the example just illustrates the possibility to use other 963 datatypes for keys, leading to more efficient encoding. 965 The tool-generated binary CBOR for the instance about cannot express 966 yet that the floating point values are 16-bit: 968 83 # array(3) 969 a3 # map(3) 970 67 # text(7) 971 6d6f76655f6e6f # "move_no" 972 19 0f91 # unsigned(3985) 973 6b # text(11) 974 706c617965725f696e666f # "player_info" 975 a6 # map(6) 976 65 # text(5) 977 616c696173 # "alias" 978 6c # text(12) 979 74696d62726f6c6f67697374 # "timbrologist" 980 69 # text(9) 981 706c617965725f6964 # "player_id" 982 19 02bb # unsigned(699) 983 6a # text(10) 984 657870657269656e6365 # "experience" 985 19 0a8b # unsigned(2699) 986 64 # text(4) 987 676f6c64 # "gold" 989 19 0148 # unsigned(328) 990 68 # text(8) 991 737570706c696573 # "supplies" 992 a3 # map(3) 993 00 # unsigned(0) 994 19 06e8 # unsigned(1768) 995 01 # unsigned(1) 996 19 0c0f # unsigned(3087) 997 02 # unsigned(2) 998 19 0579 # unsigned(1401) 999 6c # text(12) 1000 6176675f737472656e677468 # "avg_strength" 1001 fb 3fef1492c29f8275 # primitive(4606923564386321013) 1002 65 # text(5) 1003 6d6f766573 # "moves" 1004 84 # array(4) 1005 86 # array(6) 1006 19 06a6 # unsigned(1702) 1007 19 01ca # unsigned(458) 1008 18 26 # unsigned(38) 1009 19 018f # unsigned(399) 1010 19 0147 # unsigned(327) 1011 19 0130 # unsigned(304) 1012 86 # array(6) 1013 19 0c49 # unsigned(3145) 1014 19 1166 # unsigned(4454) 1015 19 0497 # unsigned(1175) 1016 19 0d71 # unsigned(3441) 1017 18 4a # unsigned(74) 1018 19 0606 # unsigned(1542) 1019 86 # array(6) 1020 19 1003 # unsigned(4099) 1021 19 0fde # unsigned(4062) 1022 19 0af8 # unsigned(2808) 1023 08 # unsigned(8) 1024 19 0c66 # unsigned(3174) 1025 19 0be8 # unsigned(3048) 1026 86 # array(6) 1027 19 016f # unsigned(367) 1028 19 0e41 # unsigned(3649) 1029 19 02f4 # unsigned(756) 1030 19 0e3c # unsigned(3644) 1031 19 0e8d # unsigned(3725) 1032 19 0ad1 # unsigned(2769) 1033 a3 # map(3) 1034 67 # text(7) 1035 6d6f76655f6e6f # "move_no" 1036 18 c7 # unsigned(199) 1037 6b # text(11) 1038 706c617965725f696e666f # "player_info" 1039 a6 # map(6) 1040 65 # text(5) 1041 616c696173 # "alias" 1042 64 # text(4) 1043 6369706f # "cipo" 1044 69 # text(9) 1045 706c617965725f6964 # "player_id" 1046 19 10d5 # unsigned(4309) 1047 6a # text(10) 1048 657870657269656e6365 # "experience" 1049 19 0ffe # unsigned(4094) 1050 64 # text(4) 1051 676f6c64 # "gold" 1052 19 1012 # unsigned(4114) 1053 68 # text(8) 1054 737570706c696573 # "supplies" 1055 a3 # map(3) 1056 00 # unsigned(0) 1057 19 0369 # unsigned(873) 1058 01 # unsigned(1) 1059 19 1262 # unsigned(4706) 1060 02 # unsigned(2) 1061 19 06c5 # unsigned(1733) 1062 6c # text(12) 1063 6176675f737472656e677468 # "avg_strength" 1064 fb 3fd832865ea1b216 # primitive(4600482572053623318) 1065 65 # text(5) 1066 6d6f766573 # "moves" 1067 82 # array(2) 1068 86 # array(6) 1069 19 07b9 # unsigned(1977) 1070 19 0c39 # unsigned(3129) 1071 19 0f32 # unsigned(3890) 1072 19 0fa0 # unsigned(4000) 1073 19 0613 # unsigned(1555) 1074 19 0179 # unsigned(377) 1075 86 # array(6) 1076 19 0a56 # unsigned(2646) 1077 19 011e # unsigned(286) 1078 19 0d23 # unsigned(3363) 1079 19 111d # unsigned(4381) 1080 19 0ee7 # unsigned(3815) 1081 19 040f # unsigned(1039) 1082 a3 # map(3) 1083 67 # text(7) 1084 6d6f76655f6e6f # "move_no" 1086 19 08b2 # unsigned(2226) 1087 6b # text(11) 1088 706c617965725f696e666f # "player_info" 1089 a6 # map(6) 1090 65 # text(5) 1091 616c696173 # "alias" 1092 66 # text(6) 1093 537461636579 # "Stacey" 1094 69 # text(9) 1095 706c617965725f6964 # "player_id" 1096 19 041f # unsigned(1055) 1097 6a # text(10) 1098 657870657269656e6365 # "experience" 1099 18 cf # unsigned(207) 1100 64 # text(4) 1101 676f6c64 # "gold" 1102 19 011d # unsigned(285) 1103 68 # text(8) 1104 737570706c696573 # "supplies" 1105 a3 # map(3) 1106 00 # unsigned(0) 1107 19 0cfd # unsigned(3325) 1108 01 # unsigned(1) 1109 19 05eb # unsigned(1515) 1110 02 # unsigned(2) 1111 19 0ce8 # unsigned(3304) 1112 6c # text(12) 1113 6176675f737472656e677468 # "avg_strength" 1114 fb 3feb7cf377a65699 # primitive(4605912429042751129) 1115 65 # text(5) 1116 6d6f766573 # "moves" 1117 81 # array(1) 1118 86 # array(6) 1119 19 0365 # unsigned(869) 1120 19 101e # unsigned(4126) 1121 19 094e # unsigned(2382) 1122 19 0c53 # unsigned(3155) 1123 19 05f3 # unsigned(1523) 1124 19 0a3d # unsigned(2621) 1126 Figure 7: CBOR instance for game example 1128 4.2. Fruit 1130 Figure 8 contains an example for a CBOR structure that contains 1131 information about fruit. 1133 fruitlist = [* Fruit] 1135 Fruit = { 1136 name : tstr, 1137 colour : [* color], 1138 avg_weight : float16, 1139 price : uint, 1140 international_names : International, 1141 rfu : bstr, ; reserved for future use 1142 } 1144 International = { 1145 "DE" : tstr, ; German 1146 "EN" : tstr, ; English 1147 "FR" : tstr, ; French 1148 "NL" : tstr, ; Dutch 1149 "ZH-HANS" : tstr, ; Chinese 1150 } 1152 color = &( 1153 black: 0, red: 1, green: 2, yellow: 3, 1154 blue: 4, magenta: 5, cyan: 6, white: 7, 1155 ) 1157 Figure 8: Example CBOR structure 1159 4.3. RFC 7071 1161 [RFC7071] defines the Reputon structure for JSON using somewhat 1162 formalized English text. Here is a (somewhat verbose) equivalent 1163 definition using the same terms, but notated in CDDL: 1165 reputation-object = { 1166 reputation-context, 1167 reputon-list 1168 } 1170 reputation-context = ( 1171 application: text 1172 ) 1174 reputon-list = ( 1175 reputons: reputon-array 1176 ) 1178 reputon-array = [* reputon] 1180 reputon = { 1181 rater-value, 1182 assertion-value, 1183 rated-value, 1184 rating-value, 1185 ? conf-value, 1186 ? normal-value, 1187 ? sample-value, 1188 ? gen-value, 1189 ? expire-value, 1190 * ext-value, 1191 } 1193 rater-value = ( rater: text ) 1194 assertion-value = ( assertion: text ) 1195 rated-value = ( rated: text ) 1196 rating-value = ( rating: float16 ) 1197 conf-value = ( confidence: float16 ) 1198 normal-value = ( normal-rating: float16 ) 1199 sample-value = ( sample-size: uint ) 1200 gen-value = ( generated: uint ) 1201 expire-value = ( expires: uint ) 1202 ext-value = ( text => any ) 1204 An equivalent, more compact form of this example would be: 1206 reputation-object = { 1207 application: text 1208 reputons: [* reputon] 1209 } 1211 reputon = { 1212 rater: text 1213 assertion: text 1214 rated: text 1215 rating: float16 1216 ? confidence: float16 1217 ? normal-rating: float16 1218 ? sample-size: uint 1219 ? generated: uint 1220 ? expires: uint 1221 * text => any 1222 } 1224 Note how this rather clearly delineates the structure somewhat 1225 shrouded by so many words in section 6.2.2. of [RFC7071]. Also, this 1226 definition makes it clear that several ext-values are allowed (by 1227 definition with different member names); RFC 7071 could be read to 1228 forbid the repetition of ext-value ("A specific reputon-element MUST 1229 NOT appear more than once" is ambiguous.) 1231 The CDDL tool (which hasn't quite been trained for polite 1232 conversation) says: 1234 { 1235 "application": "tridentiferous", 1236 "reputons": [ 1237 { 1238 "rater": "loamily", 1239 "assertion": "Dasyprocta", 1240 "rated": "uncommensurableness", 1241 "rating": 0.05055809746548934, 1242 "confidence": 0.7484706448605812, 1243 "normal-rating": 0.8677887734049299, 1244 "sample-size": 4059, 1245 "expires": 3969, 1246 "bearer": "nitty", 1247 "faucal": "postulnar", 1248 "naturalism": "sarcotic" 1249 }, 1250 { 1251 "rater": "precreed", 1252 "assertion": "xanthosis", 1253 "rated": "balsamy", 1254 "rating": 0.36091333590593955, 1255 "confidence": 0.3700759808403371, 1256 "sample-size": 3904 1257 }, 1258 { 1259 "rater": "urinosexual", 1260 "assertion": "malacostracous", 1261 "rated": "arenariae", 1262 "rating": 0.9210673488013762, 1263 "normal-rating": 0.4778762617112776, 1264 "sample-size": 4428, 1265 "generated": 3294, 1266 "backfurrow": "enterable", 1267 "fruitgrower": "flannelflower" 1268 }, 1269 { 1270 "rater": "pedologistically", 1271 "assertion": "unmetaphysical", 1272 "rated": "elocutionist", 1273 "rating": 0.42073613384304287, 1274 "misimagine": "retinaculum", 1275 "snobbish": "contradict", 1276 "Bosporanic": "periostotomy", 1277 "dayworker": "intragyral" 1278 } 1279 ] 1280 } 1282 4.4. Examples from JSON Content Rules 1284 Although JSON Content Rules [I-D.newton-json-content-rules] seems to 1285 address a more general problem than CDDL, it is still a worthwhile 1286 resource to explore for examples (beyond all the inspiration the 1287 format itself has had for CDDL). 1289 Figure 2 of the JCR I-D looks very similar, if slightly less noisy, 1290 in CDDL: 1292 root = [2*2 { 1293 precision: text, 1294 Latitude: float, 1295 Longitude: float, 1296 Address: text, 1297 City: text, 1298 State: text, 1299 Zip: text, 1300 Country: text 1301 }] 1303 Figure 9: JCR, Figure 2, in CDDL 1305 Apart from the lack of a need to quote the member names, text strings 1306 are called "text" or "tstr" in CDDL ("string" would be ambiguous as 1307 CBOR also provides byte strings). 1309 The CDDL tool creates the below example instance for this: 1311 [{"precision": "pyrosphere", "Latitude": 0.5399712314350172, 1312 "Longitude": 0.5157523963028087, "Address": "resow", 1313 "City": "problemwise", "State": "martyrlike", "Zip": "preprove", 1314 "Country": "Pace"}, 1315 {"precision": "unrigging", "Latitude": 0.10422704368372193, 1316 "Longitude": 0.6279808663725834, "Address": "picturedom", 1317 "City": "decipherability", "State": "autometry", "Zip": "pout", 1318 "Country": "wimple"}] 1320 Figure 4 of the JCR I-D in CDDL: 1322 root = { image } 1324 image = ( 1325 Image: { 1326 size, 1327 Title: text, 1328 thumbnail, 1329 IDs: [* int] 1330 } 1331 ) 1333 size = ( 1334 Width: 0..1280 1335 Height: 0..1024 1336 ) 1338 thumbnail = ( 1339 Thumbnail: { 1340 size, 1341 Url: uri 1342 } 1343 ) 1345 This shows how the group concept can be used to keep related elements 1346 (here: width, height) together, and to emulate the JCR style of 1347 specification. (It also shows using a tag from the prelude, "uri" - 1348 this could be done differently.) The more compact form of Figure 5 1349 of the JCR I-D could be emulated like this: 1351 root = { 1352 Image: { 1353 size, Title: text, 1354 Thumbnail: { size, Url: uri }, 1355 IDs: [* int] 1356 } 1357 } 1359 size = ( 1360 Width: 0..1280, 1361 Height: 0..1024, 1362 ) 1364 The CDDL tool creates the below example instance for this: 1366 {"Image": {"Width": 566, "Height": 516, "Title": "leisterer", 1367 "Thumbnail": {"Width": 1111, "Height": 176, "Url": 32("scrog")}, 1368 "IDs": []}} 1370 5. Making Use of CDDL 1372 In this section, we discuss several potential ways to employ CDDL. 1374 5.1. As a guide to a human user 1376 CDDL can be used to efficiently define the layout of CBOR data, such 1377 that a human implementer can easily see how data is supposed to be 1378 encoded. 1380 Since CDDL maps parts of the CBOR data to human readable names, tools 1381 could be built that use CDDL to provide a human friendly 1382 representation of the CBOR data, and allow them to edit such data 1383 while remaining compliant to its CDDL definition. 1385 5.2. For automated checking of CBOR data structure 1387 CDDL has been specified such that a machine can handle the CDDL 1388 definition and related CBOR data. For example, a machine could use 1389 CDDL to check whether or not CBOR data is compliant to its 1390 definition. 1392 The need for thoroughness of such compliance checking depends on the 1393 application. For example, an application may decide not to check the 1394 data structure at all, and use the CDDL definition solely as a means 1395 to indicate the structure of the data to the programmer. 1397 On the other end, the application may also implement a checking 1398 mechanism that goes as far as checking that all mandatory map pairs 1399 are available. 1401 The matter in how far the data description must be enforced by an 1402 application is left to the designers and implementers of that 1403 application, keeping in mind related security considerations. 1405 In no case the intention is that a CDDL tool would be "writing code" 1406 for an implementation. 1408 5.3. For data analysis tools 1410 In the long run, it can be expected that more and more data will be 1411 stored using the CBOR data format. 1413 Where there is data, there is data analysis and the need to process 1414 such data automatically. CDDL can be used for such automated data 1415 processing, allowing tools to verify data, clean it, and extract 1416 particular parts of interest from it. 1418 Since CBOR is designed with constrained devices in mind, a likely use 1419 of it would be small sensors. An interesting use would thus be 1420 automated analysis of sensor data. 1422 6. Discussion 1424 CDDL already is usable in its present form, as Section 4.3 should 1425 have demonstrated. However, additional examples should be developed, 1426 and some experience be gained with the usefulness of tools built 1427 around CDDL. 1429 6.1. Work to do 1431 o The precise semantics of occurrence indicators as defined in 1432 Section 3.2 could be explained in more detail. E.g., the exact 1433 semantics of an occurrence indicators on a group name in a map 1434 (which means the entire group can occur in this way). 1436 o Build good use cases that, one each, demonstrate vector, record, 1437 table and struct usage. 1439 o There probably are some security considerations. 1441 See also the editorial comments sprinkled throughout the document. 1443 7. Resolved Issues 1445 o The key/value pairs in maps have no fixed ordering. One could 1446 imagine situations where fixing the ordering may be of use. For 1447 example, a decoder could look for values related with integer keys 1448 1, 3 and 7. If the order were fixed and the decoder encounters 1449 the key 4 without having encountered key 3, it could conclude that 1450 key 3 is not available without doing more complicated bookkeeping. 1451 Unfortunately, neither JSON nor CBOR support this, so no attempt 1452 was made to support this in CDDL either. 1454 o CDDL distinguishes the various CBOR number types, but there is 1455 only one number type in JSON. There is no effect in specifying a 1456 precision (float16/float32/float64) when using CDDL for specifying 1457 JSON data structures. (The current validator implementation 1458 Appendix F does not handle this very well, either.) 1460 8. Security considerations 1462 This document presents a content rules language for expressing CBOR 1463 data structures. As such, it does not bring any security issues on 1464 itself, although specification of protocols that use CBOR naturally 1465 need security analysis when defined. 1467 Topics that could be considered in a security considerations section 1468 that uses CDDL to define CBOR structures include the following: 1470 o Where could the language maybe cause confusion in a way that will 1471 enable security issues? 1473 9. IANA considerations 1475 This document does not require any IANA registrations. 1477 10. Acknowledgements 1479 CDDL was originally conceived by Bert Greevenbosch, who also wrote 1480 the original five versions of this document. 1482 Inspiration was taken from the C and Pascal languages, MPEG's 1483 conventions for describing structures in the ISO base media file 1484 format, Relax-NG and its compact syntax [RELAXNG], and in particular 1485 from Andrew Lee Newton's "JSON Content Rules" 1486 [I-D.newton-json-content-rules]. 1488 Useful feedback came from Carsten Bormann, Joe Hildebrand, Sean 1489 Leonard and Jim Schaad. 1491 The CDDL tool was written by Carsten Bormann, building on previous 1492 work by Troy Heninger and Tom Lord. 1494 11. References 1496 11.1. Normative References 1498 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1499 Requirement Levels", BCP 14, RFC 2119, 1500 DOI 10.17487/RFC2119, March 1997, 1501 . 1503 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 1504 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November 1505 2003, . 1507 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 1508 Specifications: ABNF", STD 68, RFC 5234, 1509 DOI 10.17487/RFC5234, January 2008, 1510 . 1512 [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object 1513 Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, 1514 October 2013, . 1516 [RFC7159] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data 1517 Interchange Format", RFC 7159, DOI 10.17487/RFC7159, March 1518 2014, . 1520 11.2. Informative References 1522 [RELAXNG] OASIS, "RELAX-NG Compact Syntax", November 2002, 1523 . 1525 [RFC7071] Borenstein, N. and M. Kucherawy, "A Media Type for 1526 Reputation Interchange", RFC 7071, DOI 10.17487/RFC7071, 1527 November 2013, . 1529 [I-D.newton-json-content-rules] 1530 Newton, A. and P. Cordell, "A Language for Rules 1531 Describing JSON Content", draft-newton-json-content- 1532 rules-05 (work in progress), October 2015. 1534 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 1535 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, 1536 . 1538 Appendix A. Cemetery 1540 The following ideas are buried for now: 1542 o <...> as syntax for enumerations. We view values to be just 1543 another type (a very specific type with just one member), so that 1544 an enumeration can be denoted as a choice using "/" as the 1545 delimiter of choices. Because of this, no evidence is present 1546 that a separate syntax for enumerations is needed. 1548 Appendix B. Nursery 1550 This appendix describes advanced features that are still under heavy 1551 review. 1553 B.1. Annotations 1555 An _annotation_ allows to annotate a _target_ type with a _control_ 1556 type via an _annotator_. 1558 The syntax for an annotated type is "target .annotator control", 1559 where annotators are special identifiers prefixed by a dot. (Note 1560 that _target_ or _control_ might need to be parenthesized.) 1561 Three annotators are defined at his point. Note that the CDDL tool 1562 does not currently support combining multiple annotations on a single 1563 target. 1565 B.1.1. Annotation .size 1567 A ".size" annotation controls the size of the target in bytes by the 1568 control type. Examples: 1570 full-address = [[+ label], ip4, ip6] 1571 ip4 = bstr .size 4 1572 ip6 = bstr .size 16 1573 label = bstr .size (1..63) 1575 Figure 10: Annotation for size in bytes 1577 (FIXME: In the CDDL tool, the target must be a byte string for now.) 1579 When applied to an unsigned integer, the ".size" annotation restricts 1580 the range of that integer by giving a maximum number of bytes that 1581 should be needed in a computer representation of that unsigned 1582 integer. In other words, "uint .size N" is equivalent to 1583 "0...BYTES_N", where BYTES_N == 256**N. 1585 audio_sample = uint .size 3 ; 24-bit, equivalent to 0..16777215 1587 Figure 11: Annotation for integer size in bytes 1589 Note that, as with value restrictions in CDDL, this annotation is not 1590 a representation constraint; a number that fits into fewer bytes can 1591 still be represented in that form, and an inefficient implementation 1592 could use a longer form (unless that is restricted by some format 1593 constraints outside of CDDL, such as the rules in Section 3.9 of 1594 [RFC7049]). 1596 B.1.2. Annotation .bits 1598 A ".bits" annotation on a byte string indicates that, in the target, 1599 only the bits numbered by a number in the control type are allowed to 1600 be set. (Bits are counted the usual way, bit number "n" being set in 1601 "str" meaning that "(str[n >> 3] & (1 << (n & 7))) != 0".) 1602 [_bitsendian] 1604 Similarly, a ".bits" annotation on an unsigned integer "i" indicates 1605 that for all unsigned integers "n" where "(i & (1 << n)) != 0", "n" 1606 is in the control type. 1608 tcpflagbytes = bstr .bits flags 1609 flags = &( 1610 fin: 8, 1611 syn: 9, 1612 rst: 10, 1613 psh: 11, 1614 ack: 12, 1615 urg: 13, 1616 ece: 14, 1617 cwr: 15, 1618 ns: 0, 1619 ) / (4..7) ; data offset bits 1621 rwxbits = uint .bits rwx 1622 rwx = &(r: 2, w: 1, x: 0) 1624 Figure 12: Annotation for what bits can be set 1626 The CDDL tool generates the following ten example instances for 1627 "tcpflagbytes": 1629 h'906d' h'01fc' h'8145' h'01b7' h'013d' h'409f' h'018e' h'c05f' 1630 h'01fa' h'01fe' 1632 These examples do not illustrate that the above CDDL specification 1633 does not explicitly specify a size of two bytes: A valid all clear 1634 instance of flag bytes could be "h''" or "h'00'" or even "h'000000'" 1635 as well. 1637 B.1.3. Annotation .regexp 1639 A ".regexp" annotation indicates that the text string given as a 1640 target needs to match the PCRE regular expression given as a value in 1641 the control type, where that regular expression is anchored on both 1642 sides. (If anchoring is not desired for a side, ".*" needs to be 1643 inserted there.) 1645 nai = tstr .regexp "\\w+@\\w+(\\.\\w+)+" 1647 Figure 13: Annotation with a PCRE regexp 1649 The CDDL tool proposes: 1651 "N1@CH57HF.4Znqe0.dYJRN.igjf" 1653 B.1.4. Annotations .cbor and .cborseq 1655 A ".cbor" annotation on a byte string indicates that the byte string 1656 carries a CBOR encoded data item. Decoded, the data item matches the 1657 type given as the right-hand side argument (type1 in the following 1658 example). 1660 "bytes .cbor type1" 1662 Similarly, a ".cborseq" annotation on a byte string indicates that 1663 the byte string carries a sequence of CBOR encoded data items. When 1664 the data items are taken as an array, the array matches the type 1665 given as the right-hand side argument (type2 in the following 1666 example). 1668 "bytes .cborseq type2" 1670 (The conversion of the encoded sequence to an array can be effected 1671 for instance by wrapping the byte string between the two bytes 0x9f 1672 and 0xff and decoding the wrapped byte string as a CBOR encoded data 1673 item.) 1675 B.1.5. Annotations .within and .and 1677 A ".and" annotation on a type indicates that the data item matches 1678 both that left hand side type and the type given as the right hand 1679 side. (Formally, the resulting type is the intersection of the two 1680 types given.) 1682 "type1 .and type2" 1684 A variant of the ".and" annotation is the ".within" annotation, which 1685 expresses an additional intent: the left hand side type is meant to 1686 be a subset of the right-hand-side type. 1688 "type1 .within type2" 1690 While both forms have the identical formal semantics (intersection), 1691 the intention of the ".within" form is that the right hand side gives 1692 guidance to the types allowed on the left hand side, which typically 1693 is a socket (Appendix B.2): 1695 message = $message .within message-structure 1696 message-structure = [message_type, *message_option] 1697 message_type = 0..255 1698 message_option = any 1700 $message /= [3, dough: text, topping: [* text]] 1701 $message /= [4, noodles: text, sauce: text, parmesan: bool] 1703 For ".within", a tool might flag an error if type1 allows data items 1704 that are not allowed by type2. In contrast, for ".and", there is no 1705 expectation that type1 already is a subset of type2. 1707 B.1.6. Annotations .lt, .le, .gt, .ge, .eq, .ne, and .default 1709 The annotations .lt, .le, .gt, .ge, .eq, .ne specify a constraint on 1710 the left hand side type to be a value less than, less than or equal, 1711 equal to, not equal to, greather than, or greater than or equal to a 1712 value given as a (single-valued) right hand side type. In the 1713 present specification, the first four annotations (.lt, .le, .gt, 1714 .ge) are defined only for numeric types, as these have a natural 1715 ordering relationship. 1717 speed = number .ge 0 ; unit: m/s 1719 A variant of the ".ne" annotation is the ".default" annotation, which 1720 expresses an additional intent: the value specified by the right- 1721 hand-side type is intended as a default value for the left hand side 1722 type given, and the implied .ne annotation is there to prevent this 1723 value from being sent over the wire. This annotation is only 1724 meaningful when the annotated type is used in an optional context; 1725 otherwise there would be no way to express the default value. 1727 timer = { 1728 time: uint, 1729 ? displayed-step: (number .gt 0) .default 1 1730 } 1732 B.2. Socket/Plug 1734 Both for type choices and group choices, a mechanism is defined that 1735 facilitates starting out with empty choices and assembling them 1736 later, potentially in separate files that are concatenated to build 1737 the full specification. 1739 Per convention, CDDL extension points are marked with a leading 1740 dollar sign (types) or two leading dollar signs (groups). Tools 1741 honor that convention by not raising an error if such a type or group 1742 is not defined at all; the symbol is then taken to be an empty type 1743 choice (group choice), i.e., no choice is available. 1745 tcp-header = {seq: uint, ack: uint, * $$tcp-option} 1747 ; later, in a different file 1749 $$tcp-option //= ( 1750 sack: [+(left: uint, right: uint)] 1751 ) 1753 ; and, maybe in another file 1755 $$tcp-option //= ( 1756 sack-permitted: true 1757 ) 1759 Names that start with a single "$" are "type sockets", names with a 1760 double "$$" are "group sockets". It is not an error if there is no 1761 definition for a socket at all; this then means there is no way to 1762 satisfy the rule (i.e., the choice is empty). 1764 All definitions (plugs) for socket names must be augments, i.e., they 1765 must be using "/=" and "//=", respectively. 1767 To pick up the example illustrated in Figure 5, the socket/plug 1768 mechanism could be used as shown in Figure 14: 1770 PersonalData = { 1771 ? displayName: tstr, 1772 NameComponents, 1773 ? age: uint, 1774 * $$personaldata-extensions 1775 } 1777 NameComponents = ( 1778 ? firstName: tstr, 1779 ? familyName: tstr, 1780 ) 1782 ; The above already works as is. 1783 ; But then, we can add later: 1785 $$personaldata-extensions //= ( 1786 favorite-salsa: tstr, 1787 ) 1789 ; and again, somewhere else: 1791 $$personaldata-extensions //= ( 1792 shoesize: uint, 1793 ) 1795 Figure 14: Personal Data example: Using socket/plug extensibility 1797 B.3. Generics 1799 Using angle brackets, the left hand side of a rule can add formal 1800 parameters after the name being defined, as in: 1802 messages = message<"reboot", "now"> / message<"sleep", 1..100> 1803 message = {type: t, value: v} 1805 When using a generic rule, the formal parameters are bound to the 1806 actual arguments supplied (also using angle brackets), within the 1807 scope of the generic rule (as if there were a rule of the form 1808 parameter = argument). 1810 (There are some limitations to nesting of generics in Appendix F at 1811 this time.) 1813 Appendix C. Change Log 1815 Changes from version 00 to version 01: 1817 o Removed constants 1818 o Updated the tag mechanism 1820 o Extended the map structure 1822 o Added examples 1824 Changes from version 01 to version 02: 1826 o Fixed example 1828 Changes from version 02 to version 03: 1830 o Added information about characters used in names 1832 o Added text about an overall data structure and order of definition 1833 of fields 1835 o Added text about encoding of keys 1837 o Added table with keywords 1839 o Strings and integer writing conventions 1841 o Added ABNF 1843 Changes from version 03 to version 04: 1845 o Removed optional fields for non-maps 1847 o Defined all key/value pairs in maps are considered optional from 1848 the CDDL perspective 1850 o Allow omission of type of keys for maps with only text string and 1851 integer keys 1853 o Changed order of definitions 1855 o Updated fruit and moves examples 1857 o Renamed the "Philosophy" section to "Using CDDL", and added more 1858 text about CDDL usage 1860 o Several editorials 1862 Changes from version 04 to version 05: 1864 o Added text about alternative datatypes and any datatype 1865 o Fixed typos 1867 o Restructured syntax and semantics 1869 Changes from version 05 to version 05: 1871 o Fixed the ABNF for choices (no longer need to write a: (b/c)) 1873 o Added group choices (//) 1875 o Added /= and //= 1877 o Added experimental socket/plug 1879 o Added aliases text, bytes, null to prelude 1881 o Documented generics 1883 o Fixed more typos 1885 Changes from 06 to 07: 1887 o .cbor, .cborseq, .within, .and 1889 o Define .size on uint 1891 o Extended Diagnostic Notation 1893 o Precedence discussion and table 1895 o Remove some of the "issues" that can only be understood with 1896 historical context 1898 o Prefer "text" over "tstr" in some of the examples 1900 o Add "unsigned" to the prelude 1902 Changes from 07 to 08: 1904 o .lt, .le, .eq, .ne, .gt, .ge 1906 o .default 1908 Appendix D. ABNF grammar 1910 The following is a formal definition of the CDDL syntax in Augmented 1911 Backus-Naur Form (ABNF, [RFC5234]). [_abnftodo] 1912 cddl = S 1*rule 1913 rule = typename [genericparm] S assign S type S 1914 / groupname [genericparm] S assign S grpent S 1916 typename = id 1917 groupname = id 1919 assign = "=" / "/=" / "//=" 1921 genericparm = "<" S id S *("," S id S ) ">" 1922 genericarg = "<" S type1 S *("," S type1 S ) ">" 1924 type = type1 S *("/" S type1 S) 1926 type1 = type2 [S (rangeop / annotator) S type2] 1927 / "#" "6" ["." uint] "(" S type S ")" ; note no space! 1928 / "#" DIGIT ["." uint] ; major/ai 1929 / "#" ; any 1930 / "{" S group S "}" 1931 / "[" S group S "]" 1932 / "&" S "(" S group S ")" 1933 / "&" S groupname [genericarg] 1935 type2 = value 1936 / typename [genericarg] 1937 / "(" type ")" 1939 rangeop = "..." / ".." 1941 annotator = "." id 1943 group = grpchoice S *("//" S grpchoice S) 1945 grpchoice = *grpent 1947 grpent = [occur S] [memberkey S] type optcom 1948 / [occur S] groupname [genericarg] optcom ; preempted by above 1949 / [occur S] "(" S group S ")" optcom 1951 memberkey = type1 S "=>" 1952 / bareword S ":" 1953 / value S ":" 1955 bareword = id 1957 optcom = S ["," S] 1959 occur = [uint] "*" [uint] 1960 / "+" 1961 / "?" 1963 uint = ["0x" / "0b"] "0" 1964 / ["0x" / "0b"] DIGIT1 *DIGIT 1966 value = number 1967 / string 1969 int = ["-"] uint 1971 ; This is a float if it has fraction or exponent; int otherwise 1972 number = int ["." fraction] ["e" exponent ] 1973 fraction = 1*DIGIT 1974 exponent = int 1976 string = %x22 *SCHAR %x22 1977 SCHAR = %x20-21 / %x23-7E / SESC 1978 SESC = "\" %x20-7E 1980 id = EALPHA *(*("-" / ".") (EALPHA / DIGIT)) 1981 ALPHA = %x41-5A / %x61-7A 1982 EALPHA = %x41-5A / %x61-7A / "@" / "_" / "$" 1983 DIGIT = %x30-39 1984 DIGIT1 = %x31-39 1985 S = *WS 1986 WS = SP / NL 1987 SP = %x20 1988 NL = COMMENT / CRLF 1989 COMMENT = ";" *(SP / VCHAR) CRLF 1990 VCHAR = %x21-7E 1991 CRLF = %x0A / %x0D.0A 1993 Figure 15: CDDL ABNF 1995 Appendix E. Standard Prelude 1997 The following prelude is automatically added to each CDDL file 1998 [tdate]. (Note that technically, it is a postlude, as it does not 1999 disturb the selection of the first rule as the root of the 2000 definition.) 2001 any = # 2003 uint = #0 2004 nint = #1 2005 int = uint / nint 2007 bstr = #2 2008 bytes = bstr 2009 tstr = #3 2010 text = tstr 2012 tdate = #6.0(tstr) 2013 time = #6.1(number) 2014 number = int / float 2015 biguint = #6.2(bstr) 2016 bignint = #6.3(bstr) 2017 bigint = biguint / bignint 2018 integer = int / bigint 2019 unsigned = uint / biguint 2020 decfrac = #6.4([e10: int, m: integer]) 2021 bigfloat = #6.5([e2: int, m: integer]) 2022 eb64url = #6.21(any) 2023 eb64legacy = #6.21(any) 2024 eb16 = #6.21(any) 2025 encoded-cbor = #6.24(bstr) 2026 uri = #6.32(tstr) 2027 b64url = #6.33(tstr) 2028 b64legacy = #6.34(tstr) 2029 regexp = #6.35(tstr) 2030 mime-message = #6.36(tstr) 2031 cbor-any = #6.55799(any) 2033 float16 = #7.25 2034 float32 = #7.26 2035 float64 = #7.27 2036 float16-32 = float16 / float32 2037 float32-64 = float32 / float64 2038 float = float16-32 / float64 2040 false = #7.20 2041 true = #7.21 2042 bool = false / true 2043 nil = #7.22 2044 null = nil 2045 undefined = #7.23 2047 Figure 16: CDDL Prelude 2049 Note that the prelude is deemed to be fixed. This means, for 2050 instance, that additional tags beyond [RFC7049], as registered, need 2051 to be defined in each CDDL file that is using them. 2053 A common stumbling point is that the prelude does not define a type 2054 "string". CBOR has byte strings ("bytes" in the prelude) and text 2055 strings ("text"), so a type that is simply called "string" would be 2056 ambiguous. 2058 Appendix F. The CDDL tool 2060 A rough CDDL tool is available. For CDDL specifications that do not 2061 use recursion, it can check the syntax, generate one or more 2062 instances (expressed in CBOR diagnostic notation or in pretty-printed 2063 JSON), and validate an existing instance against the specification: 2065 Usage: 2066 cddl spec.cddl generate [n] 2067 cddl spec.cddl json-generate [n] 2068 cddl spec.cddl validate instance.cbor 2069 cddl spec.cddl validate instance.json 2071 Figure 17: CDDL tool usage 2073 Install on a system with a modern Ruby via: 2075 gem install cddl 2077 Figure 18 2079 The accompanying CBOR diagnostic tools (which are automatically 2080 installed by the above) are described in https://github.com/cabo/ 2081 cbor-diag ; they can be used to convert between binary CBOR, a 2082 pretty-printed form of that, CBOR diagnostic notation, JSON, and 2083 YAML. 2085 Appendix G. Extended Diagnostic Notation 2087 Section 6 of [RFC7049] defines a "diagnostic notation" in order to be 2088 able to converse about CBOR data items without having to resort to 2089 binary data. Diagnostic notation is based on JSON, with extensions 2090 for representing CBOR constructs such as binary data and tags. 2092 (Standardizing this together with the actual interchange format does 2093 not serve to create another interchange format, but enables the use 2094 of a shared diagnostic notation in tools for and documents about 2095 CBOR.) 2096 This section discusses a few extensions to the diagnostic notation 2097 that have turned out to be useful since RFC 7049 was written. We 2098 refer to the result as extended diagnostic notation (EDN). 2100 G.1. White space in binary strings 2102 Examples often benefit from some white space (spaces, line breaks) in 2103 binary strings. In extended diagnostic notation, white space is 2104 ignored in prefixed binary strings; for instance, the following are 2105 equivalent: 2107 h'48656c6c6f20776f726c64' 2108 h'48 65 6c 6c 6f 20 77 6f 72 6c 64' 2109 h'4 86 56c 6c6f 2110 20776 f726c64' 2112 G.2. Text in binary strings 2114 Diagnostic notation notates Byte strings in one of the [RFC4648] base 2115 encodings,, enclosed in single quotes, prefixed by >h< for base16, 2116 >b32< for base32, >h32< for base32hex, >b64< for base64 or base64url. 2117 Quite often, binary strings carry bytes that are meaningfully 2118 interpreted as UTF-8 text. Extended Diagnostic Notation allows the 2119 use of single quotes without a prefix to express byte strings with 2120 UTF-8 text; for instance, the following are equivalent: 2122 'hello world' 2123 h'68656c6c6f20776f726c64' 2125 The escaping rules of JSON strings are applied equivalently for text- 2126 based binary strings, e.g., \ stands for a single backslash and ' 2127 stands for a single quote. White space is included literally, i.e., 2128 the previous section does not apply to text-based binary strings. 2130 G.3. Concatenated Strings 2132 While the ability to include white space enables line-breaking of 2133 encoded binary strings, a mechanism is needed to be able to include 2134 text strings as well as binary strings in direct UTF-8 representation 2135 into line-based documents (such as RFCs and source code). 2137 We extend the diagnostic notation by allowing multiple text strings 2138 or multiple byte strings to be notated separated by white space, 2139 these are then concatenated into a single text or byte string, 2140 respectively. Text strings and binary strings do not mix within such 2141 a concatenation, except that binary string notation can be used 2142 inside a sequence of concatenated text string notation to encode 2143 characters that may be better represented in an encoded way. The 2144 following four values are equivalent: 2146 "Hello world" 2147 "Hello " "world" 2148 "Hello" h'20' "world" 2149 "" h'48656c6c6f20776f726c64' "" 2151 Similarly, the following byte string values are equivalent 2153 'Hello world' 2154 'Hello ' 'world' 2155 'Hello ' h'776f726c64' 2156 'Hello' h'20' 'world' 2157 '' h'48656c6c6f20776f726c64' '' b64'' 2158 h'4 86 56c 6c6f' h' 20776 f726c64' 2160 (Note that the approach of separating by whitespace, while familiar 2161 from the C language, requires some attention - a single comma makes a 2162 big difference here.) 2164 G.4. Hexadecimal, octal, and binary numbers 2166 In addition to JSON's decimal numbers, EDN provides hexadecimal, 2167 octal and binary numbers in the usual C-language notation (octal with 2168 0o prefix present only). 2170 The following are equivalent: 2172 4711 2173 0x1267 2174 0o11147 2175 0b1001001100111 2177 As are: 2179 1.5 2180 0x1.8p0 2181 0x18p-4 2183 G.5. Comments 2185 Longer pieces of diagnostic notation may benefit from comments. JSON 2186 famously does not provide for comments, and basic RFC 7049 diagnostic 2187 notation inherits this property. 2189 In extended diagnostic notation, comments can be included, delimited 2190 by slashes ("/"). Any text within and including a pair of slashes is 2191 considered a comment. 2193 Comments are considered white space. Hence, they are allowed in 2194 prefixed binary strings; for instance, the following are equivalent: 2196 h'68656c6c6f20776f726c64' 2197 h'68 65 6c /doubled l!/ 6c 6f /hello/ 2198 20 /space/ 2199 77 6f 72 6c 64' /world/ 2201 This can be used to annotate a CBOR structure as in: 2203 /grasp-message/ [/M_DISCOVERY/ 1, /session-id/ 10584416, 2204 /objective/ [/objective-name/ "opsonize", 2205 /D, N, S/ 7, /loop-count/ 105]] 2207 (There are currently no end-of-line comments. If we want to add 2208 them, "//" sounds like a reasonable delimiter given that we already 2209 use slashes for comments, but we also could go e.g. for "#".) 2211 Editorial Comments 2213 [_format] So far, the ability to restrict format choices have not been 2214 needed beyond the floating point formats. Those can be 2215 applied to ranges using the new .and annotation now. It is 2216 not clear we want to add more format control before we have a 2217 use case. 2219 [_range] TO DO: define this precisely. This clearly includes integers 2220 and floats. Strings - as in "a".."z" - could be added if 2221 desired, but this would require adopting a definition of string 2222 ordering and possibly a successor function so "a".."z" does not 2223 include "bb". 2225 [_strings] TO DO: This still needs to be fully realized in the ABNF and 2226 in the CDDL tool. 2228 [unflex] A comment has been that this is counter-intuitive. One 2229 solution would be to simply disallow unparenthesized usage of 2230 occurrence indicators in front of type choices unless a member 2231 key is also present like in group2 above. 2233 [_bitsendian] How useful would it be to have another variant that counts 2234 bits like in RFC box notation? (Or at least per-byte? 2235 32-bit words don't always perfectly mesh with byte 2236 strings.) 2238 [_abnftodo] TO DO: This doesn't allow non-ASCII characters in the text 2239 strings yet; there is no value notation for byte strings; 2240 representation indicators are missing as well. 2242 [tdate] The prelude as included here does not yet have a .regexp 2243 annotation on tdate, but we probably do want to have one. 2245 Authors' Addresses 2247 Christoph Vigano 2248 Universitaet Bremen 2250 Email: christoph.vigano@uni-bremen.de 2252 Henk Birkholz 2253 Fraunhofer SIT 2254 Rheinstrasse 75 2255 Darmstadt 64295 2256 Germany 2258 Email: henk.birkholz@sit.fraunhofer.de