idnits 2.17.1 draft-greevenbosch-appsawg-cbor-cddl-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 18, 2015) is 3111 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '3145' on line 943 -- Looks like a reference, but probably isn't: '4454' on line 943 -- Looks like a reference, but probably isn't: '1175' on line 943 -- Looks like a reference, but probably isn't: '3441' on line 943 -- Looks like a reference, but probably isn't: '74' on line 943 -- Looks like a reference, but probably isn't: '1542' on line 943 -- Looks like a reference, but probably isn't: '4099' on line 944 -- Looks like a reference, but probably isn't: '4062' on line 944 -- Looks like a reference, but probably isn't: '2808' on line 944 -- Looks like a reference, but probably isn't: '8' on line 944 -- Looks like a reference, but probably isn't: '3174' on line 944 -- Looks like a reference, but probably isn't: '3048' on line 944 ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949) ** Obsolete normative reference: RFC 7159 (Obsoleted by RFC 8259) == Outdated reference: A later version (-09) exists of draft-newton-json-content-rules-04 Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 13 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Vigano 3 Internet-Draft Universitaet Bremen 4 Intended status: Informational H. Birkholz 5 Expires: April 20, 2016 Fraunhofer SIT 6 October 18, 2015 8 CBOR data definition language (CDDL): a notational convention to express 9 CBOR data structures 10 draft-greevenbosch-appsawg-cbor-cddl-07 12 Abstract 14 This document proposes a notational convention to express CBOR data 15 structures (RFC 7049). Its main goal is to provide an easy and 16 unambiguous way to express structures for protocol messages and data 17 formats that use CBOR. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on April 20, 2016. 36 Copyright Notice 38 Copyright (c) 2015 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 54 1.1. Requirements notation . . . . . . . . . . . . . . . . . . 4 55 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 56 2. The Style of Data Structure Specification . . . . . . . . . . 4 57 2.1. Groups and Composition in CDDL . . . . . . . . . . . . . 5 58 2.1.1. Usage . . . . . . . . . . . . . . . . . . . . . . . . 7 59 2.1.2. Syntax . . . . . . . . . . . . . . . . . . . . . . . 8 60 2.2. Types . . . . . . . . . . . . . . . . . . . . . . . . . . 8 61 2.2.1. Values . . . . . . . . . . . . . . . . . . . . . . . 8 62 2.2.2. Choices . . . . . . . . . . . . . . . . . . . . . . . 8 63 2.2.3. Representation Types . . . . . . . . . . . . . . . . 10 64 2.2.4. Root type . . . . . . . . . . . . . . . . . . . . . . 10 65 3. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 66 3.1. General conventions . . . . . . . . . . . . . . . . . . . 11 67 3.2. Occurrence . . . . . . . . . . . . . . . . . . . . . . . 12 68 3.3. Predefined names for types . . . . . . . . . . . . . . . 12 69 3.4. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . 13 70 3.5. Maps . . . . . . . . . . . . . . . . . . . . . . . . . . 14 71 3.5.1. Structs . . . . . . . . . . . . . . . . . . . . . . . 14 72 3.5.2. Tables . . . . . . . . . . . . . . . . . . . . . . . 17 73 3.6. Tags . . . . . . . . . . . . . . . . . . . . . . . . . . 17 74 3.7. Operator Precedence . . . . . . . . . . . . . . . . . . . 18 75 4. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 19 76 4.1. Moves in a computer game . . . . . . . . . . . . . . . . 19 77 4.2. Fruit . . . . . . . . . . . . . . . . . . . . . . . . . . 24 78 4.3. RFC 7071 . . . . . . . . . . . . . . . . . . . . . . . . 25 79 4.4. Examples from JSON Content Rules . . . . . . . . . . . . 29 80 5. Making Use of CDDL . . . . . . . . . . . . . . . . . . . . . 31 81 5.1. As a guide to a human user . . . . . . . . . . . . . . . 31 82 5.2. For automated checking of CBOR data structure . . . . . . 31 83 5.3. For data analysis tools . . . . . . . . . . . . . . . . . 31 84 6. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 32 85 6.1. Work to do . . . . . . . . . . . . . . . . . . . . . . . 32 86 7. Resolved Issues . . . . . . . . . . . . . . . . . . . . . . . 32 87 8. Security considerations . . . . . . . . . . . . . . . . . . . 32 88 9. IANA considerations . . . . . . . . . . . . . . . . . . . . . 33 89 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 33 90 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 33 91 11.1. Normative References . . . . . . . . . . . . . . . . . . 33 92 11.2. Informative References . . . . . . . . . . . . . . . . . 34 93 Appendix A. Cemetery . . . . . . . . . . . . . . . . . . . . . . 34 94 Appendix B. Nursery . . . . . . . . . . . . . . . . . . . . . . 34 95 B.1. Annotations . . . . . . . . . . . . . . . . . . . . . . . 34 96 B.1.1. Annotation .size . . . . . . . . . . . . . . . . . . 35 97 B.1.2. Annotation .bits . . . . . . . . . . . . . . . . . . 35 98 B.1.3. Annotation .regexp . . . . . . . . . . . . . . . . . 36 99 B.1.4. Annotations .cbor and .cborseq . . . . . . . . . . . 37 100 B.1.5. Annotations .within and .and . . . . . . . . . . . . 37 101 B.2. Socket/Plug . . . . . . . . . . . . . . . . . . . . . . . 38 102 B.3. Generics . . . . . . . . . . . . . . . . . . . . . . . . 39 103 Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 40 104 Appendix D. ABNF grammar . . . . . . . . . . . . . . . . . . . . 41 105 Appendix E. Standard Prelude . . . . . . . . . . . . . . . . . . 43 106 Appendix F. The CDDL tool . . . . . . . . . . . . . . . . . . . 45 107 Appendix G. Extended Diagnostic Notation . . . . . . . . . . . . 45 108 G.1. White space in binary strings . . . . . . . . . . . . . . 46 109 G.2. Text in binary strings . . . . . . . . . . . . . . . . . 46 110 G.3. Concatenated Strings . . . . . . . . . . . . . . . . . . 46 111 G.4. Hexadecimal, octal, and binary numbers . . . . . . . . . 47 112 G.5. Comments . . . . . . . . . . . . . . . . . . . . . . . . 47 113 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 49 115 1. Introduction 117 In this document, a notational convention to express CBOR [RFC7049] 118 data structures is defined. 120 The main goal for the convention is to provide a unified notation 121 that can be used when defining protocols that use CBOR. We term the 122 convention "CBOR data definition language", or CDDL. 124 The CBOR notational convention has the following goals: 126 (G1) Provide an unambiguous description of the overall structure of 127 a CBOR data structure. 129 (G2) Flexibility to express the freedoms of choice in the CBOR data 130 format. 132 (G3) Possibility to restrict format choices where appropriate 133 [_format]. 135 (G4) Able to express common CBOR datatypes and structures. 137 (G5) Human and machine readable and processable. 139 (G6) Automatic checking of data format compliance. 141 (G7) Extraction of specific elements from CBOR data for further 142 processing. 144 This document has the following structure: 146 The syntax of CDDL is defined in Section 3. Examples of CDDL and 147 related CBOR data instances are defined in Section 4. Section 5 148 discusses usage of CDDL. Examples are provided early in the text to 149 better illustrate concept definitions. A formal definition of CDDL 150 using ABNF grammar is provided in Appendix D. Finally, a prelude of 151 standard CDDL definitions available in every CBOR specification is 152 listed in Appendix E. 154 1.1. Requirements notation 156 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 157 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 158 "OPTIONAL" in this document are to be interpreted as described in RFC 159 2119, BCP 14 [RFC2119]. 161 1.2. Terminology 163 New terms are introduced in _cursive_. CDDL text in the running text 164 is in "typewriter". 166 2. The Style of Data Structure Specification 168 CDDL focuses on styles of specification that are in use in the 169 community employing the data model as pioneered by JSON and now 170 refined in CBOR. 172 There are a number of more or less atomic elements of a CBOR data 173 model, such as numbers, simple values (false, true, nil), strings; 174 CDDL does not focus on specifying their structure. CDDL of course 175 also allows adding a CBOR tag to a data item. 177 The more important components of a data structure definition language 178 are the data types used for composition: arrays and maps in CBOR 179 (called arrays and objects in JSON). While these are only two 180 representation formats, they are used to specify four loosely 181 distinguishable styles of composition: 183 o A _vector_, an array of elements that are mostly of the same 184 semantics. The set of signatures associated with a signed data 185 item is a typical application of a vector. 187 o A _record_, an array the elements of which have different, 188 positionally defined semantics, as detailed in the data structure 189 definition. A 2D point, specified as an array of an x coordinate 190 (which comes first) and a y coordinate (coming second) is an 191 example of a record, as is the pair of exponent (first) and 192 mantissa (second) in a CBOR decimal fraction. 194 o A _table_, a map from a domain of map keys to a domain of map 195 values, that are mostly of the same semantics. A set of language 196 tags, each mapped to a string translated to that specific 197 language, is an example of a table. The key domain is usually not 198 limited to a specific set by the specification, but open for the 199 application, e.g., in a table mapping IP addresses to MAC 200 addresses, the specification does not attempt to foresee all 201 possible IP addresses. 203 o A _struct_, a map from a domain of map keys as defined by the 204 specification to a domain of map values the semantics of each of 205 which is bound to a specific map key. This is what many people 206 have in mind when they think about JSON objects; CBOR adds the 207 ability to use map keys that are not just strings. Structs can be 208 used to solve similar problems as records; the use of explicit map 209 keys facilitates optionality and extensibility. 211 Two important concepts provide the foundation for CDDL: 213 1. Instead of defining all four types of composition in CDDL 214 separately, or even defining one kind for arrays (vectors and 215 records) and one kind for maps (tables and structs), there is 216 only one kind of composition in CDDL: the _group_ (Section 2.1). 218 2. The other important concept is that of a _type_. The entire CDDL 219 specification defines a type (the one defined by its first 220 _rule_), which formally is the set of CBOR instances that are 221 acceptable for this specification. CDDL predefines a number of 222 basic types such as "uint" (unsigned integer) or "tstr" (text 223 string), often making use of a simple formal notation for CBOR 224 data items. Each value that can be expressed as a CBOR data item 225 also is a type in its own right, e.g. "1". A type can be built 226 as a _choice_ of other types, e.g., an "int" is either a "uint" 227 or a "nint" (negative integer). Finally, a type can be built as 228 an array or a map from a group. 230 2.1. Groups and Composition in CDDL 232 CDDL Groups are lists of name/value pairs (group _entries_). 234 In an array context, only the value of the entry is represented; the 235 name is annotation only (and can be left off if not needed). In a 236 map context, the names become the map keys ("member keys"). 238 In an array context, the sequence of elements in the group is 239 important, as it is the information that allows associating actual 240 array elements with entries in the group. In a map context, the 241 sequence of entries in a group is not relevant (but there is still a 242 need to write down group entries in a sequence). 244 A group can be placed in (round) parentheses, and given a name by 245 using it in a rule: 247 pii = ( 248 age: int, 249 name: tstr, 250 employer: tstr, 251 ) 253 Figure 1: A basic group 255 Or a group can just be used in the definition of something else: 257 person = {( 258 age: int, 259 name: tstr, 260 employer: tstr, 261 )} 263 Figure 2: Using a group in a map 265 which, given the above rule for pii, is identical to: 267 person = { 268 pii 269 } 271 Figure 3: Using a group by name 273 Note that the (curly) braces signify the creation of a map; the 274 groups themselves are neutral as to whether they will be used in a 275 map or an array. 277 The parentheses for groups are optional when there is some other set 278 of brackets present, so it would be slightly more natural to express 279 Figure 2 as: 281 person = { 282 age: int, 283 name: tstr, 284 employer: tstr, 285 } 287 Groups can be used to factor out common parts of structs, e.g., 288 instead of writing: 290 person = { 291 age: int, 292 name: tstr, 293 employer: tstr, 294 } 296 dog = { 297 age: int, 298 name: tstr, 299 leash-length: float, 300 } 302 one can choose a name for the common subgroup and write: 304 person = { 305 identity, 306 employer: tstr, 307 } 309 dog = { 310 identity, 311 leash-length: float, 312 } 314 identity = ( 315 age: int, 316 name: tstr, 317 ) 319 Figure 4: Using a group for factorization 321 Note that the contents of the braces in the above definitions 322 constitute (anonymous) groups, while "identity" is a named group. 324 2.1.1. Usage 326 Groups are the instrument used in composing data structures with 327 CDDL. It is a matter of style in defining those structures whether 328 to define groups (anonymously) right in their contexts or whether to 329 define them in a separate rule and to reference them with their 330 respective name (possibly more than once). 332 With this, one is allowed to define all small parts of their data 333 structures and compose bigger protocol units with those or to have 334 only one big protocol data unit that has all definitions ad hoc where 335 needed. 337 2.1.2. Syntax 339 The composition syntax intends to be concise and easy to read: 341 o The start of a group can be marked by '(' 343 o The end of a group can be marked by ')' 345 o Definitions of entries inside of a group are noted as follows: 346 _keytype => valuetype,_ (read "keytype maps to valuetype"). The 347 comma is actually optional (not just in the final entry), but it 348 is considered good style to set it. The double arrow can be 349 replaced by a colon in the common case of directly using a string 350 as a key (see Section 3.5.1). 352 An entry consists of a _keytype_ and a _valuetype_: 354 o _keytype_ is either an atom used as the actual key or a valuetype. 355 This may be needed when using groups in a table context, where the 356 actual keys are of lesser importance than the key types, e.g in 357 contexts verifying incoming data. 359 o _valuetype_ is either a valuetype derived from the major types 360 defined in [RFC7049], a convenience valuetype defined in this 361 document (Appendix E) or the name of a group defined in the 362 protocol file. 364 A group definition can also contain choices between groups, see 365 Section 2.2.2. 367 2.2. Types 369 2.2.1. Values 371 Values such as numbers and strings can be used in place of a type. 372 (For instance, this is a very common thing to do for a keytype, 373 common enough that CDDL provides additional convenience syntax for 374 this.) 376 2.2.2. Choices 378 Many places that allow a type also allow a choice between types, 379 delimited by a "/" (slash). The entire choice construct can be put 380 into parentheses if this is required to make the construction 381 unambiguous (please see Appendix D for the details). 383 Choices of values can be used to express enumerations: 385 attire = "bow tie" / "necktie" / "Internet attire" 386 protocol = 6 / 17 388 Similarly as for types, CDDL also allows choices between groups, 389 delimited by a "//" (double slash). 391 address = { delivery } 393 delivery = ( 394 street: tstr, ? number: uint, city // 395 po-box: uint, city // 396 per-pickup: true ) 398 city = ( 399 name: tstr, zip-code: uint 400 ) 402 Both for type choices and for group choices, additional alternatives 403 can be added to a rule later in separate rules by using "/=" and 404 "//=", respectively, instead of "=": 406 attire /= "swimwear" 408 delivery //= ( 409 lat: float, long: float, drone-type: tstr 410 ) 412 It is not a mistake if a name is first used with a "/=" or "//=" 413 (there is no need to "create it" with "="). 415 2.2.2.1. Ranges 417 Instead of naming all the values that make up a choice, CDDL allows 418 building a _range_ out of two values that are in an ordering 419 relationship. A range can be inclusive of both ends given (denoted 420 by joining two values by ".."), or include the first and exclude the 421 second (denoted by instead using "..."). 423 device-address = byte 424 max-byte = 255 425 byte = 0..max-byte ; inclusive range 426 first-non-byte = 256 427 byte1 = 0...first-non-byte ; byte1 is equivalent to byte 429 CDDL currently only allows ranges between numbers [_range]. 431 2.2.2.2. Turning a group into a choice 433 Some choices are built out of large numbers of values, often 434 integers, each of which is best given a semantic name in the 435 specification. Instead of naming each of these integers and then 436 accumulating these into a choice, CDDL allows building a choice from 437 a group by prefixing it with a "&" character: 439 terminal-color = &basecolors 440 basecolors = ( 441 black: 0, red: 1, green: 2, yellow: 3, 442 blue: 4, magenta: 5, cyan: 6, white: 7, 443 ) 444 extended-color = &( 445 basecolors, 446 orange: 8, pink: 9, purple: 10, brown: 11, 447 ) 449 As with the use of groups in arrays (Section 3.4), the membernames 450 have only documentary value (in particular, they might be used by a 451 tool when displaying integers that are taken from that choice). 453 2.2.3. Representation Types 455 CDDL allows the specification of a data item type by referring to the 456 CBOR representation (major and minor numbers). How this is used 457 should be evident from the prelude (Appendix E). 459 It may be necessary to make use of representation types outside the 460 prelude, e.g., a specification could start by making use of an 461 existing tag in a more specific way, or define a new tag not defined 462 in the prelude: 464 my_breakfast = #6.55799(breakfast) ; cbor-any is too general! 465 breakfast = cereal / porridge 466 cereal = #6.998(tstr) 467 porridge = #6.999([liquid, solid]) 468 liquid = milk / water 469 milk = 0 470 water = 1 471 solid = tstr 473 2.2.4. Root type 475 There is no special syntax to identify the root of a CDDL data 476 structure definition: that role is simply taken by the first rule 477 defined in the file. 479 This is motivated by the usual top-down approach for defining data 480 structures, decomposing a big data structure unit into smaller parts; 481 however, except for the root type, there is no need to strictly 482 follow this sequence. 484 3. Syntax 486 In this section, the overall syntax of CDDL is shown, alongside some 487 examples just illustrating syntax. (The definition will not attempt 488 to be overly formal; refer to Appendix D for the details.) 490 3.1. General conventions 492 The basic syntax is inspired by ABNF [RFC5234], with 494 o rules, whether they define groups or types, are defined with a 495 name, followed by an equals sign "=" and the actual definition 496 according to the respective syntactic rules of that definition. 498 o A name can consist of any of the characters from the set {'A', 499 ..., 'Z', 'a', ..., 'z', '0', ..., '9', '_', '-', '@', '.', '$'}, 500 starting with an alphabetic character (including '@', '_', '$') 501 and ending in one or a digit. 503 * Names are case sensitive. 505 * It is preferred style to start a name with a lower case letter. 507 * The hyphen is preferred over the underscore (except in a 508 "bareword" (Section 3.5.1), where the semantics may actually 509 require an underscore). 511 * The period may be useful for larger specifications, to express 512 some module structure (as in "tcp.throughput" vs. 513 "udp.throughput"). 515 * A number of names are predefined in the CDDL prelude, as listed 516 in Appendix E. 518 * Rule names (types or groups) do not appear in the actual CBOR 519 encoding, but names used as "barewords" in member keys do. 521 o Comments are started by a ';' (semicolon) character and finish at 522 the end of a line (LF or CRLF). 524 o outside strings, whitespace (spaces, newlines, and comments) is 525 used to separate syntactic elements for readability (and to 526 separate identifiers or numbers that follow each other); it is 527 otherwise completely optional. 529 o Hexadecimal numbers are preceded by '0x' (without quotes, lower 530 case x), and are case insensitive. Similarly, binary numbers are 531 preceded by '0b'. 533 o Strings are enclosed by double quotation '"' characters. They 534 follow the conventions for strings as defined in [RFC7159], 535 section 7. [_strings] 537 o CDDL uses UTF-8 [RFC3629] for its encoding. 539 Example: 541 ; This is a comment 542 person = { g } 544 g = ( 545 "name": tstr, 546 age: int, 547 ) 549 3.2. Occurrence 551 An optional _occurrence_ indicator can be given in front of a group 552 entry. It is either one of the characters '?' (optional), '*' (zero 553 or more), or '+' (one or more), or is of the form n*m, where n and m 554 are optional unsigned integers and n is the lower limit (default 0) 555 and m is the upper limit (default no limit) of occurrences. 557 If no occurrence indicator is specified, the group entry is to occur 558 exactly once (as if 1*1 were specified). 560 Note that CDDL, outside any directives/annotations that could 561 possibly be defined, does not make any prescription as to whether 562 arrays or maps use the definite length or indefinite length encoding. 563 I.e., there is no correlation between leaving the size of an array 564 "open" in the spec and the fact that it is then interchanged with 565 definite or indefinite length. 567 3.3. Predefined names for types 569 CDDL predefines a number of names. This subsection summarizes these 570 names, but please see Appendix E for the exact definitions. 572 The following keywords for primitive datatypes are defined: 574 "bool" Boolean value (major type 7, additional information 20 or 575 21). 577 "uint" An unsigned integer (major type 0). 579 "nint" A negative integer (major type 1). 581 "int" An unsigned integer or a negative integer. 583 "float16" IEEE 754 half-precision float (major type 7, additional 584 information 25). 586 "float32" IEEE 754 single-precision float (major type 7, additional 587 information 26). 589 "float64" IEEE 754 double-precision float (major type 7, additional 590 information 27). 592 "float" One of float16, float32, or float64. 594 "bstr" or "bytes" A byte string (major type 2). 596 "tstr" or "text" Text string (major type 3) 598 (Note that there are no predefined names for arrays or maps; these 599 are defined with the syntax given below.) 601 In addition, a number of types are defined in the prelude that are 602 associated with CBOR tags, such as "tdate", "bigint", "regexp" etc. 604 3.4. Arrays 606 Array definitions surround a group with square brackets. 608 For each entry, an occurrence indicator as specified in Section 3.2 609 is permitted. 611 For example: 613 unlimited-people = [* person] 614 one-or-two-people = [1*2 person] 615 at-least-two-people = [2* person] 616 person = ( 617 name: tstr, 618 age: uint, 619 ) 621 The group "person" is defined in such a way that repeating it in the 622 array each time generates alternating names and ages, so these are 623 four valid values for a data item of type "unlimited-people": 625 ["roundlet", 1047, "psychurgy", 2204, "extrarhythmical", 2231] 626 [] 627 ["aluminize", 212, "climograph", 4124] 628 ["penintime", 1513, "endocarditis", 4084, "impermeator", 1669, 629 "coextension", 865] 631 3.5. Maps 633 The syntax for specifying maps merits special attention, as well as a 634 number of optimizations and conveniences, as it is likely to be the 635 focal point of many specifications employing CDDL. While the syntax 636 does not strictly distinguish struct and table usage of maps, it 637 caters specifically to each of them. 639 3.5.1. Structs 641 The "struct" usage of maps is similar to the way JSON objects are 642 used in many JSON applications. 644 A map is defined in the same way as defining an array (see 645 Section 3.4), except for using curly braces "{}" instead of square 646 brackets "[]". 648 An occurrence indicator as specified in Section 3.2 is permitted for 649 each group entry. 651 The following is an example of a structure: 653 Geography = [ 654 city : tstr, 655 gpsCoordinates : GpsCoordinates, 656 ] 658 GpsCoordinates = { 659 longitude : uint, ; multiplied by 10^7 660 latitude : uint, ; multiplied by 10^7 661 } 663 When encoding, the Geography structure is encoded using a CBOR array 664 with two entries, whereas the GpsCoordinates are encoded as a CBOR 665 map with two key-value pairs. 667 Types used in a structure can be defined in separate rules or just in 668 place (potentially placed inside parentheses, such as for choices). 669 E.g.: 671 located-samples = { 672 sample-point: int, 673 samples: [+ float], 674 } 676 where "located-samples" is the datatype to be used when referring to 677 the struct, and "sample-point" and "samples" are the keys to be used. 678 This is actually a complete example: an identifier that is followed 679 by a colon can be directly used as the text string for a member key 680 (we speak of a "bareword" member key), as can a double-quoted string 681 or a number. (When other types, in particular multi-valued ones, are 682 used as keytypes, they are followed by a double arrow, see below.) 684 If a text string key does not match the syntax for an identifier (or 685 if the specifier just happens to prefer using double quotes), the 686 text string syntax can also be used in the member key position, 687 followed by a colon. The above example could therefore have been 688 written with quoted strings in the member key positions. 690 All the types defined can be used in a keytype position by following 691 them with a double arrow. A string also is a (single-valued) type, 692 so another form for this example is: 694 located-samples = { 695 "sample-point" => int, 696 "samples" => [+ float], 697 } 699 A better way to demonstrate the double-arrow use may be: 701 located-samples = { 702 sample-point: int, 703 samples: [+ float], 704 * equipment-type => equipment-tolerances, 705 } 706 equipment-type = [name: tstr, manufacturer: tstr] 707 equipment-tolerances = [+ [float, float]] 709 The example below defines a struct with optional entries: display 710 name (as a text string), the name components first name and family 711 name (as a map of text strings), and age information (as an unsigned 712 integer). 714 PersonalData = { 715 ? displayName: tstr, 716 NameComponents, 717 ? age: uint, 718 } 720 NameComponents = ( 721 ? firstName: tstr, 722 ? familyName: tstr, 723 ) 725 Note that the group definition for NameComponents does not generate 726 another map; instead, all four keys are directly in the struct built 727 by PersonalData. 729 In this example, all key/value pairs are optional from the 730 perspective of CDDL. With no occurrence indicator, an entry is 731 mandatory. 733 If the addition of more entries not specified by the current 734 specification is desired, one can add this possibility explicitly: 736 PersonalData = { 737 ? displayName: tstr, 738 NameComponents, 739 ? age: uint, 740 * tstr => any 741 } 743 NameComponents = ( 744 ? firstName: tstr, 745 ? familyName: tstr, 746 ) 748 Figure 5: Personal Data: Example for extensibility 750 The cddl tool (Appendix F) generated as one acceptable instance for 751 this specification: 753 {"familyName": "agust", "antiforeignism": "pretzel", 754 "springbuck": "illuminatingly", "exuviae": "ephemeris", 755 "kilometrage": "frogfish"} 757 (See Appendix B.2 for one way to explicitly identify an extension 758 point.) 760 3.5.2. Tables 762 A table can be specified by defining a map with entries where the 763 keytype is not single-valued, e.g.: 765 square-roots = {* x => y} 766 x = int 767 y = float 769 Here, the key in each key/value pair has datatype x (defined as int), 770 and the value has datatype y (defined as float). 772 If the specification does not need to restrict one of x or y (i.e., 773 the application is free to choose per entry), it can be replaced by 774 the predefined name "any". 776 As another example, the following could be used as a conversion table 777 converting from an integer or float to a string: 779 tostring = {* x => tstr} 780 x = int / float 782 3.6. Tags 784 A type can make use of a CBOR tag (major type 6) by using the 785 representation type notation, giving #6.nnn(type) where nnn is an 786 unsigned integer giving the tag number and "type" is the type of the 787 data item being tagged. 789 For example, the following line from the CDDL prelude (Appendix E) 790 defines "biguint" as a type name for a positive bignum N: 792 biguint = #6.2(bstr) 794 The tags defined by [RFC7049] are included in the prelude. 795 Additional tags since registered need to be added to a CDDL 796 specification as needed; e.g., a binary UUID tag could be referenced 797 as "buuid" in a specification after defining 799 buuid = #6.37(bstr) 801 In the following example, usage of the tag 32 for URIs is optional: 803 my_uri = #6.32(tstr) / tstr 805 3.7. Operator Precedence 807 As with any language that has multiple syntactic features such as 808 prefix and infix operators, CDDL has operators that bind more tightly 809 than others. This is becoming more complicated than, say, in ABNF, 810 as CDDL has both types and groups, with operators that are specific 811 to these concepts. Type operators (such as "/" for type choice) 812 operate on types, while group operators (such as "//" for group 813 choice) operate on groups. Types can simply be used in groups, but 814 groups need to be bracketed (as arrays or maps) to become types. So, 815 type operators naturally bind closer than group operators. 817 For instance, in 819 t = [group1] 820 group1 = (a / b // c / d) 821 a = 1 b = 2 c = 3 d = 4 823 group1 is a group choice between the type choice of a and b and the 824 type choice of c and d. This becomes more relevant once member keys 825 and/or occurrences are added in: 827 t = {group2} 828 group2 = (? ab: a / b // cd: c / d) 829 a = 1 b = 2 c = 3 d = 4 831 is a group choice between the optional member "ab" of type a or b and 832 the member "cd" of type c or d. Note that the optionality is 833 attached to the first choice ("ab"), not to the second choice. 835 Similarly, in 837 t = [group3] 838 group3 = (+ a / b / c) 839 a = 1 b = 2 c = 3 841 group3 is a repetition of a type choice between a, b, and c [unflex]; 842 if just a is to be repeatable, a group choice is needed to focus the 843 occurrence: 845 t = [group4] 846 group4 = (+ a // b / c) 847 a = 1 b = 2 c = 3 849 group4 is a group choice between a repeatable a and a single b or c. 851 In general, as with many other languages with operator precedence 852 rules, it is best not to rely on them, but to insert parentheses for 853 readability: 855 t = [group4a] 856 group4a = ((+ a) // (b / c)) 857 a = 1 b = 2 c = 3 859 The operator precedences, in sequence of loose to tight binding, are 860 defined in Appendix D and summarized in Table 1. (Arities given are 861 1 for unary prefix operators and 2 for binary infix operators.) 863 +----------+----+---------------------------+------+ 864 | Operator | Ar | Operates on | Prec | 865 +----------+----+---------------------------+------+ 866 | = | 2 | name = type, name = group | 1 | 867 | /= | 2 | name /= type | 1 | 868 | //= | 2 | name //= group | 1 | 869 | // | 2 | group // group | 2 | 870 | , | 2 | group, group | 3 | 871 | * | 1 | * group | 4 | 872 | N*M | 1 | N*M group | 4 | 873 | + | 1 | + group | 4 | 874 | ? | 1 | ? group | 4 | 875 | => | 2 | type => type | 5 | 876 | : | 2 | name: type | 5 | 877 | / | 2 | type / type | 6 | 878 | & | 1 | &group | 6 | 879 | .. | 2 | type..type | 7 | 880 | ... | 2 | type...type | 7 | 881 | .anno | 2 | type .anno type | 7 | 882 +----------+----+---------------------------+------+ 884 Table 1: Summary of operator precedences 886 4. Examples 888 This section contains various examples of structures defined using 889 CDDL. 891 4.1. Moves in a computer game 893 A multiplayer computer game uses CBOR to exchange moves between the 894 players. To ensure a good gaming experience, the move information 895 needs to be exchanged quickly and frequently. Therefore, the game 896 uses CBOR to send its information in a compact format. Figure 6 897 shows definition of the CBOR information exchange format. 899 UpdateMsg = [* { 900 move_no : uint, ; increases for each move 901 player_info : PlayerInfo, ; general information 902 moves : Moves, ; moves in this message 903 }] 905 PlayerInfo = { 906 alias : tstr, 907 player_id : uint, 908 experience : uint, ; beginner: 0; expert: 3 909 gold : uint, 910 supplies : Supplies, 911 avg_strength : float16, 912 } 914 Supplies = { 915 wood => uint 916 iron => uint 917 grain => uint 918 } 920 wood = 0 921 iron = 1 922 grain = 2 924 Moves = [* Move] 926 Move = ( 927 unit_id : uint, 928 unit_strength : uint, ; between 0 and 100 929 2*2 source_pos : uint, ; (x,y) 930 2*2 target_pos : uint, ; (x,y) 931 ) 933 Figure 6: CDDL definition of an information exchange format for a 934 computer game 936 The CDDL tool generates this as a possible instance: 938 [{"move_no": 3985, "player_info": 939 {"alias": "timbrologist", "player_id": 699, "experience": 2699, 940 "gold": 328, "supplies": {0: 1768, 1: 3087, 2: 1401}, 941 "avg_strength": 0.9712613869888417}, 942 "moves": [[1702, 458, 38, 399, 327, 304], 943 [3145, 4454, 1175, 3441, 74, 1542], 944 [4099, 4062, 2808, 8, 3174, 3048], 945 [367, 3649, 756, 3644, 3725, 2769]]}, 946 {"move_no": 199, "player_info": 947 {"alias": "cipo", "player_id": 4309, "experience": 4094, 948 "gold": 4114, "supplies": {0: 873, 1: 4706, 2: 1733}, 949 "avg_strength": 0.37808379403466696}, 950 "moves": [[1977, 3129, 3890, 4000, 1555, 377], 951 [2646, 286, 3363, 4381, 3815, 1039]]}, 952 {"move_no": 2226, "player_info": 953 {"alias": "Stacey", "player_id": 1055, "experience": 207, 954 "gold": 285, "supplies": {0: 3325, 1: 1515, 2: 3304}, 955 "avg_strength": 0.8590028130444863}, 956 "moves": [[869, 4126, 2382, 3155, 1523, 2621]]}] 958 Notice that the supplies have been encoded as a map with integer 959 keys. In this example, using string keys would also have been 960 suitable; the example just illustrates the possibility to use other 961 datatypes for keys, leading to more efficient encoding. 963 The tool-generated binary CBOR for the instance about cannot express 964 yet that the floating point values are 16-bit: 966 83 # array(3) 967 a3 # map(3) 968 67 # text(7) 969 6d6f76655f6e6f # "move_no" 970 19 0f91 # unsigned(3985) 971 6b # text(11) 972 706c617965725f696e666f # "player_info" 973 a6 # map(6) 974 65 # text(5) 975 616c696173 # "alias" 976 6c # text(12) 977 74696d62726f6c6f67697374 # "timbrologist" 978 69 # text(9) 979 706c617965725f6964 # "player_id" 980 19 02bb # unsigned(699) 981 6a # text(10) 982 657870657269656e6365 # "experience" 983 19 0a8b # unsigned(2699) 984 64 # text(4) 985 676f6c64 # "gold" 987 19 0148 # unsigned(328) 988 68 # text(8) 989 737570706c696573 # "supplies" 990 a3 # map(3) 991 00 # unsigned(0) 992 19 06e8 # unsigned(1768) 993 01 # unsigned(1) 994 19 0c0f # unsigned(3087) 995 02 # unsigned(2) 996 19 0579 # unsigned(1401) 997 6c # text(12) 998 6176675f737472656e677468 # "avg_strength" 999 fb 3fef1492c29f8275 # primitive(4606923564386321013) 1000 65 # text(5) 1001 6d6f766573 # "moves" 1002 84 # array(4) 1003 86 # array(6) 1004 19 06a6 # unsigned(1702) 1005 19 01ca # unsigned(458) 1006 18 26 # unsigned(38) 1007 19 018f # unsigned(399) 1008 19 0147 # unsigned(327) 1009 19 0130 # unsigned(304) 1010 86 # array(6) 1011 19 0c49 # unsigned(3145) 1012 19 1166 # unsigned(4454) 1013 19 0497 # unsigned(1175) 1014 19 0d71 # unsigned(3441) 1015 18 4a # unsigned(74) 1016 19 0606 # unsigned(1542) 1017 86 # array(6) 1018 19 1003 # unsigned(4099) 1019 19 0fde # unsigned(4062) 1020 19 0af8 # unsigned(2808) 1021 08 # unsigned(8) 1022 19 0c66 # unsigned(3174) 1023 19 0be8 # unsigned(3048) 1024 86 # array(6) 1025 19 016f # unsigned(367) 1026 19 0e41 # unsigned(3649) 1027 19 02f4 # unsigned(756) 1028 19 0e3c # unsigned(3644) 1029 19 0e8d # unsigned(3725) 1030 19 0ad1 # unsigned(2769) 1031 a3 # map(3) 1032 67 # text(7) 1033 6d6f76655f6e6f # "move_no" 1034 18 c7 # unsigned(199) 1035 6b # text(11) 1036 706c617965725f696e666f # "player_info" 1037 a6 # map(6) 1038 65 # text(5) 1039 616c696173 # "alias" 1040 64 # text(4) 1041 6369706f # "cipo" 1042 69 # text(9) 1043 706c617965725f6964 # "player_id" 1044 19 10d5 # unsigned(4309) 1045 6a # text(10) 1046 657870657269656e6365 # "experience" 1047 19 0ffe # unsigned(4094) 1048 64 # text(4) 1049 676f6c64 # "gold" 1050 19 1012 # unsigned(4114) 1051 68 # text(8) 1052 737570706c696573 # "supplies" 1053 a3 # map(3) 1054 00 # unsigned(0) 1055 19 0369 # unsigned(873) 1056 01 # unsigned(1) 1057 19 1262 # unsigned(4706) 1058 02 # unsigned(2) 1059 19 06c5 # unsigned(1733) 1060 6c # text(12) 1061 6176675f737472656e677468 # "avg_strength" 1062 fb 3fd832865ea1b216 # primitive(4600482572053623318) 1063 65 # text(5) 1064 6d6f766573 # "moves" 1065 82 # array(2) 1066 86 # array(6) 1067 19 07b9 # unsigned(1977) 1068 19 0c39 # unsigned(3129) 1069 19 0f32 # unsigned(3890) 1070 19 0fa0 # unsigned(4000) 1071 19 0613 # unsigned(1555) 1072 19 0179 # unsigned(377) 1073 86 # array(6) 1074 19 0a56 # unsigned(2646) 1075 19 011e # unsigned(286) 1076 19 0d23 # unsigned(3363) 1077 19 111d # unsigned(4381) 1078 19 0ee7 # unsigned(3815) 1079 19 040f # unsigned(1039) 1080 a3 # map(3) 1081 67 # text(7) 1082 6d6f76655f6e6f # "move_no" 1084 19 08b2 # unsigned(2226) 1085 6b # text(11) 1086 706c617965725f696e666f # "player_info" 1087 a6 # map(6) 1088 65 # text(5) 1089 616c696173 # "alias" 1090 66 # text(6) 1091 537461636579 # "Stacey" 1092 69 # text(9) 1093 706c617965725f6964 # "player_id" 1094 19 041f # unsigned(1055) 1095 6a # text(10) 1096 657870657269656e6365 # "experience" 1097 18 cf # unsigned(207) 1098 64 # text(4) 1099 676f6c64 # "gold" 1100 19 011d # unsigned(285) 1101 68 # text(8) 1102 737570706c696573 # "supplies" 1103 a3 # map(3) 1104 00 # unsigned(0) 1105 19 0cfd # unsigned(3325) 1106 01 # unsigned(1) 1107 19 05eb # unsigned(1515) 1108 02 # unsigned(2) 1109 19 0ce8 # unsigned(3304) 1110 6c # text(12) 1111 6176675f737472656e677468 # "avg_strength" 1112 fb 3feb7cf377a65699 # primitive(4605912429042751129) 1113 65 # text(5) 1114 6d6f766573 # "moves" 1115 81 # array(1) 1116 86 # array(6) 1117 19 0365 # unsigned(869) 1118 19 101e # unsigned(4126) 1119 19 094e # unsigned(2382) 1120 19 0c53 # unsigned(3155) 1121 19 05f3 # unsigned(1523) 1122 19 0a3d # unsigned(2621) 1124 Figure 7: CBOR instance for game example 1126 4.2. Fruit 1128 Figure 8 contains an example for a CBOR structure that contains 1129 information about fruit. 1131 fruitlist = [* Fruit] 1133 Fruit = { 1134 name : tstr, 1135 colour : [* color], 1136 avg_weight : float16, 1137 price : uint, 1138 international_names : International, 1139 rfu : bstr, ; reserved for future use 1140 } 1142 International = { 1143 "DE" : tstr, ; German 1144 "EN" : tstr, ; English 1145 "FR" : tstr, ; French 1146 "NL" : tstr, ; Dutch 1147 "ZH-HANS" : tstr, ; Chinese 1148 } 1150 color = &( 1151 black: 0, red: 1, green: 2, yellow: 3, 1152 blue: 4, magenta: 5, cyan: 6, white: 7, 1153 ) 1155 Figure 8: Example CBOR structure 1157 4.3. RFC 7071 1159 [RFC7071] defines the Reputon structure for JSON using somewhat 1160 formalized English text. Here is a (somewhat verbose) equivalent 1161 definition using the same terms, but notated in CDDL: 1163 reputation-object = { 1164 reputation-context, 1165 reputon-list 1166 } 1168 reputation-context = ( 1169 application: text 1170 ) 1172 reputon-list = ( 1173 reputons: reputon-array 1174 ) 1176 reputon-array = [* reputon] 1178 reputon = { 1179 rater-value, 1180 assertion-value, 1181 rated-value, 1182 rating-value, 1183 ? conf-value, 1184 ? normal-value, 1185 ? sample-value, 1186 ? gen-value, 1187 ? expire-value, 1188 * ext-value, 1189 } 1191 rater-value = ( rater: text ) 1192 assertion-value = ( assertion: text ) 1193 rated-value = ( rated: text ) 1194 rating-value = ( rating: float16 ) 1195 conf-value = ( confidence: float16 ) 1196 normal-value = ( normal-rating: float16 ) 1197 sample-value = ( sample-size: uint ) 1198 gen-value = ( generated: uint ) 1199 expire-value = ( expires: uint ) 1200 ext-value = ( text => any ) 1202 An equivalent, more compact form of this example would be: 1204 reputation-object = { 1205 application: text 1206 reputons: [* reputon] 1207 } 1209 reputon = { 1210 rater: text 1211 assertion: text 1212 rated: text 1213 rating: float16 1214 ? confidence: float16 1215 ? normal-rating: float16 1216 ? sample-size: uint 1217 ? generated: uint 1218 ? expires: uint 1219 * text => any 1220 } 1222 Note how this rather clearly delineates the structure somewhat 1223 shrouded by so many words in section 6.2.2. of [RFC7071]. Also, this 1224 definition makes it clear that several ext-values are allowed (by 1225 definition with different member names); RFC 7071 could be read to 1226 forbid the repetition of ext-value ("A specific reputon-element MUST 1227 NOT appear more than once" is ambiguous.) 1229 The CDDL tool (which hasn't quite been trained for polite 1230 conversation) says: 1232 { 1233 "application": "tridentiferous", 1234 "reputons": [ 1235 { 1236 "rater": "loamily", 1237 "assertion": "Dasyprocta", 1238 "rated": "uncommensurableness", 1239 "rating": 0.05055809746548934, 1240 "confidence": 0.7484706448605812, 1241 "normal-rating": 0.8677887734049299, 1242 "sample-size": 4059, 1243 "expires": 3969, 1244 "bearer": "nitty", 1245 "faucal": "postulnar", 1246 "naturalism": "sarcotic" 1247 }, 1248 { 1249 "rater": "precreed", 1250 "assertion": "xanthosis", 1251 "rated": "balsamy", 1252 "rating": 0.36091333590593955, 1253 "confidence": 0.3700759808403371, 1254 "sample-size": 3904 1255 }, 1256 { 1257 "rater": "urinosexual", 1258 "assertion": "malacostracous", 1259 "rated": "arenariae", 1260 "rating": 0.9210673488013762, 1261 "normal-rating": 0.4778762617112776, 1262 "sample-size": 4428, 1263 "generated": 3294, 1264 "backfurrow": "enterable", 1265 "fruitgrower": "flannelflower" 1266 }, 1267 { 1268 "rater": "pedologistically", 1269 "assertion": "unmetaphysical", 1270 "rated": "elocutionist", 1271 "rating": 0.42073613384304287, 1272 "misimagine": "retinaculum", 1273 "snobbish": "contradict", 1274 "Bosporanic": "periostotomy", 1275 "dayworker": "intragyral" 1276 } 1277 ] 1278 } 1280 4.4. Examples from JSON Content Rules 1282 Although JSON Content Rules [I-D.newton-json-content-rules] seems to 1283 address a more general problem than CDDL, it is still a worthwhile 1284 resource to explore for examples (beyond all the inspiration the 1285 format itself has had for CDDL). 1287 Figure 2 of the JCR I-D looks very similar, if slightly less noisy, 1288 in CDDL: 1290 root = [2*2 { 1291 precision: text, 1292 Latitude: float, 1293 Longitude: float, 1294 Address: text, 1295 City: text, 1296 State: text, 1297 Zip: text, 1298 Country: text 1299 }] 1301 Figure 9: JCR, Figure 2, in CDDL 1303 Apart from the lack of a need to quote the member names, text strings 1304 are called "text" or "tstr" in CDDL ("string" would be ambiguous as 1305 CBOR also provides byte strings). 1307 The CDDL tool creates the below example instance for this: 1309 [{"precision": "pyrosphere", "Latitude": 0.5399712314350172, 1310 "Longitude": 0.5157523963028087, "Address": "resow", 1311 "City": "problemwise", "State": "martyrlike", "Zip": "preprove", 1312 "Country": "Pace"}, 1313 {"precision": "unrigging", "Latitude": 0.10422704368372193, 1314 "Longitude": 0.6279808663725834, "Address": "picturedom", 1315 "City": "decipherability", "State": "autometry", "Zip": "pout", 1316 "Country": "wimple"}] 1318 Figure 4 of the JCR I-D in CDDL: 1320 root = { image } 1322 image = ( 1323 Image: { 1324 size, 1325 Title: text, 1326 thumbnail, 1327 IDs: [* int] 1328 } 1329 ) 1331 size = ( 1332 Width: 0..1280 1333 Height: 0..1024 1334 ) 1336 thumbnail = ( 1337 Thumbnail: { 1338 size, 1339 Url: uri 1340 } 1341 ) 1343 This shows how the group concept can be used to keep related elements 1344 (here: width, height) together, and to emulate the JCR style of 1345 specification. (It also shows using a tag from the prelude, "uri" - 1346 this could be done differently.) The more compact form of Figure 5 1347 of the JCR I-D could be emulated like this: 1349 root = { 1350 Image: { 1351 size, Title: text, 1352 Thumbnail: { size, Url: uri }, 1353 IDs: [* int] 1354 } 1355 } 1357 size = ( 1358 Width: 0..1280, 1359 Height: 0..1024, 1360 ) 1362 The CDDL tool creates the below example instance for this: 1364 {"Image": {"Width": 566, "Height": 516, "Title": "leisterer", 1365 "Thumbnail": {"Width": 1111, "Height": 176, "Url": 32("scrog")}, 1366 "IDs": []}} 1368 5. Making Use of CDDL 1370 In this section, we discuss several potential ways to employ CDDL. 1372 5.1. As a guide to a human user 1374 CDDL can be used to efficiently define the layout of CBOR data, such 1375 that a human implementer can easily see how data is supposed to be 1376 encoded. 1378 Since CDDL maps parts of the CBOR data to human readable names, tools 1379 could be built that use CDDL to provide a human friendly 1380 representation of the CBOR data, and allow them to edit such data 1381 while remaining compliant to its CDDL definition. 1383 5.2. For automated checking of CBOR data structure 1385 CDDL has been specified such that a machine can handle the CDDL 1386 definition and related CBOR data. For example, a machine could use 1387 CDDL to check whether or not CBOR data is compliant to its 1388 definition. 1390 The need for thoroughness of such compliance checking depends on the 1391 application. For example, an application may decide not to check the 1392 data structure at all, and use the CDDL definition solely as a means 1393 to indicate the structure of the data to the programmer. 1395 On the other end, the application may also implement a checking 1396 mechanism that goes as far as checking that all mandatory map pairs 1397 are available. 1399 The matter in how far the data description must be enforced by an 1400 application is left to the designers and implementers of that 1401 application, keeping in mind related security considerations. 1403 In no case the intention is that a CDDL tool would be "writing code" 1404 for an implementation. 1406 5.3. For data analysis tools 1408 In the long run, it can be expected that more and more data will be 1409 stored using the CBOR data format. 1411 Where there is data, there is data analysis and the need to process 1412 such data automatically. CDDL can be used for such automated data 1413 processing, allowing tools to verify data, clean it, and extract 1414 particular parts of interest from it. 1416 Since CBOR is designed with constrained devices in mind, a likely use 1417 of it would be small sensors. An interesting use would thus be 1418 automated analysis of sensor data. 1420 6. Discussion 1422 CDDL already is usable in its present form, as Section 4.3 should 1423 have demonstrated. However, additional examples should be developed, 1424 and some experience be gained with the usefulness of tools built 1425 around CDDL. 1427 6.1. Work to do 1429 o The precise semantics of occurrence indicators as defined in 1430 Section 3.2 could be explained in more detail. E.g., the exact 1431 semantics of an occurrence indicators on a group name in a map 1432 (which means the entire group can occur in this way). 1434 o Build good use cases that, one each, demonstrate vector, record, 1435 table and struct usage. 1437 o There probably are some security considerations. 1439 See also the editorial comments sprinkled throughout the document. 1441 7. Resolved Issues 1443 o The key/value pairs in maps have no fixed ordering. One could 1444 imagine situations where fixing the ordering may be of use. For 1445 example, a decoder could look for values related with integer keys 1446 1, 3 and 7. If the order were fixed and the decoder encounters 1447 the key 4 without having encountered key 3, it could conclude that 1448 key 3 is not available without doing more complicated bookkeeping. 1449 Unfortunately, neither JSON nor CBOR support this, so no attempt 1450 was made to support this in CDDL either. 1452 o CDDL distinguishes the various CBOR number types, but there is 1453 only one number type in JSON. There is no effect in specifying a 1454 precision (float16/float32/float64) when using CDDL for specifying 1455 JSON data structures. (The current validator implementation 1456 Appendix F does not handle this very well, either.) 1458 8. Security considerations 1460 This document presents a content rules language for expressing CBOR 1461 data structures. As such, it does not bring any security issues on 1462 itself, although specification of protocols that use CBOR naturally 1463 need security analysis when defined. 1465 Topics that could be considered in a security considerations section 1466 that uses CDDL to define CBOR structures include the following: 1468 o Where could the language maybe cause confusion in a way that will 1469 enable security issues? 1471 9. IANA considerations 1473 This document does not require any IANA registrations. 1475 10. Acknowledgements 1477 CDDL was originally conceived by Bert Greevenbosch, who also wrote 1478 the original five versions of this document. 1480 Inspiration was taken from the C and Pascal languages, MPEG's 1481 conventions for describing structures in the ISO base media file 1482 format, Relax-NG and its compact syntax [RELAXNG], and in particular 1483 from Andrew Lee Newton's "JSON Content Rules" 1484 [I-D.newton-json-content-rules]. 1486 Useful feedback came from Carsten Bormann, Joe Hildebrand, Sean 1487 Leonard and Jim Schaad. 1489 The CDDL tool was written by Carsten Bormann, building on previous 1490 work by Troy Heninger and Tom Lord. 1492 11. References 1494 11.1. Normative References 1496 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1497 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/ 1498 RFC2119, March 1997, 1499 . 1501 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 1502 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November 1503 2003, . 1505 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 1506 Specifications: ABNF", STD 68, RFC 5234, DOI 10.17487/ 1507 RFC5234, January 2008, 1508 . 1510 [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object 1511 Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, 1512 October 2013, . 1514 [RFC7159] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data 1515 Interchange Format", RFC 7159, DOI 10.17487/RFC7159, March 1516 2014, . 1518 11.2. Informative References 1520 [RELAXNG] OASIS, "RELAX-NG Compact Syntax", November 2002, 1521 . 1523 [RFC7071] Borenstein, N. and M. Kucherawy, "A Media Type for 1524 Reputation Interchange", RFC 7071, DOI 10.17487/RFC7071, 1525 November 2013, . 1527 [I-D.newton-json-content-rules] 1528 Newton, A., "A Language for Rules Describing JSON 1529 Content", draft-newton-json-content-rules-04 (work in 1530 progress), December 2014. 1532 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 1533 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, 1534 . 1536 Appendix A. Cemetery 1538 The following ideas are buried for now: 1540 o <...> as syntax for enumerations. We view values to be just 1541 another type (a very specific type with just one member), so that 1542 an enumeration can be denoted as a choice using "/" as the 1543 delimiter of choices. Because of this, no evidence is present 1544 that a separate syntax for enumerations is needed. 1546 Appendix B. Nursery 1548 This appendix describes advanced features that are still under heavy 1549 review. 1551 B.1. Annotations 1553 An _annotation_ allows to annotate a _target_ type with a _control_ 1554 type via an _annotator_. 1556 The syntax for an annotated type is "target .annotator control", 1557 where annotators are special identifiers prefixed by a dot. (Note 1558 that _target_ or _control_ might need to be parenthesized.) 1559 Three annotators are defined at his point. Note that the CDDL tool 1560 does not currently support combining multiple annotations on a single 1561 target. 1563 B.1.1. Annotation .size 1565 A ".size" annotation controls the size of the target in bytes by the 1566 control type. Examples: 1568 full-address = [[+ label], ip4, ip6] 1569 ip4 = bstr .size 4 1570 ip6 = bstr .size 16 1571 label = bstr .size (1..63) 1573 Figure 10: Annotation for size in bytes 1575 (FIXME: In the CDDL tool, the target must be a byte string for now.) 1577 When applied to an unsigned integer, the ".size" annotation restricts 1578 the range of that integer by giving a maximum number of bytes that 1579 should be needed in a computer representation of that unsigned 1580 integer. In other words, "uint .size N" is equivalent to 1581 "0...BYTES_N", where BYTES_N == 256**N. 1583 audio_sample = uint .size 3 ; 24-bit, equivalent to 0..16777215 1585 Figure 11: Annotation for integer size in bytes 1587 Note that, as with value restrictions in CDDL, this annotation is not 1588 a representation constraint; a number that fits into fewer bytes can 1589 still be represented in that form, and an inefficient implementation 1590 could use a longer form (unless that is restricted by some format 1591 constraints outside of CDDL, such as the rules in Section 3.9 of 1592 [RFC7049]). 1594 B.1.2. Annotation .bits 1596 A ".bits" annotation on a byte string indicates that, in the target, 1597 only the bits numbered by a number in the control type are allowed to 1598 be set. (Bits are counted the usual way, bit number "n" being set in 1599 "str" meaning that "(str[n >> 3] & (1 << (n & 7))) != 0".) 1600 [_bitsendian] 1602 Similarly, a ".bits" annotation on an unsigned integer "i" indicates 1603 that for all unsigned integers "n" where "(i & (1 << n)) != 0", "n" 1604 is in the control type. 1606 tcpflagbytes = bstr .bits flags 1607 flags = &( 1608 fin: 8, 1609 syn: 9, 1610 rst: 10, 1611 psh: 11, 1612 ack: 12, 1613 urg: 13, 1614 ece: 14, 1615 cwr: 15, 1616 ns: 0, 1617 ) / (4..7) ; data offset bits 1619 rwxbits = uint .bits rwx 1620 rwx = &(r: 2, w: 1, x: 0) 1622 Figure 12: Annotation for what bits can be set 1624 The CDDL tool generates the following ten example instances for 1625 "tcpflagbytes": 1627 h'906d' h'01fc' h'8145' h'01b7' h'013d' h'409f' h'018e' h'c05f' 1628 h'01fa' h'01fe' 1630 These examples do not illustrate that the above CDDL specification 1631 does not explicitly specify a size of two bytes: A valid all clear 1632 instance of flag bytes could be "h''" or "h'00'" or even "h'000000'" 1633 as well. 1635 B.1.3. Annotation .regexp 1637 A ".regexp" annotation indicates that the text string given as a 1638 target needs to match the PCRE regular expression given as a value in 1639 the control type, where that regular expression is anchored on both 1640 sides. (If anchoring is not desired for a side, ".*" needs to be 1641 inserted there.) 1643 nai = tstr .regexp "\\w+@\\w+(\\.\\w+)+" 1645 Figure 13: Annotation with a PCRE regexp 1647 The CDDL tool proposes: 1649 "N1@CH57HF.4Znqe0.dYJRN.igjf" 1651 B.1.4. Annotations .cbor and .cborseq 1653 A ".cbor" annotation on a byte string indicates that the byte string 1654 carries a CBOR encoded data item. Decoded, the data item matches the 1655 type given as the right-hand side argument (type1 in the following 1656 example). 1658 "bytes .cbor type1" 1660 Similarly, a ".cborseq" annotation on a byte string indicates that 1661 the byte string carries a sequence of CBOR encoded data items. When 1662 the data items are taken as an array, the array matches the type 1663 given as the right-hand side argument (type2 in the following 1664 example). 1666 "bytes .cborseq type2" 1668 (The conversion of the encoded sequence to an array can be effected 1669 for instance by wrapping the byte string between the two bytes 0x9f 1670 and 0xff and decoding the wrapped byte string as a CBOR encoded data 1671 item.) 1673 B.1.5. Annotations .within and .and 1675 A ".and" annotation on a type indicates that the data item matches 1676 both that left hand side type and the type given as the right hand 1677 side. (Formally, the resulting type is the intersection of the two 1678 types given.) 1680 "type1 .and type2" 1682 A variant of the ".and" annotation is the ".within" annotation, which 1683 expresses an additional intent: the left hand side type is meant to 1684 be a subset of the right-hand-side type. 1686 "type1 .within type2" 1688 While both forms have the identical formal semantics (intersection), 1689 the intention of the ".within" form is that the right hand side gives 1690 guidance to the types allowed on the left hand side, which typically 1691 is a socket (Appendix B.2): 1693 message = $message .within message-structure 1694 message-structure = [message_type, *message_option] 1695 message_type = 0..255 1696 message_option = any 1698 $message /= [3, dough: text, topping: [* text]] 1699 $message /= [4, noodles: text, sauce: text, parmesan: bool] 1701 For ".within", a tool might flag an error if type1 allows data items 1702 that are not allowed by type2. In contrast, for ".and", there is no 1703 expectation that type1 already is a subset of type2. 1705 B.2. Socket/Plug 1707 Both for type choices and group choices, a mechanism is defined that 1708 facilitates starting out with empty choices and assembling them 1709 later, potentially in separate files that are concatenated to build 1710 the full specification. 1712 Per convention, CDDL extension points are marked with a leading 1713 dollar sign (types) or two leading dollar signs (groups). Tools 1714 honor that convention by not raising an error if such a type or group 1715 is not defined at all; the symbol is then taken to be an empty type 1716 choice (group choice), i.e., no choice is available. 1718 tcp-header = {seq: uint, ack: uint, * $$tcp-option} 1720 ; later, in a different file 1722 $$tcp-option //= ( 1723 sack: [+(left: uint, right: uint)] 1724 ) 1726 ; and, maybe in another file 1728 $$tcp-option //= ( 1729 sack-permitted: true 1730 ) 1732 Names that start with a single "$" are "type sockets", names with a 1733 double "$$" are "group sockets". It is not an error if there is no 1734 definition for a socket at all; this then means there is no way to 1735 satisfy the rule (i.e., the choice is empty). 1737 All definitions (plugs) for socket names must be augments, i.e., they 1738 must be using "/=" and "//=", respectively. 1740 To pick up the example illustrated in Figure 5, the socket/plug 1741 mechanism could be used as shown in Figure 14: 1743 PersonalData = { 1744 ? displayName: tstr, 1745 NameComponents, 1746 ? age: uint, 1747 * $$personaldata-extensions 1748 } 1750 NameComponents = ( 1751 ? firstName: tstr, 1752 ? familyName: tstr, 1753 ) 1755 ; The above already works as is. 1756 ; But then, we can add later: 1758 $$personaldata-extensions //= ( 1759 favorite-salsa: tstr, 1760 ) 1762 ; and again, somewhere else: 1764 $$personaldata-extensions //= ( 1765 shoesize: uint, 1766 ) 1768 Figure 14: Personal Data example: Using socket/plug extensibility 1770 B.3. Generics 1772 Using angle brackets, the left hand side of a rule can add formal 1773 parameters after the name being defined, as in: 1775 messages = message<"reboot", "now"> / message<"sleep", 1..100> 1776 message = {type: t, value: v} 1778 When using a generic rule, the formal parameters are bound to the 1779 actual arguments supplied (also using angle brackets), within the 1780 scope of the generic rule (as if there were a rule of the form 1781 parameter = argument). 1783 (There are some limitations to nesting of generics in Appendix F at 1784 this time.) 1786 Appendix C. Change Log 1788 Changes from version 00 to version 01: 1790 o Removed constants 1792 o Updated the tag mechanism 1794 o Extended the map structure 1796 o Added examples 1798 Changes from version 01 to version 02: 1800 o Fixed example 1802 Changes from version 02 to version 03: 1804 o Added information about characters used in names 1806 o Added text about an overall data structure and order of definition 1807 of fields 1809 o Added text about encoding of keys 1811 o Added table with keywords 1813 o Strings and integer writing conventions 1815 o Added ABNF 1817 Changes from version 03 to version 04: 1819 o Removed optional fields for non-maps 1821 o Defined all key/value pairs in maps are considered optional from 1822 the CDDL perspective 1824 o Allow omission of type of keys for maps with only text string and 1825 integer keys 1827 o Changed order of definitions 1829 o Updated fruit and moves examples 1831 o Renamed the "Philosophy" section to "Using CDDL", and added more 1832 text about CDDL usage 1834 o Several editorials 1836 Changes from version 04 to version 05: 1838 o Added text about alternative datatypes and any datatype 1840 o Fixed typos 1842 o Restructured syntax and semantics 1844 Changes from version 05 to version 05: 1846 o Fixed the ABNF for choices (no longer need to write a: (b/c)) 1848 o Added group choices (//) 1850 o Added /= and //= 1852 o Added experimental socket/plug 1854 o Added aliases text, bytes, null to prelude 1856 o Documented generics 1858 o Fixed more typos 1860 Changes from 06 to 07: 1862 o .cbor, .cborseq, .within, .and 1864 o Define .size on uint 1866 o Extended Diagnostic Notation 1868 o Precedence discussion and table 1870 o Remove some of the "issues" that can only be understood with 1871 historical context 1873 o Prefer "text" over "tstr" in some of the examples 1875 o Add "unsigned" to the prelude 1877 Appendix D. ABNF grammar 1879 The following is a formal definition of the CDDL syntax in Augmented 1880 Backus-Naur Form (ABNF, [RFC5234]). [_abnftodo] 1881 cddl = S 1*rule 1882 rule = typename [genericparm] S assign S type S 1883 / groupname [genericparm] S assign S grpent S 1885 typename = id 1886 groupname = id 1888 assign = "=" / "/=" / "//=" 1890 genericparm = "<" S id S *("," S id S ) ">" 1891 genericarg = "<" S type1 S *("," S type1 S ) ">" 1893 type = type1 S *("/" S type1 S) 1895 type1 = type2 [S (rangeop / annotator) S type2] 1896 / "#" "6" ["." uint] "(" S type S ")" ; note no space! 1897 / "#" DIGIT ["." uint] ; major/ai 1898 / "#" ; any 1899 / "{" S group S "}" 1900 / "[" S group S "]" 1901 / "&" S "(" S group S ")" 1902 / "&" S groupname [genericarg] 1904 type2 = value 1905 / typename [genericarg] 1906 / "(" type ")" 1908 rangeop = "..." / ".." 1910 annotator = "." id 1912 group = grpchoice S *("//" S grpchoice S) 1914 grpchoice = *grpent 1916 grpent = [occur S] [memberkey S] type optcom 1917 / [occur S] groupname [genericarg] optcom ; preempted by above 1918 / [occur S] "(" S group S ")" optcom 1920 memberkey = type1 S "=>" 1921 / bareword S ":" 1922 / value S ":" 1924 bareword = id 1926 optcom = S ["," S] 1928 occur = [uint] "*" [uint] 1929 / "+" 1930 / "?" 1932 uint = ["0x" / "0b"] "0" 1933 / ["0x" / "0b"] DIGIT1 *DIGIT 1935 value = number 1936 / string 1938 int = ["-"] uint 1940 ; This is a float if it has fraction or exponent; int otherwise 1941 number = int ["." fraction] ["e" exponent ] 1942 fraction = 1*DIGIT 1943 exponent = int 1945 string = %x22 *SCHAR %x22 1946 SCHAR = %x20-21 / %x23-7E / SESC 1947 SESC = "\" %x20-7E 1949 id = EALPHA *(*("-" / ".") (EALPHA / DIGIT)) 1950 ALPHA = %x41-5A / %x61-7A 1951 EALPHA = %x41-5A / %x61-7A / "@" / "_" / "$" 1952 DIGIT = %x30-39 1953 DIGIT1 = %x31-39 1954 S = *WS 1955 WS = SP / NL 1956 SP = %x20 1957 NL = COMMENT / CRLF 1958 COMMENT = ";" *(SP / VCHAR) CRLF 1959 VCHAR = %x21-7E 1960 CRLF = %x0A / %x0D.0A 1962 Figure 15: CDDL ABNF 1964 Appendix E. Standard Prelude 1966 The following prelude is automatically added to each CDDL file 1967 [tdate]. (Note that technically, it is a postlude, as it does not 1968 disturb the selection of the first rule as the root of the 1969 definition.) 1970 any = # 1972 uint = #0 1973 nint = #1 1974 int = uint / nint 1976 bstr = #2 1977 bytes = bstr 1978 tstr = #3 1979 text = tstr 1981 tdate = #6.0(tstr) 1982 time = #6.1(number) 1983 number = int / float 1984 biguint = #6.2(bstr) 1985 bignint = #6.3(bstr) 1986 bigint = biguint / bignint 1987 integer = int / bigint 1988 unsigned = uint / biguint 1989 decfrac = #6.4([e10: int, m: integer]) 1990 bigfloat = #6.5([e2: int, m: integer]) 1991 eb64url = #6.21(any) 1992 eb64legacy = #6.21(any) 1993 eb16 = #6.21(any) 1994 encoded-cbor = #6.24(bstr) 1995 uri = #6.32(tstr) 1996 b64url = #6.33(tstr) 1997 b64legacy = #6.34(tstr) 1998 regexp = #6.35(tstr) 1999 mime-message = #6.36(tstr) 2000 cbor-any = #6.55799(any) 2002 float16 = #7.25 2003 float32 = #7.26 2004 float64 = #7.27 2005 float16-32 = float16 / float32 2006 float32-64 = float32 / float64 2007 float = float16-32 / float64 2009 false = #7.20 2010 true = #7.21 2011 bool = false / true 2012 nil = #7.22 2013 null = nil 2014 undefined = #7.23 2016 Figure 16: CDDL Prelude 2018 Note that the prelude is deemed to be fixed. This means, for 2019 instance, that additional tags beyond [RFC7049], as registered, need 2020 to be defined in each CDDL file that is using them. 2022 A common stumbling point is that the prelude does not define a type 2023 "string". CBOR has byte strings ("bytes" in the prelude) and text 2024 strings ("text"), so a type that is simply called "string" would be 2025 ambiguous. 2027 Appendix F. The CDDL tool 2029 A rough CDDL tool is available. For CDDL specifications that do not 2030 use recursion, it can check the syntax, generate one or more 2031 instances (expressed in CBOR diagnostic notation or in pretty-printed 2032 JSON), and validate an existing instance against the specification: 2034 Usage: 2035 cddl spec.cddl generate [n] 2036 cddl spec.cddl json-generate [n] 2037 cddl spec.cddl validate instance.cbor 2038 cddl spec.cddl validate instance.json 2040 Figure 17: CDDL tool usage 2042 Install on a system with a modern Ruby via: 2044 gem install cddl 2046 Figure 18 2048 The accompanying CBOR diagnostic tools (which are automatically 2049 installed by the above) are described in https://github.com/cabo/ 2050 cbor-diag ; they can be used to convert between binary CBOR, a 2051 pretty-printed form of that, CBOR diagnostic notation, JSON, and 2052 YAML. 2054 Appendix G. Extended Diagnostic Notation 2056 Section 6 of [RFC7049] defines a "diagnostic notation" in order to be 2057 able to converse about CBOR data items without having to resort to 2058 binary data. Diagnostic notation is based on JSON, with extensions 2059 for representing CBOR constructs such as binary data and tags. 2061 (Standardizing this together with the actual interchange format does 2062 not serve to create another interchange format, but enables the use 2063 of a shared diagnostic notation in tools for and documents about 2064 CBOR.) 2065 This section discusses a few extensions to the diagnostic notation 2066 that have turned out to be useful since RFC 7049 was written. We 2067 refer to the result as extended diagnostic notation (EDN). 2069 G.1. White space in binary strings 2071 Examples often benefit from some white space (spaces, line breaks) in 2072 binary strings. In extended diagnostic notation, white space is 2073 ignored in prefixed binary strings; for instance, the following are 2074 equivalent: 2076 h'48656c6c6f20776f726c64' 2077 h'48 65 6c 6c 6f 20 77 6f 72 6c 64' 2078 h'4 86 56c 6c6f 2079 20776 f726c64' 2081 G.2. Text in binary strings 2083 Diagnostic notation notates Byte strings in one of the [RFC4648] base 2084 encodings,, enclosed in single quotes, prefixed by >h< for base16, 2085 >b32< for base32, >h32< for base32hex, >b64< for base64 or base64url. 2086 Quite often, binary strings carry bytes that are meaningfully 2087 interpreted as UTF-8 text. Extended Diagnostic Notation allows the 2088 use of single quotes without a prefix to express byte strings with 2089 UTF-8 text; for instance, the following are equivalent: 2091 'hello world' 2092 h'68656c6c6f20776f726c64' 2094 The escaping rules of JSON strings are applied equivalently for text- 2095 based binary strings, e.g., \ stands for a single backslash and ' 2096 stands for a single quote. White space is included literally, i.e., 2097 the previous section does not apply to text-based binary strings. 2099 G.3. Concatenated Strings 2101 While the ability to include white space enables line-breaking of 2102 encoded binary strings, a mechanism is needed to be able to include 2103 text strings as well as binary strings in direct UTF-8 representation 2104 into line-based documents (such as RFCs and source code). 2106 We extend the diagnostic notation by allowing multiple text strings 2107 or multiple byte strings to be notated separated by white space, 2108 these are then concatenated into a single text or byte string, 2109 respectively. Text strings and binary strings do not mix within such 2110 a concatenation, except that binary string notation can be used 2111 inside a sequence of concatenated text string notation to encode 2112 characters that may be better represented in an encoded way. The 2113 following four values are equivalent: 2115 "Hello world" 2116 "Hello " "world" 2117 "Hello" h'20' "world" 2118 "" h'48656c6c6f20776f726c64' "" 2120 Similarly, the following byte string values are equivalent 2122 'Hello world' 2123 'Hello ' 'world' 2124 'Hello ' h'776f726c64' 2125 'Hello' h'20' 'world' 2126 '' h'48656c6c6f20776f726c64' '' b64'' 2127 h'4 86 56c 6c6f' h' 20776 f726c64' 2129 (Note that the approach of separating by whitespace, while familiar 2130 from the C language, requires some attention - a single comma makes a 2131 big difference here.) 2133 G.4. Hexadecimal, octal, and binary numbers 2135 In addition to JSON's decimal numbers, EDN provides hexadecimal, 2136 octal and binary numbers in the usual C-language notation (octal with 2137 0o prefix present only). 2139 The following are equivalent: 2141 4711 2142 0x1267 2143 0o11147 2144 0b1001001100111 2146 As are: 2148 1.5 2149 0x1.8p0 2150 0x18p-4 2152 G.5. Comments 2154 Longer pieces of diagnostic notation may benefit from comments. JSON 2155 famously does not provide for comments, and basic RFC 7049 diagnostic 2156 notation inherits this property. 2158 In extended diagnostic notation, comments can be included, delimited 2159 by slashes ("/"). Any text within and including a pair of slashes is 2160 considered a comment. 2162 Comments are considered white space. Hence, they are allowed in 2163 prefixed binary strings; for instance, the following are equivalent: 2165 h'68656c6c6f20776f726c64' 2166 h'68 65 6c /doubled l!/ 6c 6f /hello/ 2167 20 /space/ 2168 77 6f 72 6c 64' /world/ 2170 This can be used to annotate a CBOR structure as in: 2172 /grasp-message/ [/M_DISCOVERY/ 1, /session-id/ 10584416, 2173 /objective/ [/objective-name/ "opsonize", 2174 /D, N, S/ 7, /loop-count/ 105]] 2176 (There are currently no end-of-line comments. If we want to add 2177 them, "//" sounds like a reasonable delimiter given that we already 2178 use slashes for comments, but we also could go e.g. for "#".) 2180 Editorial Comments 2182 [_format] So far, the ability to restrict format choices have not been 2183 needed beyond the floating point formats. Those can be 2184 applied to ranges using the new .and annotation now. It is 2185 not clear we want to add more format control before we have a 2186 use case. 2188 [_range] TO DO: define this precisely. This clearly includes integers 2189 and floats. Strings - as in "a".."z" - could be added if 2190 desired, but this would require adopting a definition of string 2191 ordering and possibly a successor function so "a".."z" does not 2192 include "bb". 2194 [_strings] TO DO: This still needs to be fully realized in the ABNF and 2195 in the CDDL tool. 2197 [unflex] A comment has been that this is counter-intuitive. One 2198 solution would be to simply disallow unparenthesized usage of 2199 occurrence indicators in front of type choices unless a member 2200 key is also present like in group2 above. 2202 [_bitsendian] How useful would it be to have another variant that counts 2203 bits like in RFC box notation? (Or at least per-byte? 2204 32-bit words don't always perfectly mesh with byte 2205 strings.) 2207 [_abnftodo] TO DO: This doesn't allow non-ASCII characters in the text 2208 strings yet; there is no value notation for byte strings; 2209 representation indicators are missing as well. 2211 [tdate] The prelude as included here does not yet have a .regexp 2212 annotation on tdate, but we probably do want to have one. 2214 Authors' Addresses 2216 Christoph Vigano 2217 Universitaet Bremen 2219 Email: christoph.vigano@uni-bremen.de 2221 Henk Birkholz 2222 Fraunhofer SIT 2223 Rheinstrasse 75 2224 Darmstadt 64295 2225 Germany 2227 Email: henk.birkholz@sit.fraunhofer.de