idnits 2.17.1 draft-ietf-cbor-cddl-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 13, 2019) is 1898 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'Cc' is mentioned on line 1257, but not defined == Missing Reference: 'Aa' is mentioned on line 1257, but not defined == Missing Reference: 'Ss' is mentioned on line 1257, but not defined == Missing Reference: 'Ee' is mentioned on line 1257, but not defined == Missing Reference: 'RFCthis' is mentioned on line 1666, but not defined -- Looks like a reference, but probably isn't: '1' on line 2405 -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO6093' ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949) == Outdated reference: A later version (-13) exists of draft-bormann-cbor-cddl-freezer-01 -- Obsolete informational reference (is this intentional?): RFC 8152 (Obsoleted by RFC 9052, RFC 9053) Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 CBOR H. Birkholz 3 Internet-Draft Fraunhofer SIT 4 Intended status: Standards Track C. Vigano 5 Expires: August 17, 2019 Universitaet Bremen 6 C. Bormann 7 Universitaet Bremen TZI 8 February 13, 2019 10 Concise data definition language (CDDL): a notational convention to 11 express CBOR and JSON data structures 12 draft-ietf-cbor-cddl-07 14 Abstract 16 This document proposes a notational convention to express CBOR data 17 structures (RFC 7049, Concise Binary Object Representation). Its 18 main goal is to provide an easy and unambiguous way to express 19 structures for protocol messages and data formats that use CBOR or 20 JSON. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at https://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on August 17, 2019. 39 Copyright Notice 41 Copyright (c) 2019 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (https://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 57 1.1. Requirements notation . . . . . . . . . . . . . . . . . . 4 58 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 59 2. The Style of Data Structure Specification . . . . . . . . . . 4 60 2.1. Groups and Composition in CDDL . . . . . . . . . . . . . 6 61 2.1.1. Usage . . . . . . . . . . . . . . . . . . . . . . . . 9 62 2.1.2. Syntax . . . . . . . . . . . . . . . . . . . . . . . 9 63 2.2. Types . . . . . . . . . . . . . . . . . . . . . . . . . . 9 64 2.2.1. Values . . . . . . . . . . . . . . . . . . . . . . . 9 65 2.2.2. Choices . . . . . . . . . . . . . . . . . . . . . . . 10 66 2.2.3. Representation Types . . . . . . . . . . . . . . . . 12 67 2.2.4. Root type . . . . . . . . . . . . . . . . . . . . . . 13 68 3. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 69 3.1. General conventions . . . . . . . . . . . . . . . . . . . 13 70 3.2. Occurrence . . . . . . . . . . . . . . . . . . . . . . . 15 71 3.3. Predefined names for types . . . . . . . . . . . . . . . 16 72 3.4. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . 16 73 3.5. Maps . . . . . . . . . . . . . . . . . . . . . . . . . . 17 74 3.5.1. Structs . . . . . . . . . . . . . . . . . . . . . . . 17 75 3.5.2. Tables . . . . . . . . . . . . . . . . . . . . . . . 20 76 3.5.3. Non-deterministic order . . . . . . . . . . . . . . . 21 77 3.5.4. Cuts in Maps . . . . . . . . . . . . . . . . . . . . 22 78 3.6. Tags . . . . . . . . . . . . . . . . . . . . . . . . . . 23 79 3.7. Unwrapping . . . . . . . . . . . . . . . . . . . . . . . 23 80 3.8. Controls . . . . . . . . . . . . . . . . . . . . . . . . 24 81 3.8.1. Control operator .size . . . . . . . . . . . . . . . 25 82 3.8.2. Control operator .bits . . . . . . . . . . . . . . . 25 83 3.8.3. Control operator .regexp . . . . . . . . . . . . . . 26 84 3.8.4. Control operators .cbor and .cborseq . . . . . . . . 28 85 3.8.5. Control operators .within and .and . . . . . . . . . 28 86 3.8.6. Control operators .lt, .le, .gt, .ge, .eq, .ne, and 87 .default . . . . . . . . . . . . . . . . . . . . . . 29 88 3.9. Socket/Plug . . . . . . . . . . . . . . . . . . . . . . . 30 89 3.10. Generics . . . . . . . . . . . . . . . . . . . . . . . . 31 90 3.11. Operator Precedence . . . . . . . . . . . . . . . . . . . 32 91 4. Making Use of CDDL . . . . . . . . . . . . . . . . . . . . . 33 92 4.1. As a guide to a human user . . . . . . . . . . . . . . . 33 93 4.2. For automated checking of CBOR data structure . . . . . . 34 94 4.3. For data analysis tools . . . . . . . . . . . . . . . . . 34 95 5. Security considerations . . . . . . . . . . . . . . . . . . . 34 96 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 35 97 6.1. CDDL control operator registry . . . . . . . . . . . . . 35 98 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 36 99 7.1. Normative References . . . . . . . . . . . . . . . . . . 36 100 7.2. Informative References . . . . . . . . . . . . . . . . . 37 101 Appendix A. Parsing Expression Grammars (PEG) . . . . . . . . . 39 102 Appendix B. ABNF grammar . . . . . . . . . . . . . . . . . . . . 41 103 Appendix C. Matching rules . . . . . . . . . . . . . . . . . . . 43 104 Appendix D. Standard Prelude . . . . . . . . . . . . . . . . . . 47 105 Appendix E. Use with JSON . . . . . . . . . . . . . . . . . . . 49 106 Appendix F. A CDDL tool . . . . . . . . . . . . . . . . . . . . 51 107 Appendix G. Extended Diagnostic Notation . . . . . . . . . . . . 52 108 G.1. White space in byte string notation . . . . . . . . . . . 52 109 G.2. Text in byte string notation . . . . . . . . . . . . . . 52 110 G.3. Embedded CBOR and CBOR sequences in byte strings . . . . 53 111 G.4. Concatenated Strings . . . . . . . . . . . . . . . . . . 53 112 G.5. Hexadecimal, octal, and binary numbers . . . . . . . . . 54 113 G.6. Comments . . . . . . . . . . . . . . . . . . . . . . . . 54 114 Appendix H. Examples . . . . . . . . . . . . . . . . . . . . . . 55 115 H.1. RFC 7071 . . . . . . . . . . . . . . . . . . . . . . . . 55 116 H.2. Examples from JSON Content Rules . . . . . . . . . . . . 58 117 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 61 118 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 61 119 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 61 121 1. Introduction 123 In this document, a notational convention to express CBOR [RFC7049] 124 data structures is defined. 126 The main goal for the convention is to provide a unified notation 127 that can be used when defining protocols that use CBOR. We term the 128 convention "Concise data definition language", or CDDL. 130 The CBOR notational convention has the following goals: 132 (G1) Provide an unambiguous description of the overall structure of 133 a CBOR data item. 135 (G2) Be flexible in expressing the multiple ways in which data can 136 be represented in the CBOR data format. 138 (G3) Able to express common CBOR datatypes and structures. 140 (G4) Provide a single format that is both readable and editable for 141 humans and processable by machine. 143 (G5) Enable automatic checking of CBOR data items for data format 144 compliance. 146 (G6) Enable extraction of specific elements from CBOR data for 147 further processing. 149 Not an original goal per se, but a convenient side effect of the JSON 150 generic data model being a subset of the CBOR generic data model, is 151 the fact that CDDL can also be used for describing JSON data 152 structures (see Appendix E). 154 This document has the following structure: 156 The syntax of CDDL is defined in Section 3. Examples of CDDL and 157 related CBOR data items ("instances", which all happen to be in JSON 158 form) are given in Appendix H. Section 4 discusses usage of CDDL. 159 Examples are provided early in the text to better illustrate concept 160 definitions. A formal definition of CDDL using ABNF grammar is 161 provided in Appendix B. Finally, a _prelude_ of standard CDDL 162 definitions that is automatically prepended to and thus available in 163 every CBOR specification is listed in Appendix D. 165 1.1. Requirements notation 167 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 168 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 169 "OPTIONAL" in this document are to be interpreted as described in 170 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 171 capitals, as shown here. 173 1.2. Terminology 175 New terms are introduced in _cursive_, which is rendered in plain 176 text as the new term surrouded by underscores. CDDL text in the 177 running text is in "typewriter", which is rendered in plain text as 178 the CDDL text in double quotes (double quotes are also used in the 179 usual English sense; the reader is expected to disambiguate this by 180 context). 182 In this specification, the term "byte" is used in its now customary 183 sense as a synonym for "octet". 185 2. The Style of Data Structure Specification 187 CDDL focuses on styles of specification that are in use in the 188 community employing the data model as pioneered by JSON and now 189 refined in CBOR. 191 There are a number of more or less atomic elements of a CBOR data 192 model, such as numbers, simple values (false, true, nil), text and 193 byte strings; CDDL does not focus on specifying their structure. 194 CDDL of course also allows adding a CBOR tag to a data item. 196 Beyond those atomic elements, further components of a data structure 197 definition language are the data types used for composition: arrays 198 and maps in CBOR (called arrays and objects in JSON). While these 199 are only two representation formats, they are used to specify four 200 loosely distinguishable styles of composition: 202 o A _vector_, an array of elements that are mostly of the same 203 semantics. The set of signatures associated with a signed data 204 item is a typical application of a vector. 206 o A _record_, an array the elements of which have different, 207 positionally defined semantics, as detailed in the data structure 208 definition. A 2D point, specified as an array of an x coordinate 209 (which comes first) and a y coordinate (coming second) is an 210 example of a record, as is the pair of exponent (first) and 211 mantissa (second) in a CBOR decimal fraction. 213 o A _table_, a map from a domain of map keys to a domain of map 214 values, that are mostly of the same semantics. A set of language 215 tags, each mapped to a text string translated to that specific 216 language, is an example of a table. The key domain is usually not 217 limited to a specific set by the specification, but open for the 218 application, e.g., in a table mapping IP addresses to MAC 219 addresses, the specification does not attempt to foresee all 220 possible IP addresses. In a language such as JavaScript, a "Map" 221 (as opposed to a plain "Object") would often be employed to 222 achieve the generality of the key domain. 224 o A _struct_, a map from a domain of map keys as defined by the 225 specification to a domain of map values the semantics of each of 226 which is bound to a specific map key. This is what many people 227 have in mind when they think about JSON objects; CBOR adds the 228 ability to use map keys that are not just text strings. Structs 229 can be used to solve similar problems as records; the use of 230 explicit map keys facilitates optionality and extensibility. 232 Two important concepts provide the foundation for CDDL: 234 1. Instead of defining all four types of composition in CDDL 235 separately, or even defining one kind for arrays (vectors and 236 records) and one kind for maps (tables and structs), there is 237 only one kind of composition in CDDL: the _group_ (Section 2.1). 239 2. The other important concept is that of a _type_. The entire CDDL 240 specification defines a type (the one defined by its first 241 _rule_), which formally is the set of CBOR data items that are 242 acceptable as "instances" for this specification. CDDL 243 predefines a number of basic types such as "uint" (unsigned 244 integer) or "tstr" (text string), often making use of a simple 245 formal notation for CBOR data items. Each value that can be 246 expressed as a CBOR data item also is a type in its own right, 247 e.g. "1". A type can be built as a _choice_ of other types, 248 e.g., an "int" is either a "uint" or a "nint" (negative integer). 249 Finally, a type can be built as an array or a map from a group. 251 The rest of this section introduces a number of basic concepts of 252 CDDL, and Section 3 defines additional syntax. Appendix C gives a 253 concise summary of the semantics of CDDL. 255 2.1. Groups and Composition in CDDL 257 CDDL Groups are lists of group _entries_, each of which can be a 258 name/value pair or a more complex group expression (which then in 259 turn stands for a sequence of name/value pairs). A CDDL group is a 260 production in a grammar that matches certain sequences of name/value 261 pairs but not others. The grammar is based on the concepts of 262 Parsing Expression Grammars (see Appendix A). 264 In an array context, only the value of the name/value pair is 265 represented; the name is annotation only (and can be left off from 266 the group specification if not needed). In a map context, the names 267 become the map keys ("member keys"). 269 In an array context, the actual sequence of elements in the group is 270 important, as that sequence is the information that allows 271 associating actual array elements with entries in the group. In a 272 map context, the sequence of entries in a group is not relevant (but 273 there is still a need to write down group entries in a sequence). 275 An array matches a specification given as a group when the group 276 matches a sequence of name/value pairs the value parts of which 277 exactly match the elements of the array in order. 279 A map matches a specification given as a group when the group matches 280 a sequence of name/value pairs such that all of these name/value 281 pairs are present in the map and the map has no name/value pair that 282 is not covered by the group. 284 A simple example of using a group directly in a map definition is: 286 person = { 287 age: int, 288 name: tstr, 289 employer: tstr, 290 } 292 Figure 1: Using a group directly in a map 294 The three entries of the group are written between the curly braces 295 that create the map: Here, "age", "name", and "employer" are the 296 names that turn into the map key text strings, and "int" and "tstr" 297 (text string) are the types of the map values under these keys. 299 A group by itself (without creating a map around it) can be placed in 300 (round) parentheses, and given a name by using it in a rule: 302 pii = ( 303 age: int, 304 name: tstr, 305 employer: tstr, 306 ) 308 Figure 2: A basic group 310 This separate, named group definition allows us to rephrase Figure 1 311 as: 313 person = { 314 pii 315 } 317 Figure 3: Using a group by name 319 Note that the (curly) braces signify the creation of a map; the 320 groups themselves are neutral as to whether they will be used in a 321 map or an array. 323 As shown in Figure 1, the parentheses for groups are optional when 324 there is some other set of brackets present. Note that they can 325 still be used, leading to the not so realistic, but perfectly valid 326 example: 328 person = {( 329 age: int, 330 name: tstr, 331 employer: tstr, 332 )} 334 Figure 4: Using a parenthesized group in a map 336 Groups can be used to factor out common parts of structs, e.g., 337 instead of writing copy/paste style specifications such as in 338 Figure 5, one can factor out the common subgroup, choose a name for 339 it, and write only the specific parts into the individual maps 340 (Figure 6). 342 person = { 343 age: int, 344 name: tstr, 345 employer: tstr, 346 } 348 dog = { 349 age: int, 350 name: tstr, 351 leash-length: float, 352 } 354 Figure 5: Maps with copy/paste 356 person = { 357 identity, 358 employer: tstr, 359 } 361 dog = { 362 identity, 363 leash-length: float, 364 } 366 identity = ( 367 age: int, 368 name: tstr, 369 ) 371 Figure 6: Using a group for factorization 373 Note that the lists inside the braces in the above definitions 374 constitute (anonymous) groups, while "identity" is a named group. 376 2.1.1. Usage 378 Groups are the instrument used in composing data structures with 379 CDDL. It is a matter of style in defining those structures whether 380 to define groups (anonymously) right in their contexts or whether to 381 define them in a separate rule and to reference them with their 382 respective name (possibly more than once). 384 With this, one is allowed to define all small parts of their data 385 structures and compose bigger protocol units with those or to have 386 only one big protocol data unit that has all definitions ad hoc where 387 needed. 389 2.1.2. Syntax 391 The composition syntax is intended to be concise and easy to read: 393 o The start and end of a group can be marked by '(' and ')' 395 o Definitions of entries inside of a group are noted as follows: 396 _keytype => valuetype,_ (read "keytype maps to valuetype"). The 397 comma is actually optional (not just in the final entry), but it 398 is considered good style to set it. The double arrow can be 399 replaced by a colon in the common case of directly using a text 400 string or integer literal as a key (see Section 3.5.1; this is 401 also the common way of naming elements of an array just for 402 documentation, see Section 3.4). 404 A basic entry consists of a _keytype_ and a _valuetype_, both of 405 which are types (Section 2.2); this entry matches any name-value pair 406 the name of which is in the keytype and the value of which is in the 407 valuetype. 409 A group defined as a sequence of group entries matches any sequence 410 of name-value pairs that is composed by concatenation in order of 411 what the entries match. 413 A group definition can also contain choices between groups, see 414 Section 2.2.2. 416 2.2. Types 418 2.2.1. Values 420 Values such as numbers and strings can be used in place of a type. 421 (For instance, this is a very common thing to do for a keytype, 422 common enough that CDDL provides additional convenience syntax for 423 this.) 424 The value notation is based on the C language, but does not offer all 425 the syntactic variations (see Appendix B for details). The value 426 notation for numbers inherits from C the distinction between integer 427 values (no fractional part or exponent given -- NR1 [ISO6093]) and 428 floating point values (where a fractional part and/or an exponent is 429 present -- NR2 or NR3), so the type "1" does not include any floating 430 point numbers while the types "1e3" and "1.5" are both floating point 431 numbers and do not include any integer numbers. 433 2.2.2. Choices 435 Many places that allow a type also allow a choice between types, 436 delimited by a "/" (slash). The entire choice construct can be put 437 into parentheses if this is required to make the construction 438 unambiguous (please see Appendix B for the details). 440 Choices of values can be used to express enumerations: 442 attire = "bow tie" / "necktie" / "Internet attire" 443 protocol = 6 / 17 445 Similarly as for types, CDDL also allows choices between groups, 446 delimited by a "//" (double slash). Note that the "//" operator 447 binds much more weakly than the other CDDL operators, so each line 448 within "delivery" in the following example is its own alternative in 449 the group choice: 451 address = { delivery } 453 delivery = ( 454 street: tstr, ? number: uint, city // 455 po-box: uint, city // 456 per-pickup: true ) 458 city = ( 459 name: tstr, zip-code: uint 460 ) 462 A group choice matches the union of the sets of name-value pair 463 sequences that the alternatives in the choice can. 465 Both for type choices and for group choices, additional alternatives 466 can be added to a rule later in separate rules by using "/=" and 467 "//=", respectively, instead of "=": 469 attire /= "swimwear" 471 delivery //= ( 472 lat: float, long: float, drone-type: tstr 473 ) 475 It is not an error if a name is first used with a "/=" or "//=" 476 (there is no need to "create it" with "="). 478 2.2.2.1. Ranges 480 Instead of naming all the values that make up a choice, CDDL allows 481 building a _range_ out of two values that are in an ordering 482 relationship: A lower bound (first value) and an upper bound (second 483 value). A range can be inclusive of both bounds given (denoted by 484 joining two values by ".."), or include the lower bound and exclude 485 the upper bound (denoted by instead using "..."). If the lower bound 486 exceeds the upper bound, the resulting type is the empty set (this 487 behavior can be desirable when generics, Section 3.10, are being 488 used). 490 device-address = byte 491 max-byte = 255 492 byte = 0..max-byte ; inclusive range 493 first-non-byte = 256 494 byte1 = 0...first-non-byte ; byte1 is equivalent to byte 496 CDDL currently only allows ranges between integers (matching integer 497 values) or between floating point values (matching floating point 498 values). If both are needed in a type, a type choice between the two 499 kinds of ranges can be (clumsily) used: 501 int-range = 0..10 ; only integers match 502 float-range = 0.0..10.0 ; only floats match 503 BAD-range1 = 0..10.0 ; NOT DEFINED 504 BAD-range2 = 0.0..10 ; NOT DEFINED 505 numeric-range = int-range / float-range 507 (See also the control operators .lt/.ge and .le/.gt in 508 Section 3.8.6.) 510 Note that the dot is a valid name continuation character in CDDL, so 512 min..max 514 is not a range expression but a single name. When using a name as 515 the left hand side of a range operator, use spacing as in 516 min .. max 518 to separate off the range operator. 520 2.2.2.2. Turning a group into a choice 522 Some choices are built out of large numbers of values, often 523 integers, each of which is best given a semantic name in the 524 specification. Instead of naming each of these integers and then 525 accumulating these into a choice, CDDL allows building a choice from 526 a group by prefixing it with a "&" character: 528 terminal-color = &basecolors 529 basecolors = ( 530 black: 0, red: 1, green: 2, yellow: 3, 531 blue: 4, magenta: 5, cyan: 6, white: 7, 532 ) 533 extended-color = &( 534 basecolors, 535 orange: 8, pink: 9, purple: 10, brown: 11, 536 ) 538 As with the use of groups in arrays (Section 3.4), the member names 539 have only documentary value (in particular, they might be used by a 540 tool when displaying integers that are taken from that choice). 542 2.2.3. Representation Types 544 CDDL allows the specification of a data item type by referring to the 545 CBOR representation (major types and additional information, 546 Section 2 of [RFC7049]). How this is used should be evident from the 547 prelude (Appendix D): a hash mark ("#") optionally followed by a 548 number from 0 to 7 identifying the major type, which then can be 549 followed by a dot and a number specifying the additional information. 550 This construction specifies the set of values that can be serialized 551 in CBOR (i.e., "any"), by the given major type if one is given, or by 552 the given major type with the additional information if both are 553 given. Where a major type of 6 (Tag) is used, the type of the tagged 554 item can be specified by appending it in parentheses. 556 Note that although this notation is based on the CBOR serialization, 557 it is about a set of values at the data model level, e.g. "#7.25" 558 specifies the set of values that can be represented as half-precision 559 floats; it does not mandate that these values also do have to be 560 serialized as half-precision floats: CDDL does not provide any 561 language means to restrict the choice of serialization variants. 562 This also enables the use of CDDL with JSON, which uses a 563 fundamentally different way of serializing (some of) the same values. 565 It may be necessary to make use of representation types outside the 566 prelude, e.g., a specification could start by making use of an 567 existing tag in a more specific way, or define a new tag not defined 568 in the prelude: 570 my_breakfast = #6.55799(breakfast) ; cbor-any is too general! 571 breakfast = cereal / porridge 572 cereal = #6.998(tstr) 573 porridge = #6.999([liquid, solid]) 574 liquid = milk / water 575 milk = 0 576 water = 1 577 solid = tstr 579 2.2.4. Root type 581 There is no special syntax to identify the root of a CDDL data 582 structure definition: that role is simply taken by the first rule 583 defined in the file. 585 This is motivated by the usual top-down approach for defining data 586 structures, decomposing a big data structure unit into smaller parts; 587 however, except for the root type, there is no need to strictly 588 follow this sequence. 590 (Note that there is no way to use a group as a root - it must be a 591 type.) 593 3. Syntax 595 In this section, the overall syntax of CDDL is shown, alongside some 596 examples just illustrating syntax. (The definition will not attempt 597 to be overly formal; refer to Appendix B for the details.) 599 3.1. General conventions 601 The basic syntax is inspired by ABNF [RFC5234], with 603 o rules, whether they define groups or types, are defined with a 604 name, followed by an equals sign "=" and the actual definition 605 according to the respective syntactic rules of that definition. 607 o A name can consist of any of the characters from the set {'A' to 608 'Z', 'a' to 'z', '0' to '9', '_', '-', '@', '.', '$'}, starting 609 with an alphabetic character (including '@', '_', '$') and ending 610 in such a character or or a digit. 612 * Names are case sensitive. 614 * It is preferred style to start a name with a lower case letter. 616 * The hyphen is preferred over the underscore (except in a 617 "bareword" (Section 3.5.1), where the semantics may actually 618 require an underscore). 620 * The period may be useful for larger specifications, to express 621 some module structure (as in "tcp.throughput" vs. 622 "udp.throughput"). 624 * A number of names are predefined in the CDDL prelude, as listed 625 in Appendix D. 627 * Rule names (types or groups) do not appear in the actual CBOR 628 encoding, but names used as "barewords" in member keys do. 630 o Comments are started by a ';' (semicolon) character and finish at 631 the end of a line (LF or CRLF). 633 o outside strings, whitespace (spaces, newlines, and comments) is 634 used to separate syntactic elements for readability (and to 635 separate identifiers, range operators, or numbers that follow each 636 other); it is otherwise completely optional. 638 o Hexadecimal numbers are preceded by '0x' (without quotes, lower 639 case x), and are case insensitive. Similarly, binary numbers are 640 preceded by '0b'. 642 o Text strings are enclosed by double quotation '"' characters. 643 They follow the conventions for strings as defined in section 7 of 644 [RFC8259]. (ABNF users may want to note that there is no support 645 in CDDL for the concept of case insensitivity in text strings; if 646 necessary, regular expressions can be used (Section 3.8.3).) 648 o Byte strings are enclosed by single quotation "'" characters and 649 may be prefixed by "h" or "b64". If unprefixed, the string is 650 interpreted as with a text string, except that single quotes must 651 be escaped and that the UTF-8 bytes resulting are marked as a byte 652 string (major type 2). If prefixed as "h" or "b64", the string is 653 interpreted as a sequence of pairs of hex digits (base16, 654 Section 8 of [RFC4648]) or a base64(url) string (Sections 4 or 5 655 of [RFC4648]), respectively (as with the diagnostic notation in 656 section 6 of [RFC7049]; cf. Appendix G.2); any white space present 657 within the string (including comments) is ignored in the prefixed 658 case. 660 o CDDL uses UTF-8 [RFC3629] for its encoding. 662 Example: 664 ; This is a comment 665 person = { g } 667 g = ( 668 "name": tstr, 669 age: int, ; "age" is a bareword 670 ) 672 3.2. Occurrence 674 An optional _occurrence_ indicator can be given in front of a group 675 entry. It is either one of the characters '?' (optional), '*' (zero 676 or more), or '+' (one or more), or is of the form n*m, where n and m 677 are optional unsigned integers and n is the lower limit (default 0) 678 and m is the upper limit (default no limit) of occurrences. 680 If no occurrence indicator is specified, the group entry is to occur 681 exactly once (as if 1*1 were specified). A group entry with an 682 occurrence indicator matches sequences of name-value pairs that are 683 composed by concatenating a number of sequences that the basic group 684 entry matches, where the number needs to be allowed by the occurrence 685 indicator. 687 Note that CDDL, outside any directives/annotations that could 688 possibly be defined, does not make any prescription as to whether 689 arrays or maps use the definite length or indefinite length encoding. 690 I.e., there is no correlation between leaving the size of an array 691 "open" in the spec and the fact that it is then interchanged with 692 definite or indefinite length. 694 Please also note that CDDL can describe flexibility that the data 695 model of the target representation does not have. This is rather 696 obvious for JSON, but also is relevant for CBOR: 698 apartment = { 699 kitchen: size, 700 * bedroom: size, 701 } 702 size = float ; in m2 704 The previous specification does not mean that CBOR is changed to 705 allow to use the key "bedroom" more than once. In other words, due 706 to the restrictions imposed by the data model, the third line pretty 707 much turns into: 709 ? bedroom: size, 711 (Occurrence indicators beyond one still are useful in maps for groups 712 that allow a variety of keys.) 714 3.3. Predefined names for types 716 CDDL predefines a number of names. This subsection summarizes these 717 names, but please see Appendix D for the exact definitions. 719 The following keywords for primitive datatypes are defined: 721 "bool" Boolean value (major type 7, additional information 20 or 722 21). 724 "uint" An unsigned integer (major type 0). 726 "nint" A negative integer (major type 1). 728 "int" An unsigned integer or a negative integer. 730 "float16" A number representable as an IEEE 754 half-precision float 731 (major type 7, additional information 25). 733 "float32" A number representable as an IEEE 754 single-precision 734 float (major type 7, additional information 26). 736 "float64" A number representable as an IEEE 754 double-precision 737 float (major type 7, additional information 27). 739 "float" One of float16, float32, or float64. 741 "bstr" or "bytes" A byte string (major type 2). 743 "tstr" or "text" Text string (major type 3) 745 (Note that there are no predefined names for arrays or maps; these 746 are defined with the syntax given below.) 748 In addition, a number of types are defined in the prelude that are 749 associated with CBOR tags, such as "tdate", "bigint", "regexp" etc. 751 3.4. Arrays 753 Array definitions surround a group with square brackets. 755 For each entry, an occurrence indicator as specified in Section 3.2 756 is permitted. 758 For example: 760 unlimited-people = [* person] 761 one-or-two-people = [1*2 person] 762 at-least-two-people = [2* person] 763 person = ( 764 name: tstr, 765 age: uint, 766 ) 768 The group "person" is defined in such a way that repeating it in the 769 array each time generates alternating names and ages, so these are 770 four valid values for a data item of type "unlimited-people": 772 ["roundlet", 1047, "psychurgy", 2204, "extrarhythmical", 2231] 773 [] 774 ["aluminize", 212, "climograph", 4124] 775 ["penintime", 1513, "endocarditis", 4084, "impermeator", 1669, 776 "coextension", 865] 778 3.5. Maps 780 The syntax for specifying maps merits special attention, as well as a 781 number of optimizations and conveniences, as it is likely to be the 782 focal point of many specifications employing CDDL. While the syntax 783 does not strictly distinguish struct and table usage of maps, it 784 caters specifically to each of them. 786 But first, let's reiterate a feature of CBOR that it has inherited 787 from JSON: The key/value pairs in CBOR maps have no fixed ordering. 788 (One could imagine situations where fixing the ordering may be of 789 use. For example, a decoder could look for values related with 790 integer keys 1, 3 and 7. If the order were fixed and the decoder 791 encounters the key 4 without having encountered key 3, it could 792 conclude that key 3 is not available without doing more complicated 793 bookkeeping. Unfortunately, neither JSON nor CBOR support this, so 794 no attempt was made to support this in CDDL either.) 796 3.5.1. Structs 798 The "struct" usage of maps is similar to the way JSON objects are 799 used in many JSON applications. 801 A map is defined in the same way as defining an array (see 802 Section 3.4), except for using curly braces "{}" instead of square 803 brackets "[]". 805 An occurrence indicator as specified in Section 3.2 is permitted for 806 each group entry. 808 The following is an example of a structure: 810 Geography = [ 811 city : tstr, 812 gpsCoordinates : GpsCoordinates, 813 ] 815 GpsCoordinates = { 816 longitude : uint, ; multiplied by 10^7 817 latitude : uint, ; multiplied by 10^7 818 } 820 When encoding, the Geography structure is encoded using a CBOR array 821 with two entries (the keys for the group entries are ignored), 822 whereas the GpsCoordinates are encoded as a CBOR map with two key/ 823 value pairs. 825 Types used in a structure can be defined in separate rules or just in 826 place (potentially placed inside parentheses, such as for choices). 827 E.g.: 829 located-samples = { 830 sample-point: int, 831 samples: [+ float], 832 } 834 where "located-samples" is the datatype to be used when referring to 835 the struct, and "sample-point" and "samples" are the keys to be used. 836 This is actually a complete example: an identifier that is followed 837 by a colon can be directly used as the text string for a member key 838 (we speak of a "bareword" member key), as can a double-quoted string 839 or a number. (When other types, in particular ones that contain more 840 than one value, are used as the types of keys, they are followed by a 841 double arrow, see below.) 843 If a text string key does not match the syntax for an identifier (or 844 if the specifier just happens to prefer using double quotes), the 845 text string syntax can also be used in the member key position, 846 followed by a colon. The above example could therefore have been 847 written with quoted strings in the member key positions. 849 More generally, types specified in other ways than the cases 850 described above can be used in a keytype position by following them 851 with a double arrow -- in particular, the double arrow is necessary 852 if a type is named by an identifier (which, when followed by a colon, 853 would be interpreted as a "bareword" and turned into a text string). 854 A literal text string also gives rise to a type (which contains a 855 single value only -- the given string), so another form for this 856 example is: 858 located-samples = { 859 "sample-point" => int, 860 "samples" => [+ float], 861 } 863 See Section 3.5.4 below for how the colon shortcut described here 864 also adds some implied semantics. 866 A better way to demonstrate the double-arrow use may be: 868 located-samples = { 869 sample-point: int, 870 samples: [+ float], 871 * equipment-type => equipment-tolerances, 872 } 873 equipment-type = [name: tstr, manufacturer: tstr] 874 equipment-tolerances = [+ [float, float]] 876 The example below defines a struct with optional entries: display 877 name (as a text string), the name components first name and family 878 name (as text strings), and age information (as an unsigned integer). 880 PersonalData = { 881 ? displayName: tstr, 882 NameComponents, 883 ? age: uint, 884 } 886 NameComponents = ( 887 ? firstName: tstr, 888 ? familyName: tstr, 889 ) 891 Note that the group definition for NameComponents does not generate 892 another map; instead, all four keys are directly in the struct built 893 by PersonalData. 895 In this example, all key/value pairs are optional from the 896 perspective of CDDL. With no occurrence indicator, an entry is 897 mandatory. 899 If the addition of more entries not specified by the current 900 specification is desired, one can add this possibility explicitly: 902 PersonalData = { 903 ? displayName: tstr, 904 NameComponents, 905 ? age: uint, 906 * tstr => any 907 } 909 NameComponents = ( 910 ? firstName: tstr, 911 ? familyName: tstr, 912 ) 914 Figure 7: Personal Data: Example for extensibility 916 The CDDL tool reported on in Appendix F generated as one acceptable 917 instance for this specification: 919 {"familyName": "agust", "antiforeignism": "pretzel", 920 "springbuck": "illuminatingly", "exuviae": "ephemeris", 921 "kilometrage": "frogfish"} 923 (See Section 3.9 for one way to explicitly identify an extension 924 point.) 926 3.5.2. Tables 928 A table can be specified by defining a map with entries where the 929 keytype allows more than just a single value, e.g.: 931 square-roots = {* x => y} 932 x = int 933 y = float 935 Here, the key in each key/value pair has datatype x (defined as int), 936 and the value has datatype y (defined as float). 938 If the specification does not need to restrict one of x or y (i.e., 939 the application is free to choose per entry), it can be replaced by 940 the predefined name "any". 942 As another example, the following could be used as a conversion table 943 converting from an integer or float to a string: 945 tostring = {* mynumber => tstr} 946 mynumber = int / float 948 3.5.3. Non-deterministic order 950 While the way arrays are matched is fully determined by the Parsing 951 Expression Grammar (PEG) formalism (see Appendix A), matching is more 952 complicated for maps, as maps do not have an inherent order. For 953 each candidate name/value pair that the PEG algorithm would try, a 954 matching member is picked out of the entire map. For certain group 955 expressions, more than one member in the map may match. Most often, 956 this is inconsequential, as the group expression tends to consume all 957 matches: 959 labeled-values = { 960 ? fritz: number, 961 * label => value 962 } 963 label = text 964 value = number 966 Here, if any member with the key "fritz" is present, this will be 967 picked by the first entry of the group; all remaining text/number 968 member will be picked by the second entry (and if anything remains 969 unpicked, the map does not match). 971 However, it is possible to construct group expressions where what is 972 actually picked is indeterminate, and does matter: 974 do-not-do-this = { 975 int => int, 976 int => 6, 977 } 979 When this expression is matched against "{3: 5, 4: 6}", the first 980 group entry might pick off the "3: 5", leaving "4: 6" for matching 981 the second one. Or it might pick off "4: 6", leaving nothing for the 982 second entry. This pathological non-determinism is caused by 983 specifying more general before more specific, and by having a general 984 rule that only consumes a subset of the map key/value pairs that it 985 is able to match -- both tend not to occur in real-world 986 specifications of maps. At the time of writing, CDDL tools cannot 987 detect such cases automatically, and for the present version of the 988 CDDL specification, the specification writer is simply urged to not 989 write pathologically non-deterministic specifications. 991 (The astute reader will be reminded of what was called "ambiguous 992 content models" in SGML and "non-deterministic content models" in 993 XML. That problem is related to the one described here, but the 994 problem here is specifically caused by the lack of order in maps, 995 something that the XML schema languages do not have to contend with. 997 Note that Relax-NG's "interleave" pattern handles lack of order 998 explicitly on the specification side, while the instances in XML 999 always have determinate order.) 1001 3.5.4. Cuts in Maps 1003 The extensibility idiom discussed above for structs has one problem: 1005 extensible-map-example = { 1006 ? "optional-key" => int, 1007 * tstr => any 1008 } 1010 In this example, there is one optional key "optional-key", which, 1011 when present, maps to an integer. There is also a wild card for any 1012 future additions. 1014 Unfortunately, the data item 1016 { "optional-key": "nonsense" } 1018 does match this specification: While the first entry of the group 1019 does not match, the second one (the wildcard) does. This may be very 1020 well desirable (e.g., if a future extension is to be allowed to 1021 extend the type of "optional-key"), but in many cases isn't. 1023 In anticipation of a more general potential feature called "cuts", 1024 CDDL allows inserting a cut "^" into the definition of the map entry: 1026 extensible-map-example = { 1027 ? "optional-key" ^ => int, 1028 * tstr => any 1029 } 1031 A cut in this position means that once the member key matches the 1032 name part of an entry that carries a cut, other potential matches for 1033 the key of the member that occur in later entries in the group of the 1034 map are no longer allowed. In other words, when a group entry would 1035 pick a key/value pair based on just a matching key, it "locks in" the 1036 pick -- this rule applies independent of whether the value matches as 1037 well, so when it does not, the entire map fails to match. In 1038 summary, the example above no longer matches the specification as 1039 modified with the cut. 1041 Since the desire for this kind of exclusive matching is so frequent, 1042 the ":" shortcut is actually defined to include the cut semantics. 1043 So the preceding example (including the cut) can be written more 1044 simply as: 1046 extensible-map-example = { 1047 ? "optional-key": int, 1048 * tstr => any 1049 } 1051 or even shorter, using a bareword for the key: 1053 extensible-map-example = { 1054 ? optional-key: int, 1055 * tstr => any 1056 } 1058 3.6. Tags 1060 A type can make use of a CBOR tag (major type 6) by using the 1061 representation type notation, giving #6.nnn(type) where nnn is an 1062 unsigned integer giving the tag number and "type" is the type of the 1063 data item being tagged. 1065 For example, the following line from the CDDL prelude (Appendix D) 1066 defines "biguint" as a type name for a positive bignum N: 1068 biguint = #6.2(bstr) 1070 The tags defined by [RFC7049] are included in the prelude. 1071 Additional tags since registered need to be added to a CDDL 1072 specification as needed; e.g., a binary UUID tag could be referenced 1073 as "buuid" in a specification after defining 1075 buuid = #6.37(bstr) 1077 In the following example, usage of the tag 32 for URIs is optional: 1079 my_uri = #6.32(tstr) / tstr 1081 3.7. Unwrapping 1083 The group that is used to define a map or an array can often be 1084 reused in the definition of another map or array. Similarly, a type 1085 defined as a tag carries an internal data item that one would like to 1086 refer to. In these cases, it is expedient to simply use the name of 1087 the map, array, or tag type as a handle for the group or type defined 1088 inside it. 1090 The "unwrap" operator (written by preceding a name by a tilde 1091 character "~") can be used to strip the type defined for a name by 1092 one layer, exposing the underlying group (for maps and arrays) or 1093 type (for tags). 1095 For example, an application might want to define a basic and an 1096 advanced header. Without unwrapping, this might be done as follows: 1098 basic-header-group = ( 1099 field1: int, 1100 field2: text, 1101 ) 1103 basic-header = [ basic-header-group ] 1105 advanced-header = [ 1106 basic-header-group, 1107 field3: bytes, 1108 field4: number, ; as in the tagged type "time" 1109 ] 1111 Unwrapping simplifies this to: 1113 basic-header = [ 1114 field1: int, 1115 field2: text, 1116 ] 1118 advanced-header = [ 1119 ~basic-header, 1120 field3: bytes, 1121 field4: ~time, 1122 ] 1124 (Note that leaving out the first unwrap operator in the latter 1125 example would lead to nesting the basic-header in its own array 1126 inside the advanced-header, while, with the unwrapped basic-header, 1127 the definition of the group inside basic-header is essentially 1128 repeated inside advanced-header, leading to a single array. This can 1129 be used for various applications often solved by inheritance in 1130 programming languages. The effect of unwrapping can also be 1131 described as "threading in" the group or type inside the referenced 1132 type, which suggested the thread-like "~" character.) 1134 3.8. Controls 1136 A _control_ allows to relate a _target_ type with a _controller_ type 1137 via a _control operator_. 1139 The syntax for a control type is "target .control-operator 1140 controller", where control operators are special identifiers prefixed 1141 by a dot. (Note that _target_ or _controller_ might need to be 1142 parenthesized.) 1143 A number of control operators are defined at this point. Further 1144 control operators may be defined by new versions of this 1145 specification or by registering them according to the procedures in 1146 Section 6.1. 1148 3.8.1. Control operator .size 1150 A ".size" control controls the size of the target in bytes by the 1151 control type. The control is defined for text and byte strings, 1152 where it directly controls the number of bytes in the string. It is 1153 also defined for unsigned integers (see below). Figure 8 shows 1154 example usage for byte strings. 1156 full-address = [[+ label], ip4, ip6] 1157 ip4 = bstr .size 4 1158 ip6 = bstr .size 16 1159 label = bstr .size (1..63) 1161 Figure 8: Control for size in bytes 1163 When applied to an unsigned integer, the ".size" control restricts 1164 the range of that integer by giving a maximum number of bytes that 1165 should be needed in a computer representation of that unsigned 1166 integer. In other words, "uint .size N" is equivalent to 1167 "0...BYTES_N", where BYTES_N == 256**N. 1169 audio_sample = uint .size 3 ; 24-bit, equivalent to 0...16777216 1171 Figure 9: Control for integer size in bytes 1173 Note that, as with value restrictions in CDDL, this control is not a 1174 representation constraint; a number that fits into fewer bytes can 1175 still be represented in that form, and an inefficient implementation 1176 could use a longer form (unless that is restricted by some format 1177 constraints outside of CDDL, such as the rules in Section 3.9 of 1178 [RFC7049]). 1180 3.8.2. Control operator .bits 1182 A ".bits" control on a byte string indicates that, in the target, 1183 only the bits numbered by a number in the control type are allowed to 1184 be set. (Bits are counted the usual way, bit number "n" being set in 1185 "str" meaning that "(str[n >> 3] & (1 << (n & 7))) != 0".) 1186 Similarly, a ".bits" control on an unsigned integer "i" indicates 1187 that for all unsigned integers "n" where "(i & (1 << n)) != 0", "n" 1188 must be in the control type. 1190 tcpflagbytes = bstr .bits flags 1191 flags = &( 1192 fin: 8, 1193 syn: 9, 1194 rst: 10, 1195 psh: 11, 1196 ack: 12, 1197 urg: 13, 1198 ece: 14, 1199 cwr: 15, 1200 ns: 0, 1201 ) / (4..7) ; data offset bits 1203 rwxbits = uint .bits rwx 1204 rwx = &(r: 2, w: 1, x: 0) 1206 Figure 10: Control for what bits can be set 1208 The CDDL tool reported on in Appendix F generates the following ten 1209 example instances for "tcpflagbytes": 1211 h'906d' h'01fc' h'8145' h'01b7' h'013d' h'409f' h'018e' h'c05f' 1212 h'01fa' h'01fe' 1214 These examples do not illustrate that the above CDDL specification 1215 does not explicitly specify a size of two bytes: A valid all clear 1216 instance of flag bytes could be "h''" or "h'00'" or even "h'000000'" 1217 as well. 1219 3.8.3. Control operator .regexp 1221 A ".regexp" control indicates that the text string given as a target 1222 needs to match the XSD regular expression given as a value in the 1223 control type. XSD regular expressions are defined in Appendix F of 1224 [W3C.REC-xmlschema-2-20041028]. 1226 nai = tstr .regexp "[A-Za-z0-9]+@[A-Za-z0-9]+(\\.[A-Za-z0-9]+)+" 1228 Figure 11: Control with an XSD regexp 1230 An example matching this regular expression: 1232 "N1@CH57HF.4Znqe0.dYJRN.igjf" 1234 3.8.3.1. Usage considerations 1236 Note that XSD regular expressions do not support the usual \x or \u 1237 escapes for hexadecimal expression of bytes or unicode code points. 1238 However, in CDDL the XSD regular expressions are contained in text 1239 strings, the literal notation for which provides \u escapes; this 1240 should suffice for most applications that use regular expressions for 1241 text strings. (Note that this also means that there is one level of 1242 string escaping before the XSD escaping rules are applied.) 1244 XSD regular expressions support character class subtraction, a 1245 feature often not found in regular expression libraries; 1246 specification writers may want to use this feature sparingly. 1247 Similar considerations apply to Unicode character classes; where 1248 these are used, the specification that employs CDDL SHOULD identify 1249 which Unicode versions are addressed. 1251 Other surprises for infrequent users of XSD regular expressions may 1252 include: 1254 o No direct support for case insensitivity. While case 1255 insensitivity has gone mostly out of fashion in protocol design, 1256 it is sometimes needed and then needs to be expressed manually as 1257 in "[Cc][Aa][Ss][Ee]". 1259 o The support for popular character classes such as \w and \d is 1260 based on Unicode character properties, which is often not what is 1261 desired in an ASCII-based protocol and thus might lead to 1262 surprises. (\s and \S do have their more conventional meanings, 1263 and "." matches any character but the line ending characters \r or 1264 \n.) 1266 3.8.3.2. Discussion 1268 There are many flavors of regular expression in use in the 1269 programming community. For instance, perl-compatible regular 1270 expressions (PCRE) are widely used and probably are more useful than 1271 XSD regular expressions. However, there is no normative reference 1272 for PCRE that could be used in the present document. Instead, we opt 1273 for XSD regular expressions for now. There is precedent for that 1274 choice in the IETF, e.g., in YANG [RFC7950]. 1276 Note that CDDL uses controls as its main extension point. This 1277 creates the opportunity to add further regular expression formats in 1278 addition to the one referenced here if desired. As an example, a 1279 control ".pcre" is defined in [I-D.bormann-cbor-cddl-freezer]. 1281 3.8.4. Control operators .cbor and .cborseq 1283 A ".cbor" control on a byte string indicates that the byte string 1284 carries a CBOR encoded data item. Decoded, the data item matches the 1285 type given as the right-hand side argument (type1 in the following 1286 example). 1288 "bytes .cbor type1" 1290 Similarly, a ".cborseq" control on a byte string indicates that the 1291 byte string carries a sequence of CBOR encoded data items. When the 1292 data items are taken as an array, the array matches the type given as 1293 the right-hand side argument (type2 in the following example). 1295 "bytes .cborseq type2" 1297 (The conversion of the encoded sequence to an array can be effected 1298 for instance by wrapping the byte string between the two bytes 0x9f 1299 and 0xff and decoding the wrapped byte string as a CBOR encoded data 1300 item.) 1302 3.8.5. Control operators .within and .and 1304 A ".and" control on a type indicates that the data item matches both 1305 that left hand side type and the type given as the right hand side. 1306 (Formally, the resulting type is the intersection of the two types 1307 given.) 1309 "type1 .and type2" 1311 A variant of the ".and" control is the ".within" control, which 1312 expresses an additional intent: the left hand side type is meant to 1313 be a subset of the right-hand-side type. 1315 "type1 .within type2" 1317 While both forms have the identical formal semantics (intersection), 1318 the intention of the ".within" form is that the right hand side gives 1319 guidance to the types allowed on the left hand side, which typically 1320 is a socket (Section 3.9): 1322 message = $message .within message-structure 1323 message-structure = [message_type, *message_option] 1324 message_type = 0..255 1325 message_option = any 1327 $message /= [3, dough: text, topping: [* text]] 1328 $message /= [4, noodles: text, sauce: text, parmesan: bool] 1329 For ".within", a tool might flag an error if type1 allows data items 1330 that are not allowed by type2. In contrast, for ".and", there is no 1331 expectation that type1 already is a subset of type2. 1333 3.8.6. Control operators .lt, .le, .gt, .ge, .eq, .ne, and .default 1335 The controls .lt, .le, .gt, .ge, .eq, .ne specify a constraint on the 1336 left hand side type to be a value less than, less than or equal, 1337 greater than, greater than or equal, equal, or not equal, to a value 1338 given as a right hand side type (containing just that single value). 1339 In the present specification, the first four controls (.lt, .le, .gt, 1340 .ge) are defined only for numeric types, as these have a natural 1341 ordering relationship. 1343 speed = number .ge 0 ; unit: m/s 1345 .ne and .eq are defined both for numeric values and values of other 1346 types. If one of the values is not of a numeric type, equality is 1347 determined as follows: Text strings are equal (satisfy .eq/do not 1348 satisfy .ne) if they are byte-wise identical; the same applies for 1349 byte strings. Arrays are equal if they have the same number of 1350 elements, all of which are equal pairwise in order between the 1351 arrays. Maps are equal if they have the same number of key/value 1352 pairs, and there is pairwise equality between the key/value pairs 1353 between the two maps. Tagged values are equal if they both have the 1354 same tag and the values are equal. Values of simple types match if 1355 they are the same values. Numeric types that occur within arrays, 1356 maps, or tagged values are equal if their numeric value is equal and 1357 they are both integers or both floating point values. All other 1358 cases are not equal (e.g., comparing a text string with a byte 1359 string). 1361 A variant of the ".ne" control is the ".default" control, which 1362 expresses an additional intent: the value specified by the right- 1363 hand-side type is intended as a default value for the left hand side 1364 type given, and the implied .ne control is there to prevent this 1365 value from being sent over the wire. This control is only meaningful 1366 when the control type is used in an optional context; otherwise there 1367 would be no way to express the default value. 1369 timer = { 1370 time: uint, 1371 ? displayed-step: (number .gt 0) .default 1 1372 } 1374 3.9. Socket/Plug 1376 Both for type choices and group choices, a mechanism is defined that 1377 facilitates starting out with empty choices and assembling them 1378 later, potentially in separate files that are concatenated to build 1379 the full specification. 1381 Per convention, CDDL extension points are marked with a leading 1382 dollar sign (types) or two leading dollar signs (groups). Tools 1383 honor that convention by not raising an error if such a type or group 1384 is not defined at all; the symbol is then taken to be an empty type 1385 choice (group choice), i.e., no choice is available. 1387 tcp-header = {seq: uint, ack: uint, * $$tcp-option} 1389 ; later, in a different file 1391 $$tcp-option //= ( 1392 sack: [+(left: uint, right: uint)] 1393 ) 1395 ; and, maybe in another file 1397 $$tcp-option //= ( 1398 sack-permitted: true 1399 ) 1401 Names that start with a single "$" are "type sockets", starting out 1402 as an empty type, and intended to be extended via "/=". Names that 1403 start with a double "$$" are "group sockets", starting out as an 1404 empty group choice, and intended to be extended via "//=". In either 1405 case, it is not an error if there is no definition for a socket at 1406 all; this then means there is no way to satisfy the rule (i.e., the 1407 choice is empty). 1409 As a convention, all definitions (plugs) for socket names must be 1410 augmentations, i.e., they must be using "/=" and "//=", respectively. 1412 To pick up the example illustrated in Figure 7, the socket/plug 1413 mechanism could be used as shown in Figure 12: 1415 PersonalData = { 1416 ? displayName: tstr, 1417 NameComponents, 1418 ? age: uint, 1419 * $$personaldata-extensions 1420 } 1422 NameComponents = ( 1423 ? firstName: tstr, 1424 ? familyName: tstr, 1425 ) 1427 ; The above already works as is. 1428 ; But then, we can add later: 1430 $$personaldata-extensions //= ( 1431 favorite-salsa: tstr, 1432 ) 1434 ; and again, somewhere else: 1436 $$personaldata-extensions //= ( 1437 shoesize: uint, 1438 ) 1440 Figure 12: Personal Data example: Using socket/plug extensibility 1442 3.10. Generics 1444 Using angle brackets, the left hand side of a rule can add formal 1445 parameters after the name being defined, as in: 1447 messages = message<"reboot", "now"> / message<"sleep", 1..100> 1448 message = {type: t, value: v} 1450 When using a generic rule, the formal parameters are bound to the 1451 actual arguments supplied (also using angle brackets), within the 1452 scope of the generic rule (as if there were a rule of the form 1453 parameter = argument). 1455 Generic rules can be used for establishing names for both types and 1456 groups. 1458 (There are some limitations to nesting of generics in the tool 1459 described in Appendix F at this time.) 1461 3.11. Operator Precedence 1463 As with any language that has multiple syntactic features such as 1464 prefix and infix operators, CDDL has operators that bind more tightly 1465 than others. This is becoming more complicated than, say, in ABNF, 1466 as CDDL has both types and groups, with operators that are specific 1467 to these concepts. Type operators (such as "/" for type choice) 1468 operate on types, while group operators (such as "//" for group 1469 choice) operate on groups. Types can simply be used in groups, but 1470 groups need to be bracketed (as arrays or maps) to become types. So, 1471 type operators naturally bind closer than group operators. 1473 For instance, in 1475 t = [group1] 1476 group1 = (a / b // c / d) 1477 a = 1 b = 2 c = 3 d = 4 1479 group1 is a group choice between the type choice of a and b and the 1480 type choice of c and d. This becomes more relevant once member keys 1481 and/or occurrences are added in: 1483 t = {group2} 1484 group2 = (? ab: a / b // cd: c / d) 1485 a = 1 b = 2 c = 3 d = 4 1487 is a group choice between the optional member "ab" of type a or b and 1488 the member "cd" of type c or d. Note that the optionality is 1489 attached to the first choice ("ab"), not to the second choice. 1491 Similarly, in 1493 t = [group3] 1494 group3 = (+ a / b / c) 1495 a = 1 b = 2 c = 3 1497 group3 is a repetition of a type choice between a, b, and c; if just 1498 a is to be repeatable, a group choice is needed to focus the 1499 occurrence: 1501 (A comment has been that this could be counter-intuitive. The 1502 specification writer is encouraged to use parentheses liberally to 1503 guide readers that are not familiar with CDDL precedence rules.) 1505 t = [group4] 1506 group4 = (+ a // b / c) 1507 a = 1 b = 2 c = 3 1508 group4 is a group choice between a repeatable a and a single b or c. 1510 In general, as with many other languages with operator precedence 1511 rules, it is best not to rely on them, but to insert parentheses for 1512 readability: 1514 t = [group4a] 1515 group4a = ((+ a) // (b / c)) 1516 a = 1 b = 2 c = 3 1518 The operator precedences, in sequence of loose to tight binding, are 1519 defined in Appendix B and summarized in Table 1. (Arities given are 1520 1 for unary prefix operators and 2 for binary infix operators.) 1522 +----------+----+---------------------------+------+ 1523 | Operator | Ar | Operates on | Prec | 1524 +----------+----+---------------------------+------+ 1525 | = | 2 | name = type, name = group | 1 | 1526 | /= | 2 | name /= type | 1 | 1527 | //= | 2 | name //= group | 1 | 1528 | // | 2 | group // group | 2 | 1529 | , | 2 | group, group | 3 | 1530 | * | 1 | * group | 4 | 1531 | N*M | 1 | N*M group | 4 | 1532 | + | 1 | + group | 4 | 1533 | ? | 1 | ? group | 4 | 1534 | => | 2 | type => type | 5 | 1535 | : | 2 | name: type | 5 | 1536 | / | 2 | type / type | 6 | 1537 | .. | 2 | type..type | 7 | 1538 | ... | 2 | type...type | 7 | 1539 | .ctrl | 2 | type .ctrl type | 7 | 1540 | & | 1 | &group | 8 | 1541 | ~ | 1 | ~type | 8 | 1542 +----------+----+---------------------------+------+ 1544 Table 1: Summary of operator precedences 1546 4. Making Use of CDDL 1548 In this section, we discuss several potential ways to employ CDDL. 1550 4.1. As a guide to a human user 1552 CDDL can be used to efficiently define the layout of CBOR data, such 1553 that a human implementer can easily see how data is supposed to be 1554 encoded. 1556 Since CDDL maps parts of the CBOR data to human readable names, tools 1557 could be built that use CDDL to provide a human friendly 1558 representation of the CBOR data, and allow them to edit such data 1559 while remaining compliant to its CDDL definition. 1561 4.2. For automated checking of CBOR data structure 1563 CDDL has been specified such that a machine can handle the CDDL 1564 definition and related CBOR data (and, thus, also JSON data). For 1565 example, a machine could use CDDL to check whether or not CBOR data 1566 is compliant to its definition. 1568 The need for thoroughness of such compliance checking depends on the 1569 application. For example, an application may decide not to check the 1570 data structure at all, and use the CDDL definition solely as a means 1571 to indicate the structure of the data to the programmer. 1573 On the other end, the application may also implement a checking 1574 mechanism that goes as far as checking that all mandatory map members 1575 are available. 1577 The matter in how far the data description must be enforced by an 1578 application is left to the designers and implementers of that 1579 application, keeping in mind related security considerations. 1581 In no case the intention is that a CDDL tool would be "writing code" 1582 for an implementation. 1584 4.3. For data analysis tools 1586 In the long run, it can be expected that more and more data will be 1587 stored using the CBOR data format. 1589 Where there is data, there is data analysis and the need to process 1590 such data automatically. CDDL can be used for such automated data 1591 processing, allowing tools to verify data, clean it, and extract 1592 particular parts of interest from it. 1594 Since CBOR is designed with constrained devices in mind, a likely use 1595 of it would be small sensors. An interesting use would thus be 1596 automated analysis of sensor data. 1598 5. Security considerations 1600 This document presents a content rules language for expressing CBOR 1601 data structures. As such, it does not bring any security issues on 1602 itself, although specifications of protocols that use CBOR naturally 1603 need security analyses when defined. General guidelines for writing 1604 security considerations are defined in 1606 Security Considerations Guidelines [RFC3552] (BCP 72). 1607 Specifications using CDDL to define CBOR structures in protocols need 1608 to follow those guidelines. Additional topics that could be 1609 considered in a security considerations section for a specification 1610 that uses CDDL to define CBOR structures include the following: 1612 o Where could the language maybe cause confusion in a way that will 1613 enable security issues? 1615 o Where a CDDL matcher is part of the implementation of a system, 1616 the security of the system ought not depend on the correctness of 1617 the CDDL specification or CDDL implementation without any further 1618 defenses in place. 1620 o Where the CDDL includes extension points, the impact of extensions 1621 on the security of the system needs to be carefully considered. 1623 Writers of CDDL specifications are strongly encouraged to value 1624 simplicity and transparency of the specification over its elegance. 1625 Keep it as simple as possible while still expressing the needed data 1626 model. 1628 A related observation about formal description techniques in general 1629 that is strongly recommended to be kept in mind by writers of CDDL 1630 specifications: Just because CDDL makes it easier to handle 1631 complexity in a specification, that does not make that complexity 1632 somehow less bad (except maybe on the level of the humans having to 1633 grasp the complex structure while reading the spec). 1635 6. IANA Considerations 1637 6.1. CDDL control operator registry 1639 IANA is requested to create a registry for control operators 1640 Section 3.8. The name of this registry is "CDDL Control Operators". 1642 Each entry in the subregistry must include the name of the control 1643 operator (by convention given with the leading dot) and a reference 1644 to its documentation. Names must be composed of the leading dot 1645 followed by a text string conforming to the production "id" in 1646 Appendix B. 1648 Initial entries in this registry are as follows: 1650 +----------+---------------+ 1651 | name | documentation | 1652 +----------+---------------+ 1653 | .size | [RFCthis] | 1654 | .bits | [RFCthis] | 1655 | .regexp | [RFCthis] | 1656 | .cbor | [RFCthis] | 1657 | .cborseq | [RFCthis] | 1658 | .within | [RFCthis] | 1659 | .and | [RFCthis] | 1660 | .lt | [RFCthis] | 1661 | .le | [RFCthis] | 1662 | .gt | [RFCthis] | 1663 | .ge | [RFCthis] | 1664 | .eq | [RFCthis] | 1665 | .ne | [RFCthis] | 1666 | .default | [RFCthis] | 1667 +----------+---------------+ 1669 All other control operator names are Unassigned. 1671 The IANA policy for additions to this registry is "Specification 1672 Required" as defined in [RFC8126] (which involves an Expert Review) 1673 for names that do not include an internal dot, and "IETF Review" for 1674 names that do include an internal dot. The Expert is specifically 1675 instructed that other Standards Development Organizations (SDOs) may 1676 want to define control operators that are specific to their fields 1677 (e.g., based on a binary syntax already in use at the SDO); the 1678 review process should strive to facilitate such an undertaking. 1680 7. References 1682 7.1. Normative References 1684 [ISO6093] ISO, "Information processing -- Representation of 1685 numerical values in character strings for information 1686 interchange", ISO 6093, 1985. 1688 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1689 Requirement Levels", BCP 14, RFC 2119, 1690 DOI 10.17487/RFC2119, March 1997, 1691 . 1693 [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 1694 Text on Security Considerations", BCP 72, RFC 3552, 1695 DOI 10.17487/RFC3552, July 2003, 1696 . 1698 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 1699 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November 1700 2003, . 1702 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 1703 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, 1704 . 1706 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 1707 Specifications: ABNF", STD 68, RFC 5234, 1708 DOI 10.17487/RFC5234, January 2008, 1709 . 1711 [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object 1712 Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, 1713 October 2013, . 1715 [RFC7493] Bray, T., Ed., "The I-JSON Message Format", RFC 7493, 1716 DOI 10.17487/RFC7493, March 2015, 1717 . 1719 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 1720 Writing an IANA Considerations Section in RFCs", BCP 26, 1721 RFC 8126, DOI 10.17487/RFC8126, June 2017, 1722 . 1724 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1725 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1726 May 2017, . 1728 [RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data 1729 Interchange Format", STD 90, RFC 8259, 1730 DOI 10.17487/RFC8259, December 2017, 1731 . 1733 [W3C.REC-xmlschema-2-20041028] 1734 Biron, P. and A. Malhotra, "XML Schema Part 2: Datatypes 1735 Second Edition", World Wide Web Consortium Recommendation 1736 REC-xmlschema-2-20041028, October 2004, 1737 . 1739 7.2. Informative References 1741 [I-D.bormann-cbor-cddl-freezer] 1742 Bormann, C., "A feature freezer for the Concise Data 1743 Definition Language (CDDL)", draft-bormann-cbor-cddl- 1744 freezer-01 (work in progress), August 2018. 1746 [I-D.ietf-anima-grasp] 1747 Bormann, C., Carpenter, B., and B. Liu, "A Generic 1748 Autonomic Signaling Protocol (GRASP)", draft-ietf-anima- 1749 grasp-15 (work in progress), July 2017. 1751 [I-D.newton-json-content-rules] 1752 Newton, A. and P. Cordell, "A Language for Rules 1753 Describing JSON Content", draft-newton-json-content- 1754 rules-09 (work in progress), September 2017. 1756 [PEG] Ford, B., "Parsing expression grammars", Proceedings of 1757 the 31st ACM SIGPLAN-SIGACT symposium on Principles of 1758 programming languages - POPL '04, 1759 DOI 10.1145/964001.964011, 2004. 1761 [RELAXNG] ISO/IEC, "Information technology -- Document Schema 1762 Definition Language (DSDL) -- Part 2: Regular-grammar- 1763 based validation -- RELAX NG", ISO/IEC 19757-2, December 1764 2008. 1766 [RFC7071] Borenstein, N. and M. Kucherawy, "A Media Type for 1767 Reputation Interchange", RFC 7071, DOI 10.17487/RFC7071, 1768 November 2013, . 1770 [RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language", 1771 RFC 7950, DOI 10.17487/RFC7950, August 2016, 1772 . 1774 [RFC8007] Murray, R. and B. Niven-Jenkins, "Content Delivery Network 1775 Interconnection (CDNI) Control Interface / Triggers", 1776 RFC 8007, DOI 10.17487/RFC8007, December 2016, 1777 . 1779 [RFC8152] Schaad, J., "CBOR Object Signing and Encryption (COSE)", 1780 RFC 8152, DOI 10.17487/RFC8152, July 2017, 1781 . 1783 [RFC8428] Jennings, C., Shelby, Z., Arkko, J., Keranen, A., and C. 1784 Bormann, "Sensor Measurement Lists (SenML)", RFC 8428, 1785 DOI 10.17487/RFC8428, August 2018, 1786 . 1788 7.3. URIs 1790 [1] https://github.com/cabo/cbor-diag 1792 Appendix A. Parsing Expression Grammars (PEG) 1794 This appendix is normative. 1796 Since the 1950s, many grammar notations are based on Backus-Naur Form 1797 (BNF), a notation for context-free grammars (CFGs) within Chomsky's 1798 generative system of grammars. ABNF [RFC5234], the Augmented Backus- 1799 Naur Form widely used in IETF specifications and also inspiring the 1800 syntax of CDDL, is an example of this. 1802 Generative grammars can express ambiguity well, but this very 1803 property may make them hard to use in recognition systems, spawning a 1804 number of subdialects that pose constraints on generative grammars to 1805 be used with parser generators, which may be hard to manage for the 1806 specification writer. 1808 Parsing Expression Grammars [PEG] provide an alternative formal 1809 foundation for describing grammars that emphasizes recognition over 1810 generation, and resolves what would have been ambiguity in generative 1811 systems by introducing the concept of "prioritized choice". 1813 The notation for Parsing Expression Grammars is quite close to BNF, 1814 with the usual "Extended BNF" features such as repetition added. 1815 However, where BNF uses the unordered (symmetrical) choice operator 1816 "|" (incidentally notated as "/" in ABNF), PEG provides a prioritized 1817 choice operator "/". The two alternatives listed are to be tested in 1818 left-to-right order, locking in the first successful match and 1819 disregarding any further potential matches within the choice (but not 1820 disabling alternatives in choices containing this choice, as a "cut" 1821 would - Section 3.5.4}. 1823 For example, the ABNF expressions 1825 A = "a" "b" / "a" (1) 1827 and 1829 A = "a" / "a" "b" (2) 1831 are equivalent in ABNF's original generative framework, but very 1832 different in PEG: In (2), the second alternative will never match, as 1833 any input string starting with an "a" will already succeed in the 1834 first alternative, locking in the match. 1836 Similarly, the occurrence indicators ("?", "*", "+") are "greedy" in 1837 PEG, i.e., they consume as much input as they match (and, as a 1838 consequence, "a* a" in PEG notation or "*a a" in CDDL syntax never 1839 can match anything as all input matching "a" is already consumed by 1840 the initial "a*", leaving nothing to match the second "a"). 1842 Incidentally, the grammar of the CDDL language itself, as written in 1843 ABNF in Appendix B, can be interpreted both in the generative 1844 framework on which RFC 5234 is based, and as a PEG. This was made 1845 possible by ordering the choices in the grammar such that a 1846 successful match made on the left hand side of a "/" operator is 1847 always the intended match, instead of relying on the power of 1848 symmetrical choices (for example, note the sequence of alternatives 1849 in the rule for "uint", where the lone zero is behind the longer 1850 match alternatives that start with a zero). 1852 The syntax used for expressing the PEG component of CDDL is based on 1853 ABNF, interpreted in the obvious way with PEG semantics. The ABNF 1854 convention of notating occurrence indicators before the controlled 1855 primary, and of allowing numeric values for minimum and maximum 1856 occurrence around a "*" sign, is copied. While PEG is only about 1857 characters, CDDL has a richer set of elements, such as types and 1858 groups. Specifically, the following constructs map: 1860 +-------+-------+-------------------------------------------+ 1861 | CDDL | PEG | Remark | 1862 +-------+-------+-------------------------------------------+ 1863 | "=" | "<-" | /= and //= are abbreviations | 1864 | "//" | "/" | prioritized choice | 1865 | "/" | "/" | prioritized choice, limited to types only | 1866 | "?" P | P "?" | zero or one | 1867 | "*" P | P "*" | zero or more | 1868 | "+" P | P "+" | one or more | 1869 | A B | A B | sequence | 1870 | A, B | A B | sequence, comma is decoration only | 1871 +-------+-------+-------------------------------------------+ 1873 The literal notation and the use of square brackets, curly braces, 1874 tildes, ampersands, and hash marks is specific to CDDL and unrelated 1875 to the conventional PEG notation. The DOT (".") is replaced by the 1876 unadorned "#" or its alias "any". Also, CDDL does not provide the 1877 syntactic predicate operators NOT ("!") or AND ("&") from PEG, 1878 reducing expressiveness as well as complexity. 1880 For more details about PEG's theoretical foundation and interesting 1881 properties of the operators such as associativity and distributivity, 1882 the reader is referred to [PEG]. 1884 Appendix B. ABNF grammar 1886 This appendix is normative. 1888 The following is a formal definition of the CDDL syntax in Augmented 1889 Backus-Naur Form (ABNF, [RFC5234]). Note that, as is defined in 1890 ABNF, the quote-delimited strings below are case-insensitive (while 1891 string values and names are case-sensitive in CDDL). 1893 cddl = S 1*(rule S) 1894 rule = typename [genericparm] S assignt S type 1895 / groupname [genericparm] S assigng S grpent 1897 typename = id 1898 groupname = id 1900 assignt = "=" / "/=" 1901 assigng = "=" / "//=" 1903 genericparm = "<" S id S *("," S id S ) ">" 1904 genericarg = "<" S type1 S *("," S type1 S ) ">" 1906 type = type1 *(S "/" S type1) 1908 type1 = type2 [S (rangeop / ctlop) S type2] 1909 ; space may be needed before the operator if type2 ends in a name 1911 type2 = value 1912 / typename [genericarg] 1913 / "(" S type S ")" 1914 / "{" S group S "}" 1915 / "[" S group S "]" 1916 / "~" S typename [genericarg] 1917 / "&" S "(" S group S ")" 1918 / "&" S groupname [genericarg] 1919 / "#" "6" ["." uint] "(" S type S ")" 1920 / "#" DIGIT ["." uint] ; major/ai 1921 / "#" ; any 1923 rangeop = "..." / ".." 1925 ctlop = "." id 1927 group = grpchoice *(S "//" S grpchoice) 1929 grpchoice = *(grpent optcom) 1931 grpent = [occur S] [memberkey S] type 1932 / [occur S] groupname [genericarg] ; preempted by above 1933 / [occur S] "(" S group S ")" 1935 memberkey = type1 S ["^" S] "=>" 1936 / bareword S ":" 1937 / value S ":" 1939 bareword = id 1941 optcom = S ["," S] 1943 occur = [uint] "*" [uint] 1944 / "+" 1945 / "?" 1947 uint = DIGIT1 *DIGIT 1948 / "0x" 1*HEXDIG 1949 / "0b" 1*BINDIG 1950 / "0" 1952 value = number 1953 / text 1954 / bytes 1956 int = ["-"] uint 1958 ; This is a float if it has fraction or exponent; int otherwise 1959 number = hexfloat / (int ["." fraction] ["e" exponent ]) 1960 hexfloat = "0x" 1*HEXDIG ["." 1*HEXDIG] "p" exponent 1961 fraction = 1*DIGIT 1962 exponent = ["+"/"-"] 1*DIGIT 1964 text = %x22 *SCHAR %x22 1965 SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC 1966 SESC = "\" (%x20-7E / %x80-10FFFD) 1968 bytes = [bsqual] %x27 *BCHAR %x27 1969 BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF 1970 bsqual = "h" / "b64" 1972 id = EALPHA *(*("-" / ".") (EALPHA / DIGIT)) 1973 ALPHA = %x41-5A / %x61-7A 1974 EALPHA = ALPHA / "@" / "_" / "$" 1975 DIGIT = %x30-39 1976 DIGIT1 = %x31-39 1977 HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F" 1978 BINDIG = %x30-31 1979 S = *WS 1980 WS = SP / NL 1981 SP = %x20 1982 NL = COMMENT / CRLF 1983 COMMENT = ";" *PCHAR CRLF 1984 PCHAR = %x20-7E / %x80-10FFFD 1985 CRLF = %x0A / %x0D.0A 1987 Figure 13: CDDL ABNF 1989 Note that this ABNF does not attempt to reflect the detailed rules of 1990 what can be in a prefixed byte string. 1992 Appendix C. Matching rules 1994 This appendix is normative. 1996 In this appendix, we go through the ABNF syntax rules defined in 1997 Appendix B and briefly describe the matching semantics of each 1998 syntactic feature. In this context, an instance (data item) 1999 "matches" a CDDL specification if it is allowed by the CDDL 2000 specification; this is then broken down to parts of specifications 2001 (type and group expressions) and parts of instances (data items). 2003 cddl = S 1*(rule S) 2005 A CDDL specification is a sequence of one or more rules. Each rule 2006 gives a name to a right hand side expression, either a CDDL type or a 2007 CDDL group. Rule names can be used in the rule itself and/or other 2008 rules (and tools can output warnings if that is not the case). The 2009 order of the rules is significant only in two cases: 2011 1. The first rule defines the semantics of the entire specification; 2012 hence, there is no need to give that root rule a special name or 2013 special syntax in the language (as, e.g., with "start" in Relax- 2014 NG); its name can be therefore chosen to be descriptive. (As 2015 with all other rule names, the name of the initial rule may be 2016 used in itself or in other rules). 2018 2. Where a rule contributes to a type or group choice (using "/=" or 2019 "//="), that choice is populated in the order the rules are 2020 given; see below. 2022 rule = typename [genericparm] S assignt S type 2023 / groupname [genericparm] S assigng S grpent 2025 typename = id 2026 groupname = id 2027 A rule defines a name for a type expression (production "type") or 2028 for a group expression (production "grpent"), with the intention that 2029 the semantics does not change when the name is replaced by its 2030 (parenthesized if needed) definition. Note that whether the name 2031 defined by a rule stands for a type or a group isn't always 2032 determined by syntax alone: e.g., "a = b" can make "a" a type if "b" 2033 is a type, or a group if "b" is a group. More subtly, in "a = (b)", 2034 "a" may be used as a type if "b" is a type, or as a group both when 2035 "b" is a group and when "b" is a type (a good convention to make the 2036 latter case stand out to the human reader is to write "a = (b,)"). 2037 (Note that the same dual meaning of parentheses applies within an 2038 expression, but often can be resolved by the context of the 2039 parenthesized expression. On the more general point, it may not be 2040 clear immediately either whether "b" stands for a group or a type -- 2041 this semantic processing may need to span several levels of rule 2042 definitions before a determination can be made.) 2044 assignt = "=" / "/=" 2045 assigng = "=" / "//=" 2047 A plain equals sign defines the rule name as the equivalent of the 2048 expression to the right; it is an error if the name already was 2049 defined with a different expression. A "/=" or "//=" extends a named 2050 type or a group by additional choices; a number of these could be 2051 replaced by collecting all the right hand sides and creating a single 2052 rule with a type choice or a group choice built from the right hand 2053 sides in the order of the rules given. (It is not an error to extend 2054 a rule name that has not yet been defined; this makes the right hand 2055 side the first entry in the choice being created.) 2057 genericparm = "<" S id S *("," S id S ) ">" 2058 genericarg = "<" S type1 S *("," S type1 S ) ">" 2060 Rule names can have generic parameters, which cause temporary 2061 assignments within the right hand sides to the parameter names from 2062 the arguments given when citing the rule name. 2064 type = type1 *(S "/" S type1) 2066 A type can be given as a choice between one or more types. The 2067 choice matches a data item if the data item matches any one of the 2068 types given in the choice. The choice uses Parsing Expression 2069 Grammar semantics as discussed in Appendix A: The first choice that 2070 matches wins. (As a result, the order of rules that contribute to a 2071 single rule name can very well matter.) 2073 type1 = type2 [S (rangeop / ctlop) S type2] 2074 Two types can be combined with a range operator (which see below) or 2075 a control operator (see Section 3.8). 2077 type2 = value 2079 A type can be just a single value (such as 1 or "icecream" or 2080 h'0815'), which matches only a data item with that specific value (no 2081 conversions defined), 2083 / typename [genericarg] 2085 or be defined by a rule giving a meaning to a name (possibly after 2086 supplying generic arguments as required by the generic parameters), 2088 / "(" S type S ")" 2090 or be defined in a parenthesized type expression (parentheses may be 2091 necessary to override some operator precedence), or 2093 / "{" S group S "}" 2095 a map expression, which matches a valid CBOR map the key/value pairs 2096 of which can be ordered in such a way that the resulting sequence 2097 matches the group expression, or 2099 / "[" S group S "]" 2101 an array expression, which matches a CBOR array the elements of 2102 which, when taken as values and complemented by a wildcard (matches 2103 anything) key each, match the group, or 2105 / "~" S typename [genericarg] 2107 an "unwrapped" group (see Section 3.7), which matches the group 2108 inside a type defined as a map or an array by wrapping the group, or 2110 / "&" S "(" S group S ")" 2111 / "&" S groupname [genericarg] 2113 an enumeration expression, which matches any a value that is within 2114 the set of values that the values of the group given can take, or 2116 / "#" "6" ["." uint] "(" S type S ")" 2118 a tagged data item, tagged with the "uint" given and containing the 2119 type given as the tagged value, or 2121 / "#" DIGIT ["." uint] ; major/ai 2123 a data item of a major type (given by the DIGIT), optionally 2124 constrained to the additional information given by the uint, or 2126 / "#" ; any 2128 any data item. 2130 rangeop = "..." / ".." 2132 A range operator can be used to join two type expressions that stand 2133 for either two integer values or two floating point values; it 2134 matches any value that is between the two values, where the first 2135 value is always included in the matching set and the second value is 2136 included for ".." and excluded for "...". 2138 ctlop = "." id 2140 A control operator ties a _target_ type to a _controller_ type as 2141 defined in Section 3.8. Note that control operators are an extension 2142 point for CDDL; additional documents may want to define additional 2143 control operators. 2145 group = grpchoice *(S "//" S grpchoice) 2147 A group matches any sequence of key/value pairs that matches any of 2148 the choices given (again using Parsing Expression Grammar semantics). 2150 grpchoice = *(grpent optcom) 2152 Each of the component groups is given as a sequence of group entries. 2153 For a match, the sequence of key/value pairs given needs to match the 2154 sequence of group entries in the sequence given. 2156 grpent = [occur S] [memberkey S] type 2158 A group entry can be given by a value type, which needs to be matched 2159 by the value part of a single element, and optionally a memberkey 2160 type, which needs to be matched by the key part of the element, if 2161 the memberkey is given. If the memberkey is not given, the entry can 2162 only be used for matching arrays, not for maps. (See below how that 2163 is modified by the occurrence indicator.) 2165 / [occur S] groupname [genericarg] ; preempted by above 2167 A group entry can be built from a named group, or 2169 / [occur S] "(" S group S ")" 2171 from a parenthesized group, again with a possible occurrence 2172 indicator. 2174 memberkey = type1 S ["^" S] "=>" 2175 / bareword S ":" 2176 / value S ":" 2178 Key types can be given by a type expression, a bareword (which stands 2179 for a type that just contains a string value created from this 2180 bareword), or a value (which stands for a type that just contains 2181 this value). A key value matches its key type if the key value is a 2182 member of the key type, unless a cut preceding it in the group 2183 applies (see Section 3.5.4 how map matching is influenced by the 2184 presence of the cuts denoted by "^" or ":" in previous entries). 2186 bareword = id 2188 A bareword is an alternative way to write a type with a single text 2189 string value; it can only be used in the syntactic context given 2190 above. 2192 optcom = S ["," S] 2194 (Optional commas do not influence the matching.) 2196 occur = [uint] "*" [uint] 2197 / "+" 2198 / "?" 2200 An occurrence indicator modifies the group given to its right by 2201 requiring the group to match the sequence to be matched exactly for a 2202 certain number of times (see Section 3.2) in sequence, i.e. it acts 2203 as a (possibly infinite) group choice that contains choices with the 2204 group repeated each of the occurrences times. 2206 The rest of the ABNF describes syntax for value notation that should 2207 be familiar from programming languages, with the possible exception 2208 of h'..' and b64'..' for byte strings, as well as syntactic elements 2209 such as comments and line ends. 2211 Appendix D. Standard Prelude 2213 This appendix is normative. 2215 The following prelude is automatically added to each CDDL file. 2216 (Note that technically, it is a postlude, as it does not disturb the 2217 selection of the first rule as the root of the definition.) 2218 any = # 2220 uint = #0 2221 nint = #1 2222 int = uint / nint 2224 bstr = #2 2225 bytes = bstr 2226 tstr = #3 2227 text = tstr 2229 tdate = #6.0(tstr) 2230 time = #6.1(number) 2231 number = int / float 2232 biguint = #6.2(bstr) 2233 bignint = #6.3(bstr) 2234 bigint = biguint / bignint 2235 integer = int / bigint 2236 unsigned = uint / biguint 2237 decfrac = #6.4([e10: int, m: integer]) 2238 bigfloat = #6.5([e2: int, m: integer]) 2239 eb64url = #6.21(any) 2240 eb64legacy = #6.22(any) 2241 eb16 = #6.23(any) 2242 encoded-cbor = #6.24(bstr) 2243 uri = #6.32(tstr) 2244 b64url = #6.33(tstr) 2245 b64legacy = #6.34(tstr) 2246 regexp = #6.35(tstr) 2247 mime-message = #6.36(tstr) 2248 cbor-any = #6.55799(any) 2250 float16 = #7.25 2251 float32 = #7.26 2252 float64 = #7.27 2253 float16-32 = float16 / float32 2254 float32-64 = float32 / float64 2255 float = float16-32 / float64 2257 false = #7.20 2258 true = #7.21 2259 bool = false / true 2260 nil = #7.22 2261 null = nil 2262 undefined = #7.23 2264 Figure 14: CDDL Prelude 2266 Note that the prelude is deemed to be fixed. This means, for 2267 instance, that additional tags beyond [RFC7049], as registered, need 2268 to be defined in each CDDL file that is using them. 2270 A common stumbling point is that the prelude does not define a type 2271 "string". CBOR has byte strings ("bytes" in the prelude) and text 2272 strings ("text"), so a type that is simply called "string" would be 2273 ambiguous. 2275 Appendix E. Use with JSON 2277 This appendix is normative. 2279 The JSON generic data model (implicit in [RFC8259]) is a subset of 2280 the generic data model of CBOR. So one can use CDDL with JSON by 2281 limiting oneself to what can be represented in JSON. Roughly 2282 speaking, this means leaving out byte strings, tags, and simple 2283 values other than "false", "true", and "null", leading to the 2284 following limited prelude: 2286 any = # 2288 uint = #0 2289 nint = #1 2290 int = uint / nint 2292 tstr = #3 2293 text = tstr 2295 number = int / float 2297 float16 = #7.25 2298 float32 = #7.26 2299 float64 = #7.27 2300 float16-32 = float16 / float32 2301 float32-64 = float32 / float64 2302 float = float16-32 / float64 2304 false = #7.20 2305 true = #7.21 2306 bool = false / true 2307 nil = #7.22 2308 null = nil 2310 Figure 15: JSON compatible subset of CDDL Prelude 2312 (The major types given here do not have a direct meaning in JSON, but 2313 they can be interpreted as CBOR major types translated through 2314 Section 4 of [RFC7049].) 2316 There are a few fine points in using CDDL with JSON. First, JSON 2317 does not distinguish between integers and floating point numbers; 2318 there is only one kind of number (which may happen to be integral). 2319 In this context, specifying a type as "uint", "nint" or "int" then 2320 becomes a predicate that the number be integral. As an example, this 2321 means that the following JSON numbers are all matching "uint": 2323 10 10.0 1e1 1.0e1 100e-1 2325 (The fact that these are all integers may be surprising to users 2326 accustomed to the long tradition in programming languages of using 2327 decimal points or exponents in a number to indicate a floating point 2328 literal.) 2330 CDDL distinguishes the various CBOR number types, but there is only 2331 one number type in JSON. The effect of specifying a floating point 2332 precision (float16/float32/float64) is only to restrict the set of 2333 permissible values to those expressible with binary16/binary32/ 2334 binary64; this is unlikely to be very useful when using CDDL for 2335 specifying JSON data structures. 2337 Fundamentally, the number system of JSON itself is based on decimal 2338 numbers and decimal fractions and does not have limits to its 2339 precision or range. In practice, JSON numbers are often parsed into 2340 a number type that is called float64 here, creating a number of 2341 limitations to the generic data model [RFC7493]. In particular, this 2342 means that integers can only be expressed with interoperable 2343 exactness when they lie in the range [-(2**53)+1, (2**53)-1] -- a 2344 smaller range than that covered by CDDL "int". 2346 JSON applications that want to stay compatible with I-JSON 2347 ([RFC7493], "Internet JSON") therefore may want to define integer 2348 types with more limited ranges, such as in Figure 16. Note that the 2349 types given here are not part of the prelude; they need to be copied 2350 into the CDDL specification if needed. 2352 ij-uint = 0..9007199254740991 2353 ij-nint = -9007199254740991..-1 2354 ij-int = -9007199254740991..9007199254740991 2356 Figure 16: I-JSON types for CDDL (not part of prelude) 2358 JSON applications that do not need to stay compatible with I-JSON and 2359 that actually may need to go beyond the 64-bit unsigned and negative 2360 integers supported by "int" (= "uint"/"nint") may want to use the 2361 following additional types from the standard prelude, which are 2362 expressed in terms of tags but can straightforwardly be mapped into 2363 JSON (but not I-JSON) numbers: 2365 biguint = #6.2(bstr) 2366 bignint = #6.3(bstr) 2367 bigint = biguint / bignint 2368 integer = int / bigint 2369 unsigned = uint / biguint 2371 CDDL at this point does not have a way to express the unlimited 2372 floating point precision that is theoretically possible with JSON; at 2373 the time of writing, this is rarely used in protocols in practice. 2375 Note that a data model described in CDDL is always restricted by what 2376 can be expressed in the serialization; e.g., floating point values 2377 such as NaN (not a number) and the infinities cannot be represented 2378 in JSON even if they are allowed in the CDDL generic data model. 2380 Appendix F. A CDDL tool 2382 This appendix is for information only. 2384 A rough CDDL tool is available. For CDDL specifications, it can 2385 check the syntax, generate one or more instances (expressed in CBOR 2386 diagnostic notation or in pretty-printed JSON), and validate an 2387 existing instance against the specification: 2389 Usage: 2390 cddl spec.cddl generate [n] 2391 cddl spec.cddl json-generate [n] 2392 cddl spec.cddl validate instance.cbor 2393 cddl spec.cddl validate instance.json 2395 Figure 17: CDDL tool usage 2397 Install on a system with a modern Ruby via: 2399 gem install cddl 2401 Figure 18: CDDL tool installation 2403 The accompanying CBOR diagnostic tools (which are automatically 2404 installed by the above) are described in https://github.com/cabo/ 2405 cbor-diag [1]; they can be used to convert between binary CBOR, a 2406 pretty-printed form of that, CBOR diagnostic notation, JSON, and 2407 YAML. 2409 Appendix G. Extended Diagnostic Notation 2411 This appendix is normative. 2413 Section 6 of [RFC7049] defines a "diagnostic notation" in order to be 2414 able to converse about CBOR data items without having to resort to 2415 binary data. Diagnostic notation is based on JSON, with extensions 2416 for representing CBOR constructs such as binary data and tags. 2418 (Standardizing this together with the actual interchange format does 2419 not serve to create another interchange format, but enables the use 2420 of a shared diagnostic notation in tools for and documents about 2421 CBOR.) 2423 This section discusses a few extensions to the diagnostic notation 2424 that have turned out to be useful since RFC 7049 was written. We 2425 refer to the result as extended diagnostic notation (EDN). 2427 G.1. White space in byte string notation 2429 Examples often benefit from some white space (spaces, line breaks) in 2430 byte strings. In extended diagnostic notation, white space is 2431 ignored in prefixed byte strings; for instance, the following are 2432 equivalent: 2434 h'48656c6c6f20776f726c64' 2435 h'48 65 6c 6c 6f 20 77 6f 72 6c 64' 2436 h'4 86 56c 6c6f 2437 20776 f726c64' 2439 G.2. Text in byte string notation 2441 Diagnostic notation notates Byte strings in one of the [RFC4648] base 2442 encodings,, enclosed in single quotes, prefixed by >h< for base16, 2443 >b32< for base32, >h32< for base32hex, >b64< for base64 or base64url. 2444 Quite often, byte strings carry bytes that are meaningfully 2445 interpreted as UTF-8 text. Extended Diagnostic Notation allows the 2446 use of single quotes without a prefix to express byte strings with 2447 UTF-8 text; for instance, the following are equivalent: 2449 'hello world' 2450 h'68656c6c6f20776f726c64' 2452 The escaping rules of JSON strings are applied equivalently for text- 2453 based byte strings, e.g., \ stands for a single backslash and ' 2454 stands for a single quote. White space is included literally, i.e., 2455 the previous section does not apply to text-based byte strings. 2457 G.3. Embedded CBOR and CBOR sequences in byte strings 2459 Where a byte string is to carry an embedded CBOR-encoded item, or 2460 more generally a sequence of zero or more such items, the diagnostic 2461 notation for these zero or more CBOR data items, separated by 2462 commata, can be enclosed in << and >> to notate the byte string 2463 resulting from encoding the data items and concatenating the result. 2464 For instance, each pair of columns in the following are equivalent: 2466 <<1>> h'01' 2467 <<1, 2>> h'0102' 2468 <<"foo", null>> h'63666F6FF6' 2469 <<>> h'' 2471 G.4. Concatenated Strings 2473 While the ability to include white space enables line-breaking of 2474 encoded byte strings, a mechanism is needed to be able to include 2475 text strings as well as byte strings in direct UTF-8 representation 2476 into line-based documents (such as RFCs and source code). 2478 We extend the diagnostic notation by allowing multiple text strings 2479 or multiple byte strings to be notated separated by white space, 2480 these are then concatenated into a single text or byte string, 2481 respectively. Text strings and byte strings do not mix within such a 2482 concatenation, except that byte string notation can be used inside a 2483 sequence of concatenated text string notation to encode characters 2484 that may be better represented in an encoded way. The following four 2485 values are equivalent: 2487 "Hello world" 2488 "Hello " "world" 2489 "Hello" h'20' "world" 2490 "" h'48656c6c6f20776f726c64' "" 2492 Similarly, the following byte string values are equivalent 2494 'Hello world' 2495 'Hello ' 'world' 2496 'Hello ' h'776f726c64' 2497 'Hello' h'20' 'world' 2498 '' h'48656c6c6f20776f726c64' '' b64'' 2499 h'4 86 56c 6c6f' h' 20776 f726c64' 2501 (Note that the approach of separating by whitespace, while familiar 2502 from the C language, requires some attention - a single comma makes a 2503 big difference here.) 2505 G.5. Hexadecimal, octal, and binary numbers 2507 In addition to JSON's decimal numbers, EDN provides hexadecimal, 2508 octal and binary numbers in the usual C-language notation (octal with 2509 0o prefix present only). 2511 The following are equivalent: 2513 4711 2514 0x1267 2515 0o11147 2516 0b1001001100111 2518 As are: 2520 1.5 2521 0x1.8p0 2522 0x18p-4 2524 G.6. Comments 2526 Longer pieces of diagnostic notation may benefit from comments. JSON 2527 famously does not provide for comments, and basic RFC 7049 diagnostic 2528 notation inherits this property. 2530 In extended diagnostic notation, comments can be included, delimited 2531 by slashes ("/"). Any text within and including a pair of slashes is 2532 considered a comment. 2534 Comments are considered white space. Hence, they are allowed in 2535 prefixed byte strings; for instance, the following are equivalent: 2537 h'68656c6c6f20776f726c64' 2538 h'68 65 6c /doubled l!/ 6c 6f /hello/ 2539 20 /space/ 2540 77 6f 72 6c 64' /world/ 2542 This can be used to annotate a CBOR structure as in: 2544 /grasp-message/ [/M_DISCOVERY/ 1, /session-id/ 10584416, 2545 /objective/ [/objective-name/ "opsonize", 2546 /D, N, S/ 7, /loop-count/ 105]] 2548 (There are currently no end-of-line comments. If we want to add 2549 them, "//" sounds like a reasonable delimiter given that we already 2550 use slashes for comments, but we also could go e.g. for "#".) 2552 Appendix H. Examples 2554 This appendix is for information only. 2556 This section contains a few examples of structures defined using 2557 CDDL. 2559 The theme for the first example is taken from [RFC7071], which 2560 defines certain JSON structures in English. For a similar example, 2561 it may also be of interest to examine Appendix A of [RFC8007], which 2562 contains a CDDL definition for a JSON structure defined in the main 2563 body of the RFC. 2565 The second subsection in this appendix translates examples from 2566 [I-D.newton-json-content-rules] into CDDL. 2568 These examples all happen to describe data that is interchanged in 2569 JSON. Examples for CDDL definitions of data that is interchanged in 2570 CBOR can be found in [RFC8152], [I-D.ietf-anima-grasp], or [RFC8428]. 2572 H.1. RFC 7071 2574 [RFC7071] defines the Reputon structure for JSON using somewhat 2575 formalized English text. Here is a (somewhat verbose) equivalent 2576 definition using the same terms, but notated in CDDL: 2578 reputation-object = { 2579 reputation-context, 2580 reputon-list 2581 } 2583 reputation-context = ( 2584 application: text 2585 ) 2587 reputon-list = ( 2588 reputons: reputon-array 2589 ) 2591 reputon-array = [* reputon] 2593 reputon = { 2594 rater-value, 2595 assertion-value, 2596 rated-value, 2597 rating-value, 2598 ? conf-value, 2599 ? normal-value, 2600 ? sample-value, 2601 ? gen-value, 2602 ? expire-value, 2603 * ext-value, 2604 } 2606 rater-value = ( rater: text ) 2607 assertion-value = ( assertion: text ) 2608 rated-value = ( rated: text ) 2609 rating-value = ( rating: float16 ) 2610 conf-value = ( confidence: float16 ) 2611 normal-value = ( normal-rating: float16 ) 2612 sample-value = ( sample-size: uint ) 2613 gen-value = ( generated: uint ) 2614 expire-value = ( expires: uint ) 2615 ext-value = ( text => any ) 2617 An equivalent, more compact form of this example would be: 2619 reputation-object = { 2620 application: text 2621 reputons: [* reputon] 2622 } 2624 reputon = { 2625 rater: text 2626 assertion: text 2627 rated: text 2628 rating: float16 2629 ? confidence: float16 2630 ? normal-rating: float16 2631 ? sample-size: uint 2632 ? generated: uint 2633 ? expires: uint 2634 * text => any 2635 } 2637 Note how this rather clearly delineates the structure somewhat 2638 shrouded by so many words in section 6.2.2. of [RFC7071]. Also, this 2639 definition makes it clear that several ext-values are allowed (by 2640 definition with different member names); RFC 7071 could be read to 2641 forbid the repetition of ext-value ("A specific reputon-element MUST 2642 NOT appear more than once" is ambiguous.) 2644 The CDDL tool reported on in Appendix F generates as one example: 2646 { 2647 "application": "conchometry", 2648 "reputons": [ 2649 { 2650 "rater": "Ephthianura", 2651 "assertion": "codding", 2652 "rated": "sphaerolitic", 2653 "rating": 0.34133473256800795, 2654 "confidence": 0.9481983064298332, 2655 "expires": 1568, 2656 "unplaster": "grassy" 2657 }, 2658 { 2659 "rater": "nonchargeable", 2660 "assertion": "raglan", 2661 "rated": "alienage", 2662 "rating": 0.5724646875815566, 2663 "sample-size": 3514, 2664 "Aldebaran": "unchurched", 2665 "puruloid": "impersonable", 2666 "uninfracted": "pericarpoidal", 2667 "schorl": "Caro" 2668 }, 2669 { 2670 "rater": "precollectable", 2671 "assertion": "Merat", 2672 "rated": "thermonatrite", 2673 "rating": 0.19164006323936977, 2674 "confidence": 0.6065252103391268, 2675 "normal-rating": 0.5187773690879303, 2676 "generated": 899, 2677 "speedy": "solidungular", 2678 "noviceship": "medicine", 2679 "checkrow": "epidictic" 2680 } 2681 ] 2682 } 2684 H.2. Examples from JSON Content Rules 2686 Although JSON Content Rules [I-D.newton-json-content-rules] seems to 2687 address a more general problem than CDDL, it is still a worthwhile 2688 resource to explore for examples (beyond all the inspiration the 2689 format itself has had for CDDL). 2691 Figure 2 of the JCR I-D looks very similar, if slightly less noisy, 2692 in CDDL: 2694 root = [2*2 { 2695 precision: text, 2696 Latitude: float, 2697 Longitude: float, 2698 Address: text, 2699 City: text, 2700 State: text, 2701 Zip: text, 2702 Country: text 2703 }] 2705 Figure 19: JCR, Figure 2, in CDDL 2707 Apart from the lack of a need to quote the member names, text strings 2708 are called "text" or "tstr" in CDDL ("string" would be ambiguous as 2709 CBOR also provides byte strings). 2711 The CDDL tool reported on in Appendix F creates the below example 2712 instance for this: 2714 [{"precision": "pyrosphere", "Latitude": 0.5399712314350172, 2715 "Longitude": 0.5157523963028087, "Address": "resow", 2716 "City": "problemwise", "State": "martyrlike", "Zip": "preprove", 2717 "Country": "Pace"}, 2718 {"precision": "unrigging", "Latitude": 0.10422704368372193, 2719 "Longitude": 0.6279808663725834, "Address": "picturedom", 2720 "City": "decipherability", "State": "autometry", "Zip": "pout", 2721 "Country": "wimple"}] 2723 Figure 4 of the JCR I-D in CDDL: 2725 root = { image } 2727 image = ( 2728 Image: { 2729 size, 2730 Title: text, 2731 thumbnail, 2732 IDs: [* int] 2733 } 2734 ) 2736 size = ( 2737 Width: 0..1280 2738 Height: 0..1024 2739 ) 2741 thumbnail = ( 2742 Thumbnail: { 2743 size, 2744 Url: ~uri 2745 } 2746 ) 2748 This shows how the group concept can be used to keep related elements 2749 (here: width, height) together, and to emulate the JCR style of 2750 specification. (It also shows referencing a type by unwrapping a tag 2751 from the prelude, "uri" - this could be done differently.) The more 2752 compact form of Figure 5 of the JCR I-D could be emulated like this: 2754 root = { 2755 Image: { 2756 size, Title: text, 2757 Thumbnail: { size, Url: ~uri }, 2758 IDs: [* int] 2759 } 2760 } 2762 size = ( 2763 Width: 0..1280, 2764 Height: 0..1024, 2765 ) 2767 The CDDL tool reported on in Appendix F creates the below example 2768 instance for this: 2770 {"Image": {"Width": 566, "Height": 516, "Title": "leisterer", 2771 "Thumbnail": {"Width": 1111, "Height": 176, "Url": 32("scrog")}, 2772 "IDs": []}} 2774 Contributors 2776 CDDL was originally conceived by Bert Greevenbosch, who also wrote 2777 the original five versions of this document. 2779 Acknowledgements 2781 Inspiration was taken from the C and Pascal languages, MPEG's 2782 conventions for describing structures in the ISO base media file 2783 format, Relax-NG and its compact syntax [RELAXNG], and in particular 2784 from Andrew Lee Newton's "JSON Content Rules" 2785 [I-D.newton-json-content-rules]. 2787 Lots of highly useful feedback came from members of the IETF CBOR WG, 2788 in particular Ari Keraenen, Brian Carpenter, Burt Harris, Jeffrey 2789 Yasskin, Jim Hague, Jim Schaad, Joe Hildebrand, Max Pritikin, Michael 2790 Richardson, Pete Cordell, Sean Leonard, and Yaron Sheffer. Also, 2791 Francesca Palombini and Joe volunteered to chair the WG when it was 2792 created, providing the framework for generating and processing this 2793 feedback; with Barry Leiba having taken over from Joe since. Chris 2794 Lonvick and Ines Robles provided additional reviews during IESG 2795 processing, and Alexey Melnikov steered the process as the 2796 responsible area director. 2798 The CDDL tool reported on in Appendix F was written by Carsten 2799 Bormann, building on previous work by Troy Heninger and Tom Lord. 2801 Authors' Addresses 2803 Henk Birkholz 2804 Fraunhofer SIT 2805 Rheinstrasse 75 2806 Darmstadt 64295 2807 Germany 2809 Email: henk.birkholz@sit.fraunhofer.de 2811 Christoph Vigano 2812 Universitaet Bremen 2814 Email: christoph.vigano@uni-bremen.de 2815 Carsten Bormann 2816 Universitaet Bremen TZI 2817 Bibliothekstr. 1 2818 Bremen D-28359 2819 Germany 2821 Phone: +49-421-218-63921 2822 Email: cabo@tzi.org