idnits 2.17.1 draft-ietf-cbor-cddl-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 24, 2019) is 1859 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'Cc' is mentioned on line 1259, but not defined == Missing Reference: 'Aa' is mentioned on line 1259, but not defined == Missing Reference: 'Ss' is mentioned on line 1259, but not defined == Missing Reference: 'Ee' is mentioned on line 1259, but not defined == Missing Reference: 'RFCthis' is mentioned on line 1669, but not defined -- Looks like a reference, but probably isn't: '1' on line 2408 -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO6093' ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949) == Outdated reference: A later version (-13) exists of draft-bormann-cbor-cddl-freezer-01 -- Obsolete informational reference (is this intentional?): RFC 8152 (Obsoleted by RFC 9052, RFC 9053) Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 CBOR H. Birkholz 3 Internet-Draft Fraunhofer SIT 4 Intended status: Standards Track C. Vigano 5 Expires: September 25, 2019 Universitaet Bremen 6 C. Bormann 7 Universitaet Bremen TZI 8 March 24, 2019 10 Concise data definition language (CDDL): a notational convention to 11 express CBOR and JSON data structures 12 draft-ietf-cbor-cddl-08 14 Abstract 16 This document proposes a notational convention to express CBOR data 17 structures (RFC 7049, Concise Binary Object Representation). Its 18 main goal is to provide an easy and unambiguous way to express 19 structures for protocol messages and data formats that use CBOR or 20 JSON. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at https://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on September 25, 2019. 39 Copyright Notice 41 Copyright (c) 2019 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (https://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 57 1.1. Requirements notation . . . . . . . . . . . . . . . . . . 4 58 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 59 2. The Style of Data Structure Specification . . . . . . . . . . 4 60 2.1. Groups and Composition in CDDL . . . . . . . . . . . . . 6 61 2.1.1. Usage . . . . . . . . . . . . . . . . . . . . . . . . 9 62 2.1.2. Syntax . . . . . . . . . . . . . . . . . . . . . . . 9 63 2.2. Types . . . . . . . . . . . . . . . . . . . . . . . . . . 9 64 2.2.1. Values . . . . . . . . . . . . . . . . . . . . . . . 10 65 2.2.2. Choices . . . . . . . . . . . . . . . . . . . . . . . 10 66 2.2.3. Representation Types . . . . . . . . . . . . . . . . 12 67 2.2.4. Root type . . . . . . . . . . . . . . . . . . . . . . 13 68 3. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 69 3.1. General conventions . . . . . . . . . . . . . . . . . . . 13 70 3.2. Occurrence . . . . . . . . . . . . . . . . . . . . . . . 15 71 3.3. Predefined names for types . . . . . . . . . . . . . . . 16 72 3.4. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . 17 73 3.5. Maps . . . . . . . . . . . . . . . . . . . . . . . . . . 17 74 3.5.1. Structs . . . . . . . . . . . . . . . . . . . . . . . 18 75 3.5.2. Tables . . . . . . . . . . . . . . . . . . . . . . . 20 76 3.5.3. Non-deterministic order . . . . . . . . . . . . . . . 21 77 3.5.4. Cuts in Maps . . . . . . . . . . . . . . . . . . . . 22 78 3.6. Tags . . . . . . . . . . . . . . . . . . . . . . . . . . 23 79 3.7. Unwrapping . . . . . . . . . . . . . . . . . . . . . . . 24 80 3.8. Controls . . . . . . . . . . . . . . . . . . . . . . . . 25 81 3.8.1. Control operator .size . . . . . . . . . . . . . . . 25 82 3.8.2. Control operator .bits . . . . . . . . . . . . . . . 26 83 3.8.3. Control operator .regexp . . . . . . . . . . . . . . 26 84 3.8.4. Control operators .cbor and .cborseq . . . . . . . . 28 85 3.8.5. Control operators .within and .and . . . . . . . . . 28 86 3.8.6. Control operators .lt, .le, .gt, .ge, .eq, .ne, and 87 .default . . . . . . . . . . . . . . . . . . . . . . 29 88 3.9. Socket/Plug . . . . . . . . . . . . . . . . . . . . . . . 30 89 3.10. Generics . . . . . . . . . . . . . . . . . . . . . . . . 31 90 3.11. Operator Precedence . . . . . . . . . . . . . . . . . . . 32 91 4. Making Use of CDDL . . . . . . . . . . . . . . . . . . . . . 33 92 4.1. As a guide to a human user . . . . . . . . . . . . . . . 33 93 4.2. For automated checking of CBOR data structure . . . . . . 34 94 4.3. For data analysis tools . . . . . . . . . . . . . . . . . 34 95 5. Security considerations . . . . . . . . . . . . . . . . . . . 34 96 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 35 97 6.1. CDDL control operator registry . . . . . . . . . . . . . 35 98 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 36 99 7.1. Normative References . . . . . . . . . . . . . . . . . . 36 100 7.2. Informative References . . . . . . . . . . . . . . . . . 37 101 Appendix A. Parsing Expression Grammars (PEG) . . . . . . . . . 39 102 Appendix B. ABNF grammar . . . . . . . . . . . . . . . . . . . . 41 103 Appendix C. Matching rules . . . . . . . . . . . . . . . . . . . 43 104 Appendix D. Standard Prelude . . . . . . . . . . . . . . . . . . 47 105 Appendix E. Use with JSON . . . . . . . . . . . . . . . . . . . 49 106 Appendix F. A CDDL tool . . . . . . . . . . . . . . . . . . . . 51 107 Appendix G. Extended Diagnostic Notation . . . . . . . . . . . . 52 108 G.1. White space in byte string notation . . . . . . . . . . . 52 109 G.2. Text in byte string notation . . . . . . . . . . . . . . 52 110 G.3. Embedded CBOR and CBOR sequences in byte strings . . . . 53 111 G.4. Concatenated Strings . . . . . . . . . . . . . . . . . . 53 112 G.5. Hexadecimal, octal, and binary numbers . . . . . . . . . 54 113 G.6. Comments . . . . . . . . . . . . . . . . . . . . . . . . 54 114 Appendix H. Examples . . . . . . . . . . . . . . . . . . . . . . 55 115 H.1. RFC 7071 . . . . . . . . . . . . . . . . . . . . . . . . 55 116 H.2. Examples from JSON Content Rules . . . . . . . . . . . . 58 117 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 61 118 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 61 119 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 61 121 1. Introduction 123 In this document, a notational convention to express CBOR [RFC7049] 124 data structures is defined. 126 The main goal for the convention is to provide a unified notation 127 that can be used when defining protocols that use CBOR. We term the 128 convention "Concise data definition language", or CDDL. 130 The CBOR notational convention has the following goals: 132 (G1) Provide an unambiguous description of the overall structure of 133 a CBOR data item. 135 (G2) Be flexible in expressing the multiple ways in which data can 136 be represented in the CBOR data format. 138 (G3) Be able to express common CBOR datatypes and structures. 140 (G4) Provide a single format that is both readable and editable for 141 humans and processable by machine. 143 (G5) Enable automatic checking of CBOR data items for data format 144 compliance. 146 (G6) Enable extraction of specific elements from CBOR data for 147 further processing. 149 Not an original goal per se, but a convenient side effect of the JSON 150 generic data model being a subset of the CBOR generic data model, is 151 the fact that CDDL can also be used for describing JSON data 152 structures (see Appendix E). 154 This document has the following structure: 156 The syntax of CDDL is defined in Section 3. Examples of CDDL and 157 related CBOR data items ("instances", which all happen to be in JSON 158 form) are given in Appendix H. Section 4 discusses usage of CDDL. 159 Examples are provided early in the text to better illustrate concept 160 definitions. A formal definition of CDDL using ABNF grammar is 161 provided in Appendix B. Finally, a _prelude_ of standard CDDL 162 definitions that is automatically prepended to and thus available in 163 every CBOR specification is listed in Appendix D. 165 1.1. Requirements notation 167 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 168 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 169 "OPTIONAL" in this document are to be interpreted as described in 170 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 171 capitals, as shown here. 173 1.2. Terminology 175 New terms are introduced in _cursive_, which is rendered in plain 176 text as the new term surrounded by underscores. CDDL text in the 177 running text is in "typewriter", which is rendered in plain text as 178 the CDDL text in double quotes (double quotes are also used in the 179 usual English sense; the reader is expected to disambiguate this by 180 context). 182 In this specification, the term "byte" is used in its now customary 183 sense as a synonym for "octet". 185 2. The Style of Data Structure Specification 187 CDDL focuses on styles of specification that are in use in the 188 community employing the data model as pioneered by JSON and now 189 refined in CBOR. 191 There are a number of more or less atomic elements of a CBOR data 192 model, such as numbers, simple values (false, true, nil), text and 193 byte strings; CDDL does not focus on specifying their structure. 194 CDDL of course also allows adding a CBOR tag to a data item. 196 Beyond those atomic elements, further components of a data structure 197 definition language are the data types used for composition: arrays 198 and maps in CBOR (called arrays and objects in JSON). While these 199 are only two representation formats, they are used to specify four 200 loosely distinguishable styles of composition: 202 o A _vector_, an array of elements that are mostly of the same 203 semantics. The set of signatures associated with a signed data 204 item is a typical application of a vector. 206 o A _record_, an array the elements of which have different, 207 positionally defined semantics, as detailed in the data structure 208 definition. A 2D point, specified as an array of an x coordinate 209 (which comes first) and a y coordinate (coming second) is an 210 example of a record, as is the pair of exponent (first) and 211 mantissa (second) in a CBOR decimal fraction. 213 o A _table_, a map from a domain of map keys to a domain of map 214 values, that are mostly of the same semantics. A set of language 215 tags, each mapped to a text string translated to that specific 216 language, is an example of a table. The key domain is usually not 217 limited to a specific set by the specification, but open for the 218 application, e.g., in a table mapping IP addresses to MAC 219 addresses, the specification does not attempt to foresee all 220 possible IP addresses. In a language such as JavaScript, a "Map" 221 (as opposed to a plain "Object") would often be employed to 222 achieve the generality of the key domain. 224 o A _struct_, a map from a domain of map keys as defined by the 225 specification to a domain of map values the semantics of each of 226 which is bound to a specific map key. This is what many people 227 have in mind when they think about JSON objects; CBOR adds the 228 ability to use map keys that are not just text strings. Structs 229 can be used to solve similar problems as records; the use of 230 explicit map keys facilitates optionality and extensibility. 232 Two important concepts provide the foundation for CDDL: 234 1. Instead of defining all four types of composition in CDDL 235 separately, or even defining one kind for arrays (vectors and 236 records) and one kind for maps (tables and structs), there is 237 only one kind of composition in CDDL: the _group_ (Section 2.1). 239 2. The other important concept is that of a _type_. The entire CDDL 240 specification defines a type (the one defined by its first 241 _rule_), which formally is the set of CBOR data items that are 242 acceptable as "instances" for this specification. CDDL 243 predefines a number of basic types such as "uint" (unsigned 244 integer) or "tstr" (text string), often making use of a simple 245 formal notation for CBOR data items. Each value that can be 246 expressed as a CBOR data item also is a type in its own right, 247 e.g. "1". A type can be built as a _choice_ of other types, 248 e.g., an "int" is either a "uint" or a "nint" (negative integer). 249 Finally, a type can be built as an array or a map from a group. 251 The rest of this section introduces a number of basic concepts of 252 CDDL, and Section 3 defines additional syntax. Appendix C gives a 253 concise summary of the semantics of CDDL. 255 2.1. Groups and Composition in CDDL 257 CDDL Groups are lists of group _entries_, each of which can be a 258 name/value pair or a more complex group expression (which then in 259 turn stands for a sequence of name/value pairs). A CDDL group is a 260 production in a grammar that matches certain sequences of name/value 261 pairs but not others. The grammar is based on the concepts of 262 Parsing Expression Grammars (see Appendix A). 264 In an array context, only the value of the name/value pair is 265 represented; the name is annotation only (and can be left off from 266 the group specification if not needed). In a map context, the names 267 become the map keys ("member keys"). 269 In an array context, the actual sequence of elements in the group is 270 important, as that sequence is the information that allows 271 associating actual array elements with entries in the group. In a 272 map context, the sequence of entries in a group is not relevant (but 273 there is still a need to write down group entries in a sequence). 275 An array matches a specification given as a group when the group 276 matches a sequence of name/value pairs the value parts of which 277 exactly match the elements of the array in order. 279 A map matches a specification given as a group when the group matches 280 a sequence of name/value pairs such that all of these name/value 281 pairs are present in the map and the map has no name/value pair that 282 is not covered by the group. 284 A simple example of using a group directly in a map definition is: 286 person = { 287 age: int, 288 name: tstr, 289 employer: tstr, 290 } 292 Figure 1: Using a group directly in a map 294 The three entries of the group are written between the curly braces 295 that create the map: Here, "age", "name", and "employer" are the 296 names that turn into the map key text strings, and "int" and "tstr" 297 (text string) are the types of the map values under these keys. 299 A group by itself (without creating a map around it) can be placed in 300 (round) parentheses, and given a name by using it in a rule: 302 pii = ( 303 age: int, 304 name: tstr, 305 employer: tstr, 306 ) 308 Figure 2: A basic group 310 This separate, named group definition allows us to rephrase Figure 1 311 as: 313 person = { 314 pii 315 } 317 Figure 3: Using a group by name 319 Note that the (curly) braces signify the creation of a map; the 320 groups themselves are neutral as to whether they will be used in a 321 map or an array. 323 As shown in Figure 1, the parentheses for groups are optional when 324 there is some other set of brackets present. Note that they can 325 still be used, leading to the not so realistic, but perfectly valid 326 example: 328 person = {( 329 age: int, 330 name: tstr, 331 employer: tstr, 332 )} 334 Figure 4: Using a parenthesized group in a map 336 Groups can be used to factor out common parts of structs, e.g., 337 instead of writing copy/paste style specifications such as in 338 Figure 5, one can factor out the common subgroup, choose a name for 339 it, and write only the specific parts into the individual maps 340 (Figure 6). 342 person = { 343 age: int, 344 name: tstr, 345 employer: tstr, 346 } 348 dog = { 349 age: int, 350 name: tstr, 351 leash-length: float, 352 } 354 Figure 5: Maps with copy/paste 356 person = { 357 identity, 358 employer: tstr, 359 } 361 dog = { 362 identity, 363 leash-length: float, 364 } 366 identity = ( 367 age: int, 368 name: tstr, 369 ) 371 Figure 6: Using a group for factorization 373 Note that the lists inside the braces in the above definitions 374 constitute (anonymous) groups, while "identity" is a named group, 375 which can then be included as part of other groups (anonymous as in 376 the example, or themselves named). 378 2.1.1. Usage 380 Groups are the instrument used in composing data structures with 381 CDDL. It is a matter of style in defining those structures whether 382 to define groups (anonymously) right in their contexts or whether to 383 define them in a separate rule and to reference them with their 384 respective name (possibly more than once). 386 With this, one is allowed to define all small parts of their data 387 structures and compose bigger protocol units with those or to have 388 only one big protocol data unit that has all definitions ad hoc where 389 needed. 391 2.1.2. Syntax 393 The composition syntax is intended to be concise and easy to read: 395 o The start and end of a group can be marked by '(' and ')' 397 o Definitions of entries inside of a group are noted as follows: 398 _keytype => valuetype,_ (read "keytype maps to valuetype"). The 399 comma is actually optional (not just in the final entry), but it 400 is considered good style to set it. The double arrow can be 401 replaced by a colon in the common case of directly using a text 402 string or integer literal as a key (see Section 3.5.1; this is 403 also the common way of naming elements of an array just for 404 documentation, see Section 3.4). 406 A basic entry consists of a _keytype_ and a _valuetype_, both of 407 which are types (Section 2.2); this entry matches any name-value pair 408 the name of which is in the keytype and the value of which is in the 409 valuetype. 411 A group defined as a sequence of group entries matches any sequence 412 of name-value pairs that is composed by concatenation in order of 413 what the entries match. 415 A group definition can also contain choices between groups, see 416 Section 2.2.2. 418 2.2. Types 419 2.2.1. Values 421 Values such as numbers and strings can be used in place of a type. 422 (For instance, this is a very common thing to do for a keytype, 423 common enough that CDDL provides additional convenience syntax for 424 this.) 426 The value notation is based on the C language, but does not offer all 427 the syntactic variations (see Appendix B for details). The value 428 notation for numbers inherits from C the distinction between integer 429 values (no fractional part or exponent given -- NR1 [ISO6093]) and 430 floating point values (where a fractional part and/or an exponent is 431 present -- NR2 or NR3), so the type "1" does not include any floating 432 point numbers while the types "1e3" and "1.5" are both floating point 433 numbers and do not include any integer numbers. 435 2.2.2. Choices 437 Many places that allow a type also allow a choice between types, 438 delimited by a "/" (slash). The entire choice construct can be put 439 into parentheses if this is required to make the construction 440 unambiguous (please see Appendix B for the details). 442 Choices of values can be used to express enumerations: 444 attire = "bow tie" / "necktie" / "Internet attire" 445 protocol = 6 / 17 447 Similarly as for types, CDDL also allows choices between groups, 448 delimited by a "//" (double slash). Note that the "//" operator 449 binds much more weakly than the other CDDL operators, so each line 450 within "delivery" in the following example is its own alternative in 451 the group choice: 453 address = { delivery } 455 delivery = ( 456 street: tstr, ? number: uint, city // 457 po-box: uint, city // 458 per-pickup: true ) 460 city = ( 461 name: tstr, zip-code: uint 462 ) 464 A group choice matches the union of the sets of name-value pair 465 sequences that the alternatives in the choice can. 467 Both for type choices and for group choices, additional alternatives 468 can be added to a rule later in separate rules by using "/=" and 469 "//=", respectively, instead of "=": 471 attire /= "swimwear" 473 delivery //= ( 474 lat: float, long: float, drone-type: tstr 475 ) 477 It is not an error if a name is first used with a "/=" or "//=" 478 (there is no need to "create it" with "="). 480 2.2.2.1. Ranges 482 Instead of naming all the values that make up a choice, CDDL allows 483 building a _range_ out of two values that are in an ordering 484 relationship: A lower bound (first value) and an upper bound (second 485 value). A range can be inclusive of both bounds given (denoted by 486 joining two values by ".."), or include the lower bound and exclude 487 the upper bound (denoted by instead using "..."). If the lower bound 488 exceeds the upper bound, the resulting type is the empty set (this 489 behavior can be desirable when generics, Section 3.10, are being 490 used). 492 device-address = byte 493 max-byte = 255 494 byte = 0..max-byte ; inclusive range 495 first-non-byte = 256 496 byte1 = 0...first-non-byte ; byte1 is equivalent to byte 498 CDDL currently only allows ranges between integers (matching integer 499 values) or between floating point values (matching floating point 500 values). If both are needed in a type, a type choice between the two 501 kinds of ranges can be (clumsily) used: 503 int-range = 0..10 ; only integers match 504 float-range = 0.0..10.0 ; only floats match 505 BAD-range1 = 0..10.0 ; NOT DEFINED 506 BAD-range2 = 0.0..10 ; NOT DEFINED 507 numeric-range = int-range / float-range 509 (See also the control operators .lt/.ge and .le/.gt in 510 Section 3.8.6.) 512 Note that the dot is a valid name continuation character in CDDL, so 514 min..max 516 is not a range expression but a single name. When using a name as 517 the left hand side of a range operator, use spacing as in 519 min .. max 521 to separate off the range operator. 523 2.2.2.2. Turning a group into a choice 525 Some choices are built out of large numbers of values, often 526 integers, each of which is best given a semantic name in the 527 specification. Instead of naming each of these integers and then 528 accumulating these into a choice, CDDL allows building a choice from 529 a group by prefixing it with a "&" character: 531 terminal-color = &basecolors 532 basecolors = ( 533 black: 0, red: 1, green: 2, yellow: 3, 534 blue: 4, magenta: 5, cyan: 6, white: 7, 535 ) 536 extended-color = &( 537 basecolors, 538 orange: 8, pink: 9, purple: 10, brown: 11, 539 ) 541 As with the use of groups in arrays (Section 3.4), the member names 542 have only documentary value (in particular, they might be used by a 543 tool when displaying integers that are taken from that choice). 545 2.2.3. Representation Types 547 CDDL allows the specification of a data item type by referring to the 548 CBOR representation (major types and additional information, 549 Section 2 of [RFC7049]). How this is used should be evident from the 550 prelude (Appendix D): a hash mark ("#") optionally followed by a 551 number from 0 to 7 identifying the major type, which then can be 552 followed by a dot and a number specifying the additional information. 553 This construction specifies the set of values that can be serialized 554 in CBOR (i.e., "any"), by the given major type if one is given, or by 555 the given major type with the additional information if both are 556 given. Where a major type of 6 (Tag) is used, the type of the tagged 557 item can be specified by appending it in parentheses. 559 Note that although this notation is based on the CBOR serialization, 560 it is about a set of values at the data model level, e.g. "#7.25" 561 specifies the set of values that can be represented as half-precision 562 floats; it does not mandate that these values also do have to be 563 serialized as half-precision floats: CDDL does not provide any 564 language means to restrict the choice of serialization variants. 565 This also enables the use of CDDL with JSON, which uses a 566 fundamentally different way of serializing (some of) the same values. 568 It may be necessary to make use of representation types outside the 569 prelude, e.g., a specification could start by making use of an 570 existing tag in a more specific way, or define a new tag not defined 571 in the prelude: 573 my_breakfast = #6.55799(breakfast) ; cbor-any is too general! 574 breakfast = cereal / porridge 575 cereal = #6.998(tstr) 576 porridge = #6.999([liquid, solid]) 577 liquid = milk / water 578 milk = 0 579 water = 1 580 solid = tstr 582 2.2.4. Root type 584 There is no special syntax to identify the root of a CDDL data 585 structure definition: that role is simply taken by the first rule 586 defined in the file. 588 This is motivated by the usual top-down approach for defining data 589 structures, decomposing a big data structure unit into smaller parts; 590 however, except for the root type, there is no need to strictly 591 follow this sequence. 593 (Note that there is no way to use a group as a root - it must be a 594 type.) 596 3. Syntax 598 In this section, the overall syntax of CDDL is shown, alongside some 599 examples just illustrating syntax. (The definition will not attempt 600 to be overly formal; refer to Appendix B for the details.) 602 3.1. General conventions 604 The basic syntax is inspired by ABNF [RFC5234], with 606 o rules, whether they define groups or types, are defined with a 607 name, followed by an equals sign "=" and the actual definition 608 according to the respective syntactic rules of that definition. 610 o A name can consist of any of the characters from the set {'A' to 611 'Z', 'a' to 'z', '0' to '9', '_', '-', '@', '.', '$'}, starting 612 with an alphabetic character (including '@', '_', '$') and ending 613 in such a character or or a digit. 615 * Names are case sensitive. 617 * It is preferred style to start a name with a lower case letter. 619 * The hyphen is preferred over the underscore (except in a 620 "bareword" (Section 3.5.1), where the semantics may actually 621 require an underscore). 623 * The period may be useful for larger specifications, to express 624 some module structure (as in "tcp.throughput" vs. 625 "udp.throughput"). 627 * A number of names are predefined in the CDDL prelude, as listed 628 in Appendix D. 630 * Rule names (types or groups) do not appear in the actual CBOR 631 encoding, but names used as "barewords" in member keys do. 633 o Comments are started by a ';' (semicolon) character and finish at 634 the end of a line (LF or CRLF). 636 o outside strings, whitespace (spaces, newlines, and comments) is 637 used to separate syntactic elements for readability (and to 638 separate identifiers, range operators, or numbers that follow each 639 other); it is otherwise completely optional. 641 o Hexadecimal numbers are preceded by '0x' (without quotes, lower 642 case x), and are case insensitive. Similarly, binary numbers are 643 preceded by '0b'. 645 o Text strings are enclosed by double quotation '"' characters. 646 They follow the conventions for strings as defined in section 7 of 647 [RFC8259]. (ABNF users may want to note that there is no support 648 in CDDL for the concept of case insensitivity in text strings; if 649 necessary, regular expressions can be used (Section 3.8.3).) 651 o Byte strings are enclosed by single quotation "'" characters and 652 may be prefixed by "h" or "b64". If unprefixed, the string is 653 interpreted as with a text string, except that single quotes must 654 be escaped and that the UTF-8 bytes resulting are marked as a byte 655 string (major type 2). If prefixed as "h" or "b64", the string is 656 interpreted as a sequence of pairs of hex digits (base16, 657 Section 8 of [RFC4648]) or a base64(url) string (Sections 4 or 5 658 of [RFC4648]), respectively (as with the diagnostic notation in 659 section 6 of [RFC7049]; cf. Appendix G.2); any white space present 660 within the string (including comments) is ignored in the prefixed 661 case. 663 o CDDL uses UTF-8 [RFC3629] for its encoding. Processing of CDDL 664 does not involve Unicode normalization processes. 666 Example: 668 ; This is a comment 669 person = { g } 671 g = ( 672 "name": tstr, 673 age: int, ; "age" is a bareword 674 ) 676 3.2. Occurrence 678 An optional _occurrence_ indicator can be given in front of a group 679 entry. It is either one of the characters '?' (optional), '*' (zero 680 or more), or '+' (one or more), or is of the form n*m, where n and m 681 are optional unsigned integers and n is the lower limit (default 0) 682 and m is the upper limit (default no limit) of occurrences. 684 If no occurrence indicator is specified, the group entry is to occur 685 exactly once (as if 1*1 were specified). A group entry with an 686 occurrence indicator matches sequences of name-value pairs that are 687 composed by concatenating a number of sequences that the basic group 688 entry matches, where the number needs to be allowed by the occurrence 689 indicator. 691 Note that CDDL, outside any directives/annotations that could 692 possibly be defined, does not make any prescription as to whether 693 arrays or maps use the definite length or indefinite length encoding. 694 I.e., there is no correlation between leaving the size of an array 695 "open" in the spec and the fact that it is then interchanged with 696 definite or indefinite length. 698 Please also note that CDDL can describe flexibility that the data 699 model of the target representation does not have. This is rather 700 obvious for JSON, but also is relevant for CBOR: 702 apartment = { 703 kitchen: size, 704 * bedroom: size, 705 } 706 size = float ; in m2 707 The previous specification does not mean that CBOR is changed to 708 allow to use the key "bedroom" more than once. In other words, due 709 to the restrictions imposed by the data model, the third line pretty 710 much turns into: 712 ? bedroom: size, 714 (Occurrence indicators beyond one still are useful in maps for groups 715 that allow a variety of keys.) 717 3.3. Predefined names for types 719 CDDL predefines a number of names. This subsection summarizes these 720 names, but please see Appendix D for the exact definitions. 722 The following keywords for primitive datatypes are defined: 724 "bool" Boolean value (major type 7, additional information 20 or 725 21). 727 "uint" An unsigned integer (major type 0). 729 "nint" A negative integer (major type 1). 731 "int" An unsigned integer or a negative integer. 733 "float16" A number representable as an IEEE 754 half-precision float 734 (major type 7, additional information 25). 736 "float32" A number representable as an IEEE 754 single-precision 737 float (major type 7, additional information 26). 739 "float64" A number representable as an IEEE 754 double-precision 740 float (major type 7, additional information 27). 742 "float" One of float16, float32, or float64. 744 "bstr" or "bytes" A byte string (major type 2). 746 "tstr" or "text" Text string (major type 3) 748 (Note that there are no predefined names for arrays or maps; these 749 are defined with the syntax given below.) 751 In addition, a number of types are defined in the prelude that are 752 associated with CBOR tags, such as "tdate", "bigint", "regexp" etc. 754 3.4. Arrays 756 Array definitions surround a group with square brackets. 758 For each entry, an occurrence indicator as specified in Section 3.2 759 is permitted. 761 For example: 763 unlimited-people = [* person] 764 one-or-two-people = [1*2 person] 765 at-least-two-people = [2* person] 766 person = ( 767 name: tstr, 768 age: uint, 769 ) 771 The group "person" is defined in such a way that repeating it in the 772 array each time generates alternating names and ages, so these are 773 four valid values for a data item of type "unlimited-people": 775 ["roundlet", 1047, "psychurgy", 2204, "extrarhythmical", 2231] 776 [] 777 ["aluminize", 212, "climograph", 4124] 778 ["penintime", 1513, "endocarditis", 4084, "impermeator", 1669, 779 "coextension", 865] 781 3.5. Maps 783 The syntax for specifying maps merits special attention, as well as a 784 number of optimizations and conveniences, as it is likely to be the 785 focal point of many specifications employing CDDL. While the syntax 786 does not strictly distinguish struct and table usage of maps, it 787 caters specifically to each of them. 789 But first, let's reiterate a feature of CBOR that it has inherited 790 from JSON: The key/value pairs in CBOR maps have no fixed ordering. 791 (One could imagine situations where fixing the ordering may be of 792 use. For example, a decoder could look for values related with 793 integer keys 1, 3 and 7. If the order were fixed and the decoder 794 encounters the key 4 without having encountered key 3, it could 795 conclude that key 3 is not available without doing more complicated 796 bookkeeping. Unfortunately, neither JSON nor CBOR support this, so 797 no attempt was made to support this in CDDL either.) 799 3.5.1. Structs 801 The "struct" usage of maps is similar to the way JSON objects are 802 used in many JSON applications. 804 A map is defined in the same way as defining an array (see 805 Section 3.4), except for using curly braces "{}" instead of square 806 brackets "[]". 808 An occurrence indicator as specified in Section 3.2 is permitted for 809 each group entry. 811 The following is an example of a record with a structure enbedded: 813 Geography = [ 814 city : tstr, 815 gpsCoordinates : GpsCoordinates, 816 ] 818 GpsCoordinates = { 819 longitude : uint, ; degrees, scaled by 10^7 820 latitude : uint, ; degreed, scaled by 10^7 821 } 823 When encoding, the Geography record is encoded using a CBOR array 824 with two members (the keys for the group entries are ignored), 825 whereas the GpsCoordinates structure is encoded as a CBOR map with 826 two key/value pairs. 828 Types used in a structure can be defined in separate rules or just in 829 place (potentially placed inside parentheses, such as for choices). 830 E.g.: 832 located-samples = { 833 sample-point: int, 834 samples: [+ float], 835 } 837 where "located-samples" is the datatype to be used when referring to 838 the struct, and "sample-point" and "samples" are the keys to be used. 839 This is actually a complete example: an identifier that is followed 840 by a colon can be directly used as the text string for a member key 841 (we speak of a "bareword" member key), as can a double-quoted string 842 or a number. (When other types, in particular ones that contain more 843 than one value, are used as the types of keys, they are followed by a 844 double arrow, see below.) 845 If a text string key does not match the syntax for an identifier (or 846 if the specifier just happens to prefer using double quotes), the 847 text string syntax can also be used in the member key position, 848 followed by a colon. The above example could therefore have been 849 written with quoted strings in the member key positions. 851 More generally, types specified in other ways than the cases 852 described above can be used in a keytype position by following them 853 with a double arrow -- in particular, the double arrow is necessary 854 if a type is named by an identifier (which, when followed by a colon, 855 would be interpreted as a "bareword" and turned into a text string). 856 A literal text string also gives rise to a type (which contains a 857 single value only -- the given string), so another form for this 858 example is: 860 located-samples = { 861 "sample-point" => int, 862 "samples" => [+ float], 863 } 865 See Section 3.5.4 below for how the colon shortcut described here 866 also adds some implied semantics. 868 A better way to demonstrate the double-arrow use may be: 870 located-samples = { 871 sample-point: int, 872 samples: [+ float], 873 * equipment-type => equipment-tolerances, 874 } 875 equipment-type = [name: tstr, manufacturer: tstr] 876 equipment-tolerances = [+ [float, float]] 878 The example below defines a struct with optional entries: display 879 name (as a text string), the name components first name and family 880 name (as text strings), and age information (as an unsigned integer). 882 PersonalData = { 883 ? displayName: tstr, 884 NameComponents, 885 ? age: uint, 886 } 888 NameComponents = ( 889 ? firstName: tstr, 890 ? familyName: tstr, 891 ) 893 Note that the group definition for NameComponents does not generate 894 another map; instead, all four keys are directly in the struct built 895 by PersonalData. 897 In this example, all key/value pairs are optional from the 898 perspective of CDDL. With no occurrence indicator, an entry is 899 mandatory. 901 If the addition of more entries not specified by the current 902 specification is desired, one can add this possibility explicitly: 904 PersonalData = { 905 ? displayName: tstr, 906 NameComponents, 907 ? age: uint, 908 * tstr => any 909 } 911 NameComponents = ( 912 ? firstName: tstr, 913 ? familyName: tstr, 914 ) 916 Figure 7: Personal Data: Example for extensibility 918 The CDDL tool reported on in Appendix F generated as one acceptable 919 instance for this specification: 921 {"familyName": "agust", "antiforeignism": "pretzel", 922 "springbuck": "illuminatingly", "exuviae": "ephemeris", 923 "kilometrage": "frogfish"} 925 (See Section 3.9 for one way to explicitly identify an extension 926 point.) 928 3.5.2. Tables 930 A table can be specified by defining a map with entries where the 931 keytype allows more than just a single value, e.g.: 933 square-roots = {* x => y} 934 x = int 935 y = float 937 Here, the key in each key/value pair has datatype x (defined as int), 938 and the value has datatype y (defined as float). 940 If the specification does not need to restrict one of x or y (i.e., 941 the application is free to choose per entry), it can be replaced by 942 the predefined name "any". 944 As another example, the following could be used as a conversion table 945 converting from an integer or float to a string: 947 tostring = {* mynumber => tstr} 948 mynumber = int / float 950 3.5.3. Non-deterministic order 952 While the way arrays are matched is fully determined by the Parsing 953 Expression Grammar (PEG) formalism (see Appendix A), matching is more 954 complicated for maps, as maps do not have an inherent order. For 955 each candidate name/value pair that the PEG algorithm would try, a 956 matching member is picked out of the entire map. For certain group 957 expressions, more than one member in the map may match. Most often, 958 this is inconsequential, as the group expression tends to consume all 959 matches: 961 labeled-values = { 962 ? fritz: number, 963 * label => value 964 } 965 label = text 966 value = number 968 Here, if any member with the key "fritz" is present, this will be 969 picked by the first entry of the group; all remaining text/number 970 member will be picked by the second entry (and if anything remains 971 unpicked, the map does not match). 973 However, it is possible to construct group expressions where what is 974 actually picked is indeterminate, and does matter: 976 do-not-do-this = { 977 int => int, 978 int => 6, 979 } 981 When this expression is matched against "{3: 5, 4: 6}", the first 982 group entry might pick off the "3: 5", leaving "4: 6" for matching 983 the second one. Or it might pick off "4: 6", leaving nothing for the 984 second entry. This pathological non-determinism is caused by 985 specifying more general before more specific, and by having a general 986 rule that only consumes a subset of the map key/value pairs that it 987 is able to match -- both tend not to occur in real-world 988 specifications of maps. At the time of writing, CDDL tools cannot 989 detect such cases automatically, and for the present version of the 990 CDDL specification, the specification writer is simply urged to not 991 write pathologically non-deterministic specifications. 993 (The astute reader will be reminded of what was called "ambiguous 994 content models" in SGML and "non-deterministic content models" in 995 XML. That problem is related to the one described here, but the 996 problem here is specifically caused by the lack of order in maps, 997 something that the XML schema languages do not have to contend with. 998 Note that Relax-NG's "interleave" pattern handles lack of order 999 explicitly on the specification side, while the instances in XML 1000 always have determinate order.) 1002 3.5.4. Cuts in Maps 1004 The extensibility idiom discussed above for structs has one problem: 1006 extensible-map-example = { 1007 ? "optional-key" => int, 1008 * tstr => any 1009 } 1011 In this example, there is one optional key "optional-key", which, 1012 when present, maps to an integer. There is also a wild card for any 1013 future additions. 1015 Unfortunately, the data item 1017 { "optional-key": "nonsense" } 1019 does match this specification: While the first entry of the group 1020 does not match, the second one (the wildcard) does. This may be very 1021 well desirable (e.g., if a future extension is to be allowed to 1022 extend the type of "optional-key"), but in many cases isn't. 1024 In anticipation of a more general potential feature called "cuts", 1025 CDDL allows inserting a cut "^" into the definition of the map entry: 1027 extensible-map-example = { 1028 ? "optional-key" ^ => int, 1029 * tstr => any 1030 } 1032 A cut in this position means that once the member key matches the 1033 name part of an entry that carries a cut, other potential matches for 1034 the key of the member that occur in later entries in the group of the 1035 map are no longer allowed. In other words, when a group entry would 1036 pick a key/value pair based on just a matching key, it "locks in" the 1037 pick -- this rule applies independent of whether the value matches as 1038 well, so when it does not, the entire map fails to match. In 1039 summary, the example above no longer matches the specification as 1040 modified with the cut. 1042 Since the desire for this kind of exclusive matching is so frequent, 1043 the ":" shortcut is actually defined to include the cut semantics. 1044 So the preceding example (including the cut) can be written more 1045 simply as: 1047 extensible-map-example = { 1048 ? "optional-key": int, 1049 * tstr => any 1050 } 1052 or even shorter, using a bareword for the key: 1054 extensible-map-example = { 1055 ? optional-key: int, 1056 * tstr => any 1057 } 1059 3.6. Tags 1061 A type can make use of a CBOR tag (major type 6) by using the 1062 representation type notation, giving #6.nnn(type) where nnn is an 1063 unsigned integer giving the tag number and "type" is the type of the 1064 data item being tagged. 1066 For example, the following line from the CDDL prelude (Appendix D) 1067 defines "biguint" as a type name for a positive bignum N: 1069 biguint = #6.2(bstr) 1071 The tags defined by [RFC7049] are included in the prelude. 1072 Additional tags since registered need to be added to a CDDL 1073 specification as needed; e.g., a binary UUID tag could be referenced 1074 as "buuid" in a specification after defining 1076 buuid = #6.37(bstr) 1078 In the following example, usage of the tag 32 for URIs is optional: 1080 my_uri = #6.32(tstr) / tstr 1082 3.7. Unwrapping 1084 The group that is used to define a map or an array can often be 1085 reused in the definition of another map or array. Similarly, a type 1086 defined as a tag carries an internal data item that one would like to 1087 refer to. In these cases, it is expedient to simply use the name of 1088 the map, array, or tag type as a handle for the group or type defined 1089 inside it. 1091 The "unwrap" operator (written by preceding a name by a tilde 1092 character "~") can be used to strip the type defined for a name by 1093 one layer, exposing the underlying group (for maps and arrays) or 1094 type (for tags). 1096 For example, an application might want to define a basic and an 1097 advanced header. Without unwrapping, this might be done as follows: 1099 basic-header-group = ( 1100 field1: int, 1101 field2: text, 1102 ) 1104 basic-header = [ basic-header-group ] 1106 advanced-header = [ 1107 basic-header-group, 1108 field3: bytes, 1109 field4: number, ; as in the tagged type "time" 1110 ] 1112 Unwrapping simplifies this to: 1114 basic-header = [ 1115 field1: int, 1116 field2: text, 1117 ] 1119 advanced-header = [ 1120 ~basic-header, 1121 field3: bytes, 1122 field4: ~time, 1123 ] 1125 (Note that leaving out the first unwrap operator in the latter 1126 example would lead to nesting the basic-header in its own array 1127 inside the advanced-header, while, with the unwrapped basic-header, 1128 the definition of the group inside basic-header is essentially 1129 repeated inside advanced-header, leading to a single array. This can 1130 be used for various applications often solved by inheritance in 1131 programming languages. The effect of unwrapping can also be 1132 described as "threading in" the group or type inside the referenced 1133 type, which suggested the thread-like "~" character.) 1135 3.8. Controls 1137 A _control_ allows to relate a _target_ type with a _controller_ type 1138 via a _control operator_. 1140 The syntax for a control type is "target .control-operator 1141 controller", where control operators are special identifiers prefixed 1142 by a dot. (Note that _target_ or _controller_ might need to be 1143 parenthesized.) 1145 A number of control operators are defined at this point. Further 1146 control operators may be defined by new versions of this 1147 specification or by registering them according to the procedures in 1148 Section 6.1. 1150 3.8.1. Control operator .size 1152 A ".size" control controls the size of the target in bytes by the 1153 control type. The control is defined for text and byte strings, 1154 where it directly controls the number of bytes in the string. It is 1155 also defined for unsigned integers (see below). Figure 8 shows 1156 example usage for byte strings. 1158 full-address = [[+ label], ip4, ip6] 1159 ip4 = bstr .size 4 1160 ip6 = bstr .size 16 1161 label = bstr .size (1..63) 1163 Figure 8: Control for size in bytes 1165 When applied to an unsigned integer, the ".size" control restricts 1166 the range of that integer by giving a maximum number of bytes that 1167 should be needed in a computer representation of that unsigned 1168 integer. In other words, "uint .size N" is equivalent to 1169 "0...BYTES_N", where BYTES_N == 256**N. 1171 audio_sample = uint .size 3 ; 24-bit, equivalent to 0...16777216 1173 Figure 9: Control for integer size in bytes 1175 Note that, as with value restrictions in CDDL, this control is not a 1176 representation constraint; a number that fits into fewer bytes can 1177 still be represented in that form, and an inefficient implementation 1178 could use a longer form (unless that is restricted by some format 1179 constraints outside of CDDL, such as the rules in Section 3.9 of 1180 [RFC7049]). 1182 3.8.2. Control operator .bits 1184 A ".bits" control on a byte string indicates that, in the target, 1185 only the bits numbered by a number in the control type are allowed to 1186 be set. (Bits are counted the usual way, bit number "n" being set in 1187 "str" meaning that "(str[n >> 3] & (1 << (n & 7))) != 0".) 1188 Similarly, a ".bits" control on an unsigned integer "i" indicates 1189 that for all unsigned integers "n" where "(i & (1 << n)) != 0", "n" 1190 must be in the control type. 1192 tcpflagbytes = bstr .bits flags 1193 flags = &( 1194 fin: 8, 1195 syn: 9, 1196 rst: 10, 1197 psh: 11, 1198 ack: 12, 1199 urg: 13, 1200 ece: 14, 1201 cwr: 15, 1202 ns: 0, 1203 ) / (4..7) ; data offset bits 1205 rwxbits = uint .bits rwx 1206 rwx = &(r: 2, w: 1, x: 0) 1208 Figure 10: Control for what bits can be set 1210 The CDDL tool reported on in Appendix F generates the following ten 1211 example instances for "tcpflagbytes": 1213 h'906d' h'01fc' h'8145' h'01b7' h'013d' h'409f' h'018e' h'c05f' 1214 h'01fa' h'01fe' 1216 These examples do not illustrate that the above CDDL specification 1217 does not explicitly specify a size of two bytes: A valid all clear 1218 instance of flag bytes could be "h''" or "h'00'" or even "h'000000'" 1219 as well. 1221 3.8.3. Control operator .regexp 1223 A ".regexp" control indicates that the text string given as a target 1224 needs to match the XSD regular expression given as a value in the 1225 control type. XSD regular expressions are defined in Appendix F of 1226 [W3C.REC-xmlschema-2-20041028]. 1228 nai = tstr .regexp "[A-Za-z0-9]+@[A-Za-z0-9]+(\\.[A-Za-z0-9]+)+" 1230 Figure 11: Control with an XSD regexp 1232 An example matching this regular expression: 1234 "N1@CH57HF.4Znqe0.dYJRN.igjf" 1236 3.8.3.1. Usage considerations 1238 Note that XSD regular expressions do not support the usual \x or \u 1239 escapes for hexadecimal expression of bytes or unicode code points. 1240 However, in CDDL the XSD regular expressions are contained in text 1241 strings, the literal notation for which provides \u escapes; this 1242 should suffice for most applications that use regular expressions for 1243 text strings. (Note that this also means that there is one level of 1244 string escaping before the XSD escaping rules are applied.) 1246 XSD regular expressions support character class subtraction, a 1247 feature often not found in regular expression libraries; 1248 specification writers may want to use this feature sparingly. 1249 Similar considerations apply to Unicode character classes; where 1250 these are used, the specification that employs CDDL SHOULD identify 1251 which Unicode versions are addressed. 1253 Other surprises for infrequent users of XSD regular expressions may 1254 include: 1256 o No direct support for case insensitivity. While case 1257 insensitivity has gone mostly out of fashion in protocol design, 1258 it is sometimes needed and then needs to be expressed manually as 1259 in "[Cc][Aa][Ss][Ee]". 1261 o The support for popular character classes such as \w and \d is 1262 based on Unicode character properties, which is often not what is 1263 desired in an ASCII-based protocol and thus might lead to 1264 surprises. (\s and \S do have their more conventional meanings, 1265 and "." matches any character but the line ending characters \r or 1266 \n.) 1268 3.8.3.2. Discussion 1270 There are many flavors of regular expression in use in the 1271 programming community. For instance, perl-compatible regular 1272 expressions (PCRE) are widely used and probably are more useful than 1273 XSD regular expressions. However, there is no normative reference 1274 for PCRE that could be used in the present document. Instead, we opt 1275 for XSD regular expressions for now. There is precedent for that 1276 choice in the IETF, e.g., in YANG [RFC7950]. 1278 Note that CDDL uses controls as its main extension point. This 1279 creates the opportunity to add further regular expression formats in 1280 addition to the one referenced here if desired. As an example, a 1281 control ".pcre" is defined in [I-D.bormann-cbor-cddl-freezer]. 1283 3.8.4. Control operators .cbor and .cborseq 1285 A ".cbor" control on a byte string indicates that the byte string 1286 carries a CBOR encoded data item. Decoded, the data item matches the 1287 type given as the right-hand side argument (type1 in the following 1288 example). 1290 "bytes .cbor type1" 1292 Similarly, a ".cborseq" control on a byte string indicates that the 1293 byte string carries a sequence of CBOR encoded data items. When the 1294 data items are taken as an array, the array matches the type given as 1295 the right-hand side argument (type2 in the following example). 1297 "bytes .cborseq type2" 1299 (The conversion of the encoded sequence to an array can be effected 1300 for instance by wrapping the byte string between the two bytes 0x9f 1301 and 0xff and decoding the wrapped byte string as a CBOR encoded data 1302 item.) 1304 3.8.5. Control operators .within and .and 1306 A ".and" control on a type indicates that the data item matches both 1307 that left hand side type and the type given as the right hand side. 1308 (Formally, the resulting type is the intersection of the two types 1309 given.) 1311 "type1 .and type2" 1313 A variant of the ".and" control is the ".within" control, which 1314 expresses an additional intent: the left hand side type is meant to 1315 be a subset of the right-hand-side type. 1317 "type1 .within type2" 1319 While both forms have the identical formal semantics (intersection), 1320 the intention of the ".within" form is that the right hand side gives 1321 guidance to the types allowed on the left hand side, which typically 1322 is a socket (Section 3.9): 1324 message = $message .within message-structure 1325 message-structure = [message_type, *message_option] 1326 message_type = 0..255 1327 message_option = any 1329 $message /= [3, dough: text, topping: [* text]] 1330 $message /= [4, noodles: text, sauce: text, parmesan: bool] 1332 For ".within", a tool might flag an error if type1 allows data items 1333 that are not allowed by type2. In contrast, for ".and", there is no 1334 expectation that type1 already is a subset of type2. 1336 3.8.6. Control operators .lt, .le, .gt, .ge, .eq, .ne, and .default 1338 The controls .lt, .le, .gt, .ge, .eq, .ne specify a constraint on the 1339 left hand side type to be a value less than, less than or equal, 1340 greater than, greater than or equal, equal, or not equal, to a value 1341 given as a right hand side type (containing just that single value). 1342 In the present specification, the first four controls (.lt, .le, .gt, 1343 .ge) are defined only for numeric types, as these have a natural 1344 ordering relationship. 1346 speed = number .ge 0 ; unit: m/s 1348 .ne and .eq are defined both for numeric values and values of other 1349 types. If one of the values is not of a numeric type, equality is 1350 determined as follows: Text strings are equal (satisfy .eq/do not 1351 satisfy .ne) if they are byte-wise identical; the same applies for 1352 byte strings. Arrays are equal if they have the same number of 1353 elements, all of which are equal pairwise in order between the 1354 arrays. Maps are equal if they have the same number of key/value 1355 pairs, and there is pairwise equality between the key/value pairs 1356 between the two maps. Tagged values are equal if they both have the 1357 same tag and the values are equal. Values of simple types match if 1358 they are the same values. Numeric types that occur within arrays, 1359 maps, or tagged values are equal if their numeric value is equal and 1360 they are both integers or both floating point values. All other 1361 cases are not equal (e.g., comparing a text string with a byte 1362 string). 1364 A variant of the ".ne" control is the ".default" control, which 1365 expresses an additional intent: the value specified by the right- 1366 hand-side type is intended as a default value for the left hand side 1367 type given, and the implied .ne control is there to prevent this 1368 value from being sent over the wire. This control is only meaningful 1369 when the control type is used in an optional context; otherwise there 1370 would be no way to make use of the default value. 1372 timer = { 1373 time: uint, 1374 ? displayed-step: (number .gt 0) .default 1 1375 } 1377 3.9. Socket/Plug 1379 Both for type choices and group choices, a mechanism is defined that 1380 facilitates starting out with empty choices and assembling them 1381 later, potentially in separate files that are concatenated to build 1382 the full specification. 1384 Per convention, CDDL extension points are marked with a leading 1385 dollar sign (types) or two leading dollar signs (groups). Tools 1386 honor that convention by not raising an error if such a type or group 1387 is not defined at all; the symbol is then taken to be an empty type 1388 choice (group choice), i.e., no choice is available. 1390 tcp-header = {seq: uint, ack: uint, * $$tcp-option} 1392 ; later, in a different file 1394 $$tcp-option //= ( 1395 sack: [+(left: uint, right: uint)] 1396 ) 1398 ; and, maybe in another file 1400 $$tcp-option //= ( 1401 sack-permitted: true 1402 ) 1404 Names that start with a single "$" are "type sockets", starting out 1405 as an empty type, and intended to be extended via "/=". Names that 1406 start with a double "$$" are "group sockets", starting out as an 1407 empty group choice, and intended to be extended via "//=". In either 1408 case, it is not an error if there is no definition for a socket at 1409 all; this then means there is no way to satisfy the rule (i.e., the 1410 choice is empty). 1412 As a convention, all definitions (plugs) for socket names must be 1413 augmentations, i.e., they must be using "/=" and "//=", respectively. 1415 To pick up the example illustrated in Figure 7, the socket/plug 1416 mechanism could be used as shown in Figure 12: 1418 PersonalData = { 1419 ? displayName: tstr, 1420 NameComponents, 1421 ? age: uint, 1422 * $$personaldata-extensions 1423 } 1425 NameComponents = ( 1426 ? firstName: tstr, 1427 ? familyName: tstr, 1428 ) 1430 ; The above already works as is. 1431 ; But then, we can add later: 1433 $$personaldata-extensions //= ( 1434 favorite-salsa: tstr, 1435 ) 1437 ; and again, somewhere else: 1439 $$personaldata-extensions //= ( 1440 shoesize: uint, 1441 ) 1443 Figure 12: Personal Data example: Using socket/plug extensibility 1445 3.10. Generics 1447 Using angle brackets, the left hand side of a rule can add formal 1448 parameters after the name being defined, as in: 1450 messages = message<"reboot", "now"> / message<"sleep", 1..100> 1451 message = {type: t, value: v} 1453 When using a generic rule, the formal parameters are bound to the 1454 actual arguments supplied (also using angle brackets), within the 1455 scope of the generic rule (as if there were a rule of the form 1456 parameter = argument). 1458 Generic rules can be used for establishing names for both types and 1459 groups. 1461 (There are some limitations to nesting of generics in the tool 1462 described in Appendix F at this time.) 1464 3.11. Operator Precedence 1466 As with any language that has multiple syntactic features such as 1467 prefix and infix operators, CDDL has operators that bind more tightly 1468 than others. This is becoming more complicated than, say, in ABNF, 1469 as CDDL has both types and groups, with operators that are specific 1470 to these concepts. Type operators (such as "/" for type choice) 1471 operate on types, while group operators (such as "//" for group 1472 choice) operate on groups. Types can simply be used in groups, but 1473 groups need to be bracketed (as arrays or maps) to become types. So, 1474 type operators naturally bind closer than group operators. 1476 For instance, in 1478 t = [group1] 1479 group1 = (a / b // c / d) 1480 a = 1 b = 2 c = 3 d = 4 1482 group1 is a group choice between the type choice of a and b and the 1483 type choice of c and d. This becomes more relevant once member keys 1484 and/or occurrences are added in: 1486 t = {group2} 1487 group2 = (? ab: a / b // cd: c / d) 1488 a = 1 b = 2 c = 3 d = 4 1490 is a group choice between the optional member "ab" of type a or b and 1491 the member "cd" of type c or d. Note that the optionality is 1492 attached to the first choice ("ab"), not to the second choice. 1494 Similarly, in 1496 t = [group3] 1497 group3 = (+ a / b / c) 1498 a = 1 b = 2 c = 3 1500 group3 is a repetition of a type choice between a, b, and c; if just 1501 a is to be repeatable, a group choice is needed to focus the 1502 occurrence: 1504 (A comment has been that this could be counter-intuitive. The 1505 specification writer is encouraged to use parentheses liberally to 1506 guide readers that are not familiar with CDDL precedence rules.) 1508 t = [group4] 1509 group4 = (+ a // b / c) 1510 a = 1 b = 2 c = 3 1511 group4 is a group choice between a repeatable a and a single b or c. 1513 In general, as with many other languages with operator precedence 1514 rules, it is best not to rely on them, but to insert parentheses for 1515 readability: 1517 t = [group4a] 1518 group4a = ((+ a) // (b / c)) 1519 a = 1 b = 2 c = 3 1521 The operator precedences, in sequence of loose to tight binding, are 1522 defined in Appendix B and summarized in Table 1. (Arities given are 1523 1 for unary prefix operators and 2 for binary infix operators.) 1525 +----------+----+---------------------------+------+ 1526 | Operator | Ar | Operates on | Prec | 1527 +----------+----+---------------------------+------+ 1528 | = | 2 | name = type, name = group | 1 | 1529 | /= | 2 | name /= type | 1 | 1530 | //= | 2 | name //= group | 1 | 1531 | // | 2 | group // group | 2 | 1532 | , | 2 | group, group | 3 | 1533 | * | 1 | * group | 4 | 1534 | N*M | 1 | N*M group | 4 | 1535 | + | 1 | + group | 4 | 1536 | ? | 1 | ? group | 4 | 1537 | => | 2 | type => type | 5 | 1538 | : | 2 | name: type | 5 | 1539 | / | 2 | type / type | 6 | 1540 | .. | 2 | type..type | 7 | 1541 | ... | 2 | type...type | 7 | 1542 | .ctrl | 2 | type .ctrl type | 7 | 1543 | & | 1 | &group | 8 | 1544 | ~ | 1 | ~type | 8 | 1545 +----------+----+---------------------------+------+ 1547 Table 1: Summary of operator precedences 1549 4. Making Use of CDDL 1551 In this section, we discuss several potential ways to employ CDDL. 1553 4.1. As a guide to a human user 1555 CDDL can be used to efficiently define the layout of CBOR data, such 1556 that a human implementer can easily see how data is supposed to be 1557 encoded. 1559 Since CDDL maps parts of the CBOR data to human readable names, tools 1560 could be built that use CDDL to provide a human friendly 1561 representation of the CBOR data, and allow them to edit such data 1562 while remaining compliant to its CDDL definition. 1564 4.2. For automated checking of CBOR data structure 1566 CDDL has been specified such that a machine can handle the CDDL 1567 definition and related CBOR data (and, thus, also JSON data). For 1568 example, a machine could use CDDL to check whether or not CBOR data 1569 is compliant to its definition. 1571 The need for thoroughness of such compliance checking depends on the 1572 application. For example, an application may decide not to check the 1573 data structure at all, and use the CDDL definition solely as a means 1574 to indicate the structure of the data to the programmer. 1576 On the other end, the application may also implement a checking 1577 mechanism that goes as far as checking that all mandatory map members 1578 are available. 1580 The matter in how far the data description must be enforced by an 1581 application is left to the designers and implementers of that 1582 application, keeping in mind related security considerations. 1584 In no case the intention is that a CDDL tool would be "writing code" 1585 for an implementation. 1587 4.3. For data analysis tools 1589 In the long run, it can be expected that more and more data will be 1590 stored using the CBOR data format. 1592 Where there is data, there is data analysis and the need to process 1593 such data automatically. CDDL can be used for such automated data 1594 processing, allowing tools to verify data, clean it, and extract 1595 particular parts of interest from it. 1597 Since CBOR is designed with constrained devices in mind, a likely use 1598 of it would be small sensors. An interesting use would thus be 1599 automated analysis of sensor data. 1601 5. Security considerations 1603 This document presents a content rules language for expressing CBOR 1604 data structures. As such, it does not bring any security issues on 1605 itself, although specifications of protocols that use CBOR naturally 1606 need security analyses when defined. General guidelines for writing 1607 security considerations are defined in 1609 Security Considerations Guidelines [RFC3552] (BCP 72). 1610 Specifications using CDDL to define CBOR structures in protocols need 1611 to follow those guidelines. Additional topics that could be 1612 considered in a security considerations section for a specification 1613 that uses CDDL to define CBOR structures include the following: 1615 o Where could the language maybe cause confusion in a way that will 1616 enable security issues? 1618 o Where a CDDL matcher is part of the implementation of a system, 1619 the security of the system ought not depend on the correctness of 1620 the CDDL specification or CDDL implementation without any further 1621 defenses in place. 1623 o Where the CDDL includes extension points, the impact of extensions 1624 on the security of the system needs to be carefully considered. 1626 Writers of CDDL specifications are strongly encouraged to value 1627 clarity and transparency of the specification over its elegance. 1628 Keep it as simple as possible while still expressing the needed data 1629 model. 1631 A related observation about formal description techniques in general 1632 that is strongly recommended to be kept in mind by writers of CDDL 1633 specifications: Just because CDDL makes it easier to handle 1634 complexity in a specification, that does not make that complexity 1635 somehow less bad (except maybe on the level of the humans having to 1636 grasp the complex structure while reading the spec). 1638 6. IANA Considerations 1640 6.1. CDDL control operator registry 1642 IANA is requested to create a registry for control operators 1643 Section 3.8. The name of this registry is "CDDL Control Operators". 1645 Each entry in the subregistry must include the name of the control 1646 operator (by convention given with the leading dot) and a reference 1647 to its documentation. Names must be composed of the leading dot 1648 followed by a text string conforming to the production "id" in 1649 Appendix B. 1651 Initial entries in this registry are as follows: 1653 +----------+---------------+ 1654 | name | documentation | 1655 +----------+---------------+ 1656 | .size | [RFCthis] | 1657 | .bits | [RFCthis] | 1658 | .regexp | [RFCthis] | 1659 | .cbor | [RFCthis] | 1660 | .cborseq | [RFCthis] | 1661 | .within | [RFCthis] | 1662 | .and | [RFCthis] | 1663 | .lt | [RFCthis] | 1664 | .le | [RFCthis] | 1665 | .gt | [RFCthis] | 1666 | .ge | [RFCthis] | 1667 | .eq | [RFCthis] | 1668 | .ne | [RFCthis] | 1669 | .default | [RFCthis] | 1670 +----------+---------------+ 1672 All other control operator names are Unassigned. 1674 The IANA policy for additions to this registry is "Specification 1675 Required" as defined in [RFC8126] (which involves an Expert Review) 1676 for names that do not include an internal dot, and "IETF Review" for 1677 names that do include an internal dot. The Expert is specifically 1678 instructed that other Standards Development Organizations (SDOs) may 1679 want to define control operators that are specific to their fields 1680 (e.g., based on a binary syntax already in use at the SDO); the 1681 review process should strive to facilitate such an undertaking. 1683 7. References 1685 7.1. Normative References 1687 [ISO6093] ISO, "Information processing -- Representation of 1688 numerical values in character strings for information 1689 interchange", ISO 6093, 1985. 1691 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1692 Requirement Levels", BCP 14, RFC 2119, 1693 DOI 10.17487/RFC2119, March 1997, 1694 . 1696 [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 1697 Text on Security Considerations", BCP 72, RFC 3552, 1698 DOI 10.17487/RFC3552, July 2003, 1699 . 1701 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 1702 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November 1703 2003, . 1705 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 1706 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, 1707 . 1709 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 1710 Specifications: ABNF", STD 68, RFC 5234, 1711 DOI 10.17487/RFC5234, January 2008, 1712 . 1714 [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object 1715 Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, 1716 October 2013, . 1718 [RFC7493] Bray, T., Ed., "The I-JSON Message Format", RFC 7493, 1719 DOI 10.17487/RFC7493, March 2015, 1720 . 1722 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 1723 Writing an IANA Considerations Section in RFCs", BCP 26, 1724 RFC 8126, DOI 10.17487/RFC8126, June 2017, 1725 . 1727 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1728 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1729 May 2017, . 1731 [RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data 1732 Interchange Format", STD 90, RFC 8259, 1733 DOI 10.17487/RFC8259, December 2017, 1734 . 1736 [W3C.REC-xmlschema-2-20041028] 1737 Biron, P. and A. Malhotra, "XML Schema Part 2: Datatypes 1738 Second Edition", World Wide Web Consortium Recommendation 1739 REC-xmlschema-2-20041028, October 2004, 1740 . 1742 7.2. Informative References 1744 [I-D.bormann-cbor-cddl-freezer] 1745 Bormann, C., "A feature freezer for the Concise Data 1746 Definition Language (CDDL)", draft-bormann-cbor-cddl- 1747 freezer-01 (work in progress), August 2018. 1749 [I-D.ietf-anima-grasp] 1750 Bormann, C., Carpenter, B., and B. Liu, "A Generic 1751 Autonomic Signaling Protocol (GRASP)", draft-ietf-anima- 1752 grasp-15 (work in progress), July 2017. 1754 [I-D.newton-json-content-rules] 1755 Newton, A. and P. Cordell, "A Language for Rules 1756 Describing JSON Content", draft-newton-json-content- 1757 rules-09 (work in progress), September 2017. 1759 [PEG] Ford, B., "Parsing expression grammars", Proceedings of 1760 the 31st ACM SIGPLAN-SIGACT symposium on Principles of 1761 programming languages - POPL '04, 1762 DOI 10.1145/964001.964011, 2004. 1764 [RELAXNG] ISO/IEC, "Information technology -- Document Schema 1765 Definition Language (DSDL) -- Part 2: Regular-grammar- 1766 based validation -- RELAX NG", ISO/IEC 19757-2, December 1767 2008. 1769 [RFC7071] Borenstein, N. and M. Kucherawy, "A Media Type for 1770 Reputation Interchange", RFC 7071, DOI 10.17487/RFC7071, 1771 November 2013, . 1773 [RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language", 1774 RFC 7950, DOI 10.17487/RFC7950, August 2016, 1775 . 1777 [RFC8007] Murray, R. and B. Niven-Jenkins, "Content Delivery Network 1778 Interconnection (CDNI) Control Interface / Triggers", 1779 RFC 8007, DOI 10.17487/RFC8007, December 2016, 1780 . 1782 [RFC8152] Schaad, J., "CBOR Object Signing and Encryption (COSE)", 1783 RFC 8152, DOI 10.17487/RFC8152, July 2017, 1784 . 1786 [RFC8428] Jennings, C., Shelby, Z., Arkko, J., Keranen, A., and C. 1787 Bormann, "Sensor Measurement Lists (SenML)", RFC 8428, 1788 DOI 10.17487/RFC8428, August 2018, 1789 . 1791 7.3. URIs 1793 [1] https://github.com/cabo/cbor-diag 1795 Appendix A. Parsing Expression Grammars (PEG) 1797 This appendix is normative. 1799 Since the 1950s, many grammar notations are based on Backus-Naur Form 1800 (BNF), a notation for context-free grammars (CFGs) within Chomsky's 1801 generative system of grammars. ABNF [RFC5234], the Augmented Backus- 1802 Naur Form widely used in IETF specifications and also inspiring the 1803 syntax of CDDL, is an example of this. 1805 Generative grammars can express ambiguity well, but this very 1806 property may make them hard to use in recognition systems, spawning a 1807 number of subdialects that pose constraints on generative grammars to 1808 be used with parser generators, which may be hard to manage for the 1809 specification writer. 1811 Parsing Expression Grammars [PEG] provide an alternative formal 1812 foundation for describing grammars that emphasizes recognition over 1813 generation, and resolves what would have been ambiguity in generative 1814 systems by introducing the concept of "prioritized choice". 1816 The notation for Parsing Expression Grammars is quite close to BNF, 1817 with the usual "Extended BNF" features such as repetition added. 1818 However, where BNF uses the unordered (symmetrical) choice operator 1819 "|" (incidentally notated as "/" in ABNF), PEG provides a prioritized 1820 choice operator "/". The two alternatives listed are to be tested in 1821 left-to-right order, locking in the first successful match and 1822 disregarding any further potential matches within the choice (but not 1823 disabling alternatives in choices containing this choice, as a "cut" 1824 would - Section 3.5.4}. 1826 For example, the ABNF expressions 1828 A = "a" "b" / "a" (1) 1830 and 1832 A = "a" / "a" "b" (2) 1834 are equivalent in ABNF's original generative framework, but very 1835 different in PEG: In (2), the second alternative will never match, as 1836 any input string starting with an "a" will already succeed in the 1837 first alternative, locking in the match. 1839 Similarly, the occurrence indicators ("?", "*", "+") are "greedy" in 1840 PEG, i.e., they consume as much input as they match (and, as a 1841 consequence, "a* a" in PEG notation or "*a a" in CDDL syntax never 1842 can match anything as all input matching "a" is already consumed by 1843 the initial "a*", leaving nothing to match the second "a"). 1845 Incidentally, the grammar of the CDDL language itself, as written in 1846 ABNF in Appendix B, can be interpreted both in the generative 1847 framework on which RFC 5234 is based, and as a PEG. This was made 1848 possible by ordering the choices in the grammar such that a 1849 successful match made on the left hand side of a "/" operator is 1850 always the intended match, instead of relying on the power of 1851 symmetrical choices (for example, note the sequence of alternatives 1852 in the rule for "uint", where the lone zero is behind the longer 1853 match alternatives that start with a zero). 1855 The syntax used for expressing the PEG component of CDDL is based on 1856 ABNF, interpreted in the obvious way with PEG semantics. The ABNF 1857 convention of notating occurrence indicators before the controlled 1858 primary, and of allowing numeric values for minimum and maximum 1859 occurrence around a "*" sign, is copied. While PEG is only about 1860 characters, CDDL has a richer set of elements, such as types and 1861 groups. Specifically, the following constructs map: 1863 +-------+-------+-------------------------------------------+ 1864 | CDDL | PEG | Remark | 1865 +-------+-------+-------------------------------------------+ 1866 | "=" | "<-" | /= and //= are abbreviations | 1867 | "//" | "/" | prioritized choice | 1868 | "/" | "/" | prioritized choice, limited to types only | 1869 | "?" P | P "?" | zero or one | 1870 | "*" P | P "*" | zero or more | 1871 | "+" P | P "+" | one or more | 1872 | A B | A B | sequence | 1873 | A, B | A B | sequence, comma is decoration only | 1874 +-------+-------+-------------------------------------------+ 1876 The literal notation and the use of square brackets, curly braces, 1877 tildes, ampersands, and hash marks is specific to CDDL and unrelated 1878 to the conventional PEG notation. The DOT (".") is replaced by the 1879 unadorned "#" or its alias "any". Also, CDDL does not provide the 1880 syntactic predicate operators NOT ("!") or AND ("&") from PEG, 1881 reducing expressiveness as well as complexity. 1883 For more details about PEG's theoretical foundation and interesting 1884 properties of the operators such as associativity and distributivity, 1885 the reader is referred to [PEG]. 1887 Appendix B. ABNF grammar 1889 This appendix is normative. 1891 The following is a formal definition of the CDDL syntax in Augmented 1892 Backus-Naur Form (ABNF, [RFC5234]). Note that, as is defined in 1893 ABNF, the quote-delimited strings below are case-insensitive (while 1894 string values and names are case-sensitive in CDDL). 1896 cddl = S 1*(rule S) 1897 rule = typename [genericparm] S assignt S type 1898 / groupname [genericparm] S assigng S grpent 1900 typename = id 1901 groupname = id 1903 assignt = "=" / "/=" 1904 assigng = "=" / "//=" 1906 genericparm = "<" S id S *("," S id S ) ">" 1907 genericarg = "<" S type1 S *("," S type1 S ) ">" 1909 type = type1 *(S "/" S type1) 1911 type1 = type2 [S (rangeop / ctlop) S type2] 1912 ; space may be needed before the operator if type2 ends in a name 1914 type2 = value 1915 / typename [genericarg] 1916 / "(" S type S ")" 1917 / "{" S group S "}" 1918 / "[" S group S "]" 1919 / "~" S typename [genericarg] 1920 / "&" S "(" S group S ")" 1921 / "&" S groupname [genericarg] 1922 / "#" "6" ["." uint] "(" S type S ")" 1923 / "#" DIGIT ["." uint] ; major/ai 1924 / "#" ; any 1926 rangeop = "..." / ".." 1928 ctlop = "." id 1930 group = grpchoice *(S "//" S grpchoice) 1932 grpchoice = *(grpent optcom) 1934 grpent = [occur S] [memberkey S] type 1935 / [occur S] groupname [genericarg] ; preempted by above 1936 / [occur S] "(" S group S ")" 1938 memberkey = type1 S ["^" S] "=>" 1939 / bareword S ":" 1940 / value S ":" 1942 bareword = id 1944 optcom = S ["," S] 1946 occur = [uint] "*" [uint] 1947 / "+" 1948 / "?" 1950 uint = DIGIT1 *DIGIT 1951 / "0x" 1*HEXDIG 1952 / "0b" 1*BINDIG 1953 / "0" 1955 value = number 1956 / text 1957 / bytes 1959 int = ["-"] uint 1961 ; This is a float if it has fraction or exponent; int otherwise 1962 number = hexfloat / (int ["." fraction] ["e" exponent ]) 1963 hexfloat = "0x" 1*HEXDIG ["." 1*HEXDIG] "p" exponent 1964 fraction = 1*DIGIT 1965 exponent = ["+"/"-"] 1*DIGIT 1967 text = %x22 *SCHAR %x22 1968 SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC 1969 SESC = "\" (%x20-7E / %x80-10FFFD) 1971 bytes = [bsqual] %x27 *BCHAR %x27 1972 BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF 1973 bsqual = "h" / "b64" 1975 id = EALPHA *(*("-" / ".") (EALPHA / DIGIT)) 1976 ALPHA = %x41-5A / %x61-7A 1977 EALPHA = ALPHA / "@" / "_" / "$" 1978 DIGIT = %x30-39 1979 DIGIT1 = %x31-39 1980 HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F" 1981 BINDIG = %x30-31 1982 S = *WS 1983 WS = SP / NL 1984 SP = %x20 1985 NL = COMMENT / CRLF 1986 COMMENT = ";" *PCHAR CRLF 1987 PCHAR = %x20-7E / %x80-10FFFD 1988 CRLF = %x0A / %x0D.0A 1990 Figure 13: CDDL ABNF 1992 Note that this ABNF does not attempt to reflect the detailed rules of 1993 what can be in a prefixed byte string. 1995 Appendix C. Matching rules 1997 This appendix is normative. 1999 In this appendix, we go through the ABNF syntax rules defined in 2000 Appendix B and briefly describe the matching semantics of each 2001 syntactic feature. In this context, an instance (data item) 2002 "matches" a CDDL specification if it is allowed by the CDDL 2003 specification; this is then broken down to parts of specifications 2004 (type and group expressions) and parts of instances (data items). 2006 cddl = S 1*(rule S) 2008 A CDDL specification is a sequence of one or more rules. Each rule 2009 gives a name to a right hand side expression, either a CDDL type or a 2010 CDDL group. Rule names can be used in the rule itself and/or other 2011 rules (and tools can output warnings if that is not the case). The 2012 order of the rules is significant only in two cases: 2014 1. The first rule defines the semantics of the entire specification; 2015 hence, there is no need to give that root rule a special name or 2016 special syntax in the language (as, e.g., with "start" in Relax- 2017 NG); its name can be therefore chosen to be descriptive. (As 2018 with all other rule names, the name of the initial rule may be 2019 used in itself or in other rules). 2021 2. Where a rule contributes to a type or group choice (using "/=" or 2022 "//="), that choice is populated in the order the rules are 2023 given; see below. 2025 rule = typename [genericparm] S assignt S type 2026 / groupname [genericparm] S assigng S grpent 2028 typename = id 2029 groupname = id 2030 A rule defines a name for a type expression (production "type") or 2031 for a group expression (production "grpent"), with the intention that 2032 the semantics does not change when the name is replaced by its 2033 (parenthesized if needed) definition. Note that whether the name 2034 defined by a rule stands for a type or a group isn't always 2035 determined by syntax alone: e.g., "a = b" can make "a" a type if "b" 2036 is a type, or a group if "b" is a group. More subtly, in "a = (b)", 2037 "a" may be used as a type if "b" is a type, or as a group both when 2038 "b" is a group and when "b" is a type (a good convention to make the 2039 latter case stand out to the human reader is to write "a = (b,)"). 2040 (Note that the same dual meaning of parentheses applies within an 2041 expression, but often can be resolved by the context of the 2042 parenthesized expression. On the more general point, it may not be 2043 clear immediately either whether "b" stands for a group or a type -- 2044 this semantic processing may need to span several levels of rule 2045 definitions before a determination can be made.) 2047 assignt = "=" / "/=" 2048 assigng = "=" / "//=" 2050 A plain equals sign defines the rule name as the equivalent of the 2051 expression to the right; it is an error if the name already was 2052 defined with a different expression. A "/=" or "//=" extends a named 2053 type or a group by additional choices; a number of these could be 2054 replaced by collecting all the right hand sides and creating a single 2055 rule with a type choice or a group choice built from the right hand 2056 sides in the order of the rules given. (It is not an error to extend 2057 a rule name that has not yet been defined; this makes the right hand 2058 side the first entry in the choice being created.) 2060 genericparm = "<" S id S *("," S id S ) ">" 2061 genericarg = "<" S type1 S *("," S type1 S ) ">" 2063 Rule names can have generic parameters, which cause temporary 2064 assignments within the right hand sides to the parameter names from 2065 the arguments given when citing the rule name. 2067 type = type1 *(S "/" S type1) 2069 A type can be given as a choice between one or more types. The 2070 choice matches a data item if the data item matches any one of the 2071 types given in the choice. The choice uses Parsing Expression 2072 Grammar semantics as discussed in Appendix A: The first choice that 2073 matches wins. (As a result, the order of rules that contribute to a 2074 single rule name can very well matter.) 2076 type1 = type2 [S (rangeop / ctlop) S type2] 2077 Two types can be combined with a range operator (which see below) or 2078 a control operator (see Section 3.8). 2080 type2 = value 2082 A type can be just a single value (such as 1 or "icecream" or 2083 h'0815'), which matches only a data item with that specific value (no 2084 conversions defined), 2086 / typename [genericarg] 2088 or be defined by a rule giving a meaning to a name (possibly after 2089 supplying generic arguments as required by the generic parameters), 2091 / "(" S type S ")" 2093 or be defined in a parenthesized type expression (parentheses may be 2094 necessary to override some operator precedence), or 2096 / "{" S group S "}" 2098 a map expression, which matches a valid CBOR map the key/value pairs 2099 of which can be ordered in such a way that the resulting sequence 2100 matches the group expression, or 2102 / "[" S group S "]" 2104 an array expression, which matches a CBOR array the elements of 2105 which, when taken as values and complemented by a wildcard (matches 2106 anything) key each, match the group, or 2108 / "~" S typename [genericarg] 2110 an "unwrapped" group (see Section 3.7), which matches the group 2111 inside a type defined as a map or an array by wrapping the group, or 2113 / "&" S "(" S group S ")" 2114 / "&" S groupname [genericarg] 2116 an enumeration expression, which matches any a value that is within 2117 the set of values that the values of the group given can take, or 2119 / "#" "6" ["." uint] "(" S type S ")" 2121 a tagged data item, tagged with the "uint" given and containing the 2122 type given as the tagged value, or 2124 / "#" DIGIT ["." uint] ; major/ai 2126 a data item of a major type (given by the DIGIT), optionally 2127 constrained to the additional information given by the uint, or 2129 / "#" ; any 2131 any data item. 2133 rangeop = "..." / ".." 2135 A range operator can be used to join two type expressions that stand 2136 for either two integer values or two floating point values; it 2137 matches any value that is between the two values, where the first 2138 value is always included in the matching set and the second value is 2139 included for ".." and excluded for "...". 2141 ctlop = "." id 2143 A control operator ties a _target_ type to a _controller_ type as 2144 defined in Section 3.8. Note that control operators are an extension 2145 point for CDDL; additional documents may want to define additional 2146 control operators. 2148 group = grpchoice *(S "//" S grpchoice) 2150 A group matches any sequence of key/value pairs that matches any of 2151 the choices given (again using Parsing Expression Grammar semantics). 2153 grpchoice = *(grpent optcom) 2155 Each of the component groups is given as a sequence of group entries. 2156 For a match, the sequence of key/value pairs given needs to match the 2157 sequence of group entries in the sequence given. 2159 grpent = [occur S] [memberkey S] type 2161 A group entry can be given by a value type, which needs to be matched 2162 by the value part of a single element, and optionally a memberkey 2163 type, which needs to be matched by the key part of the element, if 2164 the memberkey is given. If the memberkey is not given, the entry can 2165 only be used for matching arrays, not for maps. (See below how that 2166 is modified by the occurrence indicator.) 2168 / [occur S] groupname [genericarg] ; preempted by above 2170 A group entry can be built from a named group, or 2172 / [occur S] "(" S group S ")" 2174 from a parenthesized group, again with a possible occurrence 2175 indicator. 2177 memberkey = type1 S ["^" S] "=>" 2178 / bareword S ":" 2179 / value S ":" 2181 Key types can be given by a type expression, a bareword (which stands 2182 for a type that just contains a string value created from this 2183 bareword), or a value (which stands for a type that just contains 2184 this value). A key value matches its key type if the key value is a 2185 member of the key type, unless a cut preceding it in the group 2186 applies (see Section 3.5.4 how map matching is influenced by the 2187 presence of the cuts denoted by "^" or ":" in previous entries). 2189 bareword = id 2191 A bareword is an alternative way to write a type with a single text 2192 string value; it can only be used in the syntactic context given 2193 above. 2195 optcom = S ["," S] 2197 (Optional commas do not influence the matching.) 2199 occur = [uint] "*" [uint] 2200 / "+" 2201 / "?" 2203 An occurrence indicator modifies the group given to its right by 2204 requiring the group to match the sequence to be matched exactly for a 2205 certain number of times (see Section 3.2) in sequence, i.e. it acts 2206 as a (possibly infinite) group choice that contains choices with the 2207 group repeated each of the occurrences times. 2209 The rest of the ABNF describes syntax for value notation that should 2210 be familiar from programming languages, with the possible exception 2211 of h'..' and b64'..' for byte strings, as well as syntactic elements 2212 such as comments and line ends. 2214 Appendix D. Standard Prelude 2216 This appendix is normative. 2218 The following prelude is automatically added to each CDDL file. 2219 (Note that technically, it is a postlude, as it does not disturb the 2220 selection of the first rule as the root of the definition.) 2221 any = # 2223 uint = #0 2224 nint = #1 2225 int = uint / nint 2227 bstr = #2 2228 bytes = bstr 2229 tstr = #3 2230 text = tstr 2232 tdate = #6.0(tstr) 2233 time = #6.1(number) 2234 number = int / float 2235 biguint = #6.2(bstr) 2236 bignint = #6.3(bstr) 2237 bigint = biguint / bignint 2238 integer = int / bigint 2239 unsigned = uint / biguint 2240 decfrac = #6.4([e10: int, m: integer]) 2241 bigfloat = #6.5([e2: int, m: integer]) 2242 eb64url = #6.21(any) 2243 eb64legacy = #6.22(any) 2244 eb16 = #6.23(any) 2245 encoded-cbor = #6.24(bstr) 2246 uri = #6.32(tstr) 2247 b64url = #6.33(tstr) 2248 b64legacy = #6.34(tstr) 2249 regexp = #6.35(tstr) 2250 mime-message = #6.36(tstr) 2251 cbor-any = #6.55799(any) 2253 float16 = #7.25 2254 float32 = #7.26 2255 float64 = #7.27 2256 float16-32 = float16 / float32 2257 float32-64 = float32 / float64 2258 float = float16-32 / float64 2260 false = #7.20 2261 true = #7.21 2262 bool = false / true 2263 nil = #7.22 2264 null = nil 2265 undefined = #7.23 2267 Figure 14: CDDL Prelude 2269 Note that the prelude is deemed to be fixed. This means, for 2270 instance, that additional tags beyond [RFC7049], as registered, need 2271 to be defined in each CDDL file that is using them. 2273 A common stumbling point is that the prelude does not define a type 2274 "string". CBOR has byte strings ("bytes" in the prelude) and text 2275 strings ("text"), so a type that is simply called "string" would be 2276 ambiguous. 2278 Appendix E. Use with JSON 2280 This appendix is normative. 2282 The JSON generic data model (implicit in [RFC8259]) is a subset of 2283 the generic data model of CBOR. So one can use CDDL with JSON by 2284 limiting oneself to what can be represented in JSON. Roughly 2285 speaking, this means leaving out byte strings, tags, and simple 2286 values other than "false", "true", and "null", leading to the 2287 following limited prelude: 2289 any = # 2291 uint = #0 2292 nint = #1 2293 int = uint / nint 2295 tstr = #3 2296 text = tstr 2298 number = int / float 2300 float16 = #7.25 2301 float32 = #7.26 2302 float64 = #7.27 2303 float16-32 = float16 / float32 2304 float32-64 = float32 / float64 2305 float = float16-32 / float64 2307 false = #7.20 2308 true = #7.21 2309 bool = false / true 2310 nil = #7.22 2311 null = nil 2313 Figure 15: JSON compatible subset of CDDL Prelude 2315 (The major types given here do not have a direct meaning in JSON, but 2316 they can be interpreted as CBOR major types translated through 2317 Section 4 of [RFC7049].) 2319 There are a few fine points in using CDDL with JSON. First, JSON 2320 does not distinguish between integers and floating point numbers; 2321 there is only one kind of number (which may happen to be integral). 2322 In this context, specifying a type as "uint", "nint" or "int" then 2323 becomes a predicate that the number be integral. As an example, this 2324 means that the following JSON numbers are all matching "uint": 2326 10 10.0 1e1 1.0e1 100e-1 2328 (The fact that these are all integers may be surprising to users 2329 accustomed to the long tradition in programming languages of using 2330 decimal points or exponents in a number to indicate a floating point 2331 literal.) 2333 CDDL distinguishes the various CBOR number types, but there is only 2334 one number type in JSON. The effect of specifying a floating point 2335 precision (float16/float32/float64) is only to restrict the set of 2336 permissible values to those expressible with binary16/binary32/ 2337 binary64; this is unlikely to be very useful when using CDDL for 2338 specifying JSON data structures. 2340 Fundamentally, the number system of JSON itself is based on decimal 2341 numbers and decimal fractions and does not have limits to its 2342 precision or range. In practice, JSON numbers are often parsed into 2343 a number type that is called float64 here, creating a number of 2344 limitations to the generic data model [RFC7493]. In particular, this 2345 means that integers can only be expressed with interoperable 2346 exactness when they lie in the range [-(2**53)+1, (2**53)-1] -- a 2347 smaller range than that covered by CDDL "int". 2349 JSON applications that want to stay compatible with I-JSON 2350 ([RFC7493], "Internet JSON") therefore may want to define integer 2351 types with more limited ranges, such as in Figure 16. Note that the 2352 types given here are not part of the prelude; they need to be copied 2353 into the CDDL specification if needed. 2355 ij-uint = 0..9007199254740991 2356 ij-nint = -9007199254740991..-1 2357 ij-int = -9007199254740991..9007199254740991 2359 Figure 16: I-JSON types for CDDL (not part of prelude) 2361 JSON applications that do not need to stay compatible with I-JSON and 2362 that actually may need to go beyond the 64-bit unsigned and negative 2363 integers supported by "int" (= "uint"/"nint") may want to use the 2364 following additional types from the standard prelude, which are 2365 expressed in terms of tags but can straightforwardly be mapped into 2366 JSON (but not I-JSON) numbers: 2368 biguint = #6.2(bstr) 2369 bignint = #6.3(bstr) 2370 bigint = biguint / bignint 2371 integer = int / bigint 2372 unsigned = uint / biguint 2374 CDDL at this point does not have a way to express the unlimited 2375 floating point precision that is theoretically possible with JSON; at 2376 the time of writing, this is rarely used in protocols in practice. 2378 Note that a data model described in CDDL is always restricted by what 2379 can be expressed in the serialization; e.g., floating point values 2380 such as NaN (not a number) and the infinities cannot be represented 2381 in JSON even if they are allowed in the CDDL generic data model. 2383 Appendix F. A CDDL tool 2385 This appendix is for information only. 2387 A rough CDDL tool is available. For CDDL specifications, it can 2388 check the syntax, generate one or more instances (expressed in CBOR 2389 diagnostic notation or in pretty-printed JSON), and validate an 2390 existing instance against the specification: 2392 Usage: 2393 cddl spec.cddl generate [n] 2394 cddl spec.cddl json-generate [n] 2395 cddl spec.cddl validate instance.cbor 2396 cddl spec.cddl validate instance.json 2398 Figure 17: CDDL tool usage 2400 Install on a system with a modern Ruby via: 2402 gem install cddl 2404 Figure 18: CDDL tool installation 2406 The accompanying CBOR diagnostic tools (which are automatically 2407 installed by the above) are described in https://github.com/cabo/ 2408 cbor-diag [1]; they can be used to convert between binary CBOR, a 2409 pretty-printed form of that, CBOR diagnostic notation, JSON, and 2410 YAML. 2412 Appendix G. Extended Diagnostic Notation 2414 This appendix is normative. 2416 Section 6 of [RFC7049] defines a "diagnostic notation" in order to be 2417 able to converse about CBOR data items without having to resort to 2418 binary data. Diagnostic notation is based on JSON, with extensions 2419 for representing CBOR constructs such as binary data and tags. 2421 (Standardizing this together with the actual interchange format does 2422 not serve to create another interchange format, but enables the use 2423 of a shared diagnostic notation in tools for and documents about 2424 CBOR.) 2426 This section discusses a few extensions to the diagnostic notation 2427 that have turned out to be useful since RFC 7049 was written. We 2428 refer to the result as extended diagnostic notation (EDN). 2430 G.1. White space in byte string notation 2432 Examples often benefit from some white space (spaces, line breaks) in 2433 byte strings. In extended diagnostic notation, white space is 2434 ignored in prefixed byte strings; for instance, the following are 2435 equivalent: 2437 h'48656c6c6f20776f726c64' 2438 h'48 65 6c 6c 6f 20 77 6f 72 6c 64' 2439 h'4 86 56c 6c6f 2440 20776 f726c64' 2442 G.2. Text in byte string notation 2444 Diagnostic notation notates Byte strings in one of the [RFC4648] base 2445 encodings,, enclosed in single quotes, prefixed by >h< for base16, 2446 >b32< for base32, >h32< for base32hex, >b64< for base64 or base64url. 2447 Quite often, byte strings carry bytes that are meaningfully 2448 interpreted as UTF-8 text. Extended Diagnostic Notation allows the 2449 use of single quotes without a prefix to express byte strings with 2450 UTF-8 text; for instance, the following are equivalent: 2452 'hello world' 2453 h'68656c6c6f20776f726c64' 2455 The escaping rules of JSON strings are applied equivalently for text- 2456 based byte strings, e.g., \ stands for a single backslash and ' 2457 stands for a single quote. White space is included literally, i.e., 2458 the previous section does not apply to text-based byte strings. 2460 G.3. Embedded CBOR and CBOR sequences in byte strings 2462 Where a byte string is to carry an embedded CBOR-encoded item, or 2463 more generally a sequence of zero or more such items, the diagnostic 2464 notation for these zero or more CBOR data items, separated by 2465 commata, can be enclosed in << and >> to notate the byte string 2466 resulting from encoding the data items and concatenating the result. 2467 For instance, each pair of columns in the following are equivalent: 2469 <<1>> h'01' 2470 <<1, 2>> h'0102' 2471 <<"foo", null>> h'63666F6FF6' 2472 <<>> h'' 2474 G.4. Concatenated Strings 2476 While the ability to include white space enables line-breaking of 2477 encoded byte strings, a mechanism is needed to be able to include 2478 text strings as well as byte strings in direct UTF-8 representation 2479 into line-based documents (such as RFCs and source code). 2481 We extend the diagnostic notation by allowing multiple text strings 2482 or multiple byte strings to be notated separated by white space, 2483 these are then concatenated into a single text or byte string, 2484 respectively. Text strings and byte strings do not mix within such a 2485 concatenation, except that byte string notation can be used inside a 2486 sequence of concatenated text string notation to encode characters 2487 that may be better represented in an encoded way. The following four 2488 values are equivalent: 2490 "Hello world" 2491 "Hello " "world" 2492 "Hello" h'20' "world" 2493 "" h'48656c6c6f20776f726c64' "" 2495 Similarly, the following byte string values are equivalent 2497 'Hello world' 2498 'Hello ' 'world' 2499 'Hello ' h'776f726c64' 2500 'Hello' h'20' 'world' 2501 '' h'48656c6c6f20776f726c64' '' b64'' 2502 h'4 86 56c 6c6f' h' 20776 f726c64' 2504 (Note that the approach of separating by whitespace, while familiar 2505 from the C language, requires some attention - a single comma makes a 2506 big difference here.) 2508 G.5. Hexadecimal, octal, and binary numbers 2510 In addition to JSON's decimal numbers, EDN provides hexadecimal, 2511 octal and binary numbers in the usual C-language notation (octal with 2512 0o prefix present only). 2514 The following are equivalent: 2516 4711 2517 0x1267 2518 0o11147 2519 0b1001001100111 2521 As are: 2523 1.5 2524 0x1.8p0 2525 0x18p-4 2527 G.6. Comments 2529 Longer pieces of diagnostic notation may benefit from comments. JSON 2530 famously does not provide for comments, and basic RFC 7049 diagnostic 2531 notation inherits this property. 2533 In extended diagnostic notation, comments can be included, delimited 2534 by slashes ("/"). Any text within and including a pair of slashes is 2535 considered a comment. 2537 Comments are considered white space. Hence, they are allowed in 2538 prefixed byte strings; for instance, the following are equivalent: 2540 h'68656c6c6f20776f726c64' 2541 h'68 65 6c /doubled l!/ 6c 6f /hello/ 2542 20 /space/ 2543 77 6f 72 6c 64' /world/ 2545 This can be used to annotate a CBOR structure as in: 2547 /grasp-message/ [/M_DISCOVERY/ 1, /session-id/ 10584416, 2548 /objective/ [/objective-name/ "opsonize", 2549 /D, N, S/ 7, /loop-count/ 105]] 2551 (There are currently no end-of-line comments. If we want to add 2552 them, "//" sounds like a reasonable delimiter given that we already 2553 use slashes for comments, but we also could go e.g. for "#".) 2555 Appendix H. Examples 2557 This appendix is for information only. 2559 This section contains a few examples of structures defined using 2560 CDDL. 2562 The theme for the first example is taken from [RFC7071], which 2563 defines certain JSON structures in English. For a similar example, 2564 it may also be of interest to examine Appendix A of [RFC8007], which 2565 contains a CDDL definition for a JSON structure defined in the main 2566 body of the RFC. 2568 The second subsection in this appendix translates examples from 2569 [I-D.newton-json-content-rules] into CDDL. 2571 These examples all happen to describe data that is interchanged in 2572 JSON. Examples for CDDL definitions of data that is interchanged in 2573 CBOR can be found in [RFC8152], [I-D.ietf-anima-grasp], or [RFC8428]. 2575 H.1. RFC 7071 2577 [RFC7071] defines the Reputon structure for JSON using somewhat 2578 formalized English text. Here is a (somewhat verbose) equivalent 2579 definition using the same terms, but notated in CDDL: 2581 reputation-object = { 2582 reputation-context, 2583 reputon-list 2584 } 2586 reputation-context = ( 2587 application: text 2588 ) 2590 reputon-list = ( 2591 reputons: reputon-array 2592 ) 2594 reputon-array = [* reputon] 2596 reputon = { 2597 rater-value, 2598 assertion-value, 2599 rated-value, 2600 rating-value, 2601 ? conf-value, 2602 ? normal-value, 2603 ? sample-value, 2604 ? gen-value, 2605 ? expire-value, 2606 * ext-value, 2607 } 2609 rater-value = ( rater: text ) 2610 assertion-value = ( assertion: text ) 2611 rated-value = ( rated: text ) 2612 rating-value = ( rating: float16 ) 2613 conf-value = ( confidence: float16 ) 2614 normal-value = ( normal-rating: float16 ) 2615 sample-value = ( sample-size: uint ) 2616 gen-value = ( generated: uint ) 2617 expire-value = ( expires: uint ) 2618 ext-value = ( text => any ) 2620 An equivalent, more compact form of this example would be: 2622 reputation-object = { 2623 application: text 2624 reputons: [* reputon] 2625 } 2627 reputon = { 2628 rater: text 2629 assertion: text 2630 rated: text 2631 rating: float16 2632 ? confidence: float16 2633 ? normal-rating: float16 2634 ? sample-size: uint 2635 ? generated: uint 2636 ? expires: uint 2637 * text => any 2638 } 2640 Note how this rather clearly delineates the structure somewhat 2641 shrouded by so many words in section 6.2.2. of [RFC7071]. Also, this 2642 definition makes it clear that several ext-values are allowed (by 2643 definition with different member names); RFC 7071 could be read to 2644 forbid the repetition of ext-value ("A specific reputon-element MUST 2645 NOT appear more than once" is ambiguous.) 2647 The CDDL tool reported on in Appendix F generates as one example: 2649 { 2650 "application": "conchometry", 2651 "reputons": [ 2652 { 2653 "rater": "Ephthianura", 2654 "assertion": "codding", 2655 "rated": "sphaerolitic", 2656 "rating": 0.34133473256800795, 2657 "confidence": 0.9481983064298332, 2658 "expires": 1568, 2659 "unplaster": "grassy" 2660 }, 2661 { 2662 "rater": "nonchargeable", 2663 "assertion": "raglan", 2664 "rated": "alienage", 2665 "rating": 0.5724646875815566, 2666 "sample-size": 3514, 2667 "Aldebaran": "unchurched", 2668 "puruloid": "impersonable", 2669 "uninfracted": "pericarpoidal", 2670 "schorl": "Caro" 2671 }, 2672 { 2673 "rater": "precollectable", 2674 "assertion": "Merat", 2675 "rated": "thermonatrite", 2676 "rating": 0.19164006323936977, 2677 "confidence": 0.6065252103391268, 2678 "normal-rating": 0.5187773690879303, 2679 "generated": 899, 2680 "speedy": "solidungular", 2681 "noviceship": "medicine", 2682 "checkrow": "epidictic" 2683 } 2684 ] 2685 } 2687 H.2. Examples from JSON Content Rules 2689 Although JSON Content Rules [I-D.newton-json-content-rules] seems to 2690 address a more general problem than CDDL, it is still a worthwhile 2691 resource to explore for examples (beyond all the inspiration the 2692 format itself has had for CDDL). 2694 Figure 2 of the JCR I-D looks very similar, if slightly less noisy, 2695 in CDDL: 2697 root = [2*2 { 2698 precision: text, 2699 Latitude: float, 2700 Longitude: float, 2701 Address: text, 2702 City: text, 2703 State: text, 2704 Zip: text, 2705 Country: text 2706 }] 2708 Figure 19: JCR, Figure 2, in CDDL 2710 Apart from the lack of a need to quote the member names, text strings 2711 are called "text" or "tstr" in CDDL ("string" would be ambiguous as 2712 CBOR also provides byte strings). 2714 The CDDL tool reported on in Appendix F creates the below example 2715 instance for this: 2717 [{"precision": "pyrosphere", "Latitude": 0.5399712314350172, 2718 "Longitude": 0.5157523963028087, "Address": "resow", 2719 "City": "problemwise", "State": "martyrlike", "Zip": "preprove", 2720 "Country": "Pace"}, 2721 {"precision": "unrigging", "Latitude": 0.10422704368372193, 2722 "Longitude": 0.6279808663725834, "Address": "picturedom", 2723 "City": "decipherability", "State": "autometry", "Zip": "pout", 2724 "Country": "wimple"}] 2726 Figure 4 of the JCR I-D in CDDL: 2728 root = { image } 2730 image = ( 2731 Image: { 2732 size, 2733 Title: text, 2734 thumbnail, 2735 IDs: [* int] 2736 } 2737 ) 2739 size = ( 2740 Width: 0..1280 2741 Height: 0..1024 2742 ) 2744 thumbnail = ( 2745 Thumbnail: { 2746 size, 2747 Url: ~uri 2748 } 2749 ) 2751 This shows how the group concept can be used to keep related elements 2752 (here: width, height) together, and to emulate the JCR style of 2753 specification. (It also shows referencing a type by unwrapping a tag 2754 from the prelude, "uri" - this could be done differently.) The more 2755 compact form of Figure 5 of the JCR I-D could be emulated like this: 2757 root = { 2758 Image: { 2759 size, Title: text, 2760 Thumbnail: { size, Url: ~uri }, 2761 IDs: [* int] 2762 } 2763 } 2765 size = ( 2766 Width: 0..1280, 2767 Height: 0..1024, 2768 ) 2770 The CDDL tool reported on in Appendix F creates the below example 2771 instance for this: 2773 {"Image": {"Width": 566, "Height": 516, "Title": "leisterer", 2774 "Thumbnail": {"Width": 1111, "Height": 176, "Url": 32("scrog")}, 2775 "IDs": []}} 2777 Contributors 2779 CDDL was originally conceived by Bert Greevenbosch, who also wrote 2780 the original five versions of this document. 2782 Acknowledgements 2784 Inspiration was taken from the C and Pascal languages, MPEG's 2785 conventions for describing structures in the ISO base media file 2786 format, Relax-NG and its compact syntax [RELAXNG], and in particular 2787 from Andrew Lee Newton's "JSON Content Rules" 2788 [I-D.newton-json-content-rules]. 2790 Lots of highly useful feedback came from members of the IETF CBOR WG, 2791 in particular Ari Keraenen, Brian Carpenter, Burt Harris, Jeffrey 2792 Yasskin, Jim Hague, Jim Schaad, Joe Hildebrand, Max Pritikin, Michael 2793 Richardson, Pete Cordell, Sean Leonard, and Yaron Sheffer. Also, 2794 Francesca Palombini and Joe volunteered to chair the WG when it was 2795 created, providing the framework for generating and processing this 2796 feedback; with Barry Leiba having taken over from Joe since. Chris 2797 Lonvick and Ines Robles provided additional reviews during IESG 2798 processing, and Alexey Melnikov steered the process as the 2799 responsible area director. 2801 The CDDL tool reported on in Appendix F was written by Carsten 2802 Bormann, building on previous work by Troy Heninger and Tom Lord. 2804 Authors' Addresses 2806 Henk Birkholz 2807 Fraunhofer SIT 2808 Rheinstrasse 75 2809 Darmstadt 64295 2810 Germany 2812 Email: henk.birkholz@sit.fraunhofer.de 2814 Christoph Vigano 2815 Universitaet Bremen 2817 Email: christoph.vigano@uni-bremen.de 2818 Carsten Bormann 2819 Universitaet Bremen TZI 2820 Bibliothekstr. 1 2821 Bremen D-28359 2822 Germany 2824 Phone: +49-421-218-63921 2825 Email: cabo@tzi.org