idnits 2.17.1 draft-ietf-cbor-cddl-control-05.txt: -(3): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 2 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (31 July 2021) is 990 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'RFCthis' is mentioned on line 434, but not defined Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Bormann 3 Internet-Draft Universität Bremen TZI 4 Intended status: Informational 31 July 2021 5 Expires: 1 February 2022 7 Additional Control Operators for CDDL 8 draft-ietf-cbor-cddl-control-05 10 Abstract 12 The Concise Data Definition Language (CDDL), standardized in RFC 13 8610, provides "control operators" as its main language extension 14 point. 16 The present document defines a number of control operators that did 17 not make it into RFC 8610: ".plus", ".cat" and ".det" for the 18 construction of constants, ".abnf"/".abnfb" for including ABNF (RFC 19 5234/RFC 7405) in CDDL specifications, and ".feature" for indicating 20 the use of a non-basic feature in an instance. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at https://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on 1 February 2022. 39 Copyright Notice 41 Copyright (c) 2021 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 46 license-info) in effect on the date of publication of this document. 47 Please review these documents carefully, as they describe your rights 48 and restrictions with respect to this document. Code Components 49 extracted from this document must include Simplified BSD License text 50 as described in Section 4.e of the Trust Legal Provisions and are 51 provided without warranty as described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 56 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 57 2. Computed Literals . . . . . . . . . . . . . . . . . . . . . . 3 58 2.1. Numeric Addition . . . . . . . . . . . . . . . . . . . . 4 59 2.2. String Concatenation . . . . . . . . . . . . . . . . . . 4 60 2.3. String Concatenation with Dedenting . . . . . . . . . . . 5 61 3. Embedded ABNF . . . . . . . . . . . . . . . . . . . . . . . . 6 62 4. Features . . . . . . . . . . . . . . . . . . . . . . . . . . 8 63 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 64 6. Implementation Status . . . . . . . . . . . . . . . . . . . . 11 65 7. Security considerations . . . . . . . . . . . . . . . . . . . 11 66 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 67 8.1. Normative References . . . . . . . . . . . . . . . . . . 11 68 8.2. Informative References . . . . . . . . . . . . . . . . . 12 69 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 12 70 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 13 72 1. Introduction 74 The Concise Data Definition Language (CDDL), standardized in 75 [RFC8610], provides "control operators" as its main language 76 extension point. 78 The present document defines a number of control operators that did 79 not make it into RFC 8610: 81 +==========+===========================================+ 82 | Name | Purpose | 83 +==========+===========================================+ 84 | .plus | Numeric addition | 85 +----------+-------------------------------------------+ 86 | .cat | String Concatenation | 87 +----------+-------------------------------------------+ 88 | .det | String Concatenation, pre-dedenting | 89 +----------+-------------------------------------------+ 90 | .abnf | ABNF in CDDL (text strings) | 91 +----------+-------------------------------------------+ 92 | .abnfb | ABNF in CDDL (byte strings) | 93 +----------+-------------------------------------------+ 94 | .feature | Detecting feature use in extension points | 95 +----------+-------------------------------------------+ 97 Table 1: New control operators in this document 99 1.1. Terminology 101 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 102 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 103 "OPTIONAL" in this document are to be interpreted as described in 104 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 105 capitals, as shown here. 107 This specification uses terminology from [RFC8610]. In particular, 108 with respect to control operators, "target" refers to the left hand 109 side operand, and "controller" to the right hand side operand. 111 2. Computed Literals 113 CDDL as defined in [RFC8610] does not have any mechanisms to compute 114 literals. As an 80 % solution, this specification adds three control 115 operators: ".plus" for numeric addition, ".cat" for string 116 concatenation, and ".det" for string concatenation with dedenting of 117 both sides (target and controller). 119 For these operators, as with all control operators, targets and 120 controllers are types. The resulting type is therefore formally a 121 function of the elements of the cross-product of the two types. Not 122 all tools may be able to work with non-unique targets or controllers. 124 2.1. Numeric Addition 126 In many cases in a specification, numbers are needed relative to a 127 base number. The ".plus" control identifies a number that is 128 constructed by adding the numeric values of the target and of the 129 controller. 131 Target and controller MUST be numeric. If the target is a floating 132 point number and the controller an integer number, or vice versa, the 133 sum is converted into the type of the target; converting from a 134 floating point number to an integer selects its floor (the largest 135 integer less than or equal to the floating point number). 137 interval = ( 138 BASE => int ; lower bound 139 (BASE .plus 1) => int ; upper bound 140 ? (BASE .plus 2) => int ; tolerance 141 ) 143 X = 0 144 Y = 3 145 rect = { 146 interval 147 interval 148 } 150 Figure 1: Example: addition to a base value 152 The example in Figure 1 contains the generic definition of a group 153 "interval" that gives a lower and an upper bound and optionally a 154 tolerance. "rect" combines two of these groups into a map, one group 155 for the X dimension and one for Y dimension. 157 2.2. String Concatenation 159 It is often useful to be able to compose string literals out of 160 component literals defined in different places in the specification. 162 The ".cat" control identifies a string that is built from a 163 concatenation of the target and the controller. Target and 164 controller MUST be strings. The result of the operation has the type 165 of the target. The concatenation is performed on the bytes in both 166 strings. If the target is a text string, the result of that 167 concatenation MUST be valid UTF-8. 169 a = "foo" .cat ' 170 bar 171 baz 172 ' 173 ; on a system where the newline is \n, is the same string as: 174 b = "foo\n bar\n baz\n" 176 Figure 2: Example: concatenation of text and byte string 178 The example in Figure 2 builds a text string named "a" out of 179 concatenating the target text string ""foo"" and the controller byte 180 string entered in a text form byte string literal. (This particular 181 idiom is useful when the text string contains newlines, which, as 182 shown in the example for "b", may be harder to read when entered in 183 the format that the pure CDDL text string notation inherits from 184 JSON.) 186 2.3. String Concatenation with Dedenting 188 Multi-line string literals for various applications, including 189 embedded ABNF (Section 3), need to be set flush left, at least 190 partially. Often, having some indentation in the source code for the 191 literal can promote readability, as in Figure 3. 193 oid = bytes .abnfb ("oid" .det cbor-tags-oid) 194 roid = bytes .abnfb ("roid" .det cbor-tags-oid) 196 cbor-tags-oid = ' 197 oid = 1*arc 198 roid = *arc 199 arc = [nlsb] %x00-7f 200 nlsb = %x81-ff *%x80-ff 201 ' 203 Figure 3: Example: dedenting concatenation 205 The control operator ".det" works like ".cat", except that both 206 arguments (target and controller) are independently _dedented_ before 207 the concatenation takes place. For the purposes of this 208 specification, we define dedenting as: 210 1. determining the smallest amount of left-most blank space (number 211 of leading space characters) in all the non-blank lines, and 213 2. removing exactly that number of leading space characters from 214 each line. For blank (blank space only or empty) lines, there 215 may be less (or no) leading space characters than this amount, in 216 which case all leading space is removed. 218 (The name ".det" is a shortcut for "dedenting cat". The maybe more 219 obvious name ".dedcat" has not been chosen as it is longer and may 220 invoke unpleasant images.) 222 Occasionally, dedenting of only a single item is needed. This can be 223 achieved by using this operator with an empty string, e.g., """ .det 224 rhs" or "lhs .det """, which can in turn be combined with a ".cat": 225 in the construct "lhs .cat ("" .det rhs)", only "rhs" is dedented. 227 3. Embedded ABNF 229 Many IETF protocols define allowable values for their text strings in 230 ABNF [RFC5234] [RFC7405]. It is often desirable to define a text 231 string type in CDDL by employing existing ABNF embedded into the CDDL 232 specification. Without specific ABNF support in CDDL, that ABNF 233 would usually need to be translated into a regular expression (if 234 that is even possible). 236 ABNF is added to CDDL in the same way that regular expressions were 237 added: by defining a ".abnf" control operator. The target is usually 238 "text" or some restriction on it, the controller is the text of an 239 ABNF specification. 241 There are several small issues, with solutions given here: 243 * ABNF can be used to define byte sequences as well as UTF-8 text 244 strings interpreted as Unicode scalar sequences. This means this 245 specification defines two control operators: ".abnfb" for ABNF 246 denoting byte sequences and ".abnf" for denoting sequences of 247 Unicode scalar values (codepoint) represented as UTF-8 text 248 strings. Both control operators can be applied to targets of 249 either string type; the ABNF is applied to sequence of bytes in 250 the string interpreting that as a sequence of bytes (".abnfb") or 251 as a sequence of code points represented as an UTF-8 text string 252 (".abnf"). The controller string MUST be a text string. 254 * ABNF defines a list of rules, not a single expression (called 255 "elements" in [RFC5234]). This is resolved by requiring the 256 controller string to be one valid "element", followed by zero or 257 more valid "rule" separated from the element by a newline; so the 258 controller string can be built by preceding a piece of valid ABNF 259 by an "element" that selects from that ABNF and a newline. 261 * For the same reason, ABNF requires newlines; specifying newlines 262 in CDDL text strings is tedious (and leads to essentially 263 unreadable ABNF). The workaround employs the ".cat" operator 264 introduced in Section 2.2 and the syntax for text in byte strings. 265 As is customary for ABNF, the syntax of ABNF itself (NOT the 266 syntax expressed in ABNF!) is relaxed to allow a single linefeed 267 as a newline: 269 CRLF = %x0A / %x0D.0A 271 * One set of rules provided in an ABNF specification is often used 272 in multiple positions, in particular staples such as DIGIT and 273 ALPHA. (Note that all rules referenced need to be defined in each 274 ABNF operator controller string -- there is no implicit import of 275 [RFC5234] Core ABNF or other rules.) The composition this calls 276 for can be provided by the ".cat" operator, and/or by ".det" if 277 there is indentation to be disposed of. 279 These points are combined into an example in Figure 4, which uses 280 ABNF from [RFC3339] to specify one each of the CBOR tags defined in 281 [RFC8943] and [RFC8949]. 283 ; for RFC 8943 284 Tag1004 = #6.1004(text .abnf full-date) 285 ; for RFC 8949 286 Tag0 = #6.0(text .abnf date-time) 288 full-date = "full-date" .cat rfc3339 289 date-time = "date-time" .cat rfc3339 291 ; Note the trick of idiomatically starting with a newline, separating 292 ; off the element in the concatenations above from the rule-list 293 rfc3339 = ' 294 date-fullyear = 4DIGIT 295 date-month = 2DIGIT ; 01-12 296 date-mday = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on 297 ; month/year 298 time-hour = 2DIGIT ; 00-23 299 time-minute = 2DIGIT ; 00-59 300 time-second = 2DIGIT ; 00-58, 00-59, 00-60 based on leap sec 301 ; rules 302 time-secfrac = "." 1*DIGIT 303 time-numoffset = ("+" / "-") time-hour ":" time-minute 304 time-offset = "Z" / time-numoffset 306 partial-time = time-hour ":" time-minute ":" time-second 307 [time-secfrac] 308 full-date = date-fullyear "-" date-month "-" date-mday 309 full-time = partial-time time-offset 311 date-time = full-date "T" full-time 312 ' .det rfc5234-core 314 rfc5234-core = ' 315 DIGIT = %x30-39 ; 0-9 316 ; abbreviated here 317 ' 319 Figure 4: Example: employing RFC 3339 ABNF for defining CBOR Tags 321 4. Features 323 Commonly, the kind of validation enabled by languages such as CDDL 324 provides a Boolean result: valid, or invalid. 326 In rapidly evolving environments, this is too simplistic. The data 327 models described by a CDDL specification may continually be enhanced 328 by additional features, and it would be useful even for a 329 specification that does not yet describe a specific future feature to 330 identify the extension point the feature can use, accepting such 331 extensions while marking them as such. 333 The ".feature" control annotates the target as making use of the 334 feature named by the controller. The latter will usually be a 335 string. A tool that validates an instance against that specification 336 may mark the instance as using a feature that is annotated by the 337 specification. 339 More specifically, the tool's diagnostic output might contain the 340 controller (right hand side) as a feature name, and the target (left 341 hand side) as a feature detail. However, in some cases, the target 342 has too much detail, and the specification might want to hint the 343 tool that more limited detail is appropriate. In this case, the 344 controller should be an array, with the first element being the 345 feature name (that would otherwise be the entire controller), and the 346 second element being the detail (usually another string), as 347 illustrated in Figure 5. 349 foo = { 350 kind: bar / baz .feature (["foo-extensions", "bazify"]) 351 } 352 bar = ... 353 baz = ... ; complex stuff that doesn't all need to be in the detail 355 Figure 5: Providing explicit detail with .feature 357 Figure 6 shows what could be the definition of a person, with 358 potential extensions beyond "name" and "organization" being marked 359 "further-person-extension". Extensions that are known at the time 360 this definition is written can be collected into "$$person- 361 extensions". However, future extensions would be deemed invalid 362 unless the wildcard at the end of the map is added. These extensions 363 could then be specifically examined by a user or a tool that makes 364 use of the validation result; the label (map key) actually used makes 365 a fine feature detail for the tool's diagnostic output. 367 Leaving out the entire extension point would mean that instances that 368 make use of an extension would be marked as whole-sale invalid, 369 making the entire validation approach much less useful. Leaving the 370 extension point in, but not marking its use as special, would render 371 mistakes such as using the label "organisation" instead of 372 "organization" invisible. 374 person = { 375 ? name: text 376 ? organization: text 377 $$person-extensions 378 * (text .feature "further-person-extension") => any 379 } 381 $$person-extensions //= (? bloodgroup: text) 383 Figure 6: Map extensibility with .feature 385 Figure 7 shows another example where ".feature" provides for type 386 extensibility. 388 allowed-types = number / text / bool / null 389 / [* number] / [* text] / [* bool] 390 / (any .feature "allowed-type-extension") 392 Figure 7: Type extensibility with .feature 394 A CDDL tool may simply report the set of features being used; the 395 control then only provides information to the process requesting the 396 validation. One could also imagine a tool that takes arguments 397 allowing the tool to accept certain features and reject others 398 (enable/disable). The latter approach could for instance be used for 399 a JSON/CBOR switch, as illustrated in Figure 8. 401 SenML-Record = { 402 ; ... 403 ? v => number 404 ; ... 405 } 406 v = JC<"v", 2> 407 JC = J .feature "json" / C .feature "cbor" 409 Figure 8: Describing variants with .feature 411 It remains to be seen if the enable/disable approach can lead to new 412 idioms of using CDDL. The language currently has no way to enforce 413 mutually exclusive use of features, as would be needed in this 414 example. 416 5. IANA Considerations 418 This document requests IANA to register the contents of Table 2 into 419 the registry "CDDL Control Operators" of [IANA.cddl]: 421 +==========+===========+ 422 | Name | Reference | 423 +==========+===========+ 424 | .plus | [RFCthis] | 425 +----------+-----------+ 426 | .cat | [RFCthis] | 427 +----------+-----------+ 428 | .det | [RFCthis] | 429 +----------+-----------+ 430 | .abnf | [RFCthis] | 431 +----------+-----------+ 432 | .abnfb | [RFCthis] | 433 +----------+-----------+ 434 | .feature | [RFCthis] | 435 +----------+-----------+ 437 Table 2: New control 438 operators to be 439 registered 441 6. Implementation Status 443 This section is to be removed before publishing as an RFC. 445 An early implementation of the control operator ".feature" has been 446 available in the CDDL tool described in Appendix F of [RFC8610] since 447 version 0.8.11. The validator warns about each feature being used 448 and provides the set of target values used with the feature. The 449 other control operators defined in this specification are also 450 implemented as of version 0.8.21 and 0.8.26 (double-handed ".det"). 452 Andrew Weiss' [CDDL-RS] has an ongoing implementation of this draft 453 which is feature-complete except for the ABNF and dedenting support 454 (https://github.com/anweiss/cddl/pull/79 455 (https://github.com/anweiss/cddl/pull/79)). 457 7. Security considerations 459 The security considerations of [RFC8610] apply. 461 8. References 463 8.1. Normative References 465 [IANA.cddl] 466 IANA, "Concise Data Definition Language (CDDL)", 467 . 469 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 470 Requirement Levels", BCP 14, RFC 2119, 471 DOI 10.17487/RFC2119, March 1997, 472 . 474 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 475 Specifications: ABNF", STD 68, RFC 5234, 476 DOI 10.17487/RFC5234, January 2008, 477 . 479 [RFC7405] Kyzivat, P., "Case-Sensitive String Support in ABNF", 480 RFC 7405, DOI 10.17487/RFC7405, December 2014, 481 . 483 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 484 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 485 May 2017, . 487 [RFC8610] Birkholz, H., Vigano, C., and C. Bormann, "Concise Data 488 Definition Language (CDDL): A Notational Convention to 489 Express Concise Binary Object Representation (CBOR) and 490 JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610, 491 June 2019, . 493 8.2. Informative References 495 [CDDL-RS] Weiss, A., "cddl-rs", n.d., 496 . 498 [RFC3339] Klyne, G. and C. Newman, "Date and Time on the Internet: 499 Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002, 500 . 502 [RFC8943] Jones, M., Nadalin, A., and J. Richter, "Concise Binary 503 Object Representation (CBOR) Tags for Date", RFC 8943, 504 DOI 10.17487/RFC8943, November 2020, 505 . 507 [RFC8949] Bormann, C. and P. Hoffman, "Concise Binary Object 508 Representation (CBOR)", STD 94, RFC 8949, 509 DOI 10.17487/RFC8949, December 2020, 510 . 512 Acknowledgements 514 Jim Schaad suggested several improvements. The ".feature" feature 515 was developed out of a discussion with Henk Birkholz. Paul Kyzivat 516 helped isolate the need for ".det". 518 .det is an abbreviation for "dedenting cat", but Det is also the name 519 of a German TV Cartoon character created in the 1960s. 521 Author's Address 523 Carsten Bormann 524 Universität Bremen TZI 525 Postfach 330440 526 D-28359 Bremen 527 Germany 529 Phone: +49-421-218-63921 530 Email: cabo@tzi.org