idnits 2.17.1 draft-ietf-cbor-cddl-control-06.txt: -(3): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 2 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (27 September 2021) is 935 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFCthis' is mentioned on line 446, but not defined Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Bormann 3 Internet-Draft Universität Bremen TZI 4 Intended status: Standards Track 27 September 2021 5 Expires: 31 March 2022 7 Additional Control Operators for CDDL 8 draft-ietf-cbor-cddl-control-06 10 Abstract 12 The Concise Data Definition Language (CDDL), standardized in RFC 13 8610, provides "control operators" as its main language extension 14 point. 16 The present document defines a number of control operators that were 17 not yet ready at the time RFC 8610 was completed: .plus, .cat and 18 .det for the construction of constants, .abnf/.abnfb for including 19 ABNF (RFC 5234/RFC 7405) in CDDL specifications, and .feature for 20 indicating the use of a non-basic feature in an instance. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at https://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on 31 March 2022. 39 Copyright Notice 41 Copyright (c) 2021 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 46 license-info) in effect on the date of publication of this document. 47 Please review these documents carefully, as they describe your rights 48 and restrictions with respect to this document. Code Components 49 extracted from this document must include Simplified BSD License text 50 as described in Section 4.e of the Trust Legal Provisions and are 51 provided without warranty as described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 56 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 57 2. Computed Literals . . . . . . . . . . . . . . . . . . . . . . 3 58 2.1. Numeric Addition . . . . . . . . . . . . . . . . . . . . 4 59 2.2. String Concatenation . . . . . . . . . . . . . . . . . . 4 60 2.3. String Concatenation with Dedenting . . . . . . . . . . . 5 61 3. Embedded ABNF . . . . . . . . . . . . . . . . . . . . . . . . 6 62 4. Features . . . . . . . . . . . . . . . . . . . . . . . . . . 8 63 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 64 6. Implementation Status . . . . . . . . . . . . . . . . . . . . 11 65 7. Security considerations . . . . . . . . . . . . . . . . . . . 11 66 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 67 8.1. Normative References . . . . . . . . . . . . . . . . . . 11 68 8.2. Informative References . . . . . . . . . . . . . . . . . 12 69 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 12 70 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 13 72 1. Introduction 74 The Concise Data Definition Language (CDDL), standardized in 75 [RFC8610], provides "control operators" as its main language 76 extension point (Section 3.8 of [RFC8610]). 78 The present document defines a number of control operators that were 79 not yet ready at the time RFC 8610 was completed: 81 +==========+=================================================+ 82 | Name | Purpose | 83 +==========+=================================================+ 84 | .plus | Numeric addition | 85 +----------+-------------------------------------------------+ 86 | .cat | String Concatenation | 87 +----------+-------------------------------------------------+ 88 | .det | String Concatenation, pre-dedenting | 89 +----------+-------------------------------------------------+ 90 | .abnf | ABNF in CDDL (text strings) | 91 +----------+-------------------------------------------------+ 92 | .abnfb | ABNF in CDDL (byte strings) | 93 +----------+-------------------------------------------------+ 94 | .feature | Indicate name of feature used (extension point) | 95 +----------+-------------------------------------------------+ 97 Table 1: New control operators in this document 99 1.1. Terminology 101 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 102 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 103 "OPTIONAL" in this document are to be interpreted as described in 104 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 105 capitals, as shown here. 107 This specification uses terminology from [RFC8610]. In particular, 108 with respect to control operators, "target" refers to the left hand 109 side operand, and "controller" to the right hand side operand. 110 "Tool" refers to tools along the lines of that described in 111 Appendix F of [RFC8610]. 113 2. Computed Literals 115 CDDL as defined in [RFC8610] does not have any mechanisms to compute 116 literals. To cover a large part of the use cases, this specification 117 adds three control operators: .plus for numeric addition, .cat for 118 string concatenation, and .det for string concatenation with 119 dedenting of both sides (target and controller). 121 For these operators, as with all control operators, targets and 122 controllers are types. The resulting type is therefore formally a 123 function of the elements of the cross-product of the two types. Not 124 all tools may be able to work with non-unique targets or controllers. 126 2.1. Numeric Addition 128 In many cases in a specification, numbers are needed relative to a 129 base number. The .plus control identifies a number that is 130 constructed by adding the numeric values of the target and of the 131 controller. 133 Target and controller MUST be numeric. If the target is a floating 134 point number and the controller an integer number, or vice versa, the 135 sum is converted into the type of the target; converting from a 136 floating point number to an integer selects its floor (the largest 137 integer less than or equal to the floating point number). 139 interval = ( 140 BASE => int ; lower bound 141 (BASE .plus 1) => int ; upper bound 142 ? (BASE .plus 2) => int ; tolerance 143 ) 145 X = 0 146 Y = 3 147 rect = { 148 interval 149 interval 150 } 152 Figure 1: Example: addition to a base value 154 The example in Figure 1 contains the generic definition of a group 155 interval that gives a lower and an upper bound and optionally a 156 tolerance. rect combines two of these groups into a map, one group 157 for the X dimension and one for Y dimension. 159 2.2. String Concatenation 161 It is often useful to be able to compose string literals out of 162 component literals defined in different places in the specification. 164 The .cat control identifies a string that is built from a 165 concatenation of the target and the controller. Target and 166 controller MUST be strings. The result of the operation has the type 167 of the target. The concatenation is performed on the bytes in both 168 strings. If the target is a text string, the result of that 169 concatenation MUST be valid UTF-8. 171 a = "foo" .cat ' 172 bar 173 baz 174 ' 175 ; on a system where the newline is \n, is the same string as: 176 b = "foo\n bar\n baz\n" 178 Figure 2: Example: concatenation of text and byte string 180 The example in Figure 2 builds a text string named a out of 181 concatenating the target text string "foo" and the controller byte 182 string entered in a text form byte string literal. (This particular 183 idiom is useful when the text string contains newlines, which, as 184 shown in the example for b, may be harder to read when entered in the 185 format that the pure CDDL text string notation inherits from JSON.) 187 2.3. String Concatenation with Dedenting 189 Multi-line string literals for various applications, including 190 embedded ABNF (Section 3), need to be set flush left, at least 191 partially. Often, having some indentation in the source code for the 192 literal can promote readability, as in Figure 3. 194 oid = bytes .abnfb ("oid" .det cbor-tags-oid) 195 roid = bytes .abnfb ("roid" .det cbor-tags-oid) 197 cbor-tags-oid = ' 198 oid = 1*arc 199 roid = *arc 200 arc = [nlsb] %x00-7f 201 nlsb = %x81-ff *%x80-ff 202 ' 204 Figure 3: Example: dedenting concatenation 206 The control operator .det works like .cat, except that both arguments 207 (target and controller) are independently _dedented_ before the 208 concatenation takes place. 210 For the first rule in Figure 3, the result is equivalent to Figure 4. 212 oid = bytes .abnfb 'oid 213 oid = 1*arc 214 roid = *arc 215 arc = [nlsb] %x00-7f 216 nlsb = %x81-ff *%x80-ff 217 ' 218 Figure 4: Dedenting example: result of first .det 220 For the purposes of this specification, we define dedenting as: 222 1. determining the smallest amount of left-most blank space (number 223 of leading space characters) in all the non-blank lines, and 225 2. removing exactly that number of leading space characters from 226 each line. For blank (blank space only or empty) lines, there 227 may be less (or no) leading space characters than this amount, in 228 which case all leading space is removed. 230 (The name .det is a shortcut for "dedenting cat". The maybe more 231 obvious name .dedcat has not been chosen as it is longer and may 232 invoke unpleasant images.) 234 Occasionally, dedenting of only a single item is needed. This can be 235 achieved by using this operator with an empty string, e.g., "" .det 236 rhs or lhs .det "", which can in turn be combined with a .cat: in the 237 construct lhs .cat ("" .det rhs), only rhs is dedented. 239 3. Embedded ABNF 241 Many IETF protocols define allowable values for their text strings in 242 ABNF [RFC5234] [RFC7405]. It is often desirable to define a text 243 string type in CDDL by employing existing ABNF embedded into the CDDL 244 specification. Without specific ABNF support in CDDL, that ABNF 245 would usually need to be translated into a regular expression (if 246 that is even possible). 248 ABNF is added to CDDL in the same way that regular expressions were 249 added: by defining a .abnf control operator. The target is usually 250 text or some restriction on it, the controller is the text of an ABNF 251 specification. 253 There are several small issues, with solutions given here: 255 * ABNF can be used to define byte sequences as well as UTF-8 text 256 strings interpreted as Unicode scalar sequences. This means this 257 specification defines two control operators: .abnfb for ABNF 258 denoting byte sequences and .abnf for denoting sequences of 259 Unicode scalar values (codepoint) represented as UTF-8 text 260 strings. Both control operators can be applied to targets of 261 either string type; the ABNF is applied to sequence of bytes in 262 the string interpreting that as a sequence of bytes (.abnfb) or as 263 a sequence of code points represented as an UTF-8 text string 264 (.abnf). The controller string MUST be a text string. 266 * ABNF defines a list of rules, not a single expression (called 267 "elements" in [RFC5234]). This is resolved by requiring the 268 controller string to be one valid "element", followed by zero or 269 more valid "rule" separated from the element by a newline; so the 270 controller string can be built by preceding a piece of valid ABNF 271 by an "element" that selects from that ABNF and a newline. 273 * For the same reason, ABNF requires newlines; specifying newlines 274 in CDDL text strings is tedious (and leads to essentially 275 unreadable ABNF). The workaround employs the .cat operator 276 introduced in Section 2.2 and the syntax for text in byte strings. 277 As is customary for ABNF, the syntax of ABNF itself (NOT the 278 syntax expressed in ABNF!) is relaxed to allow a single linefeed 279 as a newline: 281 CRLF = %x0A / %x0D.0A 283 * One set of rules provided in an ABNF specification is often used 284 in multiple positions, in particular staples such as DIGIT and 285 ALPHA. (Note that all rules referenced need to be defined in each 286 ABNF operator controller string -- there is no implicit import of 287 [RFC5234] Core ABNF or other rules.) The composition this calls 288 for can be provided by the .cat operator, and/or by .det if there 289 is indentation to be disposed of. 291 These points are combined into an example in Figure 5, which uses 292 ABNF from [RFC3339] to specify one each of the CBOR tags defined in 293 [RFC8943] and [RFC8949]. 295 ; for RFC 8943 296 Tag1004 = #6.1004(text .abnf full-date) 297 ; for RFC 8949 298 Tag0 = #6.0(text .abnf date-time) 300 full-date = "full-date" .cat rfc3339 301 date-time = "date-time" .cat rfc3339 303 ; Note the trick of idiomatically starting with a newline, separating 304 ; off the element in the concatenations above from the rule-list 305 rfc3339 = ' 306 date-fullyear = 4DIGIT 307 date-month = 2DIGIT ; 01-12 308 date-mday = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on 309 ; month/year 310 time-hour = 2DIGIT ; 00-23 311 time-minute = 2DIGIT ; 00-59 312 time-second = 2DIGIT ; 00-58, 00-59, 00-60 based on leap sec 313 ; rules 314 time-secfrac = "." 1*DIGIT 315 time-numoffset = ("+" / "-") time-hour ":" time-minute 316 time-offset = "Z" / time-numoffset 318 partial-time = time-hour ":" time-minute ":" time-second 319 [time-secfrac] 320 full-date = date-fullyear "-" date-month "-" date-mday 321 full-time = partial-time time-offset 323 date-time = full-date "T" full-time 324 ' .det rfc5234-core 326 rfc5234-core = ' 327 DIGIT = %x30-39 ; 0-9 328 ; abbreviated here 329 ' 331 Figure 5: Example: employing RFC 3339 ABNF for defining CBOR Tags 333 4. Features 335 Commonly, the kind of validation enabled by languages such as CDDL 336 provides a Boolean result: valid, or invalid. 338 In rapidly evolving environments, this is too simplistic. The data 339 models described by a CDDL specification may continually be enhanced 340 by additional features, and it would be useful even for a 341 specification that does not yet describe a specific future feature to 342 identify the extension point the feature can use, accepting such 343 extensions while marking them as such. 345 The .feature control annotates the target as making use of the 346 feature named by the controller. The latter will usually be a 347 string. A tool that validates an instance against that specification 348 may mark the instance as using a feature that is annotated by the 349 specification. 351 More specifically, the tool's diagnostic output might contain the 352 controller (right hand side) as a feature name, and the target (left 353 hand side) as a feature detail. However, in some cases, the target 354 has too much detail, and the specification might want to hint the 355 tool that more limited detail is appropriate. In this case, the 356 controller should be an array, with the first element being the 357 feature name (that would otherwise be the entire controller), and the 358 second element being the detail (usually another string), as 359 illustrated in Figure 6. 361 foo = { 362 kind: bar / baz .feature (["foo-extensions", "bazify"]) 363 } 364 bar = ... 365 baz = ... ; complex stuff that doesn't all need to be in the detail 367 Figure 6: Providing explicit detail with .feature 369 Figure 7 shows what could be the definition of a person, with 370 potential extensions beyond name and organization being marked 371 further-person-extension. Extensions that are known at the time this 372 definition is written can be collected into $$person-extensions. 373 However, future extensions would be deemed invalid unless the 374 wildcard at the end of the map is added. These extensions could then 375 be specifically examined by a user or a tool that makes use of the 376 validation result; the label (map key) actually used makes a fine 377 feature detail for the tool's diagnostic output. 379 Leaving out the entire extension point would mean that instances that 380 make use of an extension would be marked as whole-sale invalid, 381 making the entire validation approach much less useful. Leaving the 382 extension point in, but not marking its use as special, would render 383 mistakes such as using the label organisation instead of organization 384 invisible. 386 person = { 387 ? name: text 388 ? organization: text 389 $$person-extensions 390 * (text .feature "further-person-extension") => any 391 } 393 $$person-extensions //= (? bloodgroup: text) 395 Figure 7: Map extensibility with .feature 397 Figure 8 shows another example where .feature provides for type 398 extensibility. 400 allowed-types = number / text / bool / null 401 / [* number] / [* text] / [* bool] 402 / (any .feature "allowed-type-extension") 404 Figure 8: Type extensibility with .feature 406 A CDDL tool may simply report the set of features being used; the 407 control then only provides information to the process requesting the 408 validation. One could also imagine a tool that takes arguments 409 allowing the tool to accept certain features and reject others 410 (enable/disable). The latter approach could for instance be used for 411 a JSON/CBOR switch, as illustrated in Figure 9. 413 SenML-Record = { 414 ; ... 415 ? v => number 416 ; ... 417 } 418 v = JC<"v", 2> 419 JC = J .feature "json" / C .feature "cbor" 421 Figure 9: Describing variants with .feature 423 It remains to be seen if the enable/disable approach can lead to new 424 idioms of using CDDL. The language currently has no way to enforce 425 mutually exclusive use of features, as would be needed in this 426 example. 428 5. IANA Considerations 430 This document requests IANA to register the contents of Table 2 into 431 the registry "CDDL Control Operators" of [IANA.cddl]: 433 +==========+===========+ 434 | Name | Reference | 435 +==========+===========+ 436 | .plus | [RFCthis] | 437 +----------+-----------+ 438 | .cat | [RFCthis] | 439 +----------+-----------+ 440 | .det | [RFCthis] | 441 +----------+-----------+ 442 | .abnf | [RFCthis] | 443 +----------+-----------+ 444 | .abnfb | [RFCthis] | 445 +----------+-----------+ 446 | .feature | [RFCthis] | 447 +----------+-----------+ 449 Table 2: New control 450 operators to be 451 registered 453 6. Implementation Status 455 This section is to be removed before publishing as an RFC. 457 An early implementation of the control operator .feature has been 458 available in the CDDL tool described in Appendix F of [RFC8610] since 459 version 0.8.11. The validator warns about each feature being used 460 and provides the set of target values used with the feature. The 461 other control operators defined in this specification are also 462 implemented as of version 0.8.21 and 0.8.26 (double-handed .det). 464 Andrew Weiss' [CDDL-RS] has an ongoing implementation of this draft 465 which is feature-complete except for the ABNF and dedenting support 466 (https://github.com/anweiss/cddl/pull/79 467 (https://github.com/anweiss/cddl/pull/79)). 469 7. Security considerations 471 The security considerations of [RFC8610] apply. 473 8. References 475 8.1. Normative References 477 [IANA.cddl] 478 IANA, "Concise Data Definition Language (CDDL)", 479 . 481 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 482 Requirement Levels", BCP 14, RFC 2119, 483 DOI 10.17487/RFC2119, March 1997, 484 . 486 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 487 Specifications: ABNF", STD 68, RFC 5234, 488 DOI 10.17487/RFC5234, January 2008, 489 . 491 [RFC7405] Kyzivat, P., "Case-Sensitive String Support in ABNF", 492 RFC 7405, DOI 10.17487/RFC7405, December 2014, 493 . 495 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 496 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 497 May 2017, . 499 [RFC8610] Birkholz, H., Vigano, C., and C. Bormann, "Concise Data 500 Definition Language (CDDL): A Notational Convention to 501 Express Concise Binary Object Representation (CBOR) and 502 JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610, 503 June 2019, . 505 8.2. Informative References 507 [CDDL-RS] Weiss, A., "cddl-rs", n.d., 508 . 510 [RFC3339] Klyne, G. and C. Newman, "Date and Time on the Internet: 511 Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002, 512 . 514 [RFC8943] Jones, M., Nadalin, A., and J. Richter, "Concise Binary 515 Object Representation (CBOR) Tags for Date", RFC 8943, 516 DOI 10.17487/RFC8943, November 2020, 517 . 519 [RFC8949] Bormann, C. and P. Hoffman, "Concise Binary Object 520 Representation (CBOR)", STD 94, RFC 8949, 521 DOI 10.17487/RFC8949, December 2020, 522 . 524 Acknowledgements 526 Jim Schaad suggested several improvements. The .feature feature was 527 developed out of a discussion with Henk Birkholz. Paul Kyzivat 528 helped isolate the need for .det. 530 .det is an abbreviation for "dedenting cat", but Det is also the name 531 of a German TV Cartoon character created in the 1960s. 533 Author's Address 535 Carsten Bormann 536 Universität Bremen TZI 537 Postfach 330440 538 D-28359 Bremen 539 Germany 541 Phone: +49-421-218-63921 542 Email: cabo@tzi.org