idnits 2.17.1 draft-bormann-cbor-cddl-freezer-06.txt: -(3): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 2 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (14 April 2021) is 1100 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Duplicate reference: RFC8610, mentioned in 'Err6526', was also mentioned in 'RFC8610'. -- Duplicate reference: RFC8610, mentioned in 'Err6527', was also mentioned in 'Err6526'. -- Duplicate reference: RFC8610, mentioned in 'Err6543', was also mentioned in 'Err6527'. == Outdated reference: A later version (-06) exists of draft-ietf-core-coral-03 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Bormann 3 Internet-Draft Universität Bremen TZI 4 Intended status: Informational 14 April 2021 5 Expires: 16 October 2021 7 A feature freezer for the Concise Data Definition Language (CDDL) 8 draft-bormann-cbor-cddl-freezer-06 10 Abstract 12 In defining the Concise Data Definition Language (CDDL), some 13 features have turned up that would be nice to have. In the interest 14 of completing this specification in a timely manner, the present 15 document was started to collect nice-to-have features that did not 16 make it into the first RFC for CDDL, RFC 8610. 18 It is now time to discuss thawing some of the concepts discussed 19 here. A number of additional proposals have been added. 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at https://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on 16 October 2021. 38 Copyright Notice 40 Copyright (c) 2021 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 45 license-info) in effect on the date of publication of this document. 46 Please review these documents carefully, as they describe your rights 47 and restrictions with respect to this document. Code Components 48 extracted from this document must include Simplified BSD License text 49 as described in Section 4.e of the Trust Legal Provisions and are 50 provided without warranty as described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 55 2. Base language features . . . . . . . . . . . . . . . . . . . 3 56 2.1. Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 3. Literal syntax . . . . . . . . . . . . . . . . . . . . . . . 3 58 3.1. Tag-oriented Literals . . . . . . . . . . . . . . . . . . 3 59 3.2. Regular Expression Literals . . . . . . . . . . . . . . . 3 60 3.3. Clarifications . . . . . . . . . . . . . . . . . . . . . 4 61 3.3.1. Err6527 . . . . . . . . . . . . . . . . . . . . . . . 4 62 3.3.2. Err6543 . . . . . . . . . . . . . . . . . . . . . . . 4 63 4. Controls . . . . . . . . . . . . . . . . . . . . . . . . . . 5 64 4.1. Control operator .pcre . . . . . . . . . . . . . . . . . 6 65 4.2. Endianness in .bits . . . . . . . . . . . . . . . . . . . 6 66 4.3. .bitfield control . . . . . . . . . . . . . . . . . . . . 6 67 5. Co-occurrence Constraints . . . . . . . . . . . . . . . . . . 7 68 6. Module superstructure . . . . . . . . . . . . . . . . . . . . 7 69 6.1. Namespacing . . . . . . . . . . . . . . . . . . . . . . . 8 70 7. Alternative Representations . . . . . . . . . . . . . . . . . 8 71 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 72 9. Security considerations . . . . . . . . . . . . . . . . . . . 8 73 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 74 10.1. Normative References . . . . . . . . . . . . . . . . . . 8 75 10.2. Informative References . . . . . . . . . . . . . . . . . 9 76 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 10 77 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 10 79 1. Introduction 81 In defining the Concise Data Definition Language (CDDL), some 82 features have turned up that would be nice to have. In the interest 83 of completing this specification in a timely manner, the present 84 document was started to collect nice-to-have features that did not 85 make it into the first RFC for CDDL [RFC8610]. 87 It is now time to discuss thawing some of the concepts discussed 88 here. A number of additional proposals have been added. 90 There is always a danger for a document like this to become a 91 shopping list; the intention is to develop this document further 92 based on real-world experience with the first CDDL standard. 94 2. Base language features 96 2.1. Cuts 98 Section 3.5.4 of [RFC8610] alludes to a new language feature, _cuts_, 99 and defines it in a fashion that is rather focused on a single 100 application in the context of maps and generating better diagnostic 101 information about them. 103 The present document is expected to grow a more complete definition 104 of cuts, with the expectation that it will be upwards-compatible to 105 the existing one in [RFC8610], before this possibly becomes a 106 mainline language feature in a future version of CDDL. 108 3. Literal syntax 110 3.1. Tag-oriented Literals 112 Some CBOR tags often would be most natural to use in a CDDL spec with 113 a literal syntax that is tailored to their semantics instead of their 114 serialization in CBOR. There is currently no way to add such 115 syntaxes, no defined extension point either. 117 The text form of CoRAL [I-D.ietf-core-coral] defines literals of the 118 form 120 dt'2019-07-21T19:53Z' 122 for datetime items. (Similar advances should then probably be made 123 in diagnostic notation.) 125 3.2. Regular Expression Literals 127 Regular expressions currently are notated as strings in CDDL, with 128 all the string escaping rules applied once. It might be convenient 129 to have a more conventional literal format for regular expressions, 130 possibly also providing a place to add modifiers such as "/i". This 131 might also imply "text .regexp ...", which with the proposal in 132 Section 4.1 then raises the question of how to indicate the regular 133 expression flavor. 135 3.3. Clarifications 137 A number of errata reports have been made around some details of text 138 string and byte string literal syntax: [Err6527] and [Err6543]. 139 These need to be addressed by re-examining the details of these 140 literal syntaxes. Also, [Err6526] needs to be applied. 142 3.3.1. Err6527 144 The ABNF used in [RFC8610] for the content of text string literals is 145 rather permissive: 147 text = %x22 *SCHAR %x22 148 SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC 149 SESC = "\" (%x20-7E / %x80-10FFFD) 151 This allows almost any non-C0 character to be escaped by a backslash, 152 but critically misses out on the "\uXXXX" and "\uHHHH\uLLLL" forms 153 that JSON allows to specify characters in hex. Both can be solved by 154 updating the SESC production to: 156 SESC = "\" ( %x22 / %x2F / %x5C / %x62 / %x66 / %x6E / %x72 / %x74 / 157 (%x75 hexchar) ) 158 hexchar = non-surrogate / (high-surrogate "\" %x75 low-surrogate) 159 non-surrogate = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) / 160 ("D" %x30-37 2HEXDIG ) 161 high-surrogate = "D" ("8"/"9"/"A"/"B") 2HEXDIG 162 low-surrogate = "D" ("C"/"D"/"E"/"F") 2HEXDIG 164 Now that SESC is more restrictively formulated, this also requires an 165 update to the BCHAR production used in the ABNF syntax for byte 166 string literals: 168 bytes = [bsqual] %x27 *BCHAR %x27 169 BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF 170 bsqual = "h" / "b64" 172 The updated version explicit allows "\'", which is no longer allowed 173 in the updated SESC: 175 BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / "\'" / CRLF 177 3.3.2. Err6543 179 The ABNF used in [RFC8610] for the content of byte string literals 180 lumps together byte strings notated as text with byte strings notated 181 in base16 (hex) or base64 (but see also updated BCHAR production 182 above): 184 bytes = [bsqual] %x27 *BCHAR %x27 185 BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF 187 Errata report 6543 proposes to handle the two cases in separate 188 productions (where, with an updated SESC, BCHAR obviously needs to be 189 updated as above): 191 bytes = %x27 *BCHAR %x27 192 / bsqual %x27 *QCHAR %x27 193 BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF 194 QCHAR = DIGIT / ALPHA / "+" / "/" / "-" / "_" / "=" / WS 196 This potentially causes a subtle change, which is hidden in the WS 197 production: 199 WS = SP / NL 200 SP = %x20 201 NL = COMMENT / CRLF 202 COMMENT = ";" *PCHAR CRLF 203 PCHAR = %x20-7E / %x80-10FFFD 204 CRLF = %x0A / %x0D.0A 206 This allows any non-C0 character in a comment, so this fragment 207 becomes possible: 209 foo = h' 210 43424F52 ; 'CBOR' 211 0A ; LF, but don't use CR! 212 ' 214 The current text is not unambiguously saying whether the three 215 apostrophes need to be escaped with a "\" or not, as in: 217 foo = h' 218 43424F52 ; \'CBOR\' 219 0A ; LF, but don\'t use CR! 220 ' 222 ... which would be supported by the existing ABNF in [RFC8610]. 224 4. Controls 226 Controls are the main extension point of the CDDL language. It is 227 relatively painless to add controls to CDDL. Several candidates have 228 been identified that aren't quite ready for adoption, of which one 229 shall be listed here. 231 4.1. Control operator .pcre 233 There are many variants of regular expression languages. 234 Section 3.8.3 of [RFC8610] defines the .regexp control, which is 235 based on XSD [XSD2] regular expressions. As discussed in that 236 section, the most desirable form of regular expressions in many cases 237 is the family called "Perl-Compatible Regular Expressions" ([PCRE]); 238 however, no formally stable definition of PCRE is available at this 239 time for normatively referencing it from an RFC. 241 The present document defines the control operator .pcre, which is 242 similar to .regexp, but uses PCRE2 regular expressions. More 243 specifically, a ".pcre" control indicates that the text string given 244 as a target needs to match the PCRE regular expression given as a 245 value in the control type, where that regular expression is anchored 246 on both sides. (If anchoring is not desired for a side, ".*" needs 247 to be inserted there.) 249 Similarly, ".es2018re" could be defined for ECMAscript 2018 regular 250 expressions with anchors added. 252 4.2. Endianness in .bits 254 How useful would it be to have another variant of .bits that counts 255 bits like in RFC box notation? (Or at least per-byte? 32-bit words 256 don't always perfectly mesh with byte strings.) 258 4.3. .bitfield control 260 Provide a way to specify bitfields in byte strings and uints to a 261 higher level of detail than is possible with .bits. Strawman: 263 Field = uint .bitfield Fieldbits 265 Fieldbits = [ 266 flag1: [1, bool], 267 val: [4, Vals], 268 flag2: [1, bool], 269 ] 271 Vals = &(A: 0, B: 1, C: 2, D: 3) 273 Note that the group within the controlling array can have choices, 274 enabling the whole power of a context-free grammar (but not much 275 more). 277 5. Co-occurrence Constraints 279 While there are no co-occurrence constraints in CDDL, many actual use 280 cases can be addressed by using the fact that a group is a grammar: 282 postal = { 283 ( street: text, 284 housenumber: text) // 285 ( pobox: text .regexp "[0-9]+" ) 286 } 288 However, constraints that are not just structural/tree-based but are 289 predicates combining parts of the structure cannot be expressed: 291 session = { 292 timeout: uint, 293 } 295 other-session = { 296 timeout: uint .lt [somehow refer to session.timeout], 297 } 299 As a minimum, this requires the ability to reach over to other parts 300 of the tree in a control. Compare JSON Pointer [RFC6901] and JSON 301 Relative Pointer [I-D.handrews-relative-json-pointer]. Stefan 302 Goessner's jsonpath is a JSON variant of XPath that has not been 303 formally standardized [jsonpath]. 305 More generally, something akin to what Schematron is to Relax-NG may 306 be needed. 308 6. Module superstructure 310 CDDL rules could be packaged as modules and referenced from other 311 modules. There could be some control of namespace pollution, as well 312 as unambiguous referencing ("versioning"). 314 This is probably best achieved by a pragma-like syntax which could be 315 carried in CDDL comments, leaving each module to be valid CDDL (if 316 missing some rule definitions to be imported). 318 6.1. Namespacing 320 A convention for mapping CDDL-internal names to external ones could 321 be developed, possibly steered by some pragma-like constructs. 322 External names would likely be URI-based, with some conventions as 323 they are used in RDF or Curies. Internal names might look similar to 324 XML QNames. Note that the identifier character set for CDDL 325 deliberately includes $ and @, which could be used in such a 326 convention. 328 7. Alternative Representations 330 For CDDL, alternative representations e.g. in JSON (and thus in YAML) 331 could be defined, similar to the way YANG defines an XML-based 332 serialization called YIN in Section 11 of [RFC6020]. One proposal 333 for such a syntax is provided by the "cddlc" tool [cddlc]; this could 334 be written up and agreed upon. 336 cddlj = ["cddl", +rule] 337 rule = ["=" / "/=" / "//=", namep, type] 338 namep = ["name", id] / ["gen", id, +id] 339 id = text .regexp "[A-Za-z@_$](([-.])*[A-Za-z0-9@_$])*" 340 op = ".." / "..." / 341 text .regexp "\\.[A-Za-z@_$](([-.])*[A-Za-z0-9@_$])*" 342 namea = ["name", id] / ["gen", id, +type] 343 type = value / namea / ["op", op, type, type] / 344 ["map", group] / ["ary", group] / ["tcho", 2*type] / 345 ["unwrap", namea] / ["enum", group / namea] / 346 ["prim", ?(0..7, ?uint)] 347 group = ["mem", null/type, type] / 348 ["rep", uint, uint/false, group] / 349 ["seq", 2*group] / ["gcho", 2*group] 350 value = ["number"/"text"/"bytes", text] 352 8. IANA Considerations 354 This document makes no requests of IANA. 356 9. Security considerations 358 The security considerations of [RFC8610] apply. 360 10. References 362 10.1. Normative References 364 [RFC8610] Birkholz, H., Vigano, C., and C. Bormann, "Concise Data 365 Definition Language (CDDL): A Notational Convention to 366 Express Concise Binary Object Representation (CBOR) and 367 JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610, 368 June 2019, . 370 10.2. Informative References 372 [cddlc] "CDDL conversion utilities", n.d., 373 . 375 [Err6526] "Errata Report 6526", RFC 8610, 376 . 378 [Err6527] "Errata Report 6527", RFC 8610, 379 . 381 [Err6543] "Errata Report 6543", RFC 8610, 382 . 384 [I-D.handrews-relative-json-pointer] 385 Luff, G. and H. Andrews, "Relative JSON Pointers", Work in 386 Progress, Internet-Draft, draft-handrews-relative-json- 387 pointer-02, 18 September 2019, 388 . 391 [I-D.ietf-core-coral] 392 Hartke, K., "The Constrained RESTful Application Language 393 (CoRAL)", Work in Progress, Internet-Draft, draft-ietf- 394 core-coral-03, 9 March 2020, 395 . 398 [jsonpath] "jsonpath online evaluator", n.d., . 400 [PCRE] "Perl-compatible Regular Expressions (revised API: 401 PCRE2)", n.d., . 403 [RFC6020] Bjorklund, M., Ed., "YANG - A Data Modeling Language for 404 the Network Configuration Protocol (NETCONF)", RFC 6020, 405 DOI 10.17487/RFC6020, October 2010, 406 . 408 [RFC6901] Bryan, P., Ed., Zyp, K., and M. Nottingham, Ed., 409 "JavaScript Object Notation (JSON) Pointer", RFC 6901, 410 DOI 10.17487/RFC6901, April 2013, 411 . 413 [XSD2] Biron, P. and A. Malhotra, "XML Schema Part 2: Datatypes 414 Second Edition", World Wide Web Consortium Recommendation 415 REC-xmlschema-2-20041028, 28 October 2004, 416 . 418 Acknowledgements 420 Many people have asked for CDDL to be completed, soon. These are 421 usually also the people who have brought up observations that led to 422 the proposals discussed here. Sean Leonard has campaigned for a 423 regexp literal syntax. 425 Author's Address 427 Carsten Bormann 428 Universität Bremen TZI 429 Postfach 330440 430 D-28359 Bremen 431 Germany 433 Phone: +49-421-218-63921 434 Email: cabo@tzi.org