idnits 2.17.1 draft-bormann-cbor-cddl-freezer-07.txt: -(3): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 2 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (21 April 2021) is 1101 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Duplicate reference: RFC8610, mentioned in 'Err6526', was also mentioned in 'RFC8610'. -- Duplicate reference: RFC8610, mentioned in 'Err6527', was also mentioned in 'Err6526'. -- Duplicate reference: RFC8610, mentioned in 'Err6543', was also mentioned in 'Err6527'. == Outdated reference: A later version (-06) exists of draft-ietf-core-coral-03 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Bormann 3 Internet-Draft Universität Bremen TZI 4 Intended status: Informational 21 April 2021 5 Expires: 23 October 2021 7 A feature freezer for the Concise Data Definition Language (CDDL) 8 draft-bormann-cbor-cddl-freezer-07 10 Abstract 12 In defining the Concise Data Definition Language (CDDL), some 13 features have turned up that would be nice to have. In the interest 14 of completing this specification in a timely manner, the present 15 document was started to collect nice-to-have features that did not 16 make it into the first RFC for CDDL, RFC 8610. 18 It is now time to discuss thawing some of the concepts discussed 19 here. A number of additional proposals have been added. 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at https://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on 23 October 2021. 38 Copyright Notice 40 Copyright (c) 2021 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 45 license-info) in effect on the date of publication of this document. 46 Please review these documents carefully, as they describe your rights 47 and restrictions with respect to this document. Code Components 48 extracted from this document must include Simplified BSD License text 49 as described in Section 4.e of the Trust Legal Provisions and are 50 provided without warranty as described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 55 2. Base language features . . . . . . . . . . . . . . . . . . . 3 56 2.1. Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 3. Literal syntax . . . . . . . . . . . . . . . . . . . . . . . 3 58 3.1. Tag-oriented Literals . . . . . . . . . . . . . . . . . . 3 59 3.2. Regular Expression Literals . . . . . . . . . . . . . . . 3 60 3.3. Clarifications . . . . . . . . . . . . . . . . . . . . . 4 61 3.3.1. Err6527 . . . . . . . . . . . . . . . . . . . . . . . 4 62 3.3.2. Err6543 . . . . . . . . . . . . . . . . . . . . . . . 5 63 4. Controls . . . . . . . . . . . . . . . . . . . . . . . . . . 6 64 4.1. Control operator .pcre . . . . . . . . . . . . . . . . . 6 65 4.2. Endianness in .bits . . . . . . . . . . . . . . . . . . . 6 66 4.3. .bitfield control . . . . . . . . . . . . . . . . . . . . 6 67 5. Co-occurrence Constraints . . . . . . . . . . . . . . . . . . 7 68 6. Module superstructure . . . . . . . . . . . . . . . . . . . . 7 69 6.1. Namespacing . . . . . . . . . . . . . . . . . . . . . . . 8 70 7. Alternative Representations . . . . . . . . . . . . . . . . . 8 71 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 72 9. Security considerations . . . . . . . . . . . . . . . . . . . 8 73 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 74 10.1. Normative References . . . . . . . . . . . . . . . . . . 8 75 10.2. Informative References . . . . . . . . . . . . . . . . . 9 76 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 10 77 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 10 79 1. Introduction 81 In defining the Concise Data Definition Language (CDDL), some 82 features have turned up that would be nice to have. In the interest 83 of completing this specification in a timely manner, the present 84 document was started to collect nice-to-have features that did not 85 make it into the first RFC for CDDL [RFC8610]. 87 It is now time to discuss thawing some of the concepts discussed 88 here. A number of additional proposals have been added. 90 There is always a danger for a document like this to become a 91 shopping list; the intention is to develop this document further 92 based on real-world experience with the first CDDL standard. 94 2. Base language features 96 2.1. Cuts 98 Section 3.5.4 of [RFC8610] alludes to a new language feature, _cuts_, 99 and defines it in a fashion that is rather focused on a single 100 application in the context of maps and generating better diagnostic 101 information about them. 103 The present document is expected to grow a more complete definition 104 of cuts, with the expectation that it will be upwards-compatible to 105 the existing one in [RFC8610], before this possibly becomes a 106 mainline language feature in a future version of CDDL. 108 3. Literal syntax 110 3.1. Tag-oriented Literals 112 Some CBOR tags often would be most natural to use in a CDDL spec with 113 a literal syntax that is tailored to their semantics instead of their 114 serialization in CBOR. There is currently no way to add such 115 syntaxes, no defined extension point either. 117 The text form of CoRAL [I-D.ietf-core-coral] defines literals of the 118 form 120 dt'2019-07-21T19:53Z' 122 for datetime items. (Similar advances should then probably be made 123 in diagnostic notation.) 125 3.2. Regular Expression Literals 127 Regular expressions currently are notated as strings in CDDL, with 128 all the string escaping rules applied once. It might be convenient 129 to have a more conventional literal format for regular expressions, 130 possibly also providing a place to add modifiers such as "/i". This 131 might also imply "text .regexp ...", which with the proposal in 132 Section 4.1 then raises the question of how to indicate the regular 133 expression flavor. 135 3.3. Clarifications 137 A number of errata reports have been made around some details of text 138 string and byte string literal syntax: [Err6527] and [Err6543]. 139 These need to be addressed by re-examining the details of these 140 literal syntaxes. Also, [Err6526] needs to be applied. 142 3.3.1. Err6527 144 The ABNF used in [RFC8610] for the content of text string literals is 145 rather permissive: 147 text = %x22 *SCHAR %x22 148 SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC 149 SESC = "\" (%x20-7E / %x80-10FFFD) 151 This allows almost any non-C0 character to be escaped by a backslash, 152 but critically misses out on the "\uXXXX" and "\uHHHH\uLLLL" forms 153 that JSON allows to specify characters in hex. Both can be solved by 154 updating the SESC production to: 156 SESC = "\" ( %x22 / "/" / "\" / ; \" \/ \\ 157 %x62 / %x66 / %x6E / %x72 / %x74 / ; \b \f \n \r \t 158 (%x75 hexchar) ) ; \u 159 hexchar = non-surrogate / (high-surrogate "\" %x75 low-surrogate) 160 non-surrogate = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) / 161 ("D" %x30-37 2HEXDIG ) 162 high-surrogate = "D" ("8"/"9"/"A"/"B") 2HEXDIG 163 low-surrogate = "D" ("C"/"D"/"E"/"F") 2HEXDIG 165 Now that SESC is more restrictively formulated, this also requires an 166 update to the BCHAR production used in the ABNF syntax for byte 167 string literals: 169 bytes = [bsqual] %x27 *BCHAR %x27 170 BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF 171 bsqual = "h" / "b64" 173 The updated version explicit allows "\'", which is no longer allowed 174 in the updated SESC: 176 BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / "\'" / CRLF 178 3.3.2. Err6543 180 The ABNF used in [RFC8610] for the content of byte string literals 181 lumps together byte strings notated as text with byte strings notated 182 in base16 (hex) or base64 (but see also updated BCHAR production 183 above): 185 bytes = [bsqual] %x27 *BCHAR %x27 186 BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF 188 Errata report 6543 proposes to handle the two cases in separate 189 productions (where, with an updated SESC, BCHAR obviously needs to be 190 updated as above): 192 bytes = %x27 *BCHAR %x27 193 / bsqual %x27 *QCHAR %x27 194 BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF 195 QCHAR = DIGIT / ALPHA / "+" / "/" / "-" / "_" / "=" / WS 197 This potentially causes a subtle change, which is hidden in the WS 198 production: 200 WS = SP / NL 201 SP = %x20 202 NL = COMMENT / CRLF 203 COMMENT = ";" *PCHAR CRLF 204 PCHAR = %x20-7E / %x80-10FFFD 205 CRLF = %x0A / %x0D.0A 207 This allows any non-C0 character in a comment, so this fragment 208 becomes possible: 210 foo = h' 211 43424F52 ; 'CBOR' 212 0A ; LF, but don't use CR! 213 ' 215 The current text is not unambiguously saying whether the three 216 apostrophes need to be escaped with a "\" or not, as in: 218 foo = h' 219 43424F52 ; \'CBOR\' 220 0A ; LF, but don\'t use CR! 221 ' 223 ... which would be supported by the existing ABNF in [RFC8610]. 225 4. Controls 227 Controls are the main extension point of the CDDL language. It is 228 relatively painless to add controls to CDDL. Several candidates have 229 been identified that aren't quite ready for adoption, of which one 230 shall be listed here. 232 4.1. Control operator .pcre 234 There are many variants of regular expression languages. 235 Section 3.8.3 of [RFC8610] defines the .regexp control, which is 236 based on XSD [XSD2] regular expressions. As discussed in that 237 section, the most desirable form of regular expressions in many cases 238 is the family called "Perl-Compatible Regular Expressions" ([PCRE]); 239 however, no formally stable definition of PCRE is available at this 240 time for normatively referencing it from an RFC. 242 The present document defines the control operator .pcre, which is 243 similar to .regexp, but uses PCRE2 regular expressions. More 244 specifically, a ".pcre" control indicates that the text string given 245 as a target needs to match the PCRE regular expression given as a 246 value in the control type, where that regular expression is anchored 247 on both sides. (If anchoring is not desired for a side, ".*" needs 248 to be inserted there.) 250 Similarly, ".es2018re" could be defined for ECMAscript 2018 regular 251 expressions with anchors added. 253 4.2. Endianness in .bits 255 How useful would it be to have another variant of .bits that counts 256 bits like in RFC box notation? (Or at least per-byte? 32-bit words 257 don't always perfectly mesh with byte strings.) 259 4.3. .bitfield control 261 Provide a way to specify bitfields in byte strings and uints to a 262 higher level of detail than is possible with .bits. Strawman: 264 Field = uint .bitfield Fieldbits 266 Fieldbits = [ 267 flag1: [1, bool], 268 val: [4, Vals], 269 flag2: [1, bool], 270 ] 272 Vals = &(A: 0, B: 1, C: 2, D: 3) 273 Note that the group within the controlling array can have choices, 274 enabling the whole power of a context-free grammar (but not much 275 more). 277 5. Co-occurrence Constraints 279 While there are no co-occurrence constraints in CDDL, many actual use 280 cases can be addressed by using the fact that a group is a grammar: 282 postal = { 283 ( street: text, 284 housenumber: text) // 285 ( pobox: text .regexp "[0-9]+" ) 286 } 288 However, constraints that are not just structural/tree-based but are 289 predicates combining parts of the structure cannot be expressed: 291 session = { 292 timeout: uint, 293 } 295 other-session = { 296 timeout: uint .lt [somehow refer to session.timeout], 297 } 299 As a minimum, this requires the ability to reach over to other parts 300 of the tree in a control. Compare JSON Pointer [RFC6901] and JSON 301 Relative Pointer [I-D.handrews-relative-json-pointer]. Stefan 302 Goessner's jsonpath is a JSON variant of XPath that has not been 303 formally standardized [jsonpath]. 305 More generally, something akin to what Schematron is to Relax-NG may 306 be needed. 308 6. Module superstructure 310 CDDL rules could be packaged as modules and referenced from other 311 modules. There could be some control of namespace pollution, as well 312 as unambiguous referencing ("versioning"). 314 This is probably best achieved by a pragma-like syntax which could be 315 carried in CDDL comments, leaving each module to be valid CDDL (if 316 missing some rule definitions to be imported). 318 6.1. Namespacing 320 A convention for mapping CDDL-internal names to external ones could 321 be developed, possibly steered by some pragma-like constructs. 322 External names would likely be URI-based, with some conventions as 323 they are used in RDF or Curies. Internal names might look similar to 324 XML QNames. Note that the identifier character set for CDDL 325 deliberately includes $ and @, which could be used in such a 326 convention. 328 7. Alternative Representations 330 For CDDL, alternative representations e.g. in JSON (and thus in YAML) 331 could be defined, similar to the way YANG defines an XML-based 332 serialization called YIN in Section 11 of [RFC6020]. One proposal 333 for such a syntax is provided by the "cddlc" tool [cddlc]; this could 334 be written up and agreed upon. 336 cddlj = ["cddl", +rule] 337 rule = ["=" / "/=" / "//=", namep, type] 338 namep = ["name", id] / ["gen", id, +id] 339 id = text .regexp "[A-Za-z@_$](([-.])*[A-Za-z0-9@_$])*" 340 op = ".." / "..." / 341 text .regexp "\\.[A-Za-z@_$](([-.])*[A-Za-z0-9@_$])*" 342 namea = ["name", id] / ["gen", id, +type] 343 type = value / namea / ["op", op, type, type] / 344 ["map", group] / ["ary", group] / ["tcho", 2*type] / 345 ["unwrap", namea] / ["enum", group / namea] / 346 ["prim", ?(0..7, ?uint)] 347 group = ["mem", null/type, type] / 348 ["rep", uint, uint/false, group] / 349 ["seq", 2*group] / ["gcho", 2*group] 350 value = ["number"/"text"/"bytes", text] 352 8. IANA Considerations 354 This document makes no requests of IANA. 356 9. Security considerations 358 The security considerations of [RFC8610] apply. 360 10. References 362 10.1. Normative References 364 [RFC8610] Birkholz, H., Vigano, C., and C. Bormann, "Concise Data 365 Definition Language (CDDL): A Notational Convention to 366 Express Concise Binary Object Representation (CBOR) and 367 JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610, 368 June 2019, . 370 10.2. Informative References 372 [cddlc] "CDDL conversion utilities", n.d., 373 . 375 [Err6526] "Errata Report 6526", RFC 8610, 376 . 378 [Err6527] "Errata Report 6527", RFC 8610, 379 . 381 [Err6543] "Errata Report 6543", RFC 8610, 382 . 384 [I-D.handrews-relative-json-pointer] 385 Luff, G. and H. Andrews, "Relative JSON Pointers", Work in 386 Progress, Internet-Draft, draft-handrews-relative-json- 387 pointer-02, 18 September 2019, 388 . 391 [I-D.ietf-core-coral] 392 Hartke, K., "The Constrained RESTful Application Language 393 (CoRAL)", Work in Progress, Internet-Draft, draft-ietf- 394 core-coral-03, 9 March 2020, 395 . 398 [jsonpath] "jsonpath online evaluator", n.d., . 400 [PCRE] "Perl-compatible Regular Expressions (revised API: 401 PCRE2)", n.d., . 403 [RFC6020] Bjorklund, M., Ed., "YANG - A Data Modeling Language for 404 the Network Configuration Protocol (NETCONF)", RFC 6020, 405 DOI 10.17487/RFC6020, October 2010, 406 . 408 [RFC6901] Bryan, P., Ed., Zyp, K., and M. Nottingham, Ed., 409 "JavaScript Object Notation (JSON) Pointer", RFC 6901, 410 DOI 10.17487/RFC6901, April 2013, 411 . 413 [XSD2] Biron, P. and A. Malhotra, "XML Schema Part 2: Datatypes 414 Second Edition", World Wide Web Consortium Recommendation 415 REC-xmlschema-2-20041028, 28 October 2004, 416 . 418 Acknowledgements 420 Many people have asked for CDDL to be completed, soon. These are 421 usually also the people who have brought up observations that led to 422 the proposals discussed here. Sean Leonard has campaigned for a 423 regexp literal syntax. 425 Author's Address 427 Carsten Bormann 428 Universität Bremen TZI 429 Postfach 330440 430 D-28359 Bremen 431 Germany 433 Phone: +49-421-218-63921 434 Email: cabo@tzi.org