idnits 2.17.1 draft-greevenbosch-appsawg-cbor-cddl-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 27, 2014) is 3584 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '4' on line 178 -- Looks like a reference, but probably isn't: '2' on line 402 ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Greevenbosch 3 Internet-Draft Huawei Technologies 4 Intended status: Informational June 27, 2014 5 Expires: August 18, 2014 7 CBOR data definition language: a notational convention to express CBOR 8 data structures. 9 draft-greevenbosch-appsawg-cbor-cddl-02 11 Abstract 13 This document proposes a notational convention to express CBOR data 14 structures. Its main goal is to make it easy to express message 15 structures for protocols that use CBOR. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at http://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on December 27, 2014. 34 Copyright Notice 36 Copyright (c) 2014 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (http://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Requirements notation . . . . . . . . . . . . . . . . . . . . 2 52 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 53 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 54 4. Notational conventions . . . . . . . . . . . . . . . . . . . 3 55 4.1. General conventions . . . . . . . . . . . . . . . . . . . 3 56 4.2. Keywords for primitive datatypes . . . . . . . . . . . . 3 57 4.3. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . 4 58 4.4. Structures . . . . . . . . . . . . . . . . . . . . . . . 4 59 4.5. Maps . . . . . . . . . . . . . . . . . . . . . . . . . . 5 60 4.6. Tags . . . . . . . . . . . . . . . . . . . . . . . . . . 6 61 4.7. Optional variables . . . . . . . . . . . . . . . . . . . 7 62 5. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 8 63 5.1. Moves in a computer game . . . . . . . . . . . . . . . . 9 64 5.2. Fruit . . . . . . . . . . . . . . . . . . . . . . . . . . 11 65 6. Philosophy . . . . . . . . . . . . . . . . . . . . . . . . . 14 66 7. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 14 67 8. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . 14 68 9. Security considerations . . . . . . . . . . . . . . . . . . . 15 69 10. IANA considerations . . . . . . . . . . . . . . . . . . . . . 15 70 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 71 12. Normative References . . . . . . . . . . . . . . . . . . . . 15 72 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 15 74 1. Requirements notation 76 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 77 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 78 document are to be interpreted as described in [RFC2119]. 80 2. Introduction 82 In this document, a notational convention to express CBOR [RFC7049] 83 data structures is defined. 85 The main goal for the convention is to provide a unified notation 86 that can be used when defining protocols that use CBOR. 88 The CBOR notational convention has the following goals: 90 (G1) Able to provide an unambiguous description of a CBOR data 91 structures. 93 (G2) Easy for humans to read and write. 95 (G3) Flexibility to express the freedoms of choice in the CBOR data 96 format. 98 (G4) Possibility to restrict format choices where appropriate. 100 (G5) Able to express common CBOR datatypes and structures. 102 (G6) Human and machine readable and processable. 104 (G7) Usable for automatic verification of whether CBOR data is 105 compliant to a predefined format. 107 3. Definitions 109 The following contains a list of used words in this document: 111 "datatype" defines the format of a variable. 113 "variable" a data component encoded in CBOR. 115 4. Notational conventions 117 4.1. General conventions 119 The basic syntax is as follows: 121 o Each field has a name and a datatype. 123 o The name is written first, followed by a colon and then the 124 datatype. The declarations is finished with a semicolon. 125 Whitespace may appear around the colon and semicolon, as well as 126 in front of the name. 128 o The name does not appear in the actual CBOR encoding. 130 o Comments are preceded by a '#' character. 132 o Variable names and datatypes are case sensitive. 134 4.2. Keywords for primitive datatypes 136 The following keywords for primitive datatypes are defined: 138 "bool" Boolean value (major type 7, additional information 20 or 139 21). 141 "bstr" A byte string (major type 2). 143 "float(16)" IEEE 754 half-precision float (major type 7, additional 144 information 25). 146 "float(32)" IEEE 754 single-precision float (major type 7, 147 additional information 26). 149 "float(64)" IEEE 754 double-precision float (major type 7, 150 additional information 27). 152 "int" An unsigned integer (major type 0) or a negative integer 153 (major type 1). 155 "nint" A negative integer (major type 1). 157 "simple" Simple value (major type 7, additional information 24). 159 "tstr" Text string (major type 3) 161 "uint" An unsigned integer (major type 0). 163 In addition, Section 4.6 defines datatypes associated with CBOR tags. 165 4.3. Arrays 167 Arrays can be of fixed length or of variable length. Both fixed 168 length and variable length arrays can be implemented as definite and 169 indefinite length arrays. 171 A fixed length array is is indicated by '[' and ']' characters behind 172 its type, where number in between specifies the number of elements. 174 A variable length array can be indicated with a "*" behind its type. 176 The following is an example of an array of 4 integers: 178 fourNumbers: int[4]; 180 The following is an example of a variable length array: 182 fibonacci : uint*; 184 4.4. Structures 186 Structures are a logical grouping of CBOR fields. 188 A structure has a name, which can be used as a value type for other 189 fields. The name is followed by a '{' character and the definitions 190 of the variables inside of the structure. The structure is closed by 191 a '}' character. 193 A structure MAY be encoded as an array, in which case its name is 194 preceded by a '*' character. Otherwise there is no CBOR encoding for 195 the grouping. 197 The following is an example of a structure: 199 *Geography { 200 city : tstr; 201 gpsCoordinates : GpsCoordinates; 202 } 204 GpsCoordinates { 205 longitude : uint; # multiplied by 10^7 206 lattitude : uint; # multiplied by 10^7 207 } 209 When encoding, the Geography structure is encoded using a CBOR array, 210 whereas the GpsCoordinates do not have their own encompassing array. 212 4.5. Maps 214 If an entity is a map (major type 5), its datatype has the form 216 map( x, y ) 218 where the keys have datatype x, and the values a datatype y. 220 If either x or y is unspecified (i.e. free to choose per entry), it 221 is replaced by a '.'. 223 It is also possible to define a map with predefined keys as a type. 224 In this case, type declaration is as follows: 226 x: map( y ) { 227 key1: type1; 228 key2: type2; 229 ... 230 } 232 y is the datatype of the keys, and type1, type2, etc the datatype of 233 the value associated with keys key1, key2 etc. 235 The name of an optional map element is preceded by a '?' character. 237 The example below the defines a map with display name (as a string), 238 optionally the name components first name and family name (see 239 Section 4.7 for more on optional variables), and age information (as 240 an unsigned int). 242 PersonalData: map( tstr ) { 243 "displayName": tstr; 244 ?"nameComponents": NameComponents; 245 "age": uint; 246 } 248 NameComponents: map( tstr, tstr ) { 249 "firstName": tstr; 250 "familyName" : tstr; 251 } 253 It is up to the application how to handle unknown tags, however, it 254 is RECOMMENDED to ignore them. 256 4.6. Tags 258 A variable can have an associated CBOR tag (major type 6). This is 259 indicated by the tag encapsulated between the square brackets '[' and 260 ']', just before the variable's datatype definition. 262 For example, the following defines a positive bignum N: 264 N: [2]bstr; 266 [RFC7049] defines several tags. These tags can be also written using 267 the datatypes from Table 1. For table rows with an empty "possible 268 tag notation" entry, we refer to Table 3 in [RFC7049] and associated 269 references for the possible encodings. 271 For example, the following is another way to define the bignum: 273 N: bignum; 275 +------------+-----------------+------------------------------------+ 276 | datatype | possible tag | description | 277 | | notation | | 278 +------------+-----------------+------------------------------------+ 279 | b64 | [34]tstr | Base 64 (tag 34) | 280 | | | | 281 | b64url | [33]tstr | Base 64 URL (tag 33) | 282 | | | | 283 | bigfloat | | bigfloat (tag 5) | 284 | | | | 285 | bignum | [2]bstr or | positive (tag 2) or negative (tag | 286 | | [3]bstr | 3) bignum | 287 | | | | 288 | cbor | [24]bstr | Encoded CBOR data item (tag 24) | 289 | | | | 290 | decfrac | | decimal fraction (tag 4) | 291 | | | | 292 | eb16 | | Expected conversion to base16 | 293 | | | encoding (tag 23) | 294 | | | | 295 | eb64 | | Expected conversion to base64 | 296 | | | encoding (tag 22) | 297 | | | | 298 | eb64url | | Expected conversion to base64 url | 299 | | | encoding (tag 21) | 300 | | | | 301 | epochdt | | epoch date/time (tag 1) | 302 | | | | 303 | mime | [36]tstr | Mime message (tag 36) | 304 | | | | 305 | nbignum | [3]bstr | negative bignum (tag 3) | 306 | | | | 307 | regex | [35]tstr | regular expression (tag 35) | 308 | | | | 309 | standarddt | [0]tstr | standard date/time string (tag 0) | 310 | | | | 311 | ubignum | [2]bstr | positive bignum (tag 2) | 312 | | | | 313 | uri | [32]tstr | URI (tag 32) | 314 +------------+-----------------+------------------------------------+ 316 Table 1 318 4.7. Optional variables 320 There may be variables or structures whose inclusion is optional. In 321 this case, the name of the variable is preceded by a '?' character 322 For example, the following defines a CBOR structure that is dependent 323 on a boolean value. 325 *MainStruct { 326 whichForm : bool; 327 ?data1 : Form1; # when whichForm == true 328 ?data2 : Form2; # when whichForm == false 329 } 331 Form1 { 332 anInteger : int; 333 aTextString : tstr; 334 } 336 Form2 { 337 aFloat : float(16); 338 aBinaryString : bstr; 339 } 341 Notice that it is not possible to define the relationship between 342 "whichForm" and inclusion of either "data1" or "data2" with CBOR 343 content rules. Such relationship should be otherwise communicated to 344 the implementer, for example in the text of the specification that 345 uses the CBOR structure, or with comments as was done in this 346 example. 348 Protocol designers should exhibit utmost care when defining CBOR 349 structures with optional variables, especially when some of these 350 variables have the same datatype. 352 For example, the following CBOR data structure is ambiguous: 354 *DataStruct { 355 ?OptionalVariable : uint; 356 MandatoryVariable : uint; 357 ?AnotherOptionalVariable : uint; 358 } 360 Since optional variables are often detected from their datatype, it 361 is RECOMMENDED to not have a following of multiple variables of the 362 same datatype, when some of these variables are optional. 364 5. Examples 366 This section contains various examples of structures defined using 367 the CBOR notational convention. 369 5.1. Moves in a computer game 371 A multiplayer computer game uses CBOR to exchange moves between the 372 players. To ensure a good gaming experience, the move information 373 needs to be exchanged quickly and frequently. Therefore, the game 374 uses CBOR to send its information in a compact format. Figure 1 375 shows definition of the CBOR information exchange format. 377 *UpdateMsg { 378 move_no : uint; # increases for each move 379 player_info : PlayerInfo; # general information 380 moves : Moves*; # moves in this message 381 } 383 PlayerInfo { 384 alias : tstr; 385 player_id : uint; 386 experience : uint; # beginner: 0; expert: 3 387 gold : uint; 388 supplies : Supplies; 389 avg_strength : float(16); 390 } 392 Supplies : map( uint ) { 393 0 : uint; # wood 394 1 : uint; # iron 395 2 : uint; # grain 396 } 398 *Moves { 399 unit_id : uint; 400 unit_strength : uint; # between 0 and 100 401 source_pos : uint[2]; # (x,y) 402 target_pos : uint[2]; # (x,y) 403 } 405 Figure 1: CBOR definition of an information exchange format for a 406 computer game 408 Notice that the supplies have been encoded as a map with integer 409 keys. In this example, using string keys would also have been 410 suitable. However, the example illustrates the possibility to use 411 other datatypes for keys, leading to more efficient encoding. 413 Player "Johnny" does two moves. The game server has assigned Johnny 414 the ID 0x7a3b871f. Johnny is an amateur player, so has experience 1. 415 He currently has 1200 gold, 13 units of wood, 70 units of iron and 29 416 units of grain. He has several units, with a total average strength 417 of 30.25. 419 The units Johnny plays in move 250 are the unit with ID 19, strength 420 20 from (5,7) to (6,9), and the unit with ID 87, strength 40 from 421 (7,10) to (6,10). 423 This information is coded in CBOR as depicted in Figure 2. 425 9F 426 18 FA # move 250 427 66 4A 6F 68 6E 6E 79 # "Johnny" 428 1A 7A 3B 87 1F # player_id 429 01 # experience 430 19 04 B0 # 1200 gold as uint 431 A3 # begin map "supplies" with 3 elements 432 00 # wood: 433 0C # 13 as uint 434 01 # iron: 435 18 86 # 70 as uint 436 02 # grain: 437 18 1D # 29 as uint 438 F9 4F 90 # average strength 30.25 half-precision float 439 9F # indefinite length "moves" array 440 84 # 4-element array Moves 441 13 # unit id 19 as uint 442 14 # strength 20 as uint 443 82 # 2-element array source_pos 444 05 # source_pos.x=5 445 07 # source_pos.y=7 446 82 # 2-element array target_pos 447 06 # target_pos.x=6 448 09 # target_pos.y=9 449 84 # 4-element array Moves 450 18 57 # unit id 87 451 18 28 # strength 40 452 82 # 2-element array source_pos 453 07 # source_pos.x=7 454 0a # source_pos.y=10 455 82 # 2-element array target_pos 456 06 # target_pos.x=6 457 0a # target_pos.y=10 458 FF # end of "moves" array 459 FF 461 Figure 2: CBOR instance for game example 463 5.2. Fruit 465 Figure 3 contains an example for a CBOR structure that contains 466 information about fruit. 468 fruitlist : Fruit*; 470 *Fruit { 471 name : tstr; 472 colour : uint*; 473 avg_weight : float( 16 ); 474 price : uint; 475 international_names : International; 476 rfu : bstr; # reserved for future use 477 } 479 International : map( tstr ) { 480 "CN" : tstr; # Chinese 481 "NL" : tstr; # Dutch 482 "EN" : tstr; # English 483 "FR" : tstr; # French 484 "DE" : tstr; # German 485 } 487 Figure 3: Example CBOR structure 489 The colour integer can have the values from Table 2. 491 +---------+-------+ 492 | Colour | Value | 493 +---------+-------+ 494 | black | 0 | 495 | | | 496 | red | 1 | 497 | | | 498 | green | 2 | 499 | | | 500 | yellow | 3 | 501 | | | 502 | blue | 4 | 503 | | | 504 | magenta | 5 | 505 | | | 506 | cyan | 6 | 507 | | | 508 | white | 7 | 509 | | | 510 | orange | 8 | 511 | | | 512 | pink | 9 | 513 | | | 514 | purple | 10 | 515 | | | 516 | brown | 11 | 517 | | | 518 | grey | 12 | 519 +---------+-------+ 521 Table 2: Possible values for the colour field 523 For example, apples can be red, yellow or green. They have an 524 average weight of 0.195kg and a price of 30 cents. Chinese for 525 "apple" in UTF-8 is [ E8 8B B9 E6 9E 9C ], the Dutch word is "appel" 526 and the French word "pomme". 528 For simplicity, let's assume that the colour of oranges can only be 529 orange. They have an average weight of 0.230kg and a price of 50 530 cents. Chinese for "orange" in UTF-8 is [ E6 A9 99 E5 AD 90 ], the 531 Dutch word is "sinaasappel" and the German word "Orange". 533 This information would be encoded as depicted in Figure 4. 535 9F # indefinite length "fruitlist" array 536 86 # First "Fruit" instance, 6 elements 537 65 # text string "name" length 5 538 61 70 70 6C 65 # "apple" 539 83 # array for "Colour", 3 elements 540 01 # "red" as uint 541 02 # "green" as uint 542 03 # "yellow" as uint 543 F9 # Floating point half precision 544 32 3D # "avg_weight" 0.195 545 18 1E # "price" 30 as uint 546 A3 # map "international_names", 3 pairs 547 62 43 4E # text string length 2, "CN" 548 66 E8 8B B9 E6 9E 9C # Chinese word for apple 549 62 4E 4C # "NL" 550 65 61 70 70 65 6C # "appel" 551 62 46 52 # "FR" 552 65 70 6F 6D 6D 65 # "pomme" 553 40 # byte string "rfu", 0 bytes length 554 86 # Second "Fruit" instance 555 66 # text string "name" length 6 556 6F 72 61 6E 67 65 # "orange" 557 81 # array for "Colour", 3 elements 558 08 # "orange" as uint 559 F9 # Floating point half precision 560 33 5C # "avg_weight" 0.230 561 18 32 # "price" 50 as uint 562 A3 # map "international_names", 3 pairs 563 62 43 4E # text string length 2, "CN" 564 66 E6 A9 99 E5 AD 90 # Chinese word for orange 565 62 4E 4C # "NL" 566 6B 73 69 6E 61 61 73 61 70 70 65 6C # "sinaasappel" 567 62 44 45 # "DE" 568 66 4F 72 61 6E 67 65 # "Orange" 569 40 # byte string "rfu", 0 bytes length 570 FF # end of "fruitlist" array 572 Figure 4: Example CBOR instance 574 Notice that if the "Fruit" structure did not have the preceding "*", 575 the two "Fruit" instance arrays would have been omitted. In 576 addition, the "fruitlist" array would have had 12 elements instead of 577 2. (Although for "fruitlist" the indefinite length approach was 578 chosen, such that the number of elements is not explicitely 579 signalled.) 581 6. Philosophy 583 The CBOR notational convention can be used to efficiently define the 584 layout of CBOR data. 586 In addition, it has been specified such that a machine can verify 587 whether or not CBOR data is compliant to its definition. The 588 thoroughness of this compliance verification depends on the 589 application. 591 For example, an application may decide not to verify the data 592 structure at all, and use the CBOR content rules solely as a means to 593 indicate the structure of the data to the programmer. 595 On the other end, the application may also implement a verification 596 method that goes as far as verifying that all mandatory map keys are 597 available. 599 The matter in how far the data description must be enforced by an 600 application is left to the implementers and specifiers of that 601 application. 603 7. Open Issues 605 At least the following issues need further consideration: 607 o Whether to remove optional variables (other than in maps). 609 o More extensive security considerations. 611 o Consider whether structures without encapsulating CBOR array 612 encoding should be omitted. 614 8. Change Log 616 Changes from version 00 to version 01 618 o Removed constants 620 o Updated the tag mechanism 622 o Extended the map structure 624 o Added examples 626 9. Security considerations 628 This document presents a content rules language for expressing CBOR 629 data structures. As such, it does not bring any security issues on 630 itself, although specification of protocols that use CBOR naturally 631 need security analysis when defined. 633 10. IANA considerations 635 This document does not require any IANA registrations. 637 11. Acknowledgements 639 For this draft, there has been inspiration from the C and Pascal 640 languages, MPEG's conventions for describing structures in the ISO 641 base media file format, and Andrew Lee Newton's "JSON Content Rules" 642 draft. 644 Useful feedback came from Carsten Bormann and Joe Hildebrand. 646 12. Normative References 648 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 649 Requirement Levels", BCP 14, RFC 2119, March 1997. 651 [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object 652 Representation (CBOR)", RFC 7049, October 2013. 654 Author's Address 656 Bert Greevenbosch 657 Huawei Technologies Co., Ltd. 658 Huawei Industrial Base 659 Bantian, Longgang District 660 Shenzhen 518129 661 P.R. China 663 Email: bert.greevenbosch@huawei.com