idnits 2.17.1 draft-hallambaker-protogen-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 26 instances of lines with control characters in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 280: '...h case the first SHOULD be a standalon...' RFC 2119 keyword, line 291: '...sed in a service MUST inherit from a s...' RFC 2119 keyword, line 450: '...ired' option set MUST always be specif...' RFC 2119 keyword, line 468: '...Multiple' option MUST be encoded as a ...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Unrecognized Status in 'Intended Status:', assuming Proposed Standard (Expected one of 'Standards Track', 'Full Standard', 'Draft Standard', 'Proposed Standard', 'Best Current Practice', 'Informational', 'Experimental', 'Informational', 'Historic'.) -- The document date (July 6, 2015) is 3217 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4627 (Obsoleted by RFC 7158, RFC 7159) Summary: 5 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force (IETF) Phillip Hallam-Baker 2 INTERNET-DRAFT Comodo Group Inc. 3 Intended Status: July 6, 2015 4 Expires: January 7, 2016 6 Protocol Specification Tool 7 draft-hallambaker-protogen-01 9 Abstract 11 The syntax for the PROTOGEN protocol specification tool is described 12 and the use of the tool to generate protocol specifications, 13 prototype and production implementations. While the primary focus of 14 PROTOGEN is to develop protocols using JSON message syntax, the 15 PROTOGEN framework has been successfully applied to generate 16 prototypes using ASN.1, TLS, XML and RFC822 style syntax. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at http://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 Copyright Notice 35 Copyright (c) 2015 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents 40 (http://trustee.ietf.org/license-info) in effect on the date of 41 publication of this document. Please review these documents 42 carefully, as they describe your rights and restrictions with respect 43 to this document. Code Components extracted from this document must 44 include Simplified BSD License text as described in Section 4.e of 45 the Trust Legal Provisions and are provided without warranty as 46 described in the Simplified BSD License. 48 Table of Contents 50 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 51 1.1. Previous work . . . . . . . . . . . . . . . . . . . . . . 3 52 1.2. Schema driven documentation . . . . . . . . . . . . . . . 3 53 1.3. Schema driven code generation . . . . . . . . . . . . . . 4 54 1.4. Application Examples . . . . . . . . . . . . . . . . . . 4 55 2. Protocol Specification . . . . . . . . . . . . . . . . . . . . 6 56 2.1. Protocol . . . . . . . . . . . . . . . . . . . . . . . . 6 57 2.2. Description . . . . . . . . . . . . . . . . . . . . . . . 7 58 2.3. Service . . . . . . . . . . . . . . . . . . . . . . . . . 7 59 2.4. Transaction . . . . . . . . . . . . . . . . . . . . . . . 7 60 2.5. Message . . . . . . . . . . . . . . . . . . . . . . . . . 8 61 2.6. Structure . . . . . . . . . . . . . . . . . . . . . . . . 8 62 2.7. Status . . . . . . . . . . . . . . . . . . . . . . . . . 8 63 2.8. Using . . . . . . . . . . . . . . . . . . . . . . . . . . 8 64 3. Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . 9 65 3.1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . 10 66 3.2. Inherits . . . . . . . . . . . . . . . . . . . . . . . . 10 67 3.3. Null Values . . . . . . . . . . . . . . . . . . . . . . . 10 68 3.4. Lists . . . . . . . . . . . . . . . . . . . . . . . . . . 10 69 3.5. Decimal . . . . . . . . . . . . . . . . . . . . . . . . . 11 70 3.6. DateTime . . . . . . . . . . . . . . . . . . . . . . . . 11 71 3.7. Binary . . . . . . . . . . . . . . . . . . . . . . . . . 11 72 4. Further Work . . . . . . . . . . . . . . . . . . . . . . . . . 11 73 5. Acnowledgements . . . . . . . . . . . . . . . . . . . . . . . 11 74 6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 75 6.1. Normative References . . . . . . . . . . . . . . . . . . 11 76 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 11 78 1. Introduction 80 The use of schemas to describe communication protocols is well 81 established and plays a central role in the development of ASN.1 and 82 XML based protocols. No such tools are currently widely used for 83 writing JSON based protocols. 85 It is the view of the author that the first, last and only purpose of 86 a protocol schema language is to enable the use of tools to support 87 the development effort. A schema language that delays rather than 88 advances the development of correct code and consistent documentation 89 has become a liabilty, not an enabler. 91 1.1. Previous work 93 One of the main reasons for the lack of such a tool has been the 94 widespread concern as to the complexity of traditional schema tools 95 and in particular the tendency of such tools to impose a complex data 96 model on simple problems. 98 One major difference in the design of the Protogen schema language to 99 its predecessors is that it does not attempt to support every feature 100 of the JSON data model. Protogen is designed to allow programmers to 101 design and implement network service protocols quickly using widely 102 used programming languages such as C, C# and Java. JSON features that 103 do not map conveniently to the majority of widely used languages are 104 best ignored. 106 The XML Schema language is particularly obtuse presenting a two level 107 type system in which element definitions provide typing for data and 108 element types provide a type system for elements. At least three 109 different inheritance mechanisms are supported. 111 The ASN.1 schema language introduces a distinction between lists and 112 sets that is entirely frivolous in a serialization format and 113 gratuitous distinctions between implicit and explict tagging. 115 The lesson to be drawn from these abominations is clear: The primary 116 purpose of a schema language should be to allow the programmer to 117 forget and ignore the wireline representation of protocol messages. 118 Features that allow fine tuning of the wireline representation should 119 be avoided. 121 While the notion of validating input data against a schema prior to 122 passing data to an application is superficially attractive, schema 123 constraints are rarely sufficient for this purpose. Thus applied to 124 protocol design, schema validation rarely provides a meaningful 125 benefit over checking that an encoding is well formed. 127 1.2. Schema driven documentation 128 1.3. Schema driven code generation 130 1.4. Application Examples 132 The following is based on an example from [RFC4627]. 134 [ 135 { 136 "precision": "zip", 137 "Latitude": 37.7668, 138 "Longitude": -122.3959, 139 "Address": "", 140 "City": "SAN FRANCISCO", 141 "State": "CA", 142 "Zip": "94107", 143 "Country": "US" 144 }, 145 { 146 "precision": "zip", 147 "Latitude": 37.371991, 148 "Longitude": -122.026020, 149 "Address": "", 150 "City": "SUNNYVALE", 151 "State": "CA", 152 "Zip": "94085", 153 "Country": "US" 154 } 155 ] 157 The corresponding Protogen schema is: 159 Structure SiteList 160 Description 161 |A list of sites 162 Struct Site Sites 163 Multiple 165 Structure Site 166 Description 167 |A site location 168 String Country 169 Description 170 |ISO ALPHA-2 Country Code. 171 String precision 172 Decimal Latitude 173 Decimal Longitude 174 String Address 175 String City 176 String State 177 String Zip 178 For the sake of example, the description of the site structure 179 entries is elided. While Protogen does not require description 180 elements to be provided to produce code, descriptions are of course 181 essential if useful documentation is to be generated. 183 Protogen is built using the Goedel code metasynthesizer which 184 attempts to eliminate all unnecessary clutter from the code 185 specification to minimize error. By default, indentation and the off- 186 side rule are used to denote block structure following the approach 187 used in occam and Python. Punctuation characters are only used to 188 delimit strings ("), text blocks (|) and comments (!). 190 Note that the Latitude and Longitude are specified using the type 191 Decimal rather than Float. This allows an implementation to avoid the 192 loss of precision that inevitably occurs converting between a binary 193 floating point representation such as IEEE 754 binary 64 and the 194 decimal encoding used in JSON. 196 The example fragment is sufficient to describe a data structure and 197 generate methods for JSON serialization and deserializtion. It is not 198 however sufficient to generate a useful implementation of a Web 199 service or client access library. to do this we must define a 200 protocol with services, transactions and messages defined as follows: 202 Protocol 203 A collection of related services. 205 Service 206 A set of transactions with a distinct DNS SRV prefix and HTTP 207 well known service label. 209 Transaction 210 A defined sequence of protocol messages supported by a service. 211 Currently only request-response design pattern is supported. 213 Message 214 A JSON document that corresponds to a request or response. 216 To build a service using the Site structure, we prepend add following 217 declaration: 219 Protocol Sitefinder STFND 221 Service Finder "_siteFinder._wks" "SiteFinder" Request Response 222 Description 223 |Find sites for new donut stores. 225 Message Request 226 Struct Site WhereIAm 227 Message Response 228 Struct Site WhereAreDonuts 229 Multiple 231 We can now run Protogen to generate any of the following: 233 * Documentation in HTML 235 * Documentation in RFC2XML schema 237 * A C# client access library. 239 * A C# stub service library. 241 * A C header file describing corresponding C structures and data 242 tables to enable serialization/deserialization. 244 Support for partial classes makes C# a particularly attractive target 245 language for code generation as it allows classes produced by 246 generated code to be conveniently extended. Support for other modern 247 languages aligned with the Java/.NET data model requires only 248 straightforward modification of the code generator. 250 While the C# generator is optimized for development of protocols and 251 production code, the generator for C is intended for developing 252 production code after the protocol architecture is largely static. 253 The generator is intentionally biased towards flexibility rather than 254 functionality since a modern programer using C is most likely to be 255 doing so to build on a legacy code base. The ability to easily adapt 256 the output of the generator to the existing coding style(s) is likely 257 to be more highly valued than minimizing implementation effort. 259 2. Protocol Specification 261 2.1. Protocol 263 Top level specification of a protocol. The Protocol element contains 264 two attributes and a list of entries as follows: 266 Namespace 267 Namespace identifier for use in .NET and Java style programming 268 environments 270 Prefix 271 Prefix for use in C style programming environments. 273 Entries 274 A list of [Service Transaction Message Structure Description 275 Using] elements 277 2.2. Description 279 Describes the parent element. Multiple description elements may be 280 specified in which case the first SHOULD be a standalone short 281 description. The description element has one attribute: 283 Text 284 Text field data identified by use of the | prefix. 286 2.3. Service 288 A service is a named set of transactions within a protocol namespace. 290 At present, due to an implementation limitation, all request and 291 response messages used in a service MUST inherit from a single 292 message type. This is bogus and should be fixed. 294 The service element has the following attributes: 296 ID 297 The code identifier of the service 299 Discovery 300 The DNS service prefix of the service for use in SRV, NAPTR 301 style discovery 303 WellKnown 304 The HTTP well known service prefix. 306 Request 307 The parent class for all request messages supported by the 308 service. 310 Response 311 The parent class for all response messages supported by the 312 service. 314 Entries 315 A list of [Description Status] entries 317 2.4. Transaction 319 Specifies a Request-Response transaction supported by a specified 320 service. 322 At present transactions are specific to a service which is kind of 323 bogus if multiple services were defined. 325 The Transaction element has the following attributes: 327 Service 328 The identifier of the service 330 ID 331 The identifier of the transaction 333 Request 334 The request message which must not be an abstract type. 336 Response 337 The response message returned for normal completion. An 338 abstract type may be specified. 340 Entries 341 A list of [Description Status] entries 343 2.5. Message 345 Specifies a protocol message. This is almost the same as a structure 346 except that the name of a request message is a command to a server 347 and the name of a response message identifies a response. 349 Id 350 The message identifier 352 Entries 353 A list of [Description Abstract Inherits Boolean Integer Binary 354 Float Label Name String URI DateTime Struct Enum Status 355 Authentication Format Decimal] entries 357 2.6. Structure 359 2.7. Status 361 This feature is not yet implemented, the idea being that status codes 362 should be represented at both the HTTP layer and JSON layer so that 363 appropriate handling can be specified at either. 365 2.8. Using 367 Specifies a message or structure defined in another schema. 369 3. Data Types 371 Protogen recognizes ten intrinsic data types. While this is 372 considerably larger than the three intrinsic types supported in JSON, 373 the additional expressive power allows the tools to do more work for 374 the programmer. For example, distinguishing strings that represent 375 date-time values from other strings allows the tool to perform the 376 work of encoding/decoding these values. 378 The following table sumarizes the Protogen schema types and their 379 (default) corresponding C#/C equivalents. 381 +----------+-----------------------------+-------------+------------+ 382 | Schema | JSON | C# | C | 383 +----------+-----------------------------+-------------+------------+ 384 | Boolean | true | false | bool | bool | 385 | | | | | 386 | Float | number | double | double | 387 | | | | | 388 | Decimal | number | Int64 | long long | 389 | | | | | 390 | Integer | number | Int64 | int | 391 | | | | | 392 | Binary | string (base64 encoded) | byte[] Data | BinaryType | 393 | | | | | 394 | Label | string | string | StringType | 395 | | | | | 396 | Name | string | string | StringType | 397 | | | | | 398 | String | string | string | StringType | 399 | | | | | 400 | URI | string | string | StringType | 401 | | | | | 402 | DateTime | string | DateTime | struct tm | 403 +----------+-----------------------------+-------------+------------+ 405 Every data type supports the following options: 407 Required 408 The minimum number of occurrences is 1. 410 Multiple 411 Multiple values may be specified. 413 Description 414 Description of the element for use in code generation. 416 Deaful 417 Default value for the element if unspecified. 419 While the Protogen schema definition does include additional options 420 for some data types (e.g. LengthBits, LengthFixed) these are only 421 used in the TLS encoding generator and are ignored when JSON encoding 422 is being used. 424 3.1. Abstract 426 Messages and structures may be marked Abstract which means that they 427 may be used as base classes for inheritance from other messages or 428 structures but cannot appear on the wire. 430 3.2. Inherits 432 Specifies that a message or structure inherits from another message 433 or structure. 435 Note that inheritance relationships are represented in the generated 436 code for languages that support inheritance (e.g. C#) and flattened 437 out in languages that do not (e.g. C). 439 3.3. Null Values 441 No distinction is made between a value that is not present and a 442 value that is present with the value null. Thus the following JSON 443 documents are considered to specify the same object. 445 { "Value": 1 } 447 { "Value": 1, 448 "Optional": null } 450 An entry that has the 'Required' option set MUST always be specified 451 even if the value is null. 453 3.4. Lists 455 No distinction is made between a list that is not present, a list 456 with the null value and an empty list. Thus the following encodings 457 desribe the same object: 459 { "Value": 1 } 461 { "Value": 1, 462 "List": null } 464 { "Value": 1, 465 "List": [] } 467 To simplify scripting language implementation an entry that has the 468 'Multiple' option MUST be encoded as a list. 470 3.5. Decimal 472 The decimal encoding provides an alternative to use of floating point 473 to represent decimal fractions. 475 Since 10 is not a power of 2, conversion between decimal and binary 476 fractions is inexact and using Real32 or Real64 values for this 477 purpose introduces an unnecessary loss of precision. 479 Since modern programming languages lack support for a Decimal 480 intrinsic type, this is implemented by mapping the datum to a 64 bit 481 integer with an offset of 1,000,000,000. This approach allows for 482 numbers up to 9,223,372 to be represented with nine digit precision. 484 3.6. DateTime 486 Date Time Values are encoded as strings in IETF format. 488 3.7. Binary 490 Binary values are encoded using BASE64URL encoding. 492 4. Further Work 494 5. Acnowledgements 496 6. References 498 6.1. Normative References 500 [RFC4627] Crockford, D., "The application/json Media Type for 501 JavaScript Object Notation (JSON)", RFC 4627, July 2006. 503 Author's Address 505 Phillip Hallam-Baker 506 Comodo Group Inc. 508 philliph@comodo.com