idnits 2.17.1 draft-ietf-drums-abnf-v2-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** Expected the document's filename to be given on the first page, but didn't find any == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 656 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Abstract section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 45: '...3.8 OPTIONAL SEQUENCE [RULE]...' Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 94 has weird spacing: '...ng with an al...' == Line 103 has weird spacing: '...are not requi...' == Line 105 has weird spacing: '... use of a rul...' == Line 343 has weird spacing: '...ires at least...' == Line 356 has weird spacing: '...rrences of Demon Internet Ltd 6 Augmented BNF for Syntax Specifications: ABNF 8 STATUS OF THIS MEMO 10 This document is an Internet-Draft. Internet-Drafts are working 11 documents of the Internet Engineering Task Force (IETF), its 12 areas, and its working groups. Note that other groups may also 13 distribute working documents as Internet-Drafts. 15 Internet-Drafts are draft documents valid for a maximum of six 16 months and may be updated, replaced, or obsoleted by other 17 documents at any time. It is inappropriate to use Internet- 18 Drafts as reference material or to cite them other than as ``work 19 in progress.'' 21 To learn the current status of any Internet-Draft, please check 22 the ``1id-abstracts.txt'' listing contained in the Internet- 23 Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net 24 (Europe), munnari.oz.au (Pacific Rim), ftp.ietf.org (US East 25 Coast), or ftp.isi.edu (US West Coast). 27 TABLE OF CONTENTS 29 1. INTRODUCTION 31 2. RULE DEFINITION 32 2.1 RULE NAMING 33 2.2 RULE FORM 34 2.3 TERMINAL VALUES 35 2.4 EXTERNAL ENCODINGS 37 3. OPERATORS 38 3.1 CONCATENATION RULE1 RULE2 39 3.2 ALTERNATIVES RULE1 / RULE2 40 3.3 INCREMENTAL ALTERNATIVES RULE1 =/ RULE2 41 3.4 VALUE RANGE ALTERNATIVES %C##-## 42 3.5 SEQUENCE GROUP (RULE1 RULE2) 43 3.6 VARIABLE REPETITION *RULE 44 3.7 SPECIFIC REPETITION NRULE 45 3.8 OPTIONAL SEQUENCE [RULE] 46 3.9 ; COMMENT 47 3.10 OPERATOR PRECEDENCE 49 4. ABNF DEFINITION OF ABNF 51 5. SECURITY CONSIDERATIONS 53 6. APPENDIX A - CORE 54 6.1 CORE RULES 55 6.2 COMMON ENCODING 57 7. APPENDIX B - ENHANCEMENTS 58 7.1 CONCATENATED TERMINAL VALUES 59 7.2 BINARY LITERAL VALUES 61 8. ACKNOWLEDGMENTS 63 9. REFERENCES 64 10. CONTACT 66 1. INTRODUCTION 68 Internet technical specifications often need to define a format 69 syntax and are free to employ whatever notation their authors 70 deem useful. Over the years, a modified version of Backus-Naur 71 Form (BNF), called Augmented BNF (ABNF), has been popular among 72 many Internet specifications. It balances compactness and 73 simplicity, with reasonable representational power. In the early 74 days of the Arpanet, each specification contained its own 75 definition of ABNF. This included the email specifications, 76 RFC733 and then RFC822 which have come to be the common citations 77 for defining ABNF. The current document separates out that 78 definition, to permit selective reference. Predictably, it also 79 provides some modifications and enhancements. 81 The differences between standard BNF and ABNF involve naming 82 rules, repetition, alternatives, order-independence, and value 83 ranges. Appendix A (Core) supplies rule definitions and encoding 84 for a core lexical analyzer of the type common to several 85 Internet specifications. It is provided as a convenience and is 86 otherwise separate from the meta language defined in the body of 87 this document, and separate from its formal status. 89 2. RULE DEFINITION 91 2.1 Rule Naming 93 The name of a rule is simply the name itself; that is, a sequence 94 of characters, beginning with an alphabetic character, and 95 followed by a combination of alphabetics, digits and hyphens 96 (dashes). 98 NOTE: Rule names are case-insensitive 100 The names , , and all 101 refer to the same rule. 103 Unlike original BNF, angle brackets ("<", ">") are not required. 104 However, angle brackets may be used around a rule name whenever 105 their presence will facilitate discerning the use of a rule 106 name. This is typically restricted to rule name references in 107 free-form prose, or to distinguish partial rules that combine 108 into a string not separated by white space, such as shown in the 109 discussion about repetition, below. 111 2.2 Rule Form 113 A rule is defined by the following sequence: 115 name = elements crlf 117 where is the name of the rule, is one or more 118 rule names or terminal specifications and is the end-of- 119 line indicator, carriage return followed by line feed. The equal 120 sign separates the name from the definition of the rule. The 121 elements form a sequence of one or more rule names and/or value 122 definitions, combined according to the various operators, defined 123 in this document, such as alternative and repetition. 125 For visual ease, rule definitions are left aligned. When a rule 126 requires multiple lines, the continuation lines are indented. 127 The left alignment and indentation are relative to the first 128 lines of the ABNF rules and need not match the left margin of the 129 document. 131 2.3 Terminal Values 133 Rules resolve into a string of terminal values, sometimes called 134 characters. In ABNF a character is merely a non-negative 135 integer. In certain contexts a specific mapping (encoding) of 136 values into a character set (such as ASCII) will be specified. 138 Terminals are specified by one or more numeric characters with 139 the base interpretation of those characters indicated explicitly. 140 The following bases are currently defined: 142 d = decimal 144 x = hexadecimal 146 Hence: 148 CR = %d13 150 CR = %x0D 152 respectively specify the decimal and hexadecimal representation 153 of [US-ASCII] for carriage return. 155 ABNF permits specifying literal text string directly, enclosed in 156 quotation-marks. Hence: 158 command = "command string" 160 Literal text strings are interpreted as a concatenated set of 161 printable characters. 163 NOTE: ABNF strings are case-insensitive and 164 the character set for these strings is 165 us-ascii. 167 Hence: 169 rulename = "abc" 171 and: 173 rulename = "aBc" 175 will match "abc", "Abc", "aBc", "abC", "ABc", "aBC", "AbC" and 176 "ABC". 178 To specify a rule which IS case SENSITIVE, 179 specify the characters individually. 181 For example: 183 rulename = %d97 %d98 %d99 185 will match only the string which comprises only lowercased 186 characters, abc. 188 2.4 External Encodings 190 External representations of terminal value characters will vary 191 according to constraints in the storage or transmission 192 environment. Hence, the same ABNF-based grammar may have 193 multiple external encodings, such as one for a 7-bit US-ASCII 194 environment, another for a binary octet environment and still a 195 different one when 16-bit Unicode is used. Encoding details are 196 beyond the scope of ABNF, although Appendix A (Core) provides 197 definitions for a 7-bit US-ASCII environment as has been common 198 to much of the Internet. 200 By separating external encoding from the syntax, it is intended 201 that alternate encoding environments can be used for the same 202 syntax. 204 3. OPERATORS 206 3.1 Concatenation Rule1 Rule2 208 A rule can define a simple, ordered string of values -- that is,, 209 a concatenation of contiguous characters -- by listing a sequence 210 of rule names. For example: 212 foo = %x61 ; a 214 bar = %x62 ; b 216 mumble = foo bar foo 218 So that the rule matches the lowercase string "aba". 220 LINEAR WHITE SPACE: Concatenation is at the core of the ABNF 221 parsing model. A string of contiguous characters (values) is 222 parsed according to the rules defined in ABNF. For Internet 223 specifications, there is some history of permitting linear white 224 space (space and horizontal tab) to be freely--and implicitly-- 225 interspersed around major constructs, such as delimiting special 226 characters or atomic strings. 228 NOTE: This specification for ABNF does not 229 provide for implicit specification of 230 linear white space. 232 Any grammar which wishes to permit linear white space around 233 delimiters or string segments must specify it explicitly. It is 234 often useful to provide for such white space in "core" rules that 235 are then used variously among higher-level rules. The "core" 236 rules might be formed into a lexical analyzer or simply be part 237 of the main ruleset. 239 3.2 Alternatives Rule1 / Rule2 241 Elements separated by forward slash ("/") are alternatives. 242 Therefore, 244 foo / bar 246 will accept or . 248 NOTE: A quoted string containing alphabetic 249 characters is special form for 250 specifying alternative characters and is 251 interpreted as a non-terminal 252 representing the set of combinatorial 253 strings with the contained characters, 254 in the specified order but with any 255 mixture of upper and lower case. 257 3.3 Incremental Alternatives Rule1 =/ Rule2 259 It is sometimes convenient to specify a list of alternatives in 260 fragments. That is, an initial rule may match one or more 261 alternatives, with later rule definitions adding to the set of 262 alternatives. This is particularly useful for otherwise- 263 independent specifications which derive from the same parent rule 264 set, such as often occurs with parameter lists. ABNF permits 265 this incremental definition through the construct: 267 oldrule =/ additional-alternatives 269 So that the rule set 271 ruleset = alt1 / alt2 273 ruleset =/ alt3 275 ruleset =/ alt4 / alt5 277 is the same as specifying 279 ruleset = alt1 / alt2 / alt3 / alt4 / alt5 281 3.4 Value Range Alternatives %c##-## 283 A range of alternative numeric values can be specified compactly, 284 using dash ("-") to indicate the range of alternative values. 285 Hence: 287 DIGIT = %x30-39 289 is equivalent to: 291 DIGIT = "0" / "1" / "2" / "3" / "4" / "5" / "6" / 293 "7" / "8" / "9" 295 Concatenated numeric values and numeric value ranges can not be 296 specified in the same string. A numeric value may use the dotted 297 notation for concatenation or it may use the dash notation to 298 specify one value range. Hence, to specify one printable 299 character, between end of line sequences, the specification could 300 be: 302 onechar-line = %x0D.0A %x20-7E %x0D.0A 304 3.5 Sequence Group (Rule1 Rule2) 306 Elements enclosed in parentheses are treated as a single element, 307 whose contents are STRICTLY ORDERED. Thus, 309 elem (foo / bar) blat 311 which matches (elem foo blat) or (elem bar blat). 313 elem foo / bar blat 315 matches (elem foo) or (bar blat). 317 NOTE: It is strongly advised to use grouping 318 notation, rather than to rely on proper 319 reading of "bare" alternations, when 320 alternatives consist of multiple rule 321 names or literals. 323 Hence it is recommended that instead of the above form, the form: 325 (elem foo) / (bar blat) 327 be used. It will avoid misinterpretation by casual readers. 329 The sequence group notation is also used within free text to set 330 off an element sequence from the prose. 332 3.6 Variable Repetition *Rule 334 The operator "*" preceding an element indicates repetition. The 335 full form is: 337 *element 339 where and are optional decimal values, indicating at 340 least and at most occurrences of element. 342 Default values are 0 and infinity so that * allows any 343 number, including zero; 1* requires at least one; 344 3*3 allows exactly 3 and 1*2 allows one or two. 346 3.7 Specific Repetition nRule 348 A rule of the form: 350 element 352 is equivalent to 354 *element 356 That is, exactly occurrences of . Thus 2DIGIT is 357 a 2-digit number, and 3ALPHA is a string of three alphabetic 358 characters. 360 3.8 Optional Sequence [RULE] 362 Square brackets enclose an optional element sequence: 364 [foo bar] 366 is equivalent to 368 *1(foo bar). 370 3.9 ; Comment 372 A semi-colon starts a comment that continues to the end of line. 373 This is a simple way of including useful notes in parallel with 374 the specifications. 376 3.10 Operator Precedence 378 The various mechanisms described above have the following 379 precedence, from highest (binding tightest) at the top, to lowest 380 and loosest at the bottom: 382 Strings, Names formation 383 Comment 384 Value range 385 Repetition 386 Grouping, Optional 387 Concatenation 388 Alternative 390 Use of the alternative operator, freely mixed with concatenations 391 can be confusing. 393 Again, it is recommended that the grouping 394 operator be used to make explicit concatenation 395 groups. 397 4. ABNF DEFINITION OF ABNF 399 This syntax uses the rules provided in Appendix A (Core). 401 rulelist = 1*( rule / (*c-wsp c-nl) ) 403 rule = rulename defined-as elements c-nl 404 ; continues if next line starts 405 ; with white space 407 rulename = ALPHA *(ALPHA / DIGIT / "-") 409 defined-as = *c-wsp ("=" / "=/") *c-wsp 410 ; basic rules definition and 411 ; incremental alternatives 413 elements = alternation *c-wsp 415 c-wsp = WSP / (c-nl WSP) 417 c-nl = comment / CRLF 418 ; comment or newline 420 comment = ";" *(WSP / VCHAR) CRLF 422 alternation = concatenation 423 *(*c-wsp "/" *c-wsp concatenation) 425 concatenation = repetition *(1*c-wsp repetition) 427 repetition = [repeat] element 429 repeat = 1*DIGIT / (*DIGIT "*" *DIGIT) 431 element = rulename / group / option / 432 char-val / num-val / prose-val 434 group = "(" *c-wsp alternation *c-wsp ")" 436 option = "[" *c-wsp alternation *c-wsp "]" 438 char-val = DQUOTE *(%x20-21 / %x23-7E) DQUOTE 439 ; quoted string of SP and VCHAR 440 without DQUOTE 442 num-val = "%" (bin-val / dec-val / hex-val) 444 bin-val = "b" 1*BIT [ "-" 1*BIT ] 445 ; series of concatenated bit values 446 ; or single ONEOF range 448 dec-val = "d" 1*DIGIT [ "-" 1*DIGIT ] 450 hex-val = "x" 1*HEXDIG [ "-" 1*HEXDIG ] 452 prose-val = "<" *(%x20-3D / %x3F-7E) ">" 453 ; bracketed string of SP and VCHAR 454 without angles 455 ; prose description, to be used as 456 last resort 458 5. SECURITY CONSIDERATIONS 460 Security is truly believed to be irrelevant to this document. 462 6. APPENDIX A - CORE 464 This Appendix is provided as a convenient core for specific 465 grammars. The definitions may be used as a core set of rules. 467 6.1 Core Rules 469 Certain basic rules are in uppercase, such as SP, HTAB, CRLF, 470 DIGIT, ALPHA, etc. 472 ALPHA = %x41-5A / %x61-7A ; A-Z / a-z 474 BIT = "0" / "1" 476 CHAR = %x01-7F 477 ; any 7-bit US-ASCII character, 478 excluding NUL 480 CR = %x0D 481 ; carriage return 483 CRLF = CR LF 484 ; Internet standard newline 486 CTL = %x00-1F / %x7F 487 ; controls 489 DIGIT = %x30-39 490 ; 0-9 492 DQUOTE = %x22 493 ; " (Double Quote) 495 HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F" 497 HTAB = %x09 498 ; horizontal tab 500 LF = %x0A 501 ; linefeed 503 LWSP = *(WSP / CRLF WSP) 504 ; linear white space (past newline) 506 OCTET = %x00-FF 507 ; 8 bits of data 509 SP = %x20 510 ; space 512 VCHAR = %x21-7E 513 ; visible (printing) characters 515 WSP = SP / HTAB 516 ; white space 518 6.2 Common Encoding 520 Externally, data are represented as "network virtual ASCII", 521 namely 7-bit US-ASCII in an 8-bit field, with the high (8th) bit 522 set to zero. A string of values is in "network byte order" with 523 the higher-valued bytes represented on the left-hand side and 524 being sent over the network first. 526 7. APPENDIX B - ENHANCEMENTS 528 This section provides some additional features for ABNF that are 529 not part of the official specification. The features contained 530 here are expected to be have benefit, eventually, but did not 531 gain immediate use. Writers of specifications wishing to use 532 these features may cite the relevant sub-sections to this 533 appendix. 535 7.1 Concatenated Terminal Values 537 A concatenated string of such values is specified compactly, 538 using a period (".") to indicate separation of characters within 539 that value. Hence: 541 CRLF = %d13.10 543 Therefore, 545 rulename = %d97.98.99 547 will match only the string which comprises only lowercased 548 characters, abc. 550 If a grammar uses concatenated terminal values, then the "ABNF 551 Definition of ABNF" will modify two rules to be: 553 dec-val = "d" 1*DIGIT 554 [ 1*("." 1*DIGIT) / ("-" 1*DIGIT) ] 556 hex-val = "x" 1*HEXDIG 557 [ 1*("." 1*HEXDIG) / ("-" 1*HEXDIG) ] 559 7.2 Binary Literal Values 561 Terminals are specified by one or more numeric characters with 562 the base interpretation of those characters indicated explicitly. 563 The following, additional base is defined: 565 b = binary 567 If binary literal values are used with the concatenation 568 mechanism defined above, then the "ABNF Definition of ABNF" 569 modifies the relevant rule to be: 571 bin-val = "b" 1*BIT 572 [ 1*("." 1*BIT) / ("-" 1*BIT) ] 573 ; series of concatenated bit values 574 ; or single ONEOF range 576 8. ACKNOWLEDGMENTS 578 The syntax for ABNF was originally specified in RFC #733. Ken L. 579 Harrenstien, of SRI International, was responsible for re-coding 580 the BNF into an augmented BNF that makes the representation 581 smaller and easier to understand. 583 This recent project began as a simple effort to cull out the 584 portion of RFC 822, which has been repeatedly cited by non-email 585 specification writers, namely the description of augmented BNF. 586 Rather than simply and blindly converting the existing text into 587 a separate document, the working group chose to give careful 588 consideration to the deficiencies, as well as benefits, of the 589 existing specification and related specifications available over 590 the last 15 years and therefore to pursue enhancement. This 591 turned the project into something rather more ambitious than 592 first intended. Interestingly the result is not massively 593 different from that original, although decisions such as removing 594 the list notation came as a surprise. 596 The current round of specification was part of the DRUMS working 597 group, with significant contributions from Jerome Abela , Harald 598 Alvestrand, Robert Elz, Roger Fajman, Aviva Garrett, Tom Harsch, 599 Dan Kohn, Bill McQuillan, Keith Moore, Chris Newman , Pete 600 Resnick and Henning Schulzrinne. 602 9. REFERENCES 604 [US-ASCII] Coded Character Set--7-Bit American Standard Code 605 for Information Interchange, ANSI X3.4-1986. 607 [RFC733] Crocker, D.H., Vittal, J.J., Pogran, K.T., 608 Henderson, D.A. "Standard for the Format of ARPA Network 609 Text Message," RFC 733, November 1977. 611 [RFC822] Crocker, D., "Standard for the Format of ARPA Internet 612 Text Messages", RFC 822, August, 1982. 614 10. CONTACT 616 David H. Crocker Paul Overell 618 Brandenburg Consulting Demon Internet Ltd 619 675 Spruce Dr. Dorking Business Park 620 Sunnyvale, CA 94086 USA Dorking 621 Surrey, RH4 1HN 622 UK 624 Phone: +1 408 246 8253 625 Fax: +1 408 249 6205