idnits 2.17.1 draft-ietf-drums-abnf-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-16) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** Expected the document's filename to be given on the first page, but didn't find any == There are 2 instances of lines with non-ascii characters in the document. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 567 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Abstract section. ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 386: '... IT IS STRONGLY RECOMMENDED THAT THE ...' Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 92 has weird spacing: '...name of a rul...' == Line 93 has weird spacing: '...ng with an al...' == Line 102 has weird spacing: '...are not requi...' == Line 104 has weird spacing: '... use of a...' == Line 336 has weird spacing: '...rrences of e...' == (5 more instances...) == Couldn't figure out when the document was first submitted -- there may comments or warnings related to the use of a disclaimer for pre-RFC5378 work that could not be issued because of this. Please check the Legal Provisions document at https://trustee.ietf.org/license-info to determine if you need the pre-RFC5378 disclaimer. -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RULE' is mentioned on line 356, but not defined -- Possible downref: Non-RFC (?) normative reference: ref. 'US-ASCII' Summary: 13 errors (**), 0 flaws (~~), 11 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group D. Crocker (editor) 2 Internet-Draft: DRAFT-DRUMS-ABNF- Internet Mail 3 06.{txt,ps} Consortium 4 Expiration <1/98> Paul Overell 5 Demon Internet Ltd 7 Augmented BNF for Syntax Specifications: ABNF 9 STATUS OF THIS MEMO 11 This document is an Internet-Draft. Internet-Drafts are working 12 documents of the Internet Engineering Task Force (IETF), its 13 areas, and its working groups. Note that other groups may also 14 distribute working documents as Internet-Drafts. 16 Internet-Drafts are draft documents valid for a maximum of six 17 months and may be updated, replaced, or obsoleted by other 18 documents at any time. It is inappropriate to use Internet- 19 Drafts as reference material or to cite them other than as ``work 20 in progress.'' 22 To learn the current status of any Internet-Draft, please check 23 the ``1id-abstracts.txt'' listing contained in the Internet- 24 Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net 25 (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East 26 Coast), or ftp.isi.edu (US West Coast). 28 TABLE OF CONTENTS 30 1. INTRODUCTION 32 2. RULE DEFINITION 33 2.1 Rule Naming 34 2.2 Rule Form 35 2.3 End-of-Rule 36 2.4 Terminal Values 37 2.5 External Encodings 39 3. OPERATORS 40 3.1 Concatenation Rule1 41 Rule2 42 3.2 Alternatives Rule1 / Rule2 43 3.3 Incremental Alternatives 44 Rule1 =/ Rule2 45 3.4 Value Range Alternatives 46 %c##-## 47 3.5 Sequence Group (Rule1 48 Rule2) 49 3.6 Variable Repetition *Rule 50 3.7 Specific Repetition nRule 51 3.8 Optional Sequence [RULE] 52 3.9 ; Comment 53 3.10 Operator Precedence 55 4. ABNF DEFINITION OF ABNF 57 5. APPENDIX A - CORE 59 6. ACKNOWLEDGEMENTS 61 7. REFERENCES 63 8. CONTACT 65 1. INTRODUCTION 67 Internet technical specifications often need to define a format 68 syntax and are free to employ whatever notation their authors 69 deem useful. Over the years, a modified version of Backus-Naur 70 Form (BNF), called Augmented BNF (ABNF), has been popular among 71 many Internet specifications. It balances compactness and 72 simplicity, with reasonable representational power. In the early 73 days of the Arpanet, each specification contained its own 74 definition of ABNF. This included the email specifications, 75 RFC733 and then RFC822 which have come to be the common citations 76 for defining ABNF. The current document separates out that 77 definition, to permit selective reference. Predictably, it also 78 provides some enhancements. 80 The differences between standard BNF and ABNF involve naming 81 rules, repetition, alternatives, order-independence, lists, and 82 value ranges. Appendix A (Core) supplies rule definitions for a 83 core lexical analyzer of the type common to several Internet 84 specifications. It is provided as a convenience and is otherwise 85 separate from the meta language defined in the body of this 86 document, and separate from its formal status. 88 2. RULE DEFINITION 90 2.1 Rule Naming 92 The name of a rule is simply the name itself; that is, a 93 sequence of characters, beginning with an alphabetic character, 94 and followed by a combination of alphabetics, digits and hyphens 95 (dashes). 97 RULE NAMES ARE CASE-INSENSITIVE. 99 The names , , and all 100 refer to the same rule. 102 Unlike original BNF, angle brackets ("<", ">") are not required. 103 However, angle brackets may be used around a rule reference 104 whenever their presence will facilitate discerning the use of a 105 rule name. This is typically restricted to rule name references 106 in free-form prose, or to distinguish partial rules that combine 107 into a string not separated by white space, such as shown in the 108 discussion about repetition, below. 110 2.2 Rule Form 112 A rule is defined by the following sequence: 114 name = elements 116 where is the name of the rule and is one or 117 more rule names or terminal specifications. The equal sign 118 separates the name from the definition of the rule. The elements 119 form a sequence of one or more rule names and/or value 120 definitions, combined according to the various operators, defined 121 in this document, such as alternative and repetition. 123 2.3 End-of-Rule 125 Formally the grammar requires a one-token look-ahead to find the 126 "=" token, which indicates that the previous token is the name of 127 a new rule. For visual ease, rule definitions are left aligned. 128 When a rule requires multiple lines, the continuation lines are 129 indented. 131 2.4 Terminal Values 133 Rules resolve into a string of terminal values, sometimes called 134 characters. Values within ABNF are represented as decimal 135 numbers. Hence, an ABNF parser processes a sequence of 136 characters. Each character is represented as a decimal number. 137 A string of values is in "network byte order" with the higher- 138 valued bytes represented on the left-hand side and begin sent 139 over the network first.. 141 Terminals are specified by one or more numeric characters with 142 the base interpretation of those characters indicated explicitly. 143 The following bases are currently defined: 145 b = binary 147 d = decimal 149 x = hexadecimal 151 Hence: 153 CR = %d13 155 CR = %x0D 157 respectively specify the decimal and hexadecimal representation 158 of [US-ASCII] for carriage return. 160 A concatenated string of such values is specified compactly, 161 using a period (".") to indicate separation of characters within 162 that value. Hence: 164 CRLF = %d13.10 166 ABNF permits specifying literal text string directly, enclosed in 167 quotation-marks. Hence: 169 command = "command string" 171 Literal text strings are interpreted as a concatenated set of 172 printable characters. 174 ABNF STRINGS ARE CASE-INSENSITIVE AND THE 175 CHARACTER SET FOR THESE STRINGS IS US-ASCII. 177 Hence: 179 rulename = "abc" 181 will match "abc", "Abc", "aBc", "abC", "ABc", "aBC", "AbC" and 182 "ABC". 184 TO SPECIFY A RULE WHICH IS CASE SENSITIVE, 185 SPECIFY THE CHARACTERS INDIVIDUALLY. 187 For example: 189 rulename = %d97 %d9 %d99 191 or 193 rulename = %d97.98.99 195 will match only the string which comprises only lowercased 196 characters, abc. 198 2.5 External Encodings 200 External representations of these characters will vary according 201 to constraints in the storage or transmission environment. 202 Hence, the same ABNF-based grammar may have multiple external 203 encodings, such as one for a 7-bit US-ASCII environment, another 204 for a binary octet environment and still a different one when 16- 205 bit Unicode is used. Encoding details are beyond the scope of 206 ABNF, although Appendix A (Core) provides definitions for a 7-bit 207 US-ASCII environment as has been common to much of the Internet. 209 By separating external encoding from the syntax, it is intended 210 that alternate encoding environments can be used for the same 211 syntax. 213 3. OPERATORS 215 3.1 Concatenation Rule1 Rule2 217 A rule can define a simple, ordered string of values -- i.e., a 218 concatenation of contiguous characters -- by listing a sequence 219 of rule names. For example: 221 foo = %x61 ; a 223 bar = %x62 ; b 225 mumble = foo bar foo 227 So that the rule matches the lower-case string "aba". 229 LINEAR WHITE SPACE: Concatenation is at the core of the ABNF 230 parsing model. A string of contiguous characters (values) is 231 parsed according to the rules defined in ABNF. For Internet 232 specifications, there is some history of permitting linear white 233 space (space and horizontal tab) to be freely�and 234 implicitly�interspered around major constructs, such as 235 delimiting special characters or atomic strings. 237 THIS SPECIFICATION FOR ABNF DOES NOT PROVIDE 238 FOR IMPLICIT SPECIFICATION OF LINEAR WHITE 239 SPACE. 241 Any grammar which wishes to permit linear white space around 242 delimiters or string segments must specify it explicitly. It is 243 often useful to provide for such white space in "core" rules that 244 are then used variously among higher-level rules. The "core" 245 rules might be formed into a lexical analyzer or simply be part 246 of the main ruleset. 248 3.2 Alternatives Rule1 / Rule2 250 Elements separated by forward slash ("/") are alternatives. 251 Therefore, 253 foo / bar 255 will accept or . 257 REMINDER: A string containing alphabetic 258 characters is a non-terminal representing the set 259 of combinatorial strings with upper and lower case 260 characters. 262 3.3 Incremental Alternatives Rule1 =/ Rule2 264 It is sometimes convenient to specify a list of alternatives in 265 fragments. That is, an initial rule may match one or more 266 alternatives, with later rule definitions adding to the set of 267 alternatives. This is particularly useful for otherwise- 268 independent specifications which derive from the same parent rule 269 set, such as often occurs with parameter lists. ABNF permits 270 this incremental definition through the construct: 272 oldrule =/ additional-alternatives 274 So that the rule set 276 ruleset = alt1 / alt2 278 ruleset =/ alt3 280 ruleset =/ alt4 / alt5 282 is the same as specifying 284 ruleset = alt1 / alt2 / alt3 / alt4 / alt5 286 3.4 Value Range Alternatives %c##-## 288 A range of alternative numeric values can be specified compactly, 289 using dash ("-") to indicate the range of alternative values. 290 Hence: 292 DIGIT = %x30-3A 294 is equivalent to: 296 DIGIT = "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" 297 / "9" 299 3.5 Sequence Group (Rule1 Rule2) 301 Elements enclosed in parentheses are treated as a single element, 302 whose contents are STRICTLY ORDERED. Thus, 304 elem (foo / bar) blat 306 which matches (elem foo blat) or (elem bar blat). 308 elem foo / bar blat 310 matches (elem foo) or (bar blat). 312 IT IS STRONGLY ADVISED TO USE GROUPING 313 NOTATION, RATHER THAN TO RELY ON PROPER 314 READING OF "BARE" ALTERNATIONS, WHEN 315 ALTERNATIVES CONSIST OF MULTIPLE RULE NAMES 316 OR LITERALS.. 318 Hence it is strongly recommended that instead of the above form, 319 the form: 321 (elem foo) / (bar blat) 323 be used. It will avoid misinterpretation by casual readers. 325 The local grouping notation is also used within free text to set 326 off an element sequence from the prose. 328 3.6 Variable Repetition *Rule 330 The operator "*" preceding an element indicates repetition. The 331 full form is: 333 *element 335 where and are optional decimal values, indicating at 336 least and at most occurrences of element. 338 Default values are 0 and infinity so that <*element> allows any 339 number, including zero; <1*element> requires at least one; 340 <3*3element> allows exactly 3 and <1*2element> allows one or two. 342 3.7 Specific Repetition nRule 344 A rule of the form: 346 element 348 is equivalent to 350 *element 352 That is, exactly occurrences of . Thus 2DIGIT is 353 a 2-digit number, and 3ALPHA is a string of three alphabetic 354 characters. 356 3.8 Optional Sequence [RULE] 358 Square brackets enclose an optional element sequence: 360 [foo bar] 362 is equivalent to 364 *1(foo bar). 366 3.9 ; Comment 368 A semi-colon starts a comment that continues to the end of line. 369 This is a simple way of including useful notes in parallel with 370 the specifications. 372 3.10 Operator Precedence 374 The various mechanisms described above have the following 375 precedence, from highest (binding tightest) at the top-left, to 376 lowest and loosest at the bottom-right: 378 Strings, Names formation Comment 379 Value range Repetition, List 380 Grouping, Optional Concatenation 381 Alternative 383 Use of the alternative operator, freely mixed with concatenations 384 can be confusing. 386 IT IS STRONGLY RECOMMENDED THAT THE GROUPING 387 OPERATOR BE USED TO MAKE EXPLICIT 388 CONCATENATION GROUPS. 390 4. ABNF DEFINITION OF ABNF 392 This syntax uses the rules provided in Appendix A (Core). 394 rulelist = 1*( rule / (*c-wsp c-nl) ) 396 rule = rulename defined-as elements c-nl 397 ; continues if next line starts 398 ; with white space 400 rulename = ALPHA *(ALPHA / DIGIT / "-") 402 defined-as = *c-wsp ("=" / "=/") *c-wsp 403 ; basic rules definition and 404 ; incremental alternatives 406 elements = alternation *c-wsp 408 c-wsp = WSP / (c-nl WSP) 410 c-nl = comment / CRLF 411 ; comment or newline 413 comment = ";" *(WSP / PCHAR) CRLF 415 alternation = concatenation 416 *(*c-wsp "/" *c-wsp concatenation) 418 concatenation = repetition *(1*c-wsp repetition) 420 repetition = [repeat] element 422 repeat = 1*DIGIT / (*DIGIT "*" *DIGIT) 424 element = rulename / group / option / 425 char-val / num-val / prose-val 427 group = "(" *c-wsp alternation *c-wsp ")" 429 option = "[" *c-wsp alternation *c-wsp "]" 431 char-val = DQUOTE *PCHAR-NDQ DQUOTE 433 num-val = "%" (bin-val / dec-val / hex-val) 435 bin-val = "b" 1*BIT 436 *( ("." 1*BIT) / (":" 1*BIT) ) 437 ; series of concatenated bit values 438 ; and/or series of ONEOF ranges 440 dec-val = "d" 1*DIGIT 441 *( ("." 1*DIGIT) / (":" 1*DIGIT) ) 443 hex-val = "x" 1*HEXDIG 444 *( ("." 1*HEXDIG) / (":" 1*HEXDIG) ) 446 prose-val = "<" *PCHAR-NRB ">" 448 5. APPENDIX A - CORE 450 This Appendix is provided as a convenient core for specific 451 grammars. The definitions may be used as a core set of rules. 453 Certain basic rules are in uppercase, such as SP, HT, CRLF, 454 DIGIT, ALPHA, etc. 456 ALPHA = %x41-5A / %x61-7A ; A-Z / a-z 458 BIT = "0" / "1" 460 CHAR = %x00-7F 461 ; any US-ASCII character 463 CR = %x0D 464 ; carriage return 466 CRLF = CR LF 467 ; Internet standard newline 469 CTL = %x00-1F / %x7F 470 ; controls 472 DIGIT = %x30-39 473 ; 0-9 475 DQUOTE = %x22 476 ; " (Double Quote) 478 HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F" 480 HT = %x09 481 ; horizontal tab 483 LF = %x0A 484 ; linefeed 486 LWSP = *(WSP / CRLF WSP) 487 ; linear white space (past newline) 489 PCHAR = %x20-7E 490 ; printable characters 492 PCHAR-NRB = %x20-3D / %x3F-7E 493 ; PCHAR less > (No Right Bracket) 495 PCHAR-NDQ = %x20-21 / %x23-7E 496 ; PCHAR less " (No Double Quote) 498 SP = %x20 499 ; space 501 WSP = SP / HT 502 ; white space 504 Externally, data are represented as "network virtual ASCII", 505 namely 7-bit US-ASCII in an 8th bit field, with the high (8th) 506 bit set to zero. 508 6. ACKNOWLEDGEMENTS 510 The syntax for ABNF was originally specified in RFC #733. Ken L. 511 Harrenstien, of SRI International, was responsible for re-coding 512 the BNF into an augmented BNF that makes the representation 513 smaller and easier to understand. 515 The current round of specification was part of the DRUMS working 516 group, with significant contributions from Roger Fajman, Bill 517 McQuillan, Keith Moore, Pete Resnick, Jerome Abela and Chris 518 Newman. 520 7. REFERENCES 522 [US-ASCII] Coded Character Set--7-Bit American Standard Code 523 for Information Interchange, ANSI X3.4-1986. 525 8. CONTACT 527 David H. Crocker Paul Overell 529 Internet Mail Consortium Demon Internet Ltd 530 675 Spruce Dr. Dorking Business Park 531 Sunnyvale, CA 94086 USA Dorking 532 Surrey, RH4 1HN 533 UK 535 Phone: +1 408 246 8253 536 Fax: +1 408 249 6205