idnits 2.17.1 draft-ietf-drums-abnf-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-19) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 611 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Abstract section. ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 18 instances of lines with control characters in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 94 has weird spacing: '...name of a rul...' == Line 95 has weird spacing: '...ng with a dig...' == Line 105 has weird spacing: '... use of a rul...' == Line 290 has weird spacing: '...ses are treat...' == Line 329 has weird spacing: '...rrences of e...' == (10 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RULE' is mentioned on line 350, but not defined -- Possible downref: Non-RFC (?) normative reference: ref. 'US-ASCII' Summary: 12 errors (**), 0 flaws (~~), 9 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group D. Crocker (editor) 2 Internet-Draft Internet Mail Consortium 3 Paul Overell 4 Expiration <7/97> Demon Internet Ltd 6 Augmented BNF for Syntax Specifications: ABNF 7 9 STATUS OF THIS MEMO 11 This document is an Internet-Draft. Internet-Drafts are 12 working documents of the Internet Engineering Task Force 13 (IETF), its areas, and its working groups. Note that other 14 groups may also distribute working documents as Internet- 15 Drafts. 17 Internet-Drafts are draft documents valid for a maximum of 18 six months and may be updated, replaced, or obsoleted by 19 other documents at any time. It is inappropriate to use 20 Internet- Drafts as reference material or to cite them other 21 than as ``work in progress.'' 23 To learn the current status of any Internet-Draft, please 24 check the ``1id-abstracts.txt'' listing contained in the 25 Internet- Drafts Shadow Directories on ftp.is.co.za (Africa), 26 nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), 27 ds.internic.net (US East Coast), or ftp.isi.edu (US West 28 Coast). 30 TABLE OF CONTENTS 32 1. INTRODUCTION 34 2. RULE DEFINITION 35 2.1 Rule Naming 36 2.2 Rule Form 37 2.3 End-of-Rule 38 2.4 Terminal Values 39 2.5 External Encodings 41 3. OPERATORS 42 3.1 Concatenation: Rule1 Rule2 43 3.2 Alternatives: Rule1 / Rule2 44 3.3 Incremental Alternatives: Rule1 =/ Rule2 45 3.4 Sequence Group: (Rule1 Rule2) 46 3.5 Set Group: {Rule 1 Rule2} 47 3.6 Variable Repetition: *Rule 48 3.7 Specific Repetition: nRule 49 3.8 Optional Sequence: [RULE] 50 3.9 Lists: #Rule 51 3.10 Value Ranges: a..b 52 3.11 ; Comment 53 3.12 Operator Precedence 55 4. ABNF DEFINITION OF ABNF... 57 5. APPENDIX A - CORE 59 6. ACKNOWLEDGEMENTS 61 7. REFERENCES 63 8. CONTACT 65 1. INTRODUCTION 67 Internet technical specifications often need to define a 68 format syntax and are free to employ whatever notation their 69 authors deem useful. Over the years, a modified version of 70 Backus-Naur Form (BNF), called Augmented BNF (ABNF), has been 71 popular among many Internet specifications. It balances 72 compactness and simplicity, with reasonable representational 73 power. In the early days of the Arpanet, each specification 74 contained its own definition of ABNF. This included the 75 email specifications, RFC733 and then RFC822 which have come 76 to be the common citations for defining ABNF. The current 77 document separates out that definition, to permit selective 78 reference. Predictably, it also provides some enhancements. 80 The differences between standard BNF and the ABNF defined 81 here involve naming rules, repetition, alternatives, and 82 order-independence, and rules that add alternatives to 83 existing rules, lists, and value ranges. Appendix A (Core) 84 supplies rule definitions for a core lexical analyzer, of the 85 type common to several Internet specifications. It is 86 provided as a convenience and is otherwise separate from the 87 meta language defined in the body of this document, and its 88 formal status. 90 2. RULE DEFINITION 92 2.1 Rule Naming 94 The name of a rule is simply the name itself; that is, a 95 sequence of characters, not beginning with a digit, with an 96 asterisk ("*"), or with a number (pound) sign ("#"). (This 97 avoids ambiguity with the various repetition mechanisms, 98 defined below.) Rule names are case-INsensitive. The names 99 , , and all refer 100 to the same rule. 102 Unlike original BNF, angle brackets ("<", ">") are not 103 required. However, angle brackets may used around a rule 104 reference whenever their presence will facilitate discerning 105 the use of a rule name. This is typically restricted to 106 rule name references in free-form prose, or to distinguish 107 partial rules that combine into a string not separated by 108 linear white space, such as shown in the discussion about 109 repetition, below. 111 2.2 Rule Form 113 A rule is defined by the following sequence: 115 name = elements 117 where is the name of the rule and is one or 118 more rules or terminal specifications. The equal sign 119 separates the name from the definition of the rule. The 120 elements are a sequence of one or more rule names and/or 121 value definitions, combined according to the various 122 operators, defined in this document, such as alternative and 123 repetition. 125 2.3 End-of-Rule 127 Formally the grammar requires a one-token look-ahead to find 128 the "=" token, which indicates that the previous token is the 129 name of a new rule. For visual ease, rules should start in 130 column 1, with rule continuation indicated by blank (linear 131 white space) in column 1. In some documentation, "column 1" 132 might be virtual, with a consistent indentation from the left 133 margin, for all rules. 135 2.4 Terminal Values 137 Rules resolve into a string of terminal values, sometimes 138 called characters. Values within ABNF are represented as 139 decimal numbers. Hence, an ABNF parser processes a sequence 140 of characters. Each character is represented as a decimal 141 number. A string of values is in "network byte order" with 142 the higher-valued bytes represented on the left-hand side and 143 begin sent over the network first.. 145 Terminals are specified by one or more numerica characters 146 with the base interpretation of those characters indicated 147 explicitly. The following bases are currently defined: 149 b = binary 151 d = decimal 153 x = hexadecimal 155 Hence: 157 CR = %d13 159 CR = %x0D 161 respectively specify the decimal and hexadecimal 162 representation of [US-ASCII] for carriage return. 164 A string of such values is specified compactly, using a 165 period (".") to indicate separation of characters within that 166 value. Hence: 168 CRLF = %d13.10 170 For a sequence of values which can be represented as simple, 171 graphical characters (letters), they may be specified as a 172 string of literals, enclosed in quotation-marks. Hence: 174 rule = "rule-name = rule-value" 176 specifies the rule "rule" which contains the characters of an 177 ABNF rule specification. 179 ABNF STRINGS ARE CASE-INSENSITIVE AND THE 180 CHARACTER SET FOR THESE STRINGS IS US- 181 ASCII. 183 Hence: 185 rulename = "abc" 187 will match "abc", "Abc", "aBc", "abC", "ABc", "aBC", and 188 "ABC". 190 To specify a rule which IS case SENSITIVE, specify the 191 characters individually. For example: 193 rulename = %d97 %d9 %d99 195 or 197 rulename = %d97.98.99 199 will match only the string which comprises only lowercased 200 characters, abc. 202 2.5 External Encodings 204 External representations of these characters will vary 205 according to constraints in the storage or transmission 206 environment. Hence, the same ABNF-based grammar may have 207 multiple external encodings, such as one for a 7-bit US-ASCII 208 environment, another for a binary octet environment and still 209 a different one when 16-bit Unicode is used. Encoding 210 details are beyond the scope of ABNF, although Appendix A 211 (Core) provides definitions for a 7-bit US-ASCII environment 212 as has been common to much of the Internet. 214 By separating external encoding from the syntax, it is 215 intended that alternate encoding environments can be used for 216 the same syntax. 218 3. OPERATORS 220 3.1 Concatenation: Rule1 Rule2 222 A rule can define a simple, ordered string of values -- i.e., 223 a concatenation of contiguous characters -- by listing a 224 sequence of rule names. For example: 226 foo = %x61 ; a 228 bar = %x62 ; b 230 mumble = foo bar foo 232 So that the rule defines the lower-case string 233 "aba". 235 LINEAR WHITE SPACE: Concatenation is at the core of the ABNF 236 parsing model. A string of contiguous characters (values) is 237 parsed according to the rules defined in ABNF. For Internet 238 specifications, there is some history of permitting linear 239 white space (space and horizontal tab) to be freely-and 240 implicitly-interspered around major constructs, such as 241 delimiting special characters or atomic strings. 243 THIS SPECIFICATION FOR ABNF DOES NOT PROVIDE 244 FOR IMPLICIT SPECIFICATION OF LINEAR WHITE 245 SPACE. 247 Any grammar which wishes to permit linear white space around 248 delimiters or string segments must specify it explicitly. It 249 is often useful to provide for such white space in "core" 250 rules that are then used variously among higher-level rules. 251 The "core" rules might be formed into a lexical analyzer or 252 simply be part of the main ruleset. 254 3.2 Alternatives: Rule1 / Rule2 256 Elements separated by forward slash ("/") are alternatives. 257 Therefore, 259 foo / bar 261 will accept or . 263 3.3 Incremental Alternatives: Rule1 =/ Rule2 265 It is sometimes convenient to specify a list of alternatives 266 in fragments. That is, an initial rule may define one or 267 more alternatives, with later rule definitions adding to the 268 set of alternatives. This is particularly useful for 269 otherwise-independent specifications which derive from the 270 same parent rule set, such as often occurs with parameter 271 lists. ABNF permits this incremental definition through the 272 construct: 274 oldrule =/ additional-alternatives 276 So that the rule set 278 ruleset = alt1 / alt2 280 ruleset =/ alt3 282 ruleset =/ alt4 / alt5 284 is the same as specifying 286 ruleset = alt1 / alt2 / alt3 / alt4 / alt5 288 3.4 Sequence Group: (Rule1 Rule2) 290 Elements enclosed in parentheses are treated as a single 291 element, whose contents are STRICTLY ORDERED. Thus, 293 (elem foo) / (bar blat) elem 295 allows the token sequences (elem foo elem) and (bar blat 296 elem). Without the grouping, the rule: 298 elem foo / bar elem 300 would match (elem foo elem) or (elem bar elem). The local 301 grouping notation is also used within free text to set off an 302 element sequence from the prose. 304 3.5 Set Group: {Rule 1 Rule2} 306 Elements enclosed in braces (squiggly brackets) are treated 307 as a single, UNORDERED element. Its contents may occur in 308 any order. Hence: 310 {elem foo} bar 312 would match (elem foo bar) and (foo elem bar). 314 NOTE: Specifying alternatives is quite different from 315 specifying set grouping. Alternatives indicate the matching 316 of exactly one (sub-)rule out of the total grouping. The set 317 mechanism indicates the matching of a string which contains 318 all of the elements within the group; however the elements 319 may occur in any order. 321 3.6 Variable Repetition: *Rule 323 The operator "*" preceding an element indicates repetition. 324 The full form is: 326 *element 328 where and are optional decimal values, indicating at 329 least and at most occurrences of element. 331 Default values are 0 and infinity so that <*element> allows 332 any number, including zero; <1*element> requires at least 333 one; <3*3element> allows exactly 3 and <1*2element> allows 334 one or two. 336 3.7 Specific Repetition: nRule 338 A rule of the form: 340 element 342 is equivalent to 344 *element 346 That is, exactly occurrences of . Thus 2DIGIT 347 is a 2-digit number, and 3ALPHA is a string of three 348 alphabetic characters. 350 3.8 Optional Sequence: [RULE] 352 Square brackets enclose an optional element sequence: 354 [foo bar] 356 is equivalent to 358 *1(foo bar). 360 3.9 Lists: #Rule 362 A construct "#" is defined as being similar to "*", for a 363 list sequence: 365 #element 367 indicates at least and at most elements, each 368 separated by one or more commas (","). This makes the usual 369 form of lists very easy; a rule such as: 371 element *("," element) 373 can therefore be shown as 375 1#element 377 Wherever this construct is used, null elements are allowed, 378 but do not contribute to the count of elements present. 379 That is, 381 element,,element 383 is permitted, but counts as only two elements. Therefore, 384 where at least one element is required, at least one non- 385 null element must be present. 387 Default values are 0 and infinity so that <#element> allows 388 any number, including zero; <1#element> requires at least 389 one; and <1#2element> allows one or two. 391 3.10 Value Ranges: a..b 393 Values separated by double periods ("..") specify a range of 394 values from the lowest, on the left-hand side, to the highest 395 on the right-hand side. Values may be specifiednumericallyl 396 or with rule references. The form: 398 %d12..%d15 400 or 402 %d12..15 404 represents a value in the range 12 to 15, inclusively. When 405 the values are specified using rules rather than explicit 406 decimal numbers, the rules must reduce to single, decimal 407 values. Hence: 409 CR = %d12 411 LF = %d15 413 smallrange = LF..CR 415 is valid and indicates the decimal value range 12 to 15. 417 3.11 ; Comment 419 A semi-colon starts a comment that continues to the end of 420 line. This is a simple way of including useful notes in 421 parallel with the specifications. 423 3.12 Operator Precedence 425 The various mechanisms described above have the following 426 precedence, from highest (binding tightest) at the top, to 427 lowest and loosest at the bottom: 429 Comment 430 Value range 431 Repetition, List 432 Grouping, Optional 433 Concatenation 434 Alternative 436 Use of the alternative operator, freely mixed with 437 concatenations can be confusing. It is STRONGLY recommended 438 that the grouping operator be used to make explicit 439 concatenation groups. 441 4. ABNF DEFINITION OF ABNF... 443 This syntax uses the rules provided in Appendix A (Core). 445 rule = rulename *WSP ("=" / "=/") value 446 [comment] *WSP CRLF 447 ; continues if next line starts 448 with white space 449 ; basic rules definition and 450 incremental alternatives 452 bin-val = "b" 1*("0".."1") 453 [ 1*("." 1*("0".."1")) 454 / (".." 1*("0".."1")) ] 456 comment = *WSP ";" *(WSP / %x21..%x7E) 458 dec-val = "d" 1*DIGIT 459 [ 1*("." 1*DIGIT) 460 / (".." 1*DIGIT) ] 462 el-component = rulename / set / group / option / 463 num-val / lit-val / prose-val 465 element = [repeat] el-component 467 group = *WSP "(" value *WSP ")" 469 hex-val = "x" 1*(DIGIT / "A".."F") 470 [ 1*("." 1*(DIGIT / "A".."F")) 471 / (".." 1*(DIGIT / "A".."F")) ] 473 lit-val = *WSP <"> *PCHAR <"> 475 num-val = *WSP "%" (bin-val / dec-val / hex-val) 477 option = *WSP "[" value *WSP "] 479 prose-val = *WSP "<" 480 ">" 482 set = *WSP "{" 2*element *WSP "}" 483 ; elements in any order 485 range = value *WSP ".." value 486 ; values must reduce to single 487 decimal values 489 repeat = repeat-num / 490 [repeat-num] *WSP 491 ("*" / "#") [repeat-num] 493 repeat-num = *WSP 1*DIGIT 495 rulename = *WSP ALPHA *( ALPHA / DIGIT / "-" ) 497 value = 1*element *(*WSP "/" 1*element) 499 5. APPENDIX A - CORE 501 This Appendix is provided as a convenient core for specific 502 grammars. The definitions may be used as a core set of 503 rules. 505 Certain basic rules are in uppercase, such as SPACE, TAB, 506 CRLF, DIGIT, ALPHA, etc. 508 ALPHA = "A".."Z" 509 ; case not significant 511 CHAR = %x00..7F 512 ; us-ascii 514 CR = %d13 516 CRLF = CR LF 518 CTL = %d0..31 / %d127 520 DIGIT = "0".."9" 522 HTAB = %d9 524 LF = %d10 526 LWSP = SP / HTAB 528 SPACE = " " 530 PCHAR = %x20..7E 532 WSP = LWSP / CRLF LWSP 534 Externally, data are represented as "network virtual ASCII", 535 namely 7-bit US-ASCII in an 8th bit field, with the high 536 (8th) bit set to zero. 538 6. ACKNOWLEDGEMENTS 540 The syntax for ABNF was originally specified in RFC #733. 541 Ken L. Harrenstien, of SRI International, was responsible for 542 re-coding the BNF into an augmented BNF that makes the 543 representation smaller and easier to understand. The current 544 round of specification was part of the DRUMS working group, 545 with significant contributions from Bill McQuillan, Keith 546 Moore, Pete Resnick, Jerome Abela and Chris Newman. 548 7. REFERENCES 550 [US-ASCII] Coded Character Set--7-Bit American Standard 551 Code for Information Interchange, ANSI X3.4-1986. 553 8. CONTACT 555 David H. Crocker Paul Overell 556 Demon Internet Ltd 557 Internet Mail Consortium Dorking Business Park 558 675 Spruce Dr. Dorking 559 Sunnyvale, CA 94086 USA Surrey, RH4 1HN 560 UK 561 562 563 Phone: +1 408 246 8253 564 Fax: +1 408 249 6205