idnits 2.17.1 draft-showalter-sieve-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 4 instances of too long lines in the document, the longest one being 5 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? RFC 2119 keyword, line 221: '...n implementation SHOULD only allow one...' RFC 2119 keyword, line 222: '...e processed, and MAY limit the number ...' RFC 2119 keyword, line 225: '...n implementation MUST refuse to redire...' RFC 2119 keyword, line 460: '... YOU MAY HAVE ALREADY WON TEN MILLIO...' RFC 2119 keyword, line 524: '... sizes, MAY have a "K", "M", or "G" ...' (50 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 323 has weird spacing: '...espaces for e...' == Line 785 has weird spacing: '...defined here ...' == Line 1112 has weird spacing: '... to acm@fro...' == Line 1185 has weird spacing: '...ecified by...' == Line 1186 has weird spacing: '... The human...' == (5 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 8, 2000) is 8747 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'FLAMES' on line 1852 looks like a reference -- Missing reference section? 'KEYWORDS' on line 1860 looks like a reference -- Missing reference section? 'ABNF' on line 1837 looks like a reference -- Missing reference section? 'UTF-8' on line 1879 looks like a reference -- Missing reference section? 'BINARY-SI' on line 1845 looks like a reference -- Missing reference section? 'IMAIL' on line 1866 looks like a reference -- Missing reference section? 'MIME' on line 1869 looks like a reference -- Missing reference section? 'ACAP' on line 1841 looks like a reference -- Missing reference section? 'MDN' on line 1873 looks like a reference -- Missing reference section? 'DSN' on line 1849 looks like a reference -- Missing reference section? 'ADDRESS-PART' on line 1377 looks like a reference -- Missing reference section? 'COMPARATOR' on line 1445 looks like a reference -- Missing reference section? 'MATCH-TYPE' on line 1445 looks like a reference -- Missing reference section? 'QUANTIFIER' on line 1674 looks like a reference -- Missing reference section? 'IMAP' on line 1863 looks like a reference -- Missing reference section? 'SMTP' on line 1876 looks like a reference Summary: 4 errors (**), 0 flaws (~~), 8 warnings (==), 19 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group T. Showalter 2 Internet Draft: Sieve Mirapoint, Inc 3 Document: draft-showalter-sieve-11.txt May 8, 2000 4 Expire in six months 6 Sieve: A Mail Filtering Language 8 Status of this memo 10 This document is an Internet-Draft and is in full conformance with 11 all provisions of Section 10 of RFC 2026. 13 Internet-Drafts are working documents of the Internet Engineering 14 Task Force (IETF), its areas, and its working groups. Note that 15 other groups may also distribute working documents as Internet- 16 Drafts. 18 Internet-Drafts are draft documents valid for a maximum of six months 19 and may be updated, replaced, or obsoleted by other documents at any 20 time. It is inappropriate to use Internet-Drafts as reference 21 material or to cite them other than as "work in progress." 23 The list of current Internet-Drafts can be accessed at 24 http://www.ietf.org/ietf/1id-abstracts.txt 26 The list of Internet-Draft Shadow Directories can be accessed at 27 http://www.ietf.org/shadow.html 29 Copyright Notice 31 Copyright (C) The Internet Society 2000. All Rights Reserved. 33 Abstract 35 This document describes a language for filtering e-mail messages at 36 time of final delivery. It is designed to be implementable on either 37 a mail client or mail server. It is meant to be extensible, simple, 38 and independent of access protocol, mail architecture, and operating 39 system. It is suitable for running on a mail server where users may 40 not be allowed to execute arbitrary programs, such as on black box 41 IMAP servers, as it has no variables, loops, or ability to shell out 42 to external programs. 44 Internet DRAFT Sieve May 8, 2000 46 Table of Contents 48 Status of this memo ............................................... 1 49 Copyright Notice .................................................. 1 50 Abstract .......................................................... 1 51 0. Meta-information on this draft ............................ 4 52 0.1. Discussion ............................................... 4 53 0.2. Known Issues ............................................. 4 54 0.2.1. Probable Extensions ...................................... 4 55 0.3. Noted Changes ............................................ 5 56 0.3.1. since -10 ................................................ 5 57 0.3.1. since -09 ................................................ 5 58 0.3.2. since -08 ................................................ 5 59 0.3.3. since -07 ................................................ 5 60 0.3.4. since -06 ................................................ 6 61 0.3.5. since -05 ................................................ 6 62 0.3.6. since -04 ................................................ 7 63 1. Introduction .............................................. 9 64 1.1. Conventions Used in This Document ........................ 9 65 1.2. Example mail messages .................................... 10 66 2. Design .................................................... 11 67 2.1. Form of the Language ..................................... 11 68 2.2. Whitespace ............................................... 11 69 2.3. Comments ................................................. 11 70 2.4. Literal Data ............................................. 12 71 2.4.1. Numbers .................................................. 12 72 2.4.2. Strings .................................................. 12 73 2.4.2.1. String Lists ............................................. 13 74 2.4.2.2. Headers .................................................. 14 75 2.4.2.3. Addresses ................................................ 14 76 2.4.2.4. MIME Parts ............................................... 14 77 2.5. Tests .................................................... 14 78 2.5.1. Test Lists ............................................... 14 79 2.6. Arguments ................................................ 15 80 2.6.1. Positional Arguments ..................................... 15 81 2.6.2. Tagged Arguments ......................................... 15 82 2.6.3. Optional Arguments ....................................... 16 83 2.6.4. Types of Arguments ....................................... 16 84 2.7. String Comparison ........................................ 16 85 2.7.1. Match Type ............................................... 16 86 2.7.2. Comparisons Across Character Sets ........................ 17 87 2.7.3. Comparators .............................................. 18 88 2.7.4. Comparisons Against Addresses ............................ 19 89 Internet DRAFT Sieve May 8, 2000 91 2.8. Blocks ................................................... 19 92 2.9. Commands ................................................. 20 93 2.10. Evaluation ............................................... 20 94 2.10.1. Action Interaction ....................................... 20 95 2.10.2. Implicit Keep ............................................ 21 96 2.10.3. Message Uniqueness in a Mailbox .......................... 21 97 2.10.4. Limits on Numbers of Actions ............................. 21 98 2.10.5. Extensions and Optional Features ......................... 22 99 2.10.6. Errors ................................................... 22 100 2.10.7. Limits on Execution ...................................... 23 101 3. Control Commands .......................................... 23 102 3.1. If ........................................................ 23 103 3.2. Control Structure Require ................................ 24 104 3.3. Control Structure Stop ................................... 25 105 4. Action Commands ........................................... 25 106 4.1. Action reject ............................................ 25 107 4.2. Action fileinto .......................................... 26 108 4.3. Action redirect .......................................... 26 109 4.4. Action keep .............................................. 27 110 4.5. Action discard ........................................... 27 111 5. Test Commands ............................................. 28 112 5.1. Test address ............................................. 28 113 5.2. Test allof ............................................... 29 114 5.3. Test anyof ............................................... 29 115 5.4. Test envelope ............................................ 29 116 5.5. Test exists .............................................. 30 117 5.6. Test false ............................................... 31 118 5.7. Test header .............................................. 31 119 5.8. Test not ................................................. 31 120 5.9. Test size ................................................ 31 121 5.10. Test true ................................................ 32 122 6. Extensibility ............................................. 32 123 6.1. Capability String ........................................ 32 124 6.2. IANA Considerations ...................................... 33 125 6.3. Capability Transport ..................................... 33 126 7. Transmission .............................................. 34 127 8. Parsing ................................................... 34 128 8.1. Lexical Tokens ........................................... 34 129 8.2. Grammar .................................................. 36 130 9. Extended Example .......................................... 37 131 10. Security Considerations ................................... 38 132 11. Acknowledgments ........................................... 38 133 12. Author's Address .......................................... 39 134 Appendix A. References ........................................... 40 135 Appendix B. Full Copyright Statement ............................. 41 136 Internet DRAFT Sieve May 8, 2000 138 0. Meta-information on this draft 140 This information is intended to facilitate discussion. It will be 141 removed when this document leaves the Internet-Draft stage. 143 0.1. Discussion 145 This draft is being discussed on the MTA Filters mailing list at 146 . Subscription requests can be sent to 147 (send an email message with the 148 word "subscribe" in the body). More information on the mailing list 149 along with a WWW archive of back messages is available at 150 . 152 0.2. Known Issues 154 0.2.1. Probable Extensions 156 Extensions for vacation, "sub-addressing", IMAP flags on delivery, 157 regular expressions, and "include" functionality have been suggested. 158 This functionality has been left out of this draft because of 159 limitations on implementations (i.e., you can't set IMAP flags on a 160 POP server, many MTAs don't support subaddressing, you can't include 161 other scripts on a sealed server, etc.). 163 The following documents discuss extensions: 165 draft-melnikov-sieve-imapflags-03.txt 166 This document describes an extension for setting IMAP flags on 167 delivery. 169 draft-murchison-sieve-regex-00.txt 170 This document discusses use of regular expressions for string 171 comparisons. 173 draft-murchison-sieve-subaddress-00.txt 174 This document describes a method for using sub-addresses within 175 Sieve scripts. 177 draft-showalter-sieve-vacation-02.txt 178 This document describes a Unix vacation(1)-style autoresponder. 180 Also of note is draft-gellens-acap-sieve-00.txt, which discusses a 181 transport for Sieve scripts via ACAP. 183 Internet DRAFT Sieve May 8, 2000 185 0.3. Noted Changes 187 0.3.1. since -10 189 The "Probable Extensions" section has been revised. A correction was 190 made to the big example in section 9. The outline form of section 191 6.3 was removed since there was just a single bullet item. 193 0.3.1. since -09 195 Many nits from Randall Gellens. 197 The use of backslashes and metacharacters with :matches has been 198 changed to reflect what happens in C. This means the parser doesn't 199 need to know about how the string is being used. 201 The syntax for dot-stuffed strings was changed to be both correct and 202 more readable. 204 0.3.2. since -08 206 Editorial changes only, as I recall. 208 0.3.3. since -07 210 A whole lot of nits from Ned Freed. 212 Clarification that multiple address-part or match-type tags is an 213 error. 215 Section 2.10.7 added. 217 Reject generates MDNs, not DSNs. Left it as optional. 219 Require an Stop moved to Control Structures section. 221 Text removed: "Therefore, an implementation SHOULD only allow one 222 "reject" per message processed, and MAY limit the number of redirect 223 actions taken." Covered by 2.10.4. 225 Text removed: "An implementation MUST refuse to redirect a message to 226 itself." Hopefully covered by loop control stuff in reject. 228 Various corrections from Randy Gellens. 230 The syntax definitions of allof and anyof have changed to use a 231 test-list instead of describing the grammar of the test-list. 233 Internet DRAFT Sieve May 8, 2000 235 0.3.4. since -06 237 Larry Greenfield supplied a rewrite of the grammar that separates 238 things out into a tokenizer and a parser. This grammar also allows 239 UTF-8 characters in strings (previous versions limited characters to 240 the 0x01-0x7F range). 242 Steve Hole made a number of editorial suggestions that were taken. 243 This includes discussing a tokenizer in 2.1 and renaming sections 3, 244 4, and 5 ("Control Structures" became "Control Commands", "Actions" 245 became "Action Commands", and "Tests" became "Test Commands"). Other 246 uses of these terms in this document should have been changed to 247 match, but I probably missed some. 249 Lots of new rules were added to section 2.10, and should be reviewed 250 carefully. I think that they reflect consensus, but am not sure. 252 Tokens are defined as being case insensitive. 254 Envelope takes a COMPARATOR argument. 256 ADDRESS-PART defaults to :all. 258 Test "true" has been put back. Truth was accidentally deleted. 260 Gregory Sereda provided several examples, including a long one which 261 has been inserted as section 9. 263 The copyright date has been fixed and the copyright and I-D 264 boilerplate updated with the latest and greatest from the IETF web 265 site. 267 Unnecessary brackets were removed around various syntax elements in 268 section 2.7. 270 Acknowledgments were moved further towards the end. 272 Several other more minor fixes were made. 274 0.3.5. since -05 276 Draft -05 was never published in the Internet-Drafts repository, but 277 was circulated on the ietf-mta-filters@imc.org mailing list. 279 All nits submitted by Greg Sereda are hopefully addressed. Most of 280 these were example bugs, but he also pointed out that types for 281 arguments were under-specified and in several cases orders of 282 arguments disagreed with the syntax. 284 Internet DRAFT Sieve May 8, 2000 286 "Match keyword" was changed to "match type" as an editorial change. 288 "Forward" was renamed to "redirect" because of the conflict between 289 multiple meanings of "forward" in order to make it clear exactly what 290 we meant. 292 Limitation of one redirect per message should be removed. 294 The types of arguments have been added to their syntax line. 296 Added "require" back in a slightly different form. "Require" is now 297 an action (arbitrarily) and has been added to sec. 2.10 as well. 299 Implementations are responsible for not allowing mail loops. 301 All discussion of short-circuit evaluation has been removed. On a 302 related note, tests must not have side effects. 304 Envelope is required to drop source routes. 306 An address-matching primitive has been added. 308 0.3.6. since -04 310 Here are a list of changes from draft 04. (It may not be complete.) 312 * Consensus: i;ascii-casemap is required. 314 * Consensus: i;ascii-casemap is the default. 316 * Header name compares are always case-insensitive; the draft now 317 says so. 319 * Several examples were fixed, but it is likely that errors remain. 321 * Bug: Section 7, remove reference to "support". 323 * There are two namespaces for extension names, one "vnd.", one 324 everything else, like MIME. 326 * All XXXs have been removed, except for in IANA section. 328 * Fileinto is optional, and discussion of local mail folders and POP3 329 has been removed. 331 * A non-present comparator is considered to be basically a syntax 332 error. 334 Internet DRAFT Sieve May 8, 2000 336 * Resent headers are not to be added by the "redirect" command. 338 * Tagged arguments must follow the keyword, and may not be 339 interspersed with positional arguments. 341 * Envelope-matching commands are to be added with the syntax that 342 Barry suggested. 344 * Put back :matches match type. 346 * What happens when an error occurs has been dropped. 348 * Reject is now optional. 350 * Implementations are encouraged to decode header charsets, and if 351 they don't, are required to not do compares on 8-bit data. 353 Internet DRAFT Sieve May 8, 2000 355 1. Introduction 357 This memo documents a language that can be used to create filters for 358 electronic mail. It is not tied to any particular operating system or 359 mail architecture. It requires the use of [IMAIL]-compliant 360 messages, but should otherwise generalize to many systems. 362 The language is powerful enough to be useful but limited in order to 363 allow for a safe server-side filtering system. The intention is to 364 make it impossible for users to do anything more complex (and 365 dangerous) than write simple mail filters, along with facilitating 366 the use of GUIs for filter creation and manipulation. The language is 367 not Turing-complete: it provides no way to write a loop or a function 368 and variables are not provided. 370 Scripts written in Sieve are executed during final delivery, when the 371 message is moved to the user-accessible mailbox. In systems where 372 the MTA does final delivery, such as traditional Unix mail, it is 373 reasonable to sort when the MTA deposits mail into the user's 374 mailbox. 376 There are a number of reasons to use a filtering system. Mail 377 traffic for most users has been increasing due to increased usage of 378 e-mail, the emergence of unsolicited email as a form of advertising, 379 and increased usage of mailing lists. 381 Experience at Carnegie Mellon has shown that if a filtering system is 382 made available to users, many will make use of it in order to file 383 messages from specific users or mailing lists. However, many others 384 did not make use of the Andrew system's FLAMES filtering language 385 [FLAMES] due to difficulty in setting it up. 387 Because of the expectation that users will make use of filtering if 388 it is offered and easy to use, this language has been made simple 389 enough to allow many users to make use of it, but rich enough that it 390 can be used productively. However, it is expected that GUI-based 391 editors will be the preferred way of editing filters for a large 392 number of users. 394 1.1. Conventions Used in This Document 396 In the sections of this document that discuss the requirements of 397 various keywords and operators, the following conventions have been 398 adopted. 400 The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", "CAN", and 401 "MAY" in this document are to be interpreted as defined in 402 [KEYWORDS]. 404 Internet DRAFT Sieve May 8, 2000 406 Each section on a command (test, action, or control structure) has a 407 line labeled "Syntax:". This line describes the syntax of the 408 command, including its name and its arguments. Required arguments 409 are listed inside angle brackets ("<" and ">"). Optional arguments 410 are listed inside square brackets ("[" and "]"). Each argument is 411 followed by its type, so "" represents an argument 412 called "key" that is a string. Literal strings are represented with 413 double-quoted strings. Alternatives are separated with slashes, and 414 parenthesis are used for grouping, similar to [ABNF]. 416 In the "Syntax" line, there are three special pieces of syntax that 417 are frequently repeated, MATCH-TYPE, COMPARATOR, and ADDRESS-PART. 418 These are discussed in sections 2.7.1, 2.7.3, and 2.7.4, 419 respectively. 421 The formal grammar for these commands in section 10 and is the 422 authoritative reference on how to construct commands, but the formal 423 grammar does not specify the order, semantics, number or types of 424 arguments to commands, nor the legal command names. The intent is to 425 allow for extension without changing the grammar. 427 1.2. Example mail messages 429 The following mail messages will be used throughout this document in 430 examples. 432 Message A 433 ----------------------------------------------------------- 434 Date: Tue, 1 Apr 1997 09:06:31 -0800 (PST) 435 From: coyote@desert.org 436 To: roadrunner@birdseed.org 437 Subject: I have a present for you 439 Look, I'm sorry about the whole anvil thing, and I really 440 didn't mean to try and drop it on you from the top of the 441 cliff. I want to try to make it up to you. I've got some 442 great birdseed over here at my place--top of the line 443 stuff--and if you come by, I'll have it all wrapped up 444 for you. I'm really sorry for all the problems I've caused 445 for you over the years, but I know we can work this out. 446 -- 447 Wile E. Coyote "Super Genius" coyote@znic.net 448 ----------------------------------------------------------- 450 Internet DRAFT Sieve May 8, 2000 452 Message B 453 ----------------------------------------------------------- 454 From: youcouldberich!@reply-by-postal-mail.invalid 455 Sender: b1ff@de.res.frobnitzm.edu 456 To: rube@landru.melon.net 457 Date: Mon, 31 Mar 1997 18:26:10 -0800 458 Subject: $$$ YOU, TOO, CAN BE A MILLIONAIRE! $$$ 460 YOU MAY HAVE ALREADY WON TEN MILLION DOLLARS, BUT I DOUBT 461 IT! SO JUST POST THIS TO SIX HUNDRED NEWSGROUPS! IT WILL 462 GUARANTEE THAT YOU GET AT LEAST FIVE RESPONSES WITH MONEY! 463 MONEY! MONEY! COLD HARD CASH! YOU WILL RECEIVE OVER 464 $20,000 IN LESS THAN TWO MONTHS! AND IT'S LEGAL!!!!!!!!! 465 !!!!!!!!!!!!!!!!!!111111111!!!!!!!11111111111!!1 JUST 466 SEND $5 IN SMALL, UNMARKED BILLS TO THE ADDRESSES BELOW! 467 ----------------------------------------------------------- 469 2. Design 471 2.1. Form of the Language 473 The language consists of a set of commands. Each command consists of 474 a set of tokens delimited by whitespace. The first token is the 475 command string followed by zero or more arguments. Arguments may be 476 literal data, tags, blocks of commands, or test commands. 478 The language is represented in UTF-8, as specified in [UTF-8]. 480 Tokens in the ASCII range are considered case-insensitive. 482 2.2. Whitespace 484 Whitespace is used to separate tokens. Whitespace is made up of 485 tabs, newlines (CRLF, never just CR or LF), and the space character. 486 The amount of whitespace used is not significant. 488 2.3. Comments 490 Two types of comments are offered. Comments are semantically 491 equivalent to whitespace and can be used anyplace that whitespace is 492 (with one exception in multi-line strings, as described in the 493 grammar). 495 Hash comments begin with a "#" character that is not contained within 496 a string and continue until the next CRLF. 498 Example: if size :over 100K { # this is a comment 499 discard; 501 Internet DRAFT Sieve May 8, 2000 503 } 505 Bracketed comments begin with the token "/*" and end with "*/". 506 Bracketed comments may span multiple lines. Bracketed comments do 507 not nest. 509 Example: if size :over 100K { /* this is a comment 510 this is still a comment */ discard /* this is a comment 511 */ ; 512 } 514 2.4. Literal Data 516 Literal data means data that is not executed, merely evaluated "as 517 is", to be used as arguments to commands. Literal data is limited to 518 numbers and strings. 520 2.4.1. Numbers 522 Numbers are given as ordinary decimal numbers. However, those 523 numbers that have a tendency to be fairly large, such as message 524 sizes, MAY have a "K", "M", or "G" appended to indicate a multiple of 525 a base-two number. To be comparable with the power-of-two-based 526 versions of SI units that computers frequently use, K specifies 527 kibi-, or 1,024 (2^10) times the value of the number; M specifies 528 mebi-, or 1,048,576 (2^20) times the value of the number; and G 529 specifies tebi-, or 1,073,741,824 (2^30) times the value of the 530 number [BINARY-SI]. 532 Implementations MUST provide 31 bits of magnitude in numbers, but MAY 533 provide more. 535 Only positive integers are permitted by this specification. 537 2.4.2. Strings 539 Scripts involve large numbers of strings as they are used for pattern 540 matching, addresses, textual bodies, etc. Typically, short quoted 541 strings suffice for most uses, but a more convenient form is provided 542 for longer strings such as bodies of messages. 544 A quoted string starts and ends with a single double quote (the <"> 545 character, ASCII 34). A backslash ("\", ASCII 92) inside of a quoted 546 string is followed by either another backslash or a double quote. 547 This two-character sequence represents a single backslash or double- 548 quote within the string, respectively. 550 Internet DRAFT Sieve May 8, 2000 552 No other characters should be escaped with a single backslash. 554 An undefined escape sequence (such as "\a" in a context where "a" has 555 no special meaning) is interpreted as if there were no backslash (in 556 this case, "\a" is just "a"). 558 Non-printing characters such as tabs, CR and LF, and control 559 characters are permitted in strings. NUL (ASCII 0) is not allowed in 560 strings. 562 For entering larger amounts of text, such as an email message, a 563 multi-line form is allowed. It starts with the keyword "text:", 564 followed by a CRLF, and ends with the sequence of a CRLF, a single 565 period, and another CRLF. In order to allow the message to contain 566 lines with a single-dot, lines are dot-stuffed. That is, when 567 composing a message body, an extra `.' is added before each line 568 which begins with a `.'. When the server interprets the script, 569 these extra dots are removed. Note that a line that begins with a 570 dot followed by a non-dot character is not interpreted dot-stuffed; 571 that is, ".foo" is interpreted as ".foo". However, because this is 572 potentially ambiguous, scripts SHOULD be properly dot-stuffed so such 573 lines do not appear. 575 Note that a hashed comment or whitespace may occur in between the 576 "text:" and the CRLF, but not within the string itself. Bracketed 577 comments are not allowed here. 579 2.4.2.1. String Lists 581 When matching patterns, it is frequently convenient to match against 582 groups of strings instead of single strings. For this reason, a list 583 of strings is allowed in many tests, implying that if the test is 584 true using any one of the strings, then the test is true. 585 Implementations are encouraged to use short-circuit evaluation in 586 these cases. 588 For instance, the test `header :contains ["To", "Cc"] 589 ["me@frobnitzm.edu", "me00@landru.melon.edu"]' is true if either the 590 To header or Cc header of the input message contains either of the 591 e-mail addresses "me@frobnitzm.edu" or "me00@landru.melon.edu". 593 Conversely, in any case where a list of strings is appropriate, a 594 single string is allowed without being a member of a list: it is 595 equivalent to a list with a single member. This means that the test 596 `exists "To"' is equivalent to the test `exists ["To"]'. 598 Internet DRAFT Sieve May 8, 2000 600 2.4.2.2. Headers 602 Headers are a subset of strings. In the Internet Message 603 Specification [IMAIL], each header line is allowed to have whitespace 604 nearly anywhere in the line, including after the field name and 605 before the subsequent colon. Extra spaces between the header name 606 and the ":" in a header field are ignored. 608 A header name never contains a colon. The "From" header refers to a 609 line beginning "From:" (or "From :", etc.). No header will match 610 the string "From:" due to the trailing colon. 612 2.4.2.3. Addresses 614 A number of commands call for email addresses, which are also a 615 subset of strings. These addresses must be compliant with [IMAIL]. 616 Implementations MUST ensure that the addresses are syntactically 617 valid, but need not ensure that they actually identify an email 618 recipient. 620 2.4.2.4. MIME Parts 622 In a few places, [MIME] bodyparts are represented as strings. These 623 parts include MIME headers and the body. This provides a way of 624 embedding typed data within a Sieve script so that, among other 625 things, character sets other than UTF-8 can be used for output 626 messages. 628 2.5. Tests 630 Tests are given as arguments to commands in order to control their 631 actions. In this document, tests are given to if/elsif/else to 632 decide which block of code is run. 634 Tests MUST NOT have side effects. That is, a test cannot affect the 635 state of the filter or message. No tests in this specification have 636 side effects, and side effects are forbidden in extension tests as 637 well. 639 The rationale for this is that tests with side effects impair 640 readability and maintainability and are difficult to represent in a 641 graphic interface for generating scripts. Side effects are confined 642 to actions where they are clearer. 644 2.5.1. Test Lists 646 Some tests ("allof" and "anyof", which implement logical "and" and 647 logical "or", respectively) may require more than a single test as an 649 Internet DRAFT Sieve May 8, 2000 651 argument. The test-list syntax element provides a way of grouping 652 tests. 654 Example: if anyof (not exists ["From", "Date"], 655 header :contains "from" "fool@znic.edu") { 656 discard; 657 } 659 2.6. Arguments 661 In order to specify what to do, most commands take arguments. There 662 are three types of arguments: positional, tagged, and optional. 664 2.6.1. Positional Arguments 666 Positional arguments are given to a command which discerns their 667 meaning based on their order. When a command takes positional 668 arguments, all positional arguments must be supplied and must be in 669 the order prescribed. 671 2.6.2. Tagged Arguments 673 This document provides for tagged arguments in the style of 674 CommonLISP. These are also similar to flags given to commands in 675 most command-line systems. 677 A tagged argument is an argument for a command that begins with ":" 678 followed by a tag naming the argument, such as ":contains". This 679 argument means that zero or more of the next tokens have some 680 particular meaning depending on the argument. These next tokens may 681 be numbers or strings but they are never blocks. 683 Tagged arguments are similar to positional arguments, except that 684 instead of the meaning being derived from the command, it is derived 685 from the tag. 687 Tagged arguments must appear before positional arguments, but they 688 may appear in any order with other tagged arguments. For simplicity 689 of the specification, this is not expressed in the syntax definitions 690 with commands, but they still may be reordered arbitrarily provided 691 they appear before positional arguments. Tagged arguments may be 692 mixed with optional arguments. 694 To simplify this specification, tagged arguments SHOULD NOT take 695 tagged arguments as arguments. 697 Internet DRAFT Sieve May 8, 2000 699 2.6.3. Optional Arguments 701 Optional arguments are exactly like tagged arguments except that they 702 may be left out, in which case a default value is implied. Because 703 optional arguments tend to result in shorter scripts, they have been 704 used far more than tagged arguments. 706 One particularly noteworthy case is the ":comparator" argument, which 707 allows the user to specify which [ACAP] comparator will be used to 708 compare two strings, since different languages may impose different 709 orderings on UTF-8 [UTF-8] characters. 711 2.6.4. Types of Arguments 713 Abstractly, arguments may be literal data, tests, or blocks of 714 commands. In this way, an "if" control structure is merely a command 715 that happens to take a test and a block as arguments and may execute 716 the block of code. 718 However, this abstraction is ambiguous from a parsing standpoint. 719 The grammar in section 9.2 presents a parsable version of this: 720 Arguments are string-lists, numbers, and tags, which may be followed 721 by a test or a test-list, which may be followed by a block of 722 commands. No more than one test or test list, nor more than one 723 block of commands, may be used, and commands that end with blocks of 724 commands do not end with semicolons. 726 2.7. String Comparison 728 When matching one string against another, there are a number of ways 729 of performing the match operation. These are accomplished with three 730 types of matches: an exact match, a substring match, and a wildcard 731 glob-style match. These are described below. 733 In order to provide for matches between character sets and case 734 insensitivity, Sieve borrows ACAP's comparator registry. 736 However, when a string represents the name of a header, the 737 comparator is never user-specified. Header comparisons are always 738 done with the "i;ascii-casemap" operator, i.e., case-insensitive 739 comparisons, because this is the way things are defined in the 740 message specification [IMAIL]. 742 2.7.1. Match Type 744 There are three match types describing the matching used in this 745 specification: ":is", ":contains", and ":matches". Match type 746 arguments are supplied to those commands which allow them to specify 748 Internet DRAFT Sieve May 8, 2000 750 what kind of match is to be performed. 752 These are used as tagged arguments to tests that perform string 753 comparison. 755 The ":contains" match type describes a substring match. If the value 756 argument contains the key argument as a substring, the match is true. 757 For instance, the string "frobnitzm" contains "frob" and "nit", but 758 not "fbm". The null key ("") is contained in all values. 760 The ":is" match type describes an absolute match; if the contents of 761 the first string are absolutely the same as the contents of the 762 second string, they match. Only the string "frobnitzm" is the string 763 "frobnitzm". The null key ":is" and only ":is" the null value. 765 The ":matches" version specifies a wildcard match using the 766 characters "*" and "?". "*" matches zero or more characters, and "?" 767 matches a single character. "?" and "*" may be escaped as "\\?" and 768 "\\*" in strings to match against themselves. The first backslash 769 escapes the second backslash; together, they escape the "*". This is 770 awkward, but it is commonplace in several programming languages that 771 use globs and regular expressions. 773 In order to specify what type of match is supposed to happen, 774 commands that support matching take optional tagged arguments 775 ":matches", ":is", and ":contains". Commands default to using ":is" 776 matching if no match type argument is supplied. Note that these 777 modifiers may interact with comparators; in particular, some 778 comparators are not suitable for matching with ":contains" or 779 ":matches". It is an error to use a comparator with ":contains" or 780 ":matches" that is not compatible with it. 782 It is an error to give more than one of these arguments to a given 783 command. 785 For convenience, the "MATCH-TYPE" syntax element is defined here as 786 follows: 788 Syntax: ":is" / ":contains" / ":matches" 790 2.7.2. Comparisons Across Character Sets 792 All Sieve scripts are represented in UTF-8, but messages may involve 793 a number of character sets. In order for comparisons to work across 794 character sets, implementations SHOULD implement the following 795 behavior: 797 Internet DRAFT Sieve May 8, 2000 799 Implementations decode header charsets to UTF-8. Two strings are 800 considered equal if their UTF-8 representations are identical. 801 Implementations should decode charsets represented in the forms 802 specified by [MIME] for both message headers and bodies. 803 Implementations must be capable of decoding US-ASCII, ISO-8859-1, 804 the ASCII subset of ISO-8859-* character sets, and UTF-8. 806 If implementations fail to support the above behavior, they MUST 807 conform to the following: 809 No two strings can be considered equal if one contains octets 810 greater than 127. 812 2.7.3. Comparators 814 In order to allow for language-independent, case-independent matches, 815 the match type may be coupled with a comparator name. Comparators 816 are described for [ACAP]; a registry is defined for ACAP, and this 817 specification uses that registry. 819 ACAP defines multiple comparator types. Only equality types are used 820 in this specification. 822 All implementations MUST support the "i;octet" comparator (simply 823 compares octets) and the "i;ascii-casemap" comparator (which treats 824 uppercase and lowercase characters in the ASCII subset of UTF-8 as 825 the same). If left unspecified, the default is "i;ascii-casemap". 827 Some comparators may not be usable with substring matches; that is, 828 they may only work with ":is". It is an error to try and use a 829 comparator with ":matches" or ":contains" that is not compatible with 830 it. 832 A comparator is specified by the ":comparator" option with commands 833 that support matching. This option is followed by a string providing 834 the name of the comparator to be used. For convenience, the syntax 835 of a comparator is abbreviated to "COMPARATOR", and (repeated in 836 several tests) is as follows: 838 Syntax: ":comparator" 840 So in this example, 842 Example: if header :contains :comparator "i;octet" "Subject" 843 "MAKE MONEY FAST" { 844 discard; 845 } 847 Internet DRAFT Sieve May 8, 2000 849 would discard any message with subjects like "You can MAKE MONEY 850 FAST", but not "You can Make Money Fast", since the comparator used 851 is case-sensitive. 853 Comparators other than i;octet and i;ascii-casemap must be declared 854 with require, as they are extensions. If a comparator declared with 855 require is not known, it is an error, and execution fails. If the 856 comparator is not declared with require, it is also an error, even if 857 the comparator is supported. (See 2.10.5.) 859 Both ":matches" and ":contains" match types are compatible with the 860 "i;octet" and "i;ascii-casemap" comparators and may be used with 861 them. 863 It is an error to give more than one of these arguments to a given 864 command. 866 2.7.4. Comparisons Against Addresses 868 Addresses are one of the most frequent things represented as strings. 869 These are structured, and being able to compare against the local- 870 part or the domain of an address is useful, so some tests that act 871 exclusively on addresses take an additional optional argument that 872 specifies what the test acts on. 874 These optional arguments are ":localpart", ":domain", and ":all", 875 which act on the local-part (left-side), the domain part (right- 876 side), and the whole address. 878 The kind of comparison done, such as whether or not the test done is 879 case-insensitive, is specified as a comparator argument to the test. 881 If an optional address-part is omitted, the default is ":all". 883 It is an error to give more than one of these arguments to a given 884 command. 886 For convenience, the "ADDRESS-PART" syntax element is defined here as 887 follows: 889 Syntax: ":localpart" / ":domain" / ":all" 891 2.8. Blocks 893 Blocks are sets of commands enclosed within curly braces. Blocks are 894 supplied to commands so that the commands can implement control 895 commands. 897 Internet DRAFT Sieve May 8, 2000 899 A control structure is a command that happens to take a test and a 900 block as one of its arguments; depending on the result of the test 901 supplied as another argument, it runs the code in the block some 902 number of times. 904 With the commands supplied in this memo, there are no loops. The 905 control structures supplied--if, elsif, and else--run a block either 906 once or not at all. So there are two arguments, the test and the 907 block. 909 2.9. Commands 911 Sieve scripts are sequences of commands. Commands can take any of 912 the tokens above as arguments, and arguments may be either tagged or 913 positional arguments. Not all commands take all arguments. 915 There are three kinds of commands: test commands, action commands, 916 and control commands. 918 The simplest is an action command. An action command is an 919 identifier followed by zero or more arguments, terminated by a 920 semicolon. Action commands do not take tests or blocks as arguments. 922 A control command is similar, but it takes a test as an argument, and 923 ends with a block instead of a semicolon. 925 A test command is used as part of a control command. It is used to 926 specify whether or not the block of code given to the control command 927 is executed. 929 2.10. Evaluation 931 2.10.1. Action Interaction 933 Some actions cannot be used with other actions because the result 934 would be absurd. These restrictions are noted throughout this memo. 936 Internet DRAFT Sieve May 8, 2000 938 Extension actions MUST state how they interact with actions defined 939 in this specification. 941 2.10.2. Implicit Keep 943 Previous experience with filtering systems suggests that cases tend 944 to be missed in scripts. To prevent errors, Sieve has an "implicit 945 keep". 947 An implicit keep is a keep action (see 4.4) performed in absence of 948 any other actions. 950 An implicit keep is performed if a message is not written to a 951 mailbox, redirected to a new address, or explicitly thrown out. That 952 is, if a fileinto, a keep, a redirect, or a discard is performed, an 953 implicit keep is not. 955 For instance, with any of the short messages offered above, the 956 following script produces no actions. 958 Example: if size :over 500K { discard; } 960 As a result, the implicit keep is taken. 962 2.10.3. Message Uniqueness in a Mailbox 964 Implementations SHOULD NOT deliver a message to the same folder more 965 than once, even if a script explicitly asks for a message to be 966 written to a mailbox twice. 968 The test for equality of two messages is implementation-defined. 970 If a script asks for a message to be written to a mailbox twice, it 971 MUST NOT be treated as an error. 973 2.10.4. Limits on Numbers of Actions 975 Site policy MAY limit numbers of actions taken and MAY impose 976 restrictions on which actions can be used together. In the event 977 that a script hits a policy limit on the number of actions taken for 978 a particular message, an error occurs. 980 Implementations MUST prohibit more than one reject. 982 Implementations MUST allow at least one keep or one fileinto. If 983 fileinto is not implemented, implementations MUST allow at least one 984 keep. 986 Internet DRAFT Sieve May 8, 2000 988 Implementations SHOULD prohibit reject when used with other actions. 990 2.10.5. Extensions and Optional Features 992 Because of the differing capabilities of many mail systems, several 993 features of this specification are optional. Before any of these 994 extensions can be executed, they must be declared with the "require" 995 action. 997 If an extension is not enabled with "require", implementations MUST 998 treat it as if they did not support it at all. 1000 If a script does not understand an extension declared with require, 1001 the script must not be used at all. Implementations MUST NOT execute 1002 scripts which require unknown capability names. 1004 Note: The reason for this restriction is that prior experiences with 1005 languages such as LISP and Tcl suggest that this is a workable 1006 way of noting that a given script uses an extension. 1008 Experience with PostScript suggests that mechanisms that allow 1009 a script to work around missing extensions are not used in 1010 practice. 1012 Extensions which define actions MUST state how they interact with 1013 actions discussed in the base specification. 1015 2.10.6. Errors 1017 In any programming language, there are compile-time and run-time 1018 errors. 1020 Compile-time errors are ones in syntax that are detectable if a 1021 syntax check is done. 1023 Run-time errors are not detectable until the script is run. This 1024 includes transient failures like disk full conditions, but also 1025 includes issues like invalid combinations of actions. 1027 When an error occurs in a Sieve script, all processing stops. 1029 Implementations MAY choose to do a full parse, then evaluate the 1030 script, then do all actions. Implementations might even go so far as 1031 to ensure that execution is atomic (either all actions are executed 1032 or none are executed). 1034 Other implementations may choose to parse and run at the same time. 1035 Such implementations are simpler, but have issues with partial 1037 Internet DRAFT Sieve May 8, 2000 1039 failure (some actions happen, others don't). 1041 Implementations might even go so far as to ensure that scripts can 1042 never execute an invalid set of actions (e.g., reject + fileinto) 1043 before execution, although this could involve solving the Halting 1044 Problem. 1046 This specification allows any of these approaches. Solving the 1047 Halting Problem is considered extra credit. 1049 When an error happens, implementations MUST notify the user that an 1050 error occurred, which actions (if any) were taken, and do an implicit 1051 keep. 1053 2.10.7. Limits on Execution 1055 Implementations may limit certain constructs. However, this 1056 specification places a lower bound on some of these limits. 1058 Implementations MUST support fifteen levels of nested blocks. 1060 Implementations MUST support fifteen levels of nested test lists. 1062 3. Control Commands 1064 Control structures are needed to allow for multiple and conditional 1065 actions. 1067 3.1. If 1069 There are three pieces to if: "if", "elsif", and "else". Each is 1070 actually a separate command in terms of the grammar. However, an 1071 elsif MUST only follow an if, and an else MUST follow only either an 1072 if or an elsif. An error occurs if these conditions are not met. 1074 Syntax: if 1076 Syntax: elsif 1078 Syntax: else 1080 The semantics are similar to those of any of the many other 1081 programming languages these control commands appear in. When the 1082 interpreter sees an "if", it evaluates the test associated with it. 1083 If the test is true, it executes the block associated with it. 1085 If the test of the "if" is false, it evaluates the test of the first 1086 "elsif" (if any). If the test of "elsif" is true, it runs the 1088 Internet DRAFT Sieve May 8, 2000 1090 elsif's block. An elsif may be followed by an elsif, in which case, 1091 the interpreter repeats this process until it runs out of elsifs. 1093 When the interpreter runs out of elsifs, there may be an "else" case. 1094 If there is, and none of the if or elsif tests were true, the 1095 interpreter runs the else case. 1097 This provides a way of performing exactly one of the blocks in the 1098 chain. 1100 In the following example, both Message A and B are dropped. 1102 Example: require "fileinto"; 1103 if header :contains "from" "coyote" { 1104 discard; 1105 } elsif header :contains ["subject"] ["$$$"] { 1106 discard; 1107 } else { 1108 fileinto "INBOX"; 1109 } 1111 When the script below is run over message A, it redirects the message 1112 to acm@frobnitzm.edu; message B, to postmaster@frobnitzm.edu; any 1113 other message is redirected to field@frobnitzm.edu. 1115 Example: if header :contains ["From"] ["coyote"] { 1116 redirect "acm@frobnitzm.edu"; 1117 } elsif header :contains "Subject" "$$$" { 1118 redirect "postmaster@frobnitzm.edu"; 1119 } else { 1120 redirect "field@frobnitzm.edu"; 1121 } 1123 Note that this definition prohibits the "... else if ..." sequence 1124 used by C. This is intentional, because this construct produces a 1125 shift-reduce conflict. 1127 3.2. Control Structure Require 1129 Syntax: require 1131 The require action notes that a script makes use of a certain 1132 extension. Such a declaration is required to use the extension, as 1133 discussed in section 2.10.5. Multiple capabilities can be declared 1134 with a single require. 1136 The require command, if present, MUST be used before anything other 1138 Internet DRAFT Sieve May 8, 2000 1140 than a require can be used. An error occurs if a require appears 1141 after a command other than require. 1143 Example: require ["fileinto", "reject"]; 1145 Example: require "fileinto"; 1146 require "vacation"; 1148 3.3. Control Structure Stop 1150 Syntax: stop 1152 The "stop" action ends all processing. If no actions have been 1153 executed, then the keep action is taken. 1155 4. Action Commands 1157 This document supplies five actions that may be taken on a message: 1158 keep, fileinto, redirect, reject, and discard. 1160 Implementations MUST support the "keep", "discard", and "redirect" 1161 actions. 1163 Implementations SHOULD support "reject" and "fileinto". 1165 Implementations MAY limit the number of certain actions taken (see 1166 section 2.10.4). 1168 4.1. Action reject 1170 Syntax: reject 1172 The optional "reject" action refuses delivery of a message by sending 1173 back an [MDN] to the sender. It resends the message to the sender, 1174 wrapping it in a "reject" form, noting that it was rejected by the 1175 recipient. In the following script, message A is rejected and 1176 returned to the sender. 1178 Example: if header :contains "from" "coyote@znic.net" { 1179 reject "I am not taking mail from you, and I don't want 1180 your birdseed, either!"; 1181 } 1183 Internet DRAFT Sieve May 8, 2000 1185 A reject message MUST take the form of a failure MDN as specified by 1186 [MDN]. The human-readable portion of the message, the first 1187 component of the MDN, contains the human readable message describing 1188 the error, and it SHOULD contain additional text alerting the 1189 original sender that mail was refused by a filter. This part of the 1190 MDN might appear as follows: 1192 ------------------------------------------------------------ 1193 Message was refused by recipient's mail filtering program. 1194 Reason given was as follows: 1196 I am not taking mail from you, and I don't want your 1197 birdseed, either! 1198 ------------------------------------------------------------ 1200 The MDN action-value field as defined in the MDN specification MUST 1201 be "deleted" and MUST have the MDN-sent-automatically and automatic- 1202 action modes set. 1204 Because some implementations can not or will not implement the reject 1205 command, it is optional. The capability string to be used with the 1206 require command is "reject". 1208 4.2. Action fileinto 1210 Syntax: fileinto 1212 The "fileinto" action delivers the message into the specified folder. 1213 Implementations SHOULD support fileinto, but in some environments 1214 this may be impossible. 1216 The capability string for use with the require command is "fileinto". 1218 In the following script, message A is filed into folder 1219 "INBOX.harassment". 1221 Example: require "fileinto"; 1222 if header :contains ["from"] "coyote" { 1223 fileinto "INBOX.harassment"; 1224 } 1226 4.3. Action redirect 1228 Syntax: redirect 1230 The "redirect" action is used to send the message to another user at 1231 a supplied address, as a mail forwarding feature does. The 1232 "redirect" action makes no changes to the message body or existing 1234 Internet DRAFT Sieve May 8, 2000 1236 headers, but it may add new headers. The "redirect" modifies the 1237 envelope recipient. 1239 The redirect command performs an MTA-style "forward"--that is, what 1240 you get from a .forward file using sendmail under UNIX. The address 1241 on the SMTP envelope is replaced with the one on the redirect command 1242 and the message is sent back out. (This is not an MUA-style forward, 1243 which creates a new message with a different sender and message ID, 1244 wrapping the old message in a new one.) 1246 A simple script can be used for redirecting all mail: 1248 Example: redirect "bart@example.edu"; 1250 Implementations SHOULD take measures to implement loop control, 1251 possibly including adding headers to the message or counting received 1252 headers. If an implementation detects a loop, it causes an error. 1254 4.4. Action keep 1256 Syntax: keep 1258 The "keep" action is whatever action is taken in lieu of all other 1259 actions, if no filtering happens at all; generally, this simply means 1260 to file the message into the user's main mailbox. This command 1261 provides a way to execute this action without needing to know the 1262 name of the user's main mailbox, providing a way to call it without 1263 needing to understand the user's setup, or the underlying mail 1264 system. 1266 For instance, in an implementation where the IMAP server is running 1267 scripts on behalf of the user at time of delivery, a keep command is 1268 equivalent to a fileinto "INBOX". 1270 Example: if size :under 1M { keep; } else { discard; } 1272 Note that the above script is identical to the one below. 1274 Example: if not size :under 1M { discard; } 1276 4.5. Action discard 1278 Syntax: discard 1280 Discard is used to silently throw away the message. It does so by 1281 simply canceling the implicit keep. If discard is used with other 1282 actions, the other actions still happen. Discard is compatible with 1284 Internet DRAFT Sieve May 8, 2000 1286 all other actions. (For instance fileinto+discard is equivalent to 1287 fileinto.) 1289 Discard MUST be silent; that is, it MUST NOT return a non-delivery 1290 notification of any kind ([DSN], [MDN], or otherwise). 1292 In the following script, any mail from "idiot@frobnitzm.edu" is 1293 thrown out. 1295 Example: if header :contains ["from"] ["idiot@frobnitzm.edu"] { 1296 discard; 1297 } 1299 While an important part of this language, "discard" has the potential 1300 to create serious problems for users: Students who leave themselves 1301 logged in to an unattended machine in a public computer lab may find 1302 their script changed to just "discard". In order to protect users in 1303 this situation (along with similar situations), implementations MAY 1304 keep messages destroyed by a script for an indefinite period, and MAY 1305 disallow scripts that throw out all mail. 1307 5. Test Commands 1309 Tests are used in conditionals to decide which part(s) of the 1310 conditional to execute. 1312 Implementations MUST support these tests: "address", "allof", 1313 "anyof", "exists", "false", "header", "not", "size", and "true". 1315 Implementations SHOULD support the "envelope" test. 1317 5.1. Test address 1319 Syntax: address [ADDRESS-PART] [COMPARATOR] [MATCH-TYPE] 1320 1322 The address test matches Internet addresses in structured headers 1323 that contain addresses. It returns true if any header contains any 1324 key in the specified part of the address, as modified by the 1325 comparator and the match keyword. 1327 Like envelope and header, this test returns true if any combination 1328 of the header-list and key-list arguments match. 1330 Internet email addresses [IMAIL] have the somewhat awkward 1331 characteristic that the local-part to the left of the at-sign is 1333 Internet DRAFT Sieve May 8, 2000 1335 considered case sensitive, and the domain-part to the right of the 1336 at-sign is case insensitive. The "address" command does not deal 1337 with this itself, but provides the ADDRESS-PART argument for allowing 1338 users to deal with it. 1340 The address primitive never acts on the phrase part of an email 1341 address, nor on comments within that address. It also never acts on 1342 group names, although it does act on the addresses within the group 1343 construct. 1345 Implementations MUST restrict the address test to headers that 1346 contain addresses, but MUST include at least From, To, Cc, Bcc, 1347 Sender, Resent-From, Resent-To, and SHOULD include any other header 1348 that utilizes an "address-list" structured header body. 1350 Example: if address :is :all "from" "tim@example.com" { 1351 discard; 1353 5.2. Test allof 1355 Syntax: allof 1357 The allof test performs a logical AND on the tests supplied to it. 1359 Example: allof (false, false) => false 1360 allof (false, true) => false 1361 allof (true, true) => true 1363 The allof test takes as its argument a test-list. 1365 5.3. Test anyof 1367 Syntax: anyof 1369 The anyof test performs a logical OR on the tests supplied to it. 1371 Example: anyof (false, false) => false 1372 anyof (false, true) => true 1373 anyof (true, true) => true 1375 5.4. Test envelope 1377 Syntax: envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1378 1380 The "envelope" test is true if the specified part of the SMTP (or 1382 Internet DRAFT Sieve May 8, 2000 1384 equivalent) envelope matches the specified key. 1386 If one of the envelope-part strings is (case insensitive) "from", 1387 then matching occurs against the FROM address used in the SMTP MAIL 1388 command. 1390 If one of the envelope-part strings is (case insensitive) "to", then 1391 matching occurs against the TO address used in the SMTP RCPT command 1392 that resulted in this message getting delivered to this user. Note 1393 that only the most recent TO is available, and only the one relevant 1394 to this user. 1396 The envelope-part is a string list and may contain more than one 1397 parameter, in which case all of the strings specified in the key-list 1398 are matched against all parts given in the envelope-part list. 1400 Like address and header, this test returns true if any combination of 1401 the envelope-part and key-list arguments is true. 1403 All tests against envelopes MUST drop source routes. 1405 If the SMTP transaction involved several RCPT commands, only the data 1406 from the RCPT command that caused delivery to this user is available 1407 in the "to" part of the envelope. 1409 If a protocol other than SMTP is used for message transport, 1410 implementations are expected to adapt this command appropriately. 1412 The envelope command is optional. Implementations SHOULD support it, 1413 but the necessary information may not be available in all cases. 1415 Example: require "envelope"; 1416 if envelope :all :is "from" "tim@example.com" { 1417 discard; 1418 } 1420 5.5. Test exists 1422 Syntax: exists 1424 The "exists" test is true if the headers listed in the header-names 1425 argument exist within the message. All of the headers must exist or 1426 the test is false. 1428 The following example throws out mail that doesn't have a From header 1429 and a Date header. 1431 Internet DRAFT Sieve May 8, 2000 1433 Example: if not exists ["From","Date"] { 1434 discard; 1435 } 1437 5.6. Test false 1439 Syntax: false 1441 The "false" test always evaluates to false. 1443 5.7. Test header 1445 Syntax: header [COMPARATOR] [MATCH-TYPE] 1446 1448 The "header" test evaluates to true if any header name matches any 1449 key. The type of match is specified by the optional match argument, 1450 which defaults to ":is" if not specified, as specified in section 1451 2.6. 1453 Like address and envelope, this test returns true if any combination 1454 of the string-list and key-list arguments match. 1456 If a header listed in the header-names argument exists, it contains 1457 the null key (""). However, if the named header is not present, it 1458 does not contain the null key. So if a message contained the header 1460 X-Caffeine: C8H10N4O2 1462 these tests on that header evaluate as follows: 1464 header :is ["X-Caffeine"] [""] => false 1465 header :contains ["X-Caffeine"] [""] => true 1467 5.8. Test not 1469 Syntax: not 1471 The "not" test takes some other test as an argument, and yields the 1472 opposite result. "not false" evaluates to "true" and "not true" 1473 evaluates to "false". 1475 5.9. Test size 1477 Syntax: size <":over" / ":under"> 1479 The "size" test deals with the size of a message. It takes either a 1480 tagged argument of ":over" or ":under", followed by a number 1482 Internet DRAFT Sieve May 8, 2000 1484 representing the size of the message. 1486 If the argument is ":over", and the size of the message is greater 1487 than the number provided, the test is true; otherwise, it is false. 1489 If the argument is ":under", and the size of the message is less than 1490 the number provided, the test is true; otherwise, it is false. 1492 Exactly one of ":over" or ":under" must be specified, and anything 1493 else is an error. 1495 The size of a message is defined to be the number of octets from the 1496 initial header until the last character in the message body. 1498 Note that for a message that is exactly 4,000 octets, the message is 1499 neither ":over" 4000 octets or ":under" 4000 octets. 1501 5.10. Test true 1503 Syntax: true 1505 The "true" test always evaluates to true. 1507 6. Extensibility 1509 New control structures, actions, and tests can be added to the 1510 language. Sites must make these features known to their users; this 1511 document does not define a way to discover the list of extensions 1512 supported by the server. 1514 Any extensions to this language MUST define a capability string that 1515 uniquely identifies that extension. If a new version of an extension 1516 changes the functionality of a previously defined extension, it MUST 1517 use a different name. 1519 In a situation where there is a submission protocol and an extension 1520 advertisement mechanism aware of the details of this language, 1521 scripts submitted can be checked against the mail server to prevent 1522 use of an extension that the server does not support. 1524 Extensions MUST state how they interact with constraints defined in 1525 section 2.10, e.g., whether they cancel the implicit keep, and which 1526 actions they are compatible and incompatible with. 1528 6.1. Capability String 1530 Capability strings are typically short strings describing what 1531 capabilities are supported by the server. 1533 Internet DRAFT Sieve May 8, 2000 1535 Capability strings beginning with "vnd." represent vendor-defined 1536 extensions. Such extensions are not defined by Internet standards or 1537 RFCs, but are still registered with IANA in order to prevent 1538 conflicts. Extensions starting with "vnd." SHOULD be followed by the 1539 name of the vendor and product, such as "vnd.acme.rocket-sled". 1541 The following capability strings are defined by this document: 1543 envelope The string "envelope" indicates that the implementation 1544 supports the "envelope" command. 1546 fileinto The string "fileinto" indicates that the implementation 1547 supports the "fileinto" command. 1549 reject The string "reject" indicates that the implementation 1550 supports the "reject" command. 1552 comparator- The string "comparator-elbonia" is provided if the 1553 implementation supports the "elbonia" comparator. 1554 Therefore, all implementations have at least the 1555 "comparator-i;octet" and "comparator-i;ascii-casemap" 1556 capabilities. However, these comparators may be used 1557 without being declared with require. 1559 6.2. IANA Considerations 1561 In order to provide a standard set of extensions, a registry is 1562 provided by IANA. Capability names may be registered on a first- 1563 come, first-served basis. Extensions designed for interoperable use 1564 SHOULD be defined as standards track or IESG approved experimental 1565 RFCs. 1567 To: XXX@XXX.XXX 1568 Subject: Registration of new Sieve extension 1570 Capability name: 1571 Capability keyword: 1572 Capability arguments: 1573 Standards Track/IESG-approved experimental RFC number: 1574 Person and email address to contact for further information: 1576 6.3. Capability Transport 1578 As the range of mail systems that this draft is intended to apply to 1579 is quite varied, a method of advertising which capabilities an 1580 implementation supports is difficult due to the wide range of 1581 possible implementations. Such a mechanism, however, should have 1582 property that the implementation can advertise the complete set of 1584 Internet DRAFT Sieve May 8, 2000 1586 extensions that it supports. 1588 7. Transmission 1590 The MIME type for a Sieve script is "application/sieve". 1592 The registration of this type for RFC 2048 requirements is as 1593 follows: 1595 Subject: Registration of MIME media type application/sieve 1597 MIME media type name: application 1598 MIME subtype name: sieve 1599 Required parameters: none 1600 Optional parameters: none 1601 Encoding considerations: Most sieve scripts will be textual, 1602 written in UTF-8. When non-7bit characters are used, 1603 quoted-printable is appropriate for transport systems 1604 that require 7bit encoding. 1605 Security considerations: Discussed in section 10 of RFC XXXX. 1606 Interoperability considerations: Discussed in section 2.10.5 1607 of RFC XXXX. 1608 Published specification: RFC XXXX. 1609 Applications which use this media type: sieve-enabled mail servers 1610 Additional information: 1611 Magic number(s): 1612 File extension(s): .siv 1613 Macintosh File Type Code(s): 1614 Person & email address to contact for further information: 1615 See the discussion list at ietf-mta-filters@imc.org. 1616 Intended usage: 1617 COMMON 1618 Author/Change controller: 1619 See Author information in RFC XXXX. 1621 8. Parsing 1623 The Sieve grammar is separated into tokens and a separate grammar as 1624 most programming languages are. 1626 8.1. Lexical Tokens 1628 Sieve scripts are encoded in UTF-8. The following assumes a valid 1629 UTF-8 encoding; special characters in Sieve scripts are all ASCII. 1631 Internet DRAFT Sieve May 8, 2000 1633 The following are tokens in Sieve: 1634 - identifiers 1635 - tags 1636 - numbers 1637 - quoted strings 1638 - multi-line strings 1639 - other separators 1641 Blanks, horizontal tabs, CRLFs, and comments ("white space") are 1642 ignored except as they separate tokens. Some white space is required 1643 to separate otherwise adjacent tokens and in specific places in the 1644 multi-line strings. 1646 The other separators are single individual characters, and are 1647 mentioned explicitly in the grammar. 1649 The lexical structure of sieve is defined in the following BNF (as 1650 described in [ABNF]): 1652 bracket-comment = "/*" *(CHAR-NOT-STAR / ("*" CHAR-NOT-SLASH)) "*/" 1653 ;; No */ allowed inside a comment. 1654 ;; (No * is allowed unless it is the last character, 1655 ;; or unless it is followed by a character that isn't a slash.) 1657 CHAR-NOT-DOT = (%x01-09 / %x0b-0c / %x0e-2d / %x2f-ff) 1658 ;; no dots, no CRLFs 1660 CHAR-NOT-CRLF = (%x01-09 / %x0b-0c / %x0e-ff) 1662 CHAR-NOT-SLASH = (%x00-57 / %x58-ff) 1664 CHAR-NOT-STAR = (%x00-51 / %x53-ff) 1666 comment = bracket-comment / hash-comment 1668 hash-comment = ( "#" *CHAR-NOT-CRLF CRLF ) 1670 identifier = (ALPHA / "_") *(ALPHA DIGIT "_") 1672 tag = ":" identifier 1674 number = 1*DIGIT [QUANTIFIER] 1676 QUANTIFIER = "K" / "M" / "G" 1678 Internet DRAFT Sieve May 8, 2000 1680 quoted-string = DQUOTE *CHAR DQUOTE 1681 ;; in general, \ CHAR inside a string maps to CHAR 1682 ;; so \" maps to " and \\ maps to \ 1683 ;; note that newlines and other characters are all allowed strings 1685 multi-line = "text:" *(SP / HTAB) (hash-comment / CRLF) 1686 *(multi-line-literal / multi-line-dotstuff) 1687 "." CRLF 1688 multi-line-literal = [CHAR-NOT-DOT *CHAR-NOT-CRLF] CRLF 1689 multi-line-dotstuff = "." 1*CHAR-NOT-CRLF CRLF 1690 ;; A line containing only "." ends the multi-line. 1691 ;; Remove a leading `.' if followed by another '.'. 1693 white-space = 1*(SP / CRLF / HTAB) / comment 1695 8.2. Grammar 1697 The following is the grammar of Sieve after it has been lexically 1698 interpreted. No white space or comments appear below. The start 1699 symbol is "start". 1701 argument = string-list / number / tag 1703 arguments = *argument [test / test-list] 1705 block = "{" commands "}" 1707 command = identifier arguments ( ";" / block ) 1709 commands = *command 1711 start = commands 1713 string = quoted-string / multi-line 1715 string-list = "[" string *("," string) "]" / string ;; if 1716 there is only a single string, the brackets are optional 1718 test = identifier arguments 1720 test-list = "(" test *("," test) ")" 1722 Internet DRAFT Sieve May 8, 2000 1724 9. Extended Example 1726 The following is an extended example of a Sieve script. Note that it 1727 does not make use of the implicit keep. 1729 # 1730 # Example Sieve Filter 1731 # Declare any optional features or extension used by the script 1732 # 1733 require ["fileinto", "reject"]; 1735 # 1736 # Reject any large messages (note that the four leading dots get "stuffed" 1737 # to three) 1738 # 1739 if size :over 1M 1740 { 1741 reject text: 1742 Please do not send me large attachments. 1743 Put your file on a server and send me the URL. 1744 Thank you. 1745 .... Fred 1746 . 1747 ; 1748 stop; 1749 } 1751 # 1752 # Handle messages from known mailing lists 1753 # Move messages from IETF filter discussion list to filter folder 1754 # 1755 if header :is "Sender" "owner-ietf-mta-filters@imc.org" 1756 { 1757 fileinto "filter"; # move to "filter" folder 1758 } 1759 # 1760 # Keep all messages to or from people in my company 1761 # 1762 elsif address :domain :is ["From", "To"] "company.com" 1763 { 1764 keep; # keep in "In" folder 1765 } 1767 Internet DRAFT Sieve May 8, 2000 1769 # 1770 # Try and catch unsolicited email. If a message is not to me, 1771 # or it contains a subject known to be spam, file it away. 1772 # 1773 elsif anyof (not address :all :contains 1774 ["To", "Cc", "Bcc"] "me@company.com", 1775 header :matches "subject" 1776 ["*make*money*fast*", "*university*dipl*mas*"]) 1777 { 1778 # If message header does not contain my address, 1779 # it's from a list. 1780 fileinto "spam"; # move to "spam" folder 1781 } 1782 else 1783 { 1784 # Move all other (non-company) mail to "personal" 1785 # folder. 1786 fileinto "personal"; 1787 } 1789 10. Security Considerations 1791 Users must get their mail. It is imperative that whatever method 1792 implementations use to store the user-defined filtering scripts be 1793 secure. 1795 It is equally important that implementations sanity-check the user's 1796 scripts, and not allow users to create on-demand mailbombs. For 1797 instance, an implementation that allows a user to reject or redirect 1798 multiple times to a single message might also allow a user to create 1799 a mailbomb triggered by mail from a specific user. Site- or 1800 implementation-defined limits on actions are useful for this. 1802 Several commands, such as "discard", "redirect", and "fileinto" allow 1803 for actions to be taken that are potentially very dangerous. 1805 Implementations SHOULD take measures to prevent languages from 1806 looping. 1808 11. Acknowledgments 1810 I am very thankful to Chris Newman for his support and his ABNF 1811 syntax checker, to John Myers and Steve Hole for outlining the 1812 requirements for the original drafts, to Larry Greenfield for nagging 1813 me about the grammar and finally fixing it, to Greg Sereda for 1814 repeatedly fixing and providing examples, to Ned Freed for fixing 1815 everything else, to Rob Earhart for an early implementation and a 1816 great deal of help, and to Randall Gellens for endless amounts of 1818 Internet DRAFT Sieve May 8, 2000 1820 proofreading. I am grateful to Carnegie Mellon University where most 1821 of the work on this document was done. I am also indebted to all of 1822 the readers of the ietf-mta-filters@imc.org mailing list. 1824 12. Author's Address 1826 Tim Showalter 1827 Mirapoint, Inc. 1828 Two Results Way, Suite 100 1829 Cupertino, CA 95014 1831 E-Mail: tjs@mirapoint.com 1833 Internet DRAFT Sieve May 8, 2000 1835 Appendix A. References 1837 [ABNF] Crocker, D., and P. Overell, "Augmented BNF for Syntax 1838 Specifications: ABNF", Internet Mail Consortium, RFC 2234, November 1839 1997. 1841 [ACAP] Newman, C., and J. G. Myers, "ACAP -- Application 1842 Configuration Access Protocol", RFC 2244, Innosoft and Netscape, 1843 November 1997. 1845 [BINARY-SI] "Standard IEC 60027-2: Letter symbols to be used in 1846 electrical technology - Part 2: Telecommunications and electronics", 1847 January 1999. 1849 [DSN] Moore, K., and G. Vaudreuil, "An Extensible Message Format for 1850 Delivery Status Notifications", RFC 1894, January 1996. 1852 [FLAMES] Borenstein, N, and C. Thyberg, "Power, Ease of Use, and 1853 Cooperative Work in a Practical Multimedia Message System", Int. J. 1854 of Man-Machine Studies, April, 1991. Reprinted in Computer-Supported 1855 Cooperative Work and Groupware, Saul Greenberg, editor, Harcourt 1856 Brace Jovanovich, 1991. Reprinted in Readings in Groupware and 1857 Computer-Supported Cooperative Work, Ronald Baecker, editor, Morgan 1858 Kaufmann, 1993. 1860 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 1861 Requirement Levels", RFC 2119, Harvard University, March 1997. 1863 [IMAP] Crispin, M., "Internet Message Access Protocol - version 1864 4rev1", RFC 2060, University of Washington, December 1996. 1866 [IMAIL] Crocker, D., "Standard for the Format of ARPA Internet Text 1867 Messages", STD 11, RFC 822, University of Delaware, August 1982. 1869 [MIME] Freed, N., and N. Borenstein, "Multipurpose Internet Mail 1870 Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 1871 2045, Innosoft and First Virtual, November 1996. 1873 [MDN] Fajman, R., "An Extensible Message Format for Message 1874 Disposition Notifications", RFC 2298, March 1998. 1876 [SMTP] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC 821, 1877 USC/Information Sciences Institute, August 1982. 1879 [UTF-8] Yergeau, F. "UTF-8, a transformation format of Unicode and 1880 ISO 10646", RFC 2044, Alis Technologies, October 1996. 1882 Internet DRAFT Sieve May 8, 2000 1884 Appendix B. Full Copyright Statement 1886 Copyright (C) The Internet Society 2000. All Rights Reserved. 1888 This document and translations of it may be copied and furnished to 1889 others, and derivative works that comment on or otherwise explain it 1890 or assist in its implementation may be prepared, copied, published 1891 and distributed, in whole or in part, without restriction of any 1892 kind, provided that the above copyright notice and this paragraph are 1893 included on all such copies and derivative works. However, this 1894 document itself may not be modified in any way, such as by removing 1895 the copyright notice or references to the Internet Society or other 1896 Internet organizations, except as needed for the purpose of 1897 developing Internet standards in which case the procedures for 1898 copyrights defined in the Internet Standards process must be 1899 followed, or as required to translate it into languages other than 1900 English. 1902 The limited permissions granted above are perpetual and will not be 1903 revoked by the Internet Society or its successors or assigns. 1905 This document and the information contained herein is provided on an 1906 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 1907 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 1908 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 1909 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 1910 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.