idnits 2.17.1 draft-ietf-sieve-3028bis-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1722. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1733. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1740. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1746. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC3028, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 2006) is 6495 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'COMPARATOR' is mentioned on line 1185, but not defined == Missing Reference: 'ADDRESS-PART' is mentioned on line 1116, but not defined == Missing Reference: 'MATCH-TYPE' is mentioned on line 1185, but not defined ** Obsolete normative reference: RFC 4234 (ref. 'ABNF') (Obsoleted by RFC 5234) == Outdated reference: A later version (-14) exists of draft-newman-i18n-comparator-07 ** Obsolete normative reference: RFC 2822 (ref. 'IMAIL') (Obsoleted by RFC 5322) ** Obsolete normative reference: RFC 3798 (ref. 'MDN') (Obsoleted by RFC 8098) ** Obsolete normative reference: RFC 3028 (Obsoleted by RFC 5228, RFC 5429) ** Obsolete normative reference: RFC 2821 (ref. 'SMTP') (Obsoleted by RFC 5321) -- Obsolete informational reference (is this intentional?): RFC 1894 (ref. 'DSN') (Obsoleted by RFC 3464) -- Obsolete informational reference (is this intentional?): RFC 3501 (ref. 'IMAP') (Obsoleted by RFC 9051) Summary: 8 errors (**), 0 flaws (~~), 7 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Guenther 3 Internet-Draft Sendmail, Inc. 4 Expires: January 2007 T. Showalter 5 Obsoletes: 3028 (if approved) Editors 6 July 2006 8 Sieve: An Email Filtering Language 9 draft-ietf-sieve-3028bis-08.txt 11 Status of this Memo 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 A revised version of this draft document will be submitted to the RFC 35 editor as a Standard Track RFC for the Internet Community. 36 Discussion and suggestions for improvement are requested, and should 37 be sent to ietf-mta-filters@imc.org. Distribution of this memo is 38 unlimited. 40 Copyright Notice 42 Copyright (C) The Internet Society (2006). 44 Abstract 46 This document describes a language for filtering email messages at 47 time of final delivery. It is designed to be implementable on either 48 a mail client or mail server. It is meant to be extensible, simple, 49 and independent of access protocol, mail architecture, and operating 50 system. It is suitable for running on a mail server where users may 51 not be allowed to execute arbitrary programs, such as on black box 52 Internet Message Access Protocol (IMAP) servers, as it has no 53 variables, loops, or ability to shell out to external programs. 55 Table of Contents 57 1. Introduction ........................................... 3 58 1.1. Conventions Used in This Document ..................... 4 59 1.2. Example mail messages ................................. 5 60 2. Design ................................................. 6 61 2.1. Form of the Language .................................. 6 62 2.2. Whitespace ............................................ 6 63 2.3. Comments .............................................. 6 64 2.4. Literal Data .......................................... 6 65 2.4.1. Numbers ............................................... 7 66 2.4.2. Strings ............................................... 7 67 2.4.2.1. String Lists .......................................... 8 68 2.4.2.2. Headers ............................................... 8 69 2.4.2.3. Addresses ............................................. 9 70 2.5. Tests ................................................. 9 71 2.5.1. Test Lists ............................................ 9 72 2.6. Arguments ............................................. 9 73 2.6.1. Positional Arguments .................................. 9 74 2.6.2. Tagged Arguments ...................................... 10 75 2.6.3. Optional Arguments .................................... 10 76 2.6.4. Types of Arguments .................................... 10 77 2.7. String Comparison ..................................... 11 78 2.7.1. Match Type ............................................ 11 79 2.7.2. Comparisons Across Character Sets ..................... 12 80 2.7.3. Comparators ........................................... 13 81 2.7.4. Comparisons Against Addresses ......................... 14 82 2.8. Blocks ................................................ 14 83 2.9. Commands .............................................. 15 84 2.10. Evaluation ............................................ 15 85 2.10.1. Action Interaction .................................... 15 86 2.10.2. Implicit Keep ......................................... 15 87 2.10.3. Message Uniqueness in a Mailbox ....................... 16 88 2.10.4. Limits on Numbers of Actions .......................... 16 89 2.10.5. Extensions and Optional Features ...................... 16 90 2.10.6. Errors ................................................ 17 91 2.10.7. Limits on Execution ................................... 17 92 3. Control Commands ....................................... 18 93 3.1. Control If ............................................ 18 94 3.2. Control Require ....................................... 19 95 3.3. Control Stop .......................................... 19 96 4. Action Commands ........................................ 19 97 4.1. Action fileinto ....................................... 20 98 4.2. Action redirect ....................................... 20 99 4.3. Action keep ........................................... 21 100 4.4. Action discard ........................................ 21 101 5. Test Commands .......................................... 22 102 5.1. Test address .......................................... 22 103 5.2. Test allof ............................................ 23 104 5.3. Test anyof ............................................ 23 105 5.4. Test envelope ......................................... 23 106 5.5. Test exists ........................................... 24 107 5.6. Test false ............................................ 25 108 5.7. Test header ........................................... 25 109 5.8. Test not .............................................. 25 110 5.9. Test size ............................................. 26 111 5.10. Test true ............................................. 26 112 6. Extensibility .......................................... 26 113 6.1. Capability String ..................................... 27 114 6.2. IANA Considerations ................................... 27 115 6.2.1. Template for Capability Registrations ................. 27 116 6.2.2. Handling of Existing Capability Registrations ......... 28 117 6.2.3. Initial Capability Registrations ...................... 28 118 6.3. Capability Transport .................................. 28 119 7. Transmission ........................................... 29 120 8. Parsing ................................................ 29 121 8.1. Lexical Tokens ........................................ 29 122 8.2. Grammar ............................................... 32 123 9. Extended Example ....................................... 32 124 10. Security Considerations ................................ 33 125 11. Acknowledgments ........................................ 34 126 12. Editor's Address ....................................... 34 127 13. Normative References ................................... 34 128 14. Informative References ................................. 35 129 15. Changes from RFC 3028 .................................. 35 130 16. Full Copyright Statement ............................... 35 132 1. Introduction 134 This memo documents a language that can be used to create filters for 135 electronic mail. It is not tied to any particular operating system 136 or mail architecture. It requires the use of [IMAIL]-compliant 137 messages, but should otherwise generalize to many systems. 139 The language is powerful enough to be useful but limited in order to 140 allow for a safe server-side filtering system. The intention is to 141 make it impossible for users to do anything more complex (and 142 dangerous) than write simple mail filters, along with facilitating 143 the use of GUIs for filter creation and manipulation. The language 144 is not Turing-complete: it provides no way to write a loop or a 145 function and variables are not provided. 147 Scripts written in Sieve are executed during final delivery, when the 148 message is moved to the user-accessible mailbox. In systems where 149 the MTA does final delivery, such as traditional Unix mail, it is 150 reasonable to sort when the MTA deposits mail into the user's 151 mailbox. 153 There are a number of reasons to use a filtering system. Mail 154 traffic for most users has been increasing due to increased usage of 155 email, the emergence of unsolicited email as a form of advertising, 156 and increased usage of mailing lists. 158 Experience at Carnegie Mellon has shown that if a filtering system is 159 made available to users, many will make use of it in order to file 160 messages from specific users or mailing lists. However, many others 161 did not make use of the Andrew system's FLAMES filtering language 162 [FLAMES] due to difficulty in setting it up. 164 Because of the expectation that users will make use of filtering if 165 it is offered and easy to use, this language has been made simple 166 enough to allow many users to make use of it, but rich enough that it 167 can be used productively. However, it is expected that GUI-based 168 editors will be the preferred way of editing filters for a large 169 number of users. 171 1.1. Conventions Used in This Document 173 In the sections of this document that discuss the requirements of 174 various keywords and operators, the following conventions have been 175 adopted. 177 The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" 178 in this document are to be interpreted as described in [KEYWORDS]. 180 Each section on a command (test, action, or control) has a line 181 labeled "Usage:". This line describes the usage of the command, 182 including its name and its arguments. Required arguments are listed 183 inside angle brackets ("<" and ">"). Optional arguments are listed 184 inside square brackets ("[" and "]"). Each argument is followed by 185 its type, so "" represents an argument called "key" that 186 is a string. Literal strings are represented with double-quoted 187 strings. Alternatives are separated with slashes, and parenthesis 188 are used for grouping, similar to [ABNF]. 190 In the "Usage:" line, there are three special pieces of syntax that 191 are frequently repeated, MATCH-TYPE, COMPARATOR, and ADDRESS-PART. 192 These are discussed in sections 2.7.1, 2.7.3, and 2.7.4, 193 respectively. 195 The formal grammar for these commands in section 10 and is the 196 authoritative reference on how to construct commands, but the formal 197 grammar does not specify the order, semantics, number or types of 198 arguments to commands, nor the legal command names. The intent is to 199 allow for extension without changing the grammar. 201 1.2. Example mail messages 203 The following mail messages will be used throughout this document in 204 examples. 206 Message A 207 ----------------------------------------------------------- 208 Date: Tue, 1 Apr 1997 09:06:31 -0800 (PST) 209 From: coyote@desert.example.org 210 To: roadrunner@acme.example.com 211 Subject: I have a present for you 213 Look, I'm sorry about the whole anvil thing, and I really 214 didn't mean to try and drop it on you from the top of the 215 cliff. I want to try to make it up to you. I've got some 216 great birdseed over here at my place--top of the line 217 stuff--and if you come by, I'll have it all wrapped up 218 for you. I'm really sorry for all the problems I've caused 219 for you over the years, but I know we can work this out. 220 -- 221 Wile E. Coyote "Super Genius" coyote@desert.example.org 222 ----------------------------------------------------------- 224 Message B 225 ----------------------------------------------------------- 226 From: youcouldberich!@reply-by-postal-mail.invalid 227 Sender: b1ff@de.res.example.com 228 To: rube@landru.example.edu 229 Date: Mon, 31 Mar 1997 18:26:10 -0800 230 Subject: $$$ YOU, TOO, CAN BE A MILLIONAIRE! $$$ 232 YOU MAY HAVE ALREADY WON TEN MILLION DOLLARS, BUT I DOUBT 233 IT! SO JUST POST THIS TO SIX HUNDRED NEWSGROUPS! IT WILL 234 GUARANTEE THAT YOU GET AT LEAST FIVE RESPONSES WITH MONEY! 235 MONEY! MONEY! COLD HARD CASH! YOU WILL RECEIVE OVER 236 $20,000 IN LESS THAN TWO MONTHS! AND IT'S LEGAL!!!!!!!!! 237 !!!!!!!!!!!!!!!!!!111111111!!!!!!!11111111111!!1 JUST 238 SEND $5 IN SMALL, UNMARKED BILLS TO THE ADDRESSES BELOW! 239 ----------------------------------------------------------- 241 2. Design 243 2.1. Form of the Language 245 The language consists of a set of commands. Each command consists of 246 a set of tokens delimited by whitespace. The command identifier is 247 the first token and it is followed by zero or more argument tokens. 248 Arguments may be literal data, tags, blocks of commands, or test 249 commands. 251 With the exceptions of strings and comments, the language is limited 252 to US-ASCII characters. Strings and comments may contain octets 253 outside the US-ASCII range. Specifically, they will normally be in 254 UTF-8, as specified in [UTF-8]. NUL (US-ASCII 0) is never permitted 255 in scripts, while CR and LF can only appear as the CRLF line ending. 257 Tokens other than strings are considered case-insensitive. 259 2.2. Whitespace 261 Whitespace is used to separate tokens. Whitespace is made up of 262 tabs, newlines (CRLF, never just CR or LF), and the space character. 263 The amount of whitespace used is not significant. 265 2.3. Comments 267 Two types of comments are offered. Comments are semantically 268 equivalent to whitespace and can be used anyplace that whitespace is 269 (with one exception in multi-line strings, as described in the 270 grammar). 272 Hash comments begin with a "#" character that is not contained within 273 a string and continue until the next CRLF. 275 Example: if size :over 100K { # this is a comment 276 discard; 277 } 279 Bracketed comments begin with the token "/*" and end with "*/" 280 outside of a string. Bracketed comments may span multiple lines. 281 Bracketed comments do not nest. 283 Example: if size :over 100K { /* this is a comment 284 this is still a comment */ discard /* this is a comment 285 */ ; 286 } 288 2.4. Literal Data 289 Literal data means data that is not executed, merely evaluated "as 290 is", to be used as arguments to commands. Literal data is limited to 291 numbers and strings. 293 2.4.1. Numbers 295 Numbers are given as ordinary decimal numbers. However, those 296 numbers that have a tendency to be fairly large, such as message 297 sizes, MAY have a "K", "M", or "G" appended to indicate a multiple of 298 a power of two. To be comparable with the power-of-two-based 299 versions of SI units that computers frequently use, K specifies 300 kibi-, or 1,024 (2^10) times the value of the number; M specifies 301 mebi-, or 1,048,576 (2^20) times the value of the number; and G 302 specifies gibi-, or 1,073,741,824 (2^30) times the value of the 303 number [BINARY-SI]. 305 Implementations MUST provide 31 bits of magnitude in numbers, but MAY 306 provide more. 308 Only positive integers are permitted by this specification. 310 2.4.2. Strings 312 Scripts involve large numbers of strings as they are used for pattern 313 matching, addresses, textual bodies, etc. Typically, short quoted 314 strings suffice for most uses, but a more convenient form is provided 315 for longer strings such as bodies of messages. 317 A quoted string starts and ends with a single double quote (the <"> 318 character, US-ASCII 34). A backslash ("\", ASCII 92) inside of a 319 quoted string is followed by either another backslash or a double 320 quote. This two-character sequence represents a single backslash or 321 double- quote within the string, respectively. 323 Scripts SHOULD NOT escape other characters with a backslash. 325 An undefined escape sequence (such as "\a" in a context where "a" has 326 no special meaning) is interpreted as if there were no backslash (in 327 this case, "\a" is just "a"), though that may be changed by 328 extensions. 330 Non-printing characters such as tabs, CRLF, and control characters 331 are permitted in quoted strings. Quoted strings MAY span multiple 332 lines. NUL (US-ASCII 0) is not allowed in strings. 334 As messages header data is converted to [UTF-8] for comparison (see 335 section 2.7.2), most strings will use the UTF-8 encoding. However, 336 implementations MUST accept all strings that match the grammar in 337 section 8. The ability to use non-UTF-8 encoded strings matches 338 existing practice and has proven to be useful both in tests for 339 invalid data and in arguments containing raw MIME parts for extension 340 actions that generate outgoing messages. 342 For entering larger amounts of text, such as an email message, a 343 multi-line form is allowed. It starts with the keyword "text:", 344 followed by a CRLF, and ends with the sequence of a CRLF, a single 345 period, and another CRLF. In order to allow the message to contain 346 lines with a single-dot, lines are dot-stuffed. That is, when 347 composing a message body, an extra `.' is added before each line 348 which begins with a `.'. When the server interprets the script, 349 these extra dots are removed. Note that a line that begins with a 350 dot followed by a non-dot character is not interpreted dot-stuffed; 351 that is, ".foo" is interpreted as ".foo". However, because this is 352 potentially ambiguous, scripts SHOULD be properly dot-stuffed so such 353 lines do not appear. 355 Note that a hashed comment or whitespace may occur in between the 356 "text:" and the CRLF, but not within the string itself. Bracketed 357 comments are not allowed here. 359 2.4.2.1. String Lists 361 When matching patterns, it is frequently convenient to match against 362 groups of strings instead of single strings. For this reason, a list 363 of strings is allowed in many tests, implying that if the test is 364 true using any one of the strings, then the test is true. 365 Implementations are encouraged to use short-circuit evaluation in 366 these cases. 368 For instance, the test `header :contains ["To", "Cc"] 369 ["me@example.com", "me00@landru.example.edu"]' is true if either a To 370 header or Cc header of the input message contains either of the email 371 addresses "me@example.com" or "me00@landru.example.edu". 373 Conversely, in any case where a list of strings is appropriate, a 374 single string is allowed without being a member of a list: it is 375 equivalent to a list with a single member. This means that the test 376 `exists "To"' is equivalent to the test `exists ["To"]'. 378 2.4.2.2. Headers 380 Headers are a subset of strings. In the Internet Message 381 Specification [IMAIL], each header line is allowed to have whitespace 382 nearly anywhere in the line, including after the field name and 383 before the subsequent colon. Extra spaces between the header name 384 and the ":" in a header field are ignored. 386 A header name never contains a colon. The "From" header refers to a 387 line beginning "From:" (or "From :", etc.). No header will match 388 the string "From:" due to the trailing colon. 390 Similarly, synactically invalid header names cause the same result as 391 syntactically valid header names that are not present in the message. 392 In particular, an implementation MUST NOT cause an error for 393 synactically invalid header names in tests. 395 Header lines are unfolded as described in [IMAIL] section 2.2.3. 396 Interpretation of header data SHOULD be done according to [MIME3] 397 section 6.2 (see 2.7.2 below for details). 399 2.4.2.3. Addresses 401 A number of commands call for email addresses, which are also a 402 subset of strings. When these addresses are used in outbound 403 contexts, addresses must be compliant with [IMAIL], but are further 404 constrained. Using the symbols defined in [IMAIL], section 3, the 405 syntax of an address is: 407 sieve-address = addr-spec ; simple address 408 / phrase "<" addr-spec ">" ; name & addr-spec 410 That is, routes and group syntax are not permitted. If multiple 411 addresses are required, use a string list. Named groups are not used 412 here. 414 Implementations MUST ensure that the addresses are syntactically 415 valid, but need not ensure that they actually identify an email 416 recipient. 418 2.5. Tests 420 Tests are given as arguments to commands in order to control their 421 actions. In this document, tests are given to if/elsif/else to 422 decide which block of code is run. 424 2.5.1. Test Lists 426 Some tests ("allof" and "anyof", which implement logical "and" and 427 logical "or", respectively) may require more than a single test as an 428 argument. The test-list syntax element provides a way of grouping 429 tests. 431 Example: if anyof (not exists ["From", "Date"], 432 header :contains "from" "fool@example.edu") { 433 discard; 435 } 437 2.6. Arguments 439 In order to specify what to do, most commands take arguments. There 440 are three types of arguments: positional, tagged, and optional. 442 It is an error for a script, on a single command, to use conflicting 443 arguments or to use a tagged or optional argument more than once. 445 2.6.1. Positional Arguments 447 Positional arguments are given to a command which discerns their 448 meaning based on their order. When a command takes positional 449 arguments, all positional arguments must be supplied and must be in 450 the order prescribed. 452 2.6.2. Tagged Arguments 454 This document provides for tagged arguments in the style of 455 CommonLISP. These are also similar to flags given to commands in 456 most command-line systems. 458 A tagged argument is an argument for a command that begins with ":" 459 followed by a tag naming the argument, such as ":contains". This 460 argument means that zero or more of the next tokens have some 461 particular meaning depending on the argument. These next tokens may 462 be numbers or strings but they are never blocks. 464 Tagged arguments are similar to positional arguments, except that 465 instead of the meaning being derived from the command, it is derived 466 from the tag. 468 Tagged arguments must appear before positional arguments, but they 469 may appear in any order with other tagged arguments. For simplicity 470 of the specification, this is not expressed in the syntax definitions 471 with commands, but they still may be reordered arbitrarily provided 472 they appear before positional arguments. Tagged arguments may be 473 mixed with optional arguments. 475 To simplify this specification, tagged arguments SHOULD NOT take 476 tagged arguments as arguments. 478 2.6.3. Optional Arguments 480 Optional arguments are exactly like tagged arguments except that they 481 may be left out, in which case a default value is implied. Because 482 optional arguments tend to result in shorter scripts, they have been 483 used far more than tagged arguments. 485 One particularly noteworthy case is the ":comparator" argument, which 486 allows the user to specify which comparator [COLLATION] will be used 487 to compare two strings, since different languages may impose 488 different orderings on UTF-8 [UTF-8] characters. 490 2.6.4. Types of Arguments 492 Abstractly, arguments may be literal data, tests, or blocks of 493 commands. In this way, an "if" control structure is merely a command 494 that happens to take a test and a block as arguments and may execute 495 the block of code. 497 However, this abstraction is ambiguous from a parsing standpoint. 498 The grammar in section 9.2 presents a parsable version of this: 499 Arguments are string-lists, numbers, and tags, which may be followed 500 by a test or a test-list, which may be followed by a block of 501 commands. No more than one test or test list, nor more than one 502 block of commands, may be used, and commands that end with a block of 503 commands do not end with semicolons. 505 2.7. String Comparison 507 When matching one string against another, there are a number of ways 508 of performing the match operation. These are accomplished with three 509 types of matches: an exact match, a substring match, and a wildcard 510 glob-style match. These are described below. 512 In order to provide for matches between character sets and case 513 insensitivity, Sieve uses the comparators defined in the Internet 514 Application Protocol Collation Registry [COLLATION]. 516 However, when a string represents the name of a header, the 517 comparator is never user-specified. Header comparisons are always 518 done with the "i;ascii-casemap" operator, i.e., case-insensitive 519 comparisons, because this is the way things are defined in the 520 message specification [IMAIL]. 522 2.7.1. Match Type 524 There are three match types describing the matching used in this 525 specification: ":is", ":contains", and ":matches". Match type 526 arguments are supplied to those commands which allow them to specify 527 what kind of match is to be performed. 529 These are used as optional arguments to tests that perform string 530 comparison. 532 The ":contains" match type describes a substring match. If the value 533 argument contains the key argument as a substring, the match is true. 534 For instance, the string "frobnitzm" contains "frob" and "nit", but 535 not "fbm". The empty key ("") is contained in all values. 537 The ":is" match type describes an absolute match; if the contents of 538 the first string are absolutely the same as the contents of the 539 second string, they match. Only the string "frobnitzm" is the string 540 "frobnitzm". The empty key ":is" and only ":is" the empty value. 542 The ":matches" match type specifies a wildcard match using the 543 characters "*" and "?"; the entire value must be matched. "*" 544 matches zero or more characters in the value and "?" matches a single 545 character in the value, where the comparator that is used (see 2.7.3) 546 defines what a character is. For example, the comparators "i;octet" 547 and "i;ascii-casemap" define a character to be a single octet so "?" 548 will always match exactly one octet when one of those comparators is 549 in use. In contrast, the comparator "i;basic;uca=3.1.1;uv=3.2" 550 defines a character to be any UTF-8 octet sequence encoding one 551 Unicode character and thus "?" may match more than one octet. "?" 552 and "*" may be escaped as "\\?" and "\\*" in strings to match against 553 themselves. The first backslash escapes the second backslash; 554 together, they escape the "*". This is awkward, but it is 555 commonplace in several programming languages that use globs and 556 regular expressions. 558 In order to specify what type of match is supposed to happen, 559 commands that support matching take optional arguments ":matches", 560 ":is", and ":contains". Commands default to using ":is" matching if 561 no match type argument is supplied. Note that these modifiers 562 interact with comparators; in particular, only comparators that 563 support the "substring match" operation are suitable for matching 564 with ":contains" or ":matches". It is an error to use a comparator 565 with ":contains" or ":matches" that is not compatible with it. 567 It is an error to give more than one of these arguments to a given 568 command. 570 For convenience, the "MATCH-TYPE" syntax element is defined here as 571 follows: 573 Syntax: ":is" / ":contains" / ":matches" 575 2.7.2. Comparisons Across Character Sets 577 Messages may involve a number of character sets. In order for 578 comparisons to work across character sets, implementations SHOULD 579 implement the following behavior: 581 Comparisons are performed on octets. Implementations convert text 582 from header fields in all charsets [MIME3] to Unicode, encoded as 583 UTF-8, as input to the comparator (see 2.7.3). Implementations 584 MUST be capable of converting US-ASCII, ISO-8859-1, the US-ASCII 585 subset of ISO-8859-* character sets, and UTF-8. Text that the 586 implementation cannot convert to Unicode for any reason MAY be 587 treated as plain US-ASCII (including any [MIME3] syntax) or 588 processed according to local conventions. An encoded NUL octet 589 (character zero) SHOULD NOT cause early termination of the header 590 content being compared against. 592 If implementations fail to support the above behavior, they MUST 593 conform to the following: 595 No two strings can be considered equal if one contains octets 596 greater than 127. 598 2.7.3. Comparators 600 In order to allow for language-independent, case-independent matches, 601 the match type may be coupled with a comparator name. The Internet 602 Application Protocol Collation Registry [COLLATION] provides the 603 framework for describing and naming comparators as used by this 604 specification. 606 All implementations MUST support the "i;octet" comparator (simply 607 compares octets) and the "i;ascii-casemap" comparator (which treats 608 uppercase and lowercase characters in the US-ASCII subset of UTF-8 as 609 the same). If left unspecified, the default is "i;ascii-casemap". 611 Some comparators may not be usable with substring matches; that is, 612 they may only work with ":is". It is an error to try and use a 613 comparator with ":matches" or ":contains" that is not compatible with 614 it. 616 Sieve treats a comparator result of "undefined" the same as a result 617 of "no-match". That is, this base specification does not provide any 618 means to directly detect invalid comparator input. 620 A comparator is specified by the ":comparator" option with commands 621 that support matching. This option is followed by a string providing 622 the name of the comparator to be used. For convenience, the syntax 623 of a comparator is abbreviated to "COMPARATOR", and (repeated in 624 several tests) is as follows: 626 Syntax: ":comparator" 628 So in this example, 629 Example: if header :contains :comparator "i;octet" "Subject" 630 "MAKE MONEY FAST" { 631 discard; 632 } 634 would discard any message with subjects like "You can MAKE MONEY 635 FAST", but not "You can Make Money Fast", since the comparator used 636 is case-sensitive. 638 Comparators other than "i;octet" and "i;ascii-casemap" must be 639 declared with require, as they are extensions. If a comparator 640 declared with require is not known, it is an error, and execution 641 fails. If the comparator is not declared with require, it is also an 642 error, even if the comparator is supported. (See 2.10.5.) 644 Both ":matches" and ":contains" match types are compatible with the 645 "i;octet" and "i;ascii-casemap" comparators and may be used with 646 them. 648 It is an error to give more than one of these arguments to a given 649 command. 651 2.7.4. Comparisons Against Addresses 653 Addresses are one of the most frequent things represented as strings. 654 These are structured, and being able to compare against the local- 655 part or the domain of an address is useful, so some tests that act 656 exclusively on addresses take an additional optional argument that 657 specifies what the test acts on. 659 These optional arguments are ":localpart", ":domain", and ":all", 660 which act on the local-part (left-side), the domain part (right- 661 side), and the whole address. 663 If an address is not syntactically valid then it will not be matched 664 by tests specifying ":localpart" or ":domain". 666 The kind of comparison done, such as whether or not the test done is 667 case-insensitive, is specified as a comparator argument to the test. 669 If an optional address-part is omitted, the default is ":all". 671 It is an error to give more than one of these arguments to a given 672 command. 674 For convenience, the "ADDRESS-PART" syntax element is defined here as 675 follows: 677 Syntax: ":localpart" / ":domain" / ":all" 679 2.8. Blocks 681 Blocks are sets of commands enclosed within curly braces and supplied 682 as the final argument to a command. Such a command is a control 683 structure: when executed it has control over the number of times the 684 commands in the block are executed. and how 686 With the commands supplied in this memo, there are no loops. The 687 control structures supplied--if, elsif, and else--run a block either 688 once or not at all. 690 2.9. Commands 692 Sieve scripts are sequences of commands. Commands can take any of 693 the tokens above as arguments, and arguments may be either tagged or 694 positional arguments. Not all commands take all arguments. 696 There are three kinds of commands: test commands, action commands, 697 and control commands. 699 The simplest is an action command. An action command is an 700 identifier followed by zero or more arguments, terminated by a 701 semicolon. Action commands do not take tests or blocks as arguments. 703 A control command is a command that affects the parsing or the flow 704 of execution of the Sieve script in some way. A control structure is 705 a control command which ends with a block instead of a semicolon. 707 A test command is used as part of a control command. It is used to 708 specify whether or not the block of code given to the control command 709 is executed. 711 2.10. Evaluation 713 2.10.1. Action Interaction 715 Some actions cannot be used with other actions because the result 716 would be absurd. These restrictions are noted throughout this memo. 718 Extension actions MUST state how they interact with actions defined 719 in this specification. 721 2.10.2. Implicit Keep 723 Previous experience with filtering systems suggests that cases tend 724 to be missed in scripts. To prevent errors, Sieve has an "implicit 725 keep". 727 An implicit keep is a keep action (see 4.4) performed in absence of 728 any action that cancels the implicit keep. 730 An implicit keep is performed if a message is not written to a 731 mailbox, redirected to a new address, or explicitly thrown out. That 732 is, if a fileinto, a keep, a redirect, or a discard is performed, an 733 implicit keep is not. 735 Some actions may be defined to not cancel the implicit keep. These 736 actions may not directly affect the delivery of a message, and are 737 used for their side effects. None of the actions specified in this 738 document meet that criteria, but extension actions will. 740 For instance, with any of the short messages offered above, the 741 following script produces no actions. 743 Example: if size :over 500K { discard; } 745 As a result, the implicit keep is taken. 747 2.10.3. Message Uniqueness in a Mailbox 749 Implementations SHOULD NOT deliver a message to the same folder more 750 than once, even if a script explicitly asks for a message to be 751 written to a mailbox twice. 753 The test for equality of two messages is implementation-defined. 755 If a script asks for a message to be written to a mailbox twice, it 756 MUST NOT be treated as an error. 758 2.10.4. Limits on Numbers of Actions 760 Site policy MAY limit numbers of actions taken and MAY impose 761 restrictions on which actions can be used together. In the event 762 that a script hits a policy limit on the number of actions taken for 763 a particular message, an error occurs. 765 Implementations MUST allow at least one keep or one fileinto. If 766 fileinto is not implemented, implementations MUST allow at least one 767 keep. 769 2.10.5. Extensions and Optional Features 771 Because of the differing capabilities of many mail systems, several 772 features of this specification are optional. Before any of these 773 extensions can be executed, they must be declared with the "require" 774 action. 776 If an extension is not enabled with "require", implementations MUST 777 treat it as if they did not support it at all. 779 If a script does not understand an extension declared with require, 780 the script must not be used at all. Implementations MUST NOT execute 781 scripts which require unknown capability names. 783 Note: The reason for this restriction is that prior experiences with 784 languages such as LISP and Tcl suggest that this is a workable 785 way of noting that a given script uses an extension. 787 Experience with PostScript suggests that mechanisms that allow 788 a script to work around missing extensions are not used in 789 practice. 791 Extensions which define actions MUST state how they interact with 792 actions discussed in the base specification. 794 2.10.6. Errors 796 In any programming language, there are compile-time and run-time 797 errors. 799 Compile-time errors are ones in syntax that are detectable if a 800 syntax check is done. 802 Run-time errors are not detectable until the script is run. This 803 includes transient failures like disk full conditions, but also 804 includes issues like invalid combinations of actions. 806 When an error occurs in a Sieve script, all processing stops. 808 Implementations MAY choose to do a full parse, then evaluate the 809 script, then do all actions. Implementations might even go so far as 810 to ensure that execution is atomic (either all actions are executed 811 or none are executed). 813 Other implementations may choose to parse and run at the same time. 814 Such implementations are simpler, but have issues with partial 815 failure (some actions happen, others don't). 817 Implementations might even go so far as to ensure that scripts can 818 never execute an invalid set of actions before execution, although 819 this could involve solving the Halting Problem. 821 This specification allows any of these approaches. Solving the 822 Halting Problem is considered extra credit. 824 Implementations MUST perform syntactic, semantic, and run-time checks 825 on code that is actually executed. Implementations MAY perform those 826 checks or any part of them on code that is not reached during 827 execution. 829 When an error happens, implementations MUST notify the user that an 830 error occurred, which actions (if any) were taken, and do an implicit 831 keep. 833 2.10.7. Limits on Execution 835 Implementations may limit certain constructs. However, this 836 specification places a lower bound on some of these limits. 838 Implementations MUST support fifteen levels of nested blocks. 840 Implementations MUST support fifteen levels of nested test lists. 842 3. Control Commands 844 Control structures are needed to allow for multiple and conditional 845 actions. 847 3.1. Control If 849 There are three pieces to if: "if", "elsif", and "else". Each is 850 actually a separate command in terms of the grammar. However, an 851 elsif or else MUST only follow an if or elsif. An error occurs if 852 these conditions are not met. 854 Usage: if 856 Usage: elsif 858 Usage: else 860 The semantics are similar to those of any of the many other 861 programming languages these control structures appear in. When the 862 interpreter sees an "if", it evaluates the test associated with it. 863 If the test is true, it executes the block associated with it. 865 If the test of the "if" is false, it evaluates the test of the first 866 "elsif" (if any). If the test of "elsif" is true, it runs the 867 elsif's block. An elsif may be followed by an elsif, in which case, 868 the interpreter repeats this process until it runs out of elsifs. 870 When the interpreter runs out of elsifs, there may be an "else" case. 871 If there is, and none of the if or elsif tests were true, the 872 interpreter runs the else case. 874 This provides a way of performing exactly one of the blocks in the 875 chain. 877 In the following example, both Message A and B are dropped. 879 Example: require "fileinto"; 880 if header :contains "from" "coyote" { 881 discard; 882 } elsif header :contains ["subject"] ["$$$"] { 883 discard; 884 } else { 885 fileinto "INBOX"; 886 } 888 When the script below is run over message A, it redirects the message 889 to acm@example.edu; message B, to postmaster@example.edu; any other 890 message is redirected to field@example.edu. 892 Example: if header :contains ["From"] ["coyote"] { 893 redirect "acm@example.edu"; 894 } elsif header :contains "Subject" "$$$" { 895 redirect "postmaster@example.edu"; 896 } else { 897 redirect "field@example.edu"; 898 } 900 Note that this definition prohibits the "... else if ..." sequence 901 used by C. This is intentional, because this construct produces a 902 shift-reduce conflict. 904 3.2. Control Require 906 Usage: require 908 The require action notes that a script makes use of a certain 909 extension. Such a declaration is required to use the extension, as 910 discussed in section 2.10.5. Multiple capabilities can be declared 911 with a single require. 913 The require command, if present, MUST be used before anything other 914 than a require can be used. An error occurs if a require appears 915 after a command other than require. 917 Example: require ["fileinto", "reject"]; 919 Example: require "fileinto"; 920 require "vacation"; 922 3.3. Control Stop 924 Usage: stop 926 The "stop" action ends all processing. If no actions have been 927 executed, then the keep action is taken. 929 4. Action Commands 931 This document supplies four actions that may be taken on a message: 932 keep, fileinto, redirect, and discard. 934 Implementations MUST support the "keep", "discard", and "redirect" 935 actions. 937 Implementations SHOULD support "fileinto". 939 Implementations MAY limit the number of certain actions taken (see 940 section 2.10.4). 942 4.1. Action fileinto 944 Usage: fileinto 946 The "fileinto" action delivers the message into the specified folder. 947 Implementations SHOULD support fileinto, but in some environments 948 this may be impossible. Implementations MAY place restrictions on 949 folder names; use of an invalid folder name MAY be treated as an 950 error or result in delivery to an implementation-defined folder. If 951 the implementation uses a different encoding scheme than UTF-8 for 952 folder names, it SHOULD reencode the folder name from UTF-8 to its 953 encoding scheme. For example, the Internet Message Access Protocol 954 [IMAP] uses modified UTF-7, such that a folder argument of "odds & 955 ends" would appear in IMAP as "odds &- ends". 957 The capability string for use with the require command is "fileinto". 959 In the following script, message A is filed into folder 960 "INBOX.harassment". 962 Example: require "fileinto"; 963 if header :contains ["from"] "coyote" { 964 fileinto "INBOX.harassment"; 966 } 968 4.2. Action redirect 970 Usage: redirect 972 The "redirect" action is used to send the message to another user at 973 a supplied address, as a mail forwarding feature does. The 974 "redirect" action makes no changes to the message body or existing 975 headers, but it may add new headers. The "redirect" modifies the 976 envelope recipient. 978 The redirect command performs an MTA-style "forward"--that is, what 979 you get from a .forward file using sendmail under UNIX. The address 980 on the [SMTP] envelope is replaced with the one on the redirect 981 command and the message is sent back out. (This is not an MUA-style 982 forward, which creates a new message with a different sender and 983 message ID, wrapping the old message in a new one.) 985 The envelope sender address on the outgoing message is chosen by the 986 sieve implementation. It MAY be copied from the original message. 988 A simple script can be used for redirecting all mail: 990 Example: redirect "bart@example.edu"; 992 Implementations SHOULD take measures to implement loop control, 993 possibly including adding headers to the message or counting received 994 headers. If an implementation detects a loop, it causes an error. 996 4.3. Action keep 998 Usage: keep 1000 The "keep" action is whatever action is taken in lieu of all other 1001 actions, if no filtering happens at all; generally, this simply means 1002 to file the message into the user's main mailbox. This command 1003 provides a way to execute this action without needing to know the 1004 name of the user's main mailbox, providing a way to call it without 1005 needing to understand the user's setup, or the underlying mail 1006 system. 1008 For instance, in an implementation where the IMAP server is running 1009 scripts on behalf of the user at time of delivery, a keep command is 1010 equivalent to a fileinto "INBOX". 1012 Example: if size :under 1M { keep; } else { discard; } 1013 Note that the above script is identical to the one below. 1015 Example: if not size :under 1M { discard; } 1017 4.4. Action discard 1019 Usage: discard 1021 Discard is used to silently throw away the message. It does so by 1022 simply canceling the implicit keep. If discard is used with other 1023 actions, the other actions still happen. Discard is compatible with 1024 all other actions. (For instance fileinto+discard is equivalent to 1025 fileinto.) 1027 Discard MUST be silent; that is, it MUST NOT return a non-delivery 1028 notification of any kind ([DSN], [MDN], or otherwise). 1030 In the following script, any mail from "idiot@example.edu" is thrown 1031 out. 1033 Example: if header :contains ["from"] ["idiot@example.edu"] { 1034 discard; 1035 } 1037 While an important part of this language, "discard" has the potential 1038 to create serious problems for users: Students who leave themselves 1039 logged in to an unattended machine in a public computer lab may find 1040 their script changed to just "discard". In order to protect users in 1041 this situation (along with similar situations), implementations MAY 1042 keep messages destroyed by a script for an indefinite period, and MAY 1043 disallow scripts that throw out all mail. 1045 5. Test Commands 1047 Tests are used in conditionals to decide which part(s) of the 1048 conditional to execute. 1050 Implementations MUST support these tests: "address", "allof", 1051 "anyof", "exists", "false", "header", "not", "size", and "true". 1053 Implementations SHOULD support the "envelope" test. 1055 5.1. Test address 1057 Usage: address [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1058 1060 The "address" test matches Internet addresses in structured headers 1061 that contain addresses. It returns true if any header contains any 1062 key in the specified part of the address, as modified by the 1063 comparator and the match keyword. Whether there are other addresses 1064 present in the header doesn't affect this test; this test does not 1065 provide any way to determine whether an address is the only address 1066 in a header. 1068 Like envelope and header, this test returns true if any combination 1069 of the header-list and key-list arguments match and false otherwise. 1071 Internet email addresses [IMAIL] have the somewhat awkward 1072 characteristic that the local-part to the left of the at-sign is 1073 considered case sensitive, and the domain-part to the right of the 1074 at-sign is case insensitive. The "address" command does not deal 1075 with this itself, but provides the ADDRESS-PART argument for allowing 1076 users to deal with it. 1078 The address primitive never acts on the phrase part of an email 1079 address, nor on comments within that address. It also never acts on 1080 group names, although it does act on the addresses within the group 1081 construct. 1083 Implementations MUST restrict the address test to headers that 1084 contain addresses, but MUST include at least From, To, Cc, Bcc, 1085 Sender, Resent-From, Resent-To, and SHOULD include any other header 1086 that utilizes an "address-list" structured header body. 1088 Example: if address :is :all "from" "tim@example.com" { 1089 discard; 1090 } 1092 5.2. Test allof 1094 Usage: allof 1096 The "allof" test performs a logical AND on the tests supplied to it. 1098 Example: allof (false, false) => false 1099 allof (false, true) => false 1100 allof (true, true) => true 1102 The allof test takes as its argument a test-list. 1104 5.3. Test anyof 1106 Usage: anyof 1108 The "anyof" test performs a logical OR on the tests supplied to it. 1110 Example: anyof (false, false) => false 1111 anyof (false, true) => true 1112 anyof (true, true) => true 1114 5.4. Test envelope 1116 Usage: envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1117 1119 The "envelope" test is true if the specified part of the SMTP (or 1120 equivalent) envelope matches the specified key. This specification 1121 defines the interpretation of the (case insensitive) "from" and "to" 1122 envelope-parts. Additional envelope-parts may be defined by other 1123 extensions; implementations SHOULD consider unknown envelope parts an 1124 error. 1126 If one of the envelope-part strings is (case insensitive) "from", 1127 then matching occurs against the FROM address used in the SMTP MAIL 1128 command. The null reverse-path is matched against as the empty 1129 string, regardless of the ADDRESS-PART argument specified. 1131 If one of the envelope-part strings is (case insensitive) "to", then 1132 matching occurs against the TO address used in the SMTP RCPT command 1133 that resulted in this message getting delivered to this user. Note 1134 that only the most recent TO is available, and only the one relevant 1135 to this user. 1137 The envelope-part is a string list and may contain more than one 1138 parameter, in which case all of the strings specified in the key-list 1139 are matched against all parts given in the envelope-part list. 1141 Like address and header, this test returns true if any combination of 1142 the envelope-part list and key-list arguments match and false 1143 otherwise. 1145 All tests against envelopes MUST drop source routes. 1147 If the SMTP transaction involved several RCPT commands, only the data 1148 from the RCPT command that caused delivery to this user is available 1149 in the "to" part of the envelope. 1151 If a protocol other than SMTP is used for message transport, 1152 implementations are expected to adapt this command appropriately. 1154 The envelope command is optional. Implementations SHOULD support it, 1155 but the necessary information may not be available in all cases. 1157 Example: require "envelope"; 1158 if envelope :all :is "from" "tim@example.com" { 1159 discard; 1160 } 1162 5.5. Test exists 1164 Usage: exists 1166 The "exists" test is true if the headers listed in the header-names 1167 argument exist within the message. All of the headers must exist or 1168 the test is false. 1170 The following example throws out mail that doesn't have a From header 1171 and a Date header. 1173 Example: if not exists ["From","Date"] { 1174 discard; 1175 } 1177 5.6. Test false 1179 Usage: false 1181 The "false" test always evaluates to false. 1183 5.7. Test header 1185 Usage: header [COMPARATOR] [MATCH-TYPE] 1186 1188 The "header" test evaluates to true if the value of any of the named 1189 headers, ignoring leading and trailing whitespace, matches any key. 1190 The type of match is specified by the optional match argument, which 1191 defaults to ":is" if not specified, as specified in section 2.6. 1193 Like address and envelope, this test returns true if any combination 1194 of the header-names list and key-list arguments match and false 1195 otherwise. 1197 If a header listed in the header-names argument exists, it contains 1198 the empty key (""). However, if the named header is not present, it 1199 does not match any key, including the empty key. So if a message 1200 contained the header 1202 X-Caffeine: C8H10N4O2 1204 these tests on that header evaluate as follows: 1206 header :is ["X-Caffeine"] [""] => false 1207 header :contains ["X-Caffeine"] [""] => true 1209 Testing whether a given header is either absent or doesn't contain 1210 any non-whitespace characters can be done using a negated "header" 1211 test: 1213 not header :matches "Cc" "?*" 1215 5.8. Test not 1217 Usage: not 1219 The "not" test takes some other test as an argument, and yields the 1220 opposite result. "not false" evaluates to "true" and "not true" 1221 evaluates to "false". 1223 5.9. Test size 1225 Usage: size <":over" / ":under"> 1227 The "size" test deals with the size of a message. It takes either a 1228 tagged argument of ":over" or ":under", followed by a number 1229 representing the size of the message. 1231 If the argument is ":over", and the size of the message is greater 1232 than the number provided, the test is true; otherwise, it is false. 1234 If the argument is ":under", and the size of the message is less than 1235 the number provided, the test is true; otherwise, it is false. 1237 Exactly one of ":over" or ":under" must be specified, and anything 1238 else is an error. 1240 The size of a message is defined to be the number of octets from the 1241 initial header until the last character in the message body. 1243 Note that for a message that is exactly 4,000 octets, the message is 1244 neither ":over" 4000 octets or ":under" 4000 octets. 1246 5.10. Test true 1248 Usage: true 1250 The "true" test always evaluates to true. 1252 6. Extensibility 1253 New control commands, actions, and tests can be added to the 1254 language. Sites must make these features known to their users; this 1255 document does not define a way to discover the list of extensions 1256 supported by the server. 1258 Any extensions to this language MUST define a capability string that 1259 uniquely identifies that extension. Capability string are case- 1260 sensitive; for example, "foo" and "FOO" are different capabilities. 1261 If a new version of an extension changes the functionality of a 1262 previously defined extension, it MUST use a different name. 1264 In a situation where there is a submission protocol and an extension 1265 advertisement mechanism aware of the details of this language, 1266 scripts submitted can be checked against the mail server to prevent 1267 use of an extension that the server does not support. 1269 Extensions MUST state how they interact with constraints defined in 1270 section 2.10, e.g., whether they cancel the implicit keep, and which 1271 actions they are compatible and incompatible with. 1273 6.1. Capability String 1275 Capability strings are typically short strings describing what 1276 capabilities are supported by the server. 1278 Capability strings beginning with "vnd." represent vendor-defined 1279 extensions. Such extensions are not defined by Internet standards or 1280 RFCs, but are still registered with IANA in order to prevent 1281 conflicts. Extensions starting with "vnd." SHOULD be followed by the 1282 name of the vendor and product, such as "vnd.acme.rocket-sled". 1284 The following capability strings are defined by this document: 1286 envelope The string "envelope" indicates that the implementation 1287 supports the "envelope" command. 1289 fileinto The string "fileinto" indicates that the implementation 1290 supports the "fileinto" command. 1292 comparator- The string "comparator-elbonia" is provided if the 1293 implementation supports the "elbonia" comparator. 1294 Therefore, all implementations have at least the 1295 "comparator-i;octet" 1296 and "comparator-i;ascii-casemap" capabilities. However, 1297 these comparators may be used without being declared 1298 with require. 1300 6.2. IANA Considerations 1301 In order to provide a standard set of extensions, a registry is 1302 provided by IANA. Capability names may be registered on a first- 1303 come, first-served basis. Extensions designed for interoperable use 1304 SHOULD be defined as standards track or IESG approved experimental 1305 RFCs. 1307 6.2.1. Template for Capability Registrations 1309 The following template is to be used for registering new Sieve 1310 extensions with IANA. 1312 To: iana@iana.org 1313 Subject: Registration of new Sieve extension 1315 Capability name: [the string for use in the 'require' statement] 1316 Description: [a brief description of what the extension adds 1317 or changes] 1318 RFC number: [for extensions published as RFCs] 1319 Contact address: [email and/or physical address to contact for 1320 additional information] 1322 6.2.2. Handling of Existing Capability Registrations 1324 In order to bring the existing capability registrations in line with 1325 the new template, IANA is asked to modify each as follows: 1327 1. The "capability name" and "capability arguments" fields 1328 should be eliminated 1329 2. The "capability keyword" field should be renamed to "Capability 1330 name" 1331 3. An empty "Description" field should be added 1332 4. The "Standards Track/IESG-approved experimental RFC number" field 1333 should be renamed to "RFC number" 1334 5. The "Person and email address to contact for further information" 1335 field should be renamed to "Contact address" 1337 6.2.3. Initial Capability Registrations 1339 This RFC updates the the following entries in the IANA registry for 1340 Sieve extensions. 1342 Capability name: fileinto 1343 Description: adds the 'fileinto' action for delivering to a 1344 folder other than the default 1345 RFC number: this RFC (Sieve base spec) 1346 Contact address: The Sieve discussion list 1348 Capability name: envelope 1349 Description: adds the 'envelope' test for testing the message 1350 transport sender and recipient address 1351 RFC number: this RFC (Sieve base spec) 1352 Contact address: The Sieve discussion list 1354 Capability name: comparator-* (anything starting with "comparator-") 1355 Description: adds the indicated comparator for use with the 1356 :comparator argument 1357 RFC number: this RFC (Sieve base spec) 1358 Contact address: The Sieve discussion list 1360 6.3. Capability Transport 1362 As the range of mail systems that this document is intended to apply 1363 to is quite varied, a method of advertising which capabilities an 1364 implementation supports is difficult due to the wide range of 1365 possible implementations. Such a mechanism, however, should have the 1366 property that the implementation can advertise the complete set of 1367 extensions that it supports. 1369 7. Transmission 1371 The [MIME] type for a Sieve script is "application/sieve". 1373 The registration of this type for RFC 2048 requirements is updated as 1374 follows: 1376 Subject: Registration of MIME media type application/sieve 1378 MIME media type name: application 1379 MIME subtype name: sieve 1380 Required parameters: none 1381 Optional parameters: none 1382 Encoding considerations: Most sieve scripts will be textual, 1383 written in UTF-8. When non-7bit characters are used, 1384 quoted-printable is appropriate for transport systems 1385 that require 7bit encoding. 1387 Security considerations: Discussed in section 10 of this RFC. 1388 Interoperability considerations: Discussed in section 2.10.5 1389 of this RFC. 1390 Published specification: this RFC. 1391 Applications which use this media type: sieve-enabled mail servers 1392 Additional information: 1393 Magic number(s): 1394 File extension(s): .siv 1395 Macintosh File Type Code(s): 1396 Person & email address to contact for further information: 1398 See the discussion list at ietf-mta-filters@imc.org. 1399 Intended usage: 1400 COMMON 1401 Author/Change controller: 1402 See Editor information in this RFC. 1404 8. Parsing 1406 The Sieve grammar is separated into tokens and a separate grammar as 1407 most programming languages are. 1409 8.1. Lexical Tokens 1411 Sieve scripts are encoded in UTF-8. The following assumes a valid 1412 UTF-8 encoding; special characters in Sieve scripts are all US-ASCII. 1414 The following are tokens in Sieve: 1416 - identifiers 1417 - tags 1418 - numbers 1419 - quoted strings 1420 - multi-line strings 1421 - other separators 1423 Blanks, horizontal tabs, CRLFs, and comments ("white space") are 1424 ignored except as they separate tokens. Some white space is required 1425 to separate otherwise adjacent tokens and in specific places in the 1426 multi-line strings. CR and LF can only appear in CRLF pairs. 1428 The other separators are single individual characters, and are 1429 mentioned explicitly in the grammar. 1431 The lexical structure of sieve is defined in the following grammar 1432 (as described in [ABNF]): 1434 bracket-comment = "/*" *not-star 1*STAR 1435 *(not-star-slash *not-star 1*STAR) "/" 1436 ; No */ allowed inside a comment. 1437 ; (No * is allowed unless it is the last 1438 ; character, or unless it is followed by a 1439 ; character that isn't a slash.) 1441 comment = bracket-comment / hash-comment 1443 hash-comment = "#" *octet-not-crlf CRLF 1445 identifier = (ALPHA / "_") *(ALPHA / DIGIT / "_") 1446 multi-line = "text:" *(SP / HTAB) (hash-comment / CRLF) 1447 *(multiline-literal / multiline-dotstuff) 1448 "." CRLF 1450 multiline-literal = [octet-not-period *octet-not-crlf] CRLF 1452 multiline-dotstuff = "." 1*octet-not-crlf CRLF 1453 ; A line containing only "." ends the 1454 ; multi-line. Remove a leading '.' if 1455 ; followed by another '.'. 1457 not-star = CRLF / %x01-09 / %x0B-0C / %x0E-29 / %x2B-FF / 1458 ; either a CRLF pair, OR a single octet 1459 ; other than NUL, CR, LF, or star 1461 not-star-slash = CRLF / %x01-09 / %x0B-0C / %x0E-29 / %x2B-2E / 1462 %x30-FF 1463 ; either a CRLF pair, OR a single octet 1464 ; other than NUL, CR, LF, star, or slash 1466 number = 1*DIGIT [ QUANTIFIER ] 1468 octet-not-crlf = %x01-09 / %x0B-0C / %x0E-FF 1469 ; a single octet other than NUL, CR, or LF 1471 octet-not-period = %x01-09 / %x0B-0C / %x0E-2D / %x2F-FF 1472 ; a single octet other than NUL, 1473 ; CR, LF, or period 1475 octet-not-qspecial = %x01-09 / %x0B-0C / %x0E-21 / %x23-5B / %x5D-FF 1476 ; a single octet other than NUL, 1477 ; CR, LF, double-quote, or backslash 1479 QUANTIFIER = "K" / "M" / "G" 1481 quoted-other = "\" octet-not-qspecial 1482 ; represents just the octet-no-qspecial 1483 ; character. SHOULD NOT be used 1485 quoted-safe = CRLF / octet-not-qspecial 1486 ; either a CRLF pair, OR a single octet other 1487 ; than NUL, CR, LF, double-quote, or backslash 1489 quoted-special = "\" ( DQUOTE / "\" ) 1490 ; represents just a double-quote or backslash 1492 quoted-string = DQUOTE quoted-text DQUOTE 1493 quoted-text = *(quoted-safe / quoted-special / quoted-other) 1495 STAR = "*" 1497 tag = ":" identifier 1499 white-space = 1*(SP / CRLF / HTAB) / comment 1501 8.2. Grammar 1503 The following is the grammar of Sieve after it has been lexically 1504 interpreted. No white space or comments appear below. The start 1505 symbol is "start". Non-terminals for MATCH-TYPE, COMPARATOR, and 1506 ADDRESS-PART are provided for use by extensions. 1508 ADDRESS-PART = ":localpart" / ":domain" / ":all" 1510 argument = string-list / number / tag 1512 arguments = *argument [test / test-list] 1514 block = "{" commands "}" 1516 command = identifier arguments ( ";" / block ) 1518 commands = *command 1520 COMPARATOR = ":comparator" string 1522 MATCH-TYPE = ":is" / ":contains" / ":matches" 1524 start = commands 1526 string = quoted-string / multi-line 1528 string-list = "[" string *("," string) "]" / string 1529 ; if there is only a single string, the brackets 1530 ; are optional 1532 test = identifier arguments 1534 test-list = "(" test *("," test) ")" 1536 9. Extended Example 1538 The following is an extended example of a Sieve script. Note that it 1539 does not make use of the implicit keep. 1541 # 1542 # Example Sieve Filter 1543 # Declare any optional features or extension used by the script 1544 # 1545 require ["fileinto"]; 1547 # 1548 # Handle messages from known mailing lists 1549 # Move messages from IETF filter discussion list to filter folder 1550 # 1551 if header :is "Sender" "owner-ietf-mta-filters@imc.org" 1552 { 1553 fileinto "filter"; # move to "filter" folder 1554 } 1555 # 1556 # Keep all messages to or from people in my company 1557 # 1558 elsif address :domain :is ["From", "To"] "example.com" 1559 { 1560 keep; # keep in "In" folder 1561 } 1563 # 1564 # Try and catch unsolicited email. If a message is not to me, 1565 # or it contains a subject known to be spam, file it away. 1566 # 1567 elsif anyof (not address :all :contains 1568 ["To", "Cc", "Bcc"] "me@example.com", 1569 header :matches "subject" 1570 ["*make*money*fast*", "*university*dipl*mas*"]) 1571 { 1572 # If message header does not contain my address, 1573 # it's from a list. 1574 fileinto "spam"; # move to "spam" folder 1575 } 1576 else 1577 { 1578 # Move all other (non-company) mail to "personal" 1579 # folder. 1580 fileinto "personal"; 1581 } 1583 10. Security Considerations 1585 Users must get their mail. It is imperative that whatever method 1586 implementations use to store the user-defined filtering scripts be 1587 secure. 1589 It is equally important that implementations sanity-check the user's 1590 scripts, and not allow users to create on-demand mailbombs. For 1591 instance, an implementation that allows a user to redirect a message 1592 multiple times might also allow a user to create a mailbomb triggered 1593 by mail from a specific user. Site- or implementation-defined limits 1594 on actions are useful for this. 1596 Several commands, such as "discard", "redirect", and "fileinto" allow 1597 for actions to be taken that are potentially very dangerous. 1599 Use of the "redirect" command to generate notifications may easily 1600 overwhelm the target address, especially if it was not designed to 1601 handle large messages. 1603 Implementations SHOULD take measures to prevent languages from 1604 looping. 1606 As with any filter on a message stream, if the sieve implementation 1607 and the mail agents 'behind' sieve in the message stream differ in 1608 their interpretation of the messages, it may be possible for an 1609 attacker to subvert the filter. Of particular note are differences 1610 in the interpretation of malformed messages (e.g., missing or extra 1611 syntax characters) or those that exhibit corner cases (e.g., NUL 1612 octets encoded via [MIME3]). 1614 11. Acknowledgments 1616 This document has been revised in part based on comments and 1617 discussions that took place on and off the SIEVE mailing list. 1618 Thanks to Cyrus Daboo, Ned Freed, Michael Haardt, Kjetil Torgrim 1619 Homme, Barry Leiba, Mark E. Mallett, Alexey Melnikov, Rob Siemborski, 1620 and Nigel Swinson for reviews and suggestions. 1622 12. Editors' Addresses 1624 Philip Guenther 1625 Sendmail, Inc. 1626 6425 Christie St. Ste 400 1627 Emeryville, CA 94608 1628 Email: guenther@sendmail.com 1630 Tim Showalter 1631 Email: tjs@psaux.com 1633 13. Normative References 1635 [ABNF] D. Crocker, Ed., P. Overell "Augmented BNF for Syntax 1636 Specifications: ABNF", RFC 4234, October 2005. 1638 [COLLATION] Newman, C., Duerst, M., and A. Gulbrandsen "Internet 1639 Application Protocol Collation Registry" draft- 1640 newman-i18n-comparator-07.txt (work in progress), 1641 March 2006. 1643 [IMAIL] P. Resnick, Ed., "Internet Message Format", RFC 2822, 1644 April 2001. 1646 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 1647 Requirement Levels", BCP 14, RFC 2119, March 1997. 1649 [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 1650 Extensions (MIME) Part One: Format of Internet 1651 Message Bodies", RFC 2045, November 1996. 1653 [MIME3] Moore, K., "MIME (Multipurpose Internet Mail Extensions) 1654 Part Three: Message Header Extensions for Non-ASCII 1655 Text", RFC 2047, November 1996 1657 [MDN] T. Hansen, Ed., G. Vaudreuil, Ed., "Message Disposition 1658 Notification", RFC 3798, May 2004. 1660 [RFC3028] Showalter, T., "Sieve: A Mail Filtering Language", RFC 1661 3028, January 2001. 1663 [SMTP] J. Klensin, Ed., "Simple Mail Transfer Protocol", RFC 1664 2821, April 2001. 1666 [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO 1667 10646", RFC 3629, November 2003. 1669 14. Informative References 1671 [BINARY-SI] "Standard IEC 60027-2: Letter symbols to be used in 1672 electrical technology - Part 2: Telecommunications and 1673 electronics", January 1999. 1675 [DSN] Moore, K. and G. Vaudreuil, "An Extensible Message Format 1676 for Delivery Status Notifications", RFC 1894, January 1677 1996. 1679 [FLAMES] Borenstein, N, and C. Thyberg, "Power, Ease of Use, and 1680 Cooperative Work in a Practical Multimedia Message 1681 System", Int. J. of Man-Machine Studies, April, 1991. 1682 Reprinted in Computer-Supported Cooperative Work and 1683 Groupware, Saul Greenberg, editor, Harcourt Brace 1684 Jovanovich, 1991. Reprinted in Readings in Groupware and 1685 Computer-Supported Cooperative Work, Ronald Baecker, 1686 editor, Morgan Kaufmann, 1993. 1688 [IMAP] Crispin, M., "Internet Message Access Protocol - version 1689 4rev1", RFC 3501, March 2003. 1691 15. Changes from RFC 3028 1693 This following list is a summary of the changes that have been made 1694 in the Sieve language base specification from [RFC3028]. 1696 1. Removed ban on tests having side-effects 1697 2. Removed reject extension (will be specified in a separate RFC) 1698 3. Clarified description of comparators to match [COLLATION], the 1699 new base specification for them 1700 4. Require stripping of leading and trailing whitespace in 1701 "header" test 1702 5. Clarified or tightened handling of many minor items, including: 1703 - invalid [MIME3] encoding 1704 - invalid addresses in headers 1705 - invalid header field names in tests 1706 - 'undefined' comparator result 1708 16. Full Copyright Statement 1710 Copyright (C) The Internet Society (2006). 1712 This document is subject to the rights, licenses and restrictions 1713 contained in BCP 78, and except as set forth therein, the authors 1714 retain all their rights. 1716 This document and the information contained herein are provided on an 1717 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1718 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1719 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1720 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1721 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1722 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1724 Intellectual Property 1726 The IETF takes no position regarding the validity or scope of any 1727 Intellectual Property Rights or other rights that might be claimed to 1728 pertain to the implementation or use of the technology described in 1729 this document or the extent to which any license under such rights 1730 might or might not be available; nor does it represent that it has 1731 made any independent effort to identify any such rights. Information 1732 on the procedures with respect to rights in RFC documents can be 1733 found in BCP 78 and BCP 79. 1735 Copies of IPR disclosures made to the IETF Secretariat and any 1736 assurances of licenses to be made available, or the result of an 1737 attempt made to obtain a general license or permission for the use of 1738 such proprietary rights by implementers or users of this 1739 specification can be obtained from the IETF on-line IPR repository at 1740 http://www.ietf.org/ipr. 1742 The IETF invites any interested party to bring to its attention any 1743 copyrights, patents or patent applications, or other proprietary 1744 rights that may cover technology that may be required to implement 1745 this standard. Please address the information to the IETF at ietf- 1746 ipr@ietf.org. 1748 Acknowledgement 1750 Funding for the RFC Editor function is currently provided by the 1751 Internet Society. 1753 Append A. Change History 1755 This section will be removed when this document leaves the Internet- 1756 Draft stage. 1758 Changes from draft-ietf-sieve-3028bis-07.txt 1759 1. Improve description in the extension registrations 1760 2. Give IANA directions on how to massage existing registrations 1761 into the new form 1762 3. Added "Changes from RFC 3028" section 1763 4. Updated pages numbers in table of contents 1764 5. Permit non-UTF-8 octet sequences in comments 1765 6. It's an error to use conflicting or repeated tagged and optional 1766 arguments 1767 7. Update description of script encoding 1769 Changes from draft-ietf-sieve-3028bis-06.txt 1770 1. Tweak wording of how :matches uses character definition 1771 of comparator 1772 2. Add security consideration regarding "redirect" as a notification 1773 method 1774 3. fileinto SHOULD reencode; mention IMAP's mUTF-7 1775 4. en;ascii-casemap is gone; switch back to i;ascii-casemap 1776 5. Permit non-UTF-8 octet sequences in strings 1777 6. Sort grammar non-terminals 1778 7. Syntactically invalid addresses don't match :localpart or :domain 1779 8. The null return-path has empty address parts 1780 9. Treat comparator result of "undefined" the same as "no-match" 1781 10. Envelope sender on redirects is implementation defined 1782 11. Change IANA registration template 1784 Changes from draft-ietf-sieve-3028bis-05.txt 1785 1. The specifics of what names are acceptable for fileinto and 1786 the handling of invalid names are both implementation-defined 1787 2. Update to draft-newman-i18n-comparator-07.txt 1788 3. Adjust the example in 5.7 again 1790 Changes from draft-ietf-sieve-3028bis-04.txt 1791 1. Change "Syntax:" to "Usage:" 1792 2. Update ABNF reference to RFC 4234 1793 3. Add non-terminals for MATCH-TYPE, COMPARATOR, and ADDRESS-PART 1794 4. Strip leading and trailing whitespace in the value being matched 1795 by header 1796 5. Collations operate on octets, not characters, and for character 1797 data that is the UTF-8 encoding of the Unicode characters 1798 6. :matches uses character definition of comparator 1800 Changes from draft-ietf-sieve-3028bis-03.txt 1801 1. Remove section 2.4.2.4., MIME Parts, as unreferenced 1802 2. Update to draft-newman-i18n-comparator-04.txt 1803 3. Various tweaks to examples and syntax lines 1804 4. Define "control structure" as a control command with a block 1805 argument, then use it consistently. Reword description of 1806 blocks to match 1807 5. Clarify that "header" can never match an absent header and give 1808 the preferred way to test for absent or empty 1809 6. Invalid header name syntax is not an error _in tests_ (but could 1810 be elsewhere) 1811 7. Implementation SHOULD consider unknown envelope parts an error 1812 8. Remove explicit "omitted" option from 2.7.2p2 1814 Changes from draft-ietf-sieve-3028bis-02.txt 1815 1. Change "ASCII" to "US-ASCII" throughout 1816 2. Tweak section 2.7.2 to not require use of UTF-8 internally and 1817 to explicitly leave implementation-defined the handling of text 1818 that can't be converted to Unicode 1819 3. Add reference to RFC 2047 1820 4. Clarify that capability strings are case-sensitive 1821 5. Clarify that address, envelope, and header return false if no 1822 combination of arguments match 1823 6. Directly state that code that isn't reached may still be checked 1824 for errors 1825 7. Invalid header name syntax is not an error 1826 8. Remove description of header unfolding that conflicts with 1827 [IMAIL] 1829 9. Warn that filters may be subvertable if agents interpret messages 1830 differently 1831 10. Encoded NUL octets SHOULD NOT cause truncation 1833 Changes from draft-ietf-sieve-3028bis-01.txt 1834 1. Remove ban on side effects 1835 2. Remove definition of the 'reject' action, as it is being moved 1836 to the doc that also defines the 'refuse' action 1837 3. Update capability registrations to reference the mailing list 1838 4. Add Tim back as an editor 1839 5. Refer to the zero-length string ("") as "empty" instead of 1840 "null" 1842 Changes from draft-ietf-sieve-3028bis-00.txt 1843 1. More grammar corrections: 1844 - permit /***/, 1845 - remove ambiguity in finding end of bracket comment, 1846 - require valid UTF-8, 1847 - express quoting in the grammar 1848 - ban bare CR and LF in all locations 1849 2. Correct a bunch of whitespace and linewrapping nits 1850 3. Update IMAIL and SMTP references to RFC 2822 and RFC 2821 1851 4. Require support for en;ascii-casemap comparator as well as the 1852 old i;ascii-casemap. As with the old one, you do not need to 1853 use 'require' to use the new comparator 1854 5. Update IANA considerations to update the existing registrations 1855 to point at this doc instead of 3028 1856 6. Scripts SHOULD NOT contain superfluous backslashes 1857 7. Update Acknowledgments 1859 Changes from RFC 3028 1860 1. Split references into normative and informative 1861 2. Update references to current versions of DSN, IMAP, MDN, and 1862 UTF-8 RFCs 1863 3. Replace "e-mail" with "email" 1864 4. Incorporate RFC 3028 errata 1865 5. The "reject" action cancels the implicit keep 1866 6. Replace references to ACAP with references to the i18n-comparator 1867 draft. Further work is needed to completely sync with that 1868 draft 1869 7. Start to update grammar to only permit legal UTF-8 (incomplete) 1870 and correct various other errors and typos 1871 8. Update IPR broilerplate to RFC 3978/3979