idnits 2.17.1 draft-ietf-sieve-3028bis-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 11. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1709. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1720. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1727. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1733. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 1697), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 37. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 4 instances of too long lines in the document, the longest one being 6 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'COMPARATOR' is mentioned on line 1348, but not defined == Missing Reference: 'ADDRESS-PART' is mentioned on line 1348, but not defined == Missing Reference: 'MATCH-TYPE' is mentioned on line 1348, but not defined == Missing Reference: 'QUANTIFIER' is mentioned on line 1484, but not defined == Unused Reference: 'IMAP' is defined on line 1692, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2234 (ref. 'ABNF') (Obsoleted by RFC 4234) == Outdated reference: A later version (-14) exists of draft-newman-i18n-comparator-03 ** Obsolete normative reference: RFC 2822 (ref. 'IMAIL') (Obsoleted by RFC 5322) ** Obsolete normative reference: RFC 3798 (ref. 'MDN') (Obsoleted by RFC 8098) ** Obsolete normative reference: RFC 2821 (ref. 'SMTP') (Obsoleted by RFC 5321) -- Obsolete informational reference (is this intentional?): RFC 1894 (ref. 'DSN') (Obsoleted by RFC 3464) -- Obsolete informational reference (is this intentional?): RFC 3501 (ref. 'IMAP') (Obsoleted by RFC 9051) Summary: 9 errors (**), 0 flaws (~~), 9 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet-Draft Sendmail, Inc. 3 Sieve: An Email Filtering Language 4 draft-ietf-sieve-3028bis-01.txt 6 Status of this Memo 8 By submitting this Internet-Draft, each author represents that any 9 applicable patent or other IPR claims of which he or she is aware 10 have been or will be disclosed, and any of which he or she becomes 11 aware will be disclosed, in accordance with Section 6 of BCP 79. 13 Internet-Drafts are working documents of the Internet Engineering 14 Task Force (IETF), its areas, and its working groups. Note that 15 other groups may also distribute working documents as Internet- 16 Drafts. 18 Internet-Drafts are draft documents valid for a maximum of six months 19 and may be updated, replaced, or obsoleted by other documents at any 20 time. It is inappropriate to use Internet-Drafts as reference 21 material or to cite them other than as "work in progress." 23 The list of current Internet-Drafts can be accessed at 24 http://www.ietf.org/ietf/1id-abstracts.txt 26 The list of Internet-Draft Shadow Directories can be accessed at 27 http://www.ietf.org/shadow.html. 29 A revised version of this draft document will be submitted to the RFC 30 editor as a Standard Track RFC for the Internet Community. 31 Discussion and suggestions for improvement are requested, and should 32 be sent to ietf-mta-filters@imc.org. Distribution of this memo is 33 unlimited. 35 Copyright Notice 37 Copyright (C) The Internet Society (2005). All Rights Reserved. 39 Abstract 41 This document describes a language for filtering email messages at 42 time of final delivery. It is designed to be implementable on either 43 a mail client or mail server. It is meant to be extensible, simple, 44 and independent of access protocol, mail architecture, and operating 45 system. It is suitable for running on a mail server where users may 46 not be allowed to execute arbitrary programs, such as on black box 47 Internet Message Access Protocol (IMAP) servers, as it has no 48 variables, loops, or ability to shell out to external programs. 50 Table of Contents 52 1. Introduction ........................................... 3 53 1.1. Conventions Used in This Document ..................... 4 54 1.2. Example mail messages ................................. 4 55 2. Design ................................................. 5 56 2.1. Form of the Language .................................. 5 57 2.2. Whitespace ............................................ 5 58 2.3. Comments .............................................. 6 59 2.4. Literal Data .......................................... 6 60 2.4.1. Numbers ............................................... 6 61 2.4.2. Strings ............................................... 7 62 2.4.2.1. String Lists .......................................... 7 63 2.4.2.2. Headers ............................................... 8 64 2.4.2.3. Addresses ............................................. 8 65 2.4.2.4. MIME Parts ............................................ 9 66 2.5. Tests ................................................. 9 67 2.5.1. Test Lists ............................................ 9 68 2.6. Arguments ............................................. 9 69 2.6.1. Positional Arguments .................................. 9 70 2.6.2. Tagged Arguments ...................................... 10 71 2.6.3. Optional Arguments .................................... 10 72 2.6.4. Types of Arguments .................................... 10 73 2.7. String Comparison ..................................... 11 74 2.7.1. Match Type ............................................ 11 75 2.7.2. Comparisons Across Character Sets ..................... 12 76 2.7.3. Comparators ........................................... 12 77 2.7.4. Comparisons Against Addresses ......................... 13 78 2.8. Blocks ................................................ 14 79 2.9. Commands .............................................. 14 80 2.10. Evaluation ............................................ 15 81 2.10.1. Action Interaction .................................... 15 82 2.10.2. Implicit Keep ......................................... 15 83 2.10.3. Message Uniqueness in a Mailbox ....................... 15 84 2.10.4. Limits on Numbers of Actions .......................... 16 85 2.10.5. Extensions and Optional Features ...................... 16 86 2.10.6. Errors ................................................ 17 87 2.10.7. Limits on Execution ................................... 17 88 3. Control Commands ....................................... 17 89 3.1. Control Structure If .................................. 18 90 3.2. Control Structure Require ............................. 19 91 3.3. Control Structure Stop ................................ 19 92 4. Action Commands ........................................ 19 93 4.1. Action reject ......................................... 20 94 4.2. Action fileinto ....................................... 20 95 4.3. Action redirect ....................................... 21 96 4.4. Action keep ........................................... 21 97 4.5. Action discard ........................................ 22 98 5. Test Commands .......................................... 22 99 5.1. Test address .......................................... 23 100 5.2. Test allof ............................................ 23 101 5.3. Test anyof ............................................ 24 102 5.4. Test envelope ......................................... 24 103 5.5. Test exists ........................................... 25 104 5.6. Test false ............................................ 25 105 5.7. Test header ........................................... 25 106 5.8. Test not .............................................. 26 107 5.9. Test size ............................................. 26 108 5.10. Test true ............................................. 26 109 6. Extensibility .......................................... 26 110 6.1. Capability String ..................................... 27 111 6.2. IANA Considerations ................................... 28 112 6.2.1. Template for Capability Registrations ................. 28 113 6.2.2. Initial Capability Registrations ...................... 28 114 6.3. Capability Transport .................................. 29 115 7. Transmission ........................................... 29 116 8. Parsing ................................................ 30 117 8.1. Lexical Tokens ........................................ 30 118 8.2. Grammar ............................................... 31 119 9. Extended Example ....................................... 32 120 10. Security Considerations ................................ 34 121 11. Acknowledgments ........................................ 34 122 12. Author's Address ....................................... 34 123 13. References ............................................. 34 124 14. Full Copyright Statement ............................... 36 126 1. Introduction 128 This memo documents a language that can be used to create filters for 129 electronic mail. It is not tied to any particular operating system 130 or mail architecture. It requires the use of [IMAIL]-compliant 131 messages, but should otherwise generalize to many systems. 133 The language is powerful enough to be useful but limited in order to 134 allow for a safe server-side filtering system. The intention is to 135 make it impossible for users to do anything more complex (and 136 dangerous) than write simple mail filters, along with facilitating 137 the use of GUIs for filter creation and manipulation. The language 138 is not Turing-complete: it provides no way to write a loop or a 139 function and variables are not provided. 141 Scripts written in Sieve are executed during final delivery, when the 142 message is moved to the user-accessible mailbox. In systems where 143 the MTA does final delivery, such as traditional Unix mail, it is 144 reasonable to sort when the MTA deposits mail into the user's 145 mailbox. 147 There are a number of reasons to use a filtering system. Mail 148 traffic for most users has been increasing due to increased usage of 149 email, the emergence of unsolicited email as a form of advertising, 150 and increased usage of mailing lists. 152 Experience at Carnegie Mellon has shown that if a filtering system is 153 made available to users, many will make use of it in order to file 154 messages from specific users or mailing lists. However, many others 155 did not make use of the Andrew system's FLAMES filtering language 156 [FLAMES] due to difficulty in setting it up. 158 Because of the expectation that users will make use of filtering if 159 it is offered and easy to use, this language has been made simple 160 enough to allow many users to make use of it, but rich enough that it 161 can be used productively. However, it is expected that GUI-based 162 editors will be the preferred way of editing filters for a large 163 number of users. 165 1.1. Conventions Used in This Document 167 In the sections of this document that discuss the requirements of 168 various keywords and operators, the following conventions have been 169 adopted. 171 The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" 172 in this document are to be interpreted as defined in [KEYWORDS]. 174 Each section on a command (test, action, or control structure) has a 175 line labeled "Syntax:". This line describes the syntax of the 176 command, including its name and its arguments. Required arguments 177 are listed inside angle brackets ("<" and ">"). Optional arguments 178 are listed inside square brackets ("[" and "]"). Each argument is 179 followed by its type, so "" represents an argument 180 called "key" that is a string. Literal strings are represented with 181 double-quoted strings. Alternatives are separated with slashes, and 182 parenthesis are used for grouping, similar to [ABNF]. 184 In the "Syntax" line, there are three special pieces of syntax that 185 are frequently repeated, MATCH-TYPE, COMPARATOR, and ADDRESS-PART. 186 These are discussed in sections 2.7.1, 2.7.3, and 2.7.4, 187 respectively. 189 The formal grammar for these commands in section 10 and is the 190 authoritative reference on how to construct commands, but the formal 191 grammar does not specify the order, semantics, number or types of 192 arguments to commands, nor the legal command names. The intent is to 193 allow for extension without changing the grammar. 195 1.2. Example mail messages 197 The following mail messages will be used throughout this document in 198 examples. 200 Message A 201 ----------------------------------------------------------- 202 Date: Tue, 1 Apr 1997 09:06:31 -0800 (PST) 203 From: coyote@desert.example.org 204 To: roadrunner@acme.example.com 205 Subject: I have a present for you 207 Look, I'm sorry about the whole anvil thing, and I really 208 didn't mean to try and drop it on you from the top of the 209 cliff. I want to try to make it up to you. I've got some 210 great birdseed over here at my place--top of the line 211 stuff--and if you come by, I'll have it all wrapped up 212 for you. I'm really sorry for all the problems I've caused 213 for you over the years, but I know we can work this out. 214 -- 215 Wile E. Coyote "Super Genius" coyote@desert.example.org 216 ----------------------------------------------------------- 218 Message B 219 ----------------------------------------------------------- 220 From: youcouldberich!@reply-by-postal-mail.invalid 221 Sender: b1ff@de.res.example.com 222 To: rube@landru.example.edu 223 Date: Mon, 31 Mar 1997 18:26:10 -0800 224 Subject: $$$ YOU, TOO, CAN BE A MILLIONAIRE! $$$ 226 YOU MAY HAVE ALREADY WON TEN MILLION DOLLARS, BUT I DOUBT 227 IT! SO JUST POST THIS TO SIX HUNDRED NEWSGROUPS! IT WILL 228 GUARANTEE THAT YOU GET AT LEAST FIVE RESPONSES WITH MONEY! 229 MONEY! MONEY! COLD HARD CASH! YOU WILL RECEIVE OVER 230 $20,000 IN LESS THAN TWO MONTHS! AND IT'S LEGAL!!!!!!!!! 231 !!!!!!!!!!!!!!!!!!111111111!!!!!!!11111111111!!1 JUST 232 SEND $5 IN SMALL, UNMARKED BILLS TO THE ADDRESSES BELOW! 233 ----------------------------------------------------------- 235 2. Design 236 2.1. Form of the Language 238 The language consists of a set of commands. Each command consists of 239 a set of tokens delimited by whitespace. The command identifier is 240 the first token and it is followed by zero or more argument tokens. 241 Arguments may be literal data, tags, blocks of commands, or test 242 commands. 244 The language is represented in UTF-8, as specified in [UTF-8]. 246 Tokens in the ASCII range are considered case-insensitive. 248 2.2. Whitespace 250 Whitespace is used to separate tokens. Whitespace is made up of 251 tabs, newlines (CRLF, never just CR or LF), and the space character. 252 The amount of whitespace used is not significant. 254 2.3. Comments 256 Two types of comments are offered. Comments are semantically 257 equivalent to whitespace and can be used anyplace that whitespace is 258 (with one exception in multi-line strings, as described in the 259 grammar). 261 Hash comments begin with a "#" character that is not contained within 262 a string and continue until the next CRLF. 264 Example: if size :over 100K { # this is a comment 265 discard; 266 } 268 Bracketed comments begin with the token "/*" and end with "*/" 269 outside of a string. Bracketed comments may span multiple lines. 270 Bracketed comments do not nest. 272 Example: if size :over 100K { /* this is a comment 273 this is still a comment */ discard /* this is a comment 274 */ ; 275 } 277 2.4. Literal Data 279 Literal data means data that is not executed, merely evaluated "as 280 is", to be used as arguments to commands. Literal data is limited to 281 numbers and strings. 283 2.4.1. Numbers 284 Numbers are given as ordinary decimal numbers. However, those 285 numbers that have a tendency to be fairly large, such as message 286 sizes, MAY have a "K", "M", or "G" appended to indicate a multiple of 287 a power of two. To be comparable with the power-of-two-based 288 versions of SI units that computers frequently use, K specifies 289 kibi-, or 1,024 (2^10) times the value of the number; M specifies 290 mebi-, or 1,048,576 (2^20) times the value of the number; and G 291 specifies gibi-, or 1,073,741,824 (2^30) times the value of the 292 number [BINARY-SI]. 294 Implementations MUST provide 31 bits of magnitude in numbers, but MAY 295 provide more. 297 Only positive integers are permitted by this specification. 299 2.4.2. Strings 301 Scripts involve large numbers of strings as they are used for pattern 302 matching, addresses, textual bodies, etc. Typically, short quoted 303 strings suffice for most uses, but a more convenient form is provided 304 for longer strings such as bodies of messages. 306 A quoted string starts and ends with a single double quote (the <"> 307 character, ASCII 34). A backslash ("\", ASCII 92) inside of a quoted 308 string is followed by either another backslash or a double quote. 309 This two-character sequence represents a single backslash or double- 310 quote within the string, respectively. 312 Scripts SHOULD NOT escape other characters with a backslash. 314 An undefined escape sequence (such as "\a" in a context where "a" has 315 no special meaning) is interpreted as if there were no backslash (in 316 this case, "\a" is just "a"). 318 Non-printing characters such as tabs, CR and LF, and control 319 characters are permitted in quoted strings. Quoted strings MAY span 320 multiple lines. NUL (ASCII 0) is not allowed in strings. 322 For entering larger amounts of text, such as an email message, a 323 multi-line form is allowed. It starts with the keyword "text:", 324 followed by a CRLF, and ends with the sequence of a CRLF, a single 325 period, and another CRLF. In order to allow the message to contain 326 lines with a single-dot, lines are dot-stuffed. That is, when 327 composing a message body, an extra `.' is added before each line 328 which begins with a `.'. When the server interprets the script, 329 these extra dots are removed. Note that a line that begins with a 330 dot followed by a non-dot character is not interpreted dot-stuffed; 331 that is, ".foo" is interpreted as ".foo". However, because this is 332 potentially ambiguous, scripts SHOULD be properly dot-stuffed so such 333 lines do not appear. 335 Note that a hashed comment or whitespace may occur in between the 336 "text:" and the CRLF, but not within the string itself. Bracketed 337 comments are not allowed here. 339 2.4.2.1. String Lists 341 When matching patterns, it is frequently convenient to match against 342 groups of strings instead of single strings. For this reason, a list 343 of strings is allowed in many tests, implying that if the test is 344 true using any one of the strings, then the test is true. 345 Implementations are encouraged to use short-circuit evaluation in 346 these cases. 348 For instance, the test `header :contains ["To", "Cc"] 349 ["me@example.com", "me00@landru.example.edu"]' is true if either the 350 To header or Cc header of the input message contains either of the 351 email addresses "me@example.com" or "me00@landru.example.edu". 353 Conversely, in any case where a list of strings is appropriate, a 354 single string is allowed without being a member of a list: it is 355 equivalent to a list with a single member. This means that the test 356 `exists "To"' is equivalent to the test `exists ["To"]'. 358 2.4.2.2. Headers 360 Headers are a subset of strings. In the Internet Message 361 Specification [IMAIL], each header line is allowed to have whitespace 362 nearly anywhere in the line, including after the field name and 363 before the subsequent colon. Extra spaces between the header name 364 and the ":" in a header field are ignored. 366 A header name never contains a colon. The "From" header refers to a 367 line beginning "From:" (or "From :", etc.). No header will match 368 the string "From:" due to the trailing colon. 370 Folding of long header lines (as described in [IMAIL] 2.2.3) is 371 removed prior to interpretation of the data. The folding syntax (the 372 CRLF that ends a line plus any leading whitespace at the beginning of 373 the next line that indicates folding) are interpreted as if they were 374 a single space. 376 2.4.2.3. Addresses 378 A number of commands call for email addresses, which are also a 379 subset of strings. When these addresses are used in outbound 380 contexts, addresses must be compliant with [IMAIL], but are further 381 constrained. Using the symbols defined in [IMAIL], section 3, the 382 syntax of an address is: 384 sieve-address = addr-spec ; simple address 385 / phrase "<" addr-spec ">" ; name & addr-spec 387 That is, routes and group syntax are not permitted. If multiple 388 addresses are required, use a string list. Named groups are not used 389 here. 391 Implementations MUST ensure that the addresses are syntactically 392 valid, but need not ensure that they actually identify an email 393 recipient. 395 2.4.2.4. MIME Parts 397 In a few places, [MIME] body parts are represented as strings. These 398 parts include MIME headers and the body. This provides a way of 399 embedding typed data within a Sieve script so that, among other 400 things, character sets other than UTF-8 can be used for output 401 messages. 403 2.5. Tests 405 Tests are given as arguments to commands in order to control their 406 actions. In this document, tests are given to if/elsif/else to 407 decide which block of code is run. 409 Tests MUST NOT have side effects. That is, a test cannot affect the 410 state of the filter or message. No tests in this specification have 411 side effects, and side effects are forbidden in extension tests as 412 well. 414 The rationale for this is that tests with side effects impair 415 readability and maintainability and are difficult to represent in a 416 graphic interface for generating scripts. Side effects are confined 417 to actions where they are clearer. 419 2.5.1. Test Lists 421 Some tests ("allof" and "anyof", which implement logical "and" and 422 logical "or", respectively) may require more than a single test as an 423 argument. The test-list syntax element provides a way of grouping 424 tests. 426 Example: if anyof (not exists ["From", "Date"], 427 header :contains "from" "fool@example.edu") { 429 discard; 430 } 432 2.6. Arguments 434 In order to specify what to do, most commands take arguments. There 435 are three types of arguments: positional, tagged, and optional. 437 2.6.1. Positional Arguments 439 Positional arguments are given to a command which discerns their 440 meaning based on their order. When a command takes positional 441 arguments, all positional arguments must be supplied and must be in 442 the order prescribed. 444 2.6.2. Tagged Arguments 446 This document provides for tagged arguments in the style of 447 CommonLISP. These are also similar to flags given to commands in 448 most command-line systems. 450 A tagged argument is an argument for a command that begins with ":" 451 followed by a tag naming the argument, such as ":contains". This 452 argument means that zero or more of the next tokens have some 453 particular meaning depending on the argument. These next tokens may 454 be numbers or strings but they are never blocks. 456 Tagged arguments are similar to positional arguments, except that 457 instead of the meaning being derived from the command, it is derived 458 from the tag. 460 Tagged arguments must appear before positional arguments, but they 461 may appear in any order with other tagged arguments. For simplicity 462 of the specification, this is not expressed in the syntax definitions 463 with commands, but they still may be reordered arbitrarily provided 464 they appear before positional arguments. Tagged arguments may be 465 mixed with optional arguments. 467 To simplify this specification, tagged arguments SHOULD NOT take 468 tagged arguments as arguments. 470 2.6.3. Optional Arguments 472 Optional arguments are exactly like tagged arguments except that they 473 may be left out, in which case a default value is implied. Because 474 optional arguments tend to result in shorter scripts, they have been 475 used far more than tagged arguments. 477 One particularly noteworthy case is the ":comparator" argument, which 478 allows the user to specify which comparator [COLLATION] will be used 479 to compare two strings, since different languages may impose 480 different orderings on UTF-8 [UTF-8] characters. 482 2.6.4. Types of Arguments 484 Abstractly, arguments may be literal data, tests, or blocks of 485 commands. In this way, an "if" control structure is merely a command 486 that happens to take a test and a block as arguments and may execute 487 the block of code. 489 However, this abstraction is ambiguous from a parsing standpoint. 490 The grammar in section 9.2 presents a parsable version of this: 491 Arguments are string-lists, numbers, and tags, which may be followed 492 by a test or a test-list, which may be followed by a block of 493 commands. No more than one test or test list, nor more than one 494 block of commands, may be used, and commands that end with blocks of 495 commands do not end with semicolons. 497 2.7. String Comparison 499 When matching one string against another, there are a number of ways 500 of performing the match operation. These are accomplished with three 501 types of matches: an exact match, a substring match, and a wildcard 502 glob-style match. These are described below. 504 In order to provide for matches between character sets and case 505 insensitivity, Sieve uses the comparators defined in the Internet 506 Application Protocol Collation Registry [COLLATION]. 508 However, when a string represents the name of a header, the 509 comparator is never user-specified. Header comparisons are always 510 done with the "en;ascii-casemap" operator, i.e., case-insensitive 511 comparisons, because this is the way things are defined in the 512 message specification [IMAIL]. 514 2.7.1. Match Type 516 There are three match types describing the matching used in this 517 specification: ":is", ":contains", and ":matches". Match type 518 arguments are supplied to those commands which allow them to specify 519 what kind of match is to be performed. 521 These are used as tagged arguments to tests that perform string 522 comparison. 524 The ":contains" match type describes a substring match. If the value 525 argument contains the key argument as a substring, the match is true. 526 For instance, the string "frobnitzm" contains "frob" and "nit", but 527 not "fbm". The null key ("") is contained in all values. 529 The ":is" match type describes an absolute match; if the contents of 530 the first string are absolutely the same as the contents of the 531 second string, they match. Only the string "frobnitzm" is the string 532 "frobnitzm". The null key ":is" and only ":is" the null value. 534 The ":matches" match type specifies a wildcard match using the 535 characters "*" and "?"; the entire value must be matched. "*" 536 matches zero or more characters, and "?" matches a single character. 537 "?" and "*" may be escaped as "\\?" and "\\*" in strings to match 538 against themselves. The first backslash escapes the second 539 backslash; together, they escape the "*". This is awkward, but it is 540 commonplace in several programming languages that use globs and 541 regular expressions. 543 In order to specify what type of match is supposed to happen, 544 commands that support matching take optional tagged arguments 545 ":matches", ":is", and ":contains". Commands default to using ":is" 546 matching if no match type argument is supplied. Note that these 547 modifiers may interact with comparators; in particular, some 548 comparators are not suitable for matching with ":contains" or 549 ":matches". It is an error to use a comparator with ":contains" or 550 ":matches" that is not compatible with it. 552 It is an error to give more than one of these arguments to a given 553 command. 555 For convenience, the "MATCH-TYPE" syntax element is defined here as 556 follows: 558 Syntax: ":is" / ":contains" / ":matches" 560 2.7.2. Comparisons Across Character Sets 562 All Sieve scripts are represented in UTF-8, but messages may involve 563 a number of character sets. In order for comparisons to work across 564 character sets, implementations SHOULD implement the following 565 behavior: 567 Implementations decode header charsets to UTF-8. Two strings are 568 considered equal if their UTF-8 representations are identical. 569 Implementations should decode charsets represented in the forms 570 specified by [MIME] for both message headers and bodies. 571 Implementations must be capable of decoding US-ASCII, ISO-8859-1, 572 the ASCII subset of ISO-8859-* character sets, and UTF-8. 574 If implementations fail to support the above behavior, they MUST 575 conform to the following: 577 No two strings can be considered equal if one contains octets 578 greater than 127. 580 2.7.3. Comparators 582 In order to allow for language-independent, case-independent matches, 583 the match type may be coupled with a comparator name. The Internet 584 Application Protocol Collation Registry [COLLATION] provides the 585 framework for describing and naming comparators as used by this 586 specification. 588 While multiple comparator types are defined, only equality types are 589 used in this specification. 591 All implementations MUST support the "i;octet" comparator (simply 592 compares octets), the "en;ascii-casemap" comparator (which treats 593 uppercase and lowercase characters in the ASCII subset of UTF-8 as 594 the same), as well as the "i;ascii-casemap" comparator, which is a 595 deprecated synonym for "en;ascii-casemap". If left unspecified, the 596 default is "en;ascii-casemap". 598 Some comparators may not be usable with substring matches; that is, 599 they may only work with ":is". It is an error to try and use a 600 comparator with ":matches" or ":contains" that is not compatible with 601 it. 603 A comparator is specified by the ":comparator" option with commands 604 that support matching. This option is followed by a string providing 605 the name of the comparator to be used. For convenience, the syntax 606 of a comparator is abbreviated to "COMPARATOR", and (repeated in 607 several tests) is as follows: 609 Syntax: ":comparator" 611 So in this example, 613 Example: if header :contains :comparator "i;octet" "Subject" 614 "MAKE MONEY FAST" { 615 discard; 616 } 618 would discard any message with subjects like "You can MAKE MONEY 619 FAST", but not "You can Make Money Fast", since the comparator used 620 is case-sensitive. 622 Comparators other than "i;octet", "en;ascii-casemap", and "i;ascii- 623 casemap" must be declared with require, as they are extensions. If a 624 comparator declared with require is not known, it is an error, and 625 execution fails. If the comparator is not declared with require, it 626 is also an error, even if the comparator is supported. (See 2.10.5.) 628 Both ":matches" and ":contains" match types are compatible with the 629 "i;octet" and "en;ascii-casemap" comparators and may be used with 630 them. 632 It is an error to give more than one of these arguments to a given 633 command. 635 2.7.4. Comparisons Against Addresses 637 Addresses are one of the most frequent things represented as strings. 638 These are structured, and being able to compare against the local- 639 part or the domain of an address is useful, so some tests that act 640 exclusively on addresses take an additional optional argument that 641 specifies what the test acts on. 643 These optional arguments are ":localpart", ":domain", and ":all", 644 which act on the local-part (left-side), the domain part (right- 645 side), and the whole address. 647 The kind of comparison done, such as whether or not the test done is 648 case-insensitive, is specified as a comparator argument to the test. 650 If an optional address-part is omitted, the default is ":all". 652 It is an error to give more than one of these arguments to a given 653 command. 655 For convenience, the "ADDRESS-PART" syntax element is defined here as 656 follows: 658 Syntax: ":localpart" / ":domain" / ":all" 660 2.8. Blocks 662 Blocks are sets of commands enclosed within curly braces. Blocks are 663 supplied to commands so that the commands can implement control 664 commands. 666 A control structure is a command that happens to take a test and a 667 block as one of its arguments; depending on the result of the test 668 supplied as another argument, it runs the code in the block some 669 number of times. 671 With the commands supplied in this memo, there are no loops. The 672 control structures supplied--if, elsif, and else--run a block either 673 once or not at all. So there are two arguments, the test and the 674 block. 676 2.9. Commands 678 Sieve scripts are sequences of commands. Commands can take any of 679 the tokens above as arguments, and arguments may be either tagged or 680 positional arguments. Not all commands take all arguments. 682 There are three kinds of commands: test commands, action commands, 683 and control commands. 685 The simplest is an action command. An action command is an 686 identifier followed by zero or more arguments, terminated by a 687 semicolon. Action commands do not take tests or blocks as arguments. 689 A control command is similar, but it takes a test as an argument, and 690 ends with a block instead of a semicolon. 692 A test command is used as part of a control command. It is used to 693 specify whether or not the block of code given to the control command 694 is executed. 696 2.10. Evaluation 698 2.10.1. Action Interaction 700 Some actions cannot be used with other actions because the result 701 would be absurd. These restrictions are noted throughout this memo. 703 Extension actions MUST state how they interact with actions defined 704 in this specification. 706 2.10.2. Implicit Keep 708 Previous experience with filtering systems suggests that cases tend 709 to be missed in scripts. To prevent errors, Sieve has an "implicit 710 keep". 712 An implicit keep is a keep action (see 4.4) performed in absence of 713 any action that cancels the implicit keep. 715 An implicit keep is performed if a message is not written to a 716 mailbox, redirected to a new address, rejected, or explicitly thrown 717 out. That is, if a fileinto, a keep, a redirect, a reject, or a 718 discard is performed, an implicit keep is not. 720 Some actions may be defined to not cancel the implicit keep. These 721 actions may not directly affect the delivery of a message, and are 722 used for their side effects. None of the actions specified in this 723 document meet that criteria, but extension actions will. 725 For instance, with any of the short messages offered above, the 726 following script produces no actions. 728 Example: if size :over 500K { discard; } 730 As a result, the implicit keep is taken. 732 2.10.3. Message Uniqueness in a Mailbox 734 Implementations SHOULD NOT deliver a message to the same folder more 735 than once, even if a script explicitly asks for a message to be 736 written to a mailbox twice. 738 The test for equality of two messages is implementation-defined. 740 If a script asks for a message to be written to a mailbox twice, it 741 MUST NOT be treated as an error. 743 2.10.4. Limits on Numbers of Actions 745 Site policy MAY limit numbers of actions taken and MAY impose 746 restrictions on which actions can be used together. In the event 747 that a script hits a policy limit on the number of actions taken for 748 a particular message, an error occurs. 750 Implementations MUST prohibit more than one reject. 752 Implementations MUST allow at least one keep or one fileinto. If 753 fileinto is not implemented, implementations MUST allow at least one 754 keep. 756 Implementations SHOULD prohibit reject when used with other actions. 758 2.10.5. Extensions and Optional Features 760 Because of the differing capabilities of many mail systems, several 761 features of this specification are optional. Before any of these 762 extensions can be executed, they must be declared with the "require" 763 action. 765 If an extension is not enabled with "require", implementations MUST 766 treat it as if they did not support it at all. 768 If a script does not understand an extension declared with require, 769 the script must not be used at all. Implementations MUST NOT execute 770 scripts which require unknown capability names. 772 Note: The reason for this restriction is that prior experiences with 773 languages such as LISP and Tcl suggest that this is a workable 774 way of noting that a given script uses an extension. 776 Experience with PostScript suggests that mechanisms that allow 777 a script to work around missing extensions are not used in 778 practice. 780 Extensions which define actions MUST state how they interact with 781 actions discussed in the base specification. 783 2.10.6. Errors 785 In any programming language, there are compile-time and run-time 786 errors. 788 Compile-time errors are ones in syntax that are detectable if a 789 syntax check is done. 791 Run-time errors are not detectable until the script is run. This 792 includes transient failures like disk full conditions, but also 793 includes issues like invalid combinations of actions. 795 When an error occurs in a Sieve script, all processing stops. 797 Implementations MAY choose to do a full parse, then evaluate the 798 script, then do all actions. Implementations might even go so far as 799 to ensure that execution is atomic (either all actions are executed 800 or none are executed). 802 Other implementations may choose to parse and run at the same time. 803 Such implementations are simpler, but have issues with partial 804 failure (some actions happen, others don't). 806 Implementations might even go so far as to ensure that scripts can 807 never execute an invalid set of actions (e.g., reject + fileinto) 808 before execution, although this could involve solving the Halting 809 Problem. 811 This specification allows any of these approaches. Solving the 812 Halting Problem is considered extra credit. 814 When an error happens, implementations MUST notify the user that an 815 error occurred, which actions (if any) were taken, and do an implicit 816 keep. 818 2.10.7. Limits on Execution 820 Implementations may limit certain constructs. However, this 821 specification places a lower bound on some of these limits. 823 Implementations MUST support fifteen levels of nested blocks. 825 Implementations MUST support fifteen levels of nested test lists. 827 3. Control Commands 829 Control structures are needed to allow for multiple and conditional 830 actions. 832 3.1. Control Structure If 834 There are three pieces to if: "if", "elsif", and "else". Each is 835 actually a separate command in terms of the grammar. However, an 836 elsif or else MUST only follow an if or elsif. An error occurs if 837 these conditions are not met. 839 Syntax: if 841 Syntax: elsif 843 Syntax: else 845 The semantics are similar to those of any of the many other 846 programming languages these control commands appear in. When the 847 interpreter sees an "if", it evaluates the test associated with it. 848 If the test is true, it executes the block associated with it. 850 If the test of the "if" is false, it evaluates the test of the first 851 "elsif" (if any). If the test of "elsif" is true, it runs the 852 elsif's block. An elsif may be followed by an elsif, in which case, 853 the interpreter repeats this process until it runs out of elsifs. 855 When the interpreter runs out of elsifs, there may be an "else" case. 856 If there is, and none of the if or elsif tests were true, the 857 interpreter runs the else case. 859 This provides a way of performing exactly one of the blocks in the 860 chain. 862 In the following example, both Message A and B are dropped. 864 Example: require "fileinto"; 865 if header :contains "from" "coyote" { 866 discard; 867 } elsif header :contains ["subject"] ["$$$"] { 868 discard; 869 } else { 870 fileinto "INBOX"; 871 } 873 When the script below is run over message A, it redirects the message 874 to acm@example.edu; message B, to postmaster@example.edu; any other 875 message is redirected to field@example.edu. 877 Example: if header :contains ["From"] ["coyote"] { 878 redirect "acm@example.edu"; 879 } elsif header :contains "Subject" "$$$" { 880 redirect "postmaster@example.edu"; 881 } else { 882 redirect "field@example.edu"; 883 } 885 Note that this definition prohibits the "... else if ..." sequence 886 used by C. This is intentional, because this construct produces a 887 shift-reduce conflict. 889 3.2. Control Structure Require 891 Syntax: require 893 The require action notes that a script makes use of a certain 894 extension. Such a declaration is required to use the extension, as 895 discussed in section 2.10.5. Multiple capabilities can be declared 896 with a single require. 898 The require command, if present, MUST be used before anything other 899 than a require can be used. An error occurs if a require appears 900 after a command other than require. 902 Example: require ["fileinto", "reject"]; 904 Example: require "fileinto"; 905 require "vacation"; 907 3.3. Control Structure Stop 909 Syntax: stop 910 The "stop" action ends all processing. If no actions have been 911 executed, then the keep action is taken. 913 4. Action Commands 915 This document supplies five actions that may be taken on a message: 916 keep, fileinto, redirect, reject, and discard. 918 Implementations MUST support the "keep", "discard", and "redirect" 919 actions. 921 Implementations SHOULD support "reject" and "fileinto". 923 Implementations MAY limit the number of certain actions taken (see 924 section 2.10.4). 926 4.1. Action reject 928 Syntax: reject 930 The optional "reject" action refuses delivery of a message by sending 931 back an [MDN] to the sender and cancels the implict keep. It resends 932 the message to the sender, wrapping it in a "reject" form, noting 933 that it was rejected by the recipient. In the following script, 934 message A is rejected and returned to the sender. 936 Example: if header :contains "from" "coyote@desert.example.org" { 937 reject "I am not taking mail from you, and I don't want 938 your birdseed, either!"; 939 } 941 A reject message MUST take the form of a failure MDN as specified by 942 [MDN]. The human-readable portion of the message, the first 943 component of the MDN, contains the human readable message describing 944 the error, and it SHOULD contain additional text alerting the 945 original sender that mail was refused by a filter. This part of the 946 MDN might appear as follows: 948 ------------------------------------------------------------ 949 Message was refused by recipient's mail filtering program. Reason 950 given was as follows: 952 I am not taking mail from you, and I don't want your birdseed, 953 either! 954 ------------------------------------------------------------ 956 The MDN action-value field as defined in the MDN specification MUST 957 be "deleted" and MUST have the MDN-sent-automatically and automatic- 958 action modes set. 960 Because some implementations can not or will not implement the reject 961 command, it is optional. The capability string to be used with the 962 require command is "reject". 964 4.2. Action fileinto 966 Syntax: fileinto 968 The "fileinto" action delivers the message into the specified folder. 969 Implementations SHOULD support fileinto, but in some environments 970 this may be impossible. 972 The capability string for use with the require command is "fileinto". 974 In the following script, message A is filed into folder 975 "INBOX.harassment". 977 Example: require "fileinto"; 978 if header :contains ["from"] "coyote" { 979 fileinto "INBOX.harassment"; 980 } 982 4.3. Action redirect 984 Syntax: redirect 986 The "redirect" action is used to send the message to another user at 987 a supplied address, as a mail forwarding feature does. The 988 "redirect" action makes no changes to the message body or existing 989 headers, but it may add new headers. The "redirect" modifies the 990 envelope recipient. 992 The redirect command performs an MTA-style "forward"--that is, what 993 you get from a .forward file using sendmail under UNIX. The address 994 on the [SMTP] envelope is replaced with the one on the redirect 995 command and the message is sent back out. (This is not an MUA-style 996 forward, which creates a new message with a different sender and 997 message ID, wrapping the old message in a new one.) 999 A simple script can be used for redirecting all mail: 1001 Example: redirect "bart@example.edu"; 1003 Implementations SHOULD take measures to implement loop control, 1004 possibly including adding headers to the message or counting received 1005 headers. If an implementation detects a loop, it causes an error. 1007 4.4. Action keep 1009 Syntax: keep 1011 The "keep" action is whatever action is taken in lieu of all other 1012 actions, if no filtering happens at all; generally, this simply means 1013 to file the message into the user's main mailbox. This command 1014 provides a way to execute this action without needing to know the 1015 name of the user's main mailbox, providing a way to call it without 1016 needing to understand the user's setup, or the underlying mail 1017 system. 1019 For instance, in an implementation where the Internet Message Access 1020 Protocol (IMAP) server is running scripts on behalf of the user at 1021 time of delivery, a keep command is equivalent to a fileinto "INBOX". 1023 Example: if size :under 1M { keep; } else { discard; } 1025 Note that the above script is identical to the one below. 1027 Example: if not size :under 1M { discard; } 1029 4.5. Action discard 1031 Syntax: discard 1033 Discard is used to silently throw away the message. It does so by 1034 simply canceling the implicit keep. If discard is used with other 1035 actions, the other actions still happen. Discard is compatible with 1036 all other actions. (For instance fileinto+discard is equivalent to 1037 fileinto.) 1039 Discard MUST be silent; that is, it MUST NOT return a non-delivery 1040 notification of any kind ([DSN], [MDN], or otherwise). 1042 In the following script, any mail from "idiot@example.edu" is thrown 1043 out. 1045 Example: if header :contains ["from"] ["idiot@example.edu"] { 1046 discard; 1047 } 1049 While an important part of this language, "discard" has the potential 1050 to create serious problems for users: Students who leave themselves 1051 logged in to an unattended machine in a public computer lab may find 1052 their script changed to just "discard". In order to protect users in 1053 this situation (along with similar situations), implementations MAY 1054 keep messages destroyed by a script for an indefinite period, and MAY 1055 disallow scripts that throw out all mail. 1057 5. Test Commands 1059 Tests are used in conditionals to decide which part(s) of the 1060 conditional to execute. 1062 Implementations MUST support these tests: "address", "allof", 1063 "anyof", "exists", "false", "header", "not", "size", and "true". 1065 Implementations SHOULD support the "envelope" test. 1067 5.1. Test address 1069 Syntax: address [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1070 1072 The address test matches Internet addresses in structured headers 1073 that contain addresses. It returns true if any header contains any 1074 key in the specified part of the address, as modified by the 1075 comparator and the match keyword. Whether there are other addresses 1076 present in the header doesn't affect this test; this test does not 1077 provide any way to determine whether an address is the only address 1078 in a header. 1080 Like envelope and header, this test returns true if any combination 1081 of the header-list and key-list arguments match. 1083 Internet email addresses [IMAIL] have the somewhat awkward 1084 characteristic that the local-part to the left of the at-sign is 1085 considered case sensitive, and the domain-part to the right of the 1086 at-sign is case insensitive. The "address" command does not deal 1087 with this itself, but provides the ADDRESS-PART argument for allowing 1088 users to deal with it. 1090 The address primitive never acts on the phrase part of an email 1091 address, nor on comments within that address. It also never acts on 1092 group names, although it does act on the addresses within the group 1093 construct. 1095 Implementations MUST restrict the address test to headers that 1096 contain addresses, but MUST include at least From, To, Cc, Bcc, 1097 Sender, Resent-From, Resent-To, and SHOULD include any other header 1098 that utilizes an "address-list" structured header body. 1100 Example: if address :is :all "from" "tim@example.com" { 1101 discard; 1103 5.2. Test allof 1105 Syntax: allof 1107 The allof test performs a logical AND on the tests supplied to it. 1109 Example: allof (false, false) => false 1110 allof (false, true) => false 1111 allof (true, true) => true 1113 The allof test takes as its argument a test-list. 1115 5.3. Test anyof 1117 Syntax: anyof 1119 The anyof test performs a logical OR on the tests supplied to it. 1121 Example: anyof (false, false) => false 1122 anyof (false, true) => true 1123 anyof (true, true) => true 1125 5.4. Test envelope 1127 Syntax: envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1128 1130 The "envelope" test is true if the specified part of the SMTP (or 1131 equivalent) envelope matches the specified key. 1133 If one of the envelope-part strings is (case insensitive) "from", 1134 then matching occurs against the FROM address used in the SMTP MAIL 1135 command. 1137 If one of the envelope-part strings is (case insensitive) "to", then 1138 matching occurs against the TO address used in the SMTP RCPT command 1139 that resulted in this message getting delivered to this user. Note 1140 that only the most recent TO is available, and only the one relevant 1141 to this user. 1143 The envelope-part is a string list and may contain more than one 1144 parameter, in which case all of the strings specified in the key-list 1145 are matched against all parts given in the envelope-part list. 1147 Like address and header, this test returns true if any combination of 1148 the envelope-part and key-list arguments is true. 1150 All tests against envelopes MUST drop source routes. 1152 If the SMTP transaction involved several RCPT commands, only the data 1153 from the RCPT command that caused delivery to this user is available 1154 in the "to" part of the envelope. 1156 If a protocol other than SMTP is used for message transport, 1157 implementations are expected to adapt this command appropriately. 1159 The envelope command is optional. Implementations SHOULD support it, 1160 but the necessary information may not be available in all cases. 1162 Example: require "envelope"; 1163 if envelope :all :is "from" "tim@example.com" { 1164 discard; 1165 } 1167 5.5. Test exists 1169 Syntax: exists 1171 The "exists" test is true if the headers listed in the header-names 1172 argument exist within the message. All of the headers must exist or 1173 the test is false. 1175 The following example throws out mail that doesn't have a From header 1176 and a Date header. 1178 Example: if not exists ["From","Date"] { 1179 discard; 1180 } 1182 5.6. Test false 1184 Syntax: false 1186 The "false" test always evaluates to false. 1188 5.7. Test header 1190 Syntax: header [COMPARATOR] [MATCH-TYPE] 1191 1193 The "header" test evaluates to true if any header name matches any 1194 key. The type of match is specified by the optional match argument, 1195 which defaults to ":is" if not specified, as specified in section 1196 2.6. 1198 Like address and envelope, this test returns true if any combination 1199 of the string-list and key-list arguments match. 1201 If a header listed in the header-names argument exists, it contains 1202 the null key (""). However, if the named header is not present, it 1203 does not contain the null key. So if a message contained the header 1205 X-Caffeine: C8H10N4O2 1207 these tests on that header evaluate as follows: 1209 header :is ["X-Caffeine"] [""] => false 1210 header :contains ["X-Caffeine"] [""] => true 1212 5.8. Test not 1214 Syntax: not 1216 The "not" test takes some other test as an argument, and yields the 1217 opposite result. "not false" evaluates to "true" and "not true" 1218 evaluates to "false". 1220 5.9. Test size 1222 Syntax: size <":over" / ":under"> 1224 The "size" test deals with the size of a message. It takes either a 1225 tagged argument of ":over" or ":under", followed by a number 1226 representing the size of the message. 1228 If the argument is ":over", and the size of the message is greater 1229 than the number provided, the test is true; otherwise, it is false. 1231 If the argument is ":under", and the size of the message is less than 1232 the number provided, the test is true; otherwise, it is false. 1234 Exactly one of ":over" or ":under" must be specified, and anything 1235 else is an error. 1237 The size of a message is defined to be the number of octets from the 1238 initial header until the last character in the message body. 1240 Note that for a message that is exactly 4,000 octets, the message is 1241 neither ":over" 4000 octets or ":under" 4000 octets. 1243 5.10. Test true 1245 Syntax: true 1247 The "true" test always evaluates to true. 1249 6. Extensibility 1251 New control structures, actions, and tests can be added to the 1252 language. Sites must make these features known to their users; this 1253 document does not define a way to discover the list of extensions 1254 supported by the server. 1256 Any extensions to this language MUST define a capability string that 1257 uniquely identifies that extension. If a new version of an extension 1258 changes the functionality of a previously defined extension, it MUST 1259 use a different name. 1261 In a situation where there is a submission protocol and an extension 1262 advertisement mechanism aware of the details of this language, 1263 scripts submitted can be checked against the mail server to prevent 1264 use of an extension that the server does not support. 1266 Extensions MUST state how they interact with constraints defined in 1267 section 2.10, e.g., whether they cancel the implicit keep, and which 1268 actions they are compatible and incompatible with. 1270 6.1. Capability String 1272 Capability strings are typically short strings describing what 1273 capabilities are supported by the server. 1275 Capability strings beginning with "vnd." represent vendor-defined 1276 extensions. Such extensions are not defined by Internet standards or 1277 RFCs, but are still registered with IANA in order to prevent 1278 conflicts. Extensions starting with "vnd." SHOULD be followed by the 1279 name of the vendor and product, such as "vnd.acme.rocket-sled". 1281 The following capability strings are defined by this document: 1283 envelope The string "envelope" indicates that the implementation 1284 supports the "envelope" command. 1286 fileinto The string "fileinto" indicates that the implementation 1287 supports the "fileinto" command. 1289 reject The string "reject" indicates that the implementation 1290 supports the "reject" command. 1292 comparator- The string "comparator-elbonia" is provided if the 1293 implementation supports the "elbonia" comparator. 1294 Therefore, all implementations have at least the 1295 "comparator-i;octet", "comparator-en;ascii-casemap", 1296 and "comparator-i;ascii-casemap" capabilities. However, 1297 these comparators may be used without being declared 1298 with require. 1300 6.2. IANA Considerations 1302 In order to provide a standard set of extensions, a registry is 1303 provided by IANA. Capability names may be registered on a first- 1304 come, first-served basis. Extensions designed for interoperable use 1305 SHOULD be defined as standards track or IESG approved experimental 1306 RFCs. 1308 6.2.1. Template for Capability Registrations 1310 The following template is to be used for registering new Sieve 1311 extensions with IANA. 1313 To: iana@iana.org 1314 Subject: Registration of new Sieve extension 1316 Capability name: 1317 Capability keyword: 1318 Capability arguments: 1319 Standards Track/IESG-approved experimental RFC number: 1320 Person and email address to contact for further information: 1322 6.2.2. Initial Capability Registrations 1324 This RFC updates the the following entries in the IANA registry for 1325 Sieve extensions. 1327 Capability name: fileinto 1328 Capability keyword: fileinto 1329 Capability arguments: fileinto 1330 Standards Track/IESG-approved experimental RFC number: 1331 This RFC (Sieve base spec) 1332 Person and email address to contact for further information: 1333 Tim Showalter 1334 tjs@mirapoint.com 1336 Capability name: reject 1337 Capability keyword: reject 1338 Capability arguments: reject 1339 Standards Track/IESG-approved experimental RFC number: 1340 This RFC (Sieve base spec) 1341 Person and email address to contact for further information: 1342 Tim Showalter 1343 tjs@mirapoint.com 1345 Capability name: envelope 1346 Capability keyword: envelope 1347 Capability arguments: 1348 envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1349 1350 Standards Track/IESG-approved experimental RFC number: 1351 This RFC (Sieve base spec) 1352 Person and email address to contact for further information: 1353 Tim Showalter 1354 tjs@mirapoint.com 1356 Capability name: comparator-* 1357 Capability keyword: 1358 comparator-* (anything starting with "comparator-") 1359 Capability arguments: (none) 1360 Standards Track/IESG-approved experimental RFC number: 1361 This RFC, Sieve, by reference to [COLLATION] 1362 Person and email address to contact for further information: 1363 Tim Showalter 1364 tjs@mirapoint.com 1366 6.3. Capability Transport 1368 As the range of mail systems that this document is intended to apply 1369 to is quite varied, a method of advertising which capabilities an 1370 implementation supports is difficult due to the wide range of 1371 possible implementations. Such a mechanism, however, should have the 1372 property that the implementation can advertise the complete set of 1373 extensions that it supports. 1375 7. Transmission 1377 The MIME type for a Sieve script is "application/sieve". 1379 The registration of this type for RFC 2048 requirements is updated as 1380 follows: 1382 Subject: Registration of MIME media type application/sieve 1384 MIME media type name: application 1385 MIME subtype name: sieve 1386 Required parameters: none 1387 Optional parameters: none 1388 Encoding considerations: Most sieve scripts will be textual, 1389 written in UTF-8. When non-7bit characters are used, 1390 quoted-printable is appropriate for transport systems 1391 that require 7bit encoding. 1393 Security considerations: Discussed in section 10 of this RFC. 1394 Interoperability considerations: Discussed in section 2.10.5 1395 of this RFC. 1396 Published specification: this RFC. 1397 Applications which use this media type: sieve-enabled mail servers 1398 Additional information: 1399 Magic number(s): 1400 File extension(s): .siv 1401 Macintosh File Type Code(s): 1402 Person & email address to contact for further information: 1403 See the discussion list at ietf-mta-filters@imc.org. 1404 Intended usage: 1405 COMMON 1406 Author/Change controller: 1407 See Author information in this RFC. 1409 8. Parsing 1411 The Sieve grammar is separated into tokens and a separate grammar as 1412 most programming languages are. 1414 8.1. Lexical Tokens 1416 Sieve scripts are encoded in UTF-8. The following assumes a valid 1417 UTF-8 encoding; special characters in Sieve scripts are all ASCII. 1419 The following are tokens in Sieve: 1421 - identifiers 1422 - tags 1423 - numbers 1424 - quoted strings 1425 - multi-line strings 1426 - other separators 1428 Blanks, horizontal tabs, CRLFs, and comments ("white space") are 1429 ignored except as they separate tokens. Some white space is required 1430 to separate otherwise adjacent tokens and in specific places in the 1431 multi-line strings. CR and LF can only appear in CRLF pairs. 1433 The other separators are single individual characters, and are 1434 mentioned explicitly in the grammar. 1436 The lexical structure of sieve is defined in the following grammar 1437 (as described in [ABNF]): 1439 bracket-comment = "/*" *not-star 1*STAR 1440 *(not-star-slash *not-star 1*STAR) "/" 1441 ; No */ allowed inside a comment. 1442 ; (No * is allowed unless it is the last 1443 ; character, or unless it is followed by a 1444 ; character that isn't a slash.) 1446 STAR = "*" 1448 not-star = CRLF / %x01-09 / %x0b-0c / %x0e-29 / %x2b-7f / 1449 UTF8-2 / UTF8-3 / UTF8-4 1450 ; either a CRLF pair, OR a single UTF-8 character 1451 ; other than NUL, CR, LF, or star 1453 not-star-or-slash = CRLF / %x01-09 / %x0b-0c / %x0e-29 / %x2b-2e / 1454 %x30-7f / UTF8-2 / UTF8-3 / UTF8-4 1455 ; either a CRLF pair, OR a single UTF-8 character 1456 ; other than NUL, CR, LF, star, or slash 1458 UTF8-NOT-CRLF = %x01-09 / %x0b-0c / %x0e-7f / 1459 UTF8-2 / UTF8-3 / UTF8-4 1460 ; a single UTF-8 character other than NUL, 1461 ; CR, or LF 1463 UTF8-NOT-PERIOD = %x01-09 / %x0b-0c / %x0e-2d / %x2f-7f / 1464 UTF8-2 / UTF8-3 / UTF8-4 1465 ; a single UTF-8 character other than NUL, 1466 ; CR, LF, or period 1468 UTF8-NOT-NUL = %x01-7f / UTF8-2 / UTF8-3 / UTF8-4 1469 ; a single UTF-8 character other than NUL 1471 UTF8-NOT-QSPECIAL = %x01-09 / %x0b-0c / %x0e-21 / %x23-5b / 1472 %x5d-7f / UTF8-2 / UTF8-3 / UTF8-4 1473 ; a single UTF-8 character other than NUL, 1474 ; CR, LF, double-quote, or backslash 1476 comment = bracket-comment / hash-comment 1478 hash-comment = "#" *UTF8-NOT-CRLF CRLF 1480 identifier = (ALPHA / "_") *(ALPHA / DIGIT / "_") 1482 tag = ":" identifier 1484 number = 1*DIGIT [QUANTIFIER] 1486 QUANTIFIER = "K" / "M" / "G" 1488 quoted-safe = CRLF / UTF8-NOT-QSPECIAL 1489 ; either a CRLF pair, OR a single UTF-8 1490 ; character other than NUL, CR, LF, 1491 ; double-quote, or backslash 1493 quoted-special = "\" ( DQUOTE / "\" ) 1494 ; represents just a double-quote or backslash 1496 quoted-other = "\" UTF8-NOT-QSPECIAL 1497 ; represents just the UTF8-NOT-QSPECIAL 1498 ; character. SHOULD NOT be used 1500 quoted-text = *(quoted-safe / quoted-special / quoted-other) 1502 quoted-string = DQUOTE quoted-text DQUOTE 1504 multi-line = "text:" *(SP / HTAB) (hash-comment / CRLF) 1505 *(multiline-literal / multiline-dotstuff) 1506 "." CRLF 1508 multiline-literal = [UTF8-NOT-PERIOD *UTF8-NOT-CRLF] CRLF 1510 multiline-dotstuff = "." 1*UTF8-NOT-CRLF CRLF 1511 ; A line containing only "." ends the 1512 ; multi-line. Remove a leading '.' if 1513 ; followed by another '.'. 1515 white-space = 1*(SP / CRLF / HTAB) / comment 1517 8.2. Grammar 1519 The following is the grammar of Sieve after it has been lexically 1520 interpreted. No white space or comments appear below. The start 1521 symbol is "start". 1523 argument = string-list / number / tag 1525 arguments = *argument [test / test-list] 1527 block = "{" commands "}" 1529 command = identifier arguments ( ";" / block ) 1531 commands = *command 1533 start = commands 1535 string = quoted-string / multi-line 1536 string-list = "[" string *("," string) "]" / string 1537 ; if there is only a single string, the brackets 1538 ; are optional 1540 test = identifier arguments 1542 test-list = "(" test *("," test) ")" 1544 9. Extended Example 1546 The following is an extended example of a Sieve script. Note that it 1547 does not make use of the implicit keep. 1549 # 1550 # Example Sieve Filter 1551 # Declare any optional features or extension used by the script 1552 # 1553 require ["fileinto", "reject"]; 1555 # 1556 # Reject any large messages (note that the four leading dots get 1557 # "stuffed" to three) 1558 # 1559 if size :over 1M 1560 { 1561 reject text: 1562 Please do not send me large attachments. 1563 Put your file on a server and send me the URL. 1564 Thank you. 1565 .... Fred 1566 . 1567 ; 1568 stop; 1569 } 1570 # 1571 # Handle messages from known mailing lists 1572 # Move messages from IETF filter discussion list to filter folder 1573 # 1574 if header :is "Sender" "owner-ietf-mta-filters@imc.org" 1575 { 1576 fileinto "filter"; # move to "filter" folder 1577 } 1578 # 1579 # Keep all messages to or from people in my company 1580 # 1581 elsif address :domain :is ["From", "To"] "example.com" 1582 { 1583 keep; # keep in "In" folder 1584 } 1586 # 1587 # Try and catch unsolicited email. If a message is not to me, 1588 # or it contains a subject known to be spam, file it away. 1589 # 1590 elsif anyof (not address :all :contains 1591 ["To", "Cc", "Bcc"] "me@example.com", 1592 header :matches "subject" 1593 ["*make*money*fast*", "*university*dipl*mas*"]) 1594 { 1595 # If message header does not contain my address, 1596 # it's from a list. 1597 fileinto "spam"; # move to "spam" folder 1598 } 1599 else 1600 { 1601 # Move all other (non-company) mail to "personal" 1602 # folder. 1603 fileinto "personal"; 1604 } 1606 10. Security Considerations 1608 Users must get their mail. It is imperative that whatever method 1609 implementations use to store the user-defined filtering scripts be 1610 secure. 1612 It is equally important that implementations sanity-check the user's 1613 scripts, and not allow users to create on-demand mailbombs. For 1614 instance, an implementation that allows a user to redirect a message 1615 multiple times might also allow a user to create a mailbomb triggered 1616 by mail from a specific user. Site- or implementation-defined limits 1617 on actions are useful for this. 1619 Several commands, such as "discard", "redirect", and "fileinto" allow 1620 for actions to be taken that are potentially very dangerous. 1622 Implementations SHOULD take measures to prevent languages from 1623 looping. 1625 11. Acknowledgments 1627 This document has been revised in part based on comments and 1628 discussions that took place on and off the SIEVE mailing list. 1629 Thanks to Cyrus Daboo, Ned Freed, Kjetil Torgrim Homme, Barry Leiba, 1630 Mark E. Mallett, Alexey Melnikov, Rob Siemborski, and Nigel Swinson 1631 for reviews and suggestions. 1633 The editor gratefully acknowledges the extensive work of Tim 1634 Showalter as the author of the RFC 3028. 1636 12. Editor's Addresses 1638 Philip Guenther 1639 Sendmail, Inc. 1640 6425 Christie St. Ste 400 1641 Emeryville, CA 94608 1643 Email: guenther@sendmail.com 1645 13. Normative References 1647 [ABNF] Crocker, D. and P. Overell, "Augmented BNF for Syntax 1648 Specifications: ABNF", RFC 2234, November 1997. 1650 [COLLATION] Newman, C. and M. Duerst, "Internet Application Protocol 1651 Collation Registry" draft-newman-i18n-comparator-03.txt 1652 (work in progress), October 2004. 1654 [IMAIL] P. Resnick, Ed., "Internet Message Format", RFC 2822, 1655 April 2001. 1657 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 1658 Requirement Levels", BCP 14, RFC 2119, March 1997. 1660 [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 1661 Extensions (MIME) Part One: Format of Internet Message 1662 Bodies", RFC 2045, November 1996. 1664 [MDN] T. Hansen, Ed., G. Vaudreuil, Ed., "Message Disposition 1665 Notification", RFC 3798, May 2004. 1667 [SMTP] J. Klensin, Ed., "Simple Mail Transfer Protocol", RFC 1668 2821, April 2001. 1670 [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO 1671 10646", RFC 3629, November 2003. 1673 14. Informative References 1675 [BINARY-SI] "Standard IEC 60027-2: Letter symbols to be used in 1676 electrical technology - Part 2: Telecommunications and 1677 electronics", January 1999. 1679 [DSN] Moore, K. and G. Vaudreuil, "An Extensible Message Format 1680 for Delivery Status Notifications", RFC 1894, January 1681 1996. 1683 [FLAMES] Borenstein, N, and C. Thyberg, "Power, Ease of Use, and 1684 Cooperative Work in a Practical Multimedia Message 1685 System", Int. J. of Man-Machine Studies, April, 1991. 1686 Reprinted in Computer-Supported Cooperative Work and 1687 Groupware, Saul Greenberg, editor, Harcourt Brace 1688 Jovanovich, 1991. Reprinted in Readings in Groupware and 1689 Computer-Supported Cooperative Work, Ronald Baecker, 1690 editor, Morgan Kaufmann, 1993. 1692 [IMAP] Crispin, M., "Internet Message Access Protocol - version 1693 4rev1", RFC 3501, March 2003. 1695 14. Full Copyright Statement 1697 Copyright (C) The Internet Society (2005). 1699 This document is subject to the rights, licenses and restrictions 1700 contained in BCP 78, and except as set forth therein, the authors 1701 retain all their rights. 1703 This document and the information contained herein are provided on an 1704 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1705 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1706 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1707 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1708 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1709 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1711 Intellectual Property 1713 The IETF takes no position regarding the validity or scope of any 1714 Intellectual Property Rights or other rights that might be claimed to 1715 pertain to the implementation or use of the technology described in 1716 this document or the extent to which any license under such rights 1717 might or might not be available; nor does it represent that it has 1718 made any independent effort to identify any such rights. Information 1719 on the procedures with respect to rights in RFC documents can be 1720 found in BCP 78 and BCP 79. 1722 Copies of IPR disclosures made to the IETF Secretariat and any 1723 assurances of licenses to be made available, or the result of an 1724 attempt made to obtain a general license or permission for the use of 1725 such proprietary rights by implementers or users of this 1726 specification can be obtained from the IETF on-line IPR repository at 1727 http://www.ietf.org/ipr. 1729 The IETF invites any interested party to bring to its attention any 1730 copyrights, patents or patent applications, or other proprietary 1731 rights that may cover technology that may be required to implement 1732 this standard. Please address the information to the IETF at ietf- 1733 ipr@ietf.org. 1735 Acknowledgement 1737 Funding for the RFC Editor function is currently provided by the 1738 Internet Society. 1740 Append A. Change History 1742 Open Issues: 1743 - pull 'reject' out of this doc? Whether or not that happens, the 1744 security considerations for 'reject' need to be updated to 1745 document the "joe job" problem 1746 - should 'redirect' provide an argument for specifying the envelope sender 1748 Changes from draft-ietf-sieve-3028bis-00.txt 1749 1. More grammar corrections: 1750 - permit /***/, 1751 - remove ambiguity in finding end of bracket comment, 1752 - require valid UTF-8, 1753 - express quoting in the grammar 1754 - ban bare CR and LF in all locations 1755 2. Correct a bunch of whitespace and linewrapping nits 1756 3. Update IMAIL and SMTP references RFC 2822 and RFC 2821 1757 4. Require support for en;ascii-casemap comparator as well as the 1758 old i;ascii-casemap. As with the old one, you do not need to 1759 use 'require' to use the new comparator. 1760 5. Update IANA considerations to update the existing registrations 1761 to point at this doc instead of 3028. 1762 6. Scripts SHOULD NOT contain superfluous backslashes 1763 7. Update Acknowledgments 1765 Changes from RFC 3028 1766 1. Split references into normative and informative 1767 2. Update references to current versions of DSN, IMAP, MDN, and 1768 UTF-8 RFCs 1769 3. Replace "e-mail" with "email" 1770 4. Incorporate RFC 3028 errata 1771 5. The "reject" action cancels the implicit keep 1772 6. Replace references to ACAP with references to the i18n-comparator 1773 draft. Further work is needed to completely sync with that draft. 1774 7. Start to update grammar to only permit legal UTF-8 (incomplete) 1775 and correct various other errors and typos 1776 8. Update IPR broilerplate to RFC 3978/3979