idnits 2.17.1 draft-ietf-sieve-3028bis-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1654. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1665. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1672. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1678. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 1642), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 42. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC3028, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 2005) is 6853 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'COMPARATOR' is mentioned on line 1295, but not defined == Missing Reference: 'ADDRESS-PART' is mentioned on line 1295, but not defined == Missing Reference: 'MATCH-TYPE' is mentioned on line 1295, but not defined == Missing Reference: 'QUANTIFIER' is mentioned on line 1432, but not defined == Unused Reference: 'MIME' is defined on line 1601, but no explicit reference was found in the text == Unused Reference: 'IMAP' is defined on line 1637, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2234 (ref. 'ABNF') (Obsoleted by RFC 4234) == Outdated reference: A later version (-14) exists of draft-newman-i18n-comparator-04 ** Obsolete normative reference: RFC 2822 (ref. 'IMAIL') (Obsoleted by RFC 5322) ** Obsolete normative reference: RFC 3798 (ref. 'MDN') (Obsoleted by RFC 8098) ** Obsolete normative reference: RFC 2821 (ref. 'SMTP') (Obsoleted by RFC 5321) -- Obsolete informational reference (is this intentional?): RFC 1894 (ref. 'DSN') (Obsoleted by RFC 3464) -- Obsolete informational reference (is this intentional?): RFC 3501 (ref. 'IMAP') (Obsoleted by RFC 9051) Summary: 8 errors (**), 0 flaws (~~), 10 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Guenther 3 Internet-Draft Sendmail, Inc. 4 Expires: January 2006 T. Showalter 5 Obsoletes: 3028 (if approved) Editors 6 July 2005 8 Sieve: An Email Filtering Language 9 draft-ietf-sieve-3028bis-04.txt 11 Status of this Memo 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 A revised version of this draft document will be submitted to the RFC 35 editor as a Standard Track RFC for the Internet Community. 36 Discussion and suggestions for improvement are requested, and should 37 be sent to ietf-mta-filters@imc.org. Distribution of this memo is 38 unlimited. 40 Copyright Notice 42 Copyright (C) The Internet Society (2005). All Rights Reserved. 44 Abstract 46 This document describes a language for filtering email messages at 47 time of final delivery. It is designed to be implementable on either 48 a mail client or mail server. It is meant to be extensible, simple, 49 and independent of access protocol, mail architecture, and operating 50 system. It is suitable for running on a mail server where users may 51 not be allowed to execute arbitrary programs, such as on black box 52 Internet Message Access Protocol (IMAP) servers, as it has no 53 variables, loops, or ability to shell out to external programs. 55 Table of Contents 57 1. Introduction ........................................... 3 58 1.1. Conventions Used in This Document ..................... 4 59 1.2. Example mail messages ................................. 5 60 2. Design ................................................. 5 61 2.1. Form of the Language .................................. 6 62 2.2. Whitespace ............................................ 6 63 2.3. Comments .............................................. 6 64 2.4. Literal Data .......................................... 6 65 2.4.1. Numbers ............................................... 6 66 2.4.2. Strings ............................................... 7 67 2.4.2.1. String Lists .......................................... 8 68 2.4.2.2. Headers ............................................... 8 69 2.4.2.3. Addresses ............................................. 8 70 2.5. Tests ................................................. 9 71 2.5.1. Test Lists ............................................ 9 72 2.6. Arguments ............................................. 9 73 2.6.1. Positional Arguments .................................. 9 74 2.6.2. Tagged Arguments ...................................... 10 75 2.6.3. Optional Arguments .................................... 10 76 2.6.4. Types of Arguments .................................... 10 77 2.7. String Comparison ..................................... 11 78 2.7.1. Match Type ............................................ 11 79 2.7.2. Comparisons Across Character Sets ..................... 12 80 2.7.3. Comparators ........................................... 12 81 2.7.4. Comparisons Against Addresses ......................... 14 82 2.8. Blocks ................................................ 14 83 2.9. Commands .............................................. 14 84 2.10. Evaluation ............................................ 15 85 2.10.1. Action Interaction .................................... 15 86 2.10.2. Implicit Keep ......................................... 15 87 2.10.3. Message Uniqueness in a Mailbox ....................... 16 88 2.10.4. Limits on Numbers of Actions .......................... 16 89 2.10.5. Extensions and Optional Features ...................... 16 90 2.10.6. Errors ................................................ 17 91 2.10.7. Limits on Execution ................................... 17 92 3. Control Commands ....................................... 17 93 3.1. Control If ............................................ 18 94 3.2. Control Require ....................................... 19 95 3.3. Control Stop .......................................... 19 96 4. Action Commands ........................................ 19 97 4.1. Action fileinto ....................................... 20 98 4.2. Action redirect ....................................... 20 99 4.3. Action keep ........................................... 20 100 4.4. Action discard ........................................ 21 101 5. Test Commands .......................................... 21 102 5.1. Test address .......................................... 22 103 5.2. Test allof ............................................ 22 104 5.3. Test anyof ............................................ 23 105 5.4. Test envelope ......................................... 23 106 5.5. Test exists ........................................... 24 107 5.6. Test false ............................................ 24 108 5.7. Test header ........................................... 24 109 5.8. Test not .............................................. 25 110 5.9. Test size ............................................. 25 111 5.10. Test true ............................................. 25 112 6. Extensibility .......................................... 25 113 6.1. Capability String ..................................... 26 114 6.2. IANA Considerations ................................... 26 115 6.2.1. Template for Capability Registrations ................. 27 116 6.2.2. Initial Capability Registrations ...................... 27 117 6.3. Capability Transport .................................. 28 118 7. Transmission ........................................... 28 119 8. Parsing ................................................ 29 120 8.1. Lexical Tokens ........................................ 29 121 8.2. Grammar ............................................... 31 122 9. Extended Example ....................................... 31 123 10. Security Considerations ................................ 32 124 11. Acknowledgments ........................................ 33 125 12. Editor's Address ....................................... 33 126 13. Normative References ................................... 33 127 14. Informative References ................................. 34 128 14. Full Copyright Statement ............................... 34 130 1. Introduction 132 This memo documents a language that can be used to create filters for 133 electronic mail. It is not tied to any particular operating system 134 or mail architecture. It requires the use of [IMAIL]-compliant 135 messages, but should otherwise generalize to many systems. 137 The language is powerful enough to be useful but limited in order to 138 allow for a safe server-side filtering system. The intention is to 139 make it impossible for users to do anything more complex (and 140 dangerous) than write simple mail filters, along with facilitating 141 the use of GUIs for filter creation and manipulation. The language 142 is not Turing-complete: it provides no way to write a loop or a 143 function and variables are not provided. 145 Scripts written in Sieve are executed during final delivery, when the 146 message is moved to the user-accessible mailbox. In systems where 147 the MTA does final delivery, such as traditional Unix mail, it is 148 reasonable to sort when the MTA deposits mail into the user's 149 mailbox. 151 There are a number of reasons to use a filtering system. Mail 152 traffic for most users has been increasing due to increased usage of 153 email, the emergence of unsolicited email as a form of advertising, 154 and increased usage of mailing lists. 156 Experience at Carnegie Mellon has shown that if a filtering system is 157 made available to users, many will make use of it in order to file 158 messages from specific users or mailing lists. However, many others 159 did not make use of the Andrew system's FLAMES filtering language 160 [FLAMES] due to difficulty in setting it up. 162 Because of the expectation that users will make use of filtering if 163 it is offered and easy to use, this language has been made simple 164 enough to allow many users to make use of it, but rich enough that it 165 can be used productively. However, it is expected that GUI-based 166 editors will be the preferred way of editing filters for a large 167 number of users. 169 1.1. Conventions Used in This Document 171 In the sections of this document that discuss the requirements of 172 various keywords and operators, the following conventions have been 173 adopted. 175 The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" 176 in this document are to be interpreted as defined in [KEYWORDS]. 178 Each section on a command (test, action, or control) has a line 179 labeled "Syntax:". This line describes the syntax of the command, 180 including its name and its arguments. Required arguments are listed 181 inside angle brackets ("<" and ">"). Optional arguments are listed 182 inside square brackets ("[" and "]"). Each argument is followed by 183 its type, so "" represents an argument called "key" that 184 is a string. Literal strings are represented with double-quoted 185 strings. Alternatives are separated with slashes, and parenthesis 186 are used for grouping, similar to [ABNF]. 188 In the "Syntax" line, there are three special pieces of syntax that 189 are frequently repeated, MATCH-TYPE, COMPARATOR, and ADDRESS-PART. 190 These are discussed in sections 2.7.1, 2.7.3, and 2.7.4, 191 respectively. 193 The formal grammar for these commands in section 10 and is the 194 authoritative reference on how to construct commands, but the formal 195 grammar does not specify the order, semantics, number or types of 196 arguments to commands, nor the legal command names. The intent is to 197 allow for extension without changing the grammar. 199 1.2. Example mail messages 201 The following mail messages will be used throughout this document in 202 examples. 204 Message A 205 ----------------------------------------------------------- 206 Date: Tue, 1 Apr 1997 09:06:31 -0800 (PST) 207 From: coyote@desert.example.org 208 To: roadrunner@acme.example.com 209 Subject: I have a present for you 211 Look, I'm sorry about the whole anvil thing, and I really 212 didn't mean to try and drop it on you from the top of the 213 cliff. I want to try to make it up to you. I've got some 214 great birdseed over here at my place--top of the line 215 stuff--and if you come by, I'll have it all wrapped up 216 for you. I'm really sorry for all the problems I've caused 217 for you over the years, but I know we can work this out. 218 -- 219 Wile E. Coyote "Super Genius" coyote@desert.example.org 220 ----------------------------------------------------------- 222 Message B 223 ----------------------------------------------------------- 224 From: youcouldberich!@reply-by-postal-mail.invalid 225 Sender: b1ff@de.res.example.com 226 To: rube@landru.example.edu 227 Date: Mon, 31 Mar 1997 18:26:10 -0800 228 Subject: $$$ YOU, TOO, CAN BE A MILLIONAIRE! $$$ 230 YOU MAY HAVE ALREADY WON TEN MILLION DOLLARS, BUT I DOUBT 231 IT! SO JUST POST THIS TO SIX HUNDRED NEWSGROUPS! IT WILL 232 GUARANTEE THAT YOU GET AT LEAST FIVE RESPONSES WITH MONEY! 233 MONEY! MONEY! COLD HARD CASH! YOU WILL RECEIVE OVER 234 $20,000 IN LESS THAN TWO MONTHS! AND IT'S LEGAL!!!!!!!!! 235 !!!!!!!!!!!!!!!!!!111111111!!!!!!!11111111111!!1 JUST 236 SEND $5 IN SMALL, UNMARKED BILLS TO THE ADDRESSES BELOW! 237 ----------------------------------------------------------- 239 2. Design 240 2.1. Form of the Language 242 The language consists of a set of commands. Each command consists of 243 a set of tokens delimited by whitespace. The command identifier is 244 the first token and it is followed by zero or more argument tokens. 245 Arguments may be literal data, tags, blocks of commands, or test 246 commands. 248 The language is represented in UTF-8, as specified in [UTF-8]. 250 Tokens in the US-ASCII range are considered case-insensitive. 252 2.2. Whitespace 254 Whitespace is used to separate tokens. Whitespace is made up of 255 tabs, newlines (CRLF, never just CR or LF), and the space character. 256 The amount of whitespace used is not significant. 258 2.3. Comments 260 Two types of comments are offered. Comments are semantically 261 equivalent to whitespace and can be used anyplace that whitespace is 262 (with one exception in multi-line strings, as described in the 263 grammar). 265 Hash comments begin with a "#" character that is not contained within 266 a string and continue until the next CRLF. 268 Example: if size :over 100K { # this is a comment 269 discard; 270 } 272 Bracketed comments begin with the token "/*" and end with "*/" 273 outside of a string. Bracketed comments may span multiple lines. 274 Bracketed comments do not nest. 276 Example: if size :over 100K { /* this is a comment 277 this is still a comment */ discard /* this is a comment 278 */ ; 279 } 281 2.4. Literal Data 283 Literal data means data that is not executed, merely evaluated "as 284 is", to be used as arguments to commands. Literal data is limited to 285 numbers and strings. 287 2.4.1. Numbers 288 Numbers are given as ordinary decimal numbers. However, those 289 numbers that have a tendency to be fairly large, such as message 290 sizes, MAY have a "K", "M", or "G" appended to indicate a multiple of 291 a power of two. To be comparable with the power-of-two-based 292 versions of SI units that computers frequently use, K specifies 293 kibi-, or 1,024 (2^10) times the value of the number; M specifies 294 mebi-, or 1,048,576 (2^20) times the value of the number; and G 295 specifies gibi-, or 1,073,741,824 (2^30) times the value of the 296 number [BINARY-SI]. 298 Implementations MUST provide 31 bits of magnitude in numbers, but MAY 299 provide more. 301 Only positive integers are permitted by this specification. 303 2.4.2. Strings 305 Scripts involve large numbers of strings as they are used for pattern 306 matching, addresses, textual bodies, etc. Typically, short quoted 307 strings suffice for most uses, but a more convenient form is provided 308 for longer strings such as bodies of messages. 310 A quoted string starts and ends with a single double quote (the <"> 311 character, US-ASCII 34). A backslash ("\", ASCII 92) inside of a 312 quoted string is followed by either another backslash or a double 313 quote. This two-character sequence represents a single backslash or 314 double- quote within the string, respectively. 316 Scripts SHOULD NOT escape other characters with a backslash. 318 An undefined escape sequence (such as "\a" in a context where "a" has 319 no special meaning) is interpreted as if there were no backslash (in 320 this case, "\a" is just "a"). 322 Non-printing characters such as tabs, CR and LF, and control 323 characters are permitted in quoted strings. Quoted strings MAY span 324 multiple lines. NUL (US-ASCII 0) is not allowed in strings. 326 For entering larger amounts of text, such as an email message, a 327 multi-line form is allowed. It starts with the keyword "text:", 328 followed by a CRLF, and ends with the sequence of a CRLF, a single 329 period, and another CRLF. In order to allow the message to contain 330 lines with a single-dot, lines are dot-stuffed. That is, when 331 composing a message body, an extra `.' is added before each line 332 which begins with a `.'. When the server interprets the script, 333 these extra dots are removed. Note that a line that begins with a 334 dot followed by a non-dot character is not interpreted dot-stuffed; 335 that is, ".foo" is interpreted as ".foo". However, because this is 336 potentially ambiguous, scripts SHOULD be properly dot-stuffed so such 337 lines do not appear. 339 Note that a hashed comment or whitespace may occur in between the 340 "text:" and the CRLF, but not within the string itself. Bracketed 341 comments are not allowed here. 343 2.4.2.1. String Lists 345 When matching patterns, it is frequently convenient to match against 346 groups of strings instead of single strings. For this reason, a list 347 of strings is allowed in many tests, implying that if the test is 348 true using any one of the strings, then the test is true. 349 Implementations are encouraged to use short-circuit evaluation in 350 these cases. 352 For instance, the test `header :contains ["To", "Cc"] 353 ["me@example.com", "me00@landru.example.edu"]' is true if either the 354 To header or Cc header of the input message contains either of the 355 email addresses "me@example.com" or "me00@landru.example.edu". 357 Conversely, in any case where a list of strings is appropriate, a 358 single string is allowed without being a member of a list: it is 359 equivalent to a list with a single member. This means that the test 360 `exists "To"' is equivalent to the test `exists ["To"]'. 362 2.4.2.2. Headers 364 Headers are a subset of strings. In the Internet Message 365 Specification [IMAIL], each header line is allowed to have whitespace 366 nearly anywhere in the line, including after the field name and 367 before the subsequent colon. Extra spaces between the header name 368 and the ":" in a header field are ignored. 370 A header name never contains a colon. The "From" header refers to a 371 line beginning "From:" (or "From :", etc.). No header will match 372 the string "From:" due to the trailing colon. 374 Similarly, synactically invalid header names cause the same result as 375 syntactically valid header names that are not present in the message. 376 In particular, an implementation MUST NOT cause an error for 377 synactically invalid header names in tests. 379 Header lines are unfolded as described in [IMAIL] section 2.2.3. 380 Interpretation of header data SHOULD be done according to [MIME3] 381 section 6.2 (see 2.7.2 below for details). 383 2.4.2.3. Addresses 384 A number of commands call for email addresses, which are also a 385 subset of strings. When these addresses are used in outbound 386 contexts, addresses must be compliant with [IMAIL], but are further 387 constrained. Using the symbols defined in [IMAIL], section 3, the 388 syntax of an address is: 390 sieve-address = addr-spec ; simple address 391 / phrase "<" addr-spec ">" ; name & addr-spec 393 That is, routes and group syntax are not permitted. If multiple 394 addresses are required, use a string list. Named groups are not used 395 here. 397 Implementations MUST ensure that the addresses are syntactically 398 valid, but need not ensure that they actually identify an email 399 recipient. 401 2.5. Tests 403 Tests are given as arguments to commands in order to control their 404 actions. In this document, tests are given to if/elsif/else to 405 decide which block of code is run. 407 2.5.1. Test Lists 409 Some tests ("allof" and "anyof", which implement logical "and" and 410 logical "or", respectively) may require more than a single test as an 411 argument. The test-list syntax element provides a way of grouping 412 tests. 414 Example: if anyof (not exists ["From", "Date"], 415 header :contains "from" "fool@example.edu") { 416 discard; 417 } 419 2.6. Arguments 421 In order to specify what to do, most commands take arguments. There 422 are three types of arguments: positional, tagged, and optional. 424 2.6.1. Positional Arguments 426 Positional arguments are given to a command which discerns their 427 meaning based on their order. When a command takes positional 428 arguments, all positional arguments must be supplied and must be in 429 the order prescribed. 431 2.6.2. Tagged Arguments 432 This document provides for tagged arguments in the style of 433 CommonLISP. These are also similar to flags given to commands in 434 most command-line systems. 436 A tagged argument is an argument for a command that begins with ":" 437 followed by a tag naming the argument, such as ":contains". This 438 argument means that zero or more of the next tokens have some 439 particular meaning depending on the argument. These next tokens may 440 be numbers or strings but they are never blocks. 442 Tagged arguments are similar to positional arguments, except that 443 instead of the meaning being derived from the command, it is derived 444 from the tag. 446 Tagged arguments must appear before positional arguments, but they 447 may appear in any order with other tagged arguments. For simplicity 448 of the specification, this is not expressed in the syntax definitions 449 with commands, but they still may be reordered arbitrarily provided 450 they appear before positional arguments. Tagged arguments may be 451 mixed with optional arguments. 453 To simplify this specification, tagged arguments SHOULD NOT take 454 tagged arguments as arguments. 456 2.6.3. Optional Arguments 458 Optional arguments are exactly like tagged arguments except that they 459 may be left out, in which case a default value is implied. Because 460 optional arguments tend to result in shorter scripts, they have been 461 used far more than tagged arguments. 463 One particularly noteworthy case is the ":comparator" argument, which 464 allows the user to specify which comparator [COLLATION] will be used 465 to compare two strings, since different languages may impose 466 different orderings on UTF-8 [UTF-8] characters. 468 2.6.4. Types of Arguments 470 Abstractly, arguments may be literal data, tests, or blocks of 471 commands. In this way, an "if" control structure is merely a command 472 that happens to take a test and a block as arguments and may execute 473 the block of code. 475 However, this abstraction is ambiguous from a parsing standpoint. 476 The grammar in section 9.2 presents a parsable version of this: 477 Arguments are string-lists, numbers, and tags, which may be followed 478 by a test or a test-list, which may be followed by a block of 479 commands. No more than one test or test list, nor more than one 480 block of commands, may be used, and commands that end with a block of 481 commands do not end with semicolons. 483 2.7. String Comparison 485 When matching one string against another, there are a number of ways 486 of performing the match operation. These are accomplished with three 487 types of matches: an exact match, a substring match, and a wildcard 488 glob-style match. These are described below. 490 In order to provide for matches between character sets and case 491 insensitivity, Sieve uses the comparators defined in the Internet 492 Application Protocol Collation Registry [COLLATION]. 494 However, when a string represents the name of a header, the 495 comparator is never user-specified. Header comparisons are always 496 done with the "en;ascii-casemap" operator, i.e., case-insensitive 497 comparisons, because this is the way things are defined in the 498 message specification [IMAIL]. 500 2.7.1. Match Type 502 There are three match types describing the matching used in this 503 specification: ":is", ":contains", and ":matches". Match type 504 arguments are supplied to those commands which allow them to specify 505 what kind of match is to be performed. 507 These are used as tagged arguments to tests that perform string 508 comparison. 510 The ":contains" match type describes a substring match. If the value 511 argument contains the key argument as a substring, the match is true. 512 For instance, the string "frobnitzm" contains "frob" and "nit", but 513 not "fbm". The empty key ("") is contained in all values. 515 The ":is" match type describes an absolute match; if the contents of 516 the first string are absolutely the same as the contents of the 517 second string, they match. Only the string "frobnitzm" is the string 518 "frobnitzm". The empty key ":is" and only ":is" the empty value. 520 The ":matches" match type specifies a wildcard match using the 521 characters "*" and "?"; the entire value must be matched. "*" 522 matches zero or more characters, and "?" matches a single character. 523 "?" and "*" may be escaped as "\\?" and "\\*" in strings to match 524 against themselves. The first backslash escapes the second 525 backslash; together, they escape the "*". This is awkward, but it is 526 commonplace in several programming languages that use globs and 527 regular expressions. 529 In order to specify what type of match is supposed to happen, 530 commands that support matching take optional tagged arguments 531 ":matches", ":is", and ":contains". Commands default to using ":is" 532 matching if no match type argument is supplied. Note that these 533 modifiers may interact with comparators; in particular, some 534 comparators are not suitable for matching with ":contains" or 535 ":matches". It is an error to use a comparator with ":contains" or 536 ":matches" that is not compatible with it. 538 It is an error to give more than one of these arguments to a given 539 command. 541 For convenience, the "MATCH-TYPE" syntax element is defined here as 542 follows: 544 Syntax: ":is" / ":contains" / ":matches" 546 2.7.2. Comparisons Across Character Sets 548 All Sieve scripts are represented in UTF-8, but messages may involve 549 a number of character sets. In order for comparisons to work across 550 character sets, implementations SHOULD implement the following 551 behavior: 553 Comparisons are performed in Unicode. Implementations convert 554 text from header fields in all charsets [MIME3] to Unicode as 555 input to the comparator (see 2.7.3). Implementations MUST be 556 capable of converting US-ASCII, ISO-8859-1, the US-ASCII subset of 557 ISO-8859-* character sets, and UTF-8. Text that the 558 implementation cannot convert to Unicode for any reason MAY be 559 treated as plain US-ASCII (including any [MIME3] syntax) or 560 processed according to local conventions. An encoded NUL octet 561 (character zero) SHOULD NOT cause early termination of the header 562 content being compared against. 564 If implementations fail to support the above behavior, they MUST 565 conform to the following: 567 No two strings can be considered equal if one contains octets 568 greater than 127. 570 2.7.3. Comparators 572 In order to allow for language-independent, case-independent matches, 573 the match type may be coupled with a comparator name. The Internet 574 Application Protocol Collation Registry [COLLATION] provides the 575 framework for describing and naming comparators as used by this 576 specification. 578 While multiple comparator types are defined, only equality types are 579 used in this specification. 581 All implementations MUST support the "i;octet" comparator (simply 582 compares octets), the "en;ascii-casemap" comparator (which treats 583 uppercase and lowercase characters in the US-ASCII subset of UTF-8 as 584 the same), as well as the "i;ascii-casemap" comparator, which is a 585 deprecated synonym for "en;ascii-casemap". If left unspecified, the 586 default is "en;ascii-casemap". 588 Some comparators may not be usable with substring matches; that is, 589 they may only work with ":is". It is an error to try and use a 590 comparator with ":matches" or ":contains" that is not compatible with 591 it. 593 A comparator is specified by the ":comparator" option with commands 594 that support matching. This option is followed by a string providing 595 the name of the comparator to be used. For convenience, the syntax 596 of a comparator is abbreviated to "COMPARATOR", and (repeated in 597 several tests) is as follows: 599 Syntax: ":comparator" 601 So in this example, 603 Example: if header :contains :comparator "i;octet" "Subject" 604 "MAKE MONEY FAST" { 605 discard; 606 } 608 would discard any message with subjects like "You can MAKE MONEY 609 FAST", but not "You can Make Money Fast", since the comparator used 610 is case-sensitive. 612 Comparators other than "i;octet", "en;ascii-casemap", and "i;ascii- 613 casemap" must be declared with require, as they are extensions. If a 614 comparator declared with require is not known, it is an error, and 615 execution fails. If the comparator is not declared with require, it 616 is also an error, even if the comparator is supported. (See 2.10.5.) 618 Both ":matches" and ":contains" match types are compatible with the 619 "i;octet" and "en;ascii-casemap" comparators and may be used with 620 them. 622 It is an error to give more than one of these arguments to a given 623 command. 625 2.7.4. Comparisons Against Addresses 626 Addresses are one of the most frequent things represented as strings. 627 These are structured, and being able to compare against the local- 628 part or the domain of an address is useful, so some tests that act 629 exclusively on addresses take an additional optional argument that 630 specifies what the test acts on. 632 These optional arguments are ":localpart", ":domain", and ":all", 633 which act on the local-part (left-side), the domain part (right- 634 side), and the whole address. 636 The kind of comparison done, such as whether or not the test done is 637 case-insensitive, is specified as a comparator argument to the test. 639 If an optional address-part is omitted, the default is ":all". 641 It is an error to give more than one of these arguments to a given 642 command. 644 For convenience, the "ADDRESS-PART" syntax element is defined here as 645 follows: 647 Syntax: ":localpart" / ":domain" / ":all" 649 2.8. Blocks 651 Blocks are sets of commands enclosed within curly braces and supplied 652 as the final argument to a command. Such a command is a control 653 structure: when executed it has control over the number of times the 654 commands in the block are executed. and how 656 With the commands supplied in this memo, there are no loops. The 657 control structures supplied--if, elsif, and else--run a block either 658 once or not at all. 660 2.9. Commands 662 Sieve scripts are sequences of commands. Commands can take any of 663 the tokens above as arguments, and arguments may be either tagged or 664 positional arguments. Not all commands take all arguments. 666 There are three kinds of commands: test commands, action commands, 667 and control commands. 669 The simplest is an action command. An action command is an 670 identifier followed by zero or more arguments, terminated by a 671 semicolon. Action commands do not take tests or blocks as arguments. 673 A control command is a command that affects the parsing or the flow 674 of execution of the Sieve script in some way. A control structure is 675 a control command which ends with a block instead of a semicolon. 677 A test command is used as part of a control command. It is used to 678 specify whether or not the block of code given to the control command 679 is executed. 681 2.10. Evaluation 683 2.10.1. Action Interaction 685 Some actions cannot be used with other actions because the result 686 would be absurd. These restrictions are noted throughout this memo. 688 Extension actions MUST state how they interact with actions defined 689 in this specification. 691 2.10.2. Implicit Keep 693 Previous experience with filtering systems suggests that cases tend 694 to be missed in scripts. To prevent errors, Sieve has an "implicit 695 keep". 697 An implicit keep is a keep action (see 4.4) performed in absence of 698 any action that cancels the implicit keep. 700 An implicit keep is performed if a message is not written to a 701 mailbox, redirected to a new address, or explicitly thrown out. That 702 is, if a fileinto, a keep, a redirect, or a discard is performed, an 703 implicit keep is not. 705 Some actions may be defined to not cancel the implicit keep. These 706 actions may not directly affect the delivery of a message, and are 707 used for their side effects. None of the actions specified in this 708 document meet that criteria, but extension actions will. 710 For instance, with any of the short messages offered above, the 711 following script produces no actions. 713 Example: if size :over 500K { discard; } 715 As a result, the implicit keep is taken. 717 2.10.3. Message Uniqueness in a Mailbox 719 Implementations SHOULD NOT deliver a message to the same folder more 720 than once, even if a script explicitly asks for a message to be 721 written to a mailbox twice. 723 The test for equality of two messages is implementation-defined. 725 If a script asks for a message to be written to a mailbox twice, it 726 MUST NOT be treated as an error. 728 2.10.4. Limits on Numbers of Actions 730 Site policy MAY limit numbers of actions taken and MAY impose 731 restrictions on which actions can be used together. In the event 732 that a script hits a policy limit on the number of actions taken for 733 a particular message, an error occurs. 735 Implementations MUST allow at least one keep or one fileinto. If 736 fileinto is not implemented, implementations MUST allow at least one 737 keep. 739 2.10.5. Extensions and Optional Features 741 Because of the differing capabilities of many mail systems, several 742 features of this specification are optional. Before any of these 743 extensions can be executed, they must be declared with the "require" 744 action. 746 If an extension is not enabled with "require", implementations MUST 747 treat it as if they did not support it at all. 749 If a script does not understand an extension declared with require, 750 the script must not be used at all. Implementations MUST NOT execute 751 scripts which require unknown capability names. 753 Note: The reason for this restriction is that prior experiences with 754 languages such as LISP and Tcl suggest that this is a workable 755 way of noting that a given script uses an extension. 757 Experience with PostScript suggests that mechanisms that allow 758 a script to work around missing extensions are not used in 759 practice. 761 Extensions which define actions MUST state how they interact with 762 actions discussed in the base specification. 764 2.10.6. Errors 766 In any programming language, there are compile-time and run-time 767 errors. 769 Compile-time errors are ones in syntax that are detectable if a 770 syntax check is done. 772 Run-time errors are not detectable until the script is run. This 773 includes transient failures like disk full conditions, but also 774 includes issues like invalid combinations of actions. 776 When an error occurs in a Sieve script, all processing stops. 778 Implementations MAY choose to do a full parse, then evaluate the 779 script, then do all actions. Implementations might even go so far as 780 to ensure that execution is atomic (either all actions are executed 781 or none are executed). 783 Other implementations may choose to parse and run at the same time. 784 Such implementations are simpler, but have issues with partial 785 failure (some actions happen, others don't). 787 Implementations might even go so far as to ensure that scripts can 788 never execute an invalid set of actions before execution, although 789 this could involve solving the Halting Problem. 791 This specification allows any of these approaches. Solving the 792 Halting Problem is considered extra credit. 794 Implementations MUST perform syntactic, semantic, and run-time checks 795 on code that is actually executed. Implementations MAY perform those 796 checks or any part of them on code that is not reached during 797 execution. 799 When an error happens, implementations MUST notify the user that an 800 error occurred, which actions (if any) were taken, and do an implicit 801 keep. 803 2.10.7. Limits on Execution 805 Implementations may limit certain constructs. However, this 806 specification places a lower bound on some of these limits. 808 Implementations MUST support fifteen levels of nested blocks. 810 Implementations MUST support fifteen levels of nested test lists. 812 3. Control Commands 814 Control structures are needed to allow for multiple and conditional 815 actions. 817 3.1. Control If 819 There are three pieces to if: "if", "elsif", and "else". Each is 820 actually a separate command in terms of the grammar. However, an 821 elsif or else MUST only follow an if or elsif. An error occurs if 822 these conditions are not met. 824 Syntax: if 826 Syntax: elsif 828 Syntax: else 830 The semantics are similar to those of any of the many other 831 programming languages these control structures appear in. When the 832 interpreter sees an "if", it evaluates the test associated with it. 833 If the test is true, it executes the block associated with it. 835 If the test of the "if" is false, it evaluates the test of the first 836 "elsif" (if any). If the test of "elsif" is true, it runs the 837 elsif's block. An elsif may be followed by an elsif, in which case, 838 the interpreter repeats this process until it runs out of elsifs. 840 When the interpreter runs out of elsifs, there may be an "else" case. 841 If there is, and none of the if or elsif tests were true, the 842 interpreter runs the else case. 844 This provides a way of performing exactly one of the blocks in the 845 chain. 847 In the following example, both Message A and B are dropped. 849 Example: require "fileinto"; 850 if header :contains "from" "coyote" { 851 discard; 852 } elsif header :contains ["subject"] ["$$$"] { 853 discard; 854 } else { 855 fileinto "INBOX"; 856 } 858 When the script below is run over message A, it redirects the message 859 to acm@example.edu; message B, to postmaster@example.edu; any other 860 message is redirected to field@example.edu. 862 Example: if header :contains ["From"] ["coyote"] { 863 redirect "acm@example.edu"; 864 } elsif header :contains "Subject" "$$$" { 865 redirect "postmaster@example.edu"; 866 } else { 867 redirect "field@example.edu"; 868 } 870 Note that this definition prohibits the "... else if ..." sequence 871 used by C. This is intentional, because this construct produces a 872 shift-reduce conflict. 874 3.2. Control Require 876 Syntax: require 878 The require action notes that a script makes use of a certain 879 extension. Such a declaration is required to use the extension, as 880 discussed in section 2.10.5. Multiple capabilities can be declared 881 with a single require. 883 The require command, if present, MUST be used before anything other 884 than a require can be used. An error occurs if a require appears 885 after a command other than require. 887 Example: require ["fileinto", "reject"]; 889 Example: require "fileinto"; 890 require "vacation"; 892 3.3. Control Stop 894 Syntax: stop 896 The "stop" action ends all processing. If no actions have been 897 executed, then the keep action is taken. 899 4. Action Commands 901 This document supplies four actions that may be taken on a message: 902 keep, fileinto, redirect, and discard. 904 Implementations MUST support the "keep", "discard", and "redirect" 905 actions. 907 Implementations SHOULD support "fileinto". 909 Implementations MAY limit the number of certain actions taken (see 910 section 2.10.4). 912 4.1. Action fileinto 914 Syntax: fileinto 915 The "fileinto" action delivers the message into the specified folder. 916 Implementations SHOULD support fileinto, but in some environments 917 this may be impossible. 919 The capability string for use with the require command is "fileinto". 921 In the following script, message A is filed into folder 922 "INBOX.harassment". 924 Example: require "fileinto"; 925 if header :contains ["from"] "coyote" { 926 fileinto "INBOX.harassment"; 927 } 929 4.2. Action redirect 931 Syntax: redirect 933 The "redirect" action is used to send the message to another user at 934 a supplied address, as a mail forwarding feature does. The 935 "redirect" action makes no changes to the message body or existing 936 headers, but it may add new headers. The "redirect" modifies the 937 envelope recipient. 939 The redirect command performs an MTA-style "forward"--that is, what 940 you get from a .forward file using sendmail under UNIX. The address 941 on the [SMTP] envelope is replaced with the one on the redirect 942 command and the message is sent back out. (This is not an MUA-style 943 forward, which creates a new message with a different sender and 944 message ID, wrapping the old message in a new one.) 946 A simple script can be used for redirecting all mail: 948 Example: redirect "bart@example.edu"; 950 Implementations SHOULD take measures to implement loop control, 951 possibly including adding headers to the message or counting received 952 headers. If an implementation detects a loop, it causes an error. 954 4.3. Action keep 956 Syntax: keep 958 The "keep" action is whatever action is taken in lieu of all other 959 actions, if no filtering happens at all; generally, this simply means 960 to file the message into the user's main mailbox. This command 961 provides a way to execute this action without needing to know the 962 name of the user's main mailbox, providing a way to call it without 963 needing to understand the user's setup, or the underlying mail 964 system. 966 For instance, in an implementation where the Internet Message Access 967 Protocol (IMAP) server is running scripts on behalf of the user at 968 time of delivery, a keep command is equivalent to a fileinto "INBOX". 970 Example: if size :under 1M { keep; } else { discard; } 972 Note that the above script is identical to the one below. 974 Example: if not size :under 1M { discard; } 976 4.4. Action discard 978 Syntax: discard 980 Discard is used to silently throw away the message. It does so by 981 simply canceling the implicit keep. If discard is used with other 982 actions, the other actions still happen. Discard is compatible with 983 all other actions. (For instance fileinto+discard is equivalent to 984 fileinto.) 986 Discard MUST be silent; that is, it MUST NOT return a non-delivery 987 notification of any kind ([DSN], [MDN], or otherwise). 989 In the following script, any mail from "idiot@example.edu" is thrown 990 out. 992 Example: if header :contains ["from"] ["idiot@example.edu"] { 993 discard; 994 } 996 While an important part of this language, "discard" has the potential 997 to create serious problems for users: Students who leave themselves 998 logged in to an unattended machine in a public computer lab may find 999 their script changed to just "discard". In order to protect users in 1000 this situation (along with similar situations), implementations MAY 1001 keep messages destroyed by a script for an indefinite period, and MAY 1002 disallow scripts that throw out all mail. 1004 5. Test Commands 1006 Tests are used in conditionals to decide which part(s) of the 1007 conditional to execute. 1009 Implementations MUST support these tests: "address", "allof", 1010 "anyof", "exists", "false", "header", "not", "size", and "true". 1012 Implementations SHOULD support the "envelope" test. 1014 5.1. Test address 1016 Syntax: address [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1017 1019 The address test matches Internet addresses in structured headers 1020 that contain addresses. It returns true if any header contains any 1021 key in the specified part of the address, as modified by the 1022 comparator and the match keyword. Whether there are other addresses 1023 present in the header doesn't affect this test; this test does not 1024 provide any way to determine whether an address is the only address 1025 in a header. 1027 Like envelope and header, this test returns true if any combination 1028 of the header-list and key-list arguments match and false otherwise. 1030 Internet email addresses [IMAIL] have the somewhat awkward 1031 characteristic that the local-part to the left of the at-sign is 1032 considered case sensitive, and the domain-part to the right of the 1033 at-sign is case insensitive. The "address" command does not deal 1034 with this itself, but provides the ADDRESS-PART argument for allowing 1035 users to deal with it. 1037 The address primitive never acts on the phrase part of an email 1038 address, nor on comments within that address. It also never acts on 1039 group names, although it does act on the addresses within the group 1040 construct. 1042 Implementations MUST restrict the address test to headers that 1043 contain addresses, but MUST include at least From, To, Cc, Bcc, 1044 Sender, Resent-From, Resent-To, and SHOULD include any other header 1045 that utilizes an "address-list" structured header body. 1047 Example: if address :is :all "from" "tim@example.com" { 1048 discard; 1049 } 1051 5.2. Test allof 1053 Syntax: allof 1055 The allof test performs a logical AND on the tests supplied to it. 1057 Example: allof (false, false) => false 1058 allof (false, true) => false 1059 allof (true, true) => true 1061 The allof test takes as its argument a test-list. 1063 5.3. Test anyof 1065 Syntax: anyof 1067 The anyof test performs a logical OR on the tests supplied to it. 1069 Example: anyof (false, false) => false 1070 anyof (false, true) => true 1071 anyof (true, true) => true 1073 5.4. Test envelope 1075 Syntax: envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1076 1078 The "envelope" test is true if the specified part of the SMTP (or 1079 equivalent) envelope matches the specified key. This specification 1080 defines the interpretation of the (case insensitive) "from" and "to" 1081 envelope-parts. Additional envelope-parts may be defined by other 1082 extensions; implementations SHOULD consider unknown envelope parts an 1083 error. 1085 If one of the envelope-part strings is (case insensitive) "from", 1086 then matching occurs against the FROM address used in the SMTP MAIL 1087 command. 1089 If one of the envelope-part strings is (case insensitive) "to", then 1090 matching occurs against the TO address used in the SMTP RCPT command 1091 that resulted in this message getting delivered to this user. Note 1092 that only the most recent TO is available, and only the one relevant 1093 to this user. 1095 The envelope-part is a string list and may contain more than one 1096 parameter, in which case all of the strings specified in the key-list 1097 are matched against all parts given in the envelope-part list. 1099 Like address and header, this test returns true if any combination of 1100 the envelope-part list and key-list arguments match and false 1101 otherwise. 1103 All tests against envelopes MUST drop source routes. 1105 If the SMTP transaction involved several RCPT commands, only the data 1106 from the RCPT command that caused delivery to this user is available 1107 in the "to" part of the envelope. 1109 If a protocol other than SMTP is used for message transport, 1110 implementations are expected to adapt this command appropriately. 1112 The envelope command is optional. Implementations SHOULD support it, 1113 but the necessary information may not be available in all cases. 1115 Example: require "envelope"; 1116 if envelope :all :is "from" "tim@example.com" { 1117 discard; 1118 } 1120 5.5. Test exists 1122 Syntax: exists 1124 The "exists" test is true if the headers listed in the header-names 1125 argument exist within the message. All of the headers must exist or 1126 the test is false. 1128 The following example throws out mail that doesn't have a From header 1129 and a Date header. 1131 Example: if not exists ["From","Date"] { 1132 discard; 1133 } 1135 5.6. Test false 1137 Syntax: false 1139 The "false" test always evaluates to false. 1141 5.7. Test header 1143 Syntax: header [COMPARATOR] [MATCH-TYPE] 1144 1146 The "header" test evaluates to true if any header name matches any 1147 key. The type of match is specified by the optional match argument, 1148 which defaults to ":is" if not specified, as specified in section 1149 2.6. 1151 Like address and envelope, this test returns true if any combination 1152 of the header-names list and key-list arguments match and false 1153 otherwise. 1155 If a header listed in the header-names argument exists, it contains 1156 the empty key (""). However, if the named header is not present, it 1157 does not match any key, including the empty key. So if a message 1158 contained the header 1160 X-Caffeine: C8H10N4O2 1162 these tests on that header evaluate as follows: 1164 header :is ["X-Caffeine"] [""] => false 1165 header :contains ["X-Caffeine"] [""] => true 1167 The preferred way to test whether a given header is either empty or 1168 absent is to combine an "exists" test and a "header" test: 1170 anyof (header :is "Cc" "", not exists "Cc") 1172 5.8. Test not 1174 Syntax: not 1176 The "not" test takes some other test as an argument, and yields the 1177 opposite result. "not false" evaluates to "true" and "not true" 1178 evaluates to "false". 1180 5.9. Test size 1182 Syntax: size <":over" / ":under"> 1184 The "size" test deals with the size of a message. It takes either a 1185 tagged argument of ":over" or ":under", followed by a number 1186 representing the size of the message. 1188 If the argument is ":over", and the size of the message is greater 1189 than the number provided, the test is true; otherwise, it is false. 1191 If the argument is ":under", and the size of the message is less than 1192 the number provided, the test is true; otherwise, it is false. 1194 Exactly one of ":over" or ":under" must be specified, and anything 1195 else is an error. 1197 The size of a message is defined to be the number of octets from the 1198 initial header until the last character in the message body. 1200 Note that for a message that is exactly 4,000 octets, the message is 1201 neither ":over" 4000 octets or ":under" 4000 octets. 1203 5.10. Test true 1204 Syntax: true 1206 The "true" test always evaluates to true. 1208 6. Extensibility 1210 New control commands, actions, and tests can be added to the 1211 language. Sites must make these features known to their users; this 1212 document does not define a way to discover the list of extensions 1213 supported by the server. 1215 Any extensions to this language MUST define a capability string that 1216 uniquely identifies that extension. Capability string are case- 1217 sensitive; for example, "foo" and "FOO" are different capabilities. 1218 If a new version of an extension changes the functionality of a 1219 previously defined extension, it MUST use a different name. 1221 In a situation where there is a submission protocol and an extension 1222 advertisement mechanism aware of the details of this language, 1223 scripts submitted can be checked against the mail server to prevent 1224 use of an extension that the server does not support. 1226 Extensions MUST state how they interact with constraints defined in 1227 section 2.10, e.g., whether they cancel the implicit keep, and which 1228 actions they are compatible and incompatible with. 1230 6.1. Capability String 1232 Capability strings are typically short strings describing what 1233 capabilities are supported by the server. 1235 Capability strings beginning with "vnd." represent vendor-defined 1236 extensions. Such extensions are not defined by Internet standards or 1237 RFCs, but are still registered with IANA in order to prevent 1238 conflicts. Extensions starting with "vnd." SHOULD be followed by the 1239 name of the vendor and product, such as "vnd.acme.rocket-sled". 1241 The following capability strings are defined by this document: 1243 envelope The string "envelope" indicates that the implementation 1244 supports the "envelope" command. 1246 fileinto The string "fileinto" indicates that the implementation 1247 supports the "fileinto" command. 1249 comparator- The string "comparator-elbonia" is provided if the 1250 implementation supports the "elbonia" comparator. 1251 Therefore, all implementations have at least the 1252 "comparator-i;octet", "comparator-en;ascii-casemap", 1253 and "comparator-i;ascii-casemap" capabilities. However, 1254 these comparators may be used without being declared 1255 with require. 1257 6.2. IANA Considerations 1259 In order to provide a standard set of extensions, a registry is 1260 provided by IANA. Capability names may be registered on a first- 1261 come, first-served basis. Extensions designed for interoperable use 1262 SHOULD be defined as standards track or IESG approved experimental 1263 RFCs. 1265 6.2.1. Template for Capability Registrations 1267 The following template is to be used for registering new Sieve 1268 extensions with IANA. 1270 To: iana@iana.org 1271 Subject: Registration of new Sieve extension 1273 Capability name: 1274 Capability keyword: 1275 Capability arguments: 1276 Standards Track/IESG-approved experimental RFC number: 1277 Person and email address to contact for further information: 1279 6.2.2. Initial Capability Registrations 1281 This RFC updates the the following entries in the IANA registry for 1282 Sieve extensions. 1284 Capability name: fileinto 1285 Capability keyword: fileinto 1286 Capability arguments: fileinto 1287 Standards Track/IESG-approved experimental RFC number: 1288 This RFC (Sieve base spec) 1289 Person and email address to contact for further information: 1290 The Sieve discussion list 1292 Capability name: envelope 1293 Capability keyword: envelope 1294 Capability arguments: 1295 envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1296 1297 Standards Track/IESG-approved experimental RFC number: 1298 This RFC (Sieve base spec) 1299 Person and email address to contact for further information: 1301 The Sieve discussion list 1303 Capability name: comparator-* 1304 Capability keyword: 1305 comparator-* (anything starting with "comparator-") 1306 Capability arguments: (none) 1307 Standards Track/IESG-approved experimental RFC number: 1308 This RFC, Sieve, by reference to [COLLATION] 1309 Person and email address to contact for further information: 1310 The Sieve discussion list 1312 6.3. Capability Transport 1314 As the range of mail systems that this document is intended to apply 1315 to is quite varied, a method of advertising which capabilities an 1316 implementation supports is difficult due to the wide range of 1317 possible implementations. Such a mechanism, however, should have the 1318 property that the implementation can advertise the complete set of 1319 extensions that it supports. 1321 7. Transmission 1323 The MIME type for a Sieve script is "application/sieve". 1325 The registration of this type for RFC 2048 requirements is updated as 1326 follows: 1328 Subject: Registration of MIME media type application/sieve 1330 MIME media type name: application 1331 MIME subtype name: sieve 1332 Required parameters: none 1333 Optional parameters: none 1334 Encoding considerations: Most sieve scripts will be textual, 1335 written in UTF-8. When non-7bit characters are used, 1336 quoted-printable is appropriate for transport systems 1337 that require 7bit encoding. 1339 Security considerations: Discussed in section 10 of this RFC. 1340 Interoperability considerations: Discussed in section 2.10.5 1341 of this RFC. 1342 Published specification: this RFC. 1343 Applications which use this media type: sieve-enabled mail servers 1344 Additional information: 1345 Magic number(s): 1346 File extension(s): .siv 1347 Macintosh File Type Code(s): 1348 Person & email address to contact for further information: 1350 See the discussion list at ietf-mta-filters@imc.org. 1351 Intended usage: 1352 COMMON 1353 Author/Change controller: 1354 See Editor information in this RFC. 1356 8. Parsing 1358 The Sieve grammar is separated into tokens and a separate grammar as 1359 most programming languages are. 1361 8.1. Lexical Tokens 1363 Sieve scripts are encoded in UTF-8. The following assumes a valid 1364 UTF-8 encoding; special characters in Sieve scripts are all US-ASCII. 1366 The following are tokens in Sieve: 1368 - identifiers 1369 - tags 1370 - numbers 1371 - quoted strings 1372 - multi-line strings 1373 - other separators 1375 Blanks, horizontal tabs, CRLFs, and comments ("white space") are 1376 ignored except as they separate tokens. Some white space is required 1377 to separate otherwise adjacent tokens and in specific places in the 1378 multi-line strings. CR and LF can only appear in CRLF pairs. 1380 The other separators are single individual characters, and are 1381 mentioned explicitly in the grammar. 1383 The lexical structure of sieve is defined in the following grammar 1384 (as described in [ABNF]): 1386 bracket-comment = "/*" *not-star 1*STAR 1387 *(not-star-slash *not-star 1*STAR) "/" 1388 ; No */ allowed inside a comment. 1389 ; (No * is allowed unless it is the last 1390 ; character, or unless it is followed by a 1391 ; character that isn't a slash.) 1393 STAR = "*" 1395 not-star = CRLF / %x01-09 / %x0b-0c / %x0e-29 / %x2b-7f / 1396 UTF8-2 / UTF8-3 / UTF8-4 1397 ; either a CRLF pair, OR a single UTF-8 1398 ; character other than NUL, CR, LF, or star 1400 not-star-or-slash = CRLF / %x01-09 / %x0b-0c / %x0e-29 / %x2b-2e / 1401 %x30-7f / UTF8-2 / UTF8-3 / UTF8-4 1402 ; either a CRLF pair, OR a single UTF-8 1403 ; character other than NUL, CR, LF, star, 1404 ; or slash 1406 UTF8-NOT-CRLF = %x01-09 / %x0b-0c / %x0e-7f / 1407 UTF8-2 / UTF8-3 / UTF8-4 1408 ; a single UTF-8 character other than NUL, 1409 ; CR, or LF 1411 UTF8-NOT-PERIOD = %x01-09 / %x0b-0c / %x0e-2d / %x2f-7f / 1412 UTF8-2 / UTF8-3 / UTF8-4 1413 ; a single UTF-8 character other than NUL, 1414 ; CR, LF, or period 1416 UTF8-NOT-NUL = %x01-7f / UTF8-2 / UTF8-3 / UTF8-4 1417 ; a single UTF-8 character other than NUL 1419 UTF8-NOT-QSPECIAL = %x01-09 / %x0b-0c / %x0e-21 / %x23-5b / 1420 %x5d-7f / UTF8-2 / UTF8-3 / UTF8-4 1421 ; a single UTF-8 character other than NUL, 1422 ; CR, LF, double-quote, or backslash 1424 comment = bracket-comment / hash-comment 1426 hash-comment = "#" *UTF8-NOT-CRLF CRLF 1428 identifier = (ALPHA / "_") *(ALPHA / DIGIT / "_") 1430 tag = ":" identifier 1432 number = 1*DIGIT [QUANTIFIER] 1434 QUANTIFIER = "K" / "M" / "G" 1436 quoted-safe = CRLF / UTF8-NOT-QSPECIAL 1437 ; either a CRLF pair, OR a single UTF-8 1438 ; character other than NUL, CR, LF, 1439 ; double-quote, or backslash 1441 quoted-special = "\" ( DQUOTE / "\" ) 1442 ; represents just a double-quote or backslash 1444 quoted-other = "\" UTF8-NOT-QSPECIAL 1445 ; represents just the UTF8-NOT-QSPECIAL 1446 ; character. SHOULD NOT be used 1448 quoted-text = *(quoted-safe / quoted-special / quoted-other) 1450 quoted-string = DQUOTE quoted-text DQUOTE 1452 multi-line = "text:" *(SP / HTAB) (hash-comment / CRLF) 1453 *(multiline-literal / multiline-dotstuff) 1454 "." CRLF 1456 multiline-literal = [UTF8-NOT-PERIOD *UTF8-NOT-CRLF] CRLF 1458 multiline-dotstuff = "." 1*UTF8-NOT-CRLF CRLF 1459 ; A line containing only "." ends the 1460 ; multi-line. Remove a leading '.' if 1461 ; followed by another '.'. 1463 white-space = 1*(SP / CRLF / HTAB) / comment 1465 8.2. Grammar 1467 The following is the grammar of Sieve after it has been lexically 1468 interpreted. No white space or comments appear below. The start 1469 symbol is "start". 1471 argument = string-list / number / tag 1473 arguments = *argument [test / test-list] 1475 block = "{" commands "}" 1477 command = identifier arguments ( ";" / block ) 1479 commands = *command 1481 start = commands 1483 string = quoted-string / multi-line 1485 string-list = "[" string *("," string) "]" / string 1486 ; if there is only a single string, the brackets 1487 ; are optional 1489 test = identifier arguments 1491 test-list = "(" test *("," test) ")" 1493 9. Extended Example 1494 The following is an extended example of a Sieve script. Note that it 1495 does not make use of the implicit keep. 1497 # 1498 # Example Sieve Filter 1499 # Declare any optional features or extension used by the script 1500 # 1501 require ["fileinto"]; 1503 # 1504 # Handle messages from known mailing lists 1505 # Move messages from IETF filter discussion list to filter folder 1506 # 1507 if header :is "Sender" "owner-ietf-mta-filters@imc.org" 1508 { 1509 fileinto "filter"; # move to "filter" folder 1510 } 1511 # 1512 # Keep all messages to or from people in my company 1513 # 1514 elsif address :domain :is ["From", "To"] "example.com" 1515 { 1516 keep; # keep in "In" folder 1517 } 1519 # 1520 # Try and catch unsolicited email. If a message is not to me, 1521 # or it contains a subject known to be spam, file it away. 1522 # 1523 elsif anyof (not address :all :contains 1524 ["To", "Cc", "Bcc"] "me@example.com", 1525 header :matches "subject" 1526 ["*make*money*fast*", "*university*dipl*mas*"]) 1527 { 1528 # If message header does not contain my address, 1529 # it's from a list. 1530 fileinto "spam"; # move to "spam" folder 1531 } 1532 else 1533 { 1534 # Move all other (non-company) mail to "personal" 1535 # folder. 1536 fileinto "personal"; 1537 } 1539 10. Security Considerations 1541 Users must get their mail. It is imperative that whatever method 1542 implementations use to store the user-defined filtering scripts be 1543 secure. 1545 It is equally important that implementations sanity-check the user's 1546 scripts, and not allow users to create on-demand mailbombs. For 1547 instance, an implementation that allows a user to redirect a message 1548 multiple times might also allow a user to create a mailbomb triggered 1549 by mail from a specific user. Site- or implementation-defined limits 1550 on actions are useful for this. 1552 Several commands, such as "discard", "redirect", and "fileinto" allow 1553 for actions to be taken that are potentially very dangerous. 1555 Implementations SHOULD take measures to prevent languages from 1556 looping. 1558 As with any filter on a message stream, if the sieve implementation 1559 and the mail agents 'behind' sieve in the message stream differ in 1560 their interpretation of the messages, it may be possible for an 1561 attacker to subvert the filter. Of particular note are differences 1562 in the interpretation of malformed messages (e.g., missing or extra 1563 syntax characters) or those that exhibit corner cases (e.g., NUL 1564 octects encoded via [MIME3]). 1566 11. Acknowledgments 1568 This document has been revised in part based on comments and 1569 discussions that took place on and off the SIEVE mailing list. 1570 Thanks to Cyrus Daboo, Ned Freed, Michael Haardt, Kjetil Torgrim 1571 Homme, Barry Leiba, Mark E. Mallett, Alexey Melnikov, Rob Siemborski, 1572 and Nigel Swinson for reviews and suggestions. 1574 12. Editors' Addresses 1576 Philip Guenther 1577 Sendmail, Inc. 1578 6425 Christie St. Ste 400 1579 Emeryville, CA 94608 1580 Email: guenther@sendmail.com 1582 Tim Showalter 1583 Email: tjs@psaux.com 1585 13. Normative References 1587 [ABNF] Crocker, D. and P. Overell, "Augmented BNF for Syntax 1588 Specifications: ABNF", RFC 2234, November 1997. 1590 [COLLATION] Newman, C., Duerst, M., and A. Gulbrandsen "Internet 1591 Application Protocol Collation Registry" draft- 1592 newman-i18n-comparator-04.txt (work in progress), 1593 July 2005. 1595 [IMAIL] P. Resnick, Ed., "Internet Message Format", RFC 2822, 1596 April 2001. 1598 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 1599 Requirement Levels", BCP 14, RFC 2119, March 1997. 1601 [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 1602 Extensions (MIME) Part One: Format of Internet 1603 Message Bodies", RFC 2045, November 1996. 1605 [MIME3] Moore, K., "MIME (Multipurpose Internet Mail Extensions) 1606 Part Three: Message Header Extensions for Non-ASCII 1607 Text", RFC 2047, November 1996 1609 [MDN] T. Hansen, Ed., G. Vaudreuil, Ed., "Message Disposition 1610 Notification", RFC 3798, May 2004. 1612 [SMTP] J. Klensin, Ed., "Simple Mail Transfer Protocol", RFC 1613 2821, April 2001. 1615 [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO 1616 10646", RFC 3629, November 2003. 1618 14. Informative References 1620 [BINARY-SI] "Standard IEC 60027-2: Letter symbols to be used in 1621 electrical technology - Part 2: Telecommunications and 1622 electronics", January 1999. 1624 [DSN] Moore, K. and G. Vaudreuil, "An Extensible Message Format 1625 for Delivery Status Notifications", RFC 1894, January 1626 1996. 1628 [FLAMES] Borenstein, N, and C. Thyberg, "Power, Ease of Use, and 1629 Cooperative Work in a Practical Multimedia Message 1630 System", Int. J. of Man-Machine Studies, April, 1991. 1631 Reprinted in Computer-Supported Cooperative Work and 1632 Groupware, Saul Greenberg, editor, Harcourt Brace 1633 Jovanovich, 1991. Reprinted in Readings in Groupware and 1634 Computer-Supported Cooperative Work, Ronald Baecker, 1635 editor, Morgan Kaufmann, 1993. 1637 [IMAP] Crispin, M., "Internet Message Access Protocol - version 1638 4rev1", RFC 3501, March 2003. 1640 14. Full Copyright Statement 1642 Copyright (C) The Internet Society (2005). 1644 This document is subject to the rights, licenses and restrictions 1645 contained in BCP 78, and except as set forth therein, the authors 1646 retain all their rights. 1648 This document and the information contained herein are provided on an 1649 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1650 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1651 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1652 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1653 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1654 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1656 Intellectual Property 1658 The IETF takes no position regarding the validity or scope of any 1659 Intellectual Property Rights or other rights that might be claimed to 1660 pertain to the implementation or use of the technology described in 1661 this document or the extent to which any license under such rights 1662 might or might not be available; nor does it represent that it has 1663 made any independent effort to identify any such rights. Information 1664 on the procedures with respect to rights in RFC documents can be 1665 found in BCP 78 and BCP 79. 1667 Copies of IPR disclosures made to the IETF Secretariat and any 1668 assurances of licenses to be made available, or the result of an 1669 attempt made to obtain a general license or permission for the use of 1670 such proprietary rights by implementers or users of this 1671 specification can be obtained from the IETF on-line IPR repository at 1672 http://www.ietf.org/ipr. 1674 The IETF invites any interested party to bring to its attention any 1675 copyrights, patents or patent applications, or other proprietary 1676 rights that may cover technology that may be required to implement 1677 this standard. Please address the information to the IETF at ietf- 1678 ipr@ietf.org. 1680 Acknowledgement 1682 Funding for the RFC Editor function is currently provided by the 1683 Internet Society. 1685 Append A. Change History 1686 This section will be replaced with a summary of the changes since RFC 1687 3028 when this document leaves the Internet-Draft stage. 1689 Changes from draft-ietf-sieve-3028bis-03.txt 1690 1. Remove section 2.4.2.4., MIME Parts, as unreferenced. 1691 2. Update to draft-newman-i18n-comparator-04.txt 1692 3. Various tweaks to examples and syntax lines 1693 4. Define "control structure" as a control command with a block 1694 argument, then use it consistently. Reword description of 1695 blocks to match 1696 5. Clarify that "header" can never match an absent header and give 1697 the preferred way to test for absent or empty 1698 6. Invalid header name syntax is not an error _in tests_ (but could 1699 be elsewhere) 1700 7. Implementation SHOULD consider unknown envelope parts an error 1701 8. Remove explicit "omitted" option from 2.7.2p2 1703 Changes from draft-ietf-sieve-3028bis-02.txt 1704 1. Change "ASCII" to "US-ASCII" throughout 1705 2. Tweak section 2.7.2 to not require use of UTF-8 internally and 1706 to explicitly leave implementation-defined the handling of text 1707 that can't be converted to Unicode 1708 3. Add reference to RFC 2047 1709 4. Clarify that capability strings are case-sensitive 1710 5. Clarify that address, envelope, and header return false if no 1711 combination of arguments match 1712 6. Directly state that code that isn't reached may still be checked 1713 for errors. 1714 7. Invalid header name syntax is not an error 1715 8. Remove description of header unfolding that conflicts with 1716 [IMAIL] 1717 9. Warn that filters may be subvertable if agents interpret messages 1718 differently 1719 10. Encoded NUL octets SHOULD NOT cause truncation 1721 Changes from draft-ietf-sieve-3028bis-01.txt 1722 1. Remove ban on side effects 1723 2. Remove definition of the 'reject' action, as it is being moved 1724 to the doc that also defines the 'refuse' action 1725 3. Update capability registrations to reference the mailing list 1726 4. Add Tim back as an editor 1727 5. Refer to the zero-length string ("") as "empty" instead of 1728 "null" 1730 Changes from draft-ietf-sieve-3028bis-00.txt 1731 1. More grammar corrections: 1732 - permit /***/, 1733 - remove ambiguity in finding end of bracket comment, 1734 - require valid UTF-8, 1735 - express quoting in the grammar 1736 - ban bare CR and LF in all locations 1737 2. Correct a bunch of whitespace and linewrapping nits 1738 3. Update IMAIL and SMTP references RFC 2822 and RFC 2821 1739 4. Require support for en;ascii-casemap comparator as well as the 1740 old i;ascii-casemap. As with the old one, you do not need to 1741 use 'require' to use the new comparator. 1742 5. Update IANA considerations to update the existing registrations 1743 to point at this doc instead of 3028. 1744 6. Scripts SHOULD NOT contain superfluous backslashes 1745 7. Update Acknowledgments 1747 Changes from RFC 3028 1748 1. Split references into normative and informative 1749 2. Update references to current versions of DSN, IMAP, MDN, and 1750 UTF-8 RFCs 1751 3. Replace "e-mail" with "email" 1752 4. Incorporate RFC 3028 errata 1753 5. The "reject" action cancels the implicit keep 1754 6. Replace references to ACAP with references to the i18n-comparator 1755 draft. Further work is needed to completely sync with that 1756 draft. 1757 7. Start to update grammar to only permit legal UTF-8 (incomplete) 1758 and correct various other errors and typos 1759 8. Update IPR broilerplate to RFC 3978/3979