idnits 2.17.1 draft-ietf-sieve-3028bis-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1664. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1675. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1682. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1688. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC3028, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 2006) is 6617 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'COMPARATOR' is mentioned on line 1299, but not defined == Missing Reference: 'ADDRESS-PART' is mentioned on line 1299, but not defined == Missing Reference: 'MATCH-TYPE' is mentioned on line 1299, but not defined == Missing Reference: 'QUANTIFIER' is mentioned on line 1434, but not defined == Unused Reference: 'MIME' is defined on line 1611, but no explicit reference was found in the text == Unused Reference: 'IMAP' is defined on line 1647, but no explicit reference was found in the text ** Obsolete normative reference: RFC 4234 (ref. 'ABNF') (Obsoleted by RFC 5234) == Outdated reference: A later version (-14) exists of draft-newman-i18n-comparator-07 ** Obsolete normative reference: RFC 2822 (ref. 'IMAIL') (Obsoleted by RFC 5322) ** Obsolete normative reference: RFC 3798 (ref. 'MDN') (Obsoleted by RFC 8098) ** Obsolete normative reference: RFC 2821 (ref. 'SMTP') (Obsoleted by RFC 5321) -- Obsolete informational reference (is this intentional?): RFC 1894 (ref. 'DSN') (Obsoleted by RFC 3464) -- Obsolete informational reference (is this intentional?): RFC 3501 (ref. 'IMAP') (Obsoleted by RFC 9051) Summary: 7 errors (**), 0 flaws (~~), 10 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Guenther 3 Internet-Draft Sendmail, Inc. 4 Expires: September 2006 T. Showalter 5 Obsoletes: 3028 (if approved) Editors 6 March 2006 8 Sieve: An Email Filtering Language 9 draft-ietf-sieve-3028bis-06.txt 11 Status of this Memo 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 A revised version of this draft document will be submitted to the RFC 35 editor as a Standard Track RFC for the Internet Community. 36 Discussion and suggestions for improvement are requested, and should 37 be sent to ietf-mta-filters@imc.org. Distribution of this memo is 38 unlimited. 40 Copyright Notice 42 Copyright (C) The Internet Society (2006). 44 Abstract 46 This document describes a language for filtering email messages at 47 time of final delivery. It is designed to be implementable on either 48 a mail client or mail server. It is meant to be extensible, simple, 49 and independent of access protocol, mail architecture, and operating 50 system. It is suitable for running on a mail server where users may 51 not be allowed to execute arbitrary programs, such as on black box 52 Internet Message Access Protocol (IMAP) servers, as it has no 53 variables, loops, or ability to shell out to external programs. 55 Table of Contents 57 1. Introduction ........................................... 3 58 1.1. Conventions Used in This Document ..................... 4 59 1.2. Example mail messages ................................. 5 60 2. Design ................................................. 5 61 2.1. Form of the Language .................................. 6 62 2.2. Whitespace ............................................ 6 63 2.3. Comments .............................................. 6 64 2.4. Literal Data .......................................... 6 65 2.4.1. Numbers ............................................... 6 66 2.4.2. Strings ............................................... 7 67 2.4.2.1. String Lists .......................................... 8 68 2.4.2.2. Headers ............................................... 8 69 2.4.2.3. Addresses ............................................. 8 70 2.5. Tests ................................................. 9 71 2.5.1. Test Lists ............................................ 9 72 2.6. Arguments ............................................. 9 73 2.6.1. Positional Arguments .................................. 9 74 2.6.2. Tagged Arguments ...................................... 10 75 2.6.3. Optional Arguments .................................... 10 76 2.6.4. Types of Arguments .................................... 10 77 2.7. String Comparison ..................................... 11 78 2.7.1. Match Type ............................................ 11 79 2.7.2. Comparisons Across Character Sets ..................... 12 80 2.7.3. Comparators ........................................... 12 81 2.7.4. Comparisons Against Addresses ......................... 14 82 2.8. Blocks ................................................ 14 83 2.9. Commands .............................................. 14 84 2.10. Evaluation ............................................ 15 85 2.10.1. Action Interaction .................................... 15 86 2.10.2. Implicit Keep ......................................... 15 87 2.10.3. Message Uniqueness in a Mailbox ....................... 16 88 2.10.4. Limits on Numbers of Actions .......................... 16 89 2.10.5. Extensions and Optional Features ...................... 16 90 2.10.6. Errors ................................................ 17 91 2.10.7. Limits on Execution ................................... 17 92 3. Control Commands ....................................... 17 93 3.1. Control If ............................................ 18 94 3.2. Control Require ....................................... 19 95 3.3. Control Stop .......................................... 19 96 4. Action Commands ........................................ 19 97 4.1. Action fileinto ....................................... 20 98 4.2. Action redirect ....................................... 20 99 4.3. Action keep ........................................... 20 100 4.4. Action discard ........................................ 21 101 5. Test Commands .......................................... 21 102 5.1. Test address .......................................... 22 103 5.2. Test allof ............................................ 22 104 5.3. Test anyof ............................................ 23 105 5.4. Test envelope ......................................... 23 106 5.5. Test exists ........................................... 24 107 5.6. Test false ............................................ 24 108 5.7. Test header ........................................... 24 109 5.8. Test not .............................................. 25 110 5.9. Test size ............................................. 25 111 5.10. Test true ............................................. 25 112 6. Extensibility .......................................... 25 113 6.1. Capability String ..................................... 26 114 6.2. IANA Considerations ................................... 26 115 6.2.1. Template for Capability Registrations ................. 27 116 6.2.2. Initial Capability Registrations ...................... 27 117 6.3. Capability Transport .................................. 28 118 7. Transmission ........................................... 28 119 8. Parsing ................................................ 29 120 8.1. Lexical Tokens ........................................ 29 121 8.2. Grammar ............................................... 31 122 9. Extended Example ....................................... 31 123 10. Security Considerations ................................ 32 124 11. Acknowledgments ........................................ 33 125 12. Editor's Address ....................................... 33 126 13. Normative References ................................... 33 127 14. Informative References ................................. 34 128 14. Full Copyright Statement ............................... 34 130 1. Introduction 132 This memo documents a language that can be used to create filters for 133 electronic mail. It is not tied to any particular operating system 134 or mail architecture. It requires the use of [IMAIL]-compliant 135 messages, but should otherwise generalize to many systems. 137 The language is powerful enough to be useful but limited in order to 138 allow for a safe server-side filtering system. The intention is to 139 make it impossible for users to do anything more complex (and 140 dangerous) than write simple mail filters, along with facilitating 141 the use of GUIs for filter creation and manipulation. The language 142 is not Turing-complete: it provides no way to write a loop or a 143 function and variables are not provided. 145 Scripts written in Sieve are executed during final delivery, when the 146 message is moved to the user-accessible mailbox. In systems where 147 the MTA does final delivery, such as traditional Unix mail, it is 148 reasonable to sort when the MTA deposits mail into the user's 149 mailbox. 151 There are a number of reasons to use a filtering system. Mail 152 traffic for most users has been increasing due to increased usage of 153 email, the emergence of unsolicited email as a form of advertising, 154 and increased usage of mailing lists. 156 Experience at Carnegie Mellon has shown that if a filtering system is 157 made available to users, many will make use of it in order to file 158 messages from specific users or mailing lists. However, many others 159 did not make use of the Andrew system's FLAMES filtering language 160 [FLAMES] due to difficulty in setting it up. 162 Because of the expectation that users will make use of filtering if 163 it is offered and easy to use, this language has been made simple 164 enough to allow many users to make use of it, but rich enough that it 165 can be used productively. However, it is expected that GUI-based 166 editors will be the preferred way of editing filters for a large 167 number of users. 169 1.1. Conventions Used in This Document 171 In the sections of this document that discuss the requirements of 172 various keywords and operators, the following conventions have been 173 adopted. 175 The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" 176 in this document are to be interpreted as defined in [KEYWORDS]. 178 Each section on a command (test, action, or control) has a line 179 labeled "Usage:". This line describes the usage of the command, 180 including its name and its arguments. Required arguments are listed 181 inside angle brackets ("<" and ">"). Optional arguments are listed 182 inside square brackets ("[" and "]"). Each argument is followed by 183 its type, so "" represents an argument called "key" that 184 is a string. Literal strings are represented with double-quoted 185 strings. Alternatives are separated with slashes, and parenthesis 186 are used for grouping, similar to [ABNF]. 188 In the "Usage:" line, there are three special pieces of syntax that 189 are frequently repeated, MATCH-TYPE, COMPARATOR, and ADDRESS-PART. 190 These are discussed in sections 2.7.1, 2.7.3, and 2.7.4, 191 respectively. 193 The formal grammar for these commands in section 10 and is the 194 authoritative reference on how to construct commands, but the formal 195 grammar does not specify the order, semantics, number or types of 196 arguments to commands, nor the legal command names. The intent is to 197 allow for extension without changing the grammar. 199 1.2. Example mail messages 201 The following mail messages will be used throughout this document in 202 examples. 204 Message A 205 ----------------------------------------------------------- 206 Date: Tue, 1 Apr 1997 09:06:31 -0800 (PST) 207 From: coyote@desert.example.org 208 To: roadrunner@acme.example.com 209 Subject: I have a present for you 211 Look, I'm sorry about the whole anvil thing, and I really 212 didn't mean to try and drop it on you from the top of the 213 cliff. I want to try to make it up to you. I've got some 214 great birdseed over here at my place--top of the line 215 stuff--and if you come by, I'll have it all wrapped up 216 for you. I'm really sorry for all the problems I've caused 217 for you over the years, but I know we can work this out. 218 -- 219 Wile E. Coyote "Super Genius" coyote@desert.example.org 220 ----------------------------------------------------------- 222 Message B 223 ----------------------------------------------------------- 224 From: youcouldberich!@reply-by-postal-mail.invalid 225 Sender: b1ff@de.res.example.com 226 To: rube@landru.example.edu 227 Date: Mon, 31 Mar 1997 18:26:10 -0800 228 Subject: $$$ YOU, TOO, CAN BE A MILLIONAIRE! $$$ 230 YOU MAY HAVE ALREADY WON TEN MILLION DOLLARS, BUT I DOUBT 231 IT! SO JUST POST THIS TO SIX HUNDRED NEWSGROUPS! IT WILL 232 GUARANTEE THAT YOU GET AT LEAST FIVE RESPONSES WITH MONEY! 233 MONEY! MONEY! COLD HARD CASH! YOU WILL RECEIVE OVER 234 $20,000 IN LESS THAN TWO MONTHS! AND IT'S LEGAL!!!!!!!!! 235 !!!!!!!!!!!!!!!!!!111111111!!!!!!!11111111111!!1 JUST 236 SEND $5 IN SMALL, UNMARKED BILLS TO THE ADDRESSES BELOW! 237 ----------------------------------------------------------- 239 2. Design 240 2.1. Form of the Language 242 The language consists of a set of commands. Each command consists of 243 a set of tokens delimited by whitespace. The command identifier is 244 the first token and it is followed by zero or more argument tokens. 245 Arguments may be literal data, tags, blocks of commands, or test 246 commands. 248 The language is represented in UTF-8, as specified in [UTF-8]. 250 Tokens in the US-ASCII range are considered case-insensitive. 252 2.2. Whitespace 254 Whitespace is used to separate tokens. Whitespace is made up of 255 tabs, newlines (CRLF, never just CR or LF), and the space character. 256 The amount of whitespace used is not significant. 258 2.3. Comments 260 Two types of comments are offered. Comments are semantically 261 equivalent to whitespace and can be used anyplace that whitespace is 262 (with one exception in multi-line strings, as described in the 263 grammar). 265 Hash comments begin with a "#" character that is not contained within 266 a string and continue until the next CRLF. 268 Example: if size :over 100K { # this is a comment 269 discard; 270 } 272 Bracketed comments begin with the token "/*" and end with "*/" 273 outside of a string. Bracketed comments may span multiple lines. 274 Bracketed comments do not nest. 276 Example: if size :over 100K { /* this is a comment 277 this is still a comment */ discard /* this is a comment 278 */ ; 279 } 281 2.4. Literal Data 283 Literal data means data that is not executed, merely evaluated "as 284 is", to be used as arguments to commands. Literal data is limited to 285 numbers and strings. 287 2.4.1. Numbers 288 Numbers are given as ordinary decimal numbers. However, those 289 numbers that have a tendency to be fairly large, such as message 290 sizes, MAY have a "K", "M", or "G" appended to indicate a multiple of 291 a power of two. To be comparable with the power-of-two-based 292 versions of SI units that computers frequently use, K specifies 293 kibi-, or 1,024 (2^10) times the value of the number; M specifies 294 mebi-, or 1,048,576 (2^20) times the value of the number; and G 295 specifies gibi-, or 1,073,741,824 (2^30) times the value of the 296 number [BINARY-SI]. 298 Implementations MUST provide 31 bits of magnitude in numbers, but MAY 299 provide more. 301 Only positive integers are permitted by this specification. 303 2.4.2. Strings 305 Scripts involve large numbers of strings as they are used for pattern 306 matching, addresses, textual bodies, etc. Typically, short quoted 307 strings suffice for most uses, but a more convenient form is provided 308 for longer strings such as bodies of messages. 310 A quoted string starts and ends with a single double quote (the <"> 311 character, US-ASCII 34). A backslash ("\", ASCII 92) inside of a 312 quoted string is followed by either another backslash or a double 313 quote. This two-character sequence represents a single backslash or 314 double- quote within the string, respectively. 316 Scripts SHOULD NOT escape other characters with a backslash. 318 An undefined escape sequence (such as "\a" in a context where "a" has 319 no special meaning) is interpreted as if there were no backslash (in 320 this case, "\a" is just "a"). 322 Non-printing characters such as tabs, CR and LF, and control 323 characters are permitted in quoted strings. Quoted strings MAY span 324 multiple lines. NUL (US-ASCII 0) is not allowed in strings. 326 For entering larger amounts of text, such as an email message, a 327 multi-line form is allowed. It starts with the keyword "text:", 328 followed by a CRLF, and ends with the sequence of a CRLF, a single 329 period, and another CRLF. In order to allow the message to contain 330 lines with a single-dot, lines are dot-stuffed. That is, when 331 composing a message body, an extra `.' is added before each line 332 which begins with a `.'. When the server interprets the script, 333 these extra dots are removed. Note that a line that begins with a 334 dot followed by a non-dot character is not interpreted dot-stuffed; 335 that is, ".foo" is interpreted as ".foo". However, because this is 336 potentially ambiguous, scripts SHOULD be properly dot-stuffed so such 337 lines do not appear. 339 Note that a hashed comment or whitespace may occur in between the 340 "text:" and the CRLF, but not within the string itself. Bracketed 341 comments are not allowed here. 343 2.4.2.1. String Lists 345 When matching patterns, it is frequently convenient to match against 346 groups of strings instead of single strings. For this reason, a list 347 of strings is allowed in many tests, implying that if the test is 348 true using any one of the strings, then the test is true. 349 Implementations are encouraged to use short-circuit evaluation in 350 these cases. 352 For instance, the test `header :contains ["To", "Cc"] 353 ["me@example.com", "me00@landru.example.edu"]' is true if either a To 354 header or Cc header of the input message contains either of the email 355 addresses "me@example.com" or "me00@landru.example.edu". 357 Conversely, in any case where a list of strings is appropriate, a 358 single string is allowed without being a member of a list: it is 359 equivalent to a list with a single member. This means that the test 360 `exists "To"' is equivalent to the test `exists ["To"]'. 362 2.4.2.2. Headers 364 Headers are a subset of strings. In the Internet Message 365 Specification [IMAIL], each header line is allowed to have whitespace 366 nearly anywhere in the line, including after the field name and 367 before the subsequent colon. Extra spaces between the header name 368 and the ":" in a header field are ignored. 370 A header name never contains a colon. The "From" header refers to a 371 line beginning "From:" (or "From :", etc.). No header will match 372 the string "From:" due to the trailing colon. 374 Similarly, synactically invalid header names cause the same result as 375 syntactically valid header names that are not present in the message. 376 In particular, an implementation MUST NOT cause an error for 377 synactically invalid header names in tests. 379 Header lines are unfolded as described in [IMAIL] section 2.2.3. 380 Interpretation of header data SHOULD be done according to [MIME3] 381 section 6.2 (see 2.7.2 below for details). 383 2.4.2.3. Addresses 384 A number of commands call for email addresses, which are also a 385 subset of strings. When these addresses are used in outbound 386 contexts, addresses must be compliant with [IMAIL], but are further 387 constrained. Using the symbols defined in [IMAIL], section 3, the 388 syntax of an address is: 390 sieve-address = addr-spec ; simple address 391 / phrase "<" addr-spec ">" ; name & addr-spec 393 That is, routes and group syntax are not permitted. If multiple 394 addresses are required, use a string list. Named groups are not used 395 here. 397 Implementations MUST ensure that the addresses are syntactically 398 valid, but need not ensure that they actually identify an email 399 recipient. 401 2.5. Tests 403 Tests are given as arguments to commands in order to control their 404 actions. In this document, tests are given to if/elsif/else to 405 decide which block of code is run. 407 2.5.1. Test Lists 409 Some tests ("allof" and "anyof", which implement logical "and" and 410 logical "or", respectively) may require more than a single test as an 411 argument. The test-list syntax element provides a way of grouping 412 tests. 414 Example: if anyof (not exists ["From", "Date"], 415 header :contains "from" "fool@example.edu") { 416 discard; 417 } 419 2.6. Arguments 421 In order to specify what to do, most commands take arguments. There 422 are three types of arguments: positional, tagged, and optional. 424 2.6.1. Positional Arguments 426 Positional arguments are given to a command which discerns their 427 meaning based on their order. When a command takes positional 428 arguments, all positional arguments must be supplied and must be in 429 the order prescribed. 431 2.6.2. Tagged Arguments 432 This document provides for tagged arguments in the style of 433 CommonLISP. These are also similar to flags given to commands in 434 most command-line systems. 436 A tagged argument is an argument for a command that begins with ":" 437 followed by a tag naming the argument, such as ":contains". This 438 argument means that zero or more of the next tokens have some 439 particular meaning depending on the argument. These next tokens may 440 be numbers or strings but they are never blocks. 442 Tagged arguments are similar to positional arguments, except that 443 instead of the meaning being derived from the command, it is derived 444 from the tag. 446 Tagged arguments must appear before positional arguments, but they 447 may appear in any order with other tagged arguments. For simplicity 448 of the specification, this is not expressed in the syntax definitions 449 with commands, but they still may be reordered arbitrarily provided 450 they appear before positional arguments. Tagged arguments may be 451 mixed with optional arguments. 453 To simplify this specification, tagged arguments SHOULD NOT take 454 tagged arguments as arguments. 456 2.6.3. Optional Arguments 458 Optional arguments are exactly like tagged arguments except that they 459 may be left out, in which case a default value is implied. Because 460 optional arguments tend to result in shorter scripts, they have been 461 used far more than tagged arguments. 463 One particularly noteworthy case is the ":comparator" argument, which 464 allows the user to specify which comparator [COLLATION] will be used 465 to compare two strings, since different languages may impose 466 different orderings on UTF-8 [UTF-8] characters. 468 2.6.4. Types of Arguments 470 Abstractly, arguments may be literal data, tests, or blocks of 471 commands. In this way, an "if" control structure is merely a command 472 that happens to take a test and a block as arguments and may execute 473 the block of code. 475 However, this abstraction is ambiguous from a parsing standpoint. 476 The grammar in section 9.2 presents a parsable version of this: 477 Arguments are string-lists, numbers, and tags, which may be followed 478 by a test or a test-list, which may be followed by a block of 479 commands. No more than one test or test list, nor more than one 480 block of commands, may be used, and commands that end with a block of 481 commands do not end with semicolons. 483 2.7. String Comparison 485 When matching one string against another, there are a number of ways 486 of performing the match operation. These are accomplished with three 487 types of matches: an exact match, a substring match, and a wildcard 488 glob-style match. These are described below. 490 In order to provide for matches between character sets and case 491 insensitivity, Sieve uses the comparators defined in the Internet 492 Application Protocol Collation Registry [COLLATION]. 494 However, when a string represents the name of a header, the 495 comparator is never user-specified. Header comparisons are always 496 done with the "en;ascii-casemap" operator, i.e., case-insensitive 497 comparisons, because this is the way things are defined in the 498 message specification [IMAIL]. 500 2.7.1. Match Type 502 There are three match types describing the matching used in this 503 specification: ":is", ":contains", and ":matches". Match type 504 arguments are supplied to those commands which allow them to specify 505 what kind of match is to be performed. 507 These are used as tagged arguments to tests that perform string 508 comparison. 510 The ":contains" match type describes a substring match. If the value 511 argument contains the key argument as a substring, the match is true. 512 For instance, the string "frobnitzm" contains "frob" and "nit", but 513 not "fbm". The empty key ("") is contained in all values. 515 The ":is" match type describes an absolute match; if the contents of 516 the first string are absolutely the same as the contents of the 517 second string, they match. Only the string "frobnitzm" is the string 518 "frobnitzm". The empty key ":is" and only ":is" the empty value. 520 The ":matches" match type specifies a wildcard match using the 521 characters "*" and "?"; the entire value must be matched. "*" 522 matches zero or more characters, and "?" matches a single character, 523 using the definition of character appropriate for the comparator in 524 use. That is, "?" will match exactly one octet when the "i;octet" or 525 "en;ascii-casemap" comparators are used, but will match the one or 526 more octets that compose a character in UTF-8 when the 527 "i;basic;uca=3.1.1;uv=3.2" comparator is used. "?" and "*" may be 528 escaped as "\\?" and "\\*" in strings to match against themselves. 529 The first backslash escapes the second backslash; together, they 530 escape the "*". This is awkward, but it is commonplace in several 531 programming languages that use globs and regular expressions. 533 In order to specify what type of match is supposed to happen, 534 commands that support matching take optional tagged arguments 535 ":matches", ":is", and ":contains". Commands default to using ":is" 536 matching if no match type argument is supplied. Note that these 537 modifiers interact with comparators; in particular, only comparators 538 that supoprt the "substring match" operation are suitable for 539 matching with ":contains" or ":matches". It is an error to use a 540 comparator with ":contains" or ":matches" that is not compatible with 541 it. 543 It is an error to give more than one of these arguments to a given 544 command. 546 For convenience, the "MATCH-TYPE" syntax element is defined here as 547 follows: 549 Syntax: ":is" / ":contains" / ":matches" 551 2.7.2. Comparisons Across Character Sets 553 All Sieve scripts are represented in UTF-8, but messages may involve 554 a number of character sets. In order for comparisons to work across 555 character sets, implementations SHOULD implement the following 556 behavior: 558 Comparisons are performed on octets. Implementations convert text 559 from header fields in all charsets [MIME3] to Unicode, encoded as 560 UTF-8, as input to the comparator (see 2.7.3). Implementations 561 MUST be capable of converting US-ASCII, ISO-8859-1, the US-ASCII 562 subset of ISO-8859-* character sets, and UTF-8. Text that the 563 implementation cannot convert to Unicode for any reason MAY be 564 treated as plain US-ASCII (including any [MIME3] syntax) or 565 processed according to local conventions. An encoded NUL octet 566 (character zero) SHOULD NOT cause early termination of the header 567 content being compared against. 569 If implementations fail to support the above behavior, they MUST 570 conform to the following: 572 No two strings can be considered equal if one contains octets 573 greater than 127. 575 2.7.3. Comparators 576 In order to allow for language-independent, case-independent matches, 577 the match type may be coupled with a comparator name. The Internet 578 Application Protocol Collation Registry [COLLATION] provides the 579 framework for describing and naming comparators as used by this 580 specification. 582 All implementations MUST support the "i;octet" comparator (simply 583 compares octets), the "en;ascii-casemap" comparator (which treats 584 uppercase and lowercase characters in the US-ASCII subset of UTF-8 as 585 the same), as well as the "i;ascii-casemap" comparator, which is a 586 deprecated synonym for "en;ascii-casemap". If left unspecified, the 587 default is "en;ascii-casemap". 589 Some comparators may not be usable with substring matches; that is, 590 they may only work with ":is". It is an error to try and use a 591 comparator with ":matches" or ":contains" that is not compatible with 592 it. 594 A comparator is specified by the ":comparator" option with commands 595 that support matching. This option is followed by a string providing 596 the name of the comparator to be used. For convenience, the syntax 597 of a comparator is abbreviated to "COMPARATOR", and (repeated in 598 several tests) is as follows: 600 Syntax: ":comparator" 602 So in this example, 604 Example: if header :contains :comparator "i;octet" "Subject" 605 "MAKE MONEY FAST" { 606 discard; 607 } 609 would discard any message with subjects like "You can MAKE MONEY 610 FAST", but not "You can Make Money Fast", since the comparator used 611 is case-sensitive. 613 Comparators other than "i;octet", "en;ascii-casemap", and "i;ascii- 614 casemap" must be declared with require, as they are extensions. If a 615 comparator declared with require is not known, it is an error, and 616 execution fails. If the comparator is not declared with require, it 617 is also an error, even if the comparator is supported. (See 2.10.5.) 619 Both ":matches" and ":contains" match types are compatible with the 620 "i;octet" and "en;ascii-casemap" comparators and may be used with 621 them. 623 It is an error to give more than one of these arguments to a given 624 command. 626 2.7.4. Comparisons Against Addresses 628 Addresses are one of the most frequent things represented as strings. 629 These are structured, and being able to compare against the local- 630 part or the domain of an address is useful, so some tests that act 631 exclusively on addresses take an additional optional argument that 632 specifies what the test acts on. 634 These optional arguments are ":localpart", ":domain", and ":all", 635 which act on the local-part (left-side), the domain part (right- 636 side), and the whole address. 638 The kind of comparison done, such as whether or not the test done is 639 case-insensitive, is specified as a comparator argument to the test. 641 If an optional address-part is omitted, the default is ":all". 643 It is an error to give more than one of these arguments to a given 644 command. 646 For convenience, the "ADDRESS-PART" syntax element is defined here as 647 follows: 649 Syntax: ":localpart" / ":domain" / ":all" 651 2.8. Blocks 653 Blocks are sets of commands enclosed within curly braces and supplied 654 as the final argument to a command. Such a command is a control 655 structure: when executed it has control over the number of times the 656 commands in the block are executed. and how 658 With the commands supplied in this memo, there are no loops. The 659 control structures supplied--if, elsif, and else--run a block either 660 once or not at all. 662 2.9. Commands 664 Sieve scripts are sequences of commands. Commands can take any of 665 the tokens above as arguments, and arguments may be either tagged or 666 positional arguments. Not all commands take all arguments. 668 There are three kinds of commands: test commands, action commands, 669 and control commands. 671 The simplest is an action command. An action command is an 672 identifier followed by zero or more arguments, terminated by a 673 semicolon. Action commands do not take tests or blocks as arguments. 675 A control command is a command that affects the parsing or the flow 676 of execution of the Sieve script in some way. A control structure is 677 a control command which ends with a block instead of a semicolon. 679 A test command is used as part of a control command. It is used to 680 specify whether or not the block of code given to the control command 681 is executed. 683 2.10. Evaluation 685 2.10.1. Action Interaction 687 Some actions cannot be used with other actions because the result 688 would be absurd. These restrictions are noted throughout this memo. 690 Extension actions MUST state how they interact with actions defined 691 in this specification. 693 2.10.2. Implicit Keep 695 Previous experience with filtering systems suggests that cases tend 696 to be missed in scripts. To prevent errors, Sieve has an "implicit 697 keep". 699 An implicit keep is a keep action (see 4.4) performed in absence of 700 any action that cancels the implicit keep. 702 An implicit keep is performed if a message is not written to a 703 mailbox, redirected to a new address, or explicitly thrown out. That 704 is, if a fileinto, a keep, a redirect, or a discard is performed, an 705 implicit keep is not. 707 Some actions may be defined to not cancel the implicit keep. These 708 actions may not directly affect the delivery of a message, and are 709 used for their side effects. None of the actions specified in this 710 document meet that criteria, but extension actions will. 712 For instance, with any of the short messages offered above, the 713 following script produces no actions. 715 Example: if size :over 500K { discard; } 717 As a result, the implicit keep is taken. 719 2.10.3. Message Uniqueness in a Mailbox 720 Implementations SHOULD NOT deliver a message to the same folder more 721 than once, even if a script explicitly asks for a message to be 722 written to a mailbox twice. 724 The test for equality of two messages is implementation-defined. 726 If a script asks for a message to be written to a mailbox twice, it 727 MUST NOT be treated as an error. 729 2.10.4. Limits on Numbers of Actions 731 Site policy MAY limit numbers of actions taken and MAY impose 732 restrictions on which actions can be used together. In the event 733 that a script hits a policy limit on the number of actions taken for 734 a particular message, an error occurs. 736 Implementations MUST allow at least one keep or one fileinto. If 737 fileinto is not implemented, implementations MUST allow at least one 738 keep. 740 2.10.5. Extensions and Optional Features 742 Because of the differing capabilities of many mail systems, several 743 features of this specification are optional. Before any of these 744 extensions can be executed, they must be declared with the "require" 745 action. 747 If an extension is not enabled with "require", implementations MUST 748 treat it as if they did not support it at all. 750 If a script does not understand an extension declared with require, 751 the script must not be used at all. Implementations MUST NOT execute 752 scripts which require unknown capability names. 754 Note: The reason for this restriction is that prior experiences with 755 languages such as LISP and Tcl suggest that this is a workable 756 way of noting that a given script uses an extension. 758 Experience with PostScript suggests that mechanisms that allow 759 a script to work around missing extensions are not used in 760 practice. 762 Extensions which define actions MUST state how they interact with 763 actions discussed in the base specification. 765 2.10.6. Errors 767 In any programming language, there are compile-time and run-time 768 errors. 770 Compile-time errors are ones in syntax that are detectable if a 771 syntax check is done. 773 Run-time errors are not detectable until the script is run. This 774 includes transient failures like disk full conditions, but also 775 includes issues like invalid combinations of actions. 777 When an error occurs in a Sieve script, all processing stops. 779 Implementations MAY choose to do a full parse, then evaluate the 780 script, then do all actions. Implementations might even go so far as 781 to ensure that execution is atomic (either all actions are executed 782 or none are executed). 784 Other implementations may choose to parse and run at the same time. 785 Such implementations are simpler, but have issues with partial 786 failure (some actions happen, others don't). 788 Implementations might even go so far as to ensure that scripts can 789 never execute an invalid set of actions before execution, although 790 this could involve solving the Halting Problem. 792 This specification allows any of these approaches. Solving the 793 Halting Problem is considered extra credit. 795 Implementations MUST perform syntactic, semantic, and run-time checks 796 on code that is actually executed. Implementations MAY perform those 797 checks or any part of them on code that is not reached during 798 execution. 800 When an error happens, implementations MUST notify the user that an 801 error occurred, which actions (if any) were taken, and do an implicit 802 keep. 804 2.10.7. Limits on Execution 806 Implementations may limit certain constructs. However, this 807 specification places a lower bound on some of these limits. 809 Implementations MUST support fifteen levels of nested blocks. 811 Implementations MUST support fifteen levels of nested test lists. 813 3. Control Commands 815 Control structures are needed to allow for multiple and conditional 816 actions. 818 3.1. Control If 820 There are three pieces to if: "if", "elsif", and "else". Each is 821 actually a separate command in terms of the grammar. However, an 822 elsif or else MUST only follow an if or elsif. An error occurs if 823 these conditions are not met. 825 Usage: if 827 Usage: elsif 829 Usage: else 831 The semantics are similar to those of any of the many other 832 programming languages these control structures appear in. When the 833 interpreter sees an "if", it evaluates the test associated with it. 834 If the test is true, it executes the block associated with it. 836 If the test of the "if" is false, it evaluates the test of the first 837 "elsif" (if any). If the test of "elsif" is true, it runs the 838 elsif's block. An elsif may be followed by an elsif, in which case, 839 the interpreter repeats this process until it runs out of elsifs. 841 When the interpreter runs out of elsifs, there may be an "else" case. 842 If there is, and none of the if or elsif tests were true, the 843 interpreter runs the else case. 845 This provides a way of performing exactly one of the blocks in the 846 chain. 848 In the following example, both Message A and B are dropped. 850 Example: require "fileinto"; 851 if header :contains "from" "coyote" { 852 discard; 853 } elsif header :contains ["subject"] ["$$$"] { 854 discard; 855 } else { 856 fileinto "INBOX"; 857 } 859 When the script below is run over message A, it redirects the message 860 to acm@example.edu; message B, to postmaster@example.edu; any other 861 message is redirected to field@example.edu. 863 Example: if header :contains ["From"] ["coyote"] { 864 redirect "acm@example.edu"; 865 } elsif header :contains "Subject" "$$$" { 866 redirect "postmaster@example.edu"; 867 } else { 868 redirect "field@example.edu"; 869 } 871 Note that this definition prohibits the "... else if ..." sequence 872 used by C. This is intentional, because this construct produces a 873 shift-reduce conflict. 875 3.2. Control Require 877 Usage: require 879 The require action notes that a script makes use of a certain 880 extension. Such a declaration is required to use the extension, as 881 discussed in section 2.10.5. Multiple capabilities can be declared 882 with a single require. 884 The require command, if present, MUST be used before anything other 885 than a require can be used. An error occurs if a require appears 886 after a command other than require. 888 Example: require ["fileinto", "reject"]; 890 Example: require "fileinto"; 891 require "vacation"; 893 3.3. Control Stop 895 Usage: stop 897 The "stop" action ends all processing. If no actions have been 898 executed, then the keep action is taken. 900 4. Action Commands 902 This document supplies four actions that may be taken on a message: 903 keep, fileinto, redirect, and discard. 905 Implementations MUST support the "keep", "discard", and "redirect" 906 actions. 908 Implementations SHOULD support "fileinto". 910 Implementations MAY limit the number of certain actions taken (see 911 section 2.10.4). 913 4.1. Action fileinto 915 Usage: fileinto 917 The "fileinto" action delivers the message into the specified folder. 918 Implementations SHOULD support fileinto, but in some environments 919 this may be impossible. Implementations MAY place restrictions on 920 folder names; use of an invalid folder name MAY be treated as an 921 error or result in delivery to an implementation-defined folder. 923 The capability string for use with the require command is "fileinto". 925 In the following script, message A is filed into folder 926 "INBOX.harassment". 928 Example: require "fileinto"; 929 if header :contains ["from"] "coyote" { 930 fileinto "INBOX.harassment"; 931 } 933 4.2. Action redirect 935 Usage: redirect 937 The "redirect" action is used to send the message to another user at 938 a supplied address, as a mail forwarding feature does. The 939 "redirect" action makes no changes to the message body or existing 940 headers, but it may add new headers. The "redirect" modifies the 941 envelope recipient. 943 The redirect command performs an MTA-style "forward"--that is, what 944 you get from a .forward file using sendmail under UNIX. The address 945 on the [SMTP] envelope is replaced with the one on the redirect 946 command and the message is sent back out. (This is not an MUA-style 947 forward, which creates a new message with a different sender and 948 message ID, wrapping the old message in a new one.) 950 A simple script can be used for redirecting all mail: 952 Example: redirect "bart@example.edu"; 954 Implementations SHOULD take measures to implement loop control, 955 possibly including adding headers to the message or counting received 956 headers. If an implementation detects a loop, it causes an error. 958 4.3. Action keep 959 Usage: keep 961 The "keep" action is whatever action is taken in lieu of all other 962 actions, if no filtering happens at all; generally, this simply means 963 to file the message into the user's main mailbox. This command 964 provides a way to execute this action without needing to know the 965 name of the user's main mailbox, providing a way to call it without 966 needing to understand the user's setup, or the underlying mail 967 system. 969 For instance, in an implementation where the Internet Message Access 970 Protocol (IMAP) server is running scripts on behalf of the user at 971 time of delivery, a keep command is equivalent to a fileinto "INBOX". 973 Example: if size :under 1M { keep; } else { discard; } 975 Note that the above script is identical to the one below. 977 Example: if not size :under 1M { discard; } 979 4.4. Action discard 981 Usage: discard 983 Discard is used to silently throw away the message. It does so by 984 simply canceling the implicit keep. If discard is used with other 985 actions, the other actions still happen. Discard is compatible with 986 all other actions. (For instance fileinto+discard is equivalent to 987 fileinto.) 989 Discard MUST be silent; that is, it MUST NOT return a non-delivery 990 notification of any kind ([DSN], [MDN], or otherwise). 992 In the following script, any mail from "idiot@example.edu" is thrown 993 out. 995 Example: if header :contains ["from"] ["idiot@example.edu"] { 996 discard; 997 } 999 While an important part of this language, "discard" has the potential 1000 to create serious problems for users: Students who leave themselves 1001 logged in to an unattended machine in a public computer lab may find 1002 their script changed to just "discard". In order to protect users in 1003 this situation (along with similar situations), implementations MAY 1004 keep messages destroyed by a script for an indefinite period, and MAY 1005 disallow scripts that throw out all mail. 1007 5. Test Commands 1009 Tests are used in conditionals to decide which part(s) of the 1010 conditional to execute. 1012 Implementations MUST support these tests: "address", "allof", 1013 "anyof", "exists", "false", "header", "not", "size", and "true". 1015 Implementations SHOULD support the "envelope" test. 1017 5.1. Test address 1019 Usage: address [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1020 1022 The address test matches Internet addresses in structured headers 1023 that contain addresses. It returns true if any header contains any 1024 key in the specified part of the address, as modified by the 1025 comparator and the match keyword. Whether there are other addresses 1026 present in the header doesn't affect this test; this test does not 1027 provide any way to determine whether an address is the only address 1028 in a header. 1030 Like envelope and header, this test returns true if any combination 1031 of the header-list and key-list arguments match and false otherwise. 1033 Internet email addresses [IMAIL] have the somewhat awkward 1034 characteristic that the local-part to the left of the at-sign is 1035 considered case sensitive, and the domain-part to the right of the 1036 at-sign is case insensitive. The "address" command does not deal 1037 with this itself, but provides the ADDRESS-PART argument for allowing 1038 users to deal with it. 1040 The address primitive never acts on the phrase part of an email 1041 address, nor on comments within that address. It also never acts on 1042 group names, although it does act on the addresses within the group 1043 construct. 1045 Implementations MUST restrict the address test to headers that 1046 contain addresses, but MUST include at least From, To, Cc, Bcc, 1047 Sender, Resent-From, Resent-To, and SHOULD include any other header 1048 that utilizes an "address-list" structured header body. 1050 Example: if address :is :all "from" "tim@example.com" { 1051 discard; 1052 } 1054 5.2. Test allof 1055 Usage: allof 1057 The allof test performs a logical AND on the tests supplied to it. 1059 Example: allof (false, false) => false 1060 allof (false, true) => false 1061 allof (true, true) => true 1063 The allof test takes as its argument a test-list. 1065 5.3. Test anyof 1067 Usage: anyof 1069 The anyof test performs a logical OR on the tests supplied to it. 1071 Example: anyof (false, false) => false 1072 anyof (false, true) => true 1073 anyof (true, true) => true 1075 5.4. Test envelope 1077 Usage: envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1078 1080 The "envelope" test is true if the specified part of the SMTP (or 1081 equivalent) envelope matches the specified key. This specification 1082 defines the interpretation of the (case insensitive) "from" and "to" 1083 envelope-parts. Additional envelope-parts may be defined by other 1084 extensions; implementations SHOULD consider unknown envelope parts an 1085 error. 1087 If one of the envelope-part strings is (case insensitive) "from", 1088 then matching occurs against the FROM address used in the SMTP MAIL 1089 command. 1091 If one of the envelope-part strings is (case insensitive) "to", then 1092 matching occurs against the TO address used in the SMTP RCPT command 1093 that resulted in this message getting delivered to this user. Note 1094 that only the most recent TO is available, and only the one relevant 1095 to this user. 1097 The envelope-part is a string list and may contain more than one 1098 parameter, in which case all of the strings specified in the key-list 1099 are matched against all parts given in the envelope-part list. 1101 Like address and header, this test returns true if any combination of 1102 the envelope-part list and key-list arguments match and false 1103 otherwise. 1105 All tests against envelopes MUST drop source routes. 1107 If the SMTP transaction involved several RCPT commands, only the data 1108 from the RCPT command that caused delivery to this user is available 1109 in the "to" part of the envelope. 1111 If a protocol other than SMTP is used for message transport, 1112 implementations are expected to adapt this command appropriately. 1114 The envelope command is optional. Implementations SHOULD support it, 1115 but the necessary information may not be available in all cases. 1117 Example: require "envelope"; 1118 if envelope :all :is "from" "tim@example.com" { 1119 discard; 1120 } 1122 5.5. Test exists 1124 Usage: exists 1126 The "exists" test is true if the headers listed in the header-names 1127 argument exist within the message. All of the headers must exist or 1128 the test is false. 1130 The following example throws out mail that doesn't have a From header 1131 and a Date header. 1133 Example: if not exists ["From","Date"] { 1134 discard; 1135 } 1137 5.6. Test false 1139 Usage: false 1141 The "false" test always evaluates to false. 1143 5.7. Test header 1145 Usage: header [COMPARATOR] [MATCH-TYPE] 1146 1148 The "header" test evaluates to true if the value of any of the named 1149 headers, ignoring leading and trailing whitespace, matches any key. 1150 The type of match is specified by the optional match argument, which 1151 defaults to ":is" if not specified, as specified in section 2.6. 1153 Like address and envelope, this test returns true if any combination 1154 of the header-names list and key-list arguments match and false 1155 otherwise. 1157 If a header listed in the header-names argument exists, it contains 1158 the empty key (""). However, if the named header is not present, it 1159 does not match any key, including the empty key. So if a message 1160 contained the header 1162 X-Caffeine: C8H10N4O2 1164 these tests on that header evaluate as follows: 1166 header :is ["X-Caffeine"] [""] => false 1167 header :contains ["X-Caffeine"] [""] => true 1169 Testing whether a given header is either absent or doesn't contain 1170 any non-whitespace characters can be done using a negated "header" 1171 test: 1173 not header :matches "Cc" "?*" 1175 5.8. Test not 1177 Usage: not 1179 The "not" test takes some other test as an argument, and yields the 1180 opposite result. "not false" evaluates to "true" and "not true" 1181 evaluates to "false". 1183 5.9. Test size 1185 Usage: size <":over" / ":under"> 1187 The "size" test deals with the size of a message. It takes either a 1188 tagged argument of ":over" or ":under", followed by a number 1189 representing the size of the message. 1191 If the argument is ":over", and the size of the message is greater 1192 than the number provided, the test is true; otherwise, it is false. 1194 If the argument is ":under", and the size of the message is less than 1195 the number provided, the test is true; otherwise, it is false. 1197 Exactly one of ":over" or ":under" must be specified, and anything 1198 else is an error. 1200 The size of a message is defined to be the number of octets from the 1201 initial header until the last character in the message body. 1203 Note that for a message that is exactly 4,000 octets, the message is 1204 neither ":over" 4000 octets or ":under" 4000 octets. 1206 5.10. Test true 1208 Usage: true 1210 The "true" test always evaluates to true. 1212 6. Extensibility 1214 New control commands, actions, and tests can be added to the 1215 language. Sites must make these features known to their users; this 1216 document does not define a way to discover the list of extensions 1217 supported by the server. 1219 Any extensions to this language MUST define a capability string that 1220 uniquely identifies that extension. Capability string are case- 1221 sensitive; for example, "foo" and "FOO" are different capabilities. 1222 If a new version of an extension changes the functionality of a 1223 previously defined extension, it MUST use a different name. 1225 In a situation where there is a submission protocol and an extension 1226 advertisement mechanism aware of the details of this language, 1227 scripts submitted can be checked against the mail server to prevent 1228 use of an extension that the server does not support. 1230 Extensions MUST state how they interact with constraints defined in 1231 section 2.10, e.g., whether they cancel the implicit keep, and which 1232 actions they are compatible and incompatible with. 1234 6.1. Capability String 1236 Capability strings are typically short strings describing what 1237 capabilities are supported by the server. 1239 Capability strings beginning with "vnd." represent vendor-defined 1240 extensions. Such extensions are not defined by Internet standards or 1241 RFCs, but are still registered with IANA in order to prevent 1242 conflicts. Extensions starting with "vnd." SHOULD be followed by the 1243 name of the vendor and product, such as "vnd.acme.rocket-sled". 1245 The following capability strings are defined by this document: 1247 envelope The string "envelope" indicates that the implementation 1248 supports the "envelope" command. 1250 fileinto The string "fileinto" indicates that the implementation 1251 supports the "fileinto" command. 1253 comparator- The string "comparator-elbonia" is provided if the 1254 implementation supports the "elbonia" comparator. 1255 Therefore, all implementations have at least the 1256 "comparator-i;octet", "comparator-en;ascii-casemap", 1257 and "comparator-i;ascii-casemap" capabilities. However, 1258 these comparators may be used without being declared 1259 with require. 1261 6.2. IANA Considerations 1263 In order to provide a standard set of extensions, a registry is 1264 provided by IANA. Capability names may be registered on a first- 1265 come, first-served basis. Extensions designed for interoperable use 1266 SHOULD be defined as standards track or IESG approved experimental 1267 RFCs. 1269 6.2.1. Template for Capability Registrations 1271 The following template is to be used for registering new Sieve 1272 extensions with IANA. 1274 To: iana@iana.org 1275 Subject: Registration of new Sieve extension 1277 Capability name: 1278 Capability keyword: 1279 Capability arguments: 1280 Standards Track/IESG-approved experimental RFC number: 1281 Person and email address to contact for further information: 1283 6.2.2. Initial Capability Registrations 1285 This RFC updates the the following entries in the IANA registry for 1286 Sieve extensions. 1288 Capability name: fileinto 1289 Capability keyword: fileinto 1290 Capability arguments: fileinto 1291 Standards Track/IESG-approved experimental RFC number: 1292 This RFC (Sieve base spec) 1293 Person and email address to contact for further information: 1294 The Sieve discussion list 1296 Capability name: envelope 1297 Capability keyword: envelope 1298 Capability arguments: 1299 envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1300 1301 Standards Track/IESG-approved experimental RFC number: 1302 This RFC (Sieve base spec) 1303 Person and email address to contact for further information: 1304 The Sieve discussion list 1306 Capability name: comparator-* 1307 Capability keyword: 1308 comparator-* (anything starting with "comparator-") 1309 Capability arguments: (none) 1310 Standards Track/IESG-approved experimental RFC number: 1311 This RFC, Sieve, by reference to [COLLATION] 1312 Person and email address to contact for further information: 1313 The Sieve discussion list 1315 6.3. Capability Transport 1317 As the range of mail systems that this document is intended to apply 1318 to is quite varied, a method of advertising which capabilities an 1319 implementation supports is difficult due to the wide range of 1320 possible implementations. Such a mechanism, however, should have the 1321 property that the implementation can advertise the complete set of 1322 extensions that it supports. 1324 7. Transmission 1326 The MIME type for a Sieve script is "application/sieve". 1328 The registration of this type for RFC 2048 requirements is updated as 1329 follows: 1331 Subject: Registration of MIME media type application/sieve 1333 MIME media type name: application 1334 MIME subtype name: sieve 1335 Required parameters: none 1336 Optional parameters: none 1337 Encoding considerations: Most sieve scripts will be textual, 1338 written in UTF-8. When non-7bit characters are used, 1339 quoted-printable is appropriate for transport systems 1340 that require 7bit encoding. 1342 Security considerations: Discussed in section 10 of this RFC. 1343 Interoperability considerations: Discussed in section 2.10.5 1344 of this RFC. 1345 Published specification: this RFC. 1346 Applications which use this media type: sieve-enabled mail servers 1347 Additional information: 1348 Magic number(s): 1349 File extension(s): .siv 1350 Macintosh File Type Code(s): 1351 Person & email address to contact for further information: 1352 See the discussion list at ietf-mta-filters@imc.org. 1353 Intended usage: 1354 COMMON 1355 Author/Change controller: 1356 See Editor information in this RFC. 1358 8. Parsing 1360 The Sieve grammar is separated into tokens and a separate grammar as 1361 most programming languages are. 1363 8.1. Lexical Tokens 1365 Sieve scripts are encoded in UTF-8. The following assumes a valid 1366 UTF-8 encoding; special characters in Sieve scripts are all US-ASCII. 1368 The following are tokens in Sieve: 1370 - identifiers 1371 - tags 1372 - numbers 1373 - quoted strings 1374 - multi-line strings 1375 - other separators 1377 Blanks, horizontal tabs, CRLFs, and comments ("white space") are 1378 ignored except as they separate tokens. Some white space is required 1379 to separate otherwise adjacent tokens and in specific places in the 1380 multi-line strings. CR and LF can only appear in CRLF pairs. 1382 The other separators are single individual characters, and are 1383 mentioned explicitly in the grammar. 1385 The lexical structure of sieve is defined in the following grammar 1386 (as described in [ABNF]): 1388 bracket-comment = "/*" *not-star 1*STAR 1389 *(not-star-slash *not-star 1*STAR) "/" 1390 ; No */ allowed inside a comment. 1391 ; (No * is allowed unless it is the last 1392 ; character, or unless it is followed by a 1393 ; character that isn't a slash.) 1395 STAR = "*" 1397 not-star = CRLF / %x01-09 / %x0b-0c / %x0e-29 / %x2b-7f / 1398 UTF8-2 / UTF8-3 / UTF8-4 1399 ; either a CRLF pair, OR a single UTF-8 1400 ; character other than NUL, CR, LF, or star 1402 not-star-or-slash = CRLF / %x01-09 / %x0b-0c / %x0e-29 / %x2b-2e / 1403 %x30-7f / UTF8-2 / UTF8-3 / UTF8-4 1404 ; either a CRLF pair, OR a single UTF-8 1405 ; character other than NUL, CR, LF, star, 1406 ; or slash 1408 UTF8-NOT-CRLF = %x01-09 / %x0b-0c / %x0e-7f / 1409 UTF8-2 / UTF8-3 / UTF8-4 1410 ; a single UTF-8 character other than NUL, 1411 ; CR, or LF 1413 UTF8-NOT-PERIOD = %x01-09 / %x0b-0c / %x0e-2d / %x2f-7f / 1414 UTF8-2 / UTF8-3 / UTF8-4 1415 ; a single UTF-8 character other than NUL, 1416 ; CR, LF, or period 1418 UTF8-NOT-NUL = %x01-7f / UTF8-2 / UTF8-3 / UTF8-4 1419 ; a single UTF-8 character other than NUL 1421 UTF8-NOT-QSPECIAL = %x01-09 / %x0b-0c / %x0e-21 / %x23-5b / 1422 %x5d-7f / UTF8-2 / UTF8-3 / UTF8-4 1423 ; a single UTF-8 character other than NUL, 1424 ; CR, LF, double-quote, or backslash 1426 comment = bracket-comment / hash-comment 1428 hash-comment = "#" *UTF8-NOT-CRLF CRLF 1430 identifier = (ALPHA / "_") *(ALPHA / DIGIT / "_") 1432 tag = ":" identifier 1434 number = 1*DIGIT [QUANTIFIER] 1436 QUANTIFIER = "K" / "M" / "G" 1438 quoted-safe = CRLF / UTF8-NOT-QSPECIAL 1439 ; either a CRLF pair, OR a single UTF-8 1440 ; character other than NUL, CR, LF, 1441 ; double-quote, or backslash 1443 quoted-special = "\" ( DQUOTE / "\" ) 1444 ; represents just a double-quote or backslash 1446 quoted-other = "\" UTF8-NOT-QSPECIAL 1447 ; represents just the UTF8-NOT-QSPECIAL 1448 ; character. SHOULD NOT be used 1450 quoted-text = *(quoted-safe / quoted-special / quoted-other) 1452 quoted-string = DQUOTE quoted-text DQUOTE 1454 multi-line = "text:" *(SP / HTAB) (hash-comment / CRLF) 1455 *(multiline-literal / multiline-dotstuff) 1456 "." CRLF 1458 multiline-literal = [UTF8-NOT-PERIOD *UTF8-NOT-CRLF] CRLF 1460 multiline-dotstuff = "." 1*UTF8-NOT-CRLF CRLF 1461 ; A line containing only "." ends the 1462 ; multi-line. Remove a leading '.' if 1463 ; followed by another '.'. 1465 white-space = 1*(SP / CRLF / HTAB) / comment 1467 8.2. Grammar 1469 The following is the grammar of Sieve after it has been lexically 1470 interpreted. No white space or comments appear below. The start 1471 symbol is "start". Non-terminals for MATCH-TYPE, COMPARATOR, and 1472 ADDRESS-PART are provided for use by extensions. 1474 argument = string-list / number / tag 1476 arguments = *argument [test / test-list] 1478 block = "{" commands "}" 1480 command = identifier arguments ( ";" / block ) 1482 commands = *command 1484 start = commands 1486 string = quoted-string / multi-line 1487 string-list = "[" string *("," string) "]" / string 1488 ; if there is only a single string, the brackets 1489 ; are optional 1491 test = identifier arguments 1493 test-list = "(" test *("," test) ")" 1495 ADDRESS-PART = ":localpart" / ":domain" / ":all" 1497 COMPARATOR = ":comparator" string 1499 MATCH-TYPE = ":is" / ":contains" / ":matches" 1501 9. Extended Example 1503 The following is an extended example of a Sieve script. Note that it 1504 does not make use of the implicit keep. 1506 # 1507 # Example Sieve Filter 1508 # Declare any optional features or extension used by the script 1509 # 1510 require ["fileinto"]; 1512 # 1513 # Handle messages from known mailing lists 1514 # Move messages from IETF filter discussion list to filter folder 1515 # 1516 if header :is "Sender" "owner-ietf-mta-filters@imc.org" 1517 { 1518 fileinto "filter"; # move to "filter" folder 1519 } 1520 # 1521 # Keep all messages to or from people in my company 1522 # 1523 elsif address :domain :is ["From", "To"] "example.com" 1524 { 1525 keep; # keep in "In" folder 1526 } 1528 # 1529 # Try and catch unsolicited email. If a message is not to me, 1530 # or it contains a subject known to be spam, file it away. 1531 # 1532 elsif anyof (not address :all :contains 1533 ["To", "Cc", "Bcc"] "me@example.com", 1534 header :matches "subject" 1536 ["*make*money*fast*", "*university*dipl*mas*"]) 1537 { 1538 # If message header does not contain my address, 1539 # it's from a list. 1540 fileinto "spam"; # move to "spam" folder 1541 } 1542 else 1543 { 1544 # Move all other (non-company) mail to "personal" 1545 # folder. 1546 fileinto "personal"; 1547 } 1549 10. Security Considerations 1551 Users must get their mail. It is imperative that whatever method 1552 implementations use to store the user-defined filtering scripts be 1553 secure. 1555 It is equally important that implementations sanity-check the user's 1556 scripts, and not allow users to create on-demand mailbombs. For 1557 instance, an implementation that allows a user to redirect a message 1558 multiple times might also allow a user to create a mailbomb triggered 1559 by mail from a specific user. Site- or implementation-defined limits 1560 on actions are useful for this. 1562 Several commands, such as "discard", "redirect", and "fileinto" allow 1563 for actions to be taken that are potentially very dangerous. 1565 Implementations SHOULD take measures to prevent languages from 1566 looping. 1568 As with any filter on a message stream, if the sieve implementation 1569 and the mail agents 'behind' sieve in the message stream differ in 1570 their interpretation of the messages, it may be possible for an 1571 attacker to subvert the filter. Of particular note are differences 1572 in the interpretation of malformed messages (e.g., missing or extra 1573 syntax characters) or those that exhibit corner cases (e.g., NUL 1574 octects encoded via [MIME3]). 1576 11. Acknowledgments 1578 This document has been revised in part based on comments and 1579 discussions that took place on and off the SIEVE mailing list. 1580 Thanks to Cyrus Daboo, Ned Freed, Michael Haardt, Kjetil Torgrim 1581 Homme, Barry Leiba, Mark E. Mallett, Alexey Melnikov, Rob Siemborski, 1582 and Nigel Swinson for reviews and suggestions. 1584 12. Editors' Addresses 1586 Philip Guenther 1587 Sendmail, Inc. 1588 6425 Christie St. Ste 400 1589 Emeryville, CA 94608 1590 Email: guenther@sendmail.com 1592 Tim Showalter 1593 Email: tjs@psaux.com 1595 13. Normative References 1597 [ABNF] D. Crocker, Ed., P. Overell "Augmented BNF for Syntax 1598 Specifications: ABNF", RFC 4234, October 2005. 1600 [COLLATION] Newman, C., Duerst, M., and A. Gulbrandsen "Internet 1601 Application Protocol Collation Registry" draft- 1602 newman-i18n-comparator-07.txt (work in progress), 1603 March 2006. 1605 [IMAIL] P. Resnick, Ed., "Internet Message Format", RFC 2822, 1606 April 2001. 1608 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 1609 Requirement Levels", BCP 14, RFC 2119, March 1997. 1611 [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 1612 Extensions (MIME) Part One: Format of Internet 1613 Message Bodies", RFC 2045, November 1996. 1615 [MIME3] Moore, K., "MIME (Multipurpose Internet Mail Extensions) 1616 Part Three: Message Header Extensions for Non-ASCII 1617 Text", RFC 2047, November 1996 1619 [MDN] T. Hansen, Ed., G. Vaudreuil, Ed., "Message Disposition 1620 Notification", RFC 3798, May 2004. 1622 [SMTP] J. Klensin, Ed., "Simple Mail Transfer Protocol", RFC 1623 2821, April 2001. 1625 [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO 1626 10646", RFC 3629, November 2003. 1628 14. Informative References 1630 [BINARY-SI] "Standard IEC 60027-2: Letter symbols to be used in 1631 electrical technology - Part 2: Telecommunications and 1632 electronics", January 1999. 1634 [DSN] Moore, K. and G. Vaudreuil, "An Extensible Message Format 1635 for Delivery Status Notifications", RFC 1894, January 1636 1996. 1638 [FLAMES] Borenstein, N, and C. Thyberg, "Power, Ease of Use, and 1639 Cooperative Work in a Practical Multimedia Message 1640 System", Int. J. of Man-Machine Studies, April, 1991. 1641 Reprinted in Computer-Supported Cooperative Work and 1642 Groupware, Saul Greenberg, editor, Harcourt Brace 1643 Jovanovich, 1991. Reprinted in Readings in Groupware and 1644 Computer-Supported Cooperative Work, Ronald Baecker, 1645 editor, Morgan Kaufmann, 1993. 1647 [IMAP] Crispin, M., "Internet Message Access Protocol - version 1648 4rev1", RFC 3501, March 2003. 1650 14. Full Copyright Statement 1652 Copyright (C) The Internet Society (2006). 1654 This document is subject to the rights, licenses and restrictions 1655 contained in BCP 78, and except as set forth therein, the authors 1656 retain all their rights. 1658 This document and the information contained herein are provided on an 1659 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1660 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1661 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1662 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1663 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1664 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1666 Intellectual Property 1668 The IETF takes no position regarding the validity or scope of any 1669 Intellectual Property Rights or other rights that might be claimed to 1670 pertain to the implementation or use of the technology described in 1671 this document or the extent to which any license under such rights 1672 might or might not be available; nor does it represent that it has 1673 made any independent effort to identify any such rights. Information 1674 on the procedures with respect to rights in RFC documents can be 1675 found in BCP 78 and BCP 79. 1677 Copies of IPR disclosures made to the IETF Secretariat and any 1678 assurances of licenses to be made available, or the result of an 1679 attempt made to obtain a general license or permission for the use of 1680 such proprietary rights by implementers or users of this 1681 specification can be obtained from the IETF on-line IPR repository at 1682 http://www.ietf.org/ipr. 1684 The IETF invites any interested party to bring to its attention any 1685 copyrights, patents or patent applications, or other proprietary 1686 rights that may cover technology that may be required to implement 1687 this standard. Please address the information to the IETF at ietf- 1688 ipr@ietf.org. 1690 Acknowledgement 1692 Funding for the RFC Editor function is currently provided by the 1693 Internet Society. 1695 Append A. Change History 1697 This section will be replaced with a summary of the changes since RFC 1698 3028 when this document leaves the Internet-Draft stage. 1700 Open Issues: 1701 1. Merge reject back in with textual changes to permit MDNs and 1702 protocol level rejection 1703 2. Should this recommend/require that fileinto map UTF-8 to mUTF-7 1704 when working with an IMAP store? 1706 Changes from draft-ietf-sieve-3028bis-05.txt 1707 1. The specifics of what names are acceptable for fileinto and 1708 the handling of invalid names are both implementation-defined. 1709 2. Update to draft-newman-i18n-comparator-07.txt 1710 3. Adjust the example in 5.7 again 1712 Changes from draft-ietf-sieve-3028bis-04.txt 1713 1. Change "Syntax:" to "Usage:" 1714 2. Update ABNF reference to RFC 4234 1715 3. Add non-terminals for MATCH-TYPE, COMPARATOR, and ADDRESS-PART 1716 4. Strip leading and trailing whitespace in the value being matched 1717 by header 1718 5. Collations operate on octets, not characters, and for character 1719 data that is the UTF-8 encoding of the Unicode characters 1720 6. :matches uses character definition of comparator 1722 Changes from draft-ietf-sieve-3028bis-03.txt 1723 1. Remove section 2.4.2.4., MIME Parts, as unreferenced 1724 2. Update to draft-newman-i18n-comparator-04.txt 1725 3. Various tweaks to examples and syntax lines 1726 4. Define "control structure" as a control command with a block 1727 argument, then use it consistently. Reword description of 1728 blocks to match 1729 5. Clarify that "header" can never match an absent header and give 1730 the preferred way to test for absent or empty 1731 6. Invalid header name syntax is not an error _in tests_ (but could 1732 be elsewhere) 1733 7. Implementation SHOULD consider unknown envelope parts an error 1734 8. Remove explicit "omitted" option from 2.7.2p2 1736 Changes from draft-ietf-sieve-3028bis-02.txt 1737 1. Change "ASCII" to "US-ASCII" throughout 1738 2. Tweak section 2.7.2 to not require use of UTF-8 internally and 1739 to explicitly leave implementation-defined the handling of text 1740 that can't be converted to Unicode 1741 3. Add reference to RFC 2047 1742 4. Clarify that capability strings are case-sensitive 1743 5. Clarify that address, envelope, and header return false if no 1744 combination of arguments match 1745 6. Directly state that code that isn't reached may still be checked 1746 for errors 1747 7. Invalid header name syntax is not an error 1748 8. Remove description of header unfolding that conflicts with 1749 [IMAIL] 1750 9. Warn that filters may be subvertable if agents interpret messages 1751 differently 1752 10. Encoded NUL octets SHOULD NOT cause truncation 1754 Changes from draft-ietf-sieve-3028bis-01.txt 1755 1. Remove ban on side effects 1756 2. Remove definition of the 'reject' action, as it is being moved 1757 to the doc that also defines the 'refuse' action 1758 3. Update capability registrations to reference the mailing list 1759 4. Add Tim back as an editor 1760 5. Refer to the zero-length string ("") as "empty" instead of 1761 "null" 1763 Changes from draft-ietf-sieve-3028bis-00.txt 1764 1. More grammar corrections: 1765 - permit /***/, 1766 - remove ambiguity in finding end of bracket comment, 1767 - require valid UTF-8, 1768 - express quoting in the grammar 1769 - ban bare CR and LF in all locations 1770 2. Correct a bunch of whitespace and linewrapping nits 1771 3. Update IMAIL and SMTP references to RFC 2822 and RFC 2821 1772 4. Require support for en;ascii-casemap comparator as well as the 1773 old i;ascii-casemap. As with the old one, you do not need to 1774 use 'require' to use the new comparator 1776 5. Update IANA considerations to update the existing registrations 1777 to point at this doc instead of 3028 1778 6. Scripts SHOULD NOT contain superfluous backslashes 1779 7. Update Acknowledgments 1781 Changes from RFC 3028 1782 1. Split references into normative and informative 1783 2. Update references to current versions of DSN, IMAP, MDN, and 1784 UTF-8 RFCs 1785 3. Replace "e-mail" with "email" 1786 4. Incorporate RFC 3028 errata 1787 5. The "reject" action cancels the implicit keep 1788 6. Replace references to ACAP with references to the i18n-comparator 1789 draft. Further work is needed to completely sync with that 1790 draft 1791 7. Start to update grammar to only permit legal UTF-8 (incomplete) 1792 and correct various other errors and typos 1793 8. Update IPR broilerplate to RFC 3978/3979