idnits 2.17.1 draft-ietf-sieve-3028bis-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1663. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1674. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1681. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1687. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 1651), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 42. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC3028, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 2005) is 6709 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'COMPARATOR' is mentioned on line 1296, but not defined == Missing Reference: 'ADDRESS-PART' is mentioned on line 1296, but not defined == Missing Reference: 'MATCH-TYPE' is mentioned on line 1296, but not defined == Missing Reference: 'QUANTIFIER' is mentioned on line 1431, but not defined == Unused Reference: 'MIME' is defined on line 1610, but no explicit reference was found in the text == Unused Reference: 'IMAP' is defined on line 1646, but no explicit reference was found in the text ** Obsolete normative reference: RFC 4234 (ref. 'ABNF') (Obsoleted by RFC 5234) == Outdated reference: A later version (-14) exists of draft-newman-i18n-comparator-04 ** Obsolete normative reference: RFC 2822 (ref. 'IMAIL') (Obsoleted by RFC 5322) ** Obsolete normative reference: RFC 3798 (ref. 'MDN') (Obsoleted by RFC 8098) ** Obsolete normative reference: RFC 2821 (ref. 'SMTP') (Obsoleted by RFC 5321) -- Obsolete informational reference (is this intentional?): RFC 1894 (ref. 'DSN') (Obsoleted by RFC 3464) -- Obsolete informational reference (is this intentional?): RFC 3501 (ref. 'IMAP') (Obsoleted by RFC 9051) Summary: 8 errors (**), 0 flaws (~~), 10 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Guenther 3 Internet-Draft Sendmail, Inc. 4 Expires: May 2006 T. Showalter 5 Obsoletes: 3028 (if approved) Editors 6 November 2005 8 Sieve: An Email Filtering Language 9 draft-ietf-sieve-3028bis-05.txt 11 Status of this Memo 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 A revised version of this draft document will be submitted to the RFC 35 editor as a Standard Track RFC for the Internet Community. 36 Discussion and suggestions for improvement are requested, and should 37 be sent to ietf-mta-filters@imc.org. Distribution of this memo is 38 unlimited. 40 Copyright Notice 42 Copyright (C) The Internet Society (2005). All Rights Reserved. 44 Abstract 46 This document describes a language for filtering email messages at 47 time of final delivery. It is designed to be implementable on either 48 a mail client or mail server. It is meant to be extensible, simple, 49 and independent of access protocol, mail architecture, and operating 50 system. It is suitable for running on a mail server where users may 51 not be allowed to execute arbitrary programs, such as on black box 52 Internet Message Access Protocol (IMAP) servers, as it has no 53 variables, loops, or ability to shell out to external programs. 55 Table of Contents 57 1. Introduction ........................................... 3 58 1.1. Conventions Used in This Document ..................... 4 59 1.2. Example mail messages ................................. 5 60 2. Design ................................................. 5 61 2.1. Form of the Language .................................. 6 62 2.2. Whitespace ............................................ 6 63 2.3. Comments .............................................. 6 64 2.4. Literal Data .......................................... 6 65 2.4.1. Numbers ............................................... 6 66 2.4.2. Strings ............................................... 7 67 2.4.2.1. String Lists .......................................... 8 68 2.4.2.2. Headers ............................................... 8 69 2.4.2.3. Addresses ............................................. 8 70 2.5. Tests ................................................. 9 71 2.5.1. Test Lists ............................................ 9 72 2.6. Arguments ............................................. 9 73 2.6.1. Positional Arguments .................................. 9 74 2.6.2. Tagged Arguments ...................................... 10 75 2.6.3. Optional Arguments .................................... 10 76 2.6.4. Types of Arguments .................................... 10 77 2.7. String Comparison ..................................... 11 78 2.7.1. Match Type ............................................ 11 79 2.7.2. Comparisons Across Character Sets ..................... 12 80 2.7.3. Comparators ........................................... 12 81 2.7.4. Comparisons Against Addresses ......................... 14 82 2.8. Blocks ................................................ 14 83 2.9. Commands .............................................. 14 84 2.10. Evaluation ............................................ 15 85 2.10.1. Action Interaction .................................... 15 86 2.10.2. Implicit Keep ......................................... 15 87 2.10.3. Message Uniqueness in a Mailbox ....................... 16 88 2.10.4. Limits on Numbers of Actions .......................... 16 89 2.10.5. Extensions and Optional Features ...................... 16 90 2.10.6. Errors ................................................ 17 91 2.10.7. Limits on Execution ................................... 17 92 3. Control Commands ....................................... 17 93 3.1. Control If ............................................ 18 94 3.2. Control Require ....................................... 19 95 3.3. Control Stop .......................................... 19 96 4. Action Commands ........................................ 19 97 4.1. Action fileinto ....................................... 20 98 4.2. Action redirect ....................................... 20 99 4.3. Action keep ........................................... 20 100 4.4. Action discard ........................................ 21 101 5. Test Commands .......................................... 21 102 5.1. Test address .......................................... 22 103 5.2. Test allof ............................................ 22 104 5.3. Test anyof ............................................ 23 105 5.4. Test envelope ......................................... 23 106 5.5. Test exists ........................................... 24 107 5.6. Test false ............................................ 24 108 5.7. Test header ........................................... 24 109 5.8. Test not .............................................. 25 110 5.9. Test size ............................................. 25 111 5.10. Test true ............................................. 25 112 6. Extensibility .......................................... 25 113 6.1. Capability String ..................................... 26 114 6.2. IANA Considerations ................................... 26 115 6.2.1. Template for Capability Registrations ................. 27 116 6.2.2. Initial Capability Registrations ...................... 27 117 6.3. Capability Transport .................................. 28 118 7. Transmission ........................................... 28 119 8. Parsing ................................................ 29 120 8.1. Lexical Tokens ........................................ 29 121 8.2. Grammar ............................................... 31 122 9. Extended Example ....................................... 31 123 10. Security Considerations ................................ 32 124 11. Acknowledgments ........................................ 33 125 12. Editor's Address ....................................... 33 126 13. Normative References ................................... 33 127 14. Informative References ................................. 34 128 14. Full Copyright Statement ............................... 34 130 1. Introduction 132 This memo documents a language that can be used to create filters for 133 electronic mail. It is not tied to any particular operating system 134 or mail architecture. It requires the use of [IMAIL]-compliant 135 messages, but should otherwise generalize to many systems. 137 The language is powerful enough to be useful but limited in order to 138 allow for a safe server-side filtering system. The intention is to 139 make it impossible for users to do anything more complex (and 140 dangerous) than write simple mail filters, along with facilitating 141 the use of GUIs for filter creation and manipulation. The language 142 is not Turing-complete: it provides no way to write a loop or a 143 function and variables are not provided. 145 Scripts written in Sieve are executed during final delivery, when the 146 message is moved to the user-accessible mailbox. In systems where 147 the MTA does final delivery, such as traditional Unix mail, it is 148 reasonable to sort when the MTA deposits mail into the user's 149 mailbox. 151 There are a number of reasons to use a filtering system. Mail 152 traffic for most users has been increasing due to increased usage of 153 email, the emergence of unsolicited email as a form of advertising, 154 and increased usage of mailing lists. 156 Experience at Carnegie Mellon has shown that if a filtering system is 157 made available to users, many will make use of it in order to file 158 messages from specific users or mailing lists. However, many others 159 did not make use of the Andrew system's FLAMES filtering language 160 [FLAMES] due to difficulty in setting it up. 162 Because of the expectation that users will make use of filtering if 163 it is offered and easy to use, this language has been made simple 164 enough to allow many users to make use of it, but rich enough that it 165 can be used productively. However, it is expected that GUI-based 166 editors will be the preferred way of editing filters for a large 167 number of users. 169 1.1. Conventions Used in This Document 171 In the sections of this document that discuss the requirements of 172 various keywords and operators, the following conventions have been 173 adopted. 175 The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" 176 in this document are to be interpreted as defined in [KEYWORDS]. 178 Each section on a command (test, action, or control) has a line 179 labeled "Usage:". This line describes the usage of the command, 180 including its name and its arguments. Required arguments are listed 181 inside angle brackets ("<" and ">"). Optional arguments are listed 182 inside square brackets ("[" and "]"). Each argument is followed by 183 its type, so "" represents an argument called "key" that 184 is a string. Literal strings are represented with double-quoted 185 strings. Alternatives are separated with slashes, and parenthesis 186 are used for grouping, similar to [ABNF]. 188 In the "Usage:" line, there are three special pieces of syntax that 189 are frequently repeated, MATCH-TYPE, COMPARATOR, and ADDRESS-PART. 190 These are discussed in sections 2.7.1, 2.7.3, and 2.7.4, 191 respectively. 193 The formal grammar for these commands in section 10 and is the 194 authoritative reference on how to construct commands, but the formal 195 grammar does not specify the order, semantics, number or types of 196 arguments to commands, nor the legal command names. The intent is to 197 allow for extension without changing the grammar. 199 1.2. Example mail messages 201 The following mail messages will be used throughout this document in 202 examples. 204 Message A 205 ----------------------------------------------------------- 206 Date: Tue, 1 Apr 1997 09:06:31 -0800 (PST) 207 From: coyote@desert.example.org 208 To: roadrunner@acme.example.com 209 Subject: I have a present for you 211 Look, I'm sorry about the whole anvil thing, and I really 212 didn't mean to try and drop it on you from the top of the 213 cliff. I want to try to make it up to you. I've got some 214 great birdseed over here at my place--top of the line 215 stuff--and if you come by, I'll have it all wrapped up 216 for you. I'm really sorry for all the problems I've caused 217 for you over the years, but I know we can work this out. 218 -- 219 Wile E. Coyote "Super Genius" coyote@desert.example.org 220 ----------------------------------------------------------- 222 Message B 223 ----------------------------------------------------------- 224 From: youcouldberich!@reply-by-postal-mail.invalid 225 Sender: b1ff@de.res.example.com 226 To: rube@landru.example.edu 227 Date: Mon, 31 Mar 1997 18:26:10 -0800 228 Subject: $$$ YOU, TOO, CAN BE A MILLIONAIRE! $$$ 230 YOU MAY HAVE ALREADY WON TEN MILLION DOLLARS, BUT I DOUBT 231 IT! SO JUST POST THIS TO SIX HUNDRED NEWSGROUPS! IT WILL 232 GUARANTEE THAT YOU GET AT LEAST FIVE RESPONSES WITH MONEY! 233 MONEY! MONEY! COLD HARD CASH! YOU WILL RECEIVE OVER 234 $20,000 IN LESS THAN TWO MONTHS! AND IT'S LEGAL!!!!!!!!! 235 !!!!!!!!!!!!!!!!!!111111111!!!!!!!11111111111!!1 JUST 236 SEND $5 IN SMALL, UNMARKED BILLS TO THE ADDRESSES BELOW! 237 ----------------------------------------------------------- 239 2. Design 240 2.1. Form of the Language 242 The language consists of a set of commands. Each command consists of 243 a set of tokens delimited by whitespace. The command identifier is 244 the first token and it is followed by zero or more argument tokens. 245 Arguments may be literal data, tags, blocks of commands, or test 246 commands. 248 The language is represented in UTF-8, as specified in [UTF-8]. 250 Tokens in the US-ASCII range are considered case-insensitive. 252 2.2. Whitespace 254 Whitespace is used to separate tokens. Whitespace is made up of 255 tabs, newlines (CRLF, never just CR or LF), and the space character. 256 The amount of whitespace used is not significant. 258 2.3. Comments 260 Two types of comments are offered. Comments are semantically 261 equivalent to whitespace and can be used anyplace that whitespace is 262 (with one exception in multi-line strings, as described in the 263 grammar). 265 Hash comments begin with a "#" character that is not contained within 266 a string and continue until the next CRLF. 268 Example: if size :over 100K { # this is a comment 269 discard; 270 } 272 Bracketed comments begin with the token "/*" and end with "*/" 273 outside of a string. Bracketed comments may span multiple lines. 274 Bracketed comments do not nest. 276 Example: if size :over 100K { /* this is a comment 277 this is still a comment */ discard /* this is a comment 278 */ ; 279 } 281 2.4. Literal Data 283 Literal data means data that is not executed, merely evaluated "as 284 is", to be used as arguments to commands. Literal data is limited to 285 numbers and strings. 287 2.4.1. Numbers 288 Numbers are given as ordinary decimal numbers. However, those 289 numbers that have a tendency to be fairly large, such as message 290 sizes, MAY have a "K", "M", or "G" appended to indicate a multiple of 291 a power of two. To be comparable with the power-of-two-based 292 versions of SI units that computers frequently use, K specifies 293 kibi-, or 1,024 (2^10) times the value of the number; M specifies 294 mebi-, or 1,048,576 (2^20) times the value of the number; and G 295 specifies gibi-, or 1,073,741,824 (2^30) times the value of the 296 number [BINARY-SI]. 298 Implementations MUST provide 31 bits of magnitude in numbers, but MAY 299 provide more. 301 Only positive integers are permitted by this specification. 303 2.4.2. Strings 305 Scripts involve large numbers of strings as they are used for pattern 306 matching, addresses, textual bodies, etc. Typically, short quoted 307 strings suffice for most uses, but a more convenient form is provided 308 for longer strings such as bodies of messages. 310 A quoted string starts and ends with a single double quote (the <"> 311 character, US-ASCII 34). A backslash ("\", ASCII 92) inside of a 312 quoted string is followed by either another backslash or a double 313 quote. This two-character sequence represents a single backslash or 314 double- quote within the string, respectively. 316 Scripts SHOULD NOT escape other characters with a backslash. 318 An undefined escape sequence (such as "\a" in a context where "a" has 319 no special meaning) is interpreted as if there were no backslash (in 320 this case, "\a" is just "a"). 322 Non-printing characters such as tabs, CR and LF, and control 323 characters are permitted in quoted strings. Quoted strings MAY span 324 multiple lines. NUL (US-ASCII 0) is not allowed in strings. 326 For entering larger amounts of text, such as an email message, a 327 multi-line form is allowed. It starts with the keyword "text:", 328 followed by a CRLF, and ends with the sequence of a CRLF, a single 329 period, and another CRLF. In order to allow the message to contain 330 lines with a single-dot, lines are dot-stuffed. That is, when 331 composing a message body, an extra `.' is added before each line 332 which begins with a `.'. When the server interprets the script, 333 these extra dots are removed. Note that a line that begins with a 334 dot followed by a non-dot character is not interpreted dot-stuffed; 335 that is, ".foo" is interpreted as ".foo". However, because this is 336 potentially ambiguous, scripts SHOULD be properly dot-stuffed so such 337 lines do not appear. 339 Note that a hashed comment or whitespace may occur in between the 340 "text:" and the CRLF, but not within the string itself. Bracketed 341 comments are not allowed here. 343 2.4.2.1. String Lists 345 When matching patterns, it is frequently convenient to match against 346 groups of strings instead of single strings. For this reason, a list 347 of strings is allowed in many tests, implying that if the test is 348 true using any one of the strings, then the test is true. 349 Implementations are encouraged to use short-circuit evaluation in 350 these cases. 352 For instance, the test `header :contains ["To", "Cc"] 353 ["me@example.com", "me00@landru.example.edu"]' is true if either the 354 To header or Cc header of the input message contains either of the 355 email addresses "me@example.com" or "me00@landru.example.edu". 357 Conversely, in any case where a list of strings is appropriate, a 358 single string is allowed without being a member of a list: it is 359 equivalent to a list with a single member. This means that the test 360 `exists "To"' is equivalent to the test `exists ["To"]'. 362 2.4.2.2. Headers 364 Headers are a subset of strings. In the Internet Message 365 Specification [IMAIL], each header line is allowed to have whitespace 366 nearly anywhere in the line, including after the field name and 367 before the subsequent colon. Extra spaces between the header name 368 and the ":" in a header field are ignored. 370 A header name never contains a colon. The "From" header refers to a 371 line beginning "From:" (or "From :", etc.). No header will match 372 the string "From:" due to the trailing colon. 374 Similarly, synactically invalid header names cause the same result as 375 syntactically valid header names that are not present in the message. 376 In particular, an implementation MUST NOT cause an error for 377 synactically invalid header names in tests. 379 Header lines are unfolded as described in [IMAIL] section 2.2.3. 380 Interpretation of header data SHOULD be done according to [MIME3] 381 section 6.2 (see 2.7.2 below for details). 383 2.4.2.3. Addresses 384 A number of commands call for email addresses, which are also a 385 subset of strings. When these addresses are used in outbound 386 contexts, addresses must be compliant with [IMAIL], but are further 387 constrained. Using the symbols defined in [IMAIL], section 3, the 388 syntax of an address is: 390 sieve-address = addr-spec ; simple address 391 / phrase "<" addr-spec ">" ; name & addr-spec 393 That is, routes and group syntax are not permitted. If multiple 394 addresses are required, use a string list. Named groups are not used 395 here. 397 Implementations MUST ensure that the addresses are syntactically 398 valid, but need not ensure that they actually identify an email 399 recipient. 401 2.5. Tests 403 Tests are given as arguments to commands in order to control their 404 actions. In this document, tests are given to if/elsif/else to 405 decide which block of code is run. 407 2.5.1. Test Lists 409 Some tests ("allof" and "anyof", which implement logical "and" and 410 logical "or", respectively) may require more than a single test as an 411 argument. The test-list syntax element provides a way of grouping 412 tests. 414 Example: if anyof (not exists ["From", "Date"], 415 header :contains "from" "fool@example.edu") { 416 discard; 417 } 419 2.6. Arguments 421 In order to specify what to do, most commands take arguments. There 422 are three types of arguments: positional, tagged, and optional. 424 2.6.1. Positional Arguments 426 Positional arguments are given to a command which discerns their 427 meaning based on their order. When a command takes positional 428 arguments, all positional arguments must be supplied and must be in 429 the order prescribed. 431 2.6.2. Tagged Arguments 432 This document provides for tagged arguments in the style of 433 CommonLISP. These are also similar to flags given to commands in 434 most command-line systems. 436 A tagged argument is an argument for a command that begins with ":" 437 followed by a tag naming the argument, such as ":contains". This 438 argument means that zero or more of the next tokens have some 439 particular meaning depending on the argument. These next tokens may 440 be numbers or strings but they are never blocks. 442 Tagged arguments are similar to positional arguments, except that 443 instead of the meaning being derived from the command, it is derived 444 from the tag. 446 Tagged arguments must appear before positional arguments, but they 447 may appear in any order with other tagged arguments. For simplicity 448 of the specification, this is not expressed in the syntax definitions 449 with commands, but they still may be reordered arbitrarily provided 450 they appear before positional arguments. Tagged arguments may be 451 mixed with optional arguments. 453 To simplify this specification, tagged arguments SHOULD NOT take 454 tagged arguments as arguments. 456 2.6.3. Optional Arguments 458 Optional arguments are exactly like tagged arguments except that they 459 may be left out, in which case a default value is implied. Because 460 optional arguments tend to result in shorter scripts, they have been 461 used far more than tagged arguments. 463 One particularly noteworthy case is the ":comparator" argument, which 464 allows the user to specify which comparator [COLLATION] will be used 465 to compare two strings, since different languages may impose 466 different orderings on UTF-8 [UTF-8] characters. 468 2.6.4. Types of Arguments 470 Abstractly, arguments may be literal data, tests, or blocks of 471 commands. In this way, an "if" control structure is merely a command 472 that happens to take a test and a block as arguments and may execute 473 the block of code. 475 However, this abstraction is ambiguous from a parsing standpoint. 476 The grammar in section 9.2 presents a parsable version of this: 477 Arguments are string-lists, numbers, and tags, which may be followed 478 by a test or a test-list, which may be followed by a block of 479 commands. No more than one test or test list, nor more than one 480 block of commands, may be used, and commands that end with a block of 481 commands do not end with semicolons. 483 2.7. String Comparison 485 When matching one string against another, there are a number of ways 486 of performing the match operation. These are accomplished with three 487 types of matches: an exact match, a substring match, and a wildcard 488 glob-style match. These are described below. 490 In order to provide for matches between character sets and case 491 insensitivity, Sieve uses the comparators defined in the Internet 492 Application Protocol Collation Registry [COLLATION]. 494 However, when a string represents the name of a header, the 495 comparator is never user-specified. Header comparisons are always 496 done with the "en;ascii-casemap" operator, i.e., case-insensitive 497 comparisons, because this is the way things are defined in the 498 message specification [IMAIL]. 500 2.7.1. Match Type 502 There are three match types describing the matching used in this 503 specification: ":is", ":contains", and ":matches". Match type 504 arguments are supplied to those commands which allow them to specify 505 what kind of match is to be performed. 507 These are used as tagged arguments to tests that perform string 508 comparison. 510 The ":contains" match type describes a substring match. If the value 511 argument contains the key argument as a substring, the match is true. 512 For instance, the string "frobnitzm" contains "frob" and "nit", but 513 not "fbm". The empty key ("") is contained in all values. 515 The ":is" match type describes an absolute match; if the contents of 516 the first string are absolutely the same as the contents of the 517 second string, they match. Only the string "frobnitzm" is the string 518 "frobnitzm". The empty key ":is" and only ":is" the empty value. 520 The ":matches" match type specifies a wildcard match using the 521 characters "*" and "?"; the entire value must be matched. "*" 522 matches zero or more characters, and "?" matches a single character, 523 using the definition of character appropriate for the comparator in 524 use. That is, "?" will match exactly one octet when the "i;octet" or 525 "en;ascii-casemap" comparators are used, but will match the one or 526 more octets that compose a character in UTF-8 when the 527 "i;basic;uca=3.1.1;uv=3.2" comparator is used. "?" and "*" may be 528 escaped as "\\?" and "\\*" in strings to match against themselves. 529 The first backslash escapes the second backslash; together, they 530 escape the "*". This is awkward, but it is commonplace in several 531 programming languages that use globs and regular expressions. 533 In order to specify what type of match is supposed to happen, 534 commands that support matching take optional tagged arguments 535 ":matches", ":is", and ":contains". Commands default to using ":is" 536 matching if no match type argument is supplied. Note that these 537 modifiers interact with comparators; in particular, only comparators 538 that supoprt the "substring match" operation are suitable for 539 matching with ":contains" or ":matches". It is an error to use a 540 comparator with ":contains" or ":matches" that is not compatible with 541 it. 543 It is an error to give more than one of these arguments to a given 544 command. 546 For convenience, the "MATCH-TYPE" syntax element is defined here as 547 follows: 549 Syntax: ":is" / ":contains" / ":matches" 551 2.7.2. Comparisons Across Character Sets 553 All Sieve scripts are represented in UTF-8, but messages may involve 554 a number of character sets. In order for comparisons to work across 555 character sets, implementations SHOULD implement the following 556 behavior: 558 Comparisons are performed on octets. Implementations convert text 559 from header fields in all charsets [MIME3] to Unicode, encoded as 560 UTF-8, as input to the comparator (see 2.7.3). Implementations 561 MUST be capable of converting US-ASCII, ISO-8859-1, the US-ASCII 562 subset of ISO-8859-* character sets, and UTF-8. Text that the 563 implementation cannot convert to Unicode for any reason MAY be 564 treated as plain US-ASCII (including any [MIME3] syntax) or 565 processed according to local conventions. An encoded NUL octet 566 (character zero) SHOULD NOT cause early termination of the header 567 content being compared against. 569 If implementations fail to support the above behavior, they MUST 570 conform to the following: 572 No two strings can be considered equal if one contains octets 573 greater than 127. 575 2.7.3. Comparators 576 In order to allow for language-independent, case-independent matches, 577 the match type may be coupled with a comparator name. The Internet 578 Application Protocol Collation Registry [COLLATION] provides the 579 framework for describing and naming comparators as used by this 580 specification. 582 All implementations MUST support the "i;octet" comparator (simply 583 compares octets), the "en;ascii-casemap" comparator (which treats 584 uppercase and lowercase characters in the US-ASCII subset of UTF-8 as 585 the same), as well as the "i;ascii-casemap" comparator, which is a 586 deprecated synonym for "en;ascii-casemap". If left unspecified, the 587 default is "en;ascii-casemap". 589 Some comparators may not be usable with substring matches; that is, 590 they may only work with ":is". It is an error to try and use a 591 comparator with ":matches" or ":contains" that is not compatible with 592 it. 594 A comparator is specified by the ":comparator" option with commands 595 that support matching. This option is followed by a string providing 596 the name of the comparator to be used. For convenience, the syntax 597 of a comparator is abbreviated to "COMPARATOR", and (repeated in 598 several tests) is as follows: 600 Syntax: ":comparator" 602 So in this example, 604 Example: if header :contains :comparator "i;octet" "Subject" 605 "MAKE MONEY FAST" { 606 discard; 607 } 609 would discard any message with subjects like "You can MAKE MONEY 610 FAST", but not "You can Make Money Fast", since the comparator used 611 is case-sensitive. 613 Comparators other than "i;octet", "en;ascii-casemap", and "i;ascii- 614 casemap" must be declared with require, as they are extensions. If a 615 comparator declared with require is not known, it is an error, and 616 execution fails. If the comparator is not declared with require, it 617 is also an error, even if the comparator is supported. (See 2.10.5.) 619 Both ":matches" and ":contains" match types are compatible with the 620 "i;octet" and "en;ascii-casemap" comparators and may be used with 621 them. 623 It is an error to give more than one of these arguments to a given 624 command. 626 2.7.4. Comparisons Against Addresses 628 Addresses are one of the most frequent things represented as strings. 629 These are structured, and being able to compare against the local- 630 part or the domain of an address is useful, so some tests that act 631 exclusively on addresses take an additional optional argument that 632 specifies what the test acts on. 634 These optional arguments are ":localpart", ":domain", and ":all", 635 which act on the local-part (left-side), the domain part (right- 636 side), and the whole address. 638 The kind of comparison done, such as whether or not the test done is 639 case-insensitive, is specified as a comparator argument to the test. 641 If an optional address-part is omitted, the default is ":all". 643 It is an error to give more than one of these arguments to a given 644 command. 646 For convenience, the "ADDRESS-PART" syntax element is defined here as 647 follows: 649 Syntax: ":localpart" / ":domain" / ":all" 651 2.8. Blocks 653 Blocks are sets of commands enclosed within curly braces and supplied 654 as the final argument to a command. Such a command is a control 655 structure: when executed it has control over the number of times the 656 commands in the block are executed. and how 658 With the commands supplied in this memo, there are no loops. The 659 control structures supplied--if, elsif, and else--run a block either 660 once or not at all. 662 2.9. Commands 664 Sieve scripts are sequences of commands. Commands can take any of 665 the tokens above as arguments, and arguments may be either tagged or 666 positional arguments. Not all commands take all arguments. 668 There are three kinds of commands: test commands, action commands, 669 and control commands. 671 The simplest is an action command. An action command is an 672 identifier followed by zero or more arguments, terminated by a 673 semicolon. Action commands do not take tests or blocks as arguments. 675 A control command is a command that affects the parsing or the flow 676 of execution of the Sieve script in some way. A control structure is 677 a control command which ends with a block instead of a semicolon. 679 A test command is used as part of a control command. It is used to 680 specify whether or not the block of code given to the control command 681 is executed. 683 2.10. Evaluation 685 2.10.1. Action Interaction 687 Some actions cannot be used with other actions because the result 688 would be absurd. These restrictions are noted throughout this memo. 690 Extension actions MUST state how they interact with actions defined 691 in this specification. 693 2.10.2. Implicit Keep 695 Previous experience with filtering systems suggests that cases tend 696 to be missed in scripts. To prevent errors, Sieve has an "implicit 697 keep". 699 An implicit keep is a keep action (see 4.4) performed in absence of 700 any action that cancels the implicit keep. 702 An implicit keep is performed if a message is not written to a 703 mailbox, redirected to a new address, or explicitly thrown out. That 704 is, if a fileinto, a keep, a redirect, or a discard is performed, an 705 implicit keep is not. 707 Some actions may be defined to not cancel the implicit keep. These 708 actions may not directly affect the delivery of a message, and are 709 used for their side effects. None of the actions specified in this 710 document meet that criteria, but extension actions will. 712 For instance, with any of the short messages offered above, the 713 following script produces no actions. 715 Example: if size :over 500K { discard; } 717 As a result, the implicit keep is taken. 719 2.10.3. Message Uniqueness in a Mailbox 720 Implementations SHOULD NOT deliver a message to the same folder more 721 than once, even if a script explicitly asks for a message to be 722 written to a mailbox twice. 724 The test for equality of two messages is implementation-defined. 726 If a script asks for a message to be written to a mailbox twice, it 727 MUST NOT be treated as an error. 729 2.10.4. Limits on Numbers of Actions 731 Site policy MAY limit numbers of actions taken and MAY impose 732 restrictions on which actions can be used together. In the event 733 that a script hits a policy limit on the number of actions taken for 734 a particular message, an error occurs. 736 Implementations MUST allow at least one keep or one fileinto. If 737 fileinto is not implemented, implementations MUST allow at least one 738 keep. 740 2.10.5. Extensions and Optional Features 742 Because of the differing capabilities of many mail systems, several 743 features of this specification are optional. Before any of these 744 extensions can be executed, they must be declared with the "require" 745 action. 747 If an extension is not enabled with "require", implementations MUST 748 treat it as if they did not support it at all. 750 If a script does not understand an extension declared with require, 751 the script must not be used at all. Implementations MUST NOT execute 752 scripts which require unknown capability names. 754 Note: The reason for this restriction is that prior experiences with 755 languages such as LISP and Tcl suggest that this is a workable 756 way of noting that a given script uses an extension. 758 Experience with PostScript suggests that mechanisms that allow 759 a script to work around missing extensions are not used in 760 practice. 762 Extensions which define actions MUST state how they interact with 763 actions discussed in the base specification. 765 2.10.6. Errors 767 In any programming language, there are compile-time and run-time 768 errors. 770 Compile-time errors are ones in syntax that are detectable if a 771 syntax check is done. 773 Run-time errors are not detectable until the script is run. This 774 includes transient failures like disk full conditions, but also 775 includes issues like invalid combinations of actions. 777 When an error occurs in a Sieve script, all processing stops. 779 Implementations MAY choose to do a full parse, then evaluate the 780 script, then do all actions. Implementations might even go so far as 781 to ensure that execution is atomic (either all actions are executed 782 or none are executed). 784 Other implementations may choose to parse and run at the same time. 785 Such implementations are simpler, but have issues with partial 786 failure (some actions happen, others don't). 788 Implementations might even go so far as to ensure that scripts can 789 never execute an invalid set of actions before execution, although 790 this could involve solving the Halting Problem. 792 This specification allows any of these approaches. Solving the 793 Halting Problem is considered extra credit. 795 Implementations MUST perform syntactic, semantic, and run-time checks 796 on code that is actually executed. Implementations MAY perform those 797 checks or any part of them on code that is not reached during 798 execution. 800 When an error happens, implementations MUST notify the user that an 801 error occurred, which actions (if any) were taken, and do an implicit 802 keep. 804 2.10.7. Limits on Execution 806 Implementations may limit certain constructs. However, this 807 specification places a lower bound on some of these limits. 809 Implementations MUST support fifteen levels of nested blocks. 811 Implementations MUST support fifteen levels of nested test lists. 813 3. Control Commands 815 Control structures are needed to allow for multiple and conditional 816 actions. 818 3.1. Control If 820 There are three pieces to if: "if", "elsif", and "else". Each is 821 actually a separate command in terms of the grammar. However, an 822 elsif or else MUST only follow an if or elsif. An error occurs if 823 these conditions are not met. 825 Usage: if 827 Usage: elsif 829 Usage: else 831 The semantics are similar to those of any of the many other 832 programming languages these control structures appear in. When the 833 interpreter sees an "if", it evaluates the test associated with it. 834 If the test is true, it executes the block associated with it. 836 If the test of the "if" is false, it evaluates the test of the first 837 "elsif" (if any). If the test of "elsif" is true, it runs the 838 elsif's block. An elsif may be followed by an elsif, in which case, 839 the interpreter repeats this process until it runs out of elsifs. 841 When the interpreter runs out of elsifs, there may be an "else" case. 842 If there is, and none of the if or elsif tests were true, the 843 interpreter runs the else case. 845 This provides a way of performing exactly one of the blocks in the 846 chain. 848 In the following example, both Message A and B are dropped. 850 Example: require "fileinto"; 851 if header :contains "from" "coyote" { 852 discard; 853 } elsif header :contains ["subject"] ["$$$"] { 854 discard; 855 } else { 856 fileinto "INBOX"; 857 } 859 When the script below is run over message A, it redirects the message 860 to acm@example.edu; message B, to postmaster@example.edu; any other 861 message is redirected to field@example.edu. 863 Example: if header :contains ["From"] ["coyote"] { 864 redirect "acm@example.edu"; 865 } elsif header :contains "Subject" "$$$" { 866 redirect "postmaster@example.edu"; 867 } else { 868 redirect "field@example.edu"; 869 } 871 Note that this definition prohibits the "... else if ..." sequence 872 used by C. This is intentional, because this construct produces a 873 shift-reduce conflict. 875 3.2. Control Require 877 Usage: require 879 The require action notes that a script makes use of a certain 880 extension. Such a declaration is required to use the extension, as 881 discussed in section 2.10.5. Multiple capabilities can be declared 882 with a single require. 884 The require command, if present, MUST be used before anything other 885 than a require can be used. An error occurs if a require appears 886 after a command other than require. 888 Example: require ["fileinto", "reject"]; 890 Example: require "fileinto"; 891 require "vacation"; 893 3.3. Control Stop 895 Usage: stop 897 The "stop" action ends all processing. If no actions have been 898 executed, then the keep action is taken. 900 4. Action Commands 902 This document supplies four actions that may be taken on a message: 903 keep, fileinto, redirect, and discard. 905 Implementations MUST support the "keep", "discard", and "redirect" 906 actions. 908 Implementations SHOULD support "fileinto". 910 Implementations MAY limit the number of certain actions taken (see 911 section 2.10.4). 913 4.1. Action fileinto 915 Usage: fileinto 917 The "fileinto" action delivers the message into the specified folder. 918 Implementations SHOULD support fileinto, but in some environments 919 this may be impossible. 921 The capability string for use with the require command is "fileinto". 923 In the following script, message A is filed into folder 924 "INBOX.harassment". 926 Example: require "fileinto"; 927 if header :contains ["from"] "coyote" { 928 fileinto "INBOX.harassment"; 929 } 931 4.2. Action redirect 933 Usage: redirect 935 The "redirect" action is used to send the message to another user at 936 a supplied address, as a mail forwarding feature does. The 937 "redirect" action makes no changes to the message body or existing 938 headers, but it may add new headers. The "redirect" modifies the 939 envelope recipient. 941 The redirect command performs an MTA-style "forward"--that is, what 942 you get from a .forward file using sendmail under UNIX. The address 943 on the [SMTP] envelope is replaced with the one on the redirect 944 command and the message is sent back out. (This is not an MUA-style 945 forward, which creates a new message with a different sender and 946 message ID, wrapping the old message in a new one.) 948 A simple script can be used for redirecting all mail: 950 Example: redirect "bart@example.edu"; 952 Implementations SHOULD take measures to implement loop control, 953 possibly including adding headers to the message or counting received 954 headers. If an implementation detects a loop, it causes an error. 956 4.3. Action keep 958 Usage: keep 959 The "keep" action is whatever action is taken in lieu of all other 960 actions, if no filtering happens at all; generally, this simply means 961 to file the message into the user's main mailbox. This command 962 provides a way to execute this action without needing to know the 963 name of the user's main mailbox, providing a way to call it without 964 needing to understand the user's setup, or the underlying mail 965 system. 967 For instance, in an implementation where the Internet Message Access 968 Protocol (IMAP) server is running scripts on behalf of the user at 969 time of delivery, a keep command is equivalent to a fileinto "INBOX". 971 Example: if size :under 1M { keep; } else { discard; } 973 Note that the above script is identical to the one below. 975 Example: if not size :under 1M { discard; } 977 4.4. Action discard 979 Usage: discard 981 Discard is used to silently throw away the message. It does so by 982 simply canceling the implicit keep. If discard is used with other 983 actions, the other actions still happen. Discard is compatible with 984 all other actions. (For instance fileinto+discard is equivalent to 985 fileinto.) 987 Discard MUST be silent; that is, it MUST NOT return a non-delivery 988 notification of any kind ([DSN], [MDN], or otherwise). 990 In the following script, any mail from "idiot@example.edu" is thrown 991 out. 993 Example: if header :contains ["from"] ["idiot@example.edu"] { 994 discard; 995 } 997 While an important part of this language, "discard" has the potential 998 to create serious problems for users: Students who leave themselves 999 logged in to an unattended machine in a public computer lab may find 1000 their script changed to just "discard". In order to protect users in 1001 this situation (along with similar situations), implementations MAY 1002 keep messages destroyed by a script for an indefinite period, and MAY 1003 disallow scripts that throw out all mail. 1005 5. Test Commands 1006 Tests are used in conditionals to decide which part(s) of the 1007 conditional to execute. 1009 Implementations MUST support these tests: "address", "allof", 1010 "anyof", "exists", "false", "header", "not", "size", and "true". 1012 Implementations SHOULD support the "envelope" test. 1014 5.1. Test address 1016 Usage: address [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1017 1019 The address test matches Internet addresses in structured headers 1020 that contain addresses. It returns true if any header contains any 1021 key in the specified part of the address, as modified by the 1022 comparator and the match keyword. Whether there are other addresses 1023 present in the header doesn't affect this test; this test does not 1024 provide any way to determine whether an address is the only address 1025 in a header. 1027 Like envelope and header, this test returns true if any combination 1028 of the header-list and key-list arguments match and false otherwise. 1030 Internet email addresses [IMAIL] have the somewhat awkward 1031 characteristic that the local-part to the left of the at-sign is 1032 considered case sensitive, and the domain-part to the right of the 1033 at-sign is case insensitive. The "address" command does not deal 1034 with this itself, but provides the ADDRESS-PART argument for allowing 1035 users to deal with it. 1037 The address primitive never acts on the phrase part of an email 1038 address, nor on comments within that address. It also never acts on 1039 group names, although it does act on the addresses within the group 1040 construct. 1042 Implementations MUST restrict the address test to headers that 1043 contain addresses, but MUST include at least From, To, Cc, Bcc, 1044 Sender, Resent-From, Resent-To, and SHOULD include any other header 1045 that utilizes an "address-list" structured header body. 1047 Example: if address :is :all "from" "tim@example.com" { 1048 discard; 1049 } 1051 5.2. Test allof 1053 Usage: allof 1054 The allof test performs a logical AND on the tests supplied to it. 1056 Example: allof (false, false) => false 1057 allof (false, true) => false 1058 allof (true, true) => true 1060 The allof test takes as its argument a test-list. 1062 5.3. Test anyof 1064 Usage: anyof 1066 The anyof test performs a logical OR on the tests supplied to it. 1068 Example: anyof (false, false) => false 1069 anyof (false, true) => true 1070 anyof (true, true) => true 1072 5.4. Test envelope 1074 Usage: envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1075 1077 The "envelope" test is true if the specified part of the SMTP (or 1078 equivalent) envelope matches the specified key. This specification 1079 defines the interpretation of the (case insensitive) "from" and "to" 1080 envelope-parts. Additional envelope-parts may be defined by other 1081 extensions; implementations SHOULD consider unknown envelope parts an 1082 error. 1084 If one of the envelope-part strings is (case insensitive) "from", 1085 then matching occurs against the FROM address used in the SMTP MAIL 1086 command. 1088 If one of the envelope-part strings is (case insensitive) "to", then 1089 matching occurs against the TO address used in the SMTP RCPT command 1090 that resulted in this message getting delivered to this user. Note 1091 that only the most recent TO is available, and only the one relevant 1092 to this user. 1094 The envelope-part is a string list and may contain more than one 1095 parameter, in which case all of the strings specified in the key-list 1096 are matched against all parts given in the envelope-part list. 1098 Like address and header, this test returns true if any combination of 1099 the envelope-part list and key-list arguments match and false 1100 otherwise. 1102 All tests against envelopes MUST drop source routes. 1104 If the SMTP transaction involved several RCPT commands, only the data 1105 from the RCPT command that caused delivery to this user is available 1106 in the "to" part of the envelope. 1108 If a protocol other than SMTP is used for message transport, 1109 implementations are expected to adapt this command appropriately. 1111 The envelope command is optional. Implementations SHOULD support it, 1112 but the necessary information may not be available in all cases. 1114 Example: require "envelope"; 1115 if envelope :all :is "from" "tim@example.com" { 1116 discard; 1117 } 1119 5.5. Test exists 1121 Usage: exists 1123 The "exists" test is true if the headers listed in the header-names 1124 argument exist within the message. All of the headers must exist or 1125 the test is false. 1127 The following example throws out mail that doesn't have a From header 1128 and a Date header. 1130 Example: if not exists ["From","Date"] { 1131 discard; 1132 } 1134 5.6. Test false 1136 Usage: false 1138 The "false" test always evaluates to false. 1140 5.7. Test header 1142 Usage: header [COMPARATOR] [MATCH-TYPE] 1143 1145 The "header" test evaluates to true if the value of any of the named 1146 headers, ignoring leading and trailing whitespace, matches any key. 1147 The type of match is specified by the optional match argument, which 1148 defaults to ":is" if not specified, as specified in section 2.6. 1150 Like address and envelope, this test returns true if any combination 1151 of the header-names list and key-list arguments match and false 1152 otherwise. 1154 If a header listed in the header-names argument exists, it contains 1155 the empty key (""). However, if the named header is not present, it 1156 does not match any key, including the empty key. So if a message 1157 contained the header 1159 X-Caffeine: C8H10N4O2 1161 these tests on that header evaluate as follows: 1163 header :is ["X-Caffeine"] [""] => false 1164 header :contains ["X-Caffeine"] [""] => true 1166 The preferred way to test whether a given header is either empty or 1167 absent is to combine an "exists" test and a "header" test: 1169 anyof (header :is "Cc" "", not exists "Cc") 1171 5.8. Test not 1173 Usage: not 1175 The "not" test takes some other test as an argument, and yields the 1176 opposite result. "not false" evaluates to "true" and "not true" 1177 evaluates to "false". 1179 5.9. Test size 1181 Usage: size <":over" / ":under"> 1183 The "size" test deals with the size of a message. It takes either a 1184 tagged argument of ":over" or ":under", followed by a number 1185 representing the size of the message. 1187 If the argument is ":over", and the size of the message is greater 1188 than the number provided, the test is true; otherwise, it is false. 1190 If the argument is ":under", and the size of the message is less than 1191 the number provided, the test is true; otherwise, it is false. 1193 Exactly one of ":over" or ":under" must be specified, and anything 1194 else is an error. 1196 The size of a message is defined to be the number of octets from the 1197 initial header until the last character in the message body. 1199 Note that for a message that is exactly 4,000 octets, the message is 1200 neither ":over" 4000 octets or ":under" 4000 octets. 1202 5.10. Test true 1204 Usage: true 1206 The "true" test always evaluates to true. 1208 6. Extensibility 1210 New control commands, actions, and tests can be added to the 1211 language. Sites must make these features known to their users; this 1212 document does not define a way to discover the list of extensions 1213 supported by the server. 1215 Any extensions to this language MUST define a capability string that 1216 uniquely identifies that extension. Capability string are case- 1217 sensitive; for example, "foo" and "FOO" are different capabilities. 1218 If a new version of an extension changes the functionality of a 1219 previously defined extension, it MUST use a different name. 1221 In a situation where there is a submission protocol and an extension 1222 advertisement mechanism aware of the details of this language, 1223 scripts submitted can be checked against the mail server to prevent 1224 use of an extension that the server does not support. 1226 Extensions MUST state how they interact with constraints defined in 1227 section 2.10, e.g., whether they cancel the implicit keep, and which 1228 actions they are compatible and incompatible with. 1230 6.1. Capability String 1232 Capability strings are typically short strings describing what 1233 capabilities are supported by the server. 1235 Capability strings beginning with "vnd." represent vendor-defined 1236 extensions. Such extensions are not defined by Internet standards or 1237 RFCs, but are still registered with IANA in order to prevent 1238 conflicts. Extensions starting with "vnd." SHOULD be followed by the 1239 name of the vendor and product, such as "vnd.acme.rocket-sled". 1241 The following capability strings are defined by this document: 1243 envelope The string "envelope" indicates that the implementation 1244 supports the "envelope" command. 1246 fileinto The string "fileinto" indicates that the implementation 1247 supports the "fileinto" command. 1249 comparator- The string "comparator-elbonia" is provided if the 1250 implementation supports the "elbonia" comparator. 1251 Therefore, all implementations have at least the 1252 "comparator-i;octet", "comparator-en;ascii-casemap", 1253 and "comparator-i;ascii-casemap" capabilities. However, 1254 these comparators may be used without being declared 1255 with require. 1257 6.2. IANA Considerations 1259 In order to provide a standard set of extensions, a registry is 1260 provided by IANA. Capability names may be registered on a first- 1261 come, first-served basis. Extensions designed for interoperable use 1262 SHOULD be defined as standards track or IESG approved experimental 1263 RFCs. 1265 6.2.1. Template for Capability Registrations 1267 The following template is to be used for registering new Sieve 1268 extensions with IANA. 1270 To: iana@iana.org 1271 Subject: Registration of new Sieve extension 1273 Capability name: 1274 Capability keyword: 1275 Capability arguments: 1276 Standards Track/IESG-approved experimental RFC number: 1277 Person and email address to contact for further information: 1279 6.2.2. Initial Capability Registrations 1281 This RFC updates the the following entries in the IANA registry for 1282 Sieve extensions. 1284 Capability name: fileinto 1285 Capability keyword: fileinto 1286 Capability arguments: fileinto 1287 Standards Track/IESG-approved experimental RFC number: 1288 This RFC (Sieve base spec) 1289 Person and email address to contact for further information: 1290 The Sieve discussion list 1292 Capability name: envelope 1293 Capability keyword: envelope 1294 Capability arguments: 1296 envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1297 1298 Standards Track/IESG-approved experimental RFC number: 1299 This RFC (Sieve base spec) 1300 Person and email address to contact for further information: 1301 The Sieve discussion list 1303 Capability name: comparator-* 1304 Capability keyword: 1305 comparator-* (anything starting with "comparator-") 1306 Capability arguments: (none) 1307 Standards Track/IESG-approved experimental RFC number: 1308 This RFC, Sieve, by reference to [COLLATION] 1309 Person and email address to contact for further information: 1310 The Sieve discussion list 1312 6.3. Capability Transport 1314 As the range of mail systems that this document is intended to apply 1315 to is quite varied, a method of advertising which capabilities an 1316 implementation supports is difficult due to the wide range of 1317 possible implementations. Such a mechanism, however, should have the 1318 property that the implementation can advertise the complete set of 1319 extensions that it supports. 1321 7. Transmission 1323 The MIME type for a Sieve script is "application/sieve". 1325 The registration of this type for RFC 2048 requirements is updated as 1326 follows: 1328 Subject: Registration of MIME media type application/sieve 1330 MIME media type name: application 1331 MIME subtype name: sieve 1332 Required parameters: none 1333 Optional parameters: none 1334 Encoding considerations: Most sieve scripts will be textual, 1335 written in UTF-8. When non-7bit characters are used, 1336 quoted-printable is appropriate for transport systems 1337 that require 7bit encoding. 1339 Security considerations: Discussed in section 10 of this RFC. 1340 Interoperability considerations: Discussed in section 2.10.5 1341 of this RFC. 1342 Published specification: this RFC. 1343 Applications which use this media type: sieve-enabled mail servers 1344 Additional information: 1345 Magic number(s): 1346 File extension(s): .siv 1347 Macintosh File Type Code(s): 1348 Person & email address to contact for further information: 1349 See the discussion list at ietf-mta-filters@imc.org. 1350 Intended usage: 1351 COMMON 1352 Author/Change controller: 1353 See Editor information in this RFC. 1355 8. Parsing 1357 The Sieve grammar is separated into tokens and a separate grammar as 1358 most programming languages are. 1360 8.1. Lexical Tokens 1362 Sieve scripts are encoded in UTF-8. The following assumes a valid 1363 UTF-8 encoding; special characters in Sieve scripts are all US-ASCII. 1365 The following are tokens in Sieve: 1367 - identifiers 1368 - tags 1369 - numbers 1370 - quoted strings 1371 - multi-line strings 1372 - other separators 1374 Blanks, horizontal tabs, CRLFs, and comments ("white space") are 1375 ignored except as they separate tokens. Some white space is required 1376 to separate otherwise adjacent tokens and in specific places in the 1377 multi-line strings. CR and LF can only appear in CRLF pairs. 1379 The other separators are single individual characters, and are 1380 mentioned explicitly in the grammar. 1382 The lexical structure of sieve is defined in the following grammar 1383 (as described in [ABNF]): 1385 bracket-comment = "/*" *not-star 1*STAR 1386 *(not-star-slash *not-star 1*STAR) "/" 1387 ; No */ allowed inside a comment. 1388 ; (No * is allowed unless it is the last 1389 ; character, or unless it is followed by a 1390 ; character that isn't a slash.) 1392 STAR = "*" 1394 not-star = CRLF / %x01-09 / %x0b-0c / %x0e-29 / %x2b-7f / 1395 UTF8-2 / UTF8-3 / UTF8-4 1396 ; either a CRLF pair, OR a single UTF-8 1397 ; character other than NUL, CR, LF, or star 1399 not-star-or-slash = CRLF / %x01-09 / %x0b-0c / %x0e-29 / %x2b-2e / 1400 %x30-7f / UTF8-2 / UTF8-3 / UTF8-4 1401 ; either a CRLF pair, OR a single UTF-8 1402 ; character other than NUL, CR, LF, star, 1403 ; or slash 1405 UTF8-NOT-CRLF = %x01-09 / %x0b-0c / %x0e-7f / 1406 UTF8-2 / UTF8-3 / UTF8-4 1407 ; a single UTF-8 character other than NUL, 1408 ; CR, or LF 1410 UTF8-NOT-PERIOD = %x01-09 / %x0b-0c / %x0e-2d / %x2f-7f / 1411 UTF8-2 / UTF8-3 / UTF8-4 1412 ; a single UTF-8 character other than NUL, 1413 ; CR, LF, or period 1415 UTF8-NOT-NUL = %x01-7f / UTF8-2 / UTF8-3 / UTF8-4 1416 ; a single UTF-8 character other than NUL 1418 UTF8-NOT-QSPECIAL = %x01-09 / %x0b-0c / %x0e-21 / %x23-5b / 1419 %x5d-7f / UTF8-2 / UTF8-3 / UTF8-4 1420 ; a single UTF-8 character other than NUL, 1421 ; CR, LF, double-quote, or backslash 1423 comment = bracket-comment / hash-comment 1425 hash-comment = "#" *UTF8-NOT-CRLF CRLF 1427 identifier = (ALPHA / "_") *(ALPHA / DIGIT / "_") 1429 tag = ":" identifier 1431 number = 1*DIGIT [QUANTIFIER] 1433 QUANTIFIER = "K" / "M" / "G" 1435 quoted-safe = CRLF / UTF8-NOT-QSPECIAL 1436 ; either a CRLF pair, OR a single UTF-8 1437 ; character other than NUL, CR, LF, 1438 ; double-quote, or backslash 1440 quoted-special = "\" ( DQUOTE / "\" ) 1441 ; represents just a double-quote or backslash 1443 quoted-other = "\" UTF8-NOT-QSPECIAL 1444 ; represents just the UTF8-NOT-QSPECIAL 1445 ; character. SHOULD NOT be used 1447 quoted-text = *(quoted-safe / quoted-special / quoted-other) 1449 quoted-string = DQUOTE quoted-text DQUOTE 1451 multi-line = "text:" *(SP / HTAB) (hash-comment / CRLF) 1452 *(multiline-literal / multiline-dotstuff) 1453 "." CRLF 1455 multiline-literal = [UTF8-NOT-PERIOD *UTF8-NOT-CRLF] CRLF 1457 multiline-dotstuff = "." 1*UTF8-NOT-CRLF CRLF 1458 ; A line containing only "." ends the 1459 ; multi-line. Remove a leading '.' if 1460 ; followed by another '.'. 1462 white-space = 1*(SP / CRLF / HTAB) / comment 1464 8.2. Grammar 1466 The following is the grammar of Sieve after it has been lexically 1467 interpreted. No white space or comments appear below. The start 1468 symbol is "start". Non-terminals for MATCH-TYPE, COMPARATOR, and 1469 ADDRESS-PART are provided for use by extensions. 1471 argument = string-list / number / tag 1473 arguments = *argument [test / test-list] 1475 block = "{" commands "}" 1477 command = identifier arguments ( ";" / block ) 1479 commands = *command 1481 start = commands 1483 string = quoted-string / multi-line 1485 string-list = "[" string *("," string) "]" / string 1486 ; if there is only a single string, the brackets 1487 ; are optional 1489 test = identifier arguments 1491 test-list = "(" test *("," test) ")" 1493 ADDRESS-PART = ":localpart" / ":domain" / ":all" 1495 COMPARATOR = ":comparator" string 1497 MATCH-TYPE = ":is" / ":contains" / ":matches" 1499 9. Extended Example 1501 The following is an extended example of a Sieve script. Note that it 1502 does not make use of the implicit keep. 1504 # 1505 # Example Sieve Filter 1506 # Declare any optional features or extension used by the script 1507 # 1508 require ["fileinto"]; 1510 # 1511 # Handle messages from known mailing lists 1512 # Move messages from IETF filter discussion list to filter folder 1513 # 1514 if header :is "Sender" "owner-ietf-mta-filters@imc.org" 1515 { 1516 fileinto "filter"; # move to "filter" folder 1517 } 1518 # 1519 # Keep all messages to or from people in my company 1520 # 1521 elsif address :domain :is ["From", "To"] "example.com" 1522 { 1523 keep; # keep in "In" folder 1524 } 1526 # 1527 # Try and catch unsolicited email. If a message is not to me, 1528 # or it contains a subject known to be spam, file it away. 1529 # 1530 elsif anyof (not address :all :contains 1531 ["To", "Cc", "Bcc"] "me@example.com", 1532 header :matches "subject" 1533 ["*make*money*fast*", "*university*dipl*mas*"]) 1534 { 1535 # If message header does not contain my address, 1536 # it's from a list. 1538 fileinto "spam"; # move to "spam" folder 1539 } 1540 else 1541 { 1542 # Move all other (non-company) mail to "personal" 1543 # folder. 1544 fileinto "personal"; 1545 } 1547 10. Security Considerations 1549 Users must get their mail. It is imperative that whatever method 1550 implementations use to store the user-defined filtering scripts be 1551 secure. 1553 It is equally important that implementations sanity-check the user's 1554 scripts, and not allow users to create on-demand mailbombs. For 1555 instance, an implementation that allows a user to redirect a message 1556 multiple times might also allow a user to create a mailbomb triggered 1557 by mail from a specific user. Site- or implementation-defined limits 1558 on actions are useful for this. 1560 Several commands, such as "discard", "redirect", and "fileinto" allow 1561 for actions to be taken that are potentially very dangerous. 1563 Implementations SHOULD take measures to prevent languages from 1564 looping. 1566 As with any filter on a message stream, if the sieve implementation 1567 and the mail agents 'behind' sieve in the message stream differ in 1568 their interpretation of the messages, it may be possible for an 1569 attacker to subvert the filter. Of particular note are differences 1570 in the interpretation of malformed messages (e.g., missing or extra 1571 syntax characters) or those that exhibit corner cases (e.g., NUL 1572 octects encoded via [MIME3]). 1574 11. Acknowledgments 1576 This document has been revised in part based on comments and 1577 discussions that took place on and off the SIEVE mailing list. 1578 Thanks to Cyrus Daboo, Ned Freed, Michael Haardt, Kjetil Torgrim 1579 Homme, Barry Leiba, Mark E. Mallett, Alexey Melnikov, Rob Siemborski, 1580 and Nigel Swinson for reviews and suggestions. 1582 12. Editors' Addresses 1584 Philip Guenther 1585 Sendmail, Inc. 1587 6425 Christie St. Ste 400 1588 Emeryville, CA 94608 1589 Email: guenther@sendmail.com 1591 Tim Showalter 1592 Email: tjs@psaux.com 1594 13. Normative References 1596 [ABNF] D. Crocker, Ed., P. Overell "Augmented BNF for Syntax 1597 Specifications: ABNF", RFC 4234, October 2005. 1599 [COLLATION] Newman, C., Duerst, M., and A. Gulbrandsen "Internet 1600 Application Protocol Collation Registry" draft- 1601 newman-i18n-comparator-04.txt (work in progress), 1602 July 2005. 1604 [IMAIL] P. Resnick, Ed., "Internet Message Format", RFC 2822, 1605 April 2001. 1607 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 1608 Requirement Levels", BCP 14, RFC 2119, March 1997. 1610 [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 1611 Extensions (MIME) Part One: Format of Internet 1612 Message Bodies", RFC 2045, November 1996. 1614 [MIME3] Moore, K., "MIME (Multipurpose Internet Mail Extensions) 1615 Part Three: Message Header Extensions for Non-ASCII 1616 Text", RFC 2047, November 1996 1618 [MDN] T. Hansen, Ed., G. Vaudreuil, Ed., "Message Disposition 1619 Notification", RFC 3798, May 2004. 1621 [SMTP] J. Klensin, Ed., "Simple Mail Transfer Protocol", RFC 1622 2821, April 2001. 1624 [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO 1625 10646", RFC 3629, November 2003. 1627 14. Informative References 1629 [BINARY-SI] "Standard IEC 60027-2: Letter symbols to be used in 1630 electrical technology - Part 2: Telecommunications and 1631 electronics", January 1999. 1633 [DSN] Moore, K. and G. Vaudreuil, "An Extensible Message Format 1634 for Delivery Status Notifications", RFC 1894, January 1635 1996. 1637 [FLAMES] Borenstein, N, and C. Thyberg, "Power, Ease of Use, and 1638 Cooperative Work in a Practical Multimedia Message 1639 System", Int. J. of Man-Machine Studies, April, 1991. 1640 Reprinted in Computer-Supported Cooperative Work and 1641 Groupware, Saul Greenberg, editor, Harcourt Brace 1642 Jovanovich, 1991. Reprinted in Readings in Groupware and 1643 Computer-Supported Cooperative Work, Ronald Baecker, 1644 editor, Morgan Kaufmann, 1993. 1646 [IMAP] Crispin, M., "Internet Message Access Protocol - version 1647 4rev1", RFC 3501, March 2003. 1649 14. Full Copyright Statement 1651 Copyright (C) The Internet Society (2005). 1653 This document is subject to the rights, licenses and restrictions 1654 contained in BCP 78, and except as set forth therein, the authors 1655 retain all their rights. 1657 This document and the information contained herein are provided on an 1658 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1659 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1660 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1661 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1662 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1663 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1665 Intellectual Property 1667 The IETF takes no position regarding the validity or scope of any 1668 Intellectual Property Rights or other rights that might be claimed to 1669 pertain to the implementation or use of the technology described in 1670 this document or the extent to which any license under such rights 1671 might or might not be available; nor does it represent that it has 1672 made any independent effort to identify any such rights. Information 1673 on the procedures with respect to rights in RFC documents can be 1674 found in BCP 78 and BCP 79. 1676 Copies of IPR disclosures made to the IETF Secretariat and any 1677 assurances of licenses to be made available, or the result of an 1678 attempt made to obtain a general license or permission for the use of 1679 such proprietary rights by implementers or users of this 1680 specification can be obtained from the IETF on-line IPR repository at 1681 http://www.ietf.org/ipr. 1683 The IETF invites any interested party to bring to its attention any 1684 copyrights, patents or patent applications, or other proprietary 1685 rights that may cover technology that may be required to implement 1686 this standard. Please address the information to the IETF at ietf- 1687 ipr@ietf.org. 1689 Acknowledgement 1691 Funding for the RFC Editor function is currently provided by the 1692 Internet Society. 1694 Append A. Change History 1696 This section will be replaced with a summary of the changes since RFC 1697 3028 when this document leaves the Internet-Draft stage. 1699 Open Issues: 1700 1. Compression of embedded whitespace 1701 2. Merge reject back in with textual changes to permit MDNs and 1702 protocol level rejection 1704 Changes from draft-ietf-sieve-3028bis-05.txt 1705 1. Add non-terminals for MATCH-TYPE, COMPARATOR, and ADDRESS-PART 1706 2. Strip leading and trailing whitespace in the value being matched 1707 by header 1708 3. Collations operate on octets, not characters, and for character 1709 data that is the UTF-8 encoding of the Unicode characters 1710 4. :matches uses character definition of comparator 1712 Changes from draft-ietf-sieve-3028bis-04.txt 1713 1. Change "Syntax:" to "Usage:" 1714 2. Update ABNF reference to RFC 4234 1716 Changes from draft-ietf-sieve-3028bis-03.txt 1717 1. Remove section 2.4.2.4., MIME Parts, as unreferenced 1718 2. Update to draft-newman-i18n-comparator-04.txt 1719 3. Various tweaks to examples and syntax lines 1720 4. Define "control structure" as a control command with a block 1721 argument, then use it consistently. Reword description of 1722 blocks to match 1723 5. Clarify that "header" can never match an absent header and give 1724 the preferred way to test for absent or empty 1725 6. Invalid header name syntax is not an error _in tests_ (but could 1726 be elsewhere) 1727 7. Implementation SHOULD consider unknown envelope parts an error 1728 8. Remove explicit "omitted" option from 2.7.2p2 1730 Changes from draft-ietf-sieve-3028bis-02.txt 1731 1. Change "ASCII" to "US-ASCII" throughout 1732 2. Tweak section 2.7.2 to not require use of UTF-8 internally and 1733 to explicitly leave implementation-defined the handling of text 1734 that can't be converted to Unicode 1735 3. Add reference to RFC 2047 1736 4. Clarify that capability strings are case-sensitive 1737 5. Clarify that address, envelope, and header return false if no 1738 combination of arguments match 1739 6. Directly state that code that isn't reached may still be checked 1740 for errors 1741 7. Invalid header name syntax is not an error 1742 8. Remove description of header unfolding that conflicts with 1743 [IMAIL] 1744 9. Warn that filters may be subvertable if agents interpret messages 1745 differently 1746 10. Encoded NUL octets SHOULD NOT cause truncation 1748 Changes from draft-ietf-sieve-3028bis-01.txt 1749 1. Remove ban on side effects 1750 2. Remove definition of the 'reject' action, as it is being moved 1751 to the doc that also defines the 'refuse' action 1752 3. Update capability registrations to reference the mailing list 1753 4. Add Tim back as an editor 1754 5. Refer to the zero-length string ("") as "empty" instead of 1755 "null" 1757 Changes from draft-ietf-sieve-3028bis-00.txt 1758 1. More grammar corrections: 1759 - permit /***/, 1760 - remove ambiguity in finding end of bracket comment, 1761 - require valid UTF-8, 1762 - express quoting in the grammar 1763 - ban bare CR and LF in all locations 1764 2. Correct a bunch of whitespace and linewrapping nits 1765 3. Update IMAIL and SMTP references to RFC 2822 and RFC 2821 1766 4. Require support for en;ascii-casemap comparator as well as the 1767 old i;ascii-casemap. As with the old one, you do not need to 1768 use 'require' to use the new comparator 1769 5. Update IANA considerations to update the existing registrations 1770 to point at this doc instead of 3028 1771 6. Scripts SHOULD NOT contain superfluous backslashes 1772 7. Update Acknowledgments 1774 Changes from RFC 3028 1775 1. Split references into normative and informative 1776 2. Update references to current versions of DSN, IMAP, MDN, and 1777 UTF-8 RFCs 1778 3. Replace "e-mail" with "email" 1779 4. Incorporate RFC 3028 errata 1780 5. The "reject" action cancels the implicit keep 1781 6. Replace references to ACAP with references to the i18n-comparator 1782 draft. Further work is needed to completely sync with that 1783 draft 1784 7. Start to update grammar to only permit legal UTF-8 (incomplete) 1785 and correct various other errors and typos 1786 8. Update IPR broilerplate to RFC 3978/3979