idnits 2.17.1 draft-ietf-sieve-3028bis-12.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1821. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1832. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1839. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1845. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC3028, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 2007) is 6274 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'COMPARATOR' is mentioned on line 1246, but not defined == Missing Reference: 'ADDRESS-PART' is mentioned on line 1176, but not defined == Missing Reference: 'MATCH-TYPE' is mentioned on line 1246, but not defined ** Obsolete normative reference: RFC 4234 (ref. 'ABNF') (Obsoleted by RFC 5234) == Outdated reference: A later version (-14) exists of draft-newman-i18n-comparator-07 ** Obsolete normative reference: RFC 2822 (ref. 'IMAIL') (Obsoleted by RFC 5322) ** Obsolete normative reference: RFC 2821 (ref. 'SMTP') (Obsoleted by RFC 5321) -- Obsolete informational reference (is this intentional?): RFC 3501 (ref. 'IMAP') (Obsoleted by RFC 9051) -- Obsolete informational reference (is this intentional?): RFC 3798 (ref. 'MDN') (Obsoleted by RFC 8098) -- Obsolete informational reference (is this intentional?): RFC 3028 (Obsoleted by RFC 5228, RFC 5429) Summary: 4 errors (**), 0 flaws (~~), 5 warnings (==), 12 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Guenther 3 Internet-Draft Sendmail, Inc. 4 Intended status: Standards Track T. Showalter 5 Expires: August 2007 Editors 6 Obsoletes: 3028 (if approved) February 2007 8 Sieve: An Email Filtering Language 9 draft-ietf-sieve-3028bis-12.txt 11 Status of this Memo 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 A revised version of this draft document will be submitted to the RFC 35 editor as a Standard Track RFC for the Internet Community. 36 Discussion and suggestions for improvement are requested, and should 37 be sent to ietf-mta-filters@imc.org. Distribution of this memo is 38 unlimited. 40 Copyright Notice 42 Copyright (C) The IETF Trust (2007). 44 Abstract 46 This document describes a language for filtering email messages at 47 time of final delivery. It is designed to be implementable on either 48 a mail client or mail server. It is meant to be extensible, simple, 49 and independent of access protocol, mail architecture, and operating 50 system. It is suitable for running on a mail server where users may 51 not be allowed to execute arbitrary programs, such as on black box 52 Internet Message Access Protocol (IMAP) servers, as the base language 53 has no variables, loops, or ability to shell out to external 54 programs. 56 Table of Contents 58 1. Introduction ........................................... 3 59 1.1. Conventions Used in This Document ..................... 4 60 1.2. Example mail messages ................................. 5 61 2. Design ................................................. 6 62 2.1. Form of the Language .................................. 6 63 2.2. Whitespace ............................................ 6 64 2.3. Comments .............................................. 6 65 2.4. Literal Data .......................................... 7 66 2.4.1. Numbers ............................................... 7 67 2.4.2. Strings ............................................... 7 68 2.4.2.1. String Lists .......................................... 8 69 2.4.2.2. Headers ............................................... 9 70 2.4.2.3. Addresses ............................................. 9 71 2.4.2.4. Encoding characters using "encoded-character" ......... 9 72 2.5. Tests ................................................. 10 73 2.5.1. Test Lists ............................................ 10 74 2.6. Arguments ............................................. 11 75 2.6.1. Positional Arguments .................................. 11 76 2.6.2. Tagged Arguments ...................................... 11 77 2.6.3. Optional Arguments .................................... 12 78 2.6.4. Types of Arguments .................................... 12 79 2.7. String Comparison ..................................... 12 80 2.7.1. Match Type ............................................ 12 81 2.7.2. Comparisons Across Character Sets ..................... 14 82 2.7.3. Comparators ........................................... 14 83 2.7.4. Comparisons Against Addresses ......................... 15 84 2.8. Blocks ................................................ 16 85 2.9. Commands .............................................. 16 86 2.10. Evaluation ............................................ 17 87 2.10.1. Action Interaction .................................... 17 88 2.10.2. Implicit Keep ......................................... 17 89 2.10.3. Message Uniqueness in a Mailbox ....................... 17 90 2.10.4. Limits on Numbers of Actions .......................... 18 91 2.10.5. Extensions and Optional Features ...................... 18 92 2.10.6. Errors ................................................ 18 93 2.10.7. Limits on Execution ................................... 19 94 3. Control Commands ....................................... 19 95 3.1. Control if ............................................ 19 96 3.2. Control require ....................................... 21 97 3.3. Control stop .......................................... 21 98 4. Action Commands ........................................ 21 99 4.1. Action fileinto ....................................... 21 100 4.2. Action redirect ....................................... 22 101 4.3. Action keep ........................................... 23 102 4.4. Action discard ........................................ 23 103 5. Test Commands .......................................... 24 104 5.1. Test address .......................................... 24 105 5.2. Test allof ............................................ 25 106 5.3. Test anyof ............................................ 25 107 5.4. Test envelope ......................................... 25 108 5.5. Test exists ........................................... 26 109 5.6. Test false ............................................ 26 110 5.7. Test header ........................................... 27 111 5.8. Test not .............................................. 27 112 5.9. Test size ............................................. 27 113 5.10. Test true ............................................. 28 114 6. Extensibility .......................................... 28 115 6.1. Capability String ..................................... 29 116 6.2. IANA Considerations ................................... 29 117 6.2.1. Template for Capability Registrations ................. 30 118 6.2.2. Handling of Existing Capability Registrations ......... 30 119 6.2.3. Initial Capability Registrations ...................... 30 120 6.3. Capability Transport .................................. 31 121 7. Transmission ........................................... 31 122 8. Parsing ................................................ 32 123 8.1. Lexical Tokens ........................................ 32 124 8.2. Grammar ............................................... 34 125 9. Extended Example ....................................... 35 126 10. Security Considerations ................................ 36 127 11. Acknowledgments ........................................ 36 128 12. Editors' Addresses ..................................... 37 129 13. Normative References ................................... 37 130 14. Informative References ................................. 38 131 15. Changes from RFC 3028 .................................. 39 132 16. Full Copyright Statement ............................... 40 134 1. Introduction 136 This memo documents a language that can be used to create filters for 137 electronic mail. It is not tied to any particular operating system 138 or mail architecture. It requires the use of [IMAIL]-compliant 139 messages, but should otherwise generalize to many systems. 141 The language is powerful enough to be useful but limited in order to 142 allow for a safe server-side filtering system. The intention is to 143 make it impossible for users to do anything more complex (and 144 dangerous) than write simple mail filters, along with facilitating 145 the use of GUIs for filter creation and manipulation. The base 146 language is intentionally not Turing-complete: it provides no way to 147 write a loop or a function and variables are not provided. 149 Scripts written in Sieve are executed during final delivery, when the 150 message is moved to the user-accessible mailbox. In systems where 151 the Mail Transfer Agent (MTA) does final delivery, such as 152 traditional Unix mail, it is reasonable to sort when the MTA deposits 153 mail into the user's mailbox. 155 There are a number of reasons to use a filtering system. Mail 156 traffic for most users has been increasing due to increased usage of 157 email, the emergence of unsolicited email as a form of advertising, 158 and increased usage of mailing lists. 160 Experience at Carnegie Mellon has shown that if a filtering system is 161 made available to users, many will make use of it in order to file 162 messages from specific users or mailing lists. However, many others 163 did not make use of the Andrew system's FLAMES filtering language 164 [FLAMES] due to difficulty in setting it up. 166 Because of the expectation that users will make use of filtering if 167 it is offered and easy to use, this language has been made simple 168 enough to allow many users to make use of it, but rich enough that it 169 can be used productively. However, it is expected that GUI-based 170 editors will be the preferred way of editing filters for a large 171 number of users. 173 1.1. Conventions Used in This Document 175 In the sections of this document that discuss the requirements of 176 various keywords and operators, the following conventions have been 177 adopted. 179 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 180 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 181 document are to be interpreted as described in [KEYWORDS]. 183 Each section on a command (test, action, or control) has a line 184 labeled "Usage:". This line describes the usage of the command, 185 including its name and its arguments. Required arguments are listed 186 inside angle brackets ("<" and ">"). Optional arguments are listed 187 inside square brackets ("[" and "]"). Each argument is followed by 188 its type, so "" represents an argument called "key" that 189 is a string. Literal strings are represented with double-quoted 190 strings. Alternatives are separated with slashes, and parenthesis 191 are used for grouping, similar to [ABNF]. 193 In the "Usage:" line, there are three special pieces of syntax that 194 are frequently repeated, MATCH-TYPE, COMPARATOR, and ADDRESS-PART. 195 These are discussed in sections 2.7.1, 2.7.3, and 2.7.4, 196 respectively. 198 The formal grammar for these commands is defined in section 10 and is 199 the authoritative reference on how to construct commands, but the 200 formal grammar does not specify the order, semantics, number or types 201 of arguments to commands, nor the legal command names. The intent is 202 to allow for extension without changing the grammar. 204 1.2. Example mail messages 206 The following mail messages will be used throughout this document in 207 examples. 209 Message A 210 ----------------------------------------------------------- 211 Date: Tue, 1 Apr 1997 09:06:31 -0800 (PST) 212 From: coyote@desert.example.org 213 To: roadrunner@acme.example.com 214 Subject: I have a present for you 216 Look, I'm sorry about the whole anvil thing, and I really 217 didn't mean to try and drop it on you from the top of the 218 cliff. I want to try to make it up to you. I've got some 219 great birdseed over here at my place--top of the line 220 stuff--and if you come by, I'll have it all wrapped up 221 for you. I'm really sorry for all the problems I've caused 222 for you over the years, but I know we can work this out. 223 -- 224 Wile E. Coyote "Super Genius" coyote@desert.example.org 225 ----------------------------------------------------------- 227 Message B 228 ----------------------------------------------------------- 229 From: youcouldberich!@reply-by-postal-mail.invalid 230 Sender: b1ff@de.res.example.com 231 To: rube@landru.example.com 232 Date: Mon, 31 Mar 1997 18:26:10 -0800 233 Subject: $$$ YOU, TOO, CAN BE A MILLIONAIRE! $$$ 235 YOU MAY HAVE ALREADY WON TEN MILLION DOLLARS, BUT I DOUBT 236 IT! SO JUST POST THIS TO SIX HUNDRED NEWSGROUPS! IT WILL 237 GUARANTEE THAT YOU GET AT LEAST FIVE RESPONSES WITH MONEY! 238 MONEY! MONEY! COLD HARD CASH! YOU WILL RECEIVE OVER 239 $20,000 IN LESS THAN TWO MONTHS! AND IT'S LEGAL!!!!!!!!! 240 !!!!!!!!!!!!!!!!!!111111111!!!!!!!11111111111!!1 JUST 241 SEND $5 IN SMALL, UNMARKED BILLS TO THE ADDRESSES BELOW! 242 ----------------------------------------------------------- 244 2. Design 246 2.1. Form of the Language 248 The language consists of a set of commands. Each command consists of 249 a set of tokens delimited by whitespace. The command identifier is 250 the first token and it is followed by zero or more argument tokens. 251 Arguments may be literal data, tags, blocks of commands, or test 252 commands. 254 With the exceptions of strings and comments, the language is limited 255 to US-ASCII characters. Strings and comments may contain octets 256 outside the US-ASCII range. Specifically, they will normally be in 257 UTF-8, as specified in [UTF-8]. NUL (US-ASCII 0) is never permitted 258 in scripts, while CR and LF can only appear as the CRLF line ending. 260 While this specification permits arbitrary octets to appear in 261 sieve scripts inside strings and comments, this has made it 262 difficult to robustly handle sieve scripts in programs that are 263 sensitive to the encodings used. The "encoded-character" 264 capability (section 2.4.2.4) provides an alternative means of 265 representing such octets in strings using just US-ASCII 266 characters. As such, the use of non-UTF-8 text in scripts should 267 be considered a deprecated feature that may be abandoned. 269 Tokens other than strings are considered case-insensitive. 271 2.2. Whitespace 273 Whitespace is used to separate tokens. Whitespace is made up of 274 tabs, newlines (CRLF, never just CR or LF), and the space character. 275 The amount of whitespace used is not significant. 277 2.3. Comments 279 Two types of comments are offered. Comments are semantically 280 equivalent to whitespace and can be used anyplace that whitespace is 281 (with one exception in multi-line strings, as described in the 282 grammar). 284 Hash comments begin with a "#" character that is not contained within 285 a string and continue until the next CRLF. 287 Example: if size :over 100k { # this is a comment 288 discard; 290 } 292 Bracketed comments begin with the token "/*" and end with "*/" 293 outside of a string. Bracketed comments may span multiple lines. 294 Bracketed comments do not nest. 296 Example: if size :over 100K { /* this is a comment 297 this is still a comment */ discard /* this is a comment 298 */ ; 299 } 301 2.4. Literal Data 303 Literal data means data that is not executed, merely evaluated "as 304 is", to be used as arguments to commands. Literal data is limited to 305 numbers, strings, and string lists. 307 2.4.1. Numbers 309 Numbers are given as ordinary decimal numbers. As a shorthand for 310 expressing larger values, such as message sizes, a suffix of "K", 311 "M", or "G" MAY be appended to indicate a multiple of a power of two. 312 To be comparable with the power-of-two-based versions of SI units 313 that computers frequently use, "K" specifies kibi-, or 1,024 (2^10) 314 times the value of the number; "M" specifies mebi-, or 1,048,576 315 (2^20) times the value of the number; and "G" specifies gibi-, or 316 1,073,741,824 (2^30) times the value of the number [BINARY-SI]. 318 Implementations MUST support integer values in the inclusive range 319 zero to 2,147,483,647 (2^31 - 1), but MAY support larger values. 321 Only non-negative integers are permitted by this specification. 323 2.4.2. Strings 325 Scripts involve large numbers of string values as they are used for 326 pattern matching, addresses, textual bodies, etc. Typically, short 327 quoted strings suffice for most uses, but a more convenient form is 328 provided for longer strings such as bodies of messages. 330 A quoted string starts and ends with a single double quote (the <"> 331 character, US-ASCII 34). A backslash ("\", US-ASCII 92) inside of a 332 quoted string is followed by either another backslash or a double 333 quote. These two-character sequences represent a single backslash or 334 double quote within the value, respectively. 336 Scripts SHOULD NOT escape other characters with a backslash. 338 An undefined escape sequence (such as "\a" in a context where "a" has 339 no special meaning) is interpreted as if there were no backslash (in 340 this case, "\a" is just "a"), though that may be changed by 341 extensions. 343 Non-printing characters such as tabs, CRLF, and control characters 344 are permitted in quoted strings. Quoted strings MAY span multiple 345 lines. NUL (US-ASCII 0) is not allowed in strings. 347 As message header data is converted to [UTF-8] for comparison (see 348 section 2.7.2), most string values will use the UTF-8 encoding. 349 However, implementations MUST accept all strings that match the 350 grammar in section 8. The ability to use non-UTF-8 encoded strings 351 matches existing practice and has proven to be useful both in tests 352 for invalid data and in arguments containing raw MIME parts for 353 extension actions that generate outgoing messages. 355 For entering larger amounts of text, such as an email message, a 356 multi-line form is allowed. It starts with the keyword "text:", 357 followed by a CRLF, and ends with the sequence of a CRLF, a single 358 period, and another CRLF. The CRLF before the final period is 359 considered part of the value. In order to allow the message to 360 contain lines with a single-dot, lines are dot-stuffed. That is, 361 when composing a message body, an extra `.' is added before each line 362 which begins with a `.'. When the server interprets the script, 363 these extra dots are removed. Note that a line that begins with a 364 dot followed by a non-dot character is not interpreted dot-stuffed; 365 that is, ".foo" is interpreted as ".foo". However, because this is 366 potentially ambiguous, scripts SHOULD be properly dot-stuffed so such 367 lines do not appear. 369 Note that a hashed comment or whitespace may occur in between the 370 "text:" and the CRLF, but not within the string itself. Bracketed 371 comments are not allowed here. 373 2.4.2.1. String Lists 375 When matching patterns, it is frequently convenient to match against 376 groups of strings instead of single strings. For this reason, a list 377 of strings is allowed in many tests, implying that if the test is 378 true using any one of the strings, then the test is true. 380 For instance, the test `header :contains ["To", "Cc"] 381 ["me@example.com", "me00@landru.example.com"]' is true if either a To 382 header or Cc header of the input message contains either of the email 383 addresses "me@example.com" or "me00@landru.example.com". 385 Conversely, in any case where a list of strings is appropriate, a 386 single string is allowed without being a member of a list: it is 387 equivalent to a list with a single member. This means that the test 388 `exists "To"' is equivalent to the test `exists ["To"]'. 390 2.4.2.2. Headers 392 Headers are a subset of strings. In the Internet Message 393 Specification [IMAIL], each header line is allowed to have whitespace 394 nearly anywhere in the line, including after the field name and 395 before the subsequent colon. Extra spaces between the header name 396 and the ":" in a header field are ignored. 398 A header name never contains a colon. The "From" header refers to a 399 line beginning "From:" (or "From :", etc.). No header will match 400 the string "From:" due to the trailing colon. 402 Similarly, no header will match a syntactically invalid header name. 403 An implementation MUST NOT cause an error for syntactically invalid 404 header names in tests. 406 Header lines are unfolded as described in [IMAIL] section 2.2.3. 407 Interpretation of header data SHOULD be done according to [MIME3] 408 section 6.2 (see 2.7.2 below for details). 410 2.4.2.3. Addresses 412 A number of commands call for email addresses, which are also a 413 subset of strings. When these addresses are used in outbound 414 contexts, addresses must be compliant with [IMAIL], but are further 415 constrained within this document. Using the symbols defined in 416 [IMAIL], section 3, the syntax of an address is: 418 sieve-address = addr-spec ; simple address 419 / phrase "<" addr-spec ">" ; name & addr-spec 421 That is, routes and group syntax are not permitted. If multiple 422 addresses are required, use a string list. Named groups are not 423 permitted. 425 It is an error for a script to execute an action with a value for use 426 as an outbound address that doesn't match the "sieve-address" syntax. 428 2.4.2.4. Encoding characters using "encoded-character" 430 When the "encoded-character" extension is in effect, certain 431 character sequences in strings are replaced by their decoded value. 432 This happens after escape sequences are interpreted and dot- 433 unstuffing has been done. Implementations SHOULD support "encoded- 434 character". 436 Arbitrary octets can be embedded in strings by using the syntax 437 encoded-arb-octets. The sequence is replaced by the octets with the 438 hexadecimal values given by each hex-pair. 440 encoded-arb-octets = "${hex:" hex-pair-seq "}" 441 hex-pair-seq = hex-pair *(WSP hex-pair) 442 hex-pair = 1*2HEXDIG 444 It may be inconvenient or undesirable to enter Unicode characters 445 verbatim and for these cases the syntax encoded-unicode-char can be 446 used. The sequence is replaced by the UTF-8 encoding of the 447 specified Unicode characters, which are identified by the hexadecimal 448 value of unicode-hex. 450 encoded-unicode-char = "${unicode:" unicode-hex-seq "}" 451 unicode-hex-seq = unicode-hex *(WSP unicode-hex) 452 unicode-hex = 1*6HEXDIG 454 It is an error for a script to use a hexadecimal value that isn't in 455 either the range 0 to D7FF or the range E000 to 10FFFF. (The range 456 D800 to DFFF is excluded as those character numbers are only used as 457 part of the UTF-16 encoding form and are not applicable to the UTF-8 458 encoding that the syntax here represents.) 460 The capability string for use with the require command is "encoded- 461 character". 463 In the following script, message A is discarded, since the specified 464 test string is equivalent to "$$$". 466 Example: require "encoded-character"; 467 if header :contains "Subject" "$${hex:24 24}" { 468 discard; 469 } 471 2.5. Tests 473 Tests are given as arguments to commands in order to control their 474 actions. In this document, tests are given to if/elsif/else to 475 decide which block of code is run. 477 2.5.1. Test Lists 479 Some tests ("allof" and "anyof", which implement logical "and" and 480 logical "or", respectively) may require more than a single test as an 481 argument. The test-list syntax element provides a way of grouping 482 tests as a comma separated list in parens. 484 Example: if anyof (not exists ["From", "Date"], 485 header :contains "from" "fool@example.com") { 486 discard; 487 } 489 2.6. Arguments 491 In order to specify what to do, most commands take arguments. There 492 are three types of arguments: positional, tagged, and optional. 494 It is an error for a script, on a single command, to use conflicting 495 arguments or to use a tagged or optional argument more than once. 497 2.6.1. Positional Arguments 499 Positional arguments are given to a command which discerns their 500 meaning based on their order. When a command takes positional 501 arguments, all positional arguments must be supplied and must be in 502 the order prescribed. 504 2.6.2. Tagged Arguments 506 This document provides for tagged arguments in the style of 507 CommonLISP. These are also similar to flags given to commands in 508 most command-line systems. 510 A tagged argument is an argument for a command that begins with ":" 511 followed by a tag naming the argument, such as ":contains". This 512 argument means that zero or more of the next tokens have some 513 particular meaning depending on the argument. These next tokens may 514 be literal data but they are never blocks. 516 Tagged arguments are similar to positional arguments, except that 517 instead of the meaning being derived from the command, it is derived 518 from the tag. 520 Tagged arguments must appear before positional arguments, but they 521 may appear in any order with other tagged arguments. For simplicity 522 of the specification, this is not expressed in the syntax definitions 523 with commands, but they still may be reordered arbitrarily provided 524 they appear before positional arguments. Tagged arguments may be 525 mixed with optional arguments. 527 To simplify this specification, tagged arguments SHOULD NOT take 528 tagged arguments as arguments. 530 2.6.3. Optional Arguments 532 Optional arguments are exactly like tagged arguments except that they 533 may be left out, in which case a default value is implied. Because 534 optional arguments tend to result in shorter scripts, they have been 535 used far more than tagged arguments. 537 One particularly noteworthy case is the ":comparator" argument, which 538 allows the user to specify which comparator [COLLATION] will be used 539 to compare two strings, since different languages may impose 540 different orderings on UTF-8 [UTF-8] characters. 542 2.6.4. Types of Arguments 544 Abstractly, arguments may be literal data, tests, or blocks of 545 commands. In this way, an "if" control structure is merely a command 546 that happens to take a test and a block as arguments and may execute 547 the block of code. 549 However, this abstraction is ambiguous from a parsing standpoint. 550 The grammar in section 9.2 presents a parsable version of this: 551 Arguments are string-lists, numbers, and tags, which may be followed 552 by a test or a test-list, which may be followed by a block of 553 commands. No more than one test or test list, nor more than one 554 block of commands, may be used, and commands that end with a block of 555 commands do not end with semicolons. 557 2.7. String Comparison 559 When matching one string against another, there are a number of ways 560 of performing the match operation. These are accomplished with three 561 types of matches: an exact match, a substring match, and a wildcard 562 glob-style match. These are described below. 564 In order to provide for matches between character sets and case 565 insensitivity, Sieve uses the comparators defined in the Internet 566 Application Protocol Collation Registry [COLLATION]. 568 However, when a string represents the name of a header, the 569 comparator is never user-specified. Header comparisons are always 570 done with the "i;ascii-casemap" operator, i.e., case-insensitive 571 comparisons, because this is the way things are defined in the 572 message specification [IMAIL]. 574 2.7.1. Match Type 576 There are three match types describing the matching used in this 577 specification: ":is", ":contains", and ":matches". Match type 578 arguments are supplied to those commands which allow them to specify 579 what kind of match is to be performed. 581 These are used as optional arguments to tests that perform string 582 comparison. 584 The ":contains" match type describes a substring match. If the value 585 argument contains the key argument as a substring, the match is true. 586 For instance, the string "frobnitzm" contains "frob" and "nit", but 587 not "fbm". The empty key ("") is contained in all values. 589 The ":is" match type describes an absolute match; if the contents of 590 the first string are absolutely the same as the contents of the 591 second string, they match. Only the string "frobnitzm" is the string 592 "frobnitzm". The empty key ":is" and only ":is" the empty value. 594 The ":matches" match type specifies a wildcard match using the 595 characters "*" and "?"; the entire value must be matched. "*" 596 matches zero or more characters in the value and "?" matches a single 597 character in the value, where the comparator that is used (see 2.7.3) 598 defines what a character is. For example, the comparators "i;octet" 599 and "i;ascii-casemap" define a character to be a single octet so "?" 600 will always match exactly one octet when one of those comparators is 601 in use. In contrast, the comparator "i;basic;uca=3.1.1;uv=3.2" 602 defines a character to be any UTF-8 octet sequence encoding one 603 Unicode character and thus "?" may match more than one octet. "?" 604 and "*" may be escaped as "\\?" and "\\*" in strings to match against 605 themselves. The first backslash escapes the second backslash; 606 together, they escape the "*". This is awkward, but it is 607 commonplace in several programming languages that use globs and 608 regular expressions. 610 In order to specify what type of match is supposed to happen, 611 commands that support matching take optional arguments ":matches", 612 ":is", and ":contains". Commands default to using ":is" matching if 613 no match type argument is supplied. Note that these modifiers 614 interact with comparators; in particular, only comparators that 615 support the "substring match" operation are suitable for matching 616 with ":contains" or ":matches". It is an error to use a comparator 617 with ":contains" or ":matches" that is not compatible with it. 619 It is an error to give more than one of these arguments to a given 620 command. 622 For convenience, the "MATCH-TYPE" syntax element is defined here as 623 follows: 625 Syntax: ":is" / ":contains" / ":matches" 627 2.7.2. Comparisons Across Character Sets 629 Messages may involve a number of character sets. In order for 630 comparisons to work across character sets, implementations SHOULD 631 implement the following behavior: 633 Comparisons are performed on octets. Implementations convert text 634 from header fields in all charsets [MIME3] to Unicode, encoded as 635 UTF-8, as input to the comparator (see 2.7.3). Implementations 636 MUST be capable of converting US-ASCII, ISO-8859-1, the US-ASCII 637 subset of ISO-8859-* character sets, and UTF-8. Text that the 638 implementation cannot convert to Unicode for any reason MAY be 639 treated as plain US-ASCII (including any [MIME3] syntax) or 640 processed according to local conventions. An encoded NUL octet 641 (character zero) SHOULD NOT cause early termination of the header 642 content being compared against. 644 If implementations fail to support the above behavior, they MUST 645 conform to the following: 647 No two strings can be considered equal if one contains octets 648 greater than 127. 650 2.7.3. Comparators 652 In order to allow for language-independent, case-independent matches, 653 the match type may be coupled with a comparator name. The Internet 654 Application Protocol Collation Registry [COLLATION] provides the 655 framework for describing and naming comparators as used by this 656 specification. 658 All implementations MUST support the "i;octet" comparator (simply 659 compares octets) and the "i;ascii-casemap" comparator (which treats 660 uppercase and lowercase characters in the US-ASCII subset of UTF-8 as 661 the same). If left unspecified, the default is "i;ascii-casemap". 663 Some comparators may not be usable with substring matches; that is, 664 they may only work with ":is". It is an error to try and use a 665 comparator with ":matches" or ":contains" that is not compatible with 666 it. 668 Sieve treats a comparator result of "undefined" the same as a result 669 of "no-match". That is, this base specification does not provide any 670 means to directly detect invalid comparator input. 672 A comparator is specified by the ":comparator" option with commands 673 that support matching. This option is followed by a string providing 674 the name of the comparator to be used. For convenience, the syntax 675 of a comparator is abbreviated to "COMPARATOR", and (repeated in 676 several tests) is as follows: 678 Syntax: ":comparator" 680 So in this example, 682 Example: if header :contains :comparator "i;octet" "Subject" 683 "MAKE MONEY FAST" { 684 discard; 685 } 687 would discard any message with subjects like "You can MAKE MONEY 688 FAST", but not "You can Make Money Fast", since the comparator used 689 is case-sensitive. 691 Comparators other than "i;octet" and "i;ascii-casemap" must be 692 declared with require, as they are extensions. If a comparator 693 declared with require is not known, it is an error, and execution 694 fails. If the comparator is not declared with require, it is also an 695 error, even if the comparator is supported. (See 2.10.5.) 697 Both ":matches" and ":contains" match types are compatible with the 698 "i;octet" and "i;ascii-casemap" comparators and may be used with 699 them. 701 It is an error to give more than one of these arguments to a given 702 command. 704 2.7.4. Comparisons Against Addresses 706 Addresses are one of the most frequent things represented as strings. 707 These are structured, and being able to compare against the local- 708 part or the domain of an address is useful, so some tests that act 709 exclusively on addresses take an additional optional argument that 710 specifies what the test acts on. 712 These optional arguments are ":localpart", ":domain", and ":all", 713 which act on the local-part (left-side), the domain part (right- 714 side), and the whole address. 716 If an address is not syntactically valid then it will not be matched 717 by tests specifying ":localpart" or ":domain". 719 The kind of comparison done, such as whether or not the test done is 720 case-insensitive, is specified as a comparator argument to the test. 722 If an optional address-part is omitted, the default is ":all". 724 It is an error to give more than one of these arguments to a given 725 command. 727 For convenience, the "ADDRESS-PART" syntax element is defined here as 728 follows: 730 Syntax: ":localpart" / ":domain" / ":all" 732 2.8. Blocks 734 Blocks are sets of commands enclosed within curly braces and supplied 735 as the final argument to a command. Such a command is a control 736 structure: when executed it has control over the number of times the 737 commands in the block are executed. 739 With the commands supplied in this memo, there are no loops. The 740 control structures supplied--if, elsif, and else--run a block either 741 once or not at all. 743 2.9. Commands 745 Sieve scripts are sequences of commands. Commands can take any of 746 the tokens above as arguments, and arguments may be either tagged or 747 positional arguments. Not all commands take all arguments. 749 There are three kinds of commands: test commands, action commands, 750 and control commands. 752 The simplest is an action command. An action command is an 753 identifier followed by zero or more arguments, terminated by a 754 semicolon. Action commands do not take tests or blocks as arguments. 755 The actions referenced in this document are: 756 - keep, to save the message in the default location 757 - fileinto, to save the message in a specific mailbox 758 - redirect, to forward the message to another address, 759 - discard, to silently throw away the message 761 A control command is a command that affects the parsing or the flow 762 of execution of the Sieve script in some way. A control structure is 763 a control command which ends with a block instead of a semicolon. 765 A test command is used as part of a control command. It is used to 766 specify whether or not the block of code given to the control command 767 is executed. 769 2.10. Evaluation 771 2.10.1. Action Interaction 773 Some actions cannot be used with other actions because the result 774 would be absurd. These restrictions are noted throughout this memo. 776 Extension actions MUST state how they interact with actions defined 777 in this specification. 779 2.10.2. Implicit Keep 781 Previous experience with filtering systems suggests that cases tend 782 to be missed in scripts. To prevent errors, Sieve has an "implicit 783 keep". 785 An implicit keep is a keep action (see 4.4) performed in absence of 786 any action that cancels the implicit keep. 788 An implicit keep is performed if a message is not written to a 789 mailbox, redirected to a new address, or explicitly thrown out. That 790 is, if a fileinto, a keep, a redirect, or a discard is performed, an 791 implicit keep is not. 793 Some actions may be defined to not cancel the implicit keep. These 794 actions may not directly affect the delivery of a message, and are 795 used for their side effects. None of the actions specified in this 796 document meet that criteria, but extension actions may. 798 For instance, with any of the short messages offered above, the 799 following script produces no actions. 801 Example: if size :over 500K { discard; } 803 As a result, the implicit keep is taken. 805 2.10.3. Message Uniqueness in a Mailbox 807 Implementations SHOULD NOT deliver a message to the same mailbox more 808 than once, even if a script explicitly asks for a message to be 809 written to a mailbox twice. 811 The test for equality of two messages is implementation-defined. 813 If a script asks for a message to be written to a mailbox twice, it 814 MUST NOT be treated as an error. 816 2.10.4. Limits on Numbers of Actions 818 Site policy MAY limit numbers of actions taken and MAY impose 819 restrictions on which actions can be used together. In the event 820 that a script hits a policy limit on the number of actions taken for 821 a particular message, an error occurs. 823 Implementations MUST allow at least one keep or one fileinto. If 824 fileinto is not implemented, implementations MUST allow at least one 825 keep. 827 2.10.5. Extensions and Optional Features 829 Because of the differing capabilities of many mail systems, several 830 features of this specification are optional. Before any of these 831 extensions can be executed, they must be declared with the "require" 832 action. 834 If an extension is not enabled with "require", implementations MUST 835 treat it as if they did not support it at all. This protects scripts 836 from having their behavior altered by extensions which the script 837 author might not have even been aware of. 839 Implementations MUST NOT execute at all any script which requires an 840 unknown capability name. 842 Note: The reason for this restriction is that prior experiences with 843 languages such as LISP and Tcl suggest that this is a workable 844 way of noting that a given script uses an extension. 846 Experience with PostScript suggests that mechanisms that allow 847 a script to work around missing extensions are not used in 848 practice. 850 Extensions which define actions MUST state how they interact with 851 actions discussed in the base specification. 853 2.10.6. Errors 855 In any programming language, there are compile-time and run-time 856 errors. 858 Compile-time errors are ones in syntax that are detectable if a 859 syntax check is done. 861 Run-time errors are not detectable until the script is run. This 862 includes transient failures like disk full conditions, but also 863 includes issues like invalid combinations of actions. 865 When an error occurs in a Sieve script, all processing stops. 867 Implementations MAY choose to do a full parse, then evaluate the 868 script, then do all actions. Implementations might even go so far as 869 to ensure that execution is atomic (either all actions are executed 870 or none are executed). 872 Other implementations may choose to parse and run at the same time. 873 Such implementations are simpler, but have issues with partial 874 failure (some actions happen, others don't). 876 Implementations MUST perform syntactic, semantic, and run-time checks 877 on code that is actually executed. Implementations MAY perform those 878 checks or any part of them on code that is not reached during 879 execution. 881 When an error happens, implementations MUST notify the user that an 882 error occurred, which actions (if any) were taken, and do an implicit 883 keep. 885 2.10.7. Limits on Execution 887 Implementations may limit certain constructs. However, this 888 specification places a lower bound on some of these limits. 890 Implementations MUST support fifteen levels of nested blocks. 892 Implementations MUST support fifteen levels of nested test lists. 894 3. Control Commands 896 Control structures are needed to allow for multiple and conditional 897 actions. 899 3.1. Control if 901 There are three pieces to if: "if", "elsif", and "else". Each is 902 actually a separate command in terms of the grammar. However, an 903 elsif or else MUST only follow an if or elsif. An error occurs if 904 these conditions are not met. 906 Usage: if 908 Usage: elsif 910 Usage: else 912 The semantics are similar to those of any of the many other 913 programming languages these control structures appear in. When the 914 interpreter sees an "if", it evaluates the test associated with it. 915 If the test is true, it executes the block associated with it. 917 If the test of the "if" is false, it evaluates the test of the first 918 "elsif" (if any). If the test of "elsif" is true, it runs the 919 elsif's block. An elsif may be followed by an elsif, in which case, 920 the interpreter repeats this process until it runs out of elsifs. 922 When the interpreter runs out of elsifs, there may be an "else" case. 923 If there is, and none of the if or elsif tests were true, the 924 interpreter runs the else's block. 926 This provides a way of performing exactly one of the blocks in the 927 chain. 929 In the following example, both Message A and B are dropped. 931 Example: require "fileinto"; 932 if header :contains "from" "coyote" { 933 discard; 934 } elsif header :contains ["subject"] ["$$$"] { 935 discard; 936 } else { 937 fileinto "INBOX"; 938 } 940 When the script below is run over message A, it redirects the message 941 to acm@example.com; message B, to postmaster@example.com; any other 942 message is redirected to field@example.com. 944 Example: if header :contains ["From"] ["coyote"] { 945 redirect "acm@example.com"; 946 } elsif header :contains "Subject" "$$$" { 947 redirect "postmaster@example.com"; 948 } else { 949 redirect "field@example.com"; 950 } 952 Note that this definition prohibits the "... else if ..." sequence 953 used by C. This is intentional, because this construct produces a 954 shift-reduce conflict. 956 3.2. Control require 958 Usage: require 960 The require action notes that a script makes use of a certain 961 extension. Such a declaration is required to use the extension, as 962 discussed in section 2.10.5. Multiple capabilities can be declared 963 with a single require. 965 The require command, if present, MUST be used before anything other 966 than a require can be used. An error occurs if a require appears 967 after a command other than require. 969 Example: require ["fileinto", "reject"]; 971 Example: require "fileinto"; 972 require "vacation"; 974 3.3. Control stop 976 Usage: stop 978 The "stop" action ends all processing. If the implicit keep has not 979 been cancelled, then it is taken. 981 4. Action Commands 983 This document supplies four actions that may be taken on a message: 984 keep, fileinto, redirect, and discard. 986 Implementations MUST support the "keep", "discard", and "redirect" 987 actions. 989 Implementations SHOULD support "fileinto". 991 Implementations MAY limit the number of certain actions taken (see 992 section 2.10.4). 994 4.1. Action fileinto 996 Usage: fileinto 998 The "fileinto" action delivers the message into the specified 999 mailbox. Implementations SHOULD support fileinto, but in some 1000 environments this may be impossible. Implementations MAY place 1001 restrictions on mailbox names; use of an invalid mailbox name MAY be 1002 treated as an error or result in delivery to an implementation- 1003 defined mailbox. If the specified mailbox doesn't exist, the 1004 implementation MAY treat it as an error, create the mailbox, or 1005 deliver the message to an implementation-defined mailbox. If the 1006 implementation uses a different encoding scheme than UTF-8 for 1007 mailbox names, it SHOULD reencode the mailbox name from UTF-8 to its 1008 encoding scheme. For example, the Internet Message Access Protocol 1009 [IMAP] uses modified UTF-7, such that a mailbox argument of "odds & 1010 ends" would appear in IMAP as "odds &- ends". 1012 The capability string for use with the require command is "fileinto". 1014 In the following script, message A is filed into mailbox 1015 "INBOX.harassment". 1017 Example: require "fileinto"; 1018 if header :contains ["from"] "coyote" { 1019 fileinto "INBOX.harassment"; 1020 } 1022 4.2. Action redirect 1024 Usage: redirect 1026 The "redirect" action is used to send the message to another user at 1027 a supplied address, as a mail forwarding feature does. The 1028 "redirect" action makes no changes to the message body or existing 1029 headers, but it may add new headers. In particular, existing 1030 Received headers MUST be preserved and the count of Received headers 1031 in the outgoing message MUST be larger than the same count on the 1032 message as received by the implementation. (An implementation that 1033 adds a Received header before processing the message does not need to 1034 add another when redirecting.) 1036 The message is send back out with the address from the redirect 1037 command as an envelope recipient. Implementations MAY combine 1038 separate redirects for a given message into a single submission with 1039 multiple envelope recipients. (This is not an MUA-style forward, 1040 which creates a new message with a different sender and message ID, 1041 wrapping the old message in a new one.) 1043 The envelope sender address on the outgoing message is chosen by the 1044 sieve implementation. It MAY be copied from the message being 1045 processed. 1047 A simple script can be used for redirecting all mail: 1049 Example: redirect "bart@example.com"; 1051 Implementations SHOULD take measures to implement loop control, 1052 possibly including adding headers to the message or counting Received 1053 headers. If an implementation detects a loop, it causes an error. 1055 4.3. Action keep 1057 Usage: keep 1059 The "keep" action is whatever action is taken in lieu of all other 1060 actions, if no filtering happens at all; generally, this simply means 1061 to file the message into the user's main mailbox. This command 1062 provides a way to execute this action without needing to know the 1063 name of the user's main mailbox, providing a way to call it without 1064 needing to understand the user's setup, or the underlying mail 1065 system. 1067 For instance, in an implementation where the IMAP server is running 1068 scripts on behalf of the user at time of delivery, a keep command is 1069 equivalent to a fileinto "INBOX". 1071 Example: if size :under 1M { keep; } else { discard; } 1073 Note that the above script is identical to the one below. 1075 Example: if not size :under 1M { discard; } 1077 4.4. Action discard 1079 Usage: discard 1081 Discard is used to silently throw away the message. It does so by 1082 simply canceling the implicit keep. If discard is used with other 1083 actions, the other actions still happen. Discard is compatible with 1084 all other actions. (For instance fileinto+discard is equivalent to 1085 fileinto.) 1087 Discard MUST be silent; that is, it MUST NOT return a non-delivery 1088 notification of any kind ([DSN], [MDN], or otherwise). 1090 In the following script, any mail from "idiot@example.com" is thrown 1091 out. 1093 Example: if header :contains ["from"] ["idiot@example.com"] { 1094 discard; 1095 } 1097 While an important part of this language, "discard" has the potential 1098 to create serious problems for users: Students who leave themselves 1099 logged in to an unattended machine in a public computer lab may find 1100 their script changed to just "discard". In order to protect users in 1101 this situation (along with similar situations), implementations MAY 1102 keep messages destroyed by a script for an indefinite period, and MAY 1103 disallow scripts that throw out all mail. 1105 5. Test Commands 1107 Tests are used in conditionals to decide which part(s) of the 1108 conditional to execute. 1110 Implementations MUST support these tests: "address", "allof", 1111 "anyof", "exists", "false", "header", "not", "size", and "true". 1113 Implementations SHOULD support the "envelope" test. 1115 5.1. Test address 1117 Usage: address [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1118 1120 The "address" test matches Internet addresses in structured headers 1121 that contain addresses. It returns true if any header contains any 1122 key in the specified part of the address, as modified by the 1123 comparator and the match keyword. Whether there are other addresses 1124 present in the header doesn't affect this test; this test does not 1125 provide any way to determine whether an address is the only address 1126 in a header. 1128 Like envelope and header, this test returns true if any combination 1129 of the header-list and key-list arguments match and false otherwise. 1131 Internet email addresses [IMAIL] have the somewhat awkward 1132 characteristic that the local-part to the left of the at-sign is 1133 considered case sensitive, and the domain-part to the right of the 1134 at-sign is case insensitive. The "address" command does not deal 1135 with this itself, but provides the ADDRESS-PART argument for allowing 1136 users to deal with it. 1138 The address primitive never acts on the phrase part of an email 1139 address, nor on comments within that address. It also never acts on 1140 group names, although it does act on the addresses within the group 1141 construct. 1143 Implementations MUST restrict the address test to headers that 1144 contain addresses, but MUST include at least From, To, Cc, Bcc, 1145 Sender, Resent-From, Resent-To, and SHOULD include any other header 1146 that utilizes an "address-list" structured header body. 1148 Example: if address :is :all "from" "tim@example.com" { 1149 discard; 1150 } 1152 5.2. Test allof 1154 Usage: allof 1156 The "allof" test performs a logical AND on the tests supplied to it. 1158 Example: allof (false, false) => false 1159 allof (false, true) => false 1160 allof (true, true) => true 1162 The allof test takes as its argument a test-list. 1164 5.3. Test anyof 1166 Usage: anyof 1168 The "anyof" test performs a logical OR on the tests supplied to it. 1170 Example: anyof (false, false) => false 1171 anyof (false, true) => true 1172 anyof (true, true) => true 1174 5.4. Test envelope 1176 Usage: envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] 1177 1179 The "envelope" test is true if the specified part of the [SMTP] (or 1180 equivalent) envelope matches the specified key. This specification 1181 defines the interpretation of the (case insensitive) "from" and "to" 1182 envelope-parts. Additional envelope-parts may be defined by other 1183 extensions; implementations SHOULD consider unknown envelope parts an 1184 error. 1186 If one of the envelope-part strings is (case insensitive) "from", 1187 then matching occurs against the FROM address used in the SMTP MAIL 1188 command. The null reverse-path is matched against as the empty 1189 string, regardless of the ADDRESS-PART argument specified. 1191 If one of the envelope-part strings is (case insensitive) "to", then 1192 matching occurs against the TO address used in the SMTP RCPT command 1193 that resulted in this message getting delivered to this user. Note 1194 that only the most recent TO is available, and only the one relevant 1195 to this user. 1197 The envelope-part is a string list and may contain more than one 1198 parameter, in which case all of the strings specified in the key-list 1199 are matched against all parts given in the envelope-part list. 1201 Like address and header, this test returns true if any combination of 1202 the envelope-part list and key-list arguments match and false 1203 otherwise. 1205 All tests against envelopes MUST drop source routes. 1207 If the SMTP transaction involved several RCPT commands, only the data 1208 from the RCPT command that caused delivery to this user is available 1209 in the "to" part of the envelope. 1211 If a protocol other than SMTP is used for message transport, 1212 implementations are expected to adapt this command appropriately. 1214 The envelope command is optional. Implementations SHOULD support it, 1215 but the necessary information may not be available in all cases. The 1216 capability string for use with the require command is "envelope". 1218 Example: require "envelope"; 1219 if envelope :all :is "from" "tim@example.com" { 1220 discard; 1221 } 1223 5.5. Test exists 1225 Usage: exists 1227 The "exists" test is true if the headers listed in the header-names 1228 argument exist within the message. All of the headers must exist or 1229 the test is false. 1231 The following example throws out mail that doesn't have a From header 1232 and a Date header. 1234 Example: if not exists ["From","Date"] { 1235 discard; 1236 } 1238 5.6. Test false 1240 Usage: false 1242 The "false" test always evaluates to false. 1244 5.7. Test header 1246 Usage: header [COMPARATOR] [MATCH-TYPE] 1247 1249 The "header" test evaluates to true if the value of any of the named 1250 headers, ignoring leading and trailing whitespace, matches any key. 1251 The type of match is specified by the optional match argument, which 1252 defaults to ":is" if not specified, as specified in section 2.6. 1254 Like address and envelope, this test returns true if any combination 1255 of the header-names list and key-list arguments match and false 1256 otherwise. 1258 If a header listed in the header-names argument exists, it contains 1259 the empty key (""). However, if the named header is not present, it 1260 does not match any key, including the empty key. So if a message 1261 contained the header 1263 X-Caffeine: C8H10N4O2 1265 these tests on that header evaluate as follows: 1267 header :is ["X-Caffeine"] [""] => false 1268 header :contains ["X-Caffeine"] [""] => true 1270 Testing whether a given header is either absent or doesn't contain 1271 any non-whitespace characters can be done using a negated "header" 1272 test: 1274 not header :matches "Cc" "?*" 1276 5.8. Test not 1278 Usage: not 1280 The "not" test takes some other test as an argument, and yields the 1281 opposite result. "not false" evaluates to "true" and "not true" 1282 evaluates to "false". 1284 5.9. Test size 1286 Usage: size <":over" / ":under"> 1288 The "size" test deals with the size of a message. It takes either a 1289 tagged argument of ":over" or ":under", followed by a number 1290 representing the size of the message. 1292 If the argument is ":over", and the size of the message is greater 1293 than the number provided, the test is true; otherwise, it is false. 1295 If the argument is ":under", and the size of the message is less than 1296 the number provided, the test is true; otherwise, it is false. 1298 Exactly one of ":over" or ":under" must be specified, and anything 1299 else is an error. 1301 The size of a message is defined to be the number of octets in the 1302 [IMAIL] representation of the message. 1304 Note that for a message that is exactly 4,000 octets, the message is 1305 neither ":over" 4000 octets or ":under" 4000 octets. 1307 5.10. Test true 1309 Usage: true 1311 The "true" test always evaluates to true. 1313 6. Extensibility 1315 New control commands, actions, and tests can be added to the 1316 language. Sites must make these features known to their users; this 1317 document does not define a way to discover the list of extensions 1318 supported by the server. 1320 Any extensions to this language MUST define a capability string that 1321 uniquely identifies that extension. Capability string are case- 1322 sensitive; for example, "foo" and "FOO" are different capabilities. 1323 If a new version of an extension changes the functionality of a 1324 previously defined extension, it MUST use a different name. 1325 Extensions may register a set of related capabilities by registering 1326 just a unique prefix for them. The "comparator-" prefix is an 1327 example of this. The prefix MUST end with a "-" and MUST NOT overlap 1328 any existing registrations. 1330 In a situation where there is a script submission protocol and an 1331 extension advertisement mechanism aware of the details of this 1332 language, scripts submitted can be checked against the mail server to 1333 prevent use of an extension that the server does not support. 1335 Extensions MUST state how they interact with constraints defined in 1336 section 2.10, e.g., whether they cancel the implicit keep, and which 1337 actions they are compatible and incompatible with. Extensions MUST 1338 NOT change the behavior of the "require" control command or alter the 1339 interpretation of the argument to the "require" control. 1341 Extensions that can submit new email messages or otherwise generate 1342 new protocol requests MUST consider loop suppression, at least to 1343 document any security considerations. 1345 6.1. Capability String 1347 Capability strings are typically short strings describing what 1348 capabilities are supported by the server. 1350 Capability strings beginning with "vnd." represent vendor-defined 1351 extensions. Such extensions are not defined by Internet standards or 1352 RFCs, but are still registered with IANA in order to prevent 1353 conflicts. Extensions starting with "vnd." SHOULD be followed by the 1354 name of the vendor and product, such as "vnd.acme.rocket-sled". 1356 The following capability strings are defined by this document: 1358 encoded-character 1359 The string "encoded-character" indicates that the 1360 implementation supports the interpretation of 1361 "${hex:...}" and "${unicode:...}" in strings. 1363 envelope The string "envelope" indicates that the implementation 1364 supports the "envelope" command. 1366 fileinto The string "fileinto" indicates that the implementation 1367 supports the "fileinto" command. 1369 comparator- The string "comparator-elbonia" is provided if the 1370 implementation supports the "elbonia" comparator. 1371 Therefore, all implementations have at least the 1372 "comparator-i;octet" 1373 and "comparator-i;ascii-casemap" capabilities. However, 1374 these comparators may be used without being declared 1375 with require. 1377 6.2. IANA Considerations 1379 In order to provide a standard set of extensions, a registry is 1380 provided by IANA. Capability names may be registered on a first- 1381 come, first-served basis. Extensions designed for interoperable use 1382 SHOULD be defined as standards track or IESG approved experimental 1383 RFCs. Registration of capability prefixes that do not begin with 1384 "vnd." REQUIRES a standards track or IESG approved experimental RFC. 1386 6.2.1. Template for Capability Registrations 1388 The following template is to be used for registering new Sieve 1389 extensions with IANA. 1391 To: iana@iana.org 1392 Subject: Registration of new Sieve extension 1394 Capability name: [the string for use in the 'require' statement] 1395 Description: [a brief description of what the extension adds 1396 or changes] 1397 RFC number: [for extensions published as RFCs] 1398 Contact address: [email and/or physical address to contact for 1399 additional information] 1401 6.2.2. Handling of Existing Capability Registrations 1403 In order to bring the existing capability registrations in line with 1404 the new template, IANA is asked to modify each as follows: 1406 1. The "capability name" and "capability arguments" fields 1407 should be eliminated 1408 2. The "capability keyword" field should be renamed to "Capability 1409 name" 1410 3. An empty "Description" field should be added 1411 4. The "Standards Track/IESG-approved experimental RFC number" field 1412 should be renamed to "RFC number" 1413 5. The "Person and email address to contact for further information" 1414 field should be renamed to "Contact address" 1416 6.2.3. Initial Capability Registrations 1418 This RFC updates the the following entries in the IANA registry for 1419 Sieve extensions. 1421 Capability name: encoded-character 1422 Description: changes the interpretation of strings to allow 1423 arbitrary octets and Unicode characters to be 1424 represented using US-ASCII 1425 RFC number: this RFC (Sieve base spec) 1426 Contact address: The Sieve discussion list 1428 Capability name: fileinto 1429 Description: adds the 'fileinto' action for delivering to a 1430 mailbox other than the default 1431 RFC number: this RFC (Sieve base spec) 1432 Contact address: The Sieve discussion list 1433 Capability name: envelope 1434 Description: adds the 'envelope' test for testing the message 1435 transport sender and recipient address 1436 RFC number: this RFC (Sieve base spec) 1437 Contact address: The Sieve discussion list 1439 Capability name: comparator-* (anything starting with "comparator-") 1440 Description: adds the indicated comparator for use with the 1441 :comparator argument 1442 RFC number: this RFC (Sieve base spec) and [COLLATION] 1443 Contact address: The Sieve discussion list 1445 6.3. Capability Transport 1447 As the range of mail systems that this document is intended to apply 1448 to is quite varied, a method of advertising which capabilities an 1449 implementation supports is difficult due to the wide range of 1450 possible implementations. Such a mechanism, however, should have the 1451 property that the implementation can advertise the complete set of 1452 extensions that it supports. 1454 7. Transmission 1456 The [MIME] type for a Sieve script is "application/sieve". 1458 The registration of this type for RFC 2048 requirements is updated as 1459 follows: 1461 Subject: Registration of MIME media type application/sieve 1463 MIME media type name: application 1464 MIME subtype name: sieve 1465 Required parameters: none 1466 Optional parameters: none 1467 Encoding considerations: Most sieve scripts will be textual, 1468 written in UTF-8. When non-7bit characters are used, 1469 quoted-printable is appropriate for transport systems 1470 that require 7bit encoding. 1472 Security considerations: Discussed in section 10 of this RFC. 1473 Interoperability considerations: Discussed in section 2.10.5 1474 of this RFC. 1475 Published specification: this RFC. 1476 Applications which use this media type: sieve-enabled mail 1477 servers and clients 1478 Additional information: 1479 Magic number(s): 1480 File extension(s): .siv .sieve 1481 Macintosh File Type Code(s): 1482 Person & email address to contact for further information: 1483 See the discussion list at ietf-mta-filters@imc.org. 1484 Intended usage: 1485 COMMON 1486 Author/Change controller: 1487 See Editor information in this RFC. 1489 8. Parsing 1491 The Sieve grammar is separated into tokens and a separate grammar as 1492 most programming languages are. 1494 8.1. Lexical Tokens 1496 Sieve scripts are encoded in UTF-8. The following assumes a valid 1497 UTF-8 encoding; special characters in Sieve scripts are all US-ASCII. 1499 The following are tokens in Sieve: 1501 - identifiers 1502 - tags 1503 - numbers 1504 - quoted strings 1505 - multi-line strings 1506 - other separators 1508 Identifiers, tags, and numbers are case-insensitive, while quoted 1509 strings and multi-line strings are case-sensitive. 1511 Blanks, horizontal tabs, CRLFs, and comments ("white space") are 1512 ignored except as they separate tokens. Some white space is required 1513 to separate otherwise adjacent tokens and in specific places in the 1514 multi-line strings. CR and LF can only appear in CRLF pairs. 1516 The other separators are single individual characters, and are 1517 mentioned explicitly in the grammar. 1519 The lexical structure of sieve is defined in the following grammar 1520 (as described in [ABNF]): 1522 bracket-comment = "/*" *not-star 1*STAR 1523 *(not-star-slash *not-star 1*STAR) "/" 1524 ; No */ allowed inside a comment. 1525 ; (No * is allowed unless it is the last 1526 ; character, or unless it is followed by a 1527 ; character that isn't a slash.) 1529 comment = bracket-comment / hash-comment 1531 hash-comment = "#" *octet-not-crlf CRLF 1533 identifier = (ALPHA / "_") *(ALPHA / DIGIT / "_") 1535 multi-line = "text:" *(SP / HTAB) (hash-comment / CRLF) 1536 *(multiline-literal / multiline-dotstuff) 1537 "." CRLF 1539 multiline-literal = [ octet-not-period *octet-not-crlf ] CRLF 1541 multiline-dotstuff = "." 1*octet-not-crlf CRLF 1542 ; A line containing only "." ends the 1543 ; multi-line. Remove a leading '.' if 1544 ; followed by another '.'. 1546 not-star = CRLF / %x01-09 / %x0B-0C / %x0E-29 / %x2B-FF 1547 ; either a CRLF pair, OR a single octet 1548 ; other than NUL, CR, LF, or star 1550 not-star-slash = CRLF / %x01-09 / %x0B-0C / %x0E-29 / %x2B-2E / 1551 %x30-FF 1552 ; either a CRLF pair, OR a single octet 1553 ; other than NUL, CR, LF, star, or slash 1555 number = 1*DIGIT [ QUANTIFIER ] 1557 octet-not-crlf = %x01-09 / %x0B-0C / %x0E-FF 1558 ; a single octet other than NUL, CR, or LF 1560 octet-not-period = %x01-09 / %x0B-0C / %x0E-2D / %x2F-FF 1561 ; a single octet other than NUL, 1562 ; CR, LF, or period 1564 octet-not-qspecial = %x01-09 / %x0B-0C / %x0E-21 / %x23-5B / %x5D-FF 1565 ; a single octet other than NUL, 1566 ; CR, LF, double-quote, or backslash 1568 QUANTIFIER = "K" / "M" / "G" 1570 quoted-other = "\" octet-not-qspecial 1571 ; represents just the octet-no-qspecial 1572 ; character. SHOULD NOT be used 1574 quoted-safe = CRLF / octet-not-qspecial 1575 ; either a CRLF pair, OR a single octet other 1576 ; than NUL, CR, LF, double-quote, or backslash 1578 quoted-special = "\" (DQUOTE / "\") 1579 ; represents just a double-quote or backslash 1581 quoted-string = DQUOTE quoted-text DQUOTE 1583 quoted-text = *(quoted-safe / quoted-special / quoted-other) 1585 STAR = "*" 1587 tag = ":" identifier 1589 white-space = 1*(SP / CRLF / HTAB) / comment 1591 8.2. Grammar 1593 The following is the grammar of Sieve after it has been lexically 1594 interpreted. No white space or comments appear below. The start 1595 symbol is "start". Non-terminals for MATCH-TYPE, COMPARATOR, and 1596 ADDRESS-PART are provided for use by extensions. 1598 ADDRESS-PART = ":localpart" / ":domain" / ":all" 1600 argument = string-list / number / tag 1602 arguments = *argument [ test / test-list ] 1604 block = "{" commands "}" 1606 command = identifier arguments (";" / block) 1608 commands = *command 1610 COMPARATOR = ":comparator" string 1612 MATCH-TYPE = ":is" / ":contains" / ":matches" 1614 start = commands 1616 string = quoted-string / multi-line 1618 string-list = "[" string *("," string) "]" / string 1619 ; if there is only a single string, the brackets 1620 ; are optional 1622 test = identifier arguments 1624 test-list = "(" test *("," test) ")" 1626 9. Extended Example 1628 The following is an extended example of a Sieve script. Note that it 1629 does not make use of the implicit keep. 1631 # 1632 # Example Sieve Filter 1633 # Declare any optional features or extension used by the script 1634 # 1635 require ["fileinto"]; 1637 # 1638 # Handle messages from known mailing lists 1639 # Move messages from IETF filter discussion list to filter mailbox 1640 # 1641 if header :is "Sender" "owner-ietf-mta-filters@imc.org" 1642 { 1643 fileinto "filter"; # move to "filter" mailbox 1644 } 1645 # 1646 # Keep all messages to or from people in my company 1647 # 1648 elsif address :DOMAIN :is ["From", "To"] "example.com" 1649 { 1650 keep; # keep in "In" mailbox 1651 } 1653 # 1654 # Try and catch unsolicited email. If a message is not to me, 1655 # or it contains a subject known to be spam, file it away. 1656 # 1657 elsif anyof (NOT address :all :contains 1658 ["To", "Cc", "Bcc"] "me@example.com", 1659 header :matches "subject" 1660 ["*make*money*fast*", "*university*dipl*mas*"]) 1661 { 1662 fileinto "spam"; # move to "spam" mailbox 1663 } 1664 else 1665 { 1666 # Move all other (non-company) mail to "personal" 1667 # mailbox. 1668 fileinto "personal"; 1669 } 1671 10. Security Considerations 1673 Users must get their mail. It is imperative that whatever method 1674 implementations use to store the user-defined filtering scripts be 1675 secure. 1677 It is equally important that implementations sanity-check the user's 1678 scripts, and not allow users to create on-demand mailbombs. For 1679 instance, an implementation that allows a user to redirect a message 1680 multiple times might also allow a user to create a mailbomb triggered 1681 by mail from a specific user. Site- or implementation-defined limits 1682 on actions are useful for this. 1684 Several commands, such as "discard", "redirect", and "fileinto" allow 1685 for actions to be taken that are potentially very dangerous. 1687 The "redirect" command has considerations regarding loop prevention; 1688 see the command description for recommendations. 1690 Use of the "redirect" command to generate notifications may easily 1691 overwhelm the target address, especially if it was not designed to 1692 handle large messages. 1694 Implementations SHOULD take measures to prevent scripts from looping. 1696 As with any filter on a message stream, if the sieve implementation 1697 and the mail agents 'behind' sieve in the message stream differ in 1698 their interpretation of the messages, it may be possible for an 1699 attacker to subvert the filter. Of particular note are differences 1700 in the interpretation of malformed messages (e.g., missing or extra 1701 syntax characters) or those that exhibit corner cases (e.g., NUL 1702 octets encoded via [MIME3]). 1704 11. Acknowledgments 1706 This document has been revised in part based on comments and 1707 discussions that took place on and off the SIEVE mailing list. 1708 Thanks to Sharon Chisholm, Cyrus Daboo, Ned Freed, Arnt Gulbrandsen, 1709 Michael Haardt, Kjetil Torgrim Homme, Barry Leiba, Mark E. Mallett, 1710 Alexey Melnikov, Eric Rescorla, Rob Siemborski, and Nigel Swinson for 1711 reviews and suggestions. 1713 12. Editors' Addresses 1715 Philip Guenther 1716 Sendmail, Inc. 1717 6425 Christie St. Ste 400 1718 Emeryville, CA 94608 1719 Email: guenther@sendmail.com 1721 Tim Showalter 1722 Email: tjs@psaux.com 1724 13. Normative References 1726 [ABNF] D. Crocker, Ed., P. Overell "Augmented BNF for Syntax 1727 Specifications: ABNF", RFC 4234, October 2005. 1729 [COLLATION] Newman, C., Duerst, M., and A. Gulbrandsen "Internet 1730 Application Protocol Collation Registry" draft- 1731 newman-i18n-comparator-07.txt (work in progress), 1732 March 2006. 1734 [IMAIL] P. Resnick, Ed., "Internet Message Format", RFC 2822, 1735 April 2001. 1737 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 1738 Requirement Levels", BCP 14, RFC 2119, March 1997. 1740 [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 1741 Extensions (MIME) Part One: Format of Internet 1742 Message Bodies", RFC 2045, November 1996. 1744 [MIME3] Moore, K., "MIME (Multipurpose Internet Mail Extensions) 1745 Part Three: Message Header Extensions for Non-ASCII 1746 Text", RFC 2047, November 1996. 1748 [SMTP] J. Klensin, Ed., "Simple Mail Transfer Protocol", RFC 1749 2821, April 2001. 1751 [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO 1752 10646", RFC 3629, November 2003. 1754 14. Informative References 1756 [BINARY-SI] "Standard IEC 60027-2: Letter symbols to be used in 1757 electrical technology - Part 2: Telecommunications and 1758 electronics", January 1999. 1760 [DSN] Moore, K. and G. Vaudreuil, "An Extensible Message Format 1761 for Delivery Status Notifications", RFC 3464, January 1762 2003. 1764 [FLAMES] Borenstein, N, and C. Thyberg, "Power, Ease of Use, and 1765 Cooperative Work in a Practical Multimedia Message 1766 System", Int. J. of Man-Machine Studies, April, 1991. 1767 Reprinted in Computer-Supported Cooperative Work and 1768 Groupware, Saul Greenberg, editor, Harcourt Brace 1769 Jovanovich, 1991. Reprinted in Readings in Groupware and 1770 Computer-Supported Cooperative Work, Ronald Baecker, 1771 editor, Morgan Kaufmann, 1993. 1773 [IMAP] Crispin, M., "Internet Message Access Protocol - version 1774 4rev1", RFC 3501, March 2003. 1776 [MDN] T. Hansen, Ed., G. Vaudreuil, Ed., "Message Disposition 1777 Notification", RFC 3798, May 2004. 1779 [RFC3028] Showalter, T., "Sieve: A Mail Filtering Language", RFC 1780 3028, January 2001. 1782 15. Changes from RFC 3028 1784 This following list is a summary of the changes that have been made 1785 in the Sieve language base specification from [RFC3028]. 1787 1. Removed ban on tests having side-effects 1788 2. Removed reject extension (will be specified in a separate RFC) 1789 3. Clarified description of comparators to match [COLLATION], the 1790 new base specification for them 1791 4. Require stripping of leading and trailing whitespace in 1792 "header" test 1793 5. Clarified or tightened handling of many minor items, including: 1794 - invalid [MIME3] encoding 1795 - invalid addresses in headers 1796 - invalid header field names in tests 1797 - 'undefined' comparator result 1798 - unknown envelope parts 1799 - null return-path in "envelope" test 1800 6. Capability strings are case-sensitive 1801 7. Clarified that fileinto should reencode non-ASCII mailbox 1802 names to match the mailstore's conventions 1803 8. Errors in the ABNF were corrected 1804 9. The references were updated and split into normative and 1805 informative 1807 16. Full Copyright Statement 1809 Copyright (C) The IETF Trust (2007). 1811 This document is subject to the rights, licenses and restrictions 1812 contained in BCP 78, and except as set forth therein, the authors 1813 retain all their rights. 1815 This document and the information contained herein are provided on an 1816 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1817 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 1818 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 1819 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 1820 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1821 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1823 Intellectual Property 1825 The IETF takes no position regarding the validity or scope of any 1826 Intellectual Property Rights or other rights that might be claimed to 1827 pertain to the implementation or use of the technology described in 1828 this document or the extent to which any license under such rights 1829 might or might not be available; nor does it represent that it has 1830 made any independent effort to identify any such rights. Information 1831 on the procedures with respect to rights in RFC documents can be 1832 found in BCP 78 and BCP 79. 1834 Copies of IPR disclosures made to the IETF Secretariat and any 1835 assurances of licenses to be made available, or the result of an 1836 attempt made to obtain a general license or permission for the use of 1837 such proprietary rights by implementers or users of this 1838 specification can be obtained from the IETF on-line IPR repository at 1839 http://www.ietf.org/ipr. 1841 The IETF invites any interested party to bring to its attention any 1842 copyrights, patents or patent applications, or other proprietary 1843 rights that may cover technology that may be required to implement 1844 this standard. Please address the information to the IETF at ietf- 1845 ipr@ietf.org. 1847 Acknowledgement 1849 Funding for the RFC Editor function is currently provided by the IETF 1850 Administrative Support Activity (IASA). 1852 Appendix A. Change History 1854 This section will be removed when this document leaves the Internet- 1855 Draft stage. 1857 Changes from draft-ietf-sieve-3028bis-11.txt 1858 1. Correct typo in boilerplate 1859 2. Update [DSN] reference to RFC 3464 1861 Changes from draft-ietf-sieve-3028bis-10.txt 1862 1. Clarify how the "redirect" action uses the address argument 1863 2. Eliminate the phrase "original message" 1864 3. If an outbound address doesn't match the syntax, it's an error 1866 Changes from draft-ietf-sieve-3028bis-09.txt 1867 1. [MDN] reference is merely informative 1868 2. Whitespace tweaks in the ABNF 1869 3. Extensions can't change "require" 1870 4. fileinto a nonexistent mailbox is implementation defined behavior 1871 5. Clarify the definition of the size of a message 1872 6. Make the KEYWORDS boilerplate match expectations 1873 7. Add the encoded-character extension 1874 8. Remove duplication in text regarding unknown extensions 1875 9. Address security concerns about looping with redirect or other 1876 extensions 1877 10. Valid numbers include zero 1878 11. Various changes suggested by the gen-art reviewer 1879 12. Removed references to the Halting Problem. Humor is dead 1880 13. Clarify which tokens are case-insensitive and which are 1881 case-sensitive; use the 'unexpected' case in several examples 1882 14. Add .sieve as an extension for the application/sieve MIME type 1883 15. Permit registration of capability prefixes (like "comparator-"), 1884 but require an IESG approved RFC when they're outside the 1885 "vnd." 'namespace' 1886 16. Replace "example.edu" with "example.com" 1887 17. Update boilerplate 1888 18. Updated pages numbers in table of contents 1890 Changes from draft-ietf-sieve-3028bis-08.txt 1891 1. [RFC3028] reference is merely informative 1892 2. String lists are literal data 1893 3. Tagged and optional arguments can take any sort of literal data 1894 as arguments 1895 4. Change "folder" to "mailbox" throughout 1896 5. Added more items to the "Changes from RFC 3028" list 1897 6. A multi-line string includes the CRLF before the final dot 1899 Changes from draft-ietf-sieve-3028bis-07.txt 1900 1. Improve description in the extension registrations 1901 2. Give IANA directions on how to massage existing registrations 1902 into the new form 1903 3. Added "Changes from RFC 3028" section 1904 4. Updated pages numbers in table of contents 1905 5. Permit non-UTF-8 octet sequences in comments 1906 6. It's an error to use conflicting or repeated tagged and optional 1907 arguments 1908 7. Update description of script encoding 1910 Changes from draft-ietf-sieve-3028bis-06.txt 1911 1. Tweak wording of how :matches uses character definition 1912 of comparator 1913 2. Add security consideration regarding "redirect" as a notification 1914 method 1915 3. fileinto SHOULD reencode; mention IMAP's mUTF-7 1916 4. en;ascii-casemap is gone; switch back to i;ascii-casemap 1917 5. Permit non-UTF-8 octet sequences in strings 1918 6. Sort grammar non-terminals 1919 7. Syntactically invalid addresses don't match :localpart or :domain 1920 8. The null return-path has empty address parts 1921 9. Treat comparator result of "undefined" the same as "no-match" 1922 10. Envelope sender on redirects is implementation defined 1923 11. Change IANA registration template 1925 Changes from draft-ietf-sieve-3028bis-05.txt 1926 1. The specifics of what names are acceptable for fileinto and 1927 the handling of invalid names are both implementation-defined 1928 2. Update to draft-newman-i18n-comparator-07.txt 1929 3. Adjust the example in 5.7 again 1931 Changes from draft-ietf-sieve-3028bis-04.txt 1932 1. Change "Syntax:" to "Usage:" 1933 2. Update ABNF reference to RFC 4234 1934 3. Add non-terminals for MATCH-TYPE, COMPARATOR, and ADDRESS-PART 1935 4. Strip leading and trailing whitespace in the value being matched 1936 by header 1937 5. Collations operate on octets, not characters, and for character 1938 data that is the UTF-8 encoding of the Unicode characters 1939 6. :matches uses character definition of comparator 1941 Changes from draft-ietf-sieve-3028bis-03.txt 1942 1. Remove section 2.4.2.4., MIME Parts, as unreferenced 1943 2. Update to draft-newman-i18n-comparator-04.txt 1944 3. Various tweaks to examples and syntax lines 1945 4. Define "control structure" as a control command with a block 1946 argument, then use it consistently. Reword description of 1947 blocks to match 1949 5. Clarify that "header" can never match an absent header and give 1950 the preferred way to test for absent or empty 1951 6. Invalid header name syntax is not an error _in tests_ (but could 1952 be elsewhere) 1953 7. Implementation SHOULD consider unknown envelope parts an error 1954 8. Remove explicit "omitted" option from 2.7.2p2 1956 Changes from draft-ietf-sieve-3028bis-02.txt 1957 1. Change "ASCII" to "US-ASCII" throughout 1958 2. Tweak section 2.7.2 to not require use of UTF-8 internally and 1959 to explicitly leave implementation-defined the handling of text 1960 that can't be converted to Unicode 1961 3. Add reference to RFC 2047 1962 4. Clarify that capability strings are case-sensitive 1963 5. Clarify that address, envelope, and header return false if no 1964 combination of arguments match 1965 6. Directly state that code that isn't reached may still be checked 1966 for errors 1967 7. Invalid header name syntax is not an error 1968 8. Remove description of header unfolding that conflicts with 1969 [IMAIL] 1970 9. Warn that filters may be subvertable if agents interpret messages 1971 differently 1972 10. Encoded NUL octets SHOULD NOT cause truncation 1974 Changes from draft-ietf-sieve-3028bis-01.txt 1975 1. Remove ban on side effects 1976 2. Remove definition of the 'reject' action, as it is being moved 1977 to the doc that also defines the 'refuse' action 1978 3. Update capability registrations to reference the mailing list 1979 4. Add Tim back as an editor 1980 5. Refer to the zero-length string ("") as "empty" instead of 1981 "null" 1983 Changes from draft-ietf-sieve-3028bis-00.txt 1984 1. More grammar corrections: 1985 - permit /***/, 1986 - remove ambiguity in finding end of bracket comment, 1987 - require valid UTF-8, 1988 - express quoting in the grammar 1989 - ban bare CR and LF in all locations 1990 2. Correct a bunch of whitespace and linewrapping nits 1991 3. Update IMAIL and SMTP references to RFC 2822 and RFC 2821 1992 4. Require support for en;ascii-casemap comparator as well as the 1993 old i;ascii-casemap. As with the old one, you do not need to 1994 use 'require' to use the new comparator 1995 5. Update IANA considerations to update the existing registrations 1996 to point at this doc instead of 3028 1998 6. Scripts SHOULD NOT contain superfluous backslashes 1999 7. Update Acknowledgments 2001 Changes from RFC 3028 2002 1. Split references into normative and informative 2003 2. Update references to current versions of DSN, IMAP, MDN, and 2004 UTF-8 RFCs 2005 3. Replace "e-mail" with "email" 2006 4. Incorporate RFC 3028 errata 2007 5. The "reject" action cancels the implicit keep 2008 6. Replace references to ACAP with references to the i18n-comparator 2009 draft. Further work is needed to completely sync with that 2010 draft 2011 7. Start to update grammar to only permit legal UTF-8 (incomplete) 2012 and correct various other errors and typos 2013 8. Update IPR broilerplate to RFC 3978/3979