idnits 2.17.1 

draft-ietf-sieve-3028bis-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 13.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 1687.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1698.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1705.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1711.

  ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line
     1675), which is fine, but *also* found old RFC 2026, Section 10.4C,
     paragraph 1 text on line 39.

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords -- however, there's a paragraph with
     a matching beginning. Boilerplate error?

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (May 2005) is 6915 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Looks like a reference, but probably isn't: '1' on line 592

  == Missing Reference: 'COMPARATOR' is mentioned on line 1350, but not
     defined

  == Missing Reference: 'ADDRESS-PART' is mentioned on line 1350, but not
     defined

  == Missing Reference: 'MATCH-TYPE' is mentioned on line 1350, but not
     defined

  == Missing Reference: 'QUANTIFIER' is mentioned on line 1477, but not
     defined

  == Unused Reference: 'SMTP' is defined on line 1645, but no explicit
     reference was found in the text

  == Unused Reference: 'IMAP' is defined on line 1670, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 2234 (ref. 'ABNF') (Obsoleted by RFC
     4234)

  == Outdated reference: A later version (-14) exists of
     draft-newman-i18n-comparator-03

  ** Obsolete normative reference: RFC  822 (ref. 'IMAIL') (Obsoleted by RFC
     2822)

  ** Obsolete normative reference: RFC 3798 (ref. 'MDN') (Obsoleted by RFC
     8098)

  ** Obsolete normative reference: RFC  821 (ref. 'SMTP') (Obsoleted by RFC
     2821)

  -- Obsolete informational reference (is this intentional?): RFC 1894 (ref.
     'DSN') (Obsoleted by RFC 3464)

  -- Obsolete informational reference (is this intentional?): RFC 3501 (ref.
     'IMAP') (Obsoleted by RFC 9051)


     Summary: 8 errors (**), 0 flaws (~~), 10 warnings (==), 11 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network Working Group                                P. Guenther, Editor
2	Internet-Draft                                            Sendmail, Inc.
3	Expires: November 2005                                          May 2005

5	                   Sieve: An Email Filtering Language
6	                    draft-ietf-sieve-3028bis-00.txt

8	Status of this Memo

10	   By submitting this Internet-Draft, each author represents that any
11	   applicable patent or other IPR claims of which he or she is aware
12	   have been or will be disclosed, and any of which he or she becomes
13	   aware will be disclosed, in accordance with Section 6 of BCP 79.

15	   Internet-Drafts are working documents of the Internet Engineering
16	   Task Force (IETF), its areas, and its working groups.  Note that
17	   other groups may also distribute working documents as Internet-
18	   Drafts.

20	   Internet-Drafts are draft documents valid for a maximum of six months
21	   and may be updated, replaced, or obsoleted by other documents at any
22	   time. It is inappropriate to use Internet-Drafts as reference
23	   material or to cite them other than as "work in progress."

25	   The list of current Internet-Drafts can be accessed at
26	   http://www.ietf.org/ietf/1id-abstracts.txt

28	   The list of Internet-Draft Shadow Directories can be accessed at
29	   http://www.ietf.org/shadow.html.

31	   A revised version of this draft document will be submitted to the RFC
32	   editor as a Standard Track RFC for the Internet Community.
33	   Discussion and suggestions for improvement are requested, and should
34	   be sent to ietf-mta-filters@imc.org.  Distribution of this memo is
35	   unlimited.

37	Copyright Notice

39	   Copyright (C) The Internet Society (2005).  All Rights Reserved.

41	Abstract

43	   This document describes a language for filtering email messages at
44	   time of final delivery.  It is designed to be implementable on either
45	   a mail client or mail server.  It is meant to be extensible, simple,
46	   and independent of access protocol, mail architecture, and operating
47	   system.  It is suitable for running on a mail server where users may
48	   not be allowed to execute arbitrary programs, such as on black box
49	   Internet Message Access Protocol (IMAP) servers, as it has no
50	   variables, loops, or ability to shell out to external programs.

52	Table of Contents

54	   1.      Introduction ...........................................   3
55	   1.1.     Conventions Used in This Document .....................   4
56	   1.2.     Example mail messages .................................   4
57	   2.      Design .................................................   5
58	   2.1.     Form of the Language ..................................   5
59	   2.2.     Whitespace ............................................   5
60	   2.3.     Comments ..............................................   6
61	   2.4.     Literal Data ..........................................   6
62	   2.4.1.   Numbers ...............................................   6
63	   2.4.2.   Strings ...............................................   7
64	   2.4.2.1. String Lists ..........................................   7
65	   2.4.2.2. Headers ...............................................   8
66	   2.4.2.3. Addresses .............................................   8
67	   2.4.2.4. MIME Parts ............................................   9
68	   2.5.     Tests .................................................   9
69	   2.5.1.   Test Lists ............................................   9
70	   2.6.     Arguments .............................................   9
71	   2.6.1.   Positional Arguments ..................................   9
72	   2.6.2.   Tagged Arguments ......................................  10
73	   2.6.3.   Optional Arguments ....................................  10
74	   2.6.4.   Types of Arguments ....................................  10
75	   2.7.     String Comparison .....................................  11
76	   2.7.1.   Match Type ............................................  11
77	   2.7.2.   Comparisons Across Character Sets .....................  12
78	   2.7.3.   Comparators ...........................................  12
79	   2.7.4.   Comparisons Against Addresses .........................  13
80	   2.8.     Blocks ................................................  14
81	   2.9.     Commands ..............................................  14
82	   2.10.    Evaluation ............................................  15
83	   2.10.1.  Action Interaction ....................................  15
84	   2.10.2.  Implicit Keep .........................................  15
85	   2.10.3.  Message Uniqueness in a Mailbox .......................  15
86	   2.10.4.  Limits on Numbers of Actions ..........................  16
87	   2.10.5.  Extensions and Optional Features ......................  16
88	   2.10.6.  Errors ................................................  17
89	   2.10.7.  Limits on Execution ...................................  17
90	   3.      Control Commands .......................................  17
91	   3.1.     Control Structure If ..................................  18
92	   3.2.     Control Structure Require .............................  19
93	   3.3.     Control Structure Stop ................................  19
94	   4.      Action Commands ........................................  19
95	   4.1.     Action reject .........................................  20
96	   4.2.     Action fileinto .......................................  20
97	   4.3.     Action redirect .......................................  21
98	   4.4.     Action keep ...........................................  21
99	   4.5.     Action discard ........................................  22
100	   5.      Test Commands ..........................................  22
101	   5.1.     Test address ..........................................  23
102	   5.2.     Test allof ............................................  23
103	   5.3.     Test anyof ............................................  24
104	   5.4.     Test envelope .........................................  24
105	   5.5.     Test exists ...........................................  25
106	   5.6.     Test false ............................................  25
107	   5.7.     Test header ...........................................  25
108	   5.8.     Test not ..............................................  26
109	   5.9.     Test size .............................................  26
110	   5.10.    Test true .............................................  26
111	   6.      Extensibility ..........................................  26
112	   6.1.     Capability String .....................................  27
113	   6.2.     IANA Considerations ...................................  28
114	   6.2.1.   Template for Capability Registrations .................  28
115	   6.2.2.   Initial Capability Registrations ......................  28
116	   6.3.     Capability Transport ..................................  29
117	   7.      Transmission ...........................................  29
118	   8.      Parsing ................................................  30
119	   8.1.     Lexical Tokens ........................................  30
120	   8.2.     Grammar ...............................................  31
121	   9.      Extended Example .......................................  32
122	   10.     Security Considerations ................................  34
123	   11.     Acknowledgments ........................................  34
124	   12.     Author's Address .......................................  34
125	   13.     References .............................................  34
126	   14.     Full Copyright Statement ...............................  36

128	1.      Introduction

130	   This memo documents a language that can be used to create filters for
131	   electronic mail.  It is not tied to any particular operating system
132	   or mail architecture.  It requires the use of [IMAIL]-compliant
133	   messages, but should otherwise generalize to many systems.

135	   The language is powerful enough to be useful but limited in order to
136	   allow for a safe server-side filtering system.  The intention is to
137	   make it impossible for users to do anything more complex (and
138	   dangerous) than write simple mail filters, along with facilitating
139	   the use of GUIs for filter creation and manipulation.  The language
140	   is not Turing-complete: it provides no way to write a loop or a
141	   function and variables are not provided.

143	   Scripts written in Sieve are executed during final delivery, when the
144	   message is moved to the user-accessible mailbox.  In systems where
145	   the MTA does final delivery, such as traditional Unix mail, it is
146	   reasonable to sort when the MTA deposits mail into the user's
147	   mailbox.

149	   There are a number of reasons to use a filtering system.  Mail
150	   traffic for most users has been increasing due to increased usage of
151	   email, the emergence of unsolicited email as a form of advertising,
152	   and increased usage of mailing lists.

154	   Experience at Carnegie Mellon has shown that if a filtering system is
155	   made available to users, many will make use of it in order to file
156	   messages from specific users or mailing lists.  However, many others
157	   did not make use of the Andrew system's FLAMES filtering language
158	   [FLAMES] due to difficulty in setting it up.

160	   Because of the expectation that users will make use of filtering if
161	   it is offered and easy to use, this language has been made simple
162	   enough to allow many users to make use of it, but rich enough that it
163	   can be used productively.  However, it is expected that GUI-based
164	   editors will be the preferred way of editing filters for a large
165	   number of users.

167	1.1.     Conventions Used in This Document

169	   In the sections of this document that discuss the requirements of
170	   various keywords and operators, the following conventions have been
171	   adopted.

173	   The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY"
174	   in this document are to be interpreted as defined in [KEYWORDS].

176	   Each section on a command (test, action, or control structure) has a
177	   line labeled "Syntax:".  This line describes the syntax of the
178	   command, including its name and its arguments.  Required arguments
179	   are listed inside angle brackets ("<" and ">").  Optional arguments
180	   are listed inside square brackets ("[" and "]").  Each argument is
181	   followed by its type, so "<key: string>" represents an argument
182	   called "key" that is a string.  Literal strings are represented with
183	   double-quoted strings.  Alternatives are separated with slashes, and
184	   parenthesis are used for grouping, similar to [ABNF].

186	   In the "Syntax" line, there are three special pieces of syntax that
187	   are frequently repeated, MATCH-TYPE, COMPARATOR, and ADDRESS-PART.
188	   These are discussed in sections 2.7.1, 2.7.3, and 2.7.4,
189	   respectively.

191	   The formal grammar for these commands in section 10 and is the
192	   authoritative reference on how to construct commands, but the formal
193	   grammar does not specify the order, semantics, number or types of
194	   arguments to commands, nor the legal command names.  The intent is to
195	   allow for extension without changing the grammar.

197	1.2.     Example mail messages

199	   The following mail messages will be used throughout this document in
200	   examples.

202	   Message A
203	   -----------------------------------------------------------
204	   Date: Tue, 1 Apr 1997 09:06:31 -0800 (PST)
205	   From: coyote@desert.example.org
206	   To: roadrunner@acme.example.com
207	   Subject: I have a present for you

209	   Look, I'm sorry about the whole anvil thing, and I really
210	   didn't mean to try and drop it on you from the top of the
211	   cliff.  I want to try to make it up to you.  I've got some
212	   great birdseed over here at my place--top of the line
213	   stuff--and if you come by, I'll have it all wrapped up
214	   for you.  I'm really sorry for all the problems I've caused
215	   for you over the years, but I know we can work this out.
216	   --
217	   Wile E. Coyote   "Super Genius"   coyote@desert.example.org
218	   -----------------------------------------------------------

220	   Message B
221	   -----------------------------------------------------------
222	   From: youcouldberich!@reply-by-postal-mail.invalid
223	   Sender: b1ff@de.res.example.com
224	   To: rube@landru.example.edu
225	   Date:  Mon, 31 Mar 1997 18:26:10 -0800
226	   Subject: $$$ YOU, TOO, CAN BE A MILLIONAIRE! $$$

228	   YOU MAY HAVE ALREADY WON TEN MILLION DOLLARS, BUT I DOUBT
229	   IT!  SO JUST POST THIS TO SIX HUNDRED NEWSGROUPS!  IT WILL
230	   GUARANTEE THAT YOU GET AT LEAST FIVE RESPONSES WITH MONEY!
231	   MONEY! MONEY! COLD HARD CASH!  YOU WILL RECEIVE OVER
232	   $20,000 IN LESS THAN TWO MONTHS!  AND IT'S LEGAL!!!!!!!!!
233	   !!!!!!!!!!!!!!!!!!111111111!!!!!!!11111111111!!1  JUST
234	   SEND $5 IN SMALL, UNMARKED BILLS TO THE ADDRESSES BELOW!
235	   -----------------------------------------------------------

237	2.      Design

239	2.1.     Form of the Language

241	   The language consists of a set of commands.  Each command consists of
242	   a set of tokens delimited by whitespace.  The command identifier is
243	   the first token and it is followed by zero or more argument tokens.
244	   Arguments may be literal data, tags, blocks of commands, or test
245	   commands.

247	   The language is represented in UTF-8, as specified in [UTF-8].

249	   Tokens in the ASCII range are considered case-insensitive.

251	2.2.     Whitespace

253	   Whitespace is used to separate tokens.  Whitespace is made up of
254	   tabs, newlines (CRLF, never just CR or LF), and the space character.
255	   The amount of whitespace used is not significant.

257	2.3.     Comments

259	   Two types of comments are offered.  Comments are semantically
260	   equivalent to whitespace and can be used anyplace that whitespace is
261	   (with one exception in multi-line strings, as described in the
262	   grammar).

264	   Hash comments begin with a "#" character that is not contained within
265	   a string and continue until the next CRLF.

267	   Example:  if size :over 100K { # this is a comment
268	                discard;
269	             }

271	   Bracketed comments begin with the token "/*" and end with "*/"
272	   outside of a string.  Bracketed comments may span multiple lines.
273	   Bracketed comments do not nest.

275	   Example:  if size :over 100K { /* this is a comment
276	                this is still a comment */ discard /* this is a comment
277	                */ ;
278	             }

280	2.4.     Literal Data

282	   Literal data means data that is not executed, merely evaluated "as
283	   is", to be used as arguments to commands.  Literal data is limited to
284	   numbers and strings.

286	2.4.1.   Numbers

288	   Numbers are given as ordinary decimal numbers.  However, those
289	   numbers that have a tendency to be fairly large, such as message
290	   sizes, MAY have a "K", "M", or "G" appended to indicate a multiple of
291	   a power of two.  To be comparable with the power-of-two-based
292	   versions of SI units that computers frequently use, K specifies
293	   kibi-, or 1,024 (2^10) times the value of the number; M specifies
294	   mebi-, or 1,048,576 (2^20) times the value of the number; and G
295	   specifies gibi-, or 1,073,741,824 (2^30) times the value of the
296	   number [BINARY-SI].

298	   Implementations MUST provide 31 bits of magnitude in numbers, but MAY
299	   provide more.

301	   Only positive integers are permitted by this specification.

303	2.4.2.   Strings

305	   Scripts involve large numbers of strings as they are used for pattern
306	   matching, addresses, textual bodies, etc.  Typically, short quoted
307	   strings suffice for most uses, but a more convenient form is provided
308	   for longer strings such as bodies of messages.

310	   A quoted string starts and ends with a single double quote (the <">
311	   character, ASCII 34).  A backslash ("\", ASCII 92) inside of a quoted
312	   string is followed by either another backslash or a double quote.
313	   This two-character sequence represents a single backslash or double-
314	   quote within the string, respectively.

316	   No other characters should be escaped with a single backslash.

318	   An undefined escape sequence (such as "\a" in a context where "a" has
319	   no special meaning) is interpreted as if there were no backslash (in
320	   this case, "\a" is just "a").

322	   Non-printing characters such as tabs, CR and LF, and control
323	   characters are permitted in quoted strings.  Quoted strings MAY span
324	   multiple lines.  NUL (ASCII 0) is not allowed in strings.

326	   For entering larger amounts of text, such as an email message, a
327	   multi-line form is allowed.  It starts with the keyword "text:",
328	   followed by a CRLF, and ends with the sequence of a CRLF, a single
329	   period, and another CRLF.  In order to allow the message to contain
330	   lines with a single-dot, lines are dot-stuffed.  That is, when
331	   composing a message body, an extra `.' is added before each line
332	   which begins with a `.'.  When the server interprets the script,
333	   these extra dots are removed.  Note that a line that begins with a
334	   dot followed by a non-dot character is not interpreted dot-stuffed;
335	   that is, ".foo" is interpreted as ".foo".  However, because this is
336	   potentially ambiguous, scripts SHOULD be properly dot-stuffed so such
337	   lines do not appear.

339	   Note that a hashed comment or whitespace may occur in between the
340	   "text:" and the CRLF, but not within the string itself.  Bracketed
341	   comments are not allowed here.

343	2.4.2.1. String Lists

345	   When matching patterns, it is frequently convenient to match against
346	   groups of strings instead of single strings.  For this reason, a list
347	   of strings is allowed in many tests, implying that if the test is
348	   true using any one of the strings, then the test is true.
349	   Implementations are encouraged to use short-circuit evaluation in
350	   these cases.

352	   For instance, the test `header :contains ["To", "Cc"]
353	   ["me@example.com", "me00@landru.example.edu"]' is true if either the
354	   To header or Cc header of the input message contains either of the
355	   email addresses "me@example.com" or "me00@landru.example.edu".

357	   Conversely, in any case where a list of strings is appropriate, a
358	   single string is allowed without being a member of a list: it is
359	   equivalent to a list with a single member.  This means that the test
360	   `exists "To"' is equivalent to the test `exists ["To"]'.

362	2.4.2.2. Headers

364	   Headers are a subset of strings.  In the Internet Message
365	   Specification [IMAIL] [RFC1123], each header line is allowed to have
366	   whitespace nearly anywhere in the line, including after the field
367	   name and before the subsequent colon.  Extra spaces between the
368	   header name and the ":" in a header field are ignored.

370	   A header name never contains a colon.  The "From" header refers to a
371	   line beginning "From:" (or "From   :", etc.).  No header will match
372	   the string "From:" due to the trailing colon.

374	   Folding of long header lines (as described in [IMAIL] 3.4.8) is
375	   removed prior to interpretation of the data.  The folding syntax (the
376	   CRLF that ends a line plus any leading whitespace at the beginning of
377	   the next line that indicates folding) are interpreted as if they were
378	   a single space.

380	2.4.2.3. Addresses

382	   A number of commands call for email addresses, which are also a
383	   subset of strings.  When these addresses are used in outbound
384	   contexts, addresses must be compliant with [IMAIL], but are further
385	   constrained.  Using the symbols defined in [IMAIL], section 6.1, the
386	   syntax of an address is:

388	   sieve-address = addr-spec                ; simple address
389	                 / phrase "<" addr-spec ">" ; name & addr-spec

391	   That is, routes and group syntax are not permitted.  If multiple
392	   addresses are required, use a string list.  Named groups are not used
393	   here.

395	   Implementations MUST ensure that the addresses are syntactically
396	   valid, but need not ensure that they actually identify an email
397	   recipient.

399	2.4.2.4. MIME Parts

401	   In a few places, [MIME] body parts are represented as strings.  These
402	   parts include MIME headers and the body.  This provides a way of
403	   embedding typed data within a Sieve script so that, among other
404	   things, character sets other than UTF-8 can be used for output
405	   messages.

407	2.5.     Tests

409	   Tests are given as arguments to commands in order to control their
410	   actions.  In this document, tests are given to if/elsif/else to
411	   decide which block of code is run.

413	   Tests MUST NOT have side effects.  That is, a test cannot affect the
414	   state of the filter or message.  No tests in this specification have
415	   side effects, and side effects are forbidden in extension tests as
416	   well.

418	   The rationale for this is that tests with side effects impair
419	   readability and maintainability and are difficult to represent in a
420	   graphic interface for generating scripts.  Side effects are confined
421	   to actions where they are clearer.

423	2.5.1.   Test Lists

425	   Some tests ("allof" and "anyof", which implement logical "and" and
426	   logical "or", respectively) may require more than a single test as an
427	   argument.  The test-list syntax element provides a way of grouping
428	   tests.

430	   Example:  if anyof (not exists ["From", "Date"],
431	                   header :contains "from" "fool@example.edu") {
432	                discard;
433	             }

435	2.6.     Arguments

437	   In order to specify what to do, most commands take arguments.  There
438	   are three types of arguments: positional, tagged, and optional.

440	2.6.1.   Positional Arguments

442	   Positional arguments are given to a command which discerns their
443	   meaning based on their order.  When a command takes positional
444	   arguments, all positional arguments must be supplied and must be in
445	   the order prescribed.

447	2.6.2.   Tagged Arguments

449	   This document provides for tagged arguments in the style of
450	   CommonLISP.  These are also similar to flags given to commands in
451	   most command-line systems.

453	   A tagged argument is an argument for a command that begins with ":"
454	   followed by a tag naming the argument, such as ":contains".  This
455	   argument means that zero or more of the next tokens have some
456	   particular meaning depending on the argument.  These next tokens may
457	   be numbers or strings but they are never blocks.

459	   Tagged arguments are similar to positional arguments, except that
460	   instead of the meaning being derived from the command, it is derived
461	   from the tag.

463	   Tagged arguments must appear before positional arguments, but they
464	   may appear in any order with other tagged arguments.  For simplicity
465	   of the specification, this is not expressed in the syntax definitions
466	   with commands, but they still may be reordered arbitrarily provided
467	   they appear before positional arguments.  Tagged arguments may be
468	   mixed with optional arguments.

470	   To simplify this specification, tagged arguments SHOULD NOT take
471	   tagged arguments as arguments.

473	2.6.3.   Optional Arguments

475	   Optional arguments are exactly like tagged arguments except that they
476	   may be left out, in which case a default value is implied.  Because
477	   optional arguments tend to result in shorter scripts, they have been
478	   used far more than tagged arguments.

480	   One particularly noteworthy case is the ":comparator" argument, which
481	   allows the user to specify which comparator [COLLATION] will be used
482	   to compare two strings, since different languages may impose
483	   different orderings on UTF-8 [UTF-8] characters.

485	2.6.4.   Types of Arguments

487	   Abstractly, arguments may be literal data, tests, or blocks of
488	   commands.  In this way, an "if" control structure is merely a command
489	   that happens to take a test and a block as arguments and may execute
490	   the block of code.

492	   However, this abstraction is ambiguous from a parsing standpoint.
493	   The grammar in section 9.2 presents a parsable version of this:
494	   Arguments are string-lists, numbers, and tags, which may be followed
495	   by a test or a test-list, which may be followed by a block of
496	   commands.  No more than one test or test list, nor more than one
497	   block of commands, may be used, and commands that end with blocks of
498	   commands do not end with semicolons.

500	2.7.     String Comparison

502	   When matching one string against another, there are a number of ways
503	   of performing the match operation.  These are accomplished with three
504	   types of matches: an exact match, a substring match, and a wildcard
505	   glob-style match.  These are described below.

507	   In order to provide for matches between character sets and case
508	   insensitivity, Sieve uses the comparators defined in the Internet
509	   Application Protocol Collation Registry [COLLATION].

511	   However, when a string represents the name of a header, the
512	   comparator is never user-specified.  Header comparisons are always
513	   done with the "i;ascii-casemap" operator, i.e., case-insensitive
514	   comparisons, because this is the way things are defined in the
515	   message specification [IMAIL].

517	2.7.1.   Match Type

519	   There are three match types describing the matching used in this
520	   specification: ":is", ":contains", and ":matches".  Match type
521	   arguments are supplied to those commands which allow them to specify
522	   what kind of match is to be performed.

524	   These are used as tagged arguments to tests that perform string
525	   comparison.

527	   The ":contains" match type describes a substring match.  If the value
528	   argument contains the key argument as a substring, the match is true.
529	   For instance, the string "frobnitzm" contains "frob" and "nit", but
530	   not "fbm".  The null key ("") is contained in all values.

532	   The ":is" match type describes an absolute match; if the contents of
533	   the first string are absolutely the same as the contents of the
534	   second string, they match.  Only the string "frobnitzm" is the string
535	   "frobnitzm".  The null key ":is" and only ":is" the null value.

537	   The ":matches" version specifies a wildcard match using the
538	   characters "*" and "?".  "*" matches zero or more characters, and "?"
539	   matches a single character.  "?" and "*" may be escaped as "\\?" and
540	   "\\*" in strings to match against themselves.  The first backslash
541	   escapes the second backslash; together, they escape the "*".  This is
542	   awkward, but it is commonplace in several programming languages that
543	   use globs and regular expressions.

545	   In order to specify what type of match is supposed to happen,
546	   commands that support matching take optional tagged arguments
547	   ":matches", ":is", and ":contains".  Commands default to using ":is"
548	   matching if no match type argument is supplied.  Note that these
549	   modifiers may interact with comparators; in particular, some
550	   comparators are not suitable for matching with ":contains" or
551	   ":matches".  It is an error to use a comparator with ":contains" or
552	   ":matches" that is not compatible with it.

554	   It is an error to give more than one of these arguments to a given
555	   command.

557	   For convenience, the "MATCH-TYPE" syntax element is defined here as
558	   follows:

560	   Syntax:   ":is" / ":contains" / ":matches"

562	2.7.2.   Comparisons Across Character Sets

564	   All Sieve scripts are represented in UTF-8, but messages may involve
565	   a number of character sets.  In order for comparisons to work across
566	   character sets, implementations SHOULD implement the following
567	   behavior:

569	      Implementations decode header charsets to UTF-8.  Two strings are
570	      considered equal if their UTF-8 representations are identical.
571	      Implementations should decode charsets represented in the forms
572	      specified by [MIME] for both message headers and bodies.
573	      Implementations must be capable of decoding US-ASCII, ISO-8859-1,
574	      the ASCII subset of ISO-8859-* character sets, and UTF-8.

576	   If implementations fail to support the above behavior, they MUST
577	   conform to the following:

579	      No two strings can be considered equal if one contains octets
580	      greater than 127.

582	2.7.3.   Comparators

584	   In order to allow for language-independent, case-independent matches,
585	   the match type may be coupled with a comparator name.  The Internet
586	   Application Protocol Collation Registry [COLLATION] provides the
587	   framework for describing and naming comparators as used by this
588	   specification.

590	   << The [COLLATION] draft defines "en;ascii-casemap" and then notes
591	   that Sieve and ACAP use "i;ascii-casemap" as a synonym.  Should that
592	   be removed from [1] and added here?  Should Sieve do more to match
593	   that draft?  >>

595	   While multiple comparator types are defined, only equality types are
596	   used in this specification.

598	   All implementations MUST support the "i;octet" comparator (simply
599	   compares octets) and the "i;ascii-casemap" comparator (which treats
600	   uppercase and lowercase characters in the ASCII subset of UTF-8 as
601	   the same).  If left unspecified, the default is "i;ascii-casemap".

603	   Some comparators may not be usable with substring matches; that is,
604	   they may only work with ":is".  It is an error to try and use a
605	   comparator with ":matches" or ":contains" that is not compatible with
606	   it.

608	   A comparator is specified by the ":comparator" option with commands
609	   that support matching.  This option is followed by a string providing
610	   the name of the comparator to be used.  For convenience, the syntax
611	   of a comparator is abbreviated to "COMPARATOR", and (repeated in
612	   several tests) is as follows:

614	   Syntax:   ":comparator" <comparator-name: string>

616	   So in this example,

618	   Example:  if header :contains :comparator "i;octet" "Subject"
619	                "MAKE MONEY FAST" {
620	                   discard;
621	             }

623	   would discard any message with subjects like "You can MAKE MONEY
624	   FAST", but not "You can Make Money Fast", since the comparator used
625	   is case-sensitive.

627	   Comparators other than i;octet and i;ascii-casemap must be declared
628	   with require, as they are extensions.  If a comparator declared with
629	   require is not known, it is an error, and execution fails.  If the
630	   comparator is not declared with require, it is also an error, even if
631	   the comparator is supported.  (See 2.10.5.)

633	   Both ":matches" and ":contains" match types are compatible with the
634	   "i;octet" and "i;ascii-casemap" comparators and may be used with
635	   them.

637	   It is an error to give more than one of these arguments to a given
638	   command.

640	2.7.4.   Comparisons Against Addresses

642	   Addresses are one of the most frequent things represented as strings.
643	   These are structured, and being able to compare against the local-
644	   part or the domain of an address is useful, so some tests that act
645	   exclusively on addresses take an additional optional argument that
646	   specifies what the test acts on.

648	   These optional arguments are ":localpart", ":domain", and ":all",
649	   which act on the local-part (left-side), the domain part (right-
650	   side), and the whole address.

652	   The kind of comparison done, such as whether or not the test done is
653	   case-insensitive, is specified as a comparator argument to the test.

655	   If an optional address-part is omitted, the default is ":all".

657	   It is an error to give more than one of these arguments to a given
658	   command.

660	   For convenience, the "ADDRESS-PART" syntax element is defined here as
661	   follows:

663	   Syntax:   ":localpart" / ":domain" / ":all"

665	2.8.     Blocks

667	   Blocks are sets of commands enclosed within curly braces.  Blocks are
668	   supplied to commands so that the commands can implement control
669	   commands.

671	   A control structure is a command that happens to take a test and a
672	   block as one of its arguments; depending on the result of the test
673	   supplied as another argument, it runs the code in the block some
674	   number of times.

676	   With the commands supplied in this memo, there are no loops.  The
677	   control structures supplied--if, elsif, and else--run a block either
678	   once or not at all.  So there are two arguments, the test and the
679	   block.

681	2.9.     Commands

683	   Sieve scripts are sequences of commands.  Commands can take any of
684	   the tokens above as arguments, and arguments may be either tagged or
685	   positional arguments.  Not all commands take all arguments.

687	   There are three kinds of commands: test commands, action commands,
688	   and control commands.

690	   The simplest is an action command.  An action command is an
691	   identifier followed by zero or more arguments, terminated by a
692	   semicolon.  Action commands do not take tests or blocks as arguments.

694	   A control command is similar, but it takes a test as an argument, and
695	   ends with a block instead of a semicolon.

697	   A test command is used as part of a control command.  It is used to
698	   specify whether or not the block of code given to the control command
699	   is executed.

701	2.10.    Evaluation

703	2.10.1.  Action Interaction

705	   Some actions cannot be used with other actions because the result
706	   would be absurd.  These restrictions are noted throughout this memo.

708	   Extension actions MUST state how they interact with actions defined
709	   in this specification.

711	2.10.2.  Implicit Keep

713	   Previous experience with filtering systems suggests that cases tend
714	   to be missed in scripts.  To prevent errors, Sieve has an "implicit
715	   keep".

717	   An implicit keep is a keep action (see 4.4) performed in absence of
718	   any action that cancels the implicit keep.

720	   An implicit keep is performed if a message is not written to a
721	   mailbox, redirected to a new address, rejected, or explicitly thrown
722	   out.  That is, if a fileinto, a keep, a redirect, or a discard is
723	   performed, an implicit keep is not.

725	   Some actions may be defined to not cancel the implicit keep.  These
726	   actions may not directly affect the delivery of a message, and are
727	   used for their side effects.  None of the actions specified in this
728	   document meet that criteria, but extension actions will.

730	   For instance, with any of the short messages offered above, the
731	   following script produces no actions.

733	   Example:  if size :over 500K { discard; }

735	   As a result, the implicit keep is taken.

737	2.10.3.  Message Uniqueness in a Mailbox

739	   Implementations SHOULD NOT deliver a message to the same folder more
740	   than once, even if a script explicitly asks for a message to be
741	   written to a mailbox twice.

743	   The test for equality of two messages is implementation-defined.

745	   If a script asks for a message to be written to a mailbox twice, it
746	   MUST NOT be treated as an error.

748	2.10.4.  Limits on Numbers of Actions

750	   Site policy MAY limit numbers of actions taken and MAY impose
751	   restrictions on which actions can be used together.  In the event
752	   that a script hits a policy limit on the number of actions taken for
753	   a particular message, an error occurs.

755	   Implementations MUST prohibit more than one reject.

757	   Implementations MUST allow at least one keep or one fileinto.  If
758	   fileinto is not implemented, implementations MUST allow at least one
759	   keep.

761	   Implementations SHOULD prohibit reject when used with other actions.

763	2.10.5.  Extensions and Optional Features

765	   Because of the differing capabilities of many mail systems, several
766	   features of this specification are optional.  Before any of these
767	   extensions can be executed, they must be declared with the "require"
768	   action.

770	   If an extension is not enabled with "require", implementations MUST
771	   treat it as if they did not support it at all.

773	   If a script does not understand an extension declared with require,
774	   the script must not be used at all.  Implementations MUST NOT execute
775	   scripts which require unknown capability names.

777	   Note: The reason for this restriction is that prior experiences with
778	         languages such as LISP and Tcl suggest that this is a workable
779	         way of noting that a given script uses an extension.

781	         Experience with PostScript suggests that mechanisms that allow
782	         a script to work around missing extensions are not used in
783	         practice.

785	   Extensions which define actions MUST state how they interact with
786	   actions discussed in the base specification.

788	2.10.6.  Errors

790	   In any programming language, there are compile-time and run-time
791	   errors.

793	   Compile-time errors are ones in syntax that are detectable if a
794	   syntax check is done.

796	   Run-time errors are not detectable until the script is run.  This
797	   includes transient failures like disk full conditions, but also
798	   includes issues like invalid combinations of actions.

800	   When an error occurs in a Sieve script, all processing stops.

802	   Implementations MAY choose to do a full parse, then evaluate the
803	   script, then do all actions.  Implementations might even go so far as
804	   to ensure that execution is atomic (either all actions are executed
805	   or none are executed).

807	   Other implementations may choose to parse and run at the same time.
808	   Such implementations are simpler, but have issues with partial
809	   failure (some actions happen, others don't).

811	   Implementations might even go so far as to ensure that scripts can
812	   never execute an invalid set of actions (e.g., reject + fileinto)
813	   before execution, although this could involve solving the Halting
814	   Problem.

816	   This specification allows any of these approaches.  Solving the
817	   Halting Problem is considered extra credit.

819	   When an error happens, implementations MUST notify the user that an
820	   error occurred, which actions (if any) were taken, and do an implicit
821	   keep.

823	2.10.7.  Limits on Execution

825	   Implementations may limit certain constructs.  However, this
826	   specification places a lower bound on some of these limits.

828	   Implementations MUST support fifteen levels of nested blocks.

830	   Implementations MUST support fifteen levels of nested test lists.

832	3.      Control Commands

834	   Control structures are needed to allow for multiple and conditional
835	   actions.

837	3.1.     Control Structure If

839	   There are three pieces to if: "if", "elsif", and "else".  Each is
840	   actually a separate command in terms of the grammar.  However, an
841	   elsif or else MUST only follow an if or elsif.  An error occurs if
842	   these conditions are not met.

844	   Syntax:   if <test1: test> <block1: block>

846	   Syntax:   elsif <test2: test> <block2: block>

848	   Syntax:   else <block>

850	   The semantics are similar to those of any of the many other
851	   programming languages these control commands appear in.  When the
852	   interpreter sees an "if", it evaluates the test associated with it.
853	   If the test is true, it executes the block associated with it.

855	   If the test of the "if" is false, it evaluates the test of the first
856	   "elsif" (if any).  If the test of "elsif" is true, it runs the
857	   elsif's block.  An elsif may be followed by an elsif, in which case,
858	   the interpreter repeats this process until it runs out of elsifs.

860	   When the interpreter runs out of elsifs, there may be an "else" case.
861	   If there is, and none of the if or elsif tests were true, the
862	   interpreter runs the else case.

864	   This provides a way of performing exactly one of the blocks in the
865	   chain.

867	   In the following example, both Message A and B are dropped.

869	   Example:  require "fileinto";
870	             if header :contains "from" "coyote" {
871	                discard;
872	             } elsif header :contains ["subject"] ["$$$"] {
873	                discard;
874	             } else {
875	                fileinto "INBOX";
876	             }

878	   When the script below is run over message A, it redirects the message
879	   to acm@example.edu; message B, to postmaster@example.edu; any other
880	   message is redirected to field@example.edu.

882	   Example:  if header :contains ["From"] ["coyote"] {
883	                redirect "acm@example.edu";
884	             } elsif header :contains "Subject" "$$$" {
885	                redirect "postmaster@example.edu";
886	             } else {
887	                redirect "field@example.edu";
888	             }

890	   Note that this definition prohibits the "... else if ..." sequence
891	   used by C.  This is intentional, because this construct produces a
892	   shift-reduce conflict.

894	3.2.     Control Structure Require

896	   Syntax:   require <capabilities: string-list>

898	   The require action notes that a script makes use of a certain
899	   extension.  Such a declaration is required to use the extension, as
900	   discussed in section 2.10.5.  Multiple capabilities can be declared
901	   with a single require.

903	   The require command, if present, MUST be used before anything other
904	   than a require can be used.  An error occurs if a require appears
905	   after a command other than require.

907	   Example:  require ["fileinto", "reject"];

909	   Example:  require "fileinto";
910	             require "vacation";

912	3.3.     Control Structure Stop

914	   Syntax:   stop

916	   The "stop" action ends all processing.  If no actions have been
917	   executed, then the keep action is taken.

919	4.      Action Commands

921	   This document supplies five actions that may be taken on a message:
922	   keep, fileinto, redirect, reject, and discard.

924	   Implementations MUST support the "keep", "discard", and "redirect"
925	   actions.

927	   Implementations SHOULD support "reject" and "fileinto".

929	   Implementations MAY limit the number of certain actions taken (see
930	   section 2.10.4).

932	4.1.     Action reject

934	   Syntax:   reject <reason: string>

936	   The optional "reject" action refuses delivery of a message by sending
937	   back an [MDN] to the sender and cancels the implict keep.  It resends
938	   the message to the sender, wrapping it in a "reject" form, noting
939	   that it was rejected by the recipient.  In the following script,
940	   message A is rejected and returned to the sender.

942	   Example:  if header :contains "from" "coyote@desert.example.org" {
943	                reject "I am not taking mail from you, and I don't want
944	                your birdseed, either!";
945	             }

947	   A reject message MUST take the form of a failure MDN as specified by
948	   [MDN].  The human-readable portion of the message, the first
949	   component of the MDN, contains the human readable message describing
950	   the error, and it SHOULD contain additional text alerting the
951	   original sender that mail was refused by a filter.  This part of the
952	   MDN might appear as follows:

954	   ------------------------------------------------------------
955	   Message was refused by recipient's mail filtering program.  Reason
956	   given was as follows:

958	   I am not taking mail from you, and I don't want your birdseed,
959	   either!
960	   ------------------------------------------------------------

962	   The MDN action-value field as defined in the MDN specification MUST
963	   be "deleted" and MUST have the MDN-sent-automatically and automatic-
964	   action modes set.

966	   Because some implementations can not or will not implement the reject
967	   command, it is optional.  The capability string to be used with the
968	   require command is "reject".

970	4.2.     Action fileinto

972	   Syntax:   fileinto <folder: string>

974	   The "fileinto" action delivers the message into the specified folder.
975	   Implementations SHOULD support fileinto, but in some environments
976	   this may be impossible.

978	   The capability string for use with the require command is "fileinto".

980	   In the following script, message A is filed into folder
981	   "INBOX.harassment".

983	   Example:  require "fileinto";
984	             if header :contains ["from"] "coyote" {
985	                fileinto "INBOX.harassment";
986	             }

988	4.3.     Action redirect

990	   Syntax:   redirect <address: string>

992	   The "redirect" action is used to send the message to another user at
993	   a supplied address, as a mail forwarding feature does.  The
994	   "redirect" action makes no changes to the message body or existing
995	   headers, but it may add new headers.  The "redirect" modifies the
996	   envelope recipient.

998	   The redirect command performs an MTA-style "forward"--that is, what
999	   you get from a .forward file using sendmail under UNIX.  The address
1000	   on the SMTP envelope is replaced with the one on the redirect command
1001	   and the message is sent back out.  (This is not an MUA-style forward,
1002	   which creates a new message with a different sender and message ID,
1003	   wrapping the old message in a new one.)

1005	   A simple script can be used for redirecting all mail:

1007	   Example:  redirect "bart@example.edu";

1009	   Implementations SHOULD take measures to implement loop control,
1010	   possibly including adding headers to the message or counting received
1011	   headers.  If an implementation detects a loop, it causes an error.

1013	4.4.     Action keep

1015	   Syntax:   keep

1017	   The "keep" action is whatever action is taken in lieu of all other
1018	   actions, if no filtering happens at all; generally, this simply means
1019	   to file the message into the user's main mailbox.  This command
1020	   provides a way to execute this action without needing to know the
1021	   name of the user's main mailbox, providing a way to call it without
1022	   needing to understand the user's setup, or the underlying mail
1023	   system.

1025	   For instance, in an implementation where the Internet Message Access
1026	   Protocol (IMAP) server is running scripts on behalf of the user at
1027	   time of delivery, a keep command is equivalent to a fileinto "INBOX".

1029	   Example:  if size :under 1M { keep; } else { discard; }

1031	   Note that the above script is identical to the one below.

1033	   Example:  if not size :under 1M { discard; }

1035	4.5.     Action discard

1037	   Syntax:   discard

1039	   Discard is used to silently throw away the message.  It does so by
1040	   simply canceling the implicit keep.  If discard is used with other
1041	   actions, the other actions still happen.  Discard is compatible with
1042	   all other actions.  (For instance fileinto+discard is equivalent to
1043	   fileinto.)

1045	   Discard MUST be silent; that is, it MUST NOT return a non-delivery
1046	   notification of any kind ([DSN], [MDN], or otherwise).

1048	   In the following script, any mail from "idiot@example.edu" is thrown
1049	   out.

1051	   Example:  if header :contains ["from"] ["idiot@example.edu"] {
1052	                discard;
1053	             }

1055	   While an important part of this language, "discard" has the potential
1056	   to create serious problems for users: Students who leave themselves
1057	   logged in to an unattended machine in a public computer lab may find
1058	   their script changed to just "discard".  In order to protect users in
1059	   this situation (along with similar situations), implementations MAY
1060	   keep messages destroyed by a script for an indefinite period, and MAY
1061	   disallow scripts that throw out all mail.

1063	5.      Test Commands

1065	   Tests are used in conditionals to decide which part(s) of the
1066	   conditional to execute.

1068	   Implementations MUST support these tests: "address", "allof",
1069	   "anyof", "exists", "false", "header", "not", "size", and "true".

1071	   Implementations SHOULD support the "envelope" test.

1073	5.1.     Test address

1075	   Syntax:   address [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE]
1076	             <header-list: string-list> <key-list: string-list>

1078	   The address test matches Internet addresses in structured headers
1079	   that contain addresses.  It returns true if any header contains any
1080	   key in the specified part of the address, as modified by the
1081	   comparator and the match keyword.

1083	   Like envelope and header, this test returns true if any combination
1084	   of the header-list and key-list arguments match.

1086	   Internet email addresses [IMAIL] have the somewhat awkward
1087	   characteristic that the local-part to the left of the at-sign is
1088	   considered case sensitive, and the domain-part to the right of the
1089	   at-sign is case insensitive.  The "address" command does not deal
1090	   with this itself, but provides the ADDRESS-PART argument for allowing
1091	   users to deal with it.

1093	   The address primitive never acts on the phrase part of an email
1094	   address, nor on comments within that address.  It also never acts on
1095	   group names, although it does act on the addresses within the group
1096	   construct.

1098	   Implementations MUST restrict the address test to headers that
1099	   contain addresses, but MUST include at least From, To, Cc, Bcc,
1100	   Sender, Resent-From, Resent-To, and SHOULD include any other header
1101	   that utilizes an "address-list" structured header body.

1103	   Example:  if address :is :all "from" "tim@example.com" {
1104	                discard;

1106	5.2.     Test allof

1108	   Syntax:   allof <tests: test-list>

1110	   The allof test performs a logical AND on the tests supplied to it.

1112	   Example:  allof (false, false)  =>   false
1113	             allof (false, true)   =>   false
1114	             allof (true,  true)   =>   true

1116	   The allof test takes as its argument a test-list.

1118	5.3.     Test anyof

1120	   Syntax:   anyof <tests: test-list>

1122	   The anyof test performs a logical OR on the tests supplied to it.

1124	   Example:  anyof (false, false)  =>   false
1125	             anyof (false, true)   =>   true
1126	             anyof (true,  true)   =>   true

1128	5.4.     Test envelope

1130	   Syntax:   envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE]
1131	             <envelope-part: string-list> <key-list: string-list>

1133	   The "envelope" test is true if the specified part of the SMTP (or
1134	   equivalent) envelope matches the specified key.

1136	   If one of the envelope-part strings is (case insensitive) "from",
1137	   then matching occurs against the FROM address used in the SMTP MAIL
1138	   command.

1140	   If one of the envelope-part strings is (case insensitive) "to", then
1141	   matching occurs against the TO address used in the SMTP RCPT command
1142	   that resulted in this message getting delivered to this user.  Note
1143	   that only the most recent TO is available, and only the one relevant
1144	   to this user.

1146	   The envelope-part is a string list and may contain more than one
1147	   parameter, in which case all of the strings specified in the key-list
1148	   are matched against all parts given in the envelope-part list.

1150	   Like address and header, this test returns true if any combination of
1151	   the envelope-part and key-list arguments is true.

1153	   All tests against envelopes MUST drop source routes.

1155	   If the SMTP transaction involved several RCPT commands, only the data
1156	   from the RCPT command that caused delivery to this user is available
1157	   in the "to" part of the envelope.

1159	   If a protocol other than SMTP is used for message transport,
1160	   implementations are expected to adapt this command appropriately.

1162	   The envelope command is optional.  Implementations SHOULD support it,
1163	   but the necessary information may not be available in all cases.

1165	   Example:  require "envelope";
1166	             if envelope :all :is "from" "tim@example.com" {
1167	                discard;
1168	             }

1170	5.5.     Test exists

1172	   Syntax:   exists <header-names: string-list>

1174	   The "exists" test is true if the headers listed in the header-names
1175	   argument exist within the message.  All of the headers must exist or
1176	   the test is false.

1178	   The following example throws out mail that doesn't have a From header
1179	   and a Date header.

1181	   Example:  if not exists ["From","Date"] {
1182	                discard;
1183	             }

1185	5.6.     Test false

1187	   Syntax:   false

1189	   The "false" test always evaluates to false.

1191	5.7.     Test header

1193	   Syntax:   header [COMPARATOR] [MATCH-TYPE]
1194	             <header-names: string-list> <key-list: string-list>

1196	   The "header" test evaluates to true if any header name matches any
1197	   key.  The type of match is specified by the optional match argument,
1198	   which defaults to ":is" if not specified, as specified in section
1199	   2.6.

1201	   Like address and envelope, this test returns true if any combination
1202	   of the string-list and key-list arguments match.

1204	   If a header listed in the header-names argument exists, it contains
1205	   the null key ("").  However, if the named header is not present, it
1206	   does not contain the null key.  So if a message contained the header

1208	           X-Caffeine: C8H10N4O2

1210	   these tests on that header evaluate as follows:

1212	           header :is ["X-Caffeine"] [""]         => false
1213	           header :contains ["X-Caffeine"] [""]   => true

1215	5.8.     Test not

1217	   Syntax:   not <test>

1219	   The "not" test takes some other test as an argument, and yields the
1220	   opposite result.  "not false" evaluates to "true" and "not true"
1221	   evaluates to "false".

1223	5.9.     Test size

1225	   Syntax:   size <":over" / ":under"> <limit: number>

1227	   The "size" test deals with the size of a message.  It takes either a
1228	   tagged argument of ":over" or ":under", followed by a number
1229	   representing the size of the message.

1231	   If the argument is ":over", and the size of the message is greater
1232	   than the number provided, the test is true; otherwise, it is false.

1234	   If the argument is ":under", and the size of the message is less than
1235	   the number provided, the test is true; otherwise, it is false.

1237	   Exactly one of ":over" or ":under" must be specified, and anything
1238	   else is an error.

1240	   The size of a message is defined to be the number of octets from the
1241	   initial header until the last character in the message body.

1243	   Note that for a message that is exactly 4,000 octets, the message is
1244	   neither ":over" 4000 octets or ":under" 4000 octets.

1246	5.10.    Test true

1248	   Syntax:   true

1250	   The "true" test always evaluates to true.

1252	6.      Extensibility

1254	   New control structures, actions, and tests can be added to the
1255	   language.  Sites must make these features known to their users; this
1256	   document does not define a way to discover the list of extensions
1257	   supported by the server.

1259	   Any extensions to this language MUST define a capability string that
1260	   uniquely identifies that extension.  If a new version of an extension
1261	   changes the functionality of a previously defined extension, it MUST
1262	   use a different name.

1264	   In a situation where there is a submission protocol and an extension
1265	   advertisement mechanism aware of the details of this language,
1266	   scripts submitted can be checked against the mail server to prevent
1267	   use of an extension that the server does not support.

1269	   Extensions MUST state how they interact with constraints defined in
1270	   section 2.10, e.g., whether they cancel the implicit keep, and which
1271	   actions they are compatible and incompatible with.

1273	6.1.     Capability String

1275	   Capability strings are typically short strings describing what
1276	   capabilities are supported by the server.

1278	   Capability strings beginning with "vnd." represent vendor-defined
1279	   extensions.  Such extensions are not defined by Internet standards or
1280	   RFCs, but are still registered with IANA in order to prevent
1281	   conflicts.  Extensions starting with "vnd." SHOULD be followed by the
1282	   name of the vendor and product, such as "vnd.acme.rocket-sled".

1284	   The following capability strings are defined by this document:

1286	   envelope    The string "envelope" indicates that the implementation
1287	               supports the "envelope" command.

1289	   fileinto    The string "fileinto" indicates that the implementation
1290	               supports the "fileinto" command.

1292	   reject      The string "reject" indicates that the implementation
1293	               supports the "reject" command.

1295	   comparator- The string "comparator-elbonia" is provided if the
1296	               implementation supports the "elbonia" comparator.
1297	               Therefore, all implementations have at least the
1298	               "comparator-i;octet" and "comparator-i;ascii-casemap"
1299	               capabilities.  However, these comparators may be used
1300	               without being declared with require.

1302	6.2.     IANA Considerations

1304	   In order to provide a standard set of extensions, a registry is
1305	   provided by IANA.  Capability names may be registered on a first-
1306	   come, first-served basis.  Extensions designed for interoperable use
1307	   SHOULD be defined as standards track or IESG approved experimental
1308	   RFCs.

1310	6.2.1.     Template for Capability Registrations

1312	   The following template is to be used for registering new Sieve
1313	   extensions with IANA.

1315	   To: iana@iana.org
1316	   Subject: Registration of new Sieve extension

1318	   Capability name:
1319	   Capability keyword:
1320	   Capability arguments:
1321	   Standards Track/IESG-approved experimental RFC number:
1322	   Person and email address to contact for further information:

1324	6.2.2.     Initial Capability Registrations

1326	   The following are to be added to the IANA registry for Sieve
1327	   extensions as the initial contents of the capability registry.

1329	   Capability name:        fileinto
1330	   Capability keyword:     fileinto
1331	   Capability arguments:   fileinto <folder: string>
1332	   Standards Track/IESG-approved experimental RFC number:
1333	           RFC 3028 (Sieve base spec)
1334	   Person and email address to contact for further information:
1335	           Tim Showalter
1336	           tjs@mirapoint.com

1338	   Capability name:        reject
1339	   Capability keyword:     reject
1340	   Capability arguments:   reject <reason: string>
1341	   Standards Track/IESG-approved experimental RFC number:
1342	           RFC 3028 (Sieve base spec)
1343	   Person and email address to contact for further information:
1344	           Tim Showalter
1345	           tjs@mirapoint.com

1347	   Capability name:        envelope
1348	   Capability keyword:     envelope
1349	   Capability arguments:
1350	           envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE]
1351	           <envelope-part: string-list> <key-list: string-list>
1352	   Standards Track/IESG-approved experimental RFC number:
1353	           RFC 3028 (Sieve base spec)
1354	   Person and email address to contact for further information:
1355	           Tim Showalter
1356	           tjs@mirapoint.com

1358	   Capability name:        comparator-*
1359	   Capability keyword:
1360	           comparator-* (anything starting with "comparator-")
1361	   Capability arguments:   (none)
1362	   Standards Track/IESG-approved experimental RFC number:
1363	           RFC 3028, Sieve, by reference of
1364	           RFC 2244, Application Configuration Access Protocol
1365	   Person and email address to contact for further information:
1366	           Tim Showalter
1367	           tjs@mirapoint.com

1369	6.3.     Capability Transport

1371	   As the range of mail systems that this document is intended to apply
1372	   to is quite varied, a method of advertising which capabilities an
1373	   implementation supports is difficult due to the wide range of
1374	   possible implementations.  Such a mechanism, however, should have
1375	   property that the implementation can advertise the complete set of
1376	   extensions that it supports.

1378	7.      Transmission

1380	   The MIME type for a Sieve script is "application/sieve".

1382	   The registration of this type for RFC 2048 requirements is as
1383	   follows:

1385	    Subject: Registration of MIME media type application/sieve

1387	    MIME media type name: application
1388	    MIME subtype name: sieve
1389	    Required parameters: none
1390	    Optional parameters: none
1391	    Encoding considerations: Most sieve scripts will be textual,
1392	       written in UTF-8.  When non-7bit characters are used,
1393	       quoted-printable is appropriate for transport systems
1394	       that require 7bit encoding.

1396	    Security considerations: Discussed in section 10 of RFC 3028.
1397	    Interoperability considerations: Discussed in section 2.10.5
1398	       of RFC 3028.
1399	    Published specification: RFC 3028.
1400	    Applications which use this media type: sieve-enabled mail servers
1401	    Additional information:
1402	      Magic number(s):
1403	      File extension(s): .siv
1404	      Macintosh File Type Code(s):
1405	    Person & email address to contact for further information:
1406	       See the discussion list at ietf-mta-filters@imc.org.
1407	    Intended usage:
1408	       COMMON
1409	    Author/Change controller:
1410	       See Author information in RFC 3028.

1412	8.      Parsing

1414	   The Sieve grammar is separated into tokens and a separate grammar as
1415	   most programming languages are.

1417	8.1.     Lexical Tokens

1419	   Sieve scripts are encoded in UTF-8.  The following assumes a valid
1420	   UTF-8 encoding; special characters in Sieve scripts are all ASCII.

1422	   The following are tokens in Sieve:

1424	           - identifiers
1425	           - tags
1426	           - numbers
1427	           - quoted strings
1428	           - multi-line strings
1429	           - other separators

1431	   Blanks, horizontal tabs, CRLFs, and comments ("white space") are
1432	   ignored except as they separate tokens.  Some white space is required
1433	   to separate otherwise adjacent tokens and in specific places in the
1434	   multi-line strings.

1436	   The other separators are single individual characters, and are
1437	   mentioned explicitly in the grammar.

1439	   The lexical structure of sieve is defined in the following BNF (as
1440	   described in [ABNF]):

1442	   bracket-comment = "/*" *(CHAR-NOT-STAR / ("*" CHAR-NOT-SLASH)) "*/"
1443	                       ; No */ allowed inside a comment.
1444	                       ; (No * is allowed unless it is the last
1445	                       ; character, or unless it is followed by a
1446	                       ; character that isn't a slash.)

1448	   <<
1449	   Currently, bracketed comments can contain bare CRs and NLs.  Should
1450	   that be banned?
1451	   Also, the syntax doesn't actually permit asterisks at the end, such
1452	   as "/***/"; that will change once I work out a concise way to express
1453	   it.
1454	   >>

1456	   CHAR-NOT-DOT    = %x01-09 / %x0b-0c / %x0e-2d / %x2f-7f /
1457	                     UTF8-2 / UTF8-3 / UTF8-4
1458	                       ; no dots, no CRLFs

1460	   CHAR-NOT-CRLF   = %x01-09 / %x0b-0c / %x0e-7f /
1461	                     UTF8-2 / UTF8-3 / UTF8-4

1463	   CHAR-NOT-SLASH  = %x00-2e / %x30-7f /
1464	                     UTF8-2 / UTF8-3 / UTF8-4

1466	   CHAR-NOT-STAR   = %x00-29 / %x2b-7f /
1467	                     UTF8-2 / UTF8-3 / UTF8-4

1469	   comment         = bracket-comment / hash-comment

1471	   hash-comment    = ( "#" *CHAR-NOT-CRLF CRLF )

1473	   identifier      = (ALPHA / "_") *(ALPHA / DIGIT / "_")

1475	   tag             = ":" identifier

1477	   number          = 1*DIGIT [QUANTIFIER]

1479	   QUANTIFIER      = "K" / "M" / "G"

1481	   quoted-string   = DQUOTE *CHAR DQUOTE
1482	                       ; in general, \ CHAR inside a string maps
1483	                       ; to CHAR so \" maps to " and \\ maps
1484	                       ; to \ note that newlines and other characters
1485	                       ; are all allowed strings
1486	   <<need to move the escaping into the grammar>>

1488	   multi-line          = "text:" *(SP / HTAB) (hash-comment / CRLF)
1489	                         *(multi-line-literal / multi-line-dotstuff)
1490	                         "." CRLF
1491	   multi-line-literal  = [CHAR-NOT-DOT *CHAR-NOT-CRLF] CRLF
1492	   multi-line-dotstuff = "." 1*CHAR-NOT-CRLF CRLF
1493	           ;; A line containing only "." ends the multi-line.
1494	           ;; Remove a leading '.' if followed by another '.'.

1496	   white-space = 1*(SP / CRLF / HTAB) / comment

1498	8.2.     Grammar

1500	   The following is the grammar of Sieve after it has been lexically
1501	   interpreted.  No white space or comments appear below.  The start
1502	   symbol is "start".

1504	   argument = string-list / number / tag

1506	   arguments = *argument [test / test-list]

1508	   block = "{" commands "}"

1510	   command = identifier arguments ( ";" / block )

1512	   commands = *command

1514	   start = commands

1516	   string = quoted-string / multi-line

1518	   string-list = "[" string *("," string) "]" / string         ;; if
1519	   there is only a single string, the brackets are optional

1521	   test = identifier arguments

1523	   test-list = "(" test *("," test) ")"

1525	9.      Extended Example

1527	   The following is an extended example of a Sieve script.  Note that it
1528	   does not make use of the implicit keep.

1530	    #
1531	    # Example Sieve Filter
1532	    # Declare any optional features or extension used by the script
1533	    #
1534	    require ["fileinto", "reject"];

1536	    #
1537	    # Reject any large messages (note that the four leading dots get
1538	    # "stuffed" to three)
1539	    #
1540	    if size :over 1M
1541	            {
1542	            reject text:
1543	    Please do not send me large attachments.
1544	    Put your file on a server and send me the URL.
1545	    Thank you.
1546	    .... Fred
1547	    .
1548	    ;
1549	            stop;
1550	            }
1551	    #
1552	    # Handle messages from known mailing lists
1553	    # Move messages from IETF filter discussion list to filter folder
1554	    #
1555	    if header :is "Sender" "owner-ietf-mta-filters@imc.org"
1556	            {
1557	            fileinto "filter";  # move to "filter" folder
1558	            }
1559	    #
1560	    # Keep all messages to or from people in my company
1561	    #
1562	    elsif address :domain :is ["From", "To"] "example.com"
1563	            {
1564	            keep;               # keep in "In" folder
1565	            }

1567	    #
1568	    # Try and catch unsolicited email.  If a message is not to me,
1569	    # or it contains a subject known to be spam, file it away.
1570	    #
1571	    elsif anyof (not address :all :contains
1572	                   ["To", "Cc", "Bcc"] "me@example.com",
1573	                 header :matches "subject"
1574	                   ["*make*money*fast*", "*university*dipl*mas*"])
1575	            {
1576	            # If message header does not contain my address,
1577	            # it's from a list.
1578	            fileinto "spam";   # move to "spam" folder
1579	            }
1580	    else
1581	            {
1582	            # Move all other (non-company) mail to "personal"
1583	            # folder.
1584	            fileinto "personal";
1585	            }

1587	10.     Security Considerations

1589	   Users must get their mail.  It is imperative that whatever method
1590	   implementations use to store the user-defined filtering scripts be
1591	   secure.

1593	   It is equally important that implementations sanity-check the user's
1594	   scripts, and not allow users to create on-demand mailbombs.  For
1595	   instance, an implementation that allows a user to redirect a message
1596	   multiple times might also allow a user to create a mailbomb triggered
1597	   by mail from a specific user.  Site- or implementation-defined limits
1598	   on actions are useful for this.

1600	   Several commands, such as "discard", "redirect", and "fileinto" allow
1601	   for actions to be taken that are potentially very dangerous.

1603	   Implementations SHOULD take measures to prevent languages from
1604	   looping.

1606	11.     Acknowledgments

1608	   The editor gratefully acknowledges the extensive work of Tim
1609	   Showalter as the author of the RFC 3028.

1611	12.     Author's Addresses

1613	   Philip Guenther
1614	   Sendmail, Inc.
1615	   6425 Christie St. Ste 400
1616	   Emeryville, CA 94608

1618	   Email: guenther+mtafilters@sendmail.com

1620	13.  Normative References

1622	   [ABNF]      Crocker, D. and P. Overell, "Augmented BNF for Syntax
1623	               Specifications: ABNF", RFC 2234, November 1997.

1625	   [COLLATION] Newman, C. and M. Duerst, "Internet Application Protocol
1626	               Collation Registry" draft-newman-i18n-comparator-03.txt
1627	               (work in progress), October 2004.

1629	   [IMAIL]     Crocker, D., "Standard for the Format of ARPA Internet
1630	               Text Messages", STD 11, RFC 822, August 1982.

1632	   [KEYWORDS]  Bradner, S., "Key words for use in RFCs to Indicate
1633	               Requirement Levels", BCP 14, RFC 2119, March 1997.

1635	   [MIME]      Freed, N. and N. Borenstein, "Multipurpose Internet Mail
1636	               Extensions (MIME) Part One: Format of Internet Message
1637	               Bodies", RFC 2045, November 1996.

1639	   [MDN]       T. Hansen, Ed., G. Vaudreuil, Ed., "Message Disposition
1640	               Notification", RFC 3798, May 2004.

1642	   [RFC1123]   Braden, R., "Requirements for Internet Hosts --
1643	               Application and Support", STD 3, RFC 1123, November 1989.

1645	   [SMTP]      Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC
1646	               821, August 1982.

1648	   [UTF-8]     Yergeau, F., "UTF-8, a transformation format of ISO
1649	               10646", RFC 3629, November 2003.

1651	14.  Informative References

1653	   [BINARY-SI] "Standard IEC 60027-2: Letter symbols to be used in
1654	               electrical technology - Part 2: Telecommunications and
1655	               electronics", January 1999.

1657	   [DSN]       Moore, K. and G. Vaudreuil, "An Extensible Message Format
1658	               for Delivery Status Notifications", RFC 1894, January
1659	               1996.

1661	   [FLAMES]    Borenstein, N, and C. Thyberg, "Power, Ease of Use, and
1662	               Cooperative Work in a Practical Multimedia Message
1663	               System", Int. J.  of Man-Machine Studies, April, 1991.
1664	               Reprinted in Computer-Supported Cooperative Work and
1665	               Groupware, Saul Greenberg, editor, Harcourt Brace
1666	               Jovanovich, 1991.  Reprinted in Readings in Groupware and
1667	               Computer-Supported Cooperative Work, Ronald Baecker,
1668	               editor, Morgan Kaufmann, 1993.

1670	   [IMAP]      Crispin, M., "Internet Message Access Protocol - version
1671	               4rev1", RFC 3501, March 2003.

1673	14. Full Copyright Statement

1675	   Copyright (C) The Internet Society (2005).

1677	   This document is subject to the rights, licenses and restrictions
1678	   contained in BCP 78, and except as set forth therein, the authors
1679	   retain all their rights.

1681	   This document and the information contained herein are provided on an
1682	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1683	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
1684	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
1685	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
1686	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1687	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1689	Intellectual Property

1691	   The IETF takes no position regarding the validity or scope of any
1692	   Intellectual Property Rights or other rights that might be claimed to
1693	   pertain to the implementation or use of the technology described in
1694	   this document or the extent to which any license under such rights
1695	   might or might not be available; nor does it represent that it has
1696	   made any independent effort to identify any such rights.  Information
1697	   on the procedures with respect to rights in RFC documents can be
1698	   found in BCP 78 and BCP 79.

1700	   Copies of IPR disclosures made to the IETF Secretariat and any
1701	   assurances of licenses to be made available, or the result of an
1702	   attempt made to obtain a general license or permission for the use of
1703	   such proprietary rights by implementers or users of this
1704	   specification can be obtained from the IETF on-line IPR repository at
1705	   http://www.ietf.org/ipr.

1707	   The IETF invites any interested party to bring to its attention any
1708	   copyrights, patents or patent applications, or other proprietary
1709	   rights that may cover technology that may be required to implement
1710	   this standard.  Please address the information to the IETF at ietf-
1711	   ipr@ietf.org.

1713	Acknowledgement

1715	   Funding for the RFC Editor function is currently provided by the
1716	   Internet Society.

1718	Append A. Change History

1720	   Changes from RFC 3028
1721	    1. Split references into normative and informative
1722	    2. Update references to current versions of DSN, IMAP, MDN, and
1723	       UTF-8 RFCs
1724	    3. Replace "e-mail" with "email"
1725	    4. Incorporate RFC 3028 errata
1726	    5. The "reject" action cancels the implicit keep
1727	    6. Replace references to ACAP with references to the
1728	       i18n-comparator draft.  Further work is needed to completely
1729	       sync with that draft.
1730	    7. Start to update grammar to only permit legal UTF-8 (incomplete)
1731	       and correct various other errors and typos
1732	    8. Update IPR broilerplate