Network Working Group Ken Murchison Document: draft-murchison-sieve-regex-00.txt Oceana Matrix Ltd. Expires September 15, 2000 10 March 2000 Sieve -- Regular Expression Extension Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt To view the list Internet-Draft Shadow Directories, see http://www.ietf.org/shadow.html. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society 2000. All Rights Reserved. Abstract In some cases, it is desirable to have a string matching mechanism which is more powerful than a simple exact match, a substring match or a glob-style wildcard match. The regular expression matching mechanism defined in this draft should allow users to isolate just about any string or address in a message header or envelope. Expires September 15, 2000 Murchison [Page 1] Internet Draft Sieve -- Regex Extension March 10, 2000 Table of Contents Status of this Memo . . . . . . . . . . . . . . . . . . . . . . . . 1 Copyright Notice . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 0. Meta-information on this draft . . . . . . . . . . . . . . . 3 0.1. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 3 0.2. Changes since revision 00 . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Capability Identifier . . . . . . . . . . . . . . . . . . . . 3 3. Regex Match Type . . . . . . . . . . . . . . . . . . . . . . 4 4. Security Considerations . . . . . . . . . . . . . . . . . . . 6 5. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 6 6. Author's Address . . . . . . . . . . . . . . . . . . . . . . 6 Appendix A. References . . . . . . . . . . . . . . . . . . . . . . 6 Appendix B. Full Copyright Statement . . . . . . . . . . . . . . . 6 Expires September 15, 2000 Murchison [Page 2] Internet Draft Sieve -- Regex Extension March 10, 2000 0. Meta-information on this draft This information is intended to facilitate discussion. It will be removed when this document leaves the Internet-Draft stage. 0.1. Discussion This draft is intended to be an extension to the Sieve mail filtering language, available from the Internet-Drafts repository as (where 09 is the version number, which is actually currently 09). This draft and the Sieve language itself are being discussed on the MTA Filters mailing list at . Subscription requests can be sent to (send an email message with the word "subscribe" in the body). More information on the mailing list along with a WWW archive of back messages is available at . 0.2. Changes since revision 00 Added POSIX.2 ERE summary. Added examples. Editorial changes. 1. Introduction This is an extension to the Sieve language defined by [SIEVE] for comparing strings to regular expressions. Conventions for notations are as in [SIEVE] section 1.1, including use of [KEYWORDS]. 2. Capability Identifier The capability string associated with the extension defined in this document is "regex". Expires September 15, 2000 Murchison [Page 3] Internet Draft Sieve -- Regex Extension March 10, 2000 3. Regex Match Type Commands that support matching may take the optional tagged argument ":regex" to specify that a regular expression match should be performed. The ":regex" match type is subject to the same rules and restrictions as the standard match types defined in [SIEVE]. The ":regex" match type is compatible with both the "i;octet" and "i;ascii-casemap" comparators and may be used with them. Implementations MUST support extended regular expressions (EREs) as defined by [POSIX.2]. Any regular expression not defined by [POSIX.2], including [POSIX.2] basic regular expressions, word boundaries and backreferences are not supported by this extension. The following table provides a brief summary of the regular expressions that MUST be supported. This table is presented here only as a guideline. [POSIX.2] should be used as the definitive reference. NOTE: A double-backslash is needed to escape a special character, as per section 2.4.2 of [SIEVE]. +------------+-----------------------------------------------------+ | Expression | Pattern | +------------+-----------------------------------------------------+ | Items to match a single character | +------------+-----------------------------------------------------+ | . | Match any single character except newline. | | [ ] | Bracket expression. Match any one of the enclosed | | | characters. A hypen (-) indicates a range of | | | consecutive characters. | | [^ ] | Negated bracket expression. Match any one | | | character NOT in the enclosed list. A hypen (-) | | | indicates a range of consecutive characters. | | \\ | Escape the following special character (match | | | the literal character). | +------------+-----------------------------------------------------+ | Items to be used within a bracket expression (localization) | +------------+-----------------------------------------------------+ | [: :] | Character class (alnum, alpha, blank, cntrl, | | | digit, graph, lower, print, punct, space, | | | upper, xdigit). | | [= =] | Character equivalents. | | [. .] | Collating sequence. | +------------+-----------------------------------------------------+ Expires September 15, 2000 Murchison [Page 4] Internet Draft Sieve -- Regex Extension March 10, 2000 +------------+-----------------------------------------------------+ | Expression | Pattern | +------------+-----------------------------------------------------+ | Quantifiers - Items to count the preceding regular expression | +------------+-----------------------------------------------------+ | ? | Match zero or one instances. | | * | Match zero or more instances. | | + | Match one or more instances. | | {n,m} | Match any number of instances between | | | n and m (inclusive). {n} matches exactly n | | | instances. {n,} matches n or more instances. | +------------+-----------------------------------------------------+ | Anchoring - Items to match positions | +------------+-----------------------------------------------------+ | ^ | Match the beginning of the line or string. | | $ | Match the end of the line or string. | +------------+-----------------------------------------------------+ | Other constructs | +------------+-----------------------------------------------------+ | | | Alternation. Match either of the separated | | | regular expressions. | | ( ) | Group the enclosed regular expression(s). | +------------+-----------------------------------------------------+ Example: require "regex"; # Try to catch unsolicited email. if anyof ( # if a message is not to me (with optional +detail), not address :regex ["to", "cc", "bcc"] "me(\\+.*)?@company.com", # or the subject contains 2 or more dollar signs, header :regex "subject" "\\$\\$+", # or the subject is all uppercase (no lowercase) header :regex :comparator "i;octet" "subject" "^[^:lower:]*$" ) { discard; # junk it } Expires September 15, 2000 Murchison [Page 5] Internet Draft Sieve -- Regex Extension March 10, 2000 4. Security Considerations Security considerations are discussed in [SIEVE]. It is believed that this extension doesn't introduce any additional security concerns. 5. Acknowledgments Thanks to Tim Showalter and Alexey Melnikov for help with this document. 6. Author's Address Ken Murchison Oceana Matrix Ltd. 21 Princeton Place Orchard Park, NY 14127 Phone: (716) 662-8973 EMail: ken@oceana.com Appendix A. References [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", Harvard University, RFC 2119, March, 1997. [SIEVE] Showalter, T., "Sieve: A Mail Filtering Language", Mira- point, Inc., Work In Progress. [POSIX.2], "Portable Operating System Interface (POSIX). Part 2, Shell and utilities", National Institute of Standards and Tech- nology (U.S.). Appendix B. Full Copyright Statement Copyright (C) The Internet Society 2000. All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph Expires September 15, 2000 Murchison [Page 6] Internet Draft Sieve -- Regex Extension March 10, 2000 are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of develop- ing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Expires September 15, 2000 Murchison [Page 7]