idnits 2.17.1 draft-murchison-sieve-regex-06.txt: -(258): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing document type: Expected "INTERNET-DRAFT" in the upper left hand corner of the first page ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == There are 2 instances of lines with non-ascii characters in the document. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 7 instances of too long lines in the document, the longest one being 1 character in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 100: '... implementations SHOULD reject regexes...' RFC 2119 keyword, line 159: '... Implementations MUST support extended...' RFC 2119 keyword, line 164: '... Implementations SHOULD reject regular...' RFC 2119 keyword, line 168: '...expressions that MUST be supported. T...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (30 June 2002) is 7965 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'SIEVE' on line 254 looks like a reference -- Missing reference section? 'KEYWORDS' on line 251 looks like a reference Summary: 7 errors (**), 0 flaws (~~), 3 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group K. Murchison 3 Document: draft-murchison-sieve-regex-06.txt Oceana Matrix Ltd. 4 Expires January 5, 2003 30 June 2002 6 Sieve -- Regular Expression Extension 8 Status of this Memo 10 This document is an Internet-Draft and is in full conformance with 11 all provisions of Section 10 of RFC2026. Internet-Drafts are working 12 documents of the Internet Engineering Task Force (IETF), its areas, 13 and its working groups. Note that other groups may also distribute 14 working documents as Internet-Drafts. 16 Internet-Drafts are draft documents valid for a maximum of six months 17 and may be updated, replaced, or obsoleted by other documents at any 18 time. It is inappropriate to use Internet-Drafts as reference 19 material or to cite them other than as "work in progress." 21 The list of current Internet-Drafts can be accessed at 22 http://www.ietf.org/ietf/1id-abstracts.txt 24 To view the list Internet-Draft Shadow Directories, see 25 http://www.ietf.org/shadow.html. 27 Distribution of this memo is unlimited. 29 Copyright Notice 31 Copyright (C) The Internet Society 2002. All Rights Reserved. 33 Abstract 35 In some cases, it is desirable to have a string matching mechanism 36 which is more powerful than a simple exact match, a substring match 37 or a glob-style wildcard match. The regular expression matching 38 mechanism defined in this draft should allow users to isolate just 39 about any string or address in a message header or envelope. 41 Table of Contents 43 Status of this Memo . . . . . . . . . . . . . . . . . . . . . . . . 1 45 Copyright Notice . . . . . . . . . . . . . . . . . . . . . . . . . . 1 47 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 49 0. Meta-information on this draft . . . . . . . . . . . . . . . 3 51 0.1. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 3 53 0.2. Noted Changes . . . . . . . . . . . . . . . . . . . . . . . . 3 55 0.2.1 since -05 . . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 0.2.2 since -04 . . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 0.3. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 63 2. Capability Identifier . . . . . . . . . . . . . . . . . . . . 4 65 3. Regex Match Type . . . . . . . . . . . . . . . . . . . . . . 4 67 4. Security Considerations . . . . . . . . . . . . . . . . . . . 6 69 5. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 6 71 6. Author's Address . . . . . . . . . . . . . . . . . . . . . . 7 73 Appendix A. References . . . . . . . . . . . . . . . . . . . . . . 7 75 Appendix B. Full Copyright Statement . . . . . . . . . . . . . . . 8 76 0. Meta-information on this draft 78 This information is intended to facilitate discussion. It will be 79 removed when this document leaves the Internet-Draft stage. 81 0.1. Discussion 83 This draft is intended to be an extension to the Sieve mail filtering 84 language, available from the RFC repository as 85 . 87 This draft and the Sieve language itself are being discussed on the 88 MTA Filters mailing list at . Subscription 89 requests can be sent to (send an 90 email message with the word "subscribe" in the body). More 91 information on the mailing list along with a WWW archive of back 92 messages is available at . 94 0.2. Noted Changes 96 0.2.1 since -05 98 Added open issue regarding localization/internationalization. 100 Added that implementations SHOULD reject regexes not supported by 101 this extension. 103 Editorial changes. 105 0.2.2 since -04 107 Editorial changes. 109 0.3. Open Issues 111 The major open issue with this draft is what to do, if anything, 112 about localization/internationalization. Are [POSIX.2] collating 113 sequences and character equivalents sufficient? 115 1. Introduction 117 This is an extension to the Sieve language defined by [SIEVE] for 118 comparing strings to regular expressions. 120 Conventions for notations are as in [SIEVE] section 1.1, including 121 use of [KEYWORDS]. 123 2. Capability Identifier 125 The capability string associated with the extension defined in this 126 document is "regex". 128 3. Regex Match Type 130 Commands that support matching may take the optional tagged argument 131 ":regex" to specify that a regular expression match should be 132 performed. The ":regex" match type is subject to the same rules and 133 restrictions as the standard match types defined in [SIEVE]. For 134 convenience, the "MATCH-TYPE" syntax element defined in [SIEVE] is 135 augmented here as follows: 137 MATCH-TYPE =/ ":regex" 139 Example: 141 require "regex"; 143 # Try to catch unsolicited email. 144 if anyof ( 145 # if a message is not to me (with optional +detail), 146 not address :regex ["to", "cc", "bcc"] 147 "me(\\+.*)?@company\\.com", 149 # or the subject is all uppercase (no lowercase) 150 header :regex :comparator "i;octet" "subject" 151 "^[^:lower:]+$" ) { 153 discard; # junk it 154 } 156 The ":regex" match type is compatible with both the "i;octet" and 157 "i;ascii-casemap" comparators and may be used with them. 159 Implementations MUST support extended regular expressions (EREs) as 160 defined by [POSIX.2]. Any regular expression not defined by 162 [POSIX.2], as well as [POSIX.2] basic regular expressions, word 163 boundaries and backreferences are not supported by this extension. 164 Implementations SHOULD reject regular expressions that are 165 unsupported by this specification as a syntax error. 167 The following table provides a brief summary of the regular 168 expressions that MUST be supported. This table is presented here 169 only as a guideline. [POSIX.2] should be used as the definitive 170 reference. 172 +------------+-----------------------------------------------------+ 173 | Expression | Pattern | 174 +------------+-----------------------------------------------------+ 175 | Items to match a single character | 176 +------------+-----------------------------------------------------+ 177 | . | Match any single character except newline. | 178 | [ ] | Bracket expression. Match any one of the enclosed | 179 | | characters. A hypen (-) indicates a range of | 180 | | consecutive characters. | 181 | [^ ] | Negated bracket expression. Match any one | 182 | | character NOT in the enclosed list. A hypen (-) | 183 | | indicates a range of consecutive characters. | 184 | \\ | Escape the following special character (match | 185 | | the literal character). Undefined for other | 186 | | characters. | 187 | | NOTE: Unlike [POSIX.2], a double-backslash is | 188 | | required as per section 2.4.2 of [SIEVE]. | 189 +------------+-----------------------------------------------------+ 190 | Items to be used within a bracket expression (localization) | 191 +------------+-----------------------------------------------------+ 192 | [: :] | Character class (alnum, alpha, blank, cntrl, | 193 | | digit, graph, lower, print, punct, space, | 194 | | upper, xdigit). | 195 | [= =] | Character equivalents. | 196 | [. .] | Collating sequence. | 197 +------------+-----------------------------------------------------+ 198 | Quantifiers - Items to count the preceding regular expression | 199 +------------+-----------------------------------------------------+ 200 | ? | Match zero or one instances. | 201 | * | Match zero or more instances. | 202 | + | Match one or more instances. | 203 | {n,m} | Match any number of instances between | 204 | | n and m (inclusive). {n} matches exactly n | 205 | | instances. {n,} matches n or more instances. | 206 +------------+-----------------------------------------------------+ 207 +------------+-----------------------------------------------------+ 208 | Expression | Pattern | 209 +------------+-----------------------------------------------------+ 210 | Anchoring - Items to match positions | 211 +------------+-----------------------------------------------------+ 212 | ^ | Match the beginning of the line or string. | 213 | $ | Match the end of the line or string. | 214 +------------+-----------------------------------------------------+ 215 | Other constructs | 216 +------------+-----------------------------------------------------+ 217 | | | Alternation. Match either of the separated | 218 | | regular expressions. | 219 | ( ) | Group the enclosed regular expression(s). | 220 +------------+-----------------------------------------------------+ 222 4. Security Considerations 224 Security considerations are discussed in [SIEVE]. It is believed 225 that this extension doesn't introduce any additional security 226 concerns. 228 However, a poor implementation COULD introduce security problems 229 ranging from degradation of performance to denial of service. If an 230 implementation uses a third-party regular expression library, that 231 library should be checked for potentially problematic regular 232 expressions, such as "(.*)*". 234 5. Acknowledgments 236 Thanks to Tim Showalter, Alexey Melnikov, Tony Hansen, Phil Pennock 237 and Jutta Degener for their help with this document. 239 6. Author's Address 241 Kenneth Murchison 242 Oceana Matrix Ltd. 243 21 Princeton Place 244 Orchard Park, NY 14127 246 Phone: (716) 662-8973 247 EMail: ken@oceana.com 249 Appendix A. References 251 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 252 Requirement Levels", Harvard University, RFC 2119, March, 1997. 254 [SIEVE] Showalter, T., "Sieve: A Mail Filtering Language", Mira� 255 point, Inc., RFC 3028, January 2001. 257 [POSIX.2], "Portable Operating System Interface (POSIX). Part 2, 258 Shell and utilities", National Institute of Standards and Tech� 259 nology (U.S.). 261 Appendix B. Full Copyright Statement 263 Copyright (C) The Internet Society 2002. All Rights Reserved. 265 This document and translations of it may be copied and furnished to 266 others, and derivative works that comment on or otherwise explain it 267 or assist in its implementation may be prepared, copied, published 268 and distributed, in whole or in part, without restriction of any 269 kind, provided that the above copyright notice and this paragraph 270 are included on all such copies and derivative works. However, this 271 document itself may not be modified in any way, such as by removing 272 the copyright notice or references to the Internet Society or other 273 Internet organizations, except as needed for the purpose of develop- 274 ing Internet standards in which case the procedures for copyrights 275 defined in the Internet Standards process must be followed, or as 276 required to translate it into languages other than English. 278 The limited permissions granted above are perpetual and will not be 279 revoked by the Internet Society or its successors or assigns. 281 This document and the information contained herein is provided on an 282 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 283 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 284 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 285 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 286 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.