idnits 2.17.1 draft-ietf-sieve-body-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 14. -- Found old boilerplate from RFC 3978, Section 5.5 on line 460. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 471. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 478. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 484. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 107: '...t. The implementation MUST NOT remove...' RFC 2119 keyword, line 108: '...rom the message, MUST NOT refuse to fi...' RFC 2119 keyword, line 110: '...m outright), and MUST treat multipart ...' RFC 2119 keyword, line 224: '... converted MAY be treated as plain U...' RFC 2119 keyword, line 226: '... SHOULD NOT cause early termination ...' (9 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 2006) is 6609 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'SIEVE' on line 432 looks like a reference -- Missing reference section? 'KEYWORDS' on line 425 looks like a reference -- Missing reference section? 'COMPARATOR' on line 68 looks like a reference -- Missing reference section? 'MATCH-TYPE' on line 68 looks like a reference -- Missing reference section? 'BODY-TRANSFORM' on line 68 looks like a reference -- Missing reference section? 'MIME' on line 428 looks like a reference -- Missing reference section? 'UTF-8' on line 435 looks like a reference -- Missing reference section? 'REGEX' on line 440 looks like a reference -- Missing reference section? 'VARIABLES' on line 444 looks like a reference Summary: 4 errors (**), 0 flaws (~~), 2 warnings (==), 17 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Jutta Degener 2 Internet Draft Philip Guenther 3 Expires: September 2006 Sendmail, Inc. 4 March 2006 6 Sieve Email Filtering: Body Extension 7 draft-ietf-sieve-body-03.txt 9 Status of this memo 11 By submitting this Internet-Draft, each author represents that any 12 applicable patent or other IPR claims of which he or she is aware 13 have been or will be disclosed, and any of which he or she becomes 14 aware will be disclosed, in accordance with Section 6 of BCP 79. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as 19 Internet-Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six 22 months and may be updated, replaced, or obsoleted by other 23 documents at any time. It is inappropriate to use Internet- 24 Drafts as reference material or to cite them other than as 25 "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/1id-abstracts.html 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html 33 Copyright Notice 35 Copyright (C) The Internet Society (2006). 37 Abstract 39 This document defines a new primitive for the "Sieve" email 40 filtering language that tests for the occurrence of one or more 41 strings in the body of an email message. 43 1. Introduction 45 The proposed "body" test checks for the occurrence of one 46 or more strings in the body of an email message. 47 Such a test was initially discussed for the [SIEVE] base 48 document, but was subsequently removed because it was 49 thought to be too costly to implement. 51 Nevertheless, several server vendors have implemented 52 some form of the "body" test. 54 This document reintroduces the "body" test as an extension, 55 and specifies its syntax and semantics. 57 2. Conventions used. 59 Conventions for notations are as in [SIEVE] section 1.1, including 60 use of [KEYWORDS] and the "Usage:" label for the definition of 61 action and tagged arguments syntax. 63 The capability string associated with the extension defined in 64 this document is "body". 66 3. Test body 68 Usage: "body" [COMPARATOR] [MATCH-TYPE] [BODY-TRANSFORM] 69 71 The body test matches text in the body of an email message, that 72 is, anything following the first empty line after the header. 73 (The empty line itself, if present, is not considered to be part 74 of the body.) 76 The COMPARATOR and MATCH-TYPE keyword parameters are defined 77 in [SIEVE]. The BODY-TRANSFORM is a keyword parameter 78 discussed in section 4, below. 80 If a message consists of a header only, not followed by an empty 81 line, all "body" tests return false, including that for an empty 82 string. 84 If a message consists of a header followed only by an empty 85 line with no body lines following it, the message is considered 86 to have an empty string as a body. 88 4. Body Transform 90 Prior to matching text in a message body, "transformations" 91 can be applied that filter and decode certain parts of the body. 92 These transformations are selected by a "BODY-TRANSFORM" 93 keyword parameter. 95 Usage: ":raw" 96 / ":content" 97 / ":text" 99 The default transformation is :text. 101 4.1 Body Transform ":raw" 103 The ":raw" transform is intended to match against the undecoded 104 body of a message. 106 If the specified body-transform is ":raw", the [MIME] structure 107 of the body is irrelevant. The implementation MUST NOT remove 108 any transfer encoding from the message, MUST NOT refuse to filter 109 messages with syntactic errors (unless the environment it is 110 part of rejects them outright), and MUST treat multipart boundaries 111 or the MIME headers of enclosed body parts as part of the text 112 being matched against instead of MIME structures to interpret. 114 Example: 116 require ["body", "reject"]; 118 # This will match a message containing the literal text 119 # "MAKE MONEY FAST" in body parts (ignoring any 120 # content-transfer-encodings) or MIME headers other than 121 # the outermost RFC 2822 header. 123 if body :raw :contains "MAKE MONEY FAST" { 124 reject; 125 } 127 4.2 Body Transform ":content" 129 If the body transform is ":content", only MIME parts that have 130 the specified content-types are selected for matching. 132 If an individual content type begins or ends with a '/' (slash) 133 or contains multiple slashes, it matches no content types. 134 Otherwise, if it contains a slash, then it specifies a full 135 / pair, and matches only that specific content 136 type. If it is the empty string, all MIME content types are 137 matched. Otherwise, it specifies a only, and any subtype 138 of that type matches it. 140 The search for MIME parts matching the :content specification 141 is recursive and automatically descends into multipart and 142 message/rfc822 MIME parts. All MIME parts with matching types 143 are searched for the key strings. The test returns true if any 144 combination of searched MIME part and key-list argument match. 146 If the :content specification matches a multipart MIME part, 147 only the prologue and epilogue sections of the part will be 148 searched for the key strings; the contents of nested parts are 149 only searched if their respective types match the :content 150 specification. 152 If the :content specification matches a message/rfc822 MIME part, 153 only the header of the nested message will be searched for the 154 key strings; the contents of the nested message body parts are 155 only searched if its content-type matches the :content specification. 157 (Matches against container types with an empty match string can 158 be useful as tests for the existence of such parts.) 159 Example: 160 From: Whomever 161 To: Someone 162 Date: Whenever 163 Subject: whatever 164 Content-Type: multipart/mixed; boundary=outer 166 & This is a multi-part message in MIME format. 167 & 168 --outer 169 Content-Type: multipart/alternative; boundary=inner 171 & This is a nested multi-part message in MIME format. 172 & 173 --inner 174 Content-Type: text/plain; charset="us-ascii" 176 $ Hello 177 $ 178 --inner 179 Content-Type: text/html; charset="us-ascii" 181 % Hello 182 % 183 --inner-- 184 & 185 & This is the end of the inner MIME multipart. 186 & 187 --outer 188 Content-Type: message/rfc822 190 ! From: Someone Else 191 ! Subject: hello request 193 $ Please say Hello 194 $ 195 --outer-- 196 & 197 & This is the end of the outer MIME multipart. 199 In the above example, the '&', '$' and '%' characters at the 200 start of a line are used to illustrate what portions of the 201 example message are used in tests: 203 - the lines starting with '&' are the ones that are tested when 204 a 'body :content "multipart" :contains "MIME"' 205 test is executed. 207 - the lines starting with '$' are the ones that are tested when 208 a 'body :content "text/plain" :contains "Hello"' test is 209 executed. 211 - the lines starting with '%' are the ones that are tested when 212 a 'body :content "text/html" :contains "Hello"' test is executed. 214 - the lines starting with '$' or '%' are the ones that are tested 215 when a 'body :content "text" :contains "Hello"' test is executed. 217 - the lines starting with '!' are the ones that are tested when 218 a 'body :content "message/rfc822" :contains "Hello"' test is 219 executed. 221 Comparisons are performed on octets. Implementations decode 222 the content-transfer-encoding and convert text to [UTF-8] as 223 input to the comparator. MIME parts that cannot be decoded and 224 converted MAY be treated as plain US-ASCII, omitted, or processed 225 according to local conventions. A NUL octet (character zero) 226 SHOULD NOT cause early termination of the content being compared 227 against. Implementations MUST support the "quoted-printable", 228 "base64", "7bit", "8bit", and "binary" content transfer encodings. 229 Implementations MUST be capable of converting to UTF-8 the 230 US-ASCII, ISO-8859-1, and the US-ASCII subset of 231 ISO-8859-* character sets. 233 Search expressions MUST NOT match across MIME part boundaries. 234 MIME headers of the containing text MUST NOT be included in the 235 data. 237 Example: 238 require ["body", "fileinto"]; 240 # Save any message with any text MIME part that contains the 241 # words "missile" or "coordinates" in the "secrets" folder. 243 if body :content "text" :contains ["missile", "coordinates"] { 244 fileinto "secrets"; 245 } 247 # Save any message with an audio/mp3 MIME part in 248 # the "jukebox" folder. 250 if body :content "audio/mp3" :contains "" { 251 fileinto "jukebox"; 252 } 254 4.3 Body Transform ":text" 256 The ":text" body transform matches against the results of 257 an implementation's best effort at extracting UTF-8 encoded 258 text from a message. 260 In simple implementations, :text MAY be treated the same 261 as :content "text". 263 Sophisticated implementations MAY strip mark-up from the text 264 prior to matching, and MAY convert media types other than text 265 to text prior to matching. 267 (For example, they may be able to convert proprietary text 268 editor formats to text or apply optical character recognition 269 algorithms to image data.) 271 Example: 272 require ["body", "fileinto"]; 274 # Save messages mentioning the project schedule in the 275 # project/schedule folder. 276 if body :text :contains "project schedule" { 277 fileinto "project/schedule"; 278 } 280 5. Interaction with Other Sieve Extensions 282 Any extension that extends the grammar for the COMPARATOR or 283 MATCH-TYPE nonterminals will also affect the implementation of 284 "body". 286 The [REGEX] extension can place a considerable load on a system 287 when applied to whole bodies of messages, especially when 288 implemented naively or used maliciously. 290 Regular and wildcard expressions used with "body" are exempt 291 from the side effects described in [VARIABLES]. That is, they 292 MUST NOT set match variables (${1}, ${2}...) to the input values 293 corresponding to wild card sequences in the matched pattern. 294 However, if the extension is present, variable references in the 295 key strings or content type strings are evaluated as described 296 in the draft. 298 6. IANA Considerations 300 The following template specifies the IANA registration of the Sieve 301 extension specified in this document: 303 To: iana@iana.org 304 Subject: Registration of new Sieve extension 306 Capability name: body 307 Capability keyword: body 308 Capability arguments: N/A 309 Standards Track/IESG-approved experimental RFC number: this RFC 310 Person and email address to contact for further information: 312 Jutta Degener 313 jutta@pobox.com 315 This information should be added to the list of sieve extensions 316 given on http://www.iana.org/assignments/sieve-extensions. 318 7. Security Considerations 320 The system MUST be sized and restricted in such a manner that 321 even malicious use of body matching does not deny service to 322 other users of the host system. 324 Filters relying on string matches in the raw body of an email 325 message may be more general than intended. Text matches are no 326 replacement for a spam, virus, or other security related 327 filtering system. 329 8. Acknowledgments 331 This document has been revised in part based on comments and 332 discussions that took place on and off the SIEVE mailing list. 333 Thanks to Cyrus Daboo, Ned Freed, Bob Johannessen, Simon Josefsson, 334 Mark E. Mallett, Chris Markle, Alexey Melnikov, Ken Murchison, 335 Greg Shapiro, Tim Showalter, Nigel Swinson, and Dowson Tong for 336 reviews and suggestions. 338 9. Authors' Addresses 340 Jutta Degener 341 5245 College Ave, Suite #127 342 Oakland, CA 94618 344 Email: jutta@pobox.com 346 Philip Guenther 347 Sendmail, Inc. 348 6425 Christie Ave, 4th Floor 349 Emeryville, CA 94608 351 Email: guenther@sendmail.com 353 10. Discussion 355 This section will be removed when this document leaves the 356 Internet-Draft stage. 358 This draft is intended as an extension to the Sieve mail filtering 359 language. Sieve extensions are discussed on the MTA Filters mailing 360 list at . Subscription requests can 361 be sent to (send an email 362 message with the word "subscribe" in the body). 364 More information on the mailing list along with a WWW archive of 365 back messages is available at . 367 10.1 Changes from draft-ietf-sieve-body-02.txt 369 Updated charset conversion to match draft-ietf-sieve-3028bis-06.txt. 371 Change "Syntax:" to "Usage:". 373 Updated references. 375 10.2 Changes from draft-ietf-sieve-body-01.txt 377 Updated charset conversion requirements to match those in 378 draft-ietf-sieve-3028bis-03.txt for headers. 380 10.3 Changes from draft-ietf-sieve-body-00.txt 382 Updated IPR boilerplate to RFC 3978/3979. 384 Many prose corrections in response to WGLC comments. Of particular 385 note: 386 - made clear that :raw treats MIME boundaries and headers as 387 text to be matched against 388 - corrected description in comment of :raw example 389 - clarified the interpretation of invalid content-types in 390 :content 391 - gave precise description of what gets matched when :content 392 is used with message/rfc822 or any multipart type, as well 393 as a comprehensive example 394 - include an example of :text 395 - tightened wording of interaction with [VARIABLES] 396 - added informative reference to [REGEX] 398 10.4 Changes from draft-degener-sieve-body-04.txt 400 Renamed to draft-ietf-sieve-body-00.txt; tweaked the title and 401 abstract. 403 Added Philip Guenther as co-author. 405 Split references into normative and informative. Updated [UTF-8] 406 and [VARIABLES] references. 408 Updated IPR boilerplate. 410 10.5 Changes from draft-degener-sieve-body-03.txt 412 Made "body" exempt from variable-setting side effects in the 413 presence of the "variables" extension and wild cards. It's too 414 hard to implement. 416 Removed :binary. It's uglier and less useful than it needs to be 417 to bother. 419 Added IANA section. 421 Appendices 423 Appendix A. Normative References 425 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 426 Requirement Levels", RFC 2119, March 1997. 428 [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 429 Extensions (MIME) Part One: Format of Internet Message 430 Bodies", RFC 2045, November 1996. 432 [SIEVE] Guenther, P. and T. Showalter, "Sieve: A Mail Filtering 433 Language", draft-ietf-sieve-3028bis-06, March 2006. 435 [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO 436 10646", RFC 3629, November 2003. 438 Appendix B. Informative References 440 [REGEX] Murchison, K., "Sieve Email Filtering -- Regular 441 Expression Extension", 442 draft-ietf-sieve-regex-00.txt, February 2006 444 [VARIABLES] Homme, K.T., "Sieve Extension: Variables", 445 draft-ietf-sieve-variables-07.txt, October 2005 447 Copyright Statement 449 Copyright (C) The Internet Society (2006). This document is 450 subject to the rights, licenses and restrictions contained in 451 BCP 78, and except as set forth therein, the authors retain all 452 their rights. 454 This document and the information contained herein are provided on an 455 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 456 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE 457 INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR 458 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 459 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 460 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 462 Intellectual Property 464 The IETF takes no position regarding the validity or scope of any 465 Intellectual Property Rights or other rights that might be claimed to 466 pertain to the implementation or use of the technology described in 467 this document or the extent to which any license under such rights 468 might or might not be available; nor does it represent that it has 469 made any independent effort to identify any such rights. Information 470 on the procedures with respect to rights in RFC documents can be 471 found in BCP 78 and BCP 79. 473 Copies of IPR disclosures made to the IETF Secretariat and any 474 assurances of licenses to be made available, or the result of an 475 attempt made to obtain a general license or permission for the use 476 of such proprietary rights by implementers or users of this 477 specification can be obtained from the IETF on-line IPR repository 478 at http://www.ietf.org/ipr. 480 The IETF invites any interested party to bring to its attention 481 any copyrights, patents or patent applications, or other 482 proprietary rights that may cover technology that may be required 483 to implement this standard. Please address the information to the 484 IETF at ietf-ipr@ietf.org. 486 Acknowledgement 488 Funding for the RFC Editor function is currently provided by 489 the Internet Society.