idnits 2.17.1 draft-ietf-sieve-body-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 14. -- Found old boilerplate from RFC 3978, Section 5.5 on line 453. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 464. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 471. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 477. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 509 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 107: '...t. The implementation MUST NOT remove...' RFC 2119 keyword, line 108: '...rom the message, MUST NOT refuse to fi...' RFC 2119 keyword, line 110: '...m outright), and MUST treat multipart ...' RFC 2119 keyword, line 225: '... converted MAY be treated as plain U...' RFC 2119 keyword, line 227: '... SHOULD NOT cause early termination ...' (9 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 2005) is 6854 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'SIEVE' on line 425 looks like a reference -- Missing reference section? 'KEYWORDS' on line 418 looks like a reference -- Missing reference section? 'COMPARATOR' on line 68 looks like a reference -- Missing reference section? 'MATCH-TYPE' on line 68 looks like a reference -- Missing reference section? 'BODY-TRANSFORM' on line 68 looks like a reference -- Missing reference section? 'MIME' on line 421 looks like a reference -- Missing reference section? 'UTF-8' on line 428 looks like a reference -- Missing reference section? 'REGEX' on line 433 looks like a reference -- Missing reference section? 'VARIABLES' on line 437 looks like a reference Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 17 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Jutta Degener 2 Internet Draft Philip Guenther 3 Expires: January 2006 Sendmail, Inc. 4 July 2005 6 Sieve Email Filtering: Body Extension 7 draft-ietf-sieve-body-02.txt 9 Status of this memo 11 By submitting this Internet-Draft, each author represents that any 12 applicable patent or other IPR claims of which he or she is aware 13 have been or will be disclosed, and any of which he or she becomes 14 aware will be disclosed, in accordance with Section 6 of BCP 79. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as 19 Internet-Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six 22 months and may be updated, replaced, or obsoleted by other 23 documents at any time. It is inappropriate to use Internet- 24 Drafts as reference material or to cite them other than as 25 "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/1id-abstracts.html 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html 33 Copyright Notice 35 Copyright (C) The Internet Society (2005). 37 Abstract 39 This document defines a new primitive for the "Sieve" email 40 filtering language that tests for the occurrence of one or more 41 strings in the body of an email message. 43 1. Introduction 45 The proposed "body" test checks for the occurrence of one 46 or more strings in the body of an email message. 47 Such a test was initially discussed for the [SIEVE] base 48 document, but was subsequently removed because it was 49 thought to be too costly to implement. 51 Nevertheless, several server vendors have implemented 52 some form of the "body" test. 54 This document reintroduces the "body" test as an extension, 55 and specifies its syntax and semantics. 57 2. Conventions used. 59 Conventions for notations are as in [SIEVE] section 1.1, including 60 use of [KEYWORDS] and the "Syntax:" label for the definition of 61 action and tagged arguments syntax. 63 The capability string associated with the extension defined in 64 this document is "body". 66 3. Test body 68 Syntax: "body" [COMPARATOR] [MATCH-TYPE] [BODY-TRANSFORM] 69 71 The body test matches text in the body of an email message, that 72 is, anything following the first empty line after the header. 73 (The empty line itself, if present, is not considered to be part 74 of the body.) 76 The COMPARATOR and MATCH-TYPE keyword parameters are defined 77 in [SIEVE]. The BODY-TRANSFORM is a keyword parameter 78 discussed in section 4, below. 80 If a message consists of a header only, not followed by an empty 81 line, all "body" tests return false, including that for an empty 82 string. 84 If a message consists of a header followed only by an empty 85 line with no body lines following it, the message is considered 86 to have an empty string as a body. 88 4. Body Transform 90 Prior to matching text in a message body, "transformations" 91 can be applied that filter and decode certain parts of the body. 92 These transformations are selected by a "BODY-TRANSFORM" 93 keyword parameter. 95 Syntax: ":raw" 96 / ":content" 97 / ":text" 99 The default transformation is :text. 101 4.1 Body Transform ":raw" 103 The ":raw" transform is intended to match against the undecoded 104 body of a message. 106 If the specified body-transform is ":raw", the [MIME] structure 107 of the body is irrelevant. The implementation MUST NOT remove 108 any transfer encoding from the message, MUST NOT refuse to filter 109 messages with syntactic errors (unless the environment it is 110 part of rejects them outright), and MUST treat multipart boundaries 111 or the MIME headers of enclosed body parts as part of the text 112 being matched against instead of MIME structures to interpret. 114 Example: 116 require ["body", "reject"]; 118 # This will match a message containing the literal text 119 # "MAKE MONEY FAST" in body parts (ignoring any 120 # content-transfer-encodings) or MIME headers other than 121 # the outermost RFC 2822 header. 123 if body :raw :contains "MAKE MONEY FAST" { 124 reject; 125 } 127 4.2 Body Transform ":content" 129 If the body transform is ":content", only MIME parts that have 130 the specified content-types are selected for matching. 132 If an individual content type begins or ends with a '/' (slash) 133 or contains multiple slashes, it matches no content types. 134 Otherwise, if it contains a slash, then it specifies a full 135 / pair, and matches only that specific content 136 type. If it is the empty string, all MIME content types are 137 matched. Otherwise, it specifies a only, and any subtype 138 of that type matches it. 140 The search for MIME parts matching the :content specification 141 is recursive and automatically descends into multipart and 142 message/rfc822 MIME parts. All MIME parts with matching types 143 are searched for the key strings. The test returns true if any 144 combination of searched MIME part and key-list argument match. 146 If the :content specification matches a multipart MIME part, 147 only the prologue and epilogue sections of the part will be 148 searched for the key strings; the contents of nested parts are 149 only searched if their respective types match the :content 150 specification. 152 If the :content specification matches a message/rfc822 MIME part, 153 only the header of the nested message will be searched for the 154 key strings; the contents of the nested message body parts are 155 only searched if its content-type matches the :content specification. 157 (Matches against container types with an empty match string can 158 be useful as tests for the existence of such parts.) 160 Example: 161 From: Whomever 162 To: Someone 163 Date: Whenever 164 Subject: whatever 165 Content-Type: multipart/mixed; boundary=outer 167 & This is a multi-part message in MIME format. 168 & 169 --outer 170 Content-Type: multipart/alternative; boundary=inner 172 & This is a nested multi-part message in MIME format. 173 & 174 --inner 175 Content-Type: text/plain; charset="us-ascii" 177 $ Hello 178 $ 179 --inner 180 Content-Type: text/html; charset="us-ascii" 182 % Hello 183 % 184 --inner-- 185 & 186 & This is the end of the inner MIME multipart. 187 & 188 --outer 189 Content-Type: message/rfc822 191 ! From: Someone Else 192 ! Subject: hello request 194 $ Please say Hello 195 $ 196 --outer-- 197 & 198 & This is the end of the outer MIME multipart. 200 In the above example, the '&', '$' and '%' characters at the 201 start of a line are used to illustrate what portions of the 202 example message are used in tests: 204 - the lines starting with '&' are the ones that are tested when 205 a 'body :content "multipart" :contains "MIME"' 206 test is executed. 208 - the lines starting with '$' are the ones that are tested when 209 a 'body :content "text/plain" :contains "Hello"' test is 210 executed. 212 - the lines starting with '%' are the ones that are tested when 213 a 'body :content "text/html" :contains "Hello"' test is executed. 215 - the lines starting with '$' or '%' are the ones that are tested 216 when a 'body :content "text" :contains "Hello"' test is executed. 218 - the lines starting with '!' are the ones that are tested when 219 a 'body :content "message/rfc822" :contains "Hello"' test is 220 executed. 222 Comparisons are performed in Unicode. Implementations decode 223 the content-transfer-encoding and convert text to Unicode as 224 input to the comparator. MIME parts that cannot be decoded and 225 converted MAY be treated as plain US-ASCII, omitted, or processed 226 according to local conventions. A NUL octet (character zero) 227 SHOULD NOT cause early termination of the content being compared 228 against. Implementations MUST support the "quoted-printable", 229 "base64", "7bit", "8bit", and "binary" content transfer encodings. 230 Implementations MUST be capable of converting to the Unicode the 231 US-ASCII, [UTF-8], ISO-8859-1, and the US-ASCII subset of 232 ISO-8859-* character sets. 234 Search expressions MUST NOT match across MIME part boundaries. 235 MIME headers of the containing text MUST NOT be included in the 236 data. 238 Example: 239 require ["body", "fileinto"]; 241 # Save any message with any text MIME part that contains the 242 # words "missile" or "coordinates" in the "secrets" folder. 244 if body :content "text" :contains ["missile", "coordinates"] { 245 fileinto "secrets"; 246 } 248 # Save any message with an audio/mp3 MIME part in 249 # the "jukebox" folder. 251 if body :content "audio/mp3" :contains "" { 252 fileinto "jukebox"; 253 } 255 4.3 Body Transform ":text" 257 The ":text" body transform matches against the results of 258 an implementation's best effort at extracting UTF-8 encoded 259 text from a message. 261 In simple implementations, :text MAY be treated the same 262 as :content "text". 264 Sophisticated implementations MAY strip mark-up from the text 265 prior to matching, and MAY convert media types other than text 266 to text prior to matching. 268 (For example, they may be able to convert proprietary text 269 editor formats to text or apply optical character recognition 270 algorithms to image data.) 272 Example: 273 require ["body", "fileinto"]; 275 # Save messages mentioning the project schedule in the 276 # project/schedule folder. 277 if body :text :contains "project schedule" { 278 fileinto "project/schedule"; 279 } 281 5. Interaction with Other Sieve Extensions 283 Any extension that extends the grammar for the COMPARATOR or 284 MATCH-TYPE nonterminals will also affect the implementation of 285 "body". 287 The [REGEX] extension can place a considerable load on a system 288 when applied to whole bodies of messages, especially when 289 implemented naively or used maliciously. 291 Regular and wildcard expressions used with "body" are exempt 292 from the side effects described in [VARIABLES]. That is, they 293 MUST NOT set match variables (${1}, ${2}...) to the input values 294 corresponding to wild card sequences in the matched pattern. 295 However, if the extension is present, variable references in the 296 key strings or content type strings are evaluated as described 297 in the draft. 299 6. IANA Considerations 301 The following template specifies the IANA registration of the Sieve 302 extension specified in this document: 304 To: iana@iana.org 305 Subject: Registration of new Sieve extension 307 Capability name: body 308 Capability keyword: body 309 Capability arguments: N/A 310 Standards Track/IESG-approved experimental RFC number: this RFC 311 Person and email address to contact for further information: 313 Jutta Degener 314 jutta@pobox.com 316 This information should be added to the list of sieve extensions 317 given on http://www.iana.org/assignments/sieve-extensions. 319 7. Security Considerations 321 The system MUST be sized and restricted in such a manner that 322 even malicious use of body matching does not deny service to 323 other users of the host system. 325 Filters relying on string matches in the raw body of an email 326 message may be more general than intended. Text matches are no 327 replacement for a spam, virus, or other security related 328 filtering system. 330 8. Acknowledgments 332 This document has been revised in part based on comments and 333 discussions that took place on and off the SIEVE mailing list. 334 Thanks to Cyrus Daboo, Ned Freed, Bob Johannessen, Simon Josefsson, 335 Mark E. Mallett, Chris Markle, Alexey Melnikov, Ken Murchison, 336 Greg Shapiro, Tim Showalter, Nigel Swinson, and Dowson Tong for 337 reviews and suggestions. 339 9. Authors' Addresses 341 Jutta Degener 342 5245 College Ave, Suite #127 343 Oakland, CA 94618 345 Email: jutta@pobox.com 347 Philip Guenther 348 Sendmail, Inc. 349 6425 Christie Ave, 4th Floor 350 Emeryville, CA 94608 352 Email: guenther@sendmail.com 354 10. Discussion 356 This section will be removed when this document leaves the 357 Internet-Draft stage. 359 This draft is intended as an extension to the Sieve mail filtering 360 language. Sieve extensions are discussed on the MTA Filters mailing 361 list at . Subscription requests can 362 be sent to (send an email 363 message with the word "subscribe" in the body). 365 More information on the mailing list along with a WWW archive of 366 back messages is available at . 368 10.1 Changes from draft-ietf-sieve-body-01.txt 370 Updated charset conversion requirements to match those in 371 draft-ietf-sieve-3028bis-03.txt for headers. 373 10.2 Changes from draft-ietf-sieve-body-00.txt 375 Updated IPR boilerplate to RFC 3978/3979. 377 Many prose corrections in response to WGLC comments. Of particular 378 note: 379 - made clear that :raw treats MIME boundaries and headers as 380 text to be matched against 381 - corrected description in comment of :raw example 382 - clarified the interpretation of invalid content-types in 383 :content 384 - gave precise description of what gets matched when :content 385 is used with message/rfc822 or any multipart type, as well 386 as a comprehensive example 387 - include an example of :text 388 - tightened wording of interaction with [VARIABLES] 389 - added informative reference to [REGEX] 391 10.3 Changes from draft-degener-sieve-body-04.txt 393 Renamed to draft-ietf-sieve-body-00.txt; tweaked the title and 394 abstract. 396 Added Philip Guenther as co-author. 398 Split references into normative and informative. Updated [UTF-8] 399 and [VARIABLES] references. 401 Updated IPR boilerplate. 403 10.4 Changes from draft-degener-sieve-body-03.txt 405 Made "body" exempt from variable-setting side effects in the 406 presence of the "variables" extension and wild cards. It's too 407 hard to implement. 409 Removed :binary. It's uglier and less useful than it needs to be 410 to bother. 412 Added IANA section. 414 Appendices 416 Appendix A. Normative References 418 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 419 Requirement Levels", RFC 2119, March 1997. 421 [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 422 Extensions (MIME) Part One: Format of Internet Message 423 Bodies", RFC 2045, November 1996. 425 [SIEVE] Showalter, T., "Sieve: A Mail Filtering Language", 426 RFC 3028, January 2001. 428 [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO 429 10646", RFC 3629, November 2003. 431 Appendix B. Informative References 433 [REGEX] Murchison, K., "Sieve Email Filtering -- Regular 434 Expression Extension", 435 draft-murchison-sieve-regex-08.txt, October 2004 437 [VARIABLES] Homme, K.T., "Sieve Mail Filtering Language: Variables 438 Extension", draft-ietf-sieve-variables-03.txt, April 2005 440 Copyright Statement 442 Copyright (C) The Internet Society (2005). This document is 443 subject to the rights, licenses and restrictions contained in 444 BCP 78, and except as set forth therein, the authors retain all 445 their rights. 447 This document and the information contained herein are provided on an 448 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 449 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE 450 INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR 451 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 452 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 453 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 455 Intellectual Property 457 The IETF takes no position regarding the validity or scope of any 458 Intellectual Property Rights or other rights that might be claimed to 459 pertain to the implementation or use of the technology described in 460 this document or the extent to which any license under such rights 461 might or might not be available; nor does it represent that it has 462 made any independent effort to identify any such rights. Information 463 on the procedures with respect to rights in RFC documents can be 464 found in BCP 78 and BCP 79. 466 Copies of IPR disclosures made to the IETF Secretariat and any 467 assurances of licenses to be made available, or the result of an 468 attempt made to obtain a general license or permission for the use 469 of such proprietary rights by implementers or users of this 470 specification can be obtained from the IETF on-line IPR repository 471 at http://www.ietf.org/ipr. 473 The IETF invites any interested party to bring to its attention 474 any copyrights, patents or patent applications, or other 475 proprietary rights that may cover technology that may be required 476 to implement this standard. Please address the information to the 477 IETF at ietf-ipr@ietf.org. 479 Acknowledgement 481 Funding for the RFC Editor function is currently provided by 482 the Internet Society.