idnits 2.17.1 draft-ietf-sieve-body-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 13. -- Found old boilerplate from RFC 3978, Section 5.5 on line 450. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 461. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 468. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 474. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 503 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 6 instances of too long lines in the document, the longest one being 4 characters in excess of 72. ** There are 40 instances of lines with control characters in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 106: '...t. The implementation MUST NOT remove...' RFC 2119 keyword, line 107: '...rom the message, MUST NOT refuse to fi...' RFC 2119 keyword, line 109: '...m outright), and MUST treat multipart ...' RFC 2119 keyword, line 222: '...ansfer encodings MUST be decoded prior...' RFC 2119 keyword, line 224: '... MUST be matched as they are. MIME ...' (12 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 2005) is 6918 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'SIEVE' on line 422 looks like a reference -- Missing reference section? 'KEYWORDS' on line 415 looks like a reference -- Missing reference section? 'COMPARATOR' on line 67 looks like a reference -- Missing reference section? 'MATCH-TYPE' on line 67 looks like a reference -- Missing reference section? 'BODY-TRANSFORM' on line 67 looks like a reference -- Missing reference section? 'MIME' on line 418 looks like a reference -- Missing reference section? 'UTF-8' on line 425 looks like a reference -- Missing reference section? 'REGEX' on line 430 looks like a reference -- Missing reference section? 'VARIABLES' on line 434 looks like a reference Summary: 6 errors (**), 0 flaws (~~), 3 warnings (==), 16 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Draft Philip Guenther 2 Expires: November 2005 Sendmail, Inc. 3 May 2005 5 Sieve Email Filtering: Body Extension 6 draft-ietf-sieve-body-01.txt 8 Status of this memo 10 By submitting this Internet-Draft, each author represents that any 11 applicable patent or other IPR claims of which he or she is aware 12 have been or will be disclosed, and any of which he or she becomes 13 aware will be disclosed, in accordance with Section 6 of BCP 79. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as 18 Internet-Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six 21 months and may be updated, replaced, or obsoleted by other 22 documents at any time. It is inappropriate to use Internet- 23 Drafts as reference material or to cite them other than as 24 "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/1id-abstracts.html 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html 32 Copyright Notice 34 Copyright (C) The Internet Society (2005). 36 Abstract 38 This document defines a new primitive for the "Sieve" email 39 filtering language that tests for the occurrence of one or more 40 strings in the body of an email message. 42 1. Introduction 44 The proposed "body" test checks for the occurrence of one 45 or more strings in the body of an email message. 46 Such a test was initially discussed for the [SIEVE] base 47 document, but was subsequently removed because it was 48 thought to be too costly to implement. 50 Nevertheless, several server vendors have implemented 51 some form of the "body" test. 53 This document reintroduces the "body" test as an extension, 54 and specifies its syntax and semantics. 56 2. Conventions used. 58 Conventions for notations are as in [SIEVE] section 1.1, including 59 use of [KEYWORDS] and the "Syntax:" label for the definition of 60 action and tagged arguments syntax. 62 The capability string associated with the extension defined in 63 this document is "body". 65 3. Test body 67 Syntax: "body" [COMPARATOR] [MATCH-TYPE] [BODY-TRANSFORM] 68 70 The body test matches text in the body of an email message, that 71 is, anything following the first empty line after the header. 72 (The empty line itself, if present, is not considered to be part 73 of the body.) 75 The COMPARATOR and MATCH-TYPE keyword parameters are defined 76 in [SIEVE]. The BODY-TRANSFORM is a keyword parameter 77 discussed in section 4, below. 79 If a message consists of a header only, not followed by an empty 80 line, all "body" tests return false, including that for an empty 81 string. 83 If a message consists of a header followed only by an empty 84 line with no body lines following it, the message is considered 85 to have an empty string as a body. 87 4. Body Transform 89 Prior to matching text in a message body, "transformations" 90 can be applied that filter and decode certain parts of the body. 91 These transformations are selected by a "BODY-TRANSFORM" 92 keyword parameter. 94 Syntax: ":raw" 95 / ":content" 96 / ":text" 98 The default transformation is :text. 100 4.1 Body Transform ":raw" 102 The ":raw" transform is intended to match against the undecoded 103 body of a message. 105 If the specified body-transform is ":raw", the [MIME] structure 106 of the body is irrelevant. The implementation MUST NOT remove 107 any transfer encoding from the message, MUST NOT refuse to filter 108 messages with syntactic errors (unless the environment it is 109 part of rejects them outright), and MUST treat multipart boundaries 110 or the MIME headers of enclosed body parts as part of the text 111 being matched against instead of MIME structures to interpret. 113 Example: 115 require ["body", "reject"]; 117 # This will match a message containing the literal text 118 # "MAKE MONEY FAST" in body parts (ignoring any 119 # content-transfer-encodings) or MIME headers other than 120 # the outermost RFC 2822 header. 122 if body :raw :contains "MAKE MONEY FAST" { 123 reject; 124 } 126 4.2 Body Transform ":content" 128 If the body transform is ":content", only MIME parts that have 129 the specified content-types are selected for matching. 131 If an individual content type begins or ends with a '/' (slash) 132 or contains multiple slashes, it matches no content types. 133 Otherwise, if it contains a slash, then it specifies a full 134 / pair, and matches only that specific content 135 type. If it is the empty string, all MIME content types are 136 matched. Otherwise, it specifies a only, and any subtype 137 of that type matches it. 139 The search for MIME parts matching the :content specification 140 is recursive and automatically descends into multipart and 141 message/rfc822 MIME parts. All MIME parts with matching types 142 are searched for the key strings. The test returns true if any 143 combination of searched MIME part and key-list argument match. 145 If the :content specification matches a multipart MIME part, 146 only the prologue and epilogue sections of the part will be 147 searched for the key strings; the contents of nested parts are 148 only searched if their respective types match the :content 149 specification. 151 If the :content specification matches a message/rfc822 MIME part, 152 only the header of the nested message will be searched for the 153 key strings; the contents of the nested message body parts are 154 only searched if its content-type matches the :content specification. 156 (Matches against container types with an empty match string can 157 be useful as tests for the existence of such parts.) 159 Example: 160 From: Whomever 161 To: Someone 162 Date: Whenever 163 Subject: whatever 164 Content-Type: multipart/mixed; boundary=outer 166 & This is a multi-part message in MIME format. 167 & 168 --outer 169 Content-Type: multipart/alternative; boundary=inner 171 & This is a nested multi-part message in MIME format. 172 & 173 --inner 174 Content-Type: text/plain; charset="us-ascii" 176 $ Hello 177 $ 178 --inner 179 Content-Type: text/html; charset="us-ascii" 181 % Hello 182 % 183 --inner-- 184 & 185 & This is the end of the inner MIME multipart. 186 & 187 --outer 188 Content-Type: message/rfc822 190 ! From: Someone Else 191 ! Subject: hello request 193 $ Please say Hello 194 $ 195 --outer-- 196 & 197 & This is the end of the outer MIME multipart. 199 In the above example, the '&', '$' and '%' characters at the 200 start of a line are used to illustrate what portions of the 201 example message are used in tests: 203 - the lines starting with '&' are the ones that are tested when 204 a 'body :content "multipart" :contains "MIME"' 205 test is executed. 207 - the lines starting with '$' are the ones that are tested when 208 a 'body :content "text/plain" :contains "Hello"' test is 209 executed. 211 - the lines starting with '%' are the ones that are tested when 212 a 'body :content "text/html" :contains "Hello"' test is executed. 214 - the lines starting with '$' or '%' are the ones that are tested 215 when a 'body :content "text" :contains "Hello"' test is executed. 217 - the lines starting with '!' are the ones that are tested when 218 a 'body :content "message/rfc822" :contains "Hello"' test is 219 executed. 221 MIME parts encoded in "quoted-printable" or "base64" content 222 transfer encodings MUST be decoded prior to the match. MIME 223 parts in "7bit", "8bit", "binary" content transfer encodings 224 MUST be matched as they are. MIME parts in content transfer 225 encodings other than those MAY be decoded, omitted from the test, 226 or processed as raw data. 228 MIME parts identified as using charsets other than UTF-8 as 229 defined in [UTF-8] SHOULD be converted to UTF-8 prior to the match. 230 A conversion from US-ASCII to UTF-8 MUST be supported. 231 If an implementation does not support conversion of a given 232 charset to UTF-8, it MAY compare against the US-ASCII subset 233 of the transfer-decoded character data instead. Characters from 234 documents tagged with charsets that the local implementation 235 cannot convert to UTF-8 and text from mistagged documents MAY 236 be omitted or processed according to local conventions. 238 Search expressions MUST NOT match across MIME part boundaries. 239 MIME headers of the containing text MUST NOT be included in the 240 data. 242 Example: 243 require ["body", "fileinto"]; 245 # Save any message with any text MIME part that contains the 246 # words "missile" or "coordinates" in the "secrets" folder. 248 if body :content "text" :contains ["missile", "coordinates"] { 249 fileinto "secrets"; 250 } 252 # Save any message with an audio/mp3 MIME part in 253 # the "jukebox" folder. 255 if body :content "audio/mp3" :contains "" { 256 fileinto "jukebox"; 257 } 259 4.3 Body Transform ":text" 261 The ":text" body transform matches against the results of 262 an implementation's best effort at extracting UTF-8 encoded 263 text from a message. 265 In simple implementations, :text MAY be treated the same 266 as :content "text". 268 Sophisticated implementations MAY strip mark-up from the text 269 prior to matching, and MAY convert media types other than text 270 to text prior to matching. 272 (For example, they may be able to convert proprietary text 273 editor formats to text or apply optical character recognition 274 algorithms to image data.) 276 Example: 277 require ["body", "fileinto"]; 279 # Save messages mentioning the project schedule in the 280 # project/schedule folder. 281 if body :text :contains "project schedule" { 282 fileinto "project/schedule"; 283 } 285 5. Interaction with Other Sieve Extensions 287 Any extension that extends the grammar for the COMPARATOR or 288 MATCH-TYPE nonterminals will also affect the implementation of 289 "body". 291 The [REGEX] extension can place a considerable load on a system 292 when applied to whole bodies of messages, especially when 293 implemented naively or used maliciously. 295 Regular and wildcard expressions used with "body" are exempt 296 from the side effects described in [VARIABLES]. That is, they 297 MUST NOT set match variables (${1}, ${2}...) to the input values 298 corresponding to wild card sequences in the matched pattern. 299 However, if the extension is present, variable references in the 300 key strings or content type strings are evaluated as described 301 in the draft. 303 6. IANA Considerations 305 The following template specifies the IANA registration of the Sieve 306 extension specified in this document: 308 To: iana@iana.org 309 Subject: Registration of new Sieve extension 311 Capability name: body 312 Capability keyword: body 313 Capability arguments: N/A 314 Standards Track/IESG-approved experimental RFC number: this RFC 315 Person and email address to contact for further information: 317 Jutta Degener 318 jutta@pobox.com 320 This information should be added to the list of sieve extensions 321 given on http://www.iana.org/assignments/sieve-extensions. 323 7. Security Considerations 325 The system MUST be sized and restricted in such a manner that 326 even malicious use of body matching does not deny service to 327 other users of the host system. 329 Filters relying on string matches in the raw body of an email 330 message may be more general than intended. Text matches are no 331 replacement for a spam, virus, or other security related 332 filtering system. 334 8. Acknowledgments 336 This document has been revised in part based on comments and 337 discussions that took place on and off the SIEVE mailing list. 338 Thanks to Cyrus Daboo, Ned Freed, Bob Johannessen, Simon Josefsson, 339 Mark E. Mallett, Chris Markle, Alexey Melnikov, Ken Murchison, 340 Greg Shapiro, Tim Showalter, Nigel Swinson, and Dowson Tong for 341 reviews and suggestions. 343 9. Authors' Addresses 345 Jutta Degener 346 5245 College Ave, Suite #127 347 Oakland, CA 94618 349 Email: jutta@pobox.com 351 Philip Guenther 352 Sendmail, Inc. 353 6425 Christie Ave, 4th Floor 354 Emeryville, CA 94608 356 Email: guenther@sendmail.com 358 10. Discussion 360 This section will be removed when this document leaves the 361 Internet-Draft stage. 363 This draft is intended as an extension to the Sieve mail filtering 364 language. Sieve extensions are discussed on the MTA Filters mailing 365 list at . Subscription requests can 366 be sent to (send an email 367 message with the word "subscribe" in the body). 369 More information on the mailing list along with a WWW archive of 370 back messages is available at . 372 10.1 Changes from draft-ietf-sieve-body-00.txt 374 Updated IPR boilerplate to RFC 3978/3979. 376 Many prose corrections in response to WGLC comments. Of particular 377 note: 378 - made clear that :raw treats MIME boundaries and headers as 379 text to be matched against 380 - corrected description in comment of :raw example 381 - clarified the interpretation of invalid content-types in 382 :content 383 - gave precise description of what gets matched when :content 384 is used with message/rfc822 or any multipart type, as well 385 as a comprehensive example 386 - include an example of :text 387 - tightened wording of interaction with [VARIABLES] 388 - added informative reference to [REGEX] 390 10.2 Changes from draft-degener-sieve-body-04.txt 392 Renamed to draft-ietf-sieve-body-00.txt; tweaked the title and abstract. 394 Added Philip Guenther as co-author. 396 Split references into normative and informative. Updated [UTF-8] 397 and [VARIABLES] references. 399 Updated IPR boilerplate. 401 10.3 Changes from draft-degener-sieve-body-03.txt 403 Made "body" exempt from variable-setting side effects in the presence 404 of the "variables" extension and wild cards. It's too hard to implement. 406 Removed :binary. It's uglier and less useful than it needs to be 407 to bother. 409 Added IANA section. 411 Appendices 413 Appendix A. Normative References 415 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 416 Requirement Levels", RFC 2119, March 1997. 418 [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 419 Extensions (MIME) Part One: Format of Internet Message 420 Bodies", RFC 2045, November 1996. 422 [SIEVE] Showalter, T., "Sieve: A Mail Filtering Language", RFC 3028, 423 January 2001. 425 [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO 10646", 426 RFC 3629, November 2003. 428 Appendix B. Informative References 430 [REGEX] Murchison, K., "Sieve Email Filtering -- Regular 431 Expression Extension", draft-murchison-sieve-regex-08.txt, 432 October 2004 434 [VARIABLES] Homme, K.T., "Sieve Mail Filtering Language: Variables 435 Extension", draft-ietf-sieve-variables-03.txt, April 2005 437 Copyright Statement 439 Copyright (C) The Internet Society (2005). This document is 440 subject to the rights, licenses and restrictions contained in 441 BCP 78, and except as set forth therein, the authors retain all 442 their rights. 444 This document and the information contained herein are provided on an 445 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 446 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE 447 INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR 448 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 449 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 450 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 452 Intellectual Property 454 The IETF takes no position regarding the validity or scope of any 455 Intellectual Property Rights or other rights that might be claimed to 456 pertain to the implementation or use of the technology described in 457 this document or the extent to which any license under such rights 458 might or might not be available; nor does it represent that it has 459 made any independent effort to identify any such rights. Information 460 on the procedures with respect to rights in RFC documents can be 461 found in BCP 78 and BCP 79. 463 Copies of IPR disclosures made to the IETF Secretariat and any 464 assurances of licenses to be made available, or the result of an 465 attempt made to obtain a general license or permission for the use 466 of such proprietary rights by implementers or users of this 467 specification can be obtained from the IETF on-line IPR repository 468 at http://www.ietf.org/ipr. 470 The IETF invites any interested party to bring to its attention 471 any copyrights, patents or patent applications, or other 472 proprietary rights that may cover technology that may be required 473 to implement this standard. Please address the information to the 474 IETF at ietf-ipr@ietf.org. 476 Acknowledgement 478 Funding for the RFC Editor function is currently provided by 479 the Internet Society.