idnits 2.17.1 draft-ietf-sieve-body-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 512. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 523. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 530. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 536. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 2008) is 5696 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'COMPARATOR' is mentioned on line 79, but not defined == Missing Reference: 'MATCH-TYPE' is mentioned on line 79, but not defined == Missing Reference: 'BODY-TRANSFORM' is mentioned on line 79, but not defined == Missing Reference: 'REGEX' is mentioned on line 453, but not defined Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Jutta Degener 2 Internet Draft Philip Guenther 3 Intended status: Standards Track Sendmail, Inc. 4 Expires: March 2008 September 2008 5 Updates: RFC-ietf-sieve-variables 7 Sieve Email Filtering: Body Extension 8 draft-ietf-sieve-body-08.txt 10 Status of this memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as 20 Internet-Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six 23 months and may be updated, replaced, or obsoleted by other 24 documents at any time. It is inappropriate to use Internet- 25 Drafts as reference material or to cite them other than as 26 "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/1id-abstracts.html 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html 34 Copyright Notice 36 Copyright (C) The IETF Trust (2008). 38 Abstract 40 This document defines a new command for the "Sieve" email 41 filtering language that tests for the occurrence of one or more 42 strings in the body of an email message. 44 1. Introduction 46 The "body" test checks for the occurrence of one 47 or more strings in the body of an email message. 48 Such a test was initially discussed for the [SIEVE] base 49 document, but was subsequently removed because it was 50 thought to be too costly to implement. 52 Nevertheless, several server vendors have implemented 53 some form of the "body" test. 55 This document reintroduces the "body" test as an extension, 56 and specifies its syntax and semantics. 58 2. Conventions used. 60 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 61 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 62 document are to be interpreted as described in [KEYWORDS]. 64 Conventions for notations are as in [SIEVE] section 1.1, including 65 use of the "Usage:" label for the definition of text and tagged 66 arguments syntax. 68 The rules for interpreting the grammar are defined in [SIEVE] 69 and inherited by this specification. In particular, readers of 70 this document are reminded that according to [SIEVE] sections 71 2.6.2 and 2.6.3, optional arguments such as COMPARATOR and 72 MATCH-TYPE can appear in any order. 74 The capability string associated with the extension defined in 75 this document is "body". 77 3. Test body 79 Usage: "body" [COMPARATOR] [MATCH-TYPE] [BODY-TRANSFORM] 80 82 The body test matches content in the body of an email message, 83 that is, anything following the first empty line after the header. 84 (The empty line itself, if present, is not considered to be part 85 of the body.) 87 The COMPARATOR and MATCH-TYPE keyword parameters are defined in 88 [SIEVE]. As specified in section 2.7.3 of [SIEVE], the default 89 COMPARATOR is "i;ascii-casemap" and the default MATCH-TYPE is 90 ":is". 92 The BODY-TRANSFORM is a keyword parameter that governs how a set 93 of strings to be matched against are extracted from the body of 94 the message. If a message consists of a header only, not followed 95 by an empty line, then that set is empty and all "body" tests 96 return false, including those that test for an empty string. 97 (This is similar to how the "header" test always fails when the 98 named header fields aren't present.) Otherwise, the transform 99 must be followed as defined below in section 4. 101 Note that the transforms defined here do *not* match against 102 each line of the message independently, so the strings will 103 usually contain CRLFs. How these can be matched is governed by 104 the comparator and match-type. For example, with the default 105 comparator of "i;ascii-casemap", they can be included literally 106 in key string, or be matched with the "*" or "?" wildcards of 107 the :matches match-type, or be skipped with :contains. 109 4. Body Transform 111 Prior to matching content in a message body, "transformations" 112 can be applied that filter and decode certain parts of the body. 113 These transformations are selected by a "BODY-TRANSFORM" 114 keyword parameter. 116 Usage: ":raw" 117 / ":content" 118 / ":text" 120 The default transformation is :text. 122 4.1 Body Transform ":raw" 124 The ":raw" transform matches against the entire, undecoded body 125 of a message as a single item. 127 If the specified body-transform is ":raw", the [MIME] structure 128 of the body is irrelevant. The implementation MUST NOT remove 129 any transfer encoding from the message, MUST NOT refuse to filter 130 messages with syntactic errors (unless the environment it is 131 part of rejects them outright), and MUST treat multipart boundaries 132 or the MIME headers of enclosed body parts as part of the content 133 being matched against instead of MIME structures to interpret. 135 Example: 137 require "body"; 139 # This will match a message containing the literal text 140 # "MAKE MONEY FAST" in body parts (ignoring any 141 # content-transfer-encodings) or MIME headers other than 142 # the outermost RFC 2822 header. 144 if body :raw :contains "MAKE MONEY FAST" { 145 discard; 146 } 148 4.2 Body Transform ":content" 150 If the body transform is ":content", the MIME parts that have 151 the specified content-types are matched against independently, 152 each the entire part as a single string. 154 If an individual content type begins or ends with a '/' (slash) 155 or contains multiple slashes, it matches no content types. 156 Otherwise, if it contains a slash, then it specifies a full 157 / pair, and matches only that specific content 158 type. If it is the empty string, all MIME content types are 159 matched. Otherwise, it specifies a only, and any subtype 160 of that type matches it. 162 The search for MIME parts matching the :content specification 163 is recursive and automatically descends into multipart and 164 message/rfc822 MIME parts. All MIME parts with matching types 165 are searched for the key strings. The test returns true if any 166 combination of searched MIME part and key-list argument match. 168 If the :content specification matches a multipart MIME part, 169 only the prologue and epilogue sections of the part will be 170 searched for the key strings; the contents of nested parts are 171 only searched if their respective types match the :content 172 specification. 174 If the :content specification matches a message/rfc822 MIME part, 175 only the header of the nested message will be searched for the 176 key strings; the contents of the nested message body parts are 177 only searched if its content-type matches the :content specification. 179 (Matches against container types with an empty match string can 180 be useful as tests for the existence of such parts.) 181 Example: 182 From: Whomever 183 To: Someone 184 Date: Whenever 185 Subject: whatever 186 Content-Type: multipart/mixed; boundary=outer 188 & This is a multi-part message in MIME format. 189 & 190 --outer 191 Content-Type: multipart/alternative; boundary=inner 193 & This is a nested multi-part message in MIME format. 194 & 195 --inner 196 Content-Type: text/plain; charset="us-ascii" 198 $ Hello 199 $ 200 --inner 201 Content-Type: text/html; charset="us-ascii" 203 % Hello 204 % 205 --inner-- 206 & 207 & This is the end of the inner MIME multipart. 208 & 209 --outer 210 Content-Type: message/rfc822 212 ! From: Someone Else 213 ! Subject: hello request 215 $ Please say Hello 216 $ 217 --outer-- 218 & 219 & This is the end of the outer MIME multipart. 221 In the above example, the '&', '$', '%', and '!' characters at 222 the start of a line are used to illustrate what portions of the 223 example message are used in tests: 225 - the lines starting with '&' are the ones that are tested when 226 a 'body :content "multipart" :contains "MIME"' 227 test is executed. 229 - the lines starting with '$' are the ones that are tested when 230 a 'body :content "text/plain" :contains "Hello"' test is 231 executed. 233 - the lines starting with '%' are the ones that are tested when 234 a 'body :content "text/html" :contains "Hello"' test is executed. 236 - the lines starting with '$' or '%' are the ones that are tested 237 when a 'body :content "text" :contains "Hello"' test is executed. 239 - the lines starting with '!' are the ones that are tested when 240 a 'body :content "message/rfc822" :contains "Hello"' test is 241 executed. 243 Comparisons are performed on octets. Implementations decode 244 the content-transfer-encoding and convert text to [UTF-8] as 245 input to the comparator. MIME parts that cannot be decoded and 246 converted MAY be treated as plain US-ASCII, omitted, or processed 247 according to local conventions. A NUL octet (character zero) 248 SHOULD NOT cause early termination of the content being compared 249 against. Implementations MUST support the "quoted-printable", 250 "base64", "7bit", "8bit", and "binary" content transfer encodings. 251 Implementations MUST be capable of converting to UTF-8 the 252 US-ASCII, ISO-8859-1, and the US-ASCII subset of 253 ISO-8859-* character sets. 255 Each matched part is matched against independently: search 256 expressions MUST NOT match across MIME part boundaries. 257 MIME headers of the containing part MUST NOT be included in the 258 data. 260 Example: 261 require ["body", "fileinto"]; 263 # Save any message with any text MIME part that contains the 264 # words "missile" or "coordinates" in the "secrets" folder. 266 if body :content "text" :contains ["missile", "coordinates"] { 267 fileinto "secrets"; 268 } 270 # Save any message with an audio/mp3 MIME part in 271 # the "jukebox" folder. 273 if body :content "audio/mp3" :contains "" { 274 fileinto "jukebox"; 275 } 277 4.3 Body Transform ":text" 279 The ":text" body transform matches against the results of 280 an implementation's best effort at extracting UTF-8 encoded 281 text from a message. 283 It is unspecified whether this transformation results in a single 284 string or multiple strings being matched against. All the text 285 extracted from a given non-container MIME part MUST be in the 286 same string 288 In simple implementations, :text MAY be treated the same 289 as :content "text". 291 Sophisticated implementations MAY strip mark-up from the text 292 prior to matching, and MAY convert media types other than text 293 to text prior to matching. 295 (For example, they may be able to convert proprietary text 296 editor formats to text or apply optical character recognition 297 algorithms to image data.) 299 Example: 300 require ["body", "fileinto"]; 302 # Save messages mentioning the project schedule in the 303 # project/schedule folder. 304 if body :text :contains "project schedule" { 305 fileinto "project/schedule"; 306 } 308 5. Interaction with Other Sieve Extensions 310 Any extension that extends the grammar for the COMPARATOR or 311 MATCH-TYPE nonterminals will also affect the implementation of 312 "body". 314 Wildcard expressions used with "body" are exempt from the side 315 effects described in [VARIABLES]. That is, they MUST NOT set 316 match variables (${1}, ${2}...) to the input values corresponding 317 to wild card sequences in the matched pattern. However, if the 318 extension is present, variable references in the key strings or 319 content type strings are evaluated as described in the draft. 321 6. IANA Considerations 323 The following template specifies the IANA registration of the Sieve 324 extension specified in this document: 326 To: iana@iana.org 327 Subject: Registration of new Sieve extension 329 Capability name: body 330 Description: adds the 'body' test for matching against the 331 the body of the message being processed 332 RFC number: this RFC 333 Contact Address: Jutta Degener 335 This information should be added to the list of sieve extensions 336 given on http://www.iana.org/assignments/sieve-extensions. 338 7. Security Considerations 340 The system MUST be sized and restricted in such a manner that 341 even malicious use of body matching does not deny service to 342 other users of the host system. 344 Filters relying on string matches in the raw body of an email 345 message may be more general than intended. Text matches are no 346 replacement for a spam, virus, or other security related 347 filtering system. 349 8. Acknowledgments 351 This document has been revised in part based on comments and 352 discussions that took place on and off the SIEVE mailing list. 353 Thanks to Cyrus Daboo, Ned Freed, Bob Johannessen, Simon Josefsson, 354 Mark E. Mallett, Chris Markle, Alexey Melnikov, Ken Murchison, 355 Greg Shapiro, Tim Showalter, Nigel Swinson, and Dowson Tong for 356 reviews and suggestions. 358 9. Authors' Addresses 360 Jutta Degener 361 5245 College Ave, Suite #127 362 Oakland, CA 94618 364 Email: jutta@pobox.com 366 Philip Guenther 367 Sendmail, Inc. 368 6425 Christie Ave, 4th Floor 369 Emeryville, CA 94608 371 Email: guenther@sendmail.com 373 10. Discussion 375 This section will be removed when this document leaves the 376 Internet-Draft stage. 378 This draft is intended as an extension to the Sieve mail filtering 379 language. Sieve extensions are discussed on the MTA Filters mailing 380 list at . Subscription requests can 381 be sent to (send an email 382 message with the word "subscribe" in the body). 384 More information on the mailing list along with a WWW archive of 385 back messages is available at . 387 10.1 Changes from draft-ietf-sieve-body-07.txt 389 Clarify how transforms generate one or more strings to match against. 391 Reiterate the default COMPARATOR and MATCH-TYPE from the base spec. 393 [SIEVE] and [VARIABLES] have been published. 395 10.2 Changes from draft-ietf-sieve-body-06.txt 397 Changed "matched text" to "matched content". Drop the word 398 "proposed". 400 10.3 Changes from draft-ietf-sieve-body-05.txt 402 Updated boilerplate to match RFC 4748. 404 Added "Intended-Status: Standards Track" and 405 "Updates: draft-ietf-sieve-variables-08" 407 Change the references from appendices to sections. 408 Update [SIEVE] reference. 410 10.4 Changes from draft-ietf-sieve-body-04.txt 412 Changed 'reject' to 'discard' in the example. 414 Removed reference to regex draft. 416 Update copyright boilerplate. 418 10.5 Changes from draft-ietf-sieve-body-03.txt 420 Update IANA registration to match 3028bis. 422 Added direct boilerplate for [KEYWORDS]. 424 10.6 Changes from draft-ietf-sieve-body-02.txt 426 Updated charset conversion to match draft-ietf-sieve-3028bis-06.txt. 428 Change "Syntax:" to "Usage:". 430 Updated references. 432 10.7 Changes from draft-ietf-sieve-body-01.txt 434 Updated charset conversion requirements to match those in 435 draft-ietf-sieve-3028bis-03.txt for headers. 437 10.8 Changes from draft-ietf-sieve-body-00.txt 439 Updated IPR boilerplate to RFC 3978/3979. 441 Many prose corrections in response to WGLC comments. Of particular 442 note: 443 - made clear that :raw treats MIME boundaries and headers as 444 text to be matched against 445 - corrected description in comment of :raw example 446 - clarified the interpretation of invalid content-types in 447 :content 448 - gave precise description of what gets matched when :content 449 is used with message/rfc822 or any multipart type, as well 450 as a comprehensive example 451 - include an example of :text 452 - tightened wording of interaction with [VARIABLES] 453 - added informative reference to [REGEX] 455 10.9 Changes from draft-degener-sieve-body-04.txt 457 Renamed to draft-ietf-sieve-body-00.txt; tweaked the title and 458 abstract. 460 Added Philip Guenther as co-author. 462 Split references into normative and informative. Updated [UTF-8] 463 and [VARIABLES] references. 465 Updated IPR boilerplate. 467 10.10 Changes from draft-degener-sieve-body-03.txt 469 Made "body" exempt from variable-setting side effects in the 470 presence of the "variables" extension and wild cards. It's too 471 hard to implement. 473 Removed :binary. It's uglier and less useful than it needs to be 474 to bother. 476 Added IANA section. 478 11. Normative References 480 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 481 Requirement Levels", BCP 15, RFC 2119, March 1997. 483 [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 484 Extensions (MIME) Part One: Format of Internet Message 485 Bodies", RFC 2045, November 1996. 487 [SIEVE] Guenther, P. and T. Showalter, "Sieve: An Email 488 Filtering Language", RFC 5228, January 2008. 490 [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO 491 10646", RFC 3629, November 2003. 493 12. Informative References 495 [VARIABLES] Homme, K., "Sieve Email Filtering: Variables Extension", 496 RFC 5229, January 2008. 498 Full Copyright Statement 500 Copyright (C) The IETF Trust (2008). 502 This document is subject to the rights, licenses and restrictions 503 contained in BCP 78, and except as set forth therein, the authors 504 retain all their rights. 506 This document and the information contained herein are provided on an 507 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 508 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 509 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 510 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 511 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 512 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 514 Intellectual Property 516 The IETF takes no position regarding the validity or scope of any 517 Intellectual Property Rights or other rights that might be claimed to 518 pertain to the implementation or use of the technology described in 519 this document or the extent to which any license under such rights 520 might or might not be available; nor does it represent that it has 521 made any independent effort to identify any such rights. Information 522 on the procedures with respect to rights in RFC documents can be 523 found in BCP 78 and BCP 79. 525 Copies of IPR disclosures made to the IETF Secretariat and any 526 assurances of licenses to be made available, or the result of an 527 attempt made to obtain a general license or permission for the use of 528 such proprietary rights by implementers or users of this 529 specification can be obtained from the IETF on-line IPR repository at 530 http://www.ietf.org/ipr. 532 The IETF invites any interested party to bring to its attention any 533 copyrights, patents or patent applications, or other proprietary 534 rights that may cover technology that may be required to implement 535 this standard. Please address the information to the IETF at 536 ietf-ipr@ietf.org. 538 Acknowledgement 540 Funding for the RFC Editor function is currently provided by the IETF 541 Administrative Support Activity (IASA).