idnits 2.17.1 draft-ietf-sieve-body-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 526. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 537. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 544. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 550. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 2008) is 5664 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'COMPARATOR' is mentioned on line 81, but not defined == Missing Reference: 'MATCH-TYPE' is mentioned on line 81, but not defined == Missing Reference: 'BODY-TRANSFORM' is mentioned on line 81, but not defined == Missing Reference: 'REGEX' is mentioned on line 465, but not defined Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Jutta Degener 2 Internet Draft Philip Guenther 3 Intended status: Standards Track Sendmail, Inc. 4 Expires: March 2008 September 2008 5 Updates: RFC-ietf-sieve-variables 7 Sieve Email Filtering: Body Extension 8 draft-ietf-sieve-body-09.txt 10 Status of this memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as 20 Internet-Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six 23 months and may be updated, replaced, or obsoleted by other 24 documents at any time. It is inappropriate to use Internet- 25 Drafts as reference material or to cite them other than as 26 "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/1id-abstracts.html 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html 34 Copyright Notice 36 Copyright (C) The IETF Trust (2008). 38 Abstract 40 This document defines a new command for the "Sieve" email 41 filtering language that tests for the occurrence of one or more 42 strings in the body of an email message. 44 1. Introduction 46 The "body" test checks for the occurrence of one 47 or more strings in the body of an email message. 48 Such a test was initially discussed for the [SIEVE] base 49 document, but was subsequently removed because it was 50 thought to be too costly to implement. 52 Nevertheless, several server vendors have implemented 53 some form of the "body" test. 55 This document reintroduces the "body" test as an extension, 56 and specifies its syntax and semantics. 58 2. Conventions Used in This Document 60 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 61 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 62 document are to be interpreted as described in [KEYWORDS]. 64 Conventions for notations are as in [SIEVE] section 1.1, including 65 use of the "Usage:" label for the definition of text and tagged 66 arguments syntax. 68 The rules for interpreting the grammar are defined in [SIEVE] 69 and inherited by this specification. In particular, readers of 70 this document are reminded that according to [SIEVE] sections 71 2.6.2 and 2.6.3, optional arguments such as COMPARATOR and 72 MATCH-TYPE can appear in any order. 74 3. Capability Identifier 76 The capability string associated with the extension defined in 77 this document is "body". 79 4. Test body 81 Usage: "body" [COMPARATOR] [MATCH-TYPE] [BODY-TRANSFORM] 82 84 The body test matches content in the body of an email message, 85 that is, anything following the first empty line after the header. 86 (The empty line itself, if present, is not considered to be part 87 of the body.) 88 The COMPARATOR and MATCH-TYPE keyword parameters are defined in 89 [SIEVE]. As specified in section 2.7.3 of [SIEVE], the default 90 COMPARATOR is "i;ascii-casemap" and the default MATCH-TYPE is 91 ":is". 93 The BODY-TRANSFORM is a keyword parameter that governs how a set 94 of strings to be matched against are extracted from the body of 95 the message. If a message consists of a header only, not followed 96 by an empty line, then that set is empty and all "body" tests 97 return false, including those that test for an empty string. 98 (This is similar to how the "header" test always fails when the 99 named header fields aren't present.) Otherwise, the transform 100 must be followed as defined below in section 4. 102 Note that the transforms defined here do *not* match against 103 each line of the message independently, so the strings will 104 usually contain CRLFs. How these can be matched is governed by 105 the comparator and match-type. For example, with the default 106 comparator of "i;ascii-casemap", they can be included literally 107 in key string, or be matched with the "*" or "?" wildcards of 108 the :matches match-type, or be skipped with :contains. 110 5. Body Transform 112 Prior to matching content in a message body, "transformations" 113 can be applied that filter and decode certain parts of the body. 114 These transformations are selected by a "BODY-TRANSFORM" 115 keyword parameter. 117 Usage: ":raw" 118 / ":content" 119 / ":text" 121 The default transformation is :text. 123 5.1 Body Transform ":raw" 125 The ":raw" transform matches against the entire, undecoded body 126 of a message as a single item. 128 If the specified body-transform is ":raw", the [MIME] structure 129 of the body is irrelevant. The implementation MUST NOT remove 130 any transfer encoding from the message, MUST NOT refuse to filter 131 messages with syntactic errors (unless the environment it is 132 part of rejects them outright), and MUST treat multipart boundaries 133 or the MIME headers of enclosed body parts as part of the content 134 being matched against instead of MIME structures to interpret. 136 Example: 138 require "body"; 140 # This will match a message containing the literal text 141 # "MAKE MONEY FAST" in body parts (ignoring any 142 # content-transfer-encodings) or MIME headers other than 143 # the outermost RFC 2822 header. 145 if body :raw :contains "MAKE MONEY FAST" { 146 discard; 147 } 149 5.2 Body Transform ":content" 151 If the body transform is ":content", the MIME parts that have 152 the specified content-types are matched against independently, 153 each the entire part as a single string. 155 If an individual content type begins or ends with a '/' (slash) 156 or contains multiple slashes, it matches no content types. 157 Otherwise, if it contains a slash, then it specifies a full 158 / pair, and matches only that specific content 159 type. If it is the empty string, all MIME content types are 160 matched. Otherwise, it specifies a only, and any subtype 161 of that type matches it. 163 The search for MIME parts matching the :content specification 164 is recursive and automatically descends into multipart and 165 message/rfc822 MIME parts. All MIME parts with matching types 166 are searched for the key strings. The test returns true if any 167 combination of searched MIME part and key-list argument match. 169 If the :content specification matches a multipart MIME part, 170 only the prologue and epilogue sections of the part will be 171 searched for the key strings; the contents of nested parts are 172 only searched if their respective types match the :content 173 specification. 175 If the :content specification matches a message/rfc822 MIME part, 176 only the header of the nested message will be searched for the 177 key strings; the contents of the nested message body parts are 178 only searched if its content-type matches the :content specification. 180 (Matches against container types with an empty match string can 181 be useful as tests for the existence of such parts.) 182 Example: 183 From: Whomever 184 To: Someone 185 Date: Whenever 186 Subject: whatever 187 Content-Type: multipart/mixed; boundary=outer 189 & This is a multi-part message in MIME format. 190 & 191 --outer 192 Content-Type: multipart/alternative; boundary=inner 194 & This is a nested multi-part message in MIME format. 195 & 196 --inner 197 Content-Type: text/plain; charset="us-ascii" 199 $ Hello 200 $ 201 --inner 202 Content-Type: text/html; charset="us-ascii" 204 % Hello 205 % 206 --inner-- 207 & 208 & This is the end of the inner MIME multipart. 209 & 210 --outer 211 Content-Type: message/rfc822 213 ! From: Someone Else 214 ! Subject: hello request 216 $ Please say Hello 217 $ 218 --outer-- 219 & 220 & This is the end of the outer MIME multipart. 222 In the above example, the '&', '$', '%', and '!' characters at 223 the start of a line are used to illustrate what portions of the 224 example message are used in tests: 226 - the lines starting with '&' are the ones that are tested when 227 a 'body :content "multipart" :contains "MIME"' 228 test is executed. 230 - the lines starting with '$' are the ones that are tested when 231 a 'body :content "text/plain" :contains "Hello"' test is 232 executed. 234 - the lines starting with '%' are the ones that are tested when 235 a 'body :content "text/html" :contains "Hello"' test is executed. 237 - the lines starting with '$' or '%' are the ones that are tested 238 when a 'body :content "text" :contains "Hello"' test is executed. 240 - the lines starting with '!' are the ones that are tested when 241 a 'body :content "message/rfc822" :contains "Hello"' test is 242 executed. 244 Comparisons are performed on octets. Implementations decode 245 the content-transfer-encoding and convert text to [UTF-8] as 246 input to the comparator. MIME parts that cannot be decoded and 247 converted MAY be treated as plain US-ASCII, omitted, or processed 248 according to local conventions. A NUL octet (character zero) 249 SHOULD NOT cause early termination of the content being compared 250 against. Implementations MUST support the "quoted-printable", 251 "base64", "7bit", "8bit", and "binary" content transfer encodings. 252 Implementations MUST be capable of converting to UTF-8 the 253 US-ASCII, ISO-8859-1, and the US-ASCII subset of 254 ISO-8859-* character sets. 256 Each matched part is matched against independently: search 257 expressions MUST NOT match across MIME part boundaries. 258 MIME headers of the containing part MUST NOT be included in the 259 data. 261 Example: 262 require ["body", "fileinto"]; 264 # Save any message with any text MIME part that contains the 265 # words "missile" or "coordinates" in the "secrets" folder. 267 if body :content "text" :contains ["missile", "coordinates"] { 268 fileinto "secrets"; 269 } 271 # Save any message with an audio/mp3 MIME part in 272 # the "jukebox" folder. 274 if body :content "audio/mp3" :contains "" { 275 fileinto "jukebox"; 276 } 278 5.3 Body Transform ":text" 280 The ":text" body transform matches against the results of 281 an implementation's best effort at extracting UTF-8 encoded 282 text from a message. 284 It is unspecified whether this transformation results in a single 285 string or multiple strings being matched against. All the text 286 extracted from a given non-container MIME part MUST be in the 287 same string 289 In simple implementations, :text MAY be treated the same 290 as :content "text". 292 Sophisticated implementations MAY strip mark-up from the text 293 prior to matching, and MAY convert media types other than text 294 to text prior to matching. 296 (For example, they may be able to convert proprietary text 297 editor formats to text or apply optical character recognition 298 algorithms to image data.) 300 Example: 301 require ["body", "fileinto"]; 303 # Save messages mentioning the project schedule in the 304 # project/schedule folder. 305 if body :text :contains "project schedule" { 306 fileinto "project/schedule"; 307 } 309 6. Interaction with Other Sieve Extensions 311 Any extension that extends the grammar for the COMPARATOR or 312 MATCH-TYPE nonterminals will also affect the implementation of 313 "body". 315 Wildcard expressions used with "body" are exempt from the side 316 effects described in [VARIABLES]. That is, they MUST NOT set 317 match variables (${1}, ${2}...) to the input values corresponding 318 to wildcard sequences in the matched pattern. However, if the 319 extension is present, variable references in the key strings or 320 content type strings are evaluated as described in the draft. 322 7. IANA Considerations 324 The following template specifies the IANA registration of the Sieve 325 extension specified in this document: 327 To: iana@iana.org 328 Subject: Registration of new Sieve extension 330 Capability name: body 331 Description: Provides a test for matching against the 332 the body of the message being processed 333 RFC number: this RFC 334 Contact Address: Jutta Degener 336 This information should be added to the list of sieve extensions 337 given on http://www.iana.org/assignments/sieve-extensions. 339 8. Security Considerations 341 The system MUST be sized and restricted in such a manner that 342 even malicious use of body matching does not deny service to 343 other users of the host system. 345 Filters relying on string matches in the raw body of an email 346 message may be more general than intended. Text matches are no 347 replacement for a spam, virus, or other security related 348 filtering system. 350 9. Acknowledgments 352 This document has been revised in part based on comments and 353 discussions that took place on and off the SIEVE mailing list. 354 Thanks to Cyrus Daboo, Ned Freed, Bob Johannessen, Simon Josefsson, 355 Mark E. Mallett, Chris Markle, Alexey Melnikov, Ken Murchison, 356 Greg Shapiro, Tim Showalter, Nigel Swinson, Dowson Tong, and 357 Christian Vogt for reviews and suggestions. 359 10. Authors' Addresses 361 Jutta Degener 362 5245 College Ave, Suite #127 363 Oakland, CA 94618 365 Email: jutta@pobox.com 367 Philip Guenther 368 Sendmail, Inc. 369 6425 Christie Ave, 4th Floor 370 Emeryville, CA 94608 372 Email: guenther@sendmail.com 374 11. Discussion 376 This section will be removed when this document leaves the 377 Internet-Draft stage. 379 This draft is intended as an extension to the Sieve mail filtering 380 language. Sieve extensions are discussed on the MTA Filters mailing 381 list at . Subscription requests can 382 be sent to (send an email 383 message with the word "subscribe" in the body). 385 More information on the mailing list along with a WWW archive of 386 back messages is available at . 388 11.1 Changes from draft-ietf-sieve-body-08.txt 390 Add a "Capability Identifier" section to match existing RFCs. 392 Make the normative and information references subsections of a 393 "References" section to match existing RFCs. 395 Tweak description field of the IANA registration. 397 Change "wild card" to "wildcard". 399 11.2 Changes from draft-ietf-sieve-body-07.txt 401 Clarify how transforms generate one or more strings to match against. 403 Reiterate the default COMPARATOR and MATCH-TYPE from the base spec. 405 [SIEVE] and [VARIABLES] have been published. 407 11.3 Changes from draft-ietf-sieve-body-06.txt 409 Changed "matched text" to "matched content". Drop the word 410 "proposed". 412 11.4 Changes from draft-ietf-sieve-body-05.txt 414 Updated boilerplate to match RFC 4748. 416 Added "Intended-Status: Standards Track" and 417 "Updates: draft-ietf-sieve-variables-08" 419 Change the references from appendices to sections. 420 Update [SIEVE] reference. 422 11.5 Changes from draft-ietf-sieve-body-04.txt 424 Changed 'reject' to 'discard' in the example. 426 Removed reference to regex draft. 428 Update copyright boilerplate. 430 11.6 Changes from draft-ietf-sieve-body-03.txt 432 Update IANA registration to match 3028bis. 434 Added direct boilerplate for [KEYWORDS]. 436 11.7 Changes from draft-ietf-sieve-body-02.txt 438 Updated charset conversion to match draft-ietf-sieve-3028bis-06.txt. 440 Change "Syntax:" to "Usage:". 442 Updated references. 444 11.8 Changes from draft-ietf-sieve-body-01.txt 446 Updated charset conversion requirements to match those in 447 draft-ietf-sieve-3028bis-03.txt for headers. 449 11.9 Changes from draft-ietf-sieve-body-00.txt 451 Updated IPR boilerplate to RFC 3978/3979. 453 Many prose corrections in response to WGLC comments. Of particular 454 note: 455 - made clear that :raw treats MIME boundaries and headers as 456 text to be matched against 457 - corrected description in comment of :raw example 458 - clarified the interpretation of invalid content-types in 459 :content 460 - gave precise description of what gets matched when :content 461 is used with message/rfc822 or any multipart type, as well 462 as a comprehensive example 463 - include an example of :text 464 - tightened wording of interaction with [VARIABLES] 465 - added informative reference to [REGEX] 467 11.10 Changes from draft-degener-sieve-body-04.txt 469 Renamed to draft-ietf-sieve-body-00.txt; tweaked the title and 470 abstract. 472 Added Philip Guenther as co-author. 474 Split references into normative and informative. Updated [UTF-8] 475 and [VARIABLES] references. 477 Updated IPR boilerplate. 479 11.11 Changes from draft-degener-sieve-body-03.txt 481 Made "body" exempt from variable-setting side effects in the 482 presence of the "variables" extension and wildcards. It's too 483 hard to implement. 485 Removed :binary. It's uglier and less useful than it needs to be 486 to bother. 488 Added IANA section. 490 12. References 492 12.1. Normative References 494 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 495 Requirement Levels", BCP 15, RFC 2119, March 1997. 497 [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 498 Extensions (MIME) Part One: Format of Internet Message 499 Bodies", RFC 2045, November 1996. 501 [SIEVE] Guenther, P. and T. Showalter, "Sieve: An Email 502 Filtering Language", RFC 5228, January 2008. 504 [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO 505 10646", RFC 3629, November 2003. 507 12.2. Informative References 509 [VARIABLES] Homme, K., "Sieve Email Filtering: Variables Extension", 510 RFC 5229, January 2008. 512 Full Copyright Statement 514 Copyright (C) The IETF Trust (2008). 516 This document is subject to the rights, licenses and restrictions 517 contained in BCP 78, and except as set forth therein, the authors 518 retain all their rights. 520 This document and the information contained herein are provided on an 521 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 522 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 523 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 524 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 525 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 526 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 528 Intellectual Property 530 The IETF takes no position regarding the validity or scope of any 531 Intellectual Property Rights or other rights that might be claimed to 532 pertain to the implementation or use of the technology described in 533 this document or the extent to which any license under such rights 534 might or might not be available; nor does it represent that it has 535 made any independent effort to identify any such rights. Information 536 on the procedures with respect to rights in RFC documents can be 537 found in BCP 78 and BCP 79. 539 Copies of IPR disclosures made to the IETF Secretariat and any 540 assurances of licenses to be made available, or the result of an 541 attempt made to obtain a general license or permission for the use of 542 such proprietary rights by implementers or users of this 543 specification can be obtained from the IETF on-line IPR repository at 544 http://www.ietf.org/ipr. 546 The IETF invites any interested party to bring to its attention any 547 copyrights, patents or patent applications, or other proprietary 548 rights that may cover technology that may be required to implement 549 this standard. Please address the information to the IETF at 550 ietf-ipr@ietf.org. 552 Acknowledgement 554 Funding for the RFC Editor function is currently provided by the IETF 555 Administrative Support Activity (IASA).