idnits 2.17.1 draft-bosch-sieve-duplicate-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 16, 2013) is 4111 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Bosch 3 Internet-Draft January 16, 2013 4 Intended status: Standards Track 5 Expires: July 20, 2013 7 Sieve Email Filtering: Detecting Duplicate Deliveries 8 draft-bosch-sieve-duplicate-00 10 Abstract 12 This document defines a new test command "duplicate" for the "Sieve" 13 email filtering language. It can be used to test whether a 14 particular string value is a duplicate, i.e. whether it was seen 15 before by the delivery agent that is executing the Sieve script. The 16 main application for this new test is detecting duplicate message 17 deliveries commonly caused by mailing list subscriptions or 18 redirected mail addresses. 20 Status of this Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on July 20, 2013. 37 Copyright Notice 39 Copyright (c) 2013 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 2. Conventions Used in This Document . . . . . . . . . . . . . . . 3 56 3. Test "duplicate" . . . . . . . . . . . . . . . . . . . . . . . 3 57 4. Sieve Capability Strings . . . . . . . . . . . . . . . . . . . 5 58 5. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 59 6. Security Considerations . . . . . . . . . . . . . . . . . . . . 6 60 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 61 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 7 62 8.1. Normative References . . . . . . . . . . . . . . . . . . . 7 63 8.2. Informative References . . . . . . . . . . . . . . . . . . 7 64 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 8 66 1. Introduction 68 This is an extension to the Sieve filtering language defined by RFC 69 5228 [SIEVE]. It adds a test to determine whether a certain string 70 value was seen before by the delivery agent in an earlier execution 71 of the Sieve script. This can be used to detect and handle duplicate 72 message deliveries. 74 Duplicate deliveries are a common side-effect of being subscribed to 75 a mailing list. For example, if a member of the list decides to 76 reply to both the user and the mailing list itself, the user will get 77 one copy of the message directly and another through mailing list. 78 Also, if someone cross-posts over several mailing lists to which the 79 user is subscribed, the user will receive a copy from each of those 80 lists. In another scenario, the user has several redirected mail 81 addresses all pointing to his main mail account. If one of the 82 user's contacts sends the message to more than one of those 83 addresses, the user will receive more than a single copy. Using the 84 "duplicate" extension, users have the means to detect and handle such 85 duplicates, e.g. by discarding them, marking them as "seen", or 86 putting them in a special folder. 88 Duplicate messages are normally detected using the Message-ID header 89 field, which is required to be unique for each message. However, the 90 "duplicate" test is flexible enough to use different (weaker) 91 criteria for defining what makes a message a duplicate, for example 92 based on the subject line. Also, other applications of this new test 93 command are possible, as long as the tracked value is a string. 95 2. Conventions Used in This Document 97 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 98 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 99 document are to be interpreted as described in [KEYWORDS]. 101 Conventions for notations are as in [SIEVE] Section 1.1, including 102 use of the "Usage:" label for the definition of action and tagged 103 arguments syntax. 105 3. Test "duplicate" 107 Usage: "duplicate" [":handle" ] 108 [":header" / 109 ":value" ] 110 [":seconds" ] 112 The "duplicate" test keeps track of which values were seen before by 113 this test in an earlier execution of this Sieve script. In its basic 114 form, the tested value is the content of the Message-ID header of the 115 message. This way, this test can be used to detect duplicate 116 deliveries of the same message. It can also detect duplicate 117 deliveries based on other message header fields if requested and it 118 can even use a user-provided string value, e.g. as composed from text 119 extracted from the message using the "variables" [VARIABLES] 120 extension. 122 The "duplicate" test evaluates to "true" when the provided value was 123 seen before in an earlier Sieve execution for a previous message 124 delivery. If the value was not seen earlier, the test evaluates to 125 "false". 127 As a side-effect, the "duplicate" test adds the evaluated value to an 128 internal duplicate tracking list, so that the test will evaluate to 129 "true" the next time the Sieve script is executed and the same value 130 is encountered. Note that the "duplicate" test MUST only check for 131 duplicates amongst values encountered in previous executions of the 132 Sieve script; it MUST NOT consider values encountered earlier in the 133 current Sieve script execution as potential duplicates. This means 134 that all "duplicate" tests in a Sieve script execution, including 135 those located in scripts included using the "include" [INCLUDE] 136 extension, MUST yield the same result if the arguments are identical. 138 Implementations MUST prevent adding values to the internal duplicate 139 tracking list when the Sieve script execution fails. For example, 140 this can be implemented by deferring the definitive modification of 141 the tracking list to the end of the Sieve script execution. If 142 failed script executions would add values to the duplicate tracking 143 list, all "duplicate" tests would erroneously yield "true" for the 144 next delivery attempt of the same message, which can -- depending on 145 the action taken for a duplicate -- easily lead to discarding the 146 message without further notice. 148 Implementations SHOULD limit the number of values (and thereby 149 messages) that are tracked. Also, implementations SHOULD let entries 150 in the value tracking list expire after a short period of time. The 151 user can explicitly control the length of this expiration time by 152 means of the ":seconds" argument. If the ":seconds" argument is 153 omitted, an appropriate default MUST be used. Sites SHOULD impose a 154 maximum limit on the expiration time. If that limit is exceeded, the 155 maximum value MUST silently be substituted; exceeding the limit MUST 156 NOT produce an error. 158 By default, the tracked value is the content of the message's 159 Message-ID header field. For more advanced purposes, the content of 160 another header can be chosen for tracking by specifying the ":header" 161 argument. The tracked string value can also be specified explicitly 162 using the ":value" argument. The ":header" and ":value" arguments 163 are mutually exclusive and specifying both for a single "duplicate" 164 test command MUST trigger an error at compile time. If the value is 165 extracted from a header, i.e. when the ":value" argument is not used, 166 leading and trailing whitespace (see Section 2.2 of RFC 5228 [SIEVE]) 167 MUST first be trimmed from the value before performing the actual 168 duplicate verification. 170 Using the ":handle" argument, the duplicate test can be employed for 171 multiple independent purposes. Only when the tracked value was seen 172 before in an earlier script execution by a "duplicate" test with the 173 same ":handle" argument, it is recognized as a duplicate. 175 NOTE: The necessary mechanism to track duplicate messages is very 176 similar to the mechanism that is needed for tracking duplicate 177 responses for the "vacation" [VACATION] action. One way to implement 178 the necessary mechanism for the "duplicate" test is therefore to 179 store a hash of the tracked value and, if provided, the ":handle" 180 argument. 182 4. Sieve Capability Strings 184 A Sieve implementation that defines the "duplicate" test command will 185 advertise the capability string "duplicate". 187 5. Examples 189 In the following basic example message duplicates are detected by 190 tracking the Message-ID header. Duplicate deliveries are stored in a 191 special folder contained in the user's Trash folder. If the folder 192 does not exist, it is created automatically using the "mailbox" 193 [MAILBOX] extension. This way, the user has a chance to recover 194 messages when necessary. Messages that are not recognized as 195 duplicates are stored in the user's inbox as normal. 197 require ["duplicate", "fileinto", "mailbox"]; 199 if duplicate { 200 fileinto :create "Trash/Duplicate"; 201 } 203 The next example shows a more complex use of the "duplicate" test. 204 The user gets network alerts from a set of remote automated 205 monitoring systems. Multiple notifications can be received about the 206 same event from different monitoring systems. The Message-ID of 207 these messages is different, because these are all distinct messages 208 from different senders. To avoid being notified multiple times about 209 the same event the user writes the following script: 211 require ["duplicate", "variables", "imap4flags", 212 "fileinto"]; 214 if header :matches "subject" "ALERT: *" { 215 if duplicate :seconds 60 :value "${1}" { 216 setflag "\\seen"; 217 } 218 fileinto "Alerts"; 219 } 221 The subjects of the notification message are structured with a 222 predictable pattern which includes a description of the event. In 223 the script above the "duplicate" test is used to detect duplicate 224 alert events. The message subject is matched against a pattern and 225 the event description is extracted using the "variables" [VARIABLES] 226 extension. If a message with that event in the subject was received 227 before, but more than a minute ago, it is not detected as a duplicate 228 due to the specified ":seconds" argument. In the the event of a 229 duplicate, the message is marked as "seen" using the "imap4flags" 230 [IMAP4FLAGS] extension. All alert messages are put into the "Alerts" 231 mailbox irrespective of whether those messages are duplicates or not. 233 6. Security Considerations 235 A flood of unique messages could cause the list of tracked values to 236 grow indefinitely. Implementations therefore SHOULD implement limits 237 on the number and lifespan of entries in that list. 239 7. IANA Considerations 241 The following template specifies the IANA registration of the Sieve 242 extension specified in this document: 244 To: iana@iana.org 245 Subject: Registration of new Sieve extension 247 Capability name: duplicate 248 Description: Adds test 'duplicate' that can be used to test 249 whether a particular string value is a duplicate, 250 i.e. whether it was seen before by the delivery 251 agent that is executing the Sieve script. The 252 main application for this test is detecting 253 duplicate message deliveries. 254 RFC number: this RFC 255 Contact address: Sieve mailing list 257 This information should be added to the list of sieve extensions 258 given on http://www.iana.org/assignments/sieve-extensions. 260 8. References 262 8.1. Normative References 264 [INCLUDE] Daboo, C. and A. Stone, "Sieve Email Filtering: Include 265 Extension", RFC 6609, May 2012. 267 [KEYWORDS] 268 Bradner, S., "Key words for use in RFCs to Indicate 269 Requirement Levels", BCP 14, RFC 2119, March 1997. 271 [SIEVE] Guenther, P. and T. Showalter, "Sieve: An Email Filtering 272 Language", RFC 5228, January 2008. 274 8.2. Informative References 276 [IMAP4FLAGS] 277 Melnikov, A., "Sieve Email Filtering: Imap4flags 278 Extension", RFC 5232, January 2008. 280 [MAILBOX] Melnikov, A., "The Sieve Mail-Filtering Language -- 281 Extensions for Checking Mailbox Status and Accessing 282 Mailbox Metadata", RFC 5490, March 2009. 284 [VACATION] 285 Showalter, T. and N. Freed, "Sieve Email Filtering: 286 Vacation Extension", RFC 5230, January 2008. 288 [VARIABLES] 289 Homme, K., "Sieve Email Filtering: Variables Extension", 290 RFC 5229, January 2008. 292 Author's Address 294 Stephan Bosch 295 Enschede 296 NL 298 Email: stephan@rename-it.nl