idnits 2.17.1 draft-ietf-appsawg-greylisting-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: 7. Greylisting SHOULD NOT be applied by an ADMD's submission service (see [SUBMISSION]) for authenticated client hosts. It also SHOULD not be applied against any authenticated ADMD session. Authentication can include whatever mechanisms are deemed appropriate for the ADMD, such as known internal IP addresses, protocol-level client authentication, or the like. -- The document date (April 26, 2012) is 4377 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 5598 (ref. 'EMAIL-ARCH') Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Individual submission M. Kucherawy 3 Internet-Draft Cloudmark 4 Intended status: Standards Track D. Crocker 5 Expires: October 28, 2012 Brandenburg InternetWorking 6 April 26, 2012 8 Email Greylisting: An Applicability Statement for SMTP 9 draft-ietf-appsawg-greylisting-09 11 Abstract 13 This document describes the art of email greylisting, the practice of 14 providing temporarily degraded service to unknown email clients as an 15 anti-abuse mechanism. 17 Greylisting is an established mechanism deemed essential to the 18 repertoire of current anti-abuse email filtering systems. 20 Status of this Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on October 28, 2012. 37 Copyright Notice 39 Copyright (c) 2012 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 1.1. Background . . . . . . . . . . . . . . . . . . . . . . . . 3 56 1.2. Definitions . . . . . . . . . . . . . . . . . . . . . . . 4 57 2. Types of Greylisting . . . . . . . . . . . . . . . . . . . . . 4 58 2.1. Connection-Level Greylisting . . . . . . . . . . . . . . . 4 59 2.2. SMTP HELO/EHLO Greylisting . . . . . . . . . . . . . . . . 5 60 2.3. SMTP MAIL Greylisting . . . . . . . . . . . . . . . . . . 5 61 2.4. SMTP RCPT Greylisting . . . . . . . . . . . . . . . . . . 5 62 2.5. SMTP DATA Greylisting . . . . . . . . . . . . . . . . . . 6 63 2.6. Additional Heuristics . . . . . . . . . . . . . . . . . . 7 64 2.7. Exceptions . . . . . . . . . . . . . . . . . . . . . . . . 7 65 3. Benefits and Costs . . . . . . . . . . . . . . . . . . . . . . 8 66 4. Unintended Consequences . . . . . . . . . . . . . . . . . . . 9 67 4.1. Unintended Mail Delivery Failures . . . . . . . . . . . . 9 68 4.2. Unintended SMTP Client Failures . . . . . . . . . . . . . 10 69 4.3. Address Space Saturation . . . . . . . . . . . . . . . . . 11 70 5. Recommendations . . . . . . . . . . . . . . . . . . . . . . . 12 71 6. Measuring Effectiveness . . . . . . . . . . . . . . . . . . . 13 72 7. IPv6 Applicability . . . . . . . . . . . . . . . . . . . . . . 14 73 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 74 9. Security Considerations . . . . . . . . . . . . . . . . . . . 14 75 9.1. Tradeoffs . . . . . . . . . . . . . . . . . . . . . . . . 14 76 9.2. Database . . . . . . . . . . . . . . . . . . . . . . . . . 15 77 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 78 10.1. Normative References . . . . . . . . . . . . . . . . . . . 15 79 10.2. Informative References . . . . . . . . . . . . . . . . . . 15 80 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . . 16 81 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16 83 1. Introduction 85 Preferred techniques for handling email abuse explicitly identify 86 good actors and bad actors, giving each significantly different 87 qualities service. In some cases an actor does not have a known 88 reputation; this can justify providing degraded service, until there 89 is a basis for providing better service. This latter approach is 90 known as "greylisting". Broadly, the term refers to any degradation 91 of service for an unknown or suspect source, over a period of time 92 (typically measured in minutes or a small number of hours). The 93 narrow use of the term refers to generation of an SMTP temporary 94 failure reply code for traffic from such sources. There are diverse 95 implementations of this basic concept, and, predictably therefore, 96 some blurred terminology. 98 Absent a perfect abuse detection mechanism that incurs no cost, the 99 current requirement is for an array of techniques to be used by each 100 filtering system. They range in cost and effectiveness and types of 101 abuse techniques they target. 103 Greylisting happes to be a technique that is cheap and early (in 104 terms of its application in the SMTP sequence) and surprisingly 105 remains useful. Some spamware does indeed route around this 106 technique, but much does not. 108 The firehose of spam over the Internet represents a wide range of 109 sophistication. Greylisting is useful for removing a large amount of 110 simplistic-but-significant traffic. 112 This memo documents common greylisting techniques and discusses their 113 benefits and costs. It also defines terminology to enable clear 114 distinction and discussion of these techniques. 116 There is some confusion in industry that conflates greylisting with 117 an SMTP temporary failure for any reason. The purpose of this memo 118 is also to dispel such confusion. 120 1.1. Background 122 For many years, large amounts of spam have been sent through purpose- 123 built software, or "spamware", that supports only a constrained 124 version of SMTP. In particular, such software does not perform 125 retransmission attempts after receiving an SMTP temporary failure. 126 That is, if the spamware cannot deliver a message, it just goes on to 127 the next address in its list since, in spamming, volume counts for 128 far more than reliability. Greylisting exploits this by rejecting 129 mail from unfamiliar sources with a "transient (soft) fail" (4xx) 130 [SMTP] error code. Another application of greylisting is to delay 131 mail from newly seen IP addresses on the theory that, if it's a spam 132 source, then by the time it retries, it will appear in a list of 133 sources to be filtered, and the mail will not be accepted. 135 Early references for greylisting descriptions and implementations can 136 be found at [SAUCE] and [PUREMAGIC]. 138 1.2. Definitions 140 1.2.1. Keywords 142 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 143 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 144 document are to be interpreted as described in [KEYWORDS]. 146 1.2.2. E-Mail Architecture Terminology 148 Readers need to be familiar with the material and terminology 149 discussed in [MAIL] and [EMAIL-ARCH]. 151 2. Types of Greylisting 153 Greylisting is primarily performed at some phase during an SMTP 154 session. A set of attributes about the client-side SMTP server are 155 used for assessing whether to perform greylisting. At its simplest, 156 the attribute is the IP address of the client and the assessment is 157 whether it has previously connected, recently. More elaborate 158 attribute combinations and more sophisticated assessment, can be 159 performed. The following discussion covers the most common 160 combinations. 162 2.1. Connection-Level Greylisting 164 Connection-level greylisting decides whether to accept the (TCP) 165 connection from a "new" [SMTP] client. At this point in the 166 communication between the client and the server, the only information 167 known to the receiving server is the incoming IP address. This, of 168 course, is often (but not always) translatable into a host name. 170 The typical application of greylisting here is to keep a record of 171 SMTP client IP addresses and/or host names (collectively, "sources") 172 that have been seen. Such a database acts as a cache of known 173 senders and might or might not expire records after some period. If 174 the source is not in the database, or the record of the source has 175 not reached some required minimum age (such as 30 minutes since the 176 initial connection attempt), the server does one of the following, 177 inviting a later retry: 179 o returns a 421 SMTP reply, and closes the connection; 181 o returns a different 4yz SMTP reply to all further commands in this 182 SMTP session 184 A useful variant of the basic known/unknown policy is to limit 185 greylisting to those addresses that are on some list of IP addresses 186 known to be affiliated with bad actors. Whereas the simpler policy 187 affects all new connections, including those from good actors, the 188 constrained policy applies greylisting actions only to sites that 189 already have a negative reputation. 191 2.2. SMTP HELO/EHLO Greylisting 193 HELO/EHLO greylisting refers to the first command verb in an SMTP 194 session, HELO or EHLO. It includes a single, required parameter that 195 is supposed to contain the client's fully-qualified host name or its 196 literal IP address. 198 Greylisting implemented at this phase retains a record of sources 199 coupled with HELO/EHLO parameters. It returns 4yz SMTP replies to 200 all commands until the end of the SMTP session if that tuple has not 201 previously been recorded or if the record exists but has not reached 202 some configured minimum age. 204 2.3. SMTP MAIL Greylisting 206 MAIL command greylisting refers to the command verb in an SMTP 207 session that initiates a new transaction, MAIL. It includes at least 208 one required parameter that indicates the return email address 209 (RFC5321.MailFrom) of the message being relayed from the client to 210 the server. 212 Greylisting implemented at this phase retains a record of sources 213 coupled with return email addresses. It returns 4yz SMTP replies to 214 all commands for the remainder of the SMTP session if that tuple has 215 not previously been recorded or if the record exists but has not met 216 some configured minimum age. 218 2.4. SMTP RCPT Greylisting 220 RCPT greylisting refers to the command verb in an SMTP session that 221 specifies intended recipients of an email transaction, RCPT. It 222 includes at least one required parameter that indicates the email 223 address of an intended recipient of the message being relayed from 224 the client to the server. 226 Greylisting implemented at this phase retains a record of tuples that 227 combines the provided recipient address with any combination of the 228 following: 230 o the source, as described above; 232 o the return email address; 234 o the other recipient addresses of the message (if any) 236 If the selected tuple is not found in the database, or if the record 237 is present but has not reached some configured minimum age, the 238 greylisting Mail Transfer Agent (MTA) [EMAIL-ARCH] returns 4yz SMTP 239 replies to all commands for the remainder of the SMTP session. 241 Note that often a match on a tuple involving the first valid RCPT is 242 sufficient to identify a retry correctly, and further checks can be 243 omitted. 245 2.5. SMTP DATA Greylisting 247 DATA greylisting refers to the command verb in an SMTP session that 248 transmits the actual message content, DATA, as opposed to its 249 envelope details (see [MAIL]). 251 This type of greylisting can be performed at two places in the SMTP 252 sequence: 254 1. on receipt of the DATA command, because at that point the entire 255 envelope has been received (i.e., all MAIL and RCPT commands have 256 been issued); 258 2. on completion of the DATA command, i.e., after the "." that 259 terminates transmission of the message body, since at that point 260 a digest or other analysis of the message could be performed. 262 Some implementations do filtering here because there are clients that 263 don't bother checking SMTP reply codes to commands other than DATA. 264 Hence, it can be useful to add greylisting capability at that point 265 in an SMTP session. 267 Numerous greylisting policies are possible at this point. All of 268 them retain a record of tuples that combine the various parts of the 269 SMTP transaction in some combination, including: 271 o the source, as described above; 273 o the return email address; 274 o the recipients of the message, as a set or individually; 276 o identifiers in the message header, such as the contents of the 277 RFC5322.From or RFC5322.To fields; 279 o other prominent parts of the content, such as the RFC5322.Subject 280 field; 282 o a digest of some or all of the message content, as a test for 283 uniqueness; 285 o analysis of arbitrary portions of the message body. 287 (The last four items in that list are only possible at the end of 288 DATA, not on receipt of the DATA command.) 290 If the selected tuple is not found in the database, or if the record 291 exists but has not reached some configured minimum age, the 292 greylisting MTA returns 4yz SMTP replies to all commands for the 293 remainder of the SMTP session. 295 2.6. Additional Heuristics 297 Since greylisting seeks to target spam senders, it follows that being 298 able to identify spamware within the SMTP context beyond the simple 299 notion of "not seen before" would be desirable. A more targeted 300 approach might also include in its selection such heuristics as: 302 o if a [DNSBL] lists an IP address but the implementer wishes to be 303 cautious with mitigation actions rather than blocking traffic from 304 the IP address outright, then subject it to greylisting; 306 o if the value found in a PTR record follows common naming patterns 307 for dynamic IP addresses, then subject it to greylisting. 309 2.7. Exceptions 311 Most greylisting systems provide for an exception mechanism, allowing 312 one to specify IP addresses, IP address [CIDR] blocks, hostnames or 313 domain names that are exempt from greylisting checks and thus whose 314 SMTP client sessions are not subject to such interference. 316 Likely candidates to be excepted from greylisting include those known 317 not to retry according to a pattern that will be observed as 318 legitimate, and those that send so rarely that they will age out of 319 the database. In both cases the excepted source is known not to be 320 an abusive one by the site implementing greylisting. Otherwise, 321 typical non-abusive senders will enter the exception list on the 322 first proper retry, and remain there permanently. 324 One could also use a [DNSBL] that lists known good hosts as a 325 greylisting exception set. 327 3. Benefits and Costs 329 The most obvious benefit with any of the above techniques is that 330 spamware generally does not retry, and is therefore less likely to 331 succeed, absent a record of a previous delivery attempts. 333 The most obvious detriment to implementing greylisting is the 334 imposition of delay on legitimate mail. Some popular MTAs do not 335 retry failed delivery attempts for an hour or more, which can cause 336 expensive delays when delivery of mail is time-critical. Worse, some 337 legitimate MTAs do not retry at all. (Note however that non-retrying 338 clients are not fully SMTP-capable, per Section 2.1 of [SMTP]. A 339 client does not know, nor is it entitled to know, the reason for the 340 temporary failure status code being returned; greylisting could be in 341 effect, or it could be caused by a local resource issue at the 342 server. A client therefore needs to be equipped to retry in order to 343 be considered fully capable.) 345 The counterargument to this "false positive" problem is that email 346 has always been a "best-effort" mechanism, and thus this cost is 347 ultimately low in comparison to the cost of dealing with high volumes 348 of unwanted mail. Still, the actual effect of such delays can be 349 significant, such as altering the tone or flow of a multi-participant 350 discussion to a mailing list. 352 The cache of information stored about SMTP client history does not 353 benefit legitimate clients that are already listed for acceptance, 354 when the clients are subjected to any kind of reconfiguration, 355 especially such as network renumbering. To the greylisting 356 implementation, such clients are once again unknown, and they will 357 once again be subjected to the delay. 359 Another obvious cost is for the required database. It has to be 360 large enough to keep the necessary history and fast enough to avoid 361 excessive inefficiencies in the server's operations. The primary 362 consideration is the maximum age of records in the database. If 363 records age out too soon, then hosts that do retry per [SMTP] will be 364 periodically subjected to greylisting even though they are well- 365 behaved; if records age out after too long a period, then eventually 366 spamware that launches a new campaign will not be identified as 367 "unknown" in this manner, and will not be required to retry. 369 Presuming that known friendly senders will be manually configured as 370 exceptions to the greylisting check, a steady state will eventually 371 be reached wherein the only mail that is delayed is mail from an IP 372 address that has never sent mail before. Experience suggests that 373 the vast majority of mail comes from places on a developed exception 374 list, so after a training period, only a small proportion of mail is 375 actually affected. The training period could be replaced by 376 processing a history of email traffic and adding the IP addresses 377 from which most traffic arrives to the exception list. 379 Applying greylisting based on actual message content (i.e., post- 380 DATA) is substantially more expensive than any of the other 381 alternatives both in terms of the resources required to accept and 382 temporarily store a complete message body (which can be quite 383 substantial) and any processing that is done on that content. As a 384 consequence, such methods incur more cost during the session and thus 385 is not a typical practice. 387 4. Unintended Consequences 389 4.1. Unintended Mail Delivery Failures 391 There are a few failure modes of greylisting that are worth 392 considering. For example, consider an email message intended for 393 user@example.com. The example.com domain is served by two receiving 394 mail servers, one called mail1.example.com and one called 395 mail2.example.com. On the first delivery attempt, mail1.example.com 396 greylists the client, and thus the client places the message in its 397 outgoing queue for later retry. Later, when a retry is attempted, 398 mail2.example.com is selected for the delivery, either because 399 mail1.example.com is unavailable or because a round-robin [DNS] 400 evaluation produces that result. However, the two example.com hosts 401 do not share greylisting databases, so the second host again denies 402 the attempt. Thus, although example.com has sought to improve its 403 email throughput by having two servers, it has in fact amplified the 404 problem of legitimate mail delay introduced by greylisting. 406 Similarly, consider a site with multiple outbound MTAs that share a 407 common queue. On a first outbound delivery attempt to example.com, 408 the attempt is grey listed. On a later retry, a different outbound 409 MTA is selected, which means example.com sees a different source, and 410 once again greylisting occurs on the same message. The same effect 411 can result from the use of [DHCP], where the IP address of an 412 outbound MTA changes between attempts. 414 For systems that do DATA-level greylisting, if any part of the 415 message has changed since the first attempt, the tuple constructed 416 might be different than the one for the first attempt, and the 417 delivery is again greylisted. Some MTAs do reformulate portions of 418 the message at submission time and this can produce visible 419 differences for each attempt. 421 A host that sends mail to a particular destination infrequently might 422 not remain "known" in the receiving server's database and will 423 therefore be greylisted for a high percentage of mail despite 424 possibly being a legitimate sender. 426 All of these and other similar cases can cause greylisting to be 427 applied improperly to legitimate MTAs multiple times, leading to long 428 delays in delivery or ultimately the return of the message to its 429 sender. Other side effects include out-of-order delivery of related 430 sequenced messages. 432 Address translation technologies such as [NAT] cause distinct MTAs to 433 appear to come from a common IP address. This can cause greylisting 434 to be applied only to the first connection attempt from the shared IP 435 address, meaning future MTAs connecting for the first time will be 436 exempted from the protection greylisting provides. 438 4.2. Unintended SMTP Client Failures 440 Atypical SMTP client behaviours also need to be considered when 441 deploying greylisting. 443 Some clients do not retry messages for very long periods. Popular 444 open source MTAs implement increasing backoff times when messages 445 receive temporary failure messages and/or degrade queue priority for 446 very large messages. This means greylisting introduces even more 447 delay for MTAs implementing such schemes, and the delay can become 448 large enough to become a nuisance to users. 450 Some clients do not retry messages at all, in violation of [SMTP]. 451 This means greylisting will cause outright delivery failure right 452 away for sources, envelopes, or messages that it has not seen before, 453 regardless of the client attempting the delivery, essentially 454 treating legitimate mail and spam the same. 456 If a greylisting scheme requires a database record to have reached a 457 certain age rather than merely testing for the presence of the record 458 in the database, and the client has a retry schedule that is too 459 aggressive, the client could be subjected to rate limiting by the MTA 460 independent of the restrictions imposed by greylisting. 462 Some SMTP implementations make the error of treating all error codes 463 as fatal, contrary to [SMTP]; that is, a 4yz response is treated as 464 if it were a 5yz response, and the message is returned to the sender 465 as undeliverable. This can result in such things as inadvertent 466 removal from mailing lists in response to the perceived rejections. 468 Some clients encode message-specific details in the address parameter 469 to the [SMTP] MAIL command. If doing so causes the parameter to 470 change between retry attempts, a greylisting implementation could see 471 it as a new delivery rather than a retry, and disallow the delivery. 472 In such cases, the mail will never be delivered, and will be returned 473 to the sender after the retry timeout expires. 475 A client subjected to greylisting might move to the next host found 476 in the ordered [DNS] MX record set for the destination domain and re- 477 attempt delivery. This has several considerations of its own: 479 o An increase in traffic to those alternate servers merely as a 480 result of greylisting. 482 o Alternate (MX) servers SHOULD share the same greylisting database. 483 When they do not -- as is often true when the servers occupy 484 different Administrative Management Domains (ADMDs) -- SMTP 485 clients can see variable treatment if they try to send to 486 different MX hosts. 488 o When alternate MX servers relay mail back to the "primary" MX 489 server, the latter SHOULD be configured to permit the other 490 servers to relay mail without being subjected to greylisting. 492 There are some applications that connect to an SMTP server and 493 simulate a transaction up to the point of sending the RCPT command in 494 an attempt to confirm that an address is valid. Some of these are 495 legitimate applications (e.g., mailing list servers) and others are 496 automated programs that attempt to ascertain valid addresses to which 497 to send spam (a "directory harvesting" attack). Greylisting can 498 interfere with both instances, with harmful effects on the former. 500 4.3. Address Space Saturation 502 Greylisting is obviously not a fool-proof solution to avoiding 503 abusive traffic. Bad actors that send mail with just enough 504 frequency to avoid having their records expire will never be caught 505 by this mechanism after the first instance. 507 Where this is a concern, combining greylisting with some form of 508 reputation service that estimates the likely behaviour for IP 509 addresses that are not intercepted by the greylisting function would 510 be a good choice. 512 5. Recommendations 514 The following practices are RECOMMENDED based on collected 515 experience: 517 1. Implement greylisting based a tuple consisting of (IP address, 518 RFC5321.MailFrom, and the first RFC5321.RcptTo). It has shown 519 sufficient to use only the first RFC5321.RcptTo as legitimate 520 MTAs appear not to reorder recipients between retries. Including 521 RFC5321.MailFrom improves accuracy where the IP address is being 522 matched in clusters (e.g., CIDR blocks) rather than precisely 523 (see below). After a successful retry, allow all further [SMTP] 524 traffic from the IP address in that tuple regardless of envelope 525 information. 527 2. Include a configurable time window within which a retry from a 528 greylisted host is considered, and ignored otherwise. The time 529 window needs to be configured to contain typical retry times of 530 common MTA configurations, thus anticipating that a fully-capable 531 MTA will retry sometime after the beginning of the window and 532 before the end of it. The default window SHOULD range from one 533 minute to 24 hours. Retries during the period of this window are 534 permitted and satisfy the greylisting test, and thus the client 535 is no longer likely to be a sender of spam; retries after the end 536 of the window SHOULD be considered to be a new message for the 537 purposes of greylisting evaluation (i.e., reset the "first seen" 538 timestamp for that IP address). Some sites use a higher time 539 value for the low end of the window time to match common 540 legitimate MTA retry timeouts, but additional benefit from doing 541 so appears unlikely. 543 3. Include a timeout for database entries, after which records for 544 IP addresses that have generated no recent traffic are deleted. 545 This step is intended to re-enable greylisting for an IP address 546 in the event that it has changed "owners", and will subject the 547 client to another round of greylisting. The default SHOULD be at 548 least one week. 550 4. For an Administrative Management Domain (ADMD) all inbound border 551 MTAs listed in the [DNS] SHOULD share a common greylisting 552 database and common greylisting policies. This handles sequences 553 in which a client's retry goes to a different server after the 554 first 4yz reply, and it lets all servers share the list of hosts 555 that did retry successfully. 557 5. To accommodate those senders that have clusters of outgoing mail 558 servers, greylisting servers MAY track CIDR blocks of a size of 559 its own choosing, such as /24, rather than the full IPv4 address. 561 (Note, however, that this heuristic will not work for clusters 562 having machines on different networks.) A similar grouping 563 capability MAY be established based on the domain name of the 564 mail server if one can be determined. 566 6. Include a manual override capability for adding specific IP 567 addresses or network blocks that always bypass checks. There are 568 legitimate senders that simply don't respond well to greylisting 569 for a variety of reasons, most of which do not conflict with 570 [SMTP]. There are also some highly visible online entities such 571 as email service providers that will be certain to retry, and 572 thus those that are known SHOULD be allowed to bypass the filter. 574 7. Greylisting SHOULD NOT be applied by an ADMD's submission service 575 (see [SUBMISSION]) for authenticated client hosts. It also 576 SHOULD not be applied against any authenticated ADMD session. 577 Authentication can include whatever mechanisms are deemed 578 appropriate for the ADMD, such as known internal IP addresses, 579 protocol-level client authentication, or the like. 581 There is no specific recommendation as to the specific choice of 4yz 582 code to be returned as a result of a greylisting delay. Per [SMTP], 583 however, the only two reasonable choices are 421 if the 584 implementation wishes to terminate the connection immediately, and 585 450 otherwise. It is possible that some clients treat different 4yz 586 codes differently, but no data are available on whether using 421 587 versus some other 4yz code is particularly advantageous. 589 There is also no specific recommendation as to the choice of text to 590 include in the SMTP reply, if any. Some implementers argue that 591 indicating that greylisting is in effect can give spamware a hint as 592 to when to try again for successful delivery, while others suspect 593 that it won't matter to spamware and thus the more likely audience is 594 legitimate senders seeking to understand why their mail is being 595 delayed. 597 6. Measuring Effectiveness 599 A few techniques are common when measuring the effectiveness of 600 greylisting in a particular installation: 602 o Arrange to log the spam vs. legitimate determinations of messages 603 and what the greylisting decision would have been if enabled; then 604 determine whether there is a correlation (and, of course, whether 605 too much legitimate email would also be affected); 607 o Continuing from the previous point, query the set of IP addresses 608 subjected to greylisting in any popular [DNSBL] to see if there is 609 a strong correlation. 611 7. IPv6 Applicability 613 The descriptions and recommendations presented in this memo are based 614 on many years of experience with greylisting in the IPv4 Internet 615 environment, and so they clearly pertain to IPv4 deployments only. 617 The greater size of an IPv6 address seems likely to permit 618 differences in behaviours by bad actors, and this could well mean 619 needing to alter the details for applying greylisting; it might even 620 negate any benefits in using greylisting at all. At a minimum, it is 621 likely to call for different specific choices for any greylisting 622 algorithm variables. 624 In addition, an obvious consideration is that the size of the 625 database required to store records of all of the IP addresses seen 626 will likely be substantially larger in the IPv6 environment. 628 8. IANA Considerations 630 No actions are requested of IANA in this memo. 632 [RFC Editor: Please remove this section prior to publication.] 634 9. Security Considerations 636 This section discusses potential security issues related to 637 greylisting. 639 9.1. Tradeoffs 641 The discussion above highlights the fact that, although greylisting 642 provides some obvious and valuable defenses, it can introduce 643 unintentional and detrimental consequences for delivery of legitimate 644 mail. Where timely delivery of email is essential, especially for 645 financial, transactional, or security related applications, the 646 possible consequences of such systems need to be carefully 647 considered. 649 Specific sources can be exempted from greylisting, but of course that 650 means they have elevated privilege in terms of access to the 651 mailboxes on the greylisting system, and malefactors can seek to 652 exploit this. 654 9.2. Database 656 The database that has to be maintained as part of any greylisting 657 system will grow as the diversity of its SMTP clients' hosts grows, 658 and of course is larger in general depending on the nature of the 659 tuple stored about each delivery attempt. Even with a record aging 660 policy in place, such a database could grow large enough to interfere 661 with the system hosting it, or at least to a point at which 662 greylisting service is degraded. Moreover, an attacker knowing which 663 greylisting scheme is in use could rotate parameters of SMTP clients 664 under its control, in an attempt to inflate the database to the point 665 of denial-of-service. 667 Implementers could consider configuring an appropriate failure policy 668 so that something locally acceptable happens when the database is 669 attacked or otherwise unavailable. 671 In practice, this has not appeared as a serious concern, because any 672 reasonable aging policy successfully moderates database growth. It 673 is nevertheless identified here as a consideration as there may be 674 implementations in some environments where this is indeed an issue. 676 10. References 678 10.1. Normative References 680 [EMAIL-ARCH] 681 Crocker, D., "Internet Mail Architecture", RFC 5598, 682 October 2008. 684 [KEYWORDS] 685 Bradner, S., "Key words for use in RFCs to Indicate 686 Requirement Levels", BCP 14, RFC 2119, March 1997. 688 [SMTP] Klensin, J., "Simple Mail Transfer Protocol", RFC 5321, 689 October 2008. 691 [SUBMISSION] 692 Gellens, R. and J. Klensin, "Message Submission for Mail", 693 RFC 6409, November 2011. 695 10.2. Informative References 697 [CIDR] Fuller, V. and T. Li, "Classless Inter-domain Routing 698 (CIDR): The Internet Address Assignment and Aggregation 699 Plan", RFC 4632, August 2006. 701 [DHCP] Droms, R., "Dynamic Host Configuration Protocol", 702 RFC 2131, March 1997. 704 [DNS] Mockapetris, P., "Domain names - implementation and 705 specification", STD 13, RFC 1035, November 1987. 707 [DNSBL] Levine, J., "DNS Blacklists and Whitelists", RFC 5782, 708 February 2010. 710 [MAIL] Resnick, P., Ed., "Internet Message Format", RFC 5322, 711 October 2008. 713 [NAT] Srisuresh, P. and K. Egevang, "Traditional IP Network 714 Address Translator (Traditional NAT)", RFC 3022, 715 January 2001. 717 [PUREMAGIC] 718 Harris, E., "The Next Step in the Spam Control War: 719 Greylisting", August 2003, . 722 [SAUCE] Jackson, I., "GNU SAUCE", 2001, 723 . 725 Appendix A. Acknowledgments 727 The author wishes to acknowledge Mike Adkins, Steve Atkins, Mihai 728 Costea, Dave Crocker, Derek Diget, Peter J. Holzer, John Levine, 729 Chris Lewis, Jose-Marcio Martins da Cruz, John Klensin, S. Moonesamy, 730 Suresh Ramasubramanian, Mark Risher, Jordan Rosenwald, Gregory 731 Shapiro, Joe Sniderman, Roland Turner, and Michael Wise for their 732 contributions to this memo. The various participants of the MAAWG 733 Open Sessions about greylisting were also valued contributors. 735 Authors' Addresses 737 Murray S. Kucherawy 738 Cloudmark 739 128 King St., 2nd Floor 740 San Francisco, CA 94107 741 US 743 Phone: +1 415 946 3800 744 Email: msk@cloudmark.com 746 D. Crocker 747 Brandenburg InternetWorking 748 675 Spruce Dr. 749 Sunnyvale 94086 750 USA 752 Phone: +1.408.246.8253 753 Email: dcrocker@bbiw.net 754 URI: http://bbiw.net