idnits 2.17.1 

draft-ietf-appsawg-greylisting-09.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'SHOULD not' in this paragraph:
     
     7.  Greylisting SHOULD NOT be applied by an ADMD's submission
     service (see [SUBMISSION]) for authenticated client hosts.  It also
     SHOULD not be applied against any authenticated ADMD session.
     Authentication can include whatever mechanisms are deemed appropriate for
     the ADMD, such as known internal IP addresses, protocol-level client
     authentication, or the like.

  -- The document date (April 26, 2012) is 4377 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Downref: Normative reference to an Informational RFC: RFC 5598 (ref.
     'EMAIL-ARCH')


     Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Individual submission                                       M. Kucherawy
3	Internet-Draft                                                 Cloudmark
4	Intended status: Standards Track                              D. Crocker
5	Expires: October 28, 2012                    Brandenburg InternetWorking
6	                                                          April 26, 2012

8	         Email Greylisting: An Applicability Statement for SMTP
9	                   draft-ietf-appsawg-greylisting-09

11	Abstract

13	   This document describes the art of email greylisting, the practice of
14	   providing temporarily degraded service to unknown email clients as an
15	   anti-abuse mechanism.

17	   Greylisting is an established mechanism deemed essential to the
18	   repertoire of current anti-abuse email filtering systems.

20	Status of this Memo

22	   This Internet-Draft is submitted in full conformance with the
23	   provisions of BCP 78 and BCP 79.

25	   Internet-Drafts are working documents of the Internet Engineering
26	   Task Force (IETF).  Note that other groups may also distribute
27	   working documents as Internet-Drafts.  The list of current Internet-
28	   Drafts is at http://datatracker.ietf.org/drafts/current/.

30	   Internet-Drafts are draft documents valid for a maximum of six months
31	   and may be updated, replaced, or obsoleted by other documents at any
32	   time.  It is inappropriate to use Internet-Drafts as reference
33	   material or to cite them other than as "work in progress."

35	   This Internet-Draft will expire on October 28, 2012.

37	Copyright Notice

39	   Copyright (c) 2012 IETF Trust and the persons identified as the
40	   document authors.  All rights reserved.

42	   This document is subject to BCP 78 and the IETF Trust's Legal
43	   Provisions Relating to IETF Documents
44	   (http://trustee.ietf.org/license-info) in effect on the date of
45	   publication of this document.  Please review these documents
46	   carefully, as they describe your rights and restrictions with respect
47	   to this document.  Code Components extracted from this document must
48	   include Simplified BSD License text as described in Section 4.e of
49	   the Trust Legal Provisions and are provided without warranty as
50	   described in the Simplified BSD License.

52	Table of Contents

54	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
55	     1.1.  Background . . . . . . . . . . . . . . . . . . . . . . . .  3
56	     1.2.  Definitions  . . . . . . . . . . . . . . . . . . . . . . .  4
57	   2.  Types of Greylisting . . . . . . . . . . . . . . . . . . . . .  4
58	     2.1.  Connection-Level Greylisting . . . . . . . . . . . . . . .  4
59	     2.2.  SMTP HELO/EHLO Greylisting . . . . . . . . . . . . . . . .  5
60	     2.3.  SMTP MAIL Greylisting  . . . . . . . . . . . . . . . . . .  5
61	     2.4.  SMTP RCPT Greylisting  . . . . . . . . . . . . . . . . . .  5
62	     2.5.  SMTP DATA Greylisting  . . . . . . . . . . . . . . . . . .  6
63	     2.6.  Additional Heuristics  . . . . . . . . . . . . . . . . . .  7
64	     2.7.  Exceptions . . . . . . . . . . . . . . . . . . . . . . . .  7
65	   3.  Benefits and Costs . . . . . . . . . . . . . . . . . . . . . .  8
66	   4.  Unintended Consequences  . . . . . . . . . . . . . . . . . . .  9
67	     4.1.  Unintended Mail Delivery Failures  . . . . . . . . . . . .  9
68	     4.2.  Unintended SMTP Client Failures  . . . . . . . . . . . . . 10
69	     4.3.  Address Space Saturation . . . . . . . . . . . . . . . . . 11
70	   5.  Recommendations  . . . . . . . . . . . . . . . . . . . . . . . 12
71	   6.  Measuring Effectiveness  . . . . . . . . . . . . . . . . . . . 13
72	   7.  IPv6 Applicability . . . . . . . . . . . . . . . . . . . . . . 14
73	   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 14
74	   9.  Security Considerations  . . . . . . . . . . . . . . . . . . . 14
75	     9.1.  Tradeoffs  . . . . . . . . . . . . . . . . . . . . . . . . 14
76	     9.2.  Database . . . . . . . . . . . . . . . . . . . . . . . . . 15
77	   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15
78	     10.1. Normative References . . . . . . . . . . . . . . . . . . . 15
79	     10.2. Informative References . . . . . . . . . . . . . . . . . . 15
80	   Appendix A.  Acknowledgments . . . . . . . . . . . . . . . . . . . 16
81	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16

83	1.  Introduction

85	   Preferred techniques for handling email abuse explicitly identify
86	   good actors and bad actors, giving each significantly different
87	   qualities service.  In some cases an actor does not have a known
88	   reputation; this can justify providing degraded service, until there
89	   is a basis for providing better service.  This latter approach is
90	   known as "greylisting".  Broadly, the term refers to any degradation
91	   of service for an unknown or suspect source, over a period of time
92	   (typically measured in minutes or a small number of hours).  The
93	   narrow use of the term refers to generation of an SMTP temporary
94	   failure reply code for traffic from such sources.  There are diverse
95	   implementations of this basic concept, and, predictably therefore,
96	   some blurred terminology.

98	   Absent a perfect abuse detection mechanism that incurs no cost, the
99	   current requirement is for an array of techniques to be used by each
100	   filtering system.  They range in cost and effectiveness and types of
101	   abuse techniques they target.

103	   Greylisting happes to be a technique that is cheap and early (in
104	   terms of its application in the SMTP sequence) and surprisingly
105	   remains useful.  Some spamware does indeed route around this
106	   technique, but much does not.

108	   The firehose of spam over the Internet represents a wide range of
109	   sophistication.  Greylisting is useful for removing a large amount of
110	   simplistic-but-significant traffic.

112	   This memo documents common greylisting techniques and discusses their
113	   benefits and costs.  It also defines terminology to enable clear
114	   distinction and discussion of these techniques.

116	   There is some confusion in industry that conflates greylisting with
117	   an SMTP temporary failure for any reason.  The purpose of this memo
118	   is also to dispel such confusion.

120	1.1.  Background

122	   For many years, large amounts of spam have been sent through purpose-
123	   built software, or "spamware", that supports only a constrained
124	   version of SMTP.  In particular, such software does not perform
125	   retransmission attempts after receiving an SMTP temporary failure.
126	   That is, if the spamware cannot deliver a message, it just goes on to
127	   the next address in its list since, in spamming, volume counts for
128	   far more than reliability.  Greylisting exploits this by rejecting
129	   mail from unfamiliar sources with a "transient (soft) fail" (4xx)
130	   [SMTP] error code.  Another application of greylisting is to delay
131	   mail from newly seen IP addresses on the theory that, if it's a spam
132	   source, then by the time it retries, it will appear in a list of
133	   sources to be filtered, and the mail will not be accepted.

135	   Early references for greylisting descriptions and implementations can
136	   be found at [SAUCE] and [PUREMAGIC].

138	1.2.  Definitions

140	1.2.1.  Keywords

142	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
143	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
144	   document are to be interpreted as described in [KEYWORDS].

146	1.2.2.  E-Mail Architecture Terminology

148	   Readers need to be familiar with the material and terminology
149	   discussed in [MAIL] and [EMAIL-ARCH].

151	2.  Types of Greylisting

153	   Greylisting is primarily performed at some phase during an SMTP
154	   session.  A set of attributes about the client-side SMTP server are
155	   used for assessing whether to perform greylisting.  At its simplest,
156	   the attribute is the IP address of the client and the assessment is
157	   whether it has previously connected, recently.  More elaborate
158	   attribute combinations and more sophisticated assessment, can be
159	   performed.  The following discussion covers the most common
160	   combinations.

162	2.1.  Connection-Level Greylisting

164	   Connection-level greylisting decides whether to accept the (TCP)
165	   connection from a "new" [SMTP] client.  At this point in the
166	   communication between the client and the server, the only information
167	   known to the receiving server is the incoming IP address.  This, of
168	   course, is often (but not always) translatable into a host name.

170	   The typical application of greylisting here is to keep a record of
171	   SMTP client IP addresses and/or host names (collectively, "sources")
172	   that have been seen.  Such a database acts as a cache of known
173	   senders and might or might not expire records after some period.  If
174	   the source is not in the database, or the record of the source has
175	   not reached some required minimum age (such as 30 minutes since the
176	   initial connection attempt), the server does one of the following,
177	   inviting a later retry:

179	   o  returns a 421 SMTP reply, and closes the connection;

181	   o  returns a different 4yz SMTP reply to all further commands in this
182	      SMTP session

184	   A useful variant of the basic known/unknown policy is to limit
185	   greylisting to those addresses that are on some list of IP addresses
186	   known to be affiliated with bad actors.  Whereas the simpler policy
187	   affects all new connections, including those from good actors, the
188	   constrained policy applies greylisting actions only to sites that
189	   already have a negative reputation.

191	2.2.  SMTP HELO/EHLO Greylisting

193	   HELO/EHLO greylisting refers to the first command verb in an SMTP
194	   session, HELO or EHLO.  It includes a single, required parameter that
195	   is supposed to contain the client's fully-qualified host name or its
196	   literal IP address.

198	   Greylisting implemented at this phase retains a record of sources
199	   coupled with HELO/EHLO parameters.  It returns 4yz SMTP replies to
200	   all commands until the end of the SMTP session if that tuple has not
201	   previously been recorded or if the record exists but has not reached
202	   some configured minimum age.

204	2.3.  SMTP MAIL Greylisting

206	   MAIL command greylisting refers to the command verb in an SMTP
207	   session that initiates a new transaction, MAIL.  It includes at least
208	   one required parameter that indicates the return email address
209	   (RFC5321.MailFrom) of the message being relayed from the client to
210	   the server.

212	   Greylisting implemented at this phase retains a record of sources
213	   coupled with return email addresses.  It returns 4yz SMTP replies to
214	   all commands for the remainder of the SMTP session if that tuple has
215	   not previously been recorded or if the record exists but has not met
216	   some configured minimum age.

218	2.4.  SMTP RCPT Greylisting

220	   RCPT greylisting refers to the command verb in an SMTP session that
221	   specifies intended recipients of an email transaction, RCPT.  It
222	   includes at least one required parameter that indicates the email
223	   address of an intended recipient of the message being relayed from
224	   the client to the server.

226	   Greylisting implemented at this phase retains a record of tuples that
227	   combines the provided recipient address with any combination of the
228	   following:

230	   o  the source, as described above;

232	   o  the return email address;

234	   o  the other recipient addresses of the message (if any)

236	   If the selected tuple is not found in the database, or if the record
237	   is present but has not reached some configured minimum age, the
238	   greylisting Mail Transfer Agent (MTA) [EMAIL-ARCH] returns 4yz SMTP
239	   replies to all commands for the remainder of the SMTP session.

241	   Note that often a match on a tuple involving the first valid RCPT is
242	   sufficient to identify a retry correctly, and further checks can be
243	   omitted.

245	2.5.  SMTP DATA Greylisting

247	   DATA greylisting refers to the command verb in an SMTP session that
248	   transmits the actual message content, DATA, as opposed to its
249	   envelope details (see [MAIL]).

251	   This type of greylisting can be performed at two places in the SMTP
252	   sequence:

254	   1.  on receipt of the DATA command, because at that point the entire
255	       envelope has been received (i.e., all MAIL and RCPT commands have
256	       been issued);

258	   2.  on completion of the DATA command, i.e., after the "." that
259	       terminates transmission of the message body, since at that point
260	       a digest or other analysis of the message could be performed.

262	   Some implementations do filtering here because there are clients that
263	   don't bother checking SMTP reply codes to commands other than DATA.
264	   Hence, it can be useful to add greylisting capability at that point
265	   in an SMTP session.

267	   Numerous greylisting policies are possible at this point.  All of
268	   them retain a record of tuples that combine the various parts of the
269	   SMTP transaction in some combination, including:

271	   o  the source, as described above;

273	   o  the return email address;
274	   o  the recipients of the message, as a set or individually;

276	   o  identifiers in the message header, such as the contents of the
277	      RFC5322.From or RFC5322.To fields;

279	   o  other prominent parts of the content, such as the RFC5322.Subject
280	      field;

282	   o  a digest of some or all of the message content, as a test for
283	      uniqueness;

285	   o  analysis of arbitrary portions of the message body.

287	   (The last four items in that list are only possible at the end of
288	   DATA, not on receipt of the DATA command.)

290	   If the selected tuple is not found in the database, or if the record
291	   exists but has not reached some configured minimum age, the
292	   greylisting MTA returns 4yz SMTP replies to all commands for the
293	   remainder of the SMTP session.

295	2.6.  Additional Heuristics

297	   Since greylisting seeks to target spam senders, it follows that being
298	   able to identify spamware within the SMTP context beyond the simple
299	   notion of "not seen before" would be desirable.  A more targeted
300	   approach might also include in its selection such heuristics as:

302	   o  if a [DNSBL] lists an IP address but the implementer wishes to be
303	      cautious with mitigation actions rather than blocking traffic from
304	      the IP address outright, then subject it to greylisting;

306	   o  if the value found in a PTR record follows common naming patterns
307	      for dynamic IP addresses, then subject it to greylisting.

309	2.7.  Exceptions

311	   Most greylisting systems provide for an exception mechanism, allowing
312	   one to specify IP addresses, IP address [CIDR] blocks, hostnames or
313	   domain names that are exempt from greylisting checks and thus whose
314	   SMTP client sessions are not subject to such interference.

316	   Likely candidates to be excepted from greylisting include those known
317	   not to retry according to a pattern that will be observed as
318	   legitimate, and those that send so rarely that they will age out of
319	   the database.  In both cases the excepted source is known not to be
320	   an abusive one by the site implementing greylisting.  Otherwise,
321	   typical non-abusive senders will enter the exception list on the
322	   first proper retry, and remain there permanently.

324	   One could also use a [DNSBL] that lists known good hosts as a
325	   greylisting exception set.

327	3.  Benefits and Costs

329	   The most obvious benefit with any of the above techniques is that
330	   spamware generally does not retry, and is therefore less likely to
331	   succeed, absent a record of a previous delivery attempts.

333	   The most obvious detriment to implementing greylisting is the
334	   imposition of delay on legitimate mail.  Some popular MTAs do not
335	   retry failed delivery attempts for an hour or more, which can cause
336	   expensive delays when delivery of mail is time-critical.  Worse, some
337	   legitimate MTAs do not retry at all.  (Note however that non-retrying
338	   clients are not fully SMTP-capable, per Section 2.1 of [SMTP].  A
339	   client does not know, nor is it entitled to know, the reason for the
340	   temporary failure status code being returned; greylisting could be in
341	   effect, or it could be caused by a local resource issue at the
342	   server.  A client therefore needs to be equipped to retry in order to
343	   be considered fully capable.)

345	   The counterargument to this "false positive" problem is that email
346	   has always been a "best-effort" mechanism, and thus this cost is
347	   ultimately low in comparison to the cost of dealing with high volumes
348	   of unwanted mail.  Still, the actual effect of such delays can be
349	   significant, such as altering the tone or flow of a multi-participant
350	   discussion to a mailing list.

352	   The cache of information stored about SMTP client history does not
353	   benefit legitimate clients that are already listed for acceptance,
354	   when the clients are subjected to any kind of reconfiguration,
355	   especially such as network renumbering.  To the greylisting
356	   implementation, such clients are once again unknown, and they will
357	   once again be subjected to the delay.

359	   Another obvious cost is for the required database.  It has to be
360	   large enough to keep the necessary history and fast enough to avoid
361	   excessive inefficiencies in the server's operations.  The primary
362	   consideration is the maximum age of records in the database.  If
363	   records age out too soon, then hosts that do retry per [SMTP] will be
364	   periodically subjected to greylisting even though they are well-
365	   behaved; if records age out after too long a period, then eventually
366	   spamware that launches a new campaign will not be identified as
367	   "unknown" in this manner, and will not be required to retry.

369	   Presuming that known friendly senders will be manually configured as
370	   exceptions to the greylisting check, a steady state will eventually
371	   be reached wherein the only mail that is delayed is mail from an IP
372	   address that has never sent mail before.  Experience suggests that
373	   the vast majority of mail comes from places on a developed exception
374	   list, so after a training period, only a small proportion of mail is
375	   actually affected.  The training period could be replaced by
376	   processing a history of email traffic and adding the IP addresses
377	   from which most traffic arrives to the exception list.

379	   Applying greylisting based on actual message content (i.e., post-
380	   DATA) is substantially more expensive than any of the other
381	   alternatives both in terms of the resources required to accept and
382	   temporarily store a complete message body (which can be quite
383	   substantial) and any processing that is done on that content.  As a
384	   consequence, such methods incur more cost during the session and thus
385	   is not a typical practice.

387	4.  Unintended Consequences

389	4.1.  Unintended Mail Delivery Failures

391	   There are a few failure modes of greylisting that are worth
392	   considering.  For example, consider an email message intended for
393	   user@example.com.  The example.com domain is served by two receiving
394	   mail servers, one called mail1.example.com and one called
395	   mail2.example.com.  On the first delivery attempt, mail1.example.com
396	   greylists the client, and thus the client places the message in its
397	   outgoing queue for later retry.  Later, when a retry is attempted,
398	   mail2.example.com is selected for the delivery, either because
399	   mail1.example.com is unavailable or because a round-robin [DNS]
400	   evaluation produces that result.  However, the two example.com hosts
401	   do not share greylisting databases, so the second host again denies
402	   the attempt.  Thus, although example.com has sought to improve its
403	   email throughput by having two servers, it has in fact amplified the
404	   problem of legitimate mail delay introduced by greylisting.

406	   Similarly, consider a site with multiple outbound MTAs that share a
407	   common queue.  On a first outbound delivery attempt to example.com,
408	   the attempt is grey listed.  On a later retry, a different outbound
409	   MTA is selected, which means example.com sees a different source, and
410	   once again greylisting occurs on the same message.  The same effect
411	   can result from the use of [DHCP], where the IP address of an
412	   outbound MTA changes between attempts.

414	   For systems that do DATA-level greylisting, if any part of the
415	   message has changed since the first attempt, the tuple constructed
416	   might be different than the one for the first attempt, and the
417	   delivery is again greylisted.  Some MTAs do reformulate portions of
418	   the message at submission time and this can produce visible
419	   differences for each attempt.

421	   A host that sends mail to a particular destination infrequently might
422	   not remain "known" in the receiving server's database and will
423	   therefore be greylisted for a high percentage of mail despite
424	   possibly being a legitimate sender.

426	   All of these and other similar cases can cause greylisting to be
427	   applied improperly to legitimate MTAs multiple times, leading to long
428	   delays in delivery or ultimately the return of the message to its
429	   sender.  Other side effects include out-of-order delivery of related
430	   sequenced messages.

432	   Address translation technologies such as [NAT] cause distinct MTAs to
433	   appear to come from a common IP address.  This can cause greylisting
434	   to be applied only to the first connection attempt from the shared IP
435	   address, meaning future MTAs connecting for the first time will be
436	   exempted from the protection greylisting provides.

438	4.2.  Unintended SMTP Client Failures

440	   Atypical SMTP client behaviours also need to be considered when
441	   deploying greylisting.

443	   Some clients do not retry messages for very long periods.  Popular
444	   open source MTAs implement increasing backoff times when messages
445	   receive temporary failure messages and/or degrade queue priority for
446	   very large messages.  This means greylisting introduces even more
447	   delay for MTAs implementing such schemes, and the delay can become
448	   large enough to become a nuisance to users.

450	   Some clients do not retry messages at all, in violation of [SMTP].
451	   This means greylisting will cause outright delivery failure right
452	   away for sources, envelopes, or messages that it has not seen before,
453	   regardless of the client attempting the delivery, essentially
454	   treating legitimate mail and spam the same.

456	   If a greylisting scheme requires a database record to have reached a
457	   certain age rather than merely testing for the presence of the record
458	   in the database, and the client has a retry schedule that is too
459	   aggressive, the client could be subjected to rate limiting by the MTA
460	   independent of the restrictions imposed by greylisting.

462	   Some SMTP implementations make the error of treating all error codes
463	   as fatal, contrary to [SMTP]; that is, a 4yz response is treated as
464	   if it were a 5yz response, and the message is returned to the sender
465	   as undeliverable.  This can result in such things as inadvertent
466	   removal from mailing lists in response to the perceived rejections.

468	   Some clients encode message-specific details in the address parameter
469	   to the [SMTP] MAIL command.  If doing so causes the parameter to
470	   change between retry attempts, a greylisting implementation could see
471	   it as a new delivery rather than a retry, and disallow the delivery.
472	   In such cases, the mail will never be delivered, and will be returned
473	   to the sender after the retry timeout expires.

475	   A client subjected to greylisting might move to the next host found
476	   in the ordered [DNS] MX record set for the destination domain and re-
477	   attempt delivery.  This has several considerations of its own:

479	   o  An increase in traffic to those alternate servers merely as a
480	      result of greylisting.

482	   o  Alternate (MX) servers SHOULD share the same greylisting database.
483	      When they do not -- as is often true when the servers occupy
484	      different Administrative Management Domains (ADMDs) -- SMTP
485	      clients can see variable treatment if they try to send to
486	      different MX hosts.

488	   o  When alternate MX servers relay mail back to the "primary" MX
489	      server, the latter SHOULD be configured to permit the other
490	      servers to relay mail without being subjected to greylisting.

492	   There are some applications that connect to an SMTP server and
493	   simulate a transaction up to the point of sending the RCPT command in
494	   an attempt to confirm that an address is valid.  Some of these are
495	   legitimate applications (e.g., mailing list servers) and others are
496	   automated programs that attempt to ascertain valid addresses to which
497	   to send spam (a "directory harvesting" attack).  Greylisting can
498	   interfere with both instances, with harmful effects on the former.

500	4.3.  Address Space Saturation

502	   Greylisting is obviously not a fool-proof solution to avoiding
503	   abusive traffic.  Bad actors that send mail with just enough
504	   frequency to avoid having their records expire will never be caught
505	   by this mechanism after the first instance.

507	   Where this is a concern, combining greylisting with some form of
508	   reputation service that estimates the likely behaviour for IP
509	   addresses that are not intercepted by the greylisting function would
510	   be a good choice.

512	5.  Recommendations

514	   The following practices are RECOMMENDED based on collected
515	   experience:

517	   1.  Implement greylisting based a tuple consisting of (IP address,
518	       RFC5321.MailFrom, and the first RFC5321.RcptTo).  It has shown
519	       sufficient to use only the first RFC5321.RcptTo as legitimate
520	       MTAs appear not to reorder recipients between retries.  Including
521	       RFC5321.MailFrom improves accuracy where the IP address is being
522	       matched in clusters (e.g., CIDR blocks) rather than precisely
523	       (see below).  After a successful retry, allow all further [SMTP]
524	       traffic from the IP address in that tuple regardless of envelope
525	       information.

527	   2.  Include a configurable time window within which a retry from a
528	       greylisted host is considered, and ignored otherwise.  The time
529	       window needs to be configured to contain typical retry times of
530	       common MTA configurations, thus anticipating that a fully-capable
531	       MTA will retry sometime after the beginning of the window and
532	       before the end of it.  The default window SHOULD range from one
533	       minute to 24 hours.  Retries during the period of this window are
534	       permitted and satisfy the greylisting test, and thus the client
535	       is no longer likely to be a sender of spam; retries after the end
536	       of the window SHOULD be considered to be a new message for the
537	       purposes of greylisting evaluation (i.e., reset the "first seen"
538	       timestamp for that IP address).  Some sites use a higher time
539	       value for the low end of the window time to match common
540	       legitimate MTA retry timeouts, but additional benefit from doing
541	       so appears unlikely.

543	   3.  Include a timeout for database entries, after which records for
544	       IP addresses that have generated no recent traffic are deleted.
545	       This step is intended to re-enable greylisting for an IP address
546	       in the event that it has changed "owners", and will subject the
547	       client to another round of greylisting.  The default SHOULD be at
548	       least one week.

550	   4.  For an Administrative Management Domain (ADMD) all inbound border
551	       MTAs listed in the [DNS] SHOULD share a common greylisting
552	       database and common greylisting policies.  This handles sequences
553	       in which a client's retry goes to a different server after the
554	       first 4yz reply, and it lets all servers share the list of hosts
555	       that did retry successfully.

557	   5.  To accommodate those senders that have clusters of outgoing mail
558	       servers, greylisting servers MAY track CIDR blocks of a size of
559	       its own choosing, such as /24, rather than the full IPv4 address.

561	       (Note, however, that this heuristic will not work for clusters
562	       having machines on different networks.)  A similar grouping
563	       capability MAY be established based on the domain name of the
564	       mail server if one can be determined.

566	   6.  Include a manual override capability for adding specific IP
567	       addresses or network blocks that always bypass checks.  There are
568	       legitimate senders that simply don't respond well to greylisting
569	       for a variety of reasons, most of which do not conflict with
570	       [SMTP].  There are also some highly visible online entities such
571	       as email service providers that will be certain to retry, and
572	       thus those that are known SHOULD be allowed to bypass the filter.

574	   7.  Greylisting SHOULD NOT be applied by an ADMD's submission service
575	       (see [SUBMISSION]) for authenticated client hosts.  It also
576	       SHOULD not be applied against any authenticated ADMD session.
577	       Authentication can include whatever mechanisms are deemed
578	       appropriate for the ADMD, such as known internal IP addresses,
579	       protocol-level client authentication, or the like.

581	   There is no specific recommendation as to the specific choice of 4yz
582	   code to be returned as a result of a greylisting delay.  Per [SMTP],
583	   however, the only two reasonable choices are 421 if the
584	   implementation wishes to terminate the connection immediately, and
585	   450 otherwise.  It is possible that some clients treat different 4yz
586	   codes differently, but no data are available on whether using 421
587	   versus some other 4yz code is particularly advantageous.

589	   There is also no specific recommendation as to the choice of text to
590	   include in the SMTP reply, if any.  Some implementers argue that
591	   indicating that greylisting is in effect can give spamware a hint as
592	   to when to try again for successful delivery, while others suspect
593	   that it won't matter to spamware and thus the more likely audience is
594	   legitimate senders seeking to understand why their mail is being
595	   delayed.

597	6.  Measuring Effectiveness

599	   A few techniques are common when measuring the effectiveness of
600	   greylisting in a particular installation:

602	   o  Arrange to log the spam vs. legitimate determinations of messages
603	      and what the greylisting decision would have been if enabled; then
604	      determine whether there is a correlation (and, of course, whether
605	      too much legitimate email would also be affected);

607	   o  Continuing from the previous point, query the set of IP addresses
608	      subjected to greylisting in any popular [DNSBL] to see if there is
609	      a strong correlation.

611	7.  IPv6 Applicability

613	   The descriptions and recommendations presented in this memo are based
614	   on many years of experience with greylisting in the IPv4 Internet
615	   environment, and so they clearly pertain to IPv4 deployments only.

617	   The greater size of an IPv6 address seems likely to permit
618	   differences in behaviours by bad actors, and this could well mean
619	   needing to alter the details for applying greylisting; it might even
620	   negate any benefits in using greylisting at all.  At a minimum, it is
621	   likely to call for different specific choices for any greylisting
622	   algorithm variables.

624	   In addition, an obvious consideration is that the size of the
625	   database required to store records of all of the IP addresses seen
626	   will likely be substantially larger in the IPv6 environment.

628	8.  IANA Considerations

630	   No actions are requested of IANA in this memo.

632	   [RFC Editor: Please remove this section prior to publication.]

634	9.  Security Considerations

636	   This section discusses potential security issues related to
637	   greylisting.

639	9.1.  Tradeoffs

641	   The discussion above highlights the fact that, although greylisting
642	   provides some obvious and valuable defenses, it can introduce
643	   unintentional and detrimental consequences for delivery of legitimate
644	   mail.  Where timely delivery of email is essential, especially for
645	   financial, transactional, or security related applications, the
646	   possible consequences of such systems need to be carefully
647	   considered.

649	   Specific sources can be exempted from greylisting, but of course that
650	   means they have elevated privilege in terms of access to the
651	   mailboxes on the greylisting system, and malefactors can seek to
652	   exploit this.

654	9.2.  Database

656	   The database that has to be maintained as part of any greylisting
657	   system will grow as the diversity of its SMTP clients' hosts grows,
658	   and of course is larger in general depending on the nature of the
659	   tuple stored about each delivery attempt.  Even with a record aging
660	   policy in place, such a database could grow large enough to interfere
661	   with the system hosting it, or at least to a point at which
662	   greylisting service is degraded.  Moreover, an attacker knowing which
663	   greylisting scheme is in use could rotate parameters of SMTP clients
664	   under its control, in an attempt to inflate the database to the point
665	   of denial-of-service.

667	   Implementers could consider configuring an appropriate failure policy
668	   so that something locally acceptable happens when the database is
669	   attacked or otherwise unavailable.

671	   In practice, this has not appeared as a serious concern, because any
672	   reasonable aging policy successfully moderates database growth.  It
673	   is nevertheless identified here as a consideration as there may be
674	   implementations in some environments where this is indeed an issue.

676	10.  References

678	10.1.  Normative References

680	   [EMAIL-ARCH]
681	              Crocker, D., "Internet Mail Architecture", RFC 5598,
682	              October 2008.

684	   [KEYWORDS]
685	              Bradner, S., "Key words for use in RFCs to Indicate
686	              Requirement Levels", BCP 14, RFC 2119, March 1997.

688	   [SMTP]     Klensin, J., "Simple Mail Transfer Protocol", RFC 5321,
689	              October 2008.

691	   [SUBMISSION]
692	              Gellens, R. and J. Klensin, "Message Submission for Mail",
693	              RFC 6409, November 2011.

695	10.2.  Informative References

697	   [CIDR]     Fuller, V. and T. Li, "Classless Inter-domain Routing
698	              (CIDR): The Internet Address Assignment and Aggregation
699	              Plan", RFC 4632, August 2006.

701	   [DHCP]     Droms, R., "Dynamic Host Configuration Protocol",
702	              RFC 2131, March 1997.

704	   [DNS]      Mockapetris, P., "Domain names - implementation and
705	              specification", STD 13, RFC 1035, November 1987.

707	   [DNSBL]    Levine, J., "DNS Blacklists and Whitelists", RFC 5782,
708	              February 2010.

710	   [MAIL]     Resnick, P., Ed., "Internet Message Format", RFC 5322,
711	              October 2008.

713	   [NAT]      Srisuresh, P. and K. Egevang, "Traditional IP Network
714	              Address Translator (Traditional NAT)", RFC 3022,
715	              January 2001.

717	   [PUREMAGIC]
718	              Harris, E., "The Next Step in the Spam Control War:
719	              Greylisting", August 2003, <http://projects.puremagic.com/
720	              greylisting/whitepaper.html>.

722	   [SAUCE]    Jackson, I., "GNU SAUCE", 2001,
723	              <http://www.gnu.org/software/sauce>.

725	Appendix A.  Acknowledgments

727	   The author wishes to acknowledge Mike Adkins, Steve Atkins, Mihai
728	   Costea, Dave Crocker, Derek Diget, Peter J. Holzer, John Levine,
729	   Chris Lewis, Jose-Marcio Martins da Cruz, John Klensin, S. Moonesamy,
730	   Suresh Ramasubramanian, Mark Risher, Jordan Rosenwald, Gregory
731	   Shapiro, Joe Sniderman, Roland Turner, and Michael Wise for their
732	   contributions to this memo.  The various participants of the MAAWG
733	   Open Sessions about greylisting were also valued contributors.

735	Authors' Addresses

737	   Murray S. Kucherawy
738	   Cloudmark
739	   128 King St., 2nd Floor
740	   San Francisco, CA  94107
741	   US

743	   Phone: +1 415 946 3800
744	   Email: msk@cloudmark.com

746	   D. Crocker
747	   Brandenburg InternetWorking
748	   675 Spruce Dr.
749	   Sunnyvale  94086
750	   USA

752	   Phone: +1.408.246.8253
753	   Email: dcrocker@bbiw.net
754	   URI:   http://bbiw.net