Re: IMAP extensions needed for SPAM/HAM and WHITE/BLACK listing

Dave Cridland <dave@cridland.net> Mon, 06 July 2009 08:25 UTC

Received: from balder-227.proper.com (localhost [127.0.0.1]) by balder-227.proper.com (8.14.2/8.14.2) with ESMTP id n668PBfu033253 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 6 Jul 2009 01:25:11 -0700 (MST) (envelope-from owner-ietf-imapext@mail.imc.org)
Received: (from majordom@localhost) by balder-227.proper.com (8.14.2/8.13.5/Submit) id n668PB3C033252; Mon, 6 Jul 2009 01:25:11 -0700 (MST) (envelope-from owner-ietf-imapext@mail.imc.org)
X-Authentication-Warning: balder-227.proper.com: majordom set sender to owner-ietf-imapext@mail.imc.org using -f
Received: from peirce.dave.cridland.net (peirce.dave.cridland.net [217.155.137.61]) by balder-227.proper.com (8.14.2/8.14.2) with ESMTP id n668Ow8R033229 for <ietf-imapext@imc.org>; Mon, 6 Jul 2009 01:25:09 -0700 (MST) (envelope-from dave@cridland.net)
Received: from puncture ((unknown) [217.155.137.60]) by peirce.dave.cridland.net (submission) via TCP with ESMTPA id <SlG00gAd5Jez@peirce.dave.cridland.net>; Mon, 6 Jul 2009 09:24:51 +0100
X-SMTP-Protocol-Errors: NORDNS
Subject: Re: IMAP extensions needed for SPAM/HAM and WHITE/BLACK listing
References: <371C2EEC-8CBD-4AC1-B07A-928352E19DCB@pobox.com> <F9D917EB-A09B-48BB-AA30-F31EADEDD7DD@muada.com>
In-Reply-To: <F9D917EB-A09B-48BB-AA30-F31EADEDD7DD@muada.com>
MIME-Version: 1.0
Message-Id: <28941.1246868682.560969@puncture>
Date: Mon, 06 Jul 2009 09:24:42 +0100
From: Dave Cridland <dave@cridland.net>
To: Iljitsch van Beijnum <iljitsch@muada.com>, ietf-imapext@imc.org, general discussion of application-layer protocols <discuss@ietf.org>, George Michaelson <ggm@pobox.com>
Content-Type: text/plain; delsp="yes"; charset="iso-8859-1"; format="flowed"
Content-Transfer-Encoding: quoted-printable
X-Encoded: Changed encoding from 8Bit for 7bit transmission
Sender: owner-ietf-imapext@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-imapext/mail-archive/>
List-ID: <ietf-imapext.imc.org>
List-Unsubscribe: <mailto:ietf-imapext-request@imc.org?body=unsubscribe>

On Sun Jul  5 20:03:30 2009, Iljitsch van Beijnum wrote:
> On 3 jul 2009, at 1:05, George Michaelson wrote:
> 
>> As an example, at the moment if I wish to inform google that I  
>> have  known spam in local folders, I have to go to the web  
>> interface and  manually tag. If there was an IMAP extension, I  
>> could review my  local baysian junk folder, remove all non-spam  
>> (and flag the senders  as white-listed if need be), and request  
>> the rest to be flagged as  spam back on the IMAP backed MS.
> 
> Doesn't moving the spam messages to the spam folder accomplish this  
>  already? That's what my client does.
> 
> 
There's also a proposal (expired?) to use keywords to signal  
spamminess.

> But if you want this, I'd say that it needs to be a fractional  
> thing,  not a binary spam/no spam indication. For instance, the  
> server could  give something a spam score of 2 and the client also  
> 2 and together  that would be 4 so the message is presumed to be  
> spam (assuming the  spam threshold is 3), but in a binary system no  
> spam OR no spam = no  spam.

Of course, if both client and server use precisely the same criteria,  
you've simply halved your threshold.

There's two cases:

1) The server has some feedback-based spam detection mechanism, like  
a bayesian filter. You want to teach the server's filter about your  
explicit decisions.

2) Your client has a spam filter (or some sort), you want the server  
to tell your client about it's preliminary findings.

The notion of using two spam filters in concert to attempt to make an  
overall decision is basically flawed, because it fits into one or  
other case above - if you do go to the effort of having a range based  
spamminess from one, you'd have to use it as mere input into the  
other's decision process, since combining them naïvely would produce  
poorly weighted results.

A simple keyword approach isn't quite as good as ranges, but it does  
have the useful property that nearly every service already offers  
arbitrary keywords, and so is quite likely to already offer it.

Ranges, on the other hand, require the use of annotations, which is a  
rather more complex area.

Dave.
-- 
Dave Cridland - mailto:dave@cridland.net - xmpp:dwd@dave.cridland.net
  - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
  - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade