[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Asrg] 2a. Analysis - Spam filled with words
On Tue, 09 Sep 2003 00:08:12 -0400,
Yakov Shafranovich <research@solidmatrix.com>:
>I started getting weird spam samples in the last few
>days. The spam message consists of words, one after
>the other, with an image in the middle. Looks like
>another attempt to defeat the filters, here is a sample:
As Jose pointed out in his reply, these random invisible words do
serve to "add bulk"--although any random text (even nonsense words)
would serve that same purpose.
I have a pretty strong hunch about what these messages are trying to
do. Specifically, I think they're a clever attempt (by someone who
doesn't really understand statistical language processing) to sneak
past Bayesian classifiers. And they succeed, the first time or two;
but by the third time, the Bayesian classifer's identified at least
two "tell-tale giveaways" that make these messages very easy to
"spot" for any statistically-based technology (including mine).
On a unrelated note: I've agreed to try to help corrdinate the area 2
analysis work for an indeterminately short time. One of the things I
would really like to do is to run a quick "pilot" study (and I pretty
much don't care about what). This study should be small, tightly
focused, and (ideally) something that could be accomplished in, say,
6 weeks or so. The primary goal of this pilot would be to help folks
working in area 2 to discover what (if any) unique mechanical
requirements there may be to conducting an anti-spam research
project. (Think of it as a "shakedown run.")
Ideas for a possible focus for this pilot study are actively
solicited. My preference would be for folks to email me off-list
with ideas, brainstorms, etc. I will summarize, and then post the
summary to the list.
- Terry
_______________________________________________
Asrg mailing list
Asrg@ietf.org
https://www1.ietf.org/mailman/listinfo/asrg