[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Asrg] While I was on vacation, I came up with a proof that it is not possible to build a 100% effective anti spam filter based on content



People,

    I have just returned from a vacation to England with my father, and he came up with an idea that is a proof that it is not possible to build a 100% effective anti spam filter based on the contents of a message.

    Suppose that there is a spam detector function S that detects whether or not a given message m is spam, i.e. S(m) is TRUE if the message m is spam and FALSE if the message is not spam.  Whether or not the algorithm or heuristics S uses are publicly known or not is not relevant to the proof, although an open S will make it easier for the Bad Guy.

    The Bad Guy can defeat the S function by creating a message modifier function M.  The message modifier function modifies a message m by, for example, adding more words or sentences or changing the spelling of some words.  The Bad Guy can then create a loop of the form:

while(S(m)) {
   m = M(m)
}

    Is this loop guaranteed to terminate?  That would depend on the details of S and M.  For example, if S measures the ratio of spelled to mispelled words, and M adds more and more mispelled words, then it will not terminate.  However, if M were set up such that if one strategy did not work after a while, then try a different strategy, then possibly it would terminate eventually.  If the internals of S were known, then M could be written such that it worked efficiently.

    What if there were many different spam filters, Si(m) ?  Then the while loop would have to test the message m against a lot of functions Si(m) and use some sort of a statistical technique (e.g. a simple weighted average) for deciding when a message is good enough.


    Does this make sense?


    My conclusion is that since we cannot solve the spam problem by content analysis, we will have to have some sort of consent based white list/black list/gray list solution.



Jeff

_______________________________________________
Asrg mailing list
Asrg at ietf.org
https://www1.ietf.org/mailman/listinfo/asrg