[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Asrg] 2.0 Metrics (Was Re: [Asrg] spamstones)
Arrgh! Following up to correct a couple critical typos below which I
didn't catch when proofreading before sending it. Sorry, I plead
Murphy's law.
On Wed, Apr 02, 2003 at 10:38:23AM -1000, Clifton Royston wrote:
> I define FP and FN with the provision that they are not allowed to
> be = 0, but otherwise in the standard way:
>
> measure spam category ham category
> ------- --------------- --------------
> flagged N(flagged-spam) N(flagged-ham)
> unflagged N(unflagged-spam) N(flagged-ham)
> ------- --------------- --------------
> total N(spam) N(ham)
>
> Then FP = max(N(flagged-ham),0.5 ) / N(ham)
> FN = max(N(unflagged-spam),0.5) / N(spam)
>
> Using the minimum of 1.5 for the numerator avoids undefined values in
0.5
> the log computation, and also deliberately penalizes results claiming
> "zero false positives" or "zero false negatives" if they use a small
> sample size.
>
> The tentative definition for "dSpam" is:
> 10 * ( -log10(FP) - log10(FP) + log(1/4) )
10 * ( -log10(FP) - log10(FN) + log(1/4) )
i.e. includes the log of both false positive rate and false negative
rate, not FP twice.
-- Clifton
--
Clifton Royston -- LavaNet Systems Architect -- cliftonr@lava.net
"If you ride fast enough, the Specialist can't catch you."
"What's the Specialist?" Samantha says.
"The Specialist wears a hat," says the babysitter. "The hat makes noises."
She doesn't say anything else.
Kelly Link, _The Specialist's Hat_
_______________________________________________
Asrg mailing list
Asrg@ietf.org
https://www1.ietf.org/mailman/listinfo/asrg