[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: False Positive -was- RE: [Asrg] Fwd: Returned mail: see transcript for details



On Tue, Apr 01, 2003 at 02:18:34PM -0500, Liudvikas Bukys wrote:
> 
> I slightly prefer the terms accept/reject over negative/positive.
> 
> PROPORTION DEFINITION:
> 
> I think that a 2x2 matrix is most straightforward presentation:
> 
> 	{accept,reject} x {ham,spam}
> 
> 	accept	reject	total
> ham	TA	FR	NHAM = TA+FR
> spam	FA	TR	NSPAM = FA+TR
> total	TA+FA	FR+TR	NTOTAL
> 
> and the most helpful intuitive proportions (I think) would be
> FRp = FR/NHAM and FAp = FA/NSPAM.
> 
> Numbers will ALWAYS be dependent on a particular corpus.
> 
> However, I think that the definitions above will be stable
> (measuring classifier quality, not corpus composition) over
> a wide range of ham-spam ratios.  Using FR/NTOTAL or FA/NTOTAL
> will be dominated by ham-spam ratio of the test set, obscuring
> the performance of the classifier, making results unnecessarily
> corpus-specific.

  This is exactly correct, IMHO.

  Your matrix is a flipped version of the matrix I just presented in
proposing the FP/FN calculations to be used for the "dSpam" measure.

  -- Clifton

-- 
     Clifton Royston  --  LavaNet Systems Architect --  cliftonr@lava.net

  "If you ride fast enough, the Specialist can't catch you."
  "What's the Specialist?" Samantha says. 
  "The Specialist wears a hat," says the babysitter. "The hat makes noises."
  She doesn't say anything else.  
                      Kelly Link, _The Specialist's Hat_
_______________________________________________
Asrg mailing list
Asrg@ietf.org
https://www1.ietf.org/mailman/listinfo/asrg