[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Asrg] RE: 2.a.1 Analysis of Actual Spam Data - Titan Key reduces spam attacks



I don't see what's controversial by your below. I had to go to the
dictionary at least once and what I was able to make out by the rest
made sense to me.  And the statistical points are very appreciated.

> -----Original Message-----
> From: Terry Sullivan [mailto:terry@pantos.org] 
> Sent: Friday, August 01, 2003 10:32 AM
> To: asrg@ietf.org
> Subject: [Asrg] RE: 2.a.1 Analysis of Actual Spam Data - 
> Titan Key reduces spam attacks
> 
> 
> (Oh boy, for my first posting to the list, I get to help stir up 
> controversy.)
> 
> Hello all...
> 
> For my money, there are two disjoint issues here that have sorta 
> gotten co-mingled in the discussion:
> 
> - statistical power
> 
> - absence of control condition
> 
> Statistically pedantic point: the number of observations in a sample 
> influences the magnitude of the effect that can be detected 
> analytically--nothing more, nothing less.  Statistical power 
> increases monotonically with sample size.  More data are often handy, 
> but a relatively small sample can be used successfully to detect very 
> large statistical effects.  (Tangential statistically pedantic point: 
> with the exception of 0.0, the slope of a regression line is utterly 
> uninformative.)
> 
> The thing that precludes any legitimate causal inference with these 
> data is the absence of a control condition.  But that's an issue of 
> logic, not statistical power.  Observing a correlation is a great way 
> to generate testable hypotheses; but a tenable claim of causality 
> requires actually testing those hypotheses.
> 
> I confess that I've not seen these data.  (The .xls file hoses my 
> non-Microsoft spreadsheet, and a platform-neutral format is 
> unavailable.)  But seeing/not seeing these data doesn't make the 
> causal claim viable.  In point of fact, my logs shows a measurable 
> downward trend in total spam received since I installed my latest new 
> filtering widget.  I am absolutely confident, however, that my 
> filtering widget did not *cause* that decline.  It's just a happy 
> coincidence.  
> 
> On a more philosophical note, I can't help but suspect that 
> daily/weekly spam volume is simply way too "noisy" to serve as a 
> meaningful standalone measure of anything.  There are just too many 
> uncontrolled variables at play.  I've seen short-term longitudinal 
> fluctuations in spam volume of 100% and more, and cross-sectional 
> differences in excess of 50%.  Any "measure" with that much noise is 
> too unreliable (in a statistical sense) to support meaningful 
> analysis.
> 
> - Terry
> 
> 
> 
> _______________________________________________
> Asrg mailing list
> Asrg@ietf.org
> https://www1.ietf.org/mailman/listinfo/asrg
> 
> 
> 



_______________________________________________
Asrg mailing list
Asrg@ietf.org
https://www1.ietf.org/mailman/listinfo/asrg