[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Asrg] greylisting with whitelist of good mailservers
William Leibzon:
>
> That is why I said it looks like a stochastic process and was not sure if
> using mean function is appropriate. It should also be noted that since
> I'm using quantified (x.y) scoring data the sample space can be
> considered
> to be a finite set - I can't yet decide if/how this would help though.
>
Finite - OK, but I suspect that the scores are only ordinal really. Do we
know how *much* 'worse' a message with a score 10.0 is than one with 5.0?
or do we merely know that one with 7.5 falls inbetween them? is 5:10 the
same as 2.5:5 or 10:15 ? Or what?
On the other hand, we _do_ know that a source with 50/100 messages scoring
over your threshold has twice the rate as one with 25/100. You could have
more than two buckets of course, but two is easy.
> > Are they? Do you really want a
> > set of scores like [5.1, 5.0, 5.2, 0.1] to give the same
> rep.(arithmetic
> > mean = 3.85) as the set [3.8, 3.9, 4.0, 3.7] ?
>
> Testing will show if this concept works. But for now, yes I do want to
> them to give the same or similar score, I think with larger sample this
> would give fairly accurate information.
>
I think it's likely to work moderately well. It's a philosophical thing
really, innit? Are you making repeated measures of the same quantity?
Unless you assume that the mail stream associated with a source is bound to
be homogeneous, I'd say not. If you're considering the probability that a
new message from some source will have a particular quality - then you may
be interested in the number of previous messages that have that quality.
I don't believe that a mean is really appropriate or specially useful.
Incidentally, if you want something to vary with datum age (and sample
size) it should probably be *confidence*, rather than 'reputation'.
Rgds,
JRK
_______________________________________________
Asrg mailing list
Asrg at ietf.org
https://www1.ietf.org/mailman/listinfo/asrg