[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Asrg] Collecting statistics
Vernon Schryver wrote:
> > I think a mechanism similar to DCC could be good. We need a way for
> > lots of sensors to dump information into the collection network. We
> > need a way to decide what summaries of the data are useful. And we need
> > a way to extract the data without compromising privacy. (e.g., report
> > SHA1 hashes of addresses rather than addresses themselves.)
> > ...
> I think I'm qualified to comment on that idea. The short version of
> my take is "great in theory but nearly hopeless in practice."
OK.
> The hopeless part is that in practice it is extremely difficult to
> build a network large enough to collect enough data to be other than
> a muddy pile of annecdotes.
Well, this is a research group; we consider blue-sky ideas. If most
people start seeing the level of spam that striker.ottawa.on.ca does,
there will be huge incentive to collaborate.
> The DCC is more than 2 years old, but it
> still sees at most single-digit percentages of all mail in the network
> and perhaps less.
I'm not sure that you need a much higher percentage than that. If
your sensors are well-dispersed, a couple of percent should be pretty
representative, or at least enough to start detecting trends.
> Even in the skewed population that does use the DCC, there is evidence
> that making generalizations is very hard. There is 3X difference in
> spam load per user depending on organization type judging from
> http://www.dcc-servers.net/dcc/graphs/comp-rates
Yes, I see that also among my customers. But we're looking for trends
across the whole Internet.
[...]
> Consider the dangers of being able to
> ask whether the system has seen a message with a sender of the hash of
> "Bill Gates" and at recipeint of the hash of "Steve Jobs" today.
You probably only want hashes of IP addresses, not e-mail addresses.
If the DCC collected not only message hashes, but also the number of
different IPs from which those messages originated, I bet we'd see
some interesting data.
--
David.
_______________________________________________
Asrg mailing list
Asrg@ietf.org
https://www1.ietf.org/mailman/listinfo/asrg