At 12:56 PM -0400 5/18/03, Fred Bacon wrote:
I will provide them. However they are specific to CommuniGate Pro, and requiring turning on low-level SMTP monitoring so I can track the transactions. Among other things that means that I'm getting about 40MB of log files every night for those 40,000 bounces. But I will provide the Perl libraries that I used to do the testing and generate the database. Just give me a bit of time this week to do all the work that I was supposed to have been doing instead of this :-).I nice idea, but what we really need is the script you used to analyze your logs. Then additional data can be collected at a variety of locations.
Yes. It's limited data, and it's from an odd system. Somewhere's email is somewhat abnormal compared to most companies or ISPs--although it probably looks more like a small ISP.I realize that there are many on this list who find data collection to be pointless, but Kee Hinckley has shown this to be incorrect. Vernon Schryver's assertions were useless (even if correct) without hard evidence, and Kee's data is insufficient without wider deployment.
I don't think we're that far off. The main issue is that spammer drop boxes get shut down--so the longer it is before you run the test, the less likely you are to get a valid email address. (I've considered testing that assertion. I might try periodically retesting addresses and seeing if disappear.)Likewise, Vernon's followup that Kee is analyzing a different statement than Vernon asserted is a legitimate concern. The data analysis methodology should be publicly vetted to ensure that it is providing meaningful and acurate data.
Therefore, it is not possible to determine with certainty whether these accounts actually existed. A better testing strategy would actually send email to these accounts with the DATA command and watch for bounce messages. However, spammers can always choose to use a real email address as the return address and sending email to valid accounts in itself may be considered spam by the recipients.Yes. I should have noted that this will report some number of items as valid when they are not. However unless (as someone as asserted) the major ISPs are doing this, it doesn't impact the numbers we see for them.