Maintaining a Publicly Available Blacklist - Mechanisms and Principles

← Back to Stories (view on slashdot.org)

Maintaining a Publicly Available Blacklist - Mechanisms and Principles

Posted by samzenpus on Sunday April 14, 2013 @09:00AM from the it's-not-me-it's-you dept.

badger.foo writes "When you publicly assert that somebody sent spam, you need to ensure that your data is accurate. Your process needs to be simple and verifiable, and to compensate for any errors, you want your process to be transparent to the public with clear points of contact and line of responsibility. Here are some pointers from the operator of the bsdly.net greytrap-based blacklist."

9 of 89 comments (clear)

Min score:

Reason:

Sort:

Using a blacklist ... by magic+maverick+ · 2013-04-14 09:32 · Score: 5, Interesting

And while we're at it, some hints on using a public blacklist with regards spam. The correct way is not to trust the blacklist 100%. Instead, you use it as one part of a comprehensive scheme (part of this complete breakfast). So, you may use a dictionary, and for every word in the dictionary you add 10 points (viagra, v1agra, v14gr4, etc.). You can use SPF and if it doesn't match, then that's worth 50 points, and if it's not there, maybe 20 points. And if the domain or IP address is on a blacklist, maybe 40 points. You assign the points as you like. Then, if you hit 100 points, you mark the email as "probably spam".
But you never reject or mark an email spam just because it's on some blacklist. That's just stupid. Now I'm off to RTFA.
----
OK if you have your own blacklist (perhaps a list of domains or IP addresses that have sent email to a catch-all, or that have fallen into a honeytrap), then you do what you want. But you probably should date entries and remove old ones (if they do not misbehave again), in case a legitimate user is now at that location.

--
HELP MY ACCOUNT HAS BEEN HACKED BY AN ILLIBERAL ART STUDENT SET TO DESTROY THE INTERWEBZ!
1. Re:Using a blacklist ... by PNutts · 2013-04-14 09:58 · Score: 3, Insightful
  
  I don't disagree with your premise. I work in a health based organization and the SPAM and "dirty word" lexicons block legit e-mails. I've also found that for receiving e-mails SPF and most other common sense checks block too much legit mail. God forbid businesses configure their hosts / gateways correctly. And don't get me started on third party mailer services. It makes an impossible job more impossibler.
Re:Greylist instead by 1s44c · 2013-04-14 09:34 · Score: 5, Insightful

If you ran an open relay you were on the right end of a blacklisting.
Re:Greylist instead by ShanghaiBill · 2013-04-14 09:57 · Score: 3, Insightful

... and all mails you get will be delayed by an hour or more, pretty unacceptable when you get an urgent complaint that something is down. And even in not work-related matters, making people wait for no reason is rude.
Simple solution: Use a whitelist first. If the email is from some on your family/friend/co-worker/customer list, or someone you have corresponded with in the past, then you see it immediately. Anyone else can wait.
Re:Greylist instead by jhoegl · 2013-04-14 10:00 · Score: 4, Insightful

Email is not a priority notice system.
If it is so urgent, pick up the phone.
Not realistically achievable by girlintraining · 2013-04-14 10:01 · Score: 3, Insightful

. Your process needs to be simple and verifiable,

The process can't be simple because spammers are endlessly creative with how they try to get past the filters. And if it was verifiable, that would mean published -- and once published, becomes useless. Spammers can simply test their latest creation against your filter, and now you effectively have given them a way to bypass your entire process, making it worthless.

and to compensate for any errors, you want your process to be transparent to the public
The administrative process can be transparent, but the technical process, as outlined above, cannot.

with clear points of contact and line of responsibility.
The problem here is; how do you tell the liars from the rest? Responsibility is fine, clear points of contact are fine, but what's the criterion for delineating between 'spam' and 'marketing'? How about between 'spam' and 'opt-in' that the user no longer wants? How about between... you get the idea. There is some grey here, and odds are good you're going to find someone doing something with a legitimate and ethical reason, that by all appearances... isn't. And then you're going to make a decision based on those appearances (because what else can you go on?) and then you're going to burn a bridge down.
These problems can't be solved with a handwave and a post on an internet forum.

--
#fuckbeta #iamslashdot #dicemustdie
Re:Greylist instead by nabsltd · 2013-04-14 10:20 · Score: 3, Informative

and all mails you get will be delayed by an hour or more, pretty unacceptable when you get an urgent complaint that something is down.
In a correctly configured greylist, only the first e-mail ever received from a particular IP address will be delayed. Once you know an IP addresss follows the RFC and retries, then you know that even if they do send you spam, delaying it won't change that. In order to allow for the actual machine behind an IP address changing, instead of a permanent whitelist, you pick a timeout that is long enough but not too long. I use 40 days, which allows a once-monthly mailing list to not be delayed (since the timeout is reset each time you receive an e-mail from an IP). You also pre-load the database with whitelists for Google, Amazon, Yahoo, etc.
I also set just a 4 minute delay, which means that the one e-mail is rarely even delayed by 10 minutes. I could probably get by with as short as one minute, since that would still handle the spambots that try all MX records but never try again.
Last, since I already have a database, it makes it really easy to build my own "IP address reputation" based on the incoming e-mail, which allows me to do things like temporarily blacklist an IP that has sent a lot of spam recently, etc.
We run DNS-based lists by dskoll · 2013-04-14 10:51 · Score: 4, Interesting
... though they are not publicly-accessible; only accessible to our customers. Here's how they work:
Using our reputation-collection protocol, we receive a constant stream of events from our customers. An "event" is something like "IPv4 address x.y.z.w sent to a nonexistent recipient" or "IPv6 address abcd::1234 sent something that a human voted as spam"
Currently, we have a database of just under two billion events. Once an hour, we go through our database and categorize IP addresses as:
- Greylist Stumblers: Machines that seem to have trouble passing the greylist hurdle.
- Dictionary Attackers: Machines that seem to send to a lot of nonexistent addresses.
- Spam Sources: Machines that send a lot of spam.
- Mixed: Machines that send a lot of spam, but also a lot of ham (think Yahoo's servers, for example.)
- Good: Machines that aren't on any of the other four lists and that seem to send a lot of ham
The whole system is 99.99% automated. The only manual intervention is when some requests delisting. If it seems that someone was the victim of a compromise and has now cleaned up his/her machine, we delist it for 45 days which is long enough for all events from that IP to expire. Then it goes back into consideration for automatic listing.
This system works really well. We have about 3.75 million IPv4 and 3300 IPv6 addresses on our lists; those are machines for which we have confidence that there's enough data to categorize them.
wrong tech. by buss_error · 2013-04-14 11:30 · Score: 3, Insightful

Better solution: Stop trying to force email to be a reliable and concurrent source of information. It has never been reliable nor has it ever been concurrent protocol. Check the default settings for sending email - try every hour for up to 5 days before giving up. Wait one day before sending a trouble report.
That email now generally DOES deliver results in almost real time is no excuse to think it will ALWAYS deliver in real time. If your communication either critical and/or time sensitive, then email is the wrong tool to use.

--
Necessity is the plea for every infringement of human freedom. It is the argument of tyrants; it is the creed of slaves.