What's With All This Spam?
coondoggie writes to mention a Network World article about soaring spam levels, confirmed now by researchers, IT managers, and security vendors. So, indeed, it's not just you: October was a spammy month. From the article: "Levine's assumption is this spike in spam levels is a result of a new generation of viruses and zombies that can infect PCs more quickly and are harder to get rid of. In its October report, messaging security vendor MessageLabs says the spike is largely due to two Trojan programs, Warezov and SpamThru. Others say a new breed of spam messages called image spam -- messages with text embedded in an image file that evade spam filters, which can't recognize the words inside the image -- is responsible." A note: I have no interest in penny stocks.
One thing that has always bemused me about the penny stock spams is the brokerage fees. If you pay, say, 1 1/2 cents per share in brokerage, (thus 3 cents total for buying and eventually selling), your 15 cent stock trade is 20% in the hole the minute you do it.
At work we use spam assassin with a gpl OCR plugin, however, it's getting foiled by intentional added noise in the images. I propose we come up with a way to detect these non-character elements (noise) in the associated spam images instead of just trying to OCR the text. The noise I've seen seems to be like it should be easily detectable.
"Begun, this Captcha Wars has."
-Yada
I can't afford the CPU power to let it check all messages in SpamAssassin. So I have to ditch many of them based on Netblock, Country, IP address, invalid EHLO, claiming they're "localhost" or "friend". Only then, after binning about 99% of connection attempts, do the remaining have to run the SpamAssassin gauntlet.
Most of mine get binned with a 554 "You're not localhost"
Some spammer is using an email address of mine to send spam from. So I get the people writing back, asking why I am sending them spam. And another of my domains is obviously listed somewhere as a domain where guessing user accounts might be a good idea. So I get cqoiecn@mydomain.com, zqopqwn@mydomain.com, etc. It all just sucks. I'm currently getting about 10 spams per minute.
Get your own free personal location tracker
I'm working on a sender stores system for a distributed social networking software called Appleseed based, in theory, on Internet Mail 2000. I figured early on that since the system was distributed, which means that anybody could set up an Appleseed social networking "node", that it would suffer from the same problems as any mail system if I used the standard reciever-stores system.
I don't harbor any illusions about a sender stores system being able to eliminate spam entirely, but the reason I went with it, especially after reading this indepth critique, was that it created a system of accountability. You may not be able to stop spam, but you have much better tools for knowing exactly where the spam came from.
The disadvantage is that it becomes, ideologically anyways, incompatible with current email systems. I consider this a small price to pay to allow admins to have better control and protection over their systems.
The system I'm building is rudimentary for now, and only uses direct HTTP->HTTP connections to send notifications and retrieve messages, and won't have any of the fancy abilities that email has right now, but it's a start, and there's no reason that those features can't be added as it evolves. It's gonna be a big experiment, and I'm expecting a whole lot of unforseen issues, but this whole project is a big experiment, so I'm excited about the possibilities in general.
but i just recently had an older d-link wireless router that got infected with some thing that turned it in to a spam bot. it was using the router as the spam generation unit. sending out packets to and from the most random addresses. stuff that could no doubt be spam oriented. I captured about 100MB of logs pertaining to the whole issue. it even managed to block numerous updates to the firmaware. and would not allow itself to factory default. it's like it had a hwole other firmware implanted in it and was taken control of.
At my ISP, there is even more spam in November.
I often get email that contains no advertising, contains no links, has no attachments, but is definitely not written by a human and does not convey any useful information. Often this is in the form of a short story. Sometimes it is in the form of an essay. In either case, it looks like it is generated with simple probablistic markov chaining. As such, my spam filter accepts it and I have to manually delete it. Is this just nuisance spam? What does the sender get out of it? Seems pointless, and that's pretty scary to me. I can understand being annoying so you can sell more of your product to idiots on the internet, but being annoying just for the sake of it?
How we know is more important than what we know.
Another user mentioned SPF. This is good. You configure a TXT record in your DNS, which says to the world, unless emails claiming to come from mydomain.com come from mail server a.b.c.d, or w.x.y.z, then bin them. It doesn't reduce your spam, but it prevents people being able to use our domain in the from address to send their spam, meaning you get fewer bounce-backs/user not found emails. (It can mess up forwarding though.)
But I haven't got it working in Postfix yet, so I can't benefit from other's SPF records.
Get your own free personal location tracker
Since most of this spam is sent by zombies, they care nothing about the success rate of the delivery. They just pump out thousands/millions of spam messages, hit each e-mail address once and move on. If it fails or appears to fail then it just moves to the next since single-digit success rates still result in thousands or millions of free advertising for the spammer.
As a result, using greylisting results in filtering a HUGE amount of spam out since it fakes a temporary failure from any new server connecting and waits for the server to try sending the mail again after a defined delay (according to the RFC, mailservers are supposed to try sending again if they get this temporary deferral).
I set this up on my primary server (ubuntu with postfix) and saw a 99% decrease in spam since none of the zombies care enough to try connecting again. By the time a zombie gets upgraded to be wise enough to evade this, it is likely to fail all kinds of other spam tests anyway (referring mainly to blacklists, though blacklisting can be extremely evil by nature).
If you run a mailserver, definitely look into setting this up. The wikipedia article explains the low-risk nature and exactly how it works: http://en.wikipedia.org/wiki/Greylisting
I run a small, but publicly traded company. Recently, I was contacted by a "PR firm" about "promoting the stock" of my company. Normally, I just hang up, but he mentioned a few "success stories" which seemed to correlate to some of the recent spam that had slipped through spamassassin. So I got his contact details and said since I was really busy "could he please email a summary of what we'd just talked about" (which he did).
I then called the enforcement division of the SEC and said I had the name and contact details for a company that was responsible for sending a number of unsolicited pump/dump email spams to me. I also told them that I had email from the spammer himself confirming that they'd done the deed. It wasn't some innocent bystander, but the people that actually SENT the mail. I was sent to a voicemail box and assured that I'd be called back. It's now about 2 weeks later and nobody ever called me.
And people wonder why there's so many of these vermin...uh, it's practically impossible to get caught!
I would like to see an "Ask Slashdot" article on why ISPs are not making full use of available anti-spam tools like SPF. Even blocking email from known dynamic-IP ranges would stop a lot of the zombie traffic. Nobody needs to send email from a box with an address assigned to Comcast or AOL or another consumer broadband provider. Why don't spam filters take advantage of this?
Spammers put garbage in the message body, subject, other headers, etc. in order to fool the spam filters - and unfortunately, they are often pretty successful.
But one thing they cannot change is their IP addresses. I wrote a script to parse my mail and save the IP addresses (or more precisely, their first two numbers - e.g., 213.186) that appear in spam messages, but not in normal ones. Then, I run another script on my incoming mail - which marks the message as spam if it contains a blacklisted IP address.
I update the list of IPs once in a while, and it works pretty decently. Right now, I have about 4,500 items in the list - each one corresponding to a range of 256^2 IP addresses - so it's about 7% of the whole address space (kinda scary). It blocks about 2/3 of spam, with almost no false positives. Most of my spam is also marked by the SpamAssassin (or whatever the mail server uses) and automatically moved into the spam folder, so I just run the script once in a while, and it "learns" on its own.
SPF is not actually the silver bullet you think it is.
These are meant to poison filters. The idea being if they send a lot of messages with text they know that don't look like spam they can poison the filters and later use those known words/patterns to get real spam through the filter. There are likely other bits they are trying to poison as well with the non-SPAM SPAM messages.
Yet another group of people all saying how they'd solve the current spam problem, by addressing the current problem. Let's make better OCR!!!!!!! Let's write "true AI" grade image recognition! When will it end?
Don't you people know that the bad guys can program too?
I'm amazed these anti-spam companies don't have their own private small armies of grey-hats trying to break their own products. I swear half these stupid ideas would just go away.
Personally, I think it's time we move to a completely different model, and do a bit of biomimicing.
We already have the equivalent of skin and cell walls, protection of networks and computers against outside pathogens.
What we really lack is an effective way of dealing with viral cells (computers). The fact that the internet continues to tolerate these hundreds of thousands of hosts I find rediculous.
The fact that most of these spam detection systems are held by private that don't share them is insulting.
I think what we need is a more real-time approach to spam and viruses and all bad behaviour, by just quarantining those machines (more or less) off the internet.
Something like this.
And this is a problem because... you can validate it, know that the spam really came from the spammer's own domain, and blacklist them. No, wait, that isn't a problem.
SPF was never about stopping spam, or about bypassing filters. It was about identifying forged senders at the domain level. It happens that there's a high correlation these days between the two, and in the long run knowing whether the sender is valid will be a useful piece of input in spam filters. And of course spam is what gets the headlines.
If you have some way of validating that the sender is who they say they are, you can do a number of things:
The main problem is that neither SPF nor DomainKeys has reached critical mass. Not enough places have implemented them, and implemented them strictly, for it to be worth checking. Not enough places are checking for it to be worth implementing.
Part of it is inertia. And there are still two main problems: forwarding services and road warriors. Both have solutions. You can have an SPF-aware forwarder, or one which implements DomainKeys. You can set up SMTP-AUTH on the submission port and remote users should theoretically be able to send using the home server (unless the network is brain-dead and blocks port 587 in addition to 25. And I have no doubt that they exist).
Whether SPF will prove useful in the long run is, I think, still up in the air. But saying that it's useless because spammers have "adapted" to it is missing the point.
The experts are implying that image spam is a new trick, and in a large part responsible for the increase in spam lately. However, it seems to me that image spam is a very old trick that spam filters are trained for. My spam filters block all messages that only contain images, for instance. I suppose that a mixture of text and images is what is effective, but from the filter's point of view, it doesn't matter much that the image is there. The spammers have already been using tactics like this, with or without images, for a long time. And in my little corner of the universe, image spam hasn't been getting through any better than spam without images.
Anyhow, I'm seeing a massive increase in spam since late September. While our filter is effective, the sheer volume has meant that many more junk messages are getting through. I think that what a lot of people fail to realize is that while the problem of spam can be dealt with effectively for personal email, especially if you take advantage of an online service like gmail, it's a totally different ballgame in the corporate world where spam is a tricky and costly problem. Work email addresses get published (thus harvested) for a number of legitimate reasons, and once mailbox is on the radar it seems like the rest of them start getting sucked in. Some employees can effectively ignore their junk boxes, but others simply can't -- it can be costly to miss an email. This reduces spam filtering for these employees to a simple ranking system: "here are messages that are probably legit and you should look at right away, and here are a whole shitload of messages that are probably junk but there might be an important one in there somewhere."
My organization is relatively small, and we don't benefit from hundreds or thousands of users training the filter. Thus when there's a large increase in spam that's getting through, it can take the filter a while to learn to block them effectively. During this time it's not uncommon for the occasional legitimate message to be sent to the spam filter by a user who doesn't notice it tucked into the 75 new messages in his mailbox, and this makes matters even worse. Finally, it's really hard to get users to send their junk mail to the filters, even when you've got it setup as a simple drag & drop procedure that's just as easy as deleting. If you can only convince a percentage of your people that training the filters actually works and is important, and you only have say 50-100 employees, then you may not have near the support required to really make Bayesian filtering work to its potential effectiveness.
Anyhow, over here we've seen a huge increase in spam, with some email-heavy users who used to get 10 in their inbox per day now getting 30 to 50 or more, and with potentially hundreds going to junk boxes. (this has decreased, I think things have settled down during the past week) We run a variety of filtering measures including header checks, DNS blacklists, and Bayesian analysis but just enough spam is able to get through on a daily basis to make things difficult. Back to my original topic: virtually none of the spam getting into user inboxes has been image spam, and only a small percentage of blocked spam is image spam.
Stats from last thirty days here: Messages Processed: 91588, Spam: 72881, 80%. A large portion of our legitimate messages are internal, which are not "filtered", but still counted by the system. A large number of spam messages are getting through, so I would conservatively bump that percentage up to 83-85%.
What an absurd problem. I'm going to have to put more effort into reducing its affect.
Mmm well. I work in IT Security for a university.. we're used to seeing random PC's get infected with stuff and sending out spam. We were surprised when a few weeks ago we saw our main linux shell machine sending out 14000 spams in an hour. Investigation showed that the spam kiddies had found out login details and setup a perl script to send spam from it. We've also seen it before from MacOS X machines running SSH with weak passwords.
:/
In other words, I suspect it's probably not a great long term plan to be smug about windows vulnerabilities causing all of the problems. It will continue to be one, for sure, but the spammers have other tricks which are contributing to the problem
nope. I've been training S-A for years now and it has worked nearly flawlessly until these embedded image spams. I haven't been reading my spambox closely so I don't know how many of them are caught, but 10-15 of them make it to my inbox each day. Few other spams make it through, but a significant number of these come through.
It's extremely frustrating. I have been looking at the source of them to try to find something common to filter on with procmail but they are encoded MIME attachments which I'm not willing to block wholesale.
- MM
Translate rules as necessary for your favorite mail client.
0 1 - just my two bits
This only works for certain people. If you are, say, an ISP you really can't do this. You'd have a ton of angry beating down the door and ringing the phone off the hook. At my job we use SPF and our server uses OCR. The problem is that the spammers most likely use all the different types of mail software out there and find ways around the newest updates. Sort of like moles.
SPF would be a huge help, but getting everyone to use it will be a task in and of itself. Let alone spammers picking it up and using it. But that still only attacks the e-mails that are spoofs. What really needs to happen is just to scrap the current implementation of e-mail and create a whole new system which incorporates some sort of accountability. Not an easy task by any means I know and I have no suggestions on how this could work or if its even possible. I only see the spam getting more difficult to defeat in the end due to all the scanning/scripts that are in use currently. Eventually it will get to the point where false positives are just too high making the way we currently do e-mail worthless.
"When the people fear the government, there is tyranny. When the government fears the people, there is liberty."
Ignoring for the moment your admission of guilt, how did you make that $20k/day?
Who was paying you?