The Growing Field Guide To Spam Techniques
Aneusomy writes "From Activestate: 'Compiled by Dr. John Graham-Cumming, a leading anti-spam researcher and member of the ActiveState Anti-Spam Task Force, the ActiveState Field Guide to Spam is a selection of the tricks spammers use to hide their messages from filters, providing examples taken from real-world spam messages.' The hope is that Activestate and others can contribute to continually expand this guide, so that anti-spam filters improve."
I also thought it was pretty easy to spot and eliminate SPAM offering my mom to "Add 3inches to your penis today_________________12312vxas"
Or to eliminate javascript enabled e-mail.
SPAM is not quite a science. It's skript kiddie stuff, meaning it's not too hard to do just some open relays, and mass e-mail lists you can buy from AOL.
Error 407 - No creative sig found
Linux and Linus Torvalds are more responsible and liable for spam than any other single entity. Personally I use IIS 6.0 which is secured against any external threat.
From the article:
the ActiveState Field Guide to Spam is a selection of the tricks
The words Active, Smart, Rich etc. are part of MSspeak - leave a bad taste..
providing examples taken from real-world spam messages.
Why not fictional world spam messages? You mean, all those enlargers I got over mail weren't real-world! Boo-hoo....
-
If you keep throwing chairs, one day you'll break windows....
I use Thunderbird, and found it to be a good system.
Before I used PopFile but he blocked some good mails. That was reason enough to drop it..
Just a thought, but....
Making it public, the methods used to intercept and filter spam will always mean spammers are one step ahead. If they know the strategy behind those stopping them, then that only helps them.
Is there a better way?
Many of these description shows how spammers try to hide text. Why would they do that? Isn't the whole point that we should read the spam?
I assume spam-filters reads the whole e-mail anyway, so trying to hide text in a would not accomplish anything.
Or are spammer just stupid?
I've definitely noticed that my spamassassin filters are getting less effective. Six months ago, it was rare to see a spam that didn't get caught. Now maybe 10-20% get through.
As I use a sensible email client that doesn't render HTML by default, I can't even read the text of the spams anyway.
Most of the tricks in the article (yes, I read it) require the mail to be in HTML format. If they were not, filters would be much more effective.
I don't remember ever receiving an e-mail that actually had any content requiring it to be HTML. It would be pretty sinple to set up a mail server to bounce any incoming (or outgoing for that matter) HTML mail with a friendly notice that the server does not accept HTML mail, and to please try again using ASCII. The problem is that there are plenty of people who have no idea what they are supposed to do at that point.
Also I wonder if it could be effective for filters to detect whether such obfuscation is used rather than try to parse the contents and filter based on that. Many of the methods used are pretty obvious if you try to detect that specifically.
This post is free (as in cheese in a mousetrap).
Try Ctrl-+ in Mozilla or Mozilla FireBird.
Bayesian filters are all well and good, and are -- for now -- effective. But given these tricks, the only really reliable approach I've found is IP blacklists for repeat offenders. If your machine is used to spam me, and my complaint letter is not answered in a satisfactory way (i.e. an email saying "We are sorry. The spammer has been cut off") I don't accept mail from you any more.
And if you're on ATTBI, or Comcast, or PBI.net, or BT Openworld, or Chello, or any number of large ISPs with too much tolerance for spammers, and you're not on my whitelist, I can't read your emails.
And I don't care. Get a ISP who don't shelter spammers.
Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.
I've often had spam get past every one of my filters, simply by being an innocuous subject (something like "Hi there, how's it going") and then a message body completely empty of any content.
I thought that was a pretty impressive attempt by those nifty spammers. Cut out all the bits of spam I ignore (such as offering me crap, giving me html email, popups etc) but keeping the bits I really hate (getting pissed off at receiving spam at all)
Well done kids, hope you keep it up!
So ... this is about techniques to grow spam in fields? Come on guys, April Fool's Day was months ago, and nobody would believe something that stupid!
what really needs to happen is to make spam an unprofitable business somehow...improving filters will just continue the battle between spammers and filter makers indefinitely...as long as they're making $$$ from the .00001% of people who actually click on the links and generate money, the battle will never end.
Why do I h8 apple?
All of these spamming techniques seem to involve visual tricks, because the rendered HTML is viewed in a very different way to a human than the plain text would be seen by the filter. Things like zero-height fonts, or white-on-white text, or just using one big image etc. etc.
So how about this: I think every single one of these tricks would be defeated by using this process for filtering spam:
1. Render the html to an image (not on the screen, just behind the scenes)
2. Feed the image into OCR
3. Then scan the OCR text for spam
Sure OCR is not perfect, but since these techniques are imprecise already, maybe it would work well.
Although I guess processing power is a limiting factor, but maybe someday this will be worth doing.
-- the only thing we have to fear is really scary things
You mean the "Search Pattern Assessment Model" method?
Anyone see this being helpful to both spammer and spamee
i had a friend who recently turned to the dark side and now boasts that his circle of friends include the biggest spammers in the world.
and believe it or not, the biggest break these guys have had in the past year has been help from people on the "inside".
to give you an example, an ex-AOL employer has written them a little proggy for these guys to send messages that makes the AOL mailservers think that the mail originated on the inside of the network (which means that none of it is spam checked or filtered.)
their usual 10% deliverability to AOL.com suddenly went to 100%. make no mistake -- that was worth millions to 'em.
Why not have your spam filter render the HTML in an offscreen buffer (using existing browser/plugin API's), than pull the straight text out of the rendered document and run the filter on that?
I like using the Bayesian filter Bogofilter to filter out my spams. It works pretty well and I like the ideas behind it.
But there doesn't seem to be any testing on the effectiveness of one Bayesian filter in comparison to another (for example, Bogofilter's effectiveness comparted to POPFile's), or to Bayesian Chain Rule filters such as CRM114's Mailfilter or Dspam.
"Stupid People Abusing Mail"
Sorry, but my karma just ran over your dogma.
who can possibly resist if the word "Free" is in red and bold? Well, me for starters. Still, this one line of the article is taken from the opening, describing a more serious problem; the fact that much spam uses so called 'enchanted email', that is HTML-mail. For all the other bad thing about that, the one thing I find most sinister is that it is easy to have the html-code pull a picture or something from a remote server; thus making it easy to validate your e-mail adress (logicaly, if you open the mail, the adress they sendt it to is active). In short, banning 'enchanted email' would lessen the amout of spam, as well as the bandwith it steals.
Apart from that I got a chuckle out the fact that spammers now seem to be speaking 1337;
Ze Foreign Accent
What: Replace letters with numbers or use nonsense accents
Example from the wild:
V1DE0 T4PE M0RTG4GE
Fántástìç -- eárn mõnéy thrôugh unçõlleçted judgments
The best spamfilter - withthe least false positives - are the one most people of common sence has between his ears. Anything else are mearly sorting your mail according to a fixed set of rules.
Everything in the world is controlled by a small, evil group to which, unfortunately, no one you know belongs.
shows me that most of the examples from the wild use HTML in their spam mails. So my tiny solution here in the office (behind a lousy working spam filter) is to redirect mails with content-type "text/html" to a spam folder, and yes 99.9% of it is really spam that can be thrown away. The other sort of spam that arrives here is encoded with Korean charset and also easy to filter out.
Ok, I need a proof-reader (either that or an audited-edit feature, you listening Taco?). I meant to say
"Email containing words with your name in it, or words relating to your life or work, would be given a higher probability of being marked genuine."
Sorry, but my karma just ran over your dogma.
How about
* Sad Person After Money
* Some People Are Morons
* Stupid Posts Are Meaningless
* Stop Posting Annoying Messages
* Stupid People Are Mandatory
* Suckers Protesting Against Mules
* Stupid Person At Machine
* Stupid Posters Advocating Maliciousness
* Sexual Perverts And Moneygrabbers
* Some Parts Are Meat
* SPiced hAM *
* Squirrels, Possums And Mice
* Strangled Parakeets Animal Manure
* Satan Posing As Man
* Seventy Percent Are Males
* Sprinkling Possibility As Mail
The official meaning of SPAM in terms of the Internet is "Self Promotional Advertising Message."
I still favour going after the people paying the spammers rather than the spammers themselves...unlike the big spam rings, they at least have to be locatable, otherwise they'd never be able to sell you stuff.
When I am king, you will be first against the wall.
I helped this lady out who had a 100% opt-in mailing list, but some people weren't getting their mailings... We came to find out the emails were being flagged as spam, so, I set up a dummy email account for her than took every inbound message, sent it through spamassassin (with verbose reports, etc) - and then sent the email back to her.
Now she can see if there's a problem with the headers, the content of the email, etc - so she tunes the email to get the lowest spamassassin score. (You know, the last major version of spamassassin took off points if you put your email client header as being Mozilla! Hah.. That one is gone now)..
This lady definitely isn't a spammer tho, just someone with a small mailing list of 100% opted-in people.
I'm sure spammers do the same thing. I would.
Received: from ann.coward.com ([unix socket])
by ann.coward.com (Cyrus v2.1.11) with LMTP; Tue, 04 Mar 2003 0
4:25:32 -0600
X-Sieve: CMU Sieve 2.2
Return-Path:
Received: from HMT3-CLT1.hotmailtest3.com (hmt3-clt1.hotmailtest3.com [64.4.7.32
])
by tandem.milestonerdl.com (8.12.8/8.12.7) with ESMTP id h24APV1C029223
for ; Tue, 4 Mar 2003 04:25:31 -0600 (CST)
(envelope-from Phonecalls@nootede.nl)
Received: from mail.nootede.nl ([61.11.79.215]) by HMT3-CLT1.hotmailtest3.com wi
th Microsoft SMTPSVC(5.0.2195.4821);
Tue, 4 Mar 2003 02:38:55 -0800
Message-ID:
To:
From: "Life Savings"
Subject: Life Insurance up to 75% Off. Get a FREE Quote Now! 7
983
Date: Tue, 04 Mar 2003 02:33:09 -2000
MIME-Version: 1.0
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-OriginalArrivalTime: 04 Mar 2003 10:39:01.0196 (UTC) FILETIME=[44E9F8C0:01C2E2
3A]
...Graham-Cumming...(snicker)
My filter works 100% of the time. If the mail does NOT include a certain series of letters and numbers, then the mail is deleted. The people that e-mail me know to include that in the mail, so their stuff gets through. Of course, if you want to subscribe to lists, then this sort of thing won't work.
I'm sure that is what the spammers hope and believe, but in fact most Bayesian filters associate a probability factor to each token or word, and they make a decision based on the set of tokens with the highest or lowest scores. For example, in Paul Graham's seminal Plan for Spam he describes using only the 15 most significant tokens to make the determination of the message's spamminess. So it really doesn't help to try to bury words like "penis" or "viagra" in a mass of obscure or invented words, however large; the filters will ignore those and home in on the bad words.
In fact, the spammers' choice of obscure or invented words as padding is dumb. If they would use regular words such as do occur in the legitimate email you want to read, there's actually a chance that over time they could render Bayesian filters less potent, because the good words would become more associated with spam than with legitimate mail. Careful attention to the training corpus is needed to avoid this happening.
It isn't that this new one that I saw was all that amazing an idea, I just hadn't seen it until recently. It is such an obvious idea that I don't know why I haven't seen it until more recently.
They send the mail as you. Fake the headers and make it look like it is from you. To you. From you.
I had our local setup here allowing in anything that was from our domain. Now I have to stop that.
I suppose the spammers saw that people were allowing their own domains and set it up that way.
On a side note and not all that related, I've noticed that I am getting (about once a week) an e-mail from a bank - citibank, or wells fargo, telling me that my loan application has not been approved, see details attached.
Now, I haven't been applying for loans, and the file attached is a *.pif file... which are notorious for being viruses, and not a format that a bank will send you.
Not to mention that looking at the headers, they usually come from attbi.com which is cable modems, and I have seen through Compuserve as well - which aren't exactly how banks usually do business.
There are some odd things afoot now, in the Villa Straylight.
Someone is paying the spammers to spam. They usually have a URL in the email. Set up a screen saver to DDOS the payer. FOLLOW THE MONEY, make it bad to buy spam.
I've often wondered why spammers go to all this effort to make their emails arrive and pass through the filters. If people are filtering spam out, then surely they aren't going to actually buy the product as a result of receiving the email.
It seems like a pointless drain on resources to me.
My other processor is big-endian.
- Check headers for signs of relay-misuse.
- Strip out anything between <mustang> signs; s/(\<.*\>)//g;
- Strip out all remaining punctuation.
- Use a tr/// to convert accented characters to unaccented.
- Recall that when used in a scalar context, s/// and tr/// return a count of successful changes made.
- Check for certain words in the munged text.
We can assign messages a score based on how many "nasties" were removed as compared to how many would be in a legitimate e-mail. Then despatch to one of three mailboxes: one for stuff we are sure is legit, one for stuff we are sure is spam, and one for stuff where we aren't sure. If we wanted to be really paranoid, we would strip out image links and JavaScript from HTML e-mails. It's not inconceivable that an image link could actually be a link to a CGI script with a unique identifier embedded into it, for the purpose of alerting the spammer that copy # 31337 {faute de mieux} of the message went to a working e-mail address. {Possibility for mischief?}And if we were an ISP, doing this on a public server, we would allow our customers to send abuse notifications to the appropriate server owners {for all the good it's likely to do} with just a few clicks.
Je fume. Tu fumes. Nous fûmes!
It's just an idea. Spammers are fighting filters by including meaningless text in their messages. Piping mail through a spellcheck could eliminate most spam. And some friends too.
If you look at these fractions of the population and do a little multiplication:
any idea how the bogus penis enlargment corp. and the spammers make this good business?
Someone please explain. People who have spam filters on don't want receive spam, and will most likely just ignore/delete any spam that does get through. Why do the spammers waste so much time trying to get past the filters? Is it to reach the unwashed masses behind ISP filters?
So a fat lot of good all those HTML tricks do you, eh spammers? (Are spammers stupid? Yes! It's Rule #3.)
One line blog. I hear that they're called Twitters now.
i think the only reasonable solution to this problem. is to switch to a spamer detarant system we could orginse lefleting compains out side there companys perhaps write deep and understanding letters explain our dismay at there actions, maby we could wrap the said leters around bricks or other solid objects to aid in there delevery through a window of aformentioned companys. we could take our dismay to the managed of the companys and set up some kind of dialog maby but not definalty involving two jump leads a car battery and a book entitled gonad electricution for dummys to get our point accross (done with the utmost respect of course) should the spamers still resist our polite but firm requests things would have to get more serious
Roses are Red Violates are Blue im not very good a poetry but i have many other redeming qualitys
I have on occasion misclassified mail myself, both ways. A few spams (uncolicited bulk emails) have been full enough of content that I have found interesting that I only after reading it realized this was not from anybody I knew. Conversely, I have a couple of times received mail which was for me , and was genuine, but so poorly formatted (lots of obnoxious html, strange subject and so on) that I deleted it as spam and only later came to understand it was a serious message.
The point is, not even I can do spam classification 100% correctly. It would be a tall order indeed to have an automated tool do it. But does this matter? There are two issues: discarded genuine mail, and non-caught spam.
Discarded genuine mail is not really as big a problem as people make it out to be. Mail is inherently not guaranteed; messages do fall between the cracks now and again. Swallowed by a buggy server, lost in limbo as a network connection goes down, never having a chance due to a misspelt or obsolete address, sent on a wild goose chase due to a temporary DNS error. Mail do disappear. Everybody knows that - or should know. Mistaking a mail for spam is just another crack for it to fall into. As long as the rate is low there really is no problem. And those doing mail that can easily be mistaken for spam will wise up eventually, as they see a disproprtionate amount of their email get lost in the ether.
Missing spam is no real problem either. The big issue is having fifty spam in your inbox every morning, with another fifty arriving during the day. Having one or two a day, on the other hand, is not that painful.
The point is, it is not a binary system: A spam system that misses two spams a day is better than one that misses five, and vastly better than having no system at all. Similarily, one that classifies one genuine message out of a thousand as spam is no disaster. Not good, but not a reason to shut it all down either. If reliability is _that_ important, what are you doing using email in the first place?
Filtering isn't perfect. It won't ever be perfect. That's quite alright. Saying a technique is worthless because it makes an occasional mistake is throwing out the baby with the bathwater.
Trust the Computer. The Computer is your friend.
Sexual Propaganda Aimed at Men
This space available.
This will all be blindingly obvious to most readers of /., but just for the record:
Don't use your personal email address for anything online. Don't post to usenet with it, don't use it to register for anything, don't ever use it where there's any chance of it being sold to a third party or picked up by a web crawler. Use a free throwaway web-based account like hotmail or yahoo, that's what they're for. I have a verizon.net primary email address, and I've never received a single piece of spam from it.
However, I still have a forward-only email address from my university circa 1992. Back then, there was no spam and that address has to be on every spammer's list on the planet. I still get a legitimate email every year or two, but spam outnumbers these by at least 10,000 to 1. SpamAssassin does a surprisingly good job of identifying the garbage.
I also use a proxy to surf the web, as well as a large hosts file that reroutes requests to adservers to 127.0.0.1:80, combined with a utility that returns a transparent 1x1 gif to any request on port 80. And of course I use mozilla to block pop-ups and whatnot. I'm so used to surfing in this way that I always recoil in horror when I have to use IE on a naked, unprotected box. How on earth can anyone stand it?
As for more traditional types of spam such as telemarketers, there's the national do not call list. It's free, so there's nothing to lose. You'll also want to check out the many excellent resources at the Junkbusters website. One of the most useful features is a Junkbusters Declare page, which builds custom form letters for you that you can use to opt out of Direct Marketing Association junkmail, as well as telling your financial institutions, etc., not to sell your name to third parties. I used it, it's painless, and my privacy is protected.
Of course, it would be much better if we didn't have to jump through hoop after hoop just to get through the day without being pestered by morons.
After a while, SpamAssasin's false negatives and positives drove me to the Tagged Message Delivery Agent (TMDA).
;)
TMDA has flexible whitelist and blacklist capabilities. But the big win is that it can be set to autoreply to anyone not on the whitelist, and require them to reply back before allowing the email to get through. Of course, very few spammers have valid return email addresses...
This may seem drastic, but in fact it has made life soooo much easier. It also helps you to "automagically" get off those email lists you signed up for a long time ago, don't really care about, and are too lazy (or lost the info) to sign yourself off
The only sad thing is that no longer do Russian women want to extend my length or give me free money or viagra, and I am no longer in contact with Ms. Sesse Seiko from Uganda...
The key difference is that KMail does this on a per message basis, whereas in Mozilla this is set once in Preferences and I suspect the same is true in Evolution. Thus looking at a HTML message I just received I get the following in a box at the top of the message;
"Note: This is an HTML message. For security reasons, only the raw HTML code is shown. If you trust the sender of this message then you can activate formatted HTML display for this message by clicking here."
The HTML code follows and a single click turns it into a fully rendered message, or an alternate click consignes it to the trash can.
It may be possible to add this as a mozilla mail / thunderbird toolbar, and as Thunderbird takes off I hope we will see this type of quick prefs bar develop to the same extent they have been developed for the mozilla browser component.
From the in-the-wild sample for the Camoflauge technique:
"those rearing lands
Plasticine sex-cartoons.
eel harness highest
Absolutely new category of adu1t sites.
nobody jets held
Northumbria- diamond sleep."
Any lit majors able to explain this one?
Why DON'T spammers remove us from their lists when we ask? They're working REALLY REALLY hard (with all the filtering, header forging, etc.) to send mail to people that don't want it. If they would just target their email to those who had indicated that they wanted it, and removed us that had indicated they didn't, they'd save themselves a lot of grief, as measured in legal and technical hassle.
Granted, it's easier for them to ignore the "remove me"s, but is the trouble saved in 'not removing' >= the trouble spent in 'getting past spam filters'?
Besides, if the mails were targeted to those that THOUGHT their penis was small and needed extension....doesn't that mean it's not spam anymore? And wouldn't that make their click-through (or whatever) rate higher, therefore making their own attractiveness as a bulk emailer greater to their customers?
I'm just thinkin' here...
It might be offtopic, but there is a good article in The Wall Street Journal (July the 18th, 2003) about how some spammers might benefit from anti-spam laws. The idea is that big corporations that do legid business via e-mail marketing are trying to eliminate competition that gives them, spammers, a bad name. By reducing the amount of 'get rich quick' or 'increase your penis size to 18 inches' e-mails and following strict guidelines, today's spammers have a chance of being legid, reduce costs of operations and have a well established market base. For example, knowing that people opt-in for some offers means that companies can target the consumers more precisely.
Ahh, I knew there was a catch. Meanwhile, I am going to post my email addresses as 'my_name at domain dot com.'
1. Most of the SPAM sent today has this little problem, where the sending server does not resolve to the IP which is listed in the header.
2. It will permit people to first map a domain to an IP.(Makes it harder for a SPAMMER because now he needs to register a domain. Once the domain is used to SPAM it can then be blocked. All blocked domains can be easily maintained in a list and shared by ISP's
3. Time is money. Moving domains from one ISP to another does not help the SPAMMER. The domain is blocked and the IP is identified. The SPAMMER has to be able to activate multiple domains, multiple DNS servers and such. The paterns will be easier to identify and it will be easier to block SPAM by either Blocking the Domain or the DNS server or all the IP's of a certain offending ISP
4. In order to acquire a domain a payment transaction must occure. This can be traced if it's a credit card. ISP's who accept cash withou ID or who continually HOST SPAMMERS can be blocked. The work involved to acquire a domain may inclease the costs of a domain but I am sure that this will enable people to assign responsibility.
While this system is not perfect and, yes it may cause some headaches for most, having sendmail match the MX record to the IP of the sendind server would eliminate almost 100% of all the SPAM that I have encountered in the last 3 months. We would still need to keep the existing anti-spam practices in place.
When SPAMMERS find a way around this we can then address that issue when it's time.
DRM? No thanks, I'll just get it somewhere else...
Depending on how they sent the email, this is likely one of the "tricks" where the text content and HTML content differ.
Many mail clients (IMP for example) will display the text version, and show the HTML version as an attachment. Very likely the "missing" advertisements are in an HTML attachment.
I get spams like this all the time.
retrorocket.o not found, launch anyway?
My Bayesian filter analyzes the message in raw text, including any HTML tags. A handful of HTML "enhanced" spams might make it through the first few times until I classify the new messages as junk. Once that happens the filter learns that random HTML tags increase the chances of it being spam and it's off to the junk pile.
This article highlights why I have stopped using filters altogether. End-user filters address the symptom, not the cure. The problem with even the best filter is the mail is already there, taking up space, hogging bandwidth, and the filter is churning CPU cycles to hopefuly deal with it. My mail server uses 3 rbl (blacklists), and one I have programmed myself (rbl.restongeek.com). I get no false positives, and only a trickle of spam that gets through. I also get some small pleasure reviewing my server logs of the rejected mail, where the reject happened before any of the actual data was transmitted (see my /. journal for a sample).
Of the anti-spam legislation currently being proposed, the most important clauses are those that deal with forged headers and illegal use of other servers (relay rape). Once such laws are in place, blacklists will become even more effective, because spammers will have fewer places to run and hide (if they sell something from the U.S.A.).
One final piece to the solution is to get ISPs to act responsibly, and block egress traffic on port 25 for dynamic IP addresses (look up many of my previous posts for more detail on this point). Again, combined with blacklists, this will reduce spam tremendously-- not just in your inbox, but your (and your ISP's) bandwidth.
Filtering is all very well and good - but ultimately it is an arms race that no side will win. Battles may be won but the war will rage on.
The most effective method I have used is whitelists - if your names not down your not getting to my inbox. All other mails are placed in a pending folder where I currently have to manually check the mails - filtering cold be performed on these mails to cut out the really obvious spams and save me some time.
Human authenticators could be used to move mails not on the white list to a more privileged folder than the pending (to be reviewed) or straight to your inbox. But I expect at some point in the spam wars tricking human authenticators will be on the cards.
I personally find the white list method as used by hushmail works wonderfully.
"Things that you own end up owning you" - Tyler Durden (via Diogenes of Sinope).
What's sad about all the spam is that legitimate email become flagged as spam with filters.
I created the mailing system for my company and we only send legitimate mail.
legitimate "Click here to unsubscribe..." is enough for a filter to flag the message as a spam.
We mostly have customers from US and Germany. The number of US customer reached is very low....around 10% of the mailing...for germany, 25%. It's a matter of time until those german customer become flooded with spam and start using spam filters.
As for myself I moved to a white list. 30 spam a day was more then enough to be annoyed.
I'm using outlook and Qurb is doing the job quite good for white listing. This way I can simply check the quarantines mail once a day and check if there's any good mail in there.
Karma: Very Very Very Very Bad
This thread is quite interesting, but I still cannot understand why ISP cannot be forced to stop spammers.
IMHO, if an ISP account is generating 50000 messages per day, chances are that he/she is a spammer. So, an ISP software could build a list of possible spammers. Maybe some of those messages are real service/useful communications, maybe not... But a look at the sent messages can easily reveal their nature and a list of "trusted" account can be used.
Then if an account is recognized as belonging to a spammer, the latter can be identified and/or the account ca be deactivated.
Why should an ISP do all this work? It could be forced by the law.
I know there are many ISP in different countries, not equally eager to apply such a rule, but preventing a user from receiving spam from Europe or US would be a step ahead...
- "Having a clean conscience is sign of bad memory"
Some time ago a new way for filtering spam has been discovered. Solution is simple, yet brilliant - we already have those "To confirm you're not a script, please type the text shown in this image" at various websites to guard against form-submitting bots. Apply this to email (bounce back all emails with image attached) and all the spam is gone! Not that it is a perfect solution (I wish there was...) as I see 2 minor flaws in this system : ;)
1. It introduces a delay in communication - confirmation letter has to be sent and reply received.
2. Not all recepients at the other end are *that smart* to understand "what the hell this image means and what am I supposed to do with it?"
From the other side it can serve as lameness filter
But still a promising technology. I've searched the web and came with both subscription services Mailblocks and client-side apps Icemile. The last one is free and I think I'll stick with it.
I used to maintain my filters stopping spam, but they were only catching about 60% of all spam, and even then, I still had to download it from the POP server. I signed up with spamcop.net (No, I don't work for them :P) about four months ago, and now a good 95% of my spam is blocked on the server side, and I never have to see it. Ever.
I subscribe to all of the IP blacklists, and I've never lost a legitimate email. Since March, the service has stopped about 11,000 spams. The best part is for $30/yr., I don't have to play games with filtering programs... works like a charm.
People who send out spam are in it for the money. I don't know the specifics of the industry, but I'd wager that they're paid on number of emails seint out.
Anybody know how spammers actually get paid?
What's awesome about the author (Dr. John Graham-Cumming) is that he not only knows his stuff, but he puts it out in his open source software called PopFile written in Python.
PopFile can be located at http://popfile.sourceforge.net.
I am currently using PopFile, with an accuracy of 98.26% from nearly 8,000 messages. It's the best I've ever used, and it's free!
GeekWares - Buy and Download Today!
The trick is: the Spammer, him/her/itself (well he/she WILL be an "it" if I ever find them), wants to be completely transparent.
They send mail. You see mail. In their depraved mind, you then deal with company that commissioned mail.
First of all, I want to strangle the people who commissioned said mail, especially mr. "Free golf wedge, best in world" and the fuck from K-Mart marketing who bought a cd full of email addresses and added them to K-Mart's bluelight email list.
However, that's not the point.
Think about how we filter. In order to have a realistic opt-out sequence, we have to be able to reach the spammer back. Either by email, or clicking a link, or something of that sort.
The MOMENT something that static is in the email, however, ISP filters will catch it and promptly ban any email that they send with that indicator tag in it.
See the trick? It's all based on evading filters. You can't legitimately provide an opt-out solution, because then that becomes an identifying tag for people to filter you away.
And the last thing spammers want to see is people actually opt out anyways, because if they WERE honoring it, they couldn't claim to be mailing to 50 million people. They make their cash partially on the claim that they reach a huge number of people in order to get responses from a smaller number, just as TV shows do with ratings and ads.
Legitimate email offers are one thing.
For example, I accept emails from Amazon. Why? Because I buy books from them. When something comes up that I might be interested in, I like hearing. Likewise, I accept the occasional email from online computer parts stores I've bought from. Chances are I am not buying again, but if the right offer came along I might, and I have been a customer of theirs.
However, two things need to happen:
Fraudulent email (porn, penis junk, get rich quick, etc...) needs to be stopped, except for people who bought from those people before. It should be all opt-in.
Sales of customer lists, of lists of emails, should be ILLEGAL. I have bought a service or product from you, and only from you. I have no business relationship with your cousin, your "partner" business down the street, or anyone else you might think to send my information to.
If this happened, we wouldn't mind nearly as much. Legitimate mail, from companies I legitimately have dealt with, is fine. The problem is, for every one of those emails I get, there are 5,000 fraudulent spams.
I read a recent story by an ex-spammer that said he was up to sending something like (insert some big ass random number here because I forgot) 10M emails a day, 70M emails a week.
Got paid roughly $1,000 a week on good weeks, that seemed to be his peak.
This little kokgobbler is sending out a third of a billion spam emails just so he can make $40k a year. That alone justifies letting sys/admins kill spammers.
Glonoinha the MebiByte Slayer
...everyone *wants* a larger penis. And breasts.
Cress, cress, lovely lovely cress
You have been trolled.
Have a nice day.
I have discovered a truly remarkable sig which this 120 chars is too small to contain.
I'll answer my own question a bit: After seeing one of these scumbags on TV it's obvious they get off just watching the counter increment saying that he just sent 4,123,456.890 more messages while he watched. They don't really want you buy or do anything. They just want to send the garbage.
Ever dream you could fly? Get up from the Flight Sim. I Fly
The Reg has just posted an article about anti-spam activists outing some potential future spammers. Give it a read, and if you're sufficiently motivated, join the battle.
I see lots of schemes to kill spam, but anything that requires cooperation between the end users isn't going to catch on. You need to be able to send email to and recieve email from anyone in the world. There is an existing user base of billions for email. They won't suddenly all switch over. I think the best way to deal with spam would be a GPG web of trust, but if you blacklist unverified keys you lose one of the major advantages of email.
Give me Classic Slashdot or give me death!
I'm sure you are right, though not everyone believes this. See Foldoc where is states, Correspondant Bob White claims the modern use of the term predates Monty Python by at least ten years. He cites an editor for the Dallas Times Herald describing Public Relations as "throwing a can of spam into an electric fan just to see if any of it would stick to the unwary passersby."
the Spammers MAY make money by selling an occassionaly Penis Enlarger, but they REALLY make money by selling LISTS!
Lists of VALID email addresses.
These lists are SOLD to people trying to actually sell things. "Clean" lists with valid email addresses.
The people who BUY these lists or services want as FEW bounces as possible.
This is one reason why I get gobs of these new spams that are really nothing more than spam filter tests. They are trying to figure out what gets through and also trying to poison the filters so they can claim a higher percentage throughput for the stuff they REALLY want to deliver.
The problem is not people who actually BUY the stuff advertised, the problem is the people who buy the LISTS of email addresses or the services of a spammer thinking that they are using some sort of valid "Direct Marketing" service.
I built a small website for a client and after it was up, he wanted me to find him a way to advertise the site VIA EMAIL! He wanted me to go find a spammer, and PAY the spammer to send his ad to millions of valid email addresses.
He saw absolutely nothing wrong with this. He thought it was no different than buying a snail mail mailing list and sending out thousands of flyers....but Cheaper!
and he was not selling Penis Enlargers, he was selling Printing Services!
It's the Buying of the LISTS and spam SERVICES that's the big problem! Not the people who are actually buying the stuff in the spam.
It's like the Gold Rush, there may be no Gold anymore, but that won't stop people from heading out west to try, and when they get there, they find some nice vendors who are more than happy to sell them all the tools they need to pan for gold. Whether they find gold is immaterial, there's a steady flow of customers buying supplies. The ones that give up, just go away. Plenty more suckers lined up outside to buy pans, picks and shovels.
You can use the metaphone algorithm (I use PHP so, http://us3.php.net/manual/en/function.metaphone.ph p) which has come in handy.. Just strip all HTML and de-urlencode then run this on the msg, it totally ignores numbers and punctuation and any letters that are not in (a-z A-Z). You will need to have a database pre-made full of metaphone values from a dictionary then start a comparision and you can get a general feel for the msg.
I took all the words used in a product called spamassassin and used that to do a comparison.. Coupled with bayes filtering I imagine this would be pretty much the best way to filter mail.
It is kind of an interesting approach based on what mail "sounds" like vs what it actually contains.. If you filter on the straight contents these guys will just keep coming up with different ways of encoding and generally being twitchy.
However, their mail will *always* have that "buy this!" kind of sound.
I built a system a while back that was processing all double bounces from three servers and handled around 50k/day spams and came up with some interesting results.
If anyone is interested I'll dig up the code and place it on my site with the rest of the stuff there.
anime+manga together at last.. in real time.
Umm...you're a troll, maybe, but did you check the links on the left sidebar?
"America has done some terrible things. But I know that Americans don't cheer when innocents die." -Dave Barry
Most non-spam Outlook users send HTML messages that lack tables, iframes, and other post-Mosaic formatting tricks. I think if one were to bounce email that contained these useless HTML entities, you'd still be able to get your precious email from your long-lost Outlook-using girlfriend.
The flag just makes more sense than the constitution. - Judas Gutenberg
Since HTML is such a menace, why not get rid of it? You can remove markup (note: markup) without losing the meaning of text. This Windows client strips HTML and displays sweet, innocent plaintext.
I've had this occasional idea, with an implementation for it. Obviously, Spam depends on two things, getting a message out to everyone, whether they care or not, and getting response from that 1/1000000 person foolish enough to want generic viagra (or whatever).
Is it possible to automatically set up systems that insure that the place which gets the response are added to the spammers list? If they get on each others queue's they drown out their own servers and the cost to benefit ration goes up sharply.
Or so it would seem to me.
An Invisible Entity of Vast Power whose existence must be taken on faith alone: Liberal Media
A lot of the tricks he lists as "rare" are tricks that my filters frequently pick up on.
This sig no verb.
Until recently, most of my spam was "buy this stock now!". How do you propose I locate the person making money off that one?
... you can submit a spammers address? ;)
:)
:)
;)
;)
Just a thought. Instead of brooding about not being able to remove your emails from spam sites, ever considered putting in a known spammer email in the Remove box instead of yours?
And by spammer email address, I don't mean that address you pluck it off from the email which is usually crap. All the spam that comes in usually advertise some sorta URL. Find out who owns the bloody URL using WHOIS, validate their email address to make sure it's legitimate, and throw their email in the Remove box.
Of course, if you have time on your hands, you could always go VISIT the spam sites, find out the contacts (if you can't, check out the source code), validate the email address for legitimacy, and do the same in other spam sites Remove box. Spam sites are just PERFECT breeding ground for Remove and Opt-Out boxes!
If you have a website, it gets even better! Draw up a HTML page and stick the spammer's legitimate email addresses in there... and pray for the bots to come!
Who knows, their email will probably get harvested, and the spammers will in turn get spammed by other spammers. Let them spam themselves to death for all I care.
And after that, if you are really hyped, subscribe them to some beastiality mailing lists, or some nasty ones, and imagine them squirm!
Remember, I get spam too! And it sucks. But thanks to Procmail, it's just a trickle now. Still, one or two do get through a day, and for those rare ones, boy I really DO hope they have REMOVE and OPT-OUT boxes!
So don't shun the Remove options. Embrace it to your advantage!
Okay, this article is a thinly veiled promo for ActiveState. This so-called field guide contains a handful of tricks that are mostly obvious to anyone that knows a little bit about HTML or MIME-Encoding. You would be much better off combing through SpamAssassin's extensive list of heuristics rathen then reading a boring rehash of "Hey! you can hide stuff in HTML comments! Betcha didn't know that! (Subscribe to our newsletter, thanks.)"
>And for crying out loud, "spam" is not an acronym so stop writing it in upper case!
SPAM was coined back in the early eighties. Sorry your wrong!
A Bayesian filter would learn that "iaga" is a sign of spam. Any spam-hiding technique that fragments words is going to run into this same problem. Basically they'd have to resort to one or two-letter fragments, making their messages even easier to distinguish from legitimate mail.
I agree that Bayesian filters are the way to go.
Wasn't it Justice Potter Stewart who said "I can't define pornography, but I know it when I see it."?
Well, if all other spam filters try to define spam, it is the Bayesian ones that learn by example. They not only learn by the piles that wiggys describes, Bayesian filters learn about new types of spam as the messages arrive.
I learned about Bayesian from the SpamBayes site and then switched to a beta of InBoxer from a small start-up(www.inboxer.com) about a month ago. I hardly see any spam any more.
Yeah, I've been using "Active Spam Killer" (ASK www.paganini.net/ask), which has similar functionality. In the 2.5 months I've been using it, NONE of the 2000+ spams I've recieved have gotten to my inbox. Works great with procmail, as I don't control my mail server. I still use spamassassin to trash the stuff that's obviously spam. TMDA style software seems to be the way to go... I don't see anything else helping to obliterate spam any time soon (IE: "replace SMTP on every mail server on the planet with something better that hasn't been invented yet". Not gonna happen in our lifetimes).
Here's a blast from the past from 1995:
According to "The New Hacker's Dictionary" (third edition) by Eric S. Raymond:
====
spam vt., vi., n. [From "Monty Pythons Flying Circus"]
1. To crash a program by overrunning a fixed-size buffer with excessively large input data. See also buffer overflow, overrun screw, smash the stack.
2. To cause a newsgroup to be flooded with irrelevant or inappropriate messages. You can spam a newsgroup with as little as one well- (or ill-) planned message (e.g., asking "What do you think of abortion?" on soc.women). This is often done with cross-posting (e.g. any message which is crossposted to al.rush-limbaugh and alt.politics.homosexuality will almost inevitably spam both groups).
3. To send many identical or nearly-identical messages separately to a large number of Usenet newsgroups. This is one sure way to infuriate nearly everone on the Net.
The second and third definitions have become much more prevalent as the Internet has opened up to non-techies, and to many Usenetters sense 3 is now (1995) primary. In this sense the term has apparantly (sic) begun to go mainstream, though without its original sense of folkloric freight - there is apparently a widespread belief among lusers that "spamming" is what happens when you dump cans of Spam into a revolving fan.
====
Now if I could just stop sneezing from all the dust that was disturbed from opening that book.
Using your e-mail address as From to send spam to you is old news. A new technique is to go to Google groups and find who are your 'discussion buddies', then sending you spam disguised as mail from those buddies. I've seen it happening.
Hello,
Thi<!-- Mother -->s is a gre<!-- tuesday --> arti<!-- gift -->cle. Ev<!-- asdfasf -->erybo<!-- sometimes -->y shou<!-- lovely -->ld rea<!-- car -->d it.
It's 10 PM. Do you know if you're un-American?
~~~
the best filter i have is if the message contains "-->" bin it, anything else the magic list-o-words picks up and bins for me. 99% of the spam i have had in the last month the from header is good, because its from someone else getting the spam
-later
Owen
Unless the poster asks for a personal reply, don't cc his personal address. Send the reply to the list, so everyone (including the poster) benefits.
What if instead of filtering SPAM messages, we let them through and do take all the steps towards buying the advertised products, expect of course actually buying them.
Suppose they provide a URL to click on, then lets click it several times, even filling out many order forms on their website.
If they provide a phone number to call (it usually is an 800 number), call them 20 times for each email they send you.
Bring them the /. effect, have them pay for wasted bandwidth and long distance charges.
If they send 1 000 000 emails, and if each recipient generates 20 bogus responses, it means they somehow have to process 20 000 000 bogus orders...Now that would lower their ROI, wouldn'it?
After all, arent't they begging to be DDOS'ed and overwhelmed by customer responses?
Yeah, I hate spam as much as the next geek. However most people don't stop to think about the black side of spam filters: false positives.
I use spamassassin and Mozilla's bayesian filters, they do get rid of a lot of spam, but they also do get some false positives. This means I have to check my spam folder every so often, which kind of defeats the purpose, doesn't it?
Moreover, email is not only a personal communication tool anymore. Do you buy on-line? Do you expect an order confirmation, or a shipping confirmation? Well, it's quite likely that those could be flagged as spam by spam filters. It just happened to me yesterday on an ebay winning bid notice, because the subject had an exclamation mark. Businesses -- you know, the kind of organization that usually pays the sallaries of us working geeks, or the sallaries of the parents of student geeks -- need to get through to comunicate with their customers. Spam and spam filters are both getting in the way.
How bad is that? IMHO pretty bad. Spam is killing half of the advantages of using email. Filters, with the pretty much unavoidable false positives in this cat and mouse game are killing another quarter, at least. I don't know what will happen, but it's a pretty sad situation.
/* TAANSTAFL */