Security Predictions of 2004
scubacuda writes "Computer World's security predictions for 2004: R.a..n,d,o.,m p,u,,n,c.t,,u_a.t.1..0.n evading spam filters, Internet access filtering, better desktop management, enterprise personal firewall deployment, tools that securely scrub metadata, corporate policies against USB flash drives, Wi-Fi break-ins, Bluetooth abuses, cell phone hacking, centralized control over IM, public utility breakin publicized, government defense against cybercriminals, organized cybercrime, and a shorter time to exploitation."
hopefully it is too pessimistic
R.a..n,d,o.,m p,u,,n,c.t,,u_a.t.1..0.n makes it nearly impossible to block spam messages by filtering keywords.
Can't the spam filters just remove it all? They don't really need the punctuation to check for Viagra advertisements anyway.
This is a good thing. It makes it harder for the victims to read, and gives a lot of anomolies that any modern statistical filter will find extremely useful.
OK... so they predict...
More Of The Same!
Astounding.
Remind you of something?.
That random punctuation stuff is more difficult to read than 1337speak, and will continue to be: leetspeak, at least, has a fairly broad group of people that -want- to understand it and use it conversationally, and thus its more understood.
:P) sends an adrenaline rush through me. I look forward to dealing with such attacks (either preventatively, directly, or for clients, etc.) - seriously. It's exciting stuff.
At any rate, I doubt such punctuation will be a problem. I've already seen a good deal of it get killed with bayesian filters anyway.
The other things though - very interesting. It's not like we can't predict these things ourselves, though - it's only a mattre of time before they happen, what with the increasingly dense levels of tech in our society.
Being the thrill-seeking geek that I am, the prospect alone of bluetooth hacking (wartoothing?
I can see there being a definate increase in the need for serious, intelligent, and knowledgeable computer security staff; they'll likely start supplanting what's left of IT staff, as well as replacing some of the positions that were dumped in the last several years. After IS? Who knows. Maybe we'll be batteries by then, or maybe fighting the machines.
~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
It doesn't take very much CPU to s/\W//g
Yeah! Block all email containing only graphics!
Base64 isn't hard to decode... or to just bin.
I've never seen an email with an IP address based URI that wasn't spam. Trash em
Not this user, or this user's spam filter. Spams using these techniques get the highest spam scores and when 5 is worthy of trashing, 35 is worthy of laughing at (at least until I get so much spam I'll put it in /dev/null rather then ~/mail/spam)
nor do you, cliff craven
There are no karma whores, only moderation johns
Don't put your email address online, period. Other solutions like filters only address part of the problem, because you still have to pay for the bandwidth and there's the problem of false positives. I wrote a little Javascript Turing email obfuscator, which prevents renders your email address invisible to bots, even those that can execute javascript.
An ounce of prevention...
My experince since we changed from Windows 3.1 to NT and now 2000 is that the few cases where users screwed up their PCs have been outweighed by the constant demands for an engineer visit to carry out a trivial task using the admin password. And no-one can defrag their hard disks. Ever.
When I am king, you will be first against the wall.
why not filter for greater then certian number of punctuation marks, and in the body filter for anything greater then average letter to punctuation ratio? Sorry my previous post sounded confusing...
$?!!!@#!Th.,is./ ??is,!@@ sp!*($am!?..,.,;;:
I use a 2.5" 20GB USB hard drive when I move between branch offices for work as it carries all my data and stuff with me. I also use my HD as a kind of FTP directory when I want to install client software across a server network.
Come to think of it, there's nothing to stop somebody with one of these Hard drives from importing and exporting several CDs worth of data on it, and importing all kinds of strange software or even CD-copying software into the workplace to make nice CD ISO images or even whole drive dumps of code that should not be freely distributed.
The USB hard disk is probably way more risky than a flash drive, because 512MB while it can still hold a lot of info, is still expensive and is limited by its size.
READY.
PRINT ""+-0
Spam operators are getting more creative in their efforts to get around spam filters. R.a..n,d,o.,m p,u,,n,c.t,,u_a.t.1..0.n makes it nearly impossible to block spam messages by filtering keywords. Operators are changing to graphics interchange format images with no searchable text. Some spammers send in encoded formats, like Base64, to circumvent keyword filters altogether, and relay through IP addresses that have no Domain Name System domains associated with them.
Why on earth did they expand "GIF" there?
Oh well, the base-64, and even the image method are not immune from keyword and Baysian filters (in fact, you could theoreticaly write a Baysian filter based on image features, killing any "Ad-like" images!)
autopr0n is like, down and stuff.
Spammers actually seem to try defeating bayesian spam filters by "training" them with random words:
From: Noah Poe
Date: Sun, 04 Jan 2004 15:58:49 -0600
To: a.konrad@aon.at
Subject: canberra happen
aides bone emmanuel rumania persistent josephine pencil majesty bottom
anarch molecular cafe hepburn done ellipsoid monoceros chokeberry pungent decontrolled
orphanage keel cessna lippincott drugstore onion inclement empire
This is just sick.
A monkey is doing the real work for me.
Ok, this is probably a dumb question, but why the hell doesn't anyone make a spell checking spam filter? Just set it to junk any incoming email with more than x% spelling mistakes, and voila! All y,o.ur.,. r,a.,n.d,.om.,,. p,.u,.nc,.tu,at,i.on and |33t 5p34k is fucked. Combine it with a regular spam filter, and you're set!
It'd also have the added bonus of keeping idiots who can't spell worth crap out of your inbox. And since it would work off a dictionary (preferably the same one as your outgoing spell checker, if equipped), you could always add whatever names, phrases, and abbreviations you wanted, while still keeping the "0MG L1EK MAK UR P3N0R 9 INCHZ LONGR!!" crap out of your inbox.
Surely we have the ability to create something like this. So where is it?
One of the requirements (coming from "concerned parents", of course) was to filter out swearing in the chat rooms. So if someone typed in, say, "you're a shit", what would actually appear for everyone else would be "you're a $!%^" or something similar.
Eventually, of course, we got into an arms race with the kids, who would write "sh1t", "s.h.i.t", "sh*t" and so on.
However, I came up with a program which generated a regexp which matched pretty much all the variations, and - to date - none of the kids have worked out a way around it.
This is how it worked.
(Actually, I can send anyone the original regexp generator code if they're interested - just mail me).
The basic concept was to use a table of "equivalences", for, eg. "a" => [ "@", "4", "A", ....], "f" => [ "ph", .... ]
For each swear word we generate a regexp with (r1|r2|r3|...) for each letter in the bad word, where r1, r2, r3, ... are the list of
equivalences for that letter.
That produces a list of swear word - matching regexps which we then combined into a super mega regexp which would match any of the 50 or so banned words.
One interesting thing is that you can end up with a regexp which is too big for GNU regexp to handle ... But there are ways to get round
that and you can code it up as a flex parser
too which doesn't have any limits as far as I
can tell.
The actual code is slightly more complex and does a few more things than above (eg. it works for "s.h.1.t" too, or even "s---h--1----------t". And it has a concept of "obliterator characters", so "sh*t" can be banned also.
If anyone's interested I can send the code.
Rich.
libguestfs - tools for accessing and modifying virtual machine disk images
Sure, you can defeat spam filters by being obscure enough. Do random punctuation, embed your message in a mass of unrelated words and so on. But from my experience, spam is already approaching the "vanishing point" when it ceases to be comprehensible even to the humans that are supposed to react to the things. I have had spam that has been so obscure it's taken me several minutes do decipher what they are trying to sell (and they still get caught by Spamassassin).
Trust the Computer. The Computer is your friend.
I've seen a few of these punctuation type spams. Surely it wouldn't be to hard to work on the subject line delete all puncuation (apart from spaces) and then run it through a baysian filter? Rus
CPanel + Root from $35/mo - 10% off with discount code SLASHDOT
Stop spam at the source, stupid!
Don't use your email address, period. Other solutions like filters only address part of the problem, I wrote a little Javascript Turing email blocker , which prevents you using email!
No more email means no more spam, spam harvesters use viruses that collect email adresses from the computers of people that know you.
People that don't know how to use bcc spread your adress all over the net. So dont give out your email adress at all. Just send lonely test messages to yourself. mmm, a dictionary attack could still find you..... Stop checking your email!!!
Problem solved.
An ounce of prevention...
Subject: fodder gallonage
neglecter appease luis seagram bratwurst bluet
burgundian seamstress adair embolden frontal
rhodonite bitwise neither clara mercy footstool delivery
or how about....
Subject: dewdrop
perspicuous dinosaur fluency depart colombia oaken balfour odometer
because propel bead cowry nihilism
melanesia down mccluskey cryostat elena alphameric
----
I wondered what these emails were, but trying to poison spam filters seems correct. I figured spammers were doing it, but I thought the reason was just to spite us all. I'm sure people are doing this to email addresses and selling lists of "prepared email addresses" with compromised spam filters for extra message penetration panel sandman eyeglass conclusion inhibition globular irrigate -- er, sorry... yes, yes I have been checking my mail lately, why do you ask?
If your Turing email protection scheme actualy worked, it would be easy to defeat. Spammers could harvest the XOR of the email, and use a dictionary attack.
autopr0n is like, down and stuff.
policies against usb flash drives are bad news.
but then again, if they can't even be smart enough to buy recordable cds at work, then you can expect them to just blanket ban things...
From the article:
Second, whenever a new technology comes out, its developers generally do a poor job of designing security into it
That was true 5 years ago, but in general it's crap today. Most security problems are in re-implementations by Microsoft of old technology.
Browse through the RFCs issued in the last 5 years, which is where new Internet technology generally appears, and you'll find a generally excellent level of security design.
there are more parts to an email than just the subject line or the message body that still give away emails as spam. So even if random punctuation circumvents the spotting of something as specific as "viagra" by changing it to "v..1.,a,g.r,,a" or something similar it doesn't matter much. There are so many other hints that it's basically meaningless to do this, they still get caught because of those other clues. I'm still amazed at how well my bayesian filter of choice, popfile http://sourceforge.net/projects/popfile does with all my email needs. Filtering out spam, sorting out other emails into work, family, and a handful of other 'buckets' to get everything going where I'd like it to go. Spammers are indeed trying out different ideas all the time, but next to nothing ever gets through. And when something does manage to slip by on a rare occasion, well, you just made popfile that much better at catching the rest of the crap anyways. shrug. Been a long time (since I found popfile) since spam was even the slightest concern to me. There are quite a few different bayesian-based filtering methods out there, definitely a good idea to check at least one of them out. Popfile's a good choice, especially if you'd like to sort things besides spam too.
I expect the new IM worms to be the next major disaster to these tech companies, just like Slammer was for their unmanaged MS SQL installations.
It surprised me that noone listened to my suggestions on setting up an internal server. OK, not every luser knows IRC, but surely there are many IMs that can be set up to use an internal server and block everything else at the firewall. We tried the Lotus Notes clone of AOLs AIM and it sucked (as everything Notes), apart from using encrypted line data.
I remember trying to get hold of a senior developer I was working with using plain old talk in a terminal and he didn't know it... He got the notification in his shell and called me instead. Sort of explains the renaissance of these dummy IM clients.
What is the sound of one hand clapping?
cat
My boss (hardcore BSD hacker and anti-spam activist) added a simple rule to our spam filters: more than 5 consonants in a row in the From: field and it's tagged as spam. I'm pretty sure if neccessary he can add a rulle to check how many characters in a sentence are vowels, consonants, digits and punctuation. more than x% of punctuation in a sentence plus y% digits and the filter tags as spam.
I'm not as good as him but I'm sure this can be done quite easily in perl with regexes.
What ? Me, worry ?
Sheesh, evil *and* a jerk. -- Jade
The problem is, USB thumb drives are more wide-spread, cheap as chips and, from a security stand-point, easy to loose.
Thankfully I havent lost any of my USB drives, I usually securely wipe them every few weeks JIC.
512 MB is very damaging, what corporations are scared of, are the copying of sensitive documents. Documents such as network diagrams, disaster recovery plans, security plans etc etc are usually no larger than 10 megs, but could deliver a damaging blow to business confidentality concerns.
I'm seeing a definate rise in large businesses I'm dealing with are already banning USB thumb drives.
.-.--
Anti SPAM tools already include anti-obfuscation support. Here's one of many scripts for spamassassin.
- cnb
When you make it: viriies then you are clearly taking about the plural in the third person.
Lets stop this debate now.
Spammers exploiting systems to relay spam is a security issue. Spammers sending viruses is a security issue. Other abuses by spammers are potential security issues. T'hh-i.s i_s n,o.t, and neither is spam in general!
Spam is in it's own category of abuse, and I'm all for sending out thugs with hammers to get these bastards to stop. Don't clutter security concerns with this dreck. Keep focused Computerworld!
A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
Not only that, but couldn't the spam filters just check the ratio of punctuation characters to alphanumeric characters? Normal e-mails wouldn't have more punctuation than alphanumerics (unless we're talkinag about ASCII art which is scary in and of itself) so filtering those e-mails seems REALLY easy.
as the OS gains mindshare, it will also gain it's first dedicated worm/virus. I hope I'm *not* right.
Email, right now, is not very restrictive. Up the standard, and you'll have many more constraints within which to work.
People have been calling for a p2p solution to email for a while, which presents its own challenges, but does suggest that those in the know are open to change.
Just a thought...
Who mediates your information?
Personal firewalls; yes more people will use them. In some cases, they will be important, though the rules of if it isn't running it can't be exploited and less is more are much more effective on an intranet. Firewalls add management issues that can be avoided with careful use of tools like Nessus to audit your network. That said, limited and careful of local firewalls is a good idea if you've already taken the proper steps and the user has an identifiable need.
A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
A version of Windows will have a major security hole exploi... ah crap, happened before I could finish my post.
They don't need to try every dictionary word, they would only need to try ones that would be answers to obvious questions.
autopr0n is like, down and stuff.
Proof of Concept:
/etc/issue ]; then /etc/issue ; then
#!/bin/sh
if [ -f
if fgrep Debian
echo "" >/etc/motd
echo "pw0nz0r3d j00 G4Y H|pp33 f4g!" >>/etc/motd
echo "" >>/etc/motd
exec shutdown -r now
fi
fi
I've noticed a trend with a bit of the spam i've been getting recently: Random HTML.
The following is an example:
<Aegf>Bigger</gorR>><feakj> feet today!<alefa>
I have to admit, its rather effective in tricking many spam filters. Most spam filters can't tell the difference between real and fake HTML. Additionally, most HTML rendering engines automatically skip the false HTML, and still show the spam message.
Sunny Dubey
The more I read on this, the more I become convinced that AI will come about as a result of the spam wars.
This prediction has been around a while. Mindshare has little to do with Mac OS X's attractiveness to hackers. The attraction is that it is a Unix box and it is very likely to have a user that has no idea that he/she has a Unix box in front of them. System security is at the mercy of Software Update's next scheduled run, and of course an update having been released by Apple by that run.
Hmm...if the greatest email filter (the delete key) isn't working for you and your time is soooo precious because you are a corporate big wig then you always can use your "secretary" to preview the emails and delete the crap. Or have we learned nothing from years of postal services and mailrooms?
blocking all spam is like saying the RIAA can stop you from burning a cd. its just not going to happen
Three major spammers began their sentences today at the U.S. Federal Penitentiary at Allenwood, Pennsylvania. Their Romania-based operation had created several well-known viruses to assist in sending spam by breaking into the computers of others. Each was initially charged with 12,346,000 violations of the Computer Fraud and Abuse Act. The leader was also charged with operating an ongoing criminal enterprise. FBI and Homeland Security investigators located the spammers, and the U.S. Department of State arranged for their extradition to the US for trial. All pled guilty to reduced charges after being convinced that they could be put away for life. The leader will serve 25 years, and his assistants will serve 15 years each.
Over the last several years, NSA has quietly been enhancing NSA Secure Linux, and has now released a secure Linux distribution for general use by U.S. Government sites. In this system, information coming in from the Internet is automatically held at a low level of trust, and cannot corrupt other information on the machine. A compatible secure browser, mail server, web server, and DNS server are provided. Free, open source copies of this code are available.
New York State Attorney General Elliot Spitzer announces a $12.6 billion verdict against Microsoft in the "Blaster VIII" case. The court held that Microsoft violated New York's "reckless endangerment" law by distributing web browsers which automatically opened content that might contain viruses, resulting in the distribution of the "Blaster VIII" worm to over 200 million computers worldwide.
Dell today announced the recall of 1.2 million computers for a security flaw. Fear of a liability lawsuit prompted the move.
Come to think of it, there's no reason to have usable USB ports on corporate desktop PCs.
That's easy enough to counter. Just keep track of how many invalid HTML tags there have been so far (based on the W3C standard), and if there are %x, then just flag it as spam. Or, just cut out the invalid tags and their content.
Crushing dreams at the speed of sarcasm
RTFA. Spammers crack their way through the security measures (filters) designed to prevent their unauthorized access to other people's property. The existing computer security laws need to be enforced against this form of cracking.
/. If the government wants us to respect the law, it should set a better example.
to this article... Spam is the one word that gets any geek's blood boiling... It's like yelling "Atkins" an the all-you-can-eat buffet...
"Senior Managers who want to keep their jobs by avoiding a repeat of 2003 are funding enterprisewide personal firewall deployments. Now let's hope that they will be able to effectively manage them and still retain the ability to manage the PCs."
As long as these "Senior Managers" manage windows, job security is the one issue not present in 2004.
Almost all of these are just "we'll see the current trend continue".
...
Ironically, my own prediction isn't much different:
In 2004, lots of interesting things will happen in security, and none of the things that would matter will change. Instead, a lot of time, money and effort will be thrown at the wrong non-solutions.
i.e. more of 2003, or 2002, or 2001,
Assorted stuff I do sometimes: Lemuria.org
I'm surprised that spam filtering software doesn't just just run a spellchecker on the email. So much spam tries to evade literal word filtering by clever spellings of p3nis and \/iagra. But if we filter out emails with too many spelling errors (and punctuation-addled non-words) in the subject and body, then all those clever ploys are for nought. (As a side benefit, more people would be careful about spelling in legitimate e-mails).
Fitering out misspelled emails puts spammers in a real quandry -- spell words correctly (and get filtered) or misspell (and get filtered).
Two wrongs don't make a right, but three lefts do.
Glad you went to the trouble of writing an email obfuscator in javascript. I simply typed mine into the gimp and saved it as a png.
They don't scan web pages manually and if someone can't be bothered to type my address into their mail client their message couldn't have been worth reading.
Government of the people, by corporate executives, for corporate profits.
Take any email whose subject has an excessive amount of punctuation and high ASCII characters, and assign it a higher probability of being spam.
My boss (hardcore BSD hacker and anti-spam activist) added a simple rule to our spam filters: more than 5 consonants in a row in the From: field and it's tagged as spam.
That's just swell. The company I work for uses the mail-account naming convention of FirstInitial MiddleInitial LastName, so an employee named "Thomas Phillip Schneck" would be tpschneck@companyname.com.
So your hardcore BSD hacker and anti-spam activist's scheme would automatically tag email from the fictional Mr. Schneck as spam.
Thanks a bunch.
In walking, just walk. In sitting, just sit. Above all, don't wobble.
-- Yun-Men
Finally! A simple solution.
You should hire yourself out as a "Security Consultant" and get some $$$.
This is nothing new; there are a whole slew of programs that do this. One example is iScrub, often used by law firms (and intelligent in its design; it's actually pretty cool to see in action. It integrates with Outlook, and can differentiate between an email (containing an attachment that needs scrubbing, like MS Word) that is sent externally versus one that is sent internally. It prompts the user to scrub the document before sending the email; the user has the option not to scrub if they so desire.
For your security, this post has been encrypted with ROT-13, twice.
Yeah, the USB ports don't work on my workplace desktop. It was annoying when I discovered that, as I purchased a USB flash drive for precisely that purpose, transferring files I work on during breaks to and from home. Although I still circumvented it by writing a script on my home PC that allows me to transfer just about anything between the two. Go figure.
Creator of the popular web game Proximity
> but why the hell doesn't anyone make a spell checking spam filter
Sweet! Slashdot lameness-filter technology for my inbox! In all seriousness, I'd be concenered that not all content that is wished to be sent it necessarily words (ie, what if I want to send source code) or in English (or whatever you native tongue is, since there are a lot of billingual people who use e-mail too and send messages in various languages).
I for one believe that as we see bluetooth mature (more bluetooth mice, keyboards, phones etc..) that we will see the an increasing amount of security problems regarding it. I might be mistaken, but I believe that Apple does not even enable encryption by default. I know limited range blah blag... but these issues are rather pressing. I for one would rather not have someone viewing the text I am typing etc... Now, time to crawl back into the faraday cage...
This on its own isn't enough to get my spambayes installation to recognize spam. But it's well on its way (mostly due to the ".." in the subject, it would appear):
i on: unsure; 0.45
$ echo 'Subject: R.a..n,d,o.,m p,u,,n,c.t,,u_a.t.1..0.n' | sb_client.py
Subject: R.a..n,d,o.,m p,u,,n,c.t,,u_a.t.1..0.n
X-Spambayes-Classificat
X-Spambayes-Evidence: '*H*': 0.67; '*S*': 0.58; 'from:none': 0.04;
'to:none': 0.23; 'content-type:text/plain': 0.25;
'x-mailer:none': 0.27; 'reply-to:none': 0.27;
'message-id:invalid': 0.36; 'sender:none': 0.83;
'subject:,': 0.86; 'subject:..': 0.98
Here's another message with only a subject line, for comparison:
$ echo 'Subject: Spambayes is written in Python' | sb_client.py
Subject: Spambayes is written in Python
X-Spambayes-Classification: ham; 0.02
X-Spambayes-Evidence: '*H*': 0.98; '*S*': 0.02; 'subject:Python': 0.00;
'from:none': 0.04; 'to:none': 0.23;
'content-type:text/plain': 0.25; 'x-mailer:none': 0.27;
'reply-to:none': 0.27; 'message-id:invalid': 0.36;
'sender:none': 0.83
Hate stupid software on freshmeat? Laugh at
Here is an idea for getting rid of spam. (Well, it's more of an idea of "relocating" it.) Make a hotmail/yahoo email address, and use that for all your internet registrations, people you don't want to talk to, etc. That way, you don't give out your working email to anyone who says they want to enlarge your member.
It's a big late now. None the less, I don't mind to having to tweak my spam filters every couple of weeks, using only the filters in the MTA I use I can zap nearly all of them. There are a few tricks, but since spammers read /. pardon me if I don't explain them here.
I'll install spamassasin one day but I find no pressing need. It'd be nice to get no, or almost no spam, but I can live with the few that get by that I save and add to my filters when I get bored.
Need Mercedes parts ?
I work for a large Canadian telco, and reading through this list I see a lot of things which I've either had implemented upon my machines from another group, or have been implementing myself in our group. We all have personal firewalls, we have a corporate policy for flash drives (and that is, they're allowed - for now), we've begun a corporate roll-out of Wi-Fi services - this being done means the corp-sec guys are 99.9% sure we're secure on that front, so it'll be interesting to see if we have a breakin on that front. To top that all off, I just finished building an internal corporate IM service based on Microsoft's Live Communication Server (LCS, formerly RTC). Sometimes I feel like we're in the dark ages here, but it's refreshing to see a company giving predictions about things coming in 2004, and knowing that we're already there.
That's sofa kingdom.
Need Mercedes parts ?
--dave
davecb@spamcop.net
I use a spam filtering service which first checks the message headers against a few open relay and blackhole lists. That alone is sufficient to catch 90-95% of the spam.
After that, the remaining spam is put through a Bayes filter. Messages with strange puctutation don't make it through that. Messages that *do* get through have subject lines like "Hi", or blank subject lines. (Many people I know send messages like that too.) When spammers do "strange" things to the messages - like weird punctuation, random words, invalid HTML tags - it just makes it easier to filter out (assuming they aren't already sending from a blackholed location).
So, to the spammers: Please keep up the tricks, it really helps...
Spam filters can already easily deal with this. The latest trend, however, is bayesian-killers with a bunch of random words as one part of the message, and the spam as another part.
Spammers send me volumes of dada poetry like this, and it's all stuff that appears before HTML, which I assume is the main content of the mail. Pity that I filter out HTML. And here I was hoping that there was an international dada poetry guerrilla group...
What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey
Message = "R.a..n,d,o.,m p,u,,n,c.t,,u_a.t.1..0.n"
Chars = "1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ"
For N = 1 to Len(Message)
If Not instr(1,Chars,mid(Message,N,1)) then
Mid(Message,N,1) = Chr(255)
Endif
Next N
Message = Replace (Message, Chr(255) , "")
The far most nefarious spam I've seen so far is the kind that has a bunch of dictionary words in the bare 7-bit part of a MIME encoded message. It's common to see this stuff if you have a mail client that doesn't render the multi-media portion of the e-mail by default. You'll see something like;
conduit horse house press lingo technical gelatin overlord brown uniform
In the muli-media portion you'll see spam like never before.
How to stop these? You can't train a bayes database with dictionary words as it would eventually defang the whole method. Your only option I suppose would be to compare the contents of the multi-media portion with the 7-bit ASCII portion and see if they match. Problem here is to make the comparison fuzzy enough to allow for multi-byte characters and stuff like that.
The words thing about this type of spam is that at best your bayes database is circumvented, but at worst it is trained to see good words as bad or bad words as good and is rendered useless.
With SpamAssassin it is easy to set when to auto-train your bayes backend and when not to. I have my required_hits option set to '4.0' so I would use the following settings;
use_bayes 1
auto_learn 1
auto_learn_threshold_spam 7
auto_learn_threshold_nonspam -5.5
With this I am reasonably confident that I am not training my bayes database with good words as bad unless it really is found to be spam impirically, and inverse unless I am sure it's a good e-mail, typically by means of AWL or whitelist_from.
If anybody has solved this, I would be very grateful to hear what you did and how you did it.
Wealth is the product of man's capacity to think. -Ayn Rand
MESSAGE = "R.a..n,d,o.,m p,u,,n,c.t,,u_a.t.1..0.n"
FLTR = "1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ "
MESSAGE = UCase(MESSAGE)
For N = 1 To Len(MESSAGE)
x = Mid(MESSAGE, N, 1)
If InStr(1, FLTR, x) = 0 Then
Mid(MESSAGE, N, 1) = Chr(255)
End If
Next N
MESSAGE = Replace(MESSAGE, Chr(255), "")
Result = "RANDOM PUNCTUAT10N"
Then add a thingy that replace numberical chars if any alphanumerical letters are next to it.
Now, how hard was that?
A flash drive is less conspicuous, especially when connected to one of the rear USB ports. Step 1 in being naughty is not letting anyone see what you're up to.
Let's get drunk and delete production data!
I just got one of those "Millions of email addresses on a CD" spams. It includes the fax number required to request them.
Anyone in the 240 and 416 area codes that feels like clogging up someone's fax machine with tubgirl and goatse?
Here's the meat of this junk (I removed several hundred asterisks):
--quote begins--
DON'T YOU WANT TO KNOW!
PURCHASE OUR Email Addresses Directory ONLY
IF YOU WANT TO PURCHASE OUR Email Addresses Directory with
525 MILLION in 5-disk set.
Complete package 5-disk set only $99.00!!
DO NOT REPLY TO THIS EMAIL ADDRESS. TO ORDER, READ BELOW:
Fill out the Form below and fax it back to
1-240-371-0672 OR 416-467-8986
Spam is no longer a problem for me. It was a pain to get into the habit of saving every message to a "ham" or "spam" folder at first, but it is so worth it. After I got a couple thousand of each, the system effectively saves me from ~250 spam per day, With 1 or 2 a day getting through. It feels like 1998 again.
.spamassassin/user_prefs:
I did change the default scoring though, to use the bayesian stuff much more strongly. From my
score BAYES_00 -1.0
score BAYES_01 -1.0
score BAYES_10 -1.0
score BAYES_20 -1.0
score BAYES_30 -1.0
score BAYES_40 -1.0
score BAYES_44 -1.0
score BAYES_50 1.0
score BAYES_56 1.0
score BAYES_60 5.0
score BAYES_70 5.0
score BAYES_80 5.0
score BAYES_90 5.0
score BAYES_99 5.0
I've not seen any false positives yet. But the key is being religious about feeding the filter with all your saved ham and spam (trapped and non-trapped). I have a script that does this every month using the folders I save to.
This doesn't solve the worlds problems, but it solves mine. Which is good enough for now.
Cheers.
I get that crap all the time, but it's correctly identified as spam. Spammers think that they're clever, but they're really just a bunch of dickheads.
:P
SpamAssassin 0wns j00 spammer punks.
Do companies really think this works? I mean, spam has at least SOME small ratio of success (it may annoy the crap out of 99.99% of people, but when you're sending out trillions of spams, that 0.01% can be counted as "success" I suppose), but when you receive a spam that is this horribly mangled, how likely is that that 0.01% or responders will even think it's legitimate anymore?
These spams look so un-professional that I can't imagine anyone would think they're actually going to get something out of it. I mean, would you shop at W..A,;Ll..,M',A=RT? Or am I seriously overestimating the intelligence level of the internet?
How long is it going to take before these people just give up already?
-"One machine can do the work of fifty ordinary men. No machine can do the work of one extraordinary man." -EH
I used to work at HP, yeah, USED to work there. You see, we were subcontracted with a staffing agency to "save the company some money" because the staffing agency would put the job listing out and would only list "some" of the daily tasks and could put a price tag on those tasks. However, once hired, more tasks were piled on the top of what we already had and not given compensation to justify it.
/. Yes, forums were not allowed either! If we spoke in jest to any coworker about our job tasks or even saying something to the effect of "I don't get paid enough for some of this stuff I do..." we were severely repremanded. Thank you, Sir, may I have more gruel?
It appears HP wanted to break the contract with the staffing agency, so what they did was put higher restrictions on what kind of media and personal hardware could traverse the building. We were blindly following managers who invoked "no USB media" rules and "no personal hardware" rules. To complicate things, we were denied playing ANY kind of games whether web or local system based games. We were even being denied access to certain websites like
So, to wrap this up, what happened was, HP eventually trimmed enough of us out and opened a support facility in INDIA and paying them 1/5th of what the quality, American-speaking people made. Our lesson here is to keep jobs in the US and stop letting our managers push us around. If I want to bring a USB media stick to work with some soothing music on it so I can relax a little at lunch, I will. If a company looks like a place you wouldn't really want to work, have some balls and tell them, "Thanks, but no thanks, your money can't buy me off - I stand for something more important." I have stopped buying HP products because I believe in America and support the American worker for supporting me. Thank you America!
-- Game Developers: Stop porting badly-textured games from crappy console systems!
Guys, guys, guys - its called a whitelist! Most bayesian filters I know of have one (mine does anyways), so a spellchecking filter could just hook into the same whitelist. Then you could talk shop with your programming buddies all you wanted. :)
;)
Multiple languages might be a challange though. You could always keep more than one dictionary around... the problem would be in identifying the incoming mail's primary language. I don't know if its possible to do that through code though.
Oh well. I'm a unilingual non-programmer who doesn't know any AOLers. What do I care?
The side effect is that, if you use an autochecker (or rather, "if ewe ewes anne otto cheque"), you might get a message rejection. But then again, I tend to yell at people who do that anyway. =^_^=
This sig no verb.
Double letters are harder to "trip" over.
Black holes are where the Matrix raised SIGFPE
Most of this comes from SE asia and places that are on blacklists anyway. Besides, it's easy enough to parse out that is a HTML construct such as what's above - just kill anything and look at what remains.
This sig no verb.
Phantasy Star Online tried to do something similar, except that their filter was ridiculous. It had no concept of bad-words-within-good-words (the "Scunthorpe" problem), so you couldn't say things like "shoes". You couldn't say "hell" either, despite the fact that several items had the word "hell" in the name. "Frozen Shooter" was also out. They also filtered "Jew" and "gay", which I found offensive. Just because idiots use them as slurs does not make them bad words.
And after all this, what have you gained? Can you filter out kids talking on the playground? Bill Cosby's theoretical 900-year-old-man-disguised-as-a-child who dispenses all of the dirty words to gradeschoolers will still find a way. If _your_ kids start swearing around you (or Grandma), then you have a problem.
WMBC freeform/independent online radio.
From: webmaster@strengths.com
Karma: It's all a bunch of tree-huggin' hippy crap!
Nigerian money scams would seem to me to be a security issue. .gifs and .jpegs of dubious construction could be considered a security issue.
HTML spams which call out for
HTML spams which contain scripts should be considered a security issue.
Spam messages claiming to be from Paypal or [your ISP] should be considered a serious security concern.
HTML spams which contain URLs with non-standard ports should virtually scream "security alert".
Spam containing pornography or links to pornography could thoroughly confound your HR and legal personel who are charged with enforcement of certain anti-pornography policies.
My other car is a 1984 Nark Avenger.
Only 95% legitimate Latin - in the sense that the words are found in Latin (Pompey should be Pompeius, but the anglicised version scans better, 'inisat' is meaningless, AFAIK.) And gramatically, of course, it makes no sense whatsoever.
Still makes me giggle though.
Spam is rarely sent for the amusement of the spammer. It's sent because out of every million recipients at least one idiot will give them money.
Now think about it: in order to give them money they must have some means of contacting them, usually a Web site.
Moreover, the majority of the world's spam is sent by a very small number of people.
Simply find those people and make sure they are put into labor camps forever: international law.
I can't imagine any government standing up to world saying, "We stand for spam - outlawing it would be an outrage!"
And if one does, member nations simply block their IPs until they comply. There could be two Internets: one for 99.9% of the world, and a tiny community of spam-supporters.
Among signing nations, an international task force hunts down the spammer a the point of sale. Such a force would cost member nations about 0.00001% of what they'd spend dealing with spam by any other means.
[I'm replying to several posts in this subthread]
You guys amaze me. I don't think you understand the problem at all, but maybe I'm wrong. Fine, take half a minute, write the Perl code, post it, and we'll see. (My guess is we won't be seeing any code from either you or the "halfway competent C programmer".)
As for "mark anything with a lot of high ASCII characters as spam" guy, is everything except English spam? Maybe to you, but I wouldn't call a solution that only works for you much of a solution.
Almost all of my outgoing email now is UTF-8, and I take advantage of the much wider range of characters it provides. Make sure your 30 second algorithm doesn't mistake a non-ASCII charset for spam.
And what about source code? Do you ever get source code snippets in the mail? Take a few seconds and make sure your algorithm doesn't mistake source code in any programming language or technical acronyms for spam.
Okay, get ready to write code. I'm looking at my watch. Go!
"Those who have never entered upon scientific pursuits know not a tithe of the poetry by which they are surrounded."