The Evil in E-Mail

Dumbest thing I've read all week... by TripMaster+Monkey · 2005-06-12 00:09 · Score: 5, Insightful

From TFA:

Skillicorn doesn't know all the ways suspicious e-mails might read differently from innocent ones. The beauty of his approach is that he doesn't need to know. His software is designed simply to look for messages that are different, based on word frequencies, from the mass of e-mails. It needn't understand the reasons for the differences.

Super. I'm predicting a whole lot of false positives...especially during the initial phase of this operation...

Also from TFA:

One difference might be the complete absence of words someone might possibly think would draw a law enforcement agency's attention to their e-mails, but that most people would occasionally use innocently (as in "my presentation yesterday really bombed.")

Great...so words like 'bombed' get the email flagged...as well as an absense of the word 'bombed'? So far, Skillicorn's test appears 100% sensitive...too bad it's 0% specific.

Some more from TFA:

A related trick, he says, is to examine patterns in who e-mails whom. As an example, in criminal networks it is common to find several people communicating regularly with the same person, but never with each other.

OMG! This is the pattern of emails in my company! My whole company is a giant terrorist organization! I had no idea!

/sarcasm

But here's the kicker...again with the quoting:

To help with his work, Skillicorn has been working with archives of e-mail from Enron Corp., the company at the heart of one of the most prominent scandals in recent U.S. business history. In some respects, he notes, the Enron e-mails are not a good sample for analysis, because Enron employees seemed to have no compunction about what they were doing. "People should feel some guilt or at least some self-consciousness when they're being deceptive," he says.

So let me get this straight...if criminals are okay with their criminal activity (like...say...terrorists), they'll 'slip under the radar'??? Great test, Skillicorn...sounds a lot like a standard polygraph test, which experienced criminals can fool at will, while innocent people fail them 50% of the time. That's what the War on Terror really needs...another inaccurate 'test' that does nothing but throw false positives.

I'm just glad that this method is so obviously stupid that it will never be implemented by our government...
Oh, wait...one more from TFA:

Such technology has obvious applications in surveillance by law enforcement and security bodies, but Skillicorn suspects agencies like the U.S. National Security Agency have little need of his help. "I infer from things they say around me that some of this stuff they already do," he says.

Crap.

--
____

~ |rip/\/\aster /\/\onkey

Re:Dumbest thing I've read all week... by mog007 · 2005-06-12 00:24 · Score: 2, Insightful

I say this guy do something useful with his time, and go after the REAL evil in email:

SPAM.

--
Learn something new.
Re:Dumbest thing I've read all week... by The+FooMiester · 2005-06-12 00:25 · Score: 5, Interesting

I especially liked the part about:

Another, Skillicorn says, is that research shows
people speak and write differently when they feel guilt about a
subject, for instance using fewer first-person pronouns, like I and we.

Because people always use first person pronouns in messages. That's just what's done. And alot of them should be used.

Sounds like a way to track messages with "substance" rather than the "hai h u r? heer are the pictures of my vacation." messages.

Think about that. This man has just come up with a way to measure the relative interest of what the sender has to say to people in the government.

Yet another way to cut down on the messages that the government has to read and be bored with. Yet another way to enable the government to read out communications more effectively

Yet another reason to look into using real encryption.

--
The previous has been a secret message to my comrades.
Re:Dumbest thing I've read all week... by ergo98 · 2005-06-12 00:33 · Score: 2, Interesting

So very, very true. I'd support the guy just because he's a fellow Ontarian, but there is nothing in this article of any substance or worth, and it sounds like a giant heap of grant-sucking bullshit. I think the "researcher" caught the season premiere of "Numbers", one in which they caught the criminal based on exclusion of activity (e.g. he committed crimes in the area around his stomping grounds, excluding where he lived and worked), and thought he could rationalize some nonsense about email analysis.
Re:Dumbest thing I've read all week... by Otter · 2005-06-12 00:43 · Score: 5, Insightful

If I understand correctly, what he's done is this:

1) Devised a theory

2) Tested it on a sample set of emails from Enron

3) Gotten poor results

4) Blamed the failure on Enron, for being just *too* evil for his theory to work!

Yawn. Maybe he should save the press release until he's gotten something to work.

--
What I'm listening to now on Pandora...
Re:Dumbest thing I've read all week... by danharan · 2005-06-12 01:05 · Score: 3, Insightful

Super. I'm predicting a whole lot of false positives...especially during the initial phase of this operation...
If using contrived language flags this system, I wouldn't want to be the one having to read all the false positives. I imagine I'd find out about a lot of affairs, rumours and backstabbing plans.

--
Information: "I want to be anthropomorphized"
Re:Dumbest thing I've read all week... by MrDomino · 2005-06-12 01:20 · Score: 3, Insightful

Yet another reason to look into using real encryption.

Yeah, sure, until using encryption is flagged as a likely indicator of criminal activity, too...

Remember, if we don't all walk around with our pants down in public, that means that we've got something to hide.
Re:Dumbest thing I've read all week... by kfg · 2005-06-12 01:55 · Score: 2, Insightful

"I infer from things they say around me that some of this stuff they already do," he says.

Crap.

But of course. It is the nature of the monitoring beast and the very reason such monitoring is offensive to freedom.

First you monitor. Then you monitor for the people avoiding the monitoring. Then you monitor for the people avoiding the . . .

Monitoring, if it is to work at all, is an all or nothing sort of deal. Once started it innately progresses toward the end of a secret cop in every pocket. If you know they are monitoring, you know they are heading toward this point, if not already there.

But that's ok, you have nothing to hide, do you. . .comrade?

KFG

What about other languages.? by guyfromindia · 2005-06-12 00:11 · Score: 3, Insightful

This may work well for English,etc.. but may not work with other languages..

Re:What about other languages.? by muszek · 2005-06-12 00:19 · Score: 2, Funny

You don't watch enough TV. Terrorists, just like aliens, always speak English to themselves.
Re:What about other languages.? by Anonymous Coward · 2005-06-12 00:57 · Score: 4, Interesting

That may not work either. There's that fine s-f Polish novel "Paradyzja" by Janusz Zajdel about a closed society in a space colony. The population was under constant surveillance and anyone questioning the government was immediately punished. Due to amount of gathered data the government had to use automatic systems to find such people. So what the unhappy residents did was to develop language based on metaphors and associations. For automated systems it looked like a spoken poetry while an intelligent listener easily got the point.
It was written during Cold War and of course referred to socialist governments of the time but I see new paralles now.
Re:What about other languages.? by Alsee · 2005-06-12 02:46 · Score: 2, Insightful

Duh, if they're writing in some other language then they are ALREADY at the top of the potential criminal/terrorist lists.

-

--
- - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.

if you're really up no good.. by Keruo · 2005-06-12 00:17 · Score: 4, Insightful

The emails you send would be encrypted instead plaintext.
Real criminals aren't dumb, only the bad ones who get caught are.

--
There are no atheists when recovering from tape backup.

Agreed by MarkusQ · 2005-06-12 00:19 · Score: 5, Funny

This line in the lead jumped out at me:

like several people e-mailing one person but not each other, which is how some criminal networks operate.

We have an addresses "techsupport@internaldomain" which matches this pattern to a T.

--MarkusQ

P.S. Back when we were on MS-Windows, it would have been OK, because the people asking for TechSupport were often sending each other worms at the same time.

Re:Agreed by Anonymous Coward · 2005-06-12 00:42 · Score: 2, Funny

Well, everyone knows end-users are terrorists.
Re:Agreed by ebuck · 2005-06-12 03:28 · Score: 5, Insightful

Worse yet, people will be watched and harrased by this technology, but never brought to court over it.

In a court, you can question the evidence used against you. Considering that the creator of this evidence indicated that he didn't need to know how it works, it's highly likely that you could get this evidence thrown out because it fails the test of provablility.

So this technology will "flag" people, and they will be watched "just in case". However, there's not going to be a court case, just continued monitoring until the budget to watch this person dries up. And it's very easy to get a bigger budget because you can argue, "We are watching 400,000 people who have been flagged as possible terrorists, we can't keep up. We need more money." Even when your flagging system has worse odds of finding a terrorist than the Lottery.
Re:Agreed by Master+of+Transhuman · 2005-06-12 09:21 · Score: 2, Funny

Memo to SAIC:

Suspected terrorist has posted an apparently coded message on Slashdot indicating connections with terrorist supporters in Middle Eastern countries.

Suspect has possible sexual relations with both his wife and his sister based on frequency of email contacts.

Suspect is apparently concealing his connections with his wife's mother from his wife. His wife, however, is also in contact with the terrorist leader. Indications are his wife is part of a different cell than the suspect. {See "Mr. and Mrs. Smith" for operational details.]

Flag for further analysis by the understaffed translation department. Under no circumstances let Sibel Edmonds translate these communications.

Robert Mueller
Director, FBI

--
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!

The idea isn't new... by Registered+Coward+v2 · 2005-06-12 00:19 · Score: 3, Insightful

Pattern recognition has been around a long time - from analyzing the causes of infection to finding likely cheats on expense reports (and the latter uses the frequencies of certain digits, rather than looking for the text entries).

I do disagree with his statement about not being useful to fight spam - recognizing patterns ins spam is already in use, applying the idea that the same or significantly numerous occurrences of the same words from either the same person to multiple users at the same sight and different sites, or the same basic message sent to multiple users from different mailers / return addresses might be a good indicator of spam. The challenge is how do you monitor all the traffic?

--
I'm a consultant - I convert gibberish into cash-flow.

Someone set us up the Bombed by FidelCatsro · 2005-06-12 00:20 · Score: 2, Insightful

This will be a total BOMB , Honestly this is not a new field of science at-all , Letters and writing have been examined for years and criminals writing E-mails will be writing the same things they always write .

--
The only things certain in war are Propaganda and Death. You can never be sure which is which though

Bad sample? by No+Such+Agency · 2005-06-12 00:26 · Score: 4, Insightful

Ah, my alma mater Queen's makes it onto Slashdot!

I don't know if using the Enron e-mails as his test material is such a good idea. Corporate malfeasance is probably not conducted the same way that every other criminal (or terrorist) network runs. At least their communication might be different due not to a "lack of guilt" but due to the fact that it's probably so easy to make a naughty memo sound like an innocent one without being obvious. After all these memos would be mixed in with a lot of legitimate company business the conspirators are also conducting.

How does automated analysis separate a memo saying "I think we should go ahead and promote Price out of the mailroom" - which means "Have Price-Waterhouse cook those spreadsheets I sent you", from one which just leads to some dude getting promoted out of the mailroom? Of course if they are not bothering to use code words then the system might work very well.

A related trick, he says, is to examine patterns in who e-mails whom. As an example, in criminal networks it is common to find several people communicating regularly with the same person, but never with each other. This is meant to ensure that if one lawbreaker is caught, he or she is unlikely to lead authorities to too many others. But it can also be a clue to suspicious activity.

Traffic analysis is probably more promising, since you can reconstruct relationships between players with it. The traffic pattern could look like a terrorist cell, or it could look like a bunch of guys who know each other - as he says, there's a difference. But this is old news, though automating it would make snoops' lives easier.

At any rate I find this line of inquiry disturbing for civil rights reasons, but I don't believe we should attack the researcher for working on it. Academic freedom is a very useful concept and ultimately does us more good than harm, IMO.

--
Freedom: "I won't!"

Whatever by M3rk1n_Muffl3y · 2005-06-12 00:26 · Score: 2, Insightful

I am sure this will prove to as productive as searching eBay images for hidden Al-Qaeda messages.

--
This is not the sig you are looking for...

Word bombs.... by Invalid+Character · 2005-06-12 00:26 · Score: 5, Funny

Bomd, jihad, kidnap, extort, terrorize, kill, godless, constitution.

That should keep me safe for a few days.

--

--

Registered .sig quotient : 1337

I can't believe this got funding... by bobbis.u · 2005-06-12 00:26 · Score: 4, Insightful

It seems like he is just using Bayesian filtering (the bit about how he doesn't know how it works gives it away), and using Enron emails for the training.

Personally, I can't see how this would ever work. It is typical of the attitude that "all terrorists are bad, they are all the same and we just have to deal with them all in the same way".

Isn't it obvious that different terrorists will have different styles, different levels of literacy, different levels of security awareness, different languages, different aims, different approaches - the list goes on and on. Normal emails all have these traits too. I can't imagine there is any way of applying Bayesian filtering to help with this task.

GPG by Nicholas+Evans · 2005-06-12 00:26 · Score: 2, Insightful

I'm going to go out on a limb here and say that Al Queda probably uses GPG or some other form of strong encryption in their e-mails.

--
The Yasashii Syndicate ||

Big Brother right here by m50d · 2005-06-12 00:27 · Score: 3, Insightful

He's just using statistics to detect emails that are "different". So, anyone who isn't conforming is flagged up. Organising an anti-war protest? There you are, flagged. Say goodbye to freedom, if you hadn't already. Or encrypt all your emails, and try and persuade everyone you know to. Maybe we can make encryption widespread enough these things are useless.

--
I am trolling

Re:Big Brother right here by Adult+film+producer · 2005-06-12 01:00 · Score: 5, Funny

I dont think so. Strongly enforced u.s. export controls have kept the Al Queada from gaining access to militant grade encryption tools.

Social Networks = Criminal Networks? by Anonymous Coward · 2005-06-12 00:28 · Score: 3, Insightful

...or to examine patterns that might indicate criminal activity - like several people e-mailing one person but not each other, which is how some criminal networks operate.

Not to mention most social networks. Or is everyone you know equally popular?

Too narrow by Anonymous Coward · 2005-06-12 00:39 · Score: 2, Interesting

This reminds me of a Perl module Text::Gender
or something which I tried out in a few experiments last year. It is supposed to analyse writing and determine whether its author is female or male.

It works rather well given the conditions that the authour is also is American, white and middle class. Any samples outside that field and it fails spectacularly actually getting more wrong than right (worse than chance).

These sort of ideas are cute in their ambitions
but not science of any kind at all. The tests given in the email analysis article are even more wooly still. It sort of annoys me as a scientist that standards have sunk so low and funding is available for hairbrained capers like the one described in the article.

Oh dear by Anonymous Coward · 2005-06-12 00:42 · Score: 5, Insightful

Dr Skillicorn has obviously never done any work with or for a law enforcement or intelligence agency. After spending three years in this area working on data mining of electronic communication, I can say this fella has not done his research properly. He has failed to note that the frequency of grammatical and spelling mistakes, let alone "missing" words, have become so frequent now in the SMS TXT generation that this will cause a major problem when scanning messages on this scale. I really can't be bothered to pick any more holes in this because it is time for a bacon and ketchup sandwich.

Bah by shobadobs · 2005-06-12 00:42 · Score: 2, Funny

Everytime I hear one of these stories about how they can catch criminals from their email messages, I'm like, "OMG! They made a fast factoring algorithm!" But then I read the article and discover it only works for unencrypted messages. Gee.

you encrypt - you're a terrorist by l3v1 · 2005-06-12 00:54 · Score: 2, Interesting

Just remember a not so old story where there was reported the presence of e-mail encryption software was considered as evidence in some child porn case.

First they start using some very un-smart word-scanning piece of crap filtering system [and god help you if you write foreign language letters, or have a different style than the average], then they will punish the use of mail signing and encryption software [which is something I regularly do], then if the filtering still has a false positive rate above 99% they will ban e-mailing. Then they will find out other forms of efficient communication exist.

--
I am putting myself to the fullest possible use, which is all I can think that any conscious entity can ever hope to do.

Wasted effort by AndroidCat · 2005-06-12 00:56 · Score: 3, Funny

Everyone knows that you just have to check the evil bit. (Some terrorists may be sophisticated enough to tamper with the evil bit but if they use Windows, the lack of the bit will stick out like a sore thumb.)

--
One line blog. I hear that they're called Twitters now.

Do they believe in the effectiveness of this.....? by David+Webb · 2005-06-12 01:00 · Score: 3, Insightful

So It is now no longer good enough to just have the ability to subpoena your records if your arrested?Now the government wants to activly sort/monitor the emails of an entire nation. HMM I smell more violations of the rights of the people. How much more of this are we willing to accept. How much longer until dissidents start a revolution. That's right I said it a revolution. This sounds like a combo of search/packet sniffing software.Last I heard PGP and RSA encryption was still unbreakable. This will NOT be effective for the worst thieves or tererorists.

this research is a wonderful example of... by cahiha · 2005-06-12 01:06 · Score: 2, Insightful

Graduate students, take notice. This research is a wonderful example of ... going where the wind is blowing; that gives you media coverage and funding from people who know even less than you ... not doing your background research; doing your background research would just discourage you, and it takes time that isn't required for convincing people who know less than you that your sexy proposal is worth funding

Oh Dear by taskforce · 2005-06-12 01:29 · Score: 2, Insightful

either by avoiding using certain words at all that could be flagged for possible criminal context (like "bombed)

So if you don't talk about things which a terrorist would talk about, you are a terrorist?

like several people e-mailing one person but not each other, which is how some criminal networks operate.

Yes, it's also how every other nuclear network of friends operates. Not all my friends know eachother. Not all a bank's customer's know eachother, not all a mailing list's users know eachother.

--
My 3D Texturing Skinning work (under construction)

Government sucks. by Fantastic+Lad · 2005-06-12 01:32 · Score: 5, Interesting

"The more corrupt the state, the more numerous the laws."

-Tacitus

Government is already too invasive. I'm already forced to seek a building permit before I can erect a structure on my own property. The fines for ignoring this, (and say, having the gall to build a solar powered house which is not connected to the AC power grid, or (horrors!) a straw-bale house), are huge and the government's reasons for these laws are utterly ridiculous.

Any professor who suggests that we should be looking to monitor email content is not thinking clearly. The Government already has their nose in everything, and telling us that, "It's For Our Own Good," is NOT a valid excuse.

It's MUCH more important that people be able to make mistakes -and even die through their own faults- than live ensnared in the safe-keeping of a bunch of ignorant civil servants who are trying to build a Starfleet future where everybody dresses the same, and nobody is allowed to think or act outside a bunch of pre-set 'safe' boundaries designed for middle-class suburbanites who exist in eternal ignorance of the real world, who actually believe in the Discovery Channel, who drink milk, and live in absolute terror of anything you can't experience beyond the confines of a nice, respectable department store.

-FL

letter from college home by e**(i+pi)-1 · 2005-06-12 01:36 · Score: 5, Funny

Letter from College: Hi Mom, I blew it and bombed the final exam. The physics prof put the gun on my head and told me to work harder. I could kill him. I feel like having a knife at my throat. The anger feels like poison in my blood but I know it is my fault and the all is blamed to that virus, I had been laboring with for quite a while. I'm working on it mom! I promise to make you proud. I can not wait to be on the subway home to work on my final project on weapons of mass destruction in my political science class. Its mental terror. Love Your son P.S. The powder you sent me works well for my skin infection. Strong agent.

He must be up to something fishy. by Nordberg · 2005-06-12 02:17 · Score: 2, Funny

This email doesn't contain the words r0lexx, v!/\gr4 or c14ll4s. It sticks out like a sore thumb from 99% of the email traffic we've intercepted, he must be up to no good!!!!

--
*Splort*

Been There Done That by anat0010 · 2005-06-12 02:23 · Score: 2, Interesting

Statistical analysis of word (token) frequency works great in a closed domain set, such as the Enron corpus. But once you scale up to the ISP level it falls down horribly.
Why ? The size of the token database increases massively to the point where it becomes un maintainable. Every spelling mistake, word variant, not to mention foreign language, gets included. Eventually you are unable to separate the wood from the trees. Let alone make statistically significant assertions about a single message.
And lets not mention the fact that all the work on detecting deception in correspondance hase been done on English language text. Those pesky al-Qaeda types tend to speak Arabic. So before you can even begin to detect dodgy emails written by al-Qaeda, you need to construct a written arabic parser. Then you need access to a large corpus of Arabic emails (if you have one I'd be very interested too). Then you need to research the lexical rules that tend to signify deceptive arabic.
Its an interesting problem, but not even trained and experienced intelligence operatives are able to routinely detect deceptive correspondance, so coding that algorithm is quite tricky.

This is a good place to start :
http://doi.ieeecomputersociety.org/10.1109/HICSS.2 004.1265082

Just stupid by Anonymous Coward · 2005-06-12 02:39 · Score: 3, Insightful

How many criminals are going to send plain text emails discussing criminal activities?
This is clearly just designed to appeal to the government of Police State America, probably to get more funding.
This whole obsession with 'terrorists' is just becoming tiring. There are very few 'terrorists' in the world that the Americans didn't create through their own acts of terror. If America would stop its interference in the affairs of other countries, there would probably be almost none at all outside of the White House.

A couple of reasons for it not to work by dustmite · 2005-06-12 03:13 · Score: 3, Interesting

- Many languages are conjunctive/agglutinating in nature (e.g. Turkish, Finnish, Swahili). This means that words of sentences aren't isolated (like most European languages) but are in fact formed from 'parts' that change depending on the surrounding words. Moreover, modifying pre-/suffixes are used as inflections for e.g. verb paradigms. This results in language that effectively have literally billions or even an infinite number of possible "words". It is impossible to do keyword-based analysis on such languages without a full morphological parser for each language to break a word into its 'parts' - such a parser is a massive task.

- Chinese is the opposite, it is a totally "isolating", meaning each word is distinct with no inflections, and because different characters are used for different words there are NO SPACES between words. So you cannot begin to analyse Chinese data at all unless you have a full "Chinese segmenter" to locate word boundaries.

The need to do further disambiguation further complicates all of this analysis.

There is pretty much no way for this type of analysis to be really accurate under the current level of written language analysis technologies.

ATTN: department of homeland security by Vicsun · 2005-06-12 03:33 · Score: 2, Funny

terrorist bomb al qaeda bin laden firebomb death destruction chaos terror plane WMD nuclear weapons

I concur by flowerp · 2005-06-12 03:34 · Score: 3, Funny

Yes the people who are "up to something" actually write differently. Most of the time they use phrases like "validate your bank account",
"please verify your credit card information", etc.

--
--- Eat my sig.

Re:War is Peace by tom's+a-cold · 2005-06-12 07:16 · Score: 2, Interesting

we're doomed to live a nightmare where everyone is guilty.

More opportunities for bribery and blackmail in a situation where there's a high rate of false positives and ambiguous or secret regulations that anyone could potentially be found guilty of. Then they can cultivate a huge population of informers, with even more shakedown possibilities as a result.

Then all that will be left is futile, self-destructive petty rebellion.

--
Get your teeth into a small slice: the cake of liberty

44 of 211 comments (clear)