The Evil in E-Mail

← Back to Stories (view on slashdot.org)

Posted by timothy on Sunday June 12, 2005 @12:08AM from the find-what-you-look-for dept.

Frenchy in Ontario writes "An Ontario university researcher is devising ways to help law enforcement agencies better pinpoint likely criminal behavior in e-mails. His theory is that people who are "up to something" are more likely to write differently than people who aren't - either by avoiding using certain words at all that could be flagged for possible criminal context (like "bombed) or to examine patterns that might indicate criminal activity - like several people e-mailing one person but not each other, which is how some criminal networks operate. There's also an interesting paragraph on why Enron's emails aren't as valuable as you might think for this sort of work."

25 of 211 comments (clear)

Min score:

Reason:

Sort:

Dumbest thing I've read all week... by TripMaster+Monkey · 2005-06-12 00:09 · Score: 5, Insightful

From TFA:

Skillicorn doesn't know all the ways suspicious e-mails might read differently from innocent ones. The beauty of his approach is that he doesn't need to know. His software is designed simply to look for messages that are different, based on word frequencies, from the mass of e-mails. It needn't understand the reasons for the differences.

Super. I'm predicting a whole lot of false positives...especially during the initial phase of this operation...

Also from TFA:

One difference might be the complete absence of words someone might possibly think would draw a law enforcement agency's attention to their e-mails, but that most people would occasionally use innocently (as in "my presentation yesterday really bombed.")

Great...so words like 'bombed' get the email flagged...as well as an absense of the word 'bombed'? So far, Skillicorn's test appears 100% sensitive...too bad it's 0% specific.

Some more from TFA:

A related trick, he says, is to examine patterns in who e-mails whom. As an example, in criminal networks it is common to find several people communicating regularly with the same person, but never with each other.

OMG! This is the pattern of emails in my company! My whole company is a giant terrorist organization! I had no idea!
/sarcasm

But here's the kicker...again with the quoting:

To help with his work, Skillicorn has been working with archives of e-mail from Enron Corp., the company at the heart of one of the most prominent scandals in recent U.S. business history. In some respects, he notes, the Enron e-mails are not a good sample for analysis, because Enron employees seemed to have no compunction about what they were doing. "People should feel some guilt or at least some self-consciousness when they're being deceptive," he says.

So let me get this straight...if criminals are okay with their criminal activity (like...say...terrorists), they'll 'slip under the radar'??? Great test, Skillicorn...sounds a lot like a standard polygraph test, which experienced criminals can fool at will, while innocent people fail them 50% of the time. That's what the War on Terror really needs...another inaccurate 'test' that does nothing but throw false positives.

I'm just glad that this method is so obviously stupid that it will never be implemented by our government...
Oh, wait...one more from TFA:

Such technology has obvious applications in surveillance by law enforcement and security bodies, but Skillicorn suspects agencies like the U.S. National Security Agency have little need of his help. "I infer from things they say around me that some of this stuff they already do," he says.

Crap.

--
____
~ |rip/\/\aster /\/\onkey
1. Re:Dumbest thing I've read all week... by The+FooMiester · 2005-06-12 00:25 · Score: 5, Interesting
  
  I especially liked the part about:
  
  Another, Skillicorn says, is that research shows
  people speak and write differently when they feel guilt about a
  subject, for instance using fewer first-person pronouns, like I and we.
  
  Because people always use first person pronouns in messages. That's just what's done. And alot of them should be used.
  
  Sounds like a way to track messages with "substance" rather than the "hai h u r? heer are the pictures of my vacation." messages.
  
  Think about that. This man has just come up with a way to measure the relative interest of what the sender has to say to people in the government.
  
  Yet another way to cut down on the messages that the government has to read and be bored with. Yet another way to enable the government to read out communications more effectively
  
  Yet another reason to look into using real encryption.
  
  --
  The previous has been a secret message to my comrades.
2. Re:Dumbest thing I've read all week... by Otter · 2005-06-12 00:43 · Score: 5, Insightful
  
  If I understand correctly, what he's done is this:
  
  1) Devised a theory
  
  2) Tested it on a sample set of emails from Enron
  
  3) Gotten poor results
  
  4) Blamed the failure on Enron, for being just *too* evil for his theory to work!
  
  Yawn. Maybe he should save the press release until he's gotten something to work.
  
  --
  What I'm listening to now on Pandora...
3. Re:Dumbest thing I've read all week... by danharan · 2005-06-12 01:05 · Score: 3, Insightful
  
  Super. I'm predicting a whole lot of false positives...especially during the initial phase of this operation...
  If using contrived language flags this system, I wouldn't want to be the one having to read all the false positives. I imagine I'd find out about a lot of affairs, rumours and backstabbing plans.
  
  --
  Information: "I want to be anthropomorphized"
4. Re:Dumbest thing I've read all week... by MrDomino · 2005-06-12 01:20 · Score: 3, Insightful
  
  Yet another reason to look into using real encryption.
  
  Yeah, sure, until using encryption is flagged as a likely indicator of criminal activity, too...
  
  Remember, if we don't all walk around with our pants down in public, that means that we've got something to hide.
What about other languages.? by guyfromindia · 2005-06-12 00:11 · Score: 3, Insightful

This may work well for English,etc.. but may not work with other languages..
1. Re:What about other languages.? by Anonymous Coward · 2005-06-12 00:57 · Score: 4, Interesting
  
  That may not work either. There's that fine s-f Polish novel "Paradyzja" by Janusz Zajdel about a closed society in a space colony. The population was under constant surveillance and anyone questioning the government was immediately punished. Due to amount of gathered data the government had to use automatic systems to find such people. So what the unhappy residents did was to develop language based on metaphors and associations. For automated systems it looked like a spoken poetry while an intelligent listener easily got the point.
  It was written during Cold War and of course referred to socialist governments of the time but I see new paralles now.
if you're really up no good.. by Keruo · 2005-06-12 00:17 · Score: 4, Insightful

The emails you send would be encrypted instead plaintext.
Real criminals aren't dumb, only the bad ones who get caught are.

--
There are no atheists when recovering from tape backup.
Agreed by MarkusQ · 2005-06-12 00:19 · Score: 5, Funny

This line in the lead jumped out at me:
like several people e-mailing one person but not each other, which is how some criminal networks operate.
We have an addresses "techsupport@internaldomain" which matches this pattern to a T.
--MarkusQ

P.S. Back when we were on MS-Windows, it would have been OK, because the people asking for TechSupport were often sending each other worms at the same time.
1. Re:Agreed by ebuck · 2005-06-12 03:28 · Score: 5, Insightful
  
  Worse yet, people will be watched and harrased by this technology, but never brought to court over it.
  
  In a court, you can question the evidence used against you. Considering that the creator of this evidence indicated that he didn't need to know how it works, it's highly likely that you could get this evidence thrown out because it fails the test of provablility.
  
  So this technology will "flag" people, and they will be watched "just in case". However, there's not going to be a court case, just continued monitoring until the budget to watch this person dries up. And it's very easy to get a bigger budget because you can argue, "We are watching 400,000 people who have been flagged as possible terrorists, we can't keep up. We need more money." Even when your flagging system has worse odds of finding a terrorist than the Lottery.
The idea isn't new... by Registered+Coward+v2 · 2005-06-12 00:19 · Score: 3, Insightful

Pattern recognition has been around a long time - from analyzing the causes of infection to finding likely cheats on expense reports (and the latter uses the frequencies of certain digits, rather than looking for the text entries).

I do disagree with his statement about not being useful to fight spam - recognizing patterns ins spam is already in use, applying the idea that the same or significantly numerous occurrences of the same words from either the same person to multiple users at the same sight and different sites, or the same basic message sent to multiple users from different mailers / return addresses might be a good indicator of spam. The challenge is how do you monitor all the traffic?

--
I'm a consultant - I convert gibberish into cash-flow.
Bad sample? by No+Such+Agency · 2005-06-12 00:26 · Score: 4, Insightful

Ah, my alma mater Queen's makes it onto Slashdot!

I don't know if using the Enron e-mails as his test material is such a good idea. Corporate malfeasance is probably not conducted the same way that every other criminal (or terrorist) network runs. At least their communication might be different due not to a "lack of guilt" but due to the fact that it's probably so easy to make a naughty memo sound like an innocent one without being obvious. After all these memos would be mixed in with a lot of legitimate company business the conspirators are also conducting.

How does automated analysis separate a memo saying "I think we should go ahead and promote Price out of the mailroom" - which means "Have Price-Waterhouse cook those spreadsheets I sent you", from one which just leads to some dude getting promoted out of the mailroom? Of course if they are not bothering to use code words then the system might work very well.

A related trick, he says, is to examine patterns in who e-mails whom. As an example, in criminal networks it is common to find several people communicating regularly with the same person, but never with each other. This is meant to ensure that if one lawbreaker is caught, he or she is unlikely to lead authorities to too many others. But it can also be a clue to suspicious activity.

Traffic analysis is probably more promising, since you can reconstruct relationships between players with it. The traffic pattern could look like a terrorist cell, or it could look like a bunch of guys who know each other - as he says, there's a difference. But this is old news, though automating it would make snoops' lives easier.

At any rate I find this line of inquiry disturbing for civil rights reasons, but I don't believe we should attack the researcher for working on it. Academic freedom is a very useful concept and ultimately does us more good than harm, IMO.

--
Freedom: "I won't!"
Word bombs.... by Invalid+Character · 2005-06-12 00:26 · Score: 5, Funny

Bomd, jihad, kidnap, extort, terrorize, kill, godless, constitution.
That should keep me safe for a few days.

--

--
Registered .sig quotient : 1337
I can't believe this got funding... by bobbis.u · 2005-06-12 00:26 · Score: 4, Insightful

It seems like he is just using Bayesian filtering (the bit about how he doesn't know how it works gives it away), and using Enron emails for the training.
Personally, I can't see how this would ever work. It is typical of the attitude that "all terrorists are bad, they are all the same and we just have to deal with them all in the same way".

Isn't it obvious that different terrorists will have different styles, different levels of literacy, different levels of security awareness, different languages, different aims, different approaches - the list goes on and on. Normal emails all have these traits too. I can't imagine there is any way of applying Bayesian filtering to help with this task.
Big Brother right here by m50d · 2005-06-12 00:27 · Score: 3, Insightful

He's just using statistics to detect emails that are "different". So, anyone who isn't conforming is flagged up. Organising an anti-war protest? There you are, flagged. Say goodbye to freedom, if you hadn't already. Or encrypt all your emails, and try and persuade everyone you know to. Maybe we can make encryption widespread enough these things are useless.

--
I am trolling
1. Re:Big Brother right here by Adult+film+producer · 2005-06-12 01:00 · Score: 5, Funny
  
  I dont think so. Strongly enforced u.s. export controls have kept the Al Queada from gaining access to militant grade encryption tools.
Social Networks = Criminal Networks? by Anonymous Coward · 2005-06-12 00:28 · Score: 3, Insightful

...or to examine patterns that might indicate criminal activity - like several people e-mailing one person but not each other, which is how some criminal networks operate.

Not to mention most social networks. Or is everyone you know equally popular?
Oh dear by Anonymous Coward · 2005-06-12 00:42 · Score: 5, Insightful

Dr Skillicorn has obviously never done any work with or for a law enforcement or intelligence agency. After spending three years in this area working on data mining of electronic communication, I can say this fella has not done his research properly. He has failed to note that the frequency of grammatical and spelling mistakes, let alone "missing" words, have become so frequent now in the SMS TXT generation that this will cause a major problem when scanning messages on this scale. I really can't be bothered to pick any more holes in this because it is time for a bacon and ketchup sandwich.
Wasted effort by AndroidCat · 2005-06-12 00:56 · Score: 3, Funny

Everyone knows that you just have to check the evil bit. (Some terrorists may be sophisticated enough to tamper with the evil bit but if they use Windows, the lack of the bit will stick out like a sore thumb.)

--
One line blog. I hear that they're called Twitters now.
Do they believe in the effectiveness of this.....? by David+Webb · 2005-06-12 01:00 · Score: 3, Insightful

So It is now no longer good enough to just have the ability to subpoena your records if your arrested?Now the government wants to activly sort/monitor the emails of an entire nation. HMM I smell more violations of the rights of the people. How much more of this are we willing to accept. How much longer until dissidents start a revolution. That's right I said it a revolution. This sounds like a combo of search/packet sniffing software.Last I heard PGP and RSA encryption was still unbreakable. This will NOT be effective for the worst thieves or tererorists.
Government sucks. by Fantastic+Lad · 2005-06-12 01:32 · Score: 5, Interesting

"The more corrupt the state, the more numerous the laws."

-Tacitus

Government is already too invasive. I'm already forced to seek a building permit before I can erect a structure on my own property. The fines for ignoring this, (and say, having the gall to build a solar powered house which is not connected to the AC power grid, or (horrors!) a straw-bale house), are huge and the government's reasons for these laws are utterly ridiculous.

Any professor who suggests that we should be looking to monitor email content is not thinking clearly. The Government already has their nose in everything, and telling us that, "It's For Our Own Good," is NOT a valid excuse.

It's MUCH more important that people be able to make mistakes -and even die through their own faults- than live ensnared in the safe-keeping of a bunch of ignorant civil servants who are trying to build a Starfleet future where everybody dresses the same, and nobody is allowed to think or act outside a bunch of pre-set 'safe' boundaries designed for middle-class suburbanites who exist in eternal ignorance of the real world, who actually believe in the Discovery Channel, who drink milk, and live in absolute terror of anything you can't experience beyond the confines of a nice, respectable department store.

-FL
letter from college home by e**(i+pi)-1 · 2005-06-12 01:36 · Score: 5, Funny

Letter from College: Hi Mom, I blew it and bombed the final exam. The physics prof put the gun on my head and told me to work harder. I could kill him. I feel like having a knife at my throat. The anger feels like poison in my blood but I know it is my fault and the all is blamed to that virus, I had been laboring with for quite a while. I'm working on it mom! I promise to make you proud. I can not wait to be on the subway home to work on my final project on weapons of mass destruction in my political science class. Its mental terror. Love Your son P.S. The powder you sent me works well for my skin infection. Strong agent.
Just stupid by Anonymous Coward · 2005-06-12 02:39 · Score: 3, Insightful

How many criminals are going to send plain text emails discussing criminal activities?
This is clearly just designed to appeal to the government of Police State America, probably to get more funding.
This whole obsession with 'terrorists' is just becoming tiring. There are very few 'terrorists' in the world that the Americans didn't create through their own acts of terror. If America would stop its interference in the affairs of other countries, there would probably be almost none at all outside of the White House.
A couple of reasons for it not to work by dustmite · 2005-06-12 03:13 · Score: 3, Interesting

- Many languages are conjunctive/agglutinating in nature (e.g. Turkish, Finnish, Swahili). This means that words of sentences aren't isolated (like most European languages) but are in fact formed from 'parts' that change depending on the surrounding words. Moreover, modifying pre-/suffixes are used as inflections for e.g. verb paradigms. This results in language that effectively have literally billions or even an infinite number of possible "words". It is impossible to do keyword-based analysis on such languages without a full morphological parser for each language to break a word into its 'parts' - such a parser is a massive task.

- Chinese is the opposite, it is a totally "isolating", meaning each word is distinct with no inflections, and because different characters are used for different words there are NO SPACES between words. So you cannot begin to analyse Chinese data at all unless you have a full "Chinese segmenter" to locate word boundaries.

The need to do further disambiguation further complicates all of this analysis.

There is pretty much no way for this type of analysis to be really accurate under the current level of written language analysis technologies.
I concur by flowerp · 2005-06-12 03:34 · Score: 3, Funny

Yes the people who are "up to something" actually write differently. Most of the time they use phrases like "validate your bank account",
"please verify your credit card information", etc.

--
--- Eat my sig.