Slashdot Mirror


Unmasking Anonymous Email Senders

alphadogg writes "Just because you send an email anonymously doesn't mean people can't figure out who you are anymore. A new technique developed by researchers at Concordia University in Quebec could be used to unmask would-be anonymous emailers by sniffing out patterns in their writing style from use of all lowercase letters to common typos. Their research, published in the journal Digital Investigation, describes techniques that could be used to serve up evidence in court, giving law enforcement more detailed information than a simple IP address can produce."

34 of 204 comments (clear)

  1. Pretty print it first by gatkinso · · Score: 4, Interesting

    run it thru pretty print or some other formatter before sending it.

    --
    I am very small, utmostly microscopic.
    1. Re:Pretty print it first by Anonymous Coward · · Score: 5, Insightful

      They seriously think an 80% success rate is good enough to be used in court?

      I'm betting the real reason is so they can go to a judge with their pseudo-evidence to get a warrant for more invasive spying.

    2. Re:Pretty print it first by Rob+the+Bold · · Score: 2

      run it thru pretty print or some other formatter before sending it.

      If it's looking for writing style, not just punctuation, spacing, caps, etc., then you might also want to do an auto-translate back and forth from your language. But that would potentially provide another way to find you if you used an online translator.

      --
      I am not a crackpot.
    3. Re:Pretty print it first by Skarecrow77 · · Score: 2

      I can't even see something as good as an 80% match rate on anything less than a full page of text, you'd need a damn huge sample size if you're going to be using typos and capitalization as "fingerprinting".

      Also, doesn't this mean that a simple spellchecker and the auto-capitalization function on many smartphones would defeat this technology?

    4. Re:Pretty print it first by Anonymous Coward · · Score: 4, Interesting

      According to Wikipedia an 80% success rate is good enough for most civil cases, and indictment for criminal cases. These are best off a "preponderance of the evidence," or "more likely than not" standard (>50%). Criminal case decisions are based on a standard of "clear and convincing evidence," but 80% would be more than enough to get them in the door.

      http://en.wikipedia.org/wiki/Legal_burden_of_proof#Examples

    5. Re:Pretty print it first by spun · · Score: 3, Informative

      They seriously think an 80% success rate is good enough to be used in court?

      Why not? 19 states and many countries still admit polygraph tests into court, despite the fact that they are wildly inaccurate, and people can be specifically taught to deceive them.

      http://en.wikipedia.org/wiki/Polygraph#Validity

      --
      - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
    6. Re:Pretty print it first by OzPeter · · Score: 4, Informative

      run it thru pretty print or some other formatter before sending it.

      Nah .. run it twice though Google translate

      Nah .. ejecutarlo dos veces a través de Google Translate

      Nah .. twice run through Google Translate

      --
      I am Slashdot. Are you Slashdot as well?
    7. Re:Pretty print it first by spun · · Score: 5, Funny

      "It's not a lie if _you_ believe it."

      In totally unrelated news, my dick is a foot long.

      Well I'd like to see that stand up in court.

      --
      - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
    8. Re:Pretty print it first by Natural+Join · · Score: 2

      This is called "stylometry": the algorithmic analysis of authorship based on the content of the work in question. There are many scholarly articles out there describing various algorithms out there you can find and read. Early efforts in this area involved testing the Shakespeare/Bacon hypothesis, who wrote which of the Federalist papers, and establishing the authorship of the 15th Oz novel.

      The basic concept is pretty easy. I played a fair bit with the idea back a few years ago when I wanted to prove to myself that a certain usenet troll was actually the same person as another poster. I downloaded a bunch of 19th century novels from Project Gutenberg and tried various published algos until I could accurately cluster authorship. Telling husband and wife apart (Mary Shelley and Percy Shelley) was crazy-easy; the appallingly hard case was Charlotte and Emily Bronte. (They were sisters and grew up together, clearly having a lot of influence on each other's writing styles.)

      Yes, one of the most basic issues of analysis of a stylometry algo is how well does it still work if the author is *trying* to obfuscate his style, or *trying* to imitate another author. There are algorithms that a quite insensitive to such efforts. However, they all seem to require a *lot* of text to work well. My best efforts (using other people's published algorithms) worked quite well on full novels but did not work so well given even as much as a few chapters.

      Oh, and that trolled turned out NOT to be the guy I thought it was.

    9. Re:Pretty print it first by AmonTheMetalhead · · Score: 2

      Google translate to German, then back to English, nobody will ever be able to restore the original message!

    10. Re:Pretty print it first by KahabutDieDrake · · Score: 2

      I invite you to be horrified by taking a look at the actual science behind those CSI shows. The threshold for use in court is far far lower than you might imagine it to be. Furthermore, I'm going to go out on a limb here and say that you don't have the foggiest idea how DNA evidence is handled in a courtroom, or for that matter a criminal lab. You'll be most pleased to know that not only were you a match to the sample we have, but so are all your immediate male family members, most of your extended family members and something like 5% of the population of the earth. But if you'd prefer to think of fingerprints and DNA as foolproof, please, don't let the facts stop you. It hasn't stopped anyone else.

  2. For the lulz by burnit999 · · Score: 4, Insightful

    Sooo... if I want to write an anonymous letter I just switch from my usual grammar natzi mode to my OMFG i c4/Vz p0ns0r your org MANNNN!

    1. Re:For the lulz by spun · · Score: 2

      shake shake roll... Natzi!

      --
      - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
  3. E E Cummings, that blatant spammer by SMoynihan · · Score: 4, Insightful

    Turns out most spam is written by e e cummings.

    Who'd have thought it?

  4. Finally, they can find that one guy by _0xd0ad · · Score: 5, Funny

    who always types part of the body of his message in the subject line.

  5. Too easy to fake by trollertron3000 · · Score: 2

    Yes but unlike writing this can be easily duplicated. Writing using someone else's style isn't an easy task. Doing it with a keyboard, very easy.

    --
    Tiger Blooded Bi-Winning Machine
  6. Verily, I am scrod by kmdrtako · · Score: 2

    wherefore did I ever adopt such a distinctive writing style.

    1. Re:Verily, I am scrod by Sponge+Bath · · Score: 2

      Information for /. readers: A scrod is a fish *and* is the model of car with large fins driven by Ratliff in the comic strip Eye Beam by Sam Hurt.

  7. Interesting, but easily defeated by zindorsky · · Score: 2

    I'm not saying the research is worthless, but their techniques are easily defeated.
    It would be simple to write a program that would iteratively "fuzz" your message with typos, lowercase/uppercase toggling, etc. and check the result against their algorithm until the message could no longer be tied to you.
    I'm sure someone could do it in 10 lines of Perl, or less.

    --
    If the geiger counter does not click, the coffee, she is not thick.
    1. Re:Interesting, but easily defeated by hedwards · · Score: 2

      As has been pointed out by others, in the past you couldn't auto-translate it into another language and back. You lose virtually all of the identifiable information that would help them analyze the document like that.

    2. Re:Interesting, but easily defeated by vux984 · · Score: 2

      As has been pointed out by others, in the past you couldn't auto-translate it into another language and back. You lose virtually all of the identifiable information that would help them analyze the document like that.

      And people still don't bother most of the time; so the tech is still useful.

      For example, forensic fingerprinting technology is defeated by wearing gloves, but that hasn't rendered the technology irrelevant either.

  8. Simple by LWATCDR · · Score: 5, Informative

    Use Google translate. Translate it into Spanish, then into German, then back into English, then into LEET.

    It should be simple to obscure the style and weaknesses of the author with this method.

    --
    See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    1. Re:Simple by 0100010001010011 · · Score: 3, Interesting

      With Google Translate. Translate into Spanish, then German, then English, then in LEET.

      It should be easy to hide the style of the author and weaknesses with this method

      I was expecting some hilariously screwed up result, but that turned out rather well. It also masked your writing style.

  9. This is why I cut & paste by dim5 · · Score: 5, Funny

    This is why I cut & paste each word of anonymous emails from an online dictionary.

    Untraceable.

    --

    Is something burning?
    Oh, it's my karma.

  10. Re:Behavioral Profiling rediscovered by OrangeCowHide · · Score: 3, Funny

    But this is on a computer... On the internet. That's like double implicit innovation.

    --
    Creationists are a lot like zombies. Slow, but powerful and numerous. And they all want to eat our brains. - Evilest Doe
  11. The digital equivilent of cutting up magazines. by sumdumass · · Score: 4, Interesting

    It used to be that people would cut words from magazines and other papers to make ransom notes so no one could recognize their hand writing.

    With this concept moving to the computer and internet, it will be trivial to find words, phrases, auto generation scripts and so on to do the digital equivalent. In fact, I think there are several programs out there that will pull random lines of text from several sources on the internet, take a real message and create a image of some sort to lay information over top of it, all just to get around spam filters. (disable the display of image in your email and you will be surprised at what is underneath them sometimes).

    But something I can see this really having a problem with is how easy it might make the chance at setting someone else up to take a fall. Suppose you and I have emailed each other for quite some time now. I saved all our correspondence and farmed them to find phrases and word misspellings, cut and pasted them to make statements you never intended to make, then sent them off to threaten the president. Something even more disturbing, suppose we know each other in real life and I have the hots for your wife. I make my way into your house, plant some pipes and fertilizer beside some diesel fuel in one of your closets, get on your computer, sign up for a free email address from it using fake information and start spamming chat rooms and emailing government officials your intent to kill the president.

  12. The actual research paper by Sara+Chan · · Score: 4, Informative

    The actual research paper is at

    http://www.dfrws.org/2008/proceedings/p42-iqbal.pdf

    Note that it was published in 2008. So Slashdot is reporting relatively quickly here.

  13. Not Anonymous by khr · · Score: 2

    I long ago gave up any idea that my writing would be very anonymous...

    As an American working in software companies in India for ten years, whenever managers sent out surveys they said would be "totally anonymous" I always figured with my American writing style (complete sentences, very few typos, no "spel it like u sa it", active voice, writing out our product and company name in full) everyone would recognize it was my writing anyway... And that was usually the case, as people who weren't supposed to know who wrote what would invariably reply to me, "hey, why did you write that?"

  14. I can imitate your writing style by spun · · Score: 3, Insightful

    Even worse than false negatives would be false positives. Maybe those death threats to your boss sound just like you, use the same words you use, the same grammar, everything. That's because your jealous coworker pirated himself a copy of this program, fed some writing of yours through it, and then kept editing those death threats until the program claimed they sounded just like you.

    --
    - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
  15. This gives writers like me an edge in AC by WillAffleckUW · · Score: 2

    I actually write in different styles, and used that for different RPG game systems and stories - now all I have to do is go to a nearby cafe (cant go a block without running into two) and use their free computers using different personas.

    In fact, I think I'll start studying the writing styles of Cheney, Rove, and Fnarf and using them as writing templates for my next posts ...

    Pretty easy to do.

    I think most of my current personae are quite radically different in writing style from my other published pseudonyms.

    --
    -- Tigger warning: This post may contain tiggers! --
  16. Re:I recall - he is correct, mod up, not down... by cboslin · · Score: 3, Informative

    Here is an except that proves anonymous post is correct:

    But even Unabombers are not infallible. Exulting in his apparent mastery of the FBI, the master criminal made his mistake, in the form of a 35,000- word treatise on the "Future of Industrial Society", which he submitted to the Washington Post and New York Times. If they published the rambling, anti-technology manifesto, the writer said, he would cease his campaign. After much soul-searching, the two papers did so on 20 September 1995, on the advice of the FBI.

    Relatives in Chicago were struck by similarities between some of Ted Kaczynski's earlier writings and the rambling musings of the Unabomber's tract, and eventually his brother informed the FBI. And so the trail of 18 years, dotted with 200 detained suspects along the way, led to a hand- built cabin near the Continental divide. But the tale may not yet be over.

    Here is the article from the Independent.

    I recollected that this was how the Unabomber was finally caught, via relatives who read his writings and recognized him... I respect that some mods might not like anonymous cowards, but if they are correct they should not be modded down, at least not to be fair.

  17. Re:"Could care less" by Desler · · Score: 2

    Or maybe that person really COULD care less, but their current level of caring is so low it doesn't matter.

  18. Re:The vodka is good, but the meat is rotten! by Cyberax · · Score: 3, Informative
    I'm a native Russian speaker and this phrase, indeed, can't be mistranslated this way (I just used it as a well known example). However, it's true that attempting to automatically translate ANYTHING non-trivial from English to Russian invariably results in hilarity.
    For example, I've tried to translate the next Slashdot article's blurb:

    "Google Voice users learned late Monday that the service now has a way of making purely Internet-based phone calls. Making a SIP call with a "sip:" prefix, the Google Voice phone number and @sip.voice.google.com skips the conventional phone network entirely, saving users cellphone minutes. Disruptive Telephony tested it and found that a call worked "great.""

    "Disruptive" was translated as "explosive" in the sense of "trinitrotoluene", and "great" was translated as "big". Translating it back resulted in:

    "Google Voice users learned late Monday that the service is now a way to make a clean Internet phone calls Make a call with SIP. "Sip:" prefix, Google Voice phone transmits the number and@sip.voice.google.com common telephone network fully, saving minutes of mobile phone users. Explosive Telephone tested it and found that the call worked "big""

    You can probably still guess the meaning, but it's not exactly easy.

  19. Just Google it... by ironjaw33 · · Score: 2

    Every once in awhile, I get a trollish and insulting comment on my blog. Usually, the commenter leaves the name field anonymous but leaves a valid email address as an invitation for me to take the bait and respond. A quick google search of the email often reveals other trollish comments posted by the same user elsewhere on the internet, and usually they slip up at least once and leave their name. From there, it's pretty easy to find out more personal information.