Slashdot Mirror


Linguistics Identifies Anonymous Users

mask.of.sanity writes "Researchers have examined writing styles to identify previously anonymous carders and hackers operating on underground forums. Up to 80 percent of users who wrote at least 5000 words across their posts could be identified using linguistic techniques. Techniques such as stylometric analysis were used to track users who posted across different forums, and could even be used to unveil authors of thesis papers or blogs who had taken to underground networks."

3 of 215 comments (clear)

  1. I recognise my own writing by kawabago · · Score: 3, Insightful

    I'd be rather surprised if someone else couldn't.

  2. Re:College essays by ForgedArtificer · · Score: 4, Insightful

    Actually, it's the exact opposite.

    Anti-plagiarism software searches for the same content with completely different styles.

    Writer identification involves searching for the same style amongst completely different content.

    --
    The right to offend is central to the right to free speech.
  3. Re:Anonymous First Post by Hotawa+Hawk-eye · · Score: 3, Insightful

    Nothing, as long as you have a large enough corpus of the framee's writing. If the framee is your friend, this probably isn't a problem. If they're a public figure, maybe not a problem (depending on how much editing and PRing their written statements undergo before they are released.) If they're $RANDOM_PASSERBY, not so easy.

    I think a more common usage would be to tweak your own writing just so it doesn't sound like you. Write something you don't want identified as your (the test sample), check it against a corpus of your own written work. If it detects as your work, rough up the test sample until it doesn't. This would be an easier problem than the framing case since you're not trying to make it look like a specific other person's work, you're trying to make it look like it's ANYONE else's (you don't really care whose) work.