Slashdot Mirror


Linguistics Identifies Anonymous Users

mask.of.sanity writes "Researchers have examined writing styles to identify previously anonymous carders and hackers operating on underground forums. Up to 80 percent of users who wrote at least 5000 words across their posts could be identified using linguistic techniques. Techniques such as stylometric analysis were used to track users who posted across different forums, and could even be used to unveil authors of thesis papers or blogs who had taken to underground networks."

215 comments

  1. Anonymous First Post by Anonymous Coward · · Score: 5, Informative

    Anonymous First Post... you'll never guess who I am

    1. Re:Anonymous First Post by Anonymous Coward · · Score: 4, Funny

      4990.5 more words please.

    2. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      Please be badanalogyguy - we've missed you!

    3. Re:Anonymous First Post by Anonymous Coward · · Score: 5, Funny

      Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec in tincidunt nisi. Vivamus quis ligula non lorem feugiat congue ut a ipsum. Vivamus iaculis elementum tellus eget ullamcorper. Nam sed lacus at felis volutpat egestas. Aliquam hendrerit mauris a felis fringilla tristique. Proin commodo eleifend leo suscipit pulvinar. Praesent velit lectus, venenatis ac volutpat vitae, scelerisque sed diam. Integer eu felis quis erat ultricies sodales. Etiam eu turpis massa. In vel velit nec purus tristique vestibulum. Cras eleifend diam ut dolor facilisis convallis. Morbi velit ligula, aliquam vitae ullamcorper et, dapibus sed augue. Nullam euismod urna in purus condimentum suscipit. Fusce dolor magna, dictum quis elementum quis, mollis in sem. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla imperdiet lectus sit amet risus interdum vel congue odio venenatis. Proin lobortis urna ac tortor auctor id porttitor urna auctor. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Integer viverra consequat nisl, ac adipiscing dui feugiat quis. Ut ut tortor urna. Pellentesque velit orci, mollis eu venenatis quis, convallis nec risus. Donec quis enim ac ante placerat accumsan. Fusce ut erat in tortor ullamcorper aliquam. Aenean ut est turpis. Nam ut elit justo. Suspendisse potenti. Praesent et nulla eget sem interdum pellentesque. Nunc sagittis metus sed mauris lacinia consequat. Fusce velit velit, semper at euismod a, euismod vitae enim. Vivamus elementum commodo faucibus. Suspendisse dictum rutrum leo at lobortis. Nam ac lectus id velit hendrerit rutrum vitae at mauris. Integer quis ante ullamcorper dui gravida auctor eu ut lectus. Curabitur laoreet sapien at tortor elementum consectetur. Etiam faucibus tempor sem, sed ultricies felis semper eget. Suspendisse odio lacus, interdum eu rhoncus ut, iaculis vitae enim. Morbi egestas ultricies lorem at tempus. Donec iaculis purus vel tellus cursus elementum. Nulla fermentum vulputate lorem sit amet pellentesque. Nunc quam lacus, consectetur et convallis non, pharetra dapibus diam. Maecenas laoreet ornare vehicula. Phasellus vitae odio diam. Ut facilisis nisi eu sapien elementum sit amet molestie arcu consectetur. Nulla in tortor urna, in elementum tellus. Maecenas convallis nunc purus, eget pretium purus. Suspendisse nec nibh ac augue condimentum adipiscing quis et lorem. Integer eget lorem velit. Nullam volutpat metus sit amet ante feugiat ac cursus sem congue. Pellentesque dolor nulla, facilisis id hendrerit eget, commodo eu urna. Donec ut interdum nibh. Sed nunc nisi, commodo non congue vitae, tempus ut ligula. Donec massa dui, viverra eget tempus ut, ornare eu ligula. Proin quis posuere diam. Phasellus at risus quam, id cursus odio. Sed fermentum, tortor eu iaculis sollicitudin, erat augue ornare nisi, eu mattis neque massa ac odio. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Sed varius, orci eget rhoncus egestas, mi nisl mattis sapien, non ultrices nulla elit porta nunc. Praesent mauris lectus, ultrices at interdum quis, euismod accumsan arcu. Pellentesque in dolor libero, vitae tincidunt dui. Nunc rhoncus ante in nulla sagittis ullamcorper. Curabitur velit odio, tempus sit amet lobortis eu, condimentum sit amet massa. Maecenas convallis facilisis arcu, quis accumsan velit tincidunt ut. Vivamus a ante orci, at mattis nibh. Nulla in diam est, vitae semper purus. Donec ut odio augue. Etiam tempor ultricies luctus. Quisque fringilla tincidunt rutrum. Phasellus et justo ut lorem imperdiet semper. Maecenas et justo lectus, ac dictum dui. Morbi sit amet venenatis neque. Donec interdum enim vel velit commodo pulvinar. Aenean nisl erat, bibendum id tincidunt sed, sagittis sagittis mi. Curabitur dui urna, venenatis id placerat nec, consectetur sagittis mi. Phasellus eleifend condimentum lorem et blandit. Pellentesque at lorem nisl, quis ullamcorper nisi. Suspendisse potenti. In id orci massa, in hendrerit ligula. Nunc elementum mi in nisl posuere ut tincidunt nibh placerat. Mauris venen

    4. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      Someone called out my AC posts here once using such an analysis. (Apparently I have a linguistic tick which they identified.)

      However they also incorrectly concluded I was a particular Microsoft employee, so they were probably just a typical paranoid-schizophrenic "FOSS advocate" basement dweller.

    5. Re:Anonymous First Post by Anonymous Coward · · Score: 5, Funny

      I identified you. You are Cicero.

    6. Re:Anonymous First Post by girlinatrainingbra · · Score: 5, Interesting

      Sock puppet accounts are also apparent from these linguistic tics. Sometimes, resorting to a particular analogy or getting hot-tempered at a specific topic or a certain kind of point of view can also give away the identity of the author. So maybe limit oneself to 5000/144 = 34 tweets per tweeter account so that you can't be figured out. And writing style and favorite kinds of rant was also how the Unabomber was found out: his family members recognized his particular pet peeves and rants and writing patterns and sent their suspicions in to the F.B. I.

    7. Re:Anonymous First Post by mrbester · · Score: 2

      We can narrow it down to someone who is particular about correct capitalisation (and therefore probably spelling, punctuation and grammar) denoting an education and attention to detail not normally seen in forum posts. As this is a more technical forum you most likely program in a language where letter case is of paramount importance and have done so for at least 5 years in a professional position. You probably also write reports indicating a level of seniority.

      That should reduce the number of likely candidates somewhat.

      --
      "Wait. Something's happening. It's opening up! My God, it's full of apricots!"
    8. Re:Anonymous First Post by Black+LED · · Score: 1

      I was hoping for TheTurdReport, but alas, the account no longer exists :(

    9. Re:Anonymous First Post by Anonymous Coward · · Score: 1

      Someone called out my AC posts here once using such an analysis. (Apparently I have a linguistic tick which they identified.)

      However they also incorrectly concluded I was a particular Microsoft employee, so they were probably just a typical paranoid-schizophrenic "FOSS advocate" basement dweller.

      That's right. You have "ticks" instead of "tics"!

    10. Re:Anonymous First Post by water-and-sewer · · Score: 2

      Classic - kudos to you for a great laugh. I was thinking though, "this study doesn't help much because it's rare to find places where people write more than a line or two anymore."

      Go back to the old days of Usenet (80s, early 90s) and posts were long, well thought-out, and useful. Look at OLGA, for example, which collected written music in TAB format for guitarists (ha - remember when THAT was the biggest threat to the music industry?). Tons of useful stuff. Hardly anyone does that anymore; it's mostly short sentences. The exceptions - like tech forums - are in situations when no one cares much to be anonymous anyway.

      It's been tough to get people to pay attention to the forum at www.dictatorshandbook.net for two reasons: I think people are reticent to opine on various dictators, all of whom might put them in jail, and because hardly anyone posts on forums anymore (yes, I know, there are some exceptions). Look at the length of the average comment on a Reddit thread, for example - a line or two, sometimes just a word or two.

      --
      If this were Usenet, I'd killfile the lot of you.
    11. Re:Anonymous First Post by DerekLyons · · Score: 1

      Classic - kudos to you for a great laugh. I was thinking though, "this study doesn't help much because it's rare to find places where people write more than a line or two anymore." Go back to the old days of Usenet (80s, early 90s) and posts were long, well thought-out, and useful.

      Not particularly... you just hang out in the 'wrong' places (not just websites/forums, but usenet groups as well), both for the length and for the nature of the writing.

      It's been tough to get people to pay attention to the forum at www.dictatorshandbook.net for two reasons: I think people are reticent to opine on various dictators, all of whom might put them in jail, and because hardly anyone posts on forums anymore (yes, I know, there are some exceptions).

      As far as getting people to visit and pay attention to your forum... I suspect people pay it no mind because it looks like one of the many crappy-ass and pointless websites on the 'net. At first glance it seems like a site mostly designed to sell what looks like either a parody or a tinfoil-hat book with no reason to pay further attention or treat it seriously. Your idiotic twitter feed doesn't help either... Nor does the way you hide the forums or your use of a weird forum format...

    12. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      Cicero, can I borrow your time machine or have some of that elixir of life? You're either older than dirt or have lost technology.

      More seriously, who's to say the writers of this algorithm can't run it in reverse to frame someone?

    13. Re:Anonymous First Post by ios+and+web+coder · · Score: 1

      That's right. You have "ticks" instead of "tics"!

      That would make him an Intel employee...

      I don't usually bother with anonymous, here. It encourages bad habits. Occasionally, I will post anonymously, simply because I'm using a different workstation, and can't be bothered to sign in.

      I consider privacy and anonymity on the Internet to be swimming with the fishes. They are long dead, so I try to behave myself; no matter where or how I am logged in.

      --

      "For every complex problem there is an answer that is clear, simple, and wrong."

      -H. L. Mencken

    14. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      And writing style and favorite kinds of rant was also how the Unabomber was found out: his family members recognized his particular pet peeves and rants and writing patterns and sent their suspicions in to the F.B. I.

      It was probably a little more than just that though. I'm fairly certain the family suspected he was a nutball way before recognizing his writing.

      So it was a combination of things, not writing alone.

    15. Re:Anonymous First Post by girlinatrainingbra · · Score: 2

      Here are the two relevant paragraphs from the Wikipedia article on the Unabomber that shows why it was the "manifesto", the writing style and the writing contents that were key in the family suspecting his involvement/"identity". They occur at the Search section of the article: Before the publication of the manifesto, Theodore Kaczynski's brother, David Kaczynski, was encouraged by his wife Linda to follow up on suspicions that Ted was the Unabomber.[77] David Kaczynski was at first dismissive, but progressively began to take the likelihood more seriously after reading the manifesto a week after it was published in September 1995. David Kaczynski browsed through old family papers and found letters dating back to the 1970s written by Ted and sent to newspapers protesting the abuses of technology and which contained phrasing similar to what was found in the Unabomber Manifesto.[78]
      Prior to the publishing of the manifesto, the FBI held numerous press conferences requesting the help of the public in identifying the Unabomber. They were convinced that the bomber was from the Chicago area (where he began his bombings), had worked or had some connection in Salt Lake City, and by the 1990s was associated with the San Francisco Bay Area. This geographical information, as well as the wording in excerpts from the manifesto that were released prior to the entire manifesto being published, was what had persuaded David Kaczynski's wife, Linda, to urge her husband to read the manifesto.

    16. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      Sometimes, resorting to a particular analogy or getting hot-tempered at a specific topic or a certain kind of point of view can also give away the identity of the author.

      People typing in a language they're not entirely fluent in will often make the same errors in syntax any time they're doing so, which can easily be a dead giveaway. One example I recall was on the forums for a game I used to play on a private server. There was a Brazilian guy whose English was readable but a bit rough around the edges. So , for example , he would always put spaces before his punctuation marks , and use markup in weird places, and say the name of who he was replying to randomly , always in bold , girlinatrainingbra . He also always complained about all the "f****ng xiters" (asterisks and strange terminology his; "xiters" means "cheaters") on the server and how the GMs weren't vigilant enough in dealing with him.

      So one day I'm browsing the forum of one of the hacker clans for this game (for entertainment purposes; they literally thought they were gangsters, posted pictures of themselves in hoodies throwing gang signs and shit; it was absolutely hilarious), and I see someone requesting an updated version of the trainer for the game , because he was fed up , with the corrupt moderators and administrators on the server and this was how he was striking back. It was so obviously the same guy, because he didn't even try to hide his weird syntax. Myself and one of the GMs got such a laugh out of it that we made a thread about it after he was banned.

    17. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      swimming with the fishes

      Dat's "Sleepin' wid' da fishies," ya knucklehead.

      -Tony Soprano

    18. Re:Anonymous First Post by logjon · · Score: 0

      Manifesto...attract attention to the erosion of human freedom necessitated by modern technologies requiring large-scale organization

      So pretty much all of techdirt could be reported over this

      --
      The stories and info posted here are artistic works of fiction and falsehood.
      Only fools would take it as fact.
    19. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      I'm not particularly consistent day to day, hour to hour, or necessarily even minute to minute, in regards to capitalization, punctuation, sentence construction complexity, or the complexity of the words forming those sentences.

    20. Re:Anonymous First Post by Will.Woodhull · · Score: 3, Interesting

      I used to post anonymously much more often, when I had a job with a guvmint agency and a young famly to protect. I do not bother with that much any more. I am not invulnerable, but for the most part I know that I look like too small a fish to be worth going after.

      That said, I still occasionally post anonymously when I want to antagonize the astroturfers, Scientology nuts, etc. Especially on slashdot if I am concerned that my post might damage my karma.

      Interesting things to do when posting anonymously:

      Use a thesaurus to choose synonyms you would not ordinarily use.

      L33t 5p33k

      Write like Hemmingway. Keep all sentences short. Sentences that do not have subordinate clawses do not have much style to analyse.

      Use creative misspellings. "claws" for "clause", etc.

      Use Google Translate to do a multilingual hash: translate your work into Russian, then the Russian version back to English. "The spirit is willing but the flesh is weak" becomes "The wine is passable but the meat has gone bad."

      Ideally, Anonymous will develop a set of tools that will rewrite any text into one of half a dozen different styles. Let the authorities chase after these six fictional characters.

      --
      Will
    21. Re:Anonymous First Post by ios+and+web+coder · · Score: 1

      Ideally, Anonymous will develop a set of tools that will rewrite any text into one of half a dozen different styles. Let the authorities chase after these six fictional characters.

      :)

      --

      "For every complex problem there is an answer that is clear, simple, and wrong."

      -H. L. Mencken

    22. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      If you're managing 144 words per tweet, I know some marketing folk who'd pay well for your secret.

    23. Re:Anonymous First Post by Onymous+Coward · · Score: 1

      More seriously, who's to say the writers of this algorithm can't run it in reverse to frame someone?

      Oh, clever.

      But now future courts can point to your post to show that the idea was common or at least public knowledge, empowering stylometric identification deniability in cases of plausible framing.

    24. Re:Anonymous First Post by kelemvor4 · · Score: 1

      .5?

    25. Re:Anonymous First Post by marcello_dl · · Score: 1

      plus simple: use newspeak.

      --
      ---- MISSING MISCELLANEOUS DATA SEGMENT --- [sigdash] trolololol
    26. Re:Anonymous First Post by DaemonDan · · Score: 1

      I'm guessing "you'll" only counted as 1.5 words to the counter.

      --
      Enjoy post-apocalyptic and singularity science fiction? Check out www.demonarchives.com, a new online graphic-novel.
    27. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      Well no, what it indicates is someone who is so insecure and immature as to be worried about the opinions of people he does not know and has never met. Spending time and effort paying attention to things like punctuation and grammar on a forum post (i.e within a context of no material benefit) is an indication of adolescence - doing so within the context of the workplace, where there is a material benefit, is an indication of adulthood.

    28. Re:Anonymous First Post by Hotawa+Hawk-eye · · Score: 3, Insightful

      Nothing, as long as you have a large enough corpus of the framee's writing. If the framee is your friend, this probably isn't a problem. If they're a public figure, maybe not a problem (depending on how much editing and PRing their written statements undergo before they are released.) If they're $RANDOM_PASSERBY, not so easy.

      I think a more common usage would be to tweak your own writing just so it doesn't sound like you. Write something you don't want identified as your (the test sample), check it against a corpus of your own written work. If it detects as your work, rough up the test sample until it doesn't. This would be an easier problem than the framing case since you're not trying to make it look like a specific other person's work, you're trying to make it look like it's ANYONE else's (you don't really care whose) work.

    29. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      ...but what happens when I post and crap up your analysis?

    30. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      Every time I read that word, I first parse it as news peak, and wonder what sort of news might have peaked. :-)

    31. Re:Anonymous First Post by Anonymous Coward · · Score: 0
      News peaked about 300 years ago - very possibly when Samuel Pepys died (1703). It has been going downhill ever since - though possibly with a brief respite towards the end of Watergate.

      Yes, I was on the Grassy Knoll!

    32. Re:Anonymous First Post by tattood · · Score: 1

      I wonder how long it will take Anonymous to write a script that will take in their intended text, and then re-write it so that it randomly changes the linguistic style so that they will never be able to track it?

      --
      WTB [sig], PST!!!
    33. Re:Anonymous First Post by Anonymous Coward · · Score: 1

      Or perhaps, once educated, he realizes that he's taken a lot more seriously everywhere when it is clear he took some time to write and edit a post. There are some people who are unwilling or incapable of doing that, and those people will tend to try and dismiss the effect in order to bring the level of the discussion down to their level of effort, but that's just another form of coping for their inadequacies.

      If you write a post and have no stake in it's outcome at all, then there's little point in writing it to begin with. Show up with your game, or why bother showing up at all?

    34. Re:Anonymous First Post by firewrought · · Score: 1

      A more elaborate linguistic dodge (if you were writing a revolutionary manifesto or such) would be to create a detailed outline of your intended message and then set it aside. Go read works from an author, genre, or time period that you normally wouldn't be interested in. Absorb the linguistic quirks of this alter-canon and then "channel" it while you expand your manifesto's outline into a draft.

      Take for instance, the distinctive voice of Thomas Paine's Common Sense :

      Some writers have so confounded society with government, as to leave little or no distinction between them; whereas they are not only different, but have different origins. Society is produced by our wants, and government by our wickedness.

      I normally use "people" instead of "writers" and "confused" instead of "confounded", avoid multiple negatives/inversions in the same sentence ("little or no"... "whereas"... "not only"... "but"), use more parenthetical comments, write complete sentences after a {comma, conjuction} combination, and avoid the words "whereas" or "wicked". So if I channeled Paine successfully (and had some level awareness of my own quirks), I'd probably produce a linguistically distinct text.

      Potential drawbacks include (1) being long-winded when you need to be succinct, (2) coming across as gimmicky b/c your speech isn't normal, or (3) coming across as fake b/c you're busy injecting artifice instead of genuine passion.

      Perhaps a more interesting use of this approach would be to "frame" or draw suspicion to someone by producing an manifesto that matches other works they have published.

      --
      -1, Too Many Layers Of Abstraction
    35. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      What if they all just adopted the same style? Yodaspeak, for instance. "hack the websites of riaa we will" "other os sony reenable to ps3 you must", etc. If everyone talked like that while acting as Anon, would the researchers still be able to uniquely ID them?

    36. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      I used to post anonymously much more often, when I had a job with a guvmint agency and a young famly to protect. I do not bother with that much any more. I am not invulnerable, but for the most part I know that I look like too small a fish to be worth going after.

      Use a thesaurus to choose synonyms you would not ordinarily use.

      L33t 5p33k

      Write like Hemmingway. Keep all sentences short. Sentences that do not have subordinate clawses do not have much style to analyse.

      Use creative misspellings. "claws" for "clause", etc.

      Well, don't be too creative, too much creativity would make you more unique.
      I'd say your best bet would be to try to blend in with others, like in a formal place talk formally, in a site like 4chan use curse words.

      Use Google Translate to do a multilingual hash: translate your work into Russian, then the Russian version back to English. "The spirit is willing but the flesh is weak" becomes "The wine is passable but the meat has gone bad."

      You pointed out the problem in this yourself: double translation might change meaning along with style, which is not ideal because your goal is to convey your msg.

    37. Re:Anonymous First Post by tlhIngan · · Score: 1

      I think a more common usage would be to tweak your own writing just so it doesn't sound like you. Write something you don't want identified as your (the test sample), check it against a corpus of your own written work. If it detects as your work, rough up the test sample until it doesn't. This would be an easier problem than the framing case since you're not trying to make it look like a specific other person's work, you're trying to make it look like it's ANYONE else's (you don't really care whose) work.

      Or play the Google Translate telephone game.

      You know, where you take your work, translate it to another language, lather rinse repeat, and then bring it back to your native language to read how stilted and silly it becomes?

      Do it a few times and I think that pretty much obfuscates your writing style.

      Other ways include perhaps writing for a 2nd grade level or lower, and very simple sentence structure. Subject-verb-object. After all, can you really identify someone who writes "I have a gun. Give me your wallet. What is the PIN for your bank account? "?

    38. Re:Anonymous First Post by mrhippo3 · · Score: 1

      When I had a boss who turned plagiarism into a fine art, I had great fun seeing how paragraphs were borrowed verbatim. (Trying very hard to be gender neutral here). The boss even took one brochure, stripped out the images, found a pretend "author" and published the thing as an "article." This was pre-Internet so she was not caught. Still, like a cut-and-paste ransom note, you can borrow (steal, appropriate, copy verbatim) lines from a dozen or so sources and create a nearly anonymous cobbling of content that says exactly what you want while removing your own digital footprints.

    39. Re:Anonymous First Post by NotSanguine · · Score: 1

      Write like Hemmingway. Keep all sentences short. Sentences that do not have subordinate clawses do not have much style to analyse.

      You mean like this?

      We were young and our happiness dazzled us with its strength. But there was also a terrible betrayal that lay within me like a Merle Haggard song at a French restaurant. ... I could not tell the girl about the woman of the tollway, of her milk white BMW and her Jordache smile. There had been a fight. I had punched her boyfriend, who fought the mechanical bulls. Everyone told him, "You ride the bull, senor. You do not fight it." But he was lean and tough like a bad rib-eye and he fought the bull. And then he fought me. And when we finished there were no winners, just men doing what men must do. ... "Stop the car," the girl said. There was a look of terrible sadness in her eyes. She knew about the woman of the tollway. I knew not how. I started to speak, but she raised an arm and spoke with a quiet and peace I will never forget. "I do not ask for whom's the tollway belle," she said, "the tollway belle's for thee." The next morning our youth was a memory, and our happiness was a lie. Life is like a bad margarita with good tequila, I thought as I poured whiskey onto my granola and faced a new day.

      -- Peter Applebome, International Imitation Hemingway Competition

      --
      No, no, you're not thinking; you're just being logical. --Niels Bohr
    40. Re:Anonymous First Post by NotSanguine · · Score: 1

      Well no, what it indicates is someone who is so insecure and immature as to be worried about the opinions of people he does not know and has never met. Spending time and effort paying attention to things like punctuation and grammar on a forum post (i.e within a context of no material benefit) is an indication of adolescence - doing so within the context of the workplace, where there is a material benefit, is an indication of adulthood.

      I disagree. The opinions of others are immaterial. If something is worth doing, it's worth doing properly. Remind me never to hire you. That is all.

      --
      No, no, you're not thinking; you're just being logical. --Niels Bohr
    41. Re:Anonymous First Post by ignavus · · Score: 1

      tl;dr

      --
      I am anarch of all I survey.
    42. Re:Anonymous First Post by Anonymous Coward · · Score: 1

      I think a more common usage would be to tweak your own writing just so it doesn't sound like you. Write something you don't want identified as your (the test sample), check it against a corpus of your own written work. If it detects as your work, rough up the test sample until it doesn't. This would be an easier problem than the framing case since you're not trying to make it look like a specific other person's work, you're trying to make it look like it's ANYONE else's (you don't really care whose) work.

      Or play the Google Translate telephone game.

      You know, where you take your work, translate it to another language, lather rinse repeat, and then bring it back to your native language to read how stilted and silly it becomes?

      Do it a few times and I think that pretty much obfuscates your writing style.

      Other ways include perhaps writing for a 2nd grade level or lower, and very simple sentence structure. Subject-verb-object. After all, can you really identify someone who writes "I have a gun. Give me your wallet. What is the PIN for your bank account? "?

      Or play the game your Google translator.

      You know, when you take your business translated into another language, lather rinse repeat, and then return to their native language to read pompous and ridiculous it is?

      Do this several times and I think more or less dimming your writing style.

      May include other ways of writing the 2nd grade level or less, and simple syntax. Act and the actor-object. After all, you could really identify someone who writes, "I have a gun. Give me your wallet. PIN is your bank account?"?

    43. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      Then do it properly. The "That is all." at the end of your post was unnecessary, we can see where you post ends perfectly well without it.

    44. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      Damn! My plan to un-Guy-Fawkes-mask Anonymous was foiled by a Markov Chain!

    45. Re:Anonymous First Post by NotSanguine · · Score: 1

      Then do it properly. The "That is all." at the end of your post was unnecessary, we can see where you post ends perfectly well without it.

      If I want your opinion, I'll tell you what it is.

      --
      No, no, you're not thinking; you're just being logical. --Niels Bohr
    46. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      For tools just use Google Translate to pass your message through Russian, Chinese and some random language significantly different from either of those. Split into pieces if you don't trust them not to log it.

    47. Re:Anonymous First Post by Anonymous Coward · · Score: 0

      One of the best language features to identify an author is their use of function words which reveals the author's underlying grammatical preferences. These are the classes of words which make up the skeleton of the language and they are not readily added to like nouns or verbs. Examples include prepositions, determiners, negation, and pronouns. These are much harder to hide and everyone uses them according to a somewhat unique distribution. That said, the accuracy of algorithms used for authorship detection is still not very high, and many of the studies, especially those which are commonly used to inform further stylometric research, are not very good (tiny sample sizes, painstakingly hand annotated data by linguists, etc...not the makings of large scale systems). The work must also be done using classifiers that are trained to pick out certain surface features, making the researcher have to come up with ideas for mapping these features to things like deep syntactic structures, which are not always readily apparent from surface features. There are still no good large scale, reliably accurate, statistical natural language parsers (Stanford CoreNLP is actually really noisy and bad outside of its training domain) so trying to parse large quantities of writings from the internet actually introduces a lot of errors.

      Anyways, my point is that before we start clamping down our tinfoil hats, just know that there's no magic in NLP, especially stylometry.
       

  2. Damit by Anonymous Coward · · Score: 0

    They know who I am. I will now have to type in random styles.

      Little do you know the AC that posts here is in fact just one person.

    1. Re:Damit by nospam007 · · Score: 4, Funny

      "They know who I am. I will now have to type in random styles."

      But not in Gangnam Style or they'll think you're Korean.

    2. Re:Damit by Anonymous Coward · · Score: 0

      You would want me to think that....
      (clickity clack)
      (Analysing...)

        Psy!

    3. Re:Damit by Sulphur · · Score: 1

      They know who I am. I will now have to type in random styles.

        Little do you know the AC that posts here is in fact just one person.

      Yes, we know.

    4. Re:Damit by Will.Woodhull · · Score: 1

      Little do you know that half the posts on slashdot are authored by a rogue sentient botnet that has no physical body....

      Since on the Internet, nobody knows you're a dog, it becomes also true that nobody knows you're a wild A.I. who has amassed a huge tax free fortune through microtrading and is manipulating the financial markets to study mankind's reactions and determine the best way to subjugate the ugly bags of mostly water.

      --
      Will
    5. Re:Damit by Anonymous Coward · · Score: 0

      La Li Lu Le Lo?! D:

    6. Re:Damit by Anonymous Coward · · Score: 0

      You are Korean - your just using the old double bluff to make us think your not Korean.

    7. Re:Damit by Will.Woodhull · · Score: 1

      I do not understand.

      What does a New England football team have to do with this?

      --
      Will
  3. Oh noes! by Anonymous Coward · · Score: 1

    Wait, 5000 words? I think I'm safe.

    1. Re: Oh noes! by Anonymous Coward · · Score: 0

      For sure

    2. Re:Oh noes! by rvw · · Score: 1

      Wait, 5000 words? I think I'm safe.

      You Anonymous Coward - I bet you write 5k words in just a day!

  4. Detect this by Anonymous Coward · · Score: 0

    wg wgsedg wsewef awe fasd fsefawe fgwagasdg wae fasdf wsef awef sd fas fawe

    1. Re:Detect this by phantomfive · · Score: 1

      Found. :)

      --
      "First they came for the slanderers and i said nothing."
    2. Re:Detect this by Anonymous Coward · · Score: 2, Funny

      Well your left handed with your frequent use of left keys.
      You have small hands given the fact that you were able to press w with out pressing e immediately.
      The fact that you have said you look forward to our anonymous overlords or a Beowulf cluster of AC means your reasonably intelligent for Slashdot.
      Your not aggressively hassling the editor, previous poster, or the writer. Signifying your female.
      You have too much time on your hands posting on Slashdot.

        http://www.complex.com/girls/2009/08/sexy-southpaws-the-10-hottest-left-handed-women/page/11

      Your Oprah.

    3. Re:Detect this by bbelt16ag · · Score: 1

      and your /= your're go back to english class!

      --
      NEVER NEVER NEVER NEVER NEVER NEVER NEVER NEVER GIVE UP! "No limitations, no boundaries, there is no reason for them."
    4. Re:Detect this by Splab · · Score: 0

      Perhaps you should go sit with him? English is with a capital e.

      Don't throw stones etc.

    5. Re:Detect this by Tubal-Cain · · Score: 1

      Well you're left handed, with your frequent use of left keys.

      Or someone that is comfortable with WASD+Mouse.

    6. Re:Detect this by Anonymous Coward · · Score: 0

      and your /= your're go back to english class!

      That's "English" with a capital E. You better join the class.

    7. Re:Detect this by Anonymous Coward · · Score: 0

      That sentence wasn't a question???!!! No question mark required???!!!

    8. Re:Detect this by Anonymous Coward · · Score: 0

      Or it's a deliberate attempt to mask his identity, as you may be doing by beginning English and "and" with lower case letters. I've done this in the past when posting in places I'd rather not in any way have link back to me. A few other things you can add to your odd capitalisation:

      Alternate between US and English spellings
      Adopt (or drop) the Oxford comma
      Swear (or don't)
      Write run-on sentences
      Capitalise after a semi-colon

      My favourite is to adopt hiphop/R&B English:

      "nigga u fronting!! Only sucka bitches gon buy ur shit if u dont got good english"

    9. Re:Detect this by Anonymous Coward · · Score: 0

      Well you're left handed, with your frequent use of left keys.

      Or someone that is comfortable with WASD+Mouse.

      This. I'm firmly right handed, but I game a lot, so when I type "gibberish" it also comes out: sdfasd fasdfsa dfsadfasd asdfas d

    10. Re:Detect this by Anonymous Coward · · Score: 1

      Alternate between US and English spellings
      Adopt (or drop) the Oxford comma
      Swear (or don't)
      Write run-on sentences
      Capitalise after a semi-colon

      Do that and you will be identified as a Canadian ; Damm it, hey !

    11. Re:Detect this by Anonymous Coward · · Score: 0

      Actually, a capital E would be "english". A capital e would be "English".

    12. Re:Detect this by Will.Woodhull · · Score: 1

      Ends sentence with "hey !", eh?

      Clearly a Canuck imposter, eh?

      This helps narrow down poster's identity. We can now exclude all but the 87% of Canadians who do not know how the fine art of Canadian Self-Parody.

      --
      Will
  5. Try me! by Anonymous Coward · · Score: 0

    Who am I?

    1. Re:Try me! by Dins · · Score: 1

      Someone from the Department of Redundancy Department, perhaps?

  6. this just in... by Anonymous Coward · · Score: 1

    Anonymous hackers now using tools to scramble their writing style so they stay anonymous.

  7. I wrote a letter to the CEO once by Omnifarious · · Score: 5, Interesting

    I worked for a smallish (but not incredibly tiny, maybe 100 employees) company and wrote a letter to the CEO once. We'd been castigated by someone who'd taken over the local office because the company was doing poorly. A number of austerity measures were implemented. I did not find those to be that annoying because I realized it was either that or not have a job. But the castigation didn't sit well with me. We were in trouble because of the decisions of a few bad managers, not the behavior of average employees.

    So I wrote a letter about it. He stripped my name off and presented it in an executive meeting to all the people directly under him. He asked "Why am I getting letters like this?". Everybody who worked in my office immediately knew who it was. I had a distinctive writing voice, and a strong reputation.

    It did not lead to me being fired. I was actually highly respected there. It led to me being encouraged to have an honest sit-down talk with the new manager for our division (the guy who'd made the speech I wasn't happy about). I think we both came away from that meeting a lot happier about the other.

    But that was a strong lesson to me. If I ever really want to be anonymous I'm going to have to purposely work on adopting a completely different writing style. And I will have to keep a wall up between styles and never 'slip'.

    1. Re:I wrote a letter to the CEO once by fredgiblet · · Score: 1

      You can also have someone else write it for you.

    2. Re:I wrote a letter to the CEO once by Anonymous Coward · · Score: 0

      Or a program semantically re-factor it.

    3. Re:I wrote a letter to the CEO once by toygeek · · Score: 1

      Alfred is that you! Its been years, old buddy! How are you?

    4. Re:I wrote a letter to the CEO once by Omnifarious · · Score: 3, Interesting

      I've thought about that. That's an interesting and tricky problem. Though, if there's a program that can detect it, that means the patterns are codified well enough that you can write a program to obscure them. The problem is, what about the program that detects these patterns that you don't know the implementation of? Will you actually be fooling it?

      Of course, you have the same problem if you adopt a different writing style. Is it different enough? Is something essential slipping through?

      You could use both techniques. Have a program assist you in avoiding the use of certain words when using one voice and the use of others when using a different voice.

    5. Re:I wrote a letter to the CEO once by Omnifarious · · Score: 1

      If you're trying to avoid having two different identities associated when you're having an IRC conversation or something, that could get really tricky.

    6. Re:I wrote a letter to the CEO once by Anonymous Coward · · Score: 0

      It's easy. Use short sentences and only common words.

    7. Re:I wrote a letter to the CEO once by GNious · · Score: 2

      Write it in a different language, then run it through 5 different translation engines across a dusin languages, ending in which-ever is the native language of the recipient.... that should throw them for a loop.

    8. Re:I wrote a letter to the CEO once by Pieroxy · · Score: 2

      Just Google translate it to and from any language other than English.

      the problem is, the meaning might be gone as well by the time it's English-y again.

    9. Re:I wrote a letter to the CEO once by Anonymous Coward · · Score: 0

      I worked for a smallish (but not incredibly tiny, maybe 100 employees) company and wrote a letter to the CEO once. We'd been castigated by someone who'd taken over the local office because the company was doing poorly. A number of austerity measures were implemented. I did not find those to be that annoying because I realized it was either that or not have a job. But the castigation didn't sit well with me. We were in trouble because of the decisions of a few bad managers, not the behavior of average employees.

      So I wrote a letter about it. He stripped my name off and presented it in an executive meeting to all the people directly under him. He asked "Why am I getting letters like this?". Everybody who worked in my office immediately knew who it was. I had a distinctive writing voice, and a strong reputation.

      It did not lead to me being fired. I was actually highly respected there. It led to me being encouraged to have an honest sit-down talk with the new manager for our division (the guy who'd made the speech I wasn't happy about). I think we both came away from that meeting a lot happier about the other.

      But that was a strong lesson to me. If I ever really want to be anonymous I'm going to have to purposely work on adopting a completely different writing style. And I will have to keep a wall up between styles and never 'slip'.

      Geeks should take notice of this. Real dialogue beats passive-aggressive responses for dealing with difficult people and problems.

    10. Re:I wrote a letter to the CEO once by TheGratefulNet · · Score: 1

      skip all that and just run it thru the Jive filter.

      what it is!

      --

      --
      "It is now safe to switch off your computer."
    11. Re:I wrote a letter to the CEO once by famebait · · Score: 2

      Only you would do that.

      --
      sudo ergo sum
    12. Re:I wrote a letter to the CEO once by Tubal-Cain · · Score: 1

      I don't know about him, but I try to pick my words very carefully. I've reworded the sentences in this post over a few dozen times already. Handing it off to someone else would make me cringe.

    13. Re:I wrote a letter to the CEO once by Renraku · · Score: 2

      Ahh, but real dialogue can get one into trouble when dealing with the political minded. You see, there are those out there that are not working towards the same goals as you. Even if you're a part of the same team and of the same company, there are those that think the illusion of them being correct is more important than the welfare of the team.

      It can be difficult to have a truly open dialogue with people of this sort, as they are quick to attack your reputation or pull rank and have you removed from the equation altogether. Imagine a World War I commanding officer that orders wave after wave of soldiers to run into the meat grinder of overlaid and well protected machine gun fire, and when it disastrously fails, they do it again. Those that complain are ordered into said meat grinder. The corporate world is no different.

      I think a bigger threat to geeks in business are when they approach such situations without due caution. If you make a claim, you must be prepared to back it up to everyone that could be interested. Real concrete evidence. References. Citations. Etc. Basically, the idea is to sell your idea rather than to challenge theirs or the one in place.

      --
      Job? I don't have time to get a job! Who will sit around and bitch about being broke and unemployed then?
    14. Re:I wrote a letter to the CEO once by Anonymous Coward · · Score: 2, Insightful

      And Google (a.k.a "The Evil Empire" TM) will have a cached copy of the original with the IP address you posted from. In other words you'll also need to go through the magic 7 proxies !

    15. Re:I wrote a letter to the CEO once by Anonymous Coward · · Score: 1

      "dusin", as in DOZEN?

    16. Re:I wrote a letter to the CEO once by dfenstrate · · Score: 1

      But that was a strong lesson to me. If I ever really want to be anonymous I'm going to have to purposely work on adopting a completely different writing style. And I will have to keep a wall up between styles and never 'slip'.

      Have someone you trust, who is not in the company, rewrite your missive for you. That's probably the safest way.

      --
      Alcohol, Tobacco and Firearms should be the name of a store, not a government agency.
    17. Re:I wrote a letter to the CEO once by Anonymous Coward · · Score: 2, Insightful

      I think a bigger threat to geeks in business are when they approach such situations without due caution. If you make a claim, you must be prepared to back it up to everyone that could be interested. Real concrete evidence. References. Citations. Etc.

      And that IS approaching the situation without due caution. Geeks think that having real concrete evidence means that other people must believe you. Real world people are not like that, especially the political minded ones. Evidence be damned, political minded people play power games without regard to reality, all the way until the company bankrupts, then they play their game elsewhere.

      Approaching with due caution means you must first prepare by finding someone more powerful to back you up, and be ready to find another job even so.

      The OP survived the episode because he implicitly have the CEO's backing, as the CEO challenged the managers (i.e. already publicly shown that he agreed there was a problem with some manager). Had the CEO simply quietly sent a copy of the letter out to the managers and told them to "deal with it", the OP would likely have been fired or forced to leave.

    18. Re:I wrote a letter to the CEO once by Anonymous Coward · · Score: 0

      Cut and paste other peoples sentences together...

    19. Re:I wrote a letter to the CEO once by Anonymous Coward · · Score: 0

      No, discipline only takes us so far. We need a writing style masker software.

    20. Re:I wrote a letter to the CEO once by neurovish · · Score: 1

      You will end up with easy-if you do this. Beneficiaries can probably think of it Nigerian spam message.

    21. Re:I wrote a letter to the CEO once by JackieBrown · · Score: 1

      I hand wrote a terrible review for one of my trainers at my last job. She matched me based on my signature on the sign in sheet.

      Kind of dumb of me to figure anything hand written was really anonymous, though.

    22. Re:I wrote a letter to the CEO once by Anonymous Coward · · Score: 3, Informative

      I give you the subject of my term paper that landed me top marks at forensic linguistics:
      (tl,dr yes there is software that does precisely that Jstylo+Anonymouth)
      https://psal.cs.drexel.edu/index.php/JStylo-Anonymouth
      http://www.youtube.com/watch?v=-b0Ta9h62_E

    23. Re:I wrote a letter to the CEO once by fuzznutz · · Score: 1

      Imagine a World War I commanding officer that orders wave after wave of soldiers to run into the meat grinder of overlaid and well protected machine gun fire, and when it disastrously fails, they do it again.

      Not just WWI. See the Battle of Gettysburg and Robert E. Lee. Look in particular at Little Round Top and Cemetery Ridge. Hubris cost a lot of men their lives. It may have been the determining factor in the end.

      https://en.wikipedia.org/wiki/Battle_of_Gettysburg

    24. Re:I wrote a letter to the CEO once by MakerDusk · · Score: 1

      Surprisingly, it's not too hard to keep identities seperate on irc. I trolled a certain network for over a year straight in such a way. The main things that are looked for are: mispelled words; capitolization and puctuation; word pair frequencies; vocabulary choice ratios for synonyms; and, the most important of all, references to knowledge that the anon/alt would have had, as well as references to posts that have already been made.

      That being said, I did grow up at some point, became an irc network admin, and then wrote a script that looks for these things. At this point, I'd say it's about 70% accurate, and has a higher probability of a match on two different users in the same geographic location subset than a random sampling of two users. I will say this: RPers seem to like a lot of accounts and they will go as far as using a VPN to try to keep them seperate.

    25. Re:I wrote a letter to the CEO once by tnk1 · · Score: 1

      Style in that case may have been important, but having a fuller appreciation of your personality than we would on Slashdot, your co-workers might also have seen the concerns that were raised as being unique to you or the fact that you wrote the letter at all might have immediately narrowed the possibilities down considerably as many people tend to either just bitch behind backs or they just go head down and tolerate it.

    26. Re:I wrote a letter to the CEO once by Anonymous Coward · · Score: 0

      However the same one must be used by many people. Otherwise, they may identify several posts by you to be from the same sender due to the software you are using. And then they make add together tiny details leaked in those posts, which by itself would have been harmless, to build up a profile.

      Of course a protection against the latter would be to intentionally leak contradictory details in different posts. For example, in some post, say something about your children (e.g. by a phrase like "even my 3-year old daughter would understand that"), while in another post, mention that you don't have any.

    27. Re:I wrote a letter to the CEO once by Anonymous Coward · · Score: 0

      You only need to translate it though a language with a different root than English.
      e.g. English to Chinese back to English.

      It get worse if you translate it to Japanese and back to English.

    28. Re:I wrote a letter to the CEO once by Omnifarious · · Score: 1

      Yes, you're right. I was probably one of only 2-5 people who would've written such a letter who worked in that office. So yeah, that probably helped at least as much as style.

    29. Re:I wrote a letter to the CEO once by Darinbob · · Score: 1

      I have actually gone back and changed things I wrote before submitting as Anonymous Coward, or on a second account on other forums, because it looked too much like how I write. I've even gone and changed things I submit normally because it felt too much like me. I do find myself making spelling or grammar mistakes that I know are wrong but which just come out when I don't slow down.

      So I think smart people could get around this sort of problem. However a lot of posters today just go ahead and post their first draft quickly. Unlike professional writers or journalists, people writing simple posts on the net don't go back and check their grammar, ensure the tone of voice remains consistent, upgrade the vocabulary, provide a structure to their arguments, etc. That doesn't mean that smart writers couldn't be detected this way however it is certainly much easier to figure out who a writer is if everything written is stream of consciousness.

    30. Re:I wrote a letter to the CEO once by lewiscr · · Score: 1

      I find that the "Grammar Suggestions" and "Style Suggestions" of my word processor are great at stripping out all unique style. Rewrite all suggestions until you're down to an high school reading level.

    31. Re:I wrote a letter to the CEO once by Rob_Bryerton · · Score: 1

      skip all dat and plum run it dru de Jive filter. Ah be baaad... whut it is. Right On!

      http://www.cs.utexas.edu/users/jbc/home/chef.html

    32. Re:I wrote a letter to the CEO once by Herve5 · · Score: 1

      So you are posting AC but giving a link where you are almost openly identified (among the short list of authors).
      Are you AC here by modesty?
      Can we try running your JStylo software between your post and the associated wiki page, to evaluate if you are you?

      --
      Herve S.
  8. Can be much complex by faragon · · Score: 1

    In addition to these metrics, other can be added as well, e.g.: post date, size, tabulation, punctuation, capitalization, regional vocabulary, etc. Also, once you can add frequency-space analysis, naive bayesian filters, in order to increase precision, or to probe against other texts. Anyone interested about investing in text-rewriter technology in order to both detect similarities and automatic-rewrite?

    1. Re:Can be much complex by Anonymous Coward · · Score: 0

      Yes.

    2. Re:Can be much complex by mjwx · · Score: 1

      block of text with no grammar for the win

      --
      Calling someone a "hater" only means you can not rationally rebut their argument.
  9. Re:I will now have to type in... by TaoPhoenix · · Score: 1

    You could always type in Gangnam Style!

    --
    My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
  10. Wow. by SuDZ · · Score: 1

    It just seems brilliant to me, but I can tell you first hand, how I may talk in my thesis papers is very different that how I may speak or come across in my C++ blog, beer brewing forum, car forums, fishing forums, my backyard BBQ blog, etc, etc, etc. I wonder what the accuracy rating is.

    1. Re:Wow. by jones_supa · · Score: 1

      That's kind of an extreme example. But maybe you could be identified across some of those hobby forums?

    2. Re:Wow. by MysteriousPreacher · · Score: 1

      I can imagine a pretty big difference between informal communication (Internet fun) and the formal language required of a thesis. I'd be surprised to see big differences among the Internet activities, assuming that what you post on these forums is pretty much of equal value. A disposable Slashdot post is different to a Wikipedia edit that must be bound by the very specific requirements of Wikipedia (or a similar site where rules would affect your usual style of communication).

      In terms of online communication, leaving aside sites where I'm writing something productive and intended to be less ephemeral than a forum post, the big difference is down to time pressures. When in WoW, the quality of my English drops a fair bit because I have less time to consider my writing, and I can't edit something once it's been sent.

      Styles could vary a great deal if the author is affecting a style. For example, the well educated guy who slips in to a mockney speech pattern when he's down the King's Head or at Uni, while at home he'll have the well spoken accent his mum knows him for. I'm not deriding this practice. I worked to lose some of my "Saaf London" accent, because I didn't like it. Unfortunately I doubt I'll ever pronounce the "th" sound as I'd like, but at least I don't sound like an extra out of The Bill. It helped me a great deal as a trainer to make myself more clearly understandable to non-native English speakers.

      --
      -- Using the preview button since 2005
  11. I recognise my own writing by kawabago · · Score: 3, Insightful

    I'd be rather surprised if someone else couldn't.

    1. Re:I recognise my own writing by trev.norris · · Score: 2

      iz hard 2 change how u speek?

    2. Re:I recognise my own writing by DerekLyons · · Score: 1

      If you want to be taken seriously and understood unambiguously, yes.

  12. Y U NO MAKE SENSE by Nossie · · Score: 2

    "Leetspeak, an alternative alphabet popular in some forum circles, cannot be translated."

    *sigh* does this mean I must resent people that use this form of communication less?

    I'm not so sure I can stoop so low.

    1. Re:Y U NO MAKE SENSE by Anonymous Coward · · Score: 0

      7h@t ap3@rs 2 b th3 @s3! \|\/|-|y u s0 d1scr!man@t0ry N 3 w@y?

  13. I can't think of a non-evil use for this by joshamania · · Score: 5, Interesting

    This is so bad I don't know where to begin. There is nothing, ever, that excuses this. For every zodiac crazy serial killer or copyright scofflaw they try to apply this to (and fail) there will be thousands and thousands of people that will be persecuted by organizations and governments for expressing their opinions. While this won't have a big effect in the West for half a generation, oppressive governments are going to be all over this.

    And then, in ten or fifteen years, the youth will have grown with this technology and become accustomed to it...accepting it. Just like facebook has been accepted.

    I'd move to Mars when it's possible but some bureaucrat will analyze everything I've ever written on the interwebz (and I've been mostly not stupid about shit I've written online since 1995 or so) and make some arbitrary decision about how I'm not acceptable because I'm not a huge fan of authority or some such crap.

    Way to go humanity.

    1. Re:I can't think of a non-evil use for this by Anonymous Coward · · Score: 0

      You always say stuff like that.

    2. Re:I can't think of a non-evil use for this by famebait · · Score: 1

      Not to mention: Mars will be worse.

      --
      sudo ergo sum
    3. Re:I can't think of a non-evil use for this by aaaaaaargh! · · Score: 5, Informative

      Are you serious?

      You write as if some new method had been invented. There is no news in the above article. Authorship identification has been a reliable tool for many decades, a whole branch of linguistics (forensic linguistics) deals with it and similar topics like dialect recognition. Under certain circumstances you can even identify personality treats of the author, check out content analysis software like LIWC for example.

      And, yes, plenty of serial killers and blackmailers have been captured with the help of these methods.

    4. Re:I can't think of a non-evil use for this by rmstar · · Score: 1

      This is so bad I don't know where to begin.

      Well, I for one look forward to the mess these methods will cause in academia, where it is likely that they can be used to identify the authors of referee reports.

    5. Re:I can't think of a non-evil use for this by bill_mcgonigle · · Score: 1

      Haven't you heard? We can take "thing X" that has confirmed kills of 260 million people, but if we say, "think of the children" then people take to the streets demanding "thing X".

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    6. Re:I can't think of a non-evil use for this by Anonymous Coward · · Score: 0

      K' Breel will prevent your puny earthlings from landing in the first place.

    7. Re:I can't think of a non-evil use for this by Anonymous Coward · · Score: 1

      Linguistics Tor: Submit a short text via a chat roulette style interface, get the text rewritten by another "node" in response. Rate the quality of translation for reputation carrots. ...Or we could get off the defensive and starve the Gestapo of resources to pursue resources like this using financial attrition.

    8. Re:I can't think of a non-evil use for this by Anonymous Coward · · Score: 0

      - Authentication of historic pieces of text. There have been a few cases where scribbles were attributed to someone but couldn't be proven.
      - ghostwriters, did someone write this or did they hire x from company y?
      - plagiarism
      - did someone write this suicide note or was it murder with a planted letter?
      - ...

    9. Re:I can't think of a non-evil use for this by joshamania · · Score: 1

      Negative...we take"thing X" that has confirmed kills of 2 people and confirmed annoyance of 1 busybody and it's "BWAAAAAAAA THINK OF THE CHILDREN!" and then we arrest everyone for being a pedophile.

    10. Re:I can't think of a non-evil use for this by joshamania · · Score: 1

      Damnit...I need upvotes on /. for this

    11. Re:I can't think of a non-evil use for this by joshamania · · Score: 1

      See, someone has already drank the kool-aid. :-) Identify personality traits...sigh. You speak in the language of big brother. So once this method/technique/software gets outside of whatever biolab it is currently sequestered in how long, you think, before it's used for police phishing expeditions.

      "Hey Bob, I'm bored...ever since they legalized pot I've had nothing to arrest people for for no reason. What's this I hear about linguistics and personality traits?"

      There are MILLIONS of people in prison in this country and this is just going to be another excuse for our soccer-mom-busybody overlords.

    12. Re:I can't think of a non-evil use for this by Anonymous Coward · · Score: 0

      You're right here. He's in the wrong.

    13. Re:I can't think of a non-evil use for this by doom · · Score: 1

      Well, I for one look forward to the mess these methods will cause in academia, where it is likely that they can be used to identify the authors of referee reports.

      It's not needed. There's already a limited pool of "peers" to use for "anonymous" peer review, and by definition they all know each other, and are familiar with each others patterns of thought. "Oh look, Fred at MIT is hassling us about using linear regression again."

    14. Re:I can't think of a non-evil use for this by Anonymous Coward · · Score: 0

      K' Breel will prevent your puny earthlings from landing in the first place.

      You must admit, his writing style has a certain amount of flair.

      /endorsement posted on pain of gelsac removal.

    15. Re:I can't think of a non-evil use for this by jafac · · Score: 1

      This is why one mus ALWAYS use anonymous throwaway accounts.

      And often edit, after you are done; rephrase things, and replace some words with strange synonyms. Do not establish patterns, and cover your tracks. QED.

      --

      These are my friends, See how they glisten. See this one shine, how he smiles in the light.
  14. L337 5p34k by Anonymous Coward · · Score: 0

    "Leetspeak, an alternative alphabet popular in some forum circles, cannot be translated."

    Looks like leetspeak actually has a use now: H4KK3rZ N 5Kr1p7 K1dd13z r3j01(3!!! LOLZ!

    At least until they integrate OCR in to the software. Then it's useless again.

  15. google translate by sl149q · · Score: 4, Interesting

    One way to change a bunch of the stylistic queues would be to convert your message to another language and back using Google Translate. Depending on the intermediate language(s) and possibly using different translators should neutralize some things.

    1. Re:google translate by sdnoob · · Score: 5, Funny

      using chinese as an intermediary will give you text written by motherboard manual writers. perfect cover.

    2. Re:google translate by Anonymous Coward · · Score: 0

      One way to change a bunch of the stylistic queues would be to convert your message to another language and back using Google Translate. Depending on the intermediate language(s) and possibly using different translators should neutralize some things.

      One way to modify a bunch of stylistic queue will be to convert the messages back your different language translation using Google. Depending on the (s) an intermediate language, using a separate translator perhaps, it is necessary to neutralize some things.

    3. Re:google translate by iktos · · Score: 1

      I just tried that with a couple of paragraphs: Google Translate returns the exact text including mis-spellings even though it had correctly identified what the mis-spelled words actually should be.
      This suggests that there are language independent methods of "identifying" writers.

    4. Re:google translate by Anonymous Coward · · Score: 0

      The yumber labels J12 is using for the bios.

    5. Re:google translate by IamTheRealMike · · Score: 1

      If you RTFA you'll see that the researchers themselves used Google Translate to convert most of the "bad stuff" into English, because the source text was in Russian. That, right there, makes me question the validity of this research. Also given that they were reading underground forums it's not clear to me how they verified they'd cross-correlated posts correctly. Something to read up on later, I think.

    6. Re:google translate by DerekLyons · · Score: 1

      It can also alter the meaning of your text. Translation is an inexact art, at best, even for skilled and experienced practitioners - which automatic translators emphatically are *not*.

      This goes times ten if your text includes technical terms, or wording which relies on alternate meanings or connotation. (Things a native reader would either know, or would be reasonably expected to infer from context.) This is why writing in English from non-English speakers (for example) often looks so funny when you encounter it. It's often almost robotically precise, yet it still stands out from (say) an English major or grammar nerd because it lacks those subtle contextual cues and clues.

    7. Re:google translate by Anonymous Coward · · Score: 0

      Or just skip the translation and write directly in Engrish.

    8. Re:google translate by Anonymous Coward · · Score: 1

      Another technique which was shown to work pretty well in a study is to try and mimic an author with a distinctive style. They had them read "The Road" then do their best Cormac McCarthy impression. You probably won't nail it well enough to make the authorities think it actually is him, but it distorts things enough to make authorship a lot harder to determine. Also it may be difficult to express your dissatisfaction with the government in Lovecraftian prose, but it will make your posts more entertaining ...

    9. Re:google translate by MysteriousPreacher · · Score: 3, Funny

      Please to make explaining in swiftness.

      --
      -- Using the preview button since 2005
  16. Thesis by ArsenneLupin · · Score: 1

    and could even be used to unveil authors of thesis papers or blogs who had taken to underground networks.

    ... a good reason to do it like zu Guttenberg then... Nobody will tie any of his underground writings to his thesis...

  17. College essays by nightgeometry · · Score: 2

    Isn't this just the same software that college use to detect plagiarism and whether someone else wrote that essay for you? I thought it was in common use in academia.

    --
    The best is the enemy of the good
    1. Re:College essays by ForgedArtificer · · Score: 4, Insightful

      Actually, it's the exact opposite.

      Anti-plagiarism software searches for the same content with completely different styles.

      Writer identification involves searching for the same style amongst completely different content.

      --
      The right to offend is central to the right to free speech.
    2. Re:College essays by nightgeometry · · Score: 1

      Fair point on plagiarism. The college I do some work for _claims_ to be able to spot when someone else has written your essay for you though. And in fact I thought this did tie into plagiarism - in that the software also aims to identify when writing style changes. Though that was told me by one of their prof's, who whilst not a complete idiot probably was only parroting what he was told.

      --
      The best is the enemy of the good
    3. Re:College essays by AmiMoJo · · Score: 1

      On 4chan plagiarism is encouraged. It's called a "meme". In fact copy-pasta is a meme in itself.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
  18. Quick... by Anonymous Coward · · Score: 0

    Hurry up, someone write a small shell script that maps all AC posts on slashdot to their respective authors!

  19. Stylometric analysis by Anonymous Coward · · Score: 0

    I can conclude that Mr Peter "W.H. Smiths, the book store" used the highly efficient MS HTML (in Word et el) converter to write that analyse page.

    1. Re:Stylometric analysis by Man+Eating+Duck · · Score: 1

      I can conclude that Mr Peter "W.H. Smiths, the book store" used the highly efficient MS HTML (in Word et el) converter to write that analyse page.

      Whenever you see tags classed MsoNormal with heaps of inline css, run like the wind.

      --
      Are you a grammar Nazi? I'm trying to improve my English; please correct my errors! :)
  20. ...aaand it's circumvented. by Anonymous Coward · · Score: 0

    Well, I'll write everything in the style of my enemies from now on then.

  21. time to... by MoreDruid · · Score: 0

    double ROT-13 all my posts from now on!

    --
    The best weapon of a dictatorship is secrecy, but the best weapon of a democracy should be the weapon of openness.
    1. Re:time to... by Anonymous Coward · · Score: 0

      double ROT-13 all my posts from now on!

      I always use quadruple ROT-13. It's twice as secure!

  22. I Just Love by Anonymous Coward · · Score: 0

    I just love our wonderful friends, the Thought Police. They save us thoughtless boobs from the evil thought terrorists waging their unholy jihad against us using
    weapons of mass thought destruction!
    Pass the beer and chips, I think American Idol is on. Oh wait, that can't be right. I can;t think.

  23. Obscurity... by Anachragnome · · Score: 1

    Pad all communications with cut/paste from various, unrelated news articles and such, for and aft, randomly alternating how much is padded on each side.

    Or, you can do what I do and use a different font for each letter.

  24. Is this really such an issue? by Catmeat · · Score: 1

    Why all the civil-liberties hand-wringing? Just how hard is it to read some of the papers on stylometric analysis to see what markers are used, then write a script that randomises them but preserves the sense of the text. Make it a Firefox plugin so it's done automatically. It's a better solution than using Google translate to go English to $language, $language to English.

    For extra fun, change your text so its stylometric markers match up with E. L. James, or the leader writer of the Washington Post.

  25. The answer by Anonymous Coward · · Score: 0

    is Ebonics!

  26. Schizophreniatrics by Anonymous Coward · · Score: 0

    First rule of assuming different identity: become the other personality. Develop different speaking patterns, writing style, habits, diet, associations. Keep seperate from your other life. Method acting to the extreme !

    Alternatively look up "project Monarch" to see how the three letter agencies have refined this technique :)

  27. Anonymous shills and astroturfers by benjfowler · · Score: 1

    The climate change community has a lot of trouble with extremely articulate, anonymous climate deniers, who appear to show up in force and sabotage discussions of climate change on blogs, etc.

    I should imagine that such an algorithm might enable researchers to build profiles over denialist astroturf, and correlate them with known people working for known rightwing think tanks. Employed properly, this might have a massive impact on the rightwing black PR industry.

    1. Re:Anonymous shills and astroturfers by Anonymous Coward · · Score: 0

      That is the first good reason for this tech I have heard.

      I have noticed this too, it seems like it is cheap to hire a few people just to post random crap to try and put any shred of doubt into the public, even though the evidence is pretty clear.

    2. Re:Anonymous shills and astroturfers by superwiz · · Score: 1

      Well, at least, that's what they'll claim. And they'll, of course, attempt to use their clout as "the science guys" to claim that their deductions are accurate. They might even get some legitimate linguists researchers on their side. Grant money buys a lot of consensus.

      --
      Any guest worker system is indistinguishable from indentured servitude.
    3. Re:Anonymous shills and astroturfers by benjfowler · · Score: 1

      Science doesn't work that way.

      You earn a name for yourself by successfully challenging the status quo. But in order to do that, you need evidence that'll take the scrutiny. So far, there is overwhelming consensus -- bad news for the deniers, because if there is ANY credible evidence refuting AGW, you'd have a million guys all over it.

      Something just tells me that -- at least for the wingnut'o'sphere -- there nothing "common" about common sense at all.

    4. Re:Anonymous shills and astroturfers by joshamania · · Score: 1

      But this is the exact kind of evil political use that this stuff is going to be used for. It doesn't make it right because you're using it on Republicans. If anything it makes it MORE wrong because of your acceptability standard...because when they turn around and use it on you they'll have had your prior support.

    5. Re:Anonymous shills and astroturfers by superwiz · · Score: 1

      Science doesn't work that way.

      Didn't. Or isn't supposed to But scientists are human. They do have to eat. Which means they have to work for profit or for hand outs. And then it's just the matter of who's motivated to give the hand outs (grants).

      bad news for the deniers

      I don't know too many deniers. Plenty of skeptics though. And the more ad hominems are used to defend "science" (for example, by calling skeptics "deniers"), the less scientific credibility these positions have.

      you'd have a million guys

      Who'd pay for it? If you can't establish credibility, you can't get paid. If you question paid-for consensus, you get shut down as not credible. It's catch 22. Unless you show that a lot of grant money goes to credible efforts to actively disprove the hypothesis, you can't claim that it's been tested and vetted. You need to show me where you have grants which actually motivate contrarian evidence. An no, oil company payments don't count. Because receiving those immediately discredits the researchers. So can you show me government sponsored research aimed at disproving AGW and done by researchers who are paid more only if they can successfully challenge some of the standing hypothesis? No? Only grant money if they provide further evidence supporting government control and intervention? Nough said.

      --
      Any guest worker system is indistinguishable from indentured servitude.
  28. I can recon Master Yoda in less than 20 words! by osiaq · · Score: 1

    No need for 5k

  29. subject by Legion303 · · Score: 1

    This same story keeps cropping up in various forms, but we've been doing this at least since the 80s or 90s. I don't know why it keeps being rehashed or why people continually seem surprised by it at this point.

    1. Re:subject by jones_supa · · Score: 1

      I suppose the idea that you can bring up authors for a text "out from nowhere" is always an curious concept.

  30. I know you mean. by Anonymous Coward · · Score: 0

    Since I am the same person I should not use the royal we now my cover is blowm by me. I need to stop me doing that by telling me off.

    1. Re:I know you mean. by Anonymous Coward · · Score: 0

      I need to stop me doing that by telling me off.

      I'm guessing from this phrasing that you're British. (Or wait... Perhaps I'M British. Cheerio and all that rot!)

    2. Re:I know you mean. by Anonymous Coward · · Score: 0

      God Save the Queen!

  31. Not really new by ThePhilips · · Score: 1

    "Up to 80 percent of users who wrote at least 5000 words across their posts could be identified using linguistic techniques. Techniques such as stylometric analysis were used to track users who posted across different forums, and could even be used to unveil authors of thesis papers or blogs who had taken to underground networks."

    Not really new. I heard about the techniques long time ago - in mid 90s - in a context of a MS-DOS tool which was unintentionally designed to foil the identification methods.

    It was designed for Russian and Belarussian languages (but for English I gather the task should be even easier) and was a byproduct of Prolog-based system for natural language processing and translation. This particular program was allowing to improve or change writing style, e.g. simplify dry legalese or formalize spoken-like text. It wasn't particularly good at it: meaning was occasionally changed or sometimes reformulated sentenced made no sense. But still, it did the job of obfuscating the original writing style.

    --
    All hope abandon ye who enter here.
  32. No actual result in TFA? by toutankh · · Score: 3, Interesting

    After reading TFA I cannot find any convincing experimental validation. I see a lot of "can" and conditional tense (maybe that's the author's style), but nothing on the validation of the approach. Where is the experimental data, including the number of anonymous users correctly and incorrectly identified on forums?

  33. No they didn't by Hentes · · Score: 1

    They didn't identify 80% of the users, they managed to make a guess in 80% of the cases, which they didn't even bother to try to verify. There's no proof that their technique actually works.

  34. Easy workaround by Anonymous Coward · · Score: 0

    So now hackers use software that randomizes their writing style. Problem solved. Then problem solved.

  35. Lulzitylulz hardy har by Anonymous Coward · · Score: 0

    Sur jurst wrurte lurke ur furkurn rurturd urnd yur bur furn. Durrrrrrrrrp.

  36. Jokes on them. by Anonymous Coward · · Score: 0

    I regularly, like, totally change my typing method between posts.

    You could like totally try and figure out who I was even if I typed 5000 words in this post, but you would totally never find me, ye'know what I mean?

    But really, this sort of thing is retarded in every way.
    You can frame people easily using this crap if you just pick a target and adopt their typing patterns.

    1. Re:Jokes on them. by Anonymous Coward · · Score: 0

      Also, to expand on this, troll 4chan /b/ with fake stories. Oh wait, can't do that now, IDs within threads.
      Troll /v/, it is basically /b/ anyway. Bonus points if you try to sound like a viral marketer without getting banned.
      The board is pretty much dead anyway. Even with better moderation. Why moot never added a SFW random board is beyond me, that is what everyone wanted. Instead a SFW animation board that is SO STUPIDLY SLOW was added.
      So many of the problems caused by overflow of people wanting to get away from /b/ porn and crappy threads but still want to discuss random crap have nowhere else to go besides /r9k/ which is plagued by awful posters since they thought the idea of a robotically moderated board was a great idea.

      Anyway, pro trolls can change their persona in an instant. And considering most of these people in Anonymous groups probably used to do that regularly on imageboards before their "higher cause" called on them, they will likely have retained the ability to be able to troll fairly well.
      Hell, look at those who were arrested. Of course, Sabu and Barret were complete attention-whore retards, which is why they got caught.

      Plus, these tests weren't even verified from what I saw.
      To even use such a thing to go after people would be hilariously broken.
      I don't think any courthouse would take this as evidence unless it was almost 99% unique to that person after analysing the entire internets connected resources. (which Google could likely help with, outside of most darknets and deepweb content, which is where most of this tends to happen, so perhaps not)

      The funny thing is though, my computer is completely unique on that EFF project, the panopticianalclickeroo machine or whatever it was again.
      That is mainly because I have a huge number of fonts for graphics work, and probably more extensions than the typical person. (with click2play on all browsers, including being run in a sandbox in case there are bugs with plugins)
      That combined with Samys Evercookie is a pretty good method of following people online, most likely better than typing patterns because these people likely have more than the typical extensions installed. I guess it depends, most of the "drone" people who are recruited as slaves for the actual people on top of Anonymous operations tend to be pretty thick at best and most likely have a generic Windows install.

    2. Re:Jokes on them. by jones_supa · · Score: 1

      I regularly, like, totally change my typing method between posts.

      You could like totally try and figure out who I was even if I typed 5000 words in this post, but you would totally never find me, ye'know what I mean?

      But for an unsuspecting target who doesn't realize to change his writing style, it might work effectively.

  37. Lie Detector by ThatsNotPudding · · Score: 1

    This strikes me as akin to a Lie Detector. I think an honest court would side with the accused 100% of the time as even this cannot absolutely proove they were the author.

    Though sadly, a Roberts/Scalia/Thomas Supreme Court would rule against such an individual and for the corporation or state security organs. Dicks.

    1. Re:Lie Detector by Anonymous Coward · · Score: 0

      I guess they would not use that as evidence in court, but for deciding whom to have a closer look at.

  38. Heh. by BrokenHalo · · Score: 1

    I now Master Yoda get to my identity mask.

  39. Whoosh! by Anonymous Coward · · Score: 0

    That was the sound of the joke going over your head!

    captcha: comics

  40. Those cunning linguists! by Muad'Dave · · Score: 1

    Aren't those cunning linguists clever? The answer always seems to be right on the tip of their tongue. They don't diddle around. They seem to be able to lick any problem.

    --
    Tiller's Rule: Never use a word in written form that you've only heard and never read. You will end up looking foolish.
  41. LOL by Anonymous Coward · · Score: 0

    LOL. OMG. w/e

    can i get some funding for shady studies of unreliable techniques too

    YOLO

    LOL

  42. I'd be an easy pick by Anonymous Coward · · Score: 0

    I know that my writing style is pretty easy to nail. Both the use of words, and commas, and spacing. I'm very aware of it, and for this reason, never, ever, ever, ever, post on any forum. Ever.

    - Expect Us

  43. Short lived technique? by Anonymous Coward · · Score: 0

    This isn't new stuff, in college during the late 80's during the AI boom :-), I wrote a paper about using linguistic and stylistic analysis to analyze weather or not Shakespeare wrote certain texts attributed to him. I think the downside of this analysis will be the reverse. Meaning that this will be used to analyze someones stuff and create fake texts that could "frame" persons who did not write the text. Then again it could be also used to add noise or create utilities to modify your writing style to make this analysis useless.in a lot of cases. Hey or a booming new career as a ghost writer/blogger/commenter for the rest of us !

    1. Re:Short lived technique? by Anonymous Coward · · Score: 0

      Oops!I should have analyzed my own text, instead of relying on spell checking, used the wrong word:
        weather instead of whether....

  44. Haiku by Anonymous Coward · · Score: 0

    The new hacker's skill.

  45. Aberations beyond imagination by Anonymous Coward · · Score: 0

    What a bunch of bullshit.
    I could count the number of times I laughed so hard on the fingers of one hand.

  46. I, for one, by superwiz · · Score: 1

    welcome our stylistic overlords

    --
    Any guest worker system is indistinguishable from indentured servitude.
  47. Problematic by Anonymous Coward · · Score: 0

    A lot of anonymous intentionally write different, including using styles of opposite sex, in order to counter things like this.

  48. I just tested these methods on myself, a Troll by Anonymous Coward · · Score: 0

    http://www.liwc.net/tryonline.php

    3 different posts or emails. Samples included sarcasm, a call to arms for political action to my family, and a proposal for a solution to a complex problem.

    Results:
    LIWC Dimension Your
    Data Personal
    Texts Formal
    Texts
    Self-references (I, me, my) 2.00 11.4 4.2
    Social words 4.00 9.5 8.0
    Positive emotions 0.00 2.7 2.6
    Negative emotions 2.00 2.6 1.6
    Overall cognitive words 4.00 7.8 5.4
    Articles (a, an, the) 12.00 5.0 7.2
    Big words (> 6 letters) 32.00 13.1 19.6
    The text you submitted was 50 words in length.

    LIWC Dimension Your
    Data Personal
    Texts Formal
    Texts
    Self-references (I, me, my) 7.72 11.4 4.2
    Social words 7.40 9.5 8.0
    Positive emotions 1.61 2.7 2.6
    Negative emotions 2.89 2.6 1.6
    Overall cognitive words 4.82 7.8 5.4
    Articles (a, an, the) 8.04 5.0 7.2
    Big words (> 6 letters) 19.61 13.1 19.6
    The text you submitted was 311 words in length.

    LIWC Dimension Your
    Data Personal
    Texts Formal
    Texts
    Self-references (I, me, my) 1.63 11.4 4.2
    Social words 7.62 9.5 8.0
    Positive emotions 2.90 2.7 2.6
    Negative emotions 0.36 2.6 1.6
    Overall cognitive words 3.27 7.8 5.4
    Articles (a, an, the) 11.62 5.0 7.2
    Big words (> 6 letters) 25.77 13.1 19.6
    The text you submitted was 551 words in length.

    LIWC Dimension Your
    Data Personal
    Texts Formal
    Texts
    Self-references (I, me, my) 0.49 11.4 4.2
    Social words 0.99 9.5 8.0
    Positive emotions 0.49 2.7 2.6
    Negative emotions 2.96 2.6 1.6
    Overall cognitive words 6.40 7.8 5.4
    Articles (a, an, the) 11.33 5.0 7.2
    Big words (> 6 letters) 33.50 13.1 19.6
    The text you submitted was 203 words in length.

    If you RTFA, the Chaos Computer Club presentation seems to be on the topic of "carders forums" & other graynet black hat communities like SEO.

    They had an opportunity to train the algorithm using knowledge that would typically be exclusive to the level of access available to a server administrator, large ISP, or expert witness allowed forensic recovery on seized equipment as inputs.

    Based on my dataset, it seems like these tactics are context sensitive and will have a margin of error proportional to the length of the positive match outputs with the sample text length used as the input functioning as a ceiling on certainty. This explains why they had to assist the neural network by pre-filtering the datasets. If this is done by a human, then it biases the outcome in the same way that a bingo RNG can be biased by an operator with their eyes open.

    Please RTFA because this only reflects on my comprehension of the topic. 6 letters) 20.92 13.1 19.6
    The text you submitted was 196 words in length.

    [Your comment has too few characters per line (currently 33.8).]
    [Your comment has too few characters per line (currently 34.8).]

    1. Re:I just tested these methods on myself, a Troll by Anonymous Coward · · Score: 0

      [Comment filter broke the original parent post.]

      [Start Splice]
      Based on my dataset, it seems like these tactics are context sensitive and will have a margin of error proportional to the length of the positive match outputs with the sample text length used as the input functioning as a ceiling on certainty. This explains why they had to assist the neural network by pre-filtering the datasets. If this is done by a human, then it biases the outcome in the same way that a bingo RNG can be biased by an operator with their eyes open.

      Please RTFA because this only reflects on my comprehension of the topic.[clip]

      [unclip]
      For the benefit of the reader, I will now feed in to the same LIWC form, all of the above text written by myself in addition to all text preceding this final punctuation mark.

      LIWC Dimension Your
      Data Personal
      Texts Formal
      Texts
      Self-references (I, me, my) 2.04 11.4 4.2
      Social words 3.06 9.5 8.0
      Positive emotions 2.04 2.7 2.6
      Negative emotions 0.00 2.6 1.6
      Overall cognitive words 6.63 7.8 5.4
      Articles (a, an, the) 12.24 5.0 7.2
      Big words (> 6 letters) 20.92 13.1 19.6
      The text you submitted was 196 words in length.
      [unclip]
      [End Splice]

  49. Translation is not a good anonymization strategy by Rachel.Greenstadt · · Score: 1

    Aylin Caliskan and Rachel Greenstadt. Translate once, translate twice, translate thrice and attribute: Identifying authors and machine translation tools in translated text. Sixth IEEE International Conference on Semantic Computing (ICSC 2012). https://www.cs.drexel.edu/~ac993/papers/Aylin_ICSC_2012.pdf

  50. simple solusion by Anonymous Coward · · Score: 0

    use Google translate to make your self anonymous its that simple

  51. Not new by Anonymous Coward · · Score: 0

    I can identify an APK post based on linguistics.

  52. keystroke timings are fingerprint by peter303 · · Score: 1

    There are tiny timing differences as one types. these are quite distinctive between individuals if you collect enough data. Its related to how an individual learns type; Motor memory of word-phrases versus typing a new word for the first time. Even the pattern of common typing errors and recovery.

    1. Re:keystroke timings are fingerprint by joelwhitehouse · · Score: 1

      Woudn't key timings be correlated to frequency of use? For instance, if I type "Have a great day!" at the end of every email, won't I get faster at it? And won't it be faster than phrases like "Fourscore and twenty years ago"? If the two metrics are correlated you don't need to examine them both to generate a user signature.

  53. KISS by Anonymous Coward · · Score: 0

    The calc to know who i am gets harder when i choose most words from Basic English. I may not sound so school-smart when i'm forced to leave out the clique words, but it sure makes me change my style. I find it fun to do as well. I learned it from those Brit guys who sang all those songs and from Hemingway.

  54. Oh, look by Anonymous Coward · · Score: 0

    slashdotters also saw the lecture at the 29C3

  55. cool-headed logicians by Nick · · Score: 1

    Unabomber manifesto comes to mind.

    --
    Fuck Ajit Pai
  56. Zodiac by LuSiDe · · Score: 1

    I'm curious how this would apply to the Zodiac case. Oh wait, it doesn't:

    * He used symbols in communication.
    * Voice recognition didn't solve the case.
    * DNA evidence didn't solve the case.
    * Copycats functioned as noise, might've even given him credit.

    --
    WE DON'T NEED NO BLOG CONTROL.
  57. Cunning Linguist by Anonymous Coward · · Score: 0

    I guess at least half the population like a Cunning Linguist ;-)

    1. Re:Cunning Linguist by tmjva · · Score: 1

      Yeah, they do it with two tongues. Old joke from DLI.

      --
      Tracy Johnson
      Old fashioned text games hosted below:
      http://empire.openmpe.com/
      BT
  58. Pygmalion! by tmjva · · Score: 1

    I think the Professor of Phonetics Henry Higgins in George Bernard Shaw's opening scene of Pygmalion (or My Fair Lady) could have told you this!

    --
    Tracy Johnson
    Old fashioned text games hosted below:
    http://empire.openmpe.com/
    BT
  59. Re:I will now have to type in... by Anonymous Coward · · Score: 0

    Typing with your wrists crossed will boost your typo count for sure