Project Anonymizes Your Writing Style To Hide Your Identity
mikejuk writes "An open source project to combat 'stylometry,' the study of attributing authorship to documents based only on the linguistic style they exhibit, is proving that it is possible to change writing style to evade detection. Artificial Intelligence techniques are routinely used to detect plagiarism and recently were employed to reveal that Harry Potter author J. K. Rowling is indeed the author of The Cuckoo's Calling, which was published under the byline of Robert Galbraith. Now software is tackling the opposite problem — anonymizing writing style to protect the identity of the originator. The JStylo-Anonymouth (JSAN) framework is a work in progress at the Privacy, Security and Automation Lab (PSAL) at Drexel University. It analyzes a written text and detects features which could be used to identify the author. It then suggests changes that need to be made to avoid the author's stylistic fingerprint appearing in the work."
How will it disguise my terrible opinions that are obviously wrong?
Which person posted this?
Uhm, what? It was revealed by someone at Rowlings agency tweeting it to a Sunday Times reporter, after the reporter commented on how good it was for a debut novel - that has all been confirmed by the agency.
Unless the above line is badly phrased and is meant to say "recently were employed to confirm prior reports that..." - it didn't reveal anything of the sort, the link had already been revealed by plain old journalism.
A million college students are waiting anxiously for this tool now that some professors have started checking their essays electronically for plagarism.
If Slashdot were chemistry it would look like this:Cadaverine
Under the spreading chestnut tree
I sold you and you sold me.
There lie they, and here lie we
Under the spreading chestnut tree.
Trust geeks to take the path of needless complexity. I would have thought J K Rowling fessing up, once she was outed by big mouth of the wife of one of the law firm partners she was using, was 'proof' enough of her authorship of The Cuckoo's Calling. Stylistic analysis played no part in the discovery, and is unnecessary afterwards since she flat out admitted it.
Profit does. When your bottom line depends on keeping schools convinced that you're indispensable in the War On Plagiarism you damn well find plagiarism everywhere you can, whether or not it's actually there. There are approximately 80 MILLION students in the US, with our education system being as repetitive and formulaic as it is it becomes a virtual certainty that out of 80,000,000 students a significant number will say the same thing the same way.
A bullet may have your name on it but splash damage is addressed "To whom it may concern."
load your favorite translator site (babelfish?), translate to a random language, translate back to your language. its messy because its often hard to even decipher what you just said, but it obfuscates, or more accurately destroys the linguistic oddisms that make you identifiable. i used to use this technique on an irc client i used via shell. sometimes bread bars do good taste, but sometimes breadsticks are delicious. :)
Dat's why I always troll when I'm writing as Anonymous Coward. So that they can't connect it to my writitng style.
I am sorry, but as far as literature goes, writing style anonymization (is that a word?) would harm the original intent of the author. A literary work is valuable (when so) due to author's style, among other factors, much like in movies, where a certain actor's voiceover is best for a certain character. The same character would become retarded if the actor's voice changes. Imagine Donkey (from Shrek) played by Morgan Freeman or Darth Vader played by Danny de Vito. Good characters, good actors, no match in style and intent.
Yeah, students would love this in their paper, but literature? Hell, no.
...gis sdrawkcab (usually not responding to ACs; don't bother posting as AC)
... in the rest of your digital life.
In light of recent events -and I'm not only referring to the NSA-gate, but also to all the known ways to get your private information- it is hard for me to figure out a digital way of keeping your identity secret in a high profile incident.
This is he next step in surveillance, if he government isn't doing it already. Binding together various accounts of yours based on statistics of phrases.
And it's redundant since they have a database of all IP connections, web pages, and stuff you type in anyway. Sigh. I suppose it will make confirmation of these AI. techniques trivial. Yey.
(-1: Post disagrees with my already-settled worldview) is not a valid mod option.
Surely one could simply auto-translate their prose into another language and back to avoid stylometric identification?
So, can any mediocre author convert his story to the style of a known good author using this?
-- Senior Software Engineer, Attorney appearance services, locallawyerapp.com.
Just don't lick the envelope.
The things that pass for artificial intelligence these days! My dictionary and calculator are artificially intelligent! So is my toilet!
Sounds like some company is trying to toot their own horn here or something, but AI didn't out J.K. Rowling. Her lawyers friend did. http://www.businessinsider.com/russells-apologizes-to-jk-rowling-2013-7
OK so like any self-respecting AC, I have some pretty vitriolic opinions about pretty much everything. I decided to type up my normal flamebait troll and send it through this system. Now, Good luck figuring out which AC is trolling you...
Anonymous identity? What a bag of lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna Portman. Ut enim ad minim veniam, quis beowulf exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Fuckers!
Given I regularly troll morons on 4chan, I would love to see this identify me, or generally just people that either troll or type a lot of random crap in, say, chatrooms.
Even that above sentence is horribly written on purpose.
And by troll, I actually mean troll, not the bastardized form it takes now where people are just out for any silly reply.
I have and always will troll for discussion, not replies. Trolling for replies is the weakest form of troll. They aren't even a troll, just a retard.
Also, you think this is going to identify people that type very little? Or have multiple personalities, bipolar disorders or similar? Has any of this research been done in those areas?
It would be interesting to see how it fairs against these types of people.
Stephen King seems to agree with you.
In his book "On Writing", he explains (among many other good points) that one hallmark of good writing is finding the right combination of words for imagery.
He uses examples like "I lit a cigarette, tasted like a plumber's handkerchief'" from Raymond Chandler and "'It was darker than a carload of assholes' by George V Higgins.
The Odyssey (IIRC) has the phrase "it was a wine dark sea", so this has been around for a very long time.
For casual writing the project may be useful, but I wonder how much imagery will be lost in translation.
Many of the works of revolutionaries, radicals, and dissenters are memorable for their specific imagery. Simon Sinek analyzed "I have a dream", and noted the difference between "I have a dream" and "I have a plan". The two are very different, and have different effects on people. (Viz. TED talk "How Great Leaders Inspire Action")
I'm doubtful that AI has progressed to the point where the mood and emotional content will be preserved in such a translation.
To be effective, defiant writing will still require courage.
A software package called Corporate Voice did this 20 years ago.
My version changes my writing style to appear as if I am someone else. It randomly picks newspaper articles and fits my words to someone else's "signature"
At , managers were graded on how their subordinates graded *them* on internal surveys. When managers in my sub-group didn't get good enough grades, they basically told us "just score us high, no matter what you think; shut up and give us good grades." So I did. Then, in the freeform comment section, I wrote a message that reported the order the behavior. Since I didn't trust the anonymity of the surveys, I ran the text through google translate in a loop. English -> Language A -> Language B -> Language C -> Language D -> Language E -> English, then posted *that* in. By the time it was done, it was *awful*. Understandable, but awful. Managers stopped harping on the internal surveys soon after. Correlation is not causation, but still. Posted AC for obvious reasons.
So where's the tool that informs me if my current writing style is ranked up with famous authors (i.e. your writing style is 98% like Stephen King's and 42% like J.K. Rowling's) or otherwise (i.e. your writing style is 110% like a chimp throwing feces at a keyboard or Stephenie Meyer, take your pick)?
Writing style is just one of many ways you can leak information about your identity, even if using something like Tor. If you really want to write stuff anonymously, this kind of software may be a valuable asset.
I have to wonder how automated translation software would compare in effectiveness. Translate to some other language and back, then fix the broken stuff. Of course, you can't use a web service for that or you leak the pre-anonymized version.
This is a dupe from yesterday: http://tech.slashdot.org/submission/2852793/post-without-worry---anonymouth-hides-your-identity?sdsrc=rel
An excellent point, I will try to remember this in future writing. It's the sort of thing you don't get in a writing course, for which I am grateful.
Thanks.
I am not able rightly to apprehend
the kind of confusion of ideas that could provoke such a question.
TLDR: GIGO
ironic captcha: methods
If I consistently use the program to post under the same pen-name the facts could be correlated to the real me. Gotta make a point of telling a lie about myself with each post. What I need is a way to keep my alter ego's facts consistent.
Dear aunt, let's set so double the killer delete select all
Well, there's spam egg sausage and spam, that's not got much spam in it.
Helping techies pass mandatory English classes without all that cumbersome original writing. Goodbye automatic plagiarism detection.
Way back, in the dim, distant past of the bucolic walled gardens that preceded the Internet as we know it ... there was AOL. AOL had walled predator-free gardens within gardens, where only teens younger than 18 were supposed to be communicating.
There were rumors that evil pedophiles were lurking in these gardens, so I made a sub-account for a totally bogus 16-year old boy named Alex. And Alex went forth to play.
All was going well, Alex was quite a popular young man amongst his peers and had lured ZERO pedophiles when he got this e-mail from a fellow writer: "Alex, are you Tsu?"
BUSTED ... not because of subject matter or vocabulary, but because of a @#$&%^ liking for compound, complex sentences and other arcane constructions ... and using them accurately.
JStylo-Anonymouth (JSAN)?! Could you possible have come up with any more clunky name than that? ;) Damn, I should set up some agency just to create punchy names for all these projects.
It's a shame that sometimes good grammar can get us into trouble! Keep on speaking/typing eloquently, and maybe it'll help you get a nice job someday-- a nicer job than a lot of the ACs around here maintain.