Project Anonymizes Your Writing Style To Hide Your Identity

← Back to Stories (view on slashdot.org)

Project Anonymizes Your Writing Style To Hide Your Identity

Posted by samzenpus on Monday August 5, 2013 @03:51AM from the keeping-it-under-covers dept.

mikejuk writes "An open source project to combat 'stylometry,' the study of attributing authorship to documents based only on the linguistic style they exhibit, is proving that it is possible to change writing style to evade detection. Artificial Intelligence techniques are routinely used to detect plagiarism and recently were employed to reveal that Harry Potter author J. K. Rowling is indeed the author of The Cuckoo's Calling, which was published under the byline of Robert Galbraith. Now software is tackling the opposite problem — anonymizing writing style to protect the identity of the originator. The JStylo-Anonymouth (JSAN) framework is a work in progress at the Privacy, Security and Automation Lab (PSAL) at Drexel University. It analyzes a written text and detects features which could be used to identify the author. It then suggests changes that need to be made to avoid the author's stylistic fingerprint appearing in the work."

13 of 103 comments (clear)

Min score:

Reason:

Sort:

I don't know by i+kan+reed · 2013-08-05 03:59 · Score: 5, Funny

How will it disguise my terrible opinions that are obviously wrong?
1. Re:I don't know by 192939495969798999 · 2013-08-05 04:06 · Score: 4, Funny
  
  Those blend right in with the rest of the internet.
  
  --
  stuff |
2. Re:I don't know by i+kan+reed · 2013-08-05 04:51 · Score: 3, Informative
  
  Dude, let it go, this thread was started on a post about how everyone's opinions are wrong. Not a good context for debate.
3. Re:I don't know by plover · 2013-08-05 06:59 · Score: 2
  
  Cardinal Richelieu (supposedly) wrote: "If you give me six lines written by the hand of the most honest of men, I will find something in them which will hang him." Will the JStylo-Anonymouth mean that he'd be able to hang everyone who used it?
  
  --
  John
The Cuckoo's Calling by Richard_at_work · 2013-08-05 04:03 · Score: 4, Informative

Artificial Intelligence techniques are routinely used to detect plagiarism and recently were employed to reveal that Harry Potter author J. K. Rowling is indeed the author of The Cuckoo's Calling, which was published under the byline of Robert Galbraith.
Uhm, what? It was revealed by someone at Rowlings agency tweeting it to a Sunday Times reporter, after the reporter commented on how good it was for a debut novel - that has all been confirmed by the agency.
Unless the above line is badly phrased and is meant to say "recently were employed to confirm prior reports that..." - it didn't reveal anything of the sort, the link had already been revealed by plain old journalism.
1. Re:The Cuckoo's Calling by jabuzz · 2013-08-05 04:10 · Score: 3, Informative
  
  No it was revealed by a partner at the law firm who should have known better, and should now face sanctions from the Law Society. Being struck of the register would be about right.
  On the other hand they have already reached an out of court settlement for a substantial sum, which probably came out the partners own pocket. I would also imagine the firm has lost the JKR account.
conversion to another's style by greywire · 2013-08-05 04:15 · Score: 2

So, can any mediocre author convert his story to the style of a known good author using this?

--
-- Senior Software Engineer, Attorney appearance services, locallawyerapp.com.
1. Re:conversion to another's style by tgd · 2013-08-05 04:34 · Score: 2
  
  So, can any mediocre author convert his story to the style of a known good author using this?
  There's hope for Slashdot's editors! Huzzah!
Wasn't used to out J. K. Rowling by Aurien · 2013-08-05 04:20 · Score: 2

Sounds like some company is trying to toot their own horn here or something, but AI didn't out J.K. Rowling. Her lawyers friend did. http://www.businessinsider.com/russells-apologizes-to-jk-rowling-2013-7
Stephen King by Okian+Warrior · 2013-08-05 05:02 · Score: 4, Insightful

Stephen King seems to agree with you.
In his book "On Writing", he explains (among many other good points) that one hallmark of good writing is finding the right combination of words for imagery.
He uses examples like "I lit a cigarette, tasted like a plumber's handkerchief'" from Raymond Chandler and "'It was darker than a carload of assholes' by George V Higgins.
The Odyssey (IIRC) has the phrase "it was a wine dark sea", so this has been around for a very long time.
For casual writing the project may be useful, but I wonder how much imagery will be lost in translation.
Many of the works of revolutionaries, radicals, and dissenters are memorable for their specific imagery. Simon Sinek analyzed "I have a dream", and noted the difference between "I have a dream" and "I have a plan". The two are very different, and have different effects on people. (Viz. TED talk "How Great Leaders Inspire Action")
I'm doubtful that AI has progressed to the point where the mood and emotional content will be preserved in such a translation.
To be effective, defiant writing will still require courage.
Re:Google translate? by eyenot · 2013-08-05 05:05 · Score: 2

First of all, this: http://www.youtube.com/watch?v=LMkJuDVJdTw (YouTube)
Second of all:
"Of course you can, just stylometric identification and back home in order to prevent another language is automatically translated prose?" -- (Haitian Creole -> Azerbaijani -> Slovenian -> English ...)
"Not even the same language at home and another stylometric can automatically translated into prose?" -- ( ... Irish -> Hebrew -> Czech -> English ...)
"Not even in the same language and prose automatically translated differently stylometric?" -- ( ... Japanese -> Turkish -> Hmong -> English.)
"However, different stylometric automatically translated prose, and the same language is not it?" -- (... Urdu -> Filipino -> Latin -> English ...)
Depending on who you ask, you seem to have a different "answer" to your question.

--
"Stratigraphically the origin of agriculture and thermonuclear destruction will appear essentially simultaneous" -- Lee
Re:Hurry it up by EmperorArthur · 2013-08-05 06:02 · Score: 2

Tools like this basically do: (step 1) build abstract representation of text - (step 2) rebuild it into a new text using random substitutions.
Plagiarism detection tool will just have to do step 1 and then compare it with database of saved essays in same abstract form.
How would that help if the plagiarism detection tool only has the randomized outcome of step 2?
Simple plagiarism detection tools just use string matching. If a person used popular quotes and phrases in an essay, it is entirely possible for the software to give a high plagiarism percentage. That's why all the good software packages use highlighting with a link what it thinks was plagiarized.
More advanced tools can detect things like a student using a thesaurus for one to one word replacement. I do not know how much they can do in this regard though. String matching still works as long as the matching algorithms is willing to allow one or more words to not match. The problem is, doing this causes the false positive rate to jump even higher.
Going over every possible thesaurus based permutation of every word is a O(n!) hard problem. If all text in the database was normalized, then we're back to a basic string compare. Normalized in this context means changing a word in all works to a common synonym. For instance, change ever occurrence of the word proper with correct in the last paragraph.
It's possible to do more complicated things involving the actual meaning of a sentence, paragraph, or work. Unfortunately, I have no clue to go about doing so. The rules of English grammar are hard. Worse still, both professional writers and amateurs violate them all the time.
Remember kids, there's a huge difference between knowing the proper way to do something and still doing it improperly versus not knowing the correct way to begin with.

--
So lets pretend that we've just completed writing this code, as opposed to having just completed sabotaging it -Altera
Re:Hurry it up by epine · 2013-08-05 07:32 · Score: 2

A million college students are waiting anxiously for this tool now that some professors have started checking their essays electronically for plagarism.
This assumes that they're as stupid as we all suspect, because the next thing the administration begins to do is check whether the student's written oeuvre is self-consistent without bunkering down under a blander identity than a Milli Vanilli cover of Valium Spice.
I'm so busted.