Writing Style Fingerprint Tool Easily Fooled
Urchin writes "Some of the techniques used by literary detectives and courts of law to identify the authorship of text are easily fooled, say US researchers. They found that non-professional writers could hide their identity from 'stylometric' techniques by writing in the style of novelist Cormac McCarthy. Stylometric methods have been used in a number of high-profile legal cases in recent decades, including the 'Unabomber' trial. 'We would strongly suggest that courts examine their methods of stylometry against the possibility of adversarial attacks,' say the researchers."
Yes, but the problem is this:
1. It's not just that it's possible to fake not being myself, it's also that I can pretty much frame someone else. E.g., given enough messages written by KibibyteBrain (which just clicking on the user name or id will give me a list of), it's trivial to do a stylistical analysis on those and not just get an idea of how to write in the same style, but run the same analysis on the result and refine it until the match is outstanding.
2. From what I understand, the people in this test fooled it by merely being told to write in the style of someone else, without the help of any analysis tools, and still fooled it majorly. That's some pretty damn fragile "evidence" if anyone asks me. It's something Joe Sixpack can do by himself. Add some tools and it can only get crappier.
Even such idioms as you mention, are trivial to notice even without any tools. E.g., with only a little correspondence with another team here and reading some of their docs, I can tell that they use "solution" instead of "application".
3. While it can be handwaved as "eh, nobody said it's perfect", some people do seem to take it as less fallible than it really is. Even you just called it "This is reasonable *evidence* of authorship, where of course evidence != proof." And that's the whole point. Something that can be fooled by almost any Joe Sixpack without any tools or much effort, isn't reasonable evidence at all.
We allow evidence like handwriting, signatures, fingerprints, or DNA because they're supposedly very very hard to fake well. Ok, so DNA turned fakable as well, but you need a fair bit of expensive lab equipment and knowledge. It's something a biology prof at a medical college could probably do, but not something Joey Three-fingers the small time smuggler would even know where to start if he wants to plant someone else's fake blood at his latest shootout scene. Or fingerprints turned out easy to fake for the purpose of fooling a fingerprint reader, but it's still very very hard to transfer to an object in a way that looks genuine.
But here we have something that untrained people fooled by just being told to try. I'm sorry, but for me then it shouldn't be evidence at all.
A polar bear is a cartesian bear after a coordinate transform.
I've always wondered just how accurate signatures are. I've noticed that my own signature varies widely depending on various factors. For example, when we purchased our house I had to sign my name to a dozen or more papers. The first signature looked "normal" but the later signatures were glorified scribbles. If I needed to sign a check last and just scribbled my signature on the back, would the bank (not privy to my signature's declining quality in the previous paperwork) be able to tell that it wasn't a bad fake?
My sci-fi novel, Ghost Thief, is now available from Amazon.com.