Writing Style Fingerprint Tool Easily Fooled
Urchin writes "Some of the techniques used by literary detectives and courts of law to identify the authorship of text are easily fooled, say US researchers. They found that non-professional writers could hide their identity from 'stylometric' techniques by writing in the style of novelist Cormac McCarthy. Stylometric methods have been used in a number of high-profile legal cases in recent decades, including the 'Unabomber' trial. 'We would strongly suggest that courts examine their methods of stylometry against the possibility of adversarial attacks,' say the researchers."
....from the beginning. Sure it may work on a limited set of individuals. It's the same thing as a polygraph test, it's not based on any sort of quantifiable data but mere suspicion at best. It is completely subjective and there is no real hard science to support such tests. This is the reason why polygraphs are not admissible in court, and why writing analysis shouldn't be either. Be sure to watch for writing analysis to show up on the next Maury show!
hide their identity from 'stylometric' techniques by writing in the style of novelist Cormac McCarthy
... or Anonymous Coward.
Some analysis of handwriting can be useful. In forgery, for instance, a signature can show as false when compared to an authentic one by the presence of a "forger's tremor", because the forger must proceed more slowly to produce the signature than the person to whom it properly belongs.
You mean human beings can mimick each other? Aw shucks
If the methods a stylometry analysis uses are known (and they couldn't very well be a secret to hold up in court), of course you can game them. As long as the algorithm outputs "no" for any reformulation of your message, you can easily find it, by generate-and-test if necessary. The only question is, how fast can you generate a text that (a) says what you intend and (b) does not point to you? Very fast, I'd wager.
I don't think anyone has ever sold writing analysis as a unique identifier. But it can be useful. If one was an unpublished author in any significant form, and then "went unabomber" and started to write letters as a calling card, one could deduce from very similar writing styles and structures between the incriminating work and the unpublished/unpopularized previous would would be evidence to at least raise suspicion that the writer of the previous work was somehow uniquely tied to the crimes, even if not directly. Of course, all bets are off if it is plausible that someone could have pre-analyzed the author to imitate. Its also of note, this is only a positive test(i.e. a failed match in analysis makes no claim at all as to whether or not someone wrote it). I good example would be a set of writing that demonstrates an idiom used only in a certain locale, a business term used only in a certain company, and an ideological term used only in a certain fringe political movement. This is reasonable *evidence* of authorship, where of course evidence != proof. The polygraph, on the other hand, is complete BS because the only real thing a polygraph achieves is psychologically motivate the taker to tell the truth due to "faith" in the fact he will be outted for lying by the device. It doesn't actually measure anything related to the statements, only the physiological condition which can depend on millions of independent factors.
This should not really come as a surprise to anyone. Like all evidence that has to be interpreted, the interpretation can be flawed.
Shows like CSI have computers getting an exact match on fingerprints and DNA, but the real world is not like that. Fingerprint matching is entirely subjective and the print recovered from a crime scene is rarely a nice clean one like they show on TV. DNA often has to be manipulated before a match can be made (due to the sample found at the scene being too small or of poor quality) and even then it often matches more than one person.
Even when you do get a match, it's not proof that someone was at a specific place because DNA and fingerprints can easily be transferred. Someone broke in to my car a few years ago and despite there being fingerprints the police decided not to prosecute because they were on the outside of the car and the accused could just claim he lent on it on his way home from the pub.
There have been a few cases where fingerprint and DNA evidence have been challenged in the UK courts and shown to be unreliable, with innocent people spending years in jail before being cleared. Yet, the police seem to have started asking for everyone in the area of a crime to "volunteer" their DNA. Presumably if you don't "volunteer" you become a suspect.
The idea that handwriting is any more unique than those two and at all reliable is laughable.
const int one = 65536; (Silvermoon, Texture.cs)
SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
Stylometrics is essentially a correlational field: it's not that people inherently must write in unique styles that are identifiable from a few measurable features: there is no strong genetic causation for handwriting or anything like that, which would mean that a handwriting style really does truly identify an individual or narrow set of individuals. Rather, it's that, all else being equal, people in practice, do tend to write in a way that lets the stylometric features distinguish them. But, when all else isn't equal, and people are actively trying to thwart that sort of analysis, they are, unsurprisingly, able to do so in a lot of cases.
I suspect that a lot of forensic analysis runs into this problem: it takes some fact that empirically is true among the general population, but only because the general population is not actively trying to thwart you. The set of robust empirical truths about people, that hold up even when the person is aware that you're trying to use it against them and actively trying to keep you from doing so, is much smaller.
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
"I do not think anyone has ever sold as an analysis of writing a unique identifier. But it can be useful. If one was an unpublished author in any way, and then is Unabomber, "and began to write letters as a calling card, can be deduced from very similar writing styles and structures of work and unpublished incriminating / unpopularized previous evidence that at least raise the suspicion that the writer of the earlier work was somehow tied to the crimes, though not directly. Of course, all bets are off if it is possible that someone could have analyzed previously the author to imitate. Also of note, this is only one positive test (ie, a mismatch in the analysis is not intended at all as to whether someone wrote it). I would be a good example of writing that demonstrates a language used only in a particular location, a business in which the term is used only one enterprise, and an ideological term used only in a certain political stripe movement. * * This is reasonable evidence of authorship, which, of course, tests! = Test. The polygraph, on the other hand, is complete BS because the only real thing is a polygraph achieves the beneficial psychological reasons to tell the truth because of the "faith" in the fact that they lie for outt by the device. Not really measure anything related to the statements, only the physiological condition that may depend upon millions of independent factors."
If going unabomer you could always bounce it through a translator as I did with your post. Of course you could end up with quite a bit of "Someone set us up the bomb!" But still semi-legible. I wonder how the analysis would do with that?
"There is nothing to do it. But to do it." -Floyd Pepper
Again, thats why its clear that writing analysis is only a positive test. If steps are taken to actively change the style of writing, of course it will fail. It is something like saying an audio recording of someone's voice in a phone call is invalid, because it is possible to speak in a different voice. While true, this doesn't significantly weaken the positive test value.
The real issue is why we continue to ban 'criminals' when forensics are both available for testimony but often not for further examination because of deliberate overuse. We've now been shown data that even DNA evidence can be manufatured, if it's not first tested for methyl levels. And that is totally independent of physical specification. Which bring back the essential question that we've not had updated since 2000: What are we willing to expend energy for?
What exactly is the "Cormac McCarthy style"? The article doesn't mention it all. I even skimmed through the paper and all it does it quote a paragraph from some work of Cormac McCarthy.
I can't figure out what his style exactly is, and I certainly would not be able to fake it as the participants were supposed to. And the participants were supposed to not be literary geniuses.
If you can describe something in enough detail to put it in a certain category (X writes likes this), then you can also imitate that category from that same description (I will now write like this in order to seem like X).
I do not really see how you would ever expect different.
IAIFARSIJDPOOTV - I Am In Fact A Reality Star; I Just Don't Play One On TV
no 1 can f00l teh l33t XpertZ just bi change D way D write stuff kekeke stupit fags
If you were blocking sigs, you wouldn't have to read this.
If the methods a stylometry analysis uses are known (and they couldn't very well be a secret to hold up in court), of course you can game them.
Yes, but the problem is this:
1. It's not just that it's possible to fake not being myself, it's also that I can pretty much frame someone else. E.g., given enough messages written by KibibyteBrain (which just clicking on the user name or id will give me a list of), it's trivial to do a stylistical analysis on those and not just get an idea of how to write in the same style, but run the same analysis on the result and refine it until the match is outstanding.
2. From what I understand, the people in this test fooled it by merely being told to write in the style of someone else, without the help of any analysis tools, and still fooled it majorly. That's some pretty damn fragile "evidence" if anyone asks me. It's something Joe Sixpack can do by himself. Add some tools and it can only get crappier.
Even such idioms as you mention, are trivial to notice even without any tools. E.g., with only a little correspondence with another team here and reading some of their docs, I can tell that they use "solution" instead of "application".
3. While it can be handwaved as "eh, nobody said it's perfect", some people do seem to take it as less fallible than it really is. Even you just called it "This is reasonable *evidence* of authorship, where of course evidence != proof." And that's the whole point. Something that can be fooled by almost any Joe Sixpack without any tools or much effort, isn't reasonable evidence at all.
We allow evidence like handwriting, signatures, fingerprints, or DNA because they're supposedly very very hard to fake well. Ok, so DNA turned fakable as well, but you need a fair bit of expensive lab equipment and knowledge. It's something a biology prof at a medical college could probably do, but not something Joey Three-fingers the small time smuggler would even know where to start if he wants to plant someone else's fake blood at his latest shootout scene. Or fingerprints turned out easy to fake for the purpose of fooling a fingerprint reader, but it's still very very hard to transfer to an object in a way that looks genuine.
But here we have something that untrained people fooled by just being told to try. I'm sorry, but for me then it shouldn't be evidence at all.
A polar bear is a cartesian bear after a coordinate transform.
Ah, the irony of someone saying "I could have told you" and then saying that it's "completely subjective" and has "no real hard science to support [it]"!
Writing style probably can be useful evidence where the style isn't known by others in advance, but it is quite easy to fake a style (much like having a "normal written style" and a "formal report style").
As the article says "the study only attacked some of the less complex stylometry techniques". In fact, I'm surprised that they even considered lexical density because that varies greatly within a single author's writing. It's usually high at the beginning of a text, usually (not always) gradually falls off, jumps when they change subject, and so on. I'm not aware of it's being used in forensic linguistics (although it is used in analysing texts to identify, for example, objective divisions within a text).
The sort of thing that they used in the Derek Bentley (which contributed to the partial posthumous pardon) was analysis of his statement, which had
That all pointed to the statement not being Bentley's own words, but rather being the police version of his answers to a series of police questions that had been removed from the statement. One aspect of his original trial was a statement "I did not know he was going to use the gun", which was taken as evidence that he knew his accomplice, Craig, had a gun (and the inconsistency with the denial that he knew this, later in the statement, was taken as evidence that he was lying). Since the linguistic analysis shows that this was probably a reply to a question, it seems more likely that it went something like:
Police Did you know he was going to use the gun? BentleyNo.
Which makes sense because he knew at the time of the interview that Craig had a gun.
Yes, of course this sort of thing can be gamed, but it wasn't credible that Bentley would have been capable of such sophisticated gaming. The important thing as far as this thread is concerned is that forensic linguistics doesn't plug in a single measure, turn a handle and come out with a yes/no answer; it uses a whole range of measures and builds up an overall picture of what probably happened.
Quidnam Latine loqui modo coepi?
Now every time some anonymous terrorist publishes his manifesto Cormac McCarthy gets a visit from police.
> I don't think anyone has ever sold writing analysis as a unique identifier. But it can be useful.
One problem with that is the human tendency to be overconfident as to how good these tests are. This happens everywhere. Court, business, whatever.
Say you have some metric at work (e.g. lines of code) that's easy to measure. If it's the only measure management has, it's what they'll use to measure how good you're doing. This applies even if the results are absurd, because they would rather believe that they have *some* idea what's going on than to accept the fact that they have no idea what's going on.
In summary, sometimes NO information is better than bad information, but people are very reluctant to accept that fact.
If one was an unpublished author in any way, and then is Unabomber, "and began to write letters as a calling card, can be deduced from very similar writing styles and structures of work and unpublished incriminating / unpopularized previous evidence that at least raise the suspicion that the writer of the earlier work was somehow tied to the crimes, though not directly. Of course, all bets are off if it is possible that someone could have analyzed previously the author to imitate.
Or that either their previous work or their terrorist "press releases" were deliberatly in a different style. e.g. maybe they thought they had to write in a specific way to be published in a certain field or they are trying to incriminate someone else.
"deemed to have no information content" is actually a positive feature for analysis. Vocabulary is one thing, but the little things, like prepositions, malapropisms, punctuation and favorite constructions are harder to fake. If someone consistently uses it's as a possessive and writes "for all intensive purposes", it'll be difficult for that person to suddenly start writing consistently.
cut-and-paste is always an option.
When your child's ransom note, I write.
Now every time some anonymous terrorist publishes his manifesto Cormac McCarthy gets a vistit from the police.
The fact that one person may write in the style of another is nothing new. While the use of such writing-style analysis may still have a valid use in some cases, it is clear that it, like any other forensic tool (even DNA analysis) can be beaten.
Prior to contemporary times, I believe the number of people who would have had access to enough writing samples (of persons other than authors, columnists, and other published figures) to successfully mimic another's style would have been limited to family members, friends, and confidants. Today, with broad use of blogs and social networking sites, many more people are exposed to those writers' styles. In workplaces, a small subset of individuals are often called upon to produce a majority of documentation. While such writing samples may well vary from the style a person may use in personal correspondance, each will share characteristics that, together, are unique to the writer's preferred style.
I've been one who has been called on to create much documentation by my employers. Bits of my memos, procedures, and instructional materials would get lifted and used by other departments (my departmental IT writings were most often copied and used by our central IT department). When they did so, other employees would approach me and ask if I wrote the memo, web page, or training manual, because they could recognize my writing style.
Even here on Slashdot, there are those posters who have been around for a long time, who have posted often, and who have identifiable writing styles. Anyone with enough familiarity, if given the opportunity to post under such another's screen name, would likely be able to post something that would seem to be the words of another. Of course, if the context of the message were a great departure from the copied target's known views, the post would be suspect. If it were in line with such views, it might not raise suspicion.
If anything, the number of available sources of writing samples today increases the likelihood that someone else could learn and mimic a writing style. Of course, this could be a handy defense so long as the accused has produced loads of writing that is accessible to others.
I use irony whenever I can, but my shirts are still wrinkled...
"We would strongly suggest that courts examine their methods of stylometry against the possibility of adversarial attacks,' say the researchers."
Of course, this assumes that law enforcement actually cares about the guilt or innocence of the people they convict. They don't. They only care about putting as many people in prison as they can.
No. Comparing someone's writing style and their skin conductance are not even remotely similar. You're a fucking idiot.
I've always wondered just how accurate signatures are. I've noticed that my own signature varies widely depending on various factors. For example, when we purchased our house I had to sign my name to a dozen or more papers. The first signature looked "normal" but the later signatures were glorified scribbles. If I needed to sign a check last and just scribbled my signature on the back, would the bank (not privy to my signature's declining quality in the previous paperwork) be able to tell that it wasn't a bad fake?
My sci-fi novel, Ghost Thief, is now available from Amazon.com.
It is completely subjective and there is no real hard science to support such tests.
I beg to differ. There's very little subjective in stylometrics, the subjective part is interpreting the results, but definitely not producing them. Take a look at http://en.wikipedia.org/wiki/Stylometry and tell me which of the methods described there you think is "completely subjective".
The main problem with stylometry is not the methods, but the data. As TFA describes, changing writing style throw off the results - at least to some extent. Stylometrics relies on the fact that old habits die hard, but if someone is aware that the text they are producing might be subjected to stylometric analyses, they can employ various mechanisms to avoid identification and will probably have a better chance at succeeding than if writing casually. However, most texts used in court has been produced casually (letters, emails, text messages) and almost always have some unique traits specific to their author. Even in cases where people plagiarize a known author, they always miss some subtlety in his/her style that gives away the plagiarism. These subtle differences in style are usually caught somewhere in the stylometric analysis.
It occurs to me now that you may be talking about hand-writing analysis, in which case my reply is completely irrelevant and you have completely missed the point of summary and TFA.
"Live free or don't."
So handwriting analysis has problems. Another recent Slashdot article was about how DNA evidence might be falsifiable. And we all know that eye-witnesses have serious problems. We don't however reject any of these. Why not? Because we don't care about single pieces of evidence but rather about bodies of evidence. It is the collective narrative which matters. It might be possible for one or two types of evidence to be wrong or falsified. But it is extremely difficult to falsify four or five. The real problem is when overzealous prosecutors try to portray something like handwriting analysis as a CSI-style magic bullet. This is moreover, being balanced by a problem in the opposite direction, which juries increasingly wanting all sorts of technical evidence to convict even when it would be unnecessary, prohibitively expensive or in some cases, a form of evidence that really only exists in fiction.
I've always wondered just how accurate signatures are. I've noticed that my own signature varies widely depending on various factors.
Signatures written on paper are not all that helpful for a few reasons. First off, they are easy to forge. Second off, a single person might sign his name twice and produce two signatures which look very different to both the naked eye and some forms of analysis - hence not accurate. Where they actually are accurate, however, is when written on pressure sensative pads (such as those seen on new-fandangled credit card swipers). If you were to do an analysis of the pressure and speed at which the signer signed various parts of the signature, you would actually produce some very reliable information. This is because even when you sign your name in slightly different manners you have the tendancy to use the same speed/pressure on certain parts of certain letters. Personally I would just use digital signatures...but calculating hash functions on the back of your resteraunt receipt is never fun. Its also difficult to fit a 256-bit output on that miniscule "sign here" line.
While you can attempt to write in someone else's style, you're going to run into problems duplicating it strongly enough for a stylometric analysis to implicate them. Even if you lifted exact phrases from previous works you will invariably need to come up with original words, phrases, and sentence structures to fill the gaps where the original author has not written. These should be enough put reasonable doubt as to the authorship of the faked text.
More over, if it's identified as a fake, by eliminating the material that was copied from previous styles it's likely that your identity may be revealed from the pieces that you inserted to fill gaps. Obviously the longer the piece, the more likely this is.
The technique of hiding one's own identity is a matter of using the same techniques in stylometrics to identify phrases, words, and structures that would identify you, and then changing these until they no longer give an indication of your identity.
Attempting to creating a work that duplicates someone else's stylometric signature would be fairly obvious to linguists.
Which is a totally arbitrary differentiation, considering that a confident, arrogant, or unconcerned forger might well write less hesitantly than a person worried about their handwriting quality, or whether they actually have enough in the bank to cover what they're signing for.
Dear Sirs,
I find highly offensive you suggestion that styles of writing may be subject to gimmickry and impersonation. I wish to complain in the strongest possible terms about the broadcast, and am deeply dismayed at the judgment displayed by the BBC in funding and producing such rubbish. Many of my best friends groom haddock and other north Atlantic fishes and only a few of them are transvestites.
Yours faithfully, Brigadier Sir Charles Arthur Strong (Mrs.)
"Speaking the Truth in times of universal deceit is a revolutionary act." -- George Orwell
This has been known for quite a while, with one pair of researchers speculating about how much time the average person would need, with a tool's assistance, to sufficiently hide their identity. A change of 14 words out of 1000 was sufficient to hide their identity pretty well against word-level attacks.
http://research.microsoft.com/apps/pubs/default.aspx?id=69343
It is something of a cat-and-mouse game, for as stylometric analyses become more sophisticated, so will the techniques of obfuscation. However, as more of one's personal style is "blurred", the more likely it is that other as of yet undetected patterns will get swept along in the document alterations. In the end, obfuscation must win.
... you can easily find it, by generate-and-test if necessary.
If you think generate-and-test is an easy way to find it, then I've got some NP-complete problems for you to solve. While you're at it, I also have some public keys I'd like you to crack.
(Not that I think fooling stylometry is hard, but generate-and-test is generally not useful for anything but the smallest problems.)
I don't know, some of those pads are OK at capturing my signature but others leave it a jumbled mess worse than any signature I've ever written with a pen. And that includes the "just signed my name 100 times, here's another paper to sign for my house" signature. I'm guessing that the differences are either expense (places that go with cheap pads get horrid looking signatures) or when the pad was purchased (earlier ones worse at capturing signatures than later ones).
As a side note, am I the only one who doesn't like it when my signature is printed on my receipt? It means that a receipt that I'd otherwise just throw out (since it doesn't have my full credit card number on it) becomes one I need to shred.
My sci-fi novel, Ghost Thief, is now available from Amazon.com.
sorry, but RTFA. this is stylometry, not handwriting analysis.
Wikipedia: Sylometry
Wikipedia: Graphology
From TFA: "Each volunteer was then asked to write a description of their neighbourhood in a way that masked their personal style, before writing a further passage in the style of novelist and playwright Cormac McCarthy." [...] "the techniques consistently identified Cormac McCarthy as the author of the imitations of his work."
So, yes, the whole bloody experiment was precisely about disguising your style as someone else, and no, it did not give the tests any reasonable doubt. People trying to imitate Cormac McCarthy were consistently identified as Cormac McCarthy by the stylistic analysis techniques. It doesn't get more clear cut than this, really.
So, yes, it is very possible for an average Joe Sixpack to incriminate someone else, if they so choose.
A polar bear is a cartesian bear after a coordinate transform.
this article is NOT about handwriting analysis, it is about style analysis ( stylometrics ) which is NOT about handwriting at all!
this article is NOT about handwriting in anyway, it is about writing style which is the style you write in. No where does it mention anything about handwriting!
Dear Sirs and Madam,
I wish to complain about that last complaint. I can assure you that all groomers of haddock and every other species in order Gadiformes are indeed transvestites. This is in fact a necessary grade to be reached in the apprenticeship process for the Gadiformes Groomers Guild (GGF). If the former complainant indeed knows of any non-transvestite groomers as such, then he should report them both to the GGF and to the Ministry of Fish Groomers in Luton at once!
Angrily,
Mr. Pint
Some analysis of handwriting can be useful. In forgery, for instance, a signature can show as false when compared to an authentic one by the presence of a "forger's tremor", because the forger must proceed more slowly to produce the signature than the person to whom it properly belongs.
Perfect example! What you've detected is the speed and deliberation of the signer. In using this method to detect forgeries, you must make many assumptions regarding the state of mind of the signer.
Using myself as an example, my marriage certificate is one of the very examples of my verified signature. That is, it was signed in front of witnesses, including a photographer, so there's no doubt I signed that piece of paper.
But that is not my typical signature. In addition to the audience and the occasion, I knew the certificate would be framed. So I didn't do my usual get-it-over-with quick illegible scrawl.
Since then, I've signed many checks, credit receipts, etc. when there was no witness. I'd guess the difference between my everyday signature and my wedding day signature could cast doubt on whether they were all signed by the same person.
The signatures are different, but forgery is just one explanation of that difference.
Polygraphs have the same weakness. Yes the person's heart rate or skin conductivity changes, but deception is just one explanation of that difference.
what's wrong with a PIN ? that cannot be forged. civilised countries do use them for credit card payments now you know.
No. Comparing someone's writing style and their skin conductance are not even remotely similar. You're a fucking idiot.
I am truly stunned by the depth of the logical reasoning and analytical skills you displayed in the way you completely debunked GPP's post. After reading your argument, how could anyone possibly conceive that the usefulness of stylography and polygraphy in legal investigations could be at all comparable? Truly, I'm speechless at your boundless intellect.
</sarcasm>
MCSE? No, sir...I don't do Windows. Yes, I am an idealist. What's your point?
From the Article:
And the techniques consistently identified Cormac McCarthy as the author of the imitations of his work.
The main problem with stylometry is not the methods, but the data. As TFA describes, changing writing style throw off the results - at least to some extent...if someone is aware that the text they are producing might be subjected to stylometric analyses, they can employ various mechanisms to avoid identification and will probably have a better chance at succeeding than if writing casually. However, most texts used in court has been produced casually (letters, emails, text messages) and almost always have some unique traits specific to their author.
But therein lies the rub: how can you be certain that the actual author didn't consider that the text might be subject to stylometric analysis? Even as a kid, if I wrote something that I didn't want traced back to me, I made an effort to disguise my handwriting and writing style. If I thought of that back when I was a semi-delinquent teen/pre-teen (okay, not really delinquent, but I did get a little mischievous once or twice), I can just about guarantee that anyone who is doing something that might land them in real legal trouble will do likewise.
In other words, for stylometric analysis to have *any* degree of validity whatsoever, you not only have to prove that the styles of the sample text and suspected author's typical body of writing match, you also have to prove that the original author never considered that the writing style would be analyzed, and therefore that the original author did not take any steps to disguise writing style. You can't make any assumptions about what the real author expected when composing the message.
MCSE? No, sir...I don't do Windows. Yes, I am an idealist. What's your point?
I read the book "Author Unknown" which talked about this for the forensic side. It was only an okay book. Here's how I would do it.
1. Inconsistently spell things wrong. Misspell a word one way, then down a few paragraphs, misspell it another way.
2. Type in all caps. All capitalization errors you might normally make goes away.
3. Don't use your regional sayings for things. Use some other region's, or use all of them.
4. Run it back & forth with translation services to really obfuscate it.
easy peasy.
I've always wondered just how accurate signatures are. I've noticed that my own signature varies widely depending on various factors.
Yeah... Some of my signatures are clear and readable. Anybody could easily read my name from them. In others I don't even try write all the letters. Today I actually signed one receipt without trying to write even one letter there. It is just curvy lines in general shape of my autograph.
However, I have noticed that there are some factors that can be found from every single signature I've given. For example, I always make the last curve (which originally was the lower right corner of letter "a") in exactly the same shape, even when I don't actually write the correct letter before it. That's probably not the only similarity in all my signatures, even though the circumstances change. Most likely, the same applies to you.
The question is... How difficult would those similarities be to fake. Also, how much would I need to be able to change them to later say in court "You can't prove that I wrote that"
This may be slightly offtopic (but hopefully interesting to the slashdot crowd), so I apologize in advance. I've been trying to figure out how to use electronic signature pads to verify job authorizations, and haven't been able to come up with a way that they seem airtight to me if a customer denies issuing the authorization. Perhaps you or another reader can enlighten me.
I can record the data coming in from the signature pad and associate it with the job ticket in our database easily enough. However, if the customer denies authorizing the work, and we show them the signature data, they can just claim we copied it from another ticket. That seems like a reasonable defense to me, and one that very well might hold up in court if it came to that
I've tried to think of various ways to hash the signature data with unique information from a job ticket, but can't think of anything that can get around the fact that we have access to the raw data that comes from the signature pad, and can do what we want with it. Therefore, I don't see how they can be used for anything like signing a contract.
Of course, a signature on paper (which is what we currently do) can be forged, but there are ways to tell that have been mentioned elsewhere in this story.
"Any fool can make a rule, and any fool will mind it."
--Henry David Thoreau
That is true, but that's where the habitual aspect comes in. While you may be conscious about various aspects of your writing style, there are certain areas that are less prone to conscious manipulation --- e.g. certain syntactical constructions or your active vocabulary. No one (ie. no forensic linguists) will believe that you are Douglas Coupland if the frequency of certain prepositions in your text deviates wildly from his works. And yes, you can of course tamper with such frequencies, but the point is that most people don't. You don't totally dismiss fingerprints as evidence because some criminals wear gloves, do you?
It's also important to note that no court has ever based a verdict solely on stylometry. Stylometry will never give any definitive answer, but it might corroborate other evidence, which is kind of the whole idea. Stylometry may help eliminate a subject as well as identifying one, so while it may not be usable as the sole base for a conviction, it's still very useful and should be acknowledged as such.
If you really want to know about stylometry (and forensic linguistics in general), I suggest taking a look at John Olsson's website: http://thetext.co.uk/ and/or reading his book Word Crime, which is easily read even by people without linguistic training. John Olsson is one of the only full time forensic linguists and has dealt with a lot of different cases --- some involving stylometry.
A final note: Please stop refering to it as "writing style fingerprint" --- no serious forensic linguists do that, since it's in no way similar to fingerprints. Writing style doesn't rely on biometrics and is much more easily changed than the pattern of the ridges on your finger tips.
"Live free or don't."
This is writing analysis, not handwriting analysis. It looks at the words and punctuation you write, not the shape of the letters you write, so it can be used for typed documents. If they were looking at my writing for example, they would look at my vocabulary, the fact I use British rather than American spellings and words and so on.
Over in Japan, we use Hanko, which are simply ink stamps.
While signatures can be forged, Hanko is susceptible to theft AND duplication from the stamp.
I think signatures work on the assumption that signatures are like "artifacts" of one's personality - pretty much like statistics that describe
the character of a population. The same goes for stylometrics.
These, like MD5, are good for match identification, but not for authentication.
Using stylometrics as evidence IMHO is a misuse of technology.
Well, it got me wrong.
The fact is, our entire banking system is built upon little more than trust. Neither teller nor merchant has any idea if the squiggle on the check is MY squiggle or not. The cost of analysis to gain any level of certainty exceeds the value of a typical check.
The security features on the check don't mean a lot either. The check printer has no real way of knowing that I am or am not the person whose information they are printing on the check. In any event, a bank will cash anything check like they are given. A man once drew a check on the back of his shirt and sent it to the IRS. He received his shirt back from the his bank canceled.
All of the "security features" and the whole concept of signature for authentication are nothing more than a cursory measure to help keep honest people honest.