Plagiarism-Detection Software Confirms Shakespeare Play
mi tips us that software intended to help essay graders detect plagiarism has been used to attribute to Shakespeare — with high probability — a hitherto unattributed play, 'The Reign of Edward III.' It seems that the work was co-authored by Shakespeare and another playwright of the time, Thomas Kyd. "With a program called Pl@giarism, Vickers detected 200 strings of three or more words in 'Edward III' that matched phrases in Shakespeare's other works. Usually, works by two different authors will only have about 20 matching strings."
And the evidence continues to mount against him. All lies!
A human analyst looking for similarities never noticed many strings in common, over 500 years? How could that be?
... Shakespeare plagiarized himself? Stop the presses!
Shall I compare the to a summer day....
It seems that the work was co-authored by Shakespeare and another playwright of the time, Thomas Kyd.
Or Thomas Kyd plagiarized Shakespeare's work.
Game Show Host (John Cleese): Good evening and welcome to Stake Your Claim. First this evening we have Mr Norman Voles of Gravesend who claims he wrote all Shakespeare's works. Mr Voles, I understand you claim that you wrote all those plays normally attributed to Shakespeare?
Voles (Michael Palin): That is correct. I wrote all his plays and my wife and I wrote his sonnets.
Host: Mr Voles, these plays are known to have been performed in the early 17th century. How old are you, Mr Voles?
Voles: 43.
Host: Well, how is it possible for you to have written plays performed over 300 years before you were born?
Voles: Ah well. This is where my claim falls to the ground.
Host: Ah!
Voles: There's no possible way of answering that argument, I'm afraid. I was only hoping you would not make that particular point, but I can see you're more than a match for me!
Host: Mr Voles, thank you very much for coming along.
Voles: My pleasure.
Host: Next we have Mr Bill Wymiss who claims to have built the Taj Mahal.
Wymiss (Eric Idle): No.
Host: I'm sorry?
Wymiss: No. No.
Host: I thought you cla...
Wymiss: Well I did but I can see I won't last a minute with you.
Host: Next...
Wymiss: I was right!
So they've found a play that has some of Shakespeare's pet phrases in it. How do we know Shakespeare wrote it? We need to be able to reject alternatives like someone plagiarising those phrases from Shakespeare, or someone writing a deliberate homage of Shakespeare. Something similar happens in linguistics, where you're trying to tell if two languages are related but you can't tell if a pair of words are cognates or borrowed.
Host: ... we have Mrs Mittelschmerz of Dundee who cla... Mrs Mittelschmerz, what is your claim?
Mittelschmerz (Graham Chapman in drag): That I can burrow through an elephant.
Host: (Pause) Now you've changed your claim, haven't you. You know we haven't got an elephant.
Mittelschmerz: (Insincerely) Oh, haven't you? Oh dear!
Host: You're not fooling anybody, Mrs Mittelschmerz. In your letter you quite clearly claimed that ... er ... you could be thrown off the top of Beachy Head into the English Channel and then be buried.
Mittelschmerz: No, you can't read my writing.
Host: It's typed.
Mittelschmerz: It says 'elephant'.
Host: Mrs Mittelschmerz, this is an entertainment show, and I'm not prepared to simply sit here bickering. Take her away, Heinz!
Mittelschmerz: Here, no, leave me alone! (Sound of wind and sea).
Mittelschmerz: Oooaaahh! (SPLOSH)
The work done *suggests* that Shakespeare collaborated with Kyd on the work but it's not the slam dunk that the title would have you believe.
Sigs are too short to say anything truly profound so read the above post instead.
Back in college I briefly took a creative writing course which was filled with snobs clutching their leatherbound Infinite Jest copies who used words like "perspectival" and "serendipitous."
During one of the meetings the lecture focused on poetic expression with an emphasis on sonnets. Homework consisted of writing an abab, cdcd, efef, gg sonnet and reading it outloud to the circle of douchebags who then offered their opinions about the piece. Being an industrious person, I applied my murky understanding of F/OSS principles to the fine craft of poetic expression and forked one of Shakespeare's obscure sonnets, changing some archaic words into more modern form.
I got a round of faint applause then dropped the class 2 weeks later.
Shouldn't this be a function of the works length? Something like x matching strings per word squared? Otherwise it's not surprising that the number of matches between one work and many works would be greater than the expected number between one work and one work.
This is a very unscientific study, with far more potentially meaningful variables than they have accounted for here.
For example, these matching strings could just as well be common turns of phrase of the day. There doesn't seem to be any indication that the software was re-configured for common expressions of old English.
The study would be more plausible if works by two different authors IN ENGLAND IN THE YEAR 1600 contained 20 or so matching strings. But since that control group is missing -- so is the validity of the conclusion.
------ The best brain training is now totally free : )
The article mentions the fact that there was very high competitive pressure on writers to compose plays very quickly so I wonder if there actually was plagiarism going on here. How hard would it have been for one of these writers to get at least a fairly crude copy of Shakespeare's work and utilise various elements of Shakespeare's previous plays? Can anyone enlighten us as to the probability of this being the case or for that matter how common plagiarism actually was at the time?
Sigs are too short to say anything truly profound so read the above post instead.
This software is for detecting plagiarism. In the situation it is designed for, one person uses another person's work but tries not to reveal the fact. The program catches this by noting that the pieces of writing are too similar. If it's well-designed, then it is good at this task, so it should be reasonably sensitive to similarity.
The "authentication" scenario described in TFA is very different. Assume the play is fake (written by someone pretending to be Shakespeare). Then it is not a case of one person using another person's work and trying to conceal that, but rather one person imitating another person's work. If the program is sensitive to similarity, it might be easy to fool into giving a false positive. We really don't know. In order to tell, we would have to ask some people to deliberately write fake Shakespearean works and see how the program scores those.
Until we have more data on how the software performs at THIS task, rather than the plagiarism-detection task, I'll still be skeptical about the provenance of Edward III.
Another use would be to apply the algorithms to religious books to reveal which parts were really inspired by a divinity, and which parts were simply invented by some random, power hungry, con man, to control his peers.
They could call it Bl@sphemy.
Shakespeare, huh. That guys works are full of clichés.
Keep Doing Good.
This play has been widely attributed to Shakespeare by Shakespeare scholars for some time. It already appears in the Oxford Complete Works, the New Cambridge Shakespeare, and (my favorite) the Riverside Shakespeare.
Nothing is ever definitive in this line of work, so it's interesting to have the software weigh in on it. But I don't think any scholars would be changing their minds if it didn't.
Shakespeare was the conduit through which Marlowe published his works after he (Marlowe) had to "disappear" through a faked death. Marlowe was a wanted man because of his outspokenness and involvement in the plots and intrigues of the Elizabethan age. The facts about Shakespeare's life that can be determined with absolute certainty make it unlikely that he could be the writer of the great plays, sonnets, and poems that are ascribed to him.
Wrote it.
Seriously, he's filed to many lawsuits as it is.
For anyone interested I'd suggest M. Wood's documentary, "In Search of Shakespeare". The four part documentary won't answer any of the more delicious and silly questions about the authorship of Shakespeare's plays but it will give as good an historical insight as is easily available. Thomas Kyd is best known for his play The Spanish Tragedy worth reading for the style. Christopher Marlowe and Kyd were the new kids on the block before Shakespeare made his mark. A famous critique of Shakespeare, mentioned in Wood's documentary attacks Shakespeare as unschooled and not an equal to "university wits" like Marlowe. The problem with attribution is that, likely, all authors of that period plagiarized, (by our standards) , one another. Shakespeare started out as an actor with a traveling company IIRC, the King's Men, who were basically a troupe of government propagandists. Theatre was a relatively new phenomenon and was used in the Elizabethan era as a propaganda tool during the conversion of England from Catholic to Protestantism. Shakespeare stole many of the best plots he studied as an actor with the King's Men. While Shakespeare was known to have co-authored plays with others, the missing play based on the first part of Cervantes Don Quxiote is the most notable example, I know of no evidence, though evidence of any kind is scant, that Shakespeare and Kyd worked together. Kyd and Marlowe were implicated as Catholic agents and Marlowe was likely murdered because he was catholic. IMHO neither Marlowe or Kyd can hold a candle to Shakespeare.
ideopath @ play
Get a copy of the Unabomber Manifesto
http://cyber.eserver.org/unabom.txt
Rate the entire work, and each numbered paragraph, for reading level using the Flesch-Kincaid Grade Level Readability Formula
http://www.readabilityformulas.com/flesch-grade-level-readability-formula.php
Split the work into 2 parts, one with paragraph reading level ratings greater than the overall score, one with the scores less than overall.
Apply plagiarism testing software to compare these two halves and see whether it says they were written by the same or by different persons.
Before the creation of plagiarism testing software, we still had several different reading level testing programs available. I did this test using three different programs. They said that at least two people wrote the work. Ted Kaczynski was never considered to have Multiple Personality Disorder, so if the results (still) say two people wrote it, each with their own style, then it's highly unlikely Kaczynski wrote it by himself.
"I may be synthetic, but I'm not stupid." -- Bishop 341-B
Authorship Verification. I was exposed to this while I was working on an independent study project. It's Interesting. It tries to create a model from different features more based on word usage than direct grammatical analysis, but as it eliminates key features the relations follow a certain pattern that more accurately represents the author than using features directly.
http://portal.acm.org/citation.cfm?id=1015448
Why the pedantry? Because, if you didn't know that, you really shouldn't be pontificating on linguistics or linguistic analysis.
From scarped cliff or quarried stone she cries "A thousand types are gone, I care for nothing, no not one."
It was already widely believed that the play in question was at least partially written by Shakespeare. In research, when an experiment produces evidence that accords with a theory, the correct term is to say that it "confirms" the theory. It does not prove it, but it does confirm it.
The title uses the word precisely and accurately. However, I suspect you're not clear on what the word "confirm" means in the context of an experiment.
"Convictions are more dangerous enemies of truth than lies."
It seems that the work was co-authored by Shakespeare and another playwright of the time, Thomas Kyd.
When working together, they were known by the name "Kyd Shakez."
... and then they built the supercollider.
"So why would the Bard, at this stage in his career - age 32 and well established by the time Edward III was published in 1596 - need to collaborate on a play? Simply because, as literature scholars have documented, the London theaters of the day were competing for audiences and had to churn out material as quickly as possible to stay ahead of one another. To do so, they often used groups of authors to write playbooks in a matter of weeks , paying each author by the scene. The theater companies would then often advertise themselves, rather than the authors, on the published playbooks. "
If this doesn't sound like Hollywood then I don't know what does.
Every vague and vaguely funny Shakespeare reference under the sun. I'm so LUCKY!
Of course, any product that has had @ in the name at any point in the last, oh, decade or so can not by any means be taken seriously.
Now consider TP. Started in local journalism, worked in PR dfor the nuclear industry. Didn't have a classical education. Very successful author. Like WS, gets themes from all over the place, pastiches, parodies, makes them his own. TP is a "middlebrow" author. If you know the literature of the period, you will know the highbrow stuff - the stuff that would win Bookers nowadays - is almost unreadable today. Shakespeare was a popular playwright, not an intellectual.
In some future, people like you will be explaining that TP could never have written his books as he didn't go to Oxford and didn't live in London. So they must have been written by Will Self, or Martin Amis, when just messing around.
From scarped cliff or quarried stone she cries "A thousand types are gone, I care for nothing, no not one."
I can only speak based on turnitin, but assume all of these services are similar in respects. I note that turnitin will often make mistakes, and is also incredibly easy to fool. Changing keywords, and sentence structure etc..., it is rather easy to rewrite the whole thing to avoid detection. Having seen it make mistakes because of stuff that actually was already written, and is conicdentally similar, I wonder how useful it is for text written hudnreds of years ago.
What if someone wnated to write in a shakespearian style, or genuinley had a similar style be default? What is the actual reliable indication that this poem was Shakespears?
If you ignore ACs because they are anonymous - you're an idiot.
Pretty bad that he even lied about his name.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
I don't understand. Haven't they run this software against all Shakespeare's plays to determine if Bacon (or whoever you prefer) wrote them? I mean no body is smart enough to completely change personalities when writing something. WOuldnt software like this show the consistency in writing between all the plays as well?
I would think this software could finally put to rest all the silliness about the Bard.
If someone plagiarized Shakespeare, then of course it's going to contain matches because someone is copying his style and turn of phrase. Isn't that the point of this software? I don't see how finding matches allows anyone to say one way or another that the unknown work was authored by the same person. It could well be an imitator, which I'm sure Shakespeare had plenty of during his time and thereafter.
>Usually, works by two different authors will only have about 20 matching strings :P
Except of course when you compare nsync to backstreet boys, and then you
get 20,000 matching strings.
So the software was designed to detect bodies of work that contain phrases from other works. ANd it finds a work that is a composite of Shakespear and Kyd. isn't it more likely that someone back then was plagarizing from Shakespear and Kyd? As opposed to them collaborating?
For example if I turned in a term paper and the plagarism software detected phrases from cory doctrow and thomas pynchon, the conclusion my instructor would leap to is obvioulsy that the three of us collaborated on the term paper right? not! Why should this be different for this Play?
Some drink at the fountain of knowledge. Others just gargle.
Why are there no comments about Doctor Who and the Carrionites?
GO BLUE!
Francis Baconnnnn
Yup, using speech as a social status marker is what aristocrats use to make sure that everyone around knows what they are.
If you choose to express the paltry contents of your small mind in monosyllabic grunts, that is entirely up to you. Just don't expect it to be worth our while listening. There is nothing pretentious about making use of a rich language in evocative expression, whether that be in speech or written prose.
Some colleges advocate the dumbing-down of written prose into contemptible, footling little single-clause sentences such as "This is Spot. See Spot run....", and I have had enough of it.
It really is not that hard to plumb the depths of a multi-clause English sentence, any more than it is difficult to parse nested expressions in a well-written piece of program code. Furthermore, there is no reason why we have to limit ourselves to 200 words (reserved or otherwise) to reflect all the multifarious aspects of our existence.
Platgiarism? That's a stupid name for a program.
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
Personally, I've always thought of these plagiarism detection systems as ticking time bombs. The more data they acquire, the less unique each individual work entered into the system becomes. Eventually, a point will come where there will be a near 100% false-positive rate on submitted works that are original, but fail because they are worded too similarly to works already stored in the database.
For example:
"With a program called Pl@giarism, Vickers detected 200 strings of three or more words in 'Edward III' that matched phrases in Shakespeare's other works. Usually, works by two different authors will only have about 20 matching strings."
Okay... so, is the system keeping track of the time periods in which these works are written? There's a good chance that those numbers can vary greatly based on how literate a person is and their degree of formal education. A small number of matched strings between authors might be likely if they're each familiar with writing enough to utilize things like synonyms in their writing patterns.
But what about authors that aren't as educated and utilize speech and writing patterns that are more normalized among their peers? You could have significantly higher matched string counts between them.
It gets even worse when you introduce the internet savvy into the equation, where most of their contact with the outside world is specifically done through the internet. People of similar interests and trends who spend hours talking with each other in public chat channels are likely to pick up huge similarities in their writing patterns, much like how close knit communities tend to speak with similar accents and phrases over time. Our social networks directly influence how we communicate with one another.
Considering the fact that this is now a global phenomenon, it is inevitable that our individual written works will become so normalized that it will be almost impossible to distinguish who has written what with any real certainty by automated means. Especially in the generations to come!
8==8 Bones 8==8
The software analysis found only two words not used elsewhere by Shakespeare: "thermonuclear" and "jazzercise".
(Careful analysis of this post reveals that it was plagiarized from "Culture Made Stupid", by Tom Weller.)
Here's a neat experiment. Use the algorithm to compare the Works of Shakespeare to the Shakespeare of Avon's writings, such as his will, his tombstone epitaph, etc. You'll find they weren't written by the same person.
seems like a dumb name.