A Few Million Virtual Monkeys Randomly Recreate Shakespeare
First time accepted submitter eljefe6a writes "On September 23 at 2:30 PST the A Million Amazonian Monkeys project successfully recreated A Lover's Complaint. This is the first time a work of Shakespeare has actually been randomly reproduced. It is one small step for a monkey, one giant leap for virtual primates everywhere. From the article: 'For this project, I used Hadoop, Amazon EC2, and Ubuntu Linux. Since I don’t have real monkeys, I have to create fake Amazonian Map Monkeys. The Map Monkeys create random data in ASCII between a and z. It uses Sean Luke’s Mersenne Twister to make sure I have fast, random, well behaved monkeys. Once the monkey’s output is mapped, it is passed to the reducer which runs the characters through a Bloom Field membership test. If the monkey output passes the membership test, the Shakespearean works are checked using a string comparison. If that passes, a genius monkey has written 9 characters of Shakespeare. The source material is all of Shakespeare’s works as taken from Project Gutenberg.'"
I wish I'd thought of it - and what a neat way to go about it.
I guess he had to use virtual monkeys because all real monkeys have progressed to randomly downloading things from bit-torrent.
...and wouldn't it be easier to let them evolve and then one of them can BE Shakespeare 2.0?
These posts express my own personal views, not those of my employer
Computers used to be serious tools made by serious people to solve serious problems. What the hell kind of auto-fellatio is this nonsense? You software retards are really so far up your own ass you've emerged in a new universe. Better thank your lucky stars that the hardware engineers, you know, the people doing the real computing work, have designed fast and cheap computers....
I'm virtually impressed, virtually speechless even! The man is a virtual genius.
These posts express my own personal views, not those of my employer
He's working with a very loose interpretation of the thought experiment here. Also he's apparently letting these monkeys get away with multiple character overlap to successively build the text.
There was a comic strip/sketch where scientists have a roomful of monkeys and typewriters, and their latest "Work" is clutched in a researcher's hand. As they go through, it's page after page of perfect Shakespeare, and they're going through with great excitement until they get to the very last page, they look in disappointment as it degenerates into "Ook eek ook". Anyone remember it?
I always thought the idea was that the characters would be produced sequentially throughout the entire play, not just every word produced independently. Much less credit.
If i'm understanding this, this isn't as cool as it seems. It seems like his 'monkeys' are just randomly creating words, and he matches those words against any word used in Shakespeare. If he gets a match, he marks that one as done. So, as some point one monkey made the word "be" and all of a sudden green lights all over the place.
I think the original saying was how random and unique it would be for a solid set of strings to randomly create a whole piece of work _in one go_ . Not a word here, a word there, OMG 100% of Shakespeare words have been randomly created.
-Malakai
A Dragon Lives in my Garage
Does this mean we can also create the next US President using Amazon? I mean, looking at the choices so far, how hard can that be?
Do your own thing. And overdo it!
So the virtual monkeys are recreating a subset of the work of Shakespeare not an entire work. And the Hadoop instance is splicing them together?
It is not a small step for _a_ monkey, it is a small step for millions of monkeys. It is not what you would think - that one monkey eventually recreated the work - but that millions of monkeys writing out random 9-letter groups of characters, eventually created all the groups necessary to recreate the work. Color me bored.
By the way, even with infinite computation power, using a random generator like Mersenne twister wouldn't work - there are 'only' ~2^20000 possible sequences of a given length that you would ever get. This is orders of magnitude less than the number of possible strings of length 11621 (which is 26^11621 ~= 2^54000)
"It was the best of times, it was the BLURST of times! Stupid monkeys!" {strikes them with script...}
srand (time(NULL));
while (1)
if (rand()==1234)
puts("OMGOOSES!");
Kinda a waste of CPU cycles...
Better known as 318230.
What a colossal waste of energy and computing resources.
this only shows that the output string of a generator contains a certain string... since the output of the generator is fixed, there are strings, literary works, that is, that can not arise, while others can...
this should be done with "true" randomness generated by processes in labs, or some HQ solid state noise generator...
thoughts?
and that he missed the point of the expression?
Of course it will work the Mersenne twister will eventually cover the entire 9 letter space and then he can search though for the parts that match (yes he is doing it concurrently but that’s just an inefficient way of doing it). If he had the RAM and time he could eventually recreate every book possible.
The Wikipedia page explains it better that infinite random sting is bound to contain something that is perceived as useful. Of course the literal take on on the expression is the most funny.
What's happening here (if I understand the writeup) is that the monkeys are typing random letter combinations, until they hit a small phrase that happens to be in shakespeare. Then that phrase is marked as done.
Let n be the size in characters of the target phrase. If n=1, then the complete works of shakespeare are obtained as soon as each of the letters of the alphabet have been typed at least once. You could do this in a few seconds on your computer keyboard. If n=2, then the complete works are obtained as soon as all the possible pairs of letters have been typed. The experiment in TFA has n=9 I think.
As n grows larger, the time until completion grows exponentially. Once his expeiment is done, the case n=10 should take roughly 26 times as long (ignoring punctuation capitals and diacritical marks). Alternatively, it would require a cloud roughly 26 times bigger to do it in the same amount of time.
I did think of it. I even registered a domain (see my URL and e-mail address). Planned on making a screensaver that would randomly generate stuff, and convince people to run it, ala SETI@Home. Then college happened, then graduate school happened, then marriage happened, then baby happened... And then (once again), I read on SlashDot that someone else has done one of my ideas again and made the front page.
But then again, literally as I'm reading this, my daughter is singing the Blue's Clues theme song next to me while my wife and I get ready to queue up for our nightly game of League of Legends... Sitting in the downstairs den/office that's full of years of gamer stuff that all represents the happy memories of those several years of college. That guy can have my monkeys. Good for him. I found something better. :)
Maxim: People cannot follow directions.
Increases in truth directly with the length of time spent explaining them
The Monkey Shakespeare Simulator was doing this properly back in 2003
http://web.archive.org/web/20061216060137/http://user.tninet.se/~ecf599g/aardasnails/java/Monkey/webpages/
Once I got it randomly generating all 26 letters, that was it.
Talk about a project not worth the electricity.
I think that the goal is that one of the many monkeys types an entire work of Shakespeare, not that many monkeys each type a very small segment of Shakespeare mixed in with gibberish, and then the many very small segments of Shakespeare are cut from the surrounding gibberish and combined by a person of intelligence into a work of Shakespeare.
now if the monkeys are typing out computer code he could copywrite all possible programs....
That's all well and good but how much virtual poop did they throw at each other?
The only reference that Google comes up with is from this same article.
what is the point? this is so stupid. you can recreate the entire internet given a big enough computer power but how cool is that? not cool, just a big waste of computer power.
Certainly this story must interest some people. To you, I ask this question: what makes this story interesting? To me it's a waste of energy that doesn't produce anything unexpected or particularly interesting. Compared to this, the Minecraft Enterprise-D is useful--it's at least interesting.
(Note: I am a mathematician, so maybe I'm missing some of the novelty associated with random number generation and exponential growth.)
If anyone wants to see the status of the other works of Shakespeare, you can view them here (http://www.jesse-anderson.com/2011/08/a-few-more-million-amazonian-monkeys/).
If it can be proven that this is not a copy of the original. Who owns the copy right? Does this mean that in a couple of years Torrents and PirateBay will be replaced by virtual monkeys creating works of art for me? And, would I have to pay them virtual bananas, or do they work for peanuts?
If my comment didn't sound as good in your head as it did in mine, then I guess we all know who's to blame
Slightly off topic, but thought someone might find this interesting.
In regard to the whole monkey/typewriter thing, even if you had infinite monkey/typewriter/time resources it would still never have any useful output.
Suppose you run the experiment long enough that you are guaranteed that one monkey on a typewriter has written romeo and juliet.
How do you find that monkey and his document?
Well you have to look through every document and find the one that matches romeo and juliet.
But you either initially require a copy of romeo and juliet to match with (in which case you already have the dam document)
Or worse, you need a human being who is capable of detecting such a document without seeing it beforehand, shakespear!, and in that case he might as well just write it himself.
Now have the monkeys write a novel from the future and I will be impressed.
Exactly. Breaking down the problem of "randomly finding thousands of characters in the right order" to "randomly finding 9 characters in the right order" is bullshit, because this requires information about the order of all the 9-character-blobs you find.
In other news: I compressed a Gigabyte down to 2 bits. You just have to know the order of the bits!
Stupid article. Stupid submitter. Stupid waste of energy. That's the 21st century for you. Idiocracy at its best.
You could prove that for the length of a work of Shakespeare (N), the amount of "monkeys" required to solve the problem in the same amount of time is 26^(N-9). Or, as it relates to the proverb, the solution to the equation has the time required to create a work of Shakespeare as infinite and the number of monkeys required to solve it in that time as infinite.
Of course, that solution didn't require programming the monkeys. But it is extrapolatable out to an entire work.
The ______ Agenda
Randomness will produce everything indeed. But this experiment is not random. The monkeys are not *producing* the work of Shakespeare. They are *reproducing* it. The master program already know the work, and has it programmed in its tests. There is a big filter here: take this random bit, and decide whether it is "part of Shakespeare's work." Not quite the same as letting the monkeys type a full page, and then have readers decided whether this is "as good as Shakespeare." Prior knowledge killed Schrödinger's Cat!
Now we can mesure the computational power of systems in Shakespearian Monkey's per hour. I expect all future slashdot articles to convert to this notation.
This experiment, while fun, isn't exactly the infinite monkey experiment.
Of course it is not. It is impossible to simulate an infinite amount of monkeys working for an infinite amount of time. Some concession has to be made to the fact that we have a finite amount of computing power.
can they provide the RNG and seed value that produces the poem? thought not
bite my glorious golden ass.
If someone wants to increase God's bandwidth, that's laudible--taking God seriously. In my experience, there is a correlation between me holding-up my end of the conversation and God talking. God said "honest measures" was in play -- by that which you measure-out it will be measured unto you.
God says...
10 i = i + 1
15 IF i > 99999 THEN PRINT ".";: i = 0
20 IF INKEY$ = "" THEN 10
30 PRINT "King James Bible, Line:", i
Line: 74977
themselves ill in their doings.
3:5 Thus saith the LORD concerning the prophets that make my people
err, that bite with their teeth, and cry, Peace; and he that putteth
not into their mouths, they even prepare war against him.
3:6 Therefore night shall be unto you, that ye shall not have a
vision; and it shall be dark unto you, that ye shall not divine; and
the sun shall go down over the prophets, and the day shall be dark
over them.
----------------------
ROFLMAO God's saying the monkeys will bite. Treat Him as entertainment.
And now question is, given a finite amount of time, is it likely that a team of virtual monkeys sitting at virtual typewriters come up with working keys for the programs that I download before the downloads are finished?
Yes! I could reproduce all of the works of Shakespeare in nearly zero time on a single computer. All I would need to do is reduce the comparison from 9 characters to 1 bit. A random bit generator could randomly reproduce a bit. I would then compare that one bit to a Shakespeare work and save it if it matched. Heck, we could 'optimize' this and just flip the bit if it didn't match, being that it's the only other possible permutation. Bam, O(n). My point is, as I believe others have noted, the saying is referring to the comparison of a single monkey's output with the only validation being that the monkey reproduced said Shakespeare work in it's entirety. Validating a subset of the work is essentially the same as the validator writing the work, not the monkey.
way to REDEFINE SUCCESS
Not really realistic, considering that a real monkey would get p-d off after a few minutes and start throwing feces at the page, and eating the pages of other monkeys. From what I saw, his virtual monkeys don't do that.
http://en.wikipedia.org/wiki/Infinite_monkey_theorem#Direct_proof
Probabilities
Even if the observable universe were filled with monkeys the size of atoms typing from now until the heat death of the universe, their total probability to produce a single instance of Hamlet would still be many orders of magnitude less than one in 10^183,800.
Then they all went back to writing Perl
Table-ized A.I.
But does each snippet have to be generated with multiplicity? Ie for n=1 not just all 26 letters would have to be typed as many times as they appear in shakespeare, a huge increase. For large n it makes virtually no difference.
I agree with you that this is a LONG way from the original thought experiment.
Was a five-minute ten-monkey job.
It is the universe that makes fun of us all.
This is more or less nothing but a random number generator.
I think one interesting question is rather: If we did not have Shakespeare as a template, would the monkeys ever recognize they created a poem?
Until it can analyse the random output and report artworks created. It is nothing more than a password cracker looking for the password Shakespeare created as a really long password..
The actual solution is very simple...
If you want to generate the text of every single book ever written or that may be written you first take a large chunk of memory, perhaps several tens of gigabytes and you then cycle through every possible character sequence that fits into that memory space.
It will take a very long time and will consume a huge amount of processing power and a ridiculous amount of disk space but you will eventually have generated every single book ever written.
You could test each sequence for sensibility by counting the number of known words it contains, looking for entire segments of data that contain known words, the higher the quantity of known words the more likely you have generated something readable.
What is the Bloom Field membership test? Googling for it only yields this news story.
All donate an infinite amount of cash, you can build an infinite computing platform to run the infinite monkey experiment!
- http://www.milkme.co.uk
Great use of cycles and energy. Imagine using that CPU power for something like seti or folding.
Considering we have rational reasons showing how this is retarded- in comments right under the article, I'de say we're doing okay. It sucks that it gets buried sometimes but that's what mod points are for. Ideally the editors would update the story to say that after consideration, what this person is doing isn't as interesting as we thought. But that's putting judgments into an article and that's not good reporting.
The original context had nothing to do with infinity, and it is often quoted as a 'fact' about infinity by people who understand neither the original context nor what infinity is.
The idea was to give you some idea about the probabilities involved in thermodynamics. The probabilities involved in statistical thermodynamics mean that at a macroscopic level, certain things generally regarded as contravening the laws of physics are possible, although extremely unlikely. For example some fluid could divide itself into its constituent elements, or into a warm region and a cool region. To deal with the conceptual problems caused by believing in this, the monkeys were given as an example of an extremely unlikely event. If you sit a single monkey at a typewriter and get it to type randomly, then there is a positive probability that it will type out the complete works of Shakespeare without errors. This probability is very small. However if you give an example of a thermodynamic event at a macro level which contradicts the 'laws' of thermodynamics (which were held to be laws before the statistical approach existed), the positive probability of this will be MUCH smaller. The original paper gave some simple example which I forget, something like a small amount of water at uniform temperature, and the chance of a temperature difference developing which it was possible to detect.
I guess the idea of the monkeys typing is so compelling that this 'thought experiment' has been shared extremely widely outside of the context of thermodynamics. However infinity does not have anything to do with it really. A very large, but finite, number of monkeys would be enough that one of them would type out the complete works of Shakespeare, on the first attempt, without errors, at least with a probability negligibly close to 1.
I wouldn't say impossible, just improbable. The odds of writing the play are in direct proportion to the number of characters available on the virtual keyboard. Suppose you give the infinite number of monkeys keyboards with 101 keys on them. The number of characters in the shortest play, A comedy of Errors, is 80,011. The actual odds of producing the play are 1/101^80011 or 1.74808391628e-160368. If we assign each monkey 1 cubic meter of space to work in. If we believe that each monkey will produce about 1 character per second A Comedy of Errors should be produced within the first day of this infinite number of monkeys working away. Within 2 days all of Shakespeare's works should be produced.
The odds of two of these plays ever coming into contact with each other cannot be calculated since the number of monkeys is infinite. The probability is that the minimum distance between two plays will be 1 meter times the inverse of 1.74808391628e-160368. If we assume that the play floats around actually trying to find another play in this infinite space bouncing off of monkeys, computers and printers then the odds of two plays ever bumping into each other are pretty infinitesimal.
Why is that important? It gives people the idea of the odds of two different intelligent species from two different star systems ever coming into contact with each other. We could convert the meter above to the average distance in light years between planetary systems and then....but, anyone with a basic understanding of math realizes all this already so this experiment is not even mental masturbation.
He kept a million virtual monkeys gainfully employed for X amount of time. The job field is hard out there for virtual monkeys, too, you know. Bunch of anti-virtual monkey people, you are! Hmmmph!
Vote monkeys into Congress. They are cheaper and more trustworthy.
It can't be so difficult to get it.
Experience says that once there were a bunch of monkeys and in a few millions years all of Shakespeare works were written. In order!
This sort of concession misses the point. The "infinite monkey theorem" is about how wildly unlikely things are not the same as impossible things. Therefore you cannot discount the possibility that a thing happened or will happen just because it is very improbable to happen, if it was or is going to be subjected to an arbitrarily high number of "chances" to happen.
This experiment breaks it down to brute-forcing a poor password, billions of times, instead of brute forcing a friggin' insane password, which is a substantial difference.
It seems like it could be useful as a thought experiment in a probability problem set or test, or maybe in a programming course (with n suitably reduced to run on a student's machine in reasonable time).
A better use of time would be to teach the monkeys to render the text correctly in the first place. Monkey brains aren't that different from human brains after all. - And one monkey getting all 11 letters of 'Shakespeare' correct has a lot better chance of a correct match.
See, what you really need is some randomness and selection (the filtering), both of which you've already got, and then add the reproduction and inheritance parts. If you want to be fancy, you can implement crossover. Leave the system set up for quite a while and you'll end up with some amazing things that will keep you endlessly entertained with very little maintenance required.
--God
I did something relatively like this (albeit smaller scale) when I was a 8th grader in the early 80's, learning BASIC.
One of the first programs I actually wrote myself instead of laboriously copying out of a book or a magazine was "Monkeys" - this was a program that randomly generated letters and added them to the right end of a string, checked the string against a target value, and if it didn't fit, deleted the leftmost character to the string.
If a$="tobeornottobethatisthequestion" then b$="The monkeys have succeeded in writing Shakespeare!"
This was on an Apple II, but it ran every time the computer wasn't being used, probably months over the course of a year. My monkeys never succeeded. Stupid monkeys.
-Styopa
But that defeats the point. Why 9-character segments? Why not 1-character segments? Then, when each letter has been generated once by your random number generator, you say 'done' and move on. The point of the gedankenexperiment is to show that a true random number generator will eventually produce any sequence, irrespective of whether you ascribe some meaning to that phrase or not. For example, it is just as probable that a monkey would type 'the original submitter is an idiot who misses the point of probability' as it is that they would type 'mfdag gfnaif pwrg kflgsq hmthwrhdga adsfjn fadfm asdfned qemangasd asv'. They are both 70-character strings of lowercase ASCII characters and spaces, and if you have a random number generator set up to produce these with no bias towards letter frequencies then either combination is equally probable. This 'experiment' added an extra step of determinism, which means that it is not an unbiased random number generator, it's a very badly designed program for generating a Shakespeare play.
I am TheRaven on Soylent News
9 character chunks? Should I spend a 1/2 hour writing a monkey class that emits 1 character chunks and simply monitor it for "Much Ado About Nothing"? I bet it could reproduce the work in less than a second with a single monkey.
9 characters of any particular Shakespearean work isn't slashworthy...
Now if he had a monkey emit the entire work that would be interesting in an examination of how long it took and how it occurred.
Loading...
pics or it didn't happen
The rebuttal is that people making that first argument don't understand the replication part of natural selection. Evolution doesn't say atoms randomly come together to form each person. First they formed useful proteins, and those genes got replicated. Repeat and add one level of complexity each time, keep repeating 4 billion years... and you finally make complex organisms.
Back to the analogy of monkeys typing, the idea is once monkeys bash out a useful combination, a word, consider that word created (gene useful) and will replicate. Turns out, if you apply the analogy right, monkeys can bash out shakepeare pretty fast, so the monkey analogy is a bad argument against evolution.
But those monkeys wouldn't know Shakespeare's version was any better than the infinity of others produced in the exercise. Even if they created something great they wouldn't know what they had. It would have an equal chance of being discarded as the other iterations. We need enough genius monkeys (read: humans) to read all the crap produced and determine which stands out.
This information then gets put on something called "The New York Times Bestseller list."
Here's a simpler example:
while(1)
{
int x = rand() % 10;
if (x==666) printf("Yes, everything!\n");
}
You do know that x will never be greater or equal to 10, right? Thus x can never equal 666. FAIL.
i always wondered what japer fforde was on about but never bothered looking into it. and i still haven't
i spent five minutes thinking and all i got was this crappy sig
first a question. what is a Bloom Field membership test? can't google that.
Second, I think this test is cooked. One virtual monkey did not write this uncoached. No way. it's mathematically impossible.
I think what he did was just take 9 character pieces from different monkeys.
Some drink at the fountain of knowledge. Others just gargle.
26 to the power of the number of letters in the complete works of Shakespeare is not infinite. It is not even close to infinite.
Finite time and a finite number of monkeys are sufficient to generate the complete works of Shakespeare. And if you don't mind using a lot of one, you can get by with a very small amount of the other.
Hollywood's newest screenwriter!
One of our competitors trademarked the term "hypothesis". From now on, we will call them "boneheaded ideas".
Seems like the coder used genetic/evolutionary programming to achieve the "feat" However, it is not truly random since you are comparing end result with an absolute answer, and not some mathematical approximation. In simple words, the user wanted Shakespeare, so he hardwired the code to get Shakespeare. I don't mean comparing the end result to Shakespeare. The program effective weeds out non-Shakespeare out of the result pool at every iteration. Not random. Here is an example demonstrating how anyone could do it. http://www.generation5.org/content/2003/gahelloworld.asp
wget http://www.gutenberg.org/ebooks/100.txt.utf8 # The Complete Works of William Shakespeare
perl -e 'local $/; $i=0; $_=; while(m/([0-9a-zA-Z_\-]+)/g) {if(length($1)>$i) {$i=length($1); $k=$1; }} print "$i:$k\n"' 100.txt.utf8 # find the longest word:
36:tragical-comical-historical-pastoral
perl -e 'print length("0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_-")'
64
64^36 = 1.05312292 × 10^65
There are some 10^65 words as long as the longest one written by Shakespeare. http://en.wikipedia.org/wiki/FLOPS says that "the 500 fastest supercomputers in the world combine for 58.9 petaFLOPS of computing power". If one word took just one flop, and we allowed the monkeys to take all those 500 supercomputers, it would take (10^65)/(58.9*10^15) = 1.7 * 10^48 seconds which is 3 * 10^40 years to get all the 36-char long words. Good luck to the random monkeys.
I thought the point of the original monkey idea was that they create Shakespeare *without* access to the original. This is like other experiments were genetic algorithms are used to recreate something that exists by comparing the output to the existing thing. That's just glorified curve fitting.
The real goal is to design something new that would not necessarily have been arrived at through traditional design methodology. People have "evolved" FPGA circuits by generating random bit configuration files, testing the functionality, and crossbreeding the best circuits. They don't compare it to an existing circuit. Sometimes the circuits don't even use clocks, and no one can quite explain how they do what they do. Some stop working when loaded onto a supposedly identical FPGA, as if the circuit is taking advantage of some unique variance of that particular FPGA. Another company designed an antenna with a genetic algorithm, and the end result looks like nothing you'd come up with using traditional antenna design techniques.
The fact is that a monkey banging on a keyboard is not going to even remotely approximate statistical randomness. Other issues with this "experiment" aside, the simulation of the physical system is broken.
Is there proof you did this earlier? Did you file a patent on your Infinite Monkey Generator? If so, you can sue his sorry ass goodbye and MAKE MONEY FAST!!!!1!11
given enough time, monkeys DID write Shakespeare's works - they just had to evolve to the point that one of them, named William Shakespeare, decided that writing plays was a good thing for him to do and so he did and here we are.
---- You are fully entitled to my opinion.
I was going to say, this experiment doesn't sound the original idea, the original idea as I understand it, is that the entire play would be randomly written down as a single string, so instead of finding smaller strings of x length and then putting them together, the monkeys would, as a whole, come up with 1 string that had the entire play in it, that is how I understand the original idea.
Who will sue them if they randomly reproduce / perform copyrighted works?
The infinite monkeys theorem, imho, simply relates that eventually one of an infinite number of said monkeys will happen across a random sequence of characters equivalent to one of the Shakespearean works. What you have done here however, is akin to trying to reconstitute a corrupted version of said work given the hash and some source data. Each monkey is randomly trying to produce a given small segment of the correct data. I think this doesn't count. Sorry.
What bilge. Jesse Anderson is either an idiot or (more likely) a brilliant self-promoter.
If a million monkeys were typing on computers, one of them will eventually write a Java program. The rest of them will write Perl programs.
In addition to JUST recreating a word list of all the words that Shakespeare used, the other flaw is that monkeys (in experiments performed) do not select keys randomly. The result of the 2003 experiment in a British zoo resulted in the following nonsense: http://www.vivaria.net/experiments/notes/publication/NOTES_EN.pdf
In the 1970's an article was published entitled "How Artificial Is Intelligence".
It posited typewriters with key groupings based on the proportionality of those groupings in writing samples, such as Poe or Shakespeare.
By the time the typewriters had keys with 5 letter groups they were writing Poe and Shakespeare style works.
But that paper, never digitized, is in what Vinge called Pre History.
The posting is kind but a little misleading: I didn't invent Mersenne Twister. I just wrote the fastest Java implementation, built on top of early code by Michael Lecuyer. It's fairly widely used, I suppose. Mersenne Twister proper (MT19937) was the creation of Makoto Matsumoto and Takuji Nishimura. Feel free to ask any questions. (I can't say anything to the Shakespeare project, which sounds fun).
This was a complete pointless waste of time. What on Earth bruteforcing a 9 character quote is good for, or why would it be newsworthy?
Call me when the monkeys create a work of Shakespeare that he could have written if he lived another year ..
Hey don't blame me, IANAB
Sorry, I missed the the correct moderation. How can I miss the correct moderation? I need Coke (The beverage, just in case... )
That said, my comment should fix the screw up.
(Comment written by an open collaboration of 24,342,749 monkeys)
"Science can amuse and fascinate us all, but it is engineering that changes the world. " - Asimov.
Did the guy check the nine (9) characters against all the other works of fiction in the world? It is possible the virtual monkeys were copying some other author.
But no!
Of course the reason that randomly typing text until you have re-created shakespeare is so difficult is because of the greater improbability of getting strings of characters in the correct order as the length of that string increases.
To only consider 9-character strings, and purposefully search for those random strings in a work until you find one, eventually finding them all, is a drastic and cheat-ful short cut.
I cry foul.
This technique goes back to when I was a kid. Back then people used to come around and offer to randomly generate your street address on the curb in front of your house. They had two tools to do this - a can of spray paint to generate all the random tries, and a stencil they'd tape to the curb to preserve the correct tries. The spray can was directed at the stencil, which would selectively pass correct guesses to the concrete for retention and capture incorrect guesses on the surface of the stencil. Amazingly this system produced the right numbers every time.
So let me get this straight. You're taking random output from 1 million sources (first off, computers can't be truly random, but they can simulate it pretty good) and every time you get a "match" of at least 9 characters from any of Shakespeare's works, it gets flagged and added to the "queue." So eventually, enough of these random matches add up to a complete document.
Seems to me, all you are doing is intelligently pulling out snippets from a million random streams of characters that match sections of a predetermined pattern until you have all the pieces you need. I think you've committed a major logical fallacy. The "randomness" isn't truly random, and you are manipulating the data until it matches your predetermined pattern. A sign of some outside intelligence injecting "meaning" into a stream of pseudo-random data. Or evidence of a "God."
The real experiment is:
"A monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type a given text, such as the complete works of William Shakespeare."
Variations on numbers of monkeys and amount of time are numerous, but the basic statistics can be calculated. It implies that a single monkey will eventually produce a contiguous block of text that matches character for character, a known piece of literature. It is not mathematically impossible, but so highly improbable it is indistinguishable from zero.
I absolutely agree. From the point of view of the classic "one million monkeys with typewriters" the "result" described here is completely and utterly uninteresting. If he had had each node randomly generate data until one of them had emitted the full work in _one sequence_ he would have a story. There's just this problem that this is highly unlikely to happen even if you throw all the world's computing grids after it, given that there's ~26^n random texts of length n. Therefore, while reading the abstract I knew it had to be fake but I was hoped to be proved wrong - unfortunately I was right :(
Forgot to say, that I can not exclude the possibility that there might be some interesting things in the execution (interesting algorithms for looking up the 9-char bits etc.)? I haven't read the details so I'm not able to judge. In my critical post above I was discussing the approach in terms of the classic idea about the ability of a monkey to generate the work of Shakespeare.
The naivety demonstrated in regards to the original postulate is mind-numbing. Or maybe those who buy into his implementation as being technically correct for the given postulate is the most mind-numbing of all.
It has been demonstrated now, that Shakespeare's hallowed works could in fact have been produced by a monkey. I have suspected this ever since my first exposure to this overrated dreck they call "literature" in 4th grade. It occurred to me even then, that if he'd lived today, he'd have been a writer for a show like Friends, or Seinfeld. A corollary of this is of course that hundreds of years from now, poor fourth-grade children will have to stage performances of selections from such shows as Friends. Can't you just see a couple kids on stage, one with a turkey on her head, jiggling her non-existent little boobies, and the other saying "you so gweat, I wuv you!"? Sad. Shakespeare wrote stuff that was popular. The fact that it was popular doesn't make it great, and that's a common misconception in English literary circles. Can't we break this cycle of Shitspeare?
... Waiting for PETA's squawks of outrage at this cruelty to monkeys
Isn't April 1st the agreed time for these pranks? after the 'Cat owners are smarter' one, which everybody swallowed up.. Has nobody learned yet? ;-)