Making The Case That Voynich Is A Hoax
DeadVulcan writes "The Voynich Manuscript, a mysterious book of uncertain age, is widely believed to be written either in an unknown language or a long-lost encryption scheme. Nature reports that computer scientist Gordon Rugg has demonstrated that it's possible to generate a text like the Voynich manuscript -- containing language-like regularities, despite being potentially meaningless -- using cryptographic techniques of the time. This lends some support to those who claim that the book is a hoax."
Translation from binary:
Ich denke sein vermutlich einen
Translation from German from binary:
I probably think its one
In case you're wondering what it looks like
http://www.voynich.nu/
I've studied the Voynich manuscript before, and the possibility of a hoax seems just as unlikely as many of the theories that have been floating about. Yes, the language of the Voynich manuscript could be an elaborate hoax, but Rugg's analysis only proves what is already widely known.
The problem of creating such an elaborate hoax is that even Rugg's theory doesn't explain all the features of the Voynich manuscript. Furthermore, it seems unlikely that a sixteenth-century forger would go to the trouble of creating something that would have all the qualities of a real language and would include techniques that would deliberately resemble an actual document when viewed with analytical techniques that wouldn't be developed later. Occam's Razor makes it seem more likely that there some kind of language operating in the manuscript than a random system of patterns. Then again, there's no real way of knowing.
There are some images of the text of the Voynich Manuscript available here. Analysis of the text and the illustrations support the theory that the manuscript has defined sections on astrology, herbal medicine, and other subjects. There have been some serious and some rediculous theories about the manuscript from the intriguing notion that the Voynich text is mathematically similar to East Asian languages like Chinese or Vietnamese, or that the Voynich manuscript is written in an ancient form of Ukrainian. (I've read the supposed translation of it from the Ukrainian, and it hardly makes sense given that the manuscript's illustations don't match the text of the supposed translation.)
In the meantime, this site offers more information on modern translation efforts including a font for the Voynich script. (Which would make a lovely way of annoying co-workers by switching their default system font to Voynich text...)
Prof. Rugg has a website about his methods and results, which may be of interest.
It has a slow load due to java applets though.
The simple truth is that interstellar distances will not fit into the human imagination
- Douglas Adams
Wow I've never actually made a comment on slashdot and had so many replies. To be entirley honest I don't know much about the document in question. When I scanned through, it struck me that they are looking into complex ways of proving it to be a hoax when it could be something more simple. I do understand the complexities of creating a language, and I didn't really mean to make up a completely new language with new gramar etc., I was more refering to creating your own alphabet. Create your own symbols. If you wanted to make it complex you could add your own rules and extra characters such as what other languages use (i.e., a character for the "th" sound, and another for the different pronounciations of the letter "a" and so forth). It would be very time consuming to translate something like that, but if you have nothing to do for a decade or so and are driven to try and confuse people for many years then I'm sure it could be done (although I personally doubt it). Anyway, just a brain splurge.
-Zibi
No. It is the proponents of the idea that the book is genuine's job to prove that it is indeed that. One doesn't need to prove that something is a hoax if it is, Occam's Razor does that job. What explanation is contains the fewest ubstantiated assumptions: That something was written a language nobody knows, containing valuable information nobody has any idea about, or that it was produced using a simple encryption technique to fool somebody to pay loads of shiny ducats?
I find it amazing that some people still hold this myth as true! What kind of history education have you had!?!
Look, no scientist have never claimed the earth was flat. For one thing, in every other culture than the western, it has never been claimed otherwise ("they even knew the earth was spherical"), but some has got the weird notion that Columbus had to argue that the earth wasn't flat.
He didn't. The moron had the wrong numbers, and would have gotten killed if America didn't happen to be there.
Allready the pupils of Thales claimed their master knew the earth was round. Erastostenes, measured the circumference of the earth with an error of 3%! The true circumference of the earth was known to the greeks in antiquity! Plato and his pupil Aristotle himself knew many arguments for the spherical shape of the earth, and why is this important? Because though some Christian scholars around 300 AD didn't like the idea of a spherical earth, St. Augustin adopted much of Plato's philosophy and made it an important part of christianity in the same century, and they adopted the ideas of a spherical earth as well. Through Augustin, every leading authority accepted the idea of a spherical earth.
Eventually, Erastostenes numbers was also accepted , but Columbus didn't like them, because it meant that going the other way to India was infeasible. So, he used some other numbers, and he used Marco Polo's exaggerated estimates of the distance he had travelled, and so he made it quite feasible. But it wasn't, he was wrong.
Columbus thought the distance to Asia was 4000 km, his contemporary scientists 16000 km, the real distance is 23000 km, while Columbus eventually travelled 6500 km.
So, why is this important? Because people who hold this belief often have many other misunderstandings about science. Indeed, you can't prove that the book is a hoax, but for that reason, the burden of the proof rests with the proponents of the idea that it is genuine. Who, of course, might cling to the idea that it is, long after the world has moved on to greener pastures. That's how it usually works anyway.
Employee of Inrupt, Project Release Manager and Community Manager for Solid
oops. extraneous space in the link. Here's one you just need to click
Example, my house catches fire. Firefighters are unable to determine the source. The insurance company denies my claim on the grounds that the technology existed to rub two sticks together to generate heat and produce fire.
Of course, this is ridiculous. But there have been many who claimed that producing a hoax as convincing as the Voynich papers was virtually impossible. Rugg has shown that, at the earliest known date of "discovery," it was possible, and perhaps well worth doing for the price it fetched.
So, your analogy is incomplete. The insurance company's argument would have some relevance if you had previously been claiming that it was technologically impossible for you to light the fire. They just produced a counter-argument.
Coming back to the Voynich manuscript, it just means that the possibility of a hoax cannot be ruled out because of the effort required to produce it. Turns out it's not as hard as people thought.
Accountability on the heads of the powerful.
Power in the hands of the accountable.
Use wikipedia for some background information here
No, to you. Occam's Razor is a heuristic for selecting hypotheses to test. It doesn't relieve you of the burden of proof just because your burden is heavier. You definitely do need to prove that "X is false", if that is the hypothesis you selected based on whatever heuristics you choose.
Voynich is patently written in an unknown code (i.e., language): that's not an assumption, it's a given for both hypotheses. The first hypothesis (you used the non-synonim "unsubstantiated assumption") is that Voynich has high information content in the algorithmic sense. The second hypothesis is that Voynich has low information content, again in the algorithmic sense. Considerations of value, motive, etcetera are irrelevant to this analysis although they might be of heuristic value for selecting hypotheses, but not for application of Occam's Razor (which is another heuristic).
To sum it up, you still have the burden of proof, and you can't use heuristics for selecting heuristics.
The full Borges story is here. Like much of his work, it's a good read.
GROGGS: alive and well and living in
Some other good links for Voynich information:
Elonka :)
> I can see how this could be done as a really big simultaneous equation where the coefficients are dummy values for verb/noun etc with a parallel set of equation based on the grammar rules (combination), then variance analysis to eliminate the typos. Pretty elementary.
If you think it's pretty elementary you should write it up and publish it, since doing so would make your name an instant household word in fields ranging from philology to computer science, and probably also harvest you a fine crop of honorary PhDs and cushy job offers.
The problem of inducing grammars from examples has been intensely studied, and about all we know about it is that it's hard. (For example, we have a theorem showing that it's impossible to learn an arbitrary Context Free Grammar from any finite number of example strings.) The way children learn their language's grammar is so baffling that intelligent people have seriously supposed that you're born with a grammar processor in your brain that already knows how to process any possible natural language grammar and a support module that helps you determine which grammar to use by inducing a small set of switch settings from the examples you hear.
For that matter, after several decades of research we are just now getting to the point where computers can reliably parse natural language sentences even when we already know the language, its grammar, and have a lookup dictionary for all the words in the language. Automatically determining the parse of a sentence where both the grammar and vocabulary are completely unknown is a phenomenally more difficult problem; I suspect it's impossible even in principle.
As to your suggestion, I'm curious how you're going to solve a system of simultaneous equations when the data you are working with doesn't actually express any equalities. (Stop for a minute and think how you eliminate variables in a system of simultaneous equations.) For that matter, I'm not even sure what your representations are supposed to be.
It almost sounds like you're wanting to try all possible combinations for the part of speech for each word, but the combinatorial explosion would eat your lunch (n^m solutions, for n parts of speech and m words, even assuming no words can play multiple roles). Perhaps worse, even if you could enumerate all the possibilities you wouldn't be able to tell which one was correct. Since you don't know the grammar in advance, and since natural language grammars can be remarkably different from each other, you simply wouldn't have any way of knowing which part-of-speech mappings resulted in grammatical sentences and which didn't.
And if by some chance you did guess the correct grammar, you still wouldn't have a clue what the words meant.
> then variance analysis to eliminate the typos
If that's possible for unknown languages, it should be easily applicable to known languages. Are you suggesting a methodology for automated proofreading, that would catch typos in manuscripts? Could it be embedded in the slashcode, to automagically correct the typos in our posts? This technology alone would make you rich, even without all the other stuff you would need for interpreting unknown languages.
Sheesh, evil *and* a jerk. -- Jade