Making The Case That Voynich Is A Hoax

← Back to Stories (view on slashdot.org)

Making The Case That Voynich Is A Hoax

Posted by timothy on Tuesday December 30, 2003 @07:23PM from the exercise-for-the-reader dept.

DeadVulcan writes "The Voynich Manuscript, a mysterious book of uncertain age, is widely believed to be written either in an unknown language or a long-lost encryption scheme. Nature reports that computer scientist Gordon Rugg has demonstrated that it's possible to generate a text like the Voynich manuscript -- containing language-like regularities, despite being potentially meaningless -- using cryptographic techniques of the time. This lends some support to those who claim that the book is a hoax."

9 of 382 comments (clear)

Min score:

Reason:

Sort:

The Salamander Papers by Saeed+al-Sahaf · 2003-12-30 19:28 · Score: 5, Interesting

Somebody is laughing a lot.. Remember way back the Salamander Papers?

--
"Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
Library of Babel by Mrs.+Grundy · 2003-12-30 19:31 · Score: 5, Interesting

This reminds me of a passage from Jorge Luis Borges' Library of Babel. In fact a lot reminds me of that story these days.
Five hundred years ago, the chief of an upper hexagon (2) came upon a book as confusing as the others, but which had nearly two pages of homogeneous lines. He showed his find to a wandering decoder who told him the lines were written in Portuguese; others said they were Yiddish. Within a century, the language was established: a Samoyedic Lithuanian dialect of Guarani, with classical Arabian inflections. The content was also deciphered: some notions of combinative analysis, illustrated with examples of variations with unlimited repetition.
Missing the fact.... by Zibi · 2003-12-30 19:34 · Score: 4, Interesting

I think this report is missing the fact that if someone really wanted to make a hoax book, they could simply translate any other book (even the bible) into a made up language. If it's an obscure book the likliness that anyone would every figure it out is slim.

--
-Zibi
Beale Papers by Dan+East · 2003-12-30 19:39 · Score: 4, Interesting

Sounds a bit like the Beale Papers.

Dan East

--
Better known as 318230.
Ridiculous by SargeZT · 2003-12-30 19:40 · Score: 4, Interesting

I'm sorry, but calling the Voynich Manuscript a hoax is unfeasible. Sure, could it have in theory been a hoax? Yes, but there is no point to this. The "hoaxer" creates this in 3+ months, with very accurate drawings, and probably hangs on to it till he dies, so that it can be sold to a king 100 years later and eventually make it to america? Then again, maybe Nostradamus wrote it.

--
And why did you staple the trout to the RAM?
1. Re:Ridiculous by Seth+Morabito · 2003-12-30 20:00 · Score: 5, Interesting
  
  The point of a hoax, in my opinion, would most likely have been financial gain.
  
  There is no clear evidence pointing to an exact date that the manuscript was written, and the only firm circumstantial evidence we have to go on is Marcus Marci's letter to Anasthasius Kirchir, which mentions that the manuscript was sold to King Rudolph for 600 ducats. That is a heck of a lot of money. It seems perfectly reasonable to me that someone manufactured the manuscript to extract 600 ducats from the emperor.
  
  This assumes a lot. It assumes that the letter is genuine, and it assumes that the facts mentioned in the letter are true, and it assumes that Rudolph was the first buyer, so it is by no means a sure thing. But a lot of us who lean (gingerly) toward the hoax theory stand by Occam's Razor, which points to a hoax being at least a feasable, and probably even likely solution. Rugg's analysis is just more circumstantial evidence, not proof, but every little bit weights the scale more.
The pattern of nonsense by the+end+of+britain · 2003-12-30 19:42 · Score: 4, Interesting

The technique really is interesting. We have techniques that can identify patterns that are meaningful (all of cryptology, most of number theory, graph theory) but this application is neat because it is an effort to prove--rigorously--that a given set of data is just total noise.

--
"Oh, the tragedy of math gone wrong. I can't even talk about it." -Wil Wheaton http://www.wilwheaton.net
Interesting problem. by Black+Parrot · 2003-12-30 22:22 · Score: 4, Interesting

Those who read the article can take note of an interesting challenge: though Rugg has shown that it is possible to generate a high quality hoax using a Cardan grille, proving it to be a hoax may require producing a character grid that will actually generate large portions of the text. My question is, could that be done with a genetic algorithm, and are any Slashdotters up to the task?

Also, a few comments about formal analysis. Notice that if you took some arbitrary text, typeset it in a fixed-width font to force the characters into columns, and then skimmed it with a grille in order to generate a new text, you would automatically preserve such basic statistics as character frequency, including spaces and also punctuation if you used them in your grid. (Depending on how you applied the grille, you could actually be generating a simple permutation of the original text.) However, you would disrupt all the within-word correlations.

For example, in compound words derived from Latin there is a familiar pattern where ad C* ==> aCC* (where C is some arbitrary consonant), but that pattern would be completely obscured if the characters were read off a diagonal grille as shown in the photograph. You would still get the increased frequency for C, but not the common aCC pattern.

More subtly, there are some well known universals of syllable structure in natural languages, but those would be scrambled just as the aCC would be. You would have the right proportions of consonants and vowels, but not a realistic distribution within words.

Likewise, prefixes and suffixes would be scrambled. If it is a hoax generated by a Cardan grille, it should not have prefix/suffix patterns that occur commonly in many languages. (Ditto for suffixal inflections.) In fact, the letters appearing at the beginnings and ends of words should be a random sampling from the frequency distribution of letters in the whole text; this may be the easiest metric to check.

Also, by using spaces as characters in your grid you'd get the right proportion of spaces, and therefore the right average word length, but you would obscure any patterns in word length. Someone has already linked to studies of the word lengths in the manuscripts, but those assumed that the distribution of Latin word lengths word lengths would be preserved. However, only the average would be preserved. I suspect the distribution would be converted to a gaussian. Anyone got time for the experiment? (Notice that you may generate extra spaces with the grille, depending on how you use it. For example, what do you do when your grille starts running off the bottom of the page in your source text? Or, if your grille has 10 windows, do you transcribe to the first space and then move the grille, or do you transcribe everything in the grille and insert a "virtual" space for position 11? It looks to me like you might be able to generate the document's actual "word" lengths from Latin, given only some very basic assumptions.)

--
Sheesh, evil *and* a jerk. -- Jade
Can you say "Kolmogorov complexity"? by dido · 2003-12-30 22:50 · Score: 5, Interesting

One definition of randomness, and one that seems quite reasonable is that a string is "random" if it cannot be compressed to smaller than it is, i.e. listing its characters itself is the most compact possible description. Formally, a string is random if there exists no algorithm generating the string whose description on some universal Turing machine is smaller than the string itself (this is the definition used in the field of Kolmogorov complexity). A string of a billion digits making up Pi, for example, is not random by this definition, as one can easily write a short program, whose length would certainly be less than one billion characters, whose output is the digits of Pi. Think of it this way: the most general form of pattern matching device that we know of is a Turing machine, and if the best device you can construct to match that pattern is as complex or more complex than the pattern itself, then well, you have total randomness. Unfortunately, rigorously proving that a particular string is random by this very strong definition is extremely difficult, as you run into undecidability everywhere you turn.

This is the sort of stuff that real theoretical computer science is made of. For a very good overview of the theory of Kolmogorov Complexity and algorithmic information theory, Gregory Chaitin's home page is a good starting point

To go back to the Voynich manuscript, if there is some sort of regularity that can be discerned from it, then perhaps a context-free or context-sensitive (or something in between) language may be found to characterize it. Once you have such a syntactic characterization, perhaps it might be possible to divine the semantics from context. The shape of the grammar that results may well prove whether the Manuscript is in fact a real language, a fabrication, an elaborate cipher, or just total gibberish.

--
Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.