Slashdot Mirror


Making The Case That Voynich Is A Hoax

DeadVulcan writes "The Voynich Manuscript, a mysterious book of uncertain age, is widely believed to be written either in an unknown language or a long-lost encryption scheme. Nature reports that computer scientist Gordon Rugg has demonstrated that it's possible to generate a text like the Voynich manuscript -- containing language-like regularities, despite being potentially meaningless -- using cryptographic techniques of the time. This lends some support to those who claim that the book is a hoax."

27 of 382 comments (clear)

  1. The Salamander Papers by Saeed+al-Sahaf · · Score: 5, Interesting

    Somebody is laughing a lot.. Remember way back the Salamander Papers?

    --
    "Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
    1. Re:The Salamander Papers by Timex · · Score: 2, Interesting

      I'd be surprised to find many that even KNOW that the Salamander Papers are related to the Church of Jesus Christ of Latter Day Saints...

      (I know this 'cause I was a member, once.)

      --
      When politicians are involved, everyone loses.
  2. Library of Babel by Mrs.+Grundy · · Score: 5, Interesting
    This reminds me of a passage from Jorge Luis Borges' Library of Babel. In fact a lot reminds me of that story these days.

    Five hundred years ago, the chief of an upper hexagon (2) came upon a book as confusing as the others, but which had nearly two pages of homogeneous lines. He showed his find to a wandering decoder who told him the lines were written in Portuguese; others said they were Yiddish. Within a century, the language was established: a Samoyedic Lithuanian dialect of Guarani, with classical Arabian inflections. The content was also deciphered: some notions of combinative analysis, illustrated with examples of variations with unlimited repetition.
  3. Missing the fact.... by Zibi · · Score: 4, Interesting

    I think this report is missing the fact that if someone really wanted to make a hoax book, they could simply translate any other book (even the bible) into a made up language. If it's an obscure book the likliness that anyone would every figure it out is slim.

    --
    -Zibi
  4. Beale Papers by Dan+East · · Score: 4, Interesting

    Sounds a bit like the Beale Papers.

    Dan East

    --
    Better known as 318230.
  5. Ridiculous by SargeZT · · Score: 4, Interesting

    I'm sorry, but calling the Voynich Manuscript a hoax is unfeasible. Sure, could it have in theory been a hoax? Yes, but there is no point to this. The "hoaxer" creates this in 3+ months, with very accurate drawings, and probably hangs on to it till he dies, so that it can be sold to a king 100 years later and eventually make it to america? Then again, maybe Nostradamus wrote it.

    --
    And why did you staple the trout to the RAM?
    1. Re:Ridiculous by Seth+Morabito · · Score: 5, Interesting

      The point of a hoax, in my opinion, would most likely have been financial gain.

      There is no clear evidence pointing to an exact date that the manuscript was written, and the only firm circumstantial evidence we have to go on is Marcus Marci's letter to Anasthasius Kirchir, which mentions that the manuscript was sold to King Rudolph for 600 ducats. That is a heck of a lot of money. It seems perfectly reasonable to me that someone manufactured the manuscript to extract 600 ducats from the emperor.

      This assumes a lot. It assumes that the letter is genuine, and it assumes that the facts mentioned in the letter are true, and it assumes that Rudolph was the first buyer, so it is by no means a sure thing. But a lot of us who lean (gingerly) toward the hoax theory stand by Occam's Razor, which points to a hoax being at least a feasable, and probably even likely solution. Rugg's analysis is just more circumstantial evidence, not proof, but every little bit weights the scale more.

  6. The pattern of nonsense by the+end+of+britain · · Score: 4, Interesting

    The technique really is interesting. We have techniques that can identify patterns that are meaningful (all of cryptology, most of number theory, graph theory) but this application is neat because it is an effort to prove--rigorously--that a given set of data is just total noise.

    --
    "Oh, the tragedy of math gone wrong. I can't even talk about it." -Wil Wheaton http://www.wilwheaton.net
  7. Cryptonomicom has this by puzzled · · Score: 3, Interesting



    There is a portion of Cryptonomicom by Neal Stephenson where a real book of coded intercepts is replaced by random number strings encrypted with a fairly simple scheme.

    Does anyone know if this book is a seed for Stephenson's story? He draws an awful lot of information from the history of computing for his stories.

    --
    I am very easy to get along with, but I don't have time to waste being nice to people who are being stupid. -Theo
    1. Re:Cryptonomicom has this by Jonathan · · Score: 2, Interesting

      Well, a more likely inspiration for the "Cryptonomicon" manuscript mentioned in Cryptonomicon and Quicksilver is the Steganographia of Trithemius. In the late 1990's the book was briefly in the news because a well known cryptographer, Jim Reeds, found and deciphered a hidden message from it.

  8. Re:It's one thing to say something is a hoax... by sakusha · · Score: 2, Interesting

    RTFArticle. It is pretty clear that if the text can be produced by the algorithmic chart as described, it is meaningless gibberish.

    You remind me of Stanislav Lem's classic book "Memoirs Found in a Bathtub." It's about a society that revolves around codebreaking. Lem makes huge plot points about short texts that are ambiguously decodable into dozens of other possible texts. They are never sure if the message really IS a code, or whether one of the decoded versions contains further codewords. But everyone is absolutely convinced that everything is encoded, nothing is what it seems.

    And such is true of almost anything, leading to mental masturbation like The Bible Code. People WANT to believe it's real, but it's all a hoax.

  9. Re:It's one thing to say something is a hoax... by Bagheera · · Score: 2, Interesting

    Actually, the burden of proof would be on those who claim there is some meaning in it. Reading the article, and references to the manuscript, the "It's a hoax" proposition now has a plausible explanation as to how a hoax could be perpetrated. While not conclusive evidence.

    Anyone can say anything is a hoax but it takes scientific evidence - actual empirical data - to prove such a claim.

    Anyone can claim anything, but the more outrageous the claim the more evidence they need to support it. Someone could claim the book was the work of Aliens. That claim would take more conclusive evidence than "It was part of a clever scam." While this doesn't prove the hoax theory, it gives it more plausibility than simple supposition.

    Honestly, do you think it's more likely to be an authentic encoded manuscript of alchemy? Occam's Razor favors the hoax. To challenge your analogy, it's much more like a 25th century scholar looking back and saying Roswell was a hoax than Kitty Hawk was.

    --
    Never attribute to malice what can as easily be the result of incompetence...
  10. Re:repeats by PurpleFloyd · · Score: 3, Interesting
    Of course, that particular point isn't much, cryptographically. Ever since frequency analysis came into use, historical cryptographers used "nulls" in their codes - random meaningless characters which would hopefully cause trouble to frequency analysts. It may be that the manuscript's code contains keywords that the decoder should ignore (all repitions of a word, for instance), or instruct the decoder to perform a certain action (say, 3 repititions means to skip the next three words).

    On the other hand, this certainly could be a hoax. After all, the author was familiar with cryptographic methods and was paid an enormous amount of money for the manuscript. The real truth could certainly be either hoax or reality - there simply aren't enough facts available to decide right now, despite the huge amount of work put into the manuscript by many talented amateur cryptographers.

    --

    That's it. I'm no longer part of Team Sanity.
  11. Interesting problem. by Black+Parrot · · Score: 4, Interesting


    Those who read the article can take note of an interesting challenge: though Rugg has shown that it is possible to generate a high quality hoax using a Cardan grille, proving it to be a hoax may require producing a character grid that will actually generate large portions of the text. My question is, could that be done with a genetic algorithm, and are any Slashdotters up to the task?

    Also, a few comments about formal analysis. Notice that if you took some arbitrary text, typeset it in a fixed-width font to force the characters into columns, and then skimmed it with a grille in order to generate a new text, you would automatically preserve such basic statistics as character frequency, including spaces and also punctuation if you used them in your grid. (Depending on how you applied the grille, you could actually be generating a simple permutation of the original text.) However, you would disrupt all the within-word correlations.

    For example, in compound words derived from Latin there is a familiar pattern where ad C* ==> aCC* (where C is some arbitrary consonant), but that pattern would be completely obscured if the characters were read off a diagonal grille as shown in the photograph. You would still get the increased frequency for C, but not the common aCC pattern.

    More subtly, there are some well known universals of syllable structure in natural languages, but those would be scrambled just as the aCC would be. You would have the right proportions of consonants and vowels, but not a realistic distribution within words.

    Likewise, prefixes and suffixes would be scrambled. If it is a hoax generated by a Cardan grille, it should not have prefix/suffix patterns that occur commonly in many languages. (Ditto for suffixal inflections.) In fact, the letters appearing at the beginnings and ends of words should be a random sampling from the frequency distribution of letters in the whole text; this may be the easiest metric to check.

    Also, by using spaces as characters in your grid you'd get the right proportion of spaces, and therefore the right average word length, but you would obscure any patterns in word length. Someone has already linked to studies of the word lengths in the manuscripts, but those assumed that the distribution of Latin word lengths word lengths would be preserved. However, only the average would be preserved. I suspect the distribution would be converted to a gaussian. Anyone got time for the experiment? (Notice that you may generate extra spaces with the grille, depending on how you use it. For example, what do you do when your grille starts running off the bottom of the page in your source text? Or, if your grille has 10 windows, do you transcribe to the first space and then move the grille, or do you transcribe everything in the grille and insert a "virtual" space for position 11? It looks to me like you might be able to generate the document's actual "word" lengths from Latin, given only some very basic assumptions.)

    --
    Sheesh, evil *and* a jerk. -- Jade
    1. Re: Interesting problem. by Black+Parrot · · Score: 3, Interesting


      > In fact, the letters appearing at the beginnings and ends of words should be a random sampling from the frequency distribution of letters in the whole text; this may be the easiest metric to check.

      Actually, the distribution of initial letters might be preserved, or at least mostly preserved. If the source text is written so that lines always begin with a new word, and the grille is always aligned with the start of a line, then what you read out of the grille will preserve the frequencies of word-initial letters. But if you read more than one "word" out of the grille before moving it, you will get a mixture of the true word-initial distribution plus the distribution of all the letters in the document. And if you don't always align the grille to the start of a line, all bets are off.

      Off hand, I don't see any way that the distribution of word-final letters would be preserved. The first thing I would do to detect a hoax is compare that distribution to the distribution of all the letters in the document. If they are the same, then I would suspect the use of a grille or some other randomizer.

      --
      Sheesh, evil *and* a jerk. -- Jade
  12. Repeats? by plumby · · Score: 2, Interesting
    contains some features that are not seen in any language. The most common words are often repeated two or three times, for example - the equivalent of English using 'and and and'

    What about Chines? From the little that I've learned, they often repeat a word for emphasis - e.g., Xie Xie meaning thank you.

  13. Can you say "Kolmogorov complexity"? by dido · · Score: 5, Interesting

    One definition of randomness, and one that seems quite reasonable is that a string is "random" if it cannot be compressed to smaller than it is, i.e. listing its characters itself is the most compact possible description. Formally, a string is random if there exists no algorithm generating the string whose description on some universal Turing machine is smaller than the string itself (this is the definition used in the field of Kolmogorov complexity). A string of a billion digits making up Pi, for example, is not random by this definition, as one can easily write a short program, whose length would certainly be less than one billion characters, whose output is the digits of Pi. Think of it this way: the most general form of pattern matching device that we know of is a Turing machine, and if the best device you can construct to match that pattern is as complex or more complex than the pattern itself, then well, you have total randomness. Unfortunately, rigorously proving that a particular string is random by this very strong definition is extremely difficult, as you run into undecidability everywhere you turn.

    This is the sort of stuff that real theoretical computer science is made of. For a very good overview of the theory of Kolmogorov Complexity and algorithmic information theory, Gregory Chaitin's home page is a good starting point

    To go back to the Voynich manuscript, if there is some sort of regularity that can be discerned from it, then perhaps a context-free or context-sensitive (or something in between) language may be found to characterize it. Once you have such a syntactic characterization, perhaps it might be possible to divine the semantics from context. The shape of the grammar that results may well prove whether the Manuscript is in fact a real language, a fabrication, an elaborate cipher, or just total gibberish.

    --
    Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.
    1. Re:Can you say "Kolmogorov complexity"? by kasperd · · Score: 2, Interesting

      That definition of randomness does make sense. Unfortunately it is undecidable, so you can never prove something is random according to the definition. You can prove something is not random, if you can find a program generating it. But if you cannot find such a program, you don't know if it is because it doesn't exist, or if you just didn't look on the right one.

      As for finding a language given the string, it isn't hard to find a regular language containing the string, the hard part is to find the right language. It is trivial to define a regular language that contains all strings. But in this case it probably isn't the right one.

      --

      Do you care about the security of your wireless mouse?
  14. Re: Missing a (cryptographic) clue ... by Carewolf · · Score: 2, Interesting

    No you underestimate the inherent limits of a structured language. The reasons you list are the reasons it might not be deciphered if it was a cryptographic language. If it is a natural language it would still fail.

    Imagine attacking common words and phrases. If you read an english text, you would quickly notice words like "the" "a" "and", and it was a letter stuff like "you" and "me" Once you have a large set of common words and phrases you look at how they are placed and structured, and start making qualified guesses to their relationship.

    Basically out cryptographica today, is so advanced that it now only can break most common encryptions, but it can infact break the differences between most langauges if guided by human sense.

  15. It's ancient and indecipherable! by buckeyeguy · · Score: 2, Interesting
    Therefore it must be important! Eh, no. (See the Urantia Book for one example of why some old nonsense is better left aside.)

    Years ago I had a coworker who would blather on about the Urantia book and its 'answers'... but then he was an old stoner too.

    --
    I'd have a personalized plate on my car, but "toxic bachelor" won't fit into 7 letters.
  16. Arlet and the rec.puzzles archive by Mikey-San · · Score: 2, Interesting

    Here's a great little bit of information regarding Voynich:

    http://rec-puzzles.org/new/sol.pl/cryptology/Voy ni ch

    Mmm, strangeness.

    --
    Mikey-San
    Karma: +Eleventy billion (mostly affected by watching Celebrity Jeopardy)
  17. Bible Code? by gillbates · · Score: 3, Interesting

    I do believe that there are "codes" in the Bible, but the reason is different than what the fanatics describe. My belief is that the Bible codes exist for only one reason: to ensure accuracy. Consider the following:

    The cat in the hat caught a rat and that was the end of that.

    Notice the rhyming. Now translated into spanish (courtesy babelfish):

    El gato en el sombrero cogio una rata y ese era el final de eso.

    Now translated back into english:

    The cat in the hat took a rat and that one was the end of that.

    Okay, so notice in the original that the rhyming words appeared in positions 1, 4, 7, 9, and 14 (zero based). In the retranslation, the rhyming words appear in positions 1, 4, 7, 9 and 15. This disparity alone is enough to determine that the retranslation is not accurate.

    Supposing that one writes in such a manner that there is a definitive pattern to their sentences and word choices, it is easy to determine the accuracy of a text after having gone through many translations. For a book such as the Bible, this was of paramount importance. I believe the original purpose of the "Bible codes" was to ensure that the meaning of scripture was not lost as it was passed from one generation to the next.

    Consider for example, the poem. If a poem is incorrectly copied, it no longer rhymes, or the meter is disrupted. This simple mechanism not only ensures easy memorization, but provides a security against unintended alteration. In much the same manner, the "Bible codes" have provided scholars a way of discerning the accuracy of a copy of scripture. In fact, some of scripture is indeed poetic, further reinforcing the confidence in the original scriptures.

    I find it somewhat interesting that lossless copying was available long before digital electronics were invented.

    --
    The society for a thought-free internet welcomes you.
    1. Re:Bible Code? by Samrobb · · Score: 2, Interesting
      Also, at the time the books in the bible were written, accurate transcription wasn't considered nearly as important as it is today.

      Sorry - you're wrong, particularly in terms of the writings that make up the Old Testament. The requirements for copying these texts were pretty stringent. Requirements 4, 6, and 7 are particularly interesting:

      The Talmud lists the following rules for copying the Old Testament:
      1. The parchment had to be made from the skin of a clean animal, prepared by a Jew only, and was to be fastened by strings from clean animals.
      2. Each column must have no less than forty-eight or more than sixty lines.
      3. The ink must be of no other colour than black, and had to be prepared according to a special recipe.
      4. No word nor letter could be written from memory; the scribe must have an authentic copy before him, and he had to read and pronounce aloud each word before writing it.
      5. He had to reverently wipe his pen each time before writing the Word of God, and had to wash his whole body before writing the sacred name Jehovah.
      6. One mistake on a sheet condemned the sheet; if three mistakes were found on any page, the entire manuscript was condemned.
      7. Every word and every letter was counted, and if a letter were omitted, an extra letter inserted, or if one letter touched another, the manuscript was condemned and destroyed.

      Can't recall the reference at the moment, but I have come across mention that over the course of nearly a thousand years of transcription, there is a staggering lack of transcription errors in the Hebrew texts.

      --
      "Great men are not always wise: neither do the aged understand judgement." Job 32:9
  18. Research project in progress... by oneiron · · Score: 2, Interesting

    There is a serious research project in progress which trying to get to the bottom of this mystery. If you can look past the occasional conspiracy-theorist-kook, there are actually quite a few thoughtful and intelligent folks participating. Here is the discussion thread for the project:

    Voynich Manuscript Research Project @ AboveTopSecret.com

    Note: Some of the other research projects are pretty interesting, also. In particular, the Yellowstone Super-Caldera Research Project.

  19. Was the Cryptonomicom based on the Necronomicon ? by mbone · · Score: 2, Interesting

    The "Cryptonomicom" has an obvious liguisitic similarity to the "Necronomicon" of H.P. Lovecraft. Colin Wilson later wrote sci-fi / horror stories that included Lovecraft and which stated that the Voynich Manuscript was actually one copy of the Necronomicon.

    I have no idea if Stephanson knew this, but given the similarity of names, I would suspect so.

    More details can be found here .

  20. Re:repeats by Anonymous Coward · · Score: 1, Interesting
    Another example:

    "She said that that 'that' that that boy used was wrong."

    The very common English word "that" is repeated five times in sequence. Granted, that is not a common sentence but coming across such an uncommon sentence in an English text does not mean it was forged.

  21. Surprised this didn't come up by GMFTatsujin · · Score: 2, Interesting

    The Solution of the Voynich Manuscript by Leo Levitov was published by the Aegean Press in 1987. Links to Amazon.com are left as an exercise to the Slashdot readership.

    Levitov provides methodology for extracting the linguistic model that the book encodes. Many examples and translations are provided, and there is plenty of work for the reader to do if he wants to prove the system to himself.

    Levitov proposes that his solution reveals a manual of heretical text regarding the ease and assistance of the mortally ill into death -- euthenasia, basically. To my knowledge, his work has not been discredited, only ignored.

    For the definitive hoax-type artificial reality book, check out the amazing Codex Seraphinianus.