Can You Raed Tihs?
An aoynmnuos raeedr sumbtis: "An interesting tidbit from Bisso's blog site: Scrambled words are legible as long as first and last letters are in place. Word of mouth has spread
to other blogs, and articles as well.
From the languagehat site: 'Aoccdrnig to a rscheearch at an Elingsh uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht frist and lsat ltteer is at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae we do not raed ervey lteter by it slef but the wrod as a wlohe. ceehiro.'
Jamie Zawinski has also written a perl script to convert normal text into text where letters excluding the first and last are scrambled."
So d__s t__s m__n t__t we d_n't n__d t_e m____e l____s at all?
End of lesson. You may press the button.
I showed this to a student here who is native to Indonesia, so english is not her first language, and she had a very difficult time reading it. Any thoughts on why this might be so tied to your native tongue? I would have thought that anyone fluent in english (which she is) would be able to read the post without much difficulty.
D
This meme has been kicking around blogland for a couple of days, and it definitely seems to be true. The only part of the above paragraph that was difficult to read was the sentence, "the olny iprmoetnt tihng is taht frist and lsat ltteer is at the rghit pclae".
Normally I would never post a comment about grammar, but it is kind of startling that in a block of text that jumbled the absence of 'the', and the swapping of 'is' for 'are' still jump out at you.
Yuor porgarm has a falw. It csnoiders pinctuouatn mkars as ltteers and tuhs any word wtih a pntctuuaion mark at the end wlil condeisr the fanil mark to be the lsat letter.
"Don't mind me cutting myself on Occam's Razor"
Understanding a language is only 50% comprehension. The other 50% is being able to predict what will come next based on previous experience. This is especially important in spoken language, because the brain simply does not have the power to parse each word separately in real time.
So while it is possible to understand words that are not spelled correctly, it can still take a while to understand if the nxet few wdors are not qieut waht you epcext. It is aslo mcuh lses pbatldicree wehn you use lgenor wdros.
I hpoe tihs was an imuilntinag eplamxe!
Mclettat
the mistake is in saying that the unscrambling is
done at the word level. jump you eyes randomly into
the text and try to read just one word in isolation.
as someone on cogling@ucsd pointed out, there are
also a bunch of non-scrambled key words that help
your brain figure out what the in-betweens should
be. anyhow, point being that it's not a feature
of word recognition that you can read it, but rather
a feature of higher-level reconstruction.
mt
The "consonant pairs" seem to always be still paired in these words.
If I type
sllpenig it's clear I'm typing "spelling"
but, if I type
slpenlig it's not so clear anymore.
What about: according
Aoccdrnig (as in the article) is ok but...
aocdrncig is not nearly as clear
There's a limit to how far your brain can stretch it. Some consonant pairs your brain DOES intepret much like a single letter, because it's an irregularity in english.
Words that use such consonant pairs and triplets like "tch" are much harder to distinguish when those pairs and triplets (which really sound like a single letter) are split.
Stewey
There are 10 kinds of people in the world. Those who understand binary and those who don't.
Can anyone track down an authoritative source for this?
Bisso got it from languagehat. Bisso also cites a Nature article that may be related; however, the Nature article clearly deals with hearing time-reversal of segments of spoken sentences, not reading mangled written words. languagehat cites Avva, who languagehat admits doesn't give a source; I can't get to the Avva entry at the moment.
Never take moderation advice from sigs, including this one.
Please go and feed the the cat.
Bet ya didn't see that, did ya?
Re-read it slowly.
-dave-
The pig browse. With Google. Sigh is to the chicken. Chicken is fool. Giggle. The DailyWTF giggle.
While at University I thought I'd take some Xhosa courses and eventually packed it in because I was struggling so much to read Xhosa, though I could speak it better than most of the other kids.
This leads me to think that once one builds a certain familiarity with any language, one can cope with the scramble.
To me, the most interesting part of this discovery/research is that it might find a way to help dyslexic kids. I sure hope so.
Engineering is the art of compromise.
I noticed that compression is worse using scrambled text:
./scrmable.pl genesis.txet
[anthonym@uniblab scrbameld]$
[anthonym@uniblab scrbameld]$ gzip g*
[anthonym@uniblab scrbameld]$ ls -l
total 304
-rwxr-xr-x 1 anthonym staff 63830 Sep 15 16:33 genesis.text.gz
-rw-r--r-- 1 anthonym staff 84945 Sep 15 16:36 genesis.txet.gz
-rwxr-xr-x 1 anthonym staff 1396 Sep 15 15:56 scrmable.pl
[anthonym@uniblab scrbameld]$ gunzip g*
[anthonym@uniblab scrbameld]$ zip genesis.zip g*
adding: genesis.text (deflated 70%)
adding: genesis.txet (deflated 60%)
[anthonym@uniblab scrbameld]$
Interesting. Anyone have an explaination for tihs?
A programmer is a machine for converting coffee into code.
This is so darn old... I thought Slashdot was bleeding edge! Here is the original forward FYI:
Titled: Do Spellings Matter?
"... randomising letters in the middle of words [has] little or no effect on the ability of skilled readers to understand the text. This is easy to denmtrasote. In a pubiltacion of New Scnieitst you could ramdinose all the letetrs, keipeng the first two and last two the same, and
reibadailty would hadrly be aftcfeed. My ansaylis did not come to much beucase the thoery at the time was for shape and senqeuce retigcionon.
Saberi's work sugsegts we may have some pofrweul palrlael prsooscers at work. The resaon for this is suerly that idnetiyfing coentnt by paarllel
prseocsing speeds up regnicoiton. We only need the first and last two letetrs to spot chganes in meniang"
And if you liked *that* one so much, you might like this one too:
Read the sentence below carefully:
"I do not know where family doctors acquired illegibly perplexing handwriting nevertheless, extraordinary pharmaceutical intellectuality counterbalancing indecipherability, transcendentalizes intercommunications' incomprehensibleness".
This is a sentence where the Nth word is N letters long.
e.g. 3rd word is 3 letters long, 8th word is 8 letters long and so on.
And if you like that one too, here is another one you can try to kill your boredom...
While sitting, draw clockwise circles on the ground with your right foot. While doing that, try drawing the number "6" in air with your right hand.
Your foot will change direction.
Actually, this post was more readable for me than the article or many other posts in this discussion. I was quite amazed at how switching the bookend letters made the whole word look backward, but recognizably so, as if all the letters had been reversed. And reading a word backwards (at least for me) is an even easier task than reading the scrambled middle letters (which, I'll admit was suprisingly easy).
Jacob Fugal
My parents are both teachers, and one of the most tiresome quarrels in education is Phonics vs. Whole-Word debate. Do you teach someone to read by teaching them how to sound out syllables (phonemes)? Or do you teach them to recognize whole-word patterns by rote?
Experimentally, a pure-phonics approach has proven to have the highest success rate. However, these results would suggest that whole-word approach *does* map onto some important cognitive structure . Perhaps this means that, once past the basic level, whole-word techniques would prove to be valuable in turning beginning readers into advanced readers.
Enojy :)
/$1 . shuffle($2) . $3/egix;
//, shift;
#!/usr/bin/perl -p
# scram: scrambles the innards of words
# Usage: scram <input-text >scrambled-text
# Craig Berry (20030915)
s/
([a-z]) # Initial letter
([a-z]{2,}) # Two or more middle letters
([a-z]) # Final letter
# Fisher-Yates shuffle
sub shuffle {
my @chars = split
my $i = @chars;
while ($i) {
my $j = rand $i--;
@chars[$i, $j] = @chars[$j, $i];
}
return join '', @chars;
}
When all you have is a hammer, everything looks like a skull.
This is a common thing when learning speed-reading. You basically do the same thing, but ignore the rest of the word and intuitivly know what the word was from the other words in the sentence.
However, it also makes reading out-loud difficult when you are used to skipping words when you read them.
Maybe we DID take the blue pill. You wouldn't remember anyway.
I brought this up over at ScienceForums yesterday, and someone pointed to the mentioned article that says: "They wrote up their results in the 29 April 1999 issue of Nature, but I've been unable to find it online."
The original article that particular blog is based on can be found here
Abstract is here
and full text (HTML and PDF w/ images) for those without access to Nature is here
However, this research was done on words that are reversed, not internally scrambled. I have been unable to locate research on the letter order within longer words, however the principle is accurate and I'm sure it exists.
I work in a lab. A while back, we did a useability(sp?) study on user interfaces.
We were trying to figure out why text messaging on phones is such a hit in Japan, and yet everyone over here thinks its rather clumsy.
The study basically pointed out, that to say something like, "I love you", requires you to "type" a lot of characters to convey that message. Using Kanji, one or two characters will suffice. I should've known, (being married to a chinese person), but after I thought about it, it makes a lot of sense. I have flashbacks of watching old chinese movies, and seeing the characters say a few characters, and the english subtitles would be a paragraph long.... And conversly watching english movies, and the guy rambles on-and-on, and the subtitles contains a handful of chinese characters...
Is this an irablommensucne pomenologicahenl hsiotheyps or the matiostenifan of the igibilitieltny of imatioidc iialistividundc icationommunitercns?
Is tihs an iommensurablnce paogicolenomenhl hesipothys or the mfestatioinan of the iilitteligibny of iiomatidc iividualistidnc icommunicationrntes?
Is tihs an isurablcommenne phenomenological hesipothys or the manifestation of the iigibilitlteny of imatiodic iistilduadivinc iationccommuniernts?
Is tihs an immensurablcone pologicaenmnohel hhesiypots or the mifestationan of the iibilitglienty of itiadiomc iividualistidnc intercommunications?
Is tihs an immensurablocne penomenologicahl hipothesys or the mnifestatioan of the iteligibilitny of iatimdioc iidualistindivc iommunicationcterns?
Is this an icommensurablne phenomenological hhesipotys or the matioanifestn of the ibilitielignty of iatidiomc istialidividunc intiorcommunicaetns?
Is this an imensurablcomne pologicaennomehl hesipothys or the mestatioifnan of the iiliteligibtny of itiamiodc ialistividundic iationtercommunicns?
Is tihs an ilommensurabcne phenomenological hothesiyps or the mestatiofanin of the inteligibility of iomatidic idividualistinc iationcommunicertns?
Is this an isurablmenmncoe penologicaenomhl hipothesys or the mtionifestaan of the iitbilnteligiy of iiomatidc ialistidividunc itioncommunicaertns?
Is this an icommensurablne pnomenologicahel hothesiyps or the mfestatioanin of the ilitibinteligy of iomatiidc iidualistividnc inicationntercommus?
Is this an irablmensucomne pgicaonomenolhel hypothesis or the mestatioanifn of the igibilitnteliy of iomatiidc ilistividuadinc intercommunications?
Is this an immensurablcone pnologicahenomel hisypothes or the matioifestann of the itlinteligibiy of idiomatic iistindividualc inotercommunicatins?
As a graphic design student, I have been taught that it is more difficult to read blocks of text that have been made in ALL CAPITAL LETTERS. The reason for this is that the ascenders (pieces going up from the main body of the word, like the top of the "d" in "word") and descenders (like "y" in "you") help us to see the word at a glance. In effect, once we have gotten used to reading the english language, we no longer read letters at all, but words as whole characters. Even when the middle letters are scrambled, the letters have almost the same shape. I would like to see someone try this little experiment with capital letters, as I doubt it would work nearly as well.
Also... what happen when the scrambled word is another valid word?
This sounds like a good way to confuse the ole word detector. Four variants spring to mind for an original word of n letters. In all variants, hold the first and last character constant and mix the interior letters. First, can a new n-letter word be formed. Second, can a new (n-1)-letter word be formed including the original first letter, but excluding the original last letter. Third, can a new (n-1)-letter word be formed including the original last letter, but excluding the original first letter. Fourth, can a new (n-2)-letter word be formed excluding both the original first and last letters. I suppose that if n is large (e.g. >= ~7), the pattern could be continued or multiple new valid words might be formed from the n letters.
The resulting false clues should tend to mislead the reader.
Another way to fool the old noggin would be to start with a misspelled original.
"We reject as false the choice between our safety and our ideals." --The American President (20.1.2009)
Funny, at first I thought all the words were just backwards. When I started to read them as such, it made a lot more sense.
An experiment like this might be better performed with single words instead of entire sentences, as the human mind excels at finding and deciphering patterns.
When a thing has been said, and said well, have no scruple. Take it and copy it. --Anatole France
Nit: Huffman coding is just a technique for taking a symbol alphabet with associated probability model and generating a minimal-entropy prefix-free binary code.
It is not a compression algorithm, though it often appears as the last step in a compression algorithm. In particular, it doesn't deal with the problem of how you generate the probability model, or what your symbol alphabet is.
The gzip algorithm, for example, uses "Huffman compression" just fine, but it still does poorly on scrambled text.
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
Yes, but it brings up interesting lossy text compression, where you can rearrange the middle of the words to reduce the compressed file size. Kinda like MP3 or JPG for your reading.
Code poet, espresso fiend, starter upper.
I just tried this with some co-workers! They still don't believe it.
Chinese is ideographic, and Japanese combines Chinese ideograms ("kanji") mixed with phonetic syllable signs ("kana"). Korean has an actual alphabet ("hangul"), except that instead of the letters coming in a row as in Latin, Cyrillic, or Hebrew, each syllable is packed into a box. Korean used to be written with borrowed Chinese ideograms, but nowadays the alphabet dominates writing.
You can Read more about Hangul, but you may have to have Korean support installed on your OS to display the Hangul characters.
Will I retire or break 10K?
Before he did i think ;)
:/
http://junglist.org/jumble.php
src @ http://junglist.org/jumble.php
too bad i am not cool like jwz
You know, I'm not so sure about that -- is a license rendered invalid just because contains spelling errors? I strongly suspect not.
(Anyway, the copyright is enforcable because everything is copyrighted by default, even if it has no notice at all. The interesting question is whether the license I put on that thing actually grants you any rights. I think it probably does.)
Turhgoh = Through
A topic that does not seem to have had much coverage in this article is the actual iconic visual recognition that our brains appear to use in word recognition.
Obviously each word approximates a patterned rectangle (serif fonts emphasize this further) with occasional outliers (ie. t, y, l, and any other letters that protrude above or below the base rectangle).
People with poor eyesight rely on this fuzzy but fast recognition frequently. In fact there is a classic psych experiment based around displaying a word that iconically is very similar to another word, while simultaneously presenting a context that implies the second word, and asking the subject to record the word. The subject mis-records the word roughly 90% of the time.
Q.
Insert Signature Here
I noticed, while testing the script out with a paper I happen to be in the process of writing, that compound words do not seem to work with this scheme. Though I'm hardly a linguist, it may be a result of the compound word being translated seperately and then placed together when we read it. When the letters intermingle, we aren't able to differentiate the two halves.
Examples from the paragraph I tested with are "worldview", "afterlife", and "humankind". I'm sure iterations that keep the halves partially seperate would be readable, but ones I came up with (like "wirovdelw") simply make no sense.
Other, larger words that I've noticed do not work are "consciousness" and "unenlightened", though I'm sure it wouldn't be too isn't unusual to expect large words to begin to obfuscate themselves too much.
This doesn't explain the shorter words that seem to obfuscate very readily, such as "religion" and "autonomous". Once letters and/or vowels become repitious and clump together, the word seems to be more difficult to readily decrypt. I can also confirm this is true from my experience of occasionally playing TextTwist on Yahoo! Games.
(end random paper-avoiding post)
1. At what age does this manifest itself?
2. Does this work in other languages? I am guessing japanese (at least) would not work....
3. What implications does this have for cryptology, in that you can't look for strings anymore?
Big Bonus question:
4. If 2 is false, in that it doesn't work for other languages, is this intrinsic property of English the reason that English has become the language of global business or is it simply a by-product of English being spoken by those who sailed the world and conquered the world (British and American Imperialism)? ie because English is recognisable after mangling, is that the reason that it is so "popular"?
Inquiring minds want to know....
andy
Modern English is the offspring of many different older languages (as you may know). These languages all had varying ways of representing different sounds with the alphabet given to them by the Romans. When English took all of these methods and combined them into one language. Thus, there are many different ways of creating the same sound, or phoneme.
Therefore, English does not encode the spoken language into text exactly. Though there are some sounds that can only be created one way ('ng' and 'ch' come to mind), many can be spelled numerous ways. For example: whir, were, and work have the same sound in them, but are spelled differently. This makes spelling words in English more difficult, but makes identifying misspelled words easier. You could say English now comes with error-correction. This has no doubt helped it remain in existence, despite its lack of consistent grammar rules and general lack of user-friendliness.
Disclaimer: I blame any grammatical or logical errors on my lack of sleep. Now I'm going to bed.
that that is is that that is not is not
Hi, I just tried that script with a german Tagesschau.de article which is like this: Viele altere Lehrer, zu groBe Klassen, zu wenig Studienanfanger - Deutschland hat laut der jungsten OECD-Bildungsstudie in vielen Bereichen weiter Nachholbedarf. Die zu geringe Zahl der Studenten ist nach Ansicht der OECD auch fur die aktuelle Wirtschaftsschwache mitverantwortlich. scrambled its like this: Veile altere Lheerr, zu groB Ksselan, zu wineg Sninuetdafanegr - Dnulsaechtd hat laut der junstgen OECD-Biudgdsunsitle in vielen Bechreein weteir Nohahlbeacrdf. Die zu gniegre Zhal der Sutednetn ist nach Ahscnit der OECD acuh fur die akleulte Wftrchhssacstiwache martcrwntvoieilth. German words get too long unlike English.
Wll, wht abt vwls? Ths r nncssry mst f th tm, t. N fct, nc y gt rd f th vwls nd mddl lttrs, y cn s hw trly wstfl th nglsh lngg rlly s!
This reminds me of that old programming axiom:
Every program has at least one bug.
Every program can be reduced in size by at least one instruction.
Therefore, by induction every program can be reduced to one instruction which doesn't work.
"Lawyers are for sucks."
- Doug McKenzie
Iltnsegnetiry I'm sdutynig tihs crsrootaivnel pnoheenmon at the Dptmnearet of Liuniigctss at Absytrytewh Uivsreitny and my exartrnairdoy doisiervecs waleoetderhlhy cndairotct the picsbeliud fdnngiis rrgdinaeg the rtlvaeie dfuictlify of ialtnstny ttalrisanng steennces. My rsceeerhars deplveeod a cnionevent ctnoiaptorn at hnasoa/tw.nartswdbvweos/utrtep:k./il taht dosnatterems that the hhpsteyios uuiqelny wrtaarns criieltidby if the aoussmpitn that the prreoecandpne of your wrods is not eendetxd is uueniqtolnabse. Aoilegpos for aidnoptg a cdocianorttry vwpiienot but, ttoheliacrley spkeaing, lgitehnneng the words can mnartafucue an iocnuurgons samenttet that is vlrtiauly isbpilechmoenrne.
:)
Or, if you prefer...
Interestingly I'm studying this controversial phenomenon at the Department of Linguistics at Aberystwyth University and my extraordinary discoveries wholeheartedly contradict the publicised findings regarding the relative difficulty of instantly translating sentences. My researchers developed a convenient contraption at http://www.aardvarkbusiness.net/tool that demonstrates that the hypothesis uniquely warrants credibility if the assumption that the preponderance of your words is not extended is unquestionable. Apologies for adopting a contradictory viewpoint but, theoretically speaking, lengthening the words can manufacture an incongruous statement that is virtually incomprehensible.