The Science of Word Recognition

← Back to Stories (view on slashdot.org)

The Science of Word Recognition

Posted by michael on Wednesday September 1, 2004 @09:02PM from the paris-in-the-the-spring dept.

neile writes "I stumbled across a fascinating paper over at the Microsoft Typography site today that provides a really nice overview of the different theories on how humans read. If you thought we read by recognizing word shapes, think again! With the assistance of fancy eye-tracking cameras researchers have been able to devise several clever experiments to give us new insight into how reading works." We've linked to some of Larson's work previously.

9 of 430 comments (clear)

Min score:

Reason:

Sort:

Reduced Redudancy by plasticmillion · 2004-09-01 21:26 · Score: 3, Informative

This got slashdotted!? The idea of recognizing words by "word shape" seems so silly to me that I almost feel as if the author is attacking a straw man rather than a widely accepted linguistic theory.
The final conclusions are similar to what I learned in my college linguistics classes 15 years ago. Language contains a lot of redundancy. The reason is that we often encounter situations of so-called "reduced redundancy". For example, someone might have sloppy handwriting so you can't make out all of the letters. Or you might be talking to someone while they brush their teeth. If language were highly optimized, we wouldn't understand a thing in these situations, but because of redundancy we can usually communicate very effectively.
The same applies to reading. The conclusions of the paper seem trivial to me. Of course, reading exploits "visual" and "contextual" information. How else would be understand a sentence like "The boy ate a ham___er" (with a few letters obscured)?
The fact that the brain's neural net adds up the weighted lexicographic, syntactic, semantic (and even pragmatic) information available to it in order to interpret language should be familiar to anyone who's read Goedel, Escher, Bach. And that was published in 1979...

--
Peer Pressure
1. Re:Reduced Redudancy by TheWormThatFlies · 2004-09-01 23:00 · Score: 3, Informative
  
  This got slashdotted!? The idea of recognizing words by "word shape" seems so silly to me that I almost feel as if the author is attacking a straw man rather than a widely accepted linguistic theory.
  The author is aiming the article at typographers, not linguists and psychologists. It seems that while everyone who does scientific research into the way that we read has known for a long time that the word shape theory is full of crap, the theory persists as a kind of urban myth among typographers. So the paper is a scientific literature review for the benefit of people working in typography.
Though comes before language by alanxyzzy · 2004-09-01 21:42 · Score: 4, Informative

I would love to see a study comparing how english is read to how chinese is read by native speakers.
There is an interesting article at the Harvard Gazette about research which seems to show that thought comes before language. The Korean language distinguishes between two meanings of "in" - fitting loosely or tightly.
Research shows that
Infants of English-speaking parents easily grasp the Korean distinction between a cylinder fitting loosely or tightly into a container. In other words, children come into the world with the ability to describe what's on their young minds in English, Korean, or any other language. But differences in niceties of thought not reflected in a language go unspoken when they get older.
Article in short... by uss_valiant · 2004-09-01 21:55 · Score: 4, Informative

Further examination of the evidence used to support the word shape model has demonstrated that the case for the word shape model was not as strong as it seemed. The word superiority effect is caused by familiar letter sequences and not word shapes. Uppercase is faster than lowercase because of practice. Letter shape similarities rather than word shape similarities drive mistakes in the proofreading task. And pseudowords also suffer from decreased reading speed with alternating case text. All of these findings make more sense with the parallel letter recognition model of reading than the word shape model.
Of course he describes all the models before he concludes that from the three models, Word Shape Recognition (oldest), Serial Letter Recognition and Parallel Letter Recognition (newest), the latter is the one that is today the most accepted model.
Re:How we read... by Johan+Veenstra · 2004-09-01 22:02 · Score: 5, Informative

The example:

Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe.

But soon enough there was a counter example:

Anidroccg to crad cniyrrag lcitsiugnis planoissefors at an uemannd, utisreviny in Bsitirh Cibmuloa, and crartnoy to the duoibus cmials of the ueticnd rcraeseh, a slpmie, macinahcel ioisrevnn of ianretnl cretcarahs araepps sneiciffut to csufnoe the eadyrevy oekoolnr.

In the counter example, the letters are not randomly scrabled, the letters are in reverse order, except the first and last letters.
Re:aaah!! eyes hurt! by DrSkwid · 2004-09-01 22:40 · Score: 4, Informative

dunno, firefox / moz has one of my favourite features

tools ... options ... general .... fonts & colours .... minimum font size : 14

great for annoying "web site designers" who can't design for shit

--
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
Microsoft Research Web Site by Numen · 2004-09-01 22:56 · Score: 5, Informative

If there's those that have shied away from Microsoft, well because they're Microsoft, you might not be aware of http://research.microsoft.com which regardless of which side of various fences you might sit has some very interesting material and is generally worth tracking over time.

Aplogise for the tangent, on the back of this article seemed an apt place to point to the MS research site for those that might not of been aware of it.
Re:Don't shout! by Seahawk · 2004-09-01 23:31 · Score: 4, Informative

And if you had read the rest of the article, you would know that this is just because 99% of all we read is lowercase.

People can easily be trained to read text in caps as fast as lowercase text - or mirrored text.

What I fail to understand is how randomizing the middle letters of a word doesnt affect reading much. I had hoped he would use that as an example.

Tihs is a emxpale of the efecft.
Re:So ... by macshit · 2004-09-01 23:57 · Score: 3, Informative

Kanji = picture-based
English = character-based

It's like comparing apples and oranges - two completely different ways a written language is interpreted.

I think they're not quite as different as many people seem to think though.

Most kanji are composed of more primitive components. From observing myself reading Japanese, I've noticed that I make many of the same mistakes in recognition, and use similar tricks in recognizing unknown kanji, as I do when reading english. For instance, I frequently confuse two kanji because they have mostly the same primitive components, but differ in one (often the radical -- even though it's arguably the most important part of a kanji, I find I tend to ignore it when reading!).

In my opinion it's not unreasonable to think of the parts of a kanji as being like letters and the whole thing as being like a word.

--
We live, as we dream -- alone....