Words That Speak a Thousand Pictures
venolius writes: "The New York Times (free registration required) has an article
on TextArc (created by W.Bradford
Paley), a site that "aids in the
discovery of patterns and and concepts in arbitrary text" (from the detailed
overview at TextArc). The site serves an applet that performs the task
(texts on which analysis is available include Alice
in Wonderland, Hamlet, and thousands
of others -made available by Project
Gutenberg-). The NYTimes article reports that Paley found that
"Dracula", which relies on a strong storyline had a few keywords
clustered hotly at the center, and that the metaphoric "Frankenstein"
generated a circle of 50 words of modest intensity that faded towards the edges.
"Portrait of the Artist as a Young Man" with evenly distributed key
words produces tight and round lines and "Alice in Wonderland"
produces loopier lines. Check it out! (the applet was tested on better
hardware, but I did well enough with 98/IE6/550MHz/64MB)"
Although I only viewed one book, it came up with some interesting results. I'd be curious to know how similar an authors books are to one another... can this distinguish an author's style, or merely individual works.
I also imagine that a college professor might be interested to run this against term papers!
How???? He had to go to the site and then go to the prefs in his browser to turn on Java and then click on the link that said it was going to analyze the entire text of some long book and make pretty pictures out of it...in Java. (and if he didn't have to turn on Java, then he's probably due for some more disappointment in the future) What alternative does the site have to make their research available to others? Should they have just put up this note?
We are doing some cool research, and we've
developed this really cool tool that we'd
love to let you play with, but we're worried
that some individuals may have unreasonable
expectations of how powerful their machines
are and we don't want to burst their bubbles,
so instead, we'll just keep it to ourselves.
that's just silly. I mean, the system recommendation contains the following:
Sounds like a good enough warning to me that if you're using a 486 with 32MB of RAM over a dialup, that, perhaps, you don't want to try running it.
IMHO,
Michael