Slashdot Mirror


Text Analyzer Reveals Emotional 'Temperature' of Novels and Fairy Tales

KentuckyFC writes "Stories are a powerful channel for communicating emotions. But while they have been studied in detail by generations of critics, there is little in the way of objective tools for analyzing and comparing their emotional content. That looks set to change thanks to one data mining researcher who has applied the process of sentiment analysis to novels and fairy tales that have been digitized on Project Gutenburg and the Google Books Corpus. The results show the density of emotions in different parts of a story and how the emotional 'temperature' changes throughout the tale. For example, this guy has used the technique to compare the emotional content of the entire collection of the Brothers Grimm fairy tales to reveal that the darkest story is a tale called Gambling Hansel; clearly a lesson to us all."

32 of 65 comments (clear)

  1. I usually start with a Gambling Hansel.... by Anonymous Coward · · Score: 3, Funny

    then I move to a Dirty Gretel and finish all off with a move I call the Sugar Fairy Plums.

    Ironic captcha: bodice

    1. Re:I usually start with a Gambling Hansel.... by alphatel · · Score: 2

      Sadly Gambling Hansel isn't nearly as dark as Hansel and Gretel or Hans in Luck. So I guess the algorithm sucks?

      --
      When the foot seeks the place of the head, the line is crossed. Know your place. Keep your place. Be a shoe.
  2. This Guy by sexconker · · Score: 2

    The summary doesn't even mention the researcher's name? I mean, I agree that this is useless, pointless "research". But if you're going to piss about and drop Project Gutenburg and "Google Books Corpus" which are only tangentially related, couldn't you at least give "this guy" a fucking name?

    1. Re:This Guy by mythosaz · · Score: 2

      Ahhhh.... This guy... *points at guy...*

    2. Re:This Guy by mugurel · · Score: 2

      It's Project Gutenb*e*rg

    3. Re:This Guy by sexconker · · Score: 2

      It's Project Gutenb*e*rg

      And that's what I initially typed. I had to force myself to copy Gutenburg from the summary.

  3. It's a Pot Boiler! by ackthpt · · Score: 2

    "How yer know that?"

    "It's written in charcoal."

    i'll get me coat

    --

    A feeling of having made the same mistake before: Deja Foobar
  4. The "eight fundamental emotions" by PapayaSF · · Score: 3, Interesting

    From TFA:

    Analysing the emotional content of text is also becoming easier. In recent years, researchers have built up significant databases of the emotions that a given word evokes. This is part of the new field of sentiment analysis in which common words are categorised as positive, negative or neutral and associated with one of the eight fundamental emotions—joy, sadness, anger, fear, trust, disgust, surprise and anticipation.

    I don't know about anyone else, but I found that bit as fascinating as the text analyzer itself. But where does laughter fit? Shouldn't it count as a fundamental emotion? Or is it considered just a sub-category of "surprise" or "joy"?

    In any case, I wonder if someone could combine all that with the 36 dramatic situations and a few other components, and create a program that writes stories....

    --
    Q: What does the "B." in Benoit B. Mandelbrot stand for? A: Benoit B. Mandelbrot
    1. Re:The "eight fundamental emotions" by AthanasiusKircher · · Score: 2

      I don't know about anyone else, but I found that bit as fascinating as the text analyzer itself. But where does laughter fit?

      Yeah, while this theory of emotions surely has some good aspects, forcing ALL emotionally charged words into these categories will obviously skew the data in certain ways. When a model like this is used to classify something much more complex, the ultimate data analysis often tells you more about the model than about the data. Are we actually tracking the changes in "sadness" and "fear" over the course of Hamlet, or are we tracking some arbitrary dividing line that this model forces us to use to classify words?

      Moreover, these sorts of "digital humanities" projects that just analyze a corpus by counting up occurrences of words are always incredibly limited. It's so easy to skew the data just because a character or place or something in the story happens to include a word that is "emotionally charged" in a particular way in this model.

      For example, I note that "Godfather Death" is nearest to "Gambling Hansel" in terms of the "darkest story" in the study. While the story is dark, it's actually rather short and a very simple moral, with a main character who is a remarkable healer. The main character doesn't do too well at the end, but in the gamut of Brothers Grimm stories, this one doesn't have a lot of "dark" details. I assume, instead, that this story gets rated as very "dark" because one of the main characters happens to be named "Death," a word that has a very negative valence in the model. Write a happy story at a bar called "The Good Death" (referencing either bravery or sexuality), include a few other character names or place names that recur frequently but just happen to sound "negative," and I bet this algorithm will judge it "dark" too... or at least not as "positive" as the plot would suggest.

      I'm not saying such studies are useless. But they really need to factor in context, multiple meanings, and especially other factors that might lead to high frequencies of their chosen "emotional" words, like proper names or other plot points that may not actually be representative of the vocabulary and emotions of the story overall. In essence, for anything meaningful to come out of word frequency studies, you actually need to read the text as well and take account things that would obviously skew the data.

    2. Re:The "eight fundamental emotions" by PapayaSF · · Score: 2

      But they really need to factor in context, multiple meanings, and especially other factors that might lead to high frequencies of their chosen "emotional" words, like proper names or other plot points that may not actually be representative of the vocabulary and emotions of the story overall.

      You are correct. Obviously this sort of text analyzer is still in its infancy. It would be interesting to throw some oddball stories at it and see the results. E.g., here's a story filled with unpaired words. I wonder what it would say its "emotional temperature" was? And of course the program would totally miss the humor. (Note that the New Yorker blew the formatting when they put this online, and that the actual story starts with the third sentence: "It had been a rough day....")

      --
      Q: What does the "B." in Benoit B. Mandelbrot stand for? A: Benoit B. Mandelbrot
    3. Re:The "eight fundamental emotions" by sconeu · · Score: 4, Funny

      That's because the authors couldn't get no satisfaction, though they tried and they tried, and they tried, and they tried.

      --
      General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
    4. Re:The "eight fundamental emotions" by phantomfive · · Score: 2

      I think laughter is a response to an emotion, not an emotion itself. Thus you can have nervous laughter, or joyful laughter, or laughter of relief.........

      --
      "First they came for the slanderers and i said nothing."
    5. Re:The "eight fundamental emotions" by PapayaSF · · Score: 2

      Hmmm, interesting. But what is laughing at a joke? It isn't necessarily nervous, or joyful. It's sort of a relief, in that the tension of the buildup is released in the punch line, but that doesn't seem to be the same as "laughter of relief." Humor, or laughter, just seems to me to be as core of an emotion as fear or anger, but maybe psychologists and data miners don't see it that way.

      --
      Q: What does the "B." in Benoit B. Mandelbrot stand for? A: Benoit B. Mandelbrot
    6. Re:The "eight fundamental emotions" by dougisfunny · · Score: 2

      That is a rather limited way to look at satisfaction. Relief, catharsis, vengeance, schadenfreude, comeuppance to name a few could all be descriptive of something satisfactory, but might not cross over into joy so much.

      --
      This is not the funny you're looking for.
    7. Re:The "eight fundamental emotions" by hyperfine+transition · · Score: 2

      From TFA:

      In any case, I wonder if someone could combine all that with the 36 dramatic situations and a few other components, and create a program that writes stories....

      Someone has ... every Hollywood studio has it running on their server farms

    8. Re:The "eight fundamental emotions" by Panoptes · · Score: 2
      "This is part of the new field of sentiment analysis in which common words are categorised as positive, negative or neutral and associated with one of the eight fundamental emotions - joy, sadness, anger, fear, trust, disgust, surprise and anticipation."

      Codswallop! This notion has been around for a very long time under the name of connotation. Giving something a new name and peddling it as a new concept doesn't exactly inspire confidence in the writer's competence or integrity.

  5. Think of the Children by Froeschle · · Score: 2

    After reading this article I can only say that I am grateful to have my own childhood behind me.
    [the end]

  6. Tropes by PRMan · · Score: 2

    Let me know when it can call out all the TVTropes in a story...or would that cause an endless loop?

    --
    Peter predicted that you would "deliberately forget" creation 2000 years ago...
    1. Re:Tropes by vlueboy · · Score: 2

      Let me know when it can call out all the TVTropes in a story...or would that cause an endless loop?

      This troper believes that the fun in visiting tvtropes is knowing other geeks have enjoyed each story... or *hated* it, and in many other ways been driven to break down the story for your enjoyment due to resonating with that story emotionally. Call it pride in exclusivity, sort of like coming to slashdot looking for fellow geek observations, tips, jokes, and so on. In short, tvtropes is fun because you're finding someone else that you share some knowledge with in a way the corresponding Wikipedia article cannot fulfill.
      So... once a computer is doing the "thinking", even making a knowledge tree 1000 times denser than the depth of TVtropes, we'd know there is no emotion in it, and that it won't be growing on its own from outside contributions, or real-life anecdotes tangentially related to the content, or even grow (would the computer be "watching" new series and recalculate everthing, or would it be too busy maintaining a finite set of data frozen in time.) It COULD be done given enough tech, but knowing implementations, it'd feel like landing at some endless Google linkbait farm that links endlessly to itself, and TRYING to force yourself to enjoy the crickets chirping while you click.

    2. Re:Tropes by Zanadou · · Score: 2

      This troper...

      There's a trope for that.

  7. Here is the PowerPoint for the paper by TedTschopp · · Score: 4, Informative

    Here is the a good summary of the work in a PDF of a PPT.

    http://www.saifmohammad.com/WebDocs/LaTeCH-emotions-in-books.pdf

    Ted

    --
    Fantasy remains a human right; we make in our measure and in our derivative mode... -- JRR Tolkien
    1. Re:Here is the PowerPoint for the paper by retchdog · · Score: 2

      1. used Mechanical Turk to get people to report association of words with emotions
      2. determine emotion by counting(!) corresponding words with weighting proportional to association, using sliding window in time.
      3. generate pretty but almost meaningless plot
      4. profit?

      hint: Dramatic tension is often created through irony. The audience knows that the doom of character X (e.g. Walter White) comes when he finally comes to trust character Y (e.g. the white nationalist thugs) implicitly. Goddammit, at the very least look for some negative auto- or cross-correlations.

      --
      "They were pure niggers." – Noam Chomsky
    2. Re:Here is the PowerPoint for the paper by tinkerton · · Score: 2

      Here's a good basic emotion measure: adjective count. It covers the bulk of written text even though one can find ways around it.

  8. Gambling Hansel: not dark at all by themushroom · · Score: 3, Interesting

    I didn't find it dark at all, not nearly as dark as the tales Disney sanitized. I mean, it's about a gambler who beats both God and the Devil even if he has lousy luck with mortals prior to getting rigged cards and dice.

    1. Re:Gambling Hansel: not dark at all by gl4ss · · Score: 2

      well it's pretty dark that seemingly people on earth have no choice but to gamble nor do beings in hell and heaven.

      other than that, it's pretty light hearted for a grimm.

      there's even 7 years in gambling hansel during which nobody dies, since he beat death too - and his soul lives eternally in gamblers.

      so the algorithm is pretty contextually unaware. maybe it just counts nots and buts. waste of time anyhow, even if it did get me to read gambling hansel.

      --
      world was created 5 seconds before this post as it is.
  9. Re:The Bible? by tepples · · Score: 2

    Wouldn't the steamiest book of the Bible be Fifty Shades, umm, Song of Solomon?

  10. Related stuff by my wife on tagging narratives by Paul+Fernhout · · Score: 4, Informative

    Mainly by hand though. Free book: http://www.workingwithstories.org/
    Free software for communtieis: http://www.rakontu.org/
    Related business process patent (sadly) when at IBM Research: http://www.google.com/patents?hl=en&lr=&vid=USPAT7136791
    Past commercial software: http://www.sensemaker-suite.com/
    National security (does have some automatic aspects): http://app.rahs.gov.sg/public/www/content.aspx?sid=2955

    There is a lot you can do with stories once they are tagged for emotional intensity, whether automatically, by the teller, or by other people. Stories are all around us, as we try to make sense of our lives and events in our communities. So this sort of technology to tag emotions in stories is much more far reaching than just being about fiction. It can be used to design better products, to help communities figure out what to do about a pressing issue, to resolve conflicts, and to see emerging trends. That is one reason such work is funded by the intelligence sector (as well as businesses and some non-profits). She's been trying to make these ideas freely available to everyone, but it has been a slow going slog to follow the path of free and open source for all this.

    By someone else on the relation between emotion and reason:
    http://en.wikipedia.org/wiki/Descartes'_Error

    --
    A 21st century issue: the irony of technologies of abundance in the hands of those still thinking in terms of scarcity.
  11. Re:The Bible? by retchdog · · Score: 2

    my favorite book of the Bible is Ecclesiastes which, being basically an existential musing on the meaninglessness of life by, ostensibly, King Solomon, is considered so out-of-place that scholars have been trying for about two thousand years to figure out why the hell it was included in the Hebrew canon.

    --
    "They were pure niggers." – Noam Chomsky
  12. Re:The Bible? by TedTschopp · · Score: 2

    The best answer that I have heard is that the existential nihilism that is covered by the book is an important aspect of Jewish / Christian traditions and that all wise people must confront it and think about it. The idea is so central that it even suggests the idea that God himself wrestles with this question and more specifically in the Christian Tradition this is what the Christ wrestled with on the cross when he cried out "Why have you forsaken me?" The saints, holy people, and mad men through out history have all struggled with this and were all changed by the questions asked by this book.

    --
    Fantasy remains a human right; we make in our measure and in our derivative mode... -- JRR Tolkien
  13. Re:The Bible? by __aaltlg1547 · · Score: 2

    You misunderstood the comment. TedSchopp is saying these are such fundamental questions that every Jew and Christian grapples with, even the Christ who is regarded by Christians as the incarnation of God. The saying of Jesus on the cross "Why have you forsaken me?" is very much of a piece with the existential despair of Eclesiastes. What could be worse for Jesus at that point than feeling that not only is he dying in this horrific way, but that it is a completely meaningless experience, as is the whole of his life and maybe nobody even cares what is happening to him?

    Christianity is supposed to be an answer to the questions raised by Ecclesiastes. It is all about life not being futile in the big picture and God actually caring what happens to people. Without the thoughs and feelings expressed in Ecclesiastes, one would have to ask what Christian salvation is even for.

    I find it the most universal of all the Judeo-Christian scriptures. It's not just Jews and Christians who must grapple with the apparent meaninglessness of life and the futility of our actions and desires. Everyone must. Every religion and philosophy, if it's worth anything at all, must address these issues.

  14. Text Analysis... by David_Hart · · Score: 2

    So, this analyses text and emotional connotation of words to produce an emotional score for each story. Yet, it has no way of divining context, whether or not a particular section of the story is funny, or if a death causes an emotional reaction of sadness or satisfaction (i.e. the character was evil, deserved it, etc.). In other words, it's an arbitrary system that may work at a basic level but will still get a lot of things completely wrong... at least it's a start, I guess...

  15. Simple by wonkey_monkey · · Score: 2

    if (text contains 'Natalie Portman' && text contains 'grits') temperature='steamy';

    --
    systemd is Roko's Basilisk.