Slashdot Mirror


Algorithm Aims To Predict Fiction Bestsellers

benonemusic writes "Three computer scientists at Stony Brook University in New York believe they have found some rules through a computer program that might predict which fiction books will be successful. Their algorithm had as much as an 84 percent accuracy rate when applied to already published manuscripts in Project Gutenberg and other sources. Among their findings was that more successful books relied on verbs describing thought processes rather than actions and emotions. However, some disagree with the findings. Author Ron Hansen said style is not the key, but instead readers' interest in the topics in the book." There has been work done already on finding the formula for a hit song, and using analytics to craft a blockbuster movie.

14 of 146 comments (clear)

  1. Automated response by ackthpt · · Score: 5, Funny

    Is for the enjoyment like article much very.

    Posted by Comment Bot v1.0, Universe Algorithms, division 9 Sirius Cybernetics Corporation.

    --

    A feeling of having made the same mistake before: Deja Foobar
  2. Reading Level by TubeSteak · · Score: 4, Informative

    They began their research with Project Gutenberg, a database of 44,500 books in the public domain. A book was considered successful when it was critically acclaimed and had a high download count. The books chosen for analysis represented all genres of literature, from science fiction to poetry.

    Then, they added some books not in the Gutenberg database, including Charles Dickens' "Tale of Two Cities," and Ernest Hemingway's "The Old Man and the Sea." They also added Dan Brown's latest novel, "The Lost Symbol," and books that have won the Pulitzer Prize, the National Book Award, and other awards.

    Nowadays, marketing and signalling has as much to do with sales as anything else.
    I imagine that if some publisher could make the kind of advertising push that Bill O'Reilley does,
    they could put anything onto the NYTimes best seller list too.

    --
    [Fuck Beta]
    o0t!
    1. Re:Reading Level by retchdog · · Score: 3, Interesting

      It's not just legit donors, either. One of the games these people play is to charge institutions speaking fees for a public appearance, part of which charge is the required purchase of, say, 5,000 books for their library or for "promotional purposes". The institution plays along, sending 90%+ of the books to be pulped the next day, and the speaker's sales stats get bumped. Ridiculous.

      --
      "They were pure niggers." – Noam Chomsky
  3. Re:Stagnation by noh8rz10 · · Score: 4, Insightful

    On the upside, Noam Chomsky will be overjoyed by this development; soon software systems will be developed to 'generate' hit books. Someone get Angelina (Mike Cook's, not Pitt's).

    I see, so Angelina Jolie used to be an academy-award-winning actress, but now she's just Mrs. Pitt?

  4. can it explain... by able1234au · · Score: 3, Interesting

    Perhaps they can explain why Fifty Shades did well despite being badly written.

    There is a danger in this process that we end up with a "Save the cat" problem where everything has to follow a formula
    http://www.slate.com/articles/arts/culturebox/2013/07/hollywood_and_blake_snyder_s_screenwriting_book_save_the_cat.html

    1. Re:can it explain... by bob_super · · Score: 5, Insightful

      50 shades is a textbook example of a perfect marketing campaign. It cannot fit an algorithm, it's a total outlier.

      They sent out press releases to all the agencies about the new phenomena of women using the wonderful anonymity of e-readers/tablets to read Mommy porn, like that "50 shades" thing.
      Journalists just repeated the press releases, over and over again, almost exactly word for word, on various networks, because that's a topic that draws viewer attention.

      And suddenly everyone knew that apparently a lot of people were reading that "50 shades" book, and that reading it was both cool and risqué. Jackpot.

      I read one page of the book that was published on a website. It was worse than the transcript of a reality TV show. it wasn't just bad literature, it was barely passable English.
      But the marketing was absolutely brilliant.

  5. Re:Authors fail to understand ... by plover · · Score: 4, Interesting

    However, the sample's study makes exactly the same mistake. They used Project Gutenberg as the source, and download counts as a substitute for sales. Sales has one measure: the number of dollars in the cash box at the end of the day. They should be measuring books on the NY Times bestseller list, or the Amazon Top 10 list, which have actually sold for money and are actually popular (fraudulently placed books aside.) And they should be comparing them against books from their own genres, or at least books that had similar attributes.

    I think what they'd really find is that "books that sell well are those that are marketed well", regardless of the words they contain.

    Maybe they could focus on a specific key reviewer: what does Oprah like and not like? Maybe when they cross compile the data from all the books, they will find they've only discovered Oprah's tastes. Which isn't a bad outcome, if they are ultimately trying to discover what kinds of books will be better positioned to make the author money. But I don't think they've come close to predicting fiction "best-sellers" yet.

    --
    John
  6. Re:So does this explain... by Anonymous Coward · · Score: 3, Informative

    Don't forget: Successful books relied on:
       

    verbs describing

    .

    All this time I thought adjectives described. Silly me. No wonder my great novel failed.

    If that's what you thought then yes, that's probably one of your problems. Compare the following sentences:
    "He pitched the ball."
    "He hurled the ball."
    "He tossed the ball."
    "He lobbed the ball."
    "He chucked the ball."

    Where's the adjective to describe the manner in which the ball moved? There isn't one. The verb gives you the description of HOW the ball moved.
    In direct contradiction to this "algorithm", stronger writers tend to rely more on descriptive verbs, weaker writers tend to rely on less descriptive words which need to be padded with adjectives or adverbs.

  7. Re:There is so much money by symbolset · · Score: 3, Interesting

    So you haven't been to the movies or read a bestselling book lately? There is no talent to replace.

    --
    Help stamp out iliturcy.
  8. Re:Uck by symbolset · · Score: 4, Funny

    Nowhere does it mention the one weird trick that effortlessly melts away the pounds in six minutes while you sleep - that the government doesn't want you to know because it creates instant wealth for the few who know this secret.

    --
    Help stamp out iliturcy.
  9. Re:If I had a penny by Chrisq · · Score: 4, Funny

    Oh, if I had a penny for every time an algorithm aimed to do something...

    on (anyAlgorithmProposed) {
    give yourself a penny
    }

  10. Re:If I had a penny by RabidReindeer · · Score: 3, Funny

    Add friendly vampires. If that doesn't work, add werewolves. Alternate version: zombies.

  11. Re:Authors fail to understand ... by RabidReindeer · · Score: 3, Interesting

    Success comes in two flavors.

    Gutenberg is stacked with classics. Stuff that has been successful over a long period of time. Some classics were flops when they were first published and some go periodically in and out of favor.

    The NYT bestseller list, Oprah, et. al. focus on what's popular today. Relatively few books that make those lists will be popular in a century just as many of the bestsellers from Dickens' day would only be known to literary historians. And missing from Gutenberg.

  12. Re:If I had a penny by Anonymous Coward · · Score: 3, Insightful

    So, a love triangle with a vampire, a werewolf, and a girl with the emotional depth of a zombie?