Slashdot Mirror


Algorithm Aims To Predict Fiction Bestsellers

benonemusic writes "Three computer scientists at Stony Brook University in New York believe they have found some rules through a computer program that might predict which fiction books will be successful. Their algorithm had as much as an 84 percent accuracy rate when applied to already published manuscripts in Project Gutenberg and other sources. Among their findings was that more successful books relied on verbs describing thought processes rather than actions and emotions. However, some disagree with the findings. Author Ron Hansen said style is not the key, but instead readers' interest in the topics in the book." There has been work done already on finding the formula for a hit song, and using analytics to craft a blockbuster movie.

26 of 146 comments (clear)

  1. Automated response by ackthpt · · Score: 5, Funny

    Is for the enjoyment like article much very.

    Posted by Comment Bot v1.0, Universe Algorithms, division 9 Sirius Cybernetics Corporation.

    --

    A feeling of having made the same mistake before: Deja Foobar
  2. Reading Level by TubeSteak · · Score: 4, Informative

    They began their research with Project Gutenberg, a database of 44,500 books in the public domain. A book was considered successful when it was critically acclaimed and had a high download count. The books chosen for analysis represented all genres of literature, from science fiction to poetry.

    Then, they added some books not in the Gutenberg database, including Charles Dickens' "Tale of Two Cities," and Ernest Hemingway's "The Old Man and the Sea." They also added Dan Brown's latest novel, "The Lost Symbol," and books that have won the Pulitzer Prize, the National Book Award, and other awards.

    Nowadays, marketing and signalling has as much to do with sales as anything else.
    I imagine that if some publisher could make the kind of advertising push that Bill O'Reilley does,
    they could put anything onto the NYTimes best seller list too.

    --
    [Fuck Beta]
    o0t!
    1. Re:Reading Level by Charliemopps · · Score: 2

      All books written by politically active people like O'Reilley are nothing more than slush funds to funnel money towards a particular party or candidate. The Clintons have done it, Sarah Palins a master of it... Your donors buy up your books, giving you fame, getting the press to talk about you... and then "donate" them to fund-raisers who "Give" them away to donors. It looks like you sold lots of books, your all over the news because of it but no-ones reading the book, not even the anchors claiming to interview you about it. God I hate marketing.

    2. Re:Reading Level by retchdog · · Score: 3, Interesting

      It's not just legit donors, either. One of the games these people play is to charge institutions speaking fees for a public appearance, part of which charge is the required purchase of, say, 5,000 books for their library or for "promotional purposes". The institution plays along, sending 90%+ of the books to be pulped the next day, and the speaker's sales stats get bumped. Ridiculous.

      --
      "They were pure niggers." – Noam Chomsky
  3. Stagnation by aslashdotaccount · · Score: 2

    I was about to say that this speaks poorly of the breadth of the current generation's literary interests, and then I recalled books like Little Women and Lord of the Files, or even Arthur C. Clarke's Childhood's End (although the Rama series might be more about descriptions than emotional exposes). Still, it's a little disheartening that technical manuals don't hit the bestseller lists. On the upside, Noam Chomsky will be overjoyed by this development; soon software systems will be developed to 'generate' hit books. Someone get Angelina (Mike Cook's, not Pitt's).

    1. Re:Stagnation by noh8rz10 · · Score: 4, Insightful

      On the upside, Noam Chomsky will be overjoyed by this development; soon software systems will be developed to 'generate' hit books. Someone get Angelina (Mike Cook's, not Pitt's).

      I see, so Angelina Jolie used to be an academy-award-winning actress, but now she's just Mrs. Pitt?

  4. Authors fail to understand ... by MacTO · · Score: 2

    Two quotes stand out for me:

    "It's very difficult to quantify decisions that are often made by intuition and relationships."

    The study claims that at least some of those decisions are quantifiable, which pretty much contradicts Hamilburg's point.

    "Of stylistic characteristics, the scientists are flying in the face of most teaching of creative writing when they emphasize nouns over verbs. Verbs are the engine of fiction and quality writing is often measured by their variety, precision, and force,"

    Hansen appears to have missed the point of the study: it is about what sells, rather than what's taught or what makes quality writing.

    1. Re:Authors fail to understand ... by plover · · Score: 4, Interesting

      However, the sample's study makes exactly the same mistake. They used Project Gutenberg as the source, and download counts as a substitute for sales. Sales has one measure: the number of dollars in the cash box at the end of the day. They should be measuring books on the NY Times bestseller list, or the Amazon Top 10 list, which have actually sold for money and are actually popular (fraudulently placed books aside.) And they should be comparing them against books from their own genres, or at least books that had similar attributes.

      I think what they'd really find is that "books that sell well are those that are marketed well", regardless of the words they contain.

      Maybe they could focus on a specific key reviewer: what does Oprah like and not like? Maybe when they cross compile the data from all the books, they will find they've only discovered Oprah's tastes. Which isn't a bad outcome, if they are ultimately trying to discover what kinds of books will be better positioned to make the author money. But I don't think they've come close to predicting fiction "best-sellers" yet.

      --
      John
    2. Re:Authors fail to understand ... by RabidReindeer · · Score: 3, Interesting

      Success comes in two flavors.

      Gutenberg is stacked with classics. Stuff that has been successful over a long period of time. Some classics were flops when they were first published and some go periodically in and out of favor.

      The NYT bestseller list, Oprah, et. al. focus on what's popular today. Relatively few books that make those lists will be popular in a century just as many of the bestsellers from Dickens' day would only be known to literary historians. And missing from Gutenberg.

    3. Re:Authors fail to understand ... by plover · · Score: 2

      I was commenting based on the title of the articles discussing the study: "Algorithm aims to predict fiction bestsellers"; and "Computer Algorithm Seeks to Crack Code of Fiction Bestsellers". The strong implications are that the algorithm is designed to unlock the secret of making money by writing books that contain certain words or linguistic structures. I'm arguing that a book's financial success has much less to do with any ephemeral "bestsellerness" quality, and has a much stronger association with "marketing campaigns".

      Now, that may or may not be the basis for why the researchers performed their study, or even what they hoped to learn, but it's how their study is being perceived by the media. Which is ironically making my point: it isn't the facts or the content of the study that's important, it's the coverage of the story that's put the slant on what they found. If the [marketing|reporting] for this study had instead said "Researchers develop algorithmic approach to search for linguistic commonalities in Project Gutenberg texts", it probably wouldn't even have merited notice on Slashdot.

      Tl;dr: marketing wins.

      --
      John
    4. Re:Authors fail to understand ... by AthanasiusKircher · · Score: 2

      Gutenberg is stacked with classics. Stuff that has been successful over a long period of time. Some classics were flops when they were first published and some go periodically in and out of favor.

      Or, in other words, what counts as a "classic" right now is simply what's popular today. I think the trends can be better seen in music history. Take, for example, Pachelbel's Canon in D, that piece which seemingly shows up everywhere as "classical music." Johann Pachelbel, however, was a master composer, well-known in his lifetime for all sorts of compositions. Today he has one stupid piece played at thousands of weddings and other occasions every year, just because of some whims of audiences in the late 1960s who got interested in it.

      Take Antonio Vivaldi, who was hugely popular in his lifetime, then almost completely forgotten for centuries (he died a pauper, so his fame was as short-lived as many pop artists today), until some Italian archivists dug up his thousand-or-so compositions in the 1920s, and these pieces were then deliberately promoted as part of Italian cultural history beginning in the 1930s.

      Or, heck, for a recent example, look at Thomas Tallis's Spem in alium, a Renaissance motet that was pretty obscure until the past couple of years after it appeared in the novel Fifty Shades of Grey. Suddenly, recordings of the piece bounded up to the top of the charts, and it has led to a new interest in Renaissance music and certain early music performance groups.

      I'm not saying that these pieces or composers don't have great value or that they shouldn't be "classics." But I do think that interest in particular "classics" is driven almost as much by current culture as actual current art/literature/music is. Measuring downloads from Project Gutenberg is giving us a particular snapshot into what is considered "classic" literature for the past few years. Fifty years ago, or a hundred years ago, I can guarantee you that the lists would be different -- and not just because of works written since then.

  5. can it explain... by able1234au · · Score: 3, Interesting

    Perhaps they can explain why Fifty Shades did well despite being badly written.

    There is a danger in this process that we end up with a "Save the cat" problem where everything has to follow a formula
    http://www.slate.com/articles/arts/culturebox/2013/07/hollywood_and_blake_snyder_s_screenwriting_book_save_the_cat.html

    1. Re:can it explain... by bob_super · · Score: 5, Insightful

      50 shades is a textbook example of a perfect marketing campaign. It cannot fit an algorithm, it's a total outlier.

      They sent out press releases to all the agencies about the new phenomena of women using the wonderful anonymity of e-readers/tablets to read Mommy porn, like that "50 shades" thing.
      Journalists just repeated the press releases, over and over again, almost exactly word for word, on various networks, because that's a topic that draws viewer attention.

      And suddenly everyone knew that apparently a lot of people were reading that "50 shades" book, and that reading it was both cool and risqué. Jackpot.

      I read one page of the book that was published on a website. It was worse than the transcript of a reality TV show. it wasn't just bad literature, it was barely passable English.
      But the marketing was absolutely brilliant.

    2. Re:can it explain... by hey! · · Score: 2

      I read Snyder's book because he was a friend of a friend. First off, it's not about *everything*. It's about movie scripts. Secondly it's a bit naive to blame the lack of creativity of modern movies on his book; that's a trend that predates 2005.

      In any case screenwriters are nothing like the olympian figures playwrights are in theater. The main creative force in a movie is the director, and writers are relatively minor figures in the enterprise. In the theater the script is gospel. In the movies a director routinely adds to, deletes from or reorganizes a script as he sees fit. It's important to realize that screenwriters structure screenplays, but directors structure the movies. A screenplay isn't the movie story; it's a guideline that helps the director imagine the story HE will tell. Thus things like the page count for each story beat *are for the benefit of the director*, and don't have much if any relationship to the pace of the story as seen by the moviegoers.

      What Snyder did for screenwriting was analogous to what agile programming advocates did for programming. He codified the practices from successful projects. What the linked article does is at best intellectually sloppy or at worst, disingenuous. Mr Suderman applies the 15 beat structure to recent movies, but fails to note that the same can be done for nearly every commercially successful movie in the last 80 years.

      As for 50 SHADES, it's possible the formula *might* explain why it is more successful than other readily available erotica. And marketing helps too, but remember this was initially a self-published book that took off by word of mouth.

      Ultimately, when you become a discerning reader, you realize that practically every novel is flawed in some way or another. And while all things being equal a better written novel is more likely to be successful, all things are most definitely NOT equal. You cannot craft your way to success with readers, you have to speak to something in them. It's more important that a story does something right, than it does everything, or even most things right.

      --
      Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
    3. Re:can it explain... by martin-boundary · · Score: 2
      Porn works like that. Have you ever landed on a porn video site? Most of the videos have no story, and show even less acting ability or camera skills.

      But you know what? Nobody cares! It's the same with 50 shades, people don't read it because it's art. Women read it to get ideas and phantasies. And to be honest most porn sites don't cater to women, so they have a limited choice in the matter.

  6. Re:So does this explain... by Anonymous Coward · · Score: 3, Informative

    Don't forget: Successful books relied on:
       

    verbs describing

    .

    All this time I thought adjectives described. Silly me. No wonder my great novel failed.

    If that's what you thought then yes, that's probably one of your problems. Compare the following sentences:
    "He pitched the ball."
    "He hurled the ball."
    "He tossed the ball."
    "He lobbed the ball."
    "He chucked the ball."

    Where's the adjective to describe the manner in which the ball moved? There isn't one. The verb gives you the description of HOW the ball moved.
    In direct contradiction to this "algorithm", stronger writers tend to rely more on descriptive verbs, weaker writers tend to rely on less descriptive words which need to be padded with adjectives or adverbs.

  7. What a stupid idea by OhANameWhatName · · Score: 2

    1. Read the algorithm
    2. Write a book
    3. Profit!!!

    I just wrote an algorithm that predicts that no book detailing the death of creativity at the hands of science will ever be written.

  8. Uck by speedplane · · Score: 2

    Does this article make everyone else as sick as it makes me?

    --
    Fast Federal Court and I.T.C. updates
    1. Re:Uck by symbolset · · Score: 4, Funny

      Nowhere does it mention the one weird trick that effortlessly melts away the pounds in six minutes while you sleep - that the government doesn't want you to know because it creates instant wealth for the few who know this secret.

      --
      Help stamp out iliturcy.
  9. Re:There is so much money by symbolset · · Score: 3, Interesting

    So you haven't been to the movies or read a bestselling book lately? There is no talent to replace.

    --
    Help stamp out iliturcy.
  10. Re:If I had a penny by Chrisq · · Score: 4, Funny

    Oh, if I had a penny for every time an algorithm aimed to do something...

    on (anyAlgorithmProposed) {
    give yourself a penny
    }

  11. Re:There is so much money by mcgrew · · Score: 2

    No fancy computer program is going to replace actual talent.

    I don't think there's any correlation between talent and success whatever. Wikipedia quotes Stephen King as saying that James Patterson "is a terrible writer, but very successful." I read Patterson's "When the Wind Blows" and wasn't very impressed with his writing, either, especially the switching back and forth between 1st and 3rd person. But almost every time I see a woman with a book it's one of his.

    Asimov's Hugo-winning Foundation trilogy didn't earn him a dime for ten years, until Doubleday bought the rights from the original publisher.

    Meanwhile I know a lot of incredibly talented musicians who play in bars because the labels offered them ridiculous contracts.

    Anyone remember Milli Vanilli?

    Marketing is king, talent is a dime a dozen.

  12. Re:If I had a penny by RabidReindeer · · Score: 3, Funny

    Add friendly vampires. If that doesn't work, add werewolves. Alternate version: zombies.

  13. A block buster? by ai4px · · Score: 2

    A blockbuster movie? Space, cowboys, roughnecks, scenes of things blowing up, impending doom saved at the last minute and a guy who doesn't make it home and leaves behind a beautiful girl. Oh and crazy Russians. Perfect formula. A blockbuster song? repeating lyrics which drone on and a drum machine. The public just seems to love it this way!

  14. Re:If I had a penny by Anonymous Coward · · Score: 3, Insightful

    So, a love triangle with a vampire, a werewolf, and a girl with the emotional depth of a zombie?

  15. Re:There is so much money by Tipa · · Score: 2

    Huh? Asimov originally serialized the Foundation series in Astounding Magazine, for which he was paid quite well.

    Those Golden Age SF pros didn't write a word if they weren't going to be paid for that word. This was their livelihood.