Algorithm Aims To Predict Fiction Bestsellers
benonemusic writes "Three computer scientists at Stony Brook University in New York believe they have found some rules through a computer program that might predict which fiction books will be successful. Their algorithm had as much as an 84 percent accuracy rate when applied to already published manuscripts in Project Gutenberg and other sources. Among their findings was that more successful books relied on verbs describing thought processes rather than actions and emotions. However, some disagree with the findings. Author Ron Hansen said style is not the key, but instead readers' interest in the topics in the book." There has been work done already on finding the formula for a hit song, and using analytics to craft a blockbuster movie.
... becomes more cold.
Sex , drugs, and rock 'n roll.
Oh, if I had a penny for every time an algorithm aimed to do something...
to be made from suckers. No fancy computer program is going to replace actual talent.
Brave Sir Robin ran away. ("No!") Bravely ran away away. ("I didn't!")
Bias and other flaws in the design and statistical analysis.
Suffering increases every day from the ever increasing Marketing Research and its derivations and accompanying costs. Keep in mind, there are more to costs then just money.
Is for the enjoyment like article much very.
Posted by Comment Bot v1.0, Universe Algorithms, division 9 Sirius Cybernetics Corporation.
A feeling of having made the same mistake before: Deja Foobar
How Jackie Collins sells so many books? She uses too many verbs? I thought it was about the overly dripping romance themes that women seem to like?!?!
Harrison's Postulate - "For every action there is an equal and opposite criticism"
They began their research with Project Gutenberg, a database of 44,500 books in the public domain. A book was considered successful when it was critically acclaimed and had a high download count. The books chosen for analysis represented all genres of literature, from science fiction to poetry.
Then, they added some books not in the Gutenberg database, including Charles Dickens' "Tale of Two Cities," and Ernest Hemingway's "The Old Man and the Sea." They also added Dan Brown's latest novel, "The Lost Symbol," and books that have won the Pulitzer Prize, the National Book Award, and other awards.
Nowadays, marketing and signalling has as much to do with sales as anything else.
I imagine that if some publisher could make the kind of advertising push that Bill O'Reilley does,
they could put anything onto the NYTimes best seller list too.
[Fuck Beta]
o0t!
I was about to say that this speaks poorly of the breadth of the current generation's literary interests, and then I recalled books like Little Women and Lord of the Files, or even Arthur C. Clarke's Childhood's End (although the Rama series might be more about descriptions than emotional exposes). Still, it's a little disheartening that technical manuals don't hit the bestseller lists. On the upside, Noam Chomsky will be overjoyed by this development; soon software systems will be developed to 'generate' hit books. Someone get Angelina (Mike Cook's, not Pitt's).
We all know advertising and product placement can make a big difference and return on investment, so what about including paid for marketing and tv show plugs into the modelling? Nothing can be successful if no one has heard of it.
Two quotes stand out for me:
"It's very difficult to quantify decisions that are often made by intuition and relationships."
The study claims that at least some of those decisions are quantifiable, which pretty much contradicts Hamilburg's point.
"Of stylistic characteristics, the scientists are flying in the face of most teaching of creative writing when they emphasize nouns over verbs. Verbs are the engine of fiction and quality writing is often measured by their variety, precision, and force,"
Hansen appears to have missed the point of the study: it is about what sells, rather than what's taught or what makes quality writing.
Hypnosis is nothing other than an elegant description of a process.
I think it's interesting that as you find yourself looking at this screen, and focusing in clearly to these words you are reading.... you remember a time when you felt very VERY tired... maybe after a long work meeting or after staying up late working on a paper.... What you become to notice is that you are slowly find your eyes relaxing deeply,..... and as you become aware that your mind is slowing down and your mouth widens and begins to yawn... you feel your eyes are closing as you drift off to sleep....
READY.
PRINT ""+-0
Perhaps they can explain why Fifty Shades did well despite being badly written.
There is a danger in this process that we end up with a "Save the cat" problem where everything has to follow a formula
http://www.slate.com/articles/arts/culturebox/2013/07/hollywood_and_blake_snyder_s_screenwriting_book_save_the_cat.html
71.4% of algorithms agree.
Look for modern fiction to adjust to fit the parameters of the application, degrading to a common level and uniform format. The literature cannot be observed without being altered. It will be lot like the mandatory movie formula. The content itself is irrelevant.
1. Read the algorithm
2. Write a book
3. Profit!!!
I just wrote an algorithm that predicts that no book detailing the death of creativity at the hands of science will ever be written.
Does this article make everyone else as sick as it makes me?
Fast Federal Court and I.T.C. updates
Keep in mind, there are more to costs then just money.
Well, if you get "just money" after the costs it shouldn't be too hard to deal with them.
If this is anything like the song version, expect the text equivalent of dubstep and oh-ahs.
Never heard of 'Fitcoin', is that a descendant/parent of 'Bitcoin'? Please advise :)
These things don't actually work. They're curiosities and nothing more.
When they finally develop strong AI... then you might have something. But a non-intelligent system is not going to figure these things out.
I've decided to stop wasting my time responding to AC trolls/sockpuppets... so if you want a response from me... login.
Preceding any great scientific advancement or discovery it is no accident that you will find a surge in the fiction and cultural themes surrounding it.
The New World, Forensics, Avionics, Electronic Computing, Nuclear Reaction, Rocketry, Robotics.
The cultural mind thinks as you do. Its subconscious boils with the direction it will soon take. Ask yourself: What is seen much more now in your culture? What makes you think you have any choice but to latch onto any thoughts but those which come to mind from within? What makes you think society can choose from among the roiling themes anything other than what pattern is most apt? What makes you think?
Cybernetics.
Anyway, I'm a little worried about the methodology. If you train on PG, and test on PG your generalization error will suffer. This is especially easy to get wrong when both the train and test set are constructed repeatedly with various thresholding rules, and the classifier features are (presumably) optimized during the research being conducted.
It's already been done - though only in fiction.
Roald Dahl wrote about a machine called the Great Automatic Grammatizator. A machine that you plug in various parameters - such as type of book, characters, proportions of violence/sex/humour - and it churns out something that's pretty much guaranteed to be a bestseller according to those parameters in fifteen minutes flat. Being a writer himself - and a somewhat dark one at that - the end result was a dystopian universe in which writers were forced to give up writing and just license their name to the man with the machine, simply because the machine brought the cost of production down so much that this was the only way to earn a living as a writer.
Remember, there's a HUGE difference between successful and "good".
"Successful" means appealing to the dozen or so big publishers' editors, such that they are willing to pimp your book and market it. They can - and have, obviously - taken utter crapola to the top of the "bestseller" lists.
I entirely understand that the algorithm favors deep internal monologues, because those editors clearly love them.
-Styopa
Their algorithm had as much as an 84 percent accuracy rate when applied to already published manuscripts
I could write an algorithm that's 100% accurate selecting yesterday's lottery numbers.
A blockbuster movie? Space, cowboys, roughnecks, scenes of things blowing up, impending doom saved at the last minute and a guy who doesn't make it home and leaves behind a beautiful girl. Oh and crazy Russians. Perfect formula. A blockbuster song? repeating lyrics which drone on and a drum machine. The public just seems to love it this way!
What the algorithm looked at was writing style. That's hardly new. Teachers have been recommending this or that writing style, probably since the preferred medium was stone tablets. Slavish devotion to such recommendations is obviously undesirable, and a few outliers and experiments are necessary if you don't want writing styles to become stultified. But taking some advice about it is nothing new or undesirable. This study said nothing about structure (for which there are also standard recommendations) or subject.
All creative endeavors require a certain amount of less creative craftsmanship to be done well.
Trust me, you don't want to read about the marketing campaign.
quiquid id est, timeo puellas et oscula dantes.
If you read the article they're not really examining best sellers at all. A site like Gutenberg has no correlation with modern best sellers.
Film, TV and Internet have all had drastic effects on the market as well. Thus old books aren't really representative.
Writers aren't going to spend effort to create a well written book about subjects people aren't interested in.
love is just extroverted narcissism
Get something that, krufted up, will work... and the publishers will use it, rather than have readers decide what should be published. You like the crap packaged as "music" from the members of the RIAA? You'll see that in books, too....
mark
Making an algorithm to WRITE the next fiction bestseller
Sometimes it's better not having signature
http://www.gutenberg.org/wiki/Gutenberg:General_FAQ#G.13._How_does_Project_Gutenberg_choose_books_to_publish.3F
All-volunteer; what people scan and proofread is what's there, after a copyright check. Some things that were popular and are therefore common; some things that were always rare and therefore an enthusiast scanned a copy; some things people sought out to fill out a subject heading. There's *lots* of old light fiction, adventure stories and social comedies, that no-one's cared about for a century. (I find it fascinating what changed, and what didn't, and what changed *first*. I love old B-side books.)
"[they believe they have found an algorithm that might] predict which fiction books will be successful. Their algorithm had as much as an 84 percent accuracy rate when applied to already published manuscripts in Project Gutenberg and other sources."
I can predict the success rate of already published books with 100% accuracy.
Backtesting is usually bogus because it means nothing unless the experimenter can precisely enumerate the total number of rules that were formulated and discarded--including those formulated and discarded intuitively--before arriving at the one that tested well. If you consider 100 possible systems, the chances that at least one of them will test with results significant at the 1% level is 63%.
Also, "A Tale of Two Cities" IS in the Project Gutenberg database, right here, which doesn't give me much confidence in anything else they say...
"How to Do Nothing," kids activities, back in print!
This is actually going to change the publishing game. It may take time to roll this out, but soon, novels will have to be uploaded digitally to agents and publishing houses electronically. They will buy this software and let it "proofread" the authors work. Only if the book has all the qualities of a best seller, will it be read by the agent or publishing house (or one of their lackeys). Then an online site will rise that allows authors to check their work before they submit it.