Software Predicts Movie Success
scheming daemons writes "TechNewsWorld has an article about software that predicts whether a movie will be successful or not by factoring in its rating by censors (e.g. G, PG, R), strength of the cast, genre, competition from other films at the time of release, special effects, whether it is a sequel, and the number of theaters in which it will show."
A good script?
Trolling is a art,
It seems their has been a recent spurt of "smart" systems like this...
Maybe we're finally coming out of the "AI Winter" it seems like we've been in for a decade or so...
td
hard core geek-ware
int main() { /* error */
if ( this_is_mainstream() ) {
if ( good )
return 1;
else
printf("50 Million USD");
} else
printf("Sued out of existence before it's released");
return 0;
}
King Kong is flopping like a pancake...
I can do this with Excel and some previous statistics! How breaktrough it this? Of course, if it's a program that analyzes the script, that would be another matter, but it's not.
please excuse my apathy
The big recording labels had developed software to determine the quality of song. Apparently, they could determine if a song would be a megahit or a flop. Judging from what I've heard on the radio, it doesn't seem to work. Hopefully the movie industry will have better success.
http://religiousfreaks.com/Napoleon Dynamite? I find it hard to believe that this script would have predicted the success of this film.
Also, this actually kind of disgusts me since it seems IMHO that it relies on the same formulaic approach that's responsible for the poor offerings that Hollywood is currently producing.
rating by censors (e.g. G, PG, R), strength of the cast, genre, competition from other films at the time of release, special effects, whether it is a sequel, and the number of theaters in which it will show.
Hmmm...I wonder what it had to say about Waterworld...
New Snot Eunichs.
Suddenly I'm thinking of the measure of the greatness of poetry scene from Dead Poets Society. Right on. Yeah, I know, it's not about greatness, it's about box office success. I bet they left Gigli out of their tests.
must... stay... awake...
If this thing does any good predicting at all, I'm sure it's based on the number of screens that the movie shows on. Once you have that number, I'm sure your pick will usually be pretty close. This is because the theater companies pay public opinion eggheads big bucks to figure out how many screens to reserve for movies... based on the movie's expected audience draw. These theater people do the actual analysis. To piggyback on their results and then pretend you were the insightful one seems really ... unimpressive.
Many of the criteria used here are subjective, and based upon existing human estimation of the movie's success. For instance, when a movie opens in a large numbers of theatre's simultaneously, it usually means people have already predicted it will be successful. Also, movies are often chosen to 'Open' on a date that doesn't conflict with other movies, and is chosed to maximise revenue. It's a real stretch to call this software's process 'scientific'.
Network President: Greetings, gentlemen. You already know my execubots: Executive Alpha, programmed to like things it has seen before.
Executive Alpha: Hey hey hey.
Network President: Executive Beta, programmed to roll dice to determine the fall schedule.
Executive Beta: (rolls dice) More reality shows!
Network President: And Executive Gamma, programmed to underestimate Middle America.
Executive Gamma: It's funny, but is it going to get them off their tractors?
"...strength of the cast..."
Will it be based on looks or on acting ability? There would be some serious issues if they used acting abilities. There are some horid actors/actresses that sell boatloads because they look great, and then there are some...well...less visually pleasing folks, that are fantastic actors/actresses.
Yet another example of some machine learning bozo overtraining on a dataset to come up with a perfect predictro of historical data with little value for generalization. No doubt they have some dull understanding of cross validation which they mistakenly believe assures they have not over trained. Heh. In the end just as good as your linear numTit predictor.
And then when they are done they find that any future predictve power it has only is focused into a couple of clusters that any fool could have told you were sure bets. It has not value unless your goal is to recycle the same things over and over till there's just one tru formula that all money making movies must follow.
I suspect movie making is probably a lot like the stockmarket. While there's general themes that always have positive returns, the can't be a formula for big success because if there were then once it was known it would not work anymore. Originality and a cyclic nature of traditional themes is the flow but not predictable.
Some drink at the fountain of knowledge. Others just gargle.
...with formulaic movies is more formulas?
I have my doubts this will work. Like, statistically speaking, John Ratzenberger, the guy that played Cliff on Cheers is very bankable actor, he'd been in Empire Strikes Back and a couple Superman films, and all six Pixar films, so his films have grossed billions of dollars. I guess a computer might pick him to play the villian in the next Batman film, but in real life there isn't a magic formula.
This is what happens when the bean counters try to quantify the creative process. You can add up all the ingredients for a hit movie and still have a major bomb on your hands.
It's like saying you can dump fois gras, Chateau Latour, beluga caviar and a savoy truffle into a blender and end up with the world's most wonderful milkshake. In the end it's a recipe for mediocrity, at best. More often, all you get is expensive puke.
If one could predict success by adding up the elements that go into movie making, then "Catwoman" should have been the megahit of 2004.
rating by censors (e.g. G, PG, R) strength of the cast genre competition from other films at the time of release special effects whether it is a sequel and the number of theaters in which it will show." It's ridiculous to expect software to predict entertainment. From the above, success can only be even remotely predicted by "the number of theaters in which it will show". And possibly the "strength of the cast". Mainly I think the trailers shoved down our throat with only the best parts of the movie could help success. I highly doubt this software would have predicted the success of The Blair Witch Project. Zero special effects, zero strength of the cast, zero budget.
success = IMDB.com_USER_RATING
With 9 revenue categories, correctly predicting the category 37% of the time (RTFA), is, ehem, unimpressive - a dartboard would guess correctly 11% of the time.
So we have a predictor that makes 0.63/0.88 ~= 70% as many mistakes as a dartboard. If you give it one category of "wiggle", it makes 0.25/0.66 ~= 40% as many mistakes as a dartboard.
People are making a lot of hay out of this. It tells you that small movies (opening on fewer screens) are very seldom blockbusters, and that heavily promoted movies almost always make at least ten million or so. How is this unexpected? I bet I could get similar predictive power using a SINGLE variable - the promotion budget for each of the films. If it could tell us something actually interesting (or useful to hollywood types) - like "why are some big budget movies successful while others are not?" - that might be worth something.
Also, the journalist is a nitwit - "North American ticket sales currently total $7.6 million."
The good and new comes from no quarter where it is looked for, and is always something different from what is expected.
Hollywood uses similar metrics for most of their features.
This explains more than anything else why the quality of the majority of movies dropped so fast in the last few years.
None of those parameters can measure (digitally) the quality of the story, quality of acting (note: not popularity of the cast, Pam Anderson is also popular) and quality of the movie anyway.
Hearing from buddies or critic reviews, that a movie is poorly done mix up of popular actors, effects and soft porn with dumb as stics scenario stolen from a bunch of action flicks from the past, is the fastest way to give up an average moviegoer from seeing it.
This is unbelievable! Awesome-o has thought up 1193 different film ideas. 906 of which star Adam Sandler!
I am scientifically inaccurate.
First of all, if I was only 37% successful at my tasks at work, I would be out the door in a heartbeat. One category off could mean the difference between success or failure.
Where this gets stupid are the advertising, word of mouth, and "fanatic" factors.
First, if a studio thinks a film is going to tank, they won't advertise it and won't push it to as many screens. As a result, less people even know the film exists and even if they do, it is harder to find. I can think of several movies that were awesome films that were just not advertised. I never saw a commercial for Usual Suspects, but saw it after a friend said it was the best movie they had ever seen. If the studio predicts failure, it could be a self-fulfilling prophesy, but I think the age of quick DVD release and peer recommendations is changing this.
That brings me to the second factor - word of mouth. How do you put word of mouth into a formula? Maybe I am in a very small minority, but my interest in a movie goes up significantly if a trusted friend (key point, others I do the exact opposite of what they say) says it is am AWESOME movie. They rank many movies as good, but very few as awesome. So what is the Awesome determinator? A movie can creep out of nowhere and just keep growing on the word of mouth factor. I admit that this is not a common event, but one that would seem nearly impossible to predict.
Finally, the fanatic factor. Remember where fan comes from. There are certain writers, directors, actors, soundtrack performers, etc. that carry a certain draw all on their own. Josh Weadon could write a movie about a girl who has poo flinging superpowers and tens of thousands of fans would go see it based on his name, but almost all would be inside a tight demographic. 37% sounds about right in this area.
As a final addition, there is the stupidity of Hollywood factor. They make movies based on what movie-goers like. There are less movie-goers each year because there is less for movie-goers to like. Why pay $25 for tickets, coke, and popcorn to take the wife to see a movie when I can go the big screen TV, NetFlix, and Newman's Own Microwave Popcorn route? My wife would probably add the "you can't pause the theater movie to go pee" factor, too.
Hollywood responds with stupid formulas like this that lets them focus on certain formula films fed to certain demographics and expect a simple equation where you fill in 40 variables and get instant profit. Political and religious discussions aside, the Passion totally breaks the mold. I went with 10 people to see that movie in the opening week and 6 of those people had not been to a theater in years.
The box is getting smaller each year and each year Hollywood continues to segment the box into what it thinks is the most profitable section, throw their efforts there, and alienate another years worth of eyeballs out of the box.
My hope is for alternative delivery and an uprooting of the current studio/distribution model. When the fanatics have a mechanism for funding a film or tv series that goes to internet and/or dvd delivery, the whole world changes. There are multiple ways to do this, too. Fans could pre-pay for a season of tv in order to get the dvds as they are made instead of in a boxed set (with no rental/netflix option until the boxed set was out). A film company could put up a bond that they would sell to the fans for a share of the profits.
If you really think JMS is so awesome, how many $50 bonds would you buy? If he sold 100,000 bonds with a 20% of profit share, made the movie for the $5 million, and netted only $30 million on theater, pay-per-view, and dvd, you would still get $60 back for each $50 investment.
1. make a db of meta info for already released movies
2. make a software that conforms to the already existing stats and "guesses" the income. If it doesn't guess it, tweak until it "guesses" it.
3. pitch it to Holywood execs by demonstrating it "works" by entering the same movie info you have already tweaked it for
4. profit
Of course the fact that it has (well, relatively poor IMO - 37% success? 75% "sort of success"?) success with the db of 800 movies is a result of it been tuned to work for those stats, and there's totally no guarantee it'll work for future releases.
Especially that it can't and won't factor in the most important factor: does the movie suck after all or not.
This isn't perfect because how would Passion of the Christ or Mystic River fit into this algorythm. There were no special effects, both were rated R, and one was in a language that hasn't been spoken for 2000 years. This is the problem with Hollywood today, they think there is a formula to good movies, good movies are good because they have a good plot, not high payed actors or special effects out the waazoo.
The main result is that the method (neural net) works a little better than other methods on the same data (Table 4 of paper). It scores 75% in a test; conventional regression scores 71%. As they say in the statistical literature, "big woop"; the fancy new thing is marginally better than the simple old thing.
As for the practical side of things, the main predictive variable is the number of screens on which the film was initially shown. The next-highest predictive variables are a variable representing the use of technical effects and a variable represengint the actors' reputation. Well, none of these indicates that this tool (or others discussed in the paper) is of any real use to the industry. The suggested use of the tool is to predict movie success. But the main predictive variables all represent things the industry already knew, when the film was being made and promoted. It's like asking a patient if they have a cold, and then charging them to tell them they have a cold.
It's called the Awesome-O 4000:
Um, Ok, how bout this, Adam Sandler, is like in love with some girl, but then it turns out, that the girl is actually a golden retriever, or something.
I hold very few opinions. I hold information based on observation and fact. If you wish to disagree, please use facts.
Most of the factors it uses depend on a human already deciding that a movie's going to be a success. You don't get a star studded cast unless you think it's going to be a hit. You don't spend lavishly on special effects unless you think it's going to be a hit. And distribution size is determined by its commercial potential. When that's already decided, there's not much point to having a computer algorithm say the same thing.
If this were 1990, the title would read "neural network predicts movie success" and the discussion would be about the impending success of strong AI.
Reading TFA, it's impossible to know whether this study has any value without seing a proper article, as submited to a reputable stats journal.
First of all this sounds like simple statistical classification with pretty obvious variables. However making classification work is not always trivial.
Methodology is the key here. The sample of 800 movies is rather small, and the details on the chosen explanatory variables is sketchy. With enough variables, even meaningless ones, one can explain anything on a training sample. However with proper classification techniques, using for example jacknife/resubstitution/cross-validation one can find out if the classification model has any actual predictive values.
As someone said "anybody can predict the past", and someone else "prediction is rather difficult, especially about the future".
Neural nets are often badly misapplied, but they can hardly be called "quackery". In fact, this is precisely the sort of thing that neural nets are supposed to do: take numerous factors and try to categorize the input based on those factors.
We have an entire industry devoted to figuring out which movies will be most successful, how best to advertise them, how many theaters to release a given movie in, etc. Arguably, this entire industry is less talented at picking winners than a small shell script. If you want to look for quackery, hare-brained theories, etc., you would do well to start by looking there.
You want the truthiness? You can't handle the truthiness!
What would be particularly interesting is to examine the movies it failed on and attempt to understand why.