"Long Tail Effect" Doesn't Work As Advertised, Say Wharton Researchers
Death Metal writes "In a working paper titled, 'Is Tom Cruise Threatened? Using Netflix Prize Data to Examine the Long Tail of Electronic Commerce,' Wharton Operations and Information Management professor Serguei Netessine and doctoral student Tom F. Tan pull information from the movie rental company Netflix to explore consumer demand for smash hits and lesser-known films. Netflix made its data available as part of a $1 million prize competition to encourage the development of new ways that will improve its ability to introduce customers to lesser-known titles they might find appealing." In short, the researchers say that the Long Tail effect described by Chris Anderson is much less important in the real world than popularly held. Says the article: "The key difference between the opinion of [Anderson's] book and the study by Wharton researchers is how they define 'hits' and 'niches.' In the book, Anderson focuses on the definition of hits in absolute terms such as the top 10 or top 1,000 products, while Netessine and Tan argue that, to take growing product variety into account, one has to define popularity in relative terms, such as the top 1% or top 10% of products, to properly assess the presence or absence of the Long Tail."
The working title of the paper is misleading, since there is no mention of "threat", an "effect", or anything of that sort. In fact, all I got out of it was that they were just debating something rather trivial and inconsequential - the definition of a "hit" in a statistical model and how using "top 1000" or so is improper based on NetFlix data.
To be honest, this isn't really "news" worthy of a front page listing.
Then the 80/20-rule is just a good rule of thumb.
If we have a simple hyperbolic distribution (which is a special case of Pareto), then adding more elements to the set and waiting for the distribution to renormalize as hyperbolic increases the relative weight of the top 20%. So if you have a big online retailer like Amazon with more titles than a conventional bookstore, then you can expect the top 20% sellers on Amazon generating a bigger part of all sales of Amazon than the top 20% of a bookstore in relation to all sales of said bookstore.
affect.
Kill all hipsters.
The crux of Andersen's argument is that, while Amazon et al have the same demand for big-name titles, their tail is longer and higher than a traditional bookstore, and by defining a cut at a certain point (say, those with less than 5% of the peak sales, those outside the top 10% or whatever is appropriate) it can be seen that the low-volume sales represent a larger fraction of the total sales due to the extreme length of the tail.
Quoting from the Wikipedia article on the topic:
In the graph shown above, Amazon's book sales or Netflix's movie rentals would be represented along the vertical axis, while the book or movie ranks are along the horizontal axis. The total volume of low popularity items exceeds the volume of high popularity items.
Andersen was suggesting that, in the limit of infinite items to sell and negligable stocking costs, much more profit is to be derived from the large number of items that sell a few copies than the few items that sell many copies.
Indeed, it went further than that, suggesting that as people got used to having more choice, they would begin to shun the "popular" items in favour of more obscure titles, further fattening the tail. But that's even more speculative and somewhat independent of the other economic predictions.
I'm picturing the demand curve as an exponential, shifted so that it intercepts both the x and y axes. There's a lot of demand for the most popular items, and declining demand for less and less popular ones. By definition, of course, but the shape of the curve matters. No matter how far out you go, there's always somebody who'll want it (given a large enough population).
For a traditional bookstore, the x axis hits the curve pretty high. There's a substantial cost to stock each book, say $2.00/year. There's also a fairly small local demand, say 200 copies a week for a John Grisham novel. Only a few thousand titles sell fast enough to make a profit before that $2.00/year eats up the sale price minus wholesale price.
For a mail-order/online bookstore, the cost to stock each book is lower since you only need a warehouse instead of reading stacks + comfy chairs + cashiers + parking. The cost to stock each book could drop to $0.50/year. The demand is now national, so that same John Grisham novel sells 20,000 copies a week. And a title that sold once a year in a traditional store now sells twice a week. So, many more titles can beat the clock and turn a profit.
The shape of demand didn't change. In both cases it's an exponential cut off at the point of profitability. But that point is now much farther out along the x axis. So the online retailer can make money selling stuff that would never survive in a traditional store. And customers can find stuff online that they'd be lucky to ever see locally.