Cutting Through Data Science Hype
An anonymous reader writes: Data science — or "big data" if you prefer — has evolved into a full-fledged buzzword, thanks to marketing departments around the world. John Foreman writes that part of the marketing blitz has been focused on how fast big data analysis can be. Most companies offering some kind of analytic service try to sell you on how it'll make it easy for you to quickly find and fix the problems with your business. But he points out that good, robust models need a stable set of inputs, and businesses often change far too quickly for any kind of stable prediction. He takes IBM's analytic services as an example, quoting Kevin Hillstrom: "If IBM Watson can find hidden correlations that help your business, then why can't IBM Watson stem a 3 year sales drop at IBM?" Foreman offers some simple advice: "Simple analyses don't require huge models that get blown away when the business changes. ... If your business is currently too chaotic to support a complex model, don't build one."
The term "Big Data" is bullshit, but the concept itself is not. It's statistics, plain and simple. When you have sufficient data available, there is a lot of information and insight that can be obtained from these data.
A perfect example of this are the data that are available about Mozilla Firefox. Let's start by looking at Firefox's market share today. As we can see, it's only about 10% these days, on both the desktop and mobile platforms. Their mobile presence is particularly embarrassing, as it's much less than even mobile IE's! Even the ancient Android 2.3 browser has more users than Firefox for Android! Even more interesting is how Chrome for Android alone likely has more users than Firefox does in total!
Those browser stats are an example of "Big Data" that's tremendously useful. We can learn a lot about Firefox and its role in the modern world from that data alone. When you're dealing with data sets derived from absolutely massive collections of source data, remarkable observations are possible.
We can also look at Mozilla's own Firefox feedback results. These are very interesting! Over the past 7 days, over 10,000 people have submitted feedback. Across all of the Firefox-branded products, 87% of people report being "sad" with Firefox, while only 13% are "happy" with it! That's a huge gap, even when we consider that angry people are more likely to give feedback than happy people. There are 6.5 times more people who are sad with Firefox than there are people who are happy with it! We can correlate this feedback data set, which is statistically significant, with the results we derive from the browser market share data set. It becomes obvious that people are leaving Firefox behind because they are unhappy with it. Furthermore, Mozilla should already be aware of this displeasure with Firefox.
This is the beauty of statistics at work!
When we consider global data sets consisting of data from thousands or millions or even billions of people, we can see some stunning patterns and results. Clearly Mozilla needs to do a better job of listening to its users. Something is seriously wrong when 87% of them are unhappy with Firefox. The data are there, Mozilla! The results are obvious! Please, act on it! Listen to the users!
>> Catastrophe is a critical factor in most evolutionary history.
> Citation, please.
Wikipedia has a fairly good entry on "Catastrophism", and another on "Punctuated equilibrium". But even without large scale events such as dinosaur killer asteroids or the evolution of photosynthesis poisoning most species with much higher concentrations of volatile oxygen, the are much smaller and more frequent effects. Forest fires are a crtical factor in breeding jack pine trees, floods are vital to the fertility of the ecosystem near river banks, and hurricanes spread species throughout their trail and profoundly affect the ecology and evolution of areas that are likely to endure hurricanes. And catastrophes can and do create a "founder effect", where a small number of introduced species members become a new species quite quickly in their new environment.
Do I need to find individual links links for each of those?