'Data Science' Is Dead
Nerval's Lobster writes "If you're going to make up a cool-sounding job title for yourself, 'Data Scientist' seems to fit the bill. When you put 'Data Scientist' on your resume, recruiters perk up, don't they? Go to the Strata conference and look on the jobs board — every company wants to hire Data Scientists. Time to jump aboard that bandwagon, right? Wrong, argues Miko Matsumura in a new column. 'Not only is Data Science not a science, it's not even a good job prospect,' he writes. 'Companies continue to burn millions of dollars to collect and gamely pick through the data under respective roofs. What's the time-to-value of the average "Big Data" project? How about "Never?"' After the 'Big Data' buzz cools a bit, he argues, it will be clear to everyone that 'Data Science' is dead and the job function of 'Data Scientist' will have jumped the shark."
Call yourself a statistician or database engineer and I promise there are still jobs around. And contrary to the summary, they are highly valuable jobs.
Some people die at 25 and aren't buried until 75. -Benjamin Franklin
But this general domain in the realm of contemporary giant data sets is the basic science research of our times. To say that 'data scientist' roles are dead in the near future based on a ROI analysis is to suggest that all these huge data sets aren't likely to pay off for a corp in the near future. And that doesn't sound right at all.
That's a very strong claim, I'll need to consult my Data Scientist to see if it actually fits the data.
No data has been cited during the creation of that blog post.
Opinion is fine, but when the observations are so weeping, just a little bit of substantiation is nice to have.
How to prevent more people from flocking into your field:
1) Write a Slashdot article
2) ???
3) Profit!
In soviet russia the government regulates the companies.
The author of this piece clearly have never done actual science, as confirmed by his resume, and his opinions on what science is and that somehow some observational sciences are "soft" are very questionable at best.
In my career I have worked for boring banks and boring monolithic enterprise software giants.
If there is one thing I know for certain it is that big enterprise will ALWAYS have a huge appetite for quantification of data. It almost doesn't matter if it actually does anything for you, executives at giant corporations have to DO SOMETHING have to REVIEW SOMETHING. Large scale data aggregation and reporting (one of the many things that go by "big data") might not have sciency uses, but any time a V level can provide a C level with "something" that says "We are doing stuff" there will be a huge market for it.
Basically what I am saying is, even if "Big Data" is nothing but a placebo, like say "HR Training", "Wellness programs", "performance reviews" or "teambuilding" it is a permanent fixture in the big, boring, high paying, stable job providing corporate world.
Unfortunately, unless this is structured data, you will be subjected to the data equivalent of dumpster diving. But surfacing insight from a rotting pile of enterprise data is a ghastly process—at best.
Sounds like this Miko Matsumura has no idea how successful Big Data projects actually work.
To refine his analogy, unstructured data is much like processing recyclables. Everything that might possibly be good gets thrown into a large bin, and several sorting processes run to extract individual relevant (though messy) pieces. While those pieces alone aren't pure enough to be useful, there's enough meaningful information in them that statistical analysis can separate the good from the bad, and that's where the insight comes from.
With a typical RDBMS, insight is readily apparent. A hypothesis that 75% of a user's purchases were widgets is simple to verify. In a non-relational database, as is often used in Big Data projects, that would be an inefficient computation (though it can be done). Rather, those databases are more aligned to produce a whole list of correlations between user demographics and purchasing habits, showing for example that users who buy widgets have often already bought foo bars. The "Data Scientist" didn't have to ever look specifically at statistics for widgets or foo bars, but the correlation is presented in a nice and accessible form, gleaned from millions or billions of independent data points.
Miko Matsumura is a Vice President at Hazelcast, an open source in-memory data grid company.
This is a SlashBI article written by executives for executives, with little basis in fact. Lovely.
You do not have a moral or legal right to do absolutely anything you want.
Since "Data Science" is dead, do we go back to using the old buzzwords? Or do we have to wait until some marketing MBA whiz-kid comes up with a sexy new word for "Analyst"?
90% of what a data science expert do is what people like to call data-juijitso (data reconfiguration). Which basically means getting data out of your RMDBs, SAP, Twitter, Facebook, random text (.csv, etc) file dumps, random Excel/Word Files and legacy databases and into some place you can actually generate conclusions from (like inside a HDFS Hadoop cluster). Plus during this process you need to normalize all your data so you can apply the same algorithm no matter where the data came from.
All this means is that you will spend countless hours trying to connect to the client legacy stuff and then countless hours trying to get the data out (without impacting production systems!), so you can then spend countless hours formatting this data around to be able to spend countless hours trying to get this data into your Big Data(tm) solution so you can finally run some algorithms and create results. Now multiply all that by the number of different kinds of databases the client has and you get the idea.
As an IT professional you really do not want to work in this field. No organization keep its data in a clean uniform way, data scientist is like an IT janitor.
no pipe dream, my employer has those people, making big money
and what's this nonsense and misconceptions about resumes you have between your ears?
in my career, I've held engineer, science and IT positions. I have "IT-flavored" version of my resume for when I'm seeking an IT focused type job, "engineering-flavored" one, etc. All the resumes are true, no innacuracies and all experience can be verified by contacting previous employers or talking to former coworkers or reference. So the point is of course you can have different versions of your resume with different focus on duties and skills.
"Science" lacks a robust definition, but clearly the OP's definition is overly simplistic and narrow. Stephen Hawking has a lecture somewhere (found it: http://www.hawking.org.uk/the-origin-of-the-universe.html) where he talks about the idea of the "positivist" approach defined on the ability to predict outcomes, and I like to apply that definition to Science (Hawking doesn't, directly, but it's sort of an underlying theme). That is, Science becomes the observation and experimentation required to form predictions or cause changes in predicted outcomes.
So Social Science can be a science in so far as it actually informs usefully on how people will behave or provides useful ways to affect and improve the behavior or state of society's future. Computer science is a science insofar as it is required to make computers function as expected (as predicted) -- if you want something to perform faster, you must do the research and experimentation to cause the outcome to be faster. Even archaeology can be a science by this definition in that discoveries are added to a general model of the past that predicted all sorts of things -- ancient society's behavior, glaciation, geological events... "predict" may be a stretch there (except when archaeological finds help predict the future), but in this case the method of building a model of how the world worked based on observation to describe and generalize behavior (of the earth, of ancient religions, or what have you) is a form of prediction; it's just after the fact.
Data Science is very much science in this form; the job of a data scientist is almost universally to predict what the data will say about the future given what it has said in the past. This is invaluable to businesses and while the name may fall into disfavor, in the same way "actuary" which means something very similar already has, the abuse in this article is unwarranted, unfounded, and inaccurate. I will only agree that many who sport the "Data Science" moniker may not actually be doing science by any definition, but that's the individual's fault, not the concept's.
You should have a different resume for each job you apply for......
Keep a master resume with all of your details. When you apply for a job, copy the master and pare it down to the information most relevant to the job you are applying for. Then edit the result so that you look like the perfect candidate.....update project descriptions to emphasize the same buzzwords in the job listing, etc.
If you only have one resume that you blast out to a ton of different job listings and they'll probably focus on the projects that are meaningless to their situation and weed you out more often than not.
Oh, and BI (business intelligence) is still going strong at the company I work for......(not my area, but there's still tons of work under that label).