Why Data Is the New Coal (theguardian.com)
An anonymous reader shares a report on The Guardian: "Is data the new oil?" asked proponents of big data back in 2012 in Forbes magazine. By 2016, and the rise of big data's turbo-powered cousin deep learning, we had become more certain: "Data is the new oil," stated Fortune. Amazon's Neil Lawrence has a slightly different analogy: Data, he says, is coal. Not coal today, though, but coal in the early days of the 18th century, when Thomas Newcomen invented the steam engine. A Devonian ironmonger, Newcomen built his device to pump water out of the south west's prolific tin mines. The problem, as Lawrence told the Re-Work conference on Deep Learning in London, was that the pump was rather more useful to those who had a lot of coal than those who didn't: it was good, but not good enough to buy coal in to run it. That was so true that the first of Newcomen's steam engines wasn't built in a tin mine, but in coal works near Dudley. So why is data coal? The problem is similar: there are a lot of Newcomens in the world of deep learning. Startups like London's Magic Pony and SwiftKey are coming up with revolutionary new ways to train machines to do impressive feats of cognition, from reconstructing facial data from grainy images to learning the writing style of an individual user to better predict which word they are going to type in a sentence.
I thought data was Oreos!?
what?
buy coal in to run it.???
The problem is similar: blahblahblah
Nonsense!
Especially when you think about who the coal miners are, who owns the coal, and who the "coal" is.
God. I'm sure this was written by a hipster. With think black glasses and a muslim-style beard. The whole nine yards, with bicycle and record player.
If you can't come up with a proper automotive analogy for a technology, get off my fucking Internet.
You are welcome on my lawn.
But the internet is pipes. Surely if data is coal it should be conveyor belts.
In other words, it's beyond true.
Keep your goddamned 'deep learning' machines out of my -- and everyone elses business -- or else.
I hear rumblings about an upcoming 'civil war' or even a 'race war', but I think the next 'civil war' will not be over race or ethnicity, I think it'll be over our private lives being infiltrated and violated by so-called 'big data' assholes and their asshole machines, sticking their silicon-based noses into things that aren't any of their goddamned business.
I love the smell of datacenters burning in the morning!
"Data is coal, not oil."
You sound like a moron.
Sometimes things do not fit into your analogies. No matter how hard you try to force it.
I'm a good cook. I'm a fantastic eater. - Steven Brust
Am I the only one who found this to be gibberish that makes zero sense? Seriously what is this supposed to mean?
omigod omigod THIS is the NEXT BIG THING !! Don't MISS OUT in this ONCE IN A LIFETIME opportunity to get in on the GROUND FLOOR of the biggest thing since COAL !!!!
cc: Mr Bulschiter and Associates,
As per our agreement, Oh-pinion Makerz has placed 200 stories on social media designed to ignite the hype cycle for your company's offerings. This fulfills the terms of our contract.
Please send fee to our offshore associates.
Thank you.
It goes down smooth and tastes great but if you get more than you can handle you end up driving your car through a house.
But it turns out that data is more like oranges: once you strip away the rough outer layers you've got something wonderful but you still have to divide it up or squeeze it.
But now we know that data is the new web article: if you do a lot of digging you can find nuggets of vital importance in even the blandest drivel!
But just here me out: data is the new joke format. It gets old fast and you immediately want something new and relevant that will give you an edge over the competition.
Man, he was old.
Coal was a new energy source - a way to replace human and animal labor with machine labor. This resulted in huge productivity gains (measured in productivity per person - productivity per Joule expended actually went down because coal energy was so much cheaper than human labor meaning inefficient machines could still be cheaper). The MO was dirt simple - take anything that used to require people or animals to expend effort to do, make a machine to do it, and power the machine with coal.
Data is just data. Aside from a few data-processing tasks which have already been automated (OCR, statistical analysis), there is no dirt simple way to use data to reduce human labor. You can eek out a small productivity gain by using it to improve the efficiency of marketing (e.g. don't show bra ads to men), but that's pretty much it. The productivity gain is what's necessary to make it "better" than previous ways of doing things. Improvements in economic efficiency show up as productivity gains.
Popularity is one way (probably the best way) to leverage data. You can use it to determine what's popular and position the marketing of your products in that direction. But that's a zero-sum game. Any increased sales you gain because you marketed your products better directly reduces sales of other competing products. This is totally different from coal (and oil) which enabled new methods of production, and thus weren't zero-sum.
After sitting through many code reviews, I can tell you it's not clean coal.
It must have been something you assimilated. . . .
Just like coal- my data is polluted. I deliberately try and pass as much misinformation (when I can) into companies that collect my data. Obviously a lot of it I can't.
Part of it is for self-protection and privacy- and part of it is because it amuses me and I have a weird sense of humor.
"That's the way to do it" - Punch
For the first time since I can remember, TFA was actually written more poorly than TFS. Of course, that wasn't not too hard; TFS only contained one paragraph from the article, while TFA itself went on and on and on in an a meandering, fuzzy-headed, buzz-word-filled fashion that said nothing and went nowhere. As a bonus, that 'coal' metaphor seems to have come straight from a cannabis-induced moment of "enlightenment".
'The Economy' is a giant Ponzi scheme whose most pitiable suckers are the youngest among us and the yet-unborn.
Just like the dotcom bubble, there are entire companies whose fate hinges on massive uptake of the "big data" and "deep learning" revolutions. And just like the hype cycles from the last bubble, there's some truth to them but people really take it to an extreme to get headlines and clicks. I think when the bubble pops, there will be plenty of "real" big data problems for serious qualified people to solve, as well as legions of unemployed "data scientists" and "cognitive champions."
I think applying data analysis techniques to societal problems (emergency response, environmental issues, etc.) is a good thing. I don't think the current focus of ever more intrusive advertising and behavior analysis is going to add much value in the long run. This isn't a tinfoil-hat style rejection of tracking, it's my belief that even the dumbest of consumers are going to reach a point where they can't stand having ads shoved in their face anymore and demand that it stop. Ever notice how commerce sites email you when you put an item in your cart, then don't buy it? Lots of sites have at least buried a setting somewhere in their account configs that let people turn this off. No one ever went broke overestimating the stupidity of the average consumer, but pushing things on every channel (phone, computer, tablet, streaming ads, browser ads, etc.) will lead to consumer fatigue.
Aside from a few data-processing tasks which have already been automated (OCR, statistical analysis), there is no dirt simple way to use data to reduce human labor.
Complete nonsense. Computers are how you use data to reduce human labor and tons of tasks have been automated. To take the analogy further oil by itself is useless. You need a machine to do something useful with it. Data is the same way. By itself it is comparatively useless but with a computer you can do a lot to reduce human labor. For example CAD or bookkeeping or inventory are all data processing tasks which substantially reduce human labor with the help of data.
Okay if big data the new coal, we should stop using it now because although it is currently cheap and plentiful with apparently many applications, we know eventually it lead us to the collapse of civilization.
Maintaining access to big-data will eventually cause political conflicts and maybe even wars, and continuing unrestrained usage of big data will eventually cause inconvenient problems in our daily lives that will make our world unliveable and our society unsustainable. The money exploited by the early adopters in the big-data industrial complex will dominate the political landscape and prevent us from doing anything about constraining this monster until it is too late.
If you could have put a cap on companies like the Peabody coal company back in the early days, you wouldn't ever hear statements like this today from coal company analysts...
“We have never seen leases of more than a billion tonnes and we are starting to see that under the Obama Administration.”
If the Obama administration's department of Interior can be bought-and-sold by a coal company with annual revenues of only $5B, what hope do governments have against big-data companies with annual revenues of $74B?
Any other analogies to coal people would like to say about big-data?
You mean mindlessly repeated buzzwords like "pivot point"?
-- You are in a maze of little, twisty passages, all different... --
to justify hitting it with new deep learning/NN technology. Also, most of the old data has been "mined out" with standard statistical techniques.
Oh, they can do the exercise: chase phantoms, declare that they "found" something new (the same old trends previously found again), with the very occasional true discovery. But by and large, 15 minutes after the data is released, everything we already know will be confirmed and anything new we can know or will know will be revealed.
Nothing here, Folks. Move along please.
Ignoring the grammatical errors and typos, I assume the closer was supposed to be something like, "But right now, the only organizations actually using deep learning techniques are the ones who produce the big data in the first place, and they are using it for their own purposes. We have not yet reached the point where big data or deep learning are being commercialized by third parties."
Pivot point also means dick. How can it be a buzzword? Oh I get it, all buzzwords actually mean dick.
Worst analogy ever.
When they came for the communists, I said "He's next door. Take him away. Goddam commies."
When I saw "coal" on my Buzzword Bingo sheet, I thought I was screwed!
But someone renamed something old and they named it coal and I won!
America, fuck yeah!
The summary should have included the next paragraph of the article to make any sense:
And yet, like Newcomen, their innovations are so much more useful to the people who actually have copious amounts of raw material to work from. And so Magic Pony is acquired by Twitter, SwiftKey is acquired by Microsoft – and Lawrence himself gets hired by Amazon from the University of Sheffield, where he was based until three weeks ago.
Or else the poster should have just written a terse summary and not just cut and paste paragraphs. Yeah I know, this is slash-dot...
Second class citizen of the New Gilded Age
The good news is that Peak Data lies just ahead, predicted for early next week. Things should settle down from there on...
In the humanities there has been a lot of research into the many metaphores used for data, and how they effect society's view on data-hungry operations:
http://dismagazine.com/discussion/73298/sara-m-watson-metaphors-of-big-data/
Personally I think both oil or coal are more useful metaphores than gold, because in contrast to gold, the public understands these as having both up- and downsides. The problem with the gold metaphore is that it doesn't point to the serious problems data gathering can lead to.
I worry that the pumping of data will create a lot of chilling effects and self-censorship. The widespread use of oil lead to global warming. I think the widespread use of data could lead to "social cooling".