IT's Next Hot Job: Hadoop Guru
gManZboy writes "JPMorgan Chase and other companies at this year's Hadoop World conference came begging for job applicants: They say they can't find enough IT pros with certain skills, including Hadoop MapReduce. That spells high pay. As for Hadoop's staying power as a career path (a la SQL 30 years ago), IBM, Microsoft and Oracle have all embraced Hadoop this year. Maybe the best news of all: 'Intelligent technologists will pick up Hadoop very quickly.'"
After all, every other framework of the month has lasted for 30 years, Hadoop will have at least as much staying power as Ruby on Rails!
The trick is going to be getting the appropriate experience without having learned it on the job already.
Yes, it can be done. However, this technology is geared towards environments with lots of nodes in big clusters. (which can run Linux) That's not the same as simply learning a language.
I got a job utilizing a "Big Data" database technology by being at the right place at the right time, when this technology was being rolled out. It's also hard to find people with that specialized experience.
So I would suggest to companies, hire people and train them. Just get quality people if you can't find someone with the specific skill set.
Except for ending slavery, the Nazis, communism, & securing American independence, war has never solved anything.
If you want a strong userbase, projects with good, easy to use learning resources do better. When you hit the hadoop main page, they tell you what it is, but not what you need to know in order to use it. They don't tell you what languages it supports. They give no examples of usage. Essentially, they don't do you any favours.
I spent some time trying to implement some nice free tools from IBM and Apache. I found I needed to download X and do a build of it, but half way through it wanted Y to complete the build. OK... So I go find Y and try doing a build on it, but need something else from Apache, which doesn't like the vesion of Apache I'm running. So I get the other Apache thing and find I can't get it to start up. I go research it and find conflicting and incomplete information all over the web. I throw in the towel.
One thing needed is One source for information and clear instructions for a basic, default build of a platform. Once that is reliable, then document ways to add foo and bar or even plugh if it suits you.
A feeling of having made the same mistake before: Deja Foobar
If I attend some public talk on a trendy subject its swarming with recruiters. Topics include no-sql, html5, mobile, etc. There seem to be at least ten job openings for everyone looking for something.
If I were a recruiter, I would automatically be wary of anyone who seriously refers to themselves as a "guru" of $language. Sure, you may be good at writing code and may know a particular library inside out, but anyone who calls themselves a guru probably has a very overinflated sense of their importance and actual skill level. These also tend to be the people who have the right buzzwords to get past HR filters and then proceed to bullshit their way through interviews.
"It is a denial of justice not to stretch out a helping hand to the fallen; that is the common right of humanity."
Java is one of the most inefficient languages ever? I take it you've never programmed in ruby, python, perl, etc. IIRC, Java benchmarks have shown it outpacing everything except for C/C++, FORTRAN and OCaml.
On first execution (and compile) it's slow. On first creation of an instance it is slow. After that Java makes up for itself rather nicely. If well implemented it's a great way to go, though I wouldn't chose it for my 3D rendering or reconciling a fiscal year's worth of journal entries, it's not that kind of language.
A feeling of having made the same mistake before: Deja Foobar
drink the maven kool aid, and you worries will be beyond you.
To use hadoop :
org.apache.hadoop
hadoop-core
0.20.205.0
in your pom.mxl
Then write 2 classes like those one:
class MyMap extends MapReduceBase implements Mapper<K1, V1, K2, V2 >...
class MyReduce extends MapReduceBase implements Reducer<K2, V2, K3, V3>...
Feed instances of those to a JobConf and feed that instance to a JobClient.
The rest should be obvious to a seasoned programmer, just by looking at the nomenclature of the classes hierarchy.
The great Ward Cunningham, is right, put two days into studying something and you are already half way to expert.
Jehovah be praised, Oracle was not selected
The thing I always wonder about Hadoop is how important can it get? It's only useful if you have too much data for an RDBMS, right? It seems like only JPMorgan and other giant companies could make use of it. Am I wrong?
There's no such thing as too much data for an RDBMS.
There is such a thing as poor database planning and a shitty schema, though.
SQL is a query language, not a database implementation technology. In the future Hadoop-style engines will probably be wrapped by SQL such that it will be an implementation detail or choice, similar to the MyIsam versus InnoDB choice in MySql.
I'm not saying this will make it a non-career, only that the career will morph to be more like that of an Oracle tuning specialist (who make good money still).
Table-ized A.I.
Sounds like the way I got my first Linux-based job in '95, except I used newsgroups instead of Wikipedia.
God invented whiskey so the Irish would not rule the world.
I spent some time trying to implement some nice free tools from IBM and Apache. I found I needed to download X and do a build of it, but half way through it wanted Y to complete the build. OK... So I go find Y and try doing a build on it, but need something else from Apache, which doesn't like the vesion of Apache I'm running. So I get the other Apache thing and find I can't get it to start up. I go research it and find conflicting and incomplete information all over the web. I throw in the towel.
One thing needed is One source for information and clear instructions for a basic, default build of a platform. Once that is reliable, then document ways to add foo and bar or even plugh if it suits you.
Sounds like IBM all right. They make some decent products sometimes. I'm fairly certain that other times they go out of their way to make things a pain in the ass to use. Maybe it's supposed to be a joke on the rest of the world?