Slashdot Mirror


IT's Next Hot Job: Hadoop Guru

gManZboy writes "JPMorgan Chase and other companies at this year's Hadoop World conference came begging for job applicants: They say they can't find enough IT pros with certain skills, including Hadoop MapReduce. That spells high pay. As for Hadoop's staying power as a career path (a la SQL 30 years ago), IBM, Microsoft and Oracle have all embraced Hadoop this year. Maybe the best news of all: 'Intelligent technologists will pick up Hadoop very quickly.'"

6 of 112 comments (clear)

  1. Getting the Experience by geoffrobinson · · Score: 5, Informative

    The trick is going to be getting the appropriate experience without having learned it on the job already.

    Yes, it can be done. However, this technology is geared towards environments with lots of nodes in big clusters. (which can run Linux) That's not the same as simply learning a language.

    I got a job utilizing a "Big Data" database technology by being at the right place at the right time, when this technology was being rolled out. It's also hard to find people with that specialized experience.

    So I would suggest to companies, hire people and train them. Just get quality people if you can't find someone with the specific skill set.

    --
    Except for ending slavery, the Nazis, communism, & securing American independence, war has never solved anything.
  2. Re:Bad learning resources by ackthpt · · Score: 4, Informative

    If you want a strong userbase, projects with good, easy to use learning resources do better. When you hit the hadoop main page, they tell you what it is, but not what you need to know in order to use it. They don't tell you what languages it supports. They give no examples of usage. Essentially, they don't do you any favours.

    I spent some time trying to implement some nice free tools from IBM and Apache. I found I needed to download X and do a build of it, but half way through it wanted Y to complete the build. OK... So I go find Y and try doing a build on it, but need something else from Apache, which doesn't like the vesion of Apache I'm running. So I get the other Apache thing and find I can't get it to start up. I go research it and find conflicting and incomplete information all over the web. I throw in the towel.

    One thing needed is One source for information and clear instructions for a basic, default build of a platform. Once that is reliable, then document ways to add foo and bar or even plugh if it suits you.

    --

    A feeling of having made the same mistake before: Deja Foobar
  3. Re:Right. by ackthpt · · Score: 4, Interesting

    Java is one of the most inefficient languages ever? I take it you've never programmed in ruby, python, perl, etc. IIRC, Java benchmarks have shown it outpacing everything except for C/C++, FORTRAN and OCaml.

    On first execution (and compile) it's slow. On first creation of an instance it is slow. After that Java makes up for itself rather nicely. If well implemented it's a great way to go, though I wouldn't chose it for my 3D rendering or reconciling a fiscal year's worth of journal entries, it's not that kind of language.

    --

    A feeling of having made the same mistake before: Deja Foobar
  4. Re:"Gurus" need not apply by ackthpt · · Score: 4, Funny

    If I were a recruiter, I would automatically be wary of anyone who seriously refers to themselves as a "guru" of $language. Sure, you may be good at writing code and may know a particular library inside out, but anyone who calls themselves a guru probably has a very overinflated sense of their importance and actual skill level. These also tend to be the people who have the right buzzwords to get past HR filters and then proceed to bullshit their way through interviews.

    "It says in your resume you were part of the initial development team and wrote one of the first reference books on $language."

    "That is correct, I was also part of a team which worked to ensure cross-platform consistency and stability. I've also written tutorials in $language and developed several application examples which are included in the reference website."

    "Anything else you'd like to add?"

    "I also have chaired the past two Worldwide $language development conferences and am teaching an Introduction to $language at the local community college."

    "That all sounds very good, but what development experience do you have developing $language in $businessEnvironment?"

    "None, really. I think this will likely be the first instance of its kind using $language in $businessEnvironment."

    "Sorry to hear that. We're looking for someone with more experience. Thank you for your time, there's the door."

    --

    A feeling of having made the same mistake before: Deja Foobar
  5. Re:Bad learning resources by JonySuede · · Score: 4, Informative

    drink the maven kool aid, and you worries will be beyond you.
    To use hadoop :

            org.apache.hadoop
            hadoop-core
            0.20.205.0

    in your pom.mxl

    Then write 2 classes like those one:

    class MyMap extends MapReduceBase implements Mapper<K1, V1, K2, V2 >...
    class MyReduce extends MapReduceBase implements Reducer<K2, V2, K3, V3>...

    Feed instances of those to a JobConf and feed that instance to a JobClient.

    The rest should be obvious to a seasoned programmer, just by looking at the nomenclature of the classes hierarchy.

    The great Ward Cunningham, is right, put two days into studying something and you are already half way to expert.

    --
    Jehovah be praised, Oracle was not selected
  6. Re:I'll start now! by Xyrus · · Score: 5, Funny

    Hadoop is geat, fast, and easy to use!*

    *Statements are based on word count example and terrasort. Performance may vary greatly. May need to spend significant amounts of time to tune cluster for your particular data and applications to see any real performance. Applications may need to be specially designed to fit within the tuning constraints of the cluster. This statement does not apply if you are using binary data of significant size (BDOSS). Multiple data sets and apps may not perform equally well within the cluster. Data pre-processing, formatting, sequencing, and other such steps are not included in this statement. If you any problems, hope to $DIETY Google returns a hit. See your browser search bar for further details.

    --
    ~X~