Slashdot Mirror


Exploring the Relationships Between Tech Skills (Visualization)

Nerval's Lobster writes: Simon Hughes, Dice's Chief Data Scientist, has put together an experimental visualization that explores how tech skills relate to one another. In the visualization, every circle or node represents a particular skill; colors designate communities that coalesce around skills. Try clicking "Java", for example, and notice how many other skills accompany it (a high-degree node, as graph theory would call it). As a popular skill, it appears to be present in many communities: Big Data, Oracle Database, System Administration, Automation/Testing, and (of course) Web and Software Development. You may or may not agree with some relationships, but keep in mind, it was all generated in an automatic way by computer code, untouched by a human. Building it started with Gephi, an open-source network analysis and visualization software package, by importing a pair-wise comma-separated list of skills and their similarity scores (as Simon describes in his article) and running a number of analyses: Force Atlas layout to draw a force-directed graph, Avg. Path Length to calculate the Betweenness Centrality that determines the size of a node, and finally Modularity to detect communities of skills (again, color-coded in the visualization). The graph was then exported as an XML graph file (GEXF) and converted to JSON format with two sets of elements: Nodes and Links. "We would love to hear your feedback and questions," Simon says.

65 comments

  1. Dice by Anonymous Coward · · Score: 0

    Pius this Dice shit off.

    1. Re:Dice by pla · · Score: 1

      No kidding... Did anyone else keep getting a popup on mouseover that said something like "Men's Dockers, $22"?

      Aside from the fucking annoying factor, it didn't even link anywhere or tell you where to buy them. Way to fail even at shilling, DiceDot!

  2. No PHP? by brausch · · Score: 1

    I didn't find PHP anywhere.

    Also, is there a way (or browser) to make the diagram larger?

    --
    "Almost every wise saying has an opposite one, no less wise, to balance it." - George Santayana
    1. Re:No PHP? by brausch · · Score: 1

      Found the zoom. Still didn't find PHP. Curious.

      --
      "Almost every wise saying has an opposite one, no less wise, to balance it." - George Santayana
    2. Re:No PHP? by Anonymous Coward · · Score: 0

      Ha - that's the same first thing I looked for, and likewise did not find.

      This thing sucks. The searches appear to be case-sensitive.

    3. Re:No PHP? by AmazingRuss · · Score: 4, Funny

      "I didn't find PHP anywhere."

      If only...

    4. Re:No PHP? by Lando · · Score: 2

      Remember "You may or may not agree with some relationships, but keep in mind, it was all generated in an automatic way by computer code, untouched by a human."

      How amazing computer code that has never been touched by a human. Organic software I guess, no possible way for human bias to creep in.

      --
      /* TODO: Spawn child process, interest child in technology, have child write a new sig */
    5. Re:No PHP? by Anonymous Coward · · Score: 0

      Useless person detected. Just get back to making my burger. The adults are talking.

    6. Re:No PHP? by Anonymous Coward · · Score: 1

      I didn't find PHP anywhere.

      Well, the article *did* say it was limited to technical skills...

      Also, is there a way (or browser) to make the diagram larger?

      Case in point.

    7. Re: No PHP? by Anonymous Coward · · Score: 0

      I've done more with PHP than most coders have done with any other language. Hate it all you want but it gets the job done - quickly and bug-free if you are a disciplined coder.

    8. Re:No PHP? by Platinumrat · · Score: 1

      What technical skills? Maybe web 2.0, but definitely not TECH. Technical is the hard engineering, CompSci / Physical / Simulations.

    9. Re: No PHP? by Anonymous Coward · · Score: 0

      Brainf*ck can be programmer quickly and bug free if you're good at it. Why, any language can be programmed quickly and bug free if you have everything memorized. A good language is easy to use correctly for newbs.

    10. Re:No PHP? by Anonymous Coward · · Score: 0

      Looks remarkably like a map of Asia. The cheap Eastern European contractors on the left, the Chinese fabricators in the upper part, the Indian software people in the the triangle down at the bottom and the Malaysian factories off on their little spur to the lower right.

      Of course at the default resolution, all I see is the outline. Labels aren't visible to see how well that matches to reality.

    11. Re:No PHP? by pla · · Score: 1

      Technical skills... You mean like how half of the bubbles say fluffy BS like "leadership", "strategy", "demand generation" and "spanish" (yes, the chart includes "spanish" as a node, over near ITIL)? Yeah, sure, "technical" - If you work in HR, maybe.

      "Applicant must recognize a computer, and not attempt to eat the mouse. 2-4 years experience stuffing envelopes preferred, because we don't understand metered trifolds. Ideal candidate can tell Brioni from Armani by smell alone".

    12. Re: No PHP? by Anonymous Coward · · Score: 0

      Php is like python. It is easy and fast to make small things. But hard to maintain if the project gets large.

    13. Re: No PHP? by Anonymous Coward · · Score: 0

      1997 called. It's laughing so hard it can hardly speak! I can only make out something like untyped garbage and glued together...

      It said nothing about wanting this language back.

    14. Re:No PHP? by kmoser · · Score: 1

      The data it feeds on was also touched by humans. Dirty, dirty data.

  3. Dice supplying stuff to make a resume look nicer? by GoodNewsJimDotCom · · Score: 4, Insightful

    This is a nice visualization. If I was young and aimless, I might see some value in learning new techs I don't actually need right now. Learning new techs could help you land jobs in today's age of clueless HR people judging you by your tech list vs your ability.

    However, I'm getting older and I learn another tech only if I actually need to use it for something I'm working on. So I'll be happily deficient in lots of languages I don't need. I'm certain I could be at least of average skill after about two weeks of most languages as that is my past experience with new languages. But don't tell HR. Today's software engineering world is so averse to training people it rarely considers searching for a veteran software engineer and letting him come up to speed on random techs.

    If I spent a couple weeks on every tech I hear about for the sake of toying with it, I'd never get anything done.

  4. Now that was cool! by Penguinisto · · Score: 2

    Find and click on MongoDB. Then notice that "Database" is this tiny-assed little circle waaaaaaaaaaaaaaaaaaay off to one side.

    Got a pretty good laugh out of it. :)

    --
    Quo usque tandem abutere, Nimbus, patientia nostra?
    1. Re:Now that was cool! by circletimessquare · · Score: 2

      it's just analyzing the appearance of words in listed skills

      actual database pros would not put "database" as an enumerated skill

      maybe the kind of person who lists "windows" "internet explorer" and "microsoft word" as tech skills would, but such people would not show up in the data set analyzed here: resumes from serious professionals working in the tech sector

      so it makes sense "database" would only be a tiny little distant circle

      --
      intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
    2. Re:Now that was cool! by Anonymous Coward · · Score: 0

      I personally found the Palo Alto skill the most interesting. I know that my company does not have enough people who can Palo Alto.

    3. Re:Now that was cool! by t0rkm3 · · Score: 1

      Palo Alto's themselves are not that complex. The interface is an interesting attempt at being usable, and it's getting better. What I thought was interesting about the PA node was how many connections to Apache products it has. That makes me think that people are not happy with using Panorama to view/manage the logs and run reports against the logs.

      I can sympathize.

    4. Re:Now that was cool! by ranton · · Score: 1

      it's just analyzing the appearance of words in listed skills

      so it makes sense "database" would only be a tiny little distant circle

      So what you are saying is their chosen implementation for analyzing relationships between skills is incomplete?

      --
      -- All that is necessary for the triumph of evil is that good men do nothing. -- Edmund Burke
    5. Re:Now that was cool! by circletimessquare · · Score: 1

      are you saying there exists some implementation that analyzes every resume in existence perfectly? it's "incomplete" in the sense that any such effort is incomplete and imperfect by nature of the problem. your criticism is invalid, you don't understand the task if you expect completeness is possible

      --
      intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
    6. Re:Now that was cool! by ranton · · Score: 1

      are you saying there exists some implementation that analyzes every resume in existence perfectly? it's "incomplete" in the sense that any such effort is incomplete and imperfect by nature of the problem. your criticism is invalid, you don't understand the task if you expect completeness is possible

      The task is to determine which professional skills relate to each other (this goal comes from TFA, not just the summary). The use of resumes to accomplish this is an implementation decision. Any shortcomings of using resumes is not necessarily an inherent limitation to solving the task at hand. There are other ways to complete the task that do not rely on resumes at all, or that rely on them less.

      For instance, you could use web search results to obtain data for your distributional semantics research. Websites that discuss MongoDB are probably more likely to use the term Database more often than Spring, even though the opposite is true for resumes. And in this case MongoDB has far more to do with databases than the Spring framework, so this methodology would return better results in this case.

      I am not necessarily saying a web crawl would return better results overall, I am only clarifying why problems caused by implementation decisions should not be used to claim the problem itself will always suffer from this problem.

      --
      -- All that is necessary for the triumph of evil is that good men do nothing. -- Edmund Burke
    7. Re:Now that was cool! by circletimessquare · · Score: 1

      your alternative method is inferior as the specific request is tech *skills*, which you find on resumes, people speaking to their merits to get hired

      not "tech appearing together on message boards," which indicated a whole host of relationships, relationship by skillset being far down the list

      the simple fact is there is no perfect methodology so criticizing the methodology for being imperfect is without merit. and in articulating a yet even more inferior methodology in your latest comment i have to assume you're just arguing for the sake of arguing, you're barely trying, you're not serious, and so this useless thread is over

       

      --
      intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
    8. Re:Now that was cool! by EdgeCreeper · · Score: 1

      RDBMS is one of the largest circles though. It is located in the orange area which seems like the traditional, popular, programming skills area. The RDBMS circle doesn't have as many links from it as I thought it would, but that would be how the cutoff is implemented. It's pretty nice.

  5. Paradigms by Anonymous Coward · · Score: 0

    Many paradigms complete release software investigate tree branching source code compile run test.

    1. Re:Paradigms by Anonymous Coward · · Score: 0

      Many paradigms complete release software investigate tree branching source code compile run test.

      That feature is lost with systemd.

  6. a lot comments will probably nitpick by circletimessquare · · Score: 3, Informative

    and i've seen my fair share of poor visualizations

    but i think this is actually really well done, useful even

    congratulations to Simon Hughes

    --
    intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
    1. Re:a lot comments will probably nitpick by Anonymous Coward · · Score: 1

      As one who goofs off with some of the Graphviz tools (dot), how possiblewould it be to dowith this toolchain?

    2. Re:a lot comments will probably nitpick by dwpro · · Score: 1

      I agree that it is a cool visualization. I do think it's noteworthy to mention that the data was reverse-engineered from resumes, so the relationships all have the bias of what individuals _think_ will matter to hiring committee or want to be hired to do, vs what actual skills they possess.

      --
      Millions long for immortality who do not know what to do with themselves on a rainy Sunday afternoon. -- Susan Ertz
    3. Re:a lot comments will probably nitpick by sribe · · Score: 1

      but i think this is actually really well done, useful even

      To the extent that I want to bookmark this, just for reference as to the tools he used ;-

  7. Re:Dice supplying stuff to make a resume look nice by SuperKendall · · Score: 2

    Today's software engineering world is so averse to training people it rarely considers searching for a veteran software engineer and letting him come up to speed on random techs.

    Not to put too fine a point on it but that's your own responsibility, not the company you work for.

    If there is an aversion to companies training people. that' offset by the ease of learning any newer (or even older) technology, for free.

    If you wait for the company to help you, you (and your career) will ossify. I have seen the result when I was younger, the result is not good for your freedom to choose favorable working conditions.

    --
    "There is more worth loving than we have strength to love." - Brian Jay Stanley
  8. QA by Anonymous Coward · · Score: 0

    I love that QA is a tiny little dot right in the middle of everything.

  9. Tool of the d by Tablizer · · Score: 1

    Quick! Hide it before they use it to generate H-1B targeting resumes having impossible combo's of skills to filter away citizens.

  10. OT question - But dice related by dwpro · · Score: 1

    When did the right panel get subsumed by ads? Had this been happening for long and I just hadn't noticed? I checked out the site in chrome to see the site without scripts, and holy smokes has it gotten bad.

    I would not frequent this site if the normal experience was accompanied by a trojan condom video ad (among several others) on loop.

    --
    Millions long for immortality who do not know what to do with themselves on a rainy Sunday afternoon. -- Susan Ertz
    1. Re:OT question - But dice related by Anonymous Coward · · Score: 0

      Speaking of condoms, have you tried a condom for your computer? Perhaps Adblock or even updating the hosts file?

  11. No Debian, also by Anonymous Coward · · Score: 0

    Shell Scripting, just beside Linux administration, is not linked.

    I thought Debian was the most used Linux OS for servers? Am I wrong?

    1. Re:No Debian, also by Anonymous Coward · · Score: 0

      Posting Anon because I am moding.

      Debian does not make the list. In North America RedHat/CentOS is the #1 Linux server OS, in Europe it is Suse Linux. There is a spattering of others at some companies but those two are the most common.

  12. Re:Dice supplying stuff to make a resume look nice by Anonymous Coward · · Score: 2, Informative

    Today's software engineering world is so averse to training people it rarely considers searching for a veteran software engineer and letting him come up to speed on random techs.

    Not to put too fine a point on it but that's your own responsibility, not the company you work for.

    Are you kidding?

    Sure, that applies if you're a short-term contractor. But permanent staff ought to be getting a regular training budget from their companies. If you're not, then you're working for a bad employer.

    If there is an aversion to companies training people. that' offset by the ease of learning any newer (or even older) technology, for free.

    It may be free in the financial sense, but it still has a cost in time. A good employer will give you that time. And pay the costs if there are any. You may need to explicitly ask for it, but the bottom line is that they should be investing in their long-term staff.

    If you wait for the company to help you, you (and your career) will ossify. I have seen the result when I was younger, the result is not good for your freedom to choose favorable working conditions.

    My own experience: In the last year, I've learned Node.js on company time. We haven't done any projects with it yet, but they gave me the time to do a partial rewrite of one of our existing system using it, as a proof-of-concept. We've got a licensed set of Robert C Martin's "Clean Code" video series on the company network, and dev staff are positively encouraged to spend time watching those during working hours. And I'm being sent on a dev conference later this month.

    I do agree with you though -- not all companies are like that, and I have worked at places where even one of those things would have been a stretch too far. So you really do have to pick the right employer. But truthfully, all employers should be like that if they really value their staff. It's not just about the headline salary figure.

  13. It's not very reliable data. by tlambert · · Score: 3, Insightful

    It's not very reliable data.

    They took the similarity vectors from the job postings, not from resumes, so rather than "what you're likely to know", they computed "what an employer is likely to want at the same time as wanting something else", and then declared that a similarity due to an already skewed cosine similarity metric. This happens because employers are more likely to copy other, similar job postings, or other job postings for companies in a similar business as them, or those of a company whose employees they wish to hire away.

    They claimed that they tried using resumes, but that the resulting data was not as "clean"; uh... duh?

    This visualization was not actually very useful, unless you are trying to design a resume to get yourself hired, regardless of your actual current capabilities.

    1. Re:It's not very reliable data. by Anonymous Coward · · Score: 0

      The summary's insistence that it's "all generated in an automatic way by computer code, untouched by a human" was a huge red flag from the start.

      When you read something like that, rest assured that humans wrote the software, programming in their assumptions, which will operate on crappy data gathered by very flawed humans.

  14. Read the blog post again. by tlambert · · Score: 1

    Read the blog post again. http://insights.dice.com/2015/...

    "I think that’s pretty cool, given we’re generating that automatically from job descriptions posted on our site. We also tried using the resume dataset, but the results were of a lower quality, as the skills extracted from resumes can be from different jobs."

    It was extracted from job-postings, which would only identify Schelling points in the hiring industry, not skill clusters common to people with certain desirable skill sets; in other words, it "how to fudge your resume", rather than "how to find employees like the ones I have which I like".

    1. Re:Read the blog post again. by dwpro · · Score: 1

      Ah, you're right. Thank you for that correction.

      --
      Millions long for immortality who do not know what to do with themselves on a rainy Sunday afternoon. -- Susan Ertz
  15. Dice guy runs software to generate graph from thei by loufoque · · Score: 1

    This is news?

  16. Go figure by WinstonWolfIT · · Score: 1

    C# is just as big as Linux? Whodathunk.

  17. Between tech skills and what? by Anonymous Coward · · Score: 0

    Between implies two things - between tech skills and what? Or did they mean among all tech skills?

  18. Buzzword association by tomhath · · Score: 3, Interesting
    I had to follow the link to Hughes' report to find how he created the list of inputs:

    by importing a pair-wise comma-separated list of skills and their similarity scores ...

    we’re generating that automatically from job descriptions posted on our site.

    So what this really shows is how often the same two buzzwords appear together in a job description posted on Dice.

    I found another comment in his report interesting:

    We also tried using the resume dataset, but the results were of a lower quality,

    I assume by "lower quality" he really means "people list every buzzword they can think of on the resumes posted on Dice".

    Given the inputs I wouldn't expect any surprises in the results. But that said, it's an interesting project and they did a very nice job with the visualization.

  19. All Oranges, no Apples by methano · · Score: 2

    It's interesting that Apple and its product and technologies are not related to any of these things in any way.

    1. Re:All Oranges, no Apples by tomhath · · Score: 1

      People call it a "Walled Garden" for a reason.

  20. Re:Dice supplying stuff to make a resume look nice by Anonymous Coward · · Score: 0

    Yes, self teach yourself how to design a Google datacenter. Your responsibility.

    Problems difficult enough that are worth hiring on or replacing someone are not things you can self teach. You need to be intimate with the problem domain, which requires having access to a real world situation.

  21. Not really "Tech" skills, more "Computer" Skills by Anonymous Coward · · Score: 0

    I didn't see anything about hardware (except where it can be implied in a few of the circles), for instance.

  22. Bizare scaling by pla · · Score: 1

    Looked for two dots: C and SQL.

    C has a nice big dot, but connects to only a handful of extremely broad-scoped nodes.

    SQL, OTOH, has a tiny dot, and connects to just about everything on the chart.

  23. Re:Dice supplying stuff to make a resume look nice by luis_a_espinal · · Score: 1

    Today's software engineering world is so averse to training people it rarely considers searching for a veteran software engineer and letting him come up to speed on random techs.

    Not to put too fine a point on it but that's your own responsibility, not the company you work for.

    Only if you are a contractor. Otherwise, the answer is no.

  24. Palo Alto?! by rippeltippel · · Score: 1

    Since when Palo Alto is a tech skill?

    Need to update my CV...

  25. NoNerval Script by Fnord666 · · Score: 1

    // ==UserScript==
    // @name No Nervals Lobster
    // @description Remove troll posts from Slashdot front page.
    // @include http://slashdot.org/*
    // @include http://.slashdot.org/*
    // @include https://slashdot.org/*
    // @include https://.slashdot.org/*
    // @exclude https://.slashdot.org/story/*
    // @exclude http://.slashdot.org/story/*
    // @grant none
    // ==/UserScript==

    var elements = document.getElementsByTagName('article');
    for(var i = 0; i < elements.length; i++) {
    var text = elements[i].textContent;
    if(text.search("Nerval's Lobster") != -1) {
    elements[i].parentNode.removeChild(elements[i]);
    }
    }

    My apologies to whomever wrote the original code for dumping Roland/Hugh posts for the lack of attribution. Please let me know and I will provide proper attribution in the future.

    For Chrome users, save this code to a file, then drag the file to the chrome extensions window to install it as a user script.

    --
    'The tyrant will always find pretext for his tyranny.' - Aesop's Fables
  26. Don't give anyone funny ideas by RuffMasterD · · Score: 1

    It's a polite way of saying "stop putting crap in your resumes". Don't encourage people to do more of it ;-)

    In all seriousness though, have you ever tried to analyse unstructured text? It's hard. How would you realistically improve it? Do you start with a preconceived list of technology key words and count them in the resumes? People misspell words. Words have multiple meanings depending on context. Technologies change and you might miss key words. Or you could count recent Stack Exchange topic tags, but that only shows the latest fads people are struggling with, not what they are being hired for. Survey a bunch of people about what they do at work? Too expensive. I think this is a really good start in spite of the limitations.

    --
    Human Rights, Article 12: Freedom from Interference with Privacy, Family, Home and Correspondence
  27. Electronic Engineering/Hardware Skills? by BlueStrat · · Score: 1

    Where are the hardware skills?

    'Tech skills' don't mean much without hardware.

    Strat

    --
    Progressivism (aka US 'Liberalism'): Ideas so good they need a police/surveillance-state to enforce.
  28. Good Start DICE! by Anonymous Coward · · Score: 0

    More like this please!! Less of the fake article advertising.

  29. Linkedin Data would be more useful by Anonymous Coward · · Score: 0

    I would love to see the same technique applied to linkedin's skill endorsements.

  30. No problemo by tlambert · · Score: 1

    Ah, you're right. Thank you for that correction.

    No problemo

    I miss subtle stuff all the time. I rely on really strict semantics in place of "not trusting people", which I have a hard time doing (data is data).

  31. I've been writing code like this since 1985. by tlambert · · Score: 1

    In all seriousness though, have you ever tried to analyse unstructured text? It's hard. How would you realistically improve it? Do you start with a preconceived list of technology key words and count them in the resumes? People misspell words. Words have multiple meanings depending on context.

    I've been writing code like this since 1985. Then, it was in LISP.

    It's actually trivial to me at this point. You end up with a meaning trie with differential probability vectors, and some of the roots wither away as you go down. Making a machine decision is harder, but not entirely impossible.

    I get incredibly annoyed at people like Lazlo Bock who want to put everyone's resumes into a form that basically allows Google (Lazlo Bock works for Google) or other companies to magically allow you to come into a new job under the horse collar of a performance review of your previous job which they were in no way involved with.

    The whole "HR metrics" industry... uh... kinda pisses me off? I pick companies based on criterion other than standard metrics. If they pick me that way... they do not deserve me. Mostly they stumble into me, I fix them, and then I exit.

    I understand the "OMG we need people who know what they are doing and not recent graduates!" panic. Does not mean I sympathize.