Analyzing StackOverflow Users' Programming Language Leanings
AlexDomo writes to point out this statistical breakdown of the programming languages represented at StackOverflow. "Suprisingly, JavaScript turned out to be the most 'over-represented' language on StackOverflow, by quite a long way at 294% [where "a representation of 100% means that the SO tag count is aligned exactly with the TIOBE language index"]. Could this also be because programming JavaScript is generally quite difficult and will result in people seeking help more often? Following this was C# (which I had expected to be number 1), at 153%. After this, PHP, Ruby and Python were basically fairly balanced at around 100%. The most 'under-represented' major language would definitely be C at 11%. Three other major languages which seemed to be a bit under-represented, below 50%, were C++, Java and Objective-C. For details of the method used and the full results, refer to the original article." One of the attached comments makes an interesting point about the difficulty in divining meaning from such statistics, though.
Thank you for being a friend
Traveled down the road and back again
Your heart is true, you're a pal and a cosmonaut.
And if you threw a party
Invited everyone you ever knew
You would see the biggest gift would be from me
And the card attached would say, thank you for being a friend.
JavaScript is most often used for client-side web scripting. I imagine a lot of javascript tagged stackoverflow questions are related to figuring out the HTML DOM, which can be confusing, or trying to figure out browser quirks, jQuery syntax, etc
On the other end, I don't know anyone personally who is in the process of learning C. Everyone I know who uses it are old C hackers who have years and years of experience, and aren't likely to need to ask many questions about it.
The reason Javascript is the most popular is obvious (to me at least): the web is based primarily on three languages - HTML, CSS, and Javascript. With those three, one can do most of what they want with a website. More advanced languages are for more advanced applications. Now, when some geek-lite decides they want to make a website, as many people now toy with, they are going to learn what? The advanced languages or HTML, CSS, and Javascript?
Javascript is the most common not because it's the most difficult. It's the most common because it's the most sought after. Supply - Demand.
Seems obvious to me.
The reason javascript is so high is because there people who use it are clueless morons. Conversely real languages like java and c are typically used by real professionals.
There is also a possibility of people knowing about certain languages more than the others. In other words, in general, somebody programming in C might know the language better than somebody who is doing Javascript.
Having strong foundation is important to know how to get stuff done before using the 'internet'. Certain languages are just better at that.
I wouldn't say Javascript is a particularly difficult language to program, but there is a huge variation in the skill sets of people developing in it, with a heavy bias towards those who couldn't write an original line of code to save their ass. This is the type of programmer who will flood message boards with requests for help with trivial little problems.
JavaScript is something a newbie might want to try out. Newbies ask more questions.
I don't think that's a reflection on the difficulty of JavaScript.
-Dave
It don't surprise me that it seems to correlate to the age of the language multiplied by how widespread the use, with "newer" languages that are widely used being the most represented.
I don't think it has anything to do with how difficult Javascript is, but more to what the programming experience is of the person using the language. I'm sure there are more would be more posts asking about QBasic than LISP if there was internet in 1994 like there is today.
Also people using C/Java/etc. can self-teach by digging through libraries themselves.
At my institution through the 90s and early 2000s we had to have many more Windows tech support "firemen" than apple support techs. Indeed there basically were no virus and networking and printier driver conflict fires to put out. You didn't have to worry about interrupt conflicts between PC cards. No fires.
The result was every time there was an major IT decision, the windows support techs would out vote the apple support techs. Lots of windows only software became standards and at one point there was a push just go windows only.
All because there were more problems and thus more support techs.
I would imagine that more mature languages have fewer people looking for clever tricks on this web site.
Some drink at the fountain of knowledge. Others just gargle.
"Suprisingly, JavaScript turned out to be the most 'over-represented' language on StackOverflow, [...]
Could this also be because programming JavaScript is generally quite difficult and will result in people seeking help more often?
I think that JavaScript is also used by people that do not understand it very well, and they are more likely to resort to the kind of help that this website provides.
Following this was C# (which I had expected to be number 1), at 153%. After this, PHP, Ruby and Python were basically fairly balanced at around 100%. The most 'under-represented' major language would definitely be C at 11%.
I am a C programmer and do not need help from this "stack overflow" web site.
My references are the C programming language standards and the single UNIX specification.
-Do we ask questions because of difficulty or because the underlying technology is more popular?
-Are javascript developers more likely to use sites like stackoverflow vs traditional means (books, mailing list, forums, etc).
-Do we underestimate javascript usage? Does javascript span more projects, i.e. I have a C# based web-project, but still use javascript for the UI.
These are the underlying questions that would have to be answered before we could derive anything from this sort of analysis. That said, in our recent study of stackoverflow questions (publication pending), we found that there was a strong correlation between the frequency of using a particular API class (as defined by google code search), and the numbers of questions asking about those classes. This could suggest questions have a large popularity component, or it could mean people are more likely to run into difficulty with popular components!
I think web designers with little to no programming experience account for this, no?
Those who use C++, Java, etc. are more likely to either be in training to become software engineers (for whom stackoverflow would be cheating), or are working as software engineers (and rarely need stackoverflow).
I've seen StackOverflow site, I know and used most of the languages mentioned, but I have no idea what the summary is yammering about.
Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.
Cue the non-JS programmers bashing web developers as "not real programmers". :-)
jrumney beat me to it, but I agree 100%. I don't find these results surprising at all.
I think C is under represented, using the same difficulty analysis, because syntactically its much simpler to understand and with a consistent way of solving the problems like cooking recipes. Only when you are provided with more experience you get to make small decisions that influence a lot the stability and maintainability of the problem. But again that comes from experience and mileage not from a question you get to ask, which is what stack overflow is about. A big influence for the under representation figure is also the small standard library which is what maybe generates more questions in other languages like python, php etc, which are the so called "batteries included" languages.
As you can understand i love C, and i am sad its such a frowned upon language. It's not so dangerous as its painted. With experience you can actually minimize all the typical pointer pitfalls to none and looking for error patterns when debugging is also a great help. The only language that i think is great for high level is java and i am really sad it comes with way too much baggage for embedded, even so it forces you to be a good programer.
Before the PHP haters start posting comments, I would just like to say: haters gonna hate.
Javascript is - according to its author - the most misunderstood programming language in the world. While it bears surface similarity to languages like C and Java, and allows you for simple programs to be similar in structure to these, its core design is much closer to LISP (and the syntax quite efficiently obscures/hides that), and so few people truly understand it... so questions are very frequent.
45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
It's not necessarily because of ignorant programmers, reusing existing code is not a bad thing in itself. But yes, Javascript is mostly copy&paste because it's very modular. Big programs are rare, it's mostly just snippets of code implementing specific controls so it's very easy to copy.
Anyone tried to write fresh javascript that works well in one browser much less all of them? Javascript problem isn't the language, it's the implementation and it's interaction towards the browser which contains many quirks/gotchas that it's impossible to keep track of them without experience. Add in the ease of entry (meaning more people with less experience) and you get basically this situation.
All the C programmers are busy over at bufferoverflow.com
They clearly tried to manage their data using javascript, a big mistake from the get-go. If they'd have taken the same data and parsed with with Perl, they would have found that all the questions came from Python and Ruby people. Had they done it in C++, all the questions would have come from C# users. Had they done it in PL/SQL they would have found that the questions all came from rounding errors.
And if they had done it in assembly, they would have found there were no questions at all...
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
I do find this interesting, considering the top tags for Stack Overflow questions are C# and Java.
...real programmers don't ask for help, unlike those wimpy JS hacks and C# pretenders.
make imaginary.friends COUNT=100 VISIBLE=false
Good one.
...that there aren't many stack overflows in C!
StackOverflow is only popular because there are a lot of foolish "programmers" out there, and it just happens to be the place where they congregate online and award each other meaningless "badges" and "reputation points".
Virtually all of the questions asked there can be answered by doing the following:
1) Reading the documentation of the programming language, library or software in question.
2) Having even a basic level of skill with the technology in question.
Good programmers realize that this is the case, and thus easily get the answers to their questions by themselves. It's only the shitty "programmers" who can't do this who need to resort to StackOverflow.
It's no wonder that there are so many JavaScript questions at StackOverflow. JavaScript is the definitive programming language of idiots. Good programmers do everything in their power to not use JavaScript, while idiot "programmers" embrace it and try to use it absolutely everywhere, including every place where it surely shouldn't ever be used. Good programmers don't run into the problems that these idiot JavaScript "programmers" run into so often because these good programmers know not to use a shitty language in the first place, and they know not to try to use a shitty language for stuff that it's not good at.
Assuming javascript is difficult because many people ask about it, completely forgets that different people have different abilities, skill sets, experiences, and so on. It also forgets that communities themselves have "flavours". That self-selecting bias means that stackoverflow is full of js and c# copypasters because a certain kind of windows-using web enthousiast hangs out there. This reinforces itself, so, well, that's what just got measured. Doesn't say squat about how difficult or even how popular the language is in general. It only says something about what sort of bunch likes to hang out there.
If you'd go to a place talking about systems programming, most will be (implicitly) about C. Find a functional programming hangout, lo and behold, the discussions will center around functional languages. Stackoverflow is no different, even if it doesn't have a specific label put on the outside. The inside is very much biased.
Personally I tend to avoid the site because most of the time it shows up in search results the discussions are too shallow or otherwise unusable, so it's turned out to be a waste of time for me. It's just not my cup of tea, guv, so I don't hang out there. Quelle surprise. Perhaps the most important point about the entire thing is that its fanbois are so infatuated with the thing that they overlook the obvious and start to overgeneralise. To me it is clearly the case that this happened with the rather wild conclusions based on these infographics.
Maybe it's because C, C++ and Java are taught at schools and universities, whereas JavaScript is usually self-taught?
I wish people would bother to master their own native languages before attempting to communicate by/with machines.
It's probably because Javascript has the largest proportion of amateur programmers who aren't willing to learn the language they are programming in. They won't buy a book, they won't take a class, they won't read an online manual or tutorial. What they will do is download a free script and they beg others to customize it for them. This is usually prefaced with "I don't know Javascript, but I have to...."
The basic languages that new people work most with will be the ones that are posted most questions for. Not many people begin their programming experience in C nowadays, so people learning C will have some experience in general behind them. Also my totally blind guess is that C experts don't frequent stackoverflow as much. That leads to the interesting question of "where do C and other programmers go to get their answers if not to stackoverflow?".
To me is somthing as simple as that: most Objective-C developers are coding for Apple plattforms, so they ask in an Apple-specific place. My C++ coding is usually done with Qt, so I will ask in a Qt related place. Linux kernel developers are not going to ask C questions on stack overflow, they ask in a linux-related site.
And so on...
TIOBE is pretty much the most worthless index you can make without just making shit up. I have no idea why people keep paying attention to them.
It boggles the mind why anyone would take a pretty accurate measure of a local population, compare it with a wildly inaccurate one over a larger population, and expect to find some meaningful relation between the two.
With JavaScript, you've got to support multiple browsers from the get-go, doing things that were never intended by the VM implementers, such as network polling and widget systems.
There is a fine line between being a cultivated citizen and being someone else's crop. - A. J. Patrick Liszkie
The most 'under-represented' major language would definitely be C at 11%.
"would definitely be"? It either is or it isn't.
Stackoverflow is great for people that teach themselves. TIY or TitY? The list there is ordered by the most popular languages people teach themselves. Some people, namely myself, need to reach out every once in a while for a little help. Well, there are the kids using it to get someone else to do their homework but no need to go into that.
Having to work for a living is the root of all evil.
there is a huge variation in the skill sets of people developing in it, with a heavy bias towards those who couldn't write an original line of code to save their ass.
That, combined with the fact that the internet is flooded with ancient javascript snippets ripe for copying and pasting despite the fact that they don't work on anything but netscape 4.
If I have been able to see further than others, it is because I bought a pair of binoculars.
JavaScript although somewhat "powerful" (honestly a language for which there only exist mono-threaded implementations in this day and age is hardly that "powerful" in my book) it also attracts lots of beginners.
They all dream of writing the next FaceBook and think that to make their webapp look great and intuitive and whatnot to be "social-cloud-2.0-compliant" they need JS.
And they ask dumb stupid questions: and SO users likes dumb stupid questions because they're the questions they do actually understand and they're the ones they can answer and karma-whore.
Now go ask a smart question: recently some dude put up an amazing little question about the nash equilibrium. Some 30K rep user kept insisting that "you can't do better than XXX". Despite several comments and a piece of code showing how wrong he was. Was still refusing to see the facts: "you cannot do better than XXX". These guys think they know it all because they've got 30K rep despite them being often very mediocre programmer.
So ask these smart questions: they get downvoted and "voted for close" nearly instantly. Once these people with rep can downvote, they'll downvote question that they realize are too smart for them to answer.
Joel Sprotski thinks that "any programmers with 3K rep on SO it the kind of genius-programmer anyone want on their team". Sure. I did lose a 7K rep account due to their bogus OpenID implementation and the fact that he didn't know I should link my SO account to several OpenID logins, "just in case" one would be problematic in the future (yeah, sure...). Opened a new account, I'm already a 1K rep on it.
It's really easy to karma-whore / make it to 3K or 10K rep. These aren't great programmers. These aren't good programmers.
The only thing SO is good for is getting rid of the spam.
Their implementation s*ck fat balls too: website is shitty, OpenID issues are boring, uptime is less than stellar. Personally cannot stand that lolcat "I'm working on ur problemz right now" anymore.
Cannot wait for the next big thing that shall extract less newbies and that shall actually encourage smart and thought provoking questions.
IMO, JavaScript has the lead because there are so many people that can barely program anything, that somehow get into web design positions at large companies.
Would be interesting what languages the high scoring members answer most of their questions. Wildly interpreting it as what languages the competent programmers are using.
Instead of considering this an "over-representation," perhaps this just indicates how incredibly inaccurate TIOBE is?
I'd guess that SO is more representative of "what people are actively engaged in". Maybe....for example, I work in "C" and "Perl" all the time. However, I knew them well enought that I very, very rarley post any C or Perl questions
I have worked a lot with Andrioid an iPhone latley. As these are more recent environments for me, I post questions on SO to them more.
So maybe SO is more reflective of what people are learning?
I have recently contributed a few answers to stackoverflow in a language independent subject. I did notice that PHP programmers asks the most obvious questions, immediately after C# programmers. Based on my experience the statistics on stackoverflow show one thing: popularity of a language in the group of wanna-be programmers (maybe they do not even want to be programmers).
So, languages that have come into heightened popularity in the last decade or so, most of which are primarily used or oriented around web development, were the most overrepresented, while well-established languages aimed at native applications development were the most underrepresented? And from that they conclude that Javascript is a hard language? I think there are a number of better conclusions that could have been made, such as:
1) Stack Overflow attracts more web developers than native application developers.
2) Newer languages have less documentation available.
3) Those languages that are overrepresented tend to attract more newbies.
4) There's a lot more innovation in the ways that those languages are being used, which prompts more questions.
5) Those languages are the most dissimilar from the others, which means that less previous experience translates over.
I'm not suggesting that any of these is necessarily the cause, or that it's one of these and not the others, but I do think that any of them is a better explanation than what was given in the summary.
The ratings have an inverse relation to the average competence and self-reliance of those using the language. Javascript and C# are typically used by inexperienced programmers or programmers without advanced internals knowledge. People on this side of the coin haven't been told to RTFM enough so they expect others to give them answers rather than finding them themselves. PHP and Python coders thrive in a community populated by many former Perl coders so many of them get the proper RTFM treatment but they are also popular and easy languages so they only get the middle ground. C programmers thrive in a world of RTFM and are close enough to the internals that they usually know what the problem is.
The idea that this is because Javascript and C# are 'hard' is actually fairly laughable.
javascript is heavily represented in stack overflow due to it's use as the front-end language
people will use c#, php, ruby, python, asp, java, etc on the server backend
when they present the problem though, it's might be javascript related but they'll include the server-side language they're using as well
i would the primary driver for this would be ajax
where if the site isn't using ajax, it's just not web 2.0
JavaScript tops the most asked programming language because of the popularity of jQuery.
"How do I do this in jQuery?"
"How do I do that in jQuery?"
Even if you are familiar with JavaScript and HTML DOM, you have to forget what you know and do everything the "jQuery way."
I've always heard that Mac OS X suck even worse than windows for remote exploits, but they only get exploited by hackers knowledgeable hackers, not viruses andy to script kiddies.
I'll happily testify to Mac OS X's kernel being epically inferior to Linux. For example, the whole damn machine hangs when it finds one bad sector, and said bad sector will likely corrupt data, even when the drives SMART status ain't anywhere near failing. wtf?!?
The Christian religion has been and still is the principal enemy of moral progress in the world. -- Bertrand Russell
This is a completely self-serving poll. The site is used heavily by people who are biased towards developing web sites. Here are a better set of polls:
How many device driver writers program in JavaScript, Virus Broadcast Script (VBS), or COBOL? Not very many.
How many compiler writers program in languages other than C++? Not very many.
How many people who create high performance virtual machines implement them in Scheme, Haskell, Ruby, or Python? Probably zero
How many embedded programmers write in an interpreted language? Probably zero
How many top500 systems developers write code in an interpreted language? Probably zero.
Different domains have different requirements; interpreted languages (even with JITs) are okay for some things, but when performance matters, or explicit understanding of device events/characteristics are required, it is difficult to use such. Not impossible, merely very hard. JavaOS _did_ boot on both Alpha and StrongARM...
The reason Javascript is the most popular is obvious (to me at least): the web is based primarily on three languages - HTML, CSS, and Javascript.
The article said it is overrepresented in SO queries. That's like being 'popular' in something of the way that a highly prevalent disease might be called 'popular'. (I don't deny it may be actually popular as well.)
But the features or bugs of js that make its problems overrepresented on SO are probably linked to the characteristics of js that make me groan whenever I see that a webpage demands the use of it.
I wish it could be outlawed!
-wb-
It's probably more due to the fact that people who use Real Programming Languages (e.g. C, C++ and so on) are more likely to be trained or experienced programmers who usually know what they are doing, while JavaScripters and C# people are a bit more likely to be kids trying to get some webpage working.
JavaScript on its own isn't hard, it's actually a quite nice language. The DOM thing is pretty awful though.
But this takes the cake. Why the hell are they expressing correlation coefficient in percentages? It has always been represented as a rational number between -1 and +1. I know people are dropping out of engineering and science courses. Then they come up with their own units and scale for things with well known standard and commonly accepted practices.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
Stack Overflow is written in C# and JQuery, you can ask a question about anything on there, but you've always had a better shot of getting or finding an answer if you're looking for a .NET web development related question, if only because the people running the site are more likely to know it. The more likely you are to get an answer the more likely you are to visit the site and provide an answer, so the site is skewed towards that particular technology stack. C# is a nice well designed language whereas Javascript is an abomination, so you end up with a lot more Javascript questions.
The higher incidents of Javascript questions may be due to n00bs asking about Javascript when in fact they are using Java.
Why is the TIOBE index taken to be the ground truth here? The data for the TIOBE index come from a rather crude Google search for language names. So it relies on a fuzzy search, on a mostly irrelevant corpus (all public web sites), performed using a secret algorithm that is constantly being tweaked according to the commercial interests of a monopolist. I would expect Stackoverflow popularity to be a better measure for the incidence of a language.
Most of the comments below--and to a large degree the source article--seem to implicitly assume that all discussion of programming languages happens on Stack Overflow. There probably is some difference in the average experience level of programmer of various languages. But it's also almost certainly the case that OTHER websites also discuss programming languages. For example, someone interested in finding a solution to Python puzzle might well go to the Python Cookbook (http://code.activestate.com/recipes/langs/python/) rather than Stack Overflow. Similarly, to varying degrees, for all the other languages mentioned (with various sites appropriate to each). All this really amounts to is that "Stack Overflow is a good place to find info on languages X, Y, Z; but not so good for A, B, C" ... and this effect is somewhat self-reinforcing, as users of the "underrepresented languages" look elsewhere for help.
The mere distribution of specialization on various websites says nothing at all about the quality, difficulty, breadth of use, or much anything else about the languages themselves.
Buy Text Processing in Python
First of all TIOBE claims to search world wide (which can't be true as they only search in english, but also explains why the results on TIOBE don't look similar to my gut feeling. E.g. C and C++ jobs are very very rare in germany, C is basically only used for embedded programming, C++ is more or less a legacy language meanwhile)
....
(From their web site)
The ratings are calculated by counting hits of the most popular search engines. The search query that is used is
+" programming"
Now someone is comparing the TIOBE results with StackOverflow, but a lot of TIOBEs data is very likely based on SO
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
Like codeproject, codeguru, daniweb, etc. Stackoverflow is language agnostic with its tags system, and so it attracts a disproportionate amount of languages which don't have their own established forums to compete. For example, I use it if I have questions about python, but when I have questions about C++, I go to codeproject sometimes.
And they have their own slice of the tags...
If my C doesn't work, it's because I wrote it wrong; I debug, look for the error that I made, correct it and the code eventually works as intended. If I can't find the problem, another pair of eyes from my office will usually help me find the elusive uninitialised value or pointer mishap.
If my Javascript doesn't work, more often than not it's because a browser is interpreting it in a non-standard way; then I have to look online for help in figuring out what non-documented incantation I have to use to make the browser do what I wanted it to do. Javascript problems will fox even the most experienced software engineer because they tend to be arbitrary and illogical in nature.
I'd interpret these statistics as meaning Javascript creates 30 times more requests for help online than C does. It's not a "more difficult language", but when it goes wrong you have to ask the entire world why it isn't working, not just someone from the corridor.
Also, of course, as every other commenter has mentioned JS is for noobs who ask online before they reach for a debugger, etc.
SO is known as the web/.NET hangout, so no surprises. Not exactly one's first stop for other tech.
The answer to this is obvious to me. While some of the other comments do show other reasons that are likely also true, I think this reason is the most impactful.
How many applications out there are a combination of C# and PHP? Maybe a few, but per capita not that many. How many C# / Javascript hybrids are out there? Any ASP.Net application written in C#. How many PHP / Javascript hybrids are out there? Every PHP application ever written.
The biggest reason IMO that Javascript is on top of that list is because any web-based application uses Javascript while not every web-based application uses C# or PHP. Being a C# developer myself I have posted many C# questions to SO and also a handful of Javascript questions. Never have I posted a PHP question.
>Could this also be because programming JavaScript is generally quite difficult and will result in people seeking help more?
Or it could mean that it is so robust and powerful, that so many people come up with these innovative ways of using a language, to do so many things with, where as Perl, is limited to text manipulations and IO , javascript leverages a browser's DOM completely allowing not only a hacker to take over a user's browser, but a network admin to test his current network setup, and for a programmer to develop quick and pretty websites, to many regular expressions used for parsing data, to .....the list is endless....
If you ask about another language that has minimal usage in today's day and age....(fill in your favorite low end programming language here)....
of course less people will respond, leaving you to never return to this site, thereby increasing the questions asked about OTHER programming languages.
It's a catch 22....the less you get in response , the less you come back, the less questions asked for that language.
Sure stack overflow is a great place for people who teach themselves, but I think it's funny to assume stack overflow is the best place to learn a language for oneself online. It seems to me more like the languages best represented there lack sufficient documentation and other means of getting answers. Java, for example, has some nice beginner forums, sites like JavaRanch for more stack-overflow like questions, good free documentation and tutorials, various open source project forums, etc... I mean, we may as well be asking - who is best represented on expertsexchange... I think it just shows a lack of a better open community to get answers from.
I'm a bit surprised that while a number of people have pointed out how lousy Tiobe is as an index of popularity, that nobody's pointed to an alternative. I'd suggest langpop.com as a considerably better alternative.
The most obvious points of superiority are simply documenting what they actually measure and how they combine the individual measurements to produce a final result. Although Tiobe doesn't document enough of what they do well enough to be sure, it looks like langpop.com covers a couple of types of sources that Tiobe doesn't (or at least doesn't imply they try to cover). One particularly interesting point is that they attempt to gather data about actual code, not just questions about code (e.g., they look at Freshmeat, ohloh, and Google Code).
Oh, and no, I'm not affiliated with Langpop.com (or Tiobe) in any way.
The universe is a figment of its own imagination.