Analyzing StackOverflow Users' Programming Language Leanings
AlexDomo writes to point out this statistical breakdown of the programming languages represented at StackOverflow. "Suprisingly, JavaScript turned out to be the most 'over-represented' language on StackOverflow, by quite a long way at 294% [where "a representation of 100% means that the SO tag count is aligned exactly with the TIOBE language index"]. Could this also be because programming JavaScript is generally quite difficult and will result in people seeking help more often? Following this was C# (which I had expected to be number 1), at 153%. After this, PHP, Ruby and Python were basically fairly balanced at around 100%. The most 'under-represented' major language would definitely be C at 11%. Three other major languages which seemed to be a bit under-represented, below 50%, were C++, Java and Objective-C. For details of the method used and the full results, refer to the original article." One of the attached comments makes an interesting point about the difficulty in divining meaning from such statistics, though.
JavaScript is most often used for client-side web scripting. I imagine a lot of javascript tagged stackoverflow questions are related to figuring out the HTML DOM, which can be confusing, or trying to figure out browser quirks, jQuery syntax, etc
On the other end, I don't know anyone personally who is in the process of learning C. Everyone I know who uses it are old C hackers who have years and years of experience, and aren't likely to need to ask many questions about it.
The reason Javascript is the most popular is obvious (to me at least): the web is based primarily on three languages - HTML, CSS, and Javascript. With those three, one can do most of what they want with a website. More advanced languages are for more advanced applications. Now, when some geek-lite decides they want to make a website, as many people now toy with, they are going to learn what? The advanced languages or HTML, CSS, and Javascript?
Javascript is the most common not because it's the most difficult. It's the most common because it's the most sought after. Supply - Demand.
Seems obvious to me.
There is also a possibility of people knowing about certain languages more than the others. In other words, in general, somebody programming in C might know the language better than somebody who is doing Javascript.
Having strong foundation is important to know how to get stuff done before using the 'internet'. Certain languages are just better at that.
I wouldn't say Javascript is a particularly difficult language to program, but there is a huge variation in the skill sets of people developing in it, with a heavy bias towards those who couldn't write an original line of code to save their ass. This is the type of programmer who will flood message boards with requests for help with trivial little problems.
JavaScript is something a newbie might want to try out. Newbies ask more questions.
I don't think that's a reflection on the difficulty of JavaScript.
-Dave
It don't surprise me that it seems to correlate to the age of the language multiplied by how widespread the use, with "newer" languages that are widely used being the most represented.
I don't think it has anything to do with how difficult Javascript is, but more to what the programming experience is of the person using the language. I'm sure there are more would be more posts asking about QBasic than LISP if there was internet in 1994 like there is today.
Also people using C/Java/etc. can self-teach by digging through libraries themselves.
At my institution through the 90s and early 2000s we had to have many more Windows tech support "firemen" than apple support techs. Indeed there basically were no virus and networking and printier driver conflict fires to put out. You didn't have to worry about interrupt conflicts between PC cards. No fires.
The result was every time there was an major IT decision, the windows support techs would out vote the apple support techs. Lots of windows only software became standards and at one point there was a push just go windows only.
All because there were more problems and thus more support techs.
I would imagine that more mature languages have fewer people looking for clever tricks on this web site.
Some drink at the fountain of knowledge. Others just gargle.
"Suprisingly, JavaScript turned out to be the most 'over-represented' language on StackOverflow, [...]
Could this also be because programming JavaScript is generally quite difficult and will result in people seeking help more often?
I think that JavaScript is also used by people that do not understand it very well, and they are more likely to resort to the kind of help that this website provides.
Following this was C# (which I had expected to be number 1), at 153%. After this, PHP, Ruby and Python were basically fairly balanced at around 100%. The most 'under-represented' major language would definitely be C at 11%.
I am a C programmer and do not need help from this "stack overflow" web site.
My references are the C programming language standards and the single UNIX specification.
-Do we ask questions because of difficulty or because the underlying technology is more popular?
-Are javascript developers more likely to use sites like stackoverflow vs traditional means (books, mailing list, forums, etc).
-Do we underestimate javascript usage? Does javascript span more projects, i.e. I have a C# based web-project, but still use javascript for the UI.
These are the underlying questions that would have to be answered before we could derive anything from this sort of analysis. That said, in our recent study of stackoverflow questions (publication pending), we found that there was a strong correlation between the frequency of using a particular API class (as defined by google code search), and the numbers of questions asking about those classes. This could suggest questions have a large popularity component, or it could mean people are more likely to run into difficulty with popular components!
I've seen StackOverflow site, I know and used most of the languages mentioned, but I have no idea what the summary is yammering about.
Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.
Cue the non-JS programmers bashing web developers as "not real programmers". :-)
jrumney beat me to it, but I agree 100%. I don't find these results surprising at all.
Before the PHP haters start posting comments, I would just like to say: haters gonna hate.
Javascript is - according to its author - the most misunderstood programming language in the world. While it bears surface similarity to languages like C and Java, and allows you for simple programs to be similar in structure to these, its core design is much closer to LISP (and the syntax quite efficiently obscures/hides that), and so few people truly understand it... so questions are very frequent.
45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
It's not necessarily because of ignorant programmers, reusing existing code is not a bad thing in itself. But yes, Javascript is mostly copy&paste because it's very modular. Big programs are rare, it's mostly just snippets of code implementing specific controls so it's very easy to copy.
All the C programmers are busy over at bufferoverflow.com
They clearly tried to manage their data using javascript, a big mistake from the get-go. If they'd have taken the same data and parsed with with Perl, they would have found that all the questions came from Python and Ruby people. Had they done it in C++, all the questions would have come from C# users. Had they done it in PL/SQL they would have found that the questions all came from rounding errors.
And if they had done it in assembly, they would have found there were no questions at all...
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
I do find this interesting, considering the top tags for Stack Overflow questions are C# and Java.
...real programmers don't ask for help, unlike those wimpy JS hacks and C# pretenders.
make imaginary.friends COUNT=100 VISIBLE=false
...that there aren't many stack overflows in C!
I don't disagree with your post, however, many good programmers start out as "shitty" programmers.
That's simply because not everyone has the background or know-how to get into a "code mentality" right away, and asking stupid questions is a good way to learn (specially if you realize that your question was, indeed, stupid).
I don't think reputation points and badges are worth anything though, but that's me. Still, you can see a incredible amount of really good, informative and "stimulating"** answers posted to not-so-smart questions.
**Stimulating as in some kind of answer that makes you want to try alternate ways to reach your goal. the kind of thing that makes you fire up your IDE/text editor to hack, right away.
So maybe what it tells us is Javascript is where a lot of people get their start coming from non programming fields, and they hit stack overflow having no clue what they're doing. People who learned programming, in any language, don't land on SO because they know most of the basic stuff anyway, and of course most people who learn to program learn to do so in one of the C family and Java.
It's probably because Javascript has the largest proportion of amateur programmers who aren't willing to learn the language they are programming in. They won't buy a book, they won't take a class, they won't read an online manual or tutorial. What they will do is download a free script and they beg others to customize it for them. This is usually prefaced with "I don't know Javascript, but I have to...."
Virtually all of the questions asked there can be answered by doing the following:
1) Reading the documentation of the programming language, library or software in question.
2) Having even a basic level of skill with the technology in question.
A big problem with this is that when "the library" is "All of the .NET framework" just going through the docs isn't always as easy as it seems. And even if you do find what you think is the right parts of it to use you can find yourself confused right up to the point where you ask a question on StackOverflow and someone helpfully points out that .NET actually has multiple implementations of what you want to do and that the obvious one is rarely the right one. Not to mention actual honest-to-god bugs and implementation quirks that aren't mentioned in the official docs (sure, you can search all of MSDN and hope to stumble across some MS advisory that explains a workaround but even then you might find it is overly specific, if you find it at all).
As for JavaScript there are definitely a lot of beginners out there trying to use it. There is also the issue of JavaScript being frequently used with (X)HTML, CSS and some web service that it fetches data from. Couple this with a lot of the information about JavaScript out there being wrong or outdated and it isn't really that strange that a lot of developers who would normally mainly work in say, Java, C# or Python, find themselves confused and facing conflicting information on how to solve a seemingly strange problem. JavaScript as implemented by various browsers also has a few oddities (both in terms of differing implementations and plain WTFs that are bound to baffle developers unfamiliar with it).
Greylisting is to SMTP as NAT is to IPv4
The basic languages that new people work most with will be the ones that are posted most questions for. Not many people begin their programming experience in C nowadays, so people learning C will have some experience in general behind them. Also my totally blind guess is that C experts don't frequent stackoverflow as much. That leads to the interesting question of "where do C and other programmers go to get their answers if not to stackoverflow?".
To me is somthing as simple as that: most Objective-C developers are coding for Apple plattforms, so they ask in an Apple-specific place. My C++ coding is usually done with Qt, so I will ask in a Qt related place. Linux kernel developers are not going to ask C questions on stack overflow, they ask in a linux-related site.
And so on...
TIOBE is pretty much the most worthless index you can make without just making shit up. I have no idea why people keep paying attention to them.
It boggles the mind why anyone would take a pretty accurate measure of a local population, compare it with a wildly inaccurate one over a larger population, and expect to find some meaningful relation between the two.
With JavaScript, you've got to support multiple browsers from the get-go, doing things that were never intended by the VM implementers, such as network polling and widget systems.
There is a fine line between being a cultivated citizen and being someone else's crop. - A. J. Patrick Liszkie
Stackoverflow is great for people that teach themselves. TIY or TitY? The list there is ordered by the most popular languages people teach themselves. Some people, namely myself, need to reach out every once in a while for a little help. Well, there are the kids using it to get someone else to do their homework but no need to go into that.
Having to work for a living is the root of all evil.
While certainly, there are a large number of questions that could easily be solved by googling, many questions are more subtle or deal with issues that are not well documented.
Particularly in technologies that change quickly, there is a huge need for this kind of site. One problem with googling information that changes quickly (for example, Linux) is that information that's out there quickly gets out of date, and people spend hours trying to solve their problems with inaccurate how-to's and man pages. Asking a question gets you more up-to-date information from people that know what they're doing, and it becomes a self-documenting system.
StackOverflow has become the primary location to go to search for programming issues you're dealing with, because unlike google, it doesn't contain extraneous results, spam, and things non-programming related.
If you need web hosting, you could do worse than here
I suspect that, in JavaScript's case, it's because there are a lot of things that can go wrong. In C, once you understand pointers that's 90% of the difficulty gone. With JavaScript, you have weird quicks of the scoping, strangeness related to the semicolon insertion and the bizarre binding behaviour of return, and that's before you get into browser-specific quicks and DOM weirdness.
I am TheRaven on Soylent News
there is a huge variation in the skill sets of people developing in it, with a heavy bias towards those who couldn't write an original line of code to save their ass.
That, combined with the fact that the internet is flooded with ancient javascript snippets ripe for copying and pasting despite the fact that they don't work on anything but netscape 4.
If I have been able to see further than others, it is because I bought a pair of binoculars.
Would be interesting what languages the high scoring members answer most of their questions. Wildly interpreting it as what languages the competent programmers are using.
Virtually all of the questions asked there can be answered by doing the following: 1) Reading the documentation of the programming language, library or software in question.
This is one reason there are so many JavaScript (perhaps actually DOM) questions -- where is the documentation to answer questions like "how do I do x, across every major browser versions which didn't really follow standards well"? If I'm programming in, say, Java or C++ with some framework where I control more of the environment, I can go to one place to answer questions, but there's no one definitive source for these cross browser problems.
C is only particularly dangerous when inexperienced developers just jump in and start writing code without having more experienced developers review what they wrote. The inexperienced developers have not yet developed good practices to ensure they don't overflow a buffer. Similarly they don't understand ownership of allocated memory, so they end up either freeing memory owned by other code, or failing to free memory when ownership is passed to their code.
If experienced developers carefully reviewed the code, these sorts of problems would be noticed and corrected, and things are fine. Unfortunately, real code review is somewhat rare in the business world, as are static analyzers that would also tend to catch many of these same mistakes. If you properly implement safeguards C can be a fine language, (although admittedly I'm a bit partial to C++ myself as a "low-level" language, despite its many, many flaws).
For a high level statically typed language, I actually prefer C# as a language over Java, although C#'s strong ties to Microsoft are admittedly a negative. They are rather similar languages, each with their own flaws, but I like that C# sometimes adds some syntactic shortcuts for common patterns, while Java absolutely refuses to even consider many syntax-only changes, and when they do add one, like anonymous classes, they feel half-baked.[1]
[1] Anonymous classes must be inner classes, they do not provide the option of being static nested classes which I find to be far more useful than inner classes. They are also not full closures, which are (were?) planned for java 7, which would (will?) end up making anonymous classes feel largely obsolete. (Although in Java's Defense, I'll admit that C# made a similar mistake with the anonymous methods, which really are entirely obsoleted by the newer lambda expressions. On the other hand, C#'s anonymous methods were in fact full blown closures.)
Stylish sheet to fix many problems in Slashdot's D3: https://gist.github.com/801524
I'd guess that SO is more representative of "what people are actively engaged in". Maybe....for example, I work in "C" and "Perl" all the time. However, I knew them well enought that I very, very rarley post any C or Perl questions
I have worked a lot with Andrioid an iPhone latley. As these are more recent environments for me, I post questions on SO to them more.
So maybe SO is more reflective of what people are learning?
I have recently contributed a few answers to stackoverflow in a language independent subject. I did notice that PHP programmers asks the most obvious questions, immediately after C# programmers. Based on my experience the statistics on stackoverflow show one thing: popularity of a language in the group of wanna-be programmers (maybe they do not even want to be programmers).
So, languages that have come into heightened popularity in the last decade or so, most of which are primarily used or oriented around web development, were the most overrepresented, while well-established languages aimed at native applications development were the most underrepresented? And from that they conclude that Javascript is a hard language? I think there are a number of better conclusions that could have been made, such as:
1) Stack Overflow attracts more web developers than native application developers.
2) Newer languages have less documentation available.
3) Those languages that are overrepresented tend to attract more newbies.
4) There's a lot more innovation in the ways that those languages are being used, which prompts more questions.
5) Those languages are the most dissimilar from the others, which means that less previous experience translates over.
I'm not suggesting that any of these is necessarily the cause, or that it's one of these and not the others, but I do think that any of them is a better explanation than what was given in the summary.
The ratings have an inverse relation to the average competence and self-reliance of those using the language. Javascript and C# are typically used by inexperienced programmers or programmers without advanced internals knowledge. People on this side of the coin haven't been told to RTFM enough so they expect others to give them answers rather than finding them themselves. PHP and Python coders thrive in a community populated by many former Perl coders so many of them get the proper RTFM treatment but they are also popular and easy languages so they only get the middle ground. C programmers thrive in a world of RTFM and are close enough to the internals that they usually know what the problem is.
The idea that this is because Javascript and C# are 'hard' is actually fairly laughable.
The problem isn't Javascript (though it's certainly not my first choice of languages), it's the DOM. The problem is that there's about 5-10 versions of the DOM that you have to worry about, and you never know how browser X implemented its DOM. I think you've got it backwards -- you get lots of Javascript questions because good programmers get frustrated writing multiple implementations of the same code for each browser, and want to find something that "just works" so they can move on to some more interesting task.
JavaScript tops the most asked programming language because of the popularity of jQuery.
"How do I do this in jQuery?"
"How do I do that in jQuery?"
Even if you are familiar with JavaScript and HTML DOM, you have to forget what you know and do everything the "jQuery way."
This is a completely self-serving poll. The site is used heavily by people who are biased towards developing web sites. Here are a better set of polls:
How many device driver writers program in JavaScript, Virus Broadcast Script (VBS), or COBOL? Not very many.
How many compiler writers program in languages other than C++? Not very many.
How many people who create high performance virtual machines implement them in Scheme, Haskell, Ruby, or Python? Probably zero
How many embedded programmers write in an interpreted language? Probably zero
How many top500 systems developers write code in an interpreted language? Probably zero.
Different domains have different requirements; interpreted languages (even with JITs) are okay for some things, but when performance matters, or explicit understanding of device events/characteristics are required, it is difficult to use such. Not impossible, merely very hard. JavaOS _did_ boot on both Alpha and StrongARM...
It's probably more due to the fact that people who use Real Programming Languages (e.g. C, C++ and so on) are more likely to be trained or experienced programmers who usually know what they are doing, while JavaScripters and C# people are a bit more likely to be kids trying to get some webpage working.
JavaScript on its own isn't hard, it's actually a quite nice language. The DOM thing is pretty awful though.
But this takes the cake. Why the hell are they expressing correlation coefficient in percentages? It has always been represented as a rational number between -1 and +1. I know people are dropping out of engineering and science courses. Then they come up with their own units and scale for things with well known standard and commonly accepted practices.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
C is only particularly dangerous when inexperienced developers just jump in and start writing code without having more experienced developers review what they wrote.
In other words, C is only particularly dangerous most of the time. :^P
I don't care if it's 90,000 hectares. That lake was not my doing.
Stack Overflow is written in C# and JQuery, you can ask a question about anything on there, but you've always had a better shot of getting or finding an answer if you're looking for a .NET web development related question, if only because the people running the site are more likely to know it. The more likely you are to get an answer the more likely you are to visit the site and provide an answer, so the site is skewed towards that particular technology stack. C# is a nice well designed language whereas Javascript is an abomination, so you end up with a lot more Javascript questions.
The higher incidents of Javascript questions may be due to n00bs asking about Javascript when in fact they are using Java.
Most of the comments below--and to a large degree the source article--seem to implicitly assume that all discussion of programming languages happens on Stack Overflow. There probably is some difference in the average experience level of programmer of various languages. But it's also almost certainly the case that OTHER websites also discuss programming languages. For example, someone interested in finding a solution to Python puzzle might well go to the Python Cookbook (http://code.activestate.com/recipes/langs/python/) rather than Stack Overflow. Similarly, to varying degrees, for all the other languages mentioned (with various sites appropriate to each). All this really amounts to is that "Stack Overflow is a good place to find info on languages X, Y, Z; but not so good for A, B, C" ... and this effect is somewhat self-reinforcing, as users of the "underrepresented languages" look elsewhere for help.
The mere distribution of specialization on various websites says nothing at all about the quality, difficulty, breadth of use, or much anything else about the languages themselves.
Buy Text Processing in Python
First of all TIOBE claims to search world wide (which can't be true as they only search in english, but also explains why the results on TIOBE don't look similar to my gut feeling. E.g. C and C++ jobs are very very rare in germany, C is basically only used for embedded programming, C++ is more or less a legacy language meanwhile)
....
(From their web site)
The ratings are calculated by counting hits of the most popular search engines. The search query that is used is
+" programming"
Now someone is comparing the TIOBE results with StackOverflow, but a lot of TIOBEs data is very likely based on SO
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
Virtually all of the questions asked there can be answered by doing the following:
1) Reading the documentation of the programming language, library or software in question.
2) Having even a basic level of skill with the technology in question.
Sorry mister elite anonymous super hacker, you can not be more wrong, at least for yur point 1).
Most documentation is just bad, really bad. For someone who likes to learn stuff by himself he has really trouble find what he is looking for.
E.g. you already know a bit C++ ... just a bit from a one semester university course e.g., now you want to learn Java: so where are your destructors, where is your virtual member function, how do you declare a reference how do you write an conversion operator?
You see: plenty of easy to answer question which you won'T get answered by searching / reading some Java "documentation" ... because stuff that does not exist is hard to find, stuff that uses different names (member function versus method) is hard to grasp/find etc. etc.
Now your point 2) comes, sure people who don't know that a member function in C++ is the same as a method in Java, don't know much about the basic concepts ....
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
Like codeproject, codeguru, daniweb, etc. Stackoverflow is language agnostic with its tags system, and so it attracts a disproportionate amount of languages which don't have their own established forums to compete. For example, I use it if I have questions about python, but when I have questions about C++, I go to codeproject sometimes.
Did you really throw Java and C together and said "real languages"? Real men who can grow a full beard program in C++.
Oblivion Awaits
>Could this also be because programming JavaScript is generally quite difficult and will result in people seeking help more?
Or it could mean that it is so robust and powerful, that so many people come up with these innovative ways of using a language, to do so many things with, where as Perl, is limited to text manipulations and IO , javascript leverages a browser's DOM completely allowing not only a hacker to take over a user's browser, but a network admin to test his current network setup, and for a programmer to develop quick and pretty websites, to many regular expressions used for parsing data, to .....the list is endless....
If you ask about another language that has minimal usage in today's day and age....(fill in your favorite low end programming language here)....
of course less people will respond, leaving you to never return to this site, thereby increasing the questions asked about OTHER programming languages.
It's a catch 22....the less you get in response , the less you come back, the less questions asked for that language.
Sure stack overflow is a great place for people who teach themselves, but I think it's funny to assume stack overflow is the best place to learn a language for oneself online. It seems to me more like the languages best represented there lack sufficient documentation and other means of getting answers. Java, for example, has some nice beginner forums, sites like JavaRanch for more stack-overflow like questions, good free documentation and tutorials, various open source project forums, etc... I mean, we may as well be asking - who is best represented on expertsexchange... I think it just shows a lack of a better open community to get answers from.
1) Reading the documentation of the programming language, library or software in question.
Personally, I consider the output of a Google/Stack Overflow search to be part of the documentation of the programming language. I'm not sure why someone would choose to ignore a valuable resource.
If I can find a code snippet that does what I want it to, rather than having to write it from scratch, great. In particular, I'd rather spend my time working on application logic than low level dealings with the OS.
--Jeremy
Jesus was a liberal
I'm a bit surprised that while a number of people have pointed out how lousy Tiobe is as an index of popularity, that nobody's pointed to an alternative. I'd suggest langpop.com as a considerably better alternative.
The most obvious points of superiority are simply documenting what they actually measure and how they combine the individual measurements to produce a final result. Although Tiobe doesn't document enough of what they do well enough to be sure, it looks like langpop.com covers a couple of types of sources that Tiobe doesn't (or at least doesn't imply they try to cover). One particularly interesting point is that they attempt to gather data about actual code, not just questions about code (e.g., they look at Freshmeat, ohloh, and Google Code).
Oh, and no, I'm not affiliated with Langpop.com (or Tiobe) in any way.
The universe is a figment of its own imagination.