Slashdot Mirror


R Throwdown Challenge

theodp (442580) writes "'R beats Python!' screams the headline at Prof. Norm Matloff's Mad (Data) Scientist blog. 'R beats Julia! Anyone else wanna challenge R?' Not that he has anything against Python, Matloff adds, but he just doesn't believe that Python or Julia will become 'the new R' anytime soon, or ever. Why? 'R is written by statisticians, for statisticians,' explains Matloff. 'It matters. An Argentinian chef, say, who wants to make Japanese sushi may get all the ingredients right, but likely it just won't work out quite the same. Similarly, a Pythonista could certainly cook up some code for some statistical procedure by reading a statistics book, but it wouldn't be quite same. It would likely be missing some things of interest to the practicing statistician. And R is Statistically Correct.'"

3 of 185 comments (clear)

  1. Bad analogy by Florian+Weimer · · Score: 5, Insightful

    An Argentinian chef is more likely to make great sushi than a Japanese automotive engineer.

    You generally want to use programming languages designed by experienced programmers (even better, experienced language designers) who work closely with subject matter experts. Left to their own devices, experts are likely to get a lot of things wrong, and if the language is sufficiently popular, you are stuck with their mistakes for a long time to come.

    1. Re:Bad analogy by professionalfurryele · · Score: 4, Insightful

      Sorry but I use both R and python in my work as a biomechanist and while I love working with python and hate working in R, R is not only less verbose for this task, but it is more consistent, intuitive and better documented. Very few languages beat python for simple, easy to read code, but it is not up to the task of doing general purpose statistics. To see why this is the case consider a problem with that blog post. All the diagnostic plots I need to do to check the regression are missing, no qq, no cook's, not even something simple like fitted vs. residual. Now consider what happens when I notice that while the fit is decent the residuals depend on what subject I'm looking at and I need to vary the error term. Or need to switch to a mixed effects model because there is clearly a dependence on the intercept by subject.
      Seriously when i say I hate R, I mean it. The code is ugly, it can be hard to read and woe betide the poor git who makes the mistake of needing a plot more complicated that something lattice can do. It is still better than python for statistics.

  2. true, but not really because of R itself by Trepidity · · Score: 5, Insightful

    R itself is okay, but even as a long-time user I don't think the language or environment itself is all that much to brag about. What makes it great for statistics is just that statisticians use it, which means that a lot of the packages are written by statisticians. That makes a big difference: recent papers often have R implementations, standard problems have well-maintained R packages for them with all the bells and whistles, etc. As Matloff notes, this means they often have everything that statisticians are looking for, while straightforward textbook implementations you often find in other languages often aren't nearly as thorough in how they handle the statistical models, or only handle some special cases (though there are some really good packages in other languages, just not as many).

    But I don't think that has much to do with R itself being uniquely suited to statisticians. It's used for historical reasons: Bell Labs S was influential in the field way back when nothing like Python or Julia existed, and statisticians started using it because it was a lot nicer than Fortran, which is what other areas of science mostly used back then. GNU R is essentially a free-software workalike for Bell's S, and it's kept most of the community on board through a mixture of existing packages, familiarity, and inertia.