Slashdot Mirror


Interviews: Ask Author and Programmer Andy Nicholls About R

Andy Nicholls has been an R programmer and consultant for Mango Solutions since 2011 (where he currently manages the R consultancy team), after a long stint as a statistician in the pharmaceutical industry. He has a serious background in mathematics, too, with a Masters in math and another in Statistics with Applications in Medicine. Andy has taught more than 50 on-site R training courses and has been involved in the development of more than 30 R packages; he's also a regular contributor to events at LondonR, the largest R user group in the UK. But since not everyone can get to London for a user group meeting, you can get some of the insights he's gained as an R expert in Sams Teach Yourself R In 24 Hours (available in print or at Safari), of which he is the lead author. Today, though, you can ask Andy about the much-lauded statistics-oriented free software (GPL) language directly -- Why to use it, how to get started, how to get things done, and where those intriguing release names come from. (The about page is helpful, too.) As usual, please ask as many questions as you'd like, but one question at a time, please. Note: Slashdot is always looking for interesting interview guests. Who do you want to ask? Let us know!

41 of 187 comments (clear)

  1. R? by U2xhc2hkb3QgU3Vja3M · · Score: 5, Funny

    Is that a pirates-only language?

    1. Re:R? by Hognoxious · · Score: 1

      Only if you use Mate[y].

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    2. Re:R? by Tablizer · · Score: 1

      only if calculating the Planck Constant

    3. Re:R? by carnivore302 · · Score: 1

      I'm still waiting for the answers of 'ask ray kurzweil'. It's been two months already.

      --
      Please login to access my lawn
  2. Evolution of R by patabongo · · Score: 4, Interesting

    How has the way you use R changed over time? For myself, I don't think I've gone through an entire R session in the past six months without loading dplyr. Combine that with the pipeline operator and I think if you'd shown the R code I wrote yesterday to me of two years ago, I wouldn't have believed it was the same language.

  3. Future of R, now that programmers use it? by Anonymous Coward · · Score: 2, Insightful

    What's your take on the future of R? It used to be that it was a tool for statisticians, and now it's been discovered by programmers. As a statistician who's not a programmer, but who hangs out sometimes on slashdot and stackoverflow, it feels sometime like it's in danger of becoming just another language for programmers, instead of a tool for statisticians. Should I be worried? Can it be both? Is this mass inflow of programmers going to change it somehow? Or am I just having a "get off my lawn" moment?

    More about me, I'm a PhD statistician at a major public research university, and use R every day for data manipulation, exploration, and analysis, and have for 10+ years. I've done a few packages and enough coding that I know most of R's quirks, but would not consider myself a programmer.

    1. Re:Future of R, now that programmers use it? by Anonymous Coward · · Score: 1

      The only reason any sane programmer uses it is because they have to write some stat code using some obscure test or analysis package only available in R.

    2. Re:Future of R, now that programmers use it? by Pseudonym · · Score: 3, Insightful

      As a statistician who's not a programmer, but who hangs out sometimes on slashdot and stackoverflow, it feels sometime like it's in danger of becoming just another language for programmers, instead of a tool for statisticians.

      As a programmer who used to research programming languages, here's no danger of that at all.

      It's not much of a stretch to say that no programmer really uses R. At most, programmers use the high-quality statistical libraries which only work with R. R is basically the best statistical packages every written bound together by one of the worst programming languages ever developed.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    3. Re:Future of R, now that programmers use it? by Anonymous Coward · · Score: 1

      It's not much of a stretch to say that no programmer really uses R. At most, programmers use the high-quality statistical libraries which only work with R. R is basically the best statistical packages every written bound together by one of the worst programming languages ever developed.

      This is it *exactly*!

    4. Re:Future of R, now that programmers use it? by gumbi+west · · Score: 2

      I actually program exclusively in R and fine it OK once you learn the quirks. Where it excels is in sort of "jotting" down thoughts about programs. e.g. you can define a S3 class and then make one that only has a few of the properties, or claim your object is a class it is not. This would drive any Java programer bananas but it's super nice for going fast and loose.

      Similarly, the fact that it can recover your call in addition to the arguments you passed makes several functions work much better when you haven't specified all of the optional arguments. Once you have specified them then it's back to irrelevant.

    5. Re:Future of R, now that programmers use it? by shellbeach · · Score: 1

      I actually program exclusively in R and fine it OK once you learn the quirks.

      I dunno -- there's an awful lot that's cumbersome about R and constantly does my head in. My pet bugbears:

      No native hash/dictionary construct (there is the third-party hash library, but that's not great for portability).
      It's not possible to define functions at the end of your code, making code difficult to read (or requiring you to source a separate script that contains your functions, but again, portability suffers).
      Variable scoping is ... odd (many people have written previously about R quirks in this regard)
      R is so sloooowwwww ...

      I still use R quite a bit and suffer through writing code for it because of the incredible power of the modules. But ... I kinda feel dirty every time :(

  4. Re:No thanks: I'll stick by the BIG "up & come by Sax+Russell+5449D29A · · Score: 1

    I don't believe you're even a human any more. In fact, I'd go as far as to say you're something akin to 791.

    --
    -SR
  5. Key advantages of R by Compuser · · Score: 5, Interesting

    In your view, what are the key advantages of R over other scientific computing languages, most notably Matlab (which has to be considered with its plethora of toolboxes of course)?

    1. Re:Key advantages of R by TeknoHog · · Score: 4, Interesting

      In your view, what are the key advantages of R over other scientific computing languages, most notably Matlab (which has to be considered with its plethora of toolboxes of course)?

      Or Python with scipy/numpy, or Julia, given their open source nature in addition to the plethora of libraries.

      --
      Escher was the first MC and Giger invented the HR department.
    2. Re:Key advantages of R by martinux · · Score: 1

      While I am really only dipping my toe into R I decided to do some research on this question a while back.

      I have used python for a number of scientific applications and was attempting to determine if I should use Rpy2 (http://rpy2.bitbucket.org/). It initially made sense to keep all of the data retrieval, formatting and analysis in a few python scripts. However, it seems that the design of the R language intrinsically accounts for the problem solving methodology: "R is designed to operate the way that problems are thought about." (http://www.burns-stat.com/documents/tutorials/why-use-the-r-language/)

  6. Tips for new statisticians by Anonymous Coward · · Score: 1

    For those that are relatively new to R and hope to enter the field of statistics, where would you recommend focusing your R training efforts?

    For example, which programming concepts, or fields of application, or packages, etc. do you feel are especially worthy of attention?

    Similarly, what would you recommend we avoid?

    1. Re:Tips for new statisticians by Shadow+of+Eternity · · Score: 2

      Hoisting the AC for asking a good question.

      To add on: R is gaining massive traction in graduate programs but so many professors teach it like it's SPSS, almost as a cargo cult coding language, and so much of the documentation is written for people who are already experienced coders. Is there any decent introduction to R for someone that doesn't already know it (or another programming language) fluently?

      --
      A bullet may have your name on it but splash damage is addressed "To whom it may concern."
    2. Re:Tips for new statisticians by blueshift_1 · · Score: 1

      I know it's not free, but Udemy has some truly excellent R courses. From the very basics progressing to actual data science.

  7. Re:No thanks: I'll stick by the BIG "up & come by moronikos · · Score: 1

    Ken M usually does not post as anonymous.

  8. Re:No thanks: I'll stick by the BIG "up & come by squiggleslash · · Score: 1

    Ken M is actually funny. Of course, as an Ken M enthusiast I'd add one more reason he's a huge improvement over APK here.

    --
    You are not alone. This is not normal. None of this is normal.
  9. What about the painful side of R? by Shadow+of+Eternity · · Score: 5, Interesting

    There's an entire book, the R Inferno, dedicated to R's many "quirks" and problems. Is there ever a plan to dedicate some time to focusing on cleaning up the language and making it less painful to use?

    --
    A bullet may have your name on it but splash damage is addressed "To whom it may concern."
  10. Harsh crowd by Anonymous Coward · · Score: 2, Interesting

    In my experience (from searching for R advice online - I've never mailed the R discussion list myself) the R community is incredibly harsh and unforgiving of new users. Answers to beginners' questions are normally brusque - often extremely so. (I remember one exchange, where a user basically asked "I've read the documentation for par, and I don't understand ...", and the response was, in its entirety, "?par" -- which, for those unfamiliar with R, is the command to bring up the documentation for par.)

    On the statistical end of things, too, the community seems less than helpful. My impression is that it's normally assumed that all R users have good (graduate student-level) backgrounds on the statistical aspects, and little to no consideration is given to those who might not be up to speed on the theoretical basis of some of the functions in R, or who haven't read the (pay-walled, mathematically dense) 1963 paper where the method was first described.

    What are your thoughts on the helpfulness and "beginner friendliness" of the R community? Do you think there might be an issue with going from a very hand-holdy "Teach Yourself In 24 Hours" type work and being abruptly dumped into a much more brusque "why are you asking us? - figure it out yourself!" type environment?

    1. Re: Harsh crowd by shellbeach · · Score: 1

      As a statistician: someone not trained in statistics using statistical methods when they don't understand the concepts in that mathematically dense paper from 1963 is a dangerous thing. If you want me to be your statistics consultant, pay me my consulting rate. I don't generally costly for free, on the r-help mailing list or elsewhere.

      If you don't understand that 1963 paper, you need a statistics consultant. Don't expect someone to do your statistical work for free.

      I think you just beautifully proved the OP's point.

  11. Impressed with R's speed by joeblog · · Score: 2

    I encountered R via Johns Hopkins University's data science series of Coursera courses which I highly recommend. The first one is at https://www.coursera.org/learn...

    As a mainly Python programer, but someone with an eclectic interest in programing languages (I enjoy Prolog, Lisp, ML...), I've found R very intriguing: it's a very "functional" programing language, but also object oriented (using dollar signs instead of the customary dots). I've also found R to be incredibly quick -- provided you know and use the right builtin functions. I once tried to solve an assignment with a for loop and killed the process after it hadn't finished within a day. Using "aggregate" did the job within an instant of pressing enter.

    I've found R to have numerous strange quirks I haven't got the hang of, resulting in weird results sometimes which I can't debug. The Coursera course mentioned above teaches a style of R I'm not particularly fond of using various libraries, which I'm ideologically opposed to in the same way I prefer battling with JavaScript directly rather than learning JQuery as an intermediary "dialect".

    What are your pointers for the "right way" to program in R?

    --
    If it works, it's obsolete
  12. R vs Python by Anonymous Coward · · Score: 1

    I am myself an R aficionado, but what do you answer to someone who says that Python has gone a long why to be a good contender for data analysis tasks (SciPy, Pandas, Scikit etc...)?

    1. Re:R vs Python by 110010001000 · · Score: 1

      You should tell them about Object Pascal/Delphi! I've heard it is awesome from a reputable source!!!

  13. Re:"Tirade" was a direct response to your words by BronsCon · · Score: 1

    I've complimented your work in the past, as a matter of fact. I'm sorry if you did not see it. It's not the application that's the problem, but the personality attached to it. If you read my posts on other topics, not related to you, you'll find that I'm quite often the reasonable one in the room. That leads me to wonder if that's really any different here.

    Hopefully it wasn't me who drove you to that drink. I'm more of a gin man, myself, but I do enjoy a good rum; might I ask what you're poring tonight?

    --
    APK quotes people (including myself) without context and should not be trusted. Just thought you should know.
  14. Re:No thanks: I'll stick by the BIG "up & come by 110010001000 · · Score: 1

    There is no way that ranking is accurate. It says C is the #2 language. There is no way. C++ is over C and no one programs in Delphi since 1999.

  15. Re:So do other languages... apk by 110010001000 · · Score: 1

    Dude, no one has used Object Pascal since 1999.

  16. Re:Nope, never saw it till now &... apk by BronsCon · · Score: 1

    my persona can be as nice as the next person's UNTIL I am attacked

    And this is why I was trying to point out that my initial comment, months ago, was indeed a joke and not a directed attack. Ya gotta admit, ya jumped in pretty heavy at the onset.

    We good?

    Cruzan is good stuff, definitely one of my choice rums when I go that route. Getting any "spendier" than that is just for show. My gin of preference is Citadelle; I bought a bottle of Tanqueray #10 at 4x the cost per ounce one night when I wanted to indulge and it's ended up being a show piece, certainly not best of breed.

    I miss the winter weather, but I certainly don't miss being able to wear sandals year-round. I envy your snow right now, though.

    --
    APK quotes people (including myself) without context and should not be trusted. Just thought you should know.
  17. Re:Again: TIOBE index shows differently by 110010001000 · · Score: 1

    Don't believe everything you read on the Internet. No one knows what TIOBE is. I can put up a website that says anything. There is no way anyone is using Delphi in 2016.

  18. Using R to learn statistics by twistedcubic · · Score: 3, Interesting

    What topic(s) in statistics do you think students can learn easier today using R than years ago when there was nothing like R widely available?

  19. Minitab by Hognoxious · · Score: 1

    I think minitab is better. How would you convince me otherwise?

    --
    Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  20. Re:Ok, the real question - Why is it better than.. by Xtifr · · Score: 1

    R has been around longer than Java, and is based on S which is older than C++. There's a huge body of existing code and libraries to leverage. But from what I gather, the real reason to use R is because the only other option you're being offered is SAS, and you don't want to deal with that mess! Or so I hear.

    Bottom line, if you're not being threatened with SAS, there may be little reason to learn R. But if you are, or if you think there's any danger you might be, R is probably something you want to learn ASAP! :)

  21. How about errors and debugging? by GlobalEcho · · Score: 2

    I feel that one of the weakest points of R is the error handling, reporting, and debugging available.  Do you have advice on tools or techniques for people coding in R (aside from using RStudio?  Are there plans for improvements in this area?  The current facilities are reminiscent, at least to me, of using gdb back in the 1990s.

    I have in mind cases like the following, in which a confusion about list access using the [ operator (when the [[ should have been used) provides a cryptic error message with no traceback available.

    > symlog_scaler <- list(linear_to=2.5,  abscissa=2.0,
    +    scaling_function=function(x,linear_to=2.5,abscissa=2.0){
    +        y <- x; linear_to = abs(linear_to); big_ix = (linear_to<x)
    +        y[big_ix] = linear_to + log(1+(x[big_ix] - linear_to), base=abscissa)
    +        small_ix = (-linear_to>x)
    +        y[small_ix] = -(linear_to + log(1+(-x[small_ix] - linear_to),base=abscissa))
    +        y})
    > symlog_scaler$scaling_function(-5:5)
    [1] -4.307355 -3.821928 -3.084963 -2.000000 -1.000000  0.000000  1.000000  2.000000  3.084963
    [10]  3.821928  4.307355
    > symlog_scaler['scaling_function'](-5:5)
    Error: attempt to apply non-function
    > traceback()
    No traceback available
    >

  22. Re:Doom 4 beat my ass, lol... apk by BronsCon · · Score: 1

    Right and I'd mod myself down, too.. mhm, so sure. I have one account, one single account, a fact which only Slashdot staff will be able to prove or disprove, much like your claim that I have multiple sockpuppet accounts. You're playing yourself for stupid.

    --
    APK quotes people (including myself) without context and should not be trusted. Just thought you should know.
  23. Re:Saw a post of yours in your history by BronsCon · · Score: 1

    Woah, woah, cool it, Alexander. What happened to the last couple messages we exchanged last night? Really not making yourself look good here, buddy, coming at me like this after we made amends.

    --
    APK quotes people (including myself) without context and should not be trusted. Just thought you should know.
  24. Re:THIS happened... apk by BronsCon · · Score: 1

    When did I bring up statistics, other than pointing out that R is a statistical analysis language? Whichever AC said that, I can assure you it was not me, just as whoever modded both of us down was not me. I think your "disappointment" is misdirected; you're not family and clearly have no interest in being a friend, though, so your disappointment really doesn't mean much to me. Sorry about that.

    --
    APK quotes people (including myself) without context and should not be trusted. Just thought you should know.
  25. Re:Saw a post of yours in your history by BronsCon · · Score: 1

    I'd gladly lay off but you started up again even now

    Everything you are referring to was posted before we supposedly made amends and had already been replied to by you.

    Your POST HISTORY SHOWS YOU CONSTANTLY COMING IN AFTER I HAVE BEEN IN POSTS TOO

    I stood up for you in one post and directly replied to you in another, in this very topic. Aside from that, there was another thread a few days ago where we interacted, and I made one off-the-cuff remark about wishing you'd leave me alone (in a thread where that type of comment was actually quite relevant), which was also made during that little tiff.

    My only posts to or about you since our supposed (and clearly meaningless) truce have been directly to you, people posting as you, or people posting attacking me for the conversation we already had, inquiring as to WTF those attacks are all about since we had come to a new understanding and were supposedly past that like a couple of grown adults.

    You know what? I no longer have karma to burn to make amends with you. You're dead to me.

    --
    APK quotes people (including myself) without context and should not be trusted. Just thought you should know.
  26. Third party GUIs by Trogre · · Score: 1

    I have been impressed with the strong community surrounding R, and the excellent third party libraries that are available in the CRAN.

    What is your view on the various third party GUIs that exist for R, such as RStudio, Tinn-R and RExcel? Do you use or recommend any of them?

    --
    "Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
  27. Re:Saw a post of yours in your history by BronsCon · · Score: 1

    I wasn't gonna go there but... since he's dead to me now... don't you know it?

    --
    APK quotes people (including myself) without context and should not be trusted. Just thought you should know.