Slashdot Mirror


Comparing R, Octave, and Python for Data Analysis

Here is a breakdown of R, Octave and Python, and how analysts can rely on open-source software and online learning resources to bring data-mining capabilities into their companies. The article breaks down which of the three is easiest to use, which do well with visualizations, which handle big data the best, etc. The lack of a budget shouldn't prevent you from experiencing all the benefits of a top-shelf data analysis package, and each of these options brings its own set of strengths while being much cheaper to implement than the typical proprietary solutions.

18 of 61 comments (clear)

  1. Get the Popcorn by eldavojohn · · Score: 4, Funny

    So, you're linking a SlashdotBI article to the Slashdot front page?

    Well then.

    --
    My work here is dung.
  2. Did I seriously miss something? by ACK!! · · Score: 4, Informative

    The whole article was not much more than a high level review. The graphic naturally draws attention to the parameters the writer wanted to cover but he did not back up his graphic with any sort of serious textual review of what he felt were the weaknesses or advantages of the different programming language at least not in any detail.

    --
    ACK /ak/ interj. 2. [from the comic strip "Bloom County"] An exclamation of surprised disgust, esp. i
    1. Re:Did I seriously miss something? by Ruie · · Score: 4, Interesting

      The whole article was not much more than a high level review. The graphic naturally draws attention to the parameters the writer wanted to cover but he did not back up his graphic with any sort of serious textual review of what he felt were the weaknesses or advantages of the different programming language at least not in any detail.

      And what he has is flawed as well. For example, he marked R as having issue with big data which is quite wrong - I routinely analyze multi-GB datasets in memory, and my databases go into TB. Of all the three languages R is the only one to have a native format (data.frame) that interfaces easily to database queries. Both Octave (Matlab) or Python have to use compound types which make addressing difficult.

      Also, I found R easier to master than either Octave or Python, but this is probably because I am familiar with Lisp.

    2. Re:Did I seriously miss something? by Anrego · · Score: 2

      Indeed. This is high level "meeting for the suits" bullshit. I can picture this showing up on powerpoint presentation.

      Here are your three options.. this is the one that sucks, this is the one that sucks for a different reason, and this is the one I want you to go with. Oh, and here is a chart with some pretty checkmarks and stuff to help clarify! Lets do lunch!

    3. Re:Did I seriously miss something? by dondelelcaro · · Score: 2

      there are ways around that through smart planning, variable use, and multiple data files for different variables so not all are in memory at once

      There are also packages like ff and others which handle absolutely gigantic files by offloading parts of them to storage and only allocating memory for them (and storage) when required. R certainly has some problems with dealing with huge amounts of data, but they aren't insurmountable for datasets less than 1T.

      --
      http://www.donarmstrong.com
    4. Re:Did I seriously miss something? by plopez · · Score: 2

      32 bits? are you serious?

      --
      putting the 'B' in LGBTQ+
  3. I wish he had learning resources. by Anonymous Coward · · Score: 4, Insightful

    I wish there was also a column for availibility of resources for learning like: tutorials, free books, example code, etc ....

  4. Never selected that way by vlm · · Score: 4, Insightful

    how analysts can rely on open-source software

    I've done that kind of stuff at work and those criteria are NEVER how a package is selected.

    If I need a commercial product I need all manner of signoffs requiring at least weeks of delay and massive IT involvement so they can insert it into windoze images automatically or whatever it is they do.

    If I'm doing FOSS it just ... gets done that day. No agony. And it just works, and instead of a call center script reader in India who can only tell me to reinstall the software over and over, with FOSS the "whole internet" is my support system and they as in the whole internet know what they're doing.

    Nothing about this has changed in about 15 years, so I'm not sure how this is "news". This would have been a good "news" story in the early/mid nineties.

    --
    "Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
    1. Re:Never selected that way by Sebastopol · · Score: 2

      This is a thinly veiled attempt to put Python on the same level as R. /shakes head/

      --
      https://www.accountkiller.com/removal-requested
    2. Re:Never selected that way by Anonymous Coward · · Score: 2, Insightful

      Besides, in research, using something opensource (or at the very least gratis) makes it that much easier for others to replicate what you did. Getting SAS scripts just isn't fun.

    3. Re:Never selected that way by Anonymous Coward · · Score: 5, Insightful

      I'm an astronomer. At this point in my career, I move to a new research institution every couple of years. Each institution may have a site licence for some piece of commercial software like IDL or Matlab, but I use free software (Python, in my case) because I know that I can keep using it, rather than rewriting all my scripts for a new language every time I move.

  5. More crap from /. by NoMaster · · Score: 4, Insightful

    "Here is a breakdown of R, Octave and Python ..."

    No there isn't - that's there is not much more than a shitty 'feature' table, too high level to be anything other than facile, which is "Based on [the author's] own user experience and research".

    As an student user of all 3 I would have been interested in reading a good comparative review or explanation aimed at outsiders. This ain't it; it's just more slashvertising.

    --
    What part of "a well regulated militia" do you not understand?
  6. Or if you can't make up your mind by Anonymous Coward · · Score: 2, Interesting

    Sage math http://www.sagemath.org/

  7. Julia? by Chrisq · · Score: 3, Informative

    There was a previous article about Julia which looked cool. I wonder how this measures up

  8. Oh.. by Anrego · · Score: 2, Insightful

    Now that's just desperation.

    Come on .. keep this shit in bi. Either it takes off or it doesn't.

  9. Both! by Kludge · · Score: 3, Insightful

    The best option is to use python and R, through rpy for example.
    R rocks for statistical libraries and good documentation.
    Python rocks for everything else.

  10. I don't understand by utkonos · · Score: 4, Informative

    This article compares three languages that have different purposes. R's purpose is statistical analysis and visualization. Octave is a general mathematical analysis and visualization language. Python is a generalist language that has it's own focuses on code readability among other things.

    These languages also have a target audience. R is for statisticians and scientists. Octave is for mathematicians, and Python is for programmers.

  11. Python does have data.frame.. by csirac · · Score: 3, Informative

    Through pandas, for a start. The SciPy/NumPy stack is quite nifty, I'm especially interested in how to apply it for working with irregular time series data.

    Not to say anybody should ditch R, I still support our researchers most weeks at work in using it. But it's not as clear-cut as you seem to think it is, especially in terms of memory efficiency.