Slashdot Mirror


Ask Slashdot: Best Language To Learn For Scientific Computing?

New submitter longhunt writes "I just started my second year of grad school and I am working on a project that involves a computationally intensive data mining problem. I initially coded all of my routines in VBA because it 'was there'. They work, but run way too slow. I need to port to a faster language. I have acquired an older Xeon-based server and would like to be able to make use of all four CPU cores. I can load it with either Windows (XP) or Linux and am relatively comfortable with both. I did a fair amount of C and Octave programming as an undergrad. I also messed around with Fortran77 and several flavors of BASIC. Unfortunately, I haven't done ANY programming in about 12 years, so it would almost be like starting from scratch. I need a language I can pick up in a few weeks so I can get back to my research. I am not a CS major, so I care more about the answer than the code itself. What language suggestions or tips can you give me?"

17 of 465 comments (clear)

  1. Python by curunir · · Score: 4, Insightful

    I have a friend who works for a company that does gene sequencing and other genetic research and, from what he's told me, the whole industry uses mostly python. You probably don't have the hardware resources that they do, but I'd bet you also don't have data sets that are nearly as large as theirs are.

    You might also get better results from something less general purpose like Julia, which is designed for number crunching.

    --
    "Don't blame me, I voted for Kodos!"
    1. Re:Python by the+gnat · · Score: 4, Insightful

      the whole industry uses mostly python

      This is certainly the way of the future, not just for gene sequencing but many other quantitative sciences, although a complete answer would be Python and C++, because numpy/scipy can't do everything and Python is still very slow for number-crunching. It's best to start with just Python, but eventually some C++ knowledge will be helpful. (Or just plain C, but I can't see any good reason to inflict that on myself or anyone else.)

    2. Re:Python by Anonymous Coward · · Score: 4, Insightful

      Python is the new VB.

    3. Re:Python by shutdown+-p+now · · Score: 5, Insightful

      Python is VB done right.

    4. Re:Python by dmbasso · · Score: 3, Insightful

      The problem with using the mix (when you actually write the C++ code yourself) is that debugging it is a major pain in the ass

      Only if you don't use the C/C++ code as an independent module, as it should be. If you *must* debug it in parallel, you're designing it wrong.

      --
      `echo $[0x853204FA81]|tr 0-9 ionbsdeaml`@gmail.com
    5. Re:Python by Anonymous Coward · · Score: 3, Insightful

      VB is closed-source trash.

    6. Re:Python by ebno-10db · · Score: 5, Insightful

      Perl is still in wide use.

      Do not use Perl for this. I've been using Perl for 15-20 years, and I love it for "scripting", text processing, etc., but using it for scientific computing sounds like an exercise in masochism.

    7. Re:Python by Joce640k · · Score: 4, Insightful

      Compared to C and C++, Fortran is actually more elegant for pure numerical computing.

      Unsurprising - that's what Fortran was designed for...!

      --
      No sig today...
    8. Re:Python by shutdown+-p+now · · Score: 3, Insightful

      No, it's a simple language that is easy for beginners to learn. But, unlike VB, it is not horribly designed, and is useful even once you grow out of the beginner phase.

  2. Fortran by Anonymous Coward · · Score: 2, Insightful

    sorry to say, but that is a fact

    1. Re:FORTRAN by Anonymous Coward · · Score: 2, Insightful

      Yeah, sure.

      So that no one can ever check your models or replicate your results even if you publish code and initial data.

  3. FORTRAN by Frosty+Piss · · Score: 2, Insightful

    Seriously consider FORTRAN

    --
    If you want news from today, you have to come back tomorrow.
  4. what the rest of your team uses by peter303 · · Score: 4, Insightful

    You should all be sharing your codes to avoid rewriting and to perfect it.
    And if you are not a member of a team then I seriously question the quality of your graduate program.

  5. Profile by Arker · · Score: 5, Insightful

    A lot of people will propose a language because it is their favorite. Others because they believe it is very easy to learn. I will give you a third line of thought.

    I would not look for a language in this case, I would look for a library, then teach myself whatever language is easiest/quickest to access it. I would try to profile what you are building, figure out where the bottlenecks are likely to be (profiling your existing mockup can help here but dont trust it entirely) and try to find the best stable well-designed high performance library for that particular type of code.

    --
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-
    Friends don't let friends enable ecmascript.
  6. Re:R-language by green+is+the+enemy · · Score: 2, Insightful

    This is the correct advice: Use whatever language is most common in your research area, so you can benefit from the most existing source code. This will almost certainly be a high-level scripting language like R, MATLAB or Python, with the ability to drop down to C, FORTRAN and CUDA for the small parts of the code that need optimization. (In my case: electrical engineering = MATLAB + C and CUDA mex files)

  7. Re:Fortran (plus MPI and some CUDA) by Anonymous Coward · · Score: 2, Insightful

    Fortran and learn some how to implement MPI and CUDA code is your work is parallelizable.

    DO NOT USE CUDA

    Use OpenCL

  8. C. Obviously. by RandCraw · · Score: 3, Insightful

    You know C. C is simple, as fast as any alternative, it's straightforward to optimize (aside from pointer abuse), and you always know what the compiler/runtime is doing. And threading libraries like pthreads or CUDA are best served via C/C++. Why use anything else?

    Another thought: scientific libraries. If you need external services/algorithms then your chosen language should support the libraries you need. C/C++ are well served by many fast machine learning libs such as FANN, LIBSVM, OpenCV, not to mention CBLAS, LinPACK, etc.