Slashdot Mirror


Ask Slashdot: Best Language To Learn For Scientific Computing?

New submitter longhunt writes "I just started my second year of grad school and I am working on a project that involves a computationally intensive data mining problem. I initially coded all of my routines in VBA because it 'was there'. They work, but run way too slow. I need to port to a faster language. I have acquired an older Xeon-based server and would like to be able to make use of all four CPU cores. I can load it with either Windows (XP) or Linux and am relatively comfortable with both. I did a fair amount of C and Octave programming as an undergrad. I also messed around with Fortran77 and several flavors of BASIC. Unfortunately, I haven't done ANY programming in about 12 years, so it would almost be like starting from scratch. I need a language I can pick up in a few weeks so I can get back to my research. I am not a CS major, so I care more about the answer than the code itself. What language suggestions or tips can you give me?"

9 of 465 comments (clear)

  1. Re:Python by Garridan · · Score: 5, Informative

    I use Sage. When Python isn't fast enough, I can essentially write in C with Cython. It's gloriously easy. Have some trivially parallelizable data mining? Just use the @parallel decorator. Sage comes with a slew of fast mathematical packages, so your toolbox is massive, and you can hook it all in to your Cython code with minimal overhead.

  2. BAD TIM! BAD! by girlintraining · · Score: 5, Funny

    What language suggestions or tips can you give me?"

    Timothy, shame on you. You should know better than to start a holy war.

    --
    #fuckbeta #iamslashdot #dicemustdie
  3. Re:Python by shutdown+-p+now · · Score: 5, Interesting

    a complete answer would be Python and C++, because numpy/scipy can't do everything and Python is still very slow for number-crunching.

    The problem with using the mix (when you actually write the C++ code yourself) is that debugging it is a major pain in the ass - you either attach two debuggers and simulate stepping across the boundary by manually setting breakpoints, or you give up and resort to printf debugging.

    OTOH, if Windows is an option, PTVS is a Python IDE that can debug Python and C++ code side by side, with cross-boundary stepping etc. It can also do Python/Fortran debugging with a Fortran implementation that integrates into VS (e.g. the Intel one).

    (full disclosure: I am a developer on the PTVS team who implemented this particular feature)

  4. Re:Python by shutdown+-p+now · · Score: 5, Insightful

    Python is VB done right.

  5. Profile by Arker · · Score: 5, Insightful

    A lot of people will propose a language because it is their favorite. Others because they believe it is very easy to learn. I will give you a third line of thought.

    I would not look for a language in this case, I would look for a library, then teach myself whatever language is easiest/quickest to access it. I would try to profile what you are building, figure out where the bottlenecks are likely to be (profiling your existing mockup can help here but dont trust it entirely) and try to find the best stable well-designed high performance library for that particular type of code.

    --
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-
    Friends don't let friends enable ecmascript.
  6. Re:FORTRAN by Frosty+Piss · · Score: 5, Interesting

    Clearly you are not involved in serious science.

    And if you think FORTRAN is some ancient esoteric languge, you're ignorent as well. The most recent standard, ISO/IEC 1539-1:2010, informally known as Fortran 2008, was approved in September 2010.

    Fortran is, for better or worse, the only major language out there specifically designed for scientific numerical computing. It's array handling is nice, with succinct array operations on both whole arrays and on slices, comparable with matlab or numpy but super fast. The language is carefully designed to make it very difficult to accidentally write slow code -- pointers are restricted in such a way that it's immediately obvious if there might be aliasing, as the standard example -- and so the optimizer can go to town on your code. Current incarnations have things like coarray fortran, and do concurrent and forall built into the language, allowing distributed memory and shared memory parallelism, and vectorization.

    The downsides of Fortran are mainly the flip side of one of the upsides mentioned; Fortran has a huge long history. Upside: tonnes of great libraries. Downsides: tonnes of historical baggage.

    If you have to do a lot of number crunching, Fortran remains one of the top choices, which is why many of the most sophisticated simulation codes run at supercomputing centres around the world are written in it. But of course it would be a terrible, terrible, language to write a web browser in. To each task its tool.

    --
    If you want news from today, you have to come back tomorrow.
  7. Re:Python by SJHillman · · Score: 5, Funny

    VB is feeding your scrotum to a python.

  8. Re:Python by ebno-10db · · Score: 5, Insightful

    Perl is still in wide use.

    Do not use Perl for this. I've been using Perl for 15-20 years, and I love it for "scripting", text processing, etc., but using it for scientific computing sounds like an exercise in masochism.

  9. Re:Python by Just+Some+Guy · · Score: 5, Funny

    I wrote some Perl that looked like the output of AES once.

    --
    Dewey, what part of this looks like authorities should be involved?